From: Yu Zhao <yuzhao@google.com>
To: Yin Fengwei <fengwei.yin@intel.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
akpm@linux-foundation.org, willy@infradead.org,
david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com
Subject: Re: [RFC PATCH v2 3/3] mm: mlock: update mlock_pte_range to handle large folio
Date: Wed, 12 Jul 2023 00:31:54 -0600 [thread overview]
Message-ID: <CAOUHufYef--8MxFettL6fOGjVx2vyZHZQU6EEaTCoW0XBvuC8Q@mail.gmail.com> (raw)
In-Reply-To: <20230712060144.3006358-4-fengwei.yin@intel.com>
On Wed, Jul 12, 2023 at 12:02 AM Yin Fengwei <fengwei.yin@intel.com> wrote:
>
> Current kernel only lock base size folio during mlock syscall.
> Add large folio support with following rules:
> - Only mlock large folio when it's in VM_LOCKED VMA range
>
> - If there is cow folio, mlock the cow folio as cow folio
> is also in VM_LOCKED VMA range.
>
> - munlock will apply to the large folio which is in VMA range
> or cross the VMA boundary.
>
> The last rule is used to handle the case that the large folio is
> mlocked, later the VMA is split in the middle of large folio
> and this large folio become cross VMA boundary.
>
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> ---
> mm/mlock.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 99 insertions(+), 5 deletions(-)
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 0a0c996c5c214..f49e079066870 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -305,6 +305,95 @@ void munlock_folio(struct folio *folio)
> local_unlock(&mlock_fbatch.lock);
> }
>
> +static inline bool should_mlock_folio(struct folio *folio,
> + struct vm_area_struct *vma)
> +{
> + if (vma->vm_flags & VM_LOCKED)
> + return (!folio_test_large(folio) ||
> + folio_within_vma(folio, vma));
> +
> + /*
> + * For unlock, allow munlock large folio which is partially
> + * mapped to VMA. As it's possible that large folio is
> + * mlocked and VMA is split later.
> + *
> + * During memory pressure, such kind of large folio can
> + * be split. And the pages are not in VM_LOCKed VMA
> + * can be reclaimed.
> + */
> +
> + return true;
Looks good, or just
should_mlock_folio() // or whatever name you see fit, can_mlock_folio()?
{
return !(vma->vm_flags & VM_LOCKED) || folio_within_vma();
}
> +}
> +
> +static inline unsigned int get_folio_mlock_step(struct folio *folio,
> + pte_t pte, unsigned long addr, unsigned long end)
> +{
> + unsigned int nr;
> +
> + nr = folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte);
> + return min_t(unsigned int, nr, (end - addr) >> PAGE_SHIFT);
> +}
> +
> +void mlock_folio_range(struct folio *folio, struct vm_area_struct *vma,
> + pte_t *pte, unsigned long addr, unsigned int nr)
> +{
> + struct folio *cow_folio;
> + unsigned int step = 1;
> +
> + mlock_folio(folio);
> + if (nr == 1)
> + return;
> +
> + for (; nr > 0; pte += step, addr += (step << PAGE_SHIFT), nr -= step) {
> + pte_t ptent;
> +
> + step = 1;
> + ptent = ptep_get(pte);
> +
> + if (!pte_present(ptent))
> + continue;
> +
> + cow_folio = vm_normal_folio(vma, addr, ptent);
> + if (!cow_folio || cow_folio == folio) {
> + continue;
> + }
> +
> + mlock_folio(cow_folio);
> + step = get_folio_mlock_step(folio, ptent,
> + addr, addr + (nr << PAGE_SHIFT));
> + }
> +}
> +
> +void munlock_folio_range(struct folio *folio, struct vm_area_struct *vma,
> + pte_t *pte, unsigned long addr, unsigned int nr)
> +{
> + struct folio *cow_folio;
> + unsigned int step = 1;
> +
> + munlock_folio(folio);
> + if (nr == 1)
> + return;
> +
> + for (; nr > 0; pte += step, addr += (step << PAGE_SHIFT), nr -= step) {
> + pte_t ptent;
> +
> + step = 1;
> + ptent = ptep_get(pte);
> +
> + if (!pte_present(ptent))
> + continue;
> +
> + cow_folio = vm_normal_folio(vma, addr, ptent);
> + if (!cow_folio || cow_folio == folio) {
> + continue;
> + }
> +
> + munlock_folio(cow_folio);
> + step = get_folio_mlock_step(folio, ptent,
> + addr, addr + (nr << PAGE_SHIFT));
> + }
> +}
I'll finish the above later.
> static int mlock_pte_range(pmd_t *pmd, unsigned long addr,
> unsigned long end, struct mm_walk *walk)
>
> @@ -314,6 +403,7 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr,
> pte_t *start_pte, *pte;
> pte_t ptent;
> struct folio *folio;
> + unsigned int step = 1;
>
> ptl = pmd_trans_huge_lock(pmd, vma);
> if (ptl) {
> @@ -329,24 +419,28 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr,
> goto out;
> }
>
> - start_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> + pte = start_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> if (!start_pte) {
> walk->action = ACTION_AGAIN;
> return 0;
> }
> - for (pte = start_pte; addr != end; pte++, addr += PAGE_SIZE) {
> +
> + for (; addr != end; pte += step, addr += (step << PAGE_SHIFT)) {
> + step = 1;
> ptent = ptep_get(pte);
> if (!pte_present(ptent))
> continue;
> folio = vm_normal_folio(vma, addr, ptent);
> if (!folio || folio_is_zone_device(folio))
> continue;
> - if (folio_test_large(folio))
> + if (!should_mlock_folio(folio, vma))
> continue;
> +
> + step = get_folio_mlock_step(folio, ptent, addr, end);
> if (vma->vm_flags & VM_LOCKED)
> - mlock_folio(folio);
> + mlock_folio_range(folio, vma, pte, addr, step);
> else
> - munlock_folio(folio);
> + munlock_folio_range(folio, vma, pte, addr, step);
> }
> pte_unmap(start_pte);
> out:
Looks good.
next prev parent reply other threads:[~2023-07-12 6:32 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-12 6:01 [RFC PATCH v2 0/3] support large folio for mlock Yin Fengwei
2023-07-12 6:01 ` [RFC PATCH v2 1/3] mm: add functions folio_in_range() and folio_within_vma() Yin Fengwei
2023-07-12 6:11 ` Yu Zhao
2023-07-12 6:01 ` [RFC PATCH v2 2/3] mm: handle large folio when large folio in VM_LOCKED VMA range Yin Fengwei
2023-07-12 6:23 ` Yu Zhao
2023-07-12 6:43 ` Yin Fengwei
2023-07-12 17:03 ` Yu Zhao
2023-07-13 1:55 ` Yin Fengwei
2023-07-14 2:21 ` Hugh Dickins
2023-07-14 2:49 ` Yin, Fengwei
2023-07-14 3:41 ` Hugh Dickins
2023-07-14 5:45 ` Yin, Fengwei
2023-07-12 6:01 ` [RFC PATCH v2 3/3] mm: mlock: update mlock_pte_range to handle large folio Yin Fengwei
2023-07-12 6:31 ` Yu Zhao [this message]
2023-07-15 6:06 ` Yu Zhao
2023-07-16 23:59 ` Yin, Fengwei
2023-07-17 0:35 ` Yu Zhao
2023-07-17 1:58 ` Yin Fengwei
2023-07-18 22:48 ` Yosry Ahmed
2023-07-18 23:47 ` Yin Fengwei
2023-07-19 1:32 ` Yosry Ahmed
2023-07-19 1:52 ` Yosry Ahmed
2023-07-19 1:57 ` Yin Fengwei
2023-07-19 2:00 ` Yosry Ahmed
2023-07-19 2:09 ` Yin Fengwei
2023-07-19 2:22 ` Yosry Ahmed
2023-07-19 2:28 ` Yin Fengwei
2023-07-19 14:26 ` Hugh Dickins
2023-07-19 15:44 ` Yosry Ahmed
2023-07-20 12:02 ` Yin, Fengwei
2023-07-20 20:51 ` Yosry Ahmed
2023-07-21 1:12 ` Yin, Fengwei
2023-07-21 1:35 ` Yosry Ahmed
2023-07-21 3:18 ` Yin, Fengwei
2023-07-21 3:39 ` Yosry Ahmed
2023-07-20 1:52 ` Yin, Fengwei
2023-07-17 8:12 ` Yin Fengwei
2023-07-18 2:06 ` Yin Fengwei
2023-07-18 3:59 ` Yu Zhao
2023-07-26 12:49 ` Yin Fengwei
2023-07-26 16:57 ` Yu Zhao
2023-07-27 0:15 ` Yin Fengwei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOUHufYef--8MxFettL6fOGjVx2vyZHZQU6EEaTCoW0XBvuC8Q@mail.gmail.com \
--to=yuzhao@google.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=fengwei.yin@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=shy828301@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox