From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Baolin Wang <baolin.wang@linux.alibaba.com>, akpm@linux-foundation.org
Cc: catalin.marinas@arm.com, will@kernel.org,
lorenzo.stoakes@oracle.com, ryan.roberts@arm.com,
Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, riel@surriel.com,
harry.yoo@oracle.com, jannh@google.com, willy@infradead.org,
baohua@kernel.org, dev.jain@arm.com, axelrasmussen@google.com,
yuanchu@google.com, weixugc@google.com, hannes@cmpxchg.org,
zhengqi.arch@bytedance.com, shakeel.butt@linux.dev,
linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/5] mm: add a batched helper to clear the young flag for large folios
Date: Wed, 25 Feb 2026 15:04:17 +0100 [thread overview]
Message-ID: <d172d6bf-c60c-4cf5-9da9-f30de38cdfed@kernel.org> (raw)
In-Reply-To: <bfbe28e381b02452b455498e7ea82662e83a3865.1771897150.git.baolin.wang@linux.alibaba.com>
On 2/24/26 02:56, Baolin Wang wrote:
> Currently, MGLRU will call ptep_clear_young_notify() to check and clear the
> young flag for each PTE sequentially, which is inefficient for large folios
> reclamation.
>
> Moreover, on Arm64 architecture, which supports contiguous PTEs, the Arm64-
> specific ptep_test_and_clear_young() already implements an optimization to
> clear the young flags for PTEs within a contiguous range. However, this is not
> sufficient. Similar to the Arm64 specific clear_flush_young_ptes(), we can
> extend this to perform batched operations for the entire large folio (which
> might exceed the contiguous range: CONT_PTE_SIZE).
>
> Thus, we can introduce a new batched helper: test_and_clear_young_ptes() and
> its wrapper clear_young_ptes_notify(), to perform batched checking of the young
> flags for large folios, which can help improve performance during large folio
> reclamation when MGLRU is enabled. And it will be overridden by the architecture
> that implements a more efficient batch operation in the following patches.
>
Maybe mention that the implementation follows the other existing functions.
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> include/linux/pgtable.h | 36 ++++++++++++++++++++++++++++++++++++
> mm/internal.h | 23 ++++++++++++++++++-----
> 2 files changed, 54 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 776993d4567b..0bcd3be524d3 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1103,6 +1103,42 @@ static inline int clear_flush_young_ptes(struct vm_area_struct *vma,
> }
> #endif
>
> +#ifndef test_and_clear_young_ptes
> +/**
> + * test_and_clear_young_ptes - Mark PTEs that map consecutive pages of the same
> + * folio as old
> + * @vma: The virtual memory area the pages are mapped into.
> + * @addr: Address the first page is mapped at.
> + * @ptep: Page table pointer for the first entry.
> + * @nr: Number of entries to clear access bit.
> + *
> + * May be overridden by the architecture; otherwise, implemented as a simple
> + * loop over ptep_test_and_clear_young().
> + *
> + * Note that PTE bits in the PTE range besides the PFN can differ. For example,
> + * some PTEs might be write-protected.
Document the return value?
Returns: whether any PTE was young.
Or sth like that.
> + *
> + * Context: The caller holds the page table lock. The PTEs map consecutive
> + * pages that belong to the same folio. The PTEs are all in the same PMD.
> + */
> +static inline int test_and_clear_young_ptes(struct vm_area_struct *vma,
> + unsigned long addr, pte_t *ptep,
> + unsigned int nr)
Two tabs ...
> +{
> + int young = 0;
> +
> + for (;;) {
> + young |= ptep_test_and_clear_young(vma, addr, ptep);
> + if (--nr == 0)
> + break;
> + ptep++;
> + addr += PAGE_SIZE;
> + }
> +
> + return young;
BTW: can this function simply return (and use) a bool instead?
Likely we should do the same for the other functions, but that can be
done separately.
> +}
> +#endif
> +
> /*
> * On some architectures hardware does not set page access bit when accessing
> * memory page, it is responsibility of software setting this bit. It brings
> diff --git a/mm/internal.h b/mm/internal.h
> index 1ba175b8d4f1..1b59be99dc3f 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1813,16 +1813,23 @@ static inline int pmdp_clear_flush_young_notify(struct vm_area_struct *vma,
> return young;
> }
>
> -static inline int ptep_clear_young_notify(struct vm_area_struct *vma,
> - unsigned long addr, pte_t *ptep)
> +static inline int clear_young_ptes_notify(struct vm_area_struct *vma,
> + unsigned long addr, pte_t *ptep,
> + unsigned int nr)
> {
> int young;
>
> - young = ptep_test_and_clear_young(vma, addr, ptep);
> - young |= mmu_notifier_clear_young(vma->vm_mm, addr, addr + PAGE_SIZE);
> + young = test_and_clear_young_ptes(vma, addr, ptep, nr);
> + young |= mmu_notifier_clear_young(vma->vm_mm, addr, addr + nr * PAGE_SIZE);
> return young;
> }
>
> +static inline int ptep_clear_young_notify(struct vm_area_struct *vma,
> + unsigned long addr, pte_t *ptep)
> +{
> + return clear_young_ptes_notify(vma, addr, ptep, 1);
> +}
> +
> static inline int pmdp_clear_young_notify(struct vm_area_struct *vma,
> unsigned long addr, pmd_t *pmdp)
> {
> @@ -1837,9 +1844,15 @@ static inline int pmdp_clear_young_notify(struct vm_area_struct *vma,
>
> #define clear_flush_young_ptes_notify clear_flush_young_ptes
> #define pmdp_clear_flush_young_notify pmdp_clear_flush_young
> -#define ptep_clear_young_notify ptep_test_and_clear_young
> +#define clear_young_ptes_notify test_and_clear_young_ptes
> #define pmdp_clear_young_notify pmdp_test_and_clear_young
>
> +static inline int ptep_clear_young_notify(struct vm_area_struct *vma,
> + unsigned long addr, pte_t *ptep)
> +{
> + return test_and_clear_young_ptes(vma, addr, ptep, 1);
> +}
Why not outside of the ifdef a single generic
static inline int ptep_clear_young_notify(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
{
return clear_young_ptes_notify(vma, addr, ptep, 1);
}
Same comment regarding bool.
--
Cheers,
David
next prev parent reply other threads:[~2026-02-25 14:04 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-24 1:56 [PATCH 0/5] support batched checking of the young flag for MGLRU Baolin Wang
2026-02-24 1:56 ` [PATCH 1/5] mm: use inline helper functions instead of ugly macros Baolin Wang
2026-02-24 2:36 ` Rik van Riel
2026-02-24 7:09 ` Barry Song
2026-02-25 13:56 ` David Hildenbrand (Arm)
2026-02-24 1:56 ` [PATCH 2/5] mm: rmap: add a ZONE_DEVICE folio warning in folio_referenced() Baolin Wang
2026-02-24 2:38 ` Rik van Riel
2026-02-24 5:49 ` Baolin Wang
2026-02-25 13:57 ` David Hildenbrand (Arm)
2026-02-24 6:34 ` Alistair Popple
2026-02-24 1:56 ` [PATCH 3/5] mm: add a batched helper to clear the young flag for large folios Baolin Wang
2026-02-24 22:03 ` Rik van Riel
2026-02-25 2:05 ` Baolin Wang
2026-02-25 14:04 ` David Hildenbrand (Arm) [this message]
2026-02-24 1:56 ` [PATCH 4/5] mm: support batched checking of the young flag for MGLRU Baolin Wang
2026-02-24 22:12 ` Rik van Riel
2026-02-25 14:25 ` David Hildenbrand (Arm)
2026-02-24 1:56 ` [PATCH 5/5] arm64: mm: implement the architecture-specific test_and_clear_young_ptes() Baolin Wang
2026-02-25 0:23 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d172d6bf-c60c-4cf5-9da9-f30de38cdfed@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=weixugc@google.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox