linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Baolin Wang <baolin.wang@linux.alibaba.com>, akpm@linux-foundation.org
Cc: catalin.marinas@arm.com, will@kernel.org,
	lorenzo.stoakes@oracle.com, ryan.roberts@arm.com,
	Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, riel@surriel.com,
	harry.yoo@oracle.com, jannh@google.com, willy@infradead.org,
	baohua@kernel.org, dev.jain@arm.com, axelrasmussen@google.com,
	yuanchu@google.com, weixugc@google.com, hannes@cmpxchg.org,
	zhengqi.arch@bytedance.com, shakeel.butt@linux.dev,
	linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 6/6] arm64: mm: implement the architecture-specific test_and_clear_young_ptes()
Date: Fri, 6 Mar 2026 15:47:35 +0100	[thread overview]
Message-ID: <6305e05e-2911-42b0-b6f5-7fdde787b778@kernel.org> (raw)
In-Reply-To: <7f891d42a720cc2e57862f3b79e4f774404f313c.1772778858.git.baolin.wang@linux.alibaba.com>

On 3/6/26 07:43, Baolin Wang wrote:
> Implement the Arm64 architecture-specific test_and_clear_young_ptes() to enable
> batched checking of young flags, improving performance during large folio
> reclamation when MGLRU is enabled.
> 
> While we're at it, simplify ptep_test_and_clear_young() by calling
> test_and_clear_young_ptes(). Since callers guarantee that PTEs are present
> before calling these functions, we can use pte_cont() to check the CONT_PTE
> flag instead of pte_valid_cont().
> 
> Performance testing:
> Enable MGLRU, then allocate 10G clean file-backed folios by mmap() in a memory
> cgroup, and try to reclaim 8G file-backed folios via the memory.reclaim interface.
> I can observe 60%+ performance improvement on my Arm64 32-core server (and about
> 15% improvement on my X86 machine).
> 
> W/o patchset:
> real	0m0.470s
> user	0m0.000s
> sys	0m0.470s
> 
> W/ patchset:
> real	0m0.180s
> user	0m0.001s
> sys	0m0.179s
> 
> Reviewed-by: Rik van Riel <riel@surriel.com>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  arch/arm64/include/asm/pgtable.h | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index aa4b13da6371..ab451d20e4c5 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1812,16 +1812,22 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>  	return __ptep_get_and_clear(mm, addr, ptep);
>  }
>  
> +#define test_and_clear_young_ptes test_and_clear_young_ptes
> +static inline int test_and_clear_young_ptes(struct vm_area_struct *vma,
> +					    unsigned long addr, pte_t *ptep,
> +					    unsigned int nr)
> +{
> +	if (likely(nr == 1 && !pte_cont(__ptep_get(ptep))))
> +		return __ptep_test_and_clear_young(vma, addr, ptep);
> +
> +	return contpte_test_and_clear_young_ptes(vma, addr, ptep, nr);
> +}

Thinking out loud, what would happen if

(a) The range spans multiple possible cont ranges (like, 64 ptes).

(b) The first pte is !pte_cont(), but some others in there are?

-- 
Cheers,

David


      reply	other threads:[~2026-03-06 14:47 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-06  6:43 [PATCH v3 0/6] support batched checking of the young flag for MGLRU Baolin Wang
2026-03-06  6:43 ` [PATCH v3 1/6] mm: use inline helper functions instead of ugly macros Baolin Wang
2026-03-06  6:43 ` [PATCH v3 2/6] mm: rename ptep/pmdp_clear_young_notify() to ptep/pmdp_test_and_clear_young_notify() Baolin Wang
2026-03-06  6:43 ` [PATCH v3 3/6] mm: rmap: add a ZONE_DEVICE folio warning in folio_referenced() Baolin Wang
2026-03-06  6:43 ` [PATCH v3 4/6] mm: add a batched helper to clear the young flag for large folios Baolin Wang
2026-03-06  6:43 ` [PATCH v3 5/6] mm: support batched checking of the young flag for MGLRU Baolin Wang
2026-03-06 14:44   ` David Hildenbrand (Arm)
2026-03-06  6:43 ` [PATCH v3 6/6] arm64: mm: implement the architecture-specific test_and_clear_young_ptes() Baolin Wang
2026-03-06 14:47   ` David Hildenbrand (Arm) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6305e05e-2911-42b0-b6f5-7fdde787b778@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=catalin.marinas@arm.com \
    --cc=dev.jain@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=weixugc@google.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yuanchu@google.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox