From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: akpm@linux-foundation.org, david@kernel.org,
catalin.marinas@arm.com, will@kernel.org, ryan.roberts@arm.com,
Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, riel@surriel.com,
harry.yoo@oracle.com, jannh@google.com, willy@infradead.org,
baohua@kernel.org, dev.jain@arm.com, linux-mm@kvack.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 0/5] support batch checking of references and unmapping for large folios
Date: Fri, 16 Jan 2026 08:41:44 +0000 [thread overview]
Message-ID: <c285f2fb-b64d-4932-b9ae-ef420097728e@lucifer.local> (raw)
In-Reply-To: <cover.1766631066.git.baolin.wang@linux.alibaba.com>
Andrew -
I know this has had a lot of attention, but can we hold off on sending this
upstream until either David or I have had a chance to review it?
Also note that Dev has discovered an issue with how this interacts with the
accursed uffd-wp logic (see [0]) so series needs a respin anyway.
Thanks, Lorenzo
[0]: https://lore.kernel.org/linux-mm/20260116082721.275178-1-dev.jain@arm.com/
On Fri, Dec 26, 2025 at 02:07:54PM +0800, Baolin Wang wrote:
> Currently, folio_referenced_one() always checks the young flag for each PTE
> sequentially, which is inefficient for large folios. This inefficiency is
> especially noticeable when reclaiming clean file-backed large folios, where
> folio_referenced() is observed as a significant performance hotspot.
>
> Moreover, on Arm architecture, which supports contiguous PTEs, there is already
> an optimization to clear the young flags for PTEs within a contiguous range.
> However, this is not sufficient. We can extend this to perform batched operations
> for the entire large folio (which might exceed the contiguous range: CONT_PTE_SIZE).
>
> Similar to folio_referenced_one(), we can also apply batched unmapping for large
> file folios to optimize the performance of file folio reclamation. By supporting
> batched checking of the young flags, flushing TLB entries, and unmapping, I can
> observed a significant performance improvements in my performance tests for file
> folios reclamation. Please check the performance data in the commit message of
> each patch.
>
> Run stress-ng and mm selftests, no issues were found.
>
> Patch 1: Add a new generic batched PTE helper that supports batched checks of
> the references for large folios.
> Patch 2 - 3: Preparation patches.
> patch 4: Implement the Arm64 arch-specific clear_flush_young_ptes().
> Patch 5: Support batched unmapping for file large folios.
>
> Changes from v4:
> - Fix passing the incorrect 'CONT_PTES' for non-batched APIs.
> - Rename ptep_clear_flush_young_notify() to clear_flush_young_ptes_notify() (per Ryan).
> - Fix some coding style issues (per Ryan).
> - Add reviewed tag from Ryan. Thanks.
>
> Changes from v3:
> - Fix using an incorrect parameter in ptep_clear_flush_young_notify()
> (per Liam).
>
> Changes from v2:
> - Rearrange the patch set (per Ryan).
> - Add pte_cont() check in clear_flush_young_ptes() (per Ryan).
> - Add a helper to do contpte block alignment (per Ryan).
> - Fix some coding style issues (per Lorenzo and Ryan).
> - Add more comments and update the commit message (per Lorenzo and Ryan).
> - Add acked tag from Barry. Thanks.
>
> Changes from v1:
> - Add a new patch to support batched unmapping for file large folios.
> - Update the cover letter
>
> Baolin Wang (5):
> mm: rmap: support batched checks of the references for large folios
> arm64: mm: factor out the address and ptep alignment into a new helper
> arm64: mm: support batch clearing of the young flag for large folios
> arm64: mm: implement the architecture-specific
> clear_flush_young_ptes()
> mm: rmap: support batched unmapping for file large folios
>
> arch/arm64/include/asm/pgtable.h | 23 ++++++++----
> arch/arm64/mm/contpte.c | 62 ++++++++++++++++++++------------
> include/linux/mmu_notifier.h | 9 ++---
> include/linux/pgtable.h | 31 ++++++++++++++++
> mm/rmap.c | 38 ++++++++++++++++----
> 5 files changed, 125 insertions(+), 38 deletions(-)
>
> --
> 2.47.3
>
next prev parent reply other threads:[~2026-01-16 8:42 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-26 6:07 Baolin Wang
2025-12-26 6:07 ` [PATCH v5 1/5] mm: rmap: support batched checks of the references " Baolin Wang
2026-01-07 6:01 ` Harry Yoo
2026-02-09 8:49 ` David Hildenbrand (Arm)
2026-02-09 9:14 ` Baolin Wang
2026-02-09 9:20 ` David Hildenbrand (Arm)
2026-02-09 9:25 ` Baolin Wang
2025-12-26 6:07 ` [PATCH v5 2/5] arm64: mm: factor out the address and ptep alignment into a new helper Baolin Wang
2026-02-09 8:50 ` David Hildenbrand (Arm)
2025-12-26 6:07 ` [PATCH v5 3/5] arm64: mm: support batch clearing of the young flag for large folios Baolin Wang
2026-01-02 12:21 ` Ryan Roberts
2026-02-09 9:02 ` David Hildenbrand (Arm)
2025-12-26 6:07 ` [PATCH v5 4/5] arm64: mm: implement the architecture-specific clear_flush_young_ptes() Baolin Wang
2026-01-28 11:47 ` Chris Mason
2026-01-29 1:42 ` Baolin Wang
2026-02-09 9:09 ` David Hildenbrand (Arm)
2026-02-09 9:36 ` Baolin Wang
2026-02-09 9:55 ` David Hildenbrand (Arm)
2026-02-09 10:13 ` Baolin Wang
2026-02-16 0:24 ` Alistair Popple
2025-12-26 6:07 ` [PATCH v5 5/5] mm: rmap: support batched unmapping for file large folios Baolin Wang
2026-01-06 13:22 ` Wei Yang
2026-01-06 21:29 ` Barry Song
2026-01-07 1:46 ` Wei Yang
2026-01-07 2:21 ` Barry Song
2026-01-07 2:29 ` Baolin Wang
2026-01-07 3:31 ` Wei Yang
2026-01-16 9:53 ` Dev Jain
2026-01-16 11:14 ` Lorenzo Stoakes
2026-01-16 14:28 ` Barry Song
2026-01-16 15:23 ` Barry Song
2026-01-16 15:49 ` Baolin Wang
2026-01-18 5:46 ` Dev Jain
2026-01-19 5:50 ` Baolin Wang
2026-01-19 6:36 ` Dev Jain
2026-01-19 7:22 ` Baolin Wang
2026-01-16 15:14 ` Barry Song
2026-01-18 5:48 ` Dev Jain
2026-01-07 6:54 ` Harry Yoo
2026-01-16 8:42 ` Lorenzo Stoakes
2026-01-16 16:26 ` [PATCH] mm: rmap: skip batched unmapping for UFFD vmas Baolin Wang
2026-02-09 9:54 ` David Hildenbrand (Arm)
2026-02-09 10:49 ` Barry Song
2026-02-09 10:58 ` David Hildenbrand (Arm)
2026-02-10 12:01 ` Dev Jain
2026-02-09 9:38 ` [PATCH v5 5/5] mm: rmap: support batched unmapping for file large folios David Hildenbrand (Arm)
2026-02-09 9:43 ` Baolin Wang
2026-02-13 5:19 ` Barry Song
2026-02-18 12:26 ` Dev Jain
2026-01-16 8:41 ` Lorenzo Stoakes [this message]
2026-01-16 10:53 ` [PATCH v5 0/5] support batch checking of references and unmapping for " David Hildenbrand (Red Hat)
2026-01-16 10:52 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c285f2fb-b64d-4932-b9ae-ef420097728e@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox