linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v11 0/8] mm: folio_zero_user: clear page ranges
@ 2026-01-07  7:20 Ankur Arora
  2026-01-07  7:20 ` [PATCH v11 1/8] treewide: provide a generic clear_user_page() variant Ankur Arora
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Ankur Arora @ 2026-01-07  7:20 UTC (permalink / raw)
  To: linux-kernel, linux-mm, x86
  Cc: akpm, david, bp, dave.hansen, hpa, mingo, mjguzik, luto, peterz,
	tglx, willy, raghavendra.kt, chleroy, ioworker0, lizhe.67,
	boris.ostrovsky, konrad.wilk, ankur.a.arora

Hi,

This series adds clearing of contiguous page ranges for hugepages.

Major changes over v10: no major changes in logic. On Andrew's
prompting, however, this version adds a lot more detail about the
causes of the increased performance. Some of this was handwaved
away in the earlier versions.

More detail in the changelog.

The series improves on the current discontiguous clearing approach in
two ways:

  - clear pages in a contiguous fashion.
  - use batched clearing via clear_pages() wherever exposed.

The first is useful because it allows us to make much better use
of hardware prefetchers.

The second, enables advertising the real extent to the processor. 
Where specific instructions support it (ex. string instructions on
x86; "mops" on arm64 etc), a processor can optimize based on this
because, instead of seeing a sequence of 8-byte stores, or a sequence
of 4KB pages, it sees a larger unit being operated on.

For instance, AMD Zen uarchs (for extents larger than LLC-size) switch
to a mode where they start eliding cacheline allocation. This is
helpful not just because it results in higher bandwidth, but also
because now the cache is not evicting useful cachelines and replacing
them with zeroes.

Demand faulting a 64GB region shows performance improvement:

 $ perf bench mem mmap -p $pg-sz -f demand -s 64GB -l 5

                       baseline              +series
                   (GBps +- %stdev)      (GBps +- %stdev)

   pg-sz=2MB       11.76 +- 1.10%        25.34 +- 1.18% [*]   +115.47%  	preempt=*

   pg-sz=1GB       24.85 +- 2.41%        39.22 +- 2.32%       + 57.82%  	preempt=none|voluntary
   pg-sz=1GB         (similar)           52.73 +- 0.20% [#]   +112.19%  	preempt=full|lazy

 [*] This improvement is because switching to sequential clearing
  allows the hardware prefetchers to do a much better job.

 [#] For pg-sz=1GB a large part of the improvement is because of the
  cacheline elision mentioned above. preempt=full|lazy improves upon
  that because, not needing explicit invocations of cond_resched() to
  ensure reasonable preemption latency, it can clear the full extent
  as a single unit. In comparison the maximum extent used for
  preempt=none|voluntary is PROCESS_PAGES_NON_PREEMPT_BATCH (32MB).

  When provided the full extent the processor forgoes allocating
  cachelines on this path almost entirely.

  (The hope is that eventually, in the fullness of time, the lazy
   preemption model will be able to do the same job that none or
   voluntary models are used for, allowing us to do away with
   cond_resched().)

Raghavendra also tested previous version of the series on AMD Genoa
and sees similar improvement [1] with preempt=lazy.

  $ perf bench mem map -p $page-size -f populate -s 64GB -l 10

                    base               patched              change
   pg-sz=2MB       12.731939 GB/sec    26.304263 GB/sec     106.6%
   pg-sz=1GB       26.232423 GB/sec    61.174836 GB/sec     133.2%

Comments appreciated!

Also at:
  github.com/terminus/linux clear-pages.v11

Thanks
Ankur

Changelog:

v11:
  - folio_zero_user(): unified the special casing of the gigantic page
    with the hugetlb handling. Plus cleanups.
  - highmem: unify clear_user_highpages() changes.

   (Both suggested by David Hildenbrand).

  - split patch "mm, folio_zero_user: support clearing page ranges"
    from v10 into two separate patches:

      - patch-6 "mm: folio_zero_user: clear pages sequentially", which
        switches to doing sequential clearing from process_huge_pages().

      - patch-7: "mm: folio_zero_user: clear page ranges", which
        switches to clearing in batches.

  - PROCESS_PAGES_NON_PREEMPT_BATCH: define it as 32MB instead of the
    earlier 8MB.

    (Both of these came out of a discussion with Andrew Morton.)

  (https://lore.kernel.org/lkml/20251215204922.475324-1-ankur.a.arora@oracle.com/)

v10:
 - Condition the definition of clear_user_page(), clear_user_pages()
   on the architecture code defining clear_user_highpage. This
   simplifies issues with architectures where some do not define
   clear_user_page, but define clear_user_highpage().

   Also, instead of splitting up across files, move them to linux/highmem.
   This gets rid of build errors while using clear_usre_pages() for
   architectures that use macro magic (such as sparc, m68k).
   (Suggested by Christophe Leroy).

 - thresh out some of the comments around the x86 clear_pages()
   definition (Suggested by Borislav Petkov and Mateusz Guzik).
 (https://lore.kernel.org/lkml/20251121202352.494700-1-ankur.a.arora@oracle.com/)

v9:
 - Define PROCESS_PAGES_NON_PREEMPT_BATCH in common code (instead of
   inheriting ARCH_PAGE_CONTIG_NR.)
    - Also document this in much greater detail as clearing pages
      needing a a constant dependent on the preemption model is
      facially quite odd.
     (Suggested by David Hildenbrand, Andrew Morton, Borislav Petkov.)

 - Switch architectural markers from __HAVE_ARCH_CLEAR_USER_PAGE (and
   similar) to clear_user_page etc. (Suggested by David Hildenbrand)

 - s/memzero_page_aligned_unrolled/__clear_pages_unrolled/
   (Suggested by Borislav Petkov.)
 - style, comment fixes
 (https://lore.kernel.org/lkml/20251027202109.678022-1-ankur.a.arora@oracle.com/)

v8:
 - make clear_user_highpages(), clear_user_pages() and clear_pages()
   more robust across architectures. (Thanks David!)
 - split up folio_zero_user() changes into ones for clearing contiguous
   regions and those for maintaining temporal locality since they have
   different performance profiles (Suggested by Andrew Morton.)
 - added Raghavendra's Reviewed-by, Tested-by.
 - get rid of nth_page()
 - perf related patches have been pulled already. Remove them.

v7:
 - interface cleanups, comments for clear_user_highpages(), clear_user_pages(),
   clear_pages().
 - fixed build errors flagged by kernel test robot
 (https://lore.kernel.org/lkml/20250917152418.4077386-1-ankur.a.arora@oracle.com/)

v6:
 - perf bench mem: update man pages and other cleanups (Namhyung Kim)
 - unify folio_zero_user() for HIGHMEM, !HIGHMEM options instead of
   working through a new config option (David Hildenbrand).
   - cleanups and simlification around that.
 (https://lore.kernel.org/lkml/20250902080816.3715913-1-ankur.a.arora@oracle.com/)

v5:
 - move the non HIGHMEM implementation of folio_zero_user() from x86
   to common code (Dave Hansen)
 - Minor naming cleanups, commit messages etc
 (https://lore.kernel.org/lkml/20250710005926.1159009-1-ankur.a.arora@oracle.com/)

v4:
 - adds perf bench workloads to exercise mmap() populate/demand-fault (Ingo)
 - inline stosb etc (PeterZ)
 - handle cooperative preemption models (Ingo)
 - interface and other cleanups all over (Ingo)
 (https://lore.kernel.org/lkml/20250616052223.723982-1-ankur.a.arora@oracle.com/)

v3:
 - get rid of preemption dependency (TIF_ALLOW_RESCHED); this version
   was limited to preempt=full|lazy.
 - override folio_zero_user() (Linus)
 (https://lore.kernel.org/lkml/20250414034607.762653-1-ankur.a.arora@oracle.com/)

v2:
 - addressed review comments from peterz, tglx.
 - Removed clear_user_pages(), and CONFIG_X86_32:clear_pages()
 - General code cleanup
 (https://lore.kernel.org/lkml/20230830184958.2333078-1-ankur.a.arora@oracle.com/)

[1] https://lore.kernel.org/lkml/fffd4dad-2cb9-4bc9-8a80-a70be687fd54@amd.com/

Ankur Arora (7):
  mm: introduce clear_pages() and clear_user_pages()
  highmem: introduce clear_user_highpages()
  x86/mm: Simplify clear_page_*
  x86/clear_page: Introduce clear_pages()
  mm: folio_zero_user: clear pages sequentially
  mm: folio_zero_user: clear page ranges
  mm: folio_zero_user: cache neighbouring pages

David Hildenbrand (1):
  treewide: provide a generic clear_user_page() variant

 arch/alpha/include/asm/page.h      |  1 -
 arch/arc/include/asm/page.h        |  2 +
 arch/arm/include/asm/page-nommu.h  |  1 -
 arch/arm64/include/asm/page.h      |  1 -
 arch/csky/abiv1/inc/abi/page.h     |  1 +
 arch/csky/abiv2/inc/abi/page.h     |  7 ---
 arch/hexagon/include/asm/page.h    |  1 -
 arch/loongarch/include/asm/page.h  |  1 -
 arch/m68k/include/asm/page_no.h    |  1 -
 arch/microblaze/include/asm/page.h |  1 -
 arch/mips/include/asm/page.h       |  1 +
 arch/nios2/include/asm/page.h      |  1 +
 arch/openrisc/include/asm/page.h   |  1 -
 arch/parisc/include/asm/page.h     |  1 -
 arch/powerpc/include/asm/page.h    |  1 +
 arch/riscv/include/asm/page.h      |  1 -
 arch/s390/include/asm/page.h       |  1 -
 arch/sparc/include/asm/page_64.h   |  1 +
 arch/um/include/asm/page.h         |  1 -
 arch/x86/include/asm/page.h        |  6 --
 arch/x86/include/asm/page_32.h     |  6 ++
 arch/x86/include/asm/page_64.h     | 76 ++++++++++++++++++-----
 arch/x86/lib/clear_page_64.S       | 39 +++---------
 arch/xtensa/include/asm/page.h     |  1 -
 include/linux/highmem.h            | 98 +++++++++++++++++++++++++++++-
 include/linux/mm.h                 | 56 +++++++++++++++++
 mm/memory.c                        | 75 +++++++++++++++++------
 27 files changed, 290 insertions(+), 93 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-01-08  6:22 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-07  7:20 [PATCH v11 0/8] mm: folio_zero_user: clear page ranges Ankur Arora
2026-01-07  7:20 ` [PATCH v11 1/8] treewide: provide a generic clear_user_page() variant Ankur Arora
2026-01-07  7:20 ` [PATCH v11 2/8] mm: introduce clear_pages() and clear_user_pages() Ankur Arora
2026-01-07 22:06   ` David Hildenbrand (Red Hat)
2026-01-07  7:20 ` [PATCH v11 3/8] highmem: introduce clear_user_highpages() Ankur Arora
2026-01-07 22:08   ` David Hildenbrand (Red Hat)
2026-01-08  6:10     ` Ankur Arora
2026-01-07  7:20 ` [PATCH v11 4/8] x86/mm: Simplify clear_page_* Ankur Arora
2026-01-07  7:20 ` [PATCH v11 5/8] x86/clear_page: Introduce clear_pages() Ankur Arora
2026-01-07  7:20 ` [PATCH v11 6/8] mm: folio_zero_user: clear pages sequentially Ankur Arora
2026-01-07 22:10   ` David Hildenbrand (Red Hat)
2026-01-07  7:20 ` [PATCH v11 7/8] mm: folio_zero_user: clear page ranges Ankur Arora
2026-01-07 22:16   ` David Hildenbrand (Red Hat)
2026-01-08  0:44     ` Ankur Arora
2026-01-08  0:43   ` [PATCH] mm: folio_zero_user: (fixup) cache neighbouring pages Ankur Arora
2026-01-08  0:53     ` Ankur Arora
2026-01-08  6:04   ` [PATCH] mm: folio_zero_user: (fixup) cache page ranges Ankur Arora
2026-01-07  7:20 ` [PATCH v11 8/8] mm: folio_zero_user: cache neighbouring pages Ankur Arora
2026-01-07 22:18   ` David Hildenbrand (Red Hat)
2026-01-07 18:09 ` [PATCH v11 0/8] mm: folio_zero_user: clear page ranges Andrew Morton
2026-01-08  6:21   ` Ankur Arora

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox