linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
	bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
	mingo@redhat.com, luto@kernel.org, peterz@infradead.org,
	paulmck@kernel.org, rostedt@goodmis.org, tglx@linutronix.de,
	willy@infradead.org, jon.grimm@amd.com, bharata@amd.com,
	raghavendra.kt@amd.com, boris.ostrovsky@oracle.com,
	konrad.wilk@oracle.com, ankur.a.arora@oracle.com
Subject: [PATCH v3 0/4] mm/folio_zero_user: add multi-page clearing
Date: Sun, 13 Apr 2025 20:46:03 -0700	[thread overview]
Message-ID: <20250414034607.762653-1-ankur.a.arora@oracle.com> (raw)

This series adds multi-page clearing for hugepages. It is a rework
of [1] which took a detour through PREEMPT_LAZY [2].

Why multi-page clearing?: multi-page clearing improves upon the
current page-at-a-time approach by providing the processor with a
hint as to the real region size. A processor could use this hint to,
for instance, elide cacheline allocation when clearing a large
region.

This optimization in particular is done by REP; STOS on AMD Zen
where regions larger than L3-size use non-temporal stores.

This results in significantly better performance. 

We also see performance improvement for cases where this optimization is
unavailable (pg-sz=2MB on AMD, and pg-sz=2MB|1GB on Intel) because
REP; STOS is typically microcoded which can now be amortized over
larger regions and the hint allows the hardware prefetcher to do a
better job.

Milan (EPYC 7J13, boost=0, preempt=full|lazy):

                 mm/folio_zero_user    x86/folio_zero_user     change
                  (GB/s  +- stddev)      (GB/s  +- stddev)

  pg-sz=1GB       16.51  +- 0.54%        42.80  +-  3.48%    + 159.2%
  pg-sz=2MB       11.89  +- 0.78%        16.12  +-  0.12%    +  35.5%

Icelakex (Platinum 8358, no_turbo=1, preempt=full|lazy):

                 mm/folio_zero_user    x86/folio_zero_user     change
                  (GB/s +- stddev)      (GB/s +- stddev)

  pg-sz=1GB       8.01  +- 0.24%        11.26 +- 0.48%       + 40.57%
  pg-sz=2MB       7.95  +- 0.30%        10.90 +- 0.26%       + 37.10%

Interaction with preemption: as discussed in [3], zeroing large
regions with string instructions doesn't work well with cooperative
preemption models which need regular invocations of cond_resched(). So,
this optimization is limited to only preemptible models (full, lazy).

This is done by overriding __folio_zero_user() -- which does the usual
page-at-a-time zeroing -- by an architecture optimized version but
only when running under preemptible models.
As such this ties an architecture specific optimization too closely
to preemption. Should be easy enough to change but seemed like the
simplest approach.

Comments appreciated!

Also at:
  github.com/terminus/linux clear-pages-preempt.v1


[1] https://lore.kernel.org/lkml/20230830184958.2333078-1-ankur.a.arora@oracle.com/
[2] https://lore.kernel.org/lkml/87cyyfxd4k.ffs@tglx/
[3] https://lore.kernel.org/lkml/CAHk-=wj9En-BC4t7J9xFZOws5ShwaR9yor7FxHZr8CTVyEP_+Q@mail.gmail.com/

Ankur Arora (4):
  x86/clear_page: extend clear_page*() for multi-page clearing
  x86/clear_page: add clear_pages()
  huge_page: allow arch override for folio_zero_user()
  x86/folio_zero_user: multi-page clearing

 arch/x86/include/asm/page_32.h |  6 ++++
 arch/x86/include/asm/page_64.h | 27 +++++++++------
 arch/x86/lib/clear_page_64.S   | 52 +++++++++++++++++++++--------
 arch/x86/mm/Makefile           |  1 +
 arch/x86/mm/memory.c           | 60 ++++++++++++++++++++++++++++++++++
 include/linux/mm.h             |  1 +
 mm/memory.c                    | 38 ++++++++++++++++++---
 7 files changed, 156 insertions(+), 29 deletions(-)
 create mode 100644 arch/x86/mm/memory.c

-- 
2.31.1



             reply	other threads:[~2025-04-14  3:46 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-14  3:46 Ankur Arora [this message]
2025-04-14  3:46 ` [PATCH v3 1/4] x86/clear_page: extend clear_page*() for " Ankur Arora
2025-04-14  6:32   ` Ingo Molnar
2025-04-14 11:02     ` Peter Zijlstra
2025-04-14 11:14       ` Ingo Molnar
2025-04-14 19:46       ` Ankur Arora
2025-04-14 22:26       ` Mateusz Guzik
2025-04-15  6:14         ` Ankur Arora
2025-04-15  8:22           ` Mateusz Guzik
2025-04-15 20:01             ` Ankur Arora
2025-04-15 20:32               ` Mateusz Guzik
2025-04-14 19:52     ` Ankur Arora
2025-04-14 20:09       ` Matthew Wilcox
2025-04-15 21:59         ` Ankur Arora
2025-04-14  3:46 ` [PATCH v3 2/4] x86/clear_page: add clear_pages() Ankur Arora
2025-04-14  3:46 ` [PATCH v3 3/4] huge_page: allow arch override for folio_zero_user() Ankur Arora
2025-04-14  3:46 ` [PATCH v3 4/4] x86/folio_zero_user: multi-page clearing Ankur Arora
2025-04-14  6:53   ` Ingo Molnar
2025-04-14 21:21     ` Ankur Arora
2025-04-14  7:05   ` Ingo Molnar
2025-04-15  6:36     ` Ankur Arora
2025-04-22  6:36     ` Raghavendra K T
2025-04-22 19:14       ` Ankur Arora
2025-04-15 10:16   ` Mateusz Guzik
2025-04-15 21:46     ` Ankur Arora
2025-04-15 22:01       ` Mateusz Guzik
2025-04-16  4:46         ` Ankur Arora
2025-04-17 14:06           ` Mateusz Guzik
2025-04-14  5:34 ` [PATCH v3 0/4] mm/folio_zero_user: add " Ingo Molnar
2025-04-14 19:30   ` Ankur Arora
2025-04-14  6:36 ` Ingo Molnar
2025-04-14 19:19   ` Ankur Arora
2025-04-15 19:10 ` Zi Yan
2025-04-22 19:32   ` Ankur Arora
2025-04-22  6:23 ` Raghavendra K T
2025-04-22 19:22   ` Ankur Arora
2025-04-23  8:12     ` Raghavendra K T
2025-04-23  9:18       ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250414034607.762653-1-ankur.a.arora@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata@amd.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jon.grimm@amd.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox