From: Raghavendra K T <raghavendra.kt@amd.com>
To: Ankur Arora <ankur.a.arora@oracle.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
mingo@redhat.com, luto@kernel.org, peterz@infradead.org,
paulmck@kernel.org, rostedt@goodmis.org, tglx@linutronix.de,
willy@infradead.org, jon.grimm@amd.com, bharata@amd.com,
boris.ostrovsky@oracle.com, konrad.wilk@oracle.com
Subject: Re: [PATCH v3 0/4] mm/folio_zero_user: add multi-page clearing
Date: Tue, 22 Apr 2025 11:53:06 +0530 [thread overview]
Message-ID: <0d6ba41c-0c90-4130-896a-26eabbd5bd24@amd.com> (raw)
In-Reply-To: <20250414034607.762653-1-ankur.a.arora@oracle.com>
On 4/14/2025 9:16 AM, Ankur Arora wrote:
> This series adds multi-page clearing for hugepages. It is a rework
> of [1] which took a detour through PREEMPT_LAZY [2].
>
> Why multi-page clearing?: multi-page clearing improves upon the
> current page-at-a-time approach by providing the processor with a
> hint as to the real region size. A processor could use this hint to,
> for instance, elide cacheline allocation when clearing a large
> region.
>
> This optimization in particular is done by REP; STOS on AMD Zen
> where regions larger than L3-size use non-temporal stores.
>
> This results in significantly better performance.
>
> We also see performance improvement for cases where this optimization is
> unavailable (pg-sz=2MB on AMD, and pg-sz=2MB|1GB on Intel) because
> REP; STOS is typically microcoded which can now be amortized over
> larger regions and the hint allows the hardware prefetcher to do a
> better job.
>
> Milan (EPYC 7J13, boost=0, preempt=full|lazy):
>
> mm/folio_zero_user x86/folio_zero_user change
> (GB/s +- stddev) (GB/s +- stddev)
>
> pg-sz=1GB 16.51 +- 0.54% 42.80 +- 3.48% + 159.2%
> pg-sz=2MB 11.89 +- 0.78% 16.12 +- 0.12% + 35.5%
>
> Icelakex (Platinum 8358, no_turbo=1, preempt=full|lazy):
>
> mm/folio_zero_user x86/folio_zero_user change
> (GB/s +- stddev) (GB/s +- stddev)
>
> pg-sz=1GB 8.01 +- 0.24% 11.26 +- 0.48% + 40.57%
> pg-sz=2MB 7.95 +- 0.30% 10.90 +- 0.26% + 37.10%
>
[...]
Hello Ankur,
Thank you for the patches. Was able to test briefly w/ lazy preempt
mode.
(I do understand that, there could be lot of churn based on Ingo,
Mateusz and others' comments)
But here it goes:
SUT: AMD EPYC 9B24 (Genoa) preempt=lazy
metric = time taken in sec (lower is better). total SIZE=64GB
mm/folio_zero_user x86/folio_zero_user change
pg-sz=1GB 2.47044 +- 0.38% 1.060877 +- 0.07% 57.06
pg-sz=2MB 5.098403 +- 0.01% 2.52015 +- 0.36% 50.57
More details (1G example run):
base kernel = 6.14 (preempt = lazy)
mm/folio_zero_user
Performance counter stats for 'numactl -m 0 -N 0 map_hugetlb_1G' (10 runs):
2,476.47 msec task-clock # 1.002
CPUs utilized ( +- 0.39% )
5 context-switches # 2.025
/sec ( +- 29.70% )
2 cpu-migrations # 0.810
/sec ( +- 21.15% )
202 page-faults # 81.806
/sec ( +- 0.18% )
7,348,664,233 cycles # 2.976 GHz
( +- 0.38% ) (38.39%)
878,805,326 stalled-cycles-frontend # 11.99%
frontend cycles idle ( +- 0.74% ) (38.43%)
339,023,729 instructions # 0.05
insn per cycle
# 2.53 stalled
cycles per insn ( +- 0.08% ) (38.47%)
88,579,915 branches # 35.873
M/sec ( +- 0.06% ) (38.51%)
17,369,776 branch-misses # 19.55% of
all branches ( +- 0.04% ) (38.55%)
2,261,339,695 L1-dcache-loads # 915.795
M/sec ( +- 0.06% ) (38.56%)
1,073,880,164 L1-dcache-load-misses # 47.48% of
all L1-dcache accesses ( +- 0.05% ) (38.56%)
511,231,988 L1-icache-loads # 207.038
M/sec ( +- 0.25% ) (38.52%)
128,533 L1-icache-load-misses # 0.02% of
all L1-icache accesses ( +- 0.40% ) (38.48%)
38,134 dTLB-loads # 15.443
K/sec ( +- 4.22% ) (38.44%)
33,992 dTLB-load-misses # 114.39% of
all dTLB cache accesses ( +- 9.42% ) (38.40%)
156 iTLB-loads # 63.177
/sec ( +- 13.34% ) (38.36%)
156 iTLB-load-misses # 102.50% of
all iTLB cache accesses ( +- 25.98% ) (38.36%)
2.47044 +- 0.00949 seconds time elapsed ( +- 0.38% )
x86/folio_zero_user
1,056.72 msec task-clock # 0.996
CPUs utilized ( +- 0.07% )
10 context-switches # 9.436
/sec ( +- 3.59% )
3 cpu-migrations # 2.831
/sec ( +- 11.33% )
200 page-faults # 188.718
/sec ( +- 0.15% )
3,146,571,264 cycles # 2.969 GHz
( +- 0.07% ) (38.35%)
17,226,261 stalled-cycles-frontend # 0.55%
frontend cycles idle ( +- 4.12% ) (38.44%)
14,130,553 instructions # 0.00
insn per cycle
# 1.39 stalled
cycles per insn ( +- 1.59% ) (38.53%)
3,578,614 branches # 3.377
M/sec ( +- 1.54% ) (38.62%)
415,807 branch-misses # 12.45% of
all branches ( +- 1.17% ) (38.62%)
22,208,699 L1-dcache-loads # 20.956
M/sec ( +- 5.27% ) (38.60%)
7,312,684 L1-dcache-load-misses # 27.79% of
all L1-dcache accesses ( +- 8.46% ) (38.51%)
4,032,315 L1-icache-loads # 3.805
M/sec ( +- 1.29% ) (38.48%)
15,094 L1-icache-load-misses # 0.38% of
all L1-icache accesses ( +- 1.14% ) (38.39%)
14,365 dTLB-loads # 13.555
K/sec ( +- 7.23% ) (38.38%)
9,477 dTLB-load-misses # 65.36% of
all dTLB cache accesses ( +- 12.05% ) (38.38%)
18 iTLB-loads # 16.985
/sec ( +- 34.84% ) (38.38%)
67 iTLB-load-misses # 158.39% of
all iTLB cache accesses ( +- 48.32% ) (38.32%)
1.060877 +- 0.000766 seconds time elapsed ( +- 0.07% )
Thanks and Regards
- Raghu
next prev parent reply other threads:[~2025-04-22 6:23 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-14 3:46 Ankur Arora
2025-04-14 3:46 ` [PATCH v3 1/4] x86/clear_page: extend clear_page*() for " Ankur Arora
2025-04-14 6:32 ` Ingo Molnar
2025-04-14 11:02 ` Peter Zijlstra
2025-04-14 11:14 ` Ingo Molnar
2025-04-14 19:46 ` Ankur Arora
2025-04-14 22:26 ` Mateusz Guzik
2025-04-15 6:14 ` Ankur Arora
2025-04-15 8:22 ` Mateusz Guzik
2025-04-15 20:01 ` Ankur Arora
2025-04-15 20:32 ` Mateusz Guzik
2025-04-14 19:52 ` Ankur Arora
2025-04-14 20:09 ` Matthew Wilcox
2025-04-15 21:59 ` Ankur Arora
2025-04-14 3:46 ` [PATCH v3 2/4] x86/clear_page: add clear_pages() Ankur Arora
2025-04-14 3:46 ` [PATCH v3 3/4] huge_page: allow arch override for folio_zero_user() Ankur Arora
2025-04-14 3:46 ` [PATCH v3 4/4] x86/folio_zero_user: multi-page clearing Ankur Arora
2025-04-14 6:53 ` Ingo Molnar
2025-04-14 21:21 ` Ankur Arora
2025-04-14 7:05 ` Ingo Molnar
2025-04-15 6:36 ` Ankur Arora
2025-04-22 6:36 ` Raghavendra K T
2025-04-22 19:14 ` Ankur Arora
2025-04-15 10:16 ` Mateusz Guzik
2025-04-15 21:46 ` Ankur Arora
2025-04-15 22:01 ` Mateusz Guzik
2025-04-16 4:46 ` Ankur Arora
2025-04-17 14:06 ` Mateusz Guzik
2025-04-14 5:34 ` [PATCH v3 0/4] mm/folio_zero_user: add " Ingo Molnar
2025-04-14 19:30 ` Ankur Arora
2025-04-14 6:36 ` Ingo Molnar
2025-04-14 19:19 ` Ankur Arora
2025-04-15 19:10 ` Zi Yan
2025-04-22 19:32 ` Ankur Arora
2025-04-22 6:23 ` Raghavendra K T [this message]
2025-04-22 19:22 ` Ankur Arora
2025-04-23 8:12 ` Raghavendra K T
2025-04-23 9:18 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0d6ba41c-0c90-4130-896a-26eabbd5bd24@amd.com \
--to=raghavendra.kt@amd.com \
--cc=akpm@linux-foundation.org \
--cc=ankur.a.arora@oracle.com \
--cc=bharata@amd.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jon.grimm@amd.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox