From: Peter Zijlstra <peterz@infradead.org>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com,
hpa@zytor.com, mingo@redhat.com, juri.lelli@redhat.com,
willy@infradead.org, mgorman@suse.de, rostedt@goodmis.org,
tglx@linutronix.de, vincent.guittot@linaro.org,
jon.grimm@amd.com, bharata@amd.com, boris.ostrovsky@oracle.com,
konrad.wilk@oracle.com
Subject: Re: [PATCH 9/9] x86/clear_huge_page: make clear_contig_region() preemptible
Date: Wed, 5 Apr 2023 22:27:04 +0200 [thread overview]
Message-ID: <20230405202704.GF365912@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20230403052233.1880567-10-ankur.a.arora@oracle.com>
On Sun, Apr 02, 2023 at 10:22:33PM -0700, Ankur Arora wrote:
> clear_contig_region() can be used to clear up to a huge-page (2MB/1GB)
> chunk Allow preemption in the irqentry_exit path to make sure we don't
> hold on to the CPU for an arbitrarily long period.
>
> Performance: vm-scalability/case-anon-w-seq-hugetlb mmaps an anonymous
> hugetlb-2mb region, and then writes sequentially to the region, demand
> faulting pages on the way.
>
> This test, with a CONFIG_VOLUNTARY config shows the effects of this
> change: stime drops (~18% on Icelakex, ~5% on Milan), while the utime
> goes up (~15% on Icelakex, ~13% on Milan.)
>
> *Icelakex* mm/clear_huge_page x86/clear_huge_page change
> (mem=4GB/task, tasks=128)
>
> stime 293.02 +- .49% 239.39 +- .83% -18.30%
> utime 440.11 +- .28% 508.74 +- .60% +15.59%
> wall-clock 5.96 +- .33% 6.27 +-2.23% + 5.20%
>
>
>
> *Milan* mm/clear_huge_page x86/clear_huge_page change
> (mem=1GB/task, tasks=512)
>
> stime 490.95 +- 3.55% 466.90 +- 4.79% - 4.89%
> utime 276.43 +- 2.85% 311.97 +- 5.15% +12.85%
> wall-clock 3.74 +- 6.41% 3.58 +- 7.82% - 4.27%
>
> The drop in stime is due to REP; STOS being more efficient for bigger
> extents. The increase in utime is due to cache effects of that change:
> mm/clear_huge_page() clears page-at-a-time, while narrowing towards the
> faulting page; while x86/clear_huge_page only optimizes for cache
> locality in the local neighbourhood of the faulting address.
>
> This effect on utime is visible via the increased L1-dcache-load-misses
> and LLC-load* and an increased backend boundedness for perf user-stat
> --all-user on Icelakex. The effect is slight but given the heavy cache
> pressure generated by the test, shows up in the drop in user IPC:
>
> - 9,455,243,414,829 instructions # 2.75 insn per cycle ( +- 14.14% ) (46.17%)
> - 2,367,920,864,112 L1-dcache-loads # 1.054 G/sec ( +- 14.14% ) (69.24%)
> - 42,075,182,813 L1-dcache-load-misses # 2.96% of all L1-dcache accesses ( +- 14.14% ) (69.24%)
> - 20,365,688 LLC-loads # 9.064 K/sec ( +- 13.98% ) (69.24%)
> - 890,382 LLC-load-misses # 7.18% of all LL-cache accesses ( +- 14.91% ) (69.24%)
>
> + 9,467,796,660,698 instructions # 2.37 insn per cycle ( +- 14.14% ) (46.16%)
> + 2,369,973,307,561 L1-dcache-loads # 1.027 G/sec ( +- 14.14% ) (69.24%)
> + 42,155,621,201 L1-dcache-load-misses # 2.96% of all L1-dcache accesses ( +- 14.14% ) (69.24%)
> + 22,116,300 LLC-loads # 9.588 K/sec ( +- 14.20% ) (69.24%)
> + 1,355,607 LLC-load-misses # 10.29% of all LL-cache accesses ( +- 15.49% ) (69.25%)
>
> Given the fact that the stime improves for all loads using this path,
> while the utime drop is load dependent add this change.
Either I really need sleep, or *NONE* of the above is actually relevant
to what the patch below actually does!
The above talks about the glories of using large clears, while the patch
allows reschedules which are about latency.
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> arch/x86/mm/hugetlbpage.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
> index 4294b77c4f18..c8564b0552e5 100644
> --- a/arch/x86/mm/hugetlbpage.c
> +++ b/arch/x86/mm/hugetlbpage.c
> @@ -158,7 +158,17 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
> static void clear_contig_region(struct page *page, unsigned long vaddr,
> unsigned int npages)
> {
> + might_sleep();
> +
> + /*
> + * We might be clearing a large region.
> + * Allow rescheduling.
> + */
> + allow_resched();
> clear_user_pages(page_address(page), vaddr, page, npages);
> + disallow_resched();
> +
> + cond_resched();
> }
>
> void clear_huge_page(struct page *page,
> --
> 2.31.1
>
next prev parent reply other threads:[~2023-04-05 20:27 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-03 5:22 [PATCH 0/9] x86/clear_huge_page: multi-page clearing Ankur Arora
2023-04-03 5:22 ` [PATCH 1/9] huge_pages: get rid of process_huge_page() Ankur Arora
2023-04-03 5:22 ` [PATCH 2/9] huge_page: get rid of {clear,copy}_subpage() Ankur Arora
2023-04-03 5:22 ` [PATCH 3/9] huge_page: allow arch override for clear/copy_huge_page() Ankur Arora
2023-04-03 5:22 ` [PATCH 4/9] x86/clear_page: parameterize clear_page*() to specify length Ankur Arora
2023-04-06 8:19 ` Peter Zijlstra
2023-04-07 3:03 ` Ankur Arora
2023-04-03 5:22 ` [PATCH 5/9] x86/clear_pages: add clear_pages() Ankur Arora
2023-04-06 8:23 ` Peter Zijlstra
2023-04-07 0:50 ` Ankur Arora
2023-04-07 10:34 ` Peter Zijlstra
2023-04-09 13:26 ` Matthew Wilcox
2023-04-03 5:22 ` [PATCH 6/9] mm/clear_huge_page: use multi-page clearing Ankur Arora
2023-04-03 5:22 ` [PATCH 7/9] sched: define TIF_ALLOW_RESCHED Ankur Arora
2023-04-05 20:07 ` Peter Zijlstra
2023-04-03 5:22 ` [PATCH 8/9] irqentry: define irqentry_exit_allow_resched() Ankur Arora
2023-04-04 9:38 ` Thomas Gleixner
2023-04-05 5:29 ` Ankur Arora
2023-04-05 20:22 ` Peter Zijlstra
2023-04-06 16:56 ` Ankur Arora
2023-04-06 20:13 ` Peter Zijlstra
2023-04-06 20:16 ` Peter Zijlstra
2023-04-07 2:29 ` Ankur Arora
2023-04-07 10:23 ` Peter Zijlstra
2023-04-03 5:22 ` [PATCH 9/9] x86/clear_huge_page: make clear_contig_region() preemptible Ankur Arora
2023-04-05 20:27 ` Peter Zijlstra [this message]
2023-04-06 17:00 ` Ankur Arora
2023-04-05 19:48 ` [PATCH 0/9] x86/clear_huge_page: multi-page clearing Raghavendra K T
2023-04-08 22:46 ` Ankur Arora
2023-04-10 6:26 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230405202704.GF365912@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=ankur.a.arora@oracle.com \
--cc=bharata@amd.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jon.grimm@amd.com \
--cc=juri.lelli@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox