From: Harry Yoo <harry.yoo@oracle.com>
To: Gabriel Krisman Bertazi <krisman@suse.de>
Cc: Mateusz Guzik <mjguzik@gmail.com>, Jan Kara <jack@suse.cz>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Shakeel Butt <shakeel.butt@linux.dev>,
Michal Hocko <mhocko@kernel.org>, Dennis Zhou <dennis@kernel.org>,
Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@gentwo.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [RFC PATCH 0/4] Optimize rss_stat initialization/teardown for single-threaded tasks
Date: Tue, 2 Dec 2025 04:16:08 +0900 [thread overview]
Message-ID: <aS3peFi9SqzcOoEd@hyeyoo> (raw)
In-Reply-To: <877bv6i5ts.fsf@mailhost.krisman.be>
On Mon, Dec 01, 2025 at 10:23:43AM -0500, Gabriel Krisman Bertazi wrote:
> Mateusz Guzik <mjguzik@gmail.com> writes:
>
> > On Fri, Nov 28, 2025 at 9:10 PM Jan Kara <jack@suse.cz> wrote:
> >> On Fri 28-11-25 08:30:08, Mathieu Desnoyers wrote:
> >> > What would really reduce memory allocation overhead on fork
> >> > is to move all those fields into a top level
> >> > "struct mm_percpu_struct" as a first step. This would
> >> > merge 3 per-cpu allocations into one when forking a new
> >> > task.
> >> >
> >> > Then the second step is to create a mm_percpu_struct
> >> > cache to bypass the per-cpu allocator.
> >> >
> >> > I suspect that by doing just that we'd get most of the
> >> > performance benefits provided by the single-threaded special-case
> >> > proposed here.
> >>
> >> I don't think so. Because in the profiles I have been doing for these
> >> loads the biggest cost wasn't actually the per-cpu allocation itself but
> >> the cost of zeroing the allocated counter for many CPUs (and then the
> >> counter summarization on exit) and you're not going to get rid of that with
> >> just reshuffling per-cpu fields and adding slab allocator in front.
> >>
>
> Hi Mateusz,
>
> > The major claims (by me anyway) are:
> > 1. single-threaded operation for fork + exec suffers avoidable
> > overhead even without the rss counter problem, which are tractable
> > with the same kind of thing which would sort out the multi-threaded
> > problem
>
> Agreed, there are more issues in the fork/exec path than just the
> rss_stat. The rss_stat performance is particularly relevant to us,
> though, because it is a clear regression for single-threaded introduced
> in 6.2.
>
> I took the time to test the slab constructor approach with the
> /sbin/true microbenchmark. I've seen only 2% gain on that tight loop in
> the 80c machine, which, granted, is an artificial benchmark, but still a
> good stressor of the single-threaded case. With this patchset, I
> reported 6% improvement, getting it close to the performance before the
> pcpu rss_stats introduction.
Hi Gabriel,
I don't want to argue which approach is better, but just wanted to
mention that maybe this is not a fair comparison because we can (almost)
eliminate initialization cost with slab ctor & dtor pair. As Mateusz
pointed out, under normal conditions, we know that the sum of
each rss_stat counter is zero when it's freed.
That is what slab constructor is for; if we know that certain fields of
a type are freed in a particular state, then we only need to initialize
them once in the constructor when the object is first created, and no
initialization is needed for subsequent allocations.
We couldn't use slab constructor to do this because percpu memory is not
allocated when it's called, but with ctor/dtor pair we can do this.
> This is expected, as avoiding the pcpu
> allocation and initialization all together for the single-threaded case,
> where it is not necessary, will always be better than speeding up the
> allocation (even though that a worthwhile effort itself, as Mathieu
> pointed out).
--
Cheers,
Harry / Hyeonggon
next prev parent reply other threads:[~2025-12-01 19:16 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-27 23:36 Gabriel Krisman Bertazi
2025-11-27 23:36 ` [RFC PATCH 1/4] lib/percpu_counter: Split out a helper to insert into hotplug list Gabriel Krisman Bertazi
2025-11-27 23:36 ` [RFC PATCH 2/4] lib: Support lazy initialization of per-cpu counters Gabriel Krisman Bertazi
2025-11-27 23:36 ` [RFC PATCH 3/4] mm: Avoid percpu MM counters on single-threaded tasks Gabriel Krisman Bertazi
2025-11-27 23:36 ` [RFC PATCH 4/4] mm: Split a slow path for updating mm counters Gabriel Krisman Bertazi
2025-12-01 10:19 ` David Hildenbrand (Red Hat)
2025-11-28 13:30 ` [RFC PATCH 0/4] Optimize rss_stat initialization/teardown for single-threaded tasks Mathieu Desnoyers
2025-11-28 20:10 ` Jan Kara
2025-11-28 20:12 ` Mathieu Desnoyers
2025-11-29 5:57 ` Mateusz Guzik
2025-11-29 7:50 ` Mateusz Guzik
2025-12-01 10:38 ` Harry Yoo
2025-12-01 11:31 ` Mateusz Guzik
2025-12-01 14:47 ` Mathieu Desnoyers
2025-12-01 15:23 ` Gabriel Krisman Bertazi
2025-12-01 19:16 ` Harry Yoo [this message]
2025-12-03 11:02 ` Mateusz Guzik
2025-12-03 11:54 ` Mateusz Guzik
2025-12-03 14:36 ` Mateusz Guzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aS3peFi9SqzcOoEd@hyeyoo \
--to=harry.yoo@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=david@redhat.com \
--cc=dennis@kernel.org \
--cc=jack@suse.cz \
--cc=krisman@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhocko@kernel.org \
--cc=mjguzik@gmail.com \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox