From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: linux-mm@kvack.org, lsf-pc@lists.linux-foundation.org,
Mateusz Guzik <mjguzik@gmail.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@gentwo.org>,
Dennis Zhou <dennis@kernel.org>,
Vlastimil Babka <vbabka@suse.cz>, Hao Li <hao.li@linux.dev>,
Jan Kara <jack@suse.cz>
Subject: Re: [LSF/MM/BPF TOPIC] Ways to mitigate limitations of percpu memory allocator
Date: Wed, 04 Mar 2026 12:50:42 -0500 [thread overview]
Message-ID: <87ikbb1o25.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <aaE8rkGzxlB6wCTt@hyeyoo> (Harry Yoo's message of "Fri, 27 Feb 2026 15:41:50 +0900")
Harry Yoo <harry.yoo@oracle.com> writes:
> Hi folks, I'd like to discuss ways to mitigate limitations of
> percpu memory allocator.
>
> While the percpu memory allocator has served its role well,
> it has a few problems: 1) its global lock contention, and
> 2) lack of features to avoid high initialization cost of percpu
> memory.
I won't be going to LSF this year. But Jan was the proponent of my
dual-mode pcpu initialization work and he'll be around. I'm not sure
this requires a full session either, as it might not grasp broader
interest. Are Mathieu and Mateusz attending?
>
> Global lock contention
> =======================
>
> Percpu allocator has a global lock when allocating or freeing memory.
> Of course, caching percpu memory is not always worth it, because
> it would meaningfully increase memory usage.
>
> However, some users (e.g., fork+exec, tc filter) suffer from
> the lock contention when many CPUs allocate / free percpu memory
> concurrently.
>
> That said, we need a way to cache percpu memory per cpu, in a selective
> way. As an opt-in approach, Mateusz Guzik proposed [1] keeping percpu
> memory in slab objects and letting slab cache them per cpu,
> with slab ctor+dtor pair: allocate percpu memory and
> associate it with slab object in constructor, and free it when
> deallocating slabs (with resurrecting slab destructor feature).
>
> This only works when percpu memory is associated with slab objects.
> I would like to hear if anybody thinks it's still worth redesigning
> percpu memory allocator for better scalability.
>
> Initialization of percpu data has high overhead
> ===============================================
>
> Initializing percpu data has non-negligible overhead on systems with
> many CPUs. There's been a few approaches proposed to mitigate this.
> I'd like to discuss the status of ideas proposed, and potentially
> whether there are other approaches worth exploring.
>
> Slab constructor + destructor Pair
> ----------------------------------
>
> Percpu allocator doesn't distinguish types of objects
> unlike slab and it doesn't support constructors that could avoid
> re-initializing them on every allocation.
> One solution to this is using slab ctor+dtor pair; as long as a certain
> state is preserved on free (e.g. sum of percpu counter is zero),
> initialization needs to be done only once on construction.
>
> Dual-mode percpu counters
> -------------------------
>
> Gabriel Krisman Bertazi proposed [2] introducing dual-mode percpu
> counters; single-threaded tasks use a simple counter, which is cheaper
> to initialize. Later when a new task is spawned, upgrade it to a more
> expensive, full-fledged counter.
>
> On-demand initialization of mm_cid counters
> -------------------------------------------
>
> Mathieu Desnoyers proposed [3] initializing mm_cid counters on-demand
> on clone instead of initializing for all CPUs on every allocation.
>
> [1] https://lore.kernel.org/linux-mm/20250424080755.272925-1-harry.yoo@oracle.com
> [2] https://lore.kernel.org/linux-mm/20251127233635.4170047-1-krisman@suse.de
> [3] https://lore.kernel.org/linux-mm/355143c9-78c7-4da1-9033-5ae6fa50efad@efficios.com
--
Gabriel Krisman Bertazi
prev parent reply other threads:[~2026-03-04 17:50 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-27 6:41 Harry Yoo
2026-03-04 17:50 ` Gabriel Krisman Bertazi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ikbb1o25.fsf@mailhost.krisman.be \
--to=krisman@suse.de \
--cc=cl@gentwo.org \
--cc=dennis@kernel.org \
--cc=hao.li@linux.dev \
--cc=harry.yoo@oracle.com \
--cc=jack@suse.cz \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mjguzik@gmail.com \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox