linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: linux-mm@kvack.org,  lsf-pc@lists.linux-foundation.org,
	 Mateusz Guzik <mjguzik@gmail.com>,
	 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Tejun Heo <tj@kernel.org>,  Christoph Lameter <cl@gentwo.org>,
	 Dennis Zhou <dennis@kernel.org>,
	 Vlastimil Babka <vbabka@suse.cz>,  Hao Li <hao.li@linux.dev>,
	 Jan Kara <jack@suse.cz>
Subject: Re: [LSF/MM/BPF TOPIC] Ways to mitigate limitations of percpu memory allocator
Date: Wed, 04 Mar 2026 12:50:42 -0500	[thread overview]
Message-ID: <87ikbb1o25.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <aaE8rkGzxlB6wCTt@hyeyoo> (Harry Yoo's message of "Fri, 27 Feb 2026 15:41:50 +0900")

Harry Yoo <harry.yoo@oracle.com> writes:

> Hi folks, I'd like to discuss ways to mitigate limitations of
> percpu memory allocator.
>
> While the percpu memory allocator has served its role well,
> it has a few problems: 1) its global lock contention, and
> 2) lack of features to avoid high initialization cost of percpu
> memory.

I won't be going to LSF this year.  But Jan was the proponent of my
dual-mode pcpu initialization work and he'll be around. I'm not sure
this requires a full session either, as it might not grasp broader
interest.  Are Mathieu and Mateusz attending?

>
> Global lock contention
> =======================
>
> Percpu allocator has a global lock when allocating or freeing memory.
> Of course, caching percpu memory is not always worth it, because
> it would meaningfully increase memory usage.
>
> However, some users (e.g., fork+exec, tc filter) suffer from
> the lock contention when many CPUs allocate / free percpu memory
> concurrently.
>
> That said, we need a way to cache percpu memory per cpu, in a selective
> way. As an opt-in approach, Mateusz Guzik proposed [1] keeping percpu
> memory in slab objects and letting slab cache them per cpu,
> with slab ctor+dtor pair: allocate percpu memory and
> associate it with slab object in constructor, and free it when
> deallocating slabs (with resurrecting slab destructor feature).
>
> This only works when percpu memory is associated with slab objects.
> I would like to hear if anybody thinks it's still worth redesigning
> percpu memory allocator for better scalability.
>
> Initialization of percpu data has high overhead
> ===============================================
>
> Initializing percpu data has non-negligible overhead on systems with
> many CPUs. There's been a few approaches proposed to mitigate this.
> I'd like to discuss the status of ideas proposed, and potentially
> whether there are other approaches worth exploring.
>
> Slab constructor + destructor Pair
> ----------------------------------
>
> Percpu allocator doesn't distinguish types of objects
> unlike slab and it doesn't support constructors that could avoid
> re-initializing them on every allocation.
> One solution to this is using slab ctor+dtor pair; as long as a certain
> state is preserved on free (e.g. sum of percpu counter is zero),
> initialization needs to be done only once on construction.
>
> Dual-mode percpu counters
> -------------------------
>
> Gabriel Krisman Bertazi proposed [2] introducing dual-mode percpu
> counters; single-threaded tasks use a simple counter, which is cheaper
> to initialize. Later when a new task is spawned, upgrade it to a more
> expensive, full-fledged counter.
>
> On-demand initialization of mm_cid counters
> -------------------------------------------
>
> Mathieu Desnoyers proposed [3] initializing mm_cid counters on-demand
> on clone instead of initializing for all CPUs on every allocation.
>
> [1] https://lore.kernel.org/linux-mm/20250424080755.272925-1-harry.yoo@oracle.com
> [2] https://lore.kernel.org/linux-mm/20251127233635.4170047-1-krisman@suse.de
> [3] https://lore.kernel.org/linux-mm/355143c9-78c7-4da1-9033-5ae6fa50efad@efficios.com  

-- 
Gabriel Krisman Bertazi


      reply	other threads:[~2026-03-04 17:50 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-27  6:41 Harry Yoo
2026-03-04 17:50 ` Gabriel Krisman Bertazi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ikbb1o25.fsf@mailhost.krisman.be \
    --to=krisman@suse.de \
    --cc=cl@gentwo.org \
    --cc=dennis@kernel.org \
    --cc=hao.li@linux.dev \
    --cc=harry.yoo@oracle.com \
    --cc=jack@suse.cz \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mjguzik@gmail.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox