linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Pedro Falcato <pfalcato@suse.de>
Cc: Harry Yoo <harry.yoo@oracle.com>,
	linux-mm@kvack.org,  lsf-pc@lists.linux-foundation.org,
	Mateusz Guzik <mjguzik@gmail.com>,
	 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Gabriel Krisman Bertazi <krisman@suse.de>,
	 Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@gentwo.org>,
	 Dennis Zhou <dennis@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, Hao Li <hao.li@linux.dev>,
	 Jan Kara <jack@suse.cz>
Subject: Re: [LSF/MM/BPF TOPIC] Ways to mitigate limitations of percpu memory allocator
Date: Thu, 5 Mar 2026 12:48:21 +0100	[thread overview]
Message-ID: <z7bjxfk7jah6zgyikhiz6eqxd3xwywxp745bykcr3sm3p525yi@4diwkjsoyckl> (raw)
In-Reply-To: <ji4ddtoo7mebygmytprcp5kpnu5y2ddthk5zhacshods37xij2@fs2sizuagdvz>

On Thu 05-03-26 11:33:21, Pedro Falcato wrote:
> On Fri, Feb 27, 2026 at 03:41:50PM +0900, Harry Yoo wrote:
> > Hi folks, I'd like to discuss ways to mitigate limitations of
> > percpu memory allocator. 
> > 
> > While the percpu memory allocator has served its role well,
> > it has a few problems: 1) its global lock contention, and
> > 2) lack of features to avoid high initialization cost of percpu memory.
> > 
> > Global lock contention
> > =======================
> > 
> > Percpu allocator has a global lock when allocating or freeing memory.
> > Of course, caching percpu memory is not always worth it, because
> > it would meaningfully increase memory usage.
> > 
> > However, some users (e.g., fork+exec, tc filter) suffer from
> > the lock contention when many CPUs allocate / free percpu memory
> > concurrently.
> > 
> > That said, we need a way to cache percpu memory per cpu, in a selective
> > way. As an opt-in approach, Mateusz Guzik proposed [1] keeping percpu
> > memory in slab objects and letting slab cache them per cpu,
> > with slab ctor+dtor pair: allocate percpu memory and
> > associate it with slab object in constructor, and free it when
> > deallocating slabs (with resurrecting slab destructor feature).
> > 
> > This only works when percpu memory is associated with slab objects.
> > I would like to hear if anybody thinks it's still worth redesigning
> > percpu memory allocator for better scalability.
> 
> I think this (make alloc_percpu actually scale) is the obvious suggestion.
> Everything else is just papering over the cracks.

I disagree. There are two separate (although related) issues that need
solving. One issue is certainly scalability of the percpu allocator.
Another issue (which is also visible in singlethreaded workloads) is that
a percpu counter creation has a rather large cost even if the allocator is
totally uncontended - this is because of the initialization (and final
summarization) cost. And this is very visible e.g. in the fork() intensive
loads such as shell scripts where we currently allocate several percpu
arrays for each fork() and significant part of the fork() cost is currently
the initialization of percpu arrays on larger machines. Reducing this
overhead is a separate goal.

> > Slab constructor + destructor Pair
> > ----------------------------------
> > 
> > Percpu allocator doesn't distinguish types of objects
> > unlike slab and it doesn't support constructors that could avoid
> > re-initializing them on every allocation.
> > One solution to this is using slab ctor+dtor pair; as long as a certain
> > state is preserved on free (e.g. sum of percpu counter is zero),
> > initialization needs to be done only once on construction.
> 
> As I said way back when, making an object permanently accessible 
> a-la TYPESAFE_BY_RCU) is screwey and messes with the object lifetime
> too much. Not to mention the locking problems that we discussed back-and-forth.

Yeah, I was not enthusiastic about this solution either.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


      reply	other threads:[~2026-03-05 11:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-27  6:41 Harry Yoo
2026-03-04 17:50 ` Gabriel Krisman Bertazi
2026-03-05  4:24   ` Mathieu Desnoyers
2026-03-05 10:05     ` Jan Kara
2026-03-05 11:33 ` Pedro Falcato
2026-03-05 11:48   ` Jan Kara [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=z7bjxfk7jah6zgyikhiz6eqxd3xwywxp745bykcr3sm3p525yi@4diwkjsoyckl \
    --to=jack@suse.cz \
    --cc=cl@gentwo.org \
    --cc=dennis@kernel.org \
    --cc=hao.li@linux.dev \
    --cc=harry.yoo@oracle.com \
    --cc=krisman@suse.de \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mjguzik@gmail.com \
    --cc=pfalcato@suse.de \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox