From: Harry Yoo <harry.yoo@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Hyeonggon Yoo <42.hyeyoo@gmail.com>,
Uladzislau Rezki <urezki@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
rcu@vger.kernel.org, maple-tree@lists.infradead.org
Subject: Re: [PATCH RFC v2 01/10] slab: add opt-in caching layer of percpu sheaves
Date: Mon, 24 Feb 2025 17:04:30 +0900 [thread overview]
Message-ID: <Z7woDjICqD0fkghA@harry> (raw)
In-Reply-To: <20250214-slub-percpu-caches-v2-1-88592ee0966a@suse.cz>
On Fri, Feb 14, 2025 at 05:27:37PM +0100, Vlastimil Babka wrote:
> Specifying a non-zero value for a new struct kmem_cache_args field
> sheaf_capacity will setup a caching layer of percpu arrays called
> sheaves of given capacity for the created cache.
>
> Allocations from the cache will allocate via the percpu sheaves (main or
> spare) as long as they have no NUMA node preference. Frees will also
> refill one of the sheaves.
>
> When both percpu sheaves are found empty during an allocation, an empty
> sheaf may be replaced with a full one from the per-node barn. If none
> are available and the allocation is allowed to block, an empty sheaf is
> refilled from slab(s) by an internal bulk alloc operation. When both
> percpu sheaves are full during freeing, the barn can replace a full one
> with an empty one, unless over a full sheaves limit. In that case a
> sheaf is flushed to slab(s) by an internal bulk free operation. Flushing
> sheaves and barns is also wired to the existing cpu flushing and cache
> shrinking operations.
>
> The sheaves do not distinguish NUMA locality of the cached objects. If
> an allocation is requested with kmem_cache_alloc_node() with a specific
> node (not NUMA_NO_NODE), sheaves are bypassed.
>
> The bulk operations exposed to slab users also try to utilize the
> sheaves as long as the necessary (full or empty) sheaves are available
> on the cpu or in the barn. Once depleted, they will fallback to bulk
> alloc/free to slabs directly to avoid double copying.
>
> Sysfs stat counters alloc_cpu_sheaf and free_cpu_sheaf count objects
> allocated or freed using the sheaves. Counters sheaf_refill,
> sheaf_flush_main and sheaf_flush_other count objects filled or flushed
> from or to slab pages, and can be used to assess how effective the
> caching is. The refill and flush operations will also count towards the
> usual alloc_fastpath/slowpath, free_fastpath/slowpath and other
> counters.
>
> Access to the percpu sheaves is protected by local_lock_irqsave()
> operations, each per-NUMA-node barn has a spin_lock.
>
> A current limitation is that when slub_debug is enabled for a cache with
> percpu sheaves, the objects in the array are considered as allocated from
> the slub_debug perspective, and the alloc/free debugging hooks occur
> when moving the objects between the array and slab pages. This means
> that e.g. an use-after-free that occurs for an object cached in the
> array is undetected. Collected alloc/free stacktraces might also be less
> useful. This limitation could be changed in the future.
>
> On the other hand, KASAN, kmemcg and other hooks are executed on actual
> allocations and frees by kmem_cache users even if those use the array,
> so their debugging or accounting accuracy should be unaffected.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> include/linux/slab.h | 34 ++
> mm/slab.h | 2 +
> mm/slab_common.c | 5 +-
> mm/slub.c | 982 ++++++++++++++++++++++++++++++++++++++++++++++++---
> 4 files changed, 973 insertions(+), 50 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index e8273f28656936c05d015c53923f8fe69cd161b2..c06734912972b799f537359f7fe6a750918ffe9e 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
>
> /********************************************************************
> * Core slab cache functions
> +static void __pcs_flush_all_cpu(struct kmem_cache *s, unsigned int cpu)
> +{
> + struct slub_percpu_sheaves *pcs;
> +
> + pcs = per_cpu_ptr(s->cpu_sheaves, cpu);
> +
> + if (pcs->spare) {
> + sheaf_flush(s, pcs->spare);
> + free_empty_sheaf(s, pcs->spare);
> + pcs->spare = NULL;
> + }
> +
> + // TODO: handle rcu_free
> + BUG_ON(pcs->rcu_free);
> +
> + sheaf_flush_main(s);
> +}
+1 on what Suren mentioned.
> +static void barn_shrink(struct kmem_cache *s, struct node_barn *barn)
> +{
> + struct list_head empty_list;
> + struct list_head full_list;
> + struct slab_sheaf *sheaf, *sheaf2;
> + unsigned long flags;
> +
> + INIT_LIST_HEAD(&empty_list);
> + INIT_LIST_HEAD(&full_list);
> +
> + spin_lock_irqsave(&barn->lock, flags);
> +
> + list_splice_init(&barn->sheaves_full, &full_list);
> + barn->nr_full = 0;
> + list_splice_init(&barn->sheaves_empty, &empty_list);
> + barn->nr_empty = 0;
> +
> + spin_unlock_irqrestore(&barn->lock, flags);
> +
> + list_for_each_entry_safe(sheaf, sheaf2, &full_list, barn_list) {
> + sheaf_flush(s, sheaf);
> + list_move(&sheaf->barn_list, &empty_list);
> + }
nit: is this list_move() necessary?
> +
> + list_for_each_entry_safe(sheaf, sheaf2, &empty_list, barn_list)
> + free_empty_sheaf(s, sheaf);
> +}
Otherwise looks good to me.
--
Cheers,
Harry
next prev parent reply other threads:[~2025-02-24 8:04 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-14 16:27 [PATCH RFC v2 00/10] SLUB " Vlastimil Babka
2025-02-14 16:27 ` [PATCH RFC v2 01/10] slab: add opt-in caching layer of " Vlastimil Babka
2025-02-22 22:46 ` Suren Baghdasaryan
2025-02-22 22:56 ` Suren Baghdasaryan
2025-03-12 14:57 ` Vlastimil Babka
2025-03-12 15:14 ` Suren Baghdasaryan
2025-03-17 10:09 ` Vlastimil Babka
2025-02-24 8:04 ` Harry Yoo [this message]
2025-03-12 14:59 ` Vlastimil Babka
2025-02-14 16:27 ` [PATCH RFC v2 02/10] slab: add sheaf support for batching kfree_rcu() operations Vlastimil Babka
2025-02-22 23:08 ` Suren Baghdasaryan
2025-03-12 16:19 ` Vlastimil Babka
2025-02-24 8:40 ` Harry Yoo
2025-03-12 16:16 ` Vlastimil Babka
2025-02-14 16:27 ` [PATCH RFC v2 03/10] locking/local_lock: Introduce localtry_lock_t Vlastimil Babka
2025-02-17 14:19 ` Sebastian Andrzej Siewior
2025-02-17 14:35 ` Vlastimil Babka
2025-02-17 15:07 ` Sebastian Andrzej Siewior
2025-02-18 18:41 ` Alexei Starovoitov
2025-02-26 17:00 ` Davidlohr Bueso
2025-02-26 17:15 ` Alexei Starovoitov
2025-02-26 19:28 ` Davidlohr Bueso
2025-02-14 16:27 ` [PATCH RFC v2 04/10] locking/local_lock: add localtry_trylock() Vlastimil Babka
2025-02-14 16:27 ` [PATCH RFC v2 05/10] slab: switch percpu sheaves locking to localtry_lock Vlastimil Babka
2025-02-23 2:33 ` Suren Baghdasaryan
2025-02-24 13:08 ` Harry Yoo
2025-02-14 16:27 ` [PATCH RFC v2 06/10] slab: sheaf prefilling for guaranteed allocations Vlastimil Babka
2025-02-23 3:54 ` Suren Baghdasaryan
2025-02-25 7:30 ` Harry Yoo
2025-03-12 17:09 ` Vlastimil Babka
2025-02-25 8:00 ` Harry Yoo
2025-03-12 18:16 ` Vlastimil Babka
2025-02-14 16:27 ` [PATCH RFC v2 07/10] slab: determine barn status racily outside of lock Vlastimil Babka
2025-02-23 4:00 ` Suren Baghdasaryan
2025-02-25 8:54 ` Harry Yoo
2025-03-12 18:23 ` Vlastimil Babka
2025-02-14 16:27 ` [PATCH RFC v2 08/10] tools: Add testing support for changes to rcu and slab for sheaves Vlastimil Babka
2025-02-23 4:24 ` Suren Baghdasaryan
2025-02-14 16:27 ` [PATCH RFC v2 09/10] tools: Add sheafs support to testing infrastructure Vlastimil Babka
2025-02-14 16:27 ` [PATCH RFC v2 10/10] maple_tree: use percpu sheaves for maple_node_cache Vlastimil Babka
2025-02-23 4:27 ` Suren Baghdasaryan
2025-02-14 18:28 ` [PATCH RFC v2 00/10] SLUB percpu sheaves Christoph Lameter (Ampere)
2025-02-23 0:19 ` Kent Overstreet
2025-02-23 4:44 ` Suren Baghdasaryan
2025-02-24 1:36 ` Suren Baghdasaryan
2025-02-24 1:43 ` Suren Baghdasaryan
2025-02-24 20:53 ` Vlastimil Babka
2025-02-24 21:12 ` Suren Baghdasaryan
2025-02-25 20:26 ` Suren Baghdasaryan
2025-03-04 10:54 ` Vlastimil Babka
2025-03-04 18:35 ` Suren Baghdasaryan
2025-03-04 19:08 ` Liam R. Howlett
2025-03-14 17:10 ` Suren Baghdasaryan
2025-03-17 11:08 ` Vlastimil Babka
2025-03-17 18:56 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7woDjICqD0fkghA@harry \
--to=harry.yoo@oracle.com \
--cc=42.hyeyoo@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=cl@linux.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maple-tree@lists.infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=surenb@google.com \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox