From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Ming Lei <ming.lei@redhat.com>, Harry Yoo <harry.yoo@oracle.com>
Cc: Hao Li <hao.li@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
"Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Subject: [PATCH 2/3] slab: create barns for online memoryless nodes
Date: Wed, 11 Mar 2026 09:25:56 +0100 [thread overview]
Message-ID: <20260311-b4-slab-memoryless-barns-v1-2-70ab850be4ce@kernel.org> (raw)
In-Reply-To: <20260311-b4-slab-memoryless-barns-v1-0-70ab850be4ce@kernel.org>
Ming Lei has reported [1] a performance regression due to replacing cpu
(partial) slabs with sheaves. With slub stats enabled, a large amount of
slowpath allocations were observed. The affected system has 8 online
NUMA nodes but only 2 have memory.
For sheaves to work effectively on given cpu, its NUMA node has to have
struct node_barn allocated. Those are currently only allocated on nodes
with memory (N_MEMORY) where kmem_cache_node also exist as the goal is
to cache only node-local objects. But in order to have good performance
on a memoryless node, we need its barn to exist and use sheaves to cache
non-local objects (as no local objects can exist anyway).
Therefore change the implementation to allocate barns on all online
nodes, tracked in a new nodemask slab_barn_nodes. Also add a cpu hotplug
callback as that's when a memoryless node can become online.
Change rcu_sheaf->node assignment to numa_node_id() so it's returned to
the barn of the local cpu's (potentially memoryless) node, and not to
the nearest node with memory anymore.
Reported-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/all/aZ0SbIqaIkwoW2mB@fedora/ [1]
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 59 insertions(+), 4 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 609a183f8533..d8496b37e364 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -472,6 +472,12 @@ static inline struct node_barn *get_barn(struct kmem_cache *s)
*/
static nodemask_t slab_nodes;
+/*
+ * Similar to slab_nodes but for where we have node_barn allocated.
+ * Corresponds to N_ONLINE nodes.
+ */
+static nodemask_t slab_barn_nodes;
+
/*
* Workqueue used for flushing cpu and kfree_rcu sheaves.
*/
@@ -4084,6 +4090,51 @@ void flush_all_rcu_sheaves(void)
rcu_barrier();
}
+static int slub_cpu_setup(unsigned int cpu)
+{
+ int nid = cpu_to_node(cpu);
+ struct kmem_cache *s;
+ int ret = 0;
+
+ /*
+ * we never clear a nid so it's safe to do a quick check before taking
+ * the mutex, and then recheck to handle parallel cpu hotplug safely
+ */
+ if (node_isset(nid, slab_barn_nodes))
+ return 0;
+
+ mutex_lock(&slab_mutex);
+
+ if (node_isset(nid, slab_barn_nodes))
+ goto out;
+
+ list_for_each_entry(s, &slab_caches, list) {
+ struct node_barn *barn;
+
+ /*
+ * barn might already exist if a previous callback failed midway
+ */
+ if (!cache_has_sheaves(s) || get_barn_node(s, nid))
+ continue;
+
+ barn = kmalloc_node(sizeof(*barn), GFP_KERNEL, nid);
+
+ if (!barn) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ barn_init(barn);
+ s->per_node[nid].barn = barn;
+ }
+ node_set(nid, slab_barn_nodes);
+
+out:
+ mutex_unlock(&slab_mutex);
+
+ return ret;
+}
+
/*
* Use the cpu notifier to insure that the cpu slabs are flushed when
* necessary.
@@ -5936,7 +5987,7 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
rcu_sheaf = NULL;
} else {
pcs->rcu_free = NULL;
- rcu_sheaf->node = numa_mem_id();
+ rcu_sheaf->node = numa_node_id();
}
/*
@@ -7597,7 +7648,7 @@ static int init_kmem_cache_nodes(struct kmem_cache *s)
if (slab_state == DOWN || !cache_has_sheaves(s))
return 1;
- for_each_node_mask(node, slab_nodes) {
+ for_each_node_mask(node, slab_barn_nodes) {
struct node_barn *barn;
barn = kmalloc_node(sizeof(*barn), GFP_KERNEL, node);
@@ -8250,6 +8301,7 @@ static int slab_mem_going_online_callback(int nid)
* and barn initialized for the new node.
*/
node_set(nid, slab_nodes);
+ node_set(nid, slab_barn_nodes);
out:
mutex_unlock(&slab_mutex);
return ret;
@@ -8328,7 +8380,7 @@ static void __init bootstrap_cache_sheaves(struct kmem_cache *s)
if (!capacity)
return;
- for_each_node_mask(node, slab_nodes) {
+ for_each_node_mask(node, slab_barn_nodes) {
struct node_barn *barn;
barn = kmalloc_node(sizeof(*barn), GFP_KERNEL, node);
@@ -8400,6 +8452,9 @@ void __init kmem_cache_init(void)
for_each_node_state(node, N_MEMORY)
node_set(node, slab_nodes);
+ for_each_online_node(node)
+ node_set(node, slab_barn_nodes);
+
create_boot_cache(kmem_cache_node, "kmem_cache_node",
sizeof(struct kmem_cache_node),
SLAB_HWCACHE_ALIGN | SLAB_NO_OBJ_EXT, 0, 0);
@@ -8426,7 +8481,7 @@ void __init kmem_cache_init(void)
/* Setup random freelists for each cache */
init_freelist_randomization();
- cpuhp_setup_state_nocalls(CPUHP_SLUB_DEAD, "slub:dead", NULL,
+ cpuhp_setup_state_nocalls(CPUHP_SLUB_DEAD, "slub:dead", slub_cpu_setup,
slub_cpu_dead);
pr_info("SLUB: HWalign=%d, Order=%u-%u, MinObjects=%u, CPUs=%u, Nodes=%u\n",
--
2.53.0
next prev parent reply other threads:[~2026-03-11 8:26 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 8:25 [PATCH 0/3] slab: support memoryless nodes with sheaves Vlastimil Babka (SUSE)
2026-03-11 8:25 ` [PATCH 1/3] slab: decouple pointer to barn from kmem_cache_node Vlastimil Babka (SUSE)
2026-03-13 9:27 ` Harry Yoo
2026-03-13 9:46 ` Vlastimil Babka (SUSE)
2026-03-13 11:48 ` Harry Yoo
2026-03-16 13:19 ` Vlastimil Babka (SUSE)
2026-03-11 8:25 ` Vlastimil Babka (SUSE) [this message]
2026-03-16 3:25 ` [PATCH 2/3] slab: create barns for online memoryless nodes Harry Yoo
2026-03-18 9:27 ` Hao Li
2026-03-18 12:11 ` Vlastimil Babka (SUSE)
2026-03-19 7:01 ` Hao Li
2026-03-19 9:56 ` Vlastimil Babka (SUSE)
2026-03-19 11:27 ` Hao Li
2026-03-19 12:25 ` Vlastimil Babka (SUSE)
2026-03-11 8:25 ` [PATCH 3/3] slab: free remote objects to sheaves on " Vlastimil Babka (SUSE)
2026-03-16 3:48 ` Harry Yoo
2026-03-11 9:49 ` [PATCH 0/3] slab: support memoryless nodes with sheaves Ming Lei
2026-03-11 17:22 ` Vlastimil Babka (SUSE)
2026-04-08 13:04 ` Jon Hunter
2026-04-08 14:06 ` Hao Li
2026-04-08 14:31 ` Harry Yoo (Oracle)
2026-03-16 13:33 ` Vlastimil Babka (SUSE)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260311-b4-slab-memoryless-barns-v1-2-70ab850be4ce@kernel.org \
--to=vbabka@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=harry.yoo@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ming.lei@redhat.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox