From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Ming Lei <ming.lei@redhat.com>, Harry Yoo <harry.yoo@oracle.com>
Cc: Hao Li <hao.li@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
"Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Subject: [PATCH 3/3] slab: free remote objects to sheaves on memoryless nodes
Date: Wed, 11 Mar 2026 09:25:57 +0100 [thread overview]
Message-ID: <20260311-b4-slab-memoryless-barns-v1-3-70ab850be4ce@kernel.org> (raw)
In-Reply-To: <20260311-b4-slab-memoryless-barns-v1-0-70ab850be4ce@kernel.org>
On memoryless nodes we can now allocate from cpu sheaves and refill them
normally. But when a node is memoryless on a system without actual
CONFIG_HAVE_MEMORYLESS_NODES support, freeing always uses the slowpath
because all objects appear as remote. We could instead benefit from the
freeing fastpath, because the allocations can't obtain local objects
anyway if the node is memoryless.
Thus adapt the locality check when freeing, and move them to an inline
function can_free_to_pcs() for a single shared implementation.
On configurations with CONFIG_HAVE_MEMORYLESS_NODES=y continue using
numa_mem_id() so the percpu sheaves and barn on a memoryless node will
contain mostly objects from the closest memory node (returned by
numa_mem_id()). No change is thus intended for such configuration.
On systems with CONFIG_HAVE_MEMORYLESS_NODES=n use numa_node_id() (the
cpu's node) since numa_mem_id() just aliases it anyway. But if we are
freeing on a memoryless node, allow the freeing to use percpu sheaves
for objects from any node, since they are all remote anyway.
This way we avoid the slowpath and get more performant freeing. The
potential downside is that allocations will obtain objects with a larger
average distance. If we kept bypassing the sheaves on freeing, a refill
of sheaves from slabs would tend to get closer objects thanks to the
ordering of the zonelist. Architectures that allow de-facto memoryless
nodes without proper CONFIG_HAVE_MEMORYLESS_NODES support should perhaps
consider adding such support.
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
---
mm/slub.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 55 insertions(+), 12 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index d8496b37e364..2e095ce76dd0 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -6009,6 +6009,56 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
return false;
}
+static __always_inline bool can_free_to_pcs(struct slab *slab)
+{
+ int slab_node;
+ int numa_node;
+
+ if (!IS_ENABLED(CONFIG_NUMA))
+ goto check_pfmemalloc;
+
+ slab_node = slab_nid(slab);
+
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+ /*
+ * numa_mem_id() points to the closest node with memory so only allow
+ * objects from that node to the percpu sheaves
+ */
+ numa_node = numa_mem_id();
+
+ if (likely(slab_node == numa_node))
+ goto check_pfmemalloc;
+#else
+
+ /*
+ * numa_mem_id() is only a wrapper to numa_node_id() which is where this
+ * cpu belongs to, but it might be a memoryless node anyway. We don't
+ * know what the closest node is.
+ */
+ numa_node = numa_node_id();
+
+ /* freed object is from this cpu's node, proceed */
+ if (likely(slab_node == numa_node))
+ goto check_pfmemalloc;
+
+ /*
+ * Freed object isn't from this cpu's node, but that node is memoryless.
+ * Proceed as it's better to cache remote objects than falling back to
+ * the slowpath for everything. The allocation side can never obtain
+ * a local object anyway, if none exist. We don't have numa_mem_id() to
+ * point to the closest node as we would on a proper memoryless node
+ * setup.
+ */
+ if (unlikely(!node_isset(numa_node, slab_nodes)))
+ goto check_pfmemalloc;
+#endif
+
+ return false;
+
+check_pfmemalloc:
+ return likely(!slab_test_pfmemalloc(slab));
+}
+
/*
* Bulk free objects to the percpu sheaves.
* Unlike free_to_pcs() this includes the calls to all necessary hooks
@@ -6023,7 +6073,6 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
struct node_barn *barn;
void *remote_objects[PCS_BATCH_MAX];
unsigned int remote_nr = 0;
- int node = numa_mem_id();
next_remote_batch:
while (i < size) {
@@ -6037,8 +6086,7 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
continue;
}
- if (unlikely((IS_ENABLED(CONFIG_NUMA) && slab_nid(slab) != node)
- || slab_test_pfmemalloc(slab))) {
+ if (unlikely(!can_free_to_pcs(slab))) {
remote_objects[remote_nr] = p[i];
p[i] = p[--size];
if (++remote_nr >= PCS_BATCH_MAX)
@@ -6214,11 +6262,8 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
if (unlikely(!slab_free_hook(s, object, slab_want_init_on_free(s), false)))
return;
- if (likely(!IS_ENABLED(CONFIG_NUMA) || slab_nid(slab) == numa_mem_id())
- && likely(!slab_test_pfmemalloc(slab))) {
- if (likely(free_to_pcs(s, object, true)))
- return;
- }
+ if (likely(can_free_to_pcs(slab)) && likely(free_to_pcs(s, object, true)))
+ return;
__slab_free(s, slab, object, object, 1, addr);
stat(s, FREE_SLOWPATH);
@@ -6589,10 +6634,8 @@ void kfree_nolock(const void *object)
*/
kasan_slab_free(s, x, false, false, /* skip quarantine */true);
- if (likely(!IS_ENABLED(CONFIG_NUMA) || slab_nid(slab) == numa_mem_id())) {
- if (likely(free_to_pcs(s, x, false)))
- return;
- }
+ if (likely(can_free_to_pcs(slab)) && likely(free_to_pcs(s, x, false)))
+ return;
/*
* __slab_free() can locklessly cmpxchg16 into a slab, but then it might
--
2.53.0
next prev parent reply other threads:[~2026-03-11 8:26 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 8:25 [PATCH 0/3] slab: support memoryless nodes with sheaves Vlastimil Babka (SUSE)
2026-03-11 8:25 ` [PATCH 1/3] slab: decouple pointer to barn from kmem_cache_node Vlastimil Babka (SUSE)
2026-03-13 9:27 ` Harry Yoo
2026-03-13 9:46 ` Vlastimil Babka (SUSE)
2026-03-13 11:48 ` Harry Yoo
2026-03-16 13:19 ` Vlastimil Babka (SUSE)
2026-03-11 8:25 ` [PATCH 2/3] slab: create barns for online memoryless nodes Vlastimil Babka (SUSE)
2026-03-16 3:25 ` Harry Yoo
2026-03-18 9:27 ` Hao Li
2026-03-18 12:11 ` Vlastimil Babka (SUSE)
2026-03-19 7:01 ` Hao Li
2026-03-19 9:56 ` Vlastimil Babka (SUSE)
2026-03-19 11:27 ` Hao Li
2026-03-19 12:25 ` Vlastimil Babka (SUSE)
2026-03-11 8:25 ` Vlastimil Babka (SUSE) [this message]
2026-03-16 3:48 ` [PATCH 3/3] slab: free remote objects to sheaves on " Harry Yoo
2026-03-11 9:49 ` [PATCH 0/3] slab: support memoryless nodes with sheaves Ming Lei
2026-03-11 17:22 ` Vlastimil Babka (SUSE)
2026-04-08 13:04 ` Jon Hunter
2026-04-08 14:06 ` Hao Li
2026-04-08 14:31 ` Harry Yoo (Oracle)
2026-03-16 13:33 ` Vlastimil Babka (SUSE)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260311-b4-slab-memoryless-barns-v1-3-70ab850be4ce@kernel.org \
--to=vbabka@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=harry.yoo@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ming.lei@redhat.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox