From: Hao Ge <hao.ge@linux.dev>
To: Vlastimil Babka <vbabka@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Harry Yoo <harry.yoo@oracle.com>,
Suren Baghdasaryan <surenb@google.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Hao Ge <gehao@kylinos.cn>
Subject: [PATCH v2] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
Date: Mon, 20 Oct 2025 22:30:11 +0800 [thread overview]
Message-ID: <20251020143011.377004-1-hao.ge@linux.dev> (raw)
From: Hao Ge <gehao@kylinos.cn>
In the alloc_slab_obj_exts function, there is a race condition
between the successful allocation of slab->obj_exts and its
setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
When two threads are both allocating objects from the same slab,
they both end up entering the alloc_slab_obj_exts function because
the slab has no obj_exts (allocated yet).
And One call succeeds in allocation, but the racing one overwrites
our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
allocated will have prepare_slab_obj_exts_hook() return
slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
on the zero address.
And then it will call alloc_tag_add, where the member codetag_ref *ref
of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
leading to a panic.
In order to avoid that, for the case of allocation failure where
OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
Conversely, in a race condition, if mark_failed_objexts_alloc wins the
race, the other process (that previously succeeded in allocation) will
lose the race. A null pointer dereference may occur in the following
scenario:
Thread1 Thead2
alloc_slab_obj_exts alloc_slab_obj_exts
old_exts = READ_ONCE(slab->obj_exts) = 0
mark_failed_objexts_alloc(slab);
cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts
kfree and return 0;
alloc_tag_add -> a panic occurs.
To fix this, introduce a retry mechanism for the cmpxchg() operation:
1. Add a 'retry' label at the point where READ_ONCE(slab->obj_exts) is
invoked, ensuring the latest value is fetched during subsequent retries.
2. if cmpxchg() fails (indicating a concurrent update), jump back to
"retry" to re-read old_exts and recheck the validity of the obj_exts
allocated in this operation.
Thanks for Vlastimil and Suren's help with debugging.
Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
v2: Incorporate handling for the scenario where, if mark_failed_objexts_alloc wins the race,
the other process (that previously succeeded in allocation) will lose the race, based on Suren's suggestion.
Add Suggested-by: Suren Baghdasaryan <surenb@google.com>
---
mm/slub.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 2e4340c75be2..fd1b5dda3863 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
static inline void mark_failed_objexts_alloc(struct slab *slab)
{
- slab->obj_exts = OBJEXTS_ALLOC_FAIL;
+ cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
}
static inline void handle_failed_objexts_alloc(unsigned long obj_exts,
@@ -2136,6 +2136,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
#ifdef CONFIG_MEMCG
new_exts |= MEMCG_DATA_OBJEXTS;
#endif
+retry:
old_exts = READ_ONCE(slab->obj_exts);
handle_failed_objexts_alloc(old_exts, vec, objects);
if (new_slab) {
@@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
* be simply assigned.
*/
slab->obj_exts = new_exts;
- } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
- cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
+ } else if (old_exts & ~OBJEXTS_FLAGS_MASK) {
/*
* If the slab is already in use, somebody can allocate and
* assign slabobj_exts in parallel. In this case the existing
@@ -2158,6 +2158,20 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
else
kfree(vec);
return 0;
+ } else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
+ /*
+ * There are some abnormal scenarios caused by race conditions:
+ *
+ * Thread1 Thead2
+ * alloc_slab_obj_exts alloc_slab_obj_exts
+ * old_exts = READ_ONCE(slab->obj_exts) = 0
+ * mark_failed_objexts_alloc(slab);
+ * cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts
+ *
+ * We should retry to ensure the validity of the slab_ext
+ * allocated in this operation.
+ */
+ goto retry;
}
if (allow_spin)
--
2.25.1
next reply other threads:[~2025-10-20 14:31 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 14:30 Hao Ge [this message]
2025-10-20 18:52 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251020143011.377004-1-hao.ge@linux.dev \
--to=hao.ge@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=gehao@kylinos.cn \
--cc=harry.yoo@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox