[PATCH v2] slab: Avoid race on slab->obj_exts in alloc_slab_obj

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
@ 2025-10-20 14:30 Hao Ge
  2025-10-20 18:52 ` Suren Baghdasaryan
  0 siblings, 1 reply; 2+ messages in thread
From: Hao Ge @ 2025-10-20 14:30 UTC (permalink / raw)
  To: Vlastimil Babka, Andrew Morton, Christoph Lameter,
	David Rientjes, Roman Gushchin, Harry Yoo, Suren Baghdasaryan
  Cc: Shakeel Butt, linux-mm, linux-kernel, Hao Ge

From: Hao Ge <gehao@kylinos.cn>

In the alloc_slab_obj_exts function, there is a race condition
between the successful allocation of slab->obj_exts and its
setting to OBJEXTS_ALLOC_FAIL due to allocation failure.

When two threads are both allocating objects from the same slab,
they both end up entering the alloc_slab_obj_exts function because
the slab has no obj_exts (allocated yet).

And One call succeeds in allocation, but the racing one overwrites
our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
allocated will have prepare_slab_obj_exts_hook() return
slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
on the zero address.

And then it will call alloc_tag_add, where the member codetag_ref *ref
of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
leading to a panic.

In order to avoid that, for the case of allocation failure where
OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.

Conversely, in a race condition, if mark_failed_objexts_alloc wins the
race, the other process (that previously succeeded in allocation) will
lose the race. A null pointer dereference may occur in the following
scenario:

Thread1                                                 Thead2

alloc_slab_obj_exts                               alloc_slab_obj_exts

old_exts = READ_ONCE(slab->obj_exts) = 0

						  mark_failed_objexts_alloc(slab);

cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts

kfree and return 0;

alloc_tag_add -> a panic occurs.

To fix this, introduce a retry mechanism for the cmpxchg() operation:
1. Add a 'retry' label at the point where READ_ONCE(slab->obj_exts) is
   invoked, ensuring the latest value is fetched during subsequent retries.
2. if cmpxchg() fails (indicating a concurrent update), jump back to
   "retry" to re-read old_exts and recheck the validity of the obj_exts
   allocated in this operation.

Thanks for Vlastimil and Suren's help with debugging.

Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
v2: Incorporate handling for the scenario where, if mark_failed_objexts_alloc wins the race,
    the other process (that previously succeeded in allocation) will lose the race, based on Suren's suggestion.
    Add Suggested-by: Suren Baghdasaryan <surenb@google.com>
---
 mm/slub.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 2e4340c75be2..fd1b5dda3863 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)

 static inline void mark_failed_objexts_alloc(struct slab *slab)
 {
-	slab->obj_exts = OBJEXTS_ALLOC_FAIL;
+	cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
 }

 static inline void handle_failed_objexts_alloc(unsigned long obj_exts,
@@ -2136,6 +2136,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 #ifdef CONFIG_MEMCG
 	new_exts |= MEMCG_DATA_OBJEXTS;
 #endif
+retry:
 	old_exts = READ_ONCE(slab->obj_exts);
 	handle_failed_objexts_alloc(old_exts, vec, objects);
 	if (new_slab) {
@@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 		 * be simply assigned.
 		 */
 		slab->obj_exts = new_exts;
-	} else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
-		   cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
+	} else if (old_exts & ~OBJEXTS_FLAGS_MASK) {
 		/*
 		 * If the slab is already in use, somebody can allocate and
 		 * assign slabobj_exts in parallel. In this case the existing
@@ -2158,6 +2158,20 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 		else
 			kfree(vec);
 		return 0;
+	} else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
+		/*
+		 * There are some abnormal scenarios caused by race conditions:
+		 *
+		 *	Thread1				Thead2
+		 *   alloc_slab_obj_exts		alloc_slab_obj_exts
+		 *   old_exts = READ_ONCE(slab->obj_exts) = 0
+		 *					mark_failed_objexts_alloc(slab);
+		 *   cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts
+		 *
+		 * We should retry to ensure the validity of the slab_ext
+		 * allocated in this operation.
+		 */
+		goto retry;
 	}

 	if (allow_spin)
-- 
2.25.1

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH v2] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
  2025-10-20 14:30 [PATCH v2] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts Hao Ge
@ 2025-10-20 18:52 ` Suren Baghdasaryan
  0 siblings, 0 replies; 2+ messages in thread
From: Suren Baghdasaryan @ 2025-10-20 18:52 UTC (permalink / raw)
  To: Hao Ge
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
	David Rientjes, Roman Gushchin, Harry Yoo, Shakeel Butt,
	linux-mm, linux-kernel, Hao Ge

On Mon, Oct 20, 2025 at 7:31 AM Hao Ge <hao.ge@linux.dev> wrote:
>
> From: Hao Ge <gehao@kylinos.cn>
>
> In the alloc_slab_obj_exts function, there is a race condition
> between the successful allocation of slab->obj_exts and its
> setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
>
> When two threads are both allocating objects from the same slab,
> they both end up entering the alloc_slab_obj_exts function because
> the slab has no obj_exts (allocated yet).
>
> And One call succeeds in allocation, but the racing one overwrites
> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
> allocated will have prepare_slab_obj_exts_hook() return
> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
> on the zero address.
>
> And then it will call alloc_tag_add, where the member codetag_ref *ref
> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
> leading to a panic.
>
> In order to avoid that, for the case of allocation failure where
> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
>
> Conversely, in a race condition, if mark_failed_objexts_alloc wins the
> race, the other process (that previously succeeded in allocation) will
> lose the race. A null pointer dereference may occur in the following
> scenario:
>
> Thread1                                                 Thead2
>
> alloc_slab_obj_exts                               alloc_slab_obj_exts
>
> old_exts = READ_ONCE(slab->obj_exts) = 0
>
>                                                   mark_failed_objexts_alloc(slab);
>
> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts
>
> kfree and return 0;
>
> alloc_tag_add -> a panic occurs.

I appreciate the time and effort you put in this description but it
sounds overly-complicated IMHO. IIUC in both cases the issue happens
when a valid slab->obj_exts pointer is overwritten by
OBJEXTS_ALLOC_FAIL. I would simply say:

If two competing threads enter alloc_slab_obj_exts() and one of them
fails to allocate the object extension vector, it might override the
valid slab->obj_exts allocated by the other thread with
OBJEXTS_ALLOC_FAIL. This will cause the thread that lost this race and
expects a valid pointer to dereference a NULL pointer later on.

>
> To fix this, introduce a retry mechanism for the cmpxchg() operation:
> 1. Add a 'retry' label at the point where READ_ONCE(slab->obj_exts) is
>    invoked, ensuring the latest value is fetched during subsequent retries.
> 2. if cmpxchg() fails (indicating a concurrent update), jump back to
>    "retry" to re-read old_exts and recheck the validity of the obj_exts
>    allocated in this operation.

The paragraph above explains "what" you do but we can use the code to
understand that. Changelog should describe "why" not "what" you do. I
would just say:

Update slab->obj_exts atomically using cmpxchg() to avoid
slab->obj_exts overrides by racing threads.

>
> Thanks for Vlastimil and Suren's help with debugging.
>
> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
> Suggested-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Hao Ge <gehao@kylinos.cn>
> ---
> v2: Incorporate handling for the scenario where, if mark_failed_objexts_alloc wins the race,
>     the other process (that previously succeeded in allocation) will lose the race, based on Suren's suggestion.
>     Add Suggested-by: Suren Baghdasaryan <surenb@google.com>
> ---
>  mm/slub.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 2e4340c75be2..fd1b5dda3863 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
>
>  static inline void mark_failed_objexts_alloc(struct slab *slab)
>  {
> -       slab->obj_exts = OBJEXTS_ALLOC_FAIL;
> +       cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>  }
>
>  static inline void handle_failed_objexts_alloc(unsigned long obj_exts,
> @@ -2136,6 +2136,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>  #ifdef CONFIG_MEMCG
>         new_exts |= MEMCG_DATA_OBJEXTS;
>  #endif
> +retry:
>         old_exts = READ_ONCE(slab->obj_exts);
>         handle_failed_objexts_alloc(old_exts, vec, objects);
>         if (new_slab) {
> @@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>                  * be simply assigned.
>                  */
>                 slab->obj_exts = new_exts;
> -       } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
> -                  cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
> +       } else if (old_exts & ~OBJEXTS_FLAGS_MASK) {
>                 /*
>                  * If the slab is already in use, somebody can allocate and
>                  * assign slabobj_exts in parallel. In this case the existing
> @@ -2158,6 +2158,20 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>                 else
>                         kfree(vec);
>                 return 0;
> +       } else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
> +               /*
> +                * There are some abnormal scenarios caused by race conditions:
> +                *
> +                *      Thread1                         Thead2
> +                *   alloc_slab_obj_exts                alloc_slab_obj_exts
> +                *   old_exts = READ_ONCE(slab->obj_exts) = 0
> +                *                                      mark_failed_objexts_alloc(slab);
> +                *   cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts
> +                *
> +                * We should retry to ensure the validity of the slab_ext
> +                * allocated in this operation.
> +                */

I don't think we need a diagram here. The race is quite trivial. Maybe
a simple comment like this?

/* Retry if a racing thread changed slab->obj_exts from under us. */

> +               goto retry;
>         }
>
>         if (allow_spin)
> --
> 2.25.1
>


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-10-20 18:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-20 14:30 [PATCH v2] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts Hao Ge
2025-10-20 18:52 ` Suren Baghdasaryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox