* [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
@ 2025-10-17 4:57 Hao Ge
2025-10-17 6:05 ` Harry Yoo
0 siblings, 1 reply; 10+ messages in thread
From: Hao Ge @ 2025-10-17 4:57 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Harry Yoo, Suren Baghdasaryan
Cc: Shakeel Butt, linux-mm, linux-kernel, Hao Ge
From: Hao Ge <gehao@kylinos.cn>
In the alloc_slab_obj_exts function, there is a race condition
between the successful allocation of slab->obj_exts and its
setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
When two threads are both allocating objects from the same slab,
they both end up entering the alloc_slab_obj_exts function because
the slab has no obj_exts (allocated yet).
And One call succeeds in allocation, but the racing one overwrites
our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
allocated will have prepare_slab_obj_exts_hook() return
slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
on the zero address.
And then it will call alloc_tag_add, where the member codetag_ref *ref
of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
leading to a panic.
In order to avoid that, for the case of allocation failure where
OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
Thanks for Vlastimil and Suren's help with debugging.
Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c
index 2e4340c75be2..9e6361796e34 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
static inline void mark_failed_objexts_alloc(struct slab *slab)
{
- slab->obj_exts = OBJEXTS_ALLOC_FAIL;
+ cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
}
static inline void handle_failed_objexts_alloc(unsigned long obj_exts,
--
2.25.1
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-17 4:57 [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts Hao Ge @ 2025-10-17 6:05 ` Harry Yoo 2025-10-17 6:42 ` Hao Ge 0 siblings, 1 reply; 10+ messages in thread From: Harry Yoo @ 2025-10-17 6:05 UTC (permalink / raw) To: Hao Ge Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote: > From: Hao Ge <gehao@kylinos.cn> > > In the alloc_slab_obj_exts function, there is a race condition > between the successful allocation of slab->obj_exts and its > setting to OBJEXTS_ALLOC_FAIL due to allocation failure. > > When two threads are both allocating objects from the same slab, > they both end up entering the alloc_slab_obj_exts function because > the slab has no obj_exts (allocated yet). > > And One call succeeds in allocation, but the racing one overwrites > our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully > allocated will have prepare_slab_obj_exts_hook() return > slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab) > already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based > on the zero address. > > And then it will call alloc_tag_add, where the member codetag_ref *ref > of obj_exts will be referenced.Thus, a NULL pointer dereference occurs, > leading to a panic. > > In order to avoid that, for the case of allocation failure where > OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment. > > Thanks for Vlastimil and Suren's help with debugging. > > Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") I think we should add Cc: stable as well? We need an explicit Cc: stable to backport mm patches to -stable. > Signed-off-by: Hao Ge <gehao@kylinos.cn> > --- > mm/slub.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/slub.c b/mm/slub.c > index 2e4340c75be2..9e6361796e34 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) > > static inline void mark_failed_objexts_alloc(struct slab *slab) > { > - slab->obj_exts = OBJEXTS_ALLOC_FAIL; > + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); > } A silly question: If mark_failed_objexts_alloc() succeeds and a concurrent alloc_slab_obj_exts() loses, should we retry cmpxchg() in alloc_slab_obj_exts()? > -- > 2.25.1 -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-17 6:05 ` Harry Yoo @ 2025-10-17 6:42 ` Hao Ge 2025-10-17 7:40 ` Harry Yoo 0 siblings, 1 reply; 10+ messages in thread From: Hao Ge @ 2025-10-17 6:42 UTC (permalink / raw) To: Harry Yoo Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge Hi Harry Thank you for your quick response. On 2025/10/17 14:05, Harry Yoo wrote: > On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote: >> From: Hao Ge <gehao@kylinos.cn> >> >> In the alloc_slab_obj_exts function, there is a race condition >> between the successful allocation of slab->obj_exts and its >> setting to OBJEXTS_ALLOC_FAIL due to allocation failure. >> >> When two threads are both allocating objects from the same slab, >> they both end up entering the alloc_slab_obj_exts function because >> the slab has no obj_exts (allocated yet). >> >> And One call succeeds in allocation, but the racing one overwrites >> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully >> allocated will have prepare_slab_obj_exts_hook() return >> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab) >> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based >> on the zero address. >> >> And then it will call alloc_tag_add, where the member codetag_ref *ref >> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs, >> leading to a panic. >> >> In order to avoid that, for the case of allocation failure where >> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment. >> >> Thanks for Vlastimil and Suren's help with debugging. >> >> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") > I think we should add Cc: stable as well? > We need an explicit Cc: stable to backport mm patches to -stable. Oh sorry, I missed this. > >> Signed-off-by: Hao Ge <gehao@kylinos.cn> >> --- >> mm/slub.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/slub.c b/mm/slub.c >> index 2e4340c75be2..9e6361796e34 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) >> >> static inline void mark_failed_objexts_alloc(struct slab *slab) >> { >> - slab->obj_exts = OBJEXTS_ALLOC_FAIL; >> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >> } > A silly question: > > If mark_failed_objexts_alloc() succeeds and a concurrent > alloc_slab_obj_exts() loses, should we retry cmpxchg() in > alloc_slab_obj_exts()? Great point. We could modify it like this, perhaps? static inline void mark_failed_objexts_alloc(struct slab *slab) { + unsigned long old_exts = READ_ONCE(slab->obj_exts); + if( old_exts == 0 ) + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); } Do you have any better suggestions on your end? > >> -- >> 2.25.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-17 6:42 ` Hao Ge @ 2025-10-17 7:40 ` Harry Yoo 2025-10-17 8:21 ` Vlastimil Babka 0 siblings, 1 reply; 10+ messages in thread From: Harry Yoo @ 2025-10-17 7:40 UTC (permalink / raw) To: Hao Ge Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote: > Hi Harry > > > Thank you for your quick response. > > > On 2025/10/17 14:05, Harry Yoo wrote: > > On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote: > > > From: Hao Ge <gehao@kylinos.cn> > > > > > > In the alloc_slab_obj_exts function, there is a race condition > > > between the successful allocation of slab->obj_exts and its > > > setting to OBJEXTS_ALLOC_FAIL due to allocation failure. > > > > > > When two threads are both allocating objects from the same slab, > > > they both end up entering the alloc_slab_obj_exts function because > > > the slab has no obj_exts (allocated yet). > > > > > > And One call succeeds in allocation, but the racing one overwrites > > > our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully > > > allocated will have prepare_slab_obj_exts_hook() return > > > slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab) > > > already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based > > > on the zero address. > > > > > > And then it will call alloc_tag_add, where the member codetag_ref *ref > > > of obj_exts will be referenced.Thus, a NULL pointer dereference occurs, > > > leading to a panic. > > > > > > In order to avoid that, for the case of allocation failure where > > > OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment. > > > > > > Thanks for Vlastimil and Suren's help with debugging. > > > > > > Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") > > I think we should add Cc: stable as well? > > We need an explicit Cc: stable to backport mm patches to -stable. > Oh sorry, I missed this. > > > > > Signed-off-by: Hao Ge <gehao@kylinos.cn> > > > --- > > > mm/slub.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > index 2e4340c75be2..9e6361796e34 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) > > > static inline void mark_failed_objexts_alloc(struct slab *slab) > > > { > > > - slab->obj_exts = OBJEXTS_ALLOC_FAIL; > > > + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); > > > } > > A silly question: > > > > If mark_failed_objexts_alloc() succeeds and a concurrent > > alloc_slab_obj_exts() loses, should we retry cmpxchg() in > > alloc_slab_obj_exts()? > > Great point. > > We could modify it like this, perhaps? > > static inline void mark_failed_objexts_alloc(struct slab *slab) > { > + unsigned long old_exts = READ_ONCE(slab->obj_exts); > + if( old_exts == 0 ) > + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); > } I don't think this makes sense. cmpxchg() fails anyway if old_exts != 0. > Do you have any better suggestions on your end? I meant something like this. But someone might argue that this is not necessary anyway if there's a severe memory pressure :) diff --git a/mm/slub.c b/mm/slub.c index a585d0ac45d4..4354ae68b0e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, slab->obj_exts = new_exts; } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { + + old_exts = READ_ONCE(slab->obj_exts); + if (old_exts == OBJEXTS_ALLOC_FAIL && + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts) + goto out; /* * If the slab is already in use, somebody can allocate and * assign slabobj_exts in parallel. In this case the existing @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, return 0; } +out: kmemleak_not_leak(vec); return 0; } > > > > > -- > > > 2.25.1 > -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-17 7:40 ` Harry Yoo @ 2025-10-17 8:21 ` Vlastimil Babka 2025-10-17 10:02 ` Hao Ge 0 siblings, 1 reply; 10+ messages in thread From: Vlastimil Babka @ 2025-10-17 8:21 UTC (permalink / raw) To: Harry Yoo, Hao Ge Cc: Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On 10/17/25 09:40, Harry Yoo wrote: > On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote: >> Hi Harry >> >> >> Thank you for your quick response. >> >> >> On 2025/10/17 14:05, Harry Yoo wrote: >> > On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote: >> > > From: Hao Ge <gehao@kylinos.cn> >> > > >> > > In the alloc_slab_obj_exts function, there is a race condition >> > > between the successful allocation of slab->obj_exts and its >> > > setting to OBJEXTS_ALLOC_FAIL due to allocation failure. >> > > >> > > When two threads are both allocating objects from the same slab, >> > > they both end up entering the alloc_slab_obj_exts function because >> > > the slab has no obj_exts (allocated yet). >> > > >> > > And One call succeeds in allocation, but the racing one overwrites >> > > our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully >> > > allocated will have prepare_slab_obj_exts_hook() return >> > > slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab) >> > > already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based >> > > on the zero address. >> > > >> > > And then it will call alloc_tag_add, where the member codetag_ref *ref >> > > of obj_exts will be referenced.Thus, a NULL pointer dereference occurs, >> > > leading to a panic. >> > > >> > > In order to avoid that, for the case of allocation failure where >> > > OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment. >> > > >> > > Thanks for Vlastimil and Suren's help with debugging. >> > > >> > > Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") >> > I think we should add Cc: stable as well? >> > We need an explicit Cc: stable to backport mm patches to -stable. >> Oh sorry, I missed this. >> > >> > > Signed-off-by: Hao Ge <gehao@kylinos.cn> >> > > --- >> > > mm/slub.c | 2 +- >> > > 1 file changed, 1 insertion(+), 1 deletion(-) >> > > >> > > diff --git a/mm/slub.c b/mm/slub.c >> > > index 2e4340c75be2..9e6361796e34 100644 >> > > --- a/mm/slub.c >> > > +++ b/mm/slub.c >> > > @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) >> > > static inline void mark_failed_objexts_alloc(struct slab *slab) >> > > { >> > > - slab->obj_exts = OBJEXTS_ALLOC_FAIL; >> > > + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >> > > } >> > A silly question: >> > >> > If mark_failed_objexts_alloc() succeeds and a concurrent >> > alloc_slab_obj_exts() loses, should we retry cmpxchg() in >> > alloc_slab_obj_exts()? >> >> Great point. >> >> We could modify it like this, perhaps? >> >> static inline void mark_failed_objexts_alloc(struct slab *slab) >> { >> + unsigned long old_exts = READ_ONCE(slab->obj_exts); >> + if( old_exts == 0 ) >> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >> } > > I don't think this makes sense. > cmpxchg() fails anyway if old_exts != 0. > >> Do you have any better suggestions on your end? > > I meant something like this. > > But someone might argue that this is not necessary anyway > if there's a severe memory pressure :) > > diff --git a/mm/slub.c b/mm/slub.c > index a585d0ac45d4..4354ae68b0e1 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, > slab->obj_exts = new_exts; > } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || > cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { > + > + old_exts = READ_ONCE(slab->obj_exts); > + if (old_exts == OBJEXTS_ALLOC_FAIL && > + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts) > + goto out; Yeah, but either we make it a full loop or we don't care. Maybe we could care because even without a severe memory pressure, one side might be using kmalloc_nolock() and fail more easily. I'd bet it's what's making this reproducible actually. > /* > * If the slab is already in use, somebody can allocate and > * assign slabobj_exts in parallel. In this case the existing > @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, > return 0; > } > > +out: > kmemleak_not_leak(vec); > return 0; > } > >> > >> > > -- >> > > 2.25.1 >> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-17 8:21 ` Vlastimil Babka @ 2025-10-17 10:02 ` Hao Ge 2025-10-17 10:40 ` Vlastimil Babka 0 siblings, 1 reply; 10+ messages in thread From: Hao Ge @ 2025-10-17 10:02 UTC (permalink / raw) To: Vlastimil Babka Cc: Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge > On Oct 17, 2025, at 16:22, Vlastimil Babka <vbabka@suse.cz> wrote: > > On 10/17/25 09:40, Harry Yoo wrote: >>> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote: >>> Hi Harry >>> >>> >>> Thank you for your quick response. >>> >>> >>> On 2025/10/17 14:05, Harry Yoo wrote: >>>> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote: >>>>> From: Hao Ge <gehao@kylinos.cn> >>>>> >>>>> In the alloc_slab_obj_exts function, there is a race condition >>>>> between the successful allocation of slab->obj_exts and its >>>>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure. >>>>> >>>>> When two threads are both allocating objects from the same slab, >>>>> they both end up entering the alloc_slab_obj_exts function because >>>>> the slab has no obj_exts (allocated yet). >>>>> >>>>> And One call succeeds in allocation, but the racing one overwrites >>>>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully >>>>> allocated will have prepare_slab_obj_exts_hook() return >>>>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab) >>>>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based >>>>> on the zero address. >>>>> >>>>> And then it will call alloc_tag_add, where the member codetag_ref *ref >>>>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs, >>>>> leading to a panic. >>>>> >>>>> In order to avoid that, for the case of allocation failure where >>>>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment. >>>>> >>>>> Thanks for Vlastimil and Suren's help with debugging. >>>>> >>>>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") >>>> I think we should add Cc: stable as well? >>>> We need an explicit Cc: stable to backport mm patches to -stable. >>> Oh sorry, I missed this. >>>> >>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn> >>>>> --- >>>>> mm/slub.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/mm/slub.c b/mm/slub.c >>>>> index 2e4340c75be2..9e6361796e34 100644 >>>>> --- a/mm/slub.c >>>>> +++ b/mm/slub.c >>>>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) >>>>> static inline void mark_failed_objexts_alloc(struct slab *slab) >>>>> { >>>>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL; >>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>>>> } >>>> A silly question: >>>> >>>> If mark_failed_objexts_alloc() succeeds and a concurrent >>>> alloc_slab_obj_exts() loses, should we retry cmpxchg() in >>>> alloc_slab_obj_exts()? >>> >>> Great point. >>> >>> We could modify it like this, perhaps? >>> >>> static inline void mark_failed_objexts_alloc(struct slab *slab) >>> { >>> + unsigned long old_exts = READ_ONCE(slab->obj_exts); >>> + if( old_exts == 0 ) >>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>> } >> >> I don't think this makes sense. >> cmpxchg() fails anyway if old_exts != 0. Aha, sorry I misunderstood what you meant. >> >>> Do you have any better suggestions on your end? >> >> I meant something like this. >> >> But someone might argue that this is not necessary anyway >> if there's a severe memory pressure :) >> >> diff --git a/mm/slub.c b/mm/slub.c >> index a585d0ac45d4..4354ae68b0e1 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, >> slab->obj_exts = new_exts; >> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || >> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { >> + >> + old_exts = READ_ONCE(slab->obj_exts); >> + if (old_exts == OBJEXTS_ALLOC_FAIL && >> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts) >> + goto out; > > Yeah, but either we make it a full loop or we don't care. > Maybe we could care because even without a severe memory pressure, one side > might be using kmalloc_nolock() and fail more easily. I'd bet it's what's > making this reproducible actually. From my understanding, it only affected the obj_ext associated with this allocation, which was subsequently deallocated, leading to the loss of this count. Is this correct? > >> /* >> * If the slab is already in use, somebody can allocate and >> * assign slabobj_exts in parallel. In this case the existing >> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, >> return 0; >> } >> >> +out: >> kmemleak_not_leak(vec); >> return 0; >> } >> >>>> >>>>> -- >>>>> 2.25.1 >>> >> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-17 10:02 ` Hao Ge @ 2025-10-17 10:40 ` Vlastimil Babka 2025-10-17 21:52 ` Suren Baghdasaryan 0 siblings, 1 reply; 10+ messages in thread From: Vlastimil Babka @ 2025-10-17 10:40 UTC (permalink / raw) To: Hao Ge Cc: Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On 10/17/25 12:02, Hao Ge wrote: > > >> On Oct 17, 2025, at 16:22, Vlastimil Babka <vbabka@suse.cz> wrote: >> >> On 10/17/25 09:40, Harry Yoo wrote: >>>> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote: >>>> Hi Harry >>>> >>>> >>>> Thank you for your quick response. >>>> >>>> >>>> On 2025/10/17 14:05, Harry Yoo wrote: >>>>> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote: >>>>>> From: Hao Ge <gehao@kylinos.cn> >>>>>> >>>>>> In the alloc_slab_obj_exts function, there is a race condition >>>>>> between the successful allocation of slab->obj_exts and its >>>>>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure. >>>>>> >>>>>> When two threads are both allocating objects from the same slab, >>>>>> they both end up entering the alloc_slab_obj_exts function because >>>>>> the slab has no obj_exts (allocated yet). >>>>>> >>>>>> And One call succeeds in allocation, but the racing one overwrites >>>>>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully >>>>>> allocated will have prepare_slab_obj_exts_hook() return >>>>>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab) >>>>>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based >>>>>> on the zero address. >>>>>> >>>>>> And then it will call alloc_tag_add, where the member codetag_ref *ref >>>>>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs, >>>>>> leading to a panic. >>>>>> >>>>>> In order to avoid that, for the case of allocation failure where >>>>>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment. >>>>>> >>>>>> Thanks for Vlastimil and Suren's help with debugging. >>>>>> >>>>>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") >>>>> I think we should add Cc: stable as well? >>>>> We need an explicit Cc: stable to backport mm patches to -stable. >>>> Oh sorry, I missed this. >>>>> >>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn> >>>>>> --- >>>>>> mm/slub.c | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/mm/slub.c b/mm/slub.c >>>>>> index 2e4340c75be2..9e6361796e34 100644 >>>>>> --- a/mm/slub.c >>>>>> +++ b/mm/slub.c >>>>>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) >>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab) >>>>>> { >>>>>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL; >>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>>>>> } >>>>> A silly question: >>>>> >>>>> If mark_failed_objexts_alloc() succeeds and a concurrent >>>>> alloc_slab_obj_exts() loses, should we retry cmpxchg() in >>>>> alloc_slab_obj_exts()? >>>> >>>> Great point. >>>> >>>> We could modify it like this, perhaps? >>>> >>>> static inline void mark_failed_objexts_alloc(struct slab *slab) >>>> { >>>> + unsigned long old_exts = READ_ONCE(slab->obj_exts); >>>> + if( old_exts == 0 ) >>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>>> } >>> >>> I don't think this makes sense. >>> cmpxchg() fails anyway if old_exts != 0. > > Aha, sorry I misunderstood what you meant. > >>> >>>> Do you have any better suggestions on your end? >>> >>> I meant something like this. >>> >>> But someone might argue that this is not necessary anyway >>> if there's a severe memory pressure :) >>> >>> diff --git a/mm/slub.c b/mm/slub.c >>> index a585d0ac45d4..4354ae68b0e1 100644 >>> --- a/mm/slub.c >>> +++ b/mm/slub.c >>> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, >>> slab->obj_exts = new_exts; >>> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || >>> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { >>> + >>> + old_exts = READ_ONCE(slab->obj_exts); >>> + if (old_exts == OBJEXTS_ALLOC_FAIL && >>> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts) >>> + goto out; >> >> Yeah, but either we make it a full loop or we don't care. >> Maybe we could care because even without a severe memory pressure, one side >> might be using kmalloc_nolock() and fail more easily. I'd bet it's what's >> making this reproducible actually. > > From my understanding, it only affected the obj_ext associated with this allocation, which was subsequently deallocated, leading to the loss of this count. Is this correct? Yes. > >> >>> /* >>> * If the slab is already in use, somebody can allocate and >>> * assign slabobj_exts in parallel. In this case the existing >>> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, >>> return 0; >>> } >>> >>> +out: >>> kmemleak_not_leak(vec); >>> return 0; >>> } >>> >>>>> >>>>>> -- >>>>>> 2.25.1 >>>> >>> >> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-17 10:40 ` Vlastimil Babka @ 2025-10-17 21:52 ` Suren Baghdasaryan 2025-10-20 2:01 ` Hao Ge 0 siblings, 1 reply; 10+ messages in thread From: Suren Baghdasaryan @ 2025-10-17 21:52 UTC (permalink / raw) To: Vlastimil Babka Cc: Hao Ge, Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On Fri, Oct 17, 2025 at 3:40 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > On 10/17/25 12:02, Hao Ge wrote: > > > > > >> On Oct 17, 2025, at 16:22, Vlastimil Babka <vbabka@suse.cz> wrote: > >> > >> On 10/17/25 09:40, Harry Yoo wrote: > >>>> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote: > >>>> Hi Harry > >>>> > >>>> > >>>> Thank you for your quick response. > >>>> > >>>> > >>>> On 2025/10/17 14:05, Harry Yoo wrote: > >>>>> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote: > >>>>>> From: Hao Ge <gehao@kylinos.cn> > >>>>>> > >>>>>> In the alloc_slab_obj_exts function, there is a race condition > >>>>>> between the successful allocation of slab->obj_exts and its > >>>>>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure. > >>>>>> > >>>>>> When two threads are both allocating objects from the same slab, > >>>>>> they both end up entering the alloc_slab_obj_exts function because > >>>>>> the slab has no obj_exts (allocated yet). > >>>>>> > >>>>>> And One call succeeds in allocation, but the racing one overwrites > >>>>>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully > >>>>>> allocated will have prepare_slab_obj_exts_hook() return > >>>>>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab) > >>>>>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based > >>>>>> on the zero address. > >>>>>> > >>>>>> And then it will call alloc_tag_add, where the member codetag_ref *ref > >>>>>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs, > >>>>>> leading to a panic. > >>>>>> > >>>>>> In order to avoid that, for the case of allocation failure where > >>>>>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment. > >>>>>> > >>>>>> Thanks for Vlastimil and Suren's help with debugging. > >>>>>> > >>>>>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") > >>>>> I think we should add Cc: stable as well? > >>>>> We need an explicit Cc: stable to backport mm patches to -stable. > >>>> Oh sorry, I missed this. > >>>>> > >>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn> > >>>>>> --- > >>>>>> mm/slub.c | 2 +- > >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/mm/slub.c b/mm/slub.c > >>>>>> index 2e4340c75be2..9e6361796e34 100644 > >>>>>> --- a/mm/slub.c > >>>>>> +++ b/mm/slub.c > >>>>>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) > >>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab) > >>>>>> { > >>>>>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL; > >>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); > >>>>>> } > >>>>> A silly question: > >>>>> > >>>>> If mark_failed_objexts_alloc() succeeds and a concurrent > >>>>> alloc_slab_obj_exts() loses, should we retry cmpxchg() in > >>>>> alloc_slab_obj_exts()? > >>>> > >>>> Great point. > >>>> > >>>> We could modify it like this, perhaps? > >>>> > >>>> static inline void mark_failed_objexts_alloc(struct slab *slab) > >>>> { > >>>> + unsigned long old_exts = READ_ONCE(slab->obj_exts); > >>>> + if( old_exts == 0 ) > >>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); > >>>> } > >>> > >>> I don't think this makes sense. > >>> cmpxchg() fails anyway if old_exts != 0. > > > > Aha, sorry I misunderstood what you meant. > > > >>> > >>>> Do you have any better suggestions on your end? > >>> > >>> I meant something like this. > >>> > >>> But someone might argue that this is not necessary anyway > >>> if there's a severe memory pressure :) > >>> > >>> diff --git a/mm/slub.c b/mm/slub.c > >>> index a585d0ac45d4..4354ae68b0e1 100644 > >>> --- a/mm/slub.c > >>> +++ b/mm/slub.c > >>> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, > >>> slab->obj_exts = new_exts; > >>> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || > >>> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { > >>> + > >>> + old_exts = READ_ONCE(slab->obj_exts); > >>> + if (old_exts == OBJEXTS_ALLOC_FAIL && > >>> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts) > >>> + goto out; > >> > >> Yeah, but either we make it a full loop or we don't care. > >> Maybe we could care because even without a severe memory pressure, one side > >> might be using kmalloc_nolock() and fail more easily. I'd bet it's what's > >> making this reproducible actually. > > > > From my understanding, it only affected the obj_ext associated with this allocation, which was subsequently deallocated, leading to the loss of this count. Is this correct? > > Yes. I think retrying like this should work: +retry: old_exts = READ_ONCE(slab->obj_exts); handle_failed_objexts_alloc(old_exts, vec, objects); if (new_slab) { @@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, * be simply assigned. */ slab->obj_exts = new_exts; - } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || - cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { + } else if (old_exts & ~OBJEXTS_FLAGS_MASK) { /* * If the slab is already in use, somebody can allocate and * assign slabobj_exts in parallel. In this case the existing @@ -2158,6 +2158,8 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, else kfree(vec); return 0; + } else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { + goto retry; } > > > > >> > >>> /* > >>> * If the slab is already in use, somebody can allocate and > >>> * assign slabobj_exts in parallel. In this case the existing > >>> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, > >>> return 0; > >>> } > >>> > >>> +out: > >>> kmemleak_not_leak(vec); > >>> return 0; > >>> } > >>> > >>>>> > >>>>>> -- > >>>>>> 2.25.1 > >>>> > >>> > >> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-17 21:52 ` Suren Baghdasaryan @ 2025-10-20 2:01 ` Hao Ge 2025-10-20 10:20 ` Vlastimil Babka 0 siblings, 1 reply; 10+ messages in thread From: Hao Ge @ 2025-10-20 2:01 UTC (permalink / raw) To: Suren Baghdasaryan, Vlastimil Babka Cc: Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On 2025/10/18 05:52, Suren Baghdasaryan wrote: > On Fri, Oct 17, 2025 at 3:40 AM Vlastimil Babka <vbabka@suse.cz> wrote: >> On 10/17/25 12:02, Hao Ge wrote: >>> >>>> On Oct 17, 2025, at 16:22, Vlastimil Babka <vbabka@suse.cz> wrote: >>>> >>>> On 10/17/25 09:40, Harry Yoo wrote: >>>>>> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote: >>>>>> Hi Harry >>>>>> >>>>>> >>>>>> Thank you for your quick response. >>>>>> >>>>>> >>>>>> On 2025/10/17 14:05, Harry Yoo wrote: >>>>>>> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote: >>>>>>>> From: Hao Ge <gehao@kylinos.cn> >>>>>>>> >>>>>>>> In the alloc_slab_obj_exts function, there is a race condition >>>>>>>> between the successful allocation of slab->obj_exts and its >>>>>>>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure. >>>>>>>> >>>>>>>> When two threads are both allocating objects from the same slab, >>>>>>>> they both end up entering the alloc_slab_obj_exts function because >>>>>>>> the slab has no obj_exts (allocated yet). >>>>>>>> >>>>>>>> And One call succeeds in allocation, but the racing one overwrites >>>>>>>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully >>>>>>>> allocated will have prepare_slab_obj_exts_hook() return >>>>>>>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab) >>>>>>>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based >>>>>>>> on the zero address. >>>>>>>> >>>>>>>> And then it will call alloc_tag_add, where the member codetag_ref *ref >>>>>>>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs, >>>>>>>> leading to a panic. >>>>>>>> >>>>>>>> In order to avoid that, for the case of allocation failure where >>>>>>>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment. >>>>>>>> >>>>>>>> Thanks for Vlastimil and Suren's help with debugging. >>>>>>>> >>>>>>>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") >>>>>>> I think we should add Cc: stable as well? >>>>>>> We need an explicit Cc: stable to backport mm patches to -stable. >>>>>> Oh sorry, I missed this. >>>>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn> >>>>>>>> --- >>>>>>>> mm/slub.c | 2 +- >>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/mm/slub.c b/mm/slub.c >>>>>>>> index 2e4340c75be2..9e6361796e34 100644 >>>>>>>> --- a/mm/slub.c >>>>>>>> +++ b/mm/slub.c >>>>>>>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) >>>>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab) >>>>>>>> { >>>>>>>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL; >>>>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>>>>>>> } >>>>>>> A silly question: >>>>>>> >>>>>>> If mark_failed_objexts_alloc() succeeds and a concurrent >>>>>>> alloc_slab_obj_exts() loses, should we retry cmpxchg() in >>>>>>> alloc_slab_obj_exts()? >>>>>> Great point. >>>>>> >>>>>> We could modify it like this, perhaps? >>>>>> >>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab) >>>>>> { >>>>>> + unsigned long old_exts = READ_ONCE(slab->obj_exts); >>>>>> + if( old_exts == 0 ) >>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>>>>> } >>>>> I don't think this makes sense. >>>>> cmpxchg() fails anyway if old_exts != 0. >>> Aha, sorry I misunderstood what you meant. >>> >>>>>> Do you have any better suggestions on your end? >>>>> I meant something like this. >>>>> >>>>> But someone might argue that this is not necessary anyway >>>>> if there's a severe memory pressure :) >>>>> >>>>> diff --git a/mm/slub.c b/mm/slub.c >>>>> index a585d0ac45d4..4354ae68b0e1 100644 >>>>> --- a/mm/slub.c >>>>> +++ b/mm/slub.c >>>>> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, >>>>> slab->obj_exts = new_exts; >>>>> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || >>>>> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { >>>>> + >>>>> + old_exts = READ_ONCE(slab->obj_exts); >>>>> + if (old_exts == OBJEXTS_ALLOC_FAIL && >>>>> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts) >>>>> + goto out; >>>> Yeah, but either we make it a full loop or we don't care. >>>> Maybe we could care because even without a severe memory pressure, one side >>>> might be using kmalloc_nolock() and fail more easily. I'd bet it's what's >>>> making this reproducible actually. >>> From my understanding, it only affected the obj_ext associated with this allocation, which was subsequently deallocated, leading to the loss of this count. Is this correct? >> Yes. In that case, we may really need to handle this situation and require a full loop. In theory, this scenario could occur: Thread1 Thead2 alloc_slab_obj_exts alloc_slab_obj_exts old_exts = READ_ONCE(slab->obj_exts) = 0 mark_failed_objexts_alloc(slab); cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts kfree and return 0; alloc_tag_add---->a panic occurs Alternatively, is there any code logic I might have overlooked? > I think retrying like this should work: > > +retry: > old_exts = READ_ONCE(slab->obj_exts); > handle_failed_objexts_alloc(old_exts, vec, objects); > if (new_slab) { > @@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab, > struct kmem_cache *s, > * be simply assigned. > */ > slab->obj_exts = new_exts; > - } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || > - cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { > + } else if (old_exts & ~OBJEXTS_FLAGS_MASK) { > /* > * If the slab is already in use, somebody can allocate and > * assign slabobj_exts in parallel. In this case the existing > @@ -2158,6 +2158,8 @@ int alloc_slab_obj_exts(struct slab *slab, > struct kmem_cache *s, > else > kfree(vec); > return 0; > + } else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { > + goto retry; > } Agree with this. If there are no issues with my comment above, I will send V2 based on Suren's suggestion. Additionally, I believe the "Fixes" field should be written as follows: Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations") Am I wrong? > >>>>> /* >>>>> * If the slab is already in use, somebody can allocate and >>>>> * assign slabobj_exts in parallel. In this case the existing >>>>> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, >>>>> return 0; >>>>> } >>>>> >>>>> +out: >>>>> kmemleak_not_leak(vec); >>>>> return 0; >>>>> } >>>>> >>>>>>>> -- >>>>>>>> 2.25.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts 2025-10-20 2:01 ` Hao Ge @ 2025-10-20 10:20 ` Vlastimil Babka 0 siblings, 0 replies; 10+ messages in thread From: Vlastimil Babka @ 2025-10-20 10:20 UTC (permalink / raw) To: Hao Ge, Suren Baghdasaryan Cc: Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On 10/20/25 04:01, Hao Ge wrote: > >> I think retrying like this should work: >> >> +retry: >> old_exts = READ_ONCE(slab->obj_exts); >> handle_failed_objexts_alloc(old_exts, vec, objects); >> if (new_slab) { >> @@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab, >> struct kmem_cache *s, >> * be simply assigned. >> */ >> slab->obj_exts = new_exts; >> - } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) || >> - cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { >> + } else if (old_exts & ~OBJEXTS_FLAGS_MASK) { >> /* >> * If the slab is already in use, somebody can allocate and >> * assign slabobj_exts in parallel. In this case the existing >> @@ -2158,6 +2158,8 @@ int alloc_slab_obj_exts(struct slab *slab, >> struct kmem_cache *s, >> else >> kfree(vec); >> return 0; >> + } else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) { >> + goto retry; >> } > > Agree with this. If there are no issues with my comment above, > > I will send V2 based on Suren's suggestion. Great. > Additionally, I believe the "Fixes" field should be written as follows: > > Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to > mark failed slab_ext allocations") > > Am I wrong? I think it was safe before this one: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally") ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-10-20 10:21 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-10-17 4:57 [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts Hao Ge 2025-10-17 6:05 ` Harry Yoo 2025-10-17 6:42 ` Hao Ge 2025-10-17 7:40 ` Harry Yoo 2025-10-17 8:21 ` Vlastimil Babka 2025-10-17 10:02 ` Hao Ge 2025-10-17 10:40 ` Vlastimil Babka 2025-10-17 21:52 ` Suren Baghdasaryan 2025-10-20 2:01 ` Hao Ge 2025-10-20 10:20 ` Vlastimil Babka
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox