* [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
@ 2025-10-17 4:57 Hao Ge
2025-10-17 6:05 ` Harry Yoo
0 siblings, 1 reply; 10+ messages in thread
From: Hao Ge @ 2025-10-17 4:57 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Harry Yoo, Suren Baghdasaryan
Cc: Shakeel Butt, linux-mm, linux-kernel, Hao Ge
From: Hao Ge <gehao@kylinos.cn>
In the alloc_slab_obj_exts function, there is a race condition
between the successful allocation of slab->obj_exts and its
setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
When two threads are both allocating objects from the same slab,
they both end up entering the alloc_slab_obj_exts function because
the slab has no obj_exts (allocated yet).
And One call succeeds in allocation, but the racing one overwrites
our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
allocated will have prepare_slab_obj_exts_hook() return
slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
on the zero address.
And then it will call alloc_tag_add, where the member codetag_ref *ref
of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
leading to a panic.
In order to avoid that, for the case of allocation failure where
OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
Thanks for Vlastimil and Suren's help with debugging.
Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c
index 2e4340c75be2..9e6361796e34 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
static inline void mark_failed_objexts_alloc(struct slab *slab)
{
- slab->obj_exts = OBJEXTS_ALLOC_FAIL;
+ cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
}
static inline void handle_failed_objexts_alloc(unsigned long obj_exts,
--
2.25.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-17 4:57 [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts Hao Ge
@ 2025-10-17 6:05 ` Harry Yoo
2025-10-17 6:42 ` Hao Ge
0 siblings, 1 reply; 10+ messages in thread
From: Harry Yoo @ 2025-10-17 6:05 UTC (permalink / raw)
To: Hao Ge
Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt,
linux-mm, linux-kernel, Hao Ge
On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote:
> From: Hao Ge <gehao@kylinos.cn>
>
> In the alloc_slab_obj_exts function, there is a race condition
> between the successful allocation of slab->obj_exts and its
> setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
>
> When two threads are both allocating objects from the same slab,
> they both end up entering the alloc_slab_obj_exts function because
> the slab has no obj_exts (allocated yet).
>
> And One call succeeds in allocation, but the racing one overwrites
> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
> allocated will have prepare_slab_obj_exts_hook() return
> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
> on the zero address.
>
> And then it will call alloc_tag_add, where the member codetag_ref *ref
> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
> leading to a panic.
>
> In order to avoid that, for the case of allocation failure where
> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
>
> Thanks for Vlastimil and Suren's help with debugging.
>
> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
I think we should add Cc: stable as well?
We need an explicit Cc: stable to backport mm patches to -stable.
> Signed-off-by: Hao Ge <gehao@kylinos.cn>
> ---
> mm/slub.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 2e4340c75be2..9e6361796e34 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
>
> static inline void mark_failed_objexts_alloc(struct slab *slab)
> {
> - slab->obj_exts = OBJEXTS_ALLOC_FAIL;
> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
> }
A silly question:
If mark_failed_objexts_alloc() succeeds and a concurrent
alloc_slab_obj_exts() loses, should we retry cmpxchg() in
alloc_slab_obj_exts()?
> --
> 2.25.1
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-17 6:05 ` Harry Yoo
@ 2025-10-17 6:42 ` Hao Ge
2025-10-17 7:40 ` Harry Yoo
0 siblings, 1 reply; 10+ messages in thread
From: Hao Ge @ 2025-10-17 6:42 UTC (permalink / raw)
To: Harry Yoo
Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt,
linux-mm, linux-kernel, Hao Ge
Hi Harry
Thank you for your quick response.
On 2025/10/17 14:05, Harry Yoo wrote:
> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote:
>> From: Hao Ge <gehao@kylinos.cn>
>>
>> In the alloc_slab_obj_exts function, there is a race condition
>> between the successful allocation of slab->obj_exts and its
>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
>>
>> When two threads are both allocating objects from the same slab,
>> they both end up entering the alloc_slab_obj_exts function because
>> the slab has no obj_exts (allocated yet).
>>
>> And One call succeeds in allocation, but the racing one overwrites
>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
>> allocated will have prepare_slab_obj_exts_hook() return
>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
>> on the zero address.
>>
>> And then it will call alloc_tag_add, where the member codetag_ref *ref
>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
>> leading to a panic.
>>
>> In order to avoid that, for the case of allocation failure where
>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
>>
>> Thanks for Vlastimil and Suren's help with debugging.
>>
>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
> I think we should add Cc: stable as well?
> We need an explicit Cc: stable to backport mm patches to -stable.
Oh sorry, I missed this.
>
>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>> ---
>> mm/slub.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 2e4340c75be2..9e6361796e34 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
>>
>> static inline void mark_failed_objexts_alloc(struct slab *slab)
>> {
>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL;
>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>> }
> A silly question:
>
> If mark_failed_objexts_alloc() succeeds and a concurrent
> alloc_slab_obj_exts() loses, should we retry cmpxchg() in
> alloc_slab_obj_exts()?
Great point.
We could modify it like this, perhaps?
static inline void mark_failed_objexts_alloc(struct slab *slab)
{
+ unsigned long old_exts = READ_ONCE(slab->obj_exts);
+ if( old_exts == 0 )
+ cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
}
Do you have any better suggestions on your end?
>
>> --
>> 2.25.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-17 6:42 ` Hao Ge
@ 2025-10-17 7:40 ` Harry Yoo
2025-10-17 8:21 ` Vlastimil Babka
0 siblings, 1 reply; 10+ messages in thread
From: Harry Yoo @ 2025-10-17 7:40 UTC (permalink / raw)
To: Hao Ge
Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt,
linux-mm, linux-kernel, Hao Ge
On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote:
> Hi Harry
>
>
> Thank you for your quick response.
>
>
> On 2025/10/17 14:05, Harry Yoo wrote:
> > On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote:
> > > From: Hao Ge <gehao@kylinos.cn>
> > >
> > > In the alloc_slab_obj_exts function, there is a race condition
> > > between the successful allocation of slab->obj_exts and its
> > > setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
> > >
> > > When two threads are both allocating objects from the same slab,
> > > they both end up entering the alloc_slab_obj_exts function because
> > > the slab has no obj_exts (allocated yet).
> > >
> > > And One call succeeds in allocation, but the racing one overwrites
> > > our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
> > > allocated will have prepare_slab_obj_exts_hook() return
> > > slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
> > > already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
> > > on the zero address.
> > >
> > > And then it will call alloc_tag_add, where the member codetag_ref *ref
> > > of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
> > > leading to a panic.
> > >
> > > In order to avoid that, for the case of allocation failure where
> > > OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
> > >
> > > Thanks for Vlastimil and Suren's help with debugging.
> > >
> > > Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
> > I think we should add Cc: stable as well?
> > We need an explicit Cc: stable to backport mm patches to -stable.
> Oh sorry, I missed this.
> >
> > > Signed-off-by: Hao Ge <gehao@kylinos.cn>
> > > ---
> > > mm/slub.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 2e4340c75be2..9e6361796e34 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
> > > static inline void mark_failed_objexts_alloc(struct slab *slab)
> > > {
> > > - slab->obj_exts = OBJEXTS_ALLOC_FAIL;
> > > + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
> > > }
> > A silly question:
> >
> > If mark_failed_objexts_alloc() succeeds and a concurrent
> > alloc_slab_obj_exts() loses, should we retry cmpxchg() in
> > alloc_slab_obj_exts()?
>
> Great point.
>
> We could modify it like this, perhaps?
>
> static inline void mark_failed_objexts_alloc(struct slab *slab)
> {
> + unsigned long old_exts = READ_ONCE(slab->obj_exts);
> + if( old_exts == 0 )
> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
> }
I don't think this makes sense.
cmpxchg() fails anyway if old_exts != 0.
> Do you have any better suggestions on your end?
I meant something like this.
But someone might argue that this is not necessary anyway
if there's a severe memory pressure :)
diff --git a/mm/slub.c b/mm/slub.c
index a585d0ac45d4..4354ae68b0e1 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
slab->obj_exts = new_exts;
} else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
+
+ old_exts = READ_ONCE(slab->obj_exts);
+ if (old_exts == OBJEXTS_ALLOC_FAIL &&
+ cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts)
+ goto out;
/*
* If the slab is already in use, somebody can allocate and
* assign slabobj_exts in parallel. In this case the existing
@@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
return 0;
}
+out:
kmemleak_not_leak(vec);
return 0;
}
> >
> > > --
> > > 2.25.1
>
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-17 7:40 ` Harry Yoo
@ 2025-10-17 8:21 ` Vlastimil Babka
2025-10-17 10:02 ` Hao Ge
0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2025-10-17 8:21 UTC (permalink / raw)
To: Harry Yoo, Hao Ge
Cc: Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin,
Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge
On 10/17/25 09:40, Harry Yoo wrote:
> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote:
>> Hi Harry
>>
>>
>> Thank you for your quick response.
>>
>>
>> On 2025/10/17 14:05, Harry Yoo wrote:
>> > On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote:
>> > > From: Hao Ge <gehao@kylinos.cn>
>> > >
>> > > In the alloc_slab_obj_exts function, there is a race condition
>> > > between the successful allocation of slab->obj_exts and its
>> > > setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
>> > >
>> > > When two threads are both allocating objects from the same slab,
>> > > they both end up entering the alloc_slab_obj_exts function because
>> > > the slab has no obj_exts (allocated yet).
>> > >
>> > > And One call succeeds in allocation, but the racing one overwrites
>> > > our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
>> > > allocated will have prepare_slab_obj_exts_hook() return
>> > > slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
>> > > already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
>> > > on the zero address.
>> > >
>> > > And then it will call alloc_tag_add, where the member codetag_ref *ref
>> > > of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
>> > > leading to a panic.
>> > >
>> > > In order to avoid that, for the case of allocation failure where
>> > > OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
>> > >
>> > > Thanks for Vlastimil and Suren's help with debugging.
>> > >
>> > > Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
>> > I think we should add Cc: stable as well?
>> > We need an explicit Cc: stable to backport mm patches to -stable.
>> Oh sorry, I missed this.
>> >
>> > > Signed-off-by: Hao Ge <gehao@kylinos.cn>
>> > > ---
>> > > mm/slub.c | 2 +-
>> > > 1 file changed, 1 insertion(+), 1 deletion(-)
>> > >
>> > > diff --git a/mm/slub.c b/mm/slub.c
>> > > index 2e4340c75be2..9e6361796e34 100644
>> > > --- a/mm/slub.c
>> > > +++ b/mm/slub.c
>> > > @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
>> > > static inline void mark_failed_objexts_alloc(struct slab *slab)
>> > > {
>> > > - slab->obj_exts = OBJEXTS_ALLOC_FAIL;
>> > > + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>> > > }
>> > A silly question:
>> >
>> > If mark_failed_objexts_alloc() succeeds and a concurrent
>> > alloc_slab_obj_exts() loses, should we retry cmpxchg() in
>> > alloc_slab_obj_exts()?
>>
>> Great point.
>>
>> We could modify it like this, perhaps?
>>
>> static inline void mark_failed_objexts_alloc(struct slab *slab)
>> {
>> + unsigned long old_exts = READ_ONCE(slab->obj_exts);
>> + if( old_exts == 0 )
>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>> }
>
> I don't think this makes sense.
> cmpxchg() fails anyway if old_exts != 0.
>
>> Do you have any better suggestions on your end?
>
> I meant something like this.
>
> But someone might argue that this is not necessary anyway
> if there's a severe memory pressure :)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index a585d0ac45d4..4354ae68b0e1 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> slab->obj_exts = new_exts;
> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
> +
> + old_exts = READ_ONCE(slab->obj_exts);
> + if (old_exts == OBJEXTS_ALLOC_FAIL &&
> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts)
> + goto out;
Yeah, but either we make it a full loop or we don't care.
Maybe we could care because even without a severe memory pressure, one side
might be using kmalloc_nolock() and fail more easily. I'd bet it's what's
making this reproducible actually.
> /*
> * If the slab is already in use, somebody can allocate and
> * assign slabobj_exts in parallel. In this case the existing
> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> return 0;
> }
>
> +out:
> kmemleak_not_leak(vec);
> return 0;
> }
>
>> >
>> > > --
>> > > 2.25.1
>>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-17 8:21 ` Vlastimil Babka
@ 2025-10-17 10:02 ` Hao Ge
2025-10-17 10:40 ` Vlastimil Babka
0 siblings, 1 reply; 10+ messages in thread
From: Hao Ge @ 2025-10-17 10:02 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes,
Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm,
linux-kernel, Hao Ge
> On Oct 17, 2025, at 16:22, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 10/17/25 09:40, Harry Yoo wrote:
>>> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote:
>>> Hi Harry
>>>
>>>
>>> Thank you for your quick response.
>>>
>>>
>>> On 2025/10/17 14:05, Harry Yoo wrote:
>>>> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote:
>>>>> From: Hao Ge <gehao@kylinos.cn>
>>>>>
>>>>> In the alloc_slab_obj_exts function, there is a race condition
>>>>> between the successful allocation of slab->obj_exts and its
>>>>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
>>>>>
>>>>> When two threads are both allocating objects from the same slab,
>>>>> they both end up entering the alloc_slab_obj_exts function because
>>>>> the slab has no obj_exts (allocated yet).
>>>>>
>>>>> And One call succeeds in allocation, but the racing one overwrites
>>>>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
>>>>> allocated will have prepare_slab_obj_exts_hook() return
>>>>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
>>>>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
>>>>> on the zero address.
>>>>>
>>>>> And then it will call alloc_tag_add, where the member codetag_ref *ref
>>>>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
>>>>> leading to a panic.
>>>>>
>>>>> In order to avoid that, for the case of allocation failure where
>>>>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
>>>>>
>>>>> Thanks for Vlastimil and Suren's help with debugging.
>>>>>
>>>>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
>>>> I think we should add Cc: stable as well?
>>>> We need an explicit Cc: stable to backport mm patches to -stable.
>>> Oh sorry, I missed this.
>>>>
>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>>>>> ---
>>>>> mm/slub.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/mm/slub.c b/mm/slub.c
>>>>> index 2e4340c75be2..9e6361796e34 100644
>>>>> --- a/mm/slub.c
>>>>> +++ b/mm/slub.c
>>>>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab)
>>>>> {
>>>>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL;
>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>>>>> }
>>>> A silly question:
>>>>
>>>> If mark_failed_objexts_alloc() succeeds and a concurrent
>>>> alloc_slab_obj_exts() loses, should we retry cmpxchg() in
>>>> alloc_slab_obj_exts()?
>>>
>>> Great point.
>>>
>>> We could modify it like this, perhaps?
>>>
>>> static inline void mark_failed_objexts_alloc(struct slab *slab)
>>> {
>>> + unsigned long old_exts = READ_ONCE(slab->obj_exts);
>>> + if( old_exts == 0 )
>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>>> }
>>
>> I don't think this makes sense.
>> cmpxchg() fails anyway if old_exts != 0.
Aha, sorry I misunderstood what you meant.
>>
>>> Do you have any better suggestions on your end?
>>
>> I meant something like this.
>>
>> But someone might argue that this is not necessary anyway
>> if there's a severe memory pressure :)
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index a585d0ac45d4..4354ae68b0e1 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>> slab->obj_exts = new_exts;
>> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
>> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
>> +
>> + old_exts = READ_ONCE(slab->obj_exts);
>> + if (old_exts == OBJEXTS_ALLOC_FAIL &&
>> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts)
>> + goto out;
>
> Yeah, but either we make it a full loop or we don't care.
> Maybe we could care because even without a severe memory pressure, one side
> might be using kmalloc_nolock() and fail more easily. I'd bet it's what's
> making this reproducible actually.
From my understanding, it only affected the obj_ext associated with this allocation, which was subsequently deallocated, leading to the loss of this count. Is this correct?
>
>> /*
>> * If the slab is already in use, somebody can allocate and
>> * assign slabobj_exts in parallel. In this case the existing
>> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>> return 0;
>> }
>>
>> +out:
>> kmemleak_not_leak(vec);
>> return 0;
>> }
>>
>>>>
>>>>> --
>>>>> 2.25.1
>>>
>>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-17 10:02 ` Hao Ge
@ 2025-10-17 10:40 ` Vlastimil Babka
2025-10-17 21:52 ` Suren Baghdasaryan
0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2025-10-17 10:40 UTC (permalink / raw)
To: Hao Ge
Cc: Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes,
Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm,
linux-kernel, Hao Ge
On 10/17/25 12:02, Hao Ge wrote:
>
>
>> On Oct 17, 2025, at 16:22, Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 10/17/25 09:40, Harry Yoo wrote:
>>>> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote:
>>>> Hi Harry
>>>>
>>>>
>>>> Thank you for your quick response.
>>>>
>>>>
>>>> On 2025/10/17 14:05, Harry Yoo wrote:
>>>>> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote:
>>>>>> From: Hao Ge <gehao@kylinos.cn>
>>>>>>
>>>>>> In the alloc_slab_obj_exts function, there is a race condition
>>>>>> between the successful allocation of slab->obj_exts and its
>>>>>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
>>>>>>
>>>>>> When two threads are both allocating objects from the same slab,
>>>>>> they both end up entering the alloc_slab_obj_exts function because
>>>>>> the slab has no obj_exts (allocated yet).
>>>>>>
>>>>>> And One call succeeds in allocation, but the racing one overwrites
>>>>>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
>>>>>> allocated will have prepare_slab_obj_exts_hook() return
>>>>>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
>>>>>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
>>>>>> on the zero address.
>>>>>>
>>>>>> And then it will call alloc_tag_add, where the member codetag_ref *ref
>>>>>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
>>>>>> leading to a panic.
>>>>>>
>>>>>> In order to avoid that, for the case of allocation failure where
>>>>>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
>>>>>>
>>>>>> Thanks for Vlastimil and Suren's help with debugging.
>>>>>>
>>>>>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
>>>>> I think we should add Cc: stable as well?
>>>>> We need an explicit Cc: stable to backport mm patches to -stable.
>>>> Oh sorry, I missed this.
>>>>>
>>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>>>>>> ---
>>>>>> mm/slub.c | 2 +-
>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/slub.c b/mm/slub.c
>>>>>> index 2e4340c75be2..9e6361796e34 100644
>>>>>> --- a/mm/slub.c
>>>>>> +++ b/mm/slub.c
>>>>>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
>>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab)
>>>>>> {
>>>>>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL;
>>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>>>>>> }
>>>>> A silly question:
>>>>>
>>>>> If mark_failed_objexts_alloc() succeeds and a concurrent
>>>>> alloc_slab_obj_exts() loses, should we retry cmpxchg() in
>>>>> alloc_slab_obj_exts()?
>>>>
>>>> Great point.
>>>>
>>>> We could modify it like this, perhaps?
>>>>
>>>> static inline void mark_failed_objexts_alloc(struct slab *slab)
>>>> {
>>>> + unsigned long old_exts = READ_ONCE(slab->obj_exts);
>>>> + if( old_exts == 0 )
>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>>>> }
>>>
>>> I don't think this makes sense.
>>> cmpxchg() fails anyway if old_exts != 0.
>
> Aha, sorry I misunderstood what you meant.
>
>>>
>>>> Do you have any better suggestions on your end?
>>>
>>> I meant something like this.
>>>
>>> But someone might argue that this is not necessary anyway
>>> if there's a severe memory pressure :)
>>>
>>> diff --git a/mm/slub.c b/mm/slub.c
>>> index a585d0ac45d4..4354ae68b0e1 100644
>>> --- a/mm/slub.c
>>> +++ b/mm/slub.c
>>> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>>> slab->obj_exts = new_exts;
>>> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
>>> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
>>> +
>>> + old_exts = READ_ONCE(slab->obj_exts);
>>> + if (old_exts == OBJEXTS_ALLOC_FAIL &&
>>> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts)
>>> + goto out;
>>
>> Yeah, but either we make it a full loop or we don't care.
>> Maybe we could care because even without a severe memory pressure, one side
>> might be using kmalloc_nolock() and fail more easily. I'd bet it's what's
>> making this reproducible actually.
>
> From my understanding, it only affected the obj_ext associated with this allocation, which was subsequently deallocated, leading to the loss of this count. Is this correct?
Yes.
>
>>
>>> /*
>>> * If the slab is already in use, somebody can allocate and
>>> * assign slabobj_exts in parallel. In this case the existing
>>> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>>> return 0;
>>> }
>>>
>>> +out:
>>> kmemleak_not_leak(vec);
>>> return 0;
>>> }
>>>
>>>>>
>>>>>> --
>>>>>> 2.25.1
>>>>
>>>
>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-17 10:40 ` Vlastimil Babka
@ 2025-10-17 21:52 ` Suren Baghdasaryan
2025-10-20 2:01 ` Hao Ge
0 siblings, 1 reply; 10+ messages in thread
From: Suren Baghdasaryan @ 2025-10-17 21:52 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Hao Ge, Harry Yoo, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Shakeel Butt, linux-mm,
linux-kernel, Hao Ge
On Fri, Oct 17, 2025 at 3:40 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 10/17/25 12:02, Hao Ge wrote:
> >
> >
> >> On Oct 17, 2025, at 16:22, Vlastimil Babka <vbabka@suse.cz> wrote:
> >>
> >> On 10/17/25 09:40, Harry Yoo wrote:
> >>>> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote:
> >>>> Hi Harry
> >>>>
> >>>>
> >>>> Thank you for your quick response.
> >>>>
> >>>>
> >>>> On 2025/10/17 14:05, Harry Yoo wrote:
> >>>>> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote:
> >>>>>> From: Hao Ge <gehao@kylinos.cn>
> >>>>>>
> >>>>>> In the alloc_slab_obj_exts function, there is a race condition
> >>>>>> between the successful allocation of slab->obj_exts and its
> >>>>>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
> >>>>>>
> >>>>>> When two threads are both allocating objects from the same slab,
> >>>>>> they both end up entering the alloc_slab_obj_exts function because
> >>>>>> the slab has no obj_exts (allocated yet).
> >>>>>>
> >>>>>> And One call succeeds in allocation, but the racing one overwrites
> >>>>>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
> >>>>>> allocated will have prepare_slab_obj_exts_hook() return
> >>>>>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
> >>>>>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
> >>>>>> on the zero address.
> >>>>>>
> >>>>>> And then it will call alloc_tag_add, where the member codetag_ref *ref
> >>>>>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
> >>>>>> leading to a panic.
> >>>>>>
> >>>>>> In order to avoid that, for the case of allocation failure where
> >>>>>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
> >>>>>>
> >>>>>> Thanks for Vlastimil and Suren's help with debugging.
> >>>>>>
> >>>>>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
> >>>>> I think we should add Cc: stable as well?
> >>>>> We need an explicit Cc: stable to backport mm patches to -stable.
> >>>> Oh sorry, I missed this.
> >>>>>
> >>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
> >>>>>> ---
> >>>>>> mm/slub.c | 2 +-
> >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/mm/slub.c b/mm/slub.c
> >>>>>> index 2e4340c75be2..9e6361796e34 100644
> >>>>>> --- a/mm/slub.c
> >>>>>> +++ b/mm/slub.c
> >>>>>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
> >>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab)
> >>>>>> {
> >>>>>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL;
> >>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
> >>>>>> }
> >>>>> A silly question:
> >>>>>
> >>>>> If mark_failed_objexts_alloc() succeeds and a concurrent
> >>>>> alloc_slab_obj_exts() loses, should we retry cmpxchg() in
> >>>>> alloc_slab_obj_exts()?
> >>>>
> >>>> Great point.
> >>>>
> >>>> We could modify it like this, perhaps?
> >>>>
> >>>> static inline void mark_failed_objexts_alloc(struct slab *slab)
> >>>> {
> >>>> + unsigned long old_exts = READ_ONCE(slab->obj_exts);
> >>>> + if( old_exts == 0 )
> >>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
> >>>> }
> >>>
> >>> I don't think this makes sense.
> >>> cmpxchg() fails anyway if old_exts != 0.
> >
> > Aha, sorry I misunderstood what you meant.
> >
> >>>
> >>>> Do you have any better suggestions on your end?
> >>>
> >>> I meant something like this.
> >>>
> >>> But someone might argue that this is not necessary anyway
> >>> if there's a severe memory pressure :)
> >>>
> >>> diff --git a/mm/slub.c b/mm/slub.c
> >>> index a585d0ac45d4..4354ae68b0e1 100644
> >>> --- a/mm/slub.c
> >>> +++ b/mm/slub.c
> >>> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> >>> slab->obj_exts = new_exts;
> >>> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
> >>> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
> >>> +
> >>> + old_exts = READ_ONCE(slab->obj_exts);
> >>> + if (old_exts == OBJEXTS_ALLOC_FAIL &&
> >>> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts)
> >>> + goto out;
> >>
> >> Yeah, but either we make it a full loop or we don't care.
> >> Maybe we could care because even without a severe memory pressure, one side
> >> might be using kmalloc_nolock() and fail more easily. I'd bet it's what's
> >> making this reproducible actually.
> >
> > From my understanding, it only affected the obj_ext associated with this allocation, which was subsequently deallocated, leading to the loss of this count. Is this correct?
>
> Yes.
I think retrying like this should work:
+retry:
old_exts = READ_ONCE(slab->obj_exts);
handle_failed_objexts_alloc(old_exts, vec, objects);
if (new_slab) {
@@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab,
struct kmem_cache *s,
* be simply assigned.
*/
slab->obj_exts = new_exts;
- } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
- cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
+ } else if (old_exts & ~OBJEXTS_FLAGS_MASK) {
/*
* If the slab is already in use, somebody can allocate and
* assign slabobj_exts in parallel. In this case the existing
@@ -2158,6 +2158,8 @@ int alloc_slab_obj_exts(struct slab *slab,
struct kmem_cache *s,
else
kfree(vec);
return 0;
+ } else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
+ goto retry;
}
>
> >
> >>
> >>> /*
> >>> * If the slab is already in use, somebody can allocate and
> >>> * assign slabobj_exts in parallel. In this case the existing
> >>> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> >>> return 0;
> >>> }
> >>>
> >>> +out:
> >>> kmemleak_not_leak(vec);
> >>> return 0;
> >>> }
> >>>
> >>>>>
> >>>>>> --
> >>>>>> 2.25.1
> >>>>
> >>>
> >>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-17 21:52 ` Suren Baghdasaryan
@ 2025-10-20 2:01 ` Hao Ge
2025-10-20 10:20 ` Vlastimil Babka
0 siblings, 1 reply; 10+ messages in thread
From: Hao Ge @ 2025-10-20 2:01 UTC (permalink / raw)
To: Suren Baghdasaryan, Vlastimil Babka
Cc: Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes,
Roman Gushchin, Shakeel Butt, linux-mm, linux-kernel, Hao Ge
On 2025/10/18 05:52, Suren Baghdasaryan wrote:
> On Fri, Oct 17, 2025 at 3:40 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 10/17/25 12:02, Hao Ge wrote:
>>>
>>>> On Oct 17, 2025, at 16:22, Vlastimil Babka <vbabka@suse.cz> wrote:
>>>>
>>>> On 10/17/25 09:40, Harry Yoo wrote:
>>>>>> On Fri, Oct 17, 2025 at 02:42:56PM +0800, Hao Ge wrote:
>>>>>> Hi Harry
>>>>>>
>>>>>>
>>>>>> Thank you for your quick response.
>>>>>>
>>>>>>
>>>>>> On 2025/10/17 14:05, Harry Yoo wrote:
>>>>>>> On Fri, Oct 17, 2025 at 12:57:49PM +0800, Hao Ge wrote:
>>>>>>>> From: Hao Ge <gehao@kylinos.cn>
>>>>>>>>
>>>>>>>> In the alloc_slab_obj_exts function, there is a race condition
>>>>>>>> between the successful allocation of slab->obj_exts and its
>>>>>>>> setting to OBJEXTS_ALLOC_FAIL due to allocation failure.
>>>>>>>>
>>>>>>>> When two threads are both allocating objects from the same slab,
>>>>>>>> they both end up entering the alloc_slab_obj_exts function because
>>>>>>>> the slab has no obj_exts (allocated yet).
>>>>>>>>
>>>>>>>> And One call succeeds in allocation, but the racing one overwrites
>>>>>>>> our obj_ext with OBJEXTS_ALLOC_FAIL. The threads that successfully
>>>>>>>> allocated will have prepare_slab_obj_exts_hook() return
>>>>>>>> slab_obj_exts(slab) + obj_to_index(s, slab, p), where slab_obj_exts(slab)
>>>>>>>> already sees OBJEXTS_ALLOC_FAIL and thus it returns an offset based
>>>>>>>> on the zero address.
>>>>>>>>
>>>>>>>> And then it will call alloc_tag_add, where the member codetag_ref *ref
>>>>>>>> of obj_exts will be referenced.Thus, a NULL pointer dereference occurs,
>>>>>>>> leading to a panic.
>>>>>>>>
>>>>>>>> In order to avoid that, for the case of allocation failure where
>>>>>>>> OBJEXTS_ALLOC_FAIL is assigned, we use cmpxchg to handle this assignment.
>>>>>>>>
>>>>>>>> Thanks for Vlastimil and Suren's help with debugging.
>>>>>>>>
>>>>>>>> Fixes: f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
>>>>>>> I think we should add Cc: stable as well?
>>>>>>> We need an explicit Cc: stable to backport mm patches to -stable.
>>>>>> Oh sorry, I missed this.
>>>>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>>>>>>>> ---
>>>>>>>> mm/slub.c | 2 +-
>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/mm/slub.c b/mm/slub.c
>>>>>>>> index 2e4340c75be2..9e6361796e34 100644
>>>>>>>> --- a/mm/slub.c
>>>>>>>> +++ b/mm/slub.c
>>>>>>>> @@ -2054,7 +2054,7 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts)
>>>>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab)
>>>>>>>> {
>>>>>>>> - slab->obj_exts = OBJEXTS_ALLOC_FAIL;
>>>>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>>>>>>>> }
>>>>>>> A silly question:
>>>>>>>
>>>>>>> If mark_failed_objexts_alloc() succeeds and a concurrent
>>>>>>> alloc_slab_obj_exts() loses, should we retry cmpxchg() in
>>>>>>> alloc_slab_obj_exts()?
>>>>>> Great point.
>>>>>>
>>>>>> We could modify it like this, perhaps?
>>>>>>
>>>>>> static inline void mark_failed_objexts_alloc(struct slab *slab)
>>>>>> {
>>>>>> + unsigned long old_exts = READ_ONCE(slab->obj_exts);
>>>>>> + if( old_exts == 0 )
>>>>>> + cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
>>>>>> }
>>>>> I don't think this makes sense.
>>>>> cmpxchg() fails anyway if old_exts != 0.
>>> Aha, sorry I misunderstood what you meant.
>>>
>>>>>> Do you have any better suggestions on your end?
>>>>> I meant something like this.
>>>>>
>>>>> But someone might argue that this is not necessary anyway
>>>>> if there's a severe memory pressure :)
>>>>>
>>>>> diff --git a/mm/slub.c b/mm/slub.c
>>>>> index a585d0ac45d4..4354ae68b0e1 100644
>>>>> --- a/mm/slub.c
>>>>> +++ b/mm/slub.c
>>>>> @@ -2139,6 +2139,11 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>>>>> slab->obj_exts = new_exts;
>>>>> } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
>>>>> cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
>>>>> +
>>>>> + old_exts = READ_ONCE(slab->obj_exts);
>>>>> + if (old_exts == OBJEXTS_ALLOC_FAIL &&
>>>>> + cmpxchg(&slab->obj_exts, old_exts, new_exts) == old_exts)
>>>>> + goto out;
>>>> Yeah, but either we make it a full loop or we don't care.
>>>> Maybe we could care because even without a severe memory pressure, one side
>>>> might be using kmalloc_nolock() and fail more easily. I'd bet it's what's
>>>> making this reproducible actually.
>>> From my understanding, it only affected the obj_ext associated with this allocation, which was subsequently deallocated, leading to the loss of this count. Is this correct?
>> Yes.
In that case, we may really need to handle this situation and require a
full loop.
In theory, this scenario could occur:
Thread1 Thead2
alloc_slab_obj_exts alloc_slab_obj_exts
old_exts = READ_ONCE(slab->obj_exts) = 0
mark_failed_objexts_alloc(slab);
cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts
kfree and return 0;
alloc_tag_add---->a panic occurs
Alternatively, is there any code logic I might have overlooked?
> I think retrying like this should work:
>
> +retry:
> old_exts = READ_ONCE(slab->obj_exts);
> handle_failed_objexts_alloc(old_exts, vec, objects);
> if (new_slab) {
> @@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab,
> struct kmem_cache *s,
> * be simply assigned.
> */
> slab->obj_exts = new_exts;
> - } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
> - cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
> + } else if (old_exts & ~OBJEXTS_FLAGS_MASK) {
> /*
> * If the slab is already in use, somebody can allocate and
> * assign slabobj_exts in parallel. In this case the existing
> @@ -2158,6 +2158,8 @@ int alloc_slab_obj_exts(struct slab *slab,
> struct kmem_cache *s,
> else
> kfree(vec);
> return 0;
> + } else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
> + goto retry;
> }
Agree with this. If there are no issues with my comment above,
I will send V2 based on Suren's suggestion.
Additionally, I believe the "Fixes" field should be written as follows:
Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to
mark failed slab_ext allocations")
Am I wrong?
>
>>>>> /*
>>>>> * If the slab is already in use, somebody can allocate and
>>>>> * assign slabobj_exts in parallel. In this case the existing
>>>>> @@ -2152,6 +2157,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>>>>> return 0;
>>>>> }
>>>>>
>>>>> +out:
>>>>> kmemleak_not_leak(vec);
>>>>> return 0;
>>>>> }
>>>>>
>>>>>>>> --
>>>>>>>> 2.25.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts
2025-10-20 2:01 ` Hao Ge
@ 2025-10-20 10:20 ` Vlastimil Babka
0 siblings, 0 replies; 10+ messages in thread
From: Vlastimil Babka @ 2025-10-20 10:20 UTC (permalink / raw)
To: Hao Ge, Suren Baghdasaryan
Cc: Harry Yoo, Andrew Morton, Christoph Lameter, David Rientjes,
Roman Gushchin, Shakeel Butt, linux-mm, linux-kernel, Hao Ge
On 10/20/25 04:01, Hao Ge wrote:
>
>> I think retrying like this should work:
>>
>> +retry:
>> old_exts = READ_ONCE(slab->obj_exts);
>> handle_failed_objexts_alloc(old_exts, vec, objects);
>> if (new_slab) {
>> @@ -2145,8 +2146,7 @@ int alloc_slab_obj_exts(struct slab *slab,
>> struct kmem_cache *s,
>> * be simply assigned.
>> */
>> slab->obj_exts = new_exts;
>> - } else if ((old_exts & ~OBJEXTS_FLAGS_MASK) ||
>> - cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
>> + } else if (old_exts & ~OBJEXTS_FLAGS_MASK) {
>> /*
>> * If the slab is already in use, somebody can allocate and
>> * assign slabobj_exts in parallel. In this case the existing
>> @@ -2158,6 +2158,8 @@ int alloc_slab_obj_exts(struct slab *slab,
>> struct kmem_cache *s,
>> else
>> kfree(vec);
>> return 0;
>> + } else if (cmpxchg(&slab->obj_exts, old_exts, new_exts) != old_exts) {
>> + goto retry;
>> }
>
> Agree with this. If there are no issues with my comment above,
>
> I will send V2 based on Suren's suggestion.
Great.
> Additionally, I believe the "Fixes" field should be written as follows:
>
> Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to
> mark failed slab_ext allocations")
>
> Am I wrong?
I think it was safe before this one:
f7381b911640 ("slab: mark slab->obj_exts allocation failures unconditionally")
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-10-20 10:21 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-17 4:57 [PATCH] slab: Avoid race on slab->obj_exts in alloc_slab_obj_exts Hao Ge
2025-10-17 6:05 ` Harry Yoo
2025-10-17 6:42 ` Hao Ge
2025-10-17 7:40 ` Harry Yoo
2025-10-17 8:21 ` Vlastimil Babka
2025-10-17 10:02 ` Hao Ge
2025-10-17 10:40 ` Vlastimil Babka
2025-10-17 21:52 ` Suren Baghdasaryan
2025-10-20 2:01 ` Hao Ge
2025-10-20 10:20 ` Vlastimil Babka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox