* [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition @ 2025-10-23 1:21 Hao Ge 2025-10-23 2:24 ` Harry Yoo 0 siblings, 1 reply; 8+ messages in thread From: Hao Ge @ 2025-10-23 1:21 UTC (permalink / raw) To: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Harry Yoo, Suren Baghdasaryan Cc: Shakeel Butt, linux-mm, linux-kernel, Hao Ge From: Hao Ge <gehao@kylinos.cn> If two competing threads enter alloc_slab_obj_exts(), and the thread that failed to allocate the object extension vector exits after the one that succeeded, it will mistakenly assume slab->obj_ext is still empty due to its own allocation failure. This will then trigger warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in the subsequent free path. Therefore, let's add an additional check when alloc_slab_obj_exts fails. Signed-off-by: Hao Ge <gehao@kylinos.cn> --- mm/slub.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index d4403341c9df..42276f0cc920 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p) slab = virt_to_slab(p); if (!slab_obj_exts(slab) && alloc_slab_obj_exts(slab, s, flags, false)) { - pr_warn_once("%s, %s: Failed to create slab extension vector!\n", - __func__, s->name); - return NULL; + /* Recheck if a racing thread has successfully allocated slab->obj_exts. */ + if (!slab_obj_exts(slab)) { + pr_warn_once("%s, %s: Failed to create slab extension vector!\n", + __func__, s->name); + return NULL; + } } return slab_obj_exts(slab) + obj_to_index(s, slab, p); -- 2.25.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition 2025-10-23 1:21 [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition Hao Ge @ 2025-10-23 2:24 ` Harry Yoo 2025-10-23 3:11 ` Hao Ge 0 siblings, 1 reply; 8+ messages in thread From: Harry Yoo @ 2025-10-23 2:24 UTC (permalink / raw) To: Hao Ge Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On Thu, Oct 23, 2025 at 09:21:17AM +0800, Hao Ge wrote: > From: Hao Ge <gehao@kylinos.cn> > > If two competing threads enter alloc_slab_obj_exts(), and the > thread that failed to allocate the object extension vector exits > after the one that succeeded, it will mistakenly assume slab->obj_ext > is still empty due to its own allocation failure. This will then trigger > warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in > the subsequent free path. > > Therefore, let's add an additional check when alloc_slab_obj_exts fails. > > Signed-off-by: Hao Ge <gehao@kylinos.cn> > --- > mm/slub.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/mm/slub.c b/mm/slub.c > index d4403341c9df..42276f0cc920 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p) > slab = virt_to_slab(p); > if (!slab_obj_exts(slab) && > alloc_slab_obj_exts(slab, s, flags, false)) { > - pr_warn_once("%s, %s: Failed to create slab extension vector!\n", > - __func__, s->name); > - return NULL; > + /* Recheck if a racing thread has successfully allocated slab->obj_exts. */ > + if (!slab_obj_exts(slab)) { > + pr_warn_once("%s, %s: Failed to create slab extension vector!\n", > + __func__, s->name); > + return NULL; > + } > } Maybe this patch is a bit paranoid... since if mark_failed_objexts_alloc() win cmpxchg() and then someone else allocates the object extension vector, the warning will still be printed anyway. But anyway, I think there is a better way to do this: diff --git a/mm/slub.c b/mm/slub.c index dd4c85ea1038..d08d7580349d 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2052,9 +2052,9 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) } } -static inline void mark_failed_objexts_alloc(struct slab *slab) +static inline bool mark_failed_objexts_alloc(struct slab *slab) { - cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); + return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0; } static inline void handle_failed_objexts_alloc(unsigned long obj_exts, @@ -2076,7 +2076,7 @@ static inline void handle_failed_objexts_alloc(unsigned long obj_exts, #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) {} -static inline void mark_failed_objexts_alloc(struct slab *slab) {} +static inline bool mark_failed_objexts_alloc(struct slab *slab) { return true; } static inline void handle_failed_objexts_alloc(unsigned long obj_exts, struct slabobj_ext *vec, unsigned int objects) {} @@ -2125,7 +2125,9 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, } if (!vec) { /* Mark vectors which failed to allocate */ - mark_failed_objexts_alloc(slab); + if (!mark_failed_objexts_alloc(slab) && + slab_obj_exts(slab)) + return 0; return -ENOMEM; } -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition 2025-10-23 2:24 ` Harry Yoo @ 2025-10-23 3:11 ` Hao Ge 2025-10-23 7:50 ` Harry Yoo 0 siblings, 1 reply; 8+ messages in thread From: Hao Ge @ 2025-10-23 3:11 UTC (permalink / raw) To: Harry Yoo Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge Hi Harry On 2025/10/23 10:24, Harry Yoo wrote: > On Thu, Oct 23, 2025 at 09:21:17AM +0800, Hao Ge wrote: >> From: Hao Ge <gehao@kylinos.cn> >> >> If two competing threads enter alloc_slab_obj_exts(), and the >> thread that failed to allocate the object extension vector exits >> after the one that succeeded, it will mistakenly assume slab->obj_ext >> is still empty due to its own allocation failure. This will then trigger >> warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in >> the subsequent free path. >> >> Therefore, let's add an additional check when alloc_slab_obj_exts fails. >> >> Signed-off-by: Hao Ge <gehao@kylinos.cn> >> --- >> mm/slub.c | 9 ++++++--- >> 1 file changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/mm/slub.c b/mm/slub.c >> index d4403341c9df..42276f0cc920 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p) >> slab = virt_to_slab(p); >> if (!slab_obj_exts(slab) && >> alloc_slab_obj_exts(slab, s, flags, false)) { >> - pr_warn_once("%s, %s: Failed to create slab extension vector!\n", >> - __func__, s->name); >> - return NULL; >> + /* Recheck if a racing thread has successfully allocated slab->obj_exts. */ >> + if (!slab_obj_exts(slab)) { >> + pr_warn_once("%s, %s: Failed to create slab extension vector!\n", >> + __func__, s->name); >> + return NULL; >> + } >> } > Maybe this patch is a bit paranoid... since if mark_failed_objexts_alloc() > win cmpxchg() and then someone else allocates the object extension vector, > the warning will still be printed anyway. The process that successfully allocates slab_exts will call handle_failed_objexts_alloc, setting ref->ct = CODETAG_EMPTY to prevent the warning from being triggered. > But anyway, I think there is a better way to do this: > > diff --git a/mm/slub.c b/mm/slub.c > index dd4c85ea1038..d08d7580349d 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2052,9 +2052,9 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) > } > } > > -static inline void mark_failed_objexts_alloc(struct slab *slab) > +static inline bool mark_failed_objexts_alloc(struct slab *slab) > { > - cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); > + return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0; > } > > static inline void handle_failed_objexts_alloc(unsigned long obj_exts, > @@ -2076,7 +2076,7 @@ static inline void handle_failed_objexts_alloc(unsigned long obj_exts, > #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ > > static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) {} > -static inline void mark_failed_objexts_alloc(struct slab *slab) {} > +static inline bool mark_failed_objexts_alloc(struct slab *slab) { return true; } > static inline void handle_failed_objexts_alloc(unsigned long obj_exts, > struct slabobj_ext *vec, unsigned int objects) {} > > @@ -2125,7 +2125,9 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, > } > if (!vec) { > /* Mark vectors which failed to allocate */ > - mark_failed_objexts_alloc(slab); > + if (!mark_failed_objexts_alloc(slab) && > + slab_obj_exts(slab)) > + return 0; > > return -ENOMEM; > } > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition 2025-10-23 3:11 ` Hao Ge @ 2025-10-23 7:50 ` Harry Yoo 2025-10-23 8:23 ` Hao Ge 0 siblings, 1 reply; 8+ messages in thread From: Harry Yoo @ 2025-10-23 7:50 UTC (permalink / raw) To: Hao Ge Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On Thu, Oct 23, 2025 at 11:11:56AM +0800, Hao Ge wrote: > Hi Harry > > > On 2025/10/23 10:24, Harry Yoo wrote: > > On Thu, Oct 23, 2025 at 09:21:17AM +0800, Hao Ge wrote: > > > From: Hao Ge <gehao@kylinos.cn> > > > > > > If two competing threads enter alloc_slab_obj_exts(), and the > > > thread that failed to allocate the object extension vector exits > > > after the one that succeeded, it will mistakenly assume slab->obj_ext > > > is still empty due to its own allocation failure. This will then trigger > > > warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in > > > the subsequent free path. > > > > > > Therefore, let's add an additional check when alloc_slab_obj_exts fails. > > > > > > Signed-off-by: Hao Ge <gehao@kylinos.cn> > > > --- > > > mm/slub.c | 9 ++++++--- > > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > index d4403341c9df..42276f0cc920 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p) > > > slab = virt_to_slab(p); > > > if (!slab_obj_exts(slab) && > > > alloc_slab_obj_exts(slab, s, flags, false)) { > > > - pr_warn_once("%s, %s: Failed to create slab extension vector!\n", > > > - __func__, s->name); > > > - return NULL; > > > + /* Recheck if a racing thread has successfully allocated slab->obj_exts. */ > > > + if (!slab_obj_exts(slab)) { > > > + pr_warn_once("%s, %s: Failed to create slab extension vector!\n", > > > + __func__, s->name); > > > + return NULL; > > > + } > > > } > > Maybe this patch is a bit paranoid... since if mark_failed_objexts_alloc() > > win cmpxchg() and then someone else allocates the object extension vector, > > the warning will still be printed anyway. Oh, just to be clear I was talking about the other warning: pr_warn_once("%s, %s: Failed to create slab extension vector!", __func__, s->name); > The process that successfully allocates slab_exts will call > handle_failed_objexts_alloc, setting ref->ct = CODETAG_EMPTY > to prevent the warning from being triggered. But yeah I see what you mean. As you mentioned, if the process that failed to allocate the vector wins cmpxchg(), later process that successfully allocate the vector would call set_codetag_empty(), so no warning. But if the process that allocates the vector wins cmpxchg(), then it won't call set_codetag_empty(), so the process that was trying to set OBJEXTS_ALLOC_FAIL now needs to set the tag. > > But anyway, I think there is a better way to do this: What do you think about the diff I suggested below, though? > > diff --git a/mm/slub.c b/mm/slub.c > > index dd4c85ea1038..d08d7580349d 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -2052,9 +2052,9 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) > > } > > } > > -static inline void mark_failed_objexts_alloc(struct slab *slab) > > +static inline bool mark_failed_objexts_alloc(struct slab *slab) > > { > > - cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); > > + return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0; > > } > > static inline void handle_failed_objexts_alloc(unsigned long obj_exts, > > @@ -2076,7 +2076,7 @@ static inline void handle_failed_objexts_alloc(unsigned long obj_exts, > > #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ > > static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) {} > > -static inline void mark_failed_objexts_alloc(struct slab *slab) {} > > +static inline bool mark_failed_objexts_alloc(struct slab *slab) { return true; } > > static inline void handle_failed_objexts_alloc(unsigned long obj_exts, > > struct slabobj_ext *vec, unsigned int objects) {} > > @@ -2125,7 +2125,9 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, > > } > > if (!vec) { > > /* Mark vectors which failed to allocate */ > > - mark_failed_objexts_alloc(slab); > > + if (!mark_failed_objexts_alloc(slab) && > > + slab_obj_exts(slab)) > > + return 0; > > return -ENOMEM; > > } > > > -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition 2025-10-23 7:50 ` Harry Yoo @ 2025-10-23 8:23 ` Hao Ge 2025-10-23 8:46 ` Hao Ge 0 siblings, 1 reply; 8+ messages in thread From: Hao Ge @ 2025-10-23 8:23 UTC (permalink / raw) To: Harry Yoo Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge Hi Harry On 2025/10/23 15:50, Harry Yoo wrote: > On Thu, Oct 23, 2025 at 11:11:56AM +0800, Hao Ge wrote: >> Hi Harry >> >> >> On 2025/10/23 10:24, Harry Yoo wrote: >>> On Thu, Oct 23, 2025 at 09:21:17AM +0800, Hao Ge wrote: >>>> From: Hao Ge <gehao@kylinos.cn> >>>> >>>> If two competing threads enter alloc_slab_obj_exts(), and the >>>> thread that failed to allocate the object extension vector exits >>>> after the one that succeeded, it will mistakenly assume slab->obj_ext >>>> is still empty due to its own allocation failure. This will then trigger >>>> warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in >>>> the subsequent free path. >>>> >>>> Therefore, let's add an additional check when alloc_slab_obj_exts fails. >>>> >>>> Signed-off-by: Hao Ge <gehao@kylinos.cn> >>>> --- >>>> mm/slub.c | 9 ++++++--- >>>> 1 file changed, 6 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/mm/slub.c b/mm/slub.c >>>> index d4403341c9df..42276f0cc920 100644 >>>> --- a/mm/slub.c >>>> +++ b/mm/slub.c >>>> @@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p) >>>> slab = virt_to_slab(p); >>>> if (!slab_obj_exts(slab) && >>>> alloc_slab_obj_exts(slab, s, flags, false)) { >>>> - pr_warn_once("%s, %s: Failed to create slab extension vector!\n", >>>> - __func__, s->name); >>>> - return NULL; >>>> + /* Recheck if a racing thread has successfully allocated slab->obj_exts. */ >>>> + if (!slab_obj_exts(slab)) { >>>> + pr_warn_once("%s, %s: Failed to create slab extension vector!\n", >>>> + __func__, s->name); >>>> + return NULL; >>>> + } >>>> } >>> Maybe this patch is a bit paranoid... since if mark_failed_objexts_alloc() >>> win cmpxchg() and then someone else allocates the object extension vector, >>> the warning will still be printed anyway. > Oh, just to be clear I was talking about the other warning: > pr_warn_once("%s, %s: Failed to create slab extension vector!", __func__, s->name); > >> The process that successfully allocates slab_exts will call >> handle_failed_objexts_alloc, setting ref->ct = CODETAG_EMPTY >> to prevent the warning from being triggered. > But yeah I see what you mean. > > As you mentioned, if the process that failed to allocate the vector wins > cmpxchg(), later process that successfully allocate the vector would > call set_codetag_empty(), so no warning. > > But if the process that allocates the vector wins cmpxchg(), > then it won't call set_codetag_empty(), so the process > that was trying to set OBJEXTS_ALLOC_FAIL now needs to set the tag. Yes, the case I'm encountering is exactly this one. > >>> But anyway, I think there is a better way to do this: > What do you think about the diff I suggested below, though? Sorry for the delayed response earlier; I was trying to deduce all possible scenarios. It makes sense to me, and I will submit the V2 version based on this suggestion. Thank you for your help. > >>> diff --git a/mm/slub.c b/mm/slub.c >>> index dd4c85ea1038..d08d7580349d 100644 >>> --- a/mm/slub.c >>> +++ b/mm/slub.c >>> @@ -2052,9 +2052,9 @@ static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) >>> } >>> } >>> -static inline void mark_failed_objexts_alloc(struct slab *slab) >>> +static inline bool mark_failed_objexts_alloc(struct slab *slab) >>> { >>> - cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>> + return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0; >>> } >>> static inline void handle_failed_objexts_alloc(unsigned long obj_exts, >>> @@ -2076,7 +2076,7 @@ static inline void handle_failed_objexts_alloc(unsigned long obj_exts, >>> #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ >>> static inline void mark_objexts_empty(struct slabobj_ext *obj_exts) {} >>> -static inline void mark_failed_objexts_alloc(struct slab *slab) {} >>> +static inline bool mark_failed_objexts_alloc(struct slab *slab) { return true; } >>> static inline void handle_failed_objexts_alloc(unsigned long obj_exts, >>> struct slabobj_ext *vec, unsigned int objects) {} >>> @@ -2125,7 +2125,9 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, >>> } >>> if (!vec) { >>> /* Mark vectors which failed to allocate */ >>> - mark_failed_objexts_alloc(slab); >>> + if (!mark_failed_objexts_alloc(slab) && >>> + slab_obj_exts(slab)) >>> + return 0; >>> return -ENOMEM; >>> } >>> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition 2025-10-23 8:23 ` Hao Ge @ 2025-10-23 8:46 ` Hao Ge 2025-10-23 9:06 ` Harry Yoo 0 siblings, 1 reply; 8+ messages in thread From: Hao Ge @ 2025-10-23 8:46 UTC (permalink / raw) To: Harry Yoo Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge Hi Harry On 2025/10/23 16:23, Hao Ge wrote: > Hi Harry > > > On 2025/10/23 15:50, Harry Yoo wrote: >> On Thu, Oct 23, 2025 at 11:11:56AM +0800, Hao Ge wrote: >>> Hi Harry >>> >>> >>> On 2025/10/23 10:24, Harry Yoo wrote: >>>> On Thu, Oct 23, 2025 at 09:21:17AM +0800, Hao Ge wrote: >>>>> From: Hao Ge <gehao@kylinos.cn> >>>>> >>>>> If two competing threads enter alloc_slab_obj_exts(), and the >>>>> thread that failed to allocate the object extension vector exits >>>>> after the one that succeeded, it will mistakenly assume slab->obj_ext >>>>> is still empty due to its own allocation failure. This will then >>>>> trigger >>>>> warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in >>>>> the subsequent free path. >>>>> >>>>> Therefore, let's add an additional check when alloc_slab_obj_exts >>>>> fails. >>>>> >>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn> >>>>> --- >>>>> mm/slub.c | 9 ++++++--- >>>>> 1 file changed, 6 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/mm/slub.c b/mm/slub.c >>>>> index d4403341c9df..42276f0cc920 100644 >>>>> --- a/mm/slub.c >>>>> +++ b/mm/slub.c >>>>> @@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct >>>>> kmem_cache *s, gfp_t flags, void *p) >>>>> slab = virt_to_slab(p); >>>>> if (!slab_obj_exts(slab) && >>>>> alloc_slab_obj_exts(slab, s, flags, false)) { >>>>> - pr_warn_once("%s, %s: Failed to create slab extension >>>>> vector!\n", >>>>> - __func__, s->name); >>>>> - return NULL; >>>>> + /* Recheck if a racing thread has successfully allocated >>>>> slab->obj_exts. */ >>>>> + if (!slab_obj_exts(slab)) { >>>>> + pr_warn_once("%s, %s: Failed to create slab extension >>>>> vector!\n", >>>>> + __func__, s->name); >>>>> + return NULL; >>>>> + } >>>>> } >>>> Maybe this patch is a bit paranoid... since if >>>> mark_failed_objexts_alloc() >>>> win cmpxchg() and then someone else allocates the object extension >>>> vector, >>>> the warning will still be printed anyway. >> Oh, just to be clear I was talking about the other warning: >> pr_warn_once("%s, %s: Failed to create slab extension vector!", >> __func__, s->name); >> >>> The process that successfully allocates slab_exts will call >>> handle_failed_objexts_alloc, setting ref->ct = CODETAG_EMPTY >>> to prevent the warning from being triggered. >> But yeah I see what you mean. >> >> As you mentioned, if the process that failed to allocate the vector wins >> cmpxchg(), later process that successfully allocate the vector would >> call set_codetag_empty(), so no warning. >> >> But if the process that allocates the vector wins cmpxchg(), >> then it won't call set_codetag_empty(), so the process >> that was trying to set OBJEXTS_ALLOC_FAIL now needs to set the tag. > > Yes, the case I'm encountering is exactly this one. > >> >>>> But anyway, I think there is a better way to do this: >> What do you think about the diff I suggested below, though? > > Sorry for the delayed response earlier; I was trying to deduce all > possible scenarios. > > It makes sense to me, and I will submit the V2 version based on this > suggestion. > > Thank you for your help. > >> >>>> diff --git a/mm/slub.c b/mm/slub.c >>>> index dd4c85ea1038..d08d7580349d 100644 >>>> --- a/mm/slub.c >>>> +++ b/mm/slub.c >>>> @@ -2052,9 +2052,9 @@ static inline void mark_objexts_empty(struct >>>> slabobj_ext *obj_exts) >>>> } >>>> } >>>> -static inline void mark_failed_objexts_alloc(struct slab *slab) >>>> +static inline bool mark_failed_objexts_alloc(struct slab *slab) >>>> { >>>> - cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>>> + return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0; >>>> } >>>> static inline void handle_failed_objexts_alloc(unsigned long >>>> obj_exts, >>>> @@ -2076,7 +2076,7 @@ static inline void >>>> handle_failed_objexts_alloc(unsigned long obj_exts, >>>> #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ >>>> static inline void mark_objexts_empty(struct slabobj_ext >>>> *obj_exts) {} >>>> -static inline void mark_failed_objexts_alloc(struct slab *slab) {} >>>> +static inline bool mark_failed_objexts_alloc(struct slab *slab) { >>>> return true; } Maybe it returns false here. When CONFIG_MEM_ALLOC_PROFILING_DEBUG is not enabled, The following condition will never be executed: if (!mark_failed_objexts_alloc(slab) && slab_obj_exts(slab)) if another process that allocates the vector, we will lose one count. >>>> static inline void handle_failed_objexts_alloc(unsigned long >>>> obj_exts, >>>> struct slabobj_ext *vec, unsigned int objects) {} >>>> @@ -2125,7 +2125,9 @@ int alloc_slab_obj_exts(struct slab *slab, >>>> struct kmem_cache *s, >>>> } >>>> if (!vec) { >>>> /* Mark vectors which failed to allocate */ >>>> - mark_failed_objexts_alloc(slab); >>>> + if (!mark_failed_objexts_alloc(slab) && >>>> + slab_obj_exts(slab)) >>>> + return 0; >>>> return -ENOMEM; >>>> } >>>> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition 2025-10-23 8:46 ` Hao Ge @ 2025-10-23 9:06 ` Harry Yoo 2025-10-23 9:11 ` Hao Ge 0 siblings, 1 reply; 8+ messages in thread From: Harry Yoo @ 2025-10-23 9:06 UTC (permalink / raw) To: Hao Ge Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge On Thu, Oct 23, 2025 at 04:46:42PM +0800, Hao Ge wrote: > Hi Harry > > > On 2025/10/23 16:23, Hao Ge wrote: > > Hi Harry > > > > > > On 2025/10/23 15:50, Harry Yoo wrote: > > > On Thu, Oct 23, 2025 at 11:11:56AM +0800, Hao Ge wrote: > > > > Hi Harry > > > > > > > > > > > > On 2025/10/23 10:24, Harry Yoo wrote: > > > > > On Thu, Oct 23, 2025 at 09:21:17AM +0800, Hao Ge wrote: > > > > > > From: Hao Ge <gehao@kylinos.cn> > > > > > > > > > > > > If two competing threads enter alloc_slab_obj_exts(), and the > > > > > > thread that failed to allocate the object extension vector exits > > > > > > after the one that succeeded, it will mistakenly assume slab->obj_ext > > > > > > is still empty due to its own allocation failure. This > > > > > > will then trigger > > > > > > warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in > > > > > > the subsequent free path. > > > > > > > > > > > > Therefore, let's add an additional check when > > > > > > alloc_slab_obj_exts fails. > > > > > > > > > > > > Signed-off-by: Hao Ge <gehao@kylinos.cn> > > > > > > --- > > > > > > mm/slub.c | 9 ++++++--- > > > > > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > > > > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > > > > index d4403341c9df..42276f0cc920 100644 > > > > > > --- a/mm/slub.c > > > > > > +++ b/mm/slub.c > > > > > > @@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct > > > > > > kmem_cache *s, gfp_t flags, void *p) > > > > > > slab = virt_to_slab(p); > > > > > > if (!slab_obj_exts(slab) && > > > > > > alloc_slab_obj_exts(slab, s, flags, false)) { > > > > > > - pr_warn_once("%s, %s: Failed to create slab > > > > > > extension vector!\n", > > > > > > - __func__, s->name); > > > > > > - return NULL; > > > > > > + /* Recheck if a racing thread has successfully > > > > > > allocated slab->obj_exts. */ > > > > > > + if (!slab_obj_exts(slab)) { > > > > > > + pr_warn_once("%s, %s: Failed to create slab > > > > > > extension vector!\n", > > > > > > + __func__, s->name); > > > > > > + return NULL; > > > > > > + } > > > > > > } > > > > > Maybe this patch is a bit paranoid... since if > > > > > mark_failed_objexts_alloc() > > > > > win cmpxchg() and then someone else allocates the object > > > > > extension vector, > > > > > the warning will still be printed anyway. > > > Oh, just to be clear I was talking about the other warning: > > > pr_warn_once("%s, %s: Failed to create slab extension vector!", > > > __func__, s->name); > > > > > > > The process that successfully allocates slab_exts will call > > > > handle_failed_objexts_alloc, setting ref->ct = CODETAG_EMPTY > > > > to prevent the warning from being triggered. > > > But yeah I see what you mean. > > > > > > As you mentioned, if the process that failed to allocate the vector wins > > > cmpxchg(), later process that successfully allocate the vector would > > > call set_codetag_empty(), so no warning. > > > > > > But if the process that allocates the vector wins cmpxchg(), > > > then it won't call set_codetag_empty(), so the process > > > that was trying to set OBJEXTS_ALLOC_FAIL now needs to set the tag. > > > > Yes, the case I'm encountering is exactly this one. > > > > > > > > > > But anyway, I think there is a better way to do this: > > > What do you think about the diff I suggested below, though? > > > > Sorry for the delayed response earlier; I was trying to deduce all > > possible scenarios. > > > > It makes sense to me, and I will submit the V2 version based on this > > suggestion. > > > > Thank you for your help. > > > > > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > > > index dd4c85ea1038..d08d7580349d 100644 > > > > > --- a/mm/slub.c > > > > > +++ b/mm/slub.c > > > > > @@ -2052,9 +2052,9 @@ static inline void > > > > > mark_objexts_empty(struct slabobj_ext *obj_exts) > > > > > } > > > > > } > > > > > -static inline void mark_failed_objexts_alloc(struct slab *slab) > > > > > +static inline bool mark_failed_objexts_alloc(struct slab *slab) > > > > > { > > > > > - cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); > > > > > + return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0; > > > > > } > > > > > static inline void handle_failed_objexts_alloc(unsigned > > > > > long obj_exts, > > > > > @@ -2076,7 +2076,7 @@ static inline void > > > > > handle_failed_objexts_alloc(unsigned long obj_exts, > > > > > #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ > > > > > static inline void mark_objexts_empty(struct slabobj_ext > > > > > *obj_exts) {} > > > > > -static inline void mark_failed_objexts_alloc(struct slab *slab) {} > > > > > +static inline bool mark_failed_objexts_alloc(struct slab > > > > > *slab) { return true; } > > Maybe it returns false here. > > When CONFIG_MEM_ALLOC_PROFILING_DEBUG is not enabled, > > The following condition will never be executed: > > if (!mark_failed_objexts_alloc(slab) && slab_obj_exts(slab)) Good point. But without CONFIG_MEM_ALLOC_PROFILING_DEBUG, we don't know if someone else successfully allocated the vector or not (unlike, with CONFIG_MEM_ALLOC_PROFILING_DEBUG enabled, we know that when we lose cmpxchg()). We cannot "fix" the case where a process fails to allocate the vector but another allocates the vector. So I'm not sure if checking slab_obj_exts() once more is worth it in this case, but I'm fine with either way. > if another process that allocates the vector, we will lose one count. By "one count" you mean skipping accounting the object in memory profiling, right? > > > > > static inline void handle_failed_objexts_alloc(unsigned > > > > > long obj_exts, > > > > > struct slabobj_ext *vec, unsigned int objects) {} > > > > > @@ -2125,7 +2125,9 @@ int alloc_slab_obj_exts(struct slab > > > > > *slab, struct kmem_cache *s, > > > > > } > > > > > if (!vec) { > > > > > /* Mark vectors which failed to allocate */ > > > > > - mark_failed_objexts_alloc(slab); > > > > > + if (!mark_failed_objexts_alloc(slab) && > > > > > + slab_obj_exts(slab)) > > > > > + return 0; > > > > > return -ENOMEM; > > > > > } > > > > > -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition 2025-10-23 9:06 ` Harry Yoo @ 2025-10-23 9:11 ` Hao Ge 0 siblings, 0 replies; 8+ messages in thread From: Hao Ge @ 2025-10-23 9:11 UTC (permalink / raw) To: Harry Yoo Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter, David Rientjes, Roman Gushchin, Suren Baghdasaryan, Shakeel Butt, linux-mm, linux-kernel, Hao Ge Hi Harry On 2025/10/23 17:06, Harry Yoo wrote: > On Thu, Oct 23, 2025 at 04:46:42PM +0800, Hao Ge wrote: >> Hi Harry >> >> >> On 2025/10/23 16:23, Hao Ge wrote: >>> Hi Harry >>> >>> >>> On 2025/10/23 15:50, Harry Yoo wrote: >>>> On Thu, Oct 23, 2025 at 11:11:56AM +0800, Hao Ge wrote: >>>>> Hi Harry >>>>> >>>>> >>>>> On 2025/10/23 10:24, Harry Yoo wrote: >>>>>> On Thu, Oct 23, 2025 at 09:21:17AM +0800, Hao Ge wrote: >>>>>>> From: Hao Ge <gehao@kylinos.cn> >>>>>>> >>>>>>> If two competing threads enter alloc_slab_obj_exts(), and the >>>>>>> thread that failed to allocate the object extension vector exits >>>>>>> after the one that succeeded, it will mistakenly assume slab->obj_ext >>>>>>> is still empty due to its own allocation failure. This >>>>>>> will then trigger >>>>>>> warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in >>>>>>> the subsequent free path. >>>>>>> >>>>>>> Therefore, let's add an additional check when >>>>>>> alloc_slab_obj_exts fails. >>>>>>> >>>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn> >>>>>>> --- >>>>>>> mm/slub.c | 9 ++++++--- >>>>>>> 1 file changed, 6 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/mm/slub.c b/mm/slub.c >>>>>>> index d4403341c9df..42276f0cc920 100644 >>>>>>> --- a/mm/slub.c >>>>>>> +++ b/mm/slub.c >>>>>>> @@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct >>>>>>> kmem_cache *s, gfp_t flags, void *p) >>>>>>> slab = virt_to_slab(p); >>>>>>> if (!slab_obj_exts(slab) && >>>>>>> alloc_slab_obj_exts(slab, s, flags, false)) { >>>>>>> - pr_warn_once("%s, %s: Failed to create slab >>>>>>> extension vector!\n", >>>>>>> - __func__, s->name); >>>>>>> - return NULL; >>>>>>> + /* Recheck if a racing thread has successfully >>>>>>> allocated slab->obj_exts. */ >>>>>>> + if (!slab_obj_exts(slab)) { >>>>>>> + pr_warn_once("%s, %s: Failed to create slab >>>>>>> extension vector!\n", >>>>>>> + __func__, s->name); >>>>>>> + return NULL; >>>>>>> + } >>>>>>> } >>>>>> Maybe this patch is a bit paranoid... since if >>>>>> mark_failed_objexts_alloc() >>>>>> win cmpxchg() and then someone else allocates the object >>>>>> extension vector, >>>>>> the warning will still be printed anyway. >>>> Oh, just to be clear I was talking about the other warning: >>>> pr_warn_once("%s, %s: Failed to create slab extension vector!", >>>> __func__, s->name); >>>> >>>>> The process that successfully allocates slab_exts will call >>>>> handle_failed_objexts_alloc, setting ref->ct = CODETAG_EMPTY >>>>> to prevent the warning from being triggered. >>>> But yeah I see what you mean. >>>> >>>> As you mentioned, if the process that failed to allocate the vector wins >>>> cmpxchg(), later process that successfully allocate the vector would >>>> call set_codetag_empty(), so no warning. >>>> >>>> But if the process that allocates the vector wins cmpxchg(), >>>> then it won't call set_codetag_empty(), so the process >>>> that was trying to set OBJEXTS_ALLOC_FAIL now needs to set the tag. >>> Yes, the case I'm encountering is exactly this one. >>> >>>>>> But anyway, I think there is a better way to do this: >>>> What do you think about the diff I suggested below, though? >>> Sorry for the delayed response earlier; I was trying to deduce all >>> possible scenarios. >>> >>> It makes sense to me, and I will submit the V2 version based on this >>> suggestion. >>> >>> Thank you for your help. >>> >>>>>> diff --git a/mm/slub.c b/mm/slub.c >>>>>> index dd4c85ea1038..d08d7580349d 100644 >>>>>> --- a/mm/slub.c >>>>>> +++ b/mm/slub.c >>>>>> @@ -2052,9 +2052,9 @@ static inline void >>>>>> mark_objexts_empty(struct slabobj_ext *obj_exts) >>>>>> } >>>>>> } >>>>>> -static inline void mark_failed_objexts_alloc(struct slab *slab) >>>>>> +static inline bool mark_failed_objexts_alloc(struct slab *slab) >>>>>> { >>>>>> - cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL); >>>>>> + return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0; >>>>>> } >>>>>> static inline void handle_failed_objexts_alloc(unsigned >>>>>> long obj_exts, >>>>>> @@ -2076,7 +2076,7 @@ static inline void >>>>>> handle_failed_objexts_alloc(unsigned long obj_exts, >>>>>> #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */ >>>>>> static inline void mark_objexts_empty(struct slabobj_ext >>>>>> *obj_exts) {} >>>>>> -static inline void mark_failed_objexts_alloc(struct slab *slab) {} >>>>>> +static inline bool mark_failed_objexts_alloc(struct slab >>>>>> *slab) { return true; } >> Maybe it returns false here. >> >> When CONFIG_MEM_ALLOC_PROFILING_DEBUG is not enabled, >> >> The following condition will never be executed: >> >> if (!mark_failed_objexts_alloc(slab) && slab_obj_exts(slab)) > Good point. But without CONFIG_MEM_ALLOC_PROFILING_DEBUG, we don't know > if someone else successfully allocated the vector or not (unlike, with > CONFIG_MEM_ALLOC_PROFILING_DEBUG enabled, we know that when we lose > cmpxchg()). We cannot "fix" the case where a process fails to allocate > the vector but another allocates the vector. > > So I'm not sure if checking slab_obj_exts() once more is worth it in > this case, but I'm fine with either way. > >> if another process that allocates the vector, we will lose one count. > By "one count" you mean skipping accounting the object in memory > profiling, right? Yes. > >>>>>> static inline void handle_failed_objexts_alloc(unsigned >>>>>> long obj_exts, >>>>>> struct slabobj_ext *vec, unsigned int objects) {} >>>>>> @@ -2125,7 +2125,9 @@ int alloc_slab_obj_exts(struct slab >>>>>> *slab, struct kmem_cache *s, >>>>>> } >>>>>> if (!vec) { >>>>>> /* Mark vectors which failed to allocate */ >>>>>> - mark_failed_objexts_alloc(slab); >>>>>> + if (!mark_failed_objexts_alloc(slab) && >>>>>> + slab_obj_exts(slab)) >>>>>> + return 0; >>>>>> return -ENOMEM; >>>>>> } >>>>>> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-10-23 9:12 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-10-23 1:21 [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition Hao Ge 2025-10-23 2:24 ` Harry Yoo 2025-10-23 3:11 ` Hao Ge 2025-10-23 7:50 ` Harry Yoo 2025-10-23 8:23 ` Hao Ge 2025-10-23 8:46 ` Hao Ge 2025-10-23 9:06 ` Harry Yoo 2025-10-23 9:11 ` Hao Ge
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox