From: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Alexei Starovoitov <ast@kernel.org>, Hao Li <hao.li@linux.dev>,
Suren Baghdasaryan <surenb@google.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
Date: Wed, 25 Feb 2026 14:44:24 +0530 [thread overview]
Message-ID: <84492f08-04c2-485c-9a18-cdafd5a9c3e5@linux.ibm.com> (raw)
In-Reply-To: <aZ2Gwie5dpXotxWc@hyeyoo>
On 24/02/26 4:40 pm, Harry Yoo wrote:
> On Tue, Feb 24, 2026 at 02:34:41PM +0530, Venkat Rao Bagalkote wrote:
>> On 23/02/26 1:28 pm, Harry Yoo wrote:
>>> When alloc_slab_obj_exts() is called later in time (instead of at slab
>>> allocation & initialization step), slab->stride and slab->obj_exts are
>>> set when the slab is already accessible by multiple CPUs.
>>>
>>> The current implementation does not enforce memory ordering between
>>> slab->stride and slab->obj_exts. However, for correctness, slab->stride
>>> must be visible before slab->obj_exts, otherwise concurrent readers
>>> may observe slab->obj_exts as non-zero while stride is still stale,
>>> leading to incorrect reference counting of object cgroups.
>>>
>>> There has been a bug report [1] that showed symptoms of incorrect
>>> reference counting of object cgroups, which could be triggered by
>>> this memory ordering issue.
>>>
>>> Fix this by unconditionally initializing slab->stride in
>>> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
>>> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>>>
>>> This ensures stride is set before the slab becomes visible to
>>> other CPUs via the per-node partial slab list (protected by spinlock
>>> with acquire/release semantics), preventing them from observing
>>> inconsistent stride value.
>>>
>>> Thanks to Shakeel Butt for pointing out this issue [2].
>>>
>>> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
>>> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>>> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com
>>> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo
>>> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>>> ---
>>>
>>> I tested this patch, but I could not confirm that this actually fixes
>>> the issue reported by [1]. It would be nice if Venkat could help
>>> confirm; but perhaps it's challenging to reliably reproduce...
>>
>> Thanks for the patch. I did ran the complete test suite, and unfortunately
>> issue is reproducing.
> Oops, thanks for confirming that it's still reproduced!
> That's really helpful.
>
> Perhaps I should start considering cases where it's not a memory
> ordering issue, but let's check one more thing before moving on.
> could you please test if it still reproduces with the following patch?
>
> If it's still reproducible, it should not be due to the memory ordering
> issue between obj_exts and stride.
>
> ---8<---
> From: Harry Yoo <harry.yoo@oracle.com>
> Date: Mon, 23 Feb 2026 16:58:09 +0900
> Subject: mm/slab: enforce slab->stride -> slab->obj_exts ordering
>
> I tried to avoid unnecessary memory barriers for efficiency,
> but the original bug is still reproducible.
>
> Probably I missed a case where an object is allocated on a CPU
> and then freed on a different CPU without involving spinlock.
>
> I'm not sure if I did not cover edge cases or if it's caused by
> something other than memory ordering issue.
>
> Anyway, let's find out by introducing heavy memory barriers!
>
> Always ensure that updates to stride is visible before obj_exts.
>
> ---
> mm/slab.h | 1 +
> mm/slub.c | 10 +++++++---
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 71c7261bf822..aacdd9f4e509 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -565,6 +565,7 @@ static inline void slab_set_stride(struct slab *slab, unsigned short stride)
> }
> static inline unsigned short slab_get_stride(struct slab *slab)
> {
> + smp_rmb();
> return slab->stride;
> }
> #else
> diff --git a/mm/slub.c b/mm/slub.c
> index 862642c165ed..c7c8b660a994 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> retry:
> old_exts = READ_ONCE(slab->obj_exts);
> handle_failed_objexts_alloc(old_exts, vec, objects);
> - slab_set_stride(slab, sizeof(struct slabobj_ext));
>
> if (new_slab) {
> /*
> @@ -2272,6 +2271,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> void *addr;
> unsigned long obj_exts;
>
> + slab_set_stride(slab, sizeof(struct slabobj_ext));
> + /* pairs with smp_rmb() in slab_get_stride() */
> + smp_wmb();
> +
> if (!need_slab_obj_exts(s))
> return;
>
> @@ -2288,7 +2291,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> obj_exts |= MEMCG_DATA_OBJEXTS;
> #endif
> slab->obj_exts = obj_exts;
> - slab_set_stride(slab, sizeof(struct slabobj_ext));
> } else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
> unsigned int offset = obj_exts_offset_in_object(s);
>
> @@ -2305,8 +2307,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> #ifdef CONFIG_MEMCG
> obj_exts |= MEMCG_DATA_OBJEXTS;
> #endif
> - slab->obj_exts = obj_exts;
> slab_set_stride(slab, s->size);
> + /* pairs with smp_rmb() in slab_get_stride() */
> + smp_wmb();
> + slab->obj_exts = obj_exts;
> }
> }
>
> --
> 2.43.0
>
With this patch, issue is not reproduced. So looks good.
Regards,
Venkat.
next prev parent reply other threads:[~2026-02-25 9:14 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 7:58 Harry Yoo
2026-02-23 11:44 ` Harry Yoo
2026-02-23 17:04 ` Vlastimil Babka
2026-02-23 20:23 ` Shakeel Butt
2026-02-24 9:04 ` Venkat Rao Bagalkote
2026-02-24 11:10 ` Harry Yoo
2026-02-25 9:14 ` Venkat Rao Bagalkote [this message]
2026-02-25 10:15 ` Harry Yoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=84492f08-04c2-485c-9a18-cdafd5a9c3e5@linux.ibm.com \
--to=venkat88@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=ast@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=cl@gentwo.org \
--cc=hannes@cmpxchg.org \
--cc=hao.li@linux.dev \
--cc=harry.yoo@oracle.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox