* [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
@ 2026-02-23 7:58 Harry Yoo
2026-02-23 11:44 ` Harry Yoo
0 siblings, 1 reply; 3+ messages in thread
From: Harry Yoo @ 2026-02-23 7:58 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton
Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Harry Yoo,
Alexei Starovoitov, Hao Li, Suren Baghdasaryan, Shakeel Butt,
Muchun Song, Johannes Weiner, Michal Hocko, cgroups, linux-mm,
Venkat Rao Bagalkote
When alloc_slab_obj_exts() is called later in time (instead of at slab
allocation & initialization step), slab->stride and slab->obj_exts are
set when the slab is already accessible by multiple CPUs.
The current implementation does not enforce memory ordering between
slab->stride and slab->obj_exts. However, for correctness, slab->stride
must be visible before slab->obj_exts, otherwise concurrent readers
may observe slab->obj_exts as non-zero while stride is still stale,
leading to incorrect reference counting of object cgroups.
There has been a bug report [1] that showed symptoms of incorrect
reference counting of object cgroups, which could be triggered by
this memory ordering issue.
Fix this by unconditionally initializing slab->stride in
alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
This ensures stride is set before the slab becomes visible to
other CPUs via the per-node partial slab list (protected by spinlock
with acquire/release semantics), preventing them from observing
inconsistent stride value.
Thanks to Shakeel Butt for pointing out this issue [2].
Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
I tested this patch, but I could not confirm that this actually fixes
the issue reported by [1]. It would be nice if Venkat could help
confirm; but perhaps it's challenging to reliably reproduce...
Since this logically makes sense, it would be worth fix it anyway.
mm/slub.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 18c30872d196..afa98065d74f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
retry:
old_exts = READ_ONCE(slab->obj_exts);
handle_failed_objexts_alloc(old_exts, vec, objects);
- slab_set_stride(slab, sizeof(struct slabobj_ext));
if (new_slab) {
/*
@@ -2272,6 +2271,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
void *addr;
unsigned long obj_exts;
+ /* Initialize stride early to avoid memory ordering issues */
+ slab_set_stride(slab, sizeof(struct slabobj_ext));
+
if (!need_slab_obj_exts(s))
return;
@@ -2288,7 +2290,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
obj_exts |= MEMCG_DATA_OBJEXTS;
#endif
slab->obj_exts = obj_exts;
- slab_set_stride(slab, sizeof(struct slabobj_ext));
} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
unsigned int offset = obj_exts_offset_in_object(s);
--
2.43.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-23 7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
@ 2026-02-23 11:44 ` Harry Yoo
2026-02-23 17:04 ` Vlastimil Babka
0 siblings, 1 reply; 3+ messages in thread
From: Harry Yoo @ 2026-02-23 11:44 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton
Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Hao Li,
Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
Michal Hocko, cgroups, linux-mm, Venkat Rao Bagalkote
On Mon, Feb 23, 2026 at 04:58:09PM +0900, Harry Yoo wrote:
> When alloc_slab_obj_exts() is called later in time (instead of at slab
> allocation & initialization step), slab->stride and slab->obj_exts are
> set when the slab is already accessible by multiple CPUs.
>
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. However, for correctness, slab->stride
> must be visible before slab->obj_exts, otherwise concurrent readers
> may observe slab->obj_exts as non-zero while stride is still stale,
> leading to incorrect reference counting of object cgroups.
>
> There has been a bug report [1] that showed symptoms of incorrect
> reference counting of object cgroups, which could be triggered by
> this memory ordering issue.
>
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>
> This ensures stride is set before the slab becomes visible to
> other CPUs via the per-node partial slab list (protected by spinlock
> with acquire/release semantics), preventing them from observing
> inconsistent stride value.
>
> Thanks to Shakeel Butt for pointing out this issue [2].
>
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
Vlastimil, could you please update the changelog when applying this
to the tree? I think this also explains [3] (thanks for raising it
off-list, Vlastimil!):
When alloc_slab_obj_exts() is called later (instead of during slab
allocation and initialization), slab->stride and slab->obj_exts are
updated after the slab is already accessible by multiple CPUs.
The current implementation does not enforce memory ordering between
slab->stride and slab->obj_exts. For correctness, slab->stride must be
visible before slab->obj_exts. Otherwise, concurrent readers may observe
slab->obj_exts as non-zero while stride is still stale.
With stale slab->stride, slab_obj_ext() could return the wrong obj_ext.
This could cause two problems:
- obj_cgroup_put() is called on the wrong objcg, leading to
a use-after-free due to incorrect reference counting [1] by
decrementing the reference count more than it was incremented.
- refill_obj_stock() is called on the wrong objcg, leading to
a page_counter overflow [2] by uncharging more memory than charged.
Fix this by unconditionally initializing slab->stride in
alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
In the case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the function.
This ensures updates to slab->stride become visible before the slab
can be accessed by other CPUs via the per-node partial slab list
(protected by spinlock with acquire/release semantics).
Thanks to Shakeel Butt for pointing out this issue [3].
Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
Closes: https://lore.kernel.org/all/ddff7c7d-c0c3-4780-808f-9a83268bbf0c@linux.ibm.com [2]
Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [3]
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-23 11:44 ` Harry Yoo
@ 2026-02-23 17:04 ` Vlastimil Babka
0 siblings, 0 replies; 3+ messages in thread
From: Vlastimil Babka @ 2026-02-23 17:04 UTC (permalink / raw)
To: Harry Yoo, Vlastimil Babka, Andrew Morton
Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Hao Li,
Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
Michal Hocko, cgroups, linux-mm, Venkat Rao Bagalkote
On 2/23/26 12:44, Harry Yoo wrote:
> On Mon, Feb 23, 2026 at 04:58:09PM +0900, Harry Yoo wrote:
>> When alloc_slab_obj_exts() is called later in time (instead of at slab
>> allocation & initialization step), slab->stride and slab->obj_exts are
>> set when the slab is already accessible by multiple CPUs.
>>
>> The current implementation does not enforce memory ordering between
>> slab->stride and slab->obj_exts. However, for correctness, slab->stride
>> must be visible before slab->obj_exts, otherwise concurrent readers
>> may observe slab->obj_exts as non-zero while stride is still stale,
>> leading to incorrect reference counting of object cgroups.
>>
>> There has been a bug report [1] that showed symptoms of incorrect
>> reference counting of object cgroups, which could be triggered by
>> this memory ordering issue.
>>
>> Fix this by unconditionally initializing slab->stride in
>> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
>> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>>
>> This ensures stride is set before the slab becomes visible to
>> other CPUs via the per-node partial slab list (protected by spinlock
>> with acquire/release semantics), preventing them from observing
>> inconsistent stride value.
>>
>> Thanks to Shakeel Butt for pointing out this issue [2].
>>
>> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
>> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
>> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
>> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>> ---
>
> Vlastimil, could you please update the changelog when applying this
> to the tree? I think this also explains [3] (thanks for raising it
> off-list, Vlastimil!):
Done, thanks! Added to slab/for-next-fixes
> When alloc_slab_obj_exts() is called later (instead of during slab
> allocation and initialization), slab->stride and slab->obj_exts are
> updated after the slab is already accessible by multiple CPUs.
>
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. For correctness, slab->stride must be
> visible before slab->obj_exts. Otherwise, concurrent readers may observe
> slab->obj_exts as non-zero while stride is still stale.
>
> With stale slab->stride, slab_obj_ext() could return the wrong obj_ext.
> This could cause two problems:
>
> - obj_cgroup_put() is called on the wrong objcg, leading to
> a use-after-free due to incorrect reference counting [1] by
> decrementing the reference count more than it was incremented.
>
> - refill_obj_stock() is called on the wrong objcg, leading to
> a page_counter overflow [2] by uncharging more memory than charged.
>
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In the case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the function.
>
> This ensures updates to slab->stride become visible before the slab
> can be accessed by other CPUs via the per-node partial slab list
> (protected by spinlock with acquire/release semantics).
>
> Thanks to Shakeel Butt for pointing out this issue [3].
>
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Closes: https://lore.kernel.org/all/ddff7c7d-c0c3-4780-808f-9a83268bbf0c@linux.ibm.com [2]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [3]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-02-23 17:04 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-23 7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
2026-02-23 11:44 ` Harry Yoo
2026-02-23 17:04 ` Vlastimil Babka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox