* [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
@ 2026-02-23 7:58 Harry Yoo
2026-02-23 11:44 ` Harry Yoo
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Harry Yoo @ 2026-02-23 7:58 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton
Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Harry Yoo,
Alexei Starovoitov, Hao Li, Suren Baghdasaryan, Shakeel Butt,
Muchun Song, Johannes Weiner, Michal Hocko, cgroups, linux-mm,
Venkat Rao Bagalkote
When alloc_slab_obj_exts() is called later in time (instead of at slab
allocation & initialization step), slab->stride and slab->obj_exts are
set when the slab is already accessible by multiple CPUs.
The current implementation does not enforce memory ordering between
slab->stride and slab->obj_exts. However, for correctness, slab->stride
must be visible before slab->obj_exts, otherwise concurrent readers
may observe slab->obj_exts as non-zero while stride is still stale,
leading to incorrect reference counting of object cgroups.
There has been a bug report [1] that showed symptoms of incorrect
reference counting of object cgroups, which could be triggered by
this memory ordering issue.
Fix this by unconditionally initializing slab->stride in
alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
This ensures stride is set before the slab becomes visible to
other CPUs via the per-node partial slab list (protected by spinlock
with acquire/release semantics), preventing them from observing
inconsistent stride value.
Thanks to Shakeel Butt for pointing out this issue [2].
Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
I tested this patch, but I could not confirm that this actually fixes
the issue reported by [1]. It would be nice if Venkat could help
confirm; but perhaps it's challenging to reliably reproduce...
Since this logically makes sense, it would be worth fix it anyway.
mm/slub.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 18c30872d196..afa98065d74f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
retry:
old_exts = READ_ONCE(slab->obj_exts);
handle_failed_objexts_alloc(old_exts, vec, objects);
- slab_set_stride(slab, sizeof(struct slabobj_ext));
if (new_slab) {
/*
@@ -2272,6 +2271,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
void *addr;
unsigned long obj_exts;
+ /* Initialize stride early to avoid memory ordering issues */
+ slab_set_stride(slab, sizeof(struct slabobj_ext));
+
if (!need_slab_obj_exts(s))
return;
@@ -2288,7 +2290,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
obj_exts |= MEMCG_DATA_OBJEXTS;
#endif
slab->obj_exts = obj_exts;
- slab_set_stride(slab, sizeof(struct slabobj_ext));
} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
unsigned int offset = obj_exts_offset_in_object(s);
--
2.43.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-23 7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
@ 2026-02-23 11:44 ` Harry Yoo
2026-02-23 17:04 ` Vlastimil Babka
2026-02-23 20:23 ` Shakeel Butt
2026-02-24 9:04 ` Venkat Rao Bagalkote
2 siblings, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2026-02-23 11:44 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton
Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Hao Li,
Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
Michal Hocko, cgroups, linux-mm, Venkat Rao Bagalkote
On Mon, Feb 23, 2026 at 04:58:09PM +0900, Harry Yoo wrote:
> When alloc_slab_obj_exts() is called later in time (instead of at slab
> allocation & initialization step), slab->stride and slab->obj_exts are
> set when the slab is already accessible by multiple CPUs.
>
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. However, for correctness, slab->stride
> must be visible before slab->obj_exts, otherwise concurrent readers
> may observe slab->obj_exts as non-zero while stride is still stale,
> leading to incorrect reference counting of object cgroups.
>
> There has been a bug report [1] that showed symptoms of incorrect
> reference counting of object cgroups, which could be triggered by
> this memory ordering issue.
>
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>
> This ensures stride is set before the slab becomes visible to
> other CPUs via the per-node partial slab list (protected by spinlock
> with acquire/release semantics), preventing them from observing
> inconsistent stride value.
>
> Thanks to Shakeel Butt for pointing out this issue [2].
>
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
Vlastimil, could you please update the changelog when applying this
to the tree? I think this also explains [3] (thanks for raising it
off-list, Vlastimil!):
When alloc_slab_obj_exts() is called later (instead of during slab
allocation and initialization), slab->stride and slab->obj_exts are
updated after the slab is already accessible by multiple CPUs.
The current implementation does not enforce memory ordering between
slab->stride and slab->obj_exts. For correctness, slab->stride must be
visible before slab->obj_exts. Otherwise, concurrent readers may observe
slab->obj_exts as non-zero while stride is still stale.
With stale slab->stride, slab_obj_ext() could return the wrong obj_ext.
This could cause two problems:
- obj_cgroup_put() is called on the wrong objcg, leading to
a use-after-free due to incorrect reference counting [1] by
decrementing the reference count more than it was incremented.
- refill_obj_stock() is called on the wrong objcg, leading to
a page_counter overflow [2] by uncharging more memory than charged.
Fix this by unconditionally initializing slab->stride in
alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
In the case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the function.
This ensures updates to slab->stride become visible before the slab
can be accessed by other CPUs via the per-node partial slab list
(protected by spinlock with acquire/release semantics).
Thanks to Shakeel Butt for pointing out this issue [3].
Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
Closes: https://lore.kernel.org/all/ddff7c7d-c0c3-4780-808f-9a83268bbf0c@linux.ibm.com [2]
Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [3]
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-23 11:44 ` Harry Yoo
@ 2026-02-23 17:04 ` Vlastimil Babka
0 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2026-02-23 17:04 UTC (permalink / raw)
To: Harry Yoo, Vlastimil Babka, Andrew Morton
Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Hao Li,
Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
Michal Hocko, cgroups, linux-mm, Venkat Rao Bagalkote
On 2/23/26 12:44, Harry Yoo wrote:
> On Mon, Feb 23, 2026 at 04:58:09PM +0900, Harry Yoo wrote:
>> When alloc_slab_obj_exts() is called later in time (instead of at slab
>> allocation & initialization step), slab->stride and slab->obj_exts are
>> set when the slab is already accessible by multiple CPUs.
>>
>> The current implementation does not enforce memory ordering between
>> slab->stride and slab->obj_exts. However, for correctness, slab->stride
>> must be visible before slab->obj_exts, otherwise concurrent readers
>> may observe slab->obj_exts as non-zero while stride is still stale,
>> leading to incorrect reference counting of object cgroups.
>>
>> There has been a bug report [1] that showed symptoms of incorrect
>> reference counting of object cgroups, which could be triggered by
>> this memory ordering issue.
>>
>> Fix this by unconditionally initializing slab->stride in
>> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
>> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>>
>> This ensures stride is set before the slab becomes visible to
>> other CPUs via the per-node partial slab list (protected by spinlock
>> with acquire/release semantics), preventing them from observing
>> inconsistent stride value.
>>
>> Thanks to Shakeel Butt for pointing out this issue [2].
>>
>> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
>> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
>> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
>> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>> ---
>
> Vlastimil, could you please update the changelog when applying this
> to the tree? I think this also explains [3] (thanks for raising it
> off-list, Vlastimil!):
Done, thanks! Added to slab/for-next-fixes
> When alloc_slab_obj_exts() is called later (instead of during slab
> allocation and initialization), slab->stride and slab->obj_exts are
> updated after the slab is already accessible by multiple CPUs.
>
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. For correctness, slab->stride must be
> visible before slab->obj_exts. Otherwise, concurrent readers may observe
> slab->obj_exts as non-zero while stride is still stale.
>
> With stale slab->stride, slab_obj_ext() could return the wrong obj_ext.
> This could cause two problems:
>
> - obj_cgroup_put() is called on the wrong objcg, leading to
> a use-after-free due to incorrect reference counting [1] by
> decrementing the reference count more than it was incremented.
>
> - refill_obj_stock() is called on the wrong objcg, leading to
> a page_counter overflow [2] by uncharging more memory than charged.
>
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In the case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the function.
>
> This ensures updates to slab->stride become visible before the slab
> can be accessed by other CPUs via the per-node partial slab list
> (protected by spinlock with acquire/release semantics).
>
> Thanks to Shakeel Butt for pointing out this issue [3].
>
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Closes: https://lore.kernel.org/all/ddff7c7d-c0c3-4780-808f-9a83268bbf0c@linux.ibm.com [2]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [3]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-23 7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
2026-02-23 11:44 ` Harry Yoo
@ 2026-02-23 20:23 ` Shakeel Butt
2026-02-24 9:04 ` Venkat Rao Bagalkote
2 siblings, 0 replies; 14+ messages in thread
From: Shakeel Butt @ 2026-02-23 20:23 UTC (permalink / raw)
To: Harry Yoo
Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Alexei Starovoitov, Hao Li,
Suren Baghdasaryan, Muchun Song, Johannes Weiner, Michal Hocko,
cgroups, linux-mm, Venkat Rao Bagalkote
On Mon, Feb 23, 2026 at 04:58:09PM +0900, Harry Yoo wrote:
> When alloc_slab_obj_exts() is called later in time (instead of at slab
> allocation & initialization step), slab->stride and slab->obj_exts are
> set when the slab is already accessible by multiple CPUs.
>
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. However, for correctness, slab->stride
> must be visible before slab->obj_exts, otherwise concurrent readers
> may observe slab->obj_exts as non-zero while stride is still stale,
> leading to incorrect reference counting of object cgroups.
>
> There has been a bug report [1] that showed symptoms of incorrect
> reference counting of object cgroups, which could be triggered by
> this memory ordering issue.
>
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>
> This ensures stride is set before the slab becomes visible to
> other CPUs via the per-node partial slab list (protected by spinlock
> with acquire/release semantics), preventing them from observing
> inconsistent stride value.
>
> Thanks to Shakeel Butt for pointing out this issue [2].
>
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-23 7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
2026-02-23 11:44 ` Harry Yoo
2026-02-23 20:23 ` Shakeel Butt
@ 2026-02-24 9:04 ` Venkat Rao Bagalkote
2026-02-24 11:10 ` Harry Yoo
2 siblings, 1 reply; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-02-24 9:04 UTC (permalink / raw)
To: Harry Yoo, Vlastimil Babka, Andrew Morton
Cc: Christoph Lameter, David Rientjes, Roman Gushchin,
Alexei Starovoitov, Hao Li, Suren Baghdasaryan, Shakeel Butt,
Muchun Song, Johannes Weiner, Michal Hocko, cgroups, linux-mm
On 23/02/26 1:28 pm, Harry Yoo wrote:
> When alloc_slab_obj_exts() is called later in time (instead of at slab
> allocation & initialization step), slab->stride and slab->obj_exts are
> set when the slab is already accessible by multiple CPUs.
>
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. However, for correctness, slab->stride
> must be visible before slab->obj_exts, otherwise concurrent readers
> may observe slab->obj_exts as non-zero while stride is still stale,
> leading to incorrect reference counting of object cgroups.
>
> There has been a bug report [1] that showed symptoms of incorrect
> reference counting of object cgroups, which could be triggered by
> this memory ordering issue.
>
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>
> This ensures stride is set before the slab becomes visible to
> other CPUs via the per-node partial slab list (protected by spinlock
> with acquire/release semantics), preventing them from observing
> inconsistent stride value.
>
> Thanks to Shakeel Butt for pointing out this issue [2].
>
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
>
> I tested this patch, but I could not confirm that this actually fixes
> the issue reported by [1]. It would be nice if Venkat could help
> confirm; but perhaps it's challenging to reliably reproduce...
Thanks for the patch. I did ran the complete test suite, and
unfortunately issue is reproducing.
I applied this patch on mainline repo for testing.
Traces:
[ 9316.514161] BUG: Kernel NULL pointer dereference on read at 0x00000000
[ 9316.514169] Faulting instruction address: 0xc0000000008b2ff4
[ 9316.514176] Oops: Kernel access of bad area, sig: 7 [#1]
[ 9316.514182] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
[ 9316.514189] Modules linked in: overlay dm_zero dm_thin_pool
dm_persistent_data dm_bio_prison dm_snapshot dm_bufio dm_flakey xfs loop
dm_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set bonding nf_tables tls
sunrpc rfkill nfnetlink pseries_rng vmx_crypto dax_pmem fuse ext4 crc16
mbcache jbd2 nd_pmem papr_scm sd_mod libnvdimm sg ibmvscsi ibmveth
scsi_transport_srp pseries_wdt [last unloaded: scsi_debug]
[ 9316.514295] CPU: 16 UID: 0 PID: 0 Comm: swapper/16 Kdump: loaded
Tainted: G W 7.0.0-rc1+ #1 PREEMPTLAZY
[ 9316.514306] Tainted: [W]=WARN
[ 9316.514311] Hardware name: IBM,9080-HEX Power11 (architected)
0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
[ 9316.514318] NIP: c0000000008b2ff4 LR: c0000000008b2fec CTR:
c00000000036d680
[ 9316.514326] REGS: c000000d0dcb7870 TRAP: 0300 Tainted: G W
(7.0.0-rc1+)
[ 9316.514333] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR:
84042802 XER: 20040000
[ 9316.514356] CFAR: c000000000862e94 DAR: 0000000000000000 DSISR:
00080000 IRQMASK: 0
[ 9316.514356] GPR00: c0000000008b2fec c000000d0dcb7b10 c00000000243a500
0000000000000001
[ 9316.514356] GPR04: 0000000000000008 0000000000000001 c0000000008b2fec
0000000000000001
[ 9316.514356] GPR08: a80e000000000000 0000000000000001 0000000000000007
a80e000000000000
[ 9316.514356] GPR12: c00e00000e7b6cd5 c000000d0ddf4700 c000000129a98e00
0000000000000006
[ 9316.514356] GPR16: c000000007012fa0 c000000007012fa4 c000000005160980
c000000007012f88
[ 9316.514356] GPR20: c00c000000021bec c000000d0d07f008 0000000000000001
ffffffffffffff78
[ 9316.514356] GPR24: 0000000000000005 c000000d0d58f180 c0000000032cf000
c000000d0ddf4700
[ 9316.514356] GPR28: 0000000000000088 0000000000000000 c000000129a98e00
c000000d0d07f000
[ 9316.514457] NIP [c0000000008b2ff4] refill_obj_stock+0x5b4/0x680
[ 9316.514467] LR [c0000000008b2fec] refill_obj_stock+0x5ac/0x680
[ 9316.514476] Call Trace:
[ 9316.514481] [c000000d0dcb7b10] [c0000000008b2fec]
refill_obj_stock+0x5ac/0x680 (unreliable)
[ 9316.514494] [c000000d0dcb7b90] [c0000000008b9598]
__memcg_slab_free_hook+0x238/0x3ec
[ 9316.514505] [c000000d0dcb7c60] [c0000000007f3d90]
__rcu_free_sheaf_prepare+0x314/0x3e8
[ 9316.514516] [c000000d0dcb7d10] [c0000000007fc2ec]
rcu_free_sheaf+0x38/0x170
[ 9316.514528] [c000000d0dcb7d50] [c000000000334570]
rcu_do_batch+0x2ec/0xfa8
[ 9316.514538] [c000000d0dcb7e50] [c000000000339a08] rcu_core+0x22c/0x48c
[ 9316.514548] [c000000d0dcb7ec0] [c0000000001cfeac]
handle_softirqs+0x1f4/0x74c
[ 9316.514559] [c000000d0dcb7fe0] [c00000000001b0cc]
do_softirq_own_stack+0x60/0x7c
[ 9316.514570] [c0000000096c7930] [c00000000001b0b8]
do_softirq_own_stack+0x4c/0x7c
[ 9316.514581] [c0000000096c7960] [c0000000001cf168]
__irq_exit_rcu+0x268/0x308
[ 9316.514592] [c0000000096c79a0] [c0000000001d0be4] irq_exit+0x20/0x38
[ 9316.514602] [c0000000096c79c0] [c0000000000315f4]
interrupt_async_exit_prepare.constprop.0+0x18/0x2c
[ 9316.514614] [c0000000096c79e0] [c000000000009ffc]
decrementer_common_virt+0x28c/0x290
[ 9316.514626] ---- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[ 9316.514635] NIP: c00000000012d9f0 LR: c00000000135c0a8 CTR:
0000000000000000
[ 9316.514642] REGS: c0000000096c7a10 TRAP: 0900 Tainted: G W
(7.0.0-rc1+)
[ 9316.514649] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
CR: 24000804 XER: 00000000
[ 9316.514678] CFAR: 0000000000000000 IRQMASK: 0
[ 9316.514678] GPR00: 0000000000000000 c0000000096c7cb0 c00000000243a500
0000000000000000
[ 9316.514678] GPR04: 0000000000000000 800400002fe6fc10 0000000000000000
0000000000000001
[ 9316.514678] GPR08: 0000000000000030 0000000000000000 0000000000000090
0000000000000001
[ 9316.514678] GPR12: 800400002fe6fc00 c000000d0ddf4700 0000000000000000
000000002ef01a00
[ 9316.514678] GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[ 9316.514678] GPR20: 0000000000000000 0000000000000000 0000000000000000
0000000000000001
[ 9316.514678] GPR24: 0000000000000000 c000000004d7a760 000008792ad04b82
0000000000000000
[ 9316.514678] GPR28: 0000000000000000 0000000000000001 c0000000032b18d8
c0000000032b18e0
[ 9316.514774] NIP [c00000000012d9f0] plpar_hcall_norets_notrace+0x18/0x2c
[ 9316.514782] LR [c00000000135c0a8] cede_processor.isra.0+0x1c/0x34
[ 9316.514792] ---- interrupt: 900
[ 9316.514797] [c0000000096c7cb0] [c0000000096c7cf0] 0xc0000000096c7cf0
(unreliable)
[ 9316.514808] [c0000000096c7d10] [c0000000019af170]
dedicated_cede_loop+0x90/0x170
[ 9316.514819] [c0000000096c7d60] [c0000000019aeb20]
cpuidle_enter_state+0x394/0x480
[ 9316.514830] [c0000000096c7e00] [c00000000135864c] cpuidle_enter+0x64/0x9c
[ 9316.514840] [c0000000096c7e50] [c000000000284b0c] call_cpuidle+0x7c/0xf8
[ 9316.514852] [c0000000096c7e90] [c0000000002903e8]
cpuidle_idle_call+0x1c4/0x2b4
[ 9316.514862] [c0000000096c7f00] [c00000000029060c] do_idle+0x134/0x208
[ 9316.514872] [c0000000096c7f50] [c000000000290a5c]
cpu_startup_entry+0x60/0x64
[ 9316.514882] [c0000000096c7f80] [c000000000074738]
start_secondary+0x3fc/0x400
[ 9316.514894] [c0000000096c7fe0] [c00000000000e258]
start_secondary_prolog+0x10/0x14
[ 9316.514904] Code: eba962a0 4bfffe40 60000000 387e0008 4bfae7c1
60000000 ebbe0008 38800008 7fa3eb78 4bfafe85 60000000 39200001
<7d40e8a8> 7d495214 7d40e9ad 40c2fff4
[ 9316.514941] ---[ end trace 0000000000000000 ]---
Regards,
Venkat.
>
> Since this logically makes sense, it would be worth fix it anyway.
>
> mm/slub.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 18c30872d196..afa98065d74f 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> retry:
> old_exts = READ_ONCE(slab->obj_exts);
> handle_failed_objexts_alloc(old_exts, vec, objects);
> - slab_set_stride(slab, sizeof(struct slabobj_ext));
>
> if (new_slab) {
> /*
> @@ -2272,6 +2271,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> void *addr;
> unsigned long obj_exts;
>
> + /* Initialize stride early to avoid memory ordering issues */
> + slab_set_stride(slab, sizeof(struct slabobj_ext));
> +
> if (!need_slab_obj_exts(s))
> return;
>
> @@ -2288,7 +2290,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> obj_exts |= MEMCG_DATA_OBJEXTS;
> #endif
> slab->obj_exts = obj_exts;
> - slab_set_stride(slab, sizeof(struct slabobj_ext));
> } else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
> unsigned int offset = obj_exts_offset_in_object(s);
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-24 9:04 ` Venkat Rao Bagalkote
@ 2026-02-24 11:10 ` Harry Yoo
2026-02-25 9:14 ` Venkat Rao Bagalkote
0 siblings, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2026-02-24 11:10 UTC (permalink / raw)
To: Venkat Rao Bagalkote
Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Alexei Starovoitov, Hao Li,
Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
Michal Hocko, cgroups, linux-mm
On Tue, Feb 24, 2026 at 02:34:41PM +0530, Venkat Rao Bagalkote wrote:
>
> On 23/02/26 1:28 pm, Harry Yoo wrote:
> > When alloc_slab_obj_exts() is called later in time (instead of at slab
> > allocation & initialization step), slab->stride and slab->obj_exts are
> > set when the slab is already accessible by multiple CPUs.
> >
> > The current implementation does not enforce memory ordering between
> > slab->stride and slab->obj_exts. However, for correctness, slab->stride
> > must be visible before slab->obj_exts, otherwise concurrent readers
> > may observe slab->obj_exts as non-zero while stride is still stale,
> > leading to incorrect reference counting of object cgroups.
> >
> > There has been a bug report [1] that showed symptoms of incorrect
> > reference counting of object cgroups, which could be triggered by
> > this memory ordering issue.
> >
> > Fix this by unconditionally initializing slab->stride in
> > alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> > In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
> >
> > This ensures stride is set before the slab becomes visible to
> > other CPUs via the per-node partial slab list (protected by spinlock
> > with acquire/release semantics), preventing them from observing
> > inconsistent stride value.
> >
> > Thanks to Shakeel Butt for pointing out this issue [2].
> >
> > Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> > Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> > Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com
> > Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
> >
> > I tested this patch, but I could not confirm that this actually fixes
> > the issue reported by [1]. It would be nice if Venkat could help
> > confirm; but perhaps it's challenging to reliably reproduce...
>
>
> Thanks for the patch. I did ran the complete test suite, and unfortunately
> issue is reproducing.
Oops, thanks for confirming that it's still reproduced!
That's really helpful.
Perhaps I should start considering cases where it's not a memory
ordering issue, but let's check one more thing before moving on.
could you please test if it still reproduces with the following patch?
If it's still reproducible, it should not be due to the memory ordering
issue between obj_exts and stride.
---8<---
From: Harry Yoo <harry.yoo@oracle.com>
Date: Mon, 23 Feb 2026 16:58:09 +0900
Subject: mm/slab: enforce slab->stride -> slab->obj_exts ordering
I tried to avoid unnecessary memory barriers for efficiency,
but the original bug is still reproducible.
Probably I missed a case where an object is allocated on a CPU
and then freed on a different CPU without involving spinlock.
I'm not sure if I did not cover edge cases or if it's caused by
something other than memory ordering issue.
Anyway, let's find out by introducing heavy memory barriers!
Always ensure that updates to stride is visible before obj_exts.
---
mm/slab.h | 1 +
mm/slub.c | 10 +++++++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/mm/slab.h b/mm/slab.h
index 71c7261bf822..aacdd9f4e509 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -565,6 +565,7 @@ static inline void slab_set_stride(struct slab *slab, unsigned short stride)
}
static inline unsigned short slab_get_stride(struct slab *slab)
{
+ smp_rmb();
return slab->stride;
}
#else
diff --git a/mm/slub.c b/mm/slub.c
index 862642c165ed..c7c8b660a994 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
retry:
old_exts = READ_ONCE(slab->obj_exts);
handle_failed_objexts_alloc(old_exts, vec, objects);
- slab_set_stride(slab, sizeof(struct slabobj_ext));
if (new_slab) {
/*
@@ -2272,6 +2271,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
void *addr;
unsigned long obj_exts;
+ slab_set_stride(slab, sizeof(struct slabobj_ext));
+ /* pairs with smp_rmb() in slab_get_stride() */
+ smp_wmb();
+
if (!need_slab_obj_exts(s))
return;
@@ -2288,7 +2291,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
obj_exts |= MEMCG_DATA_OBJEXTS;
#endif
slab->obj_exts = obj_exts;
- slab_set_stride(slab, sizeof(struct slabobj_ext));
} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
unsigned int offset = obj_exts_offset_in_object(s);
@@ -2305,8 +2307,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
#ifdef CONFIG_MEMCG
obj_exts |= MEMCG_DATA_OBJEXTS;
#endif
- slab->obj_exts = obj_exts;
slab_set_stride(slab, s->size);
+ /* pairs with smp_rmb() in slab_get_stride() */
+ smp_wmb();
+ slab->obj_exts = obj_exts;
}
}
--
2.43.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-24 11:10 ` Harry Yoo
@ 2026-02-25 9:14 ` Venkat Rao Bagalkote
2026-02-25 10:15 ` Harry Yoo
2026-02-27 3:07 ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
0 siblings, 2 replies; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-02-25 9:14 UTC (permalink / raw)
To: Harry Yoo
Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Alexei Starovoitov, Hao Li,
Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
Michal Hocko, cgroups, linux-mm
On 24/02/26 4:40 pm, Harry Yoo wrote:
> On Tue, Feb 24, 2026 at 02:34:41PM +0530, Venkat Rao Bagalkote wrote:
>> On 23/02/26 1:28 pm, Harry Yoo wrote:
>>> When alloc_slab_obj_exts() is called later in time (instead of at slab
>>> allocation & initialization step), slab->stride and slab->obj_exts are
>>> set when the slab is already accessible by multiple CPUs.
>>>
>>> The current implementation does not enforce memory ordering between
>>> slab->stride and slab->obj_exts. However, for correctness, slab->stride
>>> must be visible before slab->obj_exts, otherwise concurrent readers
>>> may observe slab->obj_exts as non-zero while stride is still stale,
>>> leading to incorrect reference counting of object cgroups.
>>>
>>> There has been a bug report [1] that showed symptoms of incorrect
>>> reference counting of object cgroups, which could be triggered by
>>> this memory ordering issue.
>>>
>>> Fix this by unconditionally initializing slab->stride in
>>> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
>>> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>>>
>>> This ensures stride is set before the slab becomes visible to
>>> other CPUs via the per-node partial slab list (protected by spinlock
>>> with acquire/release semantics), preventing them from observing
>>> inconsistent stride value.
>>>
>>> Thanks to Shakeel Butt for pointing out this issue [2].
>>>
>>> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
>>> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>>> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com
>>> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo
>>> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>>> ---
>>>
>>> I tested this patch, but I could not confirm that this actually fixes
>>> the issue reported by [1]. It would be nice if Venkat could help
>>> confirm; but perhaps it's challenging to reliably reproduce...
>>
>> Thanks for the patch. I did ran the complete test suite, and unfortunately
>> issue is reproducing.
> Oops, thanks for confirming that it's still reproduced!
> That's really helpful.
>
> Perhaps I should start considering cases where it's not a memory
> ordering issue, but let's check one more thing before moving on.
> could you please test if it still reproduces with the following patch?
>
> If it's still reproducible, it should not be due to the memory ordering
> issue between obj_exts and stride.
>
> ---8<---
> From: Harry Yoo <harry.yoo@oracle.com>
> Date: Mon, 23 Feb 2026 16:58:09 +0900
> Subject: mm/slab: enforce slab->stride -> slab->obj_exts ordering
>
> I tried to avoid unnecessary memory barriers for efficiency,
> but the original bug is still reproducible.
>
> Probably I missed a case where an object is allocated on a CPU
> and then freed on a different CPU without involving spinlock.
>
> I'm not sure if I did not cover edge cases or if it's caused by
> something other than memory ordering issue.
>
> Anyway, let's find out by introducing heavy memory barriers!
>
> Always ensure that updates to stride is visible before obj_exts.
>
> ---
> mm/slab.h | 1 +
> mm/slub.c | 10 +++++++---
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 71c7261bf822..aacdd9f4e509 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -565,6 +565,7 @@ static inline void slab_set_stride(struct slab *slab, unsigned short stride)
> }
> static inline unsigned short slab_get_stride(struct slab *slab)
> {
> + smp_rmb();
> return slab->stride;
> }
> #else
> diff --git a/mm/slub.c b/mm/slub.c
> index 862642c165ed..c7c8b660a994 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> retry:
> old_exts = READ_ONCE(slab->obj_exts);
> handle_failed_objexts_alloc(old_exts, vec, objects);
> - slab_set_stride(slab, sizeof(struct slabobj_ext));
>
> if (new_slab) {
> /*
> @@ -2272,6 +2271,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> void *addr;
> unsigned long obj_exts;
>
> + slab_set_stride(slab, sizeof(struct slabobj_ext));
> + /* pairs with smp_rmb() in slab_get_stride() */
> + smp_wmb();
> +
> if (!need_slab_obj_exts(s))
> return;
>
> @@ -2288,7 +2291,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> obj_exts |= MEMCG_DATA_OBJEXTS;
> #endif
> slab->obj_exts = obj_exts;
> - slab_set_stride(slab, sizeof(struct slabobj_ext));
> } else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
> unsigned int offset = obj_exts_offset_in_object(s);
>
> @@ -2305,8 +2307,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> #ifdef CONFIG_MEMCG
> obj_exts |= MEMCG_DATA_OBJEXTS;
> #endif
> - slab->obj_exts = obj_exts;
> slab_set_stride(slab, s->size);
> + /* pairs with smp_rmb() in slab_get_stride() */
> + smp_wmb();
> + slab->obj_exts = obj_exts;
> }
> }
>
> --
> 2.43.0
>
With this patch, issue is not reproduced. So looks good.
Regards,
Venkat.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
2026-02-25 9:14 ` Venkat Rao Bagalkote
@ 2026-02-25 10:15 ` Harry Yoo
2026-02-27 3:07 ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
1 sibling, 0 replies; 14+ messages in thread
From: Harry Yoo @ 2026-02-25 10:15 UTC (permalink / raw)
To: Venkat Rao Bagalkote
Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
David Rientjes, Roman Gushchin, Alexei Starovoitov, Hao Li,
Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
Michal Hocko, cgroups, linux-mm
On Wed, Feb 25, 2026 at 02:44:24PM +0530, Venkat Rao Bagalkote wrote:
> > > Thanks for the patch. I did ran the complete test suite, and unfortunately
> > > issue is reproducing.
>
> > Oops, thanks for confirming that it's still reproduced!
> > That's really helpful.
> >
> > Perhaps I should start considering cases where it's not a memory
> > ordering issue, but let's check one more thing before moving on.
> > could you please test if it still reproduces with the following patch?
> >
> > If it's still reproducible, it should not be due to the memory ordering
> > issue between obj_exts and stride.
> >
> > ---8<---
> > From: Harry Yoo <harry.yoo@oracle.com>
> > Date: Mon, 23 Feb 2026 16:58:09 +0900
> > Subject: mm/slab: enforce slab->stride -> slab->obj_exts ordering
> >
> > I tried to avoid unnecessary memory barriers for efficiency,
> > but the original bug is still reproducible.
> >
> > Probably I missed a case where an object is allocated on a CPU
> > and then freed on a different CPU without involving spinlock.
> >
> > I'm not sure if I did not cover edge cases or if it's caused by
> > something other than memory ordering issue.
> >
> > Anyway, let's find out by introducing heavy memory barriers!
> >
> > Always ensure that updates to stride is visible before obj_exts.
> >
> > ---
[...]
> With this patch, issue is not reproduced. So looks good.
Thanks a lot, Venkat! That's really helpful.
I think that's enough signal to assume that memory ordering is playing
a role here, unless it happens to be masking another issue.
Even so, it's important to enforce the ordering anyway.
But having smp_load_acquire() on every alloc/free fastpath doesn't
sound great to me. Let me think a bit about it and come up with
a reasonable solution (this time, hopefully no hole in the ordering).
Since it's a bug I'm working on it with high priority.
Again, thanks a lot for testing!
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] mm/slab: a debug patch to investigate the issue further
2026-02-25 9:14 ` Venkat Rao Bagalkote
2026-02-25 10:15 ` Harry Yoo
@ 2026-02-27 3:07 ` Harry Yoo
2026-02-27 5:52 ` kernel test robot
` (2 more replies)
1 sibling, 3 replies; 14+ messages in thread
From: Harry Yoo @ 2026-02-27 3:07 UTC (permalink / raw)
To: venkat88
Cc: akpm, ast, cgroups, cl, hannes, hao.li, harry.yoo, linux-mm,
mhocko, muchun.song, rientjes, roman.gushchin, shakeel.butt,
surenb, vbabka
Hi Venkat, could you please help testing this patch and
check if it hits any warning? It's based on v7.0-rc1 tag.
This (hopefully) should give us more information
that would help debugging the issue.
1. set stride early in alloc_slab_obj_exts_early()
2. move some obj_exts helpers to slab.h
3. in slab_obj_ext(), check three things:
3-1. is the obj_ext address is the right one for this object?
3-2. does the obj_ext address change after smp_rmb()?
3-3. does obj_ext->objcg change after smp_rmb()?
No smp_wmb() is used, intentionally.
It is expected that the issue will still reproduce.
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
mm/slab.h | 131 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
mm/slub.c | 100 ++---------------------------------------
2 files changed, 130 insertions(+), 101 deletions(-)
diff --git a/mm/slab.h b/mm/slab.h
index 71c7261bf822..d1e44cd01ea1 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -578,6 +578,101 @@ static inline unsigned short slab_get_stride(struct slab *slab)
}
#endif
+#ifdef CONFIG_SLAB_OBJ_EXT
+
+/*
+ * Check if memory cgroup or memory allocation profiling is enabled.
+ * If enabled, SLUB tries to reduce memory overhead of accounting
+ * slab objects. If neither is enabled when this function is called,
+ * the optimization is simply skipped to avoid affecting caches that do not
+ * need slabobj_ext metadata.
+ *
+ * However, this may disable optimization when memory cgroup or memory
+ * allocation profiling is used, but slabs are created too early
+ * even before those subsystems are initialized.
+ */
+static inline bool need_slab_obj_exts(struct kmem_cache *s)
+{
+ if (s->flags & SLAB_NO_OBJ_EXT)
+ return false;
+
+ if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
+ return true;
+
+ if (mem_alloc_profiling_enabled())
+ return true;
+
+ return false;
+}
+
+static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
+{
+ return sizeof(struct slabobj_ext) * slab->objects;
+}
+
+static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
+ struct slab *slab)
+{
+ unsigned long objext_offset;
+
+ objext_offset = s->size * slab->objects;
+ objext_offset = ALIGN(objext_offset, sizeof(struct slabobj_ext));
+ return objext_offset;
+}
+
+static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
+ struct slab *slab)
+{
+ unsigned long objext_offset = obj_exts_offset_in_slab(s, slab);
+ unsigned long objext_size = obj_exts_size_in_slab(slab);
+
+ return objext_offset + objext_size <= slab_size(slab);
+}
+
+static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
+{
+ unsigned long obj_exts;
+ unsigned long start;
+ unsigned long end;
+
+ obj_exts = slab_obj_exts(slab);
+ if (!obj_exts)
+ return false;
+
+ start = (unsigned long)slab_address(slab);
+ end = start + slab_size(slab);
+ return (obj_exts >= start) && (obj_exts < end);
+}
+#else
+static inline bool need_slab_obj_exts(struct kmem_cache *s)
+{
+ return false;
+}
+
+static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
+{
+ return 0;
+}
+
+static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
+ struct slab *slab)
+{
+ return 0;
+}
+
+static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
+ struct slab *slab)
+{
+ return false;
+}
+
+static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
+{
+ return false;
+}
+
+#endif
+
/*
* slab_obj_ext - get the pointer to the slab object extension metadata
* associated with an object in a slab.
@@ -592,13 +687,41 @@ static inline struct slabobj_ext *slab_obj_ext(struct slab *slab,
unsigned long obj_exts,
unsigned int index)
{
- struct slabobj_ext *obj_ext;
+ struct slabobj_ext *ext_before;
+ struct slabobj_ext *ext_after;
+ struct obj_cgroup *objcg_before;
+ struct obj_cgroup *objcg_after;
VM_WARN_ON_ONCE(obj_exts != slab_obj_exts(slab));
- obj_ext = (struct slabobj_ext *)(obj_exts +
- slab_get_stride(slab) * index);
- return kasan_reset_tag(obj_ext);
+ ext_before = (struct slabobj_ext *)(obj_exts +
+ slab_get_stride(slab) * index);
+ objcg_before = ext_before->objcg;
+ // re-read things after rmb
+ smp_rmb();
+ // is ext_before the right obj_ext for this object?
+ if (obj_exts_in_slab(slab->slab_cache, slab)) {
+ struct kmem_cache *s = slab->slab_cache;
+
+ if (obj_exts_fit_within_slab_leftover(s, slab))
+ WARN(ext_before != (struct slabobj_ext *)(obj_exts + sizeof(struct slabobj_ext) * index),
+ "obj_exts array in leftover");
+ else
+ WARN(ext_before != (struct slabobj_ext *)(obj_exts + s->size * index),
+ "obj_ext in object");
+
+ } else {
+ WARN(ext_before != (struct slabobj_ext *)(obj_exts + sizeof(struct slabobj_ext) * index),
+ "obj_exts array allocated from slab");
+ }
+
+ ext_after = (struct slabobj_ext *)(obj_exts +
+ slab_get_stride(slab) * index);
+ objcg_after = ext_after->objcg;
+
+ WARN(ext_before != ext_after, "obj_ext pointer has changed");
+ WARN(objcg_before != objcg_after, "obj_ext->objcg has changed");
+ return kasan_reset_tag(ext_before);
}
int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
diff --git a/mm/slub.c b/mm/slub.c
index 862642c165ed..8eb64534370e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -757,101 +757,6 @@ static inline unsigned long get_orig_size(struct kmem_cache *s, void *object)
return *(unsigned long *)p;
}
-#ifdef CONFIG_SLAB_OBJ_EXT
-
-/*
- * Check if memory cgroup or memory allocation profiling is enabled.
- * If enabled, SLUB tries to reduce memory overhead of accounting
- * slab objects. If neither is enabled when this function is called,
- * the optimization is simply skipped to avoid affecting caches that do not
- * need slabobj_ext metadata.
- *
- * However, this may disable optimization when memory cgroup or memory
- * allocation profiling is used, but slabs are created too early
- * even before those subsystems are initialized.
- */
-static inline bool need_slab_obj_exts(struct kmem_cache *s)
-{
- if (s->flags & SLAB_NO_OBJ_EXT)
- return false;
-
- if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
- return true;
-
- if (mem_alloc_profiling_enabled())
- return true;
-
- return false;
-}
-
-static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
-{
- return sizeof(struct slabobj_ext) * slab->objects;
-}
-
-static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
- struct slab *slab)
-{
- unsigned long objext_offset;
-
- objext_offset = s->size * slab->objects;
- objext_offset = ALIGN(objext_offset, sizeof(struct slabobj_ext));
- return objext_offset;
-}
-
-static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
- struct slab *slab)
-{
- unsigned long objext_offset = obj_exts_offset_in_slab(s, slab);
- unsigned long objext_size = obj_exts_size_in_slab(slab);
-
- return objext_offset + objext_size <= slab_size(slab);
-}
-
-static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
-{
- unsigned long obj_exts;
- unsigned long start;
- unsigned long end;
-
- obj_exts = slab_obj_exts(slab);
- if (!obj_exts)
- return false;
-
- start = (unsigned long)slab_address(slab);
- end = start + slab_size(slab);
- return (obj_exts >= start) && (obj_exts < end);
-}
-#else
-static inline bool need_slab_obj_exts(struct kmem_cache *s)
-{
- return false;
-}
-
-static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
-{
- return 0;
-}
-
-static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
- struct slab *slab)
-{
- return 0;
-}
-
-static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
- struct slab *slab)
-{
- return false;
-}
-
-static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
-{
- return false;
-}
-
-#endif
-
#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
static bool obj_exts_in_object(struct kmem_cache *s, struct slab *slab)
{
@@ -2196,7 +2101,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
retry:
old_exts = READ_ONCE(slab->obj_exts);
handle_failed_objexts_alloc(old_exts, vec, objects);
- slab_set_stride(slab, sizeof(struct slabobj_ext));
if (new_slab) {
/*
@@ -2272,6 +2176,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
void *addr;
unsigned long obj_exts;
+ /* Initialize stride early to avoid memory ordering issues */
+ slab_set_stride(slab, sizeof(struct slabobj_ext));
+
if (!need_slab_obj_exts(s))
return;
@@ -2288,7 +2195,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
obj_exts |= MEMCG_DATA_OBJEXTS;
#endif
slab->obj_exts = obj_exts;
- slab_set_stride(slab, sizeof(struct slabobj_ext));
} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
unsigned int offset = obj_exts_offset_in_object(s);
--
2.43.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
2026-02-27 3:07 ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
@ 2026-02-27 5:52 ` kernel test robot
2026-02-27 6:02 ` kernel test robot
2026-02-27 8:02 ` Venkat Rao Bagalkote
2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2026-02-27 5:52 UTC (permalink / raw)
To: Harry Yoo, venkat88
Cc: llvm, oe-kbuild-all, akpm, ast, cgroups, cl, hannes, hao.li,
harry.yoo, linux-mm, mhocko, muchun.song, rientjes,
roman.gushchin, shakeel.butt, surenb, vbabka
Hi Harry,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/Harry-Yoo/mm-slab-a-debug-patch-to-investigate-the-issue-further/20260227-111246
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20260227030733.9517-1-harry.yoo%40oracle.com
patch subject: [PATCH] mm/slab: a debug patch to investigate the issue further
config: x86_64-allnoconfig (https://download.01.org/0day-ci/archive/20260227/202602271320.ywOCYQx4-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260227/202602271320.ywOCYQx4-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602271320.ywOCYQx4-lkp@intel.com/
All errors (new ones prefixed by >>):
>> mm/slub.c:1330:6: error: call to undeclared function 'obj_exts_in_slab'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
1330 | if (obj_exts_in_slab(s, slab) && !obj_exts_in_object(s, slab)) {
| ^
>> mm/slub.c:1332:16: error: call to undeclared function 'obj_exts_offset_in_slab'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
1332 | remainder -= obj_exts_offset_in_slab(s, slab);
| ^
mm/slub.c:1332:16: note: did you mean 'obj_exts_offset_in_object'?
mm/slub.c:793:28: note: 'obj_exts_offset_in_object' declared here
793 | static inline unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
| ^
>> mm/slub.c:1333:16: error: call to undeclared function 'obj_exts_size_in_slab'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
1333 | remainder -= obj_exts_size_in_slab(slab);
| ^
3 errors generated.
vim +/obj_exts_in_slab +1330 mm/slub.c
81819f0fc8285a Christoph Lameter 2007-05-06 1311
39b264641a0c3b Christoph Lameter 2008-04-14 1312 /* Check the pad bytes at the end of a slab page */
adea9876180664 Ilya Leoshkevich 2024-06-21 1313 static pad_check_attributes void
adea9876180664 Ilya Leoshkevich 2024-06-21 1314 slab_pad_check(struct kmem_cache *s, struct slab *slab)
81819f0fc8285a Christoph Lameter 2007-05-06 1315 {
2492268472e7d3 Christoph Lameter 2007-07-17 1316 u8 *start;
2492268472e7d3 Christoph Lameter 2007-07-17 1317 u8 *fault;
2492268472e7d3 Christoph Lameter 2007-07-17 1318 u8 *end;
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31 1319 u8 *pad;
2492268472e7d3 Christoph Lameter 2007-07-17 1320 int length;
2492268472e7d3 Christoph Lameter 2007-07-17 1321 int remainder;
81819f0fc8285a Christoph Lameter 2007-05-06 1322
81819f0fc8285a Christoph Lameter 2007-05-06 1323 if (!(s->flags & SLAB_POISON))
a204e6d626126d Miaohe Lin 2022-04-19 1324 return;
81819f0fc8285a Christoph Lameter 2007-05-06 1325
bb192ed9aa7191 Vlastimil Babka 2021-11-03 1326 start = slab_address(slab);
bb192ed9aa7191 Vlastimil Babka 2021-11-03 1327 length = slab_size(slab);
39b264641a0c3b Christoph Lameter 2008-04-14 1328 end = start + length;
70089d01880750 Harry Yoo 2026-01-13 1329
a77d6d33868502 Harry Yoo 2026-01-13 @1330 if (obj_exts_in_slab(s, slab) && !obj_exts_in_object(s, slab)) {
70089d01880750 Harry Yoo 2026-01-13 1331 remainder = length;
70089d01880750 Harry Yoo 2026-01-13 @1332 remainder -= obj_exts_offset_in_slab(s, slab);
70089d01880750 Harry Yoo 2026-01-13 @1333 remainder -= obj_exts_size_in_slab(slab);
70089d01880750 Harry Yoo 2026-01-13 1334 } else {
39b264641a0c3b Christoph Lameter 2008-04-14 1335 remainder = length % s->size;
70089d01880750 Harry Yoo 2026-01-13 1336 }
70089d01880750 Harry Yoo 2026-01-13 1337
81819f0fc8285a Christoph Lameter 2007-05-06 1338 if (!remainder)
a204e6d626126d Miaohe Lin 2022-04-19 1339 return;
81819f0fc8285a Christoph Lameter 2007-05-06 1340
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31 1341 pad = end - remainder;
a79316c6178ca4 Andrey Ryabinin 2015-02-13 1342 metadata_access_enable();
aa1ef4d7b3f67f Andrey Konovalov 2020-12-22 1343 fault = memchr_inv(kasan_reset_tag(pad), POISON_INUSE, remainder);
a79316c6178ca4 Andrey Ryabinin 2015-02-13 1344 metadata_access_disable();
2492268472e7d3 Christoph Lameter 2007-07-17 1345 if (!fault)
a204e6d626126d Miaohe Lin 2022-04-19 1346 return;
2492268472e7d3 Christoph Lameter 2007-07-17 1347 while (end > fault && end[-1] == POISON_INUSE)
2492268472e7d3 Christoph Lameter 2007-07-17 1348 end--;
2492268472e7d3 Christoph Lameter 2007-07-17 1349
3f6f32b14ab354 Hyesoo Yu 2025-02-26 1350 slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu",
e1b70dd1e6429f Miles Chen 2019-11-30 1351 fault, end - 1, fault - start);
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31 1352 print_section(KERN_ERR, "Padding ", pad, remainder);
3f6f32b14ab354 Hyesoo Yu 2025-02-26 1353 __slab_err(slab);
2492268472e7d3 Christoph Lameter 2007-07-17 1354
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31 1355 restore_bytes(s, "slab padding", POISON_INUSE, fault, end);
81819f0fc8285a Christoph Lameter 2007-05-06 1356 }
81819f0fc8285a Christoph Lameter 2007-05-06 1357
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
2026-02-27 3:07 ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
2026-02-27 5:52 ` kernel test robot
@ 2026-02-27 6:02 ` kernel test robot
2026-02-27 8:02 ` Venkat Rao Bagalkote
2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2026-02-27 6:02 UTC (permalink / raw)
To: Harry Yoo, venkat88
Cc: oe-kbuild-all, akpm, ast, cgroups, cl, hannes, hao.li, harry.yoo,
linux-mm, mhocko, muchun.song, rientjes, roman.gushchin,
shakeel.butt, surenb, vbabka
Hi Harry,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/Harry-Yoo/mm-slab-a-debug-patch-to-investigate-the-issue-further/20260227-111246
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20260227030733.9517-1-harry.yoo%40oracle.com
patch subject: [PATCH] mm/slab: a debug patch to investigate the issue further
config: nios2-allnoconfig (https://download.01.org/0day-ci/archive/20260227/202602271339.xhIvS2iX-lkp@intel.com/config)
compiler: nios2-linux-gcc (GCC) 11.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260227/202602271339.xhIvS2iX-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602271339.xhIvS2iX-lkp@intel.com/
All errors (new ones prefixed by >>):
mm/slub.c: In function 'slab_pad_check':
>> mm/slub.c:1330:13: error: implicit declaration of function 'obj_exts_in_slab'; did you mean 'obj_exts_in_object'? [-Werror=implicit-function-declaration]
1330 | if (obj_exts_in_slab(s, slab) && !obj_exts_in_object(s, slab)) {
| ^~~~~~~~~~~~~~~~
| obj_exts_in_object
>> mm/slub.c:1332:30: error: implicit declaration of function 'obj_exts_offset_in_slab'; did you mean 'obj_exts_offset_in_object'? [-Werror=implicit-function-declaration]
1332 | remainder -= obj_exts_offset_in_slab(s, slab);
| ^~~~~~~~~~~~~~~~~~~~~~~
| obj_exts_offset_in_object
>> mm/slub.c:1333:30: error: implicit declaration of function 'obj_exts_size_in_slab' [-Werror=implicit-function-declaration]
1333 | remainder -= obj_exts_size_in_slab(slab);
| ^~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +1330 mm/slub.c
81819f0fc8285a Christoph Lameter 2007-05-06 1311
39b264641a0c3b Christoph Lameter 2008-04-14 1312 /* Check the pad bytes at the end of a slab page */
adea9876180664 Ilya Leoshkevich 2024-06-21 1313 static pad_check_attributes void
adea9876180664 Ilya Leoshkevich 2024-06-21 1314 slab_pad_check(struct kmem_cache *s, struct slab *slab)
81819f0fc8285a Christoph Lameter 2007-05-06 1315 {
2492268472e7d3 Christoph Lameter 2007-07-17 1316 u8 *start;
2492268472e7d3 Christoph Lameter 2007-07-17 1317 u8 *fault;
2492268472e7d3 Christoph Lameter 2007-07-17 1318 u8 *end;
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31 1319 u8 *pad;
2492268472e7d3 Christoph Lameter 2007-07-17 1320 int length;
2492268472e7d3 Christoph Lameter 2007-07-17 1321 int remainder;
81819f0fc8285a Christoph Lameter 2007-05-06 1322
81819f0fc8285a Christoph Lameter 2007-05-06 1323 if (!(s->flags & SLAB_POISON))
a204e6d626126d Miaohe Lin 2022-04-19 1324 return;
81819f0fc8285a Christoph Lameter 2007-05-06 1325
bb192ed9aa7191 Vlastimil Babka 2021-11-03 1326 start = slab_address(slab);
bb192ed9aa7191 Vlastimil Babka 2021-11-03 1327 length = slab_size(slab);
39b264641a0c3b Christoph Lameter 2008-04-14 1328 end = start + length;
70089d01880750 Harry Yoo 2026-01-13 1329
a77d6d33868502 Harry Yoo 2026-01-13 @1330 if (obj_exts_in_slab(s, slab) && !obj_exts_in_object(s, slab)) {
70089d01880750 Harry Yoo 2026-01-13 1331 remainder = length;
70089d01880750 Harry Yoo 2026-01-13 @1332 remainder -= obj_exts_offset_in_slab(s, slab);
70089d01880750 Harry Yoo 2026-01-13 @1333 remainder -= obj_exts_size_in_slab(slab);
70089d01880750 Harry Yoo 2026-01-13 1334 } else {
39b264641a0c3b Christoph Lameter 2008-04-14 1335 remainder = length % s->size;
70089d01880750 Harry Yoo 2026-01-13 1336 }
70089d01880750 Harry Yoo 2026-01-13 1337
81819f0fc8285a Christoph Lameter 2007-05-06 1338 if (!remainder)
a204e6d626126d Miaohe Lin 2022-04-19 1339 return;
81819f0fc8285a Christoph Lameter 2007-05-06 1340
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31 1341 pad = end - remainder;
a79316c6178ca4 Andrey Ryabinin 2015-02-13 1342 metadata_access_enable();
aa1ef4d7b3f67f Andrey Konovalov 2020-12-22 1343 fault = memchr_inv(kasan_reset_tag(pad), POISON_INUSE, remainder);
a79316c6178ca4 Andrey Ryabinin 2015-02-13 1344 metadata_access_disable();
2492268472e7d3 Christoph Lameter 2007-07-17 1345 if (!fault)
a204e6d626126d Miaohe Lin 2022-04-19 1346 return;
2492268472e7d3 Christoph Lameter 2007-07-17 1347 while (end > fault && end[-1] == POISON_INUSE)
2492268472e7d3 Christoph Lameter 2007-07-17 1348 end--;
2492268472e7d3 Christoph Lameter 2007-07-17 1349
3f6f32b14ab354 Hyesoo Yu 2025-02-26 1350 slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu",
e1b70dd1e6429f Miles Chen 2019-11-30 1351 fault, end - 1, fault - start);
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31 1352 print_section(KERN_ERR, "Padding ", pad, remainder);
3f6f32b14ab354 Hyesoo Yu 2025-02-26 1353 __slab_err(slab);
2492268472e7d3 Christoph Lameter 2007-07-17 1354
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31 1355 restore_bytes(s, "slab padding", POISON_INUSE, fault, end);
81819f0fc8285a Christoph Lameter 2007-05-06 1356 }
81819f0fc8285a Christoph Lameter 2007-05-06 1357
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
2026-02-27 3:07 ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
2026-02-27 5:52 ` kernel test robot
2026-02-27 6:02 ` kernel test robot
@ 2026-02-27 8:02 ` Venkat Rao Bagalkote
2026-02-27 8:11 ` Harry Yoo
2 siblings, 1 reply; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-02-27 8:02 UTC (permalink / raw)
To: Harry Yoo
Cc: akpm, ast, cgroups, cl, hannes, hao.li, linux-mm, mhocko,
muchun.song, rientjes, roman.gushchin, shakeel.butt, surenb,
vbabka
On 27/02/26 8:37 am, Harry Yoo wrote:
> Hi Venkat, could you please help testing this patch and
> check if it hits any warning? It's based on v7.0-rc1 tag.
>
> This (hopefully) should give us more information
> that would help debugging the issue.
>
> 1. set stride early in alloc_slab_obj_exts_early()
> 2. move some obj_exts helpers to slab.h
> 3. in slab_obj_ext(), check three things:
> 3-1. is the obj_ext address is the right one for this object?
> 3-2. does the obj_ext address change after smp_rmb()?
> 3-3. does obj_ext->objcg change after smp_rmb()?
>
> No smp_wmb() is used, intentionally.
>
> It is expected that the issue will still reproduce.
>
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Hello Harry,
I’ve restarted the test, but there are continuous warning prints in the
logs, and they appear to be slowing down the test run significantly.
Warnings:
[ 3215.419760] obj_ext in object
[ 3215.419774] WARNING: mm/slab.h:710 at slab_obj_ext+0x2e0/0x338,
CPU#26: grep/103571
[ 3215.419783] Modules linked in: xfs loop dm_mod bonding tls rfkill
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink
sunrpc pseries_rng vmx_crypto dax_pmem fuse ext4 crc16 mbcache jbd2
sd_mod nd_pmem sg papr_scm libnvdimm ibmvscsi ibmveth scsi_transport_srp
pseries_wdt
[ 3215.419852] CPU: 26 UID: 0 PID: 103571 Comm: grep Kdump: loaded
Tainted: G W 7.0.0-rc1+ #3 PREEMPTLAZY
[ 3215.419859] Tainted: [W]=WARN
[ 3215.419862] Hardware name: IBM,9080-HEX Power11 (architected)
0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
[ 3215.419866] NIP: c0000000008a9ff4 LR: c0000000008a9ff0 CTR:
0000000000000000
[ 3215.419870] REGS: c0000001f9d37670 TRAP: 0700 Tainted: G W
(7.0.0-rc1+)
[ 3215.419874] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR:
24002404 XER: 20040000
[ 3215.419889] CFAR: c0000000001bc194 IRQMASK: 0
[ 3215.419889] GPR00: c0000000008a9ff0 c0000001f9d37910 c00000000243a500
c000000127e8d600
[ 3215.419889] GPR04: 0000000000000004 0000000000000001 c0000000001bc164
0000000000000001
[ 3215.419889] GPR08: a80e000000000000 0000000000000000 0000000000000007
a80e000000000000
[ 3215.419889] GPR12: c00e0001a1a48fb2 c000000d0dde7f00 c000000004e49960
0000000000000001
[ 3215.419889] GPR16: c00000006e6e0000 0000000000000010 c000000007017fa0
c000000007017fa4
[ 3215.419889] GPR20: 0000000000000001 c000000007017f88 0000000000080000
c000000007017f80
[ 3215.419889] GPR24: c00000006e6f0010 c0000000aef32800 c00c0000001b9a2c
c00000006e690010
[ 3215.419889] GPR28: 0000000000000003 0000000000080020 c00000006e690010
c00c0000001b9a00
[ 3215.419960] NIP [c0000000008a9ff4] slab_obj_ext+0x2e0/0x338
[ 3215.419966] LR [c0000000008a9ff0] slab_obj_ext+0x2dc/0x338
[ 3215.419972] Call Trace:
[ 3215.419975] [c0000001f9d37910] [c0000000008a9ff0]
slab_obj_ext+0x2dc/0x338 (unreliable)
[ 3215.419983] [c0000001f9d379c0] [c0000000008b9a64]
__memcg_slab_free_hook+0x1a4/0x3dc
[ 3215.419990] [c0000001f9d37a90] [c0000000007f8270] kfree+0x454/0x600
[ 3215.419998] [c0000001f9d37b20] [c000000000989724]
seq_release_private+0x98/0xd4
[ 3215.420005] [c0000001f9d37b60] [c000000000a7adb4]
proc_map_release+0xa4/0xe0
[ 3215.420012] [c0000001f9d37ba0] [c00000000091edf0] __fput+0x1e8/0x5cc
[ 3215.420019] [c0000001f9d37c20] [c000000000915670] sys_close+0x74/0xd0
[ 3215.420025] [c0000001f9d37c50] [c00000000003aeb0]
system_call_exception+0x1e0/0x4b0
[ 3215.420033] [c0000001f9d37e50] [c00000000000d05c]
system_call_vectored_common+0x15c/0x2ec
[ 3215.420041] ---- interrupt: 3000 at 0x7fff9bd34ab4
[ 3215.420045] NIP: 00007fff9bd34ab4 LR: 00007fff9bd34ab4 CTR:
0000000000000000
[ 3215.420050] REGS: c0000001f9d37e80 TRAP: 3000 Tainted: G W
(7.0.0-rc1+)
[ 3215.420054] MSR: 800000000280f033
<SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44002402 XER: 00000000
[ 3215.420077] IRQMASK: 0
[ 3215.420077] GPR00: 0000000000000006 00007fffe2939800 00007fff9bf37f00
0000000000000003
[ 3215.420077] GPR04: 00007fff9bfe077f 000000000000f881 00007fffe2939820
000000000000f881
[ 3215.420077] GPR08: 000000000000077f 0000000000000000 0000000000000000
0000000000000000
[ 3215.420077] GPR12: 0000000000000000 00007fff9c0ab0e0 0000000000000000
0000000000000000
[ 3215.420077] GPR16: 0000000000000000 00000001235700f0 0000000000000100
0000000000000001
[ 3215.420077] GPR20: 00000000ffffffff 00000001235702ef 0000000000000000
fffffffffffffffd
[ 3215.420077] GPR24: 00007fffe2939890 0000000000000000 00007fffe2939978
00007fff9bf12a88
[ 3215.420077] GPR28: 00007fffe2939974 0000000000010000 0000000000000003
0000000000010000
[ 3215.420144] NIP [00007fff9bd34ab4] 0x7fff9bd34ab4
[ 3215.420148] LR [00007fff9bd34ab4] 0x7fff9bd34ab4
[ 3215.420151] ---- interrupt: 3000
[ 3215.420154] Code: 4e800020 60000000 60000000 7f18e1d6 7b180020
7f18f214 7c3bc000 4182febc 3c62ff7a 386336c0 4b9120a9 60000000
<0fe00000> eac10060 4bffff58 3d200001
[ 3215.420183] ---[ end trace 0000000000000000 ]---
Regards,
Venkat.
> ---
> mm/slab.h | 131 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
> mm/slub.c | 100 ++---------------------------------------
> 2 files changed, 130 insertions(+), 101 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 71c7261bf822..d1e44cd01ea1 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -578,6 +578,101 @@ static inline unsigned short slab_get_stride(struct slab *slab)
> }
> #endif
>
> +#ifdef CONFIG_SLAB_OBJ_EXT
> +
> +/*
> + * Check if memory cgroup or memory allocation profiling is enabled.
> + * If enabled, SLUB tries to reduce memory overhead of accounting
> + * slab objects. If neither is enabled when this function is called,
> + * the optimization is simply skipped to avoid affecting caches that do not
> + * need slabobj_ext metadata.
> + *
> + * However, this may disable optimization when memory cgroup or memory
> + * allocation profiling is used, but slabs are created too early
> + * even before those subsystems are initialized.
> + */
> +static inline bool need_slab_obj_exts(struct kmem_cache *s)
> +{
> + if (s->flags & SLAB_NO_OBJ_EXT)
> + return false;
> +
> + if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
> + return true;
> +
> + if (mem_alloc_profiling_enabled())
> + return true;
> +
> + return false;
> +}
> +
> +static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
> +{
> + return sizeof(struct slabobj_ext) * slab->objects;
> +}
> +
> +static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
> + struct slab *slab)
> +{
> + unsigned long objext_offset;
> +
> + objext_offset = s->size * slab->objects;
> + objext_offset = ALIGN(objext_offset, sizeof(struct slabobj_ext));
> + return objext_offset;
> +}
> +
> +static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
> + struct slab *slab)
> +{
> + unsigned long objext_offset = obj_exts_offset_in_slab(s, slab);
> + unsigned long objext_size = obj_exts_size_in_slab(slab);
> +
> + return objext_offset + objext_size <= slab_size(slab);
> +}
> +
> +static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> +{
> + unsigned long obj_exts;
> + unsigned long start;
> + unsigned long end;
> +
> + obj_exts = slab_obj_exts(slab);
> + if (!obj_exts)
> + return false;
> +
> + start = (unsigned long)slab_address(slab);
> + end = start + slab_size(slab);
> + return (obj_exts >= start) && (obj_exts < end);
> +}
> +#else
> +static inline bool need_slab_obj_exts(struct kmem_cache *s)
> +{
> + return false;
> +}
> +
> +static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
> +{
> + return 0;
> +}
> +
> +static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
> + struct slab *slab)
> +{
> + return 0;
> +}
> +
> +static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
> + struct slab *slab)
> +{
> + return false;
> +}
> +
> +static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> +{
> + return false;
> +}
> +
> +#endif
> +
> /*
> * slab_obj_ext - get the pointer to the slab object extension metadata
> * associated with an object in a slab.
> @@ -592,13 +687,41 @@ static inline struct slabobj_ext *slab_obj_ext(struct slab *slab,
> unsigned long obj_exts,
> unsigned int index)
> {
> - struct slabobj_ext *obj_ext;
> + struct slabobj_ext *ext_before;
> + struct slabobj_ext *ext_after;
> + struct obj_cgroup *objcg_before;
> + struct obj_cgroup *objcg_after;
>
> VM_WARN_ON_ONCE(obj_exts != slab_obj_exts(slab));
>
> - obj_ext = (struct slabobj_ext *)(obj_exts +
> - slab_get_stride(slab) * index);
> - return kasan_reset_tag(obj_ext);
> + ext_before = (struct slabobj_ext *)(obj_exts +
> + slab_get_stride(slab) * index);
> + objcg_before = ext_before->objcg;
> + // re-read things after rmb
> + smp_rmb();
> + // is ext_before the right obj_ext for this object?
> + if (obj_exts_in_slab(slab->slab_cache, slab)) {
> + struct kmem_cache *s = slab->slab_cache;
> +
> + if (obj_exts_fit_within_slab_leftover(s, slab))
> + WARN(ext_before != (struct slabobj_ext *)(obj_exts + sizeof(struct slabobj_ext) * index),
> + "obj_exts array in leftover");
> + else
> + WARN(ext_before != (struct slabobj_ext *)(obj_exts + s->size * index),
> + "obj_ext in object");
> +
> + } else {
> + WARN(ext_before != (struct slabobj_ext *)(obj_exts + sizeof(struct slabobj_ext) * index),
> + "obj_exts array allocated from slab");
> + }
> +
> + ext_after = (struct slabobj_ext *)(obj_exts +
> + slab_get_stride(slab) * index);
> + objcg_after = ext_after->objcg;
> +
> + WARN(ext_before != ext_after, "obj_ext pointer has changed");
> + WARN(objcg_before != objcg_after, "obj_ext->objcg has changed");
> + return kasan_reset_tag(ext_before);
> }
>
> int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> diff --git a/mm/slub.c b/mm/slub.c
> index 862642c165ed..8eb64534370e 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -757,101 +757,6 @@ static inline unsigned long get_orig_size(struct kmem_cache *s, void *object)
> return *(unsigned long *)p;
> }
>
> -#ifdef CONFIG_SLAB_OBJ_EXT
> -
> -/*
> - * Check if memory cgroup or memory allocation profiling is enabled.
> - * If enabled, SLUB tries to reduce memory overhead of accounting
> - * slab objects. If neither is enabled when this function is called,
> - * the optimization is simply skipped to avoid affecting caches that do not
> - * need slabobj_ext metadata.
> - *
> - * However, this may disable optimization when memory cgroup or memory
> - * allocation profiling is used, but slabs are created too early
> - * even before those subsystems are initialized.
> - */
> -static inline bool need_slab_obj_exts(struct kmem_cache *s)
> -{
> - if (s->flags & SLAB_NO_OBJ_EXT)
> - return false;
> -
> - if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
> - return true;
> -
> - if (mem_alloc_profiling_enabled())
> - return true;
> -
> - return false;
> -}
> -
> -static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
> -{
> - return sizeof(struct slabobj_ext) * slab->objects;
> -}
> -
> -static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
> - struct slab *slab)
> -{
> - unsigned long objext_offset;
> -
> - objext_offset = s->size * slab->objects;
> - objext_offset = ALIGN(objext_offset, sizeof(struct slabobj_ext));
> - return objext_offset;
> -}
> -
> -static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
> - struct slab *slab)
> -{
> - unsigned long objext_offset = obj_exts_offset_in_slab(s, slab);
> - unsigned long objext_size = obj_exts_size_in_slab(slab);
> -
> - return objext_offset + objext_size <= slab_size(slab);
> -}
> -
> -static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> -{
> - unsigned long obj_exts;
> - unsigned long start;
> - unsigned long end;
> -
> - obj_exts = slab_obj_exts(slab);
> - if (!obj_exts)
> - return false;
> -
> - start = (unsigned long)slab_address(slab);
> - end = start + slab_size(slab);
> - return (obj_exts >= start) && (obj_exts < end);
> -}
> -#else
> -static inline bool need_slab_obj_exts(struct kmem_cache *s)
> -{
> - return false;
> -}
> -
> -static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
> -{
> - return 0;
> -}
> -
> -static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
> - struct slab *slab)
> -{
> - return 0;
> -}
> -
> -static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
> - struct slab *slab)
> -{
> - return false;
> -}
> -
> -static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> -{
> - return false;
> -}
> -
> -#endif
> -
> #if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> static bool obj_exts_in_object(struct kmem_cache *s, struct slab *slab)
> {
> @@ -2196,7 +2101,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> retry:
> old_exts = READ_ONCE(slab->obj_exts);
> handle_failed_objexts_alloc(old_exts, vec, objects);
> - slab_set_stride(slab, sizeof(struct slabobj_ext));
>
> if (new_slab) {
> /*
> @@ -2272,6 +2176,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> void *addr;
> unsigned long obj_exts;
>
> + /* Initialize stride early to avoid memory ordering issues */
> + slab_set_stride(slab, sizeof(struct slabobj_ext));
> +
> if (!need_slab_obj_exts(s))
> return;
>
> @@ -2288,7 +2195,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> obj_exts |= MEMCG_DATA_OBJEXTS;
> #endif
> slab->obj_exts = obj_exts;
> - slab_set_stride(slab, sizeof(struct slabobj_ext));
> } else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
> unsigned int offset = obj_exts_offset_in_object(s);
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
2026-02-27 8:02 ` Venkat Rao Bagalkote
@ 2026-02-27 8:11 ` Harry Yoo
2026-02-27 9:36 ` Venkat Rao Bagalkote
0 siblings, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2026-02-27 8:11 UTC (permalink / raw)
To: Venkat Rao Bagalkote
Cc: akpm, ast, cgroups, cl, hannes, hao.li, linux-mm, mhocko,
muchun.song, rientjes, roman.gushchin, shakeel.butt, surenb,
vbabka
On Fri, Feb 27, 2026 at 01:32:29PM +0530, Venkat Rao Bagalkote wrote:
>
> On 27/02/26 8:37 am, Harry Yoo wrote:
> > Hi Venkat, could you please help testing this patch and
> > check if it hits any warning? It's based on v7.0-rc1 tag.
> >
> > This (hopefully) should give us more information
> > that would help debugging the issue.
> >
> > 1. set stride early in alloc_slab_obj_exts_early()
> > 2. move some obj_exts helpers to slab.h
> > 3. in slab_obj_ext(), check three things:
> > 3-1. is the obj_ext address is the right one for this object?
> > 3-2. does the obj_ext address change after smp_rmb()?
> > 3-3. does obj_ext->objcg change after smp_rmb()?
> >
> > No smp_wmb() is used, intentionally.
> >
> > It is expected that the issue will still reproduce.
> >
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>
>
> Hello Harry,
Hello Venkat!
> I’ve restarted the test
Thanks :)
> but there are continuous warning prints in the
> logs, and they appear to be slowing down the test run significantly.
It's okay! the purpose of this patch is to see if there's any warning
hitting, rather than triggering the kernel crash.
> Warnings:
>
> [ 3215.419760] obj_ext in object
The patch adds five different warnings:
1) "obj_exts array in leftover"
2) "obj_ext in object"
3) "obj_exts array allocated from slab"
4) "obj_ext pointer has changed"
5) "obj_ext->objcg has changed"
Is 2) the only warning that is triggered?
Also, the warning below says it's triggered by proc_map_release().
Are there any other call stacks, or is this the only caller that hits
this warning?
Thanks!
> [ 3215.419774] WARNING: mm/slab.h:710 at slab_obj_ext+0x2e0/0x338, CPU#26:
> grep/103571 >
> [ 3215.419783] Modules linked in: xfs loop dm_mod bonding tls rfkill
> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc
> pseries_rng vmx_crypto dax_pmem fuse ext4 crc16 mbcache jbd2 sd_mod nd_pmem
> sg papr_scm libnvdimm ibmvscsi ibmveth scsi_transport_srp pseries_wdt
> [ 3215.419852] CPU: 26 UID: 0 PID: 103571 Comm: grep Kdump: loaded Tainted:
> G W 7.0.0-rc1+ #3 PREEMPTLAZY
> [ 3215.419859] Tainted: [W]=WARN
> [ 3215.419862] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200
> 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
> [ 3215.419866] NIP: c0000000008a9ff4 LR: c0000000008a9ff0 CTR:
> 0000000000000000
> [ 3215.419870] REGS: c0000001f9d37670 TRAP: 0700 Tainted: G W
> (7.0.0-rc1+)
> [ 3215.419874] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 24002404
> XER: 20040000
> [ 3215.419889] CFAR: c0000000001bc194 IRQMASK: 0
> [ 3215.419889] GPR00: c0000000008a9ff0 c0000001f9d37910 c00000000243a500
> c000000127e8d600
> [ 3215.419889] GPR04: 0000000000000004 0000000000000001 c0000000001bc164
> 0000000000000001
> [ 3215.419889] GPR08: a80e000000000000 0000000000000000 0000000000000007
> a80e000000000000
> [ 3215.419889] GPR12: c00e0001a1a48fb2 c000000d0dde7f00 c000000004e49960
> 0000000000000001
> [ 3215.419889] GPR16: c00000006e6e0000 0000000000000010 c000000007017fa0
> c000000007017fa4
> [ 3215.419889] GPR20: 0000000000000001 c000000007017f88 0000000000080000
> c000000007017f80
> [ 3215.419889] GPR24: c00000006e6f0010 c0000000aef32800 c00c0000001b9a2c
> c00000006e690010
> [ 3215.419889] GPR28: 0000000000000003 0000000000080020 c00000006e690010
> c00c0000001b9a00
> [ 3215.419960] NIP [c0000000008a9ff4] slab_obj_ext+0x2e0/0x338
> [ 3215.419966] LR [c0000000008a9ff0] slab_obj_ext+0x2dc/0x338
> [ 3215.419972] Call Trace:
> [ 3215.419975] [c0000001f9d37910] [c0000000008a9ff0]
> slab_obj_ext+0x2dc/0x338 (unreliable)
> [ 3215.419983] [c0000001f9d379c0] [c0000000008b9a64]
> __memcg_slab_free_hook+0x1a4/0x3dc
> [ 3215.419990] [c0000001f9d37a90] [c0000000007f8270] kfree+0x454/0x600
> [ 3215.419998] [c0000001f9d37b20] [c000000000989724]
> seq_release_private+0x98/0xd4
> [ 3215.420005] [c0000001f9d37b60] [c000000000a7adb4]
> proc_map_release+0xa4/0xe0
> [ 3215.420012] [c0000001f9d37ba0] [c00000000091edf0] __fput+0x1e8/0x5cc
> [ 3215.420019] [c0000001f9d37c20] [c000000000915670] sys_close+0x74/0xd0
> [ 3215.420025] [c0000001f9d37c50] [c00000000003aeb0]
> system_call_exception+0x1e0/0x4b0
> [ 3215.420033] [c0000001f9d37e50] [c00000000000d05c]
> system_call_vectored_common+0x15c/0x2ec
> [ 3215.420041] ---- interrupt: 3000 at 0x7fff9bd34ab4
> [ 3215.420045] NIP: 00007fff9bd34ab4 LR: 00007fff9bd34ab4 CTR:
> 0000000000000000
> [ 3215.420050] REGS: c0000001f9d37e80 TRAP: 3000 Tainted: G W
> (7.0.0-rc1+)
> [ 3215.420054] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>
> CR: 44002402 XER: 00000000
> [ 3215.420077] IRQMASK: 0
> [ 3215.420077] GPR00: 0000000000000006 00007fffe2939800 00007fff9bf37f00
> 0000000000000003
> [ 3215.420077] GPR04: 00007fff9bfe077f 000000000000f881 00007fffe2939820
> 000000000000f881
> [ 3215.420077] GPR08: 000000000000077f 0000000000000000 0000000000000000
> 0000000000000000
> [ 3215.420077] GPR12: 0000000000000000 00007fff9c0ab0e0 0000000000000000
> 0000000000000000
> [ 3215.420077] GPR16: 0000000000000000 00000001235700f0 0000000000000100
> 0000000000000001
> [ 3215.420077] GPR20: 00000000ffffffff 00000001235702ef 0000000000000000
> fffffffffffffffd
> [ 3215.420077] GPR24: 00007fffe2939890 0000000000000000 00007fffe2939978
> 00007fff9bf12a88
> [ 3215.420077] GPR28: 00007fffe2939974 0000000000010000 0000000000000003
> 0000000000010000
> [ 3215.420144] NIP [00007fff9bd34ab4] 0x7fff9bd34ab4
> [ 3215.420148] LR [00007fff9bd34ab4] 0x7fff9bd34ab4
> [ 3215.420151] ---- interrupt: 3000
> [ 3215.420154] Code: 4e800020 60000000 60000000 7f18e1d6 7b180020 7f18f214
> 7c3bc000 4182febc 3c62ff7a 386336c0 4b9120a9 60000000 <0fe00000> eac10060
> 4bffff58 3d200001
> [ 3215.420183] ---[ end trace 0000000000000000 ]---
>
> Regards,
>
> Venkat.
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
2026-02-27 8:11 ` Harry Yoo
@ 2026-02-27 9:36 ` Venkat Rao Bagalkote
0 siblings, 0 replies; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-02-27 9:36 UTC (permalink / raw)
To: Harry Yoo
Cc: akpm, ast, cgroups, cl, hannes, hao.li, linux-mm, mhocko,
muchun.song, rientjes, roman.gushchin, shakeel.butt, surenb,
vbabka
On 27/02/26 1:41 pm, Harry Yoo wrote:
> On Fri, Feb 27, 2026 at 01:32:29PM +0530, Venkat Rao Bagalkote wrote:
>> On 27/02/26 8:37 am, Harry Yoo wrote:
>>> Hi Venkat, could you please help testing this patch and
>>> check if it hits any warning? It's based on v7.0-rc1 tag.
>>>
>>> This (hopefully) should give us more information
>>> that would help debugging the issue.
>>>
>>> 1. set stride early in alloc_slab_obj_exts_early()
>>> 2. move some obj_exts helpers to slab.h
>>> 3. in slab_obj_ext(), check three things:
>>> 3-1. is the obj_ext address is the right one for this object?
>>> 3-2. does the obj_ext address change after smp_rmb()?
>>> 3-3. does obj_ext->objcg change after smp_rmb()?
>>>
>>> No smp_wmb() is used, intentionally.
>>>
>>> It is expected that the issue will still reproduce.
>>>
>>> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>>
>> Hello Harry,
> Hello Venkat!
>
>> I’ve restarted the test
> Thanks :)
>
>> but there are continuous warning prints in the
>> logs, and they appear to be slowing down the test run significantly.
> It's okay! the purpose of this patch is to see if there's any warning
> hitting, rather than triggering the kernel crash.
>
>> Warnings:
>>
>> [ 3215.419760] obj_ext in object
> The patch adds five different warnings:
>
> 1) "obj_exts array in leftover"
> 2) "obj_ext in object"
> 3) "obj_exts array allocated from slab"
> 4) "obj_ext pointer has changed"
> 5) "obj_ext->objcg has changed"
>
> Is 2) the only warning that is triggered?
>
> Also, the warning below says it's triggered by proc_map_release().
>
> Are there any other call stacks, or is this the only caller that hits
> this warning?
I’m continuing to see only warning (2) – “obj_ext in object”, but it is
being triggered from multiple different callers.
So far I have observed the warning originating from the following call
paths:
kfree → seq_release_private → proc_map_release → __fput
kfree → seq_release_private → mounts_release → __fput
__memcg_slab_post_alloc_hook → __kvmalloc_node_noprof → seq_read_iter →
vfs_read
There are many other WARN splats in the logs due to repeated hits, but
the only warning string I’ve seen is (2) “obj_ext in object”, just
triggered from different code paths
Regards,
Venkat.
>
> Thanks!
>
>> [ 3215.419774] WARNING: mm/slab.h:710 at slab_obj_ext+0x2e0/0x338, CPU#26:
>> grep/103571 >
>> [ 3215.419783] Modules linked in: xfs loop dm_mod bonding tls rfkill
>> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
>> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
>> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc
>> pseries_rng vmx_crypto dax_pmem fuse ext4 crc16 mbcache jbd2 sd_mod nd_pmem
>> sg papr_scm libnvdimm ibmvscsi ibmveth scsi_transport_srp pseries_wdt
>> [ 3215.419852] CPU: 26 UID: 0 PID: 103571 Comm: grep Kdump: loaded Tainted:
>> G W 7.0.0-rc1+ #3 PREEMPTLAZY
>> [ 3215.419859] Tainted: [W]=WARN
>> [ 3215.419862] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200
>> 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
>> [ 3215.419866] NIP: c0000000008a9ff4 LR: c0000000008a9ff0 CTR:
>> 0000000000000000
>> [ 3215.419870] REGS: c0000001f9d37670 TRAP: 0700 Tainted: G W
>> (7.0.0-rc1+)
>> [ 3215.419874] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 24002404
>> XER: 20040000
>> [ 3215.419889] CFAR: c0000000001bc194 IRQMASK: 0
>> [ 3215.419889] GPR00: c0000000008a9ff0 c0000001f9d37910 c00000000243a500
>> c000000127e8d600
>> [ 3215.419889] GPR04: 0000000000000004 0000000000000001 c0000000001bc164
>> 0000000000000001
>> [ 3215.419889] GPR08: a80e000000000000 0000000000000000 0000000000000007
>> a80e000000000000
>> [ 3215.419889] GPR12: c00e0001a1a48fb2 c000000d0dde7f00 c000000004e49960
>> 0000000000000001
>> [ 3215.419889] GPR16: c00000006e6e0000 0000000000000010 c000000007017fa0
>> c000000007017fa4
>> [ 3215.419889] GPR20: 0000000000000001 c000000007017f88 0000000000080000
>> c000000007017f80
>> [ 3215.419889] GPR24: c00000006e6f0010 c0000000aef32800 c00c0000001b9a2c
>> c00000006e690010
>> [ 3215.419889] GPR28: 0000000000000003 0000000000080020 c00000006e690010
>> c00c0000001b9a00
>> [ 3215.419960] NIP [c0000000008a9ff4] slab_obj_ext+0x2e0/0x338
>> [ 3215.419966] LR [c0000000008a9ff0] slab_obj_ext+0x2dc/0x338
>> [ 3215.419972] Call Trace:
>> [ 3215.419975] [c0000001f9d37910] [c0000000008a9ff0]
>> slab_obj_ext+0x2dc/0x338 (unreliable)
>> [ 3215.419983] [c0000001f9d379c0] [c0000000008b9a64]
>> __memcg_slab_free_hook+0x1a4/0x3dc
>> [ 3215.419990] [c0000001f9d37a90] [c0000000007f8270] kfree+0x454/0x600
>> [ 3215.419998] [c0000001f9d37b20] [c000000000989724]
>> seq_release_private+0x98/0xd4
>> [ 3215.420005] [c0000001f9d37b60] [c000000000a7adb4]
>> proc_map_release+0xa4/0xe0
>> [ 3215.420012] [c0000001f9d37ba0] [c00000000091edf0] __fput+0x1e8/0x5cc
>> [ 3215.420019] [c0000001f9d37c20] [c000000000915670] sys_close+0x74/0xd0
>> [ 3215.420025] [c0000001f9d37c50] [c00000000003aeb0]
>> system_call_exception+0x1e0/0x4b0
>> [ 3215.420033] [c0000001f9d37e50] [c00000000000d05c]
>> system_call_vectored_common+0x15c/0x2ec
>> [ 3215.420041] ---- interrupt: 3000 at 0x7fff9bd34ab4
>> [ 3215.420045] NIP: 00007fff9bd34ab4 LR: 00007fff9bd34ab4 CTR:
>> 0000000000000000
>> [ 3215.420050] REGS: c0000001f9d37e80 TRAP: 3000 Tainted: G W
>> (7.0.0-rc1+)
>> [ 3215.420054] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>
>> CR: 44002402 XER: 00000000
>> [ 3215.420077] IRQMASK: 0
>> [ 3215.420077] GPR00: 0000000000000006 00007fffe2939800 00007fff9bf37f00
>> 0000000000000003
>> [ 3215.420077] GPR04: 00007fff9bfe077f 000000000000f881 00007fffe2939820
>> 000000000000f881
>> [ 3215.420077] GPR08: 000000000000077f 0000000000000000 0000000000000000
>> 0000000000000000
>> [ 3215.420077] GPR12: 0000000000000000 00007fff9c0ab0e0 0000000000000000
>> 0000000000000000
>> [ 3215.420077] GPR16: 0000000000000000 00000001235700f0 0000000000000100
>> 0000000000000001
>> [ 3215.420077] GPR20: 00000000ffffffff 00000001235702ef 0000000000000000
>> fffffffffffffffd
>> [ 3215.420077] GPR24: 00007fffe2939890 0000000000000000 00007fffe2939978
>> 00007fff9bf12a88
>> [ 3215.420077] GPR28: 00007fffe2939974 0000000000010000 0000000000000003
>> 0000000000010000
>> [ 3215.420144] NIP [00007fff9bd34ab4] 0x7fff9bd34ab4
>> [ 3215.420148] LR [00007fff9bd34ab4] 0x7fff9bd34ab4
>> [ 3215.420151] ---- interrupt: 3000
>> [ 3215.420154] Code: 4e800020 60000000 60000000 7f18e1d6 7b180020 7f18f214
>> 7c3bc000 4182febc 3c62ff7a 386336c0 4b9120a9 60000000 <0fe00000> eac10060
>> 4bffff58 3d200001
>> [ 3215.420183] ---[ end trace 0000000000000000 ]---
>>
>> Regards,
>>
>> Venkat.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-02-27 9:36 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-23 7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
2026-02-23 11:44 ` Harry Yoo
2026-02-23 17:04 ` Vlastimil Babka
2026-02-23 20:23 ` Shakeel Butt
2026-02-24 9:04 ` Venkat Rao Bagalkote
2026-02-24 11:10 ` Harry Yoo
2026-02-25 9:14 ` Venkat Rao Bagalkote
2026-02-25 10:15 ` Harry Yoo
2026-02-27 3:07 ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
2026-02-27 5:52 ` kernel test robot
2026-02-27 6:02 ` kernel test robot
2026-02-27 8:02 ` Venkat Rao Bagalkote
2026-02-27 8:11 ` Harry Yoo
2026-02-27 9:36 ` Venkat Rao Bagalkote
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox