linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
@ 2026-02-23  7:58 Harry Yoo
  2026-02-23 11:44 ` Harry Yoo
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Harry Yoo @ 2026-02-23  7:58 UTC (permalink / raw)
  To: Vlastimil Babka, Andrew Morton
  Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Harry Yoo,
	Alexei Starovoitov, Hao Li, Suren Baghdasaryan, Shakeel Butt,
	Muchun Song, Johannes Weiner, Michal Hocko, cgroups, linux-mm,
	Venkat Rao Bagalkote

When alloc_slab_obj_exts() is called later in time (instead of at slab
allocation & initialization step), slab->stride and slab->obj_exts are
set when the slab is already accessible by multiple CPUs.

The current implementation does not enforce memory ordering between
slab->stride and slab->obj_exts. However, for correctness, slab->stride
must be visible before slab->obj_exts, otherwise concurrent readers
may observe slab->obj_exts as non-zero while stride is still stale,
leading to incorrect reference counting of object cgroups.

There has been a bug report [1] that showed symptoms of incorrect
reference counting of object cgroups, which could be triggered by
this memory ordering issue.

Fix this by unconditionally initializing slab->stride in
alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.

This ensures stride is set before the slab becomes visible to
other CPUs via the per-node partial slab list (protected by spinlock
with acquire/release semantics), preventing them from observing
inconsistent stride value.

Thanks to Shakeel Butt for pointing out this issue [2].

Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---

I tested this patch, but I could not confirm that this actually fixes
the issue reported by [1]. It would be nice if Venkat could help
confirm; but perhaps it's challenging to reliably reproduce...

Since this logically makes sense, it would be worth fix it anyway.

 mm/slub.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 18c30872d196..afa98065d74f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 retry:
 	old_exts = READ_ONCE(slab->obj_exts);
 	handle_failed_objexts_alloc(old_exts, vec, objects);
-	slab_set_stride(slab, sizeof(struct slabobj_ext));
 
 	if (new_slab) {
 		/*
@@ -2272,6 +2271,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
 	void *addr;
 	unsigned long obj_exts;
 
+	/* Initialize stride early to avoid memory ordering issues */
+	slab_set_stride(slab, sizeof(struct slabobj_ext));
+
 	if (!need_slab_obj_exts(s))
 		return;
 
@@ -2288,7 +2290,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
 		obj_exts |= MEMCG_DATA_OBJEXTS;
 #endif
 		slab->obj_exts = obj_exts;
-		slab_set_stride(slab, sizeof(struct slabobj_ext));
 	} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
 		unsigned int offset = obj_exts_offset_in_object(s);
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
  2026-02-23  7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
@ 2026-02-23 11:44 ` Harry Yoo
  2026-02-23 17:04   ` Vlastimil Babka
  2026-02-23 20:23 ` Shakeel Butt
  2026-02-24  9:04 ` Venkat Rao Bagalkote
  2 siblings, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2026-02-23 11:44 UTC (permalink / raw)
  To: Vlastimil Babka, Andrew Morton
  Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Hao Li,
	Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
	Michal Hocko, cgroups, linux-mm, Venkat Rao Bagalkote

On Mon, Feb 23, 2026 at 04:58:09PM +0900, Harry Yoo wrote:
> When alloc_slab_obj_exts() is called later in time (instead of at slab
> allocation & initialization step), slab->stride and slab->obj_exts are
> set when the slab is already accessible by multiple CPUs.
> 
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. However, for correctness, slab->stride
> must be visible before slab->obj_exts, otherwise concurrent readers
> may observe slab->obj_exts as non-zero while stride is still stale,
> leading to incorrect reference counting of object cgroups.
> 
> There has been a bug report [1] that showed symptoms of incorrect
> reference counting of object cgroups, which could be triggered by
> this memory ordering issue.
> 
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
> 
> This ensures stride is set before the slab becomes visible to
> other CPUs via the per-node partial slab list (protected by spinlock
> with acquire/release semantics), preventing them from observing
> inconsistent stride value.
> 
> Thanks to Shakeel Butt for pointing out this issue [2].
> 
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---

Vlastimil, could you please update the changelog when applying this
to the tree? I think this also explains [3] (thanks for raising it
off-list, Vlastimil!):

When alloc_slab_obj_exts() is called later (instead of during slab
allocation and initialization), slab->stride and slab->obj_exts are
updated after the slab is already accessible by multiple CPUs.

The current implementation does not enforce memory ordering between
slab->stride and slab->obj_exts. For correctness, slab->stride must be
visible before slab->obj_exts. Otherwise, concurrent readers may observe
slab->obj_exts as non-zero while stride is still stale.

With stale slab->stride, slab_obj_ext() could return the wrong obj_ext.
This could cause two problems:

  - obj_cgroup_put() is called on the wrong objcg, leading to
    a use-after-free due to incorrect reference counting [1] by
    decrementing the reference count more than it was incremented.

  - refill_obj_stock() is called on the wrong objcg, leading to
    a page_counter overflow [2] by uncharging more memory than charged.

Fix this by unconditionally initializing slab->stride in
alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
In the case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the function.

This ensures updates to slab->stride become visible before the slab
can be accessed by other CPUs via the per-node partial slab list
(protected by spinlock with acquire/release semantics).

Thanks to Shakeel Butt for pointing out this issue [3].

Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
Closes: https://lore.kernel.org/all/ddff7c7d-c0c3-4780-808f-9a83268bbf0c@linux.ibm.com [2]
Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [3]
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
  2026-02-23 11:44 ` Harry Yoo
@ 2026-02-23 17:04   ` Vlastimil Babka
  0 siblings, 0 replies; 14+ messages in thread
From: Vlastimil Babka @ 2026-02-23 17:04 UTC (permalink / raw)
  To: Harry Yoo, Vlastimil Babka, Andrew Morton
  Cc: Christoph Lameter, David Rientjes, Roman Gushchin, Hao Li,
	Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
	Michal Hocko, cgroups, linux-mm, Venkat Rao Bagalkote

On 2/23/26 12:44, Harry Yoo wrote:
> On Mon, Feb 23, 2026 at 04:58:09PM +0900, Harry Yoo wrote:
>> When alloc_slab_obj_exts() is called later in time (instead of at slab
>> allocation & initialization step), slab->stride and slab->obj_exts are
>> set when the slab is already accessible by multiple CPUs.
>> 
>> The current implementation does not enforce memory ordering between
>> slab->stride and slab->obj_exts. However, for correctness, slab->stride
>> must be visible before slab->obj_exts, otherwise concurrent readers
>> may observe slab->obj_exts as non-zero while stride is still stale,
>> leading to incorrect reference counting of object cgroups.
>> 
>> There has been a bug report [1] that showed symptoms of incorrect
>> reference counting of object cgroups, which could be triggered by
>> this memory ordering issue.
>> 
>> Fix this by unconditionally initializing slab->stride in
>> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
>> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>> 
>> This ensures stride is set before the slab becomes visible to
>> other CPUs via the per-node partial slab list (protected by spinlock
>> with acquire/release semantics), preventing them from observing
>> inconsistent stride value.
>> 
>> Thanks to Shakeel Butt for pointing out this issue [2].
>> 
>> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
>> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
>> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
>> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>> ---
> 
> Vlastimil, could you please update the changelog when applying this
> to the tree? I think this also explains [3] (thanks for raising it
> off-list, Vlastimil!):

Done, thanks! Added to slab/for-next-fixes

> When alloc_slab_obj_exts() is called later (instead of during slab
> allocation and initialization), slab->stride and slab->obj_exts are
> updated after the slab is already accessible by multiple CPUs.
> 
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. For correctness, slab->stride must be
> visible before slab->obj_exts. Otherwise, concurrent readers may observe
> slab->obj_exts as non-zero while stride is still stale.
> 
> With stale slab->stride, slab_obj_ext() could return the wrong obj_ext.
> This could cause two problems:
> 
>   - obj_cgroup_put() is called on the wrong objcg, leading to
>     a use-after-free due to incorrect reference counting [1] by
>     decrementing the reference count more than it was incremented.
> 
>   - refill_obj_stock() is called on the wrong objcg, leading to
>     a page_counter overflow [2] by uncharging more memory than charged.
> 
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In the case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the function.
> 
> This ensures updates to slab->stride become visible before the slab
> can be accessed by other CPUs via the per-node partial slab list
> (protected by spinlock with acquire/release semantics).
> 
> Thanks to Shakeel Butt for pointing out this issue [3].
> 
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Closes: https://lore.kernel.org/all/ddff7c7d-c0c3-4780-808f-9a83268bbf0c@linux.ibm.com [2]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [3]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>






^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
  2026-02-23  7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
  2026-02-23 11:44 ` Harry Yoo
@ 2026-02-23 20:23 ` Shakeel Butt
  2026-02-24  9:04 ` Venkat Rao Bagalkote
  2 siblings, 0 replies; 14+ messages in thread
From: Shakeel Butt @ 2026-02-23 20:23 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
	David Rientjes, Roman Gushchin, Alexei Starovoitov, Hao Li,
	Suren Baghdasaryan, Muchun Song, Johannes Weiner, Michal Hocko,
	cgroups, linux-mm, Venkat Rao Bagalkote

On Mon, Feb 23, 2026 at 04:58:09PM +0900, Harry Yoo wrote:
> When alloc_slab_obj_exts() is called later in time (instead of at slab
> allocation & initialization step), slab->stride and slab->obj_exts are
> set when the slab is already accessible by multiple CPUs.
> 
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. However, for correctness, slab->stride
> must be visible before slab->obj_exts, otherwise concurrent readers
> may observe slab->obj_exts as non-zero while stride is still stale,
> leading to incorrect reference counting of object cgroups.
> 
> There has been a bug report [1] that showed symptoms of incorrect
> reference counting of object cgroups, which could be triggered by
> this memory ordering issue.
> 
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
> 
> This ensures stride is set before the slab becomes visible to
> other CPUs via the per-node partial slab list (protected by spinlock
> with acquire/release semantics), preventing them from observing
> inconsistent stride value.
> 
> Thanks to Shakeel Butt for pointing out this issue [2].
> 
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>

Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
  2026-02-23  7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
  2026-02-23 11:44 ` Harry Yoo
  2026-02-23 20:23 ` Shakeel Butt
@ 2026-02-24  9:04 ` Venkat Rao Bagalkote
  2026-02-24 11:10   ` Harry Yoo
  2 siblings, 1 reply; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-02-24  9:04 UTC (permalink / raw)
  To: Harry Yoo, Vlastimil Babka, Andrew Morton
  Cc: Christoph Lameter, David Rientjes, Roman Gushchin,
	Alexei Starovoitov, Hao Li, Suren Baghdasaryan, Shakeel Butt,
	Muchun Song, Johannes Weiner, Michal Hocko, cgroups, linux-mm


On 23/02/26 1:28 pm, Harry Yoo wrote:
> When alloc_slab_obj_exts() is called later in time (instead of at slab
> allocation & initialization step), slab->stride and slab->obj_exts are
> set when the slab is already accessible by multiple CPUs.
>
> The current implementation does not enforce memory ordering between
> slab->stride and slab->obj_exts. However, for correctness, slab->stride
> must be visible before slab->obj_exts, otherwise concurrent readers
> may observe slab->obj_exts as non-zero while stride is still stale,
> leading to incorrect reference counting of object cgroups.
>
> There has been a bug report [1] that showed symptoms of incorrect
> reference counting of object cgroups, which could be triggered by
> this memory ordering issue.
>
> Fix this by unconditionally initializing slab->stride in
> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>
> This ensures stride is set before the slab becomes visible to
> other CPUs via the per-node partial slab list (protected by spinlock
> with acquire/release semantics), preventing them from observing
> inconsistent stride value.
>
> Thanks to Shakeel Butt for pointing out this issue [2].
>
> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [2]
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
>
> I tested this patch, but I could not confirm that this actually fixes
> the issue reported by [1]. It would be nice if Venkat could help
> confirm; but perhaps it's challenging to reliably reproduce...


Thanks for the patch. I did ran the complete test suite, and 
unfortunately issue is reproducing.

I applied this patch on mainline repo for testing.

Traces:

[ 9316.514161] BUG: Kernel NULL pointer dereference on read at 0x00000000
[ 9316.514169] Faulting instruction address: 0xc0000000008b2ff4
[ 9316.514176] Oops: Kernel access of bad area, sig: 7 [#1]
[ 9316.514182] LE PAGE_SIZE=64K MMU=Radix  SMP NR_CPUS=8192 NUMA pSeries
[ 9316.514189] Modules linked in: overlay dm_zero dm_thin_pool 
dm_persistent_data dm_bio_prison dm_snapshot dm_bufio dm_flakey xfs loop 
dm_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet 
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat 
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set bonding nf_tables tls 
sunrpc rfkill nfnetlink pseries_rng vmx_crypto dax_pmem fuse ext4 crc16 
mbcache jbd2 nd_pmem papr_scm sd_mod libnvdimm sg ibmvscsi ibmveth 
scsi_transport_srp pseries_wdt [last unloaded: scsi_debug]
[ 9316.514295] CPU: 16 UID: 0 PID: 0 Comm: swapper/16 Kdump: loaded 
Tainted: G        W           7.0.0-rc1+ #1 PREEMPTLAZY
[ 9316.514306] Tainted: [W]=WARN
[ 9316.514311] Hardware name: IBM,9080-HEX Power11 (architected) 
0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
[ 9316.514318] NIP:  c0000000008b2ff4 LR: c0000000008b2fec CTR: 
c00000000036d680
[ 9316.514326] REGS: c000000d0dcb7870 TRAP: 0300   Tainted: G   W        
     (7.0.0-rc1+)
[ 9316.514333] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 
84042802  XER: 20040000
[ 9316.514356] CFAR: c000000000862e94 DAR: 0000000000000000 DSISR: 
00080000 IRQMASK: 0
[ 9316.514356] GPR00: c0000000008b2fec c000000d0dcb7b10 c00000000243a500 
0000000000000001
[ 9316.514356] GPR04: 0000000000000008 0000000000000001 c0000000008b2fec 
0000000000000001
[ 9316.514356] GPR08: a80e000000000000 0000000000000001 0000000000000007 
a80e000000000000
[ 9316.514356] GPR12: c00e00000e7b6cd5 c000000d0ddf4700 c000000129a98e00 
0000000000000006
[ 9316.514356] GPR16: c000000007012fa0 c000000007012fa4 c000000005160980 
c000000007012f88
[ 9316.514356] GPR20: c00c000000021bec c000000d0d07f008 0000000000000001 
ffffffffffffff78
[ 9316.514356] GPR24: 0000000000000005 c000000d0d58f180 c0000000032cf000 
c000000d0ddf4700
[ 9316.514356] GPR28: 0000000000000088 0000000000000000 c000000129a98e00 
c000000d0d07f000
[ 9316.514457] NIP [c0000000008b2ff4] refill_obj_stock+0x5b4/0x680
[ 9316.514467] LR [c0000000008b2fec] refill_obj_stock+0x5ac/0x680
[ 9316.514476] Call Trace:
[ 9316.514481] [c000000d0dcb7b10] [c0000000008b2fec] 
refill_obj_stock+0x5ac/0x680 (unreliable)
[ 9316.514494] [c000000d0dcb7b90] [c0000000008b9598] 
__memcg_slab_free_hook+0x238/0x3ec
[ 9316.514505] [c000000d0dcb7c60] [c0000000007f3d90] 
__rcu_free_sheaf_prepare+0x314/0x3e8
[ 9316.514516] [c000000d0dcb7d10] [c0000000007fc2ec] 
rcu_free_sheaf+0x38/0x170
[ 9316.514528] [c000000d0dcb7d50] [c000000000334570] 
rcu_do_batch+0x2ec/0xfa8
[ 9316.514538] [c000000d0dcb7e50] [c000000000339a08] rcu_core+0x22c/0x48c
[ 9316.514548] [c000000d0dcb7ec0] [c0000000001cfeac] 
handle_softirqs+0x1f4/0x74c
[ 9316.514559] [c000000d0dcb7fe0] [c00000000001b0cc] 
do_softirq_own_stack+0x60/0x7c
[ 9316.514570] [c0000000096c7930] [c00000000001b0b8] 
do_softirq_own_stack+0x4c/0x7c
[ 9316.514581] [c0000000096c7960] [c0000000001cf168] 
__irq_exit_rcu+0x268/0x308
[ 9316.514592] [c0000000096c79a0] [c0000000001d0be4] irq_exit+0x20/0x38
[ 9316.514602] [c0000000096c79c0] [c0000000000315f4] 
interrupt_async_exit_prepare.constprop.0+0x18/0x2c
[ 9316.514614] [c0000000096c79e0] [c000000000009ffc] 
decrementer_common_virt+0x28c/0x290
[ 9316.514626] ---- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[ 9316.514635] NIP:  c00000000012d9f0 LR: c00000000135c0a8 CTR: 
0000000000000000
[ 9316.514642] REGS: c0000000096c7a10 TRAP: 0900   Tainted: G   W        
     (7.0.0-rc1+)
[ 9316.514649] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  
CR: 24000804  XER: 00000000
[ 9316.514678] CFAR: 0000000000000000 IRQMASK: 0
[ 9316.514678] GPR00: 0000000000000000 c0000000096c7cb0 c00000000243a500 
0000000000000000
[ 9316.514678] GPR04: 0000000000000000 800400002fe6fc10 0000000000000000 
0000000000000001
[ 9316.514678] GPR08: 0000000000000030 0000000000000000 0000000000000090 
0000000000000001
[ 9316.514678] GPR12: 800400002fe6fc00 c000000d0ddf4700 0000000000000000 
000000002ef01a00
[ 9316.514678] GPR16: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[ 9316.514678] GPR20: 0000000000000000 0000000000000000 0000000000000000 
0000000000000001
[ 9316.514678] GPR24: 0000000000000000 c000000004d7a760 000008792ad04b82 
0000000000000000
[ 9316.514678] GPR28: 0000000000000000 0000000000000001 c0000000032b18d8 
c0000000032b18e0
[ 9316.514774] NIP [c00000000012d9f0] plpar_hcall_norets_notrace+0x18/0x2c
[ 9316.514782] LR [c00000000135c0a8] cede_processor.isra.0+0x1c/0x34
[ 9316.514792] ---- interrupt: 900
[ 9316.514797] [c0000000096c7cb0] [c0000000096c7cf0] 0xc0000000096c7cf0 
(unreliable)
[ 9316.514808] [c0000000096c7d10] [c0000000019af170] 
dedicated_cede_loop+0x90/0x170
[ 9316.514819] [c0000000096c7d60] [c0000000019aeb20] 
cpuidle_enter_state+0x394/0x480
[ 9316.514830] [c0000000096c7e00] [c00000000135864c] cpuidle_enter+0x64/0x9c
[ 9316.514840] [c0000000096c7e50] [c000000000284b0c] call_cpuidle+0x7c/0xf8
[ 9316.514852] [c0000000096c7e90] [c0000000002903e8] 
cpuidle_idle_call+0x1c4/0x2b4
[ 9316.514862] [c0000000096c7f00] [c00000000029060c] do_idle+0x134/0x208
[ 9316.514872] [c0000000096c7f50] [c000000000290a5c] 
cpu_startup_entry+0x60/0x64
[ 9316.514882] [c0000000096c7f80] [c000000000074738] 
start_secondary+0x3fc/0x400
[ 9316.514894] [c0000000096c7fe0] [c00000000000e258] 
start_secondary_prolog+0x10/0x14
[ 9316.514904] Code: eba962a0 4bfffe40 60000000 387e0008 4bfae7c1 
60000000 ebbe0008 38800008 7fa3eb78 4bfafe85 60000000 39200001 
<7d40e8a8> 7d495214 7d40e9ad 40c2fff4
[ 9316.514941] ---[ end trace 0000000000000000 ]---


Regards,

Venkat.

>
> Since this logically makes sense, it would be worth fix it anyway.
>
>   mm/slub.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 18c30872d196..afa98065d74f 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>   retry:
>   	old_exts = READ_ONCE(slab->obj_exts);
>   	handle_failed_objexts_alloc(old_exts, vec, objects);
> -	slab_set_stride(slab, sizeof(struct slabobj_ext));
>   
>   	if (new_slab) {
>   		/*
> @@ -2272,6 +2271,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
>   	void *addr;
>   	unsigned long obj_exts;
>   
> +	/* Initialize stride early to avoid memory ordering issues */
> +	slab_set_stride(slab, sizeof(struct slabobj_ext));
> +
>   	if (!need_slab_obj_exts(s))
>   		return;
>   
> @@ -2288,7 +2290,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
>   		obj_exts |= MEMCG_DATA_OBJEXTS;
>   #endif
>   		slab->obj_exts = obj_exts;
> -		slab_set_stride(slab, sizeof(struct slabobj_ext));
>   	} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
>   		unsigned int offset = obj_exts_offset_in_object(s);
>   


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
  2026-02-24  9:04 ` Venkat Rao Bagalkote
@ 2026-02-24 11:10   ` Harry Yoo
  2026-02-25  9:14     ` Venkat Rao Bagalkote
  0 siblings, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2026-02-24 11:10 UTC (permalink / raw)
  To: Venkat Rao Bagalkote
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
	David Rientjes, Roman Gushchin, Alexei Starovoitov, Hao Li,
	Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
	Michal Hocko, cgroups, linux-mm

On Tue, Feb 24, 2026 at 02:34:41PM +0530, Venkat Rao Bagalkote wrote:
> 
> On 23/02/26 1:28 pm, Harry Yoo wrote:
> > When alloc_slab_obj_exts() is called later in time (instead of at slab
> > allocation & initialization step), slab->stride and slab->obj_exts are
> > set when the slab is already accessible by multiple CPUs.
> > 
> > The current implementation does not enforce memory ordering between
> > slab->stride and slab->obj_exts. However, for correctness, slab->stride
> > must be visible before slab->obj_exts, otherwise concurrent readers
> > may observe slab->obj_exts as non-zero while stride is still stale,
> > leading to incorrect reference counting of object cgroups.
> > 
> > There has been a bug report [1] that showed symptoms of incorrect
> > reference counting of object cgroups, which could be triggered by
> > this memory ordering issue.
> > 
> > Fix this by unconditionally initializing slab->stride in
> > alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
> > In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
> > 
> > This ensures stride is set before the slab becomes visible to
> > other CPUs via the per-node partial slab list (protected by spinlock
> > with acquire/release semantics), preventing them from observing
> > inconsistent stride value.
> > 
> > Thanks to Shakeel Butt for pointing out this issue [2].
> > 
> > Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
> > Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> > Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com
> > Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
> > 
> > I tested this patch, but I could not confirm that this actually fixes
> > the issue reported by [1]. It would be nice if Venkat could help
> > confirm; but perhaps it's challenging to reliably reproduce...
> 
> 
> Thanks for the patch. I did ran the complete test suite, and unfortunately
> issue is reproducing.

Oops, thanks for confirming that it's still reproduced!
That's really helpful.

Perhaps I should start considering cases where it's not a memory
ordering issue, but let's check one more thing before moving on.
could you please test if it still reproduces with the following patch?

If it's still reproducible, it should not be due to the memory ordering
issue between obj_exts and stride.

---8<---
From: Harry Yoo <harry.yoo@oracle.com>
Date: Mon, 23 Feb 2026 16:58:09 +0900
Subject: mm/slab: enforce slab->stride -> slab->obj_exts ordering

I tried to avoid unnecessary memory barriers for efficiency,
but the original bug is still reproducible.

Probably I missed a case where an object is allocated on a CPU
and then freed on a different CPU without involving spinlock.

I'm not sure if I did not cover edge cases or if it's caused by
something other than memory ordering issue.

Anyway, let's find out by introducing heavy memory barriers!

Always ensure that updates to stride is visible before obj_exts.

---
 mm/slab.h |  1 +
 mm/slub.c | 10 +++++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 71c7261bf822..aacdd9f4e509 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -565,6 +565,7 @@ static inline void slab_set_stride(struct slab *slab, unsigned short stride)
 }
 static inline unsigned short slab_get_stride(struct slab *slab)
 {
+	smp_rmb();
 	return slab->stride;
 }
 #else
diff --git a/mm/slub.c b/mm/slub.c
index 862642c165ed..c7c8b660a994 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 retry:
 	old_exts = READ_ONCE(slab->obj_exts);
 	handle_failed_objexts_alloc(old_exts, vec, objects);
-	slab_set_stride(slab, sizeof(struct slabobj_ext));

 	if (new_slab) {
 		/*
@@ -2272,6 +2271,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
 	void *addr;
 	unsigned long obj_exts;

+	slab_set_stride(slab, sizeof(struct slabobj_ext));
+	/* pairs with smp_rmb() in slab_get_stride() */
+	smp_wmb();
+
 	if (!need_slab_obj_exts(s))
 		return;

@@ -2288,7 +2291,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
 		obj_exts |= MEMCG_DATA_OBJEXTS;
 #endif
 		slab->obj_exts = obj_exts;
-		slab_set_stride(slab, sizeof(struct slabobj_ext));
 	} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
 		unsigned int offset = obj_exts_offset_in_object(s);

@@ -2305,8 +2307,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
 #ifdef CONFIG_MEMCG
 		obj_exts |= MEMCG_DATA_OBJEXTS;
 #endif
-		slab->obj_exts = obj_exts;
 		slab_set_stride(slab, s->size);
+		/* pairs with smp_rmb() in slab_get_stride() */
+		smp_wmb();
+		slab->obj_exts = obj_exts;
 	}
 }

--
2.43.0




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
  2026-02-24 11:10   ` Harry Yoo
@ 2026-02-25  9:14     ` Venkat Rao Bagalkote
  2026-02-25 10:15       ` Harry Yoo
  2026-02-27  3:07       ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
  0 siblings, 2 replies; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-02-25  9:14 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
	David Rientjes, Roman Gushchin, Alexei Starovoitov, Hao Li,
	Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
	Michal Hocko, cgroups, linux-mm


On 24/02/26 4:40 pm, Harry Yoo wrote:
> On Tue, Feb 24, 2026 at 02:34:41PM +0530, Venkat Rao Bagalkote wrote:
>> On 23/02/26 1:28 pm, Harry Yoo wrote:
>>> When alloc_slab_obj_exts() is called later in time (instead of at slab
>>> allocation & initialization step), slab->stride and slab->obj_exts are
>>> set when the slab is already accessible by multiple CPUs.
>>>
>>> The current implementation does not enforce memory ordering between
>>> slab->stride and slab->obj_exts. However, for correctness, slab->stride
>>> must be visible before slab->obj_exts, otherwise concurrent readers
>>> may observe slab->obj_exts as non-zero while stride is still stale,
>>> leading to incorrect reference counting of object cgroups.
>>>
>>> There has been a bug report [1] that showed symptoms of incorrect
>>> reference counting of object cgroups, which could be triggered by
>>> this memory ordering issue.
>>>
>>> Fix this by unconditionally initializing slab->stride in
>>> alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
>>> In case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the same function.
>>>
>>> This ensures stride is set before the slab becomes visible to
>>> other CPUs via the per-node partial slab list (protected by spinlock
>>> with acquire/release semantics), preventing them from observing
>>> inconsistent stride value.
>>>
>>> Thanks to Shakeel Butt for pointing out this issue [2].
>>>
>>> Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
>>> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>>> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com
>>> Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo
>>> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>>> ---
>>>
>>> I tested this patch, but I could not confirm that this actually fixes
>>> the issue reported by [1]. It would be nice if Venkat could help
>>> confirm; but perhaps it's challenging to reliably reproduce...
>>
>> Thanks for the patch. I did ran the complete test suite, and unfortunately
>> issue is reproducing.
> Oops, thanks for confirming that it's still reproduced!
> That's really helpful.
>
> Perhaps I should start considering cases where it's not a memory
> ordering issue, but let's check one more thing before moving on.
> could you please test if it still reproduces with the following patch?
>
> If it's still reproducible, it should not be due to the memory ordering
> issue between obj_exts and stride.
>
> ---8<---
> From: Harry Yoo <harry.yoo@oracle.com>
> Date: Mon, 23 Feb 2026 16:58:09 +0900
> Subject: mm/slab: enforce slab->stride -> slab->obj_exts ordering
>
> I tried to avoid unnecessary memory barriers for efficiency,
> but the original bug is still reproducible.
>
> Probably I missed a case where an object is allocated on a CPU
> and then freed on a different CPU without involving spinlock.
>
> I'm not sure if I did not cover edge cases or if it's caused by
> something other than memory ordering issue.
>
> Anyway, let's find out by introducing heavy memory barriers!
>
> Always ensure that updates to stride is visible before obj_exts.
>
> ---
>   mm/slab.h |  1 +
>   mm/slub.c | 10 +++++++---
>   2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 71c7261bf822..aacdd9f4e509 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -565,6 +565,7 @@ static inline void slab_set_stride(struct slab *slab, unsigned short stride)
>   }
>   static inline unsigned short slab_get_stride(struct slab *slab)
>   {
> +	smp_rmb();
>   	return slab->stride;
>   }
>   #else
> diff --git a/mm/slub.c b/mm/slub.c
> index 862642c165ed..c7c8b660a994 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2196,7 +2196,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>   retry:
>   	old_exts = READ_ONCE(slab->obj_exts);
>   	handle_failed_objexts_alloc(old_exts, vec, objects);
> -	slab_set_stride(slab, sizeof(struct slabobj_ext));
>
>   	if (new_slab) {
>   		/*
> @@ -2272,6 +2271,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
>   	void *addr;
>   	unsigned long obj_exts;
>
> +	slab_set_stride(slab, sizeof(struct slabobj_ext));
> +	/* pairs with smp_rmb() in slab_get_stride() */
> +	smp_wmb();
> +
>   	if (!need_slab_obj_exts(s))
>   		return;
>
> @@ -2288,7 +2291,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
>   		obj_exts |= MEMCG_DATA_OBJEXTS;
>   #endif
>   		slab->obj_exts = obj_exts;
> -		slab_set_stride(slab, sizeof(struct slabobj_ext));
>   	} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
>   		unsigned int offset = obj_exts_offset_in_object(s);
>
> @@ -2305,8 +2307,10 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
>   #ifdef CONFIG_MEMCG
>   		obj_exts |= MEMCG_DATA_OBJEXTS;
>   #endif
> -		slab->obj_exts = obj_exts;
>   		slab_set_stride(slab, s->size);
> +		/* pairs with smp_rmb() in slab_get_stride() */
> +		smp_wmb();
> +		slab->obj_exts = obj_exts;
>   	}
>   }
>
> --
> 2.43.0
>

With this patch, issue is not reproduced. So looks good.


Regards,

Venkat.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues
  2026-02-25  9:14     ` Venkat Rao Bagalkote
@ 2026-02-25 10:15       ` Harry Yoo
  2026-02-27  3:07       ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
  1 sibling, 0 replies; 14+ messages in thread
From: Harry Yoo @ 2026-02-25 10:15 UTC (permalink / raw)
  To: Venkat Rao Bagalkote
  Cc: Vlastimil Babka, Andrew Morton, Christoph Lameter,
	David Rientjes, Roman Gushchin, Alexei Starovoitov, Hao Li,
	Suren Baghdasaryan, Shakeel Butt, Muchun Song, Johannes Weiner,
	Michal Hocko, cgroups, linux-mm

On Wed, Feb 25, 2026 at 02:44:24PM +0530, Venkat Rao Bagalkote wrote:
> > > Thanks for the patch. I did ran the complete test suite, and unfortunately
> > > issue is reproducing.
>
> > Oops, thanks for confirming that it's still reproduced!
> > That's really helpful.
> > 
> > Perhaps I should start considering cases where it's not a memory
> > ordering issue, but let's check one more thing before moving on.
> > could you please test if it still reproduces with the following patch?
> > 
> > If it's still reproducible, it should not be due to the memory ordering
> > issue between obj_exts and stride.
> > 
> > ---8<---
> > From: Harry Yoo <harry.yoo@oracle.com>
> > Date: Mon, 23 Feb 2026 16:58:09 +0900
> > Subject: mm/slab: enforce slab->stride -> slab->obj_exts ordering
> > 
> > I tried to avoid unnecessary memory barriers for efficiency,
> > but the original bug is still reproducible.
> > 
> > Probably I missed a case where an object is allocated on a CPU
> > and then freed on a different CPU without involving spinlock.
> > 
> > I'm not sure if I did not cover edge cases or if it's caused by
> > something other than memory ordering issue.
> > 
> > Anyway, let's find out by introducing heavy memory barriers!
> > 
> > Always ensure that updates to stride is visible before obj_exts.
> > 
> > ---

[...]

> With this patch, issue is not reproduced. So looks good.

Thanks a lot, Venkat! That's really helpful.

I think that's enough signal to assume that memory ordering is playing
a role here, unless it happens to be masking another issue.
Even so, it's important to enforce the ordering anyway.

But having smp_load_acquire() on every alloc/free fastpath doesn't
sound great to me. Let me think a bit about it and come up with
a reasonable solution (this time, hopefully no hole in the ordering).

Since it's a bug I'm working on it with high priority.

Again, thanks a lot for testing!

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] mm/slab: a debug patch to investigate the issue further
  2026-02-25  9:14     ` Venkat Rao Bagalkote
  2026-02-25 10:15       ` Harry Yoo
@ 2026-02-27  3:07       ` Harry Yoo
  2026-02-27  5:52         ` kernel test robot
                           ` (2 more replies)
  1 sibling, 3 replies; 14+ messages in thread
From: Harry Yoo @ 2026-02-27  3:07 UTC (permalink / raw)
  To: venkat88
  Cc: akpm, ast, cgroups, cl, hannes, hao.li, harry.yoo, linux-mm,
	mhocko, muchun.song, rientjes, roman.gushchin, shakeel.butt,
	surenb, vbabka

Hi Venkat, could you please help testing this patch and
check if it hits any warning? It's based on v7.0-rc1 tag.

This (hopefully) should give us more information
that would help debugging the issue.

1. set stride early in alloc_slab_obj_exts_early()
2. move some obj_exts helpers to slab.h
3. in slab_obj_ext(), check three things:
   3-1. is the obj_ext address is the right one for this object?
   3-2. does the obj_ext address change after smp_rmb()?
   3-3. does obj_ext->objcg change after smp_rmb()?

No smp_wmb() is used, intentionally.

It is expected that the issue will still reproduce.

Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
 mm/slab.h | 131 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 mm/slub.c | 100 ++---------------------------------------
 2 files changed, 130 insertions(+), 101 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 71c7261bf822..d1e44cd01ea1 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -578,6 +578,101 @@ static inline unsigned short slab_get_stride(struct slab *slab)
 }
 #endif
 
+#ifdef CONFIG_SLAB_OBJ_EXT
+
+/*
+ * Check if memory cgroup or memory allocation profiling is enabled.
+ * If enabled, SLUB tries to reduce memory overhead of accounting
+ * slab objects. If neither is enabled when this function is called,
+ * the optimization is simply skipped to avoid affecting caches that do not
+ * need slabobj_ext metadata.
+ *
+ * However, this may disable optimization when memory cgroup or memory
+ * allocation profiling is used, but slabs are created too early
+ * even before those subsystems are initialized.
+ */
+static inline bool need_slab_obj_exts(struct kmem_cache *s)
+{
+	if (s->flags & SLAB_NO_OBJ_EXT)
+		return false;
+
+	if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
+		return true;
+
+	if (mem_alloc_profiling_enabled())
+		return true;
+
+	return false;
+}
+
+static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
+{
+	return sizeof(struct slabobj_ext) * slab->objects;
+}
+
+static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
+						    struct slab *slab)
+{
+	unsigned long objext_offset;
+
+	objext_offset = s->size * slab->objects;
+	objext_offset = ALIGN(objext_offset, sizeof(struct slabobj_ext));
+	return objext_offset;
+}
+
+static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
+						     struct slab *slab)
+{
+	unsigned long objext_offset = obj_exts_offset_in_slab(s, slab);
+	unsigned long objext_size = obj_exts_size_in_slab(slab);
+
+	return objext_offset + objext_size <= slab_size(slab);
+}
+
+static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
+{
+	unsigned long obj_exts;
+	unsigned long start;
+	unsigned long end;
+
+	obj_exts = slab_obj_exts(slab);
+	if (!obj_exts)
+		return false;
+
+	start = (unsigned long)slab_address(slab);
+	end = start + slab_size(slab);
+	return (obj_exts >= start) && (obj_exts < end);
+}
+#else
+static inline bool need_slab_obj_exts(struct kmem_cache *s)
+{
+	return false;
+}
+
+static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
+{
+	return 0;
+}
+
+static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
+						    struct slab *slab)
+{
+	return 0;
+}
+
+static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
+						     struct slab *slab)
+{
+	return false;
+}
+
+static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
+{
+	return false;
+}
+
+#endif
+
 /*
  * slab_obj_ext - get the pointer to the slab object extension metadata
  * associated with an object in a slab.
@@ -592,13 +687,41 @@ static inline struct slabobj_ext *slab_obj_ext(struct slab *slab,
 					       unsigned long obj_exts,
 					       unsigned int index)
 {
-	struct slabobj_ext *obj_ext;
+	struct slabobj_ext *ext_before;
+	struct slabobj_ext *ext_after;
+	struct obj_cgroup *objcg_before;
+	struct obj_cgroup *objcg_after;
 
 	VM_WARN_ON_ONCE(obj_exts != slab_obj_exts(slab));
 
-	obj_ext = (struct slabobj_ext *)(obj_exts +
-					 slab_get_stride(slab) * index);
-	return kasan_reset_tag(obj_ext);
+	ext_before = (struct slabobj_ext *)(obj_exts +
+					    slab_get_stride(slab) * index);
+	objcg_before = ext_before->objcg;
+	// re-read things after rmb
+	smp_rmb();
+	// is ext_before the right obj_ext for this object?
+	if (obj_exts_in_slab(slab->slab_cache, slab)) {
+		struct kmem_cache *s = slab->slab_cache;
+
+		if (obj_exts_fit_within_slab_leftover(s, slab))
+			WARN(ext_before != (struct slabobj_ext *)(obj_exts + sizeof(struct slabobj_ext) * index),
+			     "obj_exts array in leftover");
+		else
+			WARN(ext_before != (struct slabobj_ext *)(obj_exts + s->size * index),
+			     "obj_ext in object");
+
+	} else {
+		WARN(ext_before != (struct slabobj_ext *)(obj_exts + sizeof(struct slabobj_ext) * index),
+		     "obj_exts array allocated from slab");
+	}
+
+	ext_after = (struct slabobj_ext *)(obj_exts +
+					   slab_get_stride(slab) * index);
+	objcg_after = ext_after->objcg;
+
+	WARN(ext_before != ext_after, "obj_ext pointer has changed");
+	WARN(objcg_before != objcg_after, "obj_ext->objcg has changed");
+	return kasan_reset_tag(ext_before);
 }
 
 int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
diff --git a/mm/slub.c b/mm/slub.c
index 862642c165ed..8eb64534370e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -757,101 +757,6 @@ static inline unsigned long get_orig_size(struct kmem_cache *s, void *object)
 	return *(unsigned long *)p;
 }
 
-#ifdef CONFIG_SLAB_OBJ_EXT
-
-/*
- * Check if memory cgroup or memory allocation profiling is enabled.
- * If enabled, SLUB tries to reduce memory overhead of accounting
- * slab objects. If neither is enabled when this function is called,
- * the optimization is simply skipped to avoid affecting caches that do not
- * need slabobj_ext metadata.
- *
- * However, this may disable optimization when memory cgroup or memory
- * allocation profiling is used, but slabs are created too early
- * even before those subsystems are initialized.
- */
-static inline bool need_slab_obj_exts(struct kmem_cache *s)
-{
-	if (s->flags & SLAB_NO_OBJ_EXT)
-		return false;
-
-	if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
-		return true;
-
-	if (mem_alloc_profiling_enabled())
-		return true;
-
-	return false;
-}
-
-static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
-{
-	return sizeof(struct slabobj_ext) * slab->objects;
-}
-
-static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
-						    struct slab *slab)
-{
-	unsigned long objext_offset;
-
-	objext_offset = s->size * slab->objects;
-	objext_offset = ALIGN(objext_offset, sizeof(struct slabobj_ext));
-	return objext_offset;
-}
-
-static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
-						     struct slab *slab)
-{
-	unsigned long objext_offset = obj_exts_offset_in_slab(s, slab);
-	unsigned long objext_size = obj_exts_size_in_slab(slab);
-
-	return objext_offset + objext_size <= slab_size(slab);
-}
-
-static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
-{
-	unsigned long obj_exts;
-	unsigned long start;
-	unsigned long end;
-
-	obj_exts = slab_obj_exts(slab);
-	if (!obj_exts)
-		return false;
-
-	start = (unsigned long)slab_address(slab);
-	end = start + slab_size(slab);
-	return (obj_exts >= start) && (obj_exts < end);
-}
-#else
-static inline bool need_slab_obj_exts(struct kmem_cache *s)
-{
-	return false;
-}
-
-static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
-{
-	return 0;
-}
-
-static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
-						    struct slab *slab)
-{
-	return 0;
-}
-
-static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
-						     struct slab *slab)
-{
-	return false;
-}
-
-static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
-{
-	return false;
-}
-
-#endif
-
 #if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
 static bool obj_exts_in_object(struct kmem_cache *s, struct slab *slab)
 {
@@ -2196,7 +2101,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 retry:
 	old_exts = READ_ONCE(slab->obj_exts);
 	handle_failed_objexts_alloc(old_exts, vec, objects);
-	slab_set_stride(slab, sizeof(struct slabobj_ext));
 
 	if (new_slab) {
 		/*
@@ -2272,6 +2176,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
 	void *addr;
 	unsigned long obj_exts;
 
+	/* Initialize stride early to avoid memory ordering issues */
+	slab_set_stride(slab, sizeof(struct slabobj_ext));
+
 	if (!need_slab_obj_exts(s))
 		return;
 
@@ -2288,7 +2195,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
 		obj_exts |= MEMCG_DATA_OBJEXTS;
 #endif
 		slab->obj_exts = obj_exts;
-		slab_set_stride(slab, sizeof(struct slabobj_ext));
 	} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
 		unsigned int offset = obj_exts_offset_in_object(s);
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
  2026-02-27  3:07       ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
@ 2026-02-27  5:52         ` kernel test robot
  2026-02-27  6:02         ` kernel test robot
  2026-02-27  8:02         ` Venkat Rao Bagalkote
  2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2026-02-27  5:52 UTC (permalink / raw)
  To: Harry Yoo, venkat88
  Cc: llvm, oe-kbuild-all, akpm, ast, cgroups, cl, hannes, hao.li,
	harry.yoo, linux-mm, mhocko, muchun.song, rientjes,
	roman.gushchin, shakeel.butt, surenb, vbabka

Hi Harry,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Harry-Yoo/mm-slab-a-debug-patch-to-investigate-the-issue-further/20260227-111246
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20260227030733.9517-1-harry.yoo%40oracle.com
patch subject: [PATCH] mm/slab: a debug patch to investigate the issue further
config: x86_64-allnoconfig (https://download.01.org/0day-ci/archive/20260227/202602271320.ywOCYQx4-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260227/202602271320.ywOCYQx4-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602271320.ywOCYQx4-lkp@intel.com/

All errors (new ones prefixed by >>):

>> mm/slub.c:1330:6: error: call to undeclared function 'obj_exts_in_slab'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
    1330 |         if (obj_exts_in_slab(s, slab) && !obj_exts_in_object(s, slab)) {
         |             ^
>> mm/slub.c:1332:16: error: call to undeclared function 'obj_exts_offset_in_slab'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
    1332 |                 remainder -= obj_exts_offset_in_slab(s, slab);
         |                              ^
   mm/slub.c:1332:16: note: did you mean 'obj_exts_offset_in_object'?
   mm/slub.c:793:28: note: 'obj_exts_offset_in_object' declared here
     793 | static inline unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
         |                            ^
>> mm/slub.c:1333:16: error: call to undeclared function 'obj_exts_size_in_slab'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
    1333 |                 remainder -= obj_exts_size_in_slab(slab);
         |                              ^
   3 errors generated.


vim +/obj_exts_in_slab +1330 mm/slub.c

81819f0fc8285a Christoph Lameter          2007-05-06  1311  
39b264641a0c3b Christoph Lameter          2008-04-14  1312  /* Check the pad bytes at the end of a slab page */
adea9876180664 Ilya Leoshkevich           2024-06-21  1313  static pad_check_attributes void
adea9876180664 Ilya Leoshkevich           2024-06-21  1314  slab_pad_check(struct kmem_cache *s, struct slab *slab)
81819f0fc8285a Christoph Lameter          2007-05-06  1315  {
2492268472e7d3 Christoph Lameter          2007-07-17  1316  	u8 *start;
2492268472e7d3 Christoph Lameter          2007-07-17  1317  	u8 *fault;
2492268472e7d3 Christoph Lameter          2007-07-17  1318  	u8 *end;
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31  1319  	u8 *pad;
2492268472e7d3 Christoph Lameter          2007-07-17  1320  	int length;
2492268472e7d3 Christoph Lameter          2007-07-17  1321  	int remainder;
81819f0fc8285a Christoph Lameter          2007-05-06  1322  
81819f0fc8285a Christoph Lameter          2007-05-06  1323  	if (!(s->flags & SLAB_POISON))
a204e6d626126d Miaohe Lin                 2022-04-19  1324  		return;
81819f0fc8285a Christoph Lameter          2007-05-06  1325  
bb192ed9aa7191 Vlastimil Babka            2021-11-03  1326  	start = slab_address(slab);
bb192ed9aa7191 Vlastimil Babka            2021-11-03  1327  	length = slab_size(slab);
39b264641a0c3b Christoph Lameter          2008-04-14  1328  	end = start + length;
70089d01880750 Harry Yoo                  2026-01-13  1329  
a77d6d33868502 Harry Yoo                  2026-01-13 @1330  	if (obj_exts_in_slab(s, slab) && !obj_exts_in_object(s, slab)) {
70089d01880750 Harry Yoo                  2026-01-13  1331  		remainder = length;
70089d01880750 Harry Yoo                  2026-01-13 @1332  		remainder -= obj_exts_offset_in_slab(s, slab);
70089d01880750 Harry Yoo                  2026-01-13 @1333  		remainder -= obj_exts_size_in_slab(slab);
70089d01880750 Harry Yoo                  2026-01-13  1334  	} else {
39b264641a0c3b Christoph Lameter          2008-04-14  1335  		remainder = length % s->size;
70089d01880750 Harry Yoo                  2026-01-13  1336  	}
70089d01880750 Harry Yoo                  2026-01-13  1337  
81819f0fc8285a Christoph Lameter          2007-05-06  1338  	if (!remainder)
a204e6d626126d Miaohe Lin                 2022-04-19  1339  		return;
81819f0fc8285a Christoph Lameter          2007-05-06  1340  
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31  1341  	pad = end - remainder;
a79316c6178ca4 Andrey Ryabinin            2015-02-13  1342  	metadata_access_enable();
aa1ef4d7b3f67f Andrey Konovalov           2020-12-22  1343  	fault = memchr_inv(kasan_reset_tag(pad), POISON_INUSE, remainder);
a79316c6178ca4 Andrey Ryabinin            2015-02-13  1344  	metadata_access_disable();
2492268472e7d3 Christoph Lameter          2007-07-17  1345  	if (!fault)
a204e6d626126d Miaohe Lin                 2022-04-19  1346  		return;
2492268472e7d3 Christoph Lameter          2007-07-17  1347  	while (end > fault && end[-1] == POISON_INUSE)
2492268472e7d3 Christoph Lameter          2007-07-17  1348  		end--;
2492268472e7d3 Christoph Lameter          2007-07-17  1349  
3f6f32b14ab354 Hyesoo Yu                  2025-02-26  1350  	slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu",
e1b70dd1e6429f Miles Chen                 2019-11-30  1351  		 fault, end - 1, fault - start);
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31  1352  	print_section(KERN_ERR, "Padding ", pad, remainder);
3f6f32b14ab354 Hyesoo Yu                  2025-02-26  1353  	__slab_err(slab);
2492268472e7d3 Christoph Lameter          2007-07-17  1354  
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31  1355  	restore_bytes(s, "slab padding", POISON_INUSE, fault, end);
81819f0fc8285a Christoph Lameter          2007-05-06  1356  }
81819f0fc8285a Christoph Lameter          2007-05-06  1357  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
  2026-02-27  3:07       ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
  2026-02-27  5:52         ` kernel test robot
@ 2026-02-27  6:02         ` kernel test robot
  2026-02-27  8:02         ` Venkat Rao Bagalkote
  2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2026-02-27  6:02 UTC (permalink / raw)
  To: Harry Yoo, venkat88
  Cc: oe-kbuild-all, akpm, ast, cgroups, cl, hannes, hao.li, harry.yoo,
	linux-mm, mhocko, muchun.song, rientjes, roman.gushchin,
	shakeel.butt, surenb, vbabka

Hi Harry,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Harry-Yoo/mm-slab-a-debug-patch-to-investigate-the-issue-further/20260227-111246
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20260227030733.9517-1-harry.yoo%40oracle.com
patch subject: [PATCH] mm/slab: a debug patch to investigate the issue further
config: nios2-allnoconfig (https://download.01.org/0day-ci/archive/20260227/202602271339.xhIvS2iX-lkp@intel.com/config)
compiler: nios2-linux-gcc (GCC) 11.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260227/202602271339.xhIvS2iX-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602271339.xhIvS2iX-lkp@intel.com/

All errors (new ones prefixed by >>):

   mm/slub.c: In function 'slab_pad_check':
>> mm/slub.c:1330:13: error: implicit declaration of function 'obj_exts_in_slab'; did you mean 'obj_exts_in_object'? [-Werror=implicit-function-declaration]
    1330 |         if (obj_exts_in_slab(s, slab) && !obj_exts_in_object(s, slab)) {
         |             ^~~~~~~~~~~~~~~~
         |             obj_exts_in_object
>> mm/slub.c:1332:30: error: implicit declaration of function 'obj_exts_offset_in_slab'; did you mean 'obj_exts_offset_in_object'? [-Werror=implicit-function-declaration]
    1332 |                 remainder -= obj_exts_offset_in_slab(s, slab);
         |                              ^~~~~~~~~~~~~~~~~~~~~~~
         |                              obj_exts_offset_in_object
>> mm/slub.c:1333:30: error: implicit declaration of function 'obj_exts_size_in_slab' [-Werror=implicit-function-declaration]
    1333 |                 remainder -= obj_exts_size_in_slab(slab);
         |                              ^~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +1330 mm/slub.c

81819f0fc8285a Christoph Lameter          2007-05-06  1311  
39b264641a0c3b Christoph Lameter          2008-04-14  1312  /* Check the pad bytes at the end of a slab page */
adea9876180664 Ilya Leoshkevich           2024-06-21  1313  static pad_check_attributes void
adea9876180664 Ilya Leoshkevich           2024-06-21  1314  slab_pad_check(struct kmem_cache *s, struct slab *slab)
81819f0fc8285a Christoph Lameter          2007-05-06  1315  {
2492268472e7d3 Christoph Lameter          2007-07-17  1316  	u8 *start;
2492268472e7d3 Christoph Lameter          2007-07-17  1317  	u8 *fault;
2492268472e7d3 Christoph Lameter          2007-07-17  1318  	u8 *end;
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31  1319  	u8 *pad;
2492268472e7d3 Christoph Lameter          2007-07-17  1320  	int length;
2492268472e7d3 Christoph Lameter          2007-07-17  1321  	int remainder;
81819f0fc8285a Christoph Lameter          2007-05-06  1322  
81819f0fc8285a Christoph Lameter          2007-05-06  1323  	if (!(s->flags & SLAB_POISON))
a204e6d626126d Miaohe Lin                 2022-04-19  1324  		return;
81819f0fc8285a Christoph Lameter          2007-05-06  1325  
bb192ed9aa7191 Vlastimil Babka            2021-11-03  1326  	start = slab_address(slab);
bb192ed9aa7191 Vlastimil Babka            2021-11-03  1327  	length = slab_size(slab);
39b264641a0c3b Christoph Lameter          2008-04-14  1328  	end = start + length;
70089d01880750 Harry Yoo                  2026-01-13  1329  
a77d6d33868502 Harry Yoo                  2026-01-13 @1330  	if (obj_exts_in_slab(s, slab) && !obj_exts_in_object(s, slab)) {
70089d01880750 Harry Yoo                  2026-01-13  1331  		remainder = length;
70089d01880750 Harry Yoo                  2026-01-13 @1332  		remainder -= obj_exts_offset_in_slab(s, slab);
70089d01880750 Harry Yoo                  2026-01-13 @1333  		remainder -= obj_exts_size_in_slab(slab);
70089d01880750 Harry Yoo                  2026-01-13  1334  	} else {
39b264641a0c3b Christoph Lameter          2008-04-14  1335  		remainder = length % s->size;
70089d01880750 Harry Yoo                  2026-01-13  1336  	}
70089d01880750 Harry Yoo                  2026-01-13  1337  
81819f0fc8285a Christoph Lameter          2007-05-06  1338  	if (!remainder)
a204e6d626126d Miaohe Lin                 2022-04-19  1339  		return;
81819f0fc8285a Christoph Lameter          2007-05-06  1340  
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31  1341  	pad = end - remainder;
a79316c6178ca4 Andrey Ryabinin            2015-02-13  1342  	metadata_access_enable();
aa1ef4d7b3f67f Andrey Konovalov           2020-12-22  1343  	fault = memchr_inv(kasan_reset_tag(pad), POISON_INUSE, remainder);
a79316c6178ca4 Andrey Ryabinin            2015-02-13  1344  	metadata_access_disable();
2492268472e7d3 Christoph Lameter          2007-07-17  1345  	if (!fault)
a204e6d626126d Miaohe Lin                 2022-04-19  1346  		return;
2492268472e7d3 Christoph Lameter          2007-07-17  1347  	while (end > fault && end[-1] == POISON_INUSE)
2492268472e7d3 Christoph Lameter          2007-07-17  1348  		end--;
2492268472e7d3 Christoph Lameter          2007-07-17  1349  
3f6f32b14ab354 Hyesoo Yu                  2025-02-26  1350  	slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu",
e1b70dd1e6429f Miles Chen                 2019-11-30  1351  		 fault, end - 1, fault - start);
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31  1352  	print_section(KERN_ERR, "Padding ", pad, remainder);
3f6f32b14ab354 Hyesoo Yu                  2025-02-26  1353  	__slab_err(slab);
2492268472e7d3 Christoph Lameter          2007-07-17  1354  
5d682681f8a2bd Balasubramani Vivekanandan 2018-01-31  1355  	restore_bytes(s, "slab padding", POISON_INUSE, fault, end);
81819f0fc8285a Christoph Lameter          2007-05-06  1356  }
81819f0fc8285a Christoph Lameter          2007-05-06  1357  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
  2026-02-27  3:07       ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
  2026-02-27  5:52         ` kernel test robot
  2026-02-27  6:02         ` kernel test robot
@ 2026-02-27  8:02         ` Venkat Rao Bagalkote
  2026-02-27  8:11           ` Harry Yoo
  2 siblings, 1 reply; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-02-27  8:02 UTC (permalink / raw)
  To: Harry Yoo
  Cc: akpm, ast, cgroups, cl, hannes, hao.li, linux-mm, mhocko,
	muchun.song, rientjes, roman.gushchin, shakeel.butt, surenb,
	vbabka


On 27/02/26 8:37 am, Harry Yoo wrote:
> Hi Venkat, could you please help testing this patch and
> check if it hits any warning? It's based on v7.0-rc1 tag.
>
> This (hopefully) should give us more information
> that would help debugging the issue.
>
> 1. set stride early in alloc_slab_obj_exts_early()
> 2. move some obj_exts helpers to slab.h
> 3. in slab_obj_ext(), check three things:
>     3-1. is the obj_ext address is the right one for this object?
>     3-2. does the obj_ext address change after smp_rmb()?
>     3-3. does obj_ext->objcg change after smp_rmb()?
>
> No smp_wmb() is used, intentionally.
>
> It is expected that the issue will still reproduce.
>
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>


Hello Harry,

I’ve restarted the test, but there are continuous warning prints in the 
logs, and they appear to be slowing down the test run significantly.


Warnings:

[ 3215.419760] obj_ext in object
[ 3215.419774] WARNING: mm/slab.h:710 at slab_obj_ext+0x2e0/0x338, 
CPU#26: grep/103571
[ 3215.419783] Modules linked in: xfs loop dm_mod bonding tls rfkill 
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet 
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat 
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink 
sunrpc pseries_rng vmx_crypto dax_pmem fuse ext4 crc16 mbcache jbd2 
sd_mod nd_pmem sg papr_scm libnvdimm ibmvscsi ibmveth scsi_transport_srp 
pseries_wdt
[ 3215.419852] CPU: 26 UID: 0 PID: 103571 Comm: grep Kdump: loaded 
Tainted: G        W           7.0.0-rc1+ #3 PREEMPTLAZY
[ 3215.419859] Tainted: [W]=WARN
[ 3215.419862] Hardware name: IBM,9080-HEX Power11 (architected) 
0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
[ 3215.419866] NIP:  c0000000008a9ff4 LR: c0000000008a9ff0 CTR: 
0000000000000000
[ 3215.419870] REGS: c0000001f9d37670 TRAP: 0700   Tainted: G   W        
     (7.0.0-rc1+)
[ 3215.419874] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 
24002404  XER: 20040000
[ 3215.419889] CFAR: c0000000001bc194 IRQMASK: 0
[ 3215.419889] GPR00: c0000000008a9ff0 c0000001f9d37910 c00000000243a500 
c000000127e8d600
[ 3215.419889] GPR04: 0000000000000004 0000000000000001 c0000000001bc164 
0000000000000001
[ 3215.419889] GPR08: a80e000000000000 0000000000000000 0000000000000007 
a80e000000000000
[ 3215.419889] GPR12: c00e0001a1a48fb2 c000000d0dde7f00 c000000004e49960 
0000000000000001
[ 3215.419889] GPR16: c00000006e6e0000 0000000000000010 c000000007017fa0 
c000000007017fa4
[ 3215.419889] GPR20: 0000000000000001 c000000007017f88 0000000000080000 
c000000007017f80
[ 3215.419889] GPR24: c00000006e6f0010 c0000000aef32800 c00c0000001b9a2c 
c00000006e690010
[ 3215.419889] GPR28: 0000000000000003 0000000000080020 c00000006e690010 
c00c0000001b9a00
[ 3215.419960] NIP [c0000000008a9ff4] slab_obj_ext+0x2e0/0x338
[ 3215.419966] LR [c0000000008a9ff0] slab_obj_ext+0x2dc/0x338
[ 3215.419972] Call Trace:
[ 3215.419975] [c0000001f9d37910] [c0000000008a9ff0] 
slab_obj_ext+0x2dc/0x338 (unreliable)
[ 3215.419983] [c0000001f9d379c0] [c0000000008b9a64] 
__memcg_slab_free_hook+0x1a4/0x3dc
[ 3215.419990] [c0000001f9d37a90] [c0000000007f8270] kfree+0x454/0x600
[ 3215.419998] [c0000001f9d37b20] [c000000000989724] 
seq_release_private+0x98/0xd4
[ 3215.420005] [c0000001f9d37b60] [c000000000a7adb4] 
proc_map_release+0xa4/0xe0
[ 3215.420012] [c0000001f9d37ba0] [c00000000091edf0] __fput+0x1e8/0x5cc
[ 3215.420019] [c0000001f9d37c20] [c000000000915670] sys_close+0x74/0xd0
[ 3215.420025] [c0000001f9d37c50] [c00000000003aeb0] 
system_call_exception+0x1e0/0x4b0
[ 3215.420033] [c0000001f9d37e50] [c00000000000d05c] 
system_call_vectored_common+0x15c/0x2ec
[ 3215.420041] ---- interrupt: 3000 at 0x7fff9bd34ab4
[ 3215.420045] NIP:  00007fff9bd34ab4 LR: 00007fff9bd34ab4 CTR: 
0000000000000000
[ 3215.420050] REGS: c0000001f9d37e80 TRAP: 3000   Tainted: G   W        
     (7.0.0-rc1+)
[ 3215.420054] MSR:  800000000280f033 
<SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 44002402  XER: 00000000
[ 3215.420077] IRQMASK: 0
[ 3215.420077] GPR00: 0000000000000006 00007fffe2939800 00007fff9bf37f00 
0000000000000003
[ 3215.420077] GPR04: 00007fff9bfe077f 000000000000f881 00007fffe2939820 
000000000000f881
[ 3215.420077] GPR08: 000000000000077f 0000000000000000 0000000000000000 
0000000000000000
[ 3215.420077] GPR12: 0000000000000000 00007fff9c0ab0e0 0000000000000000 
0000000000000000
[ 3215.420077] GPR16: 0000000000000000 00000001235700f0 0000000000000100 
0000000000000001
[ 3215.420077] GPR20: 00000000ffffffff 00000001235702ef 0000000000000000 
fffffffffffffffd
[ 3215.420077] GPR24: 00007fffe2939890 0000000000000000 00007fffe2939978 
00007fff9bf12a88
[ 3215.420077] GPR28: 00007fffe2939974 0000000000010000 0000000000000003 
0000000000010000
[ 3215.420144] NIP [00007fff9bd34ab4] 0x7fff9bd34ab4
[ 3215.420148] LR [00007fff9bd34ab4] 0x7fff9bd34ab4
[ 3215.420151] ---- interrupt: 3000
[ 3215.420154] Code: 4e800020 60000000 60000000 7f18e1d6 7b180020 
7f18f214 7c3bc000 4182febc 3c62ff7a 386336c0 4b9120a9 60000000 
<0fe00000> eac10060 4bffff58 3d200001
[ 3215.420183] ---[ end trace 0000000000000000 ]---

Regards,

Venkat.

> ---
>   mm/slab.h | 131 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
>   mm/slub.c | 100 ++---------------------------------------
>   2 files changed, 130 insertions(+), 101 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 71c7261bf822..d1e44cd01ea1 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -578,6 +578,101 @@ static inline unsigned short slab_get_stride(struct slab *slab)
>   }
>   #endif
>   
> +#ifdef CONFIG_SLAB_OBJ_EXT
> +
> +/*
> + * Check if memory cgroup or memory allocation profiling is enabled.
> + * If enabled, SLUB tries to reduce memory overhead of accounting
> + * slab objects. If neither is enabled when this function is called,
> + * the optimization is simply skipped to avoid affecting caches that do not
> + * need slabobj_ext metadata.
> + *
> + * However, this may disable optimization when memory cgroup or memory
> + * allocation profiling is used, but slabs are created too early
> + * even before those subsystems are initialized.
> + */
> +static inline bool need_slab_obj_exts(struct kmem_cache *s)
> +{
> +	if (s->flags & SLAB_NO_OBJ_EXT)
> +		return false;
> +
> +	if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
> +		return true;
> +
> +	if (mem_alloc_profiling_enabled())
> +		return true;
> +
> +	return false;
> +}
> +
> +static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
> +{
> +	return sizeof(struct slabobj_ext) * slab->objects;
> +}
> +
> +static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
> +						    struct slab *slab)
> +{
> +	unsigned long objext_offset;
> +
> +	objext_offset = s->size * slab->objects;
> +	objext_offset = ALIGN(objext_offset, sizeof(struct slabobj_ext));
> +	return objext_offset;
> +}
> +
> +static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
> +						     struct slab *slab)
> +{
> +	unsigned long objext_offset = obj_exts_offset_in_slab(s, slab);
> +	unsigned long objext_size = obj_exts_size_in_slab(slab);
> +
> +	return objext_offset + objext_size <= slab_size(slab);
> +}
> +
> +static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> +{
> +	unsigned long obj_exts;
> +	unsigned long start;
> +	unsigned long end;
> +
> +	obj_exts = slab_obj_exts(slab);
> +	if (!obj_exts)
> +		return false;
> +
> +	start = (unsigned long)slab_address(slab);
> +	end = start + slab_size(slab);
> +	return (obj_exts >= start) && (obj_exts < end);
> +}
> +#else
> +static inline bool need_slab_obj_exts(struct kmem_cache *s)
> +{
> +	return false;
> +}
> +
> +static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
> +{
> +	return 0;
> +}
> +
> +static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
> +						    struct slab *slab)
> +{
> +	return 0;
> +}
> +
> +static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
> +						     struct slab *slab)
> +{
> +	return false;
> +}
> +
> +static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> +{
> +	return false;
> +}
> +
> +#endif
> +
>   /*
>    * slab_obj_ext - get the pointer to the slab object extension metadata
>    * associated with an object in a slab.
> @@ -592,13 +687,41 @@ static inline struct slabobj_ext *slab_obj_ext(struct slab *slab,
>   					       unsigned long obj_exts,
>   					       unsigned int index)
>   {
> -	struct slabobj_ext *obj_ext;
> +	struct slabobj_ext *ext_before;
> +	struct slabobj_ext *ext_after;
> +	struct obj_cgroup *objcg_before;
> +	struct obj_cgroup *objcg_after;
>   
>   	VM_WARN_ON_ONCE(obj_exts != slab_obj_exts(slab));
>   
> -	obj_ext = (struct slabobj_ext *)(obj_exts +
> -					 slab_get_stride(slab) * index);
> -	return kasan_reset_tag(obj_ext);
> +	ext_before = (struct slabobj_ext *)(obj_exts +
> +					    slab_get_stride(slab) * index);
> +	objcg_before = ext_before->objcg;
> +	// re-read things after rmb
> +	smp_rmb();
> +	// is ext_before the right obj_ext for this object?
> +	if (obj_exts_in_slab(slab->slab_cache, slab)) {
> +		struct kmem_cache *s = slab->slab_cache;
> +
> +		if (obj_exts_fit_within_slab_leftover(s, slab))
> +			WARN(ext_before != (struct slabobj_ext *)(obj_exts + sizeof(struct slabobj_ext) * index),
> +			     "obj_exts array in leftover");
> +		else
> +			WARN(ext_before != (struct slabobj_ext *)(obj_exts + s->size * index),
> +			     "obj_ext in object");
> +
> +	} else {
> +		WARN(ext_before != (struct slabobj_ext *)(obj_exts + sizeof(struct slabobj_ext) * index),
> +		     "obj_exts array allocated from slab");
> +	}
> +
> +	ext_after = (struct slabobj_ext *)(obj_exts +
> +					   slab_get_stride(slab) * index);
> +	objcg_after = ext_after->objcg;
> +
> +	WARN(ext_before != ext_after, "obj_ext pointer has changed");
> +	WARN(objcg_before != objcg_after, "obj_ext->objcg has changed");
> +	return kasan_reset_tag(ext_before);
>   }
>   
>   int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> diff --git a/mm/slub.c b/mm/slub.c
> index 862642c165ed..8eb64534370e 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -757,101 +757,6 @@ static inline unsigned long get_orig_size(struct kmem_cache *s, void *object)
>   	return *(unsigned long *)p;
>   }
>   
> -#ifdef CONFIG_SLAB_OBJ_EXT
> -
> -/*
> - * Check if memory cgroup or memory allocation profiling is enabled.
> - * If enabled, SLUB tries to reduce memory overhead of accounting
> - * slab objects. If neither is enabled when this function is called,
> - * the optimization is simply skipped to avoid affecting caches that do not
> - * need slabobj_ext metadata.
> - *
> - * However, this may disable optimization when memory cgroup or memory
> - * allocation profiling is used, but slabs are created too early
> - * even before those subsystems are initialized.
> - */
> -static inline bool need_slab_obj_exts(struct kmem_cache *s)
> -{
> -	if (s->flags & SLAB_NO_OBJ_EXT)
> -		return false;
> -
> -	if (memcg_kmem_online() && (s->flags & SLAB_ACCOUNT))
> -		return true;
> -
> -	if (mem_alloc_profiling_enabled())
> -		return true;
> -
> -	return false;
> -}
> -
> -static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
> -{
> -	return sizeof(struct slabobj_ext) * slab->objects;
> -}
> -
> -static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
> -						    struct slab *slab)
> -{
> -	unsigned long objext_offset;
> -
> -	objext_offset = s->size * slab->objects;
> -	objext_offset = ALIGN(objext_offset, sizeof(struct slabobj_ext));
> -	return objext_offset;
> -}
> -
> -static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
> -						     struct slab *slab)
> -{
> -	unsigned long objext_offset = obj_exts_offset_in_slab(s, slab);
> -	unsigned long objext_size = obj_exts_size_in_slab(slab);
> -
> -	return objext_offset + objext_size <= slab_size(slab);
> -}
> -
> -static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> -{
> -	unsigned long obj_exts;
> -	unsigned long start;
> -	unsigned long end;
> -
> -	obj_exts = slab_obj_exts(slab);
> -	if (!obj_exts)
> -		return false;
> -
> -	start = (unsigned long)slab_address(slab);
> -	end = start + slab_size(slab);
> -	return (obj_exts >= start) && (obj_exts < end);
> -}
> -#else
> -static inline bool need_slab_obj_exts(struct kmem_cache *s)
> -{
> -	return false;
> -}
> -
> -static inline unsigned int obj_exts_size_in_slab(struct slab *slab)
> -{
> -	return 0;
> -}
> -
> -static inline unsigned long obj_exts_offset_in_slab(struct kmem_cache *s,
> -						    struct slab *slab)
> -{
> -	return 0;
> -}
> -
> -static inline bool obj_exts_fit_within_slab_leftover(struct kmem_cache *s,
> -						     struct slab *slab)
> -{
> -	return false;
> -}
> -
> -static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> -{
> -	return false;
> -}
> -
> -#endif
> -
>   #if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
>   static bool obj_exts_in_object(struct kmem_cache *s, struct slab *slab)
>   {
> @@ -2196,7 +2101,6 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>   retry:
>   	old_exts = READ_ONCE(slab->obj_exts);
>   	handle_failed_objexts_alloc(old_exts, vec, objects);
> -	slab_set_stride(slab, sizeof(struct slabobj_ext));
>   
>   	if (new_slab) {
>   		/*
> @@ -2272,6 +2176,9 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
>   	void *addr;
>   	unsigned long obj_exts;
>   
> +	/* Initialize stride early to avoid memory ordering issues */
> +	slab_set_stride(slab, sizeof(struct slabobj_ext));
> +
>   	if (!need_slab_obj_exts(s))
>   		return;
>   
> @@ -2288,7 +2195,6 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
>   		obj_exts |= MEMCG_DATA_OBJEXTS;
>   #endif
>   		slab->obj_exts = obj_exts;
> -		slab_set_stride(slab, sizeof(struct slabobj_ext));
>   	} else if (s->flags & SLAB_OBJ_EXT_IN_OBJ) {
>   		unsigned int offset = obj_exts_offset_in_object(s);
>   


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
  2026-02-27  8:02         ` Venkat Rao Bagalkote
@ 2026-02-27  8:11           ` Harry Yoo
  2026-02-27  9:36             ` Venkat Rao Bagalkote
  0 siblings, 1 reply; 14+ messages in thread
From: Harry Yoo @ 2026-02-27  8:11 UTC (permalink / raw)
  To: Venkat Rao Bagalkote
  Cc: akpm, ast, cgroups, cl, hannes, hao.li, linux-mm, mhocko,
	muchun.song, rientjes, roman.gushchin, shakeel.butt, surenb,
	vbabka

On Fri, Feb 27, 2026 at 01:32:29PM +0530, Venkat Rao Bagalkote wrote:
> 
> On 27/02/26 8:37 am, Harry Yoo wrote:
> > Hi Venkat, could you please help testing this patch and
> > check if it hits any warning? It's based on v7.0-rc1 tag.
> > 
> > This (hopefully) should give us more information
> > that would help debugging the issue.
> > 
> > 1. set stride early in alloc_slab_obj_exts_early()
> > 2. move some obj_exts helpers to slab.h
> > 3. in slab_obj_ext(), check three things:
> >     3-1. is the obj_ext address is the right one for this object?
> >     3-2. does the obj_ext address change after smp_rmb()?
> >     3-3. does obj_ext->objcg change after smp_rmb()?
> > 
> > No smp_wmb() is used, intentionally.
> > 
> > It is expected that the issue will still reproduce.
> > 
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> 
> 
> Hello Harry,

Hello Venkat!

> I’ve restarted the test

Thanks :)

> but there are continuous warning prints in the
> logs, and they appear to be slowing down the test run significantly.

It's okay! the purpose of this patch is to see if there's any warning
hitting, rather than triggering the kernel crash.

> Warnings:
> 
> [ 3215.419760] obj_ext in object

The patch adds five different warnings:

1) "obj_exts array in leftover"
2) "obj_ext in object"
3) "obj_exts array allocated from slab"
4) "obj_ext pointer has changed"
5) "obj_ext->objcg has changed"

Is 2) the only warning that is triggered?

Also, the warning below says it's triggered by proc_map_release().

Are there any other call stacks, or is this the only caller that hits
this warning?

Thanks!

> [ 3215.419774] WARNING: mm/slab.h:710 at slab_obj_ext+0x2e0/0x338, CPU#26:
> grep/103571 >
> [ 3215.419783] Modules linked in: xfs loop dm_mod bonding tls rfkill
> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc
> pseries_rng vmx_crypto dax_pmem fuse ext4 crc16 mbcache jbd2 sd_mod nd_pmem
> sg papr_scm libnvdimm ibmvscsi ibmveth scsi_transport_srp pseries_wdt
> [ 3215.419852] CPU: 26 UID: 0 PID: 103571 Comm: grep Kdump: loaded Tainted:
> G        W           7.0.0-rc1+ #3 PREEMPTLAZY
> [ 3215.419859] Tainted: [W]=WARN
> [ 3215.419862] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200
> 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
> [ 3215.419866] NIP:  c0000000008a9ff4 LR: c0000000008a9ff0 CTR:
> 0000000000000000
> [ 3215.419870] REGS: c0000001f9d37670 TRAP: 0700   Tainted: G   W           
> (7.0.0-rc1+)
> [ 3215.419874] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24002404 
> XER: 20040000
> [ 3215.419889] CFAR: c0000000001bc194 IRQMASK: 0
> [ 3215.419889] GPR00: c0000000008a9ff0 c0000001f9d37910 c00000000243a500
> c000000127e8d600
> [ 3215.419889] GPR04: 0000000000000004 0000000000000001 c0000000001bc164
> 0000000000000001
> [ 3215.419889] GPR08: a80e000000000000 0000000000000000 0000000000000007
> a80e000000000000
> [ 3215.419889] GPR12: c00e0001a1a48fb2 c000000d0dde7f00 c000000004e49960
> 0000000000000001
> [ 3215.419889] GPR16: c00000006e6e0000 0000000000000010 c000000007017fa0
> c000000007017fa4
> [ 3215.419889] GPR20: 0000000000000001 c000000007017f88 0000000000080000
> c000000007017f80
> [ 3215.419889] GPR24: c00000006e6f0010 c0000000aef32800 c00c0000001b9a2c
> c00000006e690010
> [ 3215.419889] GPR28: 0000000000000003 0000000000080020 c00000006e690010
> c00c0000001b9a00
> [ 3215.419960] NIP [c0000000008a9ff4] slab_obj_ext+0x2e0/0x338
> [ 3215.419966] LR [c0000000008a9ff0] slab_obj_ext+0x2dc/0x338
> [ 3215.419972] Call Trace:
> [ 3215.419975] [c0000001f9d37910] [c0000000008a9ff0]
> slab_obj_ext+0x2dc/0x338 (unreliable)
> [ 3215.419983] [c0000001f9d379c0] [c0000000008b9a64]
> __memcg_slab_free_hook+0x1a4/0x3dc
> [ 3215.419990] [c0000001f9d37a90] [c0000000007f8270] kfree+0x454/0x600
> [ 3215.419998] [c0000001f9d37b20] [c000000000989724]
> seq_release_private+0x98/0xd4
> [ 3215.420005] [c0000001f9d37b60] [c000000000a7adb4]
> proc_map_release+0xa4/0xe0
> [ 3215.420012] [c0000001f9d37ba0] [c00000000091edf0] __fput+0x1e8/0x5cc
> [ 3215.420019] [c0000001f9d37c20] [c000000000915670] sys_close+0x74/0xd0
> [ 3215.420025] [c0000001f9d37c50] [c00000000003aeb0]
> system_call_exception+0x1e0/0x4b0
> [ 3215.420033] [c0000001f9d37e50] [c00000000000d05c]
> system_call_vectored_common+0x15c/0x2ec
> [ 3215.420041] ---- interrupt: 3000 at 0x7fff9bd34ab4
> [ 3215.420045] NIP:  00007fff9bd34ab4 LR: 00007fff9bd34ab4 CTR:
> 0000000000000000
> [ 3215.420050] REGS: c0000001f9d37e80 TRAP: 3000   Tainted: G   W           
> (7.0.0-rc1+)
> [ 3215.420054] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> 
> CR: 44002402  XER: 00000000
> [ 3215.420077] IRQMASK: 0
> [ 3215.420077] GPR00: 0000000000000006 00007fffe2939800 00007fff9bf37f00
> 0000000000000003
> [ 3215.420077] GPR04: 00007fff9bfe077f 000000000000f881 00007fffe2939820
> 000000000000f881
> [ 3215.420077] GPR08: 000000000000077f 0000000000000000 0000000000000000
> 0000000000000000
> [ 3215.420077] GPR12: 0000000000000000 00007fff9c0ab0e0 0000000000000000
> 0000000000000000
> [ 3215.420077] GPR16: 0000000000000000 00000001235700f0 0000000000000100
> 0000000000000001
> [ 3215.420077] GPR20: 00000000ffffffff 00000001235702ef 0000000000000000
> fffffffffffffffd
> [ 3215.420077] GPR24: 00007fffe2939890 0000000000000000 00007fffe2939978
> 00007fff9bf12a88
> [ 3215.420077] GPR28: 00007fffe2939974 0000000000010000 0000000000000003
> 0000000000010000
> [ 3215.420144] NIP [00007fff9bd34ab4] 0x7fff9bd34ab4
> [ 3215.420148] LR [00007fff9bd34ab4] 0x7fff9bd34ab4
> [ 3215.420151] ---- interrupt: 3000
> [ 3215.420154] Code: 4e800020 60000000 60000000 7f18e1d6 7b180020 7f18f214
> 7c3bc000 4182febc 3c62ff7a 386336c0 4b9120a9 60000000 <0fe00000> eac10060
> 4bffff58 3d200001
> [ 3215.420183] ---[ end trace 0000000000000000 ]---
> 
> Regards,
> 
> Venkat.

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] mm/slab: a debug patch to investigate the issue further
  2026-02-27  8:11           ` Harry Yoo
@ 2026-02-27  9:36             ` Venkat Rao Bagalkote
  0 siblings, 0 replies; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-02-27  9:36 UTC (permalink / raw)
  To: Harry Yoo
  Cc: akpm, ast, cgroups, cl, hannes, hao.li, linux-mm, mhocko,
	muchun.song, rientjes, roman.gushchin, shakeel.butt, surenb,
	vbabka


On 27/02/26 1:41 pm, Harry Yoo wrote:
> On Fri, Feb 27, 2026 at 01:32:29PM +0530, Venkat Rao Bagalkote wrote:
>> On 27/02/26 8:37 am, Harry Yoo wrote:
>>> Hi Venkat, could you please help testing this patch and
>>> check if it hits any warning? It's based on v7.0-rc1 tag.
>>>
>>> This (hopefully) should give us more information
>>> that would help debugging the issue.
>>>
>>> 1. set stride early in alloc_slab_obj_exts_early()
>>> 2. move some obj_exts helpers to slab.h
>>> 3. in slab_obj_ext(), check three things:
>>>      3-1. is the obj_ext address is the right one for this object?
>>>      3-2. does the obj_ext address change after smp_rmb()?
>>>      3-3. does obj_ext->objcg change after smp_rmb()?
>>>
>>> No smp_wmb() is used, intentionally.
>>>
>>> It is expected that the issue will still reproduce.
>>>
>>> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>>
>> Hello Harry,
> Hello Venkat!
>
>> I’ve restarted the test
> Thanks :)
>
>> but there are continuous warning prints in the
>> logs, and they appear to be slowing down the test run significantly.
> It's okay! the purpose of this patch is to see if there's any warning
> hitting, rather than triggering the kernel crash.
>
>> Warnings:
>>
>> [ 3215.419760] obj_ext in object
> The patch adds five different warnings:
>
> 1) "obj_exts array in leftover"
> 2) "obj_ext in object"
> 3) "obj_exts array allocated from slab"
> 4) "obj_ext pointer has changed"
> 5) "obj_ext->objcg has changed"
>
> Is 2) the only warning that is triggered?
>
> Also, the warning below says it's triggered by proc_map_release().
>
> Are there any other call stacks, or is this the only caller that hits
> this warning?

I’m continuing to see only warning (2) – “obj_ext in object”, but it is 
being triggered from multiple different callers.

So far I have observed the warning originating from the following call 
paths:


kfree → seq_release_private → proc_map_release → __fput


kfree → seq_release_private → mounts_release → __fput


__memcg_slab_post_alloc_hook → __kvmalloc_node_noprof → seq_read_iter → 
vfs_read


There are many other WARN splats in the logs due to repeated hits, but 
the only warning string I’ve seen is (2) “obj_ext in object”, just 
triggered from different code paths


Regards,

Venkat.

>
> Thanks!
>
>> [ 3215.419774] WARNING: mm/slab.h:710 at slab_obj_ext+0x2e0/0x338, CPU#26:
>> grep/103571 >
>> [ 3215.419783] Modules linked in: xfs loop dm_mod bonding tls rfkill
>> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
>> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
>> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink sunrpc
>> pseries_rng vmx_crypto dax_pmem fuse ext4 crc16 mbcache jbd2 sd_mod nd_pmem
>> sg papr_scm libnvdimm ibmvscsi ibmveth scsi_transport_srp pseries_wdt
>> [ 3215.419852] CPU: 26 UID: 0 PID: 103571 Comm: grep Kdump: loaded Tainted:
>> G        W           7.0.0-rc1+ #3 PREEMPTLAZY
>> [ 3215.419859] Tainted: [W]=WARN
>> [ 3215.419862] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200
>> 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
>> [ 3215.419866] NIP:  c0000000008a9ff4 LR: c0000000008a9ff0 CTR:
>> 0000000000000000
>> [ 3215.419870] REGS: c0000001f9d37670 TRAP: 0700   Tainted: G   W
>> (7.0.0-rc1+)
>> [ 3215.419874] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24002404
>> XER: 20040000
>> [ 3215.419889] CFAR: c0000000001bc194 IRQMASK: 0
>> [ 3215.419889] GPR00: c0000000008a9ff0 c0000001f9d37910 c00000000243a500
>> c000000127e8d600
>> [ 3215.419889] GPR04: 0000000000000004 0000000000000001 c0000000001bc164
>> 0000000000000001
>> [ 3215.419889] GPR08: a80e000000000000 0000000000000000 0000000000000007
>> a80e000000000000
>> [ 3215.419889] GPR12: c00e0001a1a48fb2 c000000d0dde7f00 c000000004e49960
>> 0000000000000001
>> [ 3215.419889] GPR16: c00000006e6e0000 0000000000000010 c000000007017fa0
>> c000000007017fa4
>> [ 3215.419889] GPR20: 0000000000000001 c000000007017f88 0000000000080000
>> c000000007017f80
>> [ 3215.419889] GPR24: c00000006e6f0010 c0000000aef32800 c00c0000001b9a2c
>> c00000006e690010
>> [ 3215.419889] GPR28: 0000000000000003 0000000000080020 c00000006e690010
>> c00c0000001b9a00
>> [ 3215.419960] NIP [c0000000008a9ff4] slab_obj_ext+0x2e0/0x338
>> [ 3215.419966] LR [c0000000008a9ff0] slab_obj_ext+0x2dc/0x338
>> [ 3215.419972] Call Trace:
>> [ 3215.419975] [c0000001f9d37910] [c0000000008a9ff0]
>> slab_obj_ext+0x2dc/0x338 (unreliable)
>> [ 3215.419983] [c0000001f9d379c0] [c0000000008b9a64]
>> __memcg_slab_free_hook+0x1a4/0x3dc
>> [ 3215.419990] [c0000001f9d37a90] [c0000000007f8270] kfree+0x454/0x600
>> [ 3215.419998] [c0000001f9d37b20] [c000000000989724]
>> seq_release_private+0x98/0xd4
>> [ 3215.420005] [c0000001f9d37b60] [c000000000a7adb4]
>> proc_map_release+0xa4/0xe0
>> [ 3215.420012] [c0000001f9d37ba0] [c00000000091edf0] __fput+0x1e8/0x5cc
>> [ 3215.420019] [c0000001f9d37c20] [c000000000915670] sys_close+0x74/0xd0
>> [ 3215.420025] [c0000001f9d37c50] [c00000000003aeb0]
>> system_call_exception+0x1e0/0x4b0
>> [ 3215.420033] [c0000001f9d37e50] [c00000000000d05c]
>> system_call_vectored_common+0x15c/0x2ec
>> [ 3215.420041] ---- interrupt: 3000 at 0x7fff9bd34ab4
>> [ 3215.420045] NIP:  00007fff9bd34ab4 LR: 00007fff9bd34ab4 CTR:
>> 0000000000000000
>> [ 3215.420050] REGS: c0000001f9d37e80 TRAP: 3000   Tainted: G   W
>> (7.0.0-rc1+)
>> [ 3215.420054] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>
>> CR: 44002402  XER: 00000000
>> [ 3215.420077] IRQMASK: 0
>> [ 3215.420077] GPR00: 0000000000000006 00007fffe2939800 00007fff9bf37f00
>> 0000000000000003
>> [ 3215.420077] GPR04: 00007fff9bfe077f 000000000000f881 00007fffe2939820
>> 000000000000f881
>> [ 3215.420077] GPR08: 000000000000077f 0000000000000000 0000000000000000
>> 0000000000000000
>> [ 3215.420077] GPR12: 0000000000000000 00007fff9c0ab0e0 0000000000000000
>> 0000000000000000
>> [ 3215.420077] GPR16: 0000000000000000 00000001235700f0 0000000000000100
>> 0000000000000001
>> [ 3215.420077] GPR20: 00000000ffffffff 00000001235702ef 0000000000000000
>> fffffffffffffffd
>> [ 3215.420077] GPR24: 00007fffe2939890 0000000000000000 00007fffe2939978
>> 00007fff9bf12a88
>> [ 3215.420077] GPR28: 00007fffe2939974 0000000000010000 0000000000000003
>> 0000000000010000
>> [ 3215.420144] NIP [00007fff9bd34ab4] 0x7fff9bd34ab4
>> [ 3215.420148] LR [00007fff9bd34ab4] 0x7fff9bd34ab4
>> [ 3215.420151] ---- interrupt: 3000
>> [ 3215.420154] Code: 4e800020 60000000 60000000 7f18e1d6 7b180020 7f18f214
>> 7c3bc000 4182febc 3c62ff7a 386336c0 4b9120a9 60000000 <0fe00000> eac10060
>> 4bffff58 3d200001
>> [ 3215.420183] ---[ end trace 0000000000000000 ]---
>>
>> Regards,
>>
>> Venkat.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-02-27  9:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-23  7:58 [PATCH] mm/slab: initialize slab->stride early to avoid memory ordering issues Harry Yoo
2026-02-23 11:44 ` Harry Yoo
2026-02-23 17:04   ` Vlastimil Babka
2026-02-23 20:23 ` Shakeel Butt
2026-02-24  9:04 ` Venkat Rao Bagalkote
2026-02-24 11:10   ` Harry Yoo
2026-02-25  9:14     ` Venkat Rao Bagalkote
2026-02-25 10:15       ` Harry Yoo
2026-02-27  3:07       ` [PATCH] mm/slab: a debug patch to investigate the issue further Harry Yoo
2026-02-27  5:52         ` kernel test robot
2026-02-27  6:02         ` kernel test robot
2026-02-27  8:02         ` Venkat Rao Bagalkote
2026-02-27  8:11           ` Harry Yoo
2026-02-27  9:36             ` Venkat Rao Bagalkote

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox