* [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
@ 2025-10-14 15:27 Hao Ge
2025-10-14 16:12 ` Suren Baghdasaryan
2025-10-14 21:13 ` Roman Gushchin
0 siblings, 2 replies; 12+ messages in thread
From: Hao Ge @ 2025-10-14 15:27 UTC (permalink / raw)
To: Vlastimil Babka, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Shakeel Butt, Michal Hocko, Roman Gushchin,
Muchun Song, Suren Baghdasaryan
Cc: Harry Yoo, cgroups, linux-mm, linux-kernel, Hao Ge
From: Hao Ge <gehao@kylinos.cn>
Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
the same bit position, we cannot determine whether memcg_data still
points to the slabobj_ext vector simply by checking
folio->memcg_data & MEMCG_DATA_OBJEXTS.
If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
and during the release of the associated folio, the BUG check is triggered
because it was mistakenly assumed that a valid folio->memcg_data
was not cleared before freeing the folio.
So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
Suggested-by: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
---
v3: Simplify the solution, per Harry's suggestion in the v1 comments
Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
---
include/linux/memcontrol.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 873e510d6f8d..7ed15f858dc4 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -534,7 +534,9 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob
static inline bool folio_memcg_kmem(struct folio *folio)
{
VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
- VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
+ VM_BUG_ON_FOLIO((folio->memcg_data != OBJEXTS_ALLOC_FAIL) &&
+ (folio->memcg_data & MEMCG_DATA_OBJEXTS),
+ folio);
return folio->memcg_data & MEMCG_DATA_KMEM;
}
--
2.25.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-14 15:27 [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem Hao Ge
@ 2025-10-14 16:12 ` Suren Baghdasaryan
2025-10-14 20:14 ` Shakeel Butt
2025-10-15 9:25 ` Harry Yoo
2025-10-14 21:13 ` Roman Gushchin
1 sibling, 2 replies; 12+ messages in thread
From: Suren Baghdasaryan @ 2025-10-14 16:12 UTC (permalink / raw)
To: Hao Ge
Cc: Vlastimil Babka, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Shakeel Butt, Michal Hocko, Roman Gushchin,
Muchun Song, Harry Yoo, cgroups, linux-mm, linux-kernel, Hao Ge
On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
>
> From: Hao Ge <gehao@kylinos.cn>
>
> Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
> the same bit position, we cannot determine whether memcg_data still
> points to the slabobj_ext vector simply by checking
> folio->memcg_data & MEMCG_DATA_OBJEXTS.
>
> If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
> and during the release of the associated folio, the BUG check is triggered
> because it was mistakenly assumed that a valid folio->memcg_data
> was not cleared before freeing the folio.
>
> So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
>
> Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
> Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> Signed-off-by: Hao Ge <gehao@kylinos.cn>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
nit: I think it would be helpful if the changelog explained why we
need the additional check. We can have the same bit set in two
different situations:
1. object extension vector allocation failure;
2. memcg_data pointing to a valid mem_cgroup.
To distinguish between them, we need to check not only the bit itself
but also the rest of this field. If the rest is NULL, we have case 1,
otherwise case 2.
> ---
> v3: Simplify the solution, per Harry's suggestion in the v1 comments
> Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> ---
> include/linux/memcontrol.h | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 873e510d6f8d..7ed15f858dc4 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -534,7 +534,9 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob
> static inline bool folio_memcg_kmem(struct folio *folio)
> {
> VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
> - VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
> + VM_BUG_ON_FOLIO((folio->memcg_data != OBJEXTS_ALLOC_FAIL) &&
> + (folio->memcg_data & MEMCG_DATA_OBJEXTS),
> + folio);
> return folio->memcg_data & MEMCG_DATA_KMEM;
> }
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-14 16:12 ` Suren Baghdasaryan
@ 2025-10-14 20:14 ` Shakeel Butt
2025-10-14 20:58 ` Vlastimil Babka
2025-10-15 9:25 ` Harry Yoo
1 sibling, 1 reply; 12+ messages in thread
From: Shakeel Butt @ 2025-10-14 20:14 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: Hao Ge, Vlastimil Babka, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Harry Yoo, cgroups, linux-mm, linux-kernel, Hao Ge
On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
> On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
> >
> > From: Hao Ge <gehao@kylinos.cn>
> >
> > Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
> > the same bit position, we cannot determine whether memcg_data still
> > points to the slabobj_ext vector simply by checking
> > folio->memcg_data & MEMCG_DATA_OBJEXTS.
> >
> > If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
> > and during the release of the associated folio, the BUG check is triggered
> > because it was mistakenly assumed that a valid folio->memcg_data
> > was not cleared before freeing the folio.
> >
> > So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
> >
> > Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
> > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> > Signed-off-by: Hao Ge <gehao@kylinos.cn>
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>
> nit: I think it would be helpful if the changelog explained why we
> need the additional check. We can have the same bit set in two
> different situations:
> 1. object extension vector allocation failure;
> 2. memcg_data pointing to a valid mem_cgroup.
> To distinguish between them, we need to check not only the bit itself
> but also the rest of this field. If the rest is NULL, we have case 1,
> otherwise case 2.
With Suren's suggestion, you can add:
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-14 20:14 ` Shakeel Butt
@ 2025-10-14 20:58 ` Vlastimil Babka
2025-10-14 22:40 ` Shakeel Butt
0 siblings, 1 reply; 12+ messages in thread
From: Vlastimil Babka @ 2025-10-14 20:58 UTC (permalink / raw)
To: Shakeel Butt, Suren Baghdasaryan
Cc: Hao Ge, Alexei Starovoitov, Andrew Morton, Johannes Weiner,
Michal Hocko, Roman Gushchin, Muchun Song, Harry Yoo, cgroups,
linux-mm, linux-kernel, Hao Ge, David Hildenbrand
On 10/14/25 22:14, Shakeel Butt wrote:
> On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
>> On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
>> >
>> > From: Hao Ge <gehao@kylinos.cn>
>> >
>> > Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
>> > the same bit position, we cannot determine whether memcg_data still
>> > points to the slabobj_ext vector simply by checking
>> > folio->memcg_data & MEMCG_DATA_OBJEXTS.
>> >
>> > If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
>> > and during the release of the associated folio, the BUG check is triggered
>> > because it was mistakenly assumed that a valid folio->memcg_data
>> > was not cleared before freeing the folio.
>> >
>> > So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
>> >
>> > Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
>> > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
>> > Signed-off-by: Hao Ge <gehao@kylinos.cn>
>>
>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>
>> nit: I think it would be helpful if the changelog explained why we
>> need the additional check. We can have the same bit set in two
>> different situations:
>> 1. object extension vector allocation failure;
>> 2. memcg_data pointing to a valid mem_cgroup.
>> To distinguish between them, we need to check not only the bit itself
>> but also the rest of this field. If the rest is NULL, we have case 1,
>> otherwise case 2.
>
> With Suren's suggestion, you can add:
>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Thanks, I added Suren's suggestion and pushed to slab/for-next-fixes:
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/commit/?h=slab/for-next-fixes&id=711c435c89e59ee32bf8bb1c0d875a07931da5a8
Resisted the impulse to change the single VM_BUG_ON_FOLIO to
VM_WARN_ON_ONCE_FOLIO because we're still going to do that systematically,
right?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-14 15:27 [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem Hao Ge
2025-10-14 16:12 ` Suren Baghdasaryan
@ 2025-10-14 21:13 ` Roman Gushchin
1 sibling, 0 replies; 12+ messages in thread
From: Roman Gushchin @ 2025-10-14 21:13 UTC (permalink / raw)
To: Hao Ge
Cc: Vlastimil Babka, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Shakeel Butt, Michal Hocko, Muchun Song,
Suren Baghdasaryan, Harry Yoo, cgroups, linux-mm, linux-kernel,
Hao Ge
Hao Ge <hao.ge@linux.dev> writes:
> From: Hao Ge <gehao@kylinos.cn>
>
> Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
> the same bit position, we cannot determine whether memcg_data still
> points to the slabobj_ext vector simply by checking
> folio->memcg_data & MEMCG_DATA_OBJEXTS.
>
> If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
> and during the release of the associated folio, the BUG check is triggered
> because it was mistakenly assumed that a valid folio->memcg_data
> was not cleared before freeing the folio.
>
> So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
>
> Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
> Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> Signed-off-by: Hao Ge <gehao@kylinos.cn>
> ---
> v3: Simplify the solution, per Harry's suggestion in the v1 comments
> Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Thanks!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-14 20:58 ` Vlastimil Babka
@ 2025-10-14 22:40 ` Shakeel Butt
0 siblings, 0 replies; 12+ messages in thread
From: Shakeel Butt @ 2025-10-14 22:40 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Suren Baghdasaryan, Hao Ge, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Harry Yoo, cgroups, linux-mm, linux-kernel, Hao Ge,
David Hildenbrand
On Tue, Oct 14, 2025 at 10:58:25PM +0200, Vlastimil Babka wrote:
> On 10/14/25 22:14, Shakeel Butt wrote:
> > On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
> >> On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
> >> >
> >> > From: Hao Ge <gehao@kylinos.cn>
> >> >
> >> > Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
> >> > the same bit position, we cannot determine whether memcg_data still
> >> > points to the slabobj_ext vector simply by checking
> >> > folio->memcg_data & MEMCG_DATA_OBJEXTS.
> >> >
> >> > If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
> >> > and during the release of the associated folio, the BUG check is triggered
> >> > because it was mistakenly assumed that a valid folio->memcg_data
> >> > was not cleared before freeing the folio.
> >> >
> >> > So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
> >> >
> >> > Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
> >> > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> >> > Signed-off-by: Hao Ge <gehao@kylinos.cn>
> >>
> >> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> >>
> >> nit: I think it would be helpful if the changelog explained why we
> >> need the additional check. We can have the same bit set in two
> >> different situations:
> >> 1. object extension vector allocation failure;
> >> 2. memcg_data pointing to a valid mem_cgroup.
> >> To distinguish between them, we need to check not only the bit itself
> >> but also the rest of this field. If the rest is NULL, we have case 1,
> >> otherwise case 2.
> >
> > With Suren's suggestion, you can add:
> >
> > Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
>
> Thanks, I added Suren's suggestion and pushed to slab/for-next-fixes:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/commit/?h=slab/for-next-fixes&id=711c435c89e59ee32bf8bb1c0d875a07931da5a8
>
> Resisted the impulse to change the single VM_BUG_ON_FOLIO to
> VM_WARN_ON_ONCE_FOLIO because we're still going to do that systematically,
> right?
Oh is there some coordinated effort happening for this conversion?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-14 16:12 ` Suren Baghdasaryan
2025-10-14 20:14 ` Shakeel Butt
@ 2025-10-15 9:25 ` Harry Yoo
2025-10-15 9:54 ` Vlastimil Babka
1 sibling, 1 reply; 12+ messages in thread
From: Harry Yoo @ 2025-10-15 9:25 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: Hao Ge, Vlastimil Babka, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Shakeel Butt, Michal Hocko, Roman Gushchin,
Muchun Song, cgroups, linux-mm, linux-kernel, Hao Ge
On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
> On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
> >
> > From: Hao Ge <gehao@kylinos.cn>
> >
> > Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
> > the same bit position, we cannot determine whether memcg_data still
> > points to the slabobj_ext vector simply by checking
> > folio->memcg_data & MEMCG_DATA_OBJEXTS.
> >
> > If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
> > and during the release of the associated folio, the BUG check is triggered
> > because it was mistakenly assumed that a valid folio->memcg_data
> > was not cleared before freeing the folio.
nit: yesterday I was confused that this is sanity checks in buddy complaining
folio->memcg_data not being cleared, but it's actually folio_memcg_kmem()
complaining that MEMCG_OBJEXTS_DATA flag is set on non-slab folios (in
free_pages_prepare(), if PageMemcgKmem(page) -> __memcg_kmem_uncharge_page()))
So the paragraph above should be updated?
And as a side question, we clear slab->obj_exts when freeing obj_ext array,
but don't clear OBJEXTS_ALLOC_FAIL when freeing a slab? That's not good.
> > So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
> >
> > Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
> > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> > Signed-off-by: Hao Ge <gehao@kylinos.cn>
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>
> nit: I think it would be helpful if the changelog explained why we
> need the additional check. We can have the same bit set in two
> different situations:
> 1. object extension vector allocation failure;
> 2. memcg_data pointing to a valid mem_cgroup.
> To distinguish between them, we need to check not only the bit itself
> but also the rest of this field. If the rest is NULL, we have case 1,
> otherwise case 2.
Agreed.
In general LGTM,
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
By the way, maybe it'd be nice to introduce a new helper function that
properly checks MEMCG_DATA_OBJEXTS flag.
> ~/slab (slab/for-next-fixes)> git grep -n MEMCG_DATA_OBJEXTS
> include/linux/memcontrol.h:337: MEMCG_DATA_OBJEXTS = (1UL << 0),
> include/linux/memcontrol.h:344:#define __OBJEXTS_ALLOC_FAIL MEMCG_DATA_OBJEXTS
> include/linux/memcontrol.h:358: * MEMCG_DATA_OBJEXTS.
> include/linux/memcontrol.h:400: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
> include/linux/memcontrol.h:421: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
these two,
> include/linux/memcontrol.h:492: if (memcg_data & MEMCG_DATA_OBJEXTS)
this,
> include/linux/memcontrol.h:538: (folio->memcg_data & MEMCG_DATA_OBJEXTS),
> include/linux/memcontrol.h:1491: * if MEMCG_DATA_OBJEXTS is set.
> mm/kfence/core.c:624: MEMCG_DATA_OBJEXTS;
> mm/page_owner.c:513: if (memcg_data & MEMCG_DATA_OBJEXTS)
this,
> mm/slab.h:541: * MEMCG_DATA_OBJEXTS bit set or be equal to OBJEXTS_ALLOC_FAIL.
> mm/slab.h:543: VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS) &&
> mm/slub.c:2137: new_exts |= MEMCG_DATA_OBJEXTS;
> tools/mm/show_page_info.py:55: MEMCG_DATA_OBJEXTS = prog.constant("MEMCG_DATA_OBJEXTS").value_()
> tools/mm/show_page_info.py:59: if memcg_data & MEMCG_DATA_OBJEXTS:
and this do not look good.
I mean technically they are fine since OBJEXTS_ALLOC_FAIL is set on
slabs only, but that's just a coincidence.
> > ---
> > v3: Simplify the solution, per Harry's suggestion in the v1 comments
> > Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
> > include/linux/memcontrol.h | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index 873e510d6f8d..7ed15f858dc4 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -534,7 +534,9 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob
> > static inline bool folio_memcg_kmem(struct folio *folio)
> > {
> > VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
> > - VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
> > + VM_BUG_ON_FOLIO((folio->memcg_data != OBJEXTS_ALLOC_FAIL) &&
> > + (folio->memcg_data & MEMCG_DATA_OBJEXTS),
> > + folio);
> > return folio->memcg_data & MEMCG_DATA_KMEM;
> > }
> >
> > --
> > 2.25.1
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-15 9:25 ` Harry Yoo
@ 2025-10-15 9:54 ` Vlastimil Babka
2025-10-15 10:27 ` Harry Yoo
0 siblings, 1 reply; 12+ messages in thread
From: Vlastimil Babka @ 2025-10-15 9:54 UTC (permalink / raw)
To: Harry Yoo, Suren Baghdasaryan
Cc: Hao Ge, Alexei Starovoitov, Andrew Morton, Johannes Weiner,
Shakeel Butt, Michal Hocko, Roman Gushchin, Muchun Song, cgroups,
linux-mm, linux-kernel, Hao Ge
On 10/15/25 11:25, Harry Yoo wrote:
> On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
>> On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
>> >
>> > From: Hao Ge <gehao@kylinos.cn>
>> >
>> > Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
>> > the same bit position, we cannot determine whether memcg_data still
>> > points to the slabobj_ext vector simply by checking
>> > folio->memcg_data & MEMCG_DATA_OBJEXTS.
>> >
>> > If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
>> > and during the release of the associated folio, the BUG check is triggered
>> > because it was mistakenly assumed that a valid folio->memcg_data
>> > was not cleared before freeing the folio.
>
> nit: yesterday I was confused that this is sanity checks in buddy complaining
> folio->memcg_data not being cleared, but it's actually folio_memcg_kmem()
> complaining that MEMCG_OBJEXTS_DATA flag is set on non-slab folios (in
> free_pages_prepare(), if PageMemcgKmem(page) -> __memcg_kmem_uncharge_page()))
> So the paragraph above should be updated?
>
> And as a side question, we clear slab->obj_exts when freeing obj_ext array,
> but don't clear OBJEXTS_ALLOC_FAIL when freeing a slab? That's not good.
Hm great point. We should rather make sure it's cleared always, instead of
adjusting the debugging check, which shouldn't be then necessary, right?
>> > So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
>> >
>> > Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
>> > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
>> > Signed-off-by: Hao Ge <gehao@kylinos.cn>
>>
>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>
>> nit: I think it would be helpful if the changelog explained why we
>> need the additional check. We can have the same bit set in two
>> different situations:
>> 1. object extension vector allocation failure;
>> 2. memcg_data pointing to a valid mem_cgroup.
>> To distinguish between them, we need to check not only the bit itself
>> but also the rest of this field. If the rest is NULL, we have case 1,
>> otherwise case 2.
>
> Agreed.
>
> In general LGTM,
> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>
> By the way, maybe it'd be nice to introduce a new helper function that
> properly checks MEMCG_DATA_OBJEXTS flag.
I thought so too at first...
>> ~/slab (slab/for-next-fixes)> git grep -n MEMCG_DATA_OBJEXTS
>> include/linux/memcontrol.h:337: MEMCG_DATA_OBJEXTS = (1UL << 0),
>> include/linux/memcontrol.h:344:#define __OBJEXTS_ALLOC_FAIL MEMCG_DATA_OBJEXTS
>> include/linux/memcontrol.h:358: * MEMCG_DATA_OBJEXTS.
>
>> include/linux/memcontrol.h:400: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
>> include/linux/memcontrol.h:421: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
>
> these two,
>
>> include/linux/memcontrol.h:492: if (memcg_data & MEMCG_DATA_OBJEXTS)
>
> this,
>
>> include/linux/memcontrol.h:538: (folio->memcg_data & MEMCG_DATA_OBJEXTS),
>> include/linux/memcontrol.h:1491: * if MEMCG_DATA_OBJEXTS is set.
>> mm/kfence/core.c:624: MEMCG_DATA_OBJEXTS;
>
>> mm/page_owner.c:513: if (memcg_data & MEMCG_DATA_OBJEXTS)
>
> this,
>
>> mm/slab.h:541: * MEMCG_DATA_OBJEXTS bit set or be equal to OBJEXTS_ALLOC_FAIL.
>> mm/slab.h:543: VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS) &&
>> mm/slub.c:2137: new_exts |= MEMCG_DATA_OBJEXTS;
>> tools/mm/show_page_info.py:55: MEMCG_DATA_OBJEXTS = prog.constant("MEMCG_DATA_OBJEXTS").value_()
>
>> tools/mm/show_page_info.py:59: if memcg_data & MEMCG_DATA_OBJEXTS:
>
> and this do not look good.
>
> I mean technically they are fine since OBJEXTS_ALLOC_FAIL is set on
> slabs only, but that's just a coincidence.
And checked the the other debugging checks too. But then thought it's better
that if these are not expected to see slabs, then they should not be
adjusted. I don't see it as a coincidence but as intention to keep it slab
specific. It will be also more future proof for the upcoming separation of
struct slab from struct page.
>> > ---
>> > v3: Simplify the solution, per Harry's suggestion in the v1 comments
>> > Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
>> > ---
>> > include/linux/memcontrol.h | 4 +++-
>> > 1 file changed, 3 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> > index 873e510d6f8d..7ed15f858dc4 100644
>> > --- a/include/linux/memcontrol.h
>> > +++ b/include/linux/memcontrol.h
>> > @@ -534,7 +534,9 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob
>> > static inline bool folio_memcg_kmem(struct folio *folio)
>> > {
>> > VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
>> > - VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
>> > + VM_BUG_ON_FOLIO((folio->memcg_data != OBJEXTS_ALLOC_FAIL) &&
>> > + (folio->memcg_data & MEMCG_DATA_OBJEXTS),
>> > + folio);
>> > return folio->memcg_data & MEMCG_DATA_KMEM;
>> > }
>> >
>> > --
>> > 2.25.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-15 9:54 ` Vlastimil Babka
@ 2025-10-15 10:27 ` Harry Yoo
2025-10-15 10:37 ` Vlastimil Babka
0 siblings, 1 reply; 12+ messages in thread
From: Harry Yoo @ 2025-10-15 10:27 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Suren Baghdasaryan, Hao Ge, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Shakeel Butt, Michal Hocko, Roman Gushchin,
Muchun Song, cgroups, linux-mm, linux-kernel, Hao Ge
On Wed, Oct 15, 2025 at 11:54:18AM +0200, Vlastimil Babka wrote:
> On 10/15/25 11:25, Harry Yoo wrote:
> > On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
> >> On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
> >> >
> >> > From: Hao Ge <gehao@kylinos.cn>
> >> >
> >> > Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
> >> > the same bit position, we cannot determine whether memcg_data still
> >> > points to the slabobj_ext vector simply by checking
> >> > folio->memcg_data & MEMCG_DATA_OBJEXTS.
> >> >
> >> > If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
> >> > and during the release of the associated folio, the BUG check is triggered
> >> > because it was mistakenly assumed that a valid folio->memcg_data
> >> > was not cleared before freeing the folio.
> >
> > nit: yesterday I was confused that this is sanity checks in buddy complaining
> > folio->memcg_data not being cleared, but it's actually folio_memcg_kmem()
> > complaining that MEMCG_OBJEXTS_DATA flag is set on non-slab folios (in
> > free_pages_prepare(), if PageMemcgKmem(page) -> __memcg_kmem_uncharge_page()))
> > So the paragraph above should be updated?
> >
> > And as a side question, we clear slab->obj_exts when freeing obj_ext array,
> > but don't clear OBJEXTS_ALLOC_FAIL when freeing a slab? That's not good.
>
> Hm great point. We should rather make sure it's cleared always, instead of
> adjusting the debugging check, which shouldn't be then necessary, right?
Yeah folio_memcg_kmem() isn't supposed to be called on slabs anyway
(it's not a slab at the time we free it to buddy), so we don't have to
adjust the debug check.
> >> > So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
> >> >
> >> > Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
> >> > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> >> > Signed-off-by: Hao Ge <gehao@kylinos.cn>
> >>
> >> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> >>
> >> nit: I think it would be helpful if the changelog explained why we
> >> need the additional check. We can have the same bit set in two
> >> different situations:
> >> 1. object extension vector allocation failure;
> >> 2. memcg_data pointing to a valid mem_cgroup.
> >> To distinguish between them, we need to check not only the bit itself
> >> but also the rest of this field. If the rest is NULL, we have case 1,
> >> otherwise case 2.
> >
> > Agreed.
> >
> > In general LGTM,
> > Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> >
> > By the way, maybe it'd be nice to introduce a new helper function that
> > properly checks MEMCG_DATA_OBJEXTS flag.
>
> I thought so too at first...
>
> >> ~/slab (slab/for-next-fixes)> git grep -n MEMCG_DATA_OBJEXTS
> >> include/linux/memcontrol.h:337: MEMCG_DATA_OBJEXTS = (1UL << 0),
> >> include/linux/memcontrol.h:344:#define __OBJEXTS_ALLOC_FAIL MEMCG_DATA_OBJEXTS
> >> include/linux/memcontrol.h:358: * MEMCG_DATA_OBJEXTS.
> >
> >> include/linux/memcontrol.h:400: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
> >> include/linux/memcontrol.h:421: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
> >
> > these two,
> >
> >> include/linux/memcontrol.h:492: if (memcg_data & MEMCG_DATA_OBJEXTS)
> >
> > this,
> >
> >> include/linux/memcontrol.h:538: (folio->memcg_data & MEMCG_DATA_OBJEXTS),
> >> include/linux/memcontrol.h:1491: * if MEMCG_DATA_OBJEXTS is set.
> >> mm/kfence/core.c:624: MEMCG_DATA_OBJEXTS;
> >
> >> mm/page_owner.c:513: if (memcg_data & MEMCG_DATA_OBJEXTS)
> >
> > this,
> >
> >> mm/slab.h:541: * MEMCG_DATA_OBJEXTS bit set or be equal to OBJEXTS_ALLOC_FAIL.
> >> mm/slab.h:543: VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS) &&
> >> mm/slub.c:2137: new_exts |= MEMCG_DATA_OBJEXTS;
> >> tools/mm/show_page_info.py:55: MEMCG_DATA_OBJEXTS = prog.constant("MEMCG_DATA_OBJEXTS").value_()
> >
> >> tools/mm/show_page_info.py:59: if memcg_data & MEMCG_DATA_OBJEXTS:
> >
> > and this do not look good.
> >
> > I mean technically they are fine since OBJEXTS_ALLOC_FAIL is set on
> > slabs only, but that's just a coincidence.
>
> And checked the the other debugging checks too. But then thought it's better
> that if these are not expected to see slabs, then they should not be
> adjusted. I don't see it as a coincidence but as intention to keep it slab
> specific. It will be also more future proof for the upcoming separation of
> struct slab from struct page.
Then we're intentionally using (folio->memcg_data & MEMCG_DATA_OBJEXTS) check
as a way to determine whether the folio is a slab (either slabobj_ext array
allocation succeeded or failed).
That makes sense to me!
> >> > ---
> >> > v3: Simplify the solution, per Harry's suggestion in the v1 comments
> >> > Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> >> > ---
> >> > include/linux/memcontrol.h | 4 +++-
> >> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> >> > index 873e510d6f8d..7ed15f858dc4 100644
> >> > --- a/include/linux/memcontrol.h
> >> > +++ b/include/linux/memcontrol.h
> >> > @@ -534,7 +534,9 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob
> >> > static inline bool folio_memcg_kmem(struct folio *folio)
> >> > {
> >> > VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
> >> > - VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
> >> > + VM_BUG_ON_FOLIO((folio->memcg_data != OBJEXTS_ALLOC_FAIL) &&
> >> > + (folio->memcg_data & MEMCG_DATA_OBJEXTS),
> >> > + folio);
> >> > return folio->memcg_data & MEMCG_DATA_KMEM;
> >> > }
> >> >
> >> > --
> >> > 2.25.1
> >
>
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-15 10:27 ` Harry Yoo
@ 2025-10-15 10:37 ` Vlastimil Babka
2025-10-15 11:22 ` Hao Ge
0 siblings, 1 reply; 12+ messages in thread
From: Vlastimil Babka @ 2025-10-15 10:37 UTC (permalink / raw)
To: Harry Yoo
Cc: Suren Baghdasaryan, Hao Ge, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Shakeel Butt, Michal Hocko, Roman Gushchin,
Muchun Song, cgroups, linux-mm, linux-kernel, Hao Ge
On 10/15/25 12:27, Harry Yoo wrote:
> On Wed, Oct 15, 2025 at 11:54:18AM +0200, Vlastimil Babka wrote:
>> On 10/15/25 11:25, Harry Yoo wrote:
>> > On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
>> >> On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
>> >> >
>> >> > From: Hao Ge <gehao@kylinos.cn>
>> >> >
>> >> > Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
>> >> > the same bit position, we cannot determine whether memcg_data still
>> >> > points to the slabobj_ext vector simply by checking
>> >> > folio->memcg_data & MEMCG_DATA_OBJEXTS.
>> >> >
>> >> > If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
>> >> > and during the release of the associated folio, the BUG check is triggered
>> >> > because it was mistakenly assumed that a valid folio->memcg_data
>> >> > was not cleared before freeing the folio.
>> >
>> > nit: yesterday I was confused that this is sanity checks in buddy complaining
>> > folio->memcg_data not being cleared, but it's actually folio_memcg_kmem()
>> > complaining that MEMCG_OBJEXTS_DATA flag is set on non-slab folios (in
>> > free_pages_prepare(), if PageMemcgKmem(page) -> __memcg_kmem_uncharge_page()))
>> > So the paragraph above should be updated?
>> >
>> > And as a side question, we clear slab->obj_exts when freeing obj_ext array,
>> > but don't clear OBJEXTS_ALLOC_FAIL when freeing a slab? That's not good.
>>
>> Hm great point. We should rather make sure it's cleared always, instead of
>> adjusting the debugging check, which shouldn't be then necessary, right?
>
> Yeah folio_memcg_kmem() isn't supposed to be called on slabs anyway
> (it's not a slab at the time we free it to buddy), so we don't have to
> adjust the debug check.
Great. Hao Ge, can you please send v4 that instead of adjusting the
VM_BUG_ON modifies free_slab_obj_exts() to always clear slab->obj_exts? Thanks!
>> >> > So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
>> >> >
>> >> > Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
>> >> > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
>> >> > Signed-off-by: Hao Ge <gehao@kylinos.cn>
>> >>
>> >> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>> >>
>> >> nit: I think it would be helpful if the changelog explained why we
>> >> need the additional check. We can have the same bit set in two
>> >> different situations:
>> >> 1. object extension vector allocation failure;
>> >> 2. memcg_data pointing to a valid mem_cgroup.
>> >> To distinguish between them, we need to check not only the bit itself
>> >> but also the rest of this field. If the rest is NULL, we have case 1,
>> >> otherwise case 2.
>> >
>> > Agreed.
>> >
>> > In general LGTM,
>> > Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>> >
>> > By the way, maybe it'd be nice to introduce a new helper function that
>> > properly checks MEMCG_DATA_OBJEXTS flag.
>>
>> I thought so too at first...
>>
>> >> ~/slab (slab/for-next-fixes)> git grep -n MEMCG_DATA_OBJEXTS
>> >> include/linux/memcontrol.h:337: MEMCG_DATA_OBJEXTS = (1UL << 0),
>> >> include/linux/memcontrol.h:344:#define __OBJEXTS_ALLOC_FAIL MEMCG_DATA_OBJEXTS
>> >> include/linux/memcontrol.h:358: * MEMCG_DATA_OBJEXTS.
>> >
>> >> include/linux/memcontrol.h:400: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
>> >> include/linux/memcontrol.h:421: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
>> >
>> > these two,
>> >
>> >> include/linux/memcontrol.h:492: if (memcg_data & MEMCG_DATA_OBJEXTS)
>> >
>> > this,
>> >
>> >> include/linux/memcontrol.h:538: (folio->memcg_data & MEMCG_DATA_OBJEXTS),
>> >> include/linux/memcontrol.h:1491: * if MEMCG_DATA_OBJEXTS is set.
>> >> mm/kfence/core.c:624: MEMCG_DATA_OBJEXTS;
>> >
>> >> mm/page_owner.c:513: if (memcg_data & MEMCG_DATA_OBJEXTS)
>> >
>> > this,
>> >
>> >> mm/slab.h:541: * MEMCG_DATA_OBJEXTS bit set or be equal to OBJEXTS_ALLOC_FAIL.
>> >> mm/slab.h:543: VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS) &&
>> >> mm/slub.c:2137: new_exts |= MEMCG_DATA_OBJEXTS;
>> >> tools/mm/show_page_info.py:55: MEMCG_DATA_OBJEXTS = prog.constant("MEMCG_DATA_OBJEXTS").value_()
>> >
>> >> tools/mm/show_page_info.py:59: if memcg_data & MEMCG_DATA_OBJEXTS:
>> >
>> > and this do not look good.
>> >
>> > I mean technically they are fine since OBJEXTS_ALLOC_FAIL is set on
>> > slabs only, but that's just a coincidence.
>>
>> And checked the the other debugging checks too. But then thought it's better
>> that if these are not expected to see slabs, then they should not be
>> adjusted. I don't see it as a coincidence but as intention to keep it slab
>> specific. It will be also more future proof for the upcoming separation of
>> struct slab from struct page.
>
> Then we're intentionally using (folio->memcg_data & MEMCG_DATA_OBJEXTS) check
> as a way to determine whether the folio is a slab (either slabobj_ext array
> allocation succeeded or failed).
>
> That makes sense to me!
>
>> >> > ---
>> >> > v3: Simplify the solution, per Harry's suggestion in the v1 comments
>> >> > Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
>> >> > ---
>> >> > include/linux/memcontrol.h | 4 +++-
>> >> > 1 file changed, 3 insertions(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> >> > index 873e510d6f8d..7ed15f858dc4 100644
>> >> > --- a/include/linux/memcontrol.h
>> >> > +++ b/include/linux/memcontrol.h
>> >> > @@ -534,7 +534,9 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob
>> >> > static inline bool folio_memcg_kmem(struct folio *folio)
>> >> > {
>> >> > VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
>> >> > - VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
>> >> > + VM_BUG_ON_FOLIO((folio->memcg_data != OBJEXTS_ALLOC_FAIL) &&
>> >> > + (folio->memcg_data & MEMCG_DATA_OBJEXTS),
>> >> > + folio);
>> >> > return folio->memcg_data & MEMCG_DATA_KMEM;
>> >> > }
>> >> >
>> >> > --
>> >> > 2.25.1
>> >
>>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-15 10:37 ` Vlastimil Babka
@ 2025-10-15 11:22 ` Hao Ge
2025-10-15 11:40 ` Harry Yoo
0 siblings, 1 reply; 12+ messages in thread
From: Hao Ge @ 2025-10-15 11:22 UTC (permalink / raw)
To: Vlastimil Babka, Harry Yoo
Cc: Suren Baghdasaryan, Alexei Starovoitov, Andrew Morton,
Johannes Weiner, Shakeel Butt, Michal Hocko, Roman Gushchin,
Muchun Song, cgroups, linux-mm, linux-kernel, Hao Ge
Hi Vlastimil and Harry
Thank you for your professional guidance.
On 2025/10/15 18:37, Vlastimil Babka wrote:
> On 10/15/25 12:27, Harry Yoo wrote:
>> On Wed, Oct 15, 2025 at 11:54:18AM +0200, Vlastimil Babka wrote:
>>> On 10/15/25 11:25, Harry Yoo wrote:
>>>> On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
>>>>> On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
>>>>>> From: Hao Ge <gehao@kylinos.cn>
>>>>>>
>>>>>> Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
>>>>>> the same bit position, we cannot determine whether memcg_data still
>>>>>> points to the slabobj_ext vector simply by checking
>>>>>> folio->memcg_data & MEMCG_DATA_OBJEXTS.
>>>>>>
>>>>>> If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
>>>>>> and during the release of the associated folio, the BUG check is triggered
>>>>>> because it was mistakenly assumed that a valid folio->memcg_data
>>>>>> was not cleared before freeing the folio.
>>>> nit: yesterday I was confused that this is sanity checks in buddy complaining
>>>> folio->memcg_data not being cleared, but it's actually folio_memcg_kmem()
>>>> complaining that MEMCG_OBJEXTS_DATA flag is set on non-slab folios (in
>>>> free_pages_prepare(), if PageMemcgKmem(page) -> __memcg_kmem_uncharge_page()))
>>>> So the paragraph above should be updated?
Hi Harry
We don't need to update the paragraph.
We did have cgroups running at that time, but they had no connection to
this page.
The entry "[ 7108.343500] memcg:1" can also be seen in the v1 logs,
Therefore, the situation at that time was indeed consistent with what I
described above.
As discussed below, this only occurs because the OBJEXTS_ALLOC_FAIL flag
was not cleared when the slab was about to be freed.
Or have I missed anything?
>>>>
>>>> And as a side question, we clear slab->obj_exts when freeing obj_ext array,
>>>> but don't clear OBJEXTS_ALLOC_FAIL when freeing a slab? That's not good.
>>> Hm great point. We should rather make sure it's cleared always, instead of
>>> adjusting the debugging check, which shouldn't be then necessary, right?
>> Yeah folio_memcg_kmem() isn't supposed to be called on slabs anyway
>> (it's not a slab at the time we free it to buddy), so we don't have to
>> adjust the debug check.
> Great. Hao Ge, can you please send v4 that instead of adjusting the
> VM_BUG_ON modifies free_slab_obj_exts() to always clear slab->obj_exts? Thanks!
Okay, I will send v4 as soon as possible.
Thanks
Best Regards
Hao
>
>>>>>> So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
>>>>>>
>>>>>> Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
>>>>>> Suggested-by: Harry Yoo <harry.yoo@oracle.com>
>>>>>> Signed-off-by: Hao Ge <gehao@kylinos.cn>
>>>>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>>>>
>>>>> nit: I think it would be helpful if the changelog explained why we
>>>>> need the additional check. We can have the same bit set in two
>>>>> different situations:
>>>>> 1. object extension vector allocation failure;
>>>>> 2. memcg_data pointing to a valid mem_cgroup.
>>>>> To distinguish between them, we need to check not only the bit itself
>>>>> but also the rest of this field. If the rest is NULL, we have case 1,
>>>>> otherwise case 2.
>>>> Agreed.
>>>>
>>>> In general LGTM,
>>>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>>>>
>>>> By the way, maybe it'd be nice to introduce a new helper function that
>>>> properly checks MEMCG_DATA_OBJEXTS flag.
>>> I thought so too at first...
>>>
>>>>> ~/slab (slab/for-next-fixes)> git grep -n MEMCG_DATA_OBJEXTS
>>>>> include/linux/memcontrol.h:337: MEMCG_DATA_OBJEXTS = (1UL << 0),
>>>>> include/linux/memcontrol.h:344:#define __OBJEXTS_ALLOC_FAIL MEMCG_DATA_OBJEXTS
>>>>> include/linux/memcontrol.h:358: * MEMCG_DATA_OBJEXTS.
>>>>> include/linux/memcontrol.h:400: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
>>>>> include/linux/memcontrol.h:421: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
>>>> these two,
>>>>
>>>>> include/linux/memcontrol.h:492: if (memcg_data & MEMCG_DATA_OBJEXTS)
>>>> this,
>>>>
>>>>> include/linux/memcontrol.h:538: (folio->memcg_data & MEMCG_DATA_OBJEXTS),
>>>>> include/linux/memcontrol.h:1491: * if MEMCG_DATA_OBJEXTS is set.
>>>>> mm/kfence/core.c:624: MEMCG_DATA_OBJEXTS;
>>>>> mm/page_owner.c:513: if (memcg_data & MEMCG_DATA_OBJEXTS)
>>>> this,
>>>>
>>>>> mm/slab.h:541: * MEMCG_DATA_OBJEXTS bit set or be equal to OBJEXTS_ALLOC_FAIL.
>>>>> mm/slab.h:543: VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS) &&
>>>>> mm/slub.c:2137: new_exts |= MEMCG_DATA_OBJEXTS;
>>>>> tools/mm/show_page_info.py:55: MEMCG_DATA_OBJEXTS = prog.constant("MEMCG_DATA_OBJEXTS").value_()
>>>>> tools/mm/show_page_info.py:59: if memcg_data & MEMCG_DATA_OBJEXTS:
>>>> and this do not look good.
>>>>
>>>> I mean technically they are fine since OBJEXTS_ALLOC_FAIL is set on
>>>> slabs only, but that's just a coincidence.
>>> And checked the the other debugging checks too. But then thought it's better
>>> that if these are not expected to see slabs, then they should not be
>>> adjusted. I don't see it as a coincidence but as intention to keep it slab
>>> specific. It will be also more future proof for the upcoming separation of
>>> struct slab from struct page.
>> Then we're intentionally using (folio->memcg_data & MEMCG_DATA_OBJEXTS) check
>> as a way to determine whether the folio is a slab (either slabobj_ext array
>> allocation succeeded or failed).
>>
>> That makes sense to me!
>>
>>>>>> ---
>>>>>> v3: Simplify the solution, per Harry's suggestion in the v1 comments
>>>>>> Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
>>>>>> ---
>>>>>> include/linux/memcontrol.h | 4 +++-
>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>>>>>> index 873e510d6f8d..7ed15f858dc4 100644
>>>>>> --- a/include/linux/memcontrol.h
>>>>>> +++ b/include/linux/memcontrol.h
>>>>>> @@ -534,7 +534,9 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob
>>>>>> static inline bool folio_memcg_kmem(struct folio *folio)
>>>>>> {
>>>>>> VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
>>>>>> - VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
>>>>>> + VM_BUG_ON_FOLIO((folio->memcg_data != OBJEXTS_ALLOC_FAIL) &&
>>>>>> + (folio->memcg_data & MEMCG_DATA_OBJEXTS),
>>>>>> + folio);
>>>>>> return folio->memcg_data & MEMCG_DATA_KMEM;
>>>>>> }
>>>>>>
>>>>>> --
>>>>>> 2.25.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem
2025-10-15 11:22 ` Hao Ge
@ 2025-10-15 11:40 ` Harry Yoo
0 siblings, 0 replies; 12+ messages in thread
From: Harry Yoo @ 2025-10-15 11:40 UTC (permalink / raw)
To: Hao Ge
Cc: Vlastimil Babka, Suren Baghdasaryan, Alexei Starovoitov,
Andrew Morton, Johannes Weiner, Shakeel Butt, Michal Hocko,
Roman Gushchin, Muchun Song, cgroups, linux-mm, linux-kernel,
Hao Ge
On Wed, Oct 15, 2025 at 07:22:59PM +0800, Hao Ge wrote:
> Hi Vlastimil and Harry
>
> Thank you for your professional guidance.
>
> On 2025/10/15 18:37, Vlastimil Babka wrote:
> > On 10/15/25 12:27, Harry Yoo wrote:
> > > On Wed, Oct 15, 2025 at 11:54:18AM +0200, Vlastimil Babka wrote:
> > > > On 10/15/25 11:25, Harry Yoo wrote:
> > > > > On Tue, Oct 14, 2025 at 09:12:43AM -0700, Suren Baghdasaryan wrote:
> > > > > > On Tue, Oct 14, 2025 at 8:28 AM Hao Ge <hao.ge@linux.dev> wrote:
> > > > > > > From: Hao Ge <gehao@kylinos.cn>
> > > > > > >
> > > > > > > Since OBJEXTS_ALLOC_FAIL and MEMCG_DATA_OBJEXTS currently share
> > > > > > > the same bit position, we cannot determine whether memcg_data still
> > > > > > > points to the slabobj_ext vector simply by checking
> > > > > > > folio->memcg_data & MEMCG_DATA_OBJEXTS.
> > > > > > >
> > > > > > > If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
> > > > > > > and during the release of the associated folio, the BUG check is triggered
> > > > > > > because it was mistakenly assumed that a valid folio->memcg_data
> > > > > > > was not cleared before freeing the folio.
> > > > > nit: yesterday I was confused that this is sanity checks in buddy complaining
> > > > > folio->memcg_data not being cleared, but it's actually folio_memcg_kmem()
> > > > > complaining that MEMCG_OBJEXTS_DATA flag is set on non-slab folios (in
> > > > > free_pages_prepare(), if PageMemcgKmem(page) -> __memcg_kmem_uncharge_page()))
> > > > > So the paragraph above should be updated?
>
> Hi Harry
>
> We don't need to update the paragraph.
>
> We did have cgroups running at that time, but they had no connection to this
> page.
>
> The entry "[ 7108.343500] memcg:1" can also be seen in the v1 logs,
>
> Therefore, the situation at that time was indeed consistent with what I
> described above.
>
> As discussed below, this only occurs because the OBJEXTS_ALLOC_FAIL flag
>
> was not cleared when the slab was about to be freed.
>
> Or have I missed anything?
Oh, I meant "the BUG check is triggered because it was mistakenly
assumed that a valid folio->memcg_data was not cleared before freeing
the folio" is misleading.
Not clearing folio->memcg_data before freeing is considered an error, and
page_expected_state() indeed checks if ->memcg_data is cleared and reports
an error if it's not cleared. But that's not what we're talking about, right?
Instead, what we're looking at is that "the BUG is triggered because
the OBJEXTS_ALLOC_FAIL flag was not cleared, causing it to be
interpreted as a kmem folio (non-slab) with MEMCG_OBJEXTS_DATA flag set,
which is invalid because MEMCG_OBJEXTS_DATA is supposed to be set only
on slabs."
> > > > > And as a side question, we clear slab->obj_exts when freeing obj_ext array,
> > > > > but don't clear OBJEXTS_ALLOC_FAIL when freeing a slab? That's not good.
> > > > Hm great point. We should rather make sure it's cleared always, instead of
> > > > adjusting the debugging check, which shouldn't be then necessary, right?
> > > Yeah folio_memcg_kmem() isn't supposed to be called on slabs anyway
> > > (it's not a slab at the time we free it to buddy), so we don't have to
> > > adjust the debug check.
> > Great. Hao Ge, can you please send v4 that instead of adjusting the
> > VM_BUG_ON modifies free_slab_obj_exts() to always clear slab->obj_exts? Thanks!
>
>
> Okay, I will send v4 as soon as possible.
>
>
> Thanks
>
> Best Regards
>
> Hao
>
> >
> > > > > > > So let's check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem.
> > > > > > >
> > > > > > > Fixes: 7612833192d5 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
> > > > > > > Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> > > > > > > Signed-off-by: Hao Ge <gehao@kylinos.cn>
> > > > > > Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> > > > > >
> > > > > > nit: I think it would be helpful if the changelog explained why we
> > > > > > need the additional check. We can have the same bit set in two
> > > > > > different situations:
> > > > > > 1. object extension vector allocation failure;
> > > > > > 2. memcg_data pointing to a valid mem_cgroup.
> > > > > > To distinguish between them, we need to check not only the bit itself
> > > > > > but also the rest of this field. If the rest is NULL, we have case 1,
> > > > > > otherwise case 2.
> > > > > Agreed.
> > > > >
> > > > > In general LGTM,
> > > > > Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
> > > > >
> > > > > By the way, maybe it'd be nice to introduce a new helper function that
> > > > > properly checks MEMCG_DATA_OBJEXTS flag.
> > > > I thought so too at first...
> > > >
> > > > > > ~/slab (slab/for-next-fixes)> git grep -n MEMCG_DATA_OBJEXTS
> > > > > > include/linux/memcontrol.h:337: MEMCG_DATA_OBJEXTS = (1UL << 0),
> > > > > > include/linux/memcontrol.h:344:#define __OBJEXTS_ALLOC_FAIL MEMCG_DATA_OBJEXTS
> > > > > > include/linux/memcontrol.h:358: * MEMCG_DATA_OBJEXTS.
> > > > > > include/linux/memcontrol.h:400: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
> > > > > > include/linux/memcontrol.h:421: VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio);
> > > > > these two,
> > > > >
> > > > > > include/linux/memcontrol.h:492: if (memcg_data & MEMCG_DATA_OBJEXTS)
> > > > > this,
> > > > >
> > > > > > include/linux/memcontrol.h:538: (folio->memcg_data & MEMCG_DATA_OBJEXTS),
> > > > > > include/linux/memcontrol.h:1491: * if MEMCG_DATA_OBJEXTS is set.
> > > > > > mm/kfence/core.c:624: MEMCG_DATA_OBJEXTS;
> > > > > > mm/page_owner.c:513: if (memcg_data & MEMCG_DATA_OBJEXTS)
> > > > > this,
> > > > >
> > > > > > mm/slab.h:541: * MEMCG_DATA_OBJEXTS bit set or be equal to OBJEXTS_ALLOC_FAIL.
> > > > > > mm/slab.h:543: VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS) &&
> > > > > > mm/slub.c:2137: new_exts |= MEMCG_DATA_OBJEXTS;
> > > > > > tools/mm/show_page_info.py:55: MEMCG_DATA_OBJEXTS = prog.constant("MEMCG_DATA_OBJEXTS").value_()
> > > > > > tools/mm/show_page_info.py:59: if memcg_data & MEMCG_DATA_OBJEXTS:
> > > > > and this do not look good.
> > > > >
> > > > > I mean technically they are fine since OBJEXTS_ALLOC_FAIL is set on
> > > > > slabs only, but that's just a coincidence.
> > > > And checked the the other debugging checks too. But then thought it's better
> > > > that if these are not expected to see slabs, then they should not be
> > > > adjusted. I don't see it as a coincidence but as intention to keep it slab
> > > > specific. It will be also more future proof for the upcoming separation of
> > > > struct slab from struct page.
> > > Then we're intentionally using (folio->memcg_data & MEMCG_DATA_OBJEXTS) check
> > > as a way to determine whether the folio is a slab (either slabobj_ext array
> > > allocation succeeded or failed).
> > >
> > > That makes sense to me!
> > >
> > > > > > > ---
> > > > > > > v3: Simplify the solution, per Harry's suggestion in the v1 comments
> > > > > > > Add Suggested-by: Harry Yoo <harry.yoo@oracle.com>
> > > > > > > ---
> > > > > > > include/linux/memcontrol.h | 4 +++-
> > > > > > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > > > > > > index 873e510d6f8d..7ed15f858dc4 100644
> > > > > > > --- a/include/linux/memcontrol.h
> > > > > > > +++ b/include/linux/memcontrol.h
> > > > > > > @@ -534,7 +534,9 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob
> > > > > > > static inline bool folio_memcg_kmem(struct folio *folio)
> > > > > > > {
> > > > > > > VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page);
> > > > > > > - VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio);
> > > > > > > + VM_BUG_ON_FOLIO((folio->memcg_data != OBJEXTS_ALLOC_FAIL) &&
> > > > > > > + (folio->memcg_data & MEMCG_DATA_OBJEXTS),
> > > > > > > + folio);
> > > > > > > return folio->memcg_data & MEMCG_DATA_KMEM;
> > > > > > > }
> > > > > > >
> > > > > > > --
> > > > > > > 2.25.1
>
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-10-15 11:41 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-14 15:27 [PATCH v3] slab: Add check for memcg_data != OBJEXTS_ALLOC_FAIL in folio_memcg_kmem Hao Ge
2025-10-14 16:12 ` Suren Baghdasaryan
2025-10-14 20:14 ` Shakeel Butt
2025-10-14 20:58 ` Vlastimil Babka
2025-10-14 22:40 ` Shakeel Butt
2025-10-15 9:25 ` Harry Yoo
2025-10-15 9:54 ` Vlastimil Babka
2025-10-15 10:27 ` Harry Yoo
2025-10-15 10:37 ` Vlastimil Babka
2025-10-15 11:22 ` Hao Ge
2025-10-15 11:40 ` Harry Yoo
2025-10-14 21:13 ` Roman Gushchin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox