From: Vlastimil Babka <vbabka@suse.cz>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: "Christoph Lameter (Ampere)" <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Suren Baghdasaryan <surenb@google.com>,
Kent Overstreet <kent.overstreet@linux.dev>,
Andrey Ryabinin <ryabinin.a.a@gmail.com>,
Alexander Potapenko <glider@google.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Dmitry Vyukov <dvyukov@google.com>,
Yeoreum Yun <yeoreum.yun@arm.com>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
linux-mm@kvack.org
Subject: Re: [RFC PATCH] mm/slab: save memory by allocating slabobj_ext array from leftover
Date: Fri, 8 Aug 2025 16:44:32 +0200 [thread overview]
Message-ID: <c05d3ff6-d668-4317-a5ac-c8943f1bb016@suse.cz> (raw)
In-Reply-To: <aJHxsvU-77D3pfhJ@hyeyoo>
On 8/5/25 13:57, Harry Yoo wrote:
> On Thu, Jun 19, 2025 at 09:56:59AM +0200, Vlastimil Babka wrote:
>> On 6/13/25 19:47, Harry Yoo wrote:
>> > On Fri, Jun 13, 2025 at 09:04:34AM -0700, Christoph Lameter (Ampere) wrote:
>> >> On Fri, 13 Jun 2025, Harry Yoo wrote:
>> >>
>> >> > Allocate slabobj_exts array from this unused space instead of using
>> >> > kcalloc(), when it is large enough.
>> >>
>> >> How does slab debug work in this case? The object layout gets a bit
>> >> complicated with other metadata there as well.
>> >
>> > Oh, the 'leftover' space I mentioned the cover letter refers to the
>> > wasted space after the last object in a slab, not unused bytes within
>> > objects.
>> >
>> > There is no per-object metadata stored there and SLUB simply poisons the area.
>> > I taught slab_pad_check() to skip checking the slabobj_exts array.
>
> Apologies for the late reply. I was sidetracked with multiple things :(
>
> This is definitely worth optimizing, so let me make some progress
> even if it's a bit slow.
>
>> I can imagine going further with this in case where leftover space in slab
>> isn't enough.
>
> Right.
>
>> - indeed use per-object padding to store only single object's slabobj_ext,
>
> I think the most conservative approach is to not increase object_size
> but use wasted area when ALIGN(size, align) is bigger than object_size.
>
> A good candidate for that is xfs inode cache.
Yeah that's what I meant.
>> if it doesn't lead to memory waste
>
> You mean increasing object size but without decreasing the number of
> objects per slab?
Just the ALIGN thing. It we can increase size without decreasing objects it
means we can also just use leftover space for the array, IIUC?
> ...or (maybe) reducing the number of objects but without increasing
> the size of the remainder (same as calculated in calc_slab_order())?
>
>> - if not possible, but object size is small enough so there are many per
>> slab, maybe have one less object per slab to store the array?
>
> If object size is small the array likely does not fit in one object...
Hmm perhaps.
>> - once we have struct slab decoupled from struct page, it could be part of
>> struct slab directly (but it would mean struct slab isn't fixed size)
>
> It can be tried, but variable struct slab size may or may not work.
> That'll depend on the implementation details of how we allocate struct slab
> in the future.
>
>> Of course having multiple variants would risk slower code, so fast paths
>> should not be affected
>
> I agree that affecting fastpath is not great.
So we'll have to measure it.
>> we could have pointer to the 0th slabobj_ext (we
>> already have) and now also stride (to support the "per-object padding case"
>
>> - there's still space in struct slab right?)
>
> which space are you referring to, maybe lower 16 bits of page_type?
Ugh I thought counters used only 32bit of 64... so the space exists only for
64bit kernels? It would be fine to limit the optimization to those only.
>> and then the object alloc/free
>> case could be oblivious to the storage method, with just a bit more
>> arithmetic (stride). Slab folio alloc/free would be more complicated but are
>> not fath path.
>
> So it would be something like (please correct if I misunderstood):
>
> index = obj_to_index(s, slab, object)
> (the struct slabobj_ext pointer for the object at given index)
> == slab->obj_exts + stride * index
Yeah.
>
> slab->obj_exts, stride are determined depending on the case:
>
> - In the normal case (the array is allocated from kmalloc caches),
> stride = sizeof(struct slabobj_ext)
> slab->obj_exts = (the address of the buffer allocated from kmalloc)
>
> - In "the obj_exts array is stored in the leftover space" case,
> stride = sizeof(struct slabobj_ext)
> slab->obj_exts = (the start address of the leftover space)
>
> - In "per-object padding" case,
> stride = s->size
> slab->obj_exts = slab_address(slab) + s->red_left_pad +
> (offset of slabobj_ext);
>
> Ok, I think it will work. Great idea!
Yeah.
>
>> Also some variants would be wasteful if they need to be decided upfront (the
>> 2nd and 3rd above) and then the array is unused
>
> Right.
>
>> so would be only applicable
>> with SLAB_ACCOUNT caches (if kmemcg is active) or when memalloc profiling is
>> active.
>
> Right.
>
>> Shouldn't be a big issue as ad-hoc __GFP_ACCOUNT is handled by
>> different cache selection for kmalloc() and I don't know if anyone is
>> actually doing ad-hoc __GFP_ACCOUNT on named caches.
>
> A while ago I was thinking of getting rid of ad-hoc __GFP_ACCOUNT usage
> for slab allocations, but at least xarray cache appears to use it in
> an ad-hoc manner. (See xas_nomem()).
>
> In that case, not all allocations to the same cache has
> __GFP_ACCOUNT flag set.
Hmm bummer.
next prev parent reply other threads:[~2025-08-08 14:44 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-13 6:33 Harry Yoo
2025-06-13 7:11 ` Harry Yoo
2025-06-13 11:42 ` Yeoreum Yun
2025-06-13 17:58 ` Harry Yoo
2025-06-13 16:04 ` Christoph Lameter (Ampere)
2025-06-13 17:47 ` Harry Yoo
2025-06-16 11:00 ` Harry Yoo
2025-06-19 7:56 ` Vlastimil Babka
2025-08-05 11:57 ` Harry Yoo
2025-08-08 14:44 ` Vlastimil Babka [this message]
2025-08-27 11:40 ` Harry Yoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c05d3ff6-d668-4317-a5ac-c8943f1bb016@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@gmail.com \
--cc=cl@gentwo.org \
--cc=dvyukov@google.com \
--cc=glider@google.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=ryabinin.a.a@gmail.com \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=vincenzo.frascino@arm.com \
--cc=yeoreum.yun@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox