linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: "Christoph Lameter (Ampere)" <cl@gentwo.org>,
	David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Muchun Song <muchun.song@linux.dev>,
	Suren Baghdasaryan <surenb@google.com>,
	Kent Overstreet <kent.overstreet@linux.dev>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Yeoreum Yun <yeoreum.yun@arm.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH] mm/slab: save memory by allocating slabobj_ext array from leftover
Date: Fri, 8 Aug 2025 16:44:32 +0200	[thread overview]
Message-ID: <c05d3ff6-d668-4317-a5ac-c8943f1bb016@suse.cz> (raw)
In-Reply-To: <aJHxsvU-77D3pfhJ@hyeyoo>

On 8/5/25 13:57, Harry Yoo wrote:
> On Thu, Jun 19, 2025 at 09:56:59AM +0200, Vlastimil Babka wrote:
>> On 6/13/25 19:47, Harry Yoo wrote:
>> > On Fri, Jun 13, 2025 at 09:04:34AM -0700, Christoph Lameter (Ampere) wrote:
>> >> On Fri, 13 Jun 2025, Harry Yoo wrote:
>> >> 
>> >> > Allocate slabobj_exts array from this unused space instead of using
>> >> > kcalloc(), when it is large enough.
>> >> 
>> >> How does slab debug work in this case? The object layout gets a bit
>> >> complicated with other metadata there as well.
>> > 
>> > Oh, the 'leftover' space I mentioned the cover letter refers to the
>> > wasted space after the last object in a slab, not unused bytes within
>> > objects.
>> > 
>> > There is no per-object metadata stored there and SLUB simply poisons the area.
>> > I taught slab_pad_check() to skip checking the slabobj_exts array.
> 
> Apologies for the late reply. I was sidetracked with multiple things :(
> 
> This is definitely worth optimizing, so let me make some progress
> even if it's a bit slow.
> 
>> I can imagine going further with this in case where leftover space in slab
>> isn't enough.
> 
> Right.
> 
>> - indeed use per-object padding to store only single object's slabobj_ext,
> 
> I think the most conservative approach is to not increase object_size
> but use wasted area when ALIGN(size, align) is bigger than object_size.
> 
> A good candidate for that is xfs inode cache.

Yeah that's what I meant.

>> if it doesn't lead to memory waste
> 
> You mean increasing object size but without decreasing the number of
> objects per slab?

Just the ALIGN thing. It we can increase size without decreasing objects it
means we can also just use leftover space for the array, IIUC?

> ...or (maybe) reducing the number of objects but without increasing
> the size of the remainder (same as calculated in calc_slab_order())?
> 
>> - if not possible, but object size is small enough so there are many per
>> slab, maybe have one less object per slab to store the array?
> 
> If object size is small the array likely does not fit in one object...

Hmm perhaps.

>> - once we have struct slab decoupled from struct page, it could be part of
>> struct slab directly (but it would mean struct slab isn't fixed size)
> 
> It can be tried, but variable struct slab size may or may not work.
> That'll depend on the implementation details of how we allocate struct slab
> in the future.
> 
>> Of course having multiple variants would risk slower code, so fast paths
>> should not be affected 
> 
> I agree that affecting fastpath is not great.

So we'll have to measure it.

>> we could have pointer to the 0th slabobj_ext (we
>> already have) and now also stride (to support the "per-object padding case"
> 
>> - there's still space in struct slab right?)
> 
> which space are you referring to, maybe lower 16 bits of page_type?

Ugh I thought counters used only 32bit of 64... so the space exists only for
64bit kernels? It would be fine to limit the optimization to those only.

>> and then the object alloc/free
>> case could be oblivious to the storage method, with just a bit more
>> arithmetic (stride). Slab folio alloc/free would be more complicated but are
>> not fath path.
> 
> So it would be something like (please correct if I misunderstood):
> 
> index = obj_to_index(s, slab, object)
> (the struct slabobj_ext pointer for the object at given index)
>   == slab->obj_exts + stride * index

Yeah.

> 
> slab->obj_exts, stride are determined depending on the case:
> 
> - In the normal case (the array is allocated from kmalloc caches),
>   stride = sizeof(struct slabobj_ext)
>   slab->obj_exts = (the address of the buffer allocated from kmalloc)
> 
> - In "the obj_exts array is stored in the leftover space" case,
>   stride = sizeof(struct slabobj_ext)
>   slab->obj_exts = (the start address of the leftover space)
> 
> - In "per-object padding" case,
>   stride = s->size
>   slab->obj_exts = slab_address(slab) + s->red_left_pad +
>                        (offset of slabobj_ext);
> 
> Ok, I think it will work. Great idea!

Yeah.

> 
>> Also some variants would be wasteful if they need to be decided upfront (the
>> 2nd and 3rd above) and then the array is unused
> 
> Right.
> 
>> so would be only applicable
>> with SLAB_ACCOUNT caches (if kmemcg is active) or when memalloc profiling is
>> active.
> 
> Right.
> 
>> Shouldn't be a big issue as ad-hoc __GFP_ACCOUNT is handled by
>> different cache selection for kmalloc() and I don't know if anyone is
>> actually doing ad-hoc __GFP_ACCOUNT on named caches.
> 
> A while ago I was thinking of getting rid of ad-hoc __GFP_ACCOUNT usage
> for slab allocations, but at least xarray cache appears to use it in
> an ad-hoc manner. (See xas_nomem()).
> 
> In that case, not all allocations to the same cache has
> __GFP_ACCOUNT flag set.

Hmm bummer.



  reply	other threads:[~2025-08-08 14:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-13  6:33 Harry Yoo
2025-06-13  7:11 ` Harry Yoo
2025-06-13 11:42 ` Yeoreum Yun
2025-06-13 17:58   ` Harry Yoo
2025-06-13 16:04 ` Christoph Lameter (Ampere)
2025-06-13 17:47   ` Harry Yoo
2025-06-16 11:00     ` Harry Yoo
2025-06-19  7:56     ` Vlastimil Babka
2025-08-05 11:57       ` Harry Yoo
2025-08-08 14:44         ` Vlastimil Babka [this message]
2025-08-27 11:40           ` Harry Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c05d3ff6-d668-4317-a5ac-c8943f1bb016@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=cl@gentwo.org \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=ryabinin.a.a@gmail.com \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox