From: Harry Yoo <harry.yoo@oracle.com>
To: Pedro Falcato <pfalcato@suse.de>
Cc: Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
"Tobin C. Harding" <tobin@kernel.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Matthew Wilcox <willy@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>,
Dave Chinner <david@fromorbit.com>,
Rik van Riel <riel@surriel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Jann Horn <jannh@google.com>,
David Hildenbrand <david@redhat.com>,
Oscar Salvador <osalvador@suse.de>,
Michal Hocko <mhocko@kernel.org>,
Byungchul Park <byungchul@sk.com>,
linux-mm@kvack.org
Subject: Re: [DISCUSSION] Revisiting Slab Movable Objects
Date: Wed, 23 Apr 2025 08:17:34 +0900 [thread overview]
Message-ID: <aAgjjjy8c9IeP6fm@harry> (raw)
In-Reply-To: <yemom5y6p66yq4rq76ot5uyrfp6u3oqlwb72ykg74zywytlo54@6ehom7fi7wlj>
On Mon, Apr 21, 2025 at 05:33:38PM +0100, Pedro Falcato wrote:
> On Mon, Apr 21, 2025 at 10:47:39PM +0900, Harry Yoo wrote:
> > Hi folks,
> >
>
> Hi Harry,
>
> Some passing thoughts...
Hi Pedro, thanks for taking a look.
> > As a long term project, I'm starting to look into resurrecting
> > Slab Movable Objects. The goal is to make certain types of slab memory
> > movable and thus enable targeted reclamation, migration, and
> > defragmentation.
> >
> > The main purpose of this posting is to briefly review what's been tried
> > in the past, ask people why prior efforts have stalled (due to lack of
> > time or insufficient justification for additional complexity?),
> > and discuss what's feasible today.
> >
> > Please add anyone I may have missed to Cc. :)
> >
> > Previous Work on Slab Movable Objects
> > =====================================
> >
> > Christoph Lameter, Slab Defragmentation Reduction, 2007-2017 (V16: [2]):
> > Christoph Lameter, Slab object migration for xarray, 2017-2018 (V2: [3]):
> > Christoph's long-standing effort (since 2007) aiming to defragment
> > slab memory in cases where sparsely populated slabs occupy excessive
> > amount of memory.
> >
> > Early versions of the work focused on defragmenting slab caches
> > for filesystem data structures such as inode, dentry, and buffer head.
> > updatedb was suggested as the standard way to trigger for generating
> > sparsely populated slabs on file servers.
> >
> > However, defragmenting slabs for filesystem data structures has proven
> > to be very difficult to fully solve, because inodes and dentries are
> > neither reclaimable nor migratable, limiting the effectiveness of
> > defragmentation.
> >
> > In late 2018, the effort was revived with a new focus on migrating
> > XArray nodes. However, it appears the work was discontinued after
> > V2 [3]?
> >
> > Tobin C. Harding, Slab Movable Objects, 2019 (First Non-RFC: [5])
> > - Tobin C. Harding revived Christoph's earlier work and introduced
> > a few enhancements, including partial shrinking of dentries, moving
> > objects to and from a specific NUMA node, and balancing objects across
> > all NUMA nodes.
> >
> > Also appears to be discontinued after the first non-RFC version [5]?
> >
> > At LSFMM 2017, Andrea Arcangeli suggested [6] virtually mapped slabs,
> > which might be useful since migrating them does not require changing the
> > address of objects. But as Rik van Riel pointed out at that time, it
> > isn't really useful for defragmentation. Andrea Arcangeli responded
> > that it can be beneficial for memory hotplug, compaction and out-of-memory
> > avoidance.
> >
> > The exact mechanism wasn't described in [6], but I assume it'll involve
> > 1) unmap a slab (and page faults after unmap need to wait for migration
> > to complete), 2) copy objects to a new slab, and 3) map the new slab?
> > But the idea hasn't gained enough attention for anyone to actually
> > implement it.
>
> I don't think this is a silver bullet. It opens a whole separate can of worms
> while maintaining similar issues. But instead of worrying about updating pointers,
> you're worrying about locking out _any_ sort of access, which would involve stop_machine().
Haha, yes. When I read the LWN article I was like "Wait, can the kernel
really synchronize access to slab objects while migrating the underlying
pages?" and I sketched a very rough 'mecanism' in previous email without
carefully considering correctness or feasibility.
stop_machine() just to migrate slab objects sounds like a disaster.
> You can't even atomically replace a PTE without running into issues (arm requires BBM, thus
> this doesn't work) so it cannot be applied to anything that can't page fault (xarray and
> the maple tree are used in IRQ paths, if I'm not mistaken).
Yeah, while the idea is very simple, I can't think of any sane way to
correctly implement this given that slab objects can be accessed in
_any_ context.
> >
> > Potential Candidates of SMO
> > ===========================
> >
> > Basic Rules
> > -----------
> >
> > - Slab memory can only be reclaimed or migrated if the user of the slab
> > provides a way to isolate / migrate objects.
> > - If objects can be reclaimed, it makes sense to simply reclaim them
> > instead of migrating them (unless we know it's better to keep that
> > object in memory).
>
> In any case I think you want to give subsystems the power to decide between
> {RECLAIMED, MIGRATED, SKIPPED}.
Totally agreed.
> > - Some objects can't be reclaimed, but migrating them is (if possible)
> > still useful for defragmentation and compaction.
> > - However it is not always feasible
> >
> > Potential candidates include (but not limited to):
> > --------------------------------------------------
> >
> > - XArray nodes can be migrated (can't be reclaimed as they're being used)
> > - Can be reclaimed if it only includes shadow entries.
> > - Maple tree nodes (if without external locking) and VMAs can be migrated
> > and obviously can't be reclaimed.
> > - Negative dentry should be reclaimed, instead of being migrated.
> > - Only unused dentries can be reclaimed without high cost.
>
> Unused dentries can also have a high cost if they're accessed in the future
> (and the dentry LRU has some difficulty in... LRU'ing - thus the negative dentry problem).
If that's the case it can decide to migrate objects and return MIGRATED?
> In any case, it would be interesting to see if the existing shrinker interface
> could be used for this stuff. We already best-effort-reclaim objects, maybe we
> could best-effort-migrate objects too? The problem of reclaiming is adjacent to
> migrating, and we already have infrastructure for it...
You mean best-effort-migrate objects for defragmentation via shrinker
interface?
--
Cheers,
Harry / Hyeonggon
next prev parent reply other threads:[~2025-04-22 23:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-21 13:47 Harry Yoo
2025-04-21 16:33 ` Pedro Falcato
2025-04-22 23:17 ` Harry Yoo [this message]
2025-04-23 5:53 ` Christoph Lameter (Ampere)
2025-04-21 21:54 ` Dave Chinner
2025-04-23 1:47 ` Al Viro
2025-04-23 7:20 ` Harry Yoo
2025-04-23 7:40 ` Al Viro
2025-04-25 11:09 ` Harry Yoo
2025-04-28 15:31 ` Jann Horn
2025-04-30 13:11 ` Harry Yoo
2025-04-30 22:23 ` Jann Horn
2025-05-05 23:29 ` Dave Chinner
2025-04-21 21:59 ` Tobin C. Harding
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aAgjjjy8c9IeP6fm@harry \
--to=harry.yoo@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=byungchul@sk.com \
--cc=cl@linux.com \
--cc=david@fromorbit.com \
--cc=david@redhat.com \
--cc=jannh@google.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.de \
--cc=pfalcato@suse.de \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=tobin@kernel.org \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox