linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [DISCUSSION] Revisiting Slab Movable Objects
@ 2025-04-21 13:47 Harry Yoo
  2025-04-21 16:33 ` Pedro Falcato
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Harry Yoo @ 2025-04-21 13:47 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes, Andrew Morton, Roman Gushchin,
	Tobin C. Harding, Alexander Viro, Matthew Wilcox,
	Vlastimil Babka, Dave Chinner, Rik van Riel, Andrea Arcangeli,
	Liam R. Howlett, Lorenzo Stoakes, Jann Horn, Pedro Falcato,
	David Hildenbrand, Oscar Salvador, Michal Hocko, Byungchul Park,
	linux-mm

Hi folks,

As a long term project, I'm starting to look into resurrecting
Slab Movable Objects. The goal is to make certain types of slab memory
movable and thus enable targeted reclamation, migration, and
defragmentation.

The main purpose of this posting is to briefly review what's been tried
in the past, ask people why prior efforts have stalled (due to lack of
time or insufficient justification for additional complexity?),
and discuss what's feasible today.

Please add anyone I may have missed to Cc. :)

Previous Work on Slab Movable Objects
=====================================

Christoph Lameter, Slab Defragmentation Reduction, 2007-2017 (V16: [2]):
Christoph Lameter, Slab object migration for xarray, 2017-2018 (V2: [3]):
  Christoph's long-standing effort (since 2007) aiming to defragment
  slab memory in cases where sparsely populated slabs occupy excessive
  amount of memory.

  Early versions of the work focused on defragmenting slab caches
  for filesystem data structures such as inode, dentry, and buffer head.
  updatedb was suggested as the standard way to trigger for generating
  sparsely populated slabs on file servers.

  However, defragmenting slabs for filesystem data structures has proven
  to be very difficult to fully solve, because inodes and dentries are
  neither reclaimable nor migratable, limiting the effectiveness of
  defragmentation.

  In late 2018, the effort was revived with a new focus on migrating
  XArray nodes. However, it appears the work was discontinued after
  V2 [3]?

Tobin C. Harding, Slab Movable Objects, 2019 (First Non-RFC: [5])
- Tobin C. Harding revived Christoph's earlier work and introduced
  a few enhancements, including partial shrinking of dentries, moving
  objects to and from a specific NUMA node, and balancing objects across
  all NUMA nodes.

  Also appears to be discontinued after the first non-RFC version [5]? 

At LSFMM 2017, Andrea Arcangeli suggested [6] virtually mapped slabs,
which might be useful since migrating them does not require changing the
address of objects. But as Rik van Riel pointed out at that time, it
isn't really useful for defragmentation. Andrea Arcangeli responded
that it can be beneficial for memory hotplug, compaction and out-of-memory
avoidance.

The exact mechanism wasn't described in [6], but I assume it'll involve
1) unmap a slab (and page faults after unmap need to wait for migration
to complete), 2) copy objects to a new slab, and 3) map the new slab?
But the idea hasn't gained enough attention for anyone to actually
implement it.

Potential Candidates of SMO
===========================

Basic Rules
-----------

- Slab memory can only be reclaimed or migrated if the user of the slab
  provides a way to isolate / migrate objects.
- If objects can be reclaimed, it makes sense to simply reclaim them
  instead of migrating them (unless we know it's better to keep that
  object in memory).
- Some objects can't be reclaimed, but migrating them is (if possible)
  still useful for defragmentation and compaction.
  - However it is not always feasible 

Potential candidates include (but not limited to):
--------------------------------------------------

- XArray nodes can be migrated (can't be reclaimed as they're being used)
  - Can be reclaimed if it only includes shadow entries.
- Maple tree nodes (if without external locking) and VMAs can be migrated
  and obviously can't be reclaimed.
- Negative dentry should be reclaimed, instead of being migrated.
- Only unused dentries can be reclaimed without high cost.
  - Dentries with nonzero refcount are not really relocatable? (per [1])
- Even unused inodes can't be reclaimed nor relocated due to external
  references? (per [4])

Al Viro made it clear [1] that inodes / dentries are not really
relocatable. He also mentioned:
> So from the correctness POV
> 	* you can kick out everything with zero refcount not
> on shrink lists.
> 	* you _might_ try shrink_dcache_parent() on directory
> dentries, in hope to drive their refcount to zero.  However,
> that's almost certainly going to hit too hard and be too costly.
> 	* d_invalidate() is no-go; if anything, you want something
> weaker than shrink_dcache_parent(), not stronger.
> 
> For anything beyond "just kick out everything in that page that
> happens to have zero refcount" I would really like to see the
> stats - how much does it help, how costly it is _and_ how much
> of the cache does it throw away (see above re running into a root
> dentry of some filesystem and essentially trimming dcache for
> that fs down to the unevictable stuff).

Dave Chinner mentioned [4] why it is hard to reclaim or migrate (in a
targeted manner) even inodes with no active references:
> On Wed, Dec 27, 2017 at 04:06:36PM -0600, Christoph Lameter wrote:
> > This is a patchset on top of Matthew Wilcox Xarray code and implements
> > object migration of xarray nodes. The migration is integrated into
> > the defragmetation and shrinking logic of the slab allocator.
> .....
> > This is only possible for xarray for now but it would be worthwhile
> > to extend this to dentries and inodes.
> 
> Christoph, you keep saying this is the goal, but I'm yet to see a
> solution proposed for the atomic replacement of all the pointers to
> an inode from external objects.  An inode that has no active
> references still has an awful lot of passive and internal references
> that need to be dealt with.
> 
> e.g. racing page operations accessing mapping->host, the inode in
> various lists (e.g. superblock inode list, writeback lists, etc),
> the inode lookup cache(s), backpointers from LSMs, fsnotify marks,
> crypto information, internal filesystem pointers (e.g. log items,
> journal handles, buffer references, etc) and so on. And each
> filesystem has a different set of passive references, too.
> 
> Oh, and I haven't even mentioned deadlocks yet, either. :P
> 
> IOWs, just saying "it would be worthwhile to extend this to dentries
> and inodes" completely misrepresents the sheer complexity of doing
> so. We've known that atomic replacement is the big problem for
> defragging inodes and dentries since this work was started, what,
> more than 10 years? And while there's been many revisions of the
> core defrag code since then, there has been no credible solution
> presented for atomic replacement of objects with complex external
> references. This is a show-stopper for inode/dentry slab defrag, and
> I don't see that this new patchset is any different...

[1] https://lore.kernel.org/linux-mm/20190403190520.GW2217@ZenIV.linux.org.uk
[2] https://lore.kernel.org/linux-mm/20170307212429.044249411@linux.com
[3] https://marc.info/?l=linux-mm&m=154533371911133
[4] https://lore.kernel.org/linux-mm/20171228222419.GQ1871@rh
[5] https://lore.kernel.org/linux-mm/20190603042637.2018-1-tobin@kernel.org
[6] https://lwn.net/Articles/717650

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-05-05 23:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-21 13:47 [DISCUSSION] Revisiting Slab Movable Objects Harry Yoo
2025-04-21 16:33 ` Pedro Falcato
2025-04-22 23:17   ` Harry Yoo
2025-04-23  5:53   ` Christoph Lameter (Ampere)
2025-04-21 21:54 ` Dave Chinner
2025-04-23  1:47   ` Al Viro
2025-04-23  7:20     ` Harry Yoo
2025-04-23  7:40       ` Al Viro
2025-04-25 11:09   ` Harry Yoo
2025-04-28 15:31     ` Jann Horn
2025-04-30 13:11       ` Harry Yoo
2025-04-30 22:23         ` Jann Horn
2025-05-05 23:29         ` Dave Chinner
2025-04-21 21:59 ` Tobin C. Harding

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox