Hi all,

We have been investigating reclaim performance on mobile systems under
memory pressure and noticed that slab shrinking often accounts for a
significant portion of reclaim time in both direct reclaim and kswapd contexts.
In some cases, shrink_slab() can take noticeably long when multiple
shrinkers are active, leading to latency spikes and slower overall reclaim
progress.

To address this, we are considering an approach to move slab shrinking
into a dedicated kernel thread. The intention is to decouple slab reclaim
from the direct reclaim and kswapd paths, allowing it to proceed
asynchronously under controlled conditions such as system idle periods or
specific reclaim triggers.

Motivation:

Reduce latency in direct reclaim paths by offloading potentially
long-running slab reclaim work.
Improve overall reclaim efficiency by scheduling slab shrinking
separately from page reclaim.
Allow more flexible control over when and how slab caches are aged
or shrunk.

Proposed direction:

Introduce a kernel thread responsible for invoking
shrink_slab() periodically or when signaled.
Keep the existing shrinker infrastructure intact but move the
execution context outside of direct reclaim and kswapd.
Optionally trigger this thread based on system activity (e.g.
idle detection, vmpressure events, or background reclaim).

We’d like to gather community feedback on:

Whether decoupling slab reclaim from kswapd and direct reclaim
makes sense from a design and maintainability perspective.
Potential implications on fairness, concurrency, and memcg
accounting.
Any related prior work or alternative ideas that have been discussed
in this area.

Thanks for your time and consideration.

Best regards,
Yifan Ji