linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path
@ 2026-04-21 12:16 Wenchao Hao
  2026-04-21 12:16 ` [RFC PATCH v2 1/4] mm:zsmalloc: drop class lock before freeing zspage Wenchao Hao
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Wenchao Hao @ 2026-04-21 12:16 UTC (permalink / raw)
  To: Andrew Morton, Chengming Zhou, Jens Axboe, Johannes Weiner,
	Minchan Kim, Nhat Pham, Sergey Senozhatsky, Yosry Ahmed,
	linux-block, linux-kernel, linux-mm
  Cc: Barry Song, Xueyuan Chen, Wenchao Hao

Swap freeing can be expensive when unmapping a VMA containing
many swap entries. This has been reported to significantly
delay memory reclamation during Android's low-memory killing,
especially when multiple processes are terminated to free
memory, with slot_free() accounting for more than 80% of
the total cost of freeing swap entries.

Two earlier attempts by Lei and Zhiguo added a new thread in the mm core
to asynchronously collect and free swap entries [1][2], but the
design itself is fairly complex.

When anon folios and swap entries are mixed within a
process, reclaiming anon folios from killed processes
helps return memory to the system as quickly as possible,
so that newly launched applications can satisfy their
memory demands. It is not ideal for swap freeing to block
anon folio freeing. On the other hand, swap freeing can
still return memory to the system, although at a slower
rate due to memory compression.

Therefore, we introduce a GC worker to allow anon
folio freeing and slot_free to run in parallel, since
slot_free is performed asynchronously, maximizing the rate at
which memory is returned to the system.

This series takes two complementary approaches to reduce zs_free()
latency:

- Shrink zs_free() class->lock critical section by moving zspage
  freeing outside the lock.
- Defer zs_free() to a workqueue via zs_free_deferred(), benefiting
  both zram and zswap.

The deferred free approach builds on Barry Song's earlier RFC [1] with
changes based on community feedback: optimization moved to zsmalloc
layer instead of zram; fixed array storing handles (not indices) with
O(1) enqueue to avoid memory allocation on the exit path and data
consistency issues on slot reuse; size-based capacity scaling with
PAGE_SIZE.

Xueyuan's test on RK3588 with Barry's RFC v1 [3] shows that unmapping
a 256MB swap-filled VMA becomes 3.4x faster when pinning tasks to CPU2,
reducing the execution time from 63,102,982 ns to 18,570,726 ns.

A positive side effect is that async GC also slightly improves
do_swap_page() performance, as it no longer has to wait for
slot_free() to complete.

Xueyuan's test with Barry's RFC v1 [3] shows that swapping in 256MB of
data (each page filled with repeating patterns such as "1024 one",
"1024 two", "1024 three", and "1024 four") reduces execution time from
1,358,133,886 ns to 1,104,315,986 ns, achieving a 1.22x speedup.

[1] https://lore.kernel.org/all/20240805153639.1057-1-justinjiang@vivo.com/
[2] https://lore.kernel.org/all/20250909065349.574894-1-liulei.rjpt@vivo.com/
[3] https://lore.kernel.org/linux-mm/20260412060450.15813-1-baohua@kernel.org/

Xueyuan Chen (1):
  mm:zsmalloc: drop class lock before freeing zspage

Barry Song (Xiaomi) (1):
  zram: defer zs_free() in swap slot free notification path

Wenchao Hao (2):
  mm/zsmalloc: introduce zs_free_deferred() for async handle freeing
  mm/zswap: defer zs_free() in zswap_invalidate() path

 drivers/block/zram/zram_drv.c |  37 ++++++---
 include/linux/zsmalloc.h      |   2 +
 mm/zsmalloc.c                 | 141 ++++++++++++++++++++++++++++++++--
 mm/zswap.c                    |  16 ++++-
 4 files changed, 177 insertions(+), 19 deletions(-)

--
2.34.1



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-04-21 19:47 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-04-21 12:16 [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path Wenchao Hao
2026-04-21 12:16 ` [RFC PATCH v2 1/4] mm:zsmalloc: drop class lock before freeing zspage Wenchao Hao
2026-04-21 12:16 ` [RFC PATCH v2 2/4] mm/zsmalloc: introduce zs_free_deferred() for async handle freeing Wenchao Hao
2026-04-21 19:46   ` Nhat Pham
2026-04-21 12:16 ` [RFC PATCH v2 3/4] zram: defer zs_free() in swap slot free notification path Wenchao Hao
2026-04-21 12:16 ` [RFC PATCH v2 4/4] mm/zswap: defer zs_free() in zswap_invalidate() path Wenchao Hao
2026-04-21 17:03   ` Nhat Pham
2026-04-21 15:54 ` [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path Nhat Pham
2026-04-21 17:17   ` Kairui Song
2026-04-21 18:07     ` Nhat Pham
2026-04-21 18:25       ` Nhat Pham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox