Re: [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Kairui Song <ryncsn@gmail.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Wenchao Hao <haowenchao22@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Chengming Zhou <chengming.zhou@linux.dev>,
	Jens Axboe <axboe@kernel.dk>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Minchan Kim <minchan@kernel.org>,
	 Sergey Senozhatsky <senozhatsky@chromium.org>,
	Yosry Ahmed <yosry@kernel.org>,
	linux-block@vger.kernel.org,  linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,  Barry Song <baohua@kernel.org>,
	Xueyuan Chen <xueyuan.chen21@gmail.com>,
	 Wenchao Hao <haowenchao@xiaomi.com>
Subject: Re: [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path
Date: Wed, 22 Apr 2026 01:17:28 +0800	[thread overview]
Message-ID: <CAMgjq7Ae_VH6mLQito5MFnxmp7ScZvF9xw4P-yU-yUBkPZ9_YQ@mail.gmail.com> (raw)
In-Reply-To: <CAKEwX=M5YpR0cQrryX_y4pm_BuxyUWZ_8MbhWodwbf1Fe=gzew@mail.gmail.com>

On Tue, Apr 21, 2026 at 11:55 PM Nhat Pham <nphamcs@gmail.com> wrote:
>

Thanks for adding me to the Cc list :), Barry started this idea with
ZRAM, which looks very interesting to me.

> On Tue, Apr 21, 2026 at 5:16 AM Wenchao Hao <haowenchao22@gmail.com> wrote:
> >
> > Swap freeing can be expensive when unmapping a VMA containing
> > many swap entries. This has been reported to significantly
> > delay memory reclamation during Android's low-memory killing,
> > especially when multiple processes are terminated to free
> > memory, with slot_free() accounting for more than 80% of
> > the total cost of freeing swap entries.
> >
> > Two earlier attempts by Lei and Zhiguo added a new thread in the mm core
> > to asynchronously collect and free swap entries [1][2], but the
> > design itself is fairly complex.
> >
> > When anon folios and swap entries are mixed within a
> > process, reclaiming anon folios from killed processes
> > helps return memory to the system as quickly as possible,
> > so that newly launched applications can satisfy their
> > memory demands. It is not ideal for swap freeing to block
> > anon folio freeing. On the other hand, swap freeing can
> > still return memory to the system, although at a slower
> > rate due to memory compression.
>
> Is this correct? I don't think we do decompression in
> zswap_invalidate() path. We do decompression in zswap_load(), but as a
> separate step from zswap_invalidate().

It's not about decompression. I think what Wenchao means here is that:
freeing the swap entry also releases the backing compression data, but
compared to freeing an actual folio (which bring back a free folio to
reduce memory pressure), you may need to free a lot of swap entries to
free one whole folio, because the compressed data could be much
smaller than folio and with fragmentation. And swap entry freeing is
still not fast enough to be ignored.

>
> zswap/zsmalloc entry freeing is decoupled from decompression. For
> example, on process teardown, we free the zsmalloc memory but never
> decompress (if we do then it's a bug to be fixed lol, but I doubt it).
>
> Zsmalloc freeing might not be worth as much bang-for-your-buck wise
> compared to anon folio freeing, but if it's "expensive", then I think
> that points to a different root-cause: zsmalloc's poor scalability in
> the free path.

That's a very nice insight. I had an idea previously that can we have
something like a zs free bulk? Freeing handles one by one does seem
expensive.
https://lore.kernel.org/linux-mm/adt3Q_SRToF6fb3W@KASONG-MC4/

It might be tricky to do so though.

It will be best if we can speed up everything, doing things async
doesn't reduce the total amount of work, and might cause more trouble
like worker overhead or delayed freeing causing more memory pressure,
if the workqueue didn't run in time. Or maybe a process is almost
completely swapped out, then this won't help at all.

I'm not against the async idea, they might combine well.

>
> I've stared at this code path for a bit, because my other patch series
> (vswap - see [1]) was reported to display regression on the free path
> on the usemem benchmark. And one of the issues was the contention
> between compaction (both systemwide compaction, i.e zs_page_migrate,
> and zsmalloc's internal compaction, but mostly the former).:
>
> * zs_free read-acquires pool->lock, and compaction write-acquires the
> same lock. So the compaction thread will make all zs free-ers wait for
> it. I saw this read lock delay when I perfed the free step of usemem.
>
> * If this lock has fair queue-ing semantics (I have not checked), then
> if there a compaction is behind a bunch of zs_free in the queue, then
> all the subsequent zs_free's ers are blocked :)
>
> * I'm also curious about cache-friendliness of this rwlock, bouncing
> across CPUs, if you have multiple processes being torn down
> concurrently.

That's interesting, when I mentioned zs free bulk I was thinking that,
if we have a percpu queue, at least we may try read lock that on every
enqueue, free the whole queue if successful, then release the lock.
I'm sure there are more ways to optimize that, just a random idea :)

next prev parent reply	other threads:[~2026-04-21 17:18 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 12:16 Wenchao Hao
2026-04-21 12:16 ` [RFC PATCH v2 1/4] mm:zsmalloc: drop class lock before freeing zspage Wenchao Hao
2026-04-21 12:16 ` [RFC PATCH v2 2/4] mm/zsmalloc: introduce zs_free_deferred() for async handle freeing Wenchao Hao
2026-04-21 12:16 ` [RFC PATCH v2 3/4] zram: defer zs_free() in swap slot free notification path Wenchao Hao
2026-04-21 12:16 ` [RFC PATCH v2 4/4] mm/zswap: defer zs_free() in zswap_invalidate() path Wenchao Hao
2026-04-21 17:03   ` Nhat Pham
2026-04-21 15:54 ` [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path Nhat Pham
2026-04-21 17:17   ` Kairui Song [this message]
2026-04-21 18:07     ` Nhat Pham
2026-04-21 18:25       ` Nhat Pham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMgjq7Ae_VH6mLQito5MFnxmp7ScZvF9xw4P-yU-yUBkPZ9_YQ@mail.gmail.com \
    --to=ryncsn@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=baohua@kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=haowenchao22@gmail.com \
    --cc=haowenchao@xiaomi.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=senozhatsky@chromium.org \
    --cc=xueyuan.chen21@gmail.com \
    --cc=yosry@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox