From: "Qun-wei Lin (林群崴)" <Qun-wei.Lin@mediatek.com>
To: "hannes@cmpxchg.org" <hannes@cmpxchg.org>
Cc: "Andrew Yang (楊智強)" <Andrew.Yang@mediatek.com>,
"rppt@kernel.org" <rppt@kernel.org>,
"nphamcs@gmail.com" <nphamcs@gmail.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"James Hsu (徐慶薰)" <James.Hsu@mediatek.com>,
"AngeloGioacchino Del Regno"
<angelogioacchino.delregno@collabora.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mediatek@lists.infradead.org"
<linux-mediatek@lists.infradead.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"Chinwen Chang (張錦文)" <chinwen.chang@mediatek.com>,
"Casper Li (李中榮)" <casper.li@mediatek.com>,
"minchan@kernel.org" <minchan@kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"matthias.bgg@gmail.com" <matthias.bgg@gmail.com>,
"senozhatsky@chromium.org" <senozhatsky@chromium.org>
Subject: Re: [PATCH] mm: Add Kcompressd for accelerated memory compression
Date: Fri, 2 May 2025 09:16:01 +0000 [thread overview]
Message-ID: <bf1db02cc0e7682e8f6eea4d0d61f6f249536163.camel@mediatek.com> (raw)
In-Reply-To: <20250501140226.GE2020@cmpxchg.org>
On Thu, 2025-05-01 at 10:02 -0400, Johannes Weiner wrote:
> External email : Please do not click links or open attachments until
you have verified the sender or the content.
>
>
> On Wed, Apr 30, 2025 at 04:26:41PM +0800, Qun-Wei Lin wrote:
>
> > This patch series introduces a new mechanism called kcompressd to
> > improve the efficiency of memory reclaiming in the operating
system.
> >
> > Problem:
> > In the current system, the kswapd thread is responsible for both
scanning
> > the LRU pages and handling memory compression tasks (such as
those
> > involving ZSWAP/ZRAM, if enabled). This combined responsibility
can lead
> > to significant performance bottlenecks, especially under high
memory
> > pressure. The kswapd thread becomes a single point of contention,
causing
> > delays in memory reclaiming and overall system performance
degradation.
> >
> > Solution:
> > Introduced kcompressd to handle asynchronous compression during
memory
> > reclaim, improving efficiency by offloading compression tasks
from
> > kswapd. This allows kswapd to focus on its primary task of page
reclaim
> > without being burdened by the additional overhead of compression.
> >
> > In our handheld devices, we found that applying this mechanism
under high
> > memory pressure scenarios can increase the rate of pgsteal_anon per
second
> > by over 260% compared to the situation with only kswapd.
Additionally, we
> > observed a reduction of over 50% in page allocation stall
occurrences,
> > further demonstrating the effectiveness of kcompressd in
alleviating memory
> > pressure and improving system responsiveness.
>
>
> Yes, I think parallelizing this work makes a lot of sense.
>
>
> > Co-developed-by: Barry Song
<[21cnbao@gmail.com](mailto:21cnbao@gmail.com)>
> > Signed-off-by: Barry Song
<[21cnbao@gmail.com](mailto:21cnbao@gmail.com)>
> > Signed-off-by: Qun-Wei Lin
<[qun-wei.lin@mediatek.com](mailto:qun-wei.lin@mediatek.com)>
> > Reference: Re: [PATCH 0/2] Improve Zram by separating compression
context from kswapd - Barry Song
> >
[https://lore.kernel.org/lkml/20250313093005.13998-1-21cnbao@gmail.com/](https://lore.kernel.org/lkml/20250313093005.13998-1-21cnbao@gmail.com/)
> > ---
> > include/linux/mmzone.h | 6 ++++
> > mm/mm_init.c | 1 +
> > mm/page_io.c | 71
++++++++++++++++++++++++++++++++++++++++++
> > mm/swap.h | 6 ++++
> > mm/vmscan.c | 25 +++++++++++++++
> > 5 files changed, 109 insertions(+)
> >
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 6ccec1bf2896..93c9195a54ae 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -23,6 +23,7 @@
> > #include <linux/page-flags.h>
> > #include <linux/local_lock.h>
> > #include <linux/zswap.h>
> > +#include <linux/kfifo.h>
> > #include <asm/page.h>
> >
> > /* Free memory management - zoned buddy allocator. */
> > @@ -1398,6 +1399,11 @@ typedef struct pglist_data {
> >
> > int kswapd_failures; /* Number of 'reclaimed == 0'
runs */
> >
> > +#define KCOMPRESS_FIFO_SIZE 256
> > + wait_queue_head_t kcompressd_wait;
> > + struct task_struct *kcompressd;
> > + struct kfifo kcompress_fifo;
>
>
> The way you implemented this adds time-and-space overhead even on
> systems that don't have any sort of swap compression enabled.
>
To address the overhead concern, perhaps we can embed only a single
kcompressd pointer within pglist_data and perform lazy initialization
only when a zram device is added or zswap is enabled.
> That seems unnecessary. There is an existing method for asynchronous
> writeback, and pageout() is naturally fully set up to handle this.
>
> IMO the better way to do this is to make zswap_store() (and
> zram_bio_write()?) asynchronous. Make those functions queue the work
> and wake the compression daemon, and then have the daemon call
> folio_end_writeback() / bio_endio() when it's done with it.
Perhaps we could add an enqueue/wake-upkcompressd interface and call it
within zswap_store() and zram_bio_write(). This would leverage the
existing obj_cgroup_may_zswap() check in zswap_store(), it solved the
problem that zswap is re-compressed too soon. as mentioned by Nhat.
In outline:
1. Per-node pointer in pglist_data:
typedef struct pglist_data {
...
struct kcompressd_node *kcompressd;
...
}
2. Global register/unregister hooks:
kcompressd_register_backend(): Register a new backend (zram/zswap).
Initialize the kcompressd structure and kfifo if this is the first
call.
kcompressd_unregister_backend(): Unregister a backend (zram/zswap).
Use a per-node refcount and bitmap to track how manyzswap/zram
instances are active. If the last backend is unregistered, free
the kcompressd resources.
> > A net loss is possible, but kswapd can sometimes enter sleep
> > contexts,
> > allowing the parallel kcompressd thread to continue compression.
> > This could actually be a win. But I agree that additional testing
on
> > single-CPU machines may be necessary.
>
> It could be disabled by the following if we discover any regression
> on
> single-CPU machines?
>
> if (num_online_cpus() == 1)
> return false;
>
We can add this check in the register/unregister function.
3. Enqueue API:
kcompressd_enqueue_folio(folio) /kcompressd_enqueue_bio(bio): Push a
job to the kcompressd’s FIFO and wake up the kcompressd daemon.
With this approach, there is zero runtime cost on nodes when no backend
is active and only one allocation per node.
Thank you for your feedback!
Please let me know what you think.
Best Regards,
Qun-wei
next prev parent reply other threads:[~2025-05-02 9:16 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-30 8:26 Qun-Wei Lin
2025-04-30 17:05 ` Nhat Pham
2025-04-30 17:22 ` Nhat Pham
2025-04-30 21:51 ` Andrew Morton
2025-04-30 22:49 ` Barry Song
2025-05-07 15:11 ` Nhat Pham
2025-05-01 14:02 ` Johannes Weiner
2025-05-01 15:12 ` Nhat Pham
2025-06-16 3:41 ` Barry Song
2025-06-17 14:21 ` Nhat Pham
2025-06-23 5:16 ` Barry Song
2025-06-27 23:21 ` Nhat Pham
2025-07-09 3:25 ` Qun-wei Lin (林群崴)
2025-05-02 9:16 ` Qun-wei Lin (林群崴) [this message]
2025-05-01 15:50 ` Nhat Pham
2025-05-07 1:12 ` Harry Yoo
2025-05-07 1:50 ` Zi Yan
2025-05-07 2:04 ` Barry Song
2025-05-07 15:00 ` Nhat Pham
2025-05-07 15:12 ` Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bf1db02cc0e7682e8f6eea4d0d61f6f249536163.camel@mediatek.com \
--to=qun-wei.lin@mediatek.com \
--cc=21cnbao@gmail.com \
--cc=Andrew.Yang@mediatek.com \
--cc=James.Hsu@mediatek.com \
--cc=akpm@linux-foundation.org \
--cc=angelogioacchino.delregno@collabora.com \
--cc=casper.li@mediatek.com \
--cc=chinwen.chang@mediatek.com \
--cc=hannes@cmpxchg.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=matthias.bgg@gmail.com \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=rppt@kernel.org \
--cc=senozhatsky@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox