From: YoungJun Park <youngjun.park@lge.com>
To: lsf-pc@lists.linux-foundation.org
Cc: linux-mm@kvack.org, youngjun.park@lge.com, chrisl@kernel.org
Subject: [LSF/MM/BPF TOPIC] Flash Friendly Swap
Date: Wed, 18 Feb 2026 21:46:54 +0900 [thread overview]
Message-ID: <aZW0voL4MmnMQlaR@yjaykim-PowerEdge-T330> (raw)
Hello,
I would like to propose a session on NAND flash friendly swap layout.
Similar to how F2FS is designed as a flash friendly file system, the
goal is to make the swap subsystem write data to NAND flash devices
in a way that causes less wear.
We have been working on this problem in production embedded systems
and have built an out-of-tree solution with RAM buffering, sequential
writeback, and deduplication. I would like to discuss the upstream
path for these capabilities.
Background & Motivation:
We ship embedded products built on eMMC-based NAND flash and have been
dealing with memory pressure that demands aggressive swapping for years.
The limited P/E cycle endurance of NAND flash makes naive swap usage a
reliability risk -- swap I/O is random, small, and frequent, which is
the worst-case pattern for write amplification. Even with an FTL in the
eMMC controller, random writes from the swap layer still cause
additional WAF through internal garbage collection. Buffering and
reordering writes into sequential streams can complement the FTL and
reduce WAF.
Our team has published prior work on this problem[1], covering
techniques such as compression, RAM buffering with sequential writeback,
and flash-aware block management. Based on that work, we built an
internal solution and are now looking at how to bring these capabilities
upstream.
Current Implementation:
The current implementation is a standalone block device driver between
the swap layer and flash storage:
1. RAM swap buffer: A kernel thread accumulates swap-out pages and
flushes them to flash as sequential I/O at controlled intervals.
2. Management layer: Mapping between swap slots and physical flash
locations, with wear-aware allocation and writeback scheduling.
3. Deduplication: Content-hash-based dedup before writing to flash --
swap workloads often contain many zero-filled or duplicate pages.
This works, but as a standalone block device it sits outside mainline
infrastructure. I am seeking feedback on how to upstream this.
Discussion:
I would like to discuss the following topics:
- Flash friendly swap I/O:
For flash-backed swap, writing sequentially and respecting erase
block boundaries can reduce WAF. What could the swap subsystem do
to better support flash devices?
- Deduplication in the swap layer:
Swap workloads often contain many zero-filled or duplicate pages.
Should dedup be a swap-layer feature rather than reimplemented
per-backend?
- Extending zram/zswap writeback with flash awareness:
zram supports a backing device (CONFIG_ZRAM_WRITEBACK) for writing
idle/incompressible pages to persistent storage, and zswap sits in
front of swap devices with its own writeback path. Could these be
extended with sequential writeback batching, deduplication, and
flash-aware allocation? Our implementation buffers swap-out pages
in RAM before flushing to flash -- this is conceptually similar to
zswap + writeback, but we found the current writeback path
insufficient for our needs because it still issues per-page random
writes without awareness of flash erase block boundaries. I would
like to discuss what gaps remain and whether extending zswap/zram
writeback is the right upstream path.
- Swap abstraction layer:
Recent discussions on reworking the swap subsystem[2][3] aim to
decouple the swap core from its tight binding to swap offsets and
block devices. If such a layer materializes, it could provide
extension points for pluggable swap backends with device-specific
write strategies. I would like to hear the community's view on
whether this direction could also serve flash friendly swap needs.
Comments or suggestions are welcome.
[1] https://ieeexplore.ieee.org/document/8662047
[2] https://lwn.net/Articles/932077/
[3] https://lwn.net/Articles/974587/
next reply other threads:[~2026-02-18 12:47 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-18 12:46 YoungJun Park [this message]
2026-02-20 16:22 ` Christoph Hellwig
2026-02-20 23:47 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZW0voL4MmnMQlaR@yjaykim-PowerEdge-T330 \
--to=youngjun.park@lge.com \
--cc=chrisl@kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox