linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: YoungJun Park <youngjun.park@lge.com>
To: Pedro Falcato <pfalcato@suse.de>
Cc: Chris Li <chrisl@kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	nphamcs@gmail.com, bhe@redhat.com, taejoon.song@lge.com,
	ryncsn@gmail.com
Subject: Re: [LSF/MM/BPF TOPIC] Flash Friendly Swap
Date: Tue, 24 Feb 2026 13:02:22 +0900	[thread overview]
Message-ID: <aZ0izpnK+QMqxYbM@yjaykim-PowerEdge-T330> (raw)
In-Reply-To: <aZ0L48fzXzC9IOfj@yjaykim-PowerEdge-T330>

On Tue, Feb 24, 2026 at 11:24:35AM +0900, YoungJun Park wrote:
> On Mon, Feb 23, 2026 at 06:53:12PM +0000, Pedro Falcato wrote:
> > On Mon, Feb 23, 2026 at 10:15:14AM -0800, Chris Li wrote:
> > > On Mon, Feb 23, 2026 at 5:23 AM Christoph Hellwig <hch@infradead.org> wrote:
> > > >
> > > > On Fri, Feb 20, 2026 at 03:47:18PM -0800, Chris Li wrote:
> > > > > Hi Christoph,
> > > > >
> > > > > On Fri, Feb 20, 2026 at 8:22 AM Christoph Hellwig <hch@infradead.org> wrote:
> > > > > >
> > > > > > Honestly, I think always writing sequentially when swapping and
> > > > > > reclaiming in lumps (I'd call them "zones" :)) is probably the best
> > > > > > idea.  Even for the these days unlikely case of swapping to HDD it
> > > > >
> > > > > For the flash device with FTL, the location of the data written is
> > > > > most likely logical anyway.  The flash devices tend to group the new
> > > > > data internally to the same erase block together even when they are
> > > > > discontinuous from the block device point of view.
> > > >
> > > > Yes, but that's not the point..
> > > >
> > > > > It is easy to write
> > > > > out sequentially when the swap device is mostly empty. That is how the
> > > > > cluster allocator does currently any way. However, the tricky part is
> > > > > what when some random 4K blocks get swapped in, that will create holes
> > > > > on both the swap device and internal write out data. Very quickly the
> > > > > free cluster on swap devices will get all used up and that you will
> > > > > not be able to write out sequentially any more. The FTL layer
> > > > > internally wants to GC those holes to create a large empty erase
> > > > > block. I do see where to pick up the next write location can have a
> > > > > huge impact on the flash internal GC behavior and write amplification
> > > > > factor.
> > > >
> > > > And that is the point.  The FTL will always do a bad job with these work
> > > > loads.  You should not do overwrites, and can do much better
> > > 
> > > I am not sure I understand "You should not do overwrites". Can you
> > > help clarify it for me? Let say we always prefer to the write to new
> > > clusters while some swap entries has been free. What happen we run out
> > > of new cluster to write? Wouldn't we be forced to overwrite the
> > > previous free swap location? It seems to me the "overwrite" is
> > > un-avoidable if you keep swapping in and out. That is the part I am
> > > missing.
> > 
> > See log-structured fileystems. I suspect that's close to what we want for flash
> > storage swap.
> > 
> > Also, FWIW: the cloud vendors have fake SSDs that while have negligible seek
> > latency, have extremely low IOPS values (e.g AWS gp2 can do 100 IOPS on its
> > base setting, and scales up to 16K IOPS. gp3 can do 3000 up to 80K on the
> > maximum size). I suspect swapping on these is a huge slog, and we would also
> > like to write out as much sequentially as we can here (though I hope no one
> > is *actually* swapping on these things). Also mechanical drives. Log-structured
> > filesystems were originally invented for these too :)
> 
> +CC Nhat Pham, He Baoquan, Taejoon 
> 
> Hi Pedro,
> 
> The motivation is indeed similar to that of log-structured filesystems, and it
> employs a similar management mechanism.
> 
> That is why I thought a management style similar to filesystems might be
> necessary at the swap layer as well (the swap abstraction layer mentioned in
> the proposal document).
> 
> Previously, the direction for upstreaming our solution was somewhat ambiguous,
> so we have been maintaining it privately for several years.
> 
> However, recently, I would like to discuss how to proceed with upstreaming in
> the context of Baoquan's "swap_ops and pluggable swap backend"
> (https://lore.kernel.org/linux-mm/aZiFvzlBJiYBUDre@MiWiFi-R3L-srv/) and
> Nhat's "Virtual Swap Space"
> (https://lore.kernel.org/linux-mm/20260208215839.87595-1-nphamcs@gmail.com/).
> 
> Best regards
> Youngjun Park

+CC Kairui

Oops, I missed adding the discussion involving Kairui (CC'd). This is also 
a direction currently being discussed:
https://lore.kernel.org/linux-mm/CAMgjq7D6n0H2=di0SrMQbJ48cVeKhGeQMH_mY0y-au4OJbE2GQ@mail.gmail.com/T/#m2feb4489b29075136169ff3efd28dc365062f66a

I hope our proposal can be considered or aligned with these ongoing
discussions.


  reply	other threads:[~2026-02-24  4:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-18 12:46 YoungJun Park
2026-02-20 16:22 ` Christoph Hellwig
2026-02-20 23:47   ` Chris Li
2026-02-23 13:23     ` Christoph Hellwig
2026-02-23 18:15       ` Chris Li
2026-02-23 18:53         ` Pedro Falcato
2026-02-24  2:24           ` YoungJun Park
2026-02-24  4:02             ` YoungJun Park [this message]
2026-02-24  2:15         ` YoungJun Park
2026-02-24  2:08       ` YoungJun Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZ0izpnK+QMqxYbM@yjaykim-PowerEdge-T330 \
    --to=youngjun.park@lge.com \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=nphamcs@gmail.com \
    --cc=pfalcato@suse.de \
    --cc=ryncsn@gmail.com \
    --cc=taejoon.song@lge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox