From: Pedro Falcato <pfalcato@suse.de>
To: Chris Li <chrisl@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
YoungJun Park <youngjun.park@lge.com>,
lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org
Subject: Re: [LSF/MM/BPF TOPIC] Flash Friendly Swap
Date: Mon, 23 Feb 2026 18:53:12 +0000 [thread overview]
Message-ID: <jvbthrkxombekk4bbmdqrwvik2qnd5btafb6q7guimxnioaqom@7kjexmbfuzxx> (raw)
In-Reply-To: <CACePvbWkGk5Z+QueOBNCdrr1mzGxQZbRbfeskOrLsmWEvF5ToA@mail.gmail.com>
On Mon, Feb 23, 2026 at 10:15:14AM -0800, Chris Li wrote:
> On Mon, Feb 23, 2026 at 5:23 AM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Fri, Feb 20, 2026 at 03:47:18PM -0800, Chris Li wrote:
> > > Hi Christoph,
> > >
> > > On Fri, Feb 20, 2026 at 8:22 AM Christoph Hellwig <hch@infradead.org> wrote:
> > > >
> > > > Honestly, I think always writing sequentially when swapping and
> > > > reclaiming in lumps (I'd call them "zones" :)) is probably the best
> > > > idea. Even for the these days unlikely case of swapping to HDD it
> > >
> > > For the flash device with FTL, the location of the data written is
> > > most likely logical anyway. The flash devices tend to group the new
> > > data internally to the same erase block together even when they are
> > > discontinuous from the block device point of view.
> >
> > Yes, but that's not the point..
> >
> > > It is easy to write
> > > out sequentially when the swap device is mostly empty. That is how the
> > > cluster allocator does currently any way. However, the tricky part is
> > > what when some random 4K blocks get swapped in, that will create holes
> > > on both the swap device and internal write out data. Very quickly the
> > > free cluster on swap devices will get all used up and that you will
> > > not be able to write out sequentially any more. The FTL layer
> > > internally wants to GC those holes to create a large empty erase
> > > block. I do see where to pick up the next write location can have a
> > > huge impact on the flash internal GC behavior and write amplification
> > > factor.
> >
> > And that is the point. The FTL will always do a bad job with these work
> > loads. You should not do overwrites, and can do much better
>
> I am not sure I understand "You should not do overwrites". Can you
> help clarify it for me? Let say we always prefer to the write to new
> clusters while some swap entries has been free. What happen we run out
> of new cluster to write? Wouldn't we be forced to overwrite the
> previous free swap location? It seems to me the "overwrite" is
> un-avoidable if you keep swapping in and out. That is the part I am
> missing.
See log-structured fileystems. I suspect that's close to what we want for flash
storage swap.
Also, FWIW: the cloud vendors have fake SSDs that while have negligible seek
latency, have extremely low IOPS values (e.g AWS gp2 can do 100 IOPS on its
base setting, and scales up to 16K IOPS. gp3 can do 3000 up to 80K on the
maximum size). I suspect swapping on these is a huge slog, and we would also
like to write out as much sequentially as we can here (though I hope no one
is *actually* swapping on these things). Also mechanical drives. Log-structured
filesystems were originally invented for these too :)
--
Pedro
prev parent reply other threads:[~2026-02-23 18:53 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-18 12:46 YoungJun Park
2026-02-20 16:22 ` Christoph Hellwig
2026-02-20 23:47 ` Chris Li
2026-02-23 13:23 ` Christoph Hellwig
2026-02-23 18:15 ` Chris Li
2026-02-23 18:53 ` Pedro Falcato [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jvbthrkxombekk4bbmdqrwvik2qnd5btafb6q7guimxnioaqom@7kjexmbfuzxx \
--to=pfalcato@suse.de \
--cc=chrisl@kernel.org \
--cc=hch@infradead.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox