From: Dave Chinner <dgc@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Andres Freund <andres@anarazel.de>,
Pankaj Raghav <pankaj.raghav@linux.dev>, Jan Kara <jack@suse.cz>,
Ojaswin Mujoo <ojaswin@linux.ibm.com>,
linux-xfs@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org, lsf-pc@lists.linux-foundation.org,
djwong@kernel.org, john.g.garry@oracle.com, willy@infradead.org,
ritesh.list@gmail.com, Luis Chamberlain <mcgrof@kernel.org>,
dchinner@redhat.com, Javier Gonzalez <javier.gonz@samsung.com>,
gost.dev@samsung.com, tytso@mit.edu, p.raghav@samsung.com,
vi.shah@samsung.com
Subject: Re: [LSF/MM/BPF TOPIC] Buffered atomic writes
Date: Thu, 19 Feb 2026 10:42:00 +1100 [thread overview]
Message-ID: <aZZOSFdL_L_EoU34@dread> (raw)
In-Reply-To: <20260218064739.GA8881@lst.de>
On Wed, Feb 18, 2026 at 07:47:39AM +0100, Christoph Hellwig wrote:
> On Wed, Feb 18, 2026 at 12:04:43PM +1100, Dave Chinner wrote:
> > > > > I'd call it RWF_WRITETHROUGH but otherwise it makes sense.
> > > > >
> > > >
> > > > One naive question: semantically what will be the difference between
> > > > RWF_DSYNC and RWF_WRITETHROUGH?
> >
> > None, except that RWF_DSYNC provides data integrity guarantees.
>
> Which boils down to RWF_DSYNC still writing out the inode and flushing
> the cache.
>
> > > Which
> > > wouldn't be needed for RWF_WRITETHROUGH, right?
> >
> > Correct, there shouldn't be any data integrity guarantees associated
> > with plain RWF_WRITETHROUGH.
>
> Which makes me curious if the plain RWF_WRITETHROUGH would be all
> that useful.
For modern SSDs, I think the answer is yes.
e.g. when you are doing lots of small writes to many files from many
threads, it bottlenecks on single threaded writeback. All of the IO
is submitted by background writeback which runs out of CPU fairly
quickly. We end up dirty throttling and topping out at ~100k random
4kB buffered writes IOPS regardless of how much submitter
concurrency we have.
If we switch that to RWF_WRITETHROUGH, we now have N submitting
threads that can all work in parallel, we get pretty much zero dirty
folio backlog (so no dirty throttling and more consistent IO
latency) and throughput can scales much higher because we have IO
submitter concurrency to spread the CPU load around.
I did a fsmark test of a write-though hack a couple of years back,
creating and writing 4kB data files concurrently in a directory per
thread. With vanilla writeback, it topped out at about 80k 4kB file
creates/s from 4 threads and only wnet slower the more I increased
the userspace create concurrency.
Using writethrough submission, it topped out at about 400k 4kB file
creates/s from 32 threads and was largely limited in the fsmark
tasks by the CPU overhead for file creation, user data copying and
data extent space allocation.
I also did a multi-file, multi-process random 4kB write test with
fio, using files much larger than memory and long runtimes. Once the
normal background write path started dirty throttling, it ran at
about 100k 4kB write IOPS, again limited by the single threaded writeback
flusher using all it's CPU time for allocating blocks during
writeback.
Using writethrough, I saw about 900k IOPS being sustained right from
the start, largely limited by a combination of CPU usage and IO
latency in the fio task context. In comparison, the same workload
with DIO ran to the storage capability of 1.6M IOPS because it had
significantly lower CPU usage and IO latency.
I also did some kernel compile tests with writethrough for all
buffered write IO. On fast storage there was neglible
difference in performance between vanilla buffered writes and
submitter driver blocking write-through. This result made me
question the need for caching on modern SSDs at all :)
-Dave.
--
Dave Chinner
dgc@kernel.org
next prev parent reply other threads:[~2026-02-18 23:42 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-13 10:20 Pankaj Raghav
2026-02-13 13:32 ` Ojaswin Mujoo
2026-02-16 9:52 ` Pankaj Raghav
2026-02-16 15:45 ` Andres Freund
2026-02-17 12:06 ` Jan Kara
2026-02-17 12:42 ` Pankaj Raghav
2026-02-17 16:21 ` Andres Freund
2026-02-18 1:04 ` Dave Chinner
2026-02-18 6:47 ` Christoph Hellwig
2026-02-18 23:42 ` Dave Chinner [this message]
2026-02-17 16:13 ` Andres Freund
2026-02-17 18:27 ` Ojaswin Mujoo
2026-02-17 18:42 ` Andres Freund
2026-02-18 17:37 ` Jan Kara
2026-02-18 21:04 ` Andres Freund
2026-02-19 0:32 ` Dave Chinner
2026-02-17 18:33 ` Ojaswin Mujoo
2026-02-17 17:20 ` Ojaswin Mujoo
2026-02-18 17:42 ` [Lsf-pc] " Jan Kara
2026-02-18 20:22 ` Ojaswin Mujoo
2026-02-16 11:38 ` Jan Kara
2026-02-16 13:18 ` Pankaj Raghav
2026-02-17 18:36 ` Ojaswin Mujoo
2026-02-16 15:57 ` Andres Freund
2026-02-17 18:39 ` Ojaswin Mujoo
2026-02-18 0:26 ` Dave Chinner
2026-02-18 6:49 ` Christoph Hellwig
2026-02-18 12:54 ` Ojaswin Mujoo
2026-02-15 9:01 ` Amir Goldstein
2026-02-17 5:51 ` Christoph Hellwig
2026-02-17 9:23 ` [Lsf-pc] " Amir Goldstein
2026-02-17 15:47 ` Andres Freund
2026-02-17 22:45 ` Dave Chinner
2026-02-18 4:10 ` Andres Freund
2026-02-18 6:53 ` Christoph Hellwig
2026-02-18 6:51 ` Christoph Hellwig
2026-02-20 10:08 ` Pankaj Raghav (Samsung)
2026-02-20 15:10 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZZOSFdL_L_EoU34@dread \
--to=dgc@kernel.org \
--cc=andres@anarazel.de \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=gost.dev@samsung.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=javier.gonz@samsung.com \
--cc=john.g.garry@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mcgrof@kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=p.raghav@samsung.com \
--cc=pankaj.raghav@linux.dev \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=vi.shah@samsung.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox