linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <dgc@kernel.org>
To: Andres Freund <andres@anarazel.de>
Cc: Pankaj Raghav <pankaj.raghav@linux.dev>, Jan Kara <jack@suse.cz>,
	Ojaswin Mujoo <ojaswin@linux.ibm.com>,
	linux-xfs@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, lsf-pc@lists.linux-foundation.org,
	djwong@kernel.org, john.g.garry@oracle.com, willy@infradead.org,
	hch@lst.de, ritesh.list@gmail.com,
	Luis Chamberlain <mcgrof@kernel.org>,
	dchinner@redhat.com, Javier Gonzalez <javier.gonz@samsung.com>,
	gost.dev@samsung.com, tytso@mit.edu, p.raghav@samsung.com,
	vi.shah@samsung.com
Subject: Re: [LSF/MM/BPF TOPIC] Buffered atomic writes
Date: Wed, 18 Feb 2026 12:04:43 +1100	[thread overview]
Message-ID: <aZUQKx_C3-qyU4PJ@dread> (raw)
In-Reply-To: <ignmsoluhway2yllepl2djcjjaukjijq3ejrlf4uuvh57ru7ur@njkzymuvzfqf>

On Tue, Feb 17, 2026 at 11:21:20AM -0500, Andres Freund wrote:
> Hi,
> 
> On 2026-02-17 13:42:35 +0100, Pankaj Raghav wrote:
> > On 2/17/2026 1:06 PM, Jan Kara wrote:
> > > On Mon 16-02-26 10:45:40, Andres Freund wrote:
> > > > (*) As it turns out, it often seems to improves write throughput as well, if
> > > > writeback is triggered by memory pressure instead of SYNC_FILE_RANGE_WRITE,
> > > > linux seems to often trigger a lot more small random IO.
> > > > 
> > > > > So immediately writing them might be ok as long as we don't remove those
> > > > > pages from the page cache like we do in RWF_UNCACHED.
> > > > 
> > > > Yes, it might.  I actually often have wished for something like a
> > > > RWF_WRITEBACK flag...
> > > 
> > > I'd call it RWF_WRITETHROUGH but otherwise it makes sense.
> > > 
> > 
> > One naive question: semantically what will be the difference between
> > RWF_DSYNC and RWF_WRITETHROUGH?

None, except that RWF_DSYNC provides data integrity guarantees.

> > So RWF_DSYNC will be the sync version and
> > RWF_WRITETHOUGH will be an async version where we kick off writeback
> > immediately in the background and return?

No.

Write-through implies synchronous IO. i.e. that IO errors are
reported immediately to the caller, not reported on the next
operation on the file.

O_DSYNC integrity writes are, by definition, write-through
(synchronous) because they have to report physical IO completion
status to the caller. This is kinda how "synchronous" got associated
with data integrity in the first place.

DIO writes are also write-through - there is nowhere to store an IO
error for later reporting, so they must be executed synchronously to
be able to report IO errors to the caller.

Hence write-through generally implies synchronous IO, but it does
not imply any data integrity guarantees are provided for the IO.

If you want async RWF_WRITETHROUGH semantics, then the IO needs to
be issued through an async IO submission interface (i.e. AIO or
io_uring). In that case, the error status will be reported through
the AIO completion, just like for DIO writes.

IOWs, RWF_WRITETHROUGH should result in buffered writes displaying
identical IO semantics to DIO writes. In doing this, we then we only
need one IO path implementation per filesystem for all writethrough
IO (buffered or direct) and the only thing that differs is the folios
we attach to the bios.

> Besides sync vs async:
> 
> If the device has a volatile write cache, RWF_DSYNC will trigger flushes for
> the entire write cache or do FUA writes for just the RWF_DSYNC write.

Yes, that is exactly how the iomap DIO write path optimises
RWF_DSYNC writes. It's much harder to do this for buffered IO using
the generic buffered writeback paths and buffered writes never use
FUA writes.

i.e., using the iomap DIO path for RWF_WRITETHROUGH | RWF_DSYNC
would bring these significant performance optimisations to buffered
writes as well...

> Which
> wouldn't be needed for RWF_WRITETHROUGH, right?

Correct, there shouldn't be any data integrity guarantees associated
with plain RWF_WRITETHROUGH.

-Dave.
-- 
Dave Chinner
dgc@kernel.org


  reply	other threads:[~2026-02-18  1:05 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-13 10:20 Pankaj Raghav
2026-02-13 13:32 ` Ojaswin Mujoo
2026-02-16  9:52   ` Pankaj Raghav
2026-02-16 15:45     ` Andres Freund
2026-02-17 12:06       ` Jan Kara
2026-02-17 12:42         ` Pankaj Raghav
2026-02-17 16:21           ` Andres Freund
2026-02-18  1:04             ` Dave Chinner [this message]
2026-02-18  6:47               ` Christoph Hellwig
2026-02-18 23:42                 ` Dave Chinner
2026-02-17 16:13         ` Andres Freund
2026-02-17 18:27           ` Ojaswin Mujoo
2026-02-17 18:42             ` Andres Freund
2026-02-18 17:37           ` Jan Kara
2026-02-18 21:04             ` Andres Freund
2026-02-19  0:32             ` Dave Chinner
2026-02-17 18:33       ` Ojaswin Mujoo
2026-02-17 17:20     ` Ojaswin Mujoo
2026-02-18 17:42       ` [Lsf-pc] " Jan Kara
2026-02-18 20:22         ` Ojaswin Mujoo
2026-02-16 11:38   ` Jan Kara
2026-02-16 13:18     ` Pankaj Raghav
2026-02-17 18:36       ` Ojaswin Mujoo
2026-02-16 15:57     ` Andres Freund
2026-02-17 18:39     ` Ojaswin Mujoo
2026-02-18  0:26       ` Dave Chinner
2026-02-18  6:49         ` Christoph Hellwig
2026-02-18 12:54         ` Ojaswin Mujoo
2026-02-15  9:01 ` Amir Goldstein
2026-02-17  5:51 ` Christoph Hellwig
2026-02-17  9:23   ` [Lsf-pc] " Amir Goldstein
2026-02-17 15:47     ` Andres Freund
2026-02-17 22:45       ` Dave Chinner
2026-02-18  4:10         ` Andres Freund
2026-02-18  6:53       ` Christoph Hellwig
2026-02-18  6:51     ` Christoph Hellwig
2026-02-20 10:08 ` Pankaj Raghav (Samsung)
2026-02-20 15:10   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZUQKx_C3-qyU4PJ@dread \
    --to=dgc@kernel.org \
    --cc=andres@anarazel.de \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=javier.gonz@samsung.com \
    --cc=john.g.garry@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=p.raghav@samsung.com \
    --cc=pankaj.raghav@linux.dev \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=vi.shah@samsung.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox