From: Ojaswin Mujoo <ojaswin@linux.ibm.com>
To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Cc: djwong@kernel.org, john.g.garry@oracle.com, willy@infradead.org,
hch@lst.de, ritesh.list@gmail.com, jack@suse.cz,
Luis Chamberlain <mcgrof@kernel.org>,
dgc@kernel.org, tytso@mit.edu, p.raghav@samsung.com,
andres@anarazel.de, brauner@kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [RFC PATCH v2 5/5] iomap: Add DSYNC support to writethrough
Date: Thu, 9 Apr 2026 00:15:46 +0530 [thread overview]
Message-ID: <162ff37dec9295bb76ebadba6ea72ac72cc3c3df.1775658795.git.ojaswin@linux.ibm.com> (raw)
In-Reply-To: <cover.1775658795.git.ojaswin@linux.ibm.com>
Add DSYNC support to writethrough buffered writes. Unlike the usual
buffered writes where we call generic_write_sync() inline during the
syscall path, for writethrough we instead sync the data during IO
completion path, just like dio.
This allows aio writethrough to be truly async where the syscall can
return after IO submission and the sync can then be done asynchronously
during IO completion time.
Further, just like dio, we utilize the FUA optimization, if available,
to avoid syncing the data for DSYNC operations.
Suggested-by: Dave Chinner <dgc@kernel.org>
Co-developed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
fs/iomap/buffered-io.c | 37 +++++++++++++++++++++++++++++++++----
include/linux/iomap.h | 1 +
2 files changed, 34 insertions(+), 4 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 6937f10e2782..8965f603f2cf 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1119,6 +1119,14 @@ static ssize_t iomap_writethrough_complete(struct iomap_writethrough_ctx *wt_ctx
if (!ret) {
ret = wt_ctx->written;
iocb->ki_pos = wt_ctx->pos + ret;
+
+ /*
+ * If this is a DSYNC write and we couldn't optimize it, make
+ * sure we push it to stable storage now that we've written
+ * data.
+ */
+ if (iocb_is_dsync(wt_ctx->iocb) && !wt_ctx->use_fua)
+ ret = generic_write_sync(iocb, ret);
}
kfree(wt_ctx);
@@ -1173,6 +1181,7 @@ iomap_writethrough_submit_bio(struct iomap_writethrough_ctx *wt_ctx,
struct bio *bio;
unsigned int i;
u64 len = 0;
+ blk_opf_t opf = REQ_OP_WRITE;
if (!wt_ctx->nr_bvecs)
return;
@@ -1184,7 +1193,10 @@ iomap_writethrough_submit_bio(struct iomap_writethrough_ctx *wt_ctx,
wt_ops->writethrough_submit(wt_ctx->inode, iomap, wt_ctx->bio_pos,
len);
- bio = bio_alloc(iomap->bdev, wt_ctx->nr_bvecs, REQ_OP_WRITE, GFP_NOFS);
+ if (wt_ctx->use_fua)
+ opf |= REQ_FUA;
+
+ bio = bio_alloc(iomap->bdev, wt_ctx->nr_bvecs, opf, GFP_NOFS);
bio->bi_iter.bi_sector = iomap_sector(iomap, wt_ctx->bio_pos);
bio->bi_end_io = iomap_writethrough_bio_end_io;
bio->bi_private = wt_ctx;
@@ -1273,6 +1285,19 @@ static int iomap_writethrough_iter(struct iomap_writethrough_ctx *wt_ctx,
if (!(iter->flags & IOMAP_WRITETHROUGH))
return -EINVAL;
+ /*
+ * If we realise that cache flush is neccessary (eg FUA is not present
+ * or we need metadata updates) then we turn off the optimization.
+ */
+ if (wt_ctx->use_fua) {
+ if (iter->iomap.type != IOMAP_MAPPED ||
+ (iter->iomap.flags &
+ (IOMAP_F_NEW | IOMAP_F_SHARED | IOMAP_F_DIRTY)) ||
+ (bdev_write_cache(iter->iomap.bdev) &&
+ !bdev_fua(iter->iomap.bdev)))
+ wt_ctx->use_fua = false;
+ }
+
do {
struct folio *folio;
size_t offset; /* Offset into folio */
@@ -1545,9 +1570,6 @@ ssize_t iomap_file_writethrough_write(struct kiocb *iocb, struct iov_iter *i,
return -EINVAL;
if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_DONTCACHE))
return -EINVAL;
- if (iocb_is_dsync(iocb))
- /* D_SYNC support not implemented yet */
- return -EOPNOTSUPP;
/*
* +1 to max bvecs to account for unaligned write spanning multiple
@@ -1575,6 +1597,13 @@ ssize_t iomap_file_writethrough_write(struct kiocb *iocb, struct iov_iter *i,
wt_ctx->is_aio = !is_sync_kiocb(iocb);
atomic_set(&wt_ctx->ref, 1);
+ /*
+ * Similar to dio, we optimistically set use_fua=true to avoid explicit
+ * sync. In case we later realise cache flush is needed we set it back
+ * to false.
+ */
+ wt_ctx->use_fua = iocb_is_dsync(iocb) && !(iocb->ki_flags & IOCB_SYNC);
+
if (!wt_ctx->is_aio)
wt_ctx->waiter = current;
else
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index e99f7c279dc6..579bc48ed39c 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -487,6 +487,7 @@ struct iomap_writethrough_ctx {
unsigned int flags;
int error;
bool is_aio;
+ bool use_fua;
union {
/* used during submission and for non-aio completion */
--
2.53.0
prev parent reply other threads:[~2026-04-08 18:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 18:45 [RFC PATCH v2 0/5] Add buffered write-through support to iomap & xfs Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 1/5] mm: Refactor folio_clear_dirty_for_io() Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 2/5] iomap: Add initial support for buffered RWF_WRITETHROUGH Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 3/5] xfs: Add RWF_WRITETHROUGH support to xfs Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 4/5] iomap: Add aio support to RWF_WRITETHROUGH Ojaswin Mujoo
2026-04-08 18:45 ` Ojaswin Mujoo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=162ff37dec9295bb76ebadba6ea72ac72cc3c3df.1775658795.git.ojaswin@linux.ibm.com \
--to=ojaswin@linux.ibm.com \
--cc=andres@anarazel.de \
--cc=brauner@kernel.org \
--cc=dgc@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=john.g.garry@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox