From: Ojaswin Mujoo <ojaswin@linux.ibm.com>
To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Cc: djwong@kernel.org, john.g.garry@oracle.com, willy@infradead.org,
hch@lst.de, ritesh.list@gmail.com, jack@suse.cz,
Luis Chamberlain <mcgrof@kernel.org>,
dgc@kernel.org, tytso@mit.edu, p.raghav@samsung.com,
andres@anarazel.de, brauner@kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [RFC PATCH v2 0/5] Add buffered write-through support to iomap & xfs
Date: Thu, 9 Apr 2026 00:15:41 +0530 [thread overview]
Message-ID: <cover.1775658795.git.ojaswin@linux.ibm.com> (raw)
Hi all,
This is the v2 RFC to add buffered writethrough support to iomap and
xfs. The changes made are mostly to get the writethrough implementation
more inline with how dio handles writes.
** Changes since RFC v1 [3] **
1. In v1, even the non-aio writethrough syscall returned after IO submission
but before waiting for IO to be finished. However, upon revisiting some of the
discussions, we feel that it's more cleaner to keep the behavior similar to dio
ie non-aio variant should only return after the IO completes and report any
issues upon return. Hence v2 now follows the exact pattern of dio where
non-aio writethrough waits for write to finish whereas aio writethrough returns
after submission. This is inline with the discussion here [2].
2. Instead of submitting a bio per folio, we now submit a bio per iomap.
Only once all IO are complete, we call the completion function to invoke
FS specific ->end_io()
3. Instead of reusing dio code, we have open coded the IO submission and
completion. Althrough this is heavily inspired by dio, trying to hammer
buffered writethrough handling in iomap_dio_rw() was resulting in ugly
if elses and hard to follow code. The open coded variant is clean and
easier to follow however ideally we should try to factor out common
parts of dio code to have a cleaner interface.
4. Support for aio and DSYNC writethrough is added, which utilizes FUA
optimizations if available.
5. Added a new ->writethrough_submit() operation which allows FSes to
perform tasks before IO submission. Like converting COW mappins to written.
The motivation is explained in patch 3
6. Refactored folio_clear_dirty_for_io() so it can be reused without
having to call folio_mkclean(). This is because writethrough mkcleans
the folio in all the cases but only clears dirty bit if the whole folio
is about to become clean.
[2] https://lore.kernel.org/all/aZUQKx_C3-qyU4PJ@dread/
[3] https://lore.kernel.org/linux-xfs/cover.1773076216.git.ojaswin@linux.ibm.com/
*** Original Cover ***
Hi all,
This patchset implements an early design prototype of buffered I/O
write-through semantics in linux.
This idea mainly picked up traction to enable RWF_ATOMIC buffered IO [1],
however write-through path can have many use cases beyond atomic writes,
- such as enabling truly async AIO buffered I/O when issued with O_DSYNC
- better scalability for buffered I/O
The implementation of write-through combines the buffered IO frontend
with dio backend, which leads to some interesting interactions.
I've added most of the design notes in respective patches. Please note
that this is an initial RFC to iron out any early design issues. This is
largely based on suggestions from Dave an Jan in [1] so thanks for the
pointers!
* Testing Notes (UPDATED) *
- I've added support for RWF_WRITETHROUGH to fsx and fsstress in
xfstests and these patches survive fsx with integrity verification as
well as fsstress parallel stressing.
- -g quick with blocks size == page size and blocksize < pagesize shows
no new regressions.
* Design TODOs (UPDATED) *
- Evaluate if we need to tag page cache dirty bit in xarray, since
PG_Writeback is already set on the folio.
- Look into a better way to refactor writethrough path by reusing common
parts of dio code.
* Future work (once design is finalized) (UPDATED) *
- Add RWF_ATOMIC support for buffered IO via write-through path
- Add support of other RWF_ flags for write-through buffered I/O path
- Benchmarking numbers and more thorough testing needed.
- ext4 support for writethrough
- Utilize writethrough for normal buffered DSYNC path to get truly async
semantincs for DSYNC
- Look into folio batching support.
As usual, thoughts and suggestions are welcome.
[1] https://lore.kernel.org/all/d0c4d95b-8064-4a7e-996d-7ad40eb4976b@linux.dev/
Regards,
ojaswin
Ojaswin Mujoo (5):
mm: Refactor folio_clear_dirty_for_io()
iomap: Add initial support for buffered RWF_WRITETHROUGH
xfs: Add RWF_WRITETHROUGH support to xfs
iomap: Add aio support to RWF_WRITETHROUGH
iomap: Add DSYNC support to writethrough
fs/iomap/buffered-io.c | 420 ++++++++++++++++++++++++++++++++++++++++
fs/xfs/xfs_file.c | 53 ++++-
include/linux/fs.h | 7 +
include/linux/iomap.h | 45 +++++
include/linux/pagemap.h | 1 +
include/uapi/linux/fs.h | 5 +-
mm/page-writeback.c | 18 +-
7 files changed, 540 insertions(+), 9 deletions(-)
--
2.53.0
next reply other threads:[~2026-04-08 18:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 18:45 Ojaswin Mujoo [this message]
2026-04-08 18:45 ` [RFC PATCH v2 1/5] mm: Refactor folio_clear_dirty_for_io() Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 2/5] iomap: Add initial support for buffered RWF_WRITETHROUGH Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 3/5] xfs: Add RWF_WRITETHROUGH support to xfs Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 4/5] iomap: Add aio support to RWF_WRITETHROUGH Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 5/5] iomap: Add DSYNC support to writethrough Ojaswin Mujoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1775658795.git.ojaswin@linux.ibm.com \
--to=ojaswin@linux.ibm.com \
--cc=andres@anarazel.de \
--cc=brauner@kernel.org \
--cc=dgc@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=john.g.garry@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox