From: Ojaswin Mujoo <ojaswin@linux.ibm.com>
To: "Pankaj Raghav (Samsung)" <pankaj.raghav@linux.dev>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
djwong@kernel.org, john.g.garry@oracle.com, willy@infradead.org,
hch@lst.de, ritesh.list@gmail.com, jack@suse.cz,
Luis Chamberlain <mcgrof@kernel.org>,
dgc@kernel.org, tytso@mit.edu, p.raghav@samsung.com,
andres@anarazel.de, brauner@kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH v2 2/5] iomap: Add initial support for buffered RWF_WRITETHROUGH
Date: Tue, 21 Apr 2026 23:45:33 +0530 [thread overview]
Message-ID: <aee-xaUCWMM4EV7Z@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com> (raw)
In-Reply-To: <kvxvy26jblhx6sijbhxbpzzwltfoskwkv2suqhop5dq742fn5y@5tlws6pzaol3>
On Mon, Apr 20, 2026 at 01:56:02PM +0200, Pankaj Raghav (Samsung) wrote:
> > +
> > + if (wt_ops->writethrough_submit)
> > + wt_ops->writethrough_submit(wt_ctx->inode, iomap, wt_ctx->bio_pos,
> > + len);
> > +
> > + bio = bio_alloc(iomap->bdev, wt_ctx->nr_bvecs, REQ_OP_WRITE, GFP_NOFS);
>
> We might want to check if bio_alloc succeeded here.
Hi Pankaj, so we pass GFP_NOFS which has GFP_DIRECT_RECLAIM and
according to comment over bio_alloc()
* If %__GFP_DIRECT_RECLAIM is set then bio_alloc will always be able to
* allocate a bio. This is due to the mempool guarantees. To make this work,
* callers must never allocate more than 1 bio at a time from the general pool.
And we seem to be following this.
>
> > + bio->bi_iter.bi_sector = iomap_sector(iomap, wt_ctx->bio_pos);
> > + bio->bi_end_io = iomap_writethrough_bio_end_io;
> > + bio->bi_private = wt_ctx;
> > +
> > + for (i = 0; i < wt_ctx->nr_bvecs; i++)
> > + __bio_add_page(bio, wt_ctx->bvec[i].bv_page,
> > + wt_ctx->bvec[i].bv_len,
> > + wt_ctx->bvec[i].bv_offset);
> > +
> > + atomic_inc(&wt_ctx->ref);
> > + submit_bio(bio);
> > + wt_ctx->nr_bvecs = 0;
> > +}
> > +
> <snip>
> > +
> > +/**
> > + * iomap_writethrough_iter - perform RWF_WRITETHROUGH buffered write
> > + * @wt_ctx: writethrough context
> > + * @iter: iomap iter holding mapping information
> > + * @i: iov_iter for write
> > + * @wt_ops: the fs callbacks needed for writethrough
> > + *
> > + * This function copies the user buffer to folio similar to usual buffered
> > + * IO path, with the difference that we immediately issue the IO. For this we
> > + * utilize IO submission and completion mechanism that is inspired by dio.
> > + *
> > + * Folio handling note: We might be writing through a partial folio so we need
> > + * to be careful to not clear the folio dirty bit unless there are no dirty blocks
> > + * in the folio after the writethrough.
> > + */
> > +static int iomap_writethrough_iter(struct iomap_writethrough_ctx *wt_ctx,
> > + struct iomap_iter *iter, struct iov_iter *i,
> > + const struct iomap_writethrough_ops *wt_ops)
> > +
> > +{
> > + ssize_t total_written = 0;
> > + int status = 0;
> > + struct address_space *mapping = iter->inode->i_mapping;
> > + size_t chunk = mapping_max_folio_size(mapping);
> > + unsigned int bdp_flags = (iter->flags & IOMAP_NOWAIT) ? BDP_ASYNC : 0;
> > + unsigned int bs = i_blocksize(iter->inode);
> > +
> > + /* copied over based on DIO handles these flags */
> > + if (iter->iomap.type == IOMAP_UNWRITTEN)
> > + wt_ctx->flags |= IOMAP_DIO_UNWRITTEN;
> > + if (iter->iomap.flags & IOMAP_F_SHARED)
> > + wt_ctx->flags |= IOMAP_DIO_COW;
> > +
> > + if (!(iter->flags & IOMAP_WRITETHROUGH))
> > + return -EINVAL;
> > +
> > + do {
> > + struct folio *folio;
> > + size_t offset; /* Offset into folio */
> > + u64 bytes; /* Bytes to write to folio */
> > + size_t copied; /* Bytes copied from user */
> > + u64 written; /* Bytes have been written */
> > + loff_t pos;
> > + size_t off_aligned, len_aligned;
> > +
> > + bytes = iov_iter_count(i);
> > +retry:
> > + offset = iter->pos & (chunk - 1);
> > + bytes = min(chunk - offset, bytes);
> > + status = balance_dirty_pages_ratelimited_flags(mapping,
> > + bdp_flags);
> > + if (unlikely(status))
> > + break;
> > +
> > + /*
> > + * If completions already occurred and reported errors, give up
> > + * now and don't bother submitting more bios.
> > + */
> > + if (unlikely(data_race(wt_ctx->error))) {
>
> In the unlikely scenario where we encounter an error, do we have to also
> clear the writeback flag on all the folios that is part of this
> bvec until now?
>
> Something like explicitly iterate over wt_ctx->bvec[0] through
> wt_ctx->bvec[nr_bvecs - 1], manually call folio_end_writeback(bvec[i].bv_page)
> on them, and then discard the bvecs by setting the nr_bvecs = 0;
>
> I am wondering if the folios that were processed until now will be in
> PG_WRITEBACK state which can affect reclaim as we never clear the flag.
Hey Pankaj, yes you are right. I think the error handling is a bit buggy
and Sashiko has also pointed some of these. I'll take care of this in
v3, thanks for pointing this out.
Regards,
ojaswin
>
> > + wt_ctx->nr_bvecs = 0;
> > + break;
> > + }
> > +
>
> --
> Pankaj
next prev parent reply other threads:[~2026-04-21 18:15 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 18:45 [RFC PATCH v2 0/5] Add buffered write-through support to iomap & xfs Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 1/5] mm: Refactor folio_clear_dirty_for_io() Ojaswin Mujoo
2026-04-15 6:14 ` Christoph Hellwig
2026-04-08 18:45 ` [RFC PATCH v2 2/5] iomap: Add initial support for buffered RWF_WRITETHROUGH Ojaswin Mujoo
2026-04-16 12:05 ` Jan Kara
2026-04-16 12:34 ` Jan Kara
2026-04-17 19:42 ` Ojaswin Mujoo
2026-04-20 11:28 ` Jan Kara
2026-04-21 18:07 ` Ojaswin Mujoo
2026-04-17 4:13 ` Pankaj Raghav (Samsung)
2026-04-18 7:33 ` Ojaswin Mujoo
2026-04-20 11:56 ` Pankaj Raghav (Samsung)
2026-04-21 18:15 ` Ojaswin Mujoo [this message]
2026-04-08 18:45 ` [RFC PATCH v2 3/5] xfs: Add RWF_WRITETHROUGH support to xfs Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 4/5] iomap: Add aio support to RWF_WRITETHROUGH Ojaswin Mujoo
2026-04-08 18:45 ` [RFC PATCH v2 5/5] iomap: Add DSYNC support to writethrough Ojaswin Mujoo
2026-04-17 3:54 ` [RFC PATCH v2 0/5] Add buffered write-through support to iomap & xfs Pankaj Raghav (Samsung)
2026-04-18 7:26 ` Ojaswin Mujoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aee-xaUCWMM4EV7Z@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com \
--to=ojaswin@linux.ibm.com \
--cc=andres@anarazel.de \
--cc=brauner@kernel.org \
--cc=dgc@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=john.g.garry@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=pankaj.raghav@linux.dev \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox