From: Brian Foster <bfoster@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
hannes@cmpxchg.org, clm@meta.com, linux-kernel@vger.kernel.org,
willy@infradead.org, kirill@shutemov.name,
linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 13/16] iomap: make buffered writes work with RWF_UNCACHED
Date: Tue, 12 Nov 2024 11:37:43 -0500 [thread overview]
Message-ID: <ZzOEVwWpGEaq6wE7@bfoster> (raw)
In-Reply-To: <20241111234842.2024180-14-axboe@kernel.dk>
On Mon, Nov 11, 2024 at 04:37:40PM -0700, Jens Axboe wrote:
> Add iomap buffered write support for RWF_UNCACHED. If RWF_UNCACHED is
> set for a write, mark the folios being written with drop_writeback. Then
s/drop_writeback/uncached/ ?
BTW, this might be getting into wonky "don't care that much" territory,
but something else to be aware of is that certain writes can potentially
change pagecache state as a side effect outside of the actual buffered
write itself.
For example, xfs calls iomap_zero_range() on write extension (i.e. pos >
isize), which uses buffered writes and thus could populate a pagecache
folio without setting it uncached, even if done on behalf of an uncached
write.
I've only made a first pass and could be missing some details, but IIUC
I _think_ this means something like writing out a stream of small,
sparse and file extending uncached writes could actually end up behaving
more like sync I/O. Again, not saying that's something we really care
about, just raising it in case it's worth considering or documenting..
Brian
> writeback completion will drop the pages. The write_iter handler simply
> kicks off writeback for the pages, and writeback completion will take
> care of the rest.
>
> This still needs the user of the iomap buffered write helpers to call
> iocb_uncached_write() upon successful issue of the writes.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
> fs/iomap/buffered-io.c | 15 +++++++++++++--
> include/linux/iomap.h | 4 +++-
> 2 files changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index ef0b68bccbb6..2f2a5db04a68 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -603,6 +603,8 @@ struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos, size_t len)
>
> if (iter->flags & IOMAP_NOWAIT)
> fgp |= FGP_NOWAIT;
> + if (iter->flags & IOMAP_UNCACHED)
> + fgp |= FGP_UNCACHED;
> fgp |= fgf_set_order(len);
>
> return __filemap_get_folio(iter->inode->i_mapping, pos >> PAGE_SHIFT,
> @@ -1023,8 +1025,9 @@ ssize_t
> iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
> const struct iomap_ops *ops, void *private)
> {
> + struct address_space *mapping = iocb->ki_filp->f_mapping;
> struct iomap_iter iter = {
> - .inode = iocb->ki_filp->f_mapping->host,
> + .inode = mapping->host,
> .pos = iocb->ki_pos,
> .len = iov_iter_count(i),
> .flags = IOMAP_WRITE,
> @@ -1034,9 +1037,14 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
>
> if (iocb->ki_flags & IOCB_NOWAIT)
> iter.flags |= IOMAP_NOWAIT;
> + if (iocb->ki_flags & IOCB_UNCACHED)
> + iter.flags |= IOMAP_UNCACHED;
>
> - while ((ret = iomap_iter(&iter, ops)) > 0)
> + while ((ret = iomap_iter(&iter, ops)) > 0) {
> + if (iocb->ki_flags & IOCB_UNCACHED)
> + iter.iomap.flags |= IOMAP_F_UNCACHED;
> iter.processed = iomap_write_iter(&iter, i);
> + }
>
> if (unlikely(iter.pos == iocb->ki_pos))
> return ret;
> @@ -1770,6 +1778,9 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
> size_t poff = offset_in_folio(folio, pos);
> int error;
>
> + if (folio_test_uncached(folio))
> + wpc->iomap.flags |= IOMAP_F_UNCACHED;
> +
> if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos)) {
> new_ioend:
> error = iomap_submit_ioend(wpc, 0);
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index f61407e3b121..2efc72df19a2 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -64,6 +64,7 @@ struct vm_fault;
> #define IOMAP_F_BUFFER_HEAD 0
> #endif /* CONFIG_BUFFER_HEAD */
> #define IOMAP_F_XATTR (1U << 5)
> +#define IOMAP_F_UNCACHED (1U << 6)
>
> /*
> * Flags set by the core iomap code during operations:
> @@ -173,8 +174,9 @@ struct iomap_folio_ops {
> #define IOMAP_NOWAIT (1 << 5) /* do not block */
> #define IOMAP_OVERWRITE_ONLY (1 << 6) /* only pure overwrites allowed */
> #define IOMAP_UNSHARE (1 << 7) /* unshare_file_range */
> +#define IOMAP_UNCACHED (1 << 8) /* uncached IO */
> #ifdef CONFIG_FS_DAX
> -#define IOMAP_DAX (1 << 8) /* DAX mapping */
> +#define IOMAP_DAX (1 << 9) /* DAX mapping */
> #else
> #define IOMAP_DAX 0
> #endif /* CONFIG_FS_DAX */
> --
> 2.45.2
>
>
next prev parent reply other threads:[~2024-11-12 16:36 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-11 23:37 [PATCHSET v3 0/16] Uncached buffered IO Jens Axboe
2024-11-11 23:37 ` [PATCH 01/16] mm/filemap: change filemap_create_folio() to take a struct kiocb Jens Axboe
2024-11-11 23:37 ` [PATCH 02/16] mm/readahead: add folio allocation helper Jens Axboe
2024-11-11 23:37 ` [PATCH 03/16] mm: add PG_uncached page flag Jens Axboe
2024-11-12 9:12 ` Kirill A. Shutemov
2024-11-12 14:07 ` Jens Axboe
2024-11-11 23:37 ` [PATCH 04/16] mm/readahead: add readahead_control->uncached member Jens Axboe
2024-11-11 23:37 ` [PATCH 05/16] mm/filemap: use page_cache_sync_ra() to kick off read-ahead Jens Axboe
2024-11-11 23:37 ` [PATCH 06/16] mm/truncate: add folio_unmap_invalidate() helper Jens Axboe
2024-11-11 23:37 ` [PATCH 07/16] fs: add RWF_UNCACHED iocb and FOP_UNCACHED file_operations flag Jens Axboe
2024-11-11 23:37 ` [PATCH 08/16] mm/filemap: add read support for RWF_UNCACHED Jens Axboe
2024-11-11 23:37 ` [PATCH 09/16] mm/filemap: drop uncached pages when writeback completes Jens Axboe
2024-11-12 9:31 ` Kirill A. Shutemov
2024-11-12 14:09 ` Jens Axboe
2024-11-11 23:37 ` [PATCH 10/16] mm/filemap: make buffered writes work with RWF_UNCACHED Jens Axboe
2024-11-12 0:57 ` Dave Chinner
2024-11-12 1:27 ` Jens Axboe
2024-11-12 8:02 ` Dave Chinner
2024-11-12 9:50 ` Kirill A. Shutemov
2024-11-12 13:36 ` Dave Chinner
2024-11-12 14:51 ` Jens Axboe
2024-11-11 23:37 ` [PATCH 11/16] mm: add FGP_UNCACHED folio creation flag Jens Axboe
2024-11-11 23:37 ` [PATCH 12/16] ext4: add RWF_UNCACHED write support Jens Axboe
2024-11-12 16:36 ` Brian Foster
2024-11-12 17:13 ` Jens Axboe
2024-11-12 18:11 ` Brian Foster
2024-11-12 18:47 ` Jens Axboe
2024-11-11 23:37 ` [PATCH 13/16] iomap: make buffered writes work with RWF_UNCACHED Jens Axboe
2024-11-12 1:01 ` Darrick J. Wong
2024-11-12 1:30 ` Jens Axboe
2024-11-12 16:37 ` Brian Foster [this message]
2024-11-12 17:16 ` Jens Axboe
2024-11-12 18:15 ` Brian Foster
2024-11-11 23:37 ` [PATCH 14/16] xfs: punt uncached write completions to the completion wq Jens Axboe
2024-11-11 23:37 ` [PATCH 15/16] xfs: flag as supporting FOP_UNCACHED Jens Axboe
2024-11-11 23:37 ` [PATCH 16/16] btrfs: add support for uncached writes Jens Axboe
2024-11-12 1:31 ` [PATCHSET v3 0/16] Uncached buffered IO Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZzOEVwWpGEaq6wE7@bfoster \
--to=bfoster@redhat.com \
--cc=axboe@kernel.dk \
--cc=clm@meta.com \
--cc=hannes@cmpxchg.org \
--cc=kirill@shutemov.name \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox