From: Jens Axboe <axboe@kernel.dk>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
hannes@cmpxchg.org, clm@meta.com, linux-kernel@vger.kernel.org,
willy@infradead.org, kirill@shutemov.name,
linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 13/16] iomap: make buffered writes work with RWF_UNCACHED
Date: Mon, 11 Nov 2024 18:30:28 -0700 [thread overview]
Message-ID: <bc0ea54c-90c0-48f1-a9a1-50463ffc0d97@kernel.dk> (raw)
In-Reply-To: <20241112010157.GE9421@frogsfrogsfrogs>
On 11/11/24 6:01 PM, Darrick J. Wong wrote:
> On Mon, Nov 11, 2024 at 04:37:40PM -0700, Jens Axboe wrote:
>> Add iomap buffered write support for RWF_UNCACHED. If RWF_UNCACHED is
>> set for a write, mark the folios being written with drop_writeback. Then
>> writeback completion will drop the pages. The write_iter handler simply
>> kicks off writeback for the pages, and writeback completion will take
>> care of the rest.
>>
>> This still needs the user of the iomap buffered write helpers to call
>> iocb_uncached_write() upon successful issue of the writes.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>> fs/iomap/buffered-io.c | 15 +++++++++++++--
>> include/linux/iomap.h | 4 +++-
>> 2 files changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
>> index ef0b68bccbb6..2f2a5db04a68 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -603,6 +603,8 @@ struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos, size_t len)
>>
>> if (iter->flags & IOMAP_NOWAIT)
>> fgp |= FGP_NOWAIT;
>> + if (iter->flags & IOMAP_UNCACHED)
>> + fgp |= FGP_UNCACHED;
>> fgp |= fgf_set_order(len);
>>
>> return __filemap_get_folio(iter->inode->i_mapping, pos >> PAGE_SHIFT,
>> @@ -1023,8 +1025,9 @@ ssize_t
>> iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
>> const struct iomap_ops *ops, void *private)
>> {
>> + struct address_space *mapping = iocb->ki_filp->f_mapping;
>> struct iomap_iter iter = {
>> - .inode = iocb->ki_filp->f_mapping->host,
>> + .inode = mapping->host,
>> .pos = iocb->ki_pos,
>> .len = iov_iter_count(i),
>> .flags = IOMAP_WRITE,
>> @@ -1034,9 +1037,14 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i,
>>
>> if (iocb->ki_flags & IOCB_NOWAIT)
>> iter.flags |= IOMAP_NOWAIT;
>> + if (iocb->ki_flags & IOCB_UNCACHED)
>> + iter.flags |= IOMAP_UNCACHED;
>>
>> - while ((ret = iomap_iter(&iter, ops)) > 0)
>> + while ((ret = iomap_iter(&iter, ops)) > 0) {
>> + if (iocb->ki_flags & IOCB_UNCACHED)
>> + iter.iomap.flags |= IOMAP_F_UNCACHED;
>> iter.processed = iomap_write_iter(&iter, i);
>> + }
>>
>> if (unlikely(iter.pos == iocb->ki_pos))
>> return ret;
>> @@ -1770,6 +1778,9 @@ static int iomap_add_to_ioend(struct iomap_writepage_ctx *wpc,
>> size_t poff = offset_in_folio(folio, pos);
>> int error;
>>
>> + if (folio_test_uncached(folio))
>> + wpc->iomap.flags |= IOMAP_F_UNCACHED;
>> +
>> if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos)) {
>> new_ioend:
>> error = iomap_submit_ioend(wpc, 0);
>> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
>> index f61407e3b121..2efc72df19a2 100644
>> --- a/include/linux/iomap.h
>> +++ b/include/linux/iomap.h
>> @@ -64,6 +64,7 @@ struct vm_fault;
>> #define IOMAP_F_BUFFER_HEAD 0
>> #endif /* CONFIG_BUFFER_HEAD */
>> #define IOMAP_F_XATTR (1U << 5)
>> +#define IOMAP_F_UNCACHED (1U << 6)
>
> This value ^^^ is set only by the core iomap code, right?
Correct
>> /*
>> * Flags set by the core iomap code during operations:
>
> ...in which case it should be set down here. It probably ought to have
> a description of what it does, too:
Ah yes indeed, good point. I'll move it and add a description.
> "IOMAP_F_UNCACHED is set to indicate that writes to the page cache (and
> hence writeback) will result in folios being evicted as soon as the
> updated bytes are written back to the storage."
Excellent, I'll go with that.
> If the writeback fails, does that mean that the dirty data will /not/ be
> retained in the page cache? IIRC we finally got to the point where the
> major filesystems leave pagecache alone after writeback EIO.
Good question - didn't change any of those bits. It currently relies on
writeback completion to prune the ranges. So if an EIO completion
triggers writeback completion, then it'll get pruned. But for that case,
I suspect the range is still dirty, and hence the pruning would not
succeed, for obvious reasons. So you'd need further things on top of
that, I'm afraid.
> The rest of the mechanics looks nifty to me; there's plenty of places
> where this could be useful to me personally. :)
For sure, I think there are tons of use cases for this as well. Thanks
for taking a look!
--
Jens Axboe
next prev parent reply other threads:[~2024-11-12 1:30 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-11 23:37 [PATCHSET v3 0/16] Uncached buffered IO Jens Axboe
2024-11-11 23:37 ` [PATCH 01/16] mm/filemap: change filemap_create_folio() to take a struct kiocb Jens Axboe
2024-11-11 23:37 ` [PATCH 02/16] mm/readahead: add folio allocation helper Jens Axboe
2024-11-11 23:37 ` [PATCH 03/16] mm: add PG_uncached page flag Jens Axboe
2024-11-12 9:12 ` Kirill A. Shutemov
2024-11-12 14:07 ` Jens Axboe
2024-11-11 23:37 ` [PATCH 04/16] mm/readahead: add readahead_control->uncached member Jens Axboe
2024-11-11 23:37 ` [PATCH 05/16] mm/filemap: use page_cache_sync_ra() to kick off read-ahead Jens Axboe
2024-11-11 23:37 ` [PATCH 06/16] mm/truncate: add folio_unmap_invalidate() helper Jens Axboe
2024-11-11 23:37 ` [PATCH 07/16] fs: add RWF_UNCACHED iocb and FOP_UNCACHED file_operations flag Jens Axboe
2024-11-11 23:37 ` [PATCH 08/16] mm/filemap: add read support for RWF_UNCACHED Jens Axboe
2024-11-11 23:37 ` [PATCH 09/16] mm/filemap: drop uncached pages when writeback completes Jens Axboe
2024-11-12 9:31 ` Kirill A. Shutemov
2024-11-12 14:09 ` Jens Axboe
2024-11-11 23:37 ` [PATCH 10/16] mm/filemap: make buffered writes work with RWF_UNCACHED Jens Axboe
2024-11-12 0:57 ` Dave Chinner
2024-11-12 1:27 ` Jens Axboe
2024-11-12 8:02 ` Dave Chinner
2024-11-12 9:50 ` Kirill A. Shutemov
2024-11-12 13:36 ` Dave Chinner
2024-11-12 14:51 ` Jens Axboe
2024-11-11 23:37 ` [PATCH 11/16] mm: add FGP_UNCACHED folio creation flag Jens Axboe
2024-11-11 23:37 ` [PATCH 12/16] ext4: add RWF_UNCACHED write support Jens Axboe
2024-11-12 16:36 ` Brian Foster
2024-11-12 17:13 ` Jens Axboe
2024-11-12 18:11 ` Brian Foster
2024-11-12 18:47 ` Jens Axboe
2024-11-11 23:37 ` [PATCH 13/16] iomap: make buffered writes work with RWF_UNCACHED Jens Axboe
2024-11-12 1:01 ` Darrick J. Wong
2024-11-12 1:30 ` Jens Axboe [this message]
2024-11-12 16:37 ` Brian Foster
2024-11-12 17:16 ` Jens Axboe
2024-11-12 18:15 ` Brian Foster
2024-11-11 23:37 ` [PATCH 14/16] xfs: punt uncached write completions to the completion wq Jens Axboe
2024-11-11 23:37 ` [PATCH 15/16] xfs: flag as supporting FOP_UNCACHED Jens Axboe
2024-11-11 23:37 ` [PATCH 16/16] btrfs: add support for uncached writes Jens Axboe
2024-11-12 1:31 ` [PATCHSET v3 0/16] Uncached buffered IO Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bc0ea54c-90c0-48f1-a9a1-50463ffc0d97@kernel.dk \
--to=axboe@kernel.dk \
--cc=clm@meta.com \
--cc=djwong@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kirill@shutemov.name \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox