linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	hannes@cmpxchg.org, clm@meta.com, linux-kernel@vger.kernel.org,
	willy@infradead.org, kirill@shutemov.name,
	linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 12/16] ext4: add RWF_UNCACHED write support
Date: Tue, 12 Nov 2024 11:47:57 -0700	[thread overview]
Message-ID: <58ebc5a8-941b-4c3d-a3b2-3985d7eeea30@kernel.dk> (raw)
In-Reply-To: <ZzOaaInUHOmlAL-o@bfoster>

On 11/12/24 11:11 AM, Brian Foster wrote:
> On Tue, Nov 12, 2024 at 10:13:12AM -0700, Jens Axboe wrote:
>> On 11/12/24 9:36 AM, Brian Foster wrote:
>>> On Mon, Nov 11, 2024 at 04:37:39PM -0700, Jens Axboe wrote:
>>>> IOCB_UNCACHED IO needs to prune writeback regions on IO completion,
>>>> and hence need the worker punt that ext4 also does for unwritten
>>>> extents. Add an io_end flag to manage that.
>>>>
>>>> If foliop is set to foliop_uncached in ext4_write_begin(), then set
>>>> FGP_UNCACHED so that __filemap_get_folio() will mark newly created
>>>> folios as uncached. That in turn will make writeback completion drop
>>>> these ranges from the page cache.
>>>>
>>>> Now that ext4 supports both uncached reads and writes, add the fop_flag
>>>> FOP_UNCACHED to enable it.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>> ---
>>>>  fs/ext4/ext4.h    |  1 +
>>>>  fs/ext4/file.c    |  2 +-
>>>>  fs/ext4/inline.c  |  7 ++++++-
>>>>  fs/ext4/inode.c   | 18 ++++++++++++++++--
>>>>  fs/ext4/page-io.c | 28 ++++++++++++++++------------
>>>>  5 files changed, 40 insertions(+), 16 deletions(-)
>>>>
>>> ...
>>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>>> index 54bdd4884fe6..afae3ab64c9e 100644
>>>> --- a/fs/ext4/inode.c
>>>> +++ b/fs/ext4/inode.c
>>>> @@ -1138,6 +1138,7 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
>>>>  	int ret, needed_blocks;
>>>>  	handle_t *handle;
>>>>  	int retries = 0;
>>>> +	fgf_t fgp_flags;
>>>>  	struct folio *folio;
>>>>  	pgoff_t index;
>>>>  	unsigned from, to;
>>>> @@ -1164,6 +1165,15 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
>>>>  			return 0;
>>>>  	}
>>>>  
>>>> +	/*
>>>> +	 * Set FGP_WRITEBEGIN, and FGP_UNCACHED if foliop contains
>>>> +	 * foliop_uncached. That's how generic_perform_write() informs us
>>>> +	 * that this is an uncached write.
>>>> +	 */
>>>> +	fgp_flags = FGP_WRITEBEGIN;
>>>> +	if (*foliop == foliop_uncached)
>>>> +		fgp_flags |= FGP_UNCACHED;
>>>> +
>>>>  	/*
>>>>  	 * __filemap_get_folio() can take a long time if the
>>>>  	 * system is thrashing due to memory pressure, or if the folio
>>>> @@ -1172,7 +1182,7 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
>>>>  	 * the folio (if needed) without using GFP_NOFS.
>>>>  	 */
>>>>  retry_grab:
>>>> -	folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
>>>> +	folio = __filemap_get_folio(mapping, index, fgp_flags,
>>>>  					mapping_gfp_mask(mapping));
>>>>  	if (IS_ERR(folio))
>>>>  		return PTR_ERR(folio);
>>>
>>> JFYI, I notice that ext4 cycles the folio lock here in this path and
>>> thus follows up with a couple checks presumably to accommodate that. One
>>> is whether i_mapping has changed, which I assume means uncached state
>>> would have been handled/cleared externally somewhere..? I.e., if an
>>> uncached folio is somehow truncated/freed without ever having been
>>> written back?
>>>
>>> The next is a folio_wait_stable() call "in case writeback began ..."
>>> It's not immediately clear to me if that is possible here, but taking
>>> that at face value, is it an issue if we were to create an uncached
>>> folio, drop the folio lock, then have some other task dirty and
>>> writeback the folio (due to a sync write or something), then have
>>> writeback completion invalidate the folio before we relock it here?
>>
>> I don't either of those are an issue. The UNCACHED flag will only be set
>> on a newly created folio, it does not get inherited for folios that
>> already exist.
>>
> 
> Right.. but what I was wondering for that latter case is if the folio is
> created here by ext4, so uncached is set before it is unlocked.
> 
> On second look I guess the uncached completion invalidation should clear
> mapping and thus trigger the retry logic here. That seems reasonable
> enough, but is it still possible to race with writeback?
> 
> Maybe this is a better way to ask.. what happens if a write completes to
> an uncached folio that is already under writeback? For example, uncached
> write 1 completes, submits for writeback and returns to userspace. Then
> write 2 begins and redirties the same folio before the uncached
> writeback completes.
> 
> If I follow correctly, if write 2 is also uncached, it eventually blocks
> in writeback submission (folio_prepare_writeback() ->
> folio_wait_writeback()). It looks like folio lock is held there, so
> presumably that would bypass the completion time invalidation in
> folio_end_uncached(). But what if write 2 was not uncached or perhaps
> writeback completion won the race for folio lock vs. the write side
> (between locking the folio for dirtying and later for writeback
> submission)? Does anything prevent invalidation of the folio before the
> second write is submitted for writeback?
> 
> IOW, I'm wondering if the uncached completion time invalidation also
> needs a folio dirty check..?

Ah ok, I see what you mean. If the folio is dirty, the unmapping will
fail. But I guess with the recent change, we'll actually unmap it first.
I'll add the folio dirty check, thanks!

-- 
Jens Axboe


  reply	other threads:[~2024-11-12 18:48 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-11 23:37 [PATCHSET v3 0/16] Uncached buffered IO Jens Axboe
2024-11-11 23:37 ` [PATCH 01/16] mm/filemap: change filemap_create_folio() to take a struct kiocb Jens Axboe
2024-11-11 23:37 ` [PATCH 02/16] mm/readahead: add folio allocation helper Jens Axboe
2024-11-11 23:37 ` [PATCH 03/16] mm: add PG_uncached page flag Jens Axboe
2024-11-12  9:12   ` Kirill A. Shutemov
2024-11-12 14:07     ` Jens Axboe
2024-11-11 23:37 ` [PATCH 04/16] mm/readahead: add readahead_control->uncached member Jens Axboe
2024-11-11 23:37 ` [PATCH 05/16] mm/filemap: use page_cache_sync_ra() to kick off read-ahead Jens Axboe
2024-11-11 23:37 ` [PATCH 06/16] mm/truncate: add folio_unmap_invalidate() helper Jens Axboe
2024-11-11 23:37 ` [PATCH 07/16] fs: add RWF_UNCACHED iocb and FOP_UNCACHED file_operations flag Jens Axboe
2024-11-11 23:37 ` [PATCH 08/16] mm/filemap: add read support for RWF_UNCACHED Jens Axboe
2024-11-11 23:37 ` [PATCH 09/16] mm/filemap: drop uncached pages when writeback completes Jens Axboe
2024-11-12  9:31   ` Kirill A. Shutemov
2024-11-12 14:09     ` Jens Axboe
2024-11-11 23:37 ` [PATCH 10/16] mm/filemap: make buffered writes work with RWF_UNCACHED Jens Axboe
2024-11-12  0:57   ` Dave Chinner
2024-11-12  1:27     ` Jens Axboe
2024-11-12  8:02       ` Dave Chinner
2024-11-12  9:50         ` Kirill A. Shutemov
2024-11-12 13:36           ` Dave Chinner
2024-11-12 14:51         ` Jens Axboe
2024-11-11 23:37 ` [PATCH 11/16] mm: add FGP_UNCACHED folio creation flag Jens Axboe
2024-11-11 23:37 ` [PATCH 12/16] ext4: add RWF_UNCACHED write support Jens Axboe
2024-11-12 16:36   ` Brian Foster
2024-11-12 17:13     ` Jens Axboe
2024-11-12 18:11       ` Brian Foster
2024-11-12 18:47         ` Jens Axboe [this message]
2024-11-11 23:37 ` [PATCH 13/16] iomap: make buffered writes work with RWF_UNCACHED Jens Axboe
2024-11-12  1:01   ` Darrick J. Wong
2024-11-12  1:30     ` Jens Axboe
2024-11-12 16:37   ` Brian Foster
2024-11-12 17:16     ` Jens Axboe
2024-11-12 18:15       ` Brian Foster
2024-11-11 23:37 ` [PATCH 14/16] xfs: punt uncached write completions to the completion wq Jens Axboe
2024-11-11 23:37 ` [PATCH 15/16] xfs: flag as supporting FOP_UNCACHED Jens Axboe
2024-11-11 23:37 ` [PATCH 16/16] btrfs: add support for uncached writes Jens Axboe
2024-11-12  1:31 ` [PATCHSET v3 0/16] Uncached buffered IO Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58ebc5a8-941b-4c3d-a3b2-3985d7eeea30@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=bfoster@redhat.com \
    --cc=clm@meta.com \
    --cc=hannes@cmpxchg.org \
    --cc=kirill@shutemov.name \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox