Re: [PATCH] mm: fix pagecache_isize_extended() early-return bypass for large folio mappings

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Brian Foster <bfoster@redhat.com>
To: Morduan Zang <zhangdandan@uniontech.com>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, willy@infradead.org
Subject: Re: [PATCH] mm: fix pagecache_isize_extended() early-return bypass for large folio mappings
Date: Thu, 26 Feb 2026 15:22:32 -0500	[thread overview]
Message-ID: <aaCriOt-CDs4sciP@bfoster> (raw)
In-Reply-To: <3F3A46783F8E9D52+20260226133149.79586-1-zhangdandan@uniontech.com>

On Thu, Feb 26, 2026 at 09:31:49PM +0800, Morduan Zang wrote:
> pagecache_isize_extended() has two early-return guards that were designed
> for the traditional sub-page block-size case:
> 
>   Guard 1:  if (from >= to || bsize >= PAGE_SIZE)
>                 return;
> 
>   Guard 2:  rounded_from = round_up(from, bsize);
>             if (to <= rounded_from || !(rounded_from & (PAGE_SIZE - 1)))
>                 return;
> 
> Guard 1 was originally "bsize == PAGE_SIZE" and was widened to
> "bsize >= PAGE_SIZE" by commit 2ebe90dab980 ("mm: convert
> pagecache_isize_extended to use a folio").  The rationale is correct
> for the traditional buffer_head path: when the block size equals the page
> size, every folio covers exactly one block, so writeback's EOF handling
> (e.g. iomap_writepage_handle_eof()) zeros the post-EOF tail of the folio
> before writing it out, and no action is needed here.
> 
> Guard 2 covers the case where @from rounded up to the next block boundary
> is already PAGE_SIZE-aligned, meaning no hole block straddles a page
> boundary.
> 
> Both guards are correct for the traditional case.  However, commit
> 52aecaee1c26 ("mm: zero range of eof folio exposed by inode size extension")
> added post-EOF zeroing inside pagecache_isize_extended() to
> handle dirty folios that will not go through writeback before the new
> i_size becomes visible.  That zeroing code is placed after both guards,
> so it is unreachable whenever either guard fires.
> 
> The same stale-data window is also covered by xfstests generic/363
> which uses fsx with "-e 1" (EOF pollution mode) and exercises a broad
> range of size-changing operations.
> 

Hi Morduan,

So looking back at the original cover letter for this, this bit was for
the case where we had a dirty folio in pagecache that might be partially
hole backed due to eof, therefore fs zeroing might not occur.  Hence we
do the page zeroing here before exposing this range to the file (i.e.
that writeback would have done if the folio were clean).

I thought at the time this plus the ext4 patch covered the bases for
generic/363 on ext4. You refer to this test above but don't mention if
it fails. Do you reproduce a failure with that test, or is this
something discovered by inspection?

> Fixes: 52aecaee1c26 ("mm: zero range of eof folio exposed by inode size extension")
> Fixes: 2ebe90dab980 ("mm: convert pagecache_isize_extended to use a folio")
> Signed-off-by: Morduan Zang <zhangdandan@uniontech.com>
> ---
>  mm/truncate.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 12467c1bd711..d3e473a206b3 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -847,13 +847,32 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
>  
>  	WARN_ON(to > inode->i_size);
>  
> -	if (from >= to || bsize >= PAGE_SIZE)
> +	if (from >= to)
>  		return;
> +
> +	/*
> +	 * For filesystems with bsize >= PAGE_SIZE, the traditional buffer_head
> +	 * path handles post-EOF zeroing correctly at writeback time. However,
> +	 * with large folios enabled, a single folio can span multiple PAGE_SIZE
> +	 * blocks, so mmap writes beyond EOF within the same folio are not zeroed
> +	 * at writeback time before i_size is extended. We must handle this here.
> +	 */
> +	if (bsize >= PAGE_SIZE) {
> +		/*
> +		 * Only needed if the mapping supports large folios, since otherwise
> +		 * each folio is exactly one page and writeback handles EOF zeroing.
> +		 */
> +		if (!mapping_large_folio_support(inode->i_mapping))
> +			return;

Is there currently a case for bsize >= PAGE_SIZE &&
!mapping_large_folio_support()? I thought there was a WIP for
multi-block folios, but I wasn't sure if that actually worked anywhere.

> +		goto find_folio;
> +	}
> +
>  	/* Page straddling @from will not have any hole block created? */
>  	rounded_from = round_up(from, bsize);
>  	if (to <= rounded_from || !(rounded_from & (PAGE_SIZE - 1)))
>  		return;
>  

If I understood this code correctly (and I very well may not), the
purpose of this is to basically filter out cases where a dirty eof folio
doesn't require a refault after the size update for the fs to fully
populate it with blocks. If that is the case, this makes me wonder if
perhaps this check should remain, but instead use folio_size() of the
eof folio (if one exists)..?

My understanding at one point was that we wouldn't have large eof folios
that included a page aligned offset beyond eof, but I also feel like
I've run into that once or twice when dealing with some other oddball fs
related issues, so I'm not really clear on what the expected behavior is
supposed to be there. Maybe it's a corner case (i.e. related to split
failure or some such)..? That is probably a question for Willy..

Brian

> +find_folio:
>  	folio = filemap_lock_folio(inode->i_mapping, from / PAGE_SIZE);
>  	/* Folio not cached? Nothing to do */
>  	if (IS_ERR(folio))
> -- 
> 2.50.1
> 
>

next prev parent reply	other threads:[~2026-02-26 20:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-19 16:07 [PATCH 0/2] ext4, mm: improve partial inode eof zeroing Brian Foster
2024-09-19 16:07 ` [PATCH 1/2] ext4: partial zero eof block on unaligned inode size extension Brian Foster
2024-09-19 16:07 ` [PATCH 2/2] mm: zero range of eof folio exposed by " Brian Foster
2026-02-26 13:31   ` [PATCH] mm: fix pagecache_isize_extended() early-return bypass for large folio mappings Morduan Zang
2026-02-26 20:22     ` Brian Foster [this message]
2024-11-07 15:12 ` [PATCH 0/2] ext4, mm: improve partial inode eof zeroing Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaCriOt-CDs4sciP@bfoster \
    --to=bfoster@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    --cc=zhangdandan@uniontech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox