linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Brian Foster <bfoster@redhat.com>, Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>, linux-mm@kvack.org
Subject: Re: [PATCH] tmpfs: zero post-eof folio range on file extension
Date: Fri, 27 Jun 2025 11:21:56 +0800	[thread overview]
Message-ID: <fcac9402-04a1-408d-9d7a-8d5a370dac3a@linux.alibaba.com> (raw)
In-Reply-To: <aF1DKcOFYO1jMeBI@bfoster>



On 2025/6/26 20:55, Brian Foster wrote:
> On Wed, Jun 25, 2025 at 10:35:44PM -0700, Hugh Dickins wrote:
>> On Wed, 25 Jun 2025, Matthew Wilcox wrote:
>>> On Wed, Jun 25, 2025 at 02:49:30PM -0400, Brian Foster wrote:
>>>> Most traditional filesystems zero the post-eof portion of the eof
>>>> folio at writeback time, or when the file size is extended by
>>>> truncate or extending writes. This ensures that the previously
>>>> post-eof range of the folio is zeroed before it is exposed to the
>>>> file.
>>>>
>>>> tmpfs doesn't implement the writeback path the way a traditional
>>>> filesystem does, so zeroing behavior won't be exactly the same.
>>>> However, it can still perform explicit zeroing from the various
>>>> operations that extend a file and expose a post-eof portion of the
>>>> eof folio. The current lack of zeroing is observed via failure of
>>>> fstests test generic/363 on tmpfs. This test injects post-eof mapped
>>>> writes in certain situations to detect gaps in zeroing behavior.
>>>>
>>>> Add a new eof zeroing helper for file extending operations. Look up
>>>> the current eof folio, and if one exists, zero the range about to be
>>>> exposed. This allows generic/363 to pass on tmpfs.
>>>
>>> This seems like what I did here?
>>>
>>> https://lore.kernel.org/linux-fsdevel/20230202204428.3267832-4-willy@infradead.org/
>>>
>>> Which fix should we prefer?
>>
> 
> Quite similar, indeed. This is actually about the same as my initial
> prototype when I was just trying to confirm the problem for truncate. As
> Hugh notes, we still need to cover other size extension paths
> (fallocate, buffered write).
> 
>> Thank you Brian, thank you Matthew.
>>
>> Of course my immediate reaction is to prefer the smaller patch,
>> Matthew's, but generic/363 still fails with that one: I need to look
>> into what each of you is doing (and that mail thread from 2023) and
>> make up my own mind.
>>
> 
> FWIW, I started off with something trying to use shmem_undo_range() and
> eventually moved toward the current patch because I couldn't get it
> working quite right. Explicit zeroing seemed like less complexity than
> calling through undo_range() on various operations, primarily due to
> less risk of unintentional side effect. It's possible (likely :P) that
> was just user error on my part, so now that I have a functional patch I
> can revisit that approach if it is preferred.

I also tried using shmem_truncate_range() to fix this issue, but I 
failed. Ultimately, at least for me, I think the fix is simple and works.

> However one thing I wasn't aware of at the time was the additional
> zeroing needed into the target range on fallocate, so that might require
> care to avoid not immediately punching out the folios that were
> fallocated just prior. I suspect this would mean we'd need a helper or
> something to restrict the range to undo to just the eof folio. That
> seems like a plausible approach, I'm just not so sure the final result
> will end up much smaller or simpler than this patch.
> 
>> (I'm worried why I had no copy of Matthew's 2023 patch: it's sadly
>> common for me to bury patches for long periods of time, but not
>> usually to lose them completely. But that is my worry, not yours.)
>>
>> I haven't been much concerned by generic/383 failing on tmpfs:
>> but promise to respect your efforts in due course.
>>
> 
> It's certainly not the bug of the century. ;) I added the test somewhat
> recently because we had bigger zeroing issues on other filesystems and I
> noticed we had no decent test coverage.
> 
>> I procrastinate "in due course" because (a) the full correct answwer
>> will depend on what happens with large folios, and (b) large folio
>> work in 6.14 changed (I'll say broke) the behaviour of huge=always
>> at eof (I expected a danger of that, and thought I checked before
>> 6.14-rc settled, but must have messed up my checking somehow).
>>
> 
> Interesting.. I assume huge=always refers to a mount option. I need to
> give that a test.

I tested your patch by adding the 'transparent_hugepage_tmpfs=always' 
command line parameter, which will change the default huge policy to 
'huge=always' for tmpfs mounts. Your patch also passed the generic/363 
test with the tmpfs 'huge=always' mount option.

> I'm also curious if either of you have any thoughts on the uptodate
> question. Does filtering zeroing on uptodate folios ensure we'd zero
> only folios that were previously written to?

Yes, I think so. Caller will handle the !uptodate folios. So I change 
your patch to:

static void shmem_zero_eof(struct inode *inode, loff_t pos)
{
         struct folio *folio;
         loff_t i_size = i_size_read(inode);
         size_t from, len;

         folio = shmem_get_partial_folio(inode, i_size >> PAGE_SHIFT);
         if (!folio)
                 return;

         if (folio_test_uptodate(folio)) {
                 /* zero to the end of the folio or start of extending 
operation */
                 from = offset_in_folio(folio, i_size);
                 len = min_t(loff_t, folio_size(folio) - from, pos - 
i_size);
                 folio_zero_range(folio, from, len);
         }

         folio_unlock(folio);
         folio_put(folio);
}

>> So there's more (and more urgent) to sort out before settling on
>> the right generic/383 fix.

However, let's still wait to see if Hugh has any better ideas.


  reply	other threads:[~2025-06-27  3:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-25 18:49 Brian Foster
2025-06-25 19:21 ` Matthew Wilcox
2025-06-26  5:35   ` Hugh Dickins
2025-06-26 12:55     ` Brian Foster
2025-06-27  3:21       ` Baolin Wang [this message]
2025-06-27 11:54         ` Brian Foster
2025-07-09  7:57 ` Hugh Dickins
2025-07-10  6:47   ` Baolin Wang
2025-07-10 22:20     ` Hugh Dickins
2025-07-11  3:50       ` Baolin Wang
2025-07-11  7:50         ` Hugh Dickins
2025-07-11  8:42           ` Baolin Wang
2025-07-11 16:08         ` Brian Foster
2025-07-11 20:15           ` Brian Foster
2025-07-14  3:05             ` Baolin Wang
2025-07-14 14:38               ` Brian Foster
2025-07-10 12:36   ` Brian Foster
2025-07-10 23:02     ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fcac9402-04a1-408d-9d7a-8d5a370dac3a@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=bfoster@redhat.com \
    --cc=hughd@google.com \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox