linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: linux-mm@kvack.org
Cc: Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>
Subject: [PATCH] tmpfs: zero post-eof folio range on file extension
Date: Wed, 25 Jun 2025 14:49:30 -0400	[thread overview]
Message-ID: <20250625184930.269727-1-bfoster@redhat.com> (raw)

Most traditional filesystems zero the post-eof portion of the eof
folio at writeback time, or when the file size is extended by
truncate or extending writes. This ensures that the previously
post-eof range of the folio is zeroed before it is exposed to the
file.

tmpfs doesn't implement the writeback path the way a traditional
filesystem does, so zeroing behavior won't be exactly the same.
However, it can still perform explicit zeroing from the various
operations that extend a file and expose a post-eof portion of the
eof folio. The current lack of zeroing is observed via failure of
fstests test generic/363 on tmpfs. This test injects post-eof mapped
writes in certain situations to detect gaps in zeroing behavior.

Add a new eof zeroing helper for file extending operations. Look up
the current eof folio, and if one exists, zero the range about to be
exposed. This allows generic/363 to pass on tmpfs.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---

Hi all,

This survives the aforemented reproducer, an fstests regression run, and
~100m fsx operations without issues. Let me know if there are any other
recommended tests for tmpfs and I'm happy to run them. Otherwise, a
couple notes as I'm not terribly familiar with tmpfs...

First, I used _get_partial_folio() because we really only want to zero
an eof folio if one has been previously allocated. My understanding is
that lookup path will avoid unnecessary folio allocation in such cases,
but let me know if that's wrong.

Also, it seems that the falloc path leaves newly preallocated folios
!uptodate until they are used. This had me wondering if perhaps
shmem_zero_eof() could just skip out if the eof folio happens to be
!uptodate. Hm?

Thoughts, reviews, flames appreciated.

Brian

 mm/shmem.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 3a5a65b1f41a..4bb96c24fb9e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1077,6 +1077,29 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index)
 	return folio;
 }
 
+/*
+ * Zero any post-EOF range of the EOF folio about to be exposed by size
+ * extension.
+ */
+static void shmem_zero_eof(struct inode *inode, loff_t pos)
+{
+	struct folio *folio;
+	loff_t i_size = i_size_read(inode);
+	size_t from, len;
+
+	folio = shmem_get_partial_folio(inode, i_size >> PAGE_SHIFT);
+	if (!folio)
+		return;
+
+	/* zero to the end of the folio or start of extending operation */
+	from = offset_in_folio(folio, i_size);
+	len = min_t(loff_t, folio_size(folio) - from, pos - i_size);
+	folio_zero_range(folio, from, len);
+
+	folio_unlock(folio);
+	folio_put(folio);
+}
+
 /*
  * Remove range of pages and swap entries from page cache, and free them.
  * If !unfalloc, truncate or punch hole; if unfalloc, undo failed fallocate.
@@ -1302,6 +1325,8 @@ static int shmem_setattr(struct mnt_idmap *idmap,
 			return -EPERM;
 
 		if (newsize != oldsize) {
+			if (newsize > oldsize)
+				shmem_zero_eof(inode, newsize);
 			error = shmem_reacct_size(SHMEM_I(inode)->flags,
 					oldsize, newsize);
 			if (error)
@@ -3464,6 +3489,8 @@ static ssize_t shmem_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	ret = file_update_time(file);
 	if (ret)
 		goto unlock;
+	if (iocb->ki_pos > i_size_read(inode))
+		shmem_zero_eof(inode, iocb->ki_pos);
 	ret = generic_perform_write(iocb, from);
 unlock:
 	inode_unlock(inode);
@@ -3791,8 +3818,15 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset,
 		cond_resched();
 	}
 
-	if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + len > inode->i_size)
+	/*
+	 * The post-eof portion of the eof folio isn't guaranteed to be zeroed
+	 * by fallocate, so zero through the end of the fallocated range
+	 * instead of the start.
+	 */
+	if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + len > inode->i_size) {
+		shmem_zero_eof(inode, offset + len);
 		i_size_write(inode, offset + len);
+	}
 undone:
 	spin_lock(&inode->i_lock);
 	inode->i_private = NULL;
-- 
2.49.0



             reply	other threads:[~2025-06-25 18:46 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-25 18:49 Brian Foster [this message]
2025-06-25 19:21 ` Matthew Wilcox
2025-06-26  5:35   ` Hugh Dickins
2025-06-26 12:55     ` Brian Foster
2025-06-27  3:21       ` Baolin Wang
2025-06-27 11:54         ` Brian Foster
2025-07-09  7:57 ` Hugh Dickins
2025-07-10  6:47   ` Baolin Wang
2025-07-10 22:20     ` Hugh Dickins
2025-07-11  3:50       ` Baolin Wang
2025-07-11  7:50         ` Hugh Dickins
2025-07-11  8:42           ` Baolin Wang
2025-07-11 16:08         ` Brian Foster
2025-07-11 20:15           ` Brian Foster
2025-07-14  3:05             ` Baolin Wang
2025-07-14 14:38               ` Brian Foster
2025-07-10 12:36   ` Brian Foster
2025-07-10 23:02     ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250625184930.269727-1-bfoster@redhat.com \
    --to=bfoster@redhat.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=hughd@google.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox