From: Brian Foster <bfoster@redhat.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <21cnbao@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, Hugh Dickins <hughd@google.com>,
syzbot+178fff6149127421c2cc@syzkaller.appspotmail.com
Subject: Re: [PATCH] mm/shmem: fix uninitialized folio in shmem_symlink
Date: Tue, 6 Jan 2026 10:56:25 -0500 [thread overview]
Message-ID: <aV0wqdi3L67LQrcF@bfoster> (raw)
In-Reply-To: <2b12f63b-ceab-4d3e-a06f-f41e6b1b2d23@linux.alibaba.com>
On Tue, Jan 06, 2026 at 11:47:44AM +0800, Baolin Wang wrote:
>
>
> On 1/5/26 9:58 PM, Brian Foster wrote:
> > On Thu, Dec 25, 2025 at 06:08:08PM +0800, Baolin Wang wrote:
> > >
> > >
> > > On 2025/12/25 12:04, Barry Song wrote:
> > > > On Thu, Dec 25, 2025 at 6:01 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > >
> > > > > On Wed, Dec 24, 2025 at 10:40:27PM +1300, Barry Song wrote:
> > > > > > From: Barry Song <baohua@kernel.org>
> > > > > >
> > > > > > Uninitialized folio allocated in shmem_symlink() may be accessed
> > > > > > during swap-out, causing KMSAN BUG:
> > > > >
> > > > > This would be an unfortunate way to fix it. The vast majority of
> > > > > symlinks are short, and we'll never access past the \0 in normal
> > > > > operation, so we'll be dirtying a lot of cachelines essentially to (1)
> > > > > shut up an automated tool and (2) optimise a corner case.
> > > > >
> > > > > How about this instead which delays zeroing to swapout?
> > > >
> > > > Matthew, thank you very much for your review, even during Christmas.
> > > > I would like to wish you a happy holiday!
> > > >
> > > > I am not quite sure, as shm symlinks do not seem very common. Since
> > > > allocating a folio requires a symname longer than 128 bytes (where
> > > > 128 == SHORT_SYMLINK_LEN), such cases appear even rarer.
> > > >
> > > > BTW, do we need to migrate the owner_2 flag in folio_migrate_flags()?
> > > > If so, I am not quite sure it is worth changing the hotpath to
> > > > accommodate this.
> > >
> > > +1. At least for me, using the 'PG_owner_2' flag alone to mark this uncommon
> > > case doesn't seem quite worthwhile.
> > >
> >
> > Also JFYI the post-eof swapout zeroing work (still pending) looks to me
> > like it would cover the swapout time case [1]. That's just if you wanted
> > to go that route here; creation time zeroing for the large symlink case
> > seems reasonable enough to me as well.
>
> IMHO, your post-eof zeroing work is intended to zero !uptodate or beyond EOF
> folios before swap-out, however, the symlink folio is uptodate and not
> beyond the EOF. I am not sure if it will make your code more complicated
> when you want to cover this symlink case.
>
It already handles this particular case. The primary intent is to zero
all post-eof ranges that can be written out before that write out
happens. I.e., if you take a look at shmem_writeout(), if the folio
happens to straddle i_size we zero from i_size to the end of the folio.
> Why I prefer Barry's fix: First, the symlink folio is marked Uptodate after
> copying the symlink name, but the whole folio hasn’t been initialized, which
> seems unreasonable to me. Second, as I said before, using the 'PG_owner_2'
> flag to mark this uncommon case doesn’t seem worthwhile. Currently, IIUC the
> 'PG_owner_2' is only used by btrfs; if we ever want to remove the
> 'PG_owner_2', this uncommon symlink case shouldn’t block its removal.
>
I think we're talking past eachother a bit here. ;) I'm pointing out
that outstanding work already covers writeout time zeroing so as to
avoid duplicate effort if this was trending that way. IOW, I think
there's no reason to consider the page flag thing, it's more of a
question of do you want the zeroing at creation time or not..
I again don't have a strong opinion on this, but I think I can see
justification either way. If the focus is a KMSAN splat at writeout
time, then it makes some sense to address it in that slower/less likely
path.
That said, if we step back and consider that if this were a buffered
write we'd tail zero the folio at write time before marking it uptodate,
this seems like more broadly consistent behavior with other fs' and
operations. This would be analogous to say XFS allocating a zeroed
metadata buffer on symlink creation to ensure the underlying block is
tail zeroed before it is eventually written to disk.
So to your point, maybe the most appropriate thing to do is cover
zeroing in both places..
> BTW, as I said before, I've reviewed and tested your post-eof work[1]. Maybe
> you could resend the patch set so that Hugh can take a look when he has
> time.
>
Yeah.. I know Hugh is busy and that isn't the highest priority series.
I'm just catching back up from time off myself so I can give it more
time before I repost it or anything.
Brian
> > [1] https://lore.kernel.org/linux-mm/20251121152246.1023918-3-bfoster@redhat.com/
> >
> > > > > diff --git a/mm/shmem.c b/mm/shmem.c
> > > > > index ec6c01378e9d..f3b3be1b50fe 100644
> > > > > --- a/mm/shmem.c
> > > > > +++ b/mm/shmem.c
> > > > > @@ -1636,6 +1636,13 @@ int shmem_writeout(struct folio *folio, struct swap_iocb **plug,
> > > > > folio_mark_uptodate(folio);
> > > > > }
> > > > >
> > > > > + /* Zero out symlink tails to help with compression */
> > > > > + if (folio_test_owner_2(folio)) {
> > > > > + struct inode *inode = folio->mapping->host;
> > > > > + folio_zero_segment(folio, inode->i_size, folio_size(folio));
> > > > > + folio_clear_owner_2(folio);
> > > > > + }
> > > > > +
> > > > > if (!folio_alloc_swap(folio)) {
> > > > > bool first_swapped = shmem_recalc_inode(inode, 0, nr_pages);
> > > > > int error;
> > > > > @@ -4133,6 +4140,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir,
> > > > > memcpy(folio_address(folio), symname, len);
> > > > > folio_mark_uptodate(folio);
> > > > > folio_mark_dirty(folio);
> > > > > + folio_set_owner_2(folio);
> > > > > folio_unlock(folio);
> > > > > folio_put(folio);
> > > > > }
> > > >
> > > > Thanks
> > > > Barry
> > >
> > >
>
next prev parent reply other threads:[~2026-01-06 15:56 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-24 9:40 Barry Song
2025-12-24 17:01 ` Matthew Wilcox
2025-12-25 4:04 ` Barry Song
2025-12-25 10:08 ` Baolin Wang
2026-01-05 13:58 ` Brian Foster
2026-01-05 17:50 ` Matthew Wilcox
2026-01-06 3:47 ` Baolin Wang
2026-01-06 15:56 ` Brian Foster [this message]
2026-01-07 1:12 ` Baolin Wang
2026-01-06 18:46 ` Matthew Wilcox
2026-01-07 1:16 ` Baolin Wang
2025-12-28 4:29 ` Matthew Wilcox
2026-01-05 8:54 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aV0wqdi3L67LQrcF@bfoster \
--to=bfoster@redhat.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=syzbot+178fff6149127421c2cc@syzkaller.appspotmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox