From: Dave Chinner <david@fromorbit.com>
To: Kiryl Shutsemau <kirill@shutemov.name>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
Matthew Wilcox <willy@infradead.org>,
Luis Chamberlain <mcgrof@kernel.org>,
Pankaj Raghav <p.raghav@samsung.com>,
Zorro Lang <zlang@redhat.com>,
akpm@linux-foundation.org, linux-mm <linux-mm@kvack.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
xfs <linux-xfs@vger.kernel.org>
Subject: Re: Regression in generic/749 with 8k fsblock size on 6.18-rc1
Date: Fri, 17 Oct 2025 09:33:15 +1100 [thread overview]
Message-ID: <aPFyqwdv1prLXw5I@dread.disaster.area> (raw)
In-Reply-To: <bknltdsmeiapy37jknsdr2gat277a4ytm5dzj3xrcbjdf3quxm@ej2anj5kqspo>
On Thu, Oct 16, 2025 at 11:22:00AM +0100, Kiryl Shutsemau wrote:
> On Wed, Oct 15, 2025 at 10:57:26AM -0700, Darrick J. Wong wrote:
> > On Wed, Oct 15, 2025 at 04:59:03PM +0100, Kiryl Shutsemau wrote:
> > > On Tue, Oct 14, 2025 at 10:52:14AM -0700, Darrick J. Wong wrote:
> > > > Hi there,
> > > >
> > > > On 6.18-rc1, generic/749[1] running on XFS with an 8k fsblock size fails
> > > > with the following:
> > > >
> > > > --- /run/fstests/bin/tests/generic/749.out 2025-07-15 14:45:15.170416031 -0700
> > > > +++ /var/tmp/fstests/generic/749.out.bad 2025-10-13 17:48:53.079872054 -0700
> > > > @@ -1,2 +1,10 @@
> > > > QA output created by 749
> > > > +Expected SIGBUS when mmap() reading beyond page boundary
> > > > +Expected SIGBUS when mmap() writing beyond page boundary
> > > > +Expected SIGBUS when mmap() reading beyond page boundary
> > > > +Expected SIGBUS when mmap() writing beyond page boundary
> > > > +Expected SIGBUS when mmap() reading beyond page boundary
> > > > +Expected SIGBUS when mmap() writing beyond page boundary
> > > > +Expected SIGBUS when mmap() reading beyond page boundary
> > > > +Expected SIGBUS when mmap() writing beyond page boundary
> > > > Silence is golden
> > > >
> > > > This test creates small files of various sizes, maps the EOF block, and
> > > > checks that you can read and write to the mmap'd page up to (but not
> > > > beyond) the next page boundary.
> > > >
> > > > For 8k fsblock filesystems on x86, the pagecache creates a single 8k
> > > > folio to cache the entire fsblock containing EOF. If EOF is in the
> > > > first 4096 bytes of that 8k fsblock, then it should be possible to do a
> > > > mmap read/write of the first 4k, but not the second 4k. Memory accesses
> > > > to the second 4096 bytes should produce a SIGBUS.
> > >
> > > Does anybody actually relies on this behaviour (beyond xfstests)?
> >
> > Beats me, but the mmap manpage says:
> ...
> > POSIX 2024 says:
> ...
> > From both I would surmise that it's a reasonable expectation that you
> > can't map basepages beyond EOF and have page faults on those pages
> > succeed.
>
> <Added folks form the commit that introduced generic/749>
>
> Modern kernel with large folios blurs the line of what is the page.
>
> I don't want play spec lawyer. Let's look at real workloads.
Or, more importantly, consider the security-related implications of
the change....
> If there's anything that actually relies on this SIGBUS corner case,
> let's see how we can fix the kernel. But it will cost some CPU cycles.
>
> If it only broke syntactic test case, I'm inclined to say WONTFIX.
>
> Any opinions?
Mapping beyond EOF ranges into userspace address spaces is a
potential security risk. If there is ever a zeroing-beyond-EOF bug
related to large folios (history tells us we are *guaranteed* to
screw this up somewhere in future), then allowing mapping all the
way to the end of the large folio could expose a -lot more- stale
kernel data to userspace than just what the tail of a PAGE_SIZE
faulted region would expose.
Hence allowing applications to successfully fault a (unpredictable)
distance far beyond EOF because the page cache used a large folio
spanning EOF seems, to me, to be a very undesirable behaviour to
expose to userspace.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2025-10-16 22:33 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-14 17:52 Darrick J. Wong
2025-10-15 7:39 ` Kirill A. Shutemov
2025-10-15 17:45 ` Darrick J. Wong
2025-10-15 15:59 ` Kiryl Shutsemau
2025-10-15 17:57 ` Darrick J. Wong
2025-10-16 10:22 ` Kiryl Shutsemau
2025-10-16 22:33 ` Dave Chinner [this message]
2025-10-17 14:28 ` Kiryl Shutsemau
2025-10-17 16:02 ` Darrick J. Wong
2025-10-17 17:00 ` Kiryl Shutsemau
2025-10-17 17:14 ` Matthew Wilcox
2025-10-21 17:02 ` Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPFyqwdv1prLXw5I@dread.disaster.area \
--to=david@fromorbit.com \
--cc=akpm@linux-foundation.org \
--cc=djwong@kernel.org \
--cc=kirill@shutemov.name \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=willy@infradead.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox