From: Peter Xu <peterx@redhat.com>
To: Axel Rasmussen <axelrasmussen@google.com>
Cc: James Houghton <jthoughton@google.com>,
"David P. Reed" <dpreed@deepplum.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org
Subject: Re: PROBLEM: userfaultfd REGISTER minor mode on MAP_PRIVATE range fails
Date: Fri, 26 Sep 2025 18:00:15 -0400 [thread overview]
Message-ID: <aNcM7yqj36u37LFv@x1.local> (raw)
In-Reply-To: <CAJHvVcjznC9KdMDMLPCP7W5Aq_u3GobPVZQoUF=wrhUN3OL9VQ@mail.gmail.com>
On Tue, Sep 16, 2025 at 03:04:46PM -0700, Axel Rasmussen wrote:
> On Tue, Sep 16, 2025 at 12:11 PM James Houghton <jthoughton@google.com>
> wrote:
>
> > On Tue, Sep 16, 2025 at 11:35 AM Axel Rasmussen
> > <axelrasmussen@google.com> wrote:
> > >
> > >
> > >
> > > On Tue, Sep 16, 2025 at 10:27 AM David P. Reed <dpreed@deepplum.com>
> > wrote:
> > >>
> > >> Than -
> > >>
> > >> Just to clarify -
> > >> Looking at the man page for UFFDIO_API, there are two "feature bits"
> > that indicate cases where "minor" handling is now supported, and can be
> > enabled.
> > >> UFFD_FEATURE_MINOR_HUGETLBFS and UFFD_FEATURE_MINOR_SHMEM
> > >> In my reading of the documents, these seem to imply that before they
> > were added as new features, that MAP_PRIVATE|MAP_ANONYMOUS mappings were
> > supported, and that the "new" additions to the MINOR mode were just for
> > HUGETLBFS and MAP_SHARED cases.
> > >
> > >
> > > Actually minor fault support didn't exist at all before those two
> > features were added. :)
> > >
> > > You are right that userfaultfd's use of "minor fault" is (unfortunately)
> > slightly different from the meaning in other contexts. I think the more
> > normal meaning is, faults which do not incur I/O (i.e., swap faults and
> > file faults [i.e., faults on non-swap-backed pages] are major, other faults
> > are minor).
> > >
> > > For userfaultfd, a minor fault is a fault where the page already exists
> > in the page cache, but the page table entry wasn't setup. I don't think
> > that scenario can ever happen for anonymous, private mappings, so it
> > doesn't really make sense to be able to register such mappings in this
> > mode. If you create a mapping with mmap(MAP_ANON|MAP_PRIVATE) and then
> > access it (read or write), that fault requires allocation of a new page, so
> > userfaultfd does not consider that a "minor fault". My recollection though
> > is if you make a file on tmpfs or hugetlbfs, fallocate() it or whatever,
> > and you MAP_PRIVATE that file, *that* registration will work.
> >
> > Ah! You're right... MAP_PRIVATE *is* supported (for tmpfs and
> > hugetlbfs only), and UFFDIO_CONTINUE will, upon finding the page in
> > the page cache, install a RO PTE for it.
> >
>
> Why does it have to be RO? I think it depends on the PROT_ flag you
> specified when you created the private mapping.
It needs to be RO because we're installing a page cache into a PRIVATE
mapping, hence we don't want the private mapper to update the page cache,
we want the 1st write to CoW there. I believe you wrote the code. :)
Relevant lines in mfill_atomic_install_pte():
if (page_in_cache && !vm_shared)
writable = false;
>
>
> >
> > But what happens when the write comes after installing the RO PTE? My
> > reading of the code today makes me think that we'd get a minor
> > userfault and then be unable to continue...! (The only reasonable
> > behavior is that CoW is done without triggering a userfault... I
> > assumed/thought this was the behavior today. I wish I had time to test
> > this -- I hope I'm misreading it.)
> >
>
> It's possible my memory is wrong, but I don't think UFFD minor fault
> handling really interacts with CoW faults. IOW, I think you get a UFFD
> minor fault when the PTE is missing, not when it's RO resulting in CoW. I
> think there we just CoW the page as per normal and no fault is reported via
> UFFD?
Yes, even though I don't think PRIVATE is a goal for minor fault, IIUC we
support it, a CoW should be a follow up if the minor fault is triggered
from a write. If it's a read, then RO entry should start to work.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2025-09-26 22:00 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-15 20:13 David P. Reed
2025-09-15 20:24 ` James Houghton
2025-09-15 22:58 ` David P. Reed
2025-09-16 0:31 ` James Houghton
2025-09-16 14:48 ` Peter Xu
2025-09-16 15:52 ` David P. Reed
2025-09-16 16:13 ` Peter Xu
2025-09-16 17:09 ` David P. Reed
2025-09-26 22:16 ` Peter Xu
2025-09-16 17:27 ` David P. Reed
2025-09-16 18:35 ` Axel Rasmussen
2025-09-16 19:10 ` James Houghton
2025-09-16 19:47 ` David P. Reed
2025-09-16 22:04 ` Axel Rasmussen
2025-09-26 22:00 ` Peter Xu [this message]
2025-09-16 19:52 ` David P. Reed
2025-09-17 16:13 ` Axel Rasmussen
2025-09-19 18:29 ` David P. Reed
2025-09-25 19:20 ` Axel Rasmussen
2025-09-27 18:45 ` David P. Reed
2025-09-29 5:30 ` James Houghton
2025-09-29 19:44 ` David P. Reed
2025-09-29 20:30 ` Peter Xu
2025-10-01 22:16 ` Axel Rasmussen
2025-10-17 21:07 ` David P. Reed
2025-09-16 15:37 ` David P. Reed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aNcM7yqj36u37LFv@x1.local \
--to=peterx@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=dpreed@deepplum.com \
--cc=jthoughton@google.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox