From: "David P. Reed" <dpreed@deepplum.com>
To: "James Houghton" <jthoughton@google.com>
Cc: "Axel Rasmussen" <axelrasmussen@google.com>,
"Peter Xu" <peterx@redhat.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
linux-mm@kvack.org
Subject: Re: PROBLEM: userfaultfd REGISTER minor mode on MAP_PRIVATE range fails
Date: Tue, 16 Sep 2025 15:47:40 -0400 (EDT) [thread overview]
Message-ID: <1758052060.67927840@apps.rackspace.com> (raw)
In-Reply-To: <CADrL8HWS7h0e0AcjbE0npoVwqvWK6c_aqML4T+MJnTkCz5AcNg@mail.gmail.com>
On Tuesday, September 16, 2025 15:10, "James Houghton" <jthoughton@google.com> said:
> On Tue, Sep 16, 2025 at 11:35 AM Axel Rasmussen
> <axelrasmussen@google.com> wrote:
>>
>>
>>
>> On Tue, Sep 16, 2025 at 10:27 AM David P. Reed <dpreed@deepplum.com>
>> wrote:
>>>
>>> Than -
>>>
>>> Just to clarify -
>>> Looking at the man page for UFFDIO_API, there are two "feature bits" that
>>> indicate cases where "minor" handling is now supported, and can be enabled.
>>> UFFD_FEATURE_MINOR_HUGETLBFS and UFFD_FEATURE_MINOR_SHMEM
>>> In my reading of the documents, these seem to imply that before they were added
>>> as new features, that MAP_PRIVATE|MAP_ANONYMOUS mappings were supported, and
>>> that the "new" additions to the MINOR mode were just for HUGETLBFS and
>>> MAP_SHARED cases.
>>
>>
>> Actually minor fault support didn't exist at all before those two features were
>> added. :)
>>
>> You are right that userfaultfd's use of "minor fault" is (unfortunately) slightly
>> different from the meaning in other contexts. I think the more normal meaning is,
>> faults which do not incur I/O (i.e., swap faults and file faults [i.e., faults on
>> non-swap-backed pages] are major, other faults are minor).
>>
>> For userfaultfd, a minor fault is a fault where the page already exists in the
>> page cache, but the page table entry wasn't setup. I don't think that scenario
>> can ever happen for anonymous, private mappings, so it doesn't really make sense
>> to be able to register such mappings in this mode. If you create a mapping with
>> mmap(MAP_ANON|MAP_PRIVATE) and then access it (read or write), that fault
>> requires allocation of a new page, so userfaultfd does not consider that a "minor
>> fault". My recollection though is if you make a file on tmpfs or hugetlbfs,
>> fallocate() it or whatever, and you MAP_PRIVATE that file, *that* registration
>> will work.
>
> Ah! You're right... MAP_PRIVATE *is* supported (for tmpfs and
> hugetlbfs only), and UFFDIO_CONTINUE will, upon finding the page in
> the page cache, install a RO PTE for it.
>
> But what happens when the write comes after installing the RO PTE? My
> reading of the code today makes me think that we'd get a minor
> userfault and then be unable to continue...! (The only reasonable
> behavior is that CoW is done without triggering a userfault... I
> assumed/thought this was the behavior today. I wish I had time to test
> this -- I hope I'm misreading it.)
>
> :( Here I was thinking I understood how userfaultfd minor faults worked.
>
So did I. It's kind of confusing. I suppose `git blame` might suggest when the code that doesn't like MAP_PRIVATE|MAP_ANONYMOUS pages in the UFFDIO_REGISTER minor call was introduced... I know that MAP_SHARED|MAP_ANONYMOUS pages allows minor faults to be registered on them - and the only difference really is the passing of the mapping to forked clones (and the COW behavior resulting from sharing until written, but minor mapping has nothing to do with write faults particularly - reading can cause minor faults too).
next prev parent reply other threads:[~2025-09-16 19:47 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-15 20:13 David P. Reed
2025-09-15 20:24 ` James Houghton
2025-09-15 22:58 ` David P. Reed
2025-09-16 0:31 ` James Houghton
2025-09-16 14:48 ` Peter Xu
2025-09-16 15:52 ` David P. Reed
2025-09-16 16:13 ` Peter Xu
2025-09-16 17:09 ` David P. Reed
2025-09-26 22:16 ` Peter Xu
2025-09-16 17:27 ` David P. Reed
2025-09-16 18:35 ` Axel Rasmussen
2025-09-16 19:10 ` James Houghton
2025-09-16 19:47 ` David P. Reed [this message]
2025-09-16 22:04 ` Axel Rasmussen
2025-09-26 22:00 ` Peter Xu
2025-09-16 19:52 ` David P. Reed
2025-09-17 16:13 ` Axel Rasmussen
2025-09-19 18:29 ` David P. Reed
2025-09-25 19:20 ` Axel Rasmussen
2025-09-27 18:45 ` David P. Reed
2025-09-29 5:30 ` James Houghton
2025-09-29 19:44 ` David P. Reed
2025-09-29 20:30 ` Peter Xu
2025-10-01 22:16 ` Axel Rasmussen
2025-10-17 21:07 ` David P. Reed
2025-09-16 15:37 ` David P. Reed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1758052060.67927840@apps.rackspace.com \
--to=dpreed@deepplum.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=jthoughton@google.com \
--cc=linux-mm@kvack.org \
--cc=peterx@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox