linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	David Hildenbrand <david@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mike Rapoport <rppt@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	Nikita Kalyazin <kalyazin@amazon.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	James Houghton <jthoughton@google.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@suse.com>,
	Ujwal Kundur <ujwal.kundur@gmail.com>,
	Oscar Salvador <osalvador@suse.de>,
	Suren Baghdasaryan <surenb@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
Date: Tue, 21 Oct 2025 12:28:17 -0400	[thread overview]
Message-ID: <aPe0oWR9-Oj58Asz@x1.local> (raw)
In-Reply-To: <dtepn7obw5syd47uhyxavytodp7ws2pzr2yuchda32wcwn4bj4@wazn24gijumu>

On Tue, Oct 21, 2025 at 11:51:33AM -0400, Liam R. Howlett wrote:
> * Peter Xu <peterx@redhat.com> [251020 10:12]:
> > On Mon, Oct 20, 2025 at 03:34:47PM +0200, David Hildenbrand wrote:
> > > On 15.10.25 01:14, Peter Xu wrote:
> > > > [based on latest akpm/mm-new of Oct 14th, commit 36c6c5ce1b275]
> > > > 
> > > > v4:
> > > > - Some cleanups within vma_can_userfault() [David]
> > > > - Rename uffd_get_folio() to minor_get_folio() [David]
> > > > - Remove uffd_features in vm_uffd_ops, deduce it from supported ioctls [David]
> > > > 
> > > > v1: https://lore.kernel.org/r/20250620190342.1780170-1-peterx@redhat.com
> > > > v2: https://lore.kernel.org/r/20250627154655.2085903-1-peterx@redhat.com
> > > > v3: https://lore.kernel.org/r/20250926211650.525109-1-peterx@redhat.com
> > > > 
> > > > This series is an alternative proposal of what Nikita proposed here on the
> > > > initial three patches:
> > > > 
> > > >    https://lore.kernel.org/r/20250404154352.23078-1-kalyazin@amazon.com
> > > > 
> > > > This is not yet relevant to any guest-memfd support, but paving way for it.
> > > > Here, the major goal is to make kernel modules be able to opt-in with any
> > > > form of userfaultfd supports, like guest-memfd.  This alternative option
> > > > should hopefully be cleaner, and avoid leaking userfault details into
> > > > vm_ops.fault().
> > > > 
> > > > It also means this series does not depend on anything.  It's a pure
> > > > refactoring of userfaultfd internals to provide a generic API, so that
> > > > other types of files, especially RAM based, can support userfaultfd without
> > > > touching mm/ at all.
> > > > 
> > > > To achieve it, this series introduced a file operation called vm_uffd_ops.
> > > > The ops needs to be provided when a file type supports any of userfaultfd.
> > > > 
> > > > With that, I moved both hugetlbfs and shmem over, whenever possible.  So
> > > > far due to concerns on exposing an uffd_copy() API, the MISSING faults are
> > > > still separately processed and can only be done within mm/.  Hugetlbfs kept
> > > > its special paths untouched.
> > > > 
> > > > An example of shmem uffd_ops:
> > > > 
> > > > static const struct vm_uffd_ops shmem_uffd_ops = {
> > > > 	.supported_ioctls	=	BIT(_UFFDIO_COPY) |
> > > > 					BIT(_UFFDIO_ZEROPAGE) |
> > > > 					BIT(_UFFDIO_WRITEPROTECT) |
> > > > 					BIT(_UFFDIO_CONTINUE) |
> > > > 					BIT(_UFFDIO_POISON),
> > > > 	.minor_get_folio	=	shmem_uffd_get_folio,
> > > > };
> 
> I think you forgot to add the link to the guest_memfd implementation [1]
> to your cover letter.

I didn't.

https://lore.kernel.org/all/20251014231501.2301398-1-peterx@redhat.com/

    To show another sample, this is the patch that Nikita posted to implement
    minor fault for guest-memfd (on top of older versions of this series):

      https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/


> 
> > > 
> > > This looks better than the previous version to me.
> > > 
> > > Long term the goal should be to move all hugetlb/shmem specific stuff out of
> > > mm/hugetlb.c and of course, we won't be adding any new ones to
> > > mm/userfaultfd.c
> > > 
> > > I agree with Liam that a better interface could be providing default
> > > handlers for the separate ioctls [1], but there is always the option to
> > > evolve this interface into something like that later.
> > 
> > Thanks for accepting this current form.
> > 
> > > 
> > > 
> > > [1] https://lkml.kernel.org/r/frnos5jtmlqvzpcrredcoummuzvllweku5dgp5ii5in6epwnw5@anu4dqsz6shy
> > 
> > I have replied to that, here:
> > 
> > https://lore.kernel.org/all/aOVEDii4HPB6outm@x1.local/
> > 
> > If we ignore hugetlbfs, most of the hooks may not be needed, as explained.
> 
> Those were examples.
> 
> Hooks allow for all the memory type checking to go away in the code,
> which allows for more readable code and less operations per call.
> 
> > 
> > If we introduce hooks only for hugetlbfs, IMHO it's going backwards.  When
> > we want to get rid of hugetlbfs paths, we will have something more to get
> > rid of..
> 
> This is just wrong.
> 
> It is far easier to remove one function pointer than go through all the
> code and remove the checks for hugetlbfs.
> 
> Are you thinking the hooks will just point to the generic function?
> This is the only way I can see your statement making sense.  That's not
> the idea I'm trying to communicate.
> 
> The idea is that you split the functions into parts that everyone does
> and special parts, then call them in the correct sequence for each type.
> New types need new special parts while using the generic code for the
> majority of the work.
> 
> In this way, the memory types are modularized into function pointers
> that all use common code without adding complexity.  In fact, knowing
> implicitly which context from call path means we don't need to check the
> types and should be able to reduce the complexity.
> 
> Then adding a new memory type will call almost all the same functions
> except for special areas.
> 
> Removing old memory types would me removing the special areas only - and
> maybe a function pointer if they are the only user.
> 
> The current patch set does not modularizing memory, it is creating a
> middleware level where we have to parse a value to figure out what to
> do.
> 
> These patches DO expose a method for memory types to be coded in a
> kernel module, which is fundamentally different than modularizing the
> memory types.  Different enough to be glossed over on a ML by looking at
> the subject alone.
> 
> Yes, one value is better than two values, but no magic values is ideal.
> 
> Is it a significant amount of work to remove the magic value by
> fragmenting the code into memory type specific function pointers?
> 
> IOW, instead of decoding the value to figure out where to route calls,
> just expose the calls directly in the function pointer layer that you
> are creating?  What is the minimum amount of function pointers to get
> the guest_memfd to work without this value being parsed?
> 
> [1].  https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/

I don't know what you're looking for.

I think I got most acks from userfaultfd developers whoever were active in
the past few years, ever since v1...

Then, we got some concern on uffd_copy() API being complicated, it's fine,
I dropped it.

We got some other concern on having a function returning folio pointer.  We
talked it all through, luckily, even if I do not know what really happened.

Now, I really don't know what you're suggesting here.

Can you send some patches and show us the code, help everyone to support
guest-memfd minor fault, please?

-- 
Peter Xu



  reply	other threads:[~2025-10-21 16:28 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-14 23:14 Peter Xu
2025-10-14 23:14 ` [PATCH v4 1/4] mm: Introduce vm_uffd_ops API Peter Xu
2025-10-20 14:18   ` David Hildenbrand
2025-10-14 23:14 ` [PATCH v4 2/4] mm/shmem: Support " Peter Xu
2025-10-20 14:18   ` David Hildenbrand
2025-10-14 23:15 ` [PATCH v4 3/4] mm/hugetlb: " Peter Xu
2025-10-20 14:19   ` David Hildenbrand
2025-10-14 23:15 ` [PATCH v4 4/4] mm: Apply vm_uffd_ops API to core mm Peter Xu
2025-10-20 13:34 ` [PATCH v4 0/4] mm/userfaultfd: modulize memory types David Hildenbrand
2025-10-20 14:12   ` Peter Xu
2025-10-21 15:51     ` Liam R. Howlett
2025-10-21 16:28       ` Peter Xu [this message]
2025-10-30 17:13         ` Liam R. Howlett
2025-10-30 18:00           ` Nikita Kalyazin
2025-10-30 19:07           ` Peter Xu
2025-10-30 19:55             ` Peter Xu
2025-10-30 20:23               ` Lorenzo Stoakes
2025-10-30 21:13                 ` Peter Xu
2025-10-30 21:27                   ` Peter
2025-11-03 20:01                   ` David Hildenbrand (Red Hat)
2025-11-03 20:46                     ` Peter Xu
2025-11-03 21:27                       ` David Hildenbrand (Red Hat)
2025-11-03 22:49                         ` Peter Xu
2025-11-04  7:10                           ` Lorenzo Stoakes
2025-11-04 14:18                           ` David Hildenbrand (Red Hat)
2025-11-04  7:21                         ` Mike Rapoport
2025-11-04 12:23                           ` David Hildenbrand (Red Hat)
2025-11-06 16:32                           ` Liam R. Howlett
2025-11-09  7:11                             ` Mike Rapoport
2025-11-10 16:34                               ` Liam R. Howlett
2025-11-11 10:05                                 ` Mike Rapoport
2025-10-30 20:52               ` Liam R. Howlett
2025-10-30 21:33                 ` Peter Xu
2025-10-30 20:24             ` Liam R. Howlett
2025-10-30 21:26               ` Peter Xu
2025-11-03 16:11           ` Mike Rapoport
2025-11-03 18:43             ` Liam R. Howlett
2025-11-05 21:23           ` David Hildenbrand
2025-11-06 16:16             ` Liam R. Howlett
2025-11-07 10:16               ` David Hildenbrand (Red Hat)
2025-11-07 16:55                 ` Liam R. Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPe0oWR9-Oj58Asz@x1.local \
    --to=peterx@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jthoughton@google.com \
    --cc=kalyazin@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=ujwal.kundur@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox