linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v6 0/2] mm: Refactor KVM guest_memfd to introduce guestmem library
@ 2025-09-15 16:18 Kalyazin, Nikita
  2025-09-15 16:18 ` [RFC PATCH v6 1/2] mm: guestmem: " Kalyazin, Nikita
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Kalyazin, Nikita @ 2025-09-15 16:18 UTC (permalink / raw)
  To: akpm, david, pbonzini, seanjc, viro, brauner
  Cc: peterx, lorenzo.stoakes, Liam.Howlett, willy, vbabka, rppt,
	surenb, mhocko, jack, linux-mm, kvm, linux-kernel, linux-fsdevel,
	jthoughton, tabba, vannapurve, Roy, Patrick, Thomson, Jack,
	Manwaring, Derek, Cali, Marco, Kalyazin, Nikita

This is a revival of the guestmem library patch series originated from
Elliot [1].  The reason I am bringing it up now is it would help
implement UserfaultFD support minor mode in guest_memfd.

Background

We are building a Firecracker version that uses guest_memfd to back
guest memory [2].  The main objective is to use guest_memfd to remove
guest memory from host kernel's direct map to reduce the surface for
Spectre-style transient execution issues [3].  Currently, Firecracker
supports restoring VMs from snapshots using UserfaultFD [4], which is
similar to the postcopy phase of live migration.  During restoration,
while we rely on a separate mechanism to handle stage-2 faults in
guest_memfd [5], UserfaultFD support in guest_memfd is still required to
handle faults caused either by the VMM itself or by MMIO access handling
on x86.

The major problem in implementing UserfaultFD for guest_memfd is that
the MM code (UserfaultFD) needs to call KVM-specific interfaces.
Particularly for the minor mode, these are 1) determining the type of
the VMA (eg is_vma_guest_memfd()) and 2) obtaining a folio (ie
kvm_gmem_get_folio()).  Those may not be always available as KVM can be
compiled as a module.  Peter attempted to approach it via exposing an
ops structure where modules (such as KVM) could provide their own
callbacks, but it was not deemed to be sufficiently safe as it opens up
an unrestricted interface for all modules and may leave MM in an
inconsistent state [6].

An alternative way to make these interfaces available to the UserfaultFD
code is extracting generic-MM guest_memfd parts into a library
(guestmem) under MM where they can be safely consumed by the UserfaultFD
code.  As far as I know, the original guestmem library series was
motivated by adding guest_memfd support in Gunyah hypervisor [7].

This RFC

I took Elliot's v5 (the latest) and rebased it on top of the guest_memfd
preview branch [8] because I also wanted to see how it would work with
direct map removal [3] and write syscall [9], which are building blocks
for the guest_memfd-based Firecracker version.  On top of it I added a
patch that implements UserfaultFD support for guest_memfd using
interfaces provided by the guestmem library to illustrate the complete
idea.

I made the following modifications along the way:
 - Followed by a comment from Sean, converted invalidate_begin()
   callback back to void as it cannot fail in KVM, and the related
   Gunyah requirement is unknown to me
 - Extended the guestmem_ops structure with the supports_mmap() callback
   to provide conditional mmap support in guestmem
 - Extended the guestmem library interface with guestmem_allocate(),
   guestmem_test_no_direct_map(), guestmem_mark_prepared(),
   guestmem_mmap(), and guestmem_vma_is_guestmem()
 - Made (kvm_gmem)/(guestmem)_test_no_direct_map() use
   mapping_no_direct_map() instead of KVM-specific flag
   GUEST_MEMFD_FLAG_NO_DIRECT_MAP to make it KVM-independent

Feedback that I would like to receive:
 - Is this the right solution to the "UserfaultFD in guest_memfd"
   problem?
 - What requirements from other hypervisors than KVM do we need to
   consider at this point?
 - Does the line between generic-MM and KVM-specific guest_memfd parts
   look sensible?

Previous iterations of UserfaultFD support in guest_memfd patches:
v3:
 - https://lore.kernel.org/kvm/20250404154352.23078-1-kalyazin@amazon.com
 - minor changes to address review comments (James)
v2:
 - https://lore.kernel.org/kvm/20250402160721.97596-1-kalyazin@amazon.com
 - implement a full minor trap instead of hybrid missing/minor trap
   (James/Peter)
 - make UFFDIO_CONTINUE implementation generic calling vm_ops->fault()
v1:
 - https://lore.kernel.org/kvm/20250303133011.44095-1-kalyazin@amazon.com

Nikita

[1]: https://lore.kernel.org/kvm/20241122-guestmem-library-v5-2-450e92951a15@quicinc.com
[2]: https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding
[3]: https://lore.kernel.org/kvm/20250912091708.17502-1-roypat@amazon.co.uk
[4]: https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/handling-page-faults-on-snapshot-resume.md
[5]: https://lore.kernel.org/kvm/20250618042424.330664-1-jthoughton@google.com
[6]: https://lore.kernel.org/linux-mm/20250627154655.2085903-1-peterx@redhat.com
[7]: https://lore.kernel.org/lkml/20240222-gunyah-v17-0-1e9da6763d38@quicinc.com
[8]: https://git.kernel.org/pub/scm/linux/kernel/git/david/linux.git/log/?h=guestmemfd-preview
[9]: https://lore.kernel.org/kvm/20250902111951.58315-1-kalyazin@amazon.com

Nikita Kalyazin (2):
  mm: guestmem: introduce guestmem library
  userfaulfd: add minor mode for guestmem

 Documentation/admin-guide/mm/userfaultfd.rst |   4 +-
 MAINTAINERS                                  |   2 +
 fs/userfaultfd.c                             |   3 +-
 include/linux/guestmem.h                     |  46 +++
 include/linux/userfaultfd_k.h                |   8 +-
 include/uapi/linux/userfaultfd.h             |   8 +-
 mm/Kconfig                                   |   3 +
 mm/Makefile                                  |   1 +
 mm/guestmem.c                                | 380 +++++++++++++++++++
 mm/userfaultfd.c                             |  14 +-
 virt/kvm/Kconfig                             |   1 +
 virt/kvm/guest_memfd.c                       | 303 ++-------------
 12 files changed, 493 insertions(+), 280 deletions(-)
 create mode 100644 include/linux/guestmem.h
 create mode 100644 mm/guestmem.c


base-commit: 911634bac3107b237dcd8fdcb6ac91a22741cbe7
-- 
2.50.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-09-16  2:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-15 16:18 [RFC PATCH v6 0/2] mm: Refactor KVM guest_memfd to introduce guestmem library Kalyazin, Nikita
2025-09-15 16:18 ` [RFC PATCH v6 1/2] mm: guestmem: " Kalyazin, Nikita
2025-09-16  2:00   ` Matthew Wilcox
2025-09-15 16:18 ` [RFC PATCH v6 2/2] userfaulfd: add minor mode for guestmem Kalyazin, Nikita
2025-09-15 21:25 ` [RFC PATCH v6 0/2] mm: Refactor KVM guest_memfd to introduce guestmem library Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox