[PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs
@ 2025-02-18 17:24 Fuad Tabba
  2025-02-18 17:24 ` [PATCH v4 01/10] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
                   ` (9 more replies)
  0 siblings, 10 replies; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

Main changes since v3 [1]:
- Dropped the arm64 sw protected vm type. Instead,
non-confidential arm64 VM types support guest_memfd with sharing
in place when the configuration option is enabled. Future VM
types can restrict that.
- Expand the guest_memfd host fault error return values to cover
more cases.
- Fixes to faulting in guest_memfd pages in arm64.
- Rebased on Linux 6.14-rc3.

The purpose of this series is to serve as a base for _restricted_
mmap() support for guest_memfd backed memory at the host [2]. It
allows experimentation with what that support would be like in
the safe environment of software and non-confidential VM types.

For more background and for how to test this series, please refer
to v2 [3]. Note that an updated version of kvmtool that works
with this series is available here [4].

Cheers,
/fuad

[1] https://lore.kernel.org/all/20250211121128.703390-1-tabba@google.com/
[2] https://lore.kernel.org/all/20250117163001.2326672-1-tabba@google.com/
[3] https://lore.kernel.org/all/20250129172320.950523-1-tabba@google.com/
[4] https://android-kvm.googlesource.com/kvmtool/+/refs/heads/tabba/guestmem-6.14

Fuad Tabba (10):
  mm: Consolidate freeing of typed folios on final folio_put()
  KVM: guest_memfd: Handle final folio_put() of guest_memfd pages
  KVM: guest_memfd: Allow host to map guest_memfd() pages
  KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed
    memory
  KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd
    shared memory
  KVM: arm64: Refactor user_mem_abort() calculation of force_pte
  KVM: arm64: Handle guest_memfd()-backed guest page faults
  KVM: arm64: Enable mapping guest_memfd in arm64
  KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is
    allowed

 arch/arm64/include/asm/kvm_host.h             |  10 ++
 arch/arm64/kvm/Kconfig                        |   1 +
 arch/arm64/kvm/mmu.c                          |  83 ++++++++-----
 arch/x86/include/asm/kvm_host.h               |   5 +
 arch/x86/kvm/Kconfig                          |   3 +-
 include/linux/kvm_host.h                      |  23 +++-
 include/linux/page-flags.h                    |  32 +++++
 include/uapi/linux/kvm.h                      |   1 +
 mm/debug.c                                    |   1 +
 mm/swap.c                                     |  32 ++++-
 tools/testing/selftests/kvm/Makefile.kvm      |   1 +
 .../testing/selftests/kvm/guest_memfd_test.c  |  75 +++++++++++-
 virt/kvm/Kconfig                              |   5 +
 virt/kvm/guest_memfd.c                        | 110 ++++++++++++++++++
 virt/kvm/kvm_main.c                           |   9 +-
 15 files changed, 345 insertions(+), 46 deletions(-)


base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 01/10] mm: Consolidate freeing of typed folios on final folio_put()
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-20 11:53   ` David Hildenbrand
  2025-02-18 17:24 ` [PATCH v4 02/10] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages Fuad Tabba
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

Some folio types, such as hugetlb, handle freeing their own
folios. Moreover, guest_memfd will require being notified once a
folio's reference count reaches 0 to facilitate shared to private
folio conversion, without the folio actually being freed at that
point.

As a first step towards that, this patch consolidates freeing
folios that have a type. The first user is hugetlb folios. Later
in this patch series, guest_memfd will become the second user of
this.

Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/linux/page-flags.h | 15 +++++++++++++++
 mm/swap.c                  | 23 ++++++++++++++++++-----
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 36d283552f80..6dc2494bd002 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -953,6 +953,21 @@ static inline bool page_has_type(const struct page *page)
 	return page_mapcount_is_type(data_race(page->page_type));
 }
 
+static inline int page_get_type(const struct page *page)
+{
+	return page->page_type >> 24;
+}
+
+static inline bool folio_has_type(const struct folio *folio)
+{
+	return page_has_type(&folio->page);
+}
+
+static inline int folio_get_type(const struct folio *folio)
+{
+	return page_get_type(&folio->page);
+}
+
 #define FOLIO_TYPE_OPS(lname, fname)					\
 static __always_inline bool folio_test_##fname(const struct folio *folio) \
 {									\
diff --git a/mm/swap.c b/mm/swap.c
index fc8281ef4241..47bc1bb919cc 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -94,6 +94,19 @@ static void page_cache_release(struct folio *folio)
 		unlock_page_lruvec_irqrestore(lruvec, flags);
 }
 
+static void free_typed_folio(struct folio *folio)
+{
+	switch (folio_get_type(folio)) {
+#ifdef CONFIG_HUGETLBFS
+	case PGTY_hugetlb:
+		free_huge_folio(folio);
+		return;
+#endif
+	default:
+		WARN_ON_ONCE(1);
+	}
+}
+
 void __folio_put(struct folio *folio)
 {
 	if (unlikely(folio_is_zone_device(folio))) {
@@ -101,8 +114,8 @@ void __folio_put(struct folio *folio)
 		return;
 	}
 
-	if (folio_test_hugetlb(folio)) {
-		free_huge_folio(folio);
+	if (unlikely(folio_has_type(folio))) {
+		free_typed_folio(folio);
 		return;
 	}
 
@@ -966,13 +979,13 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs)
 		if (!folio_ref_sub_and_test(folio, nr_refs))
 			continue;
 
-		/* hugetlb has its own memcg */
-		if (folio_test_hugetlb(folio)) {
+		if (unlikely(folio_has_type(folio))) {
+			/* typed folios have their own memcg, if any */
 			if (lruvec) {
 				unlock_page_lruvec_irqrestore(lruvec, flags);
 				lruvec = NULL;
 			}
-			free_huge_folio(folio);
+			free_typed_folio(folio);
 			continue;
 		}
 		folio_unqueue_deferred_split(folio);
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 02/10] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
  2025-02-18 17:24 ` [PATCH v4 01/10] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-20 11:54   ` David Hildenbrand
  2025-02-18 17:24 ` [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

Before transitioning a guest_memfd folio to unshared, thereby
disallowing access by the host and allowing the hypervisor to
transition its view of the guest page as private, we need to be
sure that the host doesn't have any references to the folio.

This patch introduces a new type for guest_memfd folios, which
isn't activated in this series but is here as a placeholder and
to facilitate the code in the subsequent patch series. This will
be used in the future to register a callback that informs the
guest_memfd subsystem when the last reference is dropped,
therefore knowing that the host doesn't have any remaining
references.

This patch also introduces the configuration option,
KVM_GMEM_SHARED_MEM, which toggles support for mapping
guest_memfd shared memory at the host.

Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/kvm_host.h   |  4 ++++
 include/linux/page-flags.h | 17 +++++++++++++++++
 mm/debug.c                 |  1 +
 mm/swap.c                  |  9 +++++++++
 virt/kvm/Kconfig           |  5 +++++
 virt/kvm/guest_memfd.c     |  7 +++++++
 6 files changed, 43 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f34f4cfaa513..3ad0719bfc4f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2571,4 +2571,8 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
 				    struct kvm_pre_fault_memory *range);
 #endif
 
+#ifdef CONFIG_KVM_GMEM_SHARED_MEM
+void kvm_gmem_handle_folio_put(struct folio *folio);
+#endif
+
 #endif
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 6dc2494bd002..734afda268ab 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -933,6 +933,17 @@ enum pagetype {
 	PGTY_slab	= 0xf5,
 	PGTY_zsmalloc	= 0xf6,
 	PGTY_unaccepted	= 0xf7,
+	/*
+	 * guestmem folios are used to back VM memory as managed by guest_memfd.
+	 * Once the last reference is put, instead of freeing these folios back
+	 * to the page allocator, they are returned to guest_memfd.
+	 *
+	 * For now, guestmem will only be set on these folios as long as they
+	 * cannot be mapped to user space ("private state"), with the plan of
+	 * always setting that type once typed folios can be mapped to user
+	 * space cleanly.
+	 */
+	PGTY_guestmem	= 0xf8,
 
 	PGTY_mapcount_underflow = 0xff
 };
@@ -1082,6 +1093,12 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb)
 FOLIO_TEST_FLAG_FALSE(hugetlb)
 #endif
 
+#ifdef CONFIG_KVM_GMEM_SHARED_MEM
+FOLIO_TYPE_OPS(guestmem, guestmem)
+#else
+FOLIO_TEST_FLAG_FALSE(guestmem)
+#endif
+
 PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc)
 
 /*
diff --git a/mm/debug.c b/mm/debug.c
index 8d2acf432385..08bc42c6cba8 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -56,6 +56,7 @@ static const char *page_type_names[] = {
 	DEF_PAGETYPE_NAME(table),
 	DEF_PAGETYPE_NAME(buddy),
 	DEF_PAGETYPE_NAME(unaccepted),
+	DEF_PAGETYPE_NAME(guestmem),
 };
 
 static const char *page_type_name(unsigned int page_type)
diff --git a/mm/swap.c b/mm/swap.c
index 47bc1bb919cc..241880a46358 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -38,6 +38,10 @@
 #include <linux/local_lock.h>
 #include <linux/buffer_head.h>
 
+#ifdef CONFIG_KVM_GMEM_SHARED_MEM
+#include <linux/kvm_host.h>
+#endif
+
 #include "internal.h"
 
 #define CREATE_TRACE_POINTS
@@ -101,6 +105,11 @@ static void free_typed_folio(struct folio *folio)
 	case PGTY_hugetlb:
 		free_huge_folio(folio);
 		return;
+#endif
+#ifdef CONFIG_KVM_GMEM_SHARED_MEM
+	case PGTY_guestmem:
+		kvm_gmem_handle_folio_put(folio);
+		return;
 #endif
 	default:
 		WARN_ON_ONCE(1);
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 54e959e7d68f..37f7734cb10f 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -124,3 +124,8 @@ config HAVE_KVM_ARCH_GMEM_PREPARE
 config HAVE_KVM_ARCH_GMEM_INVALIDATE
        bool
        depends on KVM_PRIVATE_MEM
+
+config KVM_GMEM_SHARED_MEM
+       select KVM_PRIVATE_MEM
+       depends on !KVM_GENERIC_MEMORY_ATTRIBUTES
+       bool
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index b2aa6bf24d3a..c6f6792bec2a 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -312,6 +312,13 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
 	return gfn - slot->base_gfn + slot->gmem.pgoff;
 }
 
+#ifdef CONFIG_KVM_GMEM_SHARED_MEM
+void kvm_gmem_handle_folio_put(struct folio *folio)
+{
+	WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
+}
+#endif /* CONFIG_KVM_GMEM_SHARED_MEM */
+
 static struct file_operations kvm_gmem_fops = {
 	.open		= generic_file_open,
 	.release	= kvm_gmem_release,
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
  2025-02-18 17:24 ` [PATCH v4 01/10] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
  2025-02-18 17:24 ` [PATCH v4 02/10] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-20 11:58   ` David Hildenbrand
  2025-02-18 17:24 ` [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared Fuad Tabba
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

Add support for mmap() and fault() for guest_memfd backed memory
in the host for VMs that support in-place conversion between
shared and private. To that end, this patch adds the ability to
check whether the VM type supports in-place conversion, and only
allows mapping its memory if that's the case.

This behavior is also gated by the configuration option
KVM_GMEM_SHARED_MEM.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/linux/kvm_host.h |  11 +++++
 virt/kvm/guest_memfd.c   | 103 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 114 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3ad0719bfc4f..f9e8b10a4b09 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
 }
 #endif
 
+/*
+ * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
+ * private memory is enabled and it supports in-place shared/private conversion.
+ */
+#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
+static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
+{
+	return false;
+}
+#endif
+
 #ifndef kvm_arch_has_readonly_mem
 static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
 {
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index c6f6792bec2a..30b47ff0e6d2 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio)
 {
 	WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
 }
+
+static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
+{
+	struct kvm_gmem *gmem = file->private_data;
+
+	/* For now, VMs that support shared memory share all their memory. */
+	return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
+}
+
+static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
+{
+	struct inode *inode = file_inode(vmf->vma->vm_file);
+	struct folio *folio;
+	vm_fault_t ret = VM_FAULT_LOCKED;
+
+	filemap_invalidate_lock_shared(inode->i_mapping);
+
+	folio = kvm_gmem_get_folio(inode, vmf->pgoff);
+	if (IS_ERR(folio)) {
+		switch (PTR_ERR(folio)) {
+		case -EAGAIN:
+			ret = VM_FAULT_RETRY;
+			break;
+		case -ENOMEM:
+			ret = VM_FAULT_OOM;
+			break;
+		default:
+			ret = VM_FAULT_SIGBUS;
+			break;
+		}
+		goto out_filemap;
+	}
+
+	if (folio_test_hwpoison(folio)) {
+		ret = VM_FAULT_HWPOISON;
+		goto out_folio;
+	}
+
+	/* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
+	if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
+		ret = VM_FAULT_SIGBUS;
+		goto out_folio;
+	}
+
+	/*
+	 * Only private folios are marked as "guestmem" so far, and we never
+	 * expect private folios at this point.
+	 */
+	if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
+		ret = VM_FAULT_SIGBUS;
+		goto out_folio;
+	}
+
+	/* No support for huge pages. */
+	if (WARN_ON_ONCE(folio_test_large(folio))) {
+		ret = VM_FAULT_SIGBUS;
+		goto out_folio;
+	}
+
+	if (!folio_test_uptodate(folio)) {
+		clear_highpage(folio_page(folio, 0));
+		kvm_gmem_mark_prepared(folio);
+	}
+
+	vmf->page = folio_file_page(folio, vmf->pgoff);
+
+out_folio:
+	if (ret != VM_FAULT_LOCKED) {
+		folio_unlock(folio);
+		folio_put(folio);
+	}
+
+out_filemap:
+	filemap_invalidate_unlock_shared(inode->i_mapping);
+
+	return ret;
+}
+
+static const struct vm_operations_struct kvm_gmem_vm_ops = {
+	.fault = kvm_gmem_fault,
+};
+
+static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct kvm_gmem *gmem = file->private_data;
+
+	if (!kvm_arch_gmem_supports_shared_mem(gmem->kvm))
+		return -ENODEV;
+
+	if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) !=
+	    (VM_SHARED | VM_MAYSHARE)) {
+		return -EINVAL;
+	}
+
+	file_accessed(file);
+	vm_flags_set(vma, VM_DONTDUMP);
+	vma->vm_ops = &kvm_gmem_vm_ops;
+
+	return 0;
+}
+#else
+#define kvm_gmem_mmap NULL
 #endif /* CONFIG_KVM_GMEM_SHARED_MEM */
 
 static struct file_operations kvm_gmem_fops = {
+	.mmap		= kvm_gmem_mmap,
 	.open		= generic_file_open,
 	.release	= kvm_gmem_release,
 	.fallocate	= kvm_gmem_fallocate,
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (2 preceding siblings ...)
  2025-02-18 17:24 ` [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-28 16:23   ` Peter Xu
  2025-02-18 17:24 ` [PATCH v4 05/10] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory Fuad Tabba
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
that the VM supports shared memory in guest_memfd, or that the
host can create VMs that support shared memory. Supporting shared
memory implies that memory can be mapped when shared with the
host.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/uapi/linux/kvm.h | 1 +
 virt/kvm/kvm_main.c      | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 45e6d8fca9b9..117937a895da 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -929,6 +929,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_PRE_FAULT_MEMORY 236
 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
 #define KVM_CAP_X86_GUEST_MODE 238
+#define KVM_CAP_GMEM_SHARED_MEM 239
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ba0327e2d0d3..38f0f402ea46 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4830,6 +4830,10 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 #ifdef CONFIG_KVM_PRIVATE_MEM
 	case KVM_CAP_GUEST_MEMFD:
 		return !kvm || kvm_arch_has_private_mem(kvm);
+#endif
+#ifdef CONFIG_KVM_GMEM_SHARED_MEM
+	case KVM_CAP_GMEM_SHARED_MEM:
+		return !kvm || kvm_arch_gmem_supports_shared_mem(kvm);
 #endif
 	default:
 		break;
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 05/10] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (3 preceding siblings ...)
  2025-02-18 17:24 ` [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-18 17:24 ` [PATCH v4 06/10] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory Fuad Tabba
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

For VMs that allow sharing guest_memfd backed memory in-place,
handle that memory the same as "private" guest_memfd memory. This
means that faulting that memory in the host or in the guest will
go through the guest_memfd subsystem.

Note that the word "private" in the name of the function
kvm_mem_is_private() doesn't necessarily indicate that the memory
isn't shared, but is due to the history and evolution of
guest_memfd and the various names it has received. In effect,
this function is used to multiplex between the path of a normal
page fault and the path of a guest_memfd backed page fault.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/linux/kvm_host.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f9e8b10a4b09..83f65c910ccb 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2521,7 +2521,8 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
 #else
 static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
 {
-	return false;
+	return kvm_arch_gmem_supports_shared_mem(kvm) &&
+	       kvm_slot_can_be_private(gfn_to_memslot(kvm, gfn));
 }
 #endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */
 
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 06/10] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (4 preceding siblings ...)
  2025-02-18 17:24 ` [PATCH v4 05/10] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-18 17:24 ` [PATCH v4 07/10] KVM: arm64: Refactor user_mem_abort() calculation of force_pte Fuad Tabba
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

The KVM_X86_SW_PROTECTED_VM type is meant for experimentation and
does not have any underlying support for protected guests. This
makes it a good candidate for testing mapping shared memory.
Therefore, when the kconfig option is enabled, mark
KVM_X86_SW_PROTECTED_VM as supporting shared memory.

This means that this memory is considered by guest_memfd to be
shared with the host, with the possibility of in-place conversion
between shared and private. This allows the host to map and fault
in guest_memfd memory belonging to this VM type.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/x86/include/asm/kvm_host.h | 5 +++++
 arch/x86/kvm/Kconfig            | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0b7af5902ff7..c6e4925bdc8a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2245,8 +2245,13 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
 
 #ifdef CONFIG_KVM_PRIVATE_MEM
 #define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem)
+
+#define kvm_arch_gmem_supports_shared_mem(kvm)         \
+	(IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM) &&      \
+	 ((kvm)->arch.vm_type == KVM_X86_SW_PROTECTED_VM))
 #else
 #define kvm_arch_has_private_mem(kvm) false
+#define kvm_arch_gmem_supports_shared_mem(kvm) false
 #endif
 
 #define kvm_arch_has_readonly_mem(kvm) (!(kvm)->arch.has_protected_state)
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index ea2c4f21c1ca..22d1bcdaad58 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -45,7 +45,8 @@ config KVM_X86
 	select HAVE_KVM_PM_NOTIFIER if PM
 	select KVM_GENERIC_HARDWARE_ENABLING
 	select KVM_GENERIC_PRE_FAULT_MEMORY
-	select KVM_GENERIC_PRIVATE_MEM if KVM_SW_PROTECTED_VM
+	select KVM_PRIVATE_MEM if KVM_SW_PROTECTED_VM
+	select KVM_GMEM_SHARED_MEM if KVM_SW_PROTECTED_VM
 	select KVM_WERROR if WERROR
 
 config KVM
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 07/10] KVM: arm64: Refactor user_mem_abort() calculation of force_pte
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (5 preceding siblings ...)
  2025-02-18 17:24 ` [PATCH v4 06/10] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-18 17:24 ` [PATCH v4 08/10] KVM: arm64: Handle guest_memfd()-backed guest page faults Fuad Tabba
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

To simplify the code and to make the assumptions clearer,
refactor user_mem_abort() by immediately setting force_pte to
true if the conditions are met. Also, add a check to ensure that
the assumption that logging_active is guaranteed to never be true
for VM_PFNMAP memslot is true.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/mmu.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 1f55b0c7b11d..b6c0acb2311c 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1460,7 +1460,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  bool fault_is_perm)
 {
 	int ret = 0;
-	bool write_fault, writable, force_pte = false;
+	bool write_fault, writable;
 	bool exec_fault, mte_allowed;
 	bool device = false, vfio_allow_any_uc = false;
 	unsigned long mmu_seq;
@@ -1472,6 +1472,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	gfn_t gfn;
 	kvm_pfn_t pfn;
 	bool logging_active = memslot_is_logging(memslot);
+	bool force_pte = logging_active || is_protected_kvm_enabled();
 	long vma_pagesize, fault_granule;
 	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
 	struct kvm_pgtable *pgt;
@@ -1525,12 +1526,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 * logging_active is guaranteed to never be true for VM_PFNMAP
 	 * memslots.
 	 */
-	if (logging_active || is_protected_kvm_enabled()) {
-		force_pte = true;
+	if (WARN_ON_ONCE(logging_active && (vma->vm_flags & VM_PFNMAP)))
+		return -EFAULT;
+
+	if (force_pte)
 		vma_shift = PAGE_SHIFT;
-	} else {
+	else
 		vma_shift = get_vma_page_shift(vma, hva);
-	}
 
 	switch (vma_shift) {
 #ifndef __PAGETABLE_PMD_FOLDED
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 08/10] KVM: arm64: Handle guest_memfd()-backed guest page faults
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (6 preceding siblings ...)
  2025-02-18 17:24 ` [PATCH v4 07/10] KVM: arm64: Refactor user_mem_abort() calculation of force_pte Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-18 17:24 ` [PATCH v4 09/10] KVM: arm64: Enable mapping guest_memfd in arm64 Fuad Tabba
  2025-02-18 17:25 ` [PATCH v4 10/10] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba
  9 siblings, 0 replies; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

Add arm64 support for handling guest page faults on guest_memfd
backed memslots.

For now, the fault granule is restricted to PAGE_SIZE.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/mmu.c     | 79 ++++++++++++++++++++++++++--------------
 include/linux/kvm_host.h |  5 +++
 virt/kvm/kvm_main.c      |  5 ---
 3 files changed, 57 insertions(+), 32 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index b6c0acb2311c..d57a70f19aac 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1454,6 +1454,30 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
 	return vma->vm_flags & VM_MTE_ALLOWED;
 }
 
+static kvm_pfn_t faultin_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
+			     gfn_t gfn, bool write_fault, bool *writable,
+			     struct page **page, bool is_private)
+{
+	kvm_pfn_t pfn;
+	int ret;
+
+	if (!is_private)
+		return __kvm_faultin_pfn(slot, gfn, write_fault ? FOLL_WRITE : 0, writable, page);
+
+	*writable = false;
+
+	ret = kvm_gmem_get_pfn(kvm, slot, gfn, &pfn, page, NULL);
+	if (!ret) {
+		*writable = !memslot_is_readonly(slot);
+		return pfn;
+	}
+
+	if (ret == -EHWPOISON)
+		return KVM_PFN_ERR_HWPOISON;
+
+	return KVM_PFN_ERR_NOSLOT_MASK;
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_s2_trans *nested,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
@@ -1461,19 +1485,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 {
 	int ret = 0;
 	bool write_fault, writable;
-	bool exec_fault, mte_allowed;
+	bool exec_fault, mte_allowed = false;
 	bool device = false, vfio_allow_any_uc = false;
 	unsigned long mmu_seq;
 	phys_addr_t ipa = fault_ipa;
 	struct kvm *kvm = vcpu->kvm;
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma = NULL;
 	short vma_shift;
 	void *memcache;
-	gfn_t gfn;
+	gfn_t gfn = ipa >> PAGE_SHIFT;
 	kvm_pfn_t pfn;
 	bool logging_active = memslot_is_logging(memslot);
-	bool force_pte = logging_active || is_protected_kvm_enabled();
-	long vma_pagesize, fault_granule;
+	bool is_gmem = kvm_mem_is_private(kvm, gfn);
+	bool force_pte = logging_active || is_gmem || is_protected_kvm_enabled();
+	long vma_pagesize, fault_granule = PAGE_SIZE;
 	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
 	struct kvm_pgtable *pgt;
 	struct page *page;
@@ -1510,24 +1535,30 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			return ret;
 	}
 
+	mmap_read_lock(current->mm);
+
 	/*
 	 * Let's check if we will get back a huge page backed by hugetlbfs, or
 	 * get block mapping for device MMIO region.
 	 */
-	mmap_read_lock(current->mm);
-	vma = vma_lookup(current->mm, hva);
-	if (unlikely(!vma)) {
-		kvm_err("Failed to find VMA for hva 0x%lx\n", hva);
-		mmap_read_unlock(current->mm);
-		return -EFAULT;
-	}
+	if (!is_gmem) {
+		vma = vma_lookup(current->mm, hva);
+		if (unlikely(!vma)) {
+			kvm_err("Failed to find VMA for hva 0x%lx\n", hva);
+			mmap_read_unlock(current->mm);
+			return -EFAULT;
+		}
 
-	/*
-	 * logging_active is guaranteed to never be true for VM_PFNMAP
-	 * memslots.
-	 */
-	if (WARN_ON_ONCE(logging_active && (vma->vm_flags & VM_PFNMAP)))
-		return -EFAULT;
+		/*
+		 * logging_active is guaranteed to never be true for VM_PFNMAP
+		 * memslots.
+		 */
+		if (WARN_ON_ONCE(logging_active && (vma->vm_flags & VM_PFNMAP)))
+			return -EFAULT;
+
+		vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED;
+		mte_allowed = kvm_vma_mte_allowed(vma);
+	}
 
 	if (force_pte)
 		vma_shift = PAGE_SHIFT;
@@ -1597,18 +1628,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		ipa &= ~(vma_pagesize - 1);
 	}
 
-	gfn = ipa >> PAGE_SHIFT;
-	mte_allowed = kvm_vma_mte_allowed(vma);
-
-	vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED;
-
 	/* Don't use the VMA after the unlock -- it may have vanished */
 	vma = NULL;
 
 	/*
 	 * Read mmu_invalidate_seq so that KVM can detect if the results of
-	 * vma_lookup() or __kvm_faultin_pfn() become stale prior to
-	 * acquiring kvm->mmu_lock.
+	 * vma_lookup() or faultin_pfn() become stale prior to acquiring
+	 * kvm->mmu_lock.
 	 *
 	 * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
 	 * with the smp_wmb() in kvm_mmu_invalidate_end().
@@ -1616,8 +1642,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	mmu_seq = vcpu->kvm->mmu_invalidate_seq;
 	mmap_read_unlock(current->mm);
 
-	pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0,
-				&writable, &page);
+	pfn = faultin_pfn(kvm, memslot, gfn, write_fault, &writable, &page, is_gmem);
 	if (pfn == KVM_PFN_ERR_HWPOISON) {
 		kvm_send_hwpoison_signal(hva, vma_shift);
 		return 0;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 83f65c910ccb..04f998476bf9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1882,6 +1882,11 @@ static inline int memslot_id(struct kvm *kvm, gfn_t gfn)
 	return gfn_to_memslot(kvm, gfn)->id;
 }
 
+static inline bool memslot_is_readonly(const struct kvm_memory_slot *slot)
+{
+	return slot->flags & KVM_MEM_READONLY;
+}
+
 static inline gfn_t
 hva_to_gfn_memslot(unsigned long hva, struct kvm_memory_slot *slot)
 {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 38f0f402ea46..3e40acb9f5c0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2624,11 +2624,6 @@ unsigned long kvm_host_page_size(struct kvm_vcpu *vcpu, gfn_t gfn)
 	return size;
 }
 
-static bool memslot_is_readonly(const struct kvm_memory_slot *slot)
-{
-	return slot->flags & KVM_MEM_READONLY;
-}
-
 static unsigned long __gfn_to_hva_many(const struct kvm_memory_slot *slot, gfn_t gfn,
 				       gfn_t *nr_pages, bool write)
 {
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 09/10] KVM: arm64: Enable mapping guest_memfd in arm64
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (7 preceding siblings ...)
  2025-02-18 17:24 ` [PATCH v4 08/10] KVM: arm64: Handle guest_memfd()-backed guest page faults Fuad Tabba
@ 2025-02-18 17:24 ` Fuad Tabba
  2025-02-18 17:25 ` [PATCH v4 10/10] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba
  9 siblings, 0 replies; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:24 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

Enable mapping guest_memfd in arm64. For now, it applies to all
VMs in arm64 that use guest_memfd. In the future, new VM types
can restrict this via kvm_arch_gmem_supports_shared_mem().

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h | 10 ++++++++++
 arch/arm64/kvm/Kconfig            |  1 +
 2 files changed, 11 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3a7ec98ef123..e722a9982647 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1543,4 +1543,14 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val);
 #define kvm_has_s1poe(k)				\
 	(kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP))
 
+static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
+{
+	return IS_ENABLED(CONFIG_KVM_PRIVATE_MEM);
+}
+
+static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
+{
+	return IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM);
+}
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index ead632ad01b4..4830d8805bed 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -38,6 +38,7 @@ menuconfig KVM
 	select HAVE_KVM_VCPU_RUN_PID_CHANGE
 	select SCHED_INFO
 	select GUEST_PERF_EVENTS if PERF_EVENTS
+	select KVM_GMEM_SHARED_MEM
 	help
 	  Support hosting virtualized guest machines.
 
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4 10/10] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed
  2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (8 preceding siblings ...)
  2025-02-18 17:24 ` [PATCH v4 09/10] KVM: arm64: Enable mapping guest_memfd in arm64 Fuad Tabba
@ 2025-02-18 17:25 ` Fuad Tabba
  9 siblings, 0 replies; 24+ messages in thread
From: Fuad Tabba @ 2025-02-18 17:25 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, david, michael.roth,
	wei.w.wang, liam.merwick, isaku.yamahata, kirill.shutemov,
	suzuki.poulose, steven.price, quic_eberman, quic_mnalajal,
	quic_tsoni, quic_svaddagi, quic_cvanscha, quic_pderrin,
	quic_pheragu, catalin.marinas, james.morse, yuzenghui,
	oliver.upton, maz, will, qperret, keirf, roypat, shuah, hch, jgg,
	rientjes, jhubbard, fvdl, hughd, jthoughton, tabba

Expand the guest_memfd selftests to include testing mapping guest
memory for VM types that support it.

Also, build the guest_memfd selftest for aarch64.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 tools/testing/selftests/kvm/Makefile.kvm      |  1 +
 .../testing/selftests/kvm/guest_memfd_test.c  | 75 +++++++++++++++++--
 2 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index 4277b983cace..c9a3f30e28dd 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -160,6 +160,7 @@ TEST_GEN_PROGS_arm64 += coalesced_io_test
 TEST_GEN_PROGS_arm64 += demand_paging_test
 TEST_GEN_PROGS_arm64 += dirty_log_test
 TEST_GEN_PROGS_arm64 += dirty_log_perf_test
+TEST_GEN_PROGS_arm64 += guest_memfd_test
 TEST_GEN_PROGS_arm64 += guest_print_test
 TEST_GEN_PROGS_arm64 += get-reg-list
 TEST_GEN_PROGS_arm64 += kvm_create_max_vcpus
diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
index ce687f8d248f..38c501e49e0e 100644
--- a/tools/testing/selftests/kvm/guest_memfd_test.c
+++ b/tools/testing/selftests/kvm/guest_memfd_test.c
@@ -34,12 +34,48 @@ static void test_file_read_write(int fd)
 		    "pwrite on a guest_mem fd should fail");
 }
 
-static void test_mmap(int fd, size_t page_size)
+static void test_mmap_allowed(int fd, size_t total_size)
 {
+	size_t page_size = getpagesize();
+	const char val = 0xaa;
+	char *mem;
+	int ret;
+	int i;
+
+	mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	TEST_ASSERT(mem != MAP_FAILED, "mmaping() guest memory should pass.");
+
+	memset(mem, val, total_size);
+	for (i = 0; i < total_size; i++)
+		TEST_ASSERT_EQ(mem[i], val);
+
+	ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0,
+			page_size);
+	TEST_ASSERT(!ret, "fallocate the first page should succeed");
+
+	for (i = 0; i < page_size; i++)
+		TEST_ASSERT_EQ(mem[i], 0x00);
+	for (; i < total_size; i++)
+		TEST_ASSERT_EQ(mem[i], val);
+
+	memset(mem, val, total_size);
+	for (i = 0; i < total_size; i++)
+		TEST_ASSERT_EQ(mem[i], val);
+
+	ret = munmap(mem, total_size);
+	TEST_ASSERT(!ret, "munmap should succeed");
+}
+
+static void test_mmap_denied(int fd, size_t total_size)
+{
+	size_t page_size = getpagesize();
 	char *mem;
 
 	mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
 	TEST_ASSERT_EQ(mem, MAP_FAILED);
+
+	mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	TEST_ASSERT_EQ(mem, MAP_FAILED);
 }
 
 static void test_file_size(int fd, size_t page_size, size_t total_size)
@@ -170,19 +206,27 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm)
 	close(fd1);
 }
 
-int main(int argc, char *argv[])
+unsigned long get_shared_type(void)
 {
-	size_t page_size;
+#ifdef __x86_64__
+	return KVM_X86_SW_PROTECTED_VM;
+#endif
+	return 0;
+}
+
+void test_vm_type(unsigned long type, bool is_shared)
+{
+	struct kvm_vm *vm;
 	size_t total_size;
+	size_t page_size;
 	int fd;
-	struct kvm_vm *vm;
 
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD));
 
 	page_size = getpagesize();
 	total_size = page_size * 4;
 
-	vm = vm_create_barebones();
+	vm = vm_create_barebones_type(type);
 
 	test_create_guest_memfd_invalid(vm);
 	test_create_guest_memfd_multiple(vm);
@@ -190,10 +234,29 @@ int main(int argc, char *argv[])
 	fd = vm_create_guest_memfd(vm, total_size, 0);
 
 	test_file_read_write(fd);
-	test_mmap(fd, page_size);
+
+	if (is_shared)
+		test_mmap_allowed(fd, total_size);
+	else
+		test_mmap_denied(fd, total_size);
+
 	test_file_size(fd, page_size, total_size);
 	test_fallocate(fd, page_size, total_size);
 	test_invalid_punch_hole(fd, page_size, total_size);
 
 	close(fd);
+	kvm_vm_release(vm);
+}
+
+int main(int argc, char *argv[])
+{
+#ifndef __aarch64__
+	/* For now, arm64 only supports shared guest memory. */
+	test_vm_type(VM_TYPE_DEFAULT, false);
+#endif
+
+	if (kvm_has_cap(KVM_CAP_GMEM_SHARED_MEM))
+		test_vm_type(get_shared_type(), true);
+
+	return 0;
 }
-- 
2.48.1.601.g30ceb7b040-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 01/10] mm: Consolidate freeing of typed folios on final folio_put()
  2025-02-18 17:24 ` [PATCH v4 01/10] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
@ 2025-02-20 11:53   ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-02-20 11:53 UTC (permalink / raw)
  To: Fuad Tabba, kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, michael.roth, wei.w.wang,
	liam.merwick, isaku.yamahata, kirill.shutemov, suzuki.poulose,
	steven.price, quic_eberman, quic_mnalajal, quic_tsoni,
	quic_svaddagi, quic_cvanscha, quic_pderrin, quic_pheragu,
	catalin.marinas, james.morse, yuzenghui, oliver.upton, maz, will,
	qperret, keirf, roypat, shuah, hch, jgg, rientjes, jhubbard,
	fvdl, hughd, jthoughton

On 18.02.25 18:24, Fuad Tabba wrote:
> Some folio types, such as hugetlb, handle freeing their own
> folios. Moreover, guest_memfd will require being notified once a
> folio's reference count reaches 0 to facilitate shared to private
> folio conversion, without the folio actually being freed at that
> point.
> 
> As a first step towards that, this patch consolidates freeing
> folios that have a type. The first user is hugetlb folios. Later
> in this patch series, guest_memfd will become the second user of
> this.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---

(again on current patch series)

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 02/10] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages
  2025-02-18 17:24 ` [PATCH v4 02/10] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages Fuad Tabba
@ 2025-02-20 11:54   ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-02-20 11:54 UTC (permalink / raw)
  To: Fuad Tabba, kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, michael.roth, wei.w.wang,
	liam.merwick, isaku.yamahata, kirill.shutemov, suzuki.poulose,
	steven.price, quic_eberman, quic_mnalajal, quic_tsoni,
	quic_svaddagi, quic_cvanscha, quic_pderrin, quic_pheragu,
	catalin.marinas, james.morse, yuzenghui, oliver.upton, maz, will,
	qperret, keirf, roypat, shuah, hch, jgg, rientjes, jhubbard,
	fvdl, hughd, jthoughton

On 18.02.25 18:24, Fuad Tabba wrote:
> Before transitioning a guest_memfd folio to unshared, thereby
> disallowing access by the host and allowing the hypervisor to
> transition its view of the guest page as private, we need to be
> sure that the host doesn't have any references to the folio.
> 
> This patch introduces a new type for guest_memfd folios, which
> isn't activated in this series but is here as a placeholder and
> to facilitate the code in the subsequent patch series. This will
> be used in the future to register a callback that informs the
> guest_memfd subsystem when the last reference is dropped,
> therefore knowing that the host doesn't have any remaining
> references.
> 
> This patch also introduces the configuration option,
> KVM_GMEM_SHARED_MEM, which toggles support for mapping
> guest_memfd shared memory at the host.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   include/linux/kvm_host.h   |  4 ++++
>   include/linux/page-flags.h | 17 +++++++++++++++++
>   mm/debug.c                 |  1 +
>   mm/swap.c                  |  9 +++++++++
>   virt/kvm/Kconfig           |  5 +++++
>   virt/kvm/guest_memfd.c     |  7 +++++++
>   6 files changed, 43 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index f34f4cfaa513..3ad0719bfc4f 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2571,4 +2571,8 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
>   				    struct kvm_pre_fault_memory *range);
>   #endif
>   
> +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> +void kvm_gmem_handle_folio_put(struct folio *folio);
> +#endif
> +
>   #endif
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 6dc2494bd002..734afda268ab 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -933,6 +933,17 @@ enum pagetype {
>   	PGTY_slab	= 0xf5,
>   	PGTY_zsmalloc	= 0xf6,
>   	PGTY_unaccepted	= 0xf7,
> +	/*
> +	 * guestmem folios are used to back VM memory as managed by guest_memfd.
> +	 * Once the last reference is put, instead of freeing these folios back
> +	 * to the page allocator, they are returned to guest_memfd.
> +	 *
> +	 * For now, guestmem will only be set on these folios as long as they
> +	 * cannot be mapped to user space ("private state"), with the plan of
> +	 * always setting that type once typed folios can be mapped to user
> +	 * space cleanly.
> +	 */

Same comment as to v3 regarding moving the comment.

kvm_gmem_handle_folio_put() might be fixed with having it as an inline 
function for the time being as discussed.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-02-18 17:24 ` [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
@ 2025-02-20 11:58   ` David Hildenbrand
  2025-02-20 12:04     ` Fuad Tabba
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2025-02-20 11:58 UTC (permalink / raw)
  To: Fuad Tabba, kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, isaku.yamahata, mic,
	vbabka, vannapurve, ackerleytng, mail, michael.roth, wei.w.wang,
	liam.merwick, isaku.yamahata, kirill.shutemov, suzuki.poulose,
	steven.price, quic_eberman, quic_mnalajal, quic_tsoni,
	quic_svaddagi, quic_cvanscha, quic_pderrin, quic_pheragu,
	catalin.marinas, james.morse, yuzenghui, oliver.upton, maz, will,
	qperret, keirf, roypat, shuah, hch, jgg, rientjes, jhubbard,
	fvdl, hughd, jthoughton

On 18.02.25 18:24, Fuad Tabba wrote:
> Add support for mmap() and fault() for guest_memfd backed memory
> in the host for VMs that support in-place conversion between
> shared and private. To that end, this patch adds the ability to
> check whether the VM type supports in-place conversion, and only
> allows mapping its memory if that's the case.
> 
> This behavior is also gated by the configuration option
> KVM_GMEM_SHARED_MEM.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>   include/linux/kvm_host.h |  11 +++++
>   virt/kvm/guest_memfd.c   | 103 +++++++++++++++++++++++++++++++++++++++
>   2 files changed, 114 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 3ad0719bfc4f..f9e8b10a4b09 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
>   }
>   #endif
>   
> +/*
> + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
> + * private memory is enabled and it supports in-place shared/private conversion.
> + */
> +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
> +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
> +{
> +	return false;
> +}
> +#endif
> +
>   #ifndef kvm_arch_has_readonly_mem
>   static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
>   {
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index c6f6792bec2a..30b47ff0e6d2 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio)
>   {
>   	WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
>   }
> +
> +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
> +{
> +	struct kvm_gmem *gmem = file->private_data;
> +
> +	/* For now, VMs that support shared memory share all their memory. */
> +	return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
> +}
> +
> +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
> +{
> +	struct inode *inode = file_inode(vmf->vma->vm_file);
> +	struct folio *folio;
> +	vm_fault_t ret = VM_FAULT_LOCKED;
> +
> +	filemap_invalidate_lock_shared(inode->i_mapping);
> +
> +	folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> +	if (IS_ERR(folio)) {
> +		switch (PTR_ERR(folio)) {
> +		case -EAGAIN:
> +			ret = VM_FAULT_RETRY;
> +			break;
> +		case -ENOMEM:
> +			ret = VM_FAULT_OOM;
> +			break;
> +		default:
> +			ret = VM_FAULT_SIGBUS;
> +			break;
> +		}
> +		goto out_filemap;
> +	}
> +
> +	if (folio_test_hwpoison(folio)) {
> +		ret = VM_FAULT_HWPOISON;
> +		goto out_folio;
> +	}
> +
> +	/* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
> +	if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
> +		ret = VM_FAULT_SIGBUS;
> +		goto out_folio;
> +	}
> +
> +	/*
> +	 * Only private folios are marked as "guestmem" so far, and we never
> +	 * expect private folios at this point.
> +	 */
> +	if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
> +		ret = VM_FAULT_SIGBUS;
> +		goto out_folio;
> +	}
> +
> +	/* No support for huge pages. */
> +	if (WARN_ON_ONCE(folio_test_large(folio))) {
> +		ret = VM_FAULT_SIGBUS;
> +		goto out_folio;
> +	}
> +
> +	if (!folio_test_uptodate(folio)) {
> +		clear_highpage(folio_page(folio, 0));
> +		kvm_gmem_mark_prepared(folio);
> +	}

kvm_gmem_get_pfn()->__kvm_gmem_get_pfn() seems to call 
kvm_gmem_prepare_folio() instead.

Could we do the same here?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-02-20 11:58   ` David Hildenbrand
@ 2025-02-20 12:04     ` Fuad Tabba
  2025-02-20 15:45       ` Fuad Tabba
  0 siblings, 1 reply; 24+ messages in thread
From: Fuad Tabba @ 2025-02-20 12:04 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On Thu, 20 Feb 2025 at 11:58, David Hildenbrand <david@redhat.com> wrote:
>
> On 18.02.25 18:24, Fuad Tabba wrote:
> > Add support for mmap() and fault() for guest_memfd backed memory
> > in the host for VMs that support in-place conversion between
> > shared and private. To that end, this patch adds the ability to
> > check whether the VM type supports in-place conversion, and only
> > allows mapping its memory if that's the case.
> >
> > This behavior is also gated by the configuration option
> > KVM_GMEM_SHARED_MEM.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >   include/linux/kvm_host.h |  11 +++++
> >   virt/kvm/guest_memfd.c   | 103 +++++++++++++++++++++++++++++++++++++++
> >   2 files changed, 114 insertions(+)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index 3ad0719bfc4f..f9e8b10a4b09 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
> >   }
> >   #endif
> >
> > +/*
> > + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
> > + * private memory is enabled and it supports in-place shared/private conversion.
> > + */
> > +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
> > +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
> > +{
> > +     return false;
> > +}
> > +#endif
> > +
> >   #ifndef kvm_arch_has_readonly_mem
> >   static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
> >   {
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index c6f6792bec2a..30b47ff0e6d2 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio)
> >   {
> >       WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
> >   }
> > +
> > +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
> > +{
> > +     struct kvm_gmem *gmem = file->private_data;
> > +
> > +     /* For now, VMs that support shared memory share all their memory. */
> > +     return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
> > +}
> > +
> > +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
> > +{
> > +     struct inode *inode = file_inode(vmf->vma->vm_file);
> > +     struct folio *folio;
> > +     vm_fault_t ret = VM_FAULT_LOCKED;
> > +
> > +     filemap_invalidate_lock_shared(inode->i_mapping);
> > +
> > +     folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> > +     if (IS_ERR(folio)) {
> > +             switch (PTR_ERR(folio)) {
> > +             case -EAGAIN:
> > +                     ret = VM_FAULT_RETRY;
> > +                     break;
> > +             case -ENOMEM:
> > +                     ret = VM_FAULT_OOM;
> > +                     break;
> > +             default:
> > +                     ret = VM_FAULT_SIGBUS;
> > +                     break;
> > +             }
> > +             goto out_filemap;
> > +     }
> > +
> > +     if (folio_test_hwpoison(folio)) {
> > +             ret = VM_FAULT_HWPOISON;
> > +             goto out_folio;
> > +     }
> > +
> > +     /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
> > +     if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
> > +             ret = VM_FAULT_SIGBUS;
> > +             goto out_folio;
> > +     }
> > +
> > +     /*
> > +      * Only private folios are marked as "guestmem" so far, and we never
> > +      * expect private folios at this point.
> > +      */
> > +     if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
> > +             ret = VM_FAULT_SIGBUS;
> > +             goto out_folio;
> > +     }
> > +
> > +     /* No support for huge pages. */
> > +     if (WARN_ON_ONCE(folio_test_large(folio))) {
> > +             ret = VM_FAULT_SIGBUS;
> > +             goto out_folio;
> > +     }
> > +
> > +     if (!folio_test_uptodate(folio)) {
> > +             clear_highpage(folio_page(folio, 0));
> > +             kvm_gmem_mark_prepared(folio);
> > +     }
>
> kvm_gmem_get_pfn()->__kvm_gmem_get_pfn() seems to call
> kvm_gmem_prepare_folio() instead.
>
> Could we do the same here?

Will do.

Thanks,
/fuad

> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-02-20 12:04     ` Fuad Tabba
@ 2025-02-20 15:45       ` Fuad Tabba
  2025-02-20 15:58         ` David Hildenbrand
  0 siblings, 1 reply; 24+ messages in thread
From: Fuad Tabba @ 2025-02-20 15:45 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

Hi David,

On Thu, 20 Feb 2025 at 12:04, Fuad Tabba <tabba@google.com> wrote:
>
> On Thu, 20 Feb 2025 at 11:58, David Hildenbrand <david@redhat.com> wrote:
> >
> > On 18.02.25 18:24, Fuad Tabba wrote:
> > > Add support for mmap() and fault() for guest_memfd backed memory
> > > in the host for VMs that support in-place conversion between
> > > shared and private. To that end, this patch adds the ability to
> > > check whether the VM type supports in-place conversion, and only
> > > allows mapping its memory if that's the case.
> > >
> > > This behavior is also gated by the configuration option
> > > KVM_GMEM_SHARED_MEM.
> > >
> > > Signed-off-by: Fuad Tabba <tabba@google.com>
> > > ---
> > >   include/linux/kvm_host.h |  11 +++++
> > >   virt/kvm/guest_memfd.c   | 103 +++++++++++++++++++++++++++++++++++++++
> > >   2 files changed, 114 insertions(+)
> > >
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > index 3ad0719bfc4f..f9e8b10a4b09 100644
> > > --- a/include/linux/kvm_host.h
> > > +++ b/include/linux/kvm_host.h
> > > @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
> > >   }
> > >   #endif
> > >
> > > +/*
> > > + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
> > > + * private memory is enabled and it supports in-place shared/private conversion.
> > > + */
> > > +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
> > > +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
> > > +{
> > > +     return false;
> > > +}
> > > +#endif
> > > +
> > >   #ifndef kvm_arch_has_readonly_mem
> > >   static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
> > >   {
> > > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > > index c6f6792bec2a..30b47ff0e6d2 100644
> > > --- a/virt/kvm/guest_memfd.c
> > > +++ b/virt/kvm/guest_memfd.c
> > > @@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio)
> > >   {
> > >       WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
> > >   }
> > > +
> > > +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
> > > +{
> > > +     struct kvm_gmem *gmem = file->private_data;
> > > +
> > > +     /* For now, VMs that support shared memory share all their memory. */
> > > +     return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
> > > +}
> > > +
> > > +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
> > > +{
> > > +     struct inode *inode = file_inode(vmf->vma->vm_file);
> > > +     struct folio *folio;
> > > +     vm_fault_t ret = VM_FAULT_LOCKED;
> > > +
> > > +     filemap_invalidate_lock_shared(inode->i_mapping);
> > > +
> > > +     folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> > > +     if (IS_ERR(folio)) {
> > > +             switch (PTR_ERR(folio)) {
> > > +             case -EAGAIN:
> > > +                     ret = VM_FAULT_RETRY;
> > > +                     break;
> > > +             case -ENOMEM:
> > > +                     ret = VM_FAULT_OOM;
> > > +                     break;
> > > +             default:
> > > +                     ret = VM_FAULT_SIGBUS;
> > > +                     break;
> > > +             }
> > > +             goto out_filemap;
> > > +     }
> > > +
> > > +     if (folio_test_hwpoison(folio)) {
> > > +             ret = VM_FAULT_HWPOISON;
> > > +             goto out_folio;
> > > +     }
> > > +
> > > +     /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
> > > +     if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
> > > +             ret = VM_FAULT_SIGBUS;
> > > +             goto out_folio;
> > > +     }
> > > +
> > > +     /*
> > > +      * Only private folios are marked as "guestmem" so far, and we never
> > > +      * expect private folios at this point.
> > > +      */
> > > +     if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
> > > +             ret = VM_FAULT_SIGBUS;
> > > +             goto out_folio;
> > > +     }
> > > +
> > > +     /* No support for huge pages. */
> > > +     if (WARN_ON_ONCE(folio_test_large(folio))) {
> > > +             ret = VM_FAULT_SIGBUS;
> > > +             goto out_folio;
> > > +     }
> > > +
> > > +     if (!folio_test_uptodate(folio)) {
> > > +             clear_highpage(folio_page(folio, 0));
> > > +             kvm_gmem_mark_prepared(folio);
> > > +     }
> >
> > kvm_gmem_get_pfn()->__kvm_gmem_get_pfn() seems to call
> > kvm_gmem_prepare_folio() instead.
> >
> > Could we do the same here?
>
> Will do.

I realized it's not that straightforward. __kvm_gmem_prepare_folio()
requires the kvm_memory_slot, which is used to calculate the gfn. At
that point we have neither, and it's not just an issue of access, but
there might not be a slot associated with that yet.

Cheers,
/fuad
>
> Thanks,
> /fuad
>
> > --
> > Cheers,
> >
> > David / dhildenb
> >


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-02-20 15:45       ` Fuad Tabba
@ 2025-02-20 15:58         ` David Hildenbrand
  2025-02-20 17:10           ` Fuad Tabba
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2025-02-20 15:58 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 20.02.25 16:45, Fuad Tabba wrote:
> Hi David,
> 
> On Thu, 20 Feb 2025 at 12:04, Fuad Tabba <tabba@google.com> wrote:
>>
>> On Thu, 20 Feb 2025 at 11:58, David Hildenbrand <david@redhat.com> wrote:
>>>
>>> On 18.02.25 18:24, Fuad Tabba wrote:
>>>> Add support for mmap() and fault() for guest_memfd backed memory
>>>> in the host for VMs that support in-place conversion between
>>>> shared and private. To that end, this patch adds the ability to
>>>> check whether the VM type supports in-place conversion, and only
>>>> allows mapping its memory if that's the case.
>>>>
>>>> This behavior is also gated by the configuration option
>>>> KVM_GMEM_SHARED_MEM.
>>>>
>>>> Signed-off-by: Fuad Tabba <tabba@google.com>
>>>> ---
>>>>    include/linux/kvm_host.h |  11 +++++
>>>>    virt/kvm/guest_memfd.c   | 103 +++++++++++++++++++++++++++++++++++++++
>>>>    2 files changed, 114 insertions(+)
>>>>
>>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>>>> index 3ad0719bfc4f..f9e8b10a4b09 100644
>>>> --- a/include/linux/kvm_host.h
>>>> +++ b/include/linux/kvm_host.h
>>>> @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
>>>>    }
>>>>    #endif
>>>>
>>>> +/*
>>>> + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
>>>> + * private memory is enabled and it supports in-place shared/private conversion.
>>>> + */
>>>> +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
>>>> +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
>>>> +{
>>>> +     return false;
>>>> +}
>>>> +#endif
>>>> +
>>>>    #ifndef kvm_arch_has_readonly_mem
>>>>    static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
>>>>    {
>>>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
>>>> index c6f6792bec2a..30b47ff0e6d2 100644
>>>> --- a/virt/kvm/guest_memfd.c
>>>> +++ b/virt/kvm/guest_memfd.c
>>>> @@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio)
>>>>    {
>>>>        WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
>>>>    }
>>>> +
>>>> +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
>>>> +{
>>>> +     struct kvm_gmem *gmem = file->private_data;
>>>> +
>>>> +     /* For now, VMs that support shared memory share all their memory. */
>>>> +     return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
>>>> +}
>>>> +
>>>> +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
>>>> +{
>>>> +     struct inode *inode = file_inode(vmf->vma->vm_file);
>>>> +     struct folio *folio;
>>>> +     vm_fault_t ret = VM_FAULT_LOCKED;
>>>> +
>>>> +     filemap_invalidate_lock_shared(inode->i_mapping);
>>>> +
>>>> +     folio = kvm_gmem_get_folio(inode, vmf->pgoff);
>>>> +     if (IS_ERR(folio)) {
>>>> +             switch (PTR_ERR(folio)) {
>>>> +             case -EAGAIN:
>>>> +                     ret = VM_FAULT_RETRY;
>>>> +                     break;
>>>> +             case -ENOMEM:
>>>> +                     ret = VM_FAULT_OOM;
>>>> +                     break;
>>>> +             default:
>>>> +                     ret = VM_FAULT_SIGBUS;
>>>> +                     break;
>>>> +             }
>>>> +             goto out_filemap;
>>>> +     }
>>>> +
>>>> +     if (folio_test_hwpoison(folio)) {
>>>> +             ret = VM_FAULT_HWPOISON;
>>>> +             goto out_folio;
>>>> +     }
>>>> +
>>>> +     /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
>>>> +     if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
>>>> +             ret = VM_FAULT_SIGBUS;
>>>> +             goto out_folio;
>>>> +     }
>>>> +
>>>> +     /*
>>>> +      * Only private folios are marked as "guestmem" so far, and we never
>>>> +      * expect private folios at this point.
>>>> +      */
>>>> +     if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
>>>> +             ret = VM_FAULT_SIGBUS;
>>>> +             goto out_folio;
>>>> +     }
>>>> +
>>>> +     /* No support for huge pages. */
>>>> +     if (WARN_ON_ONCE(folio_test_large(folio))) {
>>>> +             ret = VM_FAULT_SIGBUS;
>>>> +             goto out_folio;
>>>> +     }
>>>> +
>>>> +     if (!folio_test_uptodate(folio)) {
>>>> +             clear_highpage(folio_page(folio, 0));
>>>> +             kvm_gmem_mark_prepared(folio);
>>>> +     }
>>>
>>> kvm_gmem_get_pfn()->__kvm_gmem_get_pfn() seems to call
>>> kvm_gmem_prepare_folio() instead.
>>>
>>> Could we do the same here?
>>
>> Will do.
> 
> I realized it's not that straightforward. __kvm_gmem_prepare_folio()
> requires the kvm_memory_slot, which is used to calculate the gfn. At
> that point we have neither, and it's not just an issue of access, but
> there might not be a slot associated with that yet.

Hmm, right ... I wonder if that might be problematic. I assume no 
memslot == no memory attribute telling us if it is private or shared at 
least for now?

Once guest_memfd maintains that state, it might be "cleaner" ? What's 
your thought?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-02-20 15:58         ` David Hildenbrand
@ 2025-02-20 17:10           ` Fuad Tabba
  2025-02-20 17:12             ` David Hildenbrand
  0 siblings, 1 reply; 24+ messages in thread
From: Fuad Tabba @ 2025-02-20 17:10 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On Thu, 20 Feb 2025 at 15:58, David Hildenbrand <david@redhat.com> wrote:
>
> On 20.02.25 16:45, Fuad Tabba wrote:
> > Hi David,
> >
> > On Thu, 20 Feb 2025 at 12:04, Fuad Tabba <tabba@google.com> wrote:
> >>
> >> On Thu, 20 Feb 2025 at 11:58, David Hildenbrand <david@redhat.com> wrote:
> >>>
> >>> On 18.02.25 18:24, Fuad Tabba wrote:
> >>>> Add support for mmap() and fault() for guest_memfd backed memory
> >>>> in the host for VMs that support in-place conversion between
> >>>> shared and private. To that end, this patch adds the ability to
> >>>> check whether the VM type supports in-place conversion, and only
> >>>> allows mapping its memory if that's the case.
> >>>>
> >>>> This behavior is also gated by the configuration option
> >>>> KVM_GMEM_SHARED_MEM.
> >>>>
> >>>> Signed-off-by: Fuad Tabba <tabba@google.com>
> >>>> ---
> >>>>    include/linux/kvm_host.h |  11 +++++
> >>>>    virt/kvm/guest_memfd.c   | 103 +++++++++++++++++++++++++++++++++++++++
> >>>>    2 files changed, 114 insertions(+)
> >>>>
> >>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> >>>> index 3ad0719bfc4f..f9e8b10a4b09 100644
> >>>> --- a/include/linux/kvm_host.h
> >>>> +++ b/include/linux/kvm_host.h
> >>>> @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
> >>>>    }
> >>>>    #endif
> >>>>
> >>>> +/*
> >>>> + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
> >>>> + * private memory is enabled and it supports in-place shared/private conversion.
> >>>> + */
> >>>> +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
> >>>> +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
> >>>> +{
> >>>> +     return false;
> >>>> +}
> >>>> +#endif
> >>>> +
> >>>>    #ifndef kvm_arch_has_readonly_mem
> >>>>    static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
> >>>>    {
> >>>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> >>>> index c6f6792bec2a..30b47ff0e6d2 100644
> >>>> --- a/virt/kvm/guest_memfd.c
> >>>> +++ b/virt/kvm/guest_memfd.c
> >>>> @@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio)
> >>>>    {
> >>>>        WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
> >>>>    }
> >>>> +
> >>>> +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
> >>>> +{
> >>>> +     struct kvm_gmem *gmem = file->private_data;
> >>>> +
> >>>> +     /* For now, VMs that support shared memory share all their memory. */
> >>>> +     return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
> >>>> +}
> >>>> +
> >>>> +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
> >>>> +{
> >>>> +     struct inode *inode = file_inode(vmf->vma->vm_file);
> >>>> +     struct folio *folio;
> >>>> +     vm_fault_t ret = VM_FAULT_LOCKED;
> >>>> +
> >>>> +     filemap_invalidate_lock_shared(inode->i_mapping);
> >>>> +
> >>>> +     folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> >>>> +     if (IS_ERR(folio)) {
> >>>> +             switch (PTR_ERR(folio)) {
> >>>> +             case -EAGAIN:
> >>>> +                     ret = VM_FAULT_RETRY;
> >>>> +                     break;
> >>>> +             case -ENOMEM:
> >>>> +                     ret = VM_FAULT_OOM;
> >>>> +                     break;
> >>>> +             default:
> >>>> +                     ret = VM_FAULT_SIGBUS;
> >>>> +                     break;
> >>>> +             }
> >>>> +             goto out_filemap;
> >>>> +     }
> >>>> +
> >>>> +     if (folio_test_hwpoison(folio)) {
> >>>> +             ret = VM_FAULT_HWPOISON;
> >>>> +             goto out_folio;
> >>>> +     }
> >>>> +
> >>>> +     /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
> >>>> +     if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
> >>>> +             ret = VM_FAULT_SIGBUS;
> >>>> +             goto out_folio;
> >>>> +     }
> >>>> +
> >>>> +     /*
> >>>> +      * Only private folios are marked as "guestmem" so far, and we never
> >>>> +      * expect private folios at this point.
> >>>> +      */
> >>>> +     if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
> >>>> +             ret = VM_FAULT_SIGBUS;
> >>>> +             goto out_folio;
> >>>> +     }
> >>>> +
> >>>> +     /* No support for huge pages. */
> >>>> +     if (WARN_ON_ONCE(folio_test_large(folio))) {
> >>>> +             ret = VM_FAULT_SIGBUS;
> >>>> +             goto out_folio;
> >>>> +     }
> >>>> +
> >>>> +     if (!folio_test_uptodate(folio)) {
> >>>> +             clear_highpage(folio_page(folio, 0));
> >>>> +             kvm_gmem_mark_prepared(folio);
> >>>> +     }
> >>>
> >>> kvm_gmem_get_pfn()->__kvm_gmem_get_pfn() seems to call
> >>> kvm_gmem_prepare_folio() instead.
> >>>
> >>> Could we do the same here?
> >>
> >> Will do.
> >
> > I realized it's not that straightforward. __kvm_gmem_prepare_folio()
> > requires the kvm_memory_slot, which is used to calculate the gfn. At
> > that point we have neither, and it's not just an issue of access, but
> > there might not be a slot associated with that yet.
>
> Hmm, right ... I wonder if that might be problematic. I assume no
> memslot == no memory attribute telling us if it is private or shared at
> least for now?
>
> Once guest_memfd maintains that state, it might be "cleaner" ? What's
> your thought?

The idea is that this doesn't determine whether it's shared or private
by the guest_memfd's attributes, but by the new state added in the
other patch series. That's independent of memslots and guest addresses
altogether.

One scenario you can imagine is the host wanting to fault in memory to
initialize it before associating it with a memslot. I guess we could
make it a requirement that you cannot fault-in pages unless they are
associated with a memslot, but that might be too restrictive.

Cheers,
/fuad



> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-02-20 17:10           ` Fuad Tabba
@ 2025-02-20 17:12             ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-02-20 17:12 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 20.02.25 18:10, Fuad Tabba wrote:
> On Thu, 20 Feb 2025 at 15:58, David Hildenbrand <david@redhat.com> wrote:
>>
>> On 20.02.25 16:45, Fuad Tabba wrote:
>>> Hi David,
>>>
>>> On Thu, 20 Feb 2025 at 12:04, Fuad Tabba <tabba@google.com> wrote:
>>>>
>>>> On Thu, 20 Feb 2025 at 11:58, David Hildenbrand <david@redhat.com> wrote:
>>>>>
>>>>> On 18.02.25 18:24, Fuad Tabba wrote:
>>>>>> Add support for mmap() and fault() for guest_memfd backed memory
>>>>>> in the host for VMs that support in-place conversion between
>>>>>> shared and private. To that end, this patch adds the ability to
>>>>>> check whether the VM type supports in-place conversion, and only
>>>>>> allows mapping its memory if that's the case.
>>>>>>
>>>>>> This behavior is also gated by the configuration option
>>>>>> KVM_GMEM_SHARED_MEM.
>>>>>>
>>>>>> Signed-off-by: Fuad Tabba <tabba@google.com>
>>>>>> ---
>>>>>>     include/linux/kvm_host.h |  11 +++++
>>>>>>     virt/kvm/guest_memfd.c   | 103 +++++++++++++++++++++++++++++++++++++++
>>>>>>     2 files changed, 114 insertions(+)
>>>>>>
>>>>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>>>>>> index 3ad0719bfc4f..f9e8b10a4b09 100644
>>>>>> --- a/include/linux/kvm_host.h
>>>>>> +++ b/include/linux/kvm_host.h
>>>>>> @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
>>>>>>     }
>>>>>>     #endif
>>>>>>
>>>>>> +/*
>>>>>> + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
>>>>>> + * private memory is enabled and it supports in-place shared/private conversion.
>>>>>> + */
>>>>>> +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
>>>>>> +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
>>>>>> +{
>>>>>> +     return false;
>>>>>> +}
>>>>>> +#endif
>>>>>> +
>>>>>>     #ifndef kvm_arch_has_readonly_mem
>>>>>>     static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
>>>>>>     {
>>>>>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
>>>>>> index c6f6792bec2a..30b47ff0e6d2 100644
>>>>>> --- a/virt/kvm/guest_memfd.c
>>>>>> +++ b/virt/kvm/guest_memfd.c
>>>>>> @@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio)
>>>>>>     {
>>>>>>         WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress.");
>>>>>>     }
>>>>>> +
>>>>>> +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index)
>>>>>> +{
>>>>>> +     struct kvm_gmem *gmem = file->private_data;
>>>>>> +
>>>>>> +     /* For now, VMs that support shared memory share all their memory. */
>>>>>> +     return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
>>>>>> +}
>>>>>> +
>>>>>> +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
>>>>>> +{
>>>>>> +     struct inode *inode = file_inode(vmf->vma->vm_file);
>>>>>> +     struct folio *folio;
>>>>>> +     vm_fault_t ret = VM_FAULT_LOCKED;
>>>>>> +
>>>>>> +     filemap_invalidate_lock_shared(inode->i_mapping);
>>>>>> +
>>>>>> +     folio = kvm_gmem_get_folio(inode, vmf->pgoff);
>>>>>> +     if (IS_ERR(folio)) {
>>>>>> +             switch (PTR_ERR(folio)) {
>>>>>> +             case -EAGAIN:
>>>>>> +                     ret = VM_FAULT_RETRY;
>>>>>> +                     break;
>>>>>> +             case -ENOMEM:
>>>>>> +                     ret = VM_FAULT_OOM;
>>>>>> +                     break;
>>>>>> +             default:
>>>>>> +                     ret = VM_FAULT_SIGBUS;
>>>>>> +                     break;
>>>>>> +             }
>>>>>> +             goto out_filemap;
>>>>>> +     }
>>>>>> +
>>>>>> +     if (folio_test_hwpoison(folio)) {
>>>>>> +             ret = VM_FAULT_HWPOISON;
>>>>>> +             goto out_folio;
>>>>>> +     }
>>>>>> +
>>>>>> +     /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
>>>>>> +     if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
>>>>>> +             ret = VM_FAULT_SIGBUS;
>>>>>> +             goto out_folio;
>>>>>> +     }
>>>>>> +
>>>>>> +     /*
>>>>>> +      * Only private folios are marked as "guestmem" so far, and we never
>>>>>> +      * expect private folios at this point.
>>>>>> +      */
>>>>>> +     if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
>>>>>> +             ret = VM_FAULT_SIGBUS;
>>>>>> +             goto out_folio;
>>>>>> +     }
>>>>>> +
>>>>>> +     /* No support for huge pages. */
>>>>>> +     if (WARN_ON_ONCE(folio_test_large(folio))) {
>>>>>> +             ret = VM_FAULT_SIGBUS;
>>>>>> +             goto out_folio;
>>>>>> +     }
>>>>>> +
>>>>>> +     if (!folio_test_uptodate(folio)) {
>>>>>> +             clear_highpage(folio_page(folio, 0));
>>>>>> +             kvm_gmem_mark_prepared(folio);
>>>>>> +     }
>>>>>
>>>>> kvm_gmem_get_pfn()->__kvm_gmem_get_pfn() seems to call
>>>>> kvm_gmem_prepare_folio() instead.
>>>>>
>>>>> Could we do the same here?
>>>>
>>>> Will do.
>>>
>>> I realized it's not that straightforward. __kvm_gmem_prepare_folio()
>>> requires the kvm_memory_slot, which is used to calculate the gfn. At
>>> that point we have neither, and it's not just an issue of access, but
>>> there might not be a slot associated with that yet.
>>
>> Hmm, right ... I wonder if that might be problematic. I assume no
>> memslot == no memory attribute telling us if it is private or shared at
>> least for now?
>>
>> Once guest_memfd maintains that state, it might be "cleaner" ? What's
>> your thought?
> 
> The idea is that this doesn't determine whether it's shared or private
> by the guest_memfd's attributes, but by the new state added in the
> other patch series. That's independent of memslots and guest addresses
> altogether.
> 
> One scenario you can imagine is the host wanting to fault in memory to
> initialize it before associating it with a memslot. I guess we could
> make it a requirement that you cannot fault-in pages unless they are
> associated with a memslot, but that might be too restrictive.

Okay, just what I thought, thanks!

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-02-18 17:24 ` [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared Fuad Tabba
@ 2025-02-28 16:23   ` Peter Xu
  2025-02-28 17:22     ` Fuad Tabba
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Xu @ 2025-02-28 16:23 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On Tue, Feb 18, 2025 at 05:24:54PM +0000, Fuad Tabba wrote:
> Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
> that the VM supports shared memory in guest_memfd, or that the
> host can create VMs that support shared memory. Supporting shared
> memory implies that memory can be mapped when shared with the
> host.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  include/uapi/linux/kvm.h | 1 +
>  virt/kvm/kvm_main.c      | 4 ++++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 45e6d8fca9b9..117937a895da 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -929,6 +929,7 @@ struct kvm_enable_cap {
>  #define KVM_CAP_PRE_FAULT_MEMORY 236
>  #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
>  #define KVM_CAP_X86_GUEST_MODE 238
> +#define KVM_CAP_GMEM_SHARED_MEM 239

I think SHARED_MEM is ok.  Said that, to me the use case in this series is
more about "in-place" rather than "shared".

In comparison, what I'm recently looking at is a "more" shared mode of
guest-memfd where it works almost like memfd.  So all pages will be shared
there.

That helps me e.g. for the N:1 kvm binding issue I mentioned in another
email (in one of my relies in previous version), in which case I want to
enable gmemfd folios to be mapped more than once in a process.

That'll work there as long as it's fully shared, because all things can be
registered in the old VA way, then there's no need to have N:1 restriction.
IOW, gmemfd will still rely on mmu notifier for tearing downs, and the
gmem->bindings will always be empty.

So if this one would be called "in-place", then I'll have my use case as
"shared".

I don't want to add any burden to your series, I think I can still make
that one "shared-full"..  So it's more of a pure comment just in case you
also think "in-place" suites more, or any name you think can identify
"in-place conversions" use case and "complete sharable" use cases.

Please also feel free to copy me for newer posts.  I'd be more than happy
to know when gmemfd will have a basic fault() function.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-02-28 16:23   ` Peter Xu
@ 2025-02-28 17:22     ` Fuad Tabba
  2025-02-28 17:33       ` David Hildenbrand
  0 siblings, 1 reply; 24+ messages in thread
From: Fuad Tabba @ 2025-02-28 17:22 UTC (permalink / raw)
  To: Peter Xu
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

Hi Peter,

On Fri, 28 Feb 2025 at 08:24, Peter Xu <peterx@redhat.com> wrote:
>
> On Tue, Feb 18, 2025 at 05:24:54PM +0000, Fuad Tabba wrote:
> > Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
> > that the VM supports shared memory in guest_memfd, or that the
> > host can create VMs that support shared memory. Supporting shared
> > memory implies that memory can be mapped when shared with the
> > host.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  include/uapi/linux/kvm.h | 1 +
> >  virt/kvm/kvm_main.c      | 4 ++++
> >  2 files changed, 5 insertions(+)
> >
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 45e6d8fca9b9..117937a895da 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -929,6 +929,7 @@ struct kvm_enable_cap {
> >  #define KVM_CAP_PRE_FAULT_MEMORY 236
> >  #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
> >  #define KVM_CAP_X86_GUEST_MODE 238
> > +#define KVM_CAP_GMEM_SHARED_MEM 239
>
> I think SHARED_MEM is ok.  Said that, to me the use case in this series is
> more about "in-place" rather than "shared".
>
> In comparison, what I'm recently looking at is a "more" shared mode of
> guest-memfd where it works almost like memfd.  So all pages will be shared
> there.
>
> That helps me e.g. for the N:1 kvm binding issue I mentioned in another
> email (in one of my relies in previous version), in which case I want to
> enable gmemfd folios to be mapped more than once in a process.
>
> That'll work there as long as it's fully shared, because all things can be
> registered in the old VA way, then there's no need to have N:1 restriction.
> IOW, gmemfd will still rely on mmu notifier for tearing downs, and the
> gmem->bindings will always be empty.
>
> So if this one would be called "in-place", then I'll have my use case as
> "shared".

I understand what you mean. The naming here is to be consistent with
the rest of the series. I don't really have a strong opinion. It means
SHARED_IN_PLACE, but then that would be a mouthful. :)

> I don't want to add any burden to your series, I think I can still make
> that one "shared-full"..  So it's more of a pure comment just in case you
> also think "in-place" suites more, or any name you think can identify
> "in-place conversions" use case and "complete sharable" use cases.
>
> Please also feel free to copy me for newer posts.  I'd be more than happy
> to know when gmemfd will have a basic fault() function.

I definitely will. Thanks for your comments.

Cheers,
/fuad

> Thanks,
>
> --
> Peter Xu
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-02-28 17:22     ` Fuad Tabba
@ 2025-02-28 17:33       ` David Hildenbrand
  2025-03-06 15:48         ` Ackerley Tng
  0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2025-02-28 17:33 UTC (permalink / raw)
  To: Fuad Tabba, Peter Xu
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 28.02.25 18:22, Fuad Tabba wrote:
> Hi Peter,
> 
> On Fri, 28 Feb 2025 at 08:24, Peter Xu <peterx@redhat.com> wrote:
>>
>> On Tue, Feb 18, 2025 at 05:24:54PM +0000, Fuad Tabba wrote:
>>> Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
>>> that the VM supports shared memory in guest_memfd, or that the
>>> host can create VMs that support shared memory. Supporting shared
>>> memory implies that memory can be mapped when shared with the
>>> host.
>>>
>>> Signed-off-by: Fuad Tabba <tabba@google.com>
>>> ---
>>>   include/uapi/linux/kvm.h | 1 +
>>>   virt/kvm/kvm_main.c      | 4 ++++
>>>   2 files changed, 5 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>> index 45e6d8fca9b9..117937a895da 100644
>>> --- a/include/uapi/linux/kvm.h
>>> +++ b/include/uapi/linux/kvm.h
>>> @@ -929,6 +929,7 @@ struct kvm_enable_cap {
>>>   #define KVM_CAP_PRE_FAULT_MEMORY 236
>>>   #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
>>>   #define KVM_CAP_X86_GUEST_MODE 238
>>> +#define KVM_CAP_GMEM_SHARED_MEM 239
>>
>> I think SHARED_MEM is ok.  Said that, to me the use case in this series is
>> more about "in-place" rather than "shared".
>>
>> In comparison, what I'm recently looking at is a "more" shared mode of
>> guest-memfd where it works almost like memfd.  So all pages will be shared
>> there.
>>
>> That helps me e.g. for the N:1 kvm binding issue I mentioned in another
>> email (in one of my relies in previous version), in which case I want to
>> enable gmemfd folios to be mapped more than once in a process.
>>
>> That'll work there as long as it's fully shared, because all things can be
>> registered in the old VA way, then there's no need to have N:1 restriction.
>> IOW, gmemfd will still rely on mmu notifier for tearing downs, and the
>> gmem->bindings will always be empty.
>>
>> So if this one would be called "in-place", then I'll have my use case as
>> "shared".
> 
> I understand what you mean. The naming here is to be consistent with
> the rest of the series. I don't really have a strong opinion. It means
> SHARED_IN_PLACE, but then that would be a mouthful. :)

I'll note that Patrick is also driving it in "all shared" mode for his 
direct-map removal series IIRC.

So we would have

a) All private
b) Mixing of private and shared (incl conversion)
c) All shared

"IN_PLACE" might be the wrong angle to look at it.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-02-28 17:33       ` David Hildenbrand
@ 2025-03-06 15:48         ` Ackerley Tng
  2025-03-06 15:57           ` David Hildenbrand
  0 siblings, 1 reply; 24+ messages in thread
From: Ackerley Tng @ 2025-03-06 15:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: tabba, peterx, kvm, linux-arm-msm, linux-mm, pbonzini,
	chenhuacai, mpe, anup, paul.walmsley, palmer, aou, seanjc, viro,
	brauner, willy, akpm, xiaoyao.li, yilun.xu, chao.p.peng, jarkko,
	amoorthy, dmatlack, isaku.yamahata, mic, vbabka, vannapurve,
	mail, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

David Hildenbrand <david@redhat.com> writes:

> On 28.02.25 18:22, Fuad Tabba wrote:
>> Hi Peter,
>> 
>> On Fri, 28 Feb 2025 at 08:24, Peter Xu <peterx@redhat.com> wrote:
>>>
>>> On Tue, Feb 18, 2025 at 05:24:54PM +0000, Fuad Tabba wrote:
>>>> Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
>>>> that the VM supports shared memory in guest_memfd, or that the
>>>> host can create VMs that support shared memory. Supporting shared
>>>> memory implies that memory can be mapped when shared with the
>>>> host.
>>>>
>>>> Signed-off-by: Fuad Tabba <tabba@google.com>
>>>> ---
>>>>   include/uapi/linux/kvm.h | 1 +
>>>>   virt/kvm/kvm_main.c      | 4 ++++
>>>>   2 files changed, 5 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>> index 45e6d8fca9b9..117937a895da 100644
>>>> --- a/include/uapi/linux/kvm.h
>>>> +++ b/include/uapi/linux/kvm.h
>>>> @@ -929,6 +929,7 @@ struct kvm_enable_cap {
>>>>   #define KVM_CAP_PRE_FAULT_MEMORY 236
>>>>   #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
>>>>   #define KVM_CAP_X86_GUEST_MODE 238
>>>> +#define KVM_CAP_GMEM_SHARED_MEM 239
>>>
>>> I think SHARED_MEM is ok.  Said that, to me the use case in this series is
>>> more about "in-place" rather than "shared".
>>>
>>> In comparison, what I'm recently looking at is a "more" shared mode of
>>> guest-memfd where it works almost like memfd.  So all pages will be shared
>>> there.
>>>
>>> That helps me e.g. for the N:1 kvm binding issue I mentioned in another
>>> email (in one of my relies in previous version), in which case I want to
>>> enable gmemfd folios to be mapped more than once in a process.
>>>
>>> That'll work there as long as it's fully shared, because all things can be
>>> registered in the old VA way, then there's no need to have N:1 restriction.
>>> IOW, gmemfd will still rely on mmu notifier for tearing downs, and the
>>> gmem->bindings will always be empty.
>>>
>>> So if this one would be called "in-place", then I'll have my use case as
>>> "shared".
>> 
>> I understand what you mean. The naming here is to be consistent with
>> the rest of the series. I don't really have a strong opinion. It means
>> SHARED_IN_PLACE, but then that would be a mouthful. :)
>
> I'll note that Patrick is also driving it in "all shared" mode for his 
> direct-map removal series IIRC.
>
> So we would have
>
> a) All private
> b) Mixing of private and shared (incl conversion)
> c) All shared
>
> "IN_PLACE" might be the wrong angle to look at it.

How about something like "supports_mmap" or "mmap_capable"?

So like

+ KVM_CAP_GMEM_MMAP
+ CONFIG_KVM_GMEM_MMAP_CAPABLE
+ kvm_arch_gmem_mmap_capable()

I'm just trying to avoid the use of shared, which could already mean 

+ shared between processes
+ shared between guest and host



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-03-06 15:48         ` Ackerley Tng
@ 2025-03-06 15:57           ` David Hildenbrand
  0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-03-06 15:57 UTC (permalink / raw)
  To: Ackerley Tng
  Cc: tabba, peterx, kvm, linux-arm-msm, linux-mm, pbonzini,
	chenhuacai, mpe, anup, paul.walmsley, palmer, aou, seanjc, viro,
	brauner, willy, akpm, xiaoyao.li, yilun.xu, chao.p.peng, jarkko,
	amoorthy, dmatlack, isaku.yamahata, mic, vbabka, vannapurve,
	mail, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 06.03.25 16:48, Ackerley Tng wrote:
> David Hildenbrand <david@redhat.com> writes:
> 
>> On 28.02.25 18:22, Fuad Tabba wrote:
>>> Hi Peter,
>>>
>>> On Fri, 28 Feb 2025 at 08:24, Peter Xu <peterx@redhat.com> wrote:
>>>>
>>>> On Tue, Feb 18, 2025 at 05:24:54PM +0000, Fuad Tabba wrote:
>>>>> Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
>>>>> that the VM supports shared memory in guest_memfd, or that the
>>>>> host can create VMs that support shared memory. Supporting shared
>>>>> memory implies that memory can be mapped when shared with the
>>>>> host.
>>>>>
>>>>> Signed-off-by: Fuad Tabba <tabba@google.com>
>>>>> ---
>>>>>    include/uapi/linux/kvm.h | 1 +
>>>>>    virt/kvm/kvm_main.c      | 4 ++++
>>>>>    2 files changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>>>> index 45e6d8fca9b9..117937a895da 100644
>>>>> --- a/include/uapi/linux/kvm.h
>>>>> +++ b/include/uapi/linux/kvm.h
>>>>> @@ -929,6 +929,7 @@ struct kvm_enable_cap {
>>>>>    #define KVM_CAP_PRE_FAULT_MEMORY 236
>>>>>    #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
>>>>>    #define KVM_CAP_X86_GUEST_MODE 238
>>>>> +#define KVM_CAP_GMEM_SHARED_MEM 239
>>>>
>>>> I think SHARED_MEM is ok.  Said that, to me the use case in this series is
>>>> more about "in-place" rather than "shared".
>>>>
>>>> In comparison, what I'm recently looking at is a "more" shared mode of
>>>> guest-memfd where it works almost like memfd.  So all pages will be shared
>>>> there.
>>>>
>>>> That helps me e.g. for the N:1 kvm binding issue I mentioned in another
>>>> email (in one of my relies in previous version), in which case I want to
>>>> enable gmemfd folios to be mapped more than once in a process.
>>>>
>>>> That'll work there as long as it's fully shared, because all things can be
>>>> registered in the old VA way, then there's no need to have N:1 restriction.
>>>> IOW, gmemfd will still rely on mmu notifier for tearing downs, and the
>>>> gmem->bindings will always be empty.
>>>>
>>>> So if this one would be called "in-place", then I'll have my use case as
>>>> "shared".
>>>
>>> I understand what you mean. The naming here is to be consistent with
>>> the rest of the series. I don't really have a strong opinion. It means
>>> SHARED_IN_PLACE, but then that would be a mouthful. :)
>>
>> I'll note that Patrick is also driving it in "all shared" mode for his
>> direct-map removal series IIRC.
>>
>> So we would have
>>
>> a) All private
>> b) Mixing of private and shared (incl conversion)
>> c) All shared
>>
>> "IN_PLACE" might be the wrong angle to look at it.
> 
> How about something like "supports_mmap" or "mmap_capable"?
> 
> So like
> 
> + KVM_CAP_GMEM_MMAP
> + CONFIG_KVM_GMEM_MMAP_CAPABLE
> + kvm_arch_gmem_mmap_capable()
> 
> I'm just trying to avoid the use of shared, which could already mean
> 
> + shared between processes
> + shared between guest and host

The reason I tried to avoid "MMAP" is that once we support read/write of 
non-private memory, the "mmap" is a bit too specific. Similarly 
"faultable". Hmmm

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-03-06 15:57 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-18 17:24 [PATCH v4 00/10] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
2025-02-18 17:24 ` [PATCH v4 01/10] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
2025-02-20 11:53   ` David Hildenbrand
2025-02-18 17:24 ` [PATCH v4 02/10] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages Fuad Tabba
2025-02-20 11:54   ` David Hildenbrand
2025-02-18 17:24 ` [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
2025-02-20 11:58   ` David Hildenbrand
2025-02-20 12:04     ` Fuad Tabba
2025-02-20 15:45       ` Fuad Tabba
2025-02-20 15:58         ` David Hildenbrand
2025-02-20 17:10           ` Fuad Tabba
2025-02-20 17:12             ` David Hildenbrand
2025-02-18 17:24 ` [PATCH v4 04/10] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared Fuad Tabba
2025-02-28 16:23   ` Peter Xu
2025-02-28 17:22     ` Fuad Tabba
2025-02-28 17:33       ` David Hildenbrand
2025-03-06 15:48         ` Ackerley Tng
2025-03-06 15:57           ` David Hildenbrand
2025-02-18 17:24 ` [PATCH v4 05/10] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory Fuad Tabba
2025-02-18 17:24 ` [PATCH v4 06/10] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory Fuad Tabba
2025-02-18 17:24 ` [PATCH v4 07/10] KVM: arm64: Refactor user_mem_abort() calculation of force_pte Fuad Tabba
2025-02-18 17:24 ` [PATCH v4 08/10] KVM: arm64: Handle guest_memfd()-backed guest page faults Fuad Tabba
2025-02-18 17:24 ` [PATCH v4 09/10] KVM: arm64: Enable mapping guest_memfd in arm64 Fuad Tabba
2025-02-18 17:25 ` [PATCH v4 10/10] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox