linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs
@ 2025-01-29 17:23 Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 01/11] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
                   ` (11 more replies)
  0 siblings, 12 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Main changes since v1 [1]:
- Added x86 support for mapping guest_memfd at the host, enabled
 only for the KVM_X86_SW_PROTECTED_VM type.
- Require setting memslot userspace_addr for guest_memfd slots
 even if shared, and remove patches that worked around that.
- Brought in more of the infrastructure from the patch series
 that allows restricted mapping of guest_memfd backed memory.
- Renamed references to "mappable" -> "shared".
- Expanded the selftests.
- Added instructions to test on x86 and arm64 (below).
- Rebased on Linux 6.13.

The purpose of this series is to serve as a base for _restricted_
mmap() support for guest_memfd backed memory at the host [2]. It
would allow experimentation with what that support would be like
in the safe environment of the software VM types, which are meant
for testing and experimentation.

This series adds a new VM type for arm64,
KVM_VM_TYPE_ARM_SW_PROTECTED, analogous to the x86
KVM_X86_SW_PROTECTED_VM. This type is to serve as a development
and testing vehicle for Confidential (CoCo) VMs.

Similar to its x86 counterpart, SW_PROTECTED is meant only for
development and testing. It's not meant to be used for "real"
VMs, and especially not in production. The behavior and effective
ABI for software-protected VMs is unstable.

This series enables mmap() and fault() support for guest_memfd
backed memory specifically for the software-protected VM types
(in x86 and arm64), only when explicitly enabled in the config.

The series is based on Linux 6.13 and much of the code within
is a subset of the latest series I sent [2], with the addition of
the new software protected vm type.

To test this series, I've pushed a kvmtool branch with support
for guest_memfd for x86 and arm64 and the new runtime options of
--guest_memfd and --sw_protected, which marks the VM as software
protected [3]. I plan on upstreaming this branch once I've tested
it more and tidied it up a bit (or a lot).

To test this patch series on x86 (I use a standard Debian image):

Build:

- Build the kernel with the following config options enabled:
defconfigs:
	x86_64_defconfig
	kvm_guest.config
config options:
	KVM
	KVM_INTEL
	KVM_PRIVATE_MEM
	KVM_SW_PROTECTED_VM
	KVM_GMEM_SHARED_MEM

- Build the kernel kvm selftest tools/testing/selftests/kvm, you
only need guest_memfd_test, e.g.:
	make EXTRA_CFLAGS="-static -DDEBUG" -C tools/testing/selftests/kvm

- Build kvmtool [3] lkvm-static (I build it on a different machine).
	make lkvm-static

Run:
Boot your Linux image with the kernel you built above.

The selftest you can run as it is:
	./guest_memfd_test

For kvmtool, where bzImage is the same as the host's:
	./lkvm-static run -c 2 -m 512 -p "break=mount" --kernel bzImage --debug --guest_memfd --sw_protected

To test this patch series on arm64 (I use a standard Debian image):

Build:

- Build the kernel with defconfig

- Build the kernel kvm selftest tools/testing/selftests/kvm, you
only need guest_memfd_test.

- Build kvmtool [3] lkvm-static (I cross compile it on a different machine).
You are likely to need libfdt as well.

For libfdt (in the same directory as kvmtool):
	git clone git://git.kernel.org/pub/scm/utils/dtc/dtc.git
	cd dtc
	export CC=aarch64-linux-gnu-gcc
	make
	cd ..

Then for kvmtool:
	make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- LIBFDT_DIR=./dtc/libfdt/ lkvm-static

Run:
Boot your Linux image with the kernel you built above.

The selftest you can run as it is:
	./guest_memfd_test

For kvmtool, where Image is the same as the host's, and rootfs is
your rootfs image (in case kvmtool can't figure it out):
	./lkvm-static run -c 2 -m 512 -d rootfs --kernel Image --force-pci --irqchip gicv3 --debug --guest_memfd --sw_protected

You can find (potentially slightly outdated) instructions on how
to a full arm64 system stack under QEMU here [4].

Cheers,
/fuad

[1] https://lore.kernel.org/all/20250122152738.1173160-1-tabba@google.com/
[2] https://lore.kernel.org/all/20250117163001.2326672-1-tabba@google.com/
[3] https://android-kvm.googlesource.com/kvmtool/+/refs/heads/tabba/guestmem-6.13
[4] https://mirrors.edge.kernel.org/pub/linux/kernel/people/will/docs/qemu/qemu-arm64-howto.html

Fuad Tabba (11):
  mm: Consolidate freeing of typed folios on final folio_put()
  KVM: guest_memfd: Handle final folio_put() of guest_memfd pages
  KVM: guest_memfd: Allow host to map guest_memfd() pages
  KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed
    memory
  KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd
    shared memory
  KVM: arm64: Refactor user_mem_abort() calculation of force_pte
  KVM: arm64: Handle guest_memfd()-backed guest page faults
  KVM: arm64: Introduce KVM_VM_TYPE_ARM_SW_PROTECTED machine type
  KVM: arm64: Enable mapping guest_memfd in arm64
  KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is
    allowed

 Documentation/virt/kvm/api.rst                |  5 +
 arch/arm64/include/asm/kvm_host.h             | 10 ++
 arch/arm64/kvm/Kconfig                        |  1 +
 arch/arm64/kvm/arm.c                          |  5 +
 arch/arm64/kvm/mmu.c                          | 91 ++++++++++++-------
 arch/x86/include/asm/kvm_host.h               |  5 +
 arch/x86/kvm/Kconfig                          |  3 +-
 include/linux/kvm_host.h                      | 19 +++-
 include/linux/page-flags.h                    | 22 +++++
 include/uapi/linux/kvm.h                      |  7 ++
 mm/debug.c                                    |  1 +
 mm/swap.c                                     | 27 +++++-
 tools/testing/selftests/kvm/Makefile          |  1 +
 .../testing/selftests/kvm/guest_memfd_test.c  | 75 +++++++++++++--
 tools/testing/selftests/kvm/lib/kvm_util.c    |  3 +-
 virt/kvm/Kconfig                              |  4 +
 virt/kvm/guest_memfd.c                        | 90 ++++++++++++++++++
 virt/kvm/kvm_main.c                           |  9 +-
 18 files changed, 326 insertions(+), 52 deletions(-)


base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 01/11] mm: Consolidate freeing of typed folios on final folio_put()
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 02/11] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages Fuad Tabba
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Some folio types, such as hugetlb, handle freeing their own
folios. Moreover, guest_memfd will require being notified once a
folio's reference count reaches 0 to facilitate shared to private
folio conversion, without the folio actually being freed at that
point.

As a first step towards that, this patch consolidates freeing
folios that have a type. The first user is hugetlb folios. Later
in this patch series, guest_memfd will become the second user of
this.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/linux/page-flags.h | 15 +++++++++++++++
 mm/swap.c                  | 22 +++++++++++++++++-----
 2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 691506bdf2c5..6615f2f59144 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -962,6 +962,21 @@ static inline bool page_has_type(const struct page *page)
 	return page_mapcount_is_type(data_race(page->page_type));
 }
 
+static inline int page_get_type(const struct page *page)
+{
+	return page->page_type >> 24;
+}
+
+static inline bool folio_has_type(const struct folio *folio)
+{
+	return page_has_type(&folio->page);
+}
+
+static inline int folio_get_type(const struct folio *folio)
+{
+	return page_get_type(&folio->page);
+}
+
 #define FOLIO_TYPE_OPS(lname, fname)					\
 static __always_inline bool folio_test_##fname(const struct folio *folio) \
 {									\
diff --git a/mm/swap.c b/mm/swap.c
index 10decd9dffa1..8a66cd9cb9da 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -94,6 +94,18 @@ static void page_cache_release(struct folio *folio)
 		unlock_page_lruvec_irqrestore(lruvec, flags);
 }
 
+static void free_typed_folio(struct folio *folio)
+{
+	switch (folio_get_type(folio)) {
+	case PGTY_hugetlb:
+		if (IS_ENABLED(CONFIG_HUGETLBFS))
+			free_huge_folio(folio);
+		return;
+	default:
+		WARN_ON_ONCE(1);
+	}
+}
+
 void __folio_put(struct folio *folio)
 {
 	if (unlikely(folio_is_zone_device(folio))) {
@@ -101,8 +113,8 @@ void __folio_put(struct folio *folio)
 		return;
 	}
 
-	if (folio_test_hugetlb(folio)) {
-		free_huge_folio(folio);
+	if (unlikely(folio_has_type(folio))) {
+		free_typed_folio(folio);
 		return;
 	}
 
@@ -934,13 +946,13 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs)
 		if (!folio_ref_sub_and_test(folio, nr_refs))
 			continue;
 
-		/* hugetlb has its own memcg */
-		if (folio_test_hugetlb(folio)) {
+		if (unlikely(folio_has_type(folio))) {
+			/* typed folios have their own memcg, if any */
 			if (lruvec) {
 				unlock_page_lruvec_irqrestore(lruvec, flags);
 				lruvec = NULL;
 			}
-			free_huge_folio(folio);
+			free_typed_folio(folio);
 			continue;
 		}
 		folio_unqueue_deferred_split(folio);
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 02/11] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 01/11] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-30 17:16   ` David Hildenbrand
  2025-01-29 17:23 ` [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Before transitioning a guest_memfd folio to unshared, thereby
disallowing access by the host and allowing the hypervisor to
transition its view of the guest page as private, we need to be
sure that the host doesn't have any references to the folio.

This patch introduces a new type for guest_memfd folios, which
isn't activated in this series but is here as a placeholder and
to facilitate the code in the next patch. This will be used in
the future to register a callback that informs the guest_memfd
subsystem when the last reference is dropped, therefore knowing
that the host doesn't have any remaining references.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/linux/page-flags.h | 7 +++++++
 mm/debug.c                 | 1 +
 mm/swap.c                  | 5 +++++
 3 files changed, 13 insertions(+)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 6615f2f59144..bab3cac1f93b 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -942,6 +942,7 @@ enum pagetype {
 	PGTY_slab	= 0xf5,
 	PGTY_zsmalloc	= 0xf6,
 	PGTY_unaccepted	= 0xf7,
+	PGTY_guestmem	= 0xf8,
 
 	PGTY_mapcount_underflow = 0xff
 };
@@ -1091,6 +1092,12 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb)
 FOLIO_TEST_FLAG_FALSE(hugetlb)
 #endif
 
+#ifdef CONFIG_KVM_GMEM_MAPPABLE
+FOLIO_TYPE_OPS(guestmem, guestmem)
+#else
+FOLIO_TEST_FLAG_FALSE(guestmem)
+#endif
+
 PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc)
 
 /*
diff --git a/mm/debug.c b/mm/debug.c
index 95b6ab809c0e..db93be385ed9 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -56,6 +56,7 @@ static const char *page_type_names[] = {
 	DEF_PAGETYPE_NAME(table),
 	DEF_PAGETYPE_NAME(buddy),
 	DEF_PAGETYPE_NAME(unaccepted),
+	DEF_PAGETYPE_NAME(guestmem),
 };
 
 static const char *page_type_name(unsigned int page_type)
diff --git a/mm/swap.c b/mm/swap.c
index 8a66cd9cb9da..73d61c7f8edd 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -37,6 +37,7 @@
 #include <linux/page_idle.h>
 #include <linux/local_lock.h>
 #include <linux/buffer_head.h>
+#include <linux/kvm_host.h>
 
 #include "internal.h"
 
@@ -101,6 +102,10 @@ static void free_typed_folio(struct folio *folio)
 		if (IS_ENABLED(CONFIG_HUGETLBFS))
 			free_huge_folio(folio);
 		return;
+	case PGTY_guestmem:
+		if (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM))
+			WARN_ONCE(1, "A placeholder that shouldn't trigger.");
+		return;
 	default:
 		WARN_ON_ONCE(1);
 	}
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 01/11] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 02/11] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-30 17:20   ` David Hildenbrand
  2025-02-07 16:45   ` Patrick Roy
  2025-01-29 17:23 ` [RFC PATCH v2 04/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared Fuad Tabba
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Add support for mmap() and fault() for guest_memfd backed memory
in the host for VMs that support in-place conversion between
shared and private (shared memory). To that end, this patch adds
the ability to check whether the VM type has that support, and
only allows mapping its memory if that's the case.

Additionally, this behavior is gated with a new configuration
option, CONFIG_KVM_GMEM_SHARED_MEM.

Signed-off-by: Fuad Tabba <tabba@google.com>

---

This patch series will allow shared memory support for software
VMs in x86. It will also introduce a similar VM type for arm64
and allow shared memory support for that. In the future, pKVM
will also support shared memory.
---
 include/linux/kvm_host.h | 11 ++++++
 virt/kvm/Kconfig         |  4 +++
 virt/kvm/guest_memfd.c   | 77 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 92 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 401439bb21e3..408429f13bf4 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -717,6 +717,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
 }
 #endif
 
+/*
+ * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
+ * private memory is enabled and it supports in-place shared/private conversion.
+ */
+#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
+static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
+{
+	return false;
+}
+#endif
+
 #ifndef kvm_arch_has_readonly_mem
 static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
 {
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 54e959e7d68f..4e759e8020c5 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -124,3 +124,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE
 config HAVE_KVM_ARCH_GMEM_INVALIDATE
        bool
        depends on KVM_PRIVATE_MEM
+
+config KVM_GMEM_SHARED_MEM
+       select KVM_PRIVATE_MEM
+       bool
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index 47a9f68f7b24..86441581c9ae 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -307,7 +307,84 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
 	return gfn - slot->base_gfn + slot->gmem.pgoff;
 }
 
+#ifdef CONFIG_KVM_GMEM_SHARED_MEM
+static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
+{
+	struct inode *inode = file_inode(vmf->vma->vm_file);
+	struct folio *folio;
+	vm_fault_t ret = VM_FAULT_LOCKED;
+
+	filemap_invalidate_lock_shared(inode->i_mapping);
+
+	folio = kvm_gmem_get_folio(inode, vmf->pgoff);
+	if (IS_ERR(folio)) {
+		ret = VM_FAULT_SIGBUS;
+		goto out_filemap;
+	}
+
+	if (folio_test_hwpoison(folio)) {
+		ret = VM_FAULT_HWPOISON;
+		goto out_folio;
+	}
+
+	if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
+		ret = VM_FAULT_SIGBUS;
+		goto out_folio;
+	}
+
+	/* No support for huge pages. */
+	if (WARN_ON_ONCE(folio_nr_pages(folio) > 1)) {
+		ret = VM_FAULT_SIGBUS;
+		goto out_folio;
+	}
+
+	if (!folio_test_uptodate(folio)) {
+		clear_highpage(folio_page(folio, 0));
+		folio_mark_uptodate(folio);
+	}
+
+	vmf->page = folio_file_page(folio, vmf->pgoff);
+
+out_folio:
+	if (ret != VM_FAULT_LOCKED) {
+		folio_unlock(folio);
+		folio_put(folio);
+	}
+
+out_filemap:
+	filemap_invalidate_unlock_shared(inode->i_mapping);
+
+	return ret;
+}
+
+static const struct vm_operations_struct kvm_gmem_vm_ops = {
+	.fault = kvm_gmem_fault,
+};
+
+static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct kvm_gmem *gmem = file->private_data;
+
+	if (!kvm_arch_gmem_supports_shared_mem(gmem->kvm))
+		return -ENODEV;
+
+	if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) !=
+	    (VM_SHARED | VM_MAYSHARE)) {
+		return -EINVAL;
+	}
+
+	file_accessed(file);
+	vm_flags_set(vma, VM_DONTDUMP);
+	vma->vm_ops = &kvm_gmem_vm_ops;
+
+	return 0;
+}
+#else
+#define kvm_gmem_mmap NULL
+#endif /* CONFIG_KVM_GMEM_SHARED_MEM */
+
 static struct file_operations kvm_gmem_fops = {
+	.mmap		= kvm_gmem_mmap,
 	.open		= generic_file_open,
 	.release	= kvm_gmem_release,
 	.fallocate	= kvm_gmem_fallocate,
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 04/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (2 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-31  9:11   ` David Hildenbrand
  2025-01-29 17:23 ` [RFC PATCH v2 05/11] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory Fuad Tabba
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
that the VM supports shared memory in guest_memfd, or that the
host can create VMs that support shared memory. Supporting shared
memory implies that memory can be mapped when shared with the
host.

For now, this checks only whether the VM type supports sharing
guest_memfd backed memory. In the future, it will be expanded to
check whether the specific memory address is shared with the
host.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/uapi/linux/kvm.h |  1 +
 virt/kvm/guest_memfd.c   | 13 +++++++++++++
 virt/kvm/kvm_main.c      |  4 ++++
 3 files changed, 18 insertions(+)

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 502ea63b5d2e..3ac805c5abf1 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -933,6 +933,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_PRE_FAULT_MEMORY 236
 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
 #define KVM_CAP_X86_GUEST_MODE 238
+#define KVM_CAP_GMEM_SHARED_MEM 239
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index 86441581c9ae..4e1144ed3446 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -308,6 +308,13 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
 }
 
 #ifdef CONFIG_KVM_GMEM_SHARED_MEM
+static bool kvm_gmem_is_shared(struct file *file, pgoff_t pgoff)
+{
+	struct kvm_gmem *gmem = file->private_data;
+
+	return kvm_arch_gmem_supports_shared_mem(gmem->kvm);
+}
+
 static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
 {
 	struct inode *inode = file_inode(vmf->vma->vm_file);
@@ -327,6 +334,12 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
 		goto out_folio;
 	}
 
+	/* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */
+	if (!kvm_gmem_is_shared(vmf->vma->vm_file, vmf->pgoff)) {
+		ret = VM_FAULT_SIGBUS;
+		goto out_folio;
+	}
+
 	if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
 		ret = VM_FAULT_SIGBUS;
 		goto out_folio;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index de2c11dae231..40e4ed512923 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4792,6 +4792,10 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 #ifdef CONFIG_KVM_PRIVATE_MEM
 	case KVM_CAP_GUEST_MEMFD:
 		return !kvm || kvm_arch_has_private_mem(kvm);
+#endif
+#ifdef CONFIG_KVM_GMEM_SHARED_MEM
+	case KVM_CAP_GMEM_SHARED_MEM:
+		return !kvm || kvm_arch_gmem_supports_shared_mem(kvm);
 #endif
 	default:
 		break;
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 05/11] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (3 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 04/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 06/11] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory Fuad Tabba
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

For VMs that allow sharing guest_memfd backed memory in-place,
handle that memory the same as "private" guest_memfd memory. This
means that faulting that memory in the host or in the guest will
go through the guest_memfd subsystem.

Note that the word "private" in the name of the function
kvm_mem_is_private() doesn't necessarily indicate that the memory
isn't shared, but is due to the history and evolution of
guest_memfd and the various names it has received. In effect,
this function is used to multiplex between the path of a normal
page fault and the path of a guest_memfd backed page fault.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/linux/kvm_host.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 408429f13bf4..e57cdf4e3f3f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2503,7 +2503,8 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
 #else
 static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
 {
-	return false;
+	return kvm_arch_gmem_supports_shared_mem(kvm) &&
+	       kvm_slot_can_be_private(gfn_to_memslot(kvm, gfn));
 }
 #endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */
 
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 06/11] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (4 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 05/11] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 07/11] KVM: arm64: Refactor user_mem_abort() calculation of force_pte Fuad Tabba
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

The KVM_X86_SW_PROTECTED_VM type is meant for experimentation and
does not have any underlying support for protected guests.  This
makes it a good candidate to use for testing this patch series.
Therefore, when the kconfig option is enabled, mark
KVM_X86_SW_PROTECTED_VM as supporting shared memory.

This means that this memory is considered by guest_memfd to be
shared with the host, with in-place conversion. This allows the
host to map and fault in guest_memfd memory belonging to this VM
type.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/x86/include/asm/kvm_host.h | 5 +++++
 arch/x86/kvm/Kconfig            | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e159e44a6a1b..35d5995350da 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2202,8 +2202,13 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
 
 #ifdef CONFIG_KVM_PRIVATE_MEM
 #define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem)
+
+#define kvm_arch_gmem_supports_shared_mem(kvm)         \
+	(IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM) &&      \
+	 ((kvm)->arch.vm_type == KVM_X86_SW_PROTECTED_VM))
 #else
 #define kvm_arch_has_private_mem(kvm) false
+#define kvm_arch_gmem_supports_shared_mem(kvm) false
 #endif
 
 #define kvm_arch_has_readonly_mem(kvm) (!(kvm)->arch.has_protected_state)
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index ea2c4f21c1ca..22d1bcdaad58 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -45,7 +45,8 @@ config KVM_X86
 	select HAVE_KVM_PM_NOTIFIER if PM
 	select KVM_GENERIC_HARDWARE_ENABLING
 	select KVM_GENERIC_PRE_FAULT_MEMORY
-	select KVM_GENERIC_PRIVATE_MEM if KVM_SW_PROTECTED_VM
+	select KVM_PRIVATE_MEM if KVM_SW_PROTECTED_VM
+	select KVM_GMEM_SHARED_MEM if KVM_SW_PROTECTED_VM
 	select KVM_WERROR if WERROR
 
 config KVM
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 07/11] KVM: arm64: Refactor user_mem_abort() calculation of force_pte
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (5 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 06/11] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 08/11] KVM: arm64: Handle guest_memfd()-backed guest page faults Fuad Tabba
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

To simplify the code and to make the assumptions clearer,
refactor user_mem_abort() by immediately setting force_pte to
true if logging_active is true. Also, add a check to ensure that
the assumption that logging_active is guaranteed to never be true
for VM_PFNMAP memslot is true.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/mmu.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index c9d46ad57e52..1ec362d0d093 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1436,7 +1436,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  bool fault_is_perm)
 {
 	int ret = 0;
-	bool write_fault, writable, force_pte = false;
+	bool write_fault, writable;
 	bool exec_fault, mte_allowed;
 	bool device = false, vfio_allow_any_uc = false;
 	unsigned long mmu_seq;
@@ -1448,6 +1448,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	gfn_t gfn;
 	kvm_pfn_t pfn;
 	bool logging_active = memslot_is_logging(memslot);
+	bool force_pte = logging_active;
 	long vma_pagesize, fault_granule;
 	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
 	struct kvm_pgtable *pgt;
@@ -1493,12 +1494,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 * logging_active is guaranteed to never be true for VM_PFNMAP
 	 * memslots.
 	 */
-	if (logging_active) {
-		force_pte = true;
+	if (WARN_ON_ONCE(logging_active && (vma->vm_flags & VM_PFNMAP)))
+		return -EFAULT;
+
+	if (force_pte)
 		vma_shift = PAGE_SHIFT;
-	} else {
+	else
 		vma_shift = get_vma_page_shift(vma, hva);
-	}
 
 	switch (vma_shift) {
 #ifndef __PAGETABLE_PMD_FOLDED
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 08/11] KVM: arm64: Handle guest_memfd()-backed guest page faults
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (6 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 07/11] KVM: arm64: Refactor user_mem_abort() calculation of force_pte Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 09/11] KVM: arm64: Introduce KVM_VM_TYPE_ARM_SW_PROTECTED machine type Fuad Tabba
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Add arm64 support for handling guest page faults on guest_memfd
backed memslots.

For now, the fault granule is restricted to PAGE_SIZE.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/mmu.c     | 84 ++++++++++++++++++++++++++--------------
 include/linux/kvm_host.h |  5 +++
 virt/kvm/kvm_main.c      |  5 ---
 3 files changed, 61 insertions(+), 33 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 1ec362d0d093..c1f3ddb88cb9 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1430,6 +1430,33 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
 	return vma->vm_flags & VM_MTE_ALLOWED;
 }
 
+static kvm_pfn_t faultin_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
+			     gfn_t gfn, bool write_fault, bool *writable,
+			     struct page **page, bool is_private)
+{
+	kvm_pfn_t pfn;
+	int ret;
+
+	if (!is_private)
+		return __kvm_faultin_pfn(slot, gfn, write_fault ? FOLL_WRITE : 0, writable, page);
+
+	*writable = false;
+
+	if (WARN_ON_ONCE(write_fault && memslot_is_readonly(slot)))
+		return KVM_PFN_ERR_NOSLOT_MASK;
+
+	ret = kvm_gmem_get_pfn(kvm, slot, gfn, &pfn, page, NULL);
+	if (!ret) {
+		*writable = write_fault;
+		return pfn;
+	}
+
+	if (ret == -EHWPOISON)
+		return KVM_PFN_ERR_HWPOISON;
+
+	return KVM_PFN_ERR_NOSLOT_MASK;
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_s2_trans *nested,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
@@ -1437,24 +1464,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 {
 	int ret = 0;
 	bool write_fault, writable;
-	bool exec_fault, mte_allowed;
+	bool exec_fault, mte_allowed = false;
 	bool device = false, vfio_allow_any_uc = false;
 	unsigned long mmu_seq;
 	phys_addr_t ipa = fault_ipa;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache;
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma = NULL;
 	short vma_shift;
-	gfn_t gfn;
+	gfn_t gfn = ipa >> PAGE_SHIFT;
 	kvm_pfn_t pfn;
 	bool logging_active = memslot_is_logging(memslot);
-	bool force_pte = logging_active;
-	long vma_pagesize, fault_granule;
+	bool is_private = kvm_mem_is_private(kvm, gfn);
+	bool force_pte = logging_active || is_private;
+	long vma_pagesize, fault_granule = PAGE_SIZE;
 	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
 	struct kvm_pgtable *pgt;
 	struct page *page;
 
-	if (fault_is_perm)
+	if (fault_is_perm && !is_private)
 		fault_granule = kvm_vcpu_trap_get_perm_fault_granule(vcpu);
 	write_fault = kvm_is_write_fault(vcpu);
 	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
@@ -1478,24 +1506,30 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			return ret;
 	}
 
+	mmap_read_lock(current->mm);
+
 	/*
 	 * Let's check if we will get back a huge page backed by hugetlbfs, or
 	 * get block mapping for device MMIO region.
 	 */
-	mmap_read_lock(current->mm);
-	vma = vma_lookup(current->mm, hva);
-	if (unlikely(!vma)) {
-		kvm_err("Failed to find VMA for hva 0x%lx\n", hva);
-		mmap_read_unlock(current->mm);
-		return -EFAULT;
-	}
+	if (!is_private) {
+		vma = vma_lookup(current->mm, hva);
+		if (unlikely(!vma)) {
+			kvm_err("Failed to find VMA for hva 0x%lx\n", hva);
+			mmap_read_unlock(current->mm);
+			return -EFAULT;
+		}
 
-	/*
-	 * logging_active is guaranteed to never be true for VM_PFNMAP
-	 * memslots.
-	 */
-	if (WARN_ON_ONCE(logging_active && (vma->vm_flags & VM_PFNMAP)))
-		return -EFAULT;
+		/*
+		 * logging_active is guaranteed to never be true for VM_PFNMAP
+		 * memslots.
+		 */
+		if (WARN_ON_ONCE(logging_active && (vma->vm_flags & VM_PFNMAP)))
+			return -EFAULT;
+
+		vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED;
+		mte_allowed = kvm_vma_mte_allowed(vma);
+	}
 
 	if (force_pte)
 		vma_shift = PAGE_SHIFT;
@@ -1565,18 +1599,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		ipa &= ~(vma_pagesize - 1);
 	}
 
-	gfn = ipa >> PAGE_SHIFT;
-	mte_allowed = kvm_vma_mte_allowed(vma);
-
-	vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED;
-
 	/* Don't use the VMA after the unlock -- it may have vanished */
 	vma = NULL;
 
 	/*
 	 * Read mmu_invalidate_seq so that KVM can detect if the results of
-	 * vma_lookup() or __kvm_faultin_pfn() become stale prior to
-	 * acquiring kvm->mmu_lock.
+	 * vma_lookup() or faultin_pfn() become stale prior to acquiring
+	 * kvm->mmu_lock.
 	 *
 	 * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
 	 * with the smp_wmb() in kvm_mmu_invalidate_end().
@@ -1584,8 +1613,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	mmu_seq = vcpu->kvm->mmu_invalidate_seq;
 	mmap_read_unlock(current->mm);
 
-	pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0,
-				&writable, &page);
+	pfn = faultin_pfn(kvm, memslot, gfn, write_fault, &writable, &page, is_private);
 	if (pfn == KVM_PFN_ERR_HWPOISON) {
 		kvm_send_hwpoison_signal(hva, vma_shift);
 		return 0;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e57cdf4e3f3f..42397fc4a1fb 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1864,6 +1864,11 @@ static inline int memslot_id(struct kvm *kvm, gfn_t gfn)
 	return gfn_to_memslot(kvm, gfn)->id;
 }
 
+static inline bool memslot_is_readonly(const struct kvm_memory_slot *slot)
+{
+	return slot->flags & KVM_MEM_READONLY;
+}
+
 static inline gfn_t
 hva_to_gfn_memslot(unsigned long hva, struct kvm_memory_slot *slot)
 {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 40e4ed512923..9f913a7fc44b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2622,11 +2622,6 @@ unsigned long kvm_host_page_size(struct kvm_vcpu *vcpu, gfn_t gfn)
 	return size;
 }
 
-static bool memslot_is_readonly(const struct kvm_memory_slot *slot)
-{
-	return slot->flags & KVM_MEM_READONLY;
-}
-
 static unsigned long __gfn_to_hva_many(const struct kvm_memory_slot *slot, gfn_t gfn,
 				       gfn_t *nr_pages, bool write)
 {
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 09/11] KVM: arm64: Introduce KVM_VM_TYPE_ARM_SW_PROTECTED machine type
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (7 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 08/11] KVM: arm64: Handle guest_memfd()-backed guest page faults Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 10/11] KVM: arm64: Enable mapping guest_memfd in arm64 Fuad Tabba
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Introduce a new virtual machine type,
KVM_VM_TYPE_ARM_SW_PROTECTED, to serve as a development and
testing vehicle for Confidential (CoCo) VMs, similar to the x86
KVM_X86_SW_PROTECTED_VM type.

Initially, this is used to test guest_memfd without needing any
underlying protection.

Similar to the x86 type, this is currently only for development
and testing.  Do not use KVM_VM_TYPE_ARM_SW_PROTECTED for "real"
VMs, and especially not in production. The behavior and effective
ABI for software-protected VMs is unstable.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 Documentation/virt/kvm/api.rst    |  5 +++++
 arch/arm64/include/asm/kvm_host.h | 10 ++++++++++
 arch/arm64/kvm/arm.c              |  5 +++++
 arch/arm64/kvm/mmu.c              |  3 ---
 include/uapi/linux/kvm.h          |  6 ++++++
 5 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index f15b61317aad..7953b07c8c2b 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -214,6 +214,11 @@ exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects
 size of the address translated by the stage2 level (guest physical to
 host physical address translations).
 
+KVM_VM_TYPE_ARM_SW_PROTECTED is currently only for development and testing of
+confidential VMs without having underlying support. Do not use
+KVM_VM_TYPE_ARM_SW_PROTECTED for "real" VMs, and especially not in production.
+The behavior and effective ABI for software-protected VMs is unstable.
+
 
 4.3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST
 ----------------------------------------------------------
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e18e9244d17a..e8a0db2ac4fa 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -380,6 +380,8 @@ struct kvm_arch {
 	 * the associated pKVM instance in the hypervisor.
 	 */
 	struct kvm_protected_vm pkvm;
+
+	unsigned long vm_type;
 };
 
 struct kvm_vcpu_fault_info {
@@ -1529,4 +1531,12 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val);
 #define kvm_has_s1poe(k)				\
 	(kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP))
 
+#define kvm_arch_has_private_mem(kvm)			\
+	(IS_ENABLED(CONFIG_KVM_PRIVATE_MEM) &&		\
+	 ((kvm)->arch.vm_type & KVM_VM_TYPE_ARM_SW_PROTECTED))
+
+#define kvm_arch_gmem_supports_shared_mem(kvm)		\
+	(IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM) &&	\
+	 ((kvm)->arch.vm_type & KVM_VM_TYPE_ARM_SW_PROTECTED))
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a102c3aebdbc..ecdb8db619d8 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -171,6 +171,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
 	int ret;
 
+	if (type & ~KVM_VM_TYPE_MASK)
+		return -EINVAL;
+
 	mutex_init(&kvm->arch.config_lock);
 
 #ifdef CONFIG_LOCKDEP
@@ -212,6 +215,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 	bitmap_zero(kvm->arch.vcpu_features, KVM_VCPU_MAX_FEATURES);
 
+	kvm->arch.vm_type = type;
+
 	return 0;
 
 err_free_cpumask:
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index c1f3ddb88cb9..8e19248533f1 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -869,9 +869,6 @@ static int kvm_init_ipa_range(struct kvm_s2_mmu *mmu, unsigned long type)
 	u64 mmfr0, mmfr1;
 	u32 phys_shift;
 
-	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
-		return -EINVAL;
-
 	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
 	if (is_protected_kvm_enabled()) {
 		phys_shift = kvm_ipa_limit;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 3ac805c5abf1..a3973d2b1a69 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -656,6 +656,12 @@ struct kvm_enable_cap {
 #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK	0xffULL
 #define KVM_VM_TYPE_ARM_IPA_SIZE(x)		\
 	((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
+
+#define KVM_VM_TYPE_ARM_SW_PROTECTED	(1UL << 9)
+
+#define KVM_VM_TYPE_MASK	(KVM_VM_TYPE_ARM_IPA_SIZE_MASK | \
+				 KVM_VM_TYPE_ARM_SW_PROTECTED)
+
 /*
  * ioctls for /dev/kvm fds:
  */
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 10/11] KVM: arm64: Enable mapping guest_memfd in arm64
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (8 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 09/11] KVM: arm64: Introduce KVM_VM_TYPE_ARM_SW_PROTECTED machine type Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-29 17:23 ` [RFC PATCH v2 11/11] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba
  2025-01-30 16:50 ` [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs David Hildenbrand
  11 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Enable mapping guest_memfd in arm64, which would only apply to
VMs with the type KVM_VM_TYPE_ARM_SW_PROTECTED.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index ead632ad01b4..4830d8805bed 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -38,6 +38,7 @@ menuconfig KVM
 	select HAVE_KVM_VCPU_RUN_PID_CHANGE
 	select SCHED_INFO
 	select GUEST_PERF_EVENTS if PERF_EVENTS
+	select KVM_GMEM_SHARED_MEM
 	help
 	  Support hosting virtualized guest machines.
 
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2 11/11] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (9 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 10/11] KVM: arm64: Enable mapping guest_memfd in arm64 Fuad Tabba
@ 2025-01-29 17:23 ` Fuad Tabba
  2025-01-30 16:50 ` [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs David Hildenbrand
  11 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-29 17:23 UTC (permalink / raw)
  To: kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton,
	tabba

Expand the guest_memfd selftests to include testing mapping guest
memory for VM types that support it.

Also, build the guest_memfd selftest for aarch64.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 tools/testing/selftests/kvm/Makefile          |  1 +
 .../testing/selftests/kvm/guest_memfd_test.c  | 75 +++++++++++++++++--
 tools/testing/selftests/kvm/lib/kvm_util.c    |  3 +-
 3 files changed, 71 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 41593d2e7de9..c998eb3c3b77 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -174,6 +174,7 @@ TEST_GEN_PROGS_aarch64 += coalesced_io_test
 TEST_GEN_PROGS_aarch64 += demand_paging_test
 TEST_GEN_PROGS_aarch64 += dirty_log_test
 TEST_GEN_PROGS_aarch64 += dirty_log_perf_test
+TEST_GEN_PROGS_aarch64 += guest_memfd_test
 TEST_GEN_PROGS_aarch64 += guest_print_test
 TEST_GEN_PROGS_aarch64 += get-reg-list
 TEST_GEN_PROGS_aarch64 += kvm_create_max_vcpus
diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
index ce687f8d248f..f1e89f72b89f 100644
--- a/tools/testing/selftests/kvm/guest_memfd_test.c
+++ b/tools/testing/selftests/kvm/guest_memfd_test.c
@@ -34,12 +34,48 @@ static void test_file_read_write(int fd)
 		    "pwrite on a guest_mem fd should fail");
 }
 
-static void test_mmap(int fd, size_t page_size)
+static void test_mmap_allowed(int fd, size_t total_size)
 {
+	size_t page_size = getpagesize();
+	const char val = 0xaa;
+	char *mem;
+	int ret;
+	int i;
+
+	mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	TEST_ASSERT(mem != MAP_FAILED, "mmaping() guest memory should pass.");
+
+	memset(mem, val, total_size);
+	for (i = 0; i < total_size; i++)
+		TEST_ASSERT_EQ(mem[i], val);
+
+	ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0,
+			page_size);
+	TEST_ASSERT(!ret, "fallocate the first page should succeed");
+
+	for (i = 0; i < page_size; i++)
+		TEST_ASSERT_EQ(mem[i], 0x00);
+	for (; i < total_size; i++)
+		TEST_ASSERT_EQ(mem[i], val);
+
+	memset(mem, val, total_size);
+	for (i = 0; i < total_size; i++)
+		TEST_ASSERT_EQ(mem[i], val);
+
+	ret = munmap(mem, total_size);
+	TEST_ASSERT(!ret, "munmap should succeed");
+}
+
+static void test_mmap_denied(int fd, size_t total_size)
+{
+	size_t page_size = getpagesize();
 	char *mem;
 
 	mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
 	TEST_ASSERT_EQ(mem, MAP_FAILED);
+
+	mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	TEST_ASSERT_EQ(mem, MAP_FAILED);
 }
 
 static void test_file_size(int fd, size_t page_size, size_t total_size)
@@ -170,19 +206,30 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm)
 	close(fd1);
 }
 
-int main(int argc, char *argv[])
+unsigned long get_shared_type(void)
 {
-	size_t page_size;
+#ifdef __x86_64__
+	return KVM_X86_SW_PROTECTED_VM;
+#endif
+#ifdef __aarch64__
+	return KVM_VM_TYPE_ARM_SW_PROTECTED;
+#endif
+	return 0;
+}
+
+void test_vm_type(unsigned long type, bool is_shared)
+{
+	struct kvm_vm *vm;
 	size_t total_size;
+	size_t page_size;
 	int fd;
-	struct kvm_vm *vm;
 
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD));
 
 	page_size = getpagesize();
 	total_size = page_size * 4;
 
-	vm = vm_create_barebones();
+	vm = vm_create_barebones_type(type);
 
 	test_create_guest_memfd_invalid(vm);
 	test_create_guest_memfd_multiple(vm);
@@ -190,10 +237,26 @@ int main(int argc, char *argv[])
 	fd = vm_create_guest_memfd(vm, total_size, 0);
 
 	test_file_read_write(fd);
-	test_mmap(fd, page_size);
+
+	if (is_shared)
+		test_mmap_allowed(fd, total_size);
+	else
+		test_mmap_denied(fd, total_size);
+
 	test_file_size(fd, page_size, total_size);
 	test_fallocate(fd, page_size, total_size);
 	test_invalid_punch_hole(fd, page_size, total_size);
 
 	close(fd);
+	kvm_vm_release(vm);
+}
+
+int main(int argc, char *argv[])
+{
+	test_vm_type(VM_TYPE_DEFAULT, false);
+
+	if (kvm_has_cap(KVM_CAP_GMEM_SHARED_MEM))
+		test_vm_type(get_shared_type(), true);
+
+	return 0;
 }
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 480e3a40d197..098ea04ec099 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -347,9 +347,8 @@ struct kvm_vm *____vm_create(struct vm_shape shape)
 	}
 
 #ifdef __aarch64__
-	TEST_ASSERT(!vm->type, "ARM doesn't support test-provided types");
 	if (vm->pa_bits != 40)
-		vm->type = KVM_VM_TYPE_ARM_IPA_SIZE(vm->pa_bits);
+		vm->type |= KVM_VM_TYPE_ARM_IPA_SIZE(vm->pa_bits);
 #endif
 
 	vm_open(vm);
-- 
2.48.1.262.g85cc9f2d1e-goog



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs
  2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
                   ` (10 preceding siblings ...)
  2025-01-29 17:23 ` [RFC PATCH v2 11/11] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba
@ 2025-01-30 16:50 ` David Hildenbrand
  2025-01-30 16:57   ` Fuad Tabba
  11 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-01-30 16:50 UTC (permalink / raw)
  To: Fuad Tabba, kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 29.01.25 18:23, Fuad Tabba wrote:

Thanks for the new version

> Main changes since v1 [1]:
> - Added x86 support for mapping guest_memfd at the host, enabled
>   only for the KVM_X86_SW_PROTECTED_VM type.

Nice!

> - Require setting memslot userspace_addr for guest_memfd slots
>   even if shared, and remove patches that worked around that.
> - Brought in more of the infrastructure from the patch series
>   that allows restricted mapping of guest_memfd backed memory.

Ah, that explains why we see the page_type stuff in here now :)

> - Renamed references to "mappable" -> "shared".
> - Expanded the selftests.
> - Added instructions to test on x86 and arm64 (below).

Very nice!


I assume there is still no page conversion happening -- or is there now 
that the page_stuff thing is in here?

Would be good to spell out what's supported and what's still TBD 
regarding mmap support.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs
  2025-01-30 16:50 ` [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs David Hildenbrand
@ 2025-01-30 16:57   ` Fuad Tabba
  0 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-30 16:57 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	yu.c.zhang, isaku.yamahata, mic, vbabka, vannapurve, ackerleytng,
	mail, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

Hi David,

On Thu, 30 Jan 2025 at 16:50, David Hildenbrand <david@redhat.com> wrote:
>
> On 29.01.25 18:23, Fuad Tabba wrote:
>
> Thanks for the new version
>
> > Main changes since v1 [1]:
> > - Added x86 support for mapping guest_memfd at the host, enabled
> >   only for the KVM_X86_SW_PROTECTED_VM type.
>
> Nice!
>
> > - Require setting memslot userspace_addr for guest_memfd slots
> >   even if shared, and remove patches that worked around that.
> > - Brought in more of the infrastructure from the patch series
> >   that allows restricted mapping of guest_memfd backed memory.
>
> Ah, that explains why we see the page_type stuff in here now :)
>
> > - Renamed references to "mappable" -> "shared".
> > - Expanded the selftests.
> > - Added instructions to test on x86 and arm64 (below).
>
> Very nice!
>
>
> I assume there is still no page conversion happening -- or is there now
> that the page_stuff thing is in here?
>
> Would be good to spell out what's supported and what's still TBD
> regarding mmap support.

Thanks! No page conversion happening yet. I'm rebasing the other
series, the one with the conversions, on top of this one, as well as
fixing it based on the feedback that I got.

What this is missing is the infrastructure that tracks the
mappability/shareability at the host and the guest, as well as the
implementation of the callbacks themselves. I thought I'd send this
one out now, while I work on the larger one, since this one is easier
to test, and serves as a base for the coming part.

Cheers,
/fuad

> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages
  2025-01-29 17:23 ` [RFC PATCH v2 02/11] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages Fuad Tabba
@ 2025-01-30 17:16   ` David Hildenbrand
  2025-01-31  9:02     ` Fuad Tabba
  0 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-01-30 17:16 UTC (permalink / raw)
  To: Fuad Tabba, kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 29.01.25 18:23, Fuad Tabba wrote:
> Before transitioning a guest_memfd folio to unshared, thereby
> disallowing access by the host and allowing the hypervisor to
> transition its view of the guest page as private, we need to be
> sure that the host doesn't have any references to the folio.
> 
> This patch introduces a new type for guest_memfd folios, which
> isn't activated in this series but is here as a placeholder and
> to facilitate the code in the next patch. This will be used in
> the future to register a callback that informs the guest_memfd
> subsystem when the last reference is dropped, therefore knowing
> that the host doesn't have any remaining references.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>   include/linux/page-flags.h | 7 +++++++
>   mm/debug.c                 | 1 +
>   mm/swap.c                  | 5 +++++
>   3 files changed, 13 insertions(+)
> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 6615f2f59144..bab3cac1f93b 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -942,6 +942,7 @@ enum pagetype {
>   	PGTY_slab	= 0xf5,
>   	PGTY_zsmalloc	= 0xf6,
>   	PGTY_unaccepted	= 0xf7,
> +	PGTY_guestmem	= 0xf8,
>   
>   	PGTY_mapcount_underflow = 0xff
>   };
> @@ -1091,6 +1092,12 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb)
>   FOLIO_TEST_FLAG_FALSE(hugetlb)
>   #endif
>   

Some short doc would be nice, to at least hint that this is related to 
guest_memfd, and that these are otherwise folios.


/*
  * guestmem folios are folios that are used to back VM memory as managed
  * guest_memfd. Once the last reference is put, instead of freeing these
  * folios back to the page allocator, they are returned to guest_memfd.
  *
  * For now, guestmem will only be set on these folios as long as they
  * cannot be mapped to user space ("private state"), with the plan of
  * always setting that type once typed folios can be mapped to user
  * space cleanly.
  */

> +#ifdef CONFIG_KVM_GMEM_MAPPABLE
> +FOLIO_TYPE_OPS(guestmem, guestmem)
> +#else
> +FOLIO_TEST_FLAG_FALSE(guestmem)
> +#endif
> +
>   PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc)
>   
>   /*
> diff --git a/mm/debug.c b/mm/debug.c
> index 95b6ab809c0e..db93be385ed9 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -56,6 +56,7 @@ static const char *page_type_names[] = {
>   	DEF_PAGETYPE_NAME(table),
>   	DEF_PAGETYPE_NAME(buddy),
>   	DEF_PAGETYPE_NAME(unaccepted),
> +	DEF_PAGETYPE_NAME(guestmem),
 >   };>
>   static const char *page_type_name(unsigned int page_type)
> diff --git a/mm/swap.c b/mm/swap.c
> index 8a66cd9cb9da..73d61c7f8edd 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -37,6 +37,7 @@
>   #include <linux/page_idle.h>
>   #include <linux/local_lock.h>
>   #include <linux/buffer_head.h>
> +#include <linux/kvm_host.h>
>   
>   #include "internal.h"
>   
> @@ -101,6 +102,10 @@ static void free_typed_folio(struct folio *folio)
>   		if (IS_ENABLED(CONFIG_HUGETLBFS))
>   			free_huge_folio(folio);
>   		return;
> +	case PGTY_guestmem:
> +		if (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM))
> +			WARN_ONCE(1, "A placeholder that shouldn't trigger.");


Does it make sense to directly introduce the callback into guest_memfd 
and handle the WARN_ONCE() in there? Then, we don't have tot ouch this 
core code later again.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-01-29 17:23 ` [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
@ 2025-01-30 17:20   ` David Hildenbrand
  2025-01-31  9:11     ` Fuad Tabba
  2025-02-07 16:45   ` Patrick Roy
  1 sibling, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-01-30 17:20 UTC (permalink / raw)
  To: Fuad Tabba, kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 29.01.25 18:23, Fuad Tabba wrote:
> Add support for mmap() and fault() for guest_memfd backed memory
> in the host for VMs that support in-place conversion between
> shared and private (shared memory). To that end, this patch adds
> the ability to check whether the VM type has that support, and
> only allows mapping its memory if that's the case.
> 
> Additionally, this behavior is gated with a new configuration
> option, CONFIG_KVM_GMEM_SHARED_MEM.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> 
> ---
> 
> This patch series will allow shared memory support for software
> VMs in x86. It will also introduce a similar VM type for arm64
> and allow shared memory support for that. In the future, pKVM
> will also support shared memory.
> ---
>   include/linux/kvm_host.h | 11 ++++++
>   virt/kvm/Kconfig         |  4 +++
>   virt/kvm/guest_memfd.c   | 77 ++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 92 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 401439bb21e3..408429f13bf4 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -717,6 +717,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
>   }
>   #endif
>   
> +/*
> + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
> + * private memory is enabled and it supports in-place shared/private conversion.
> + */
> +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
> +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
> +{
> +	return false;
> +}
> +#endif
> +
>   #ifndef kvm_arch_has_readonly_mem
>   static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
>   {
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index 54e959e7d68f..4e759e8020c5 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -124,3 +124,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE
>   config HAVE_KVM_ARCH_GMEM_INVALIDATE
>          bool
>          depends on KVM_PRIVATE_MEM
> +
> +config KVM_GMEM_SHARED_MEM
> +       select KVM_PRIVATE_MEM
> +       bool
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 47a9f68f7b24..86441581c9ae 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -307,7 +307,84 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
>   	return gfn - slot->base_gfn + slot->gmem.pgoff;
>   }
>   
> +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
> +{
> +	struct inode *inode = file_inode(vmf->vma->vm_file);
> +	struct folio *folio;
> +	vm_fault_t ret = VM_FAULT_LOCKED;
> +
> +	filemap_invalidate_lock_shared(inode->i_mapping);
> +
> +	folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> +	if (IS_ERR(folio)) {
> +		ret = VM_FAULT_SIGBUS;
> +		goto out_filemap;
> +	}
> +
> +	if (folio_test_hwpoison(folio)) {
> +		ret = VM_FAULT_HWPOISON;
> +		goto out_folio;
> +	}
> +

Worth adding a comment, something like

/*
  * Only private folios are marked as "guestmem" so far, and we never
  * expect private folios at this point.
  */
> +	if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
> +		ret = VM_FAULT_SIGBUS;
> +		goto out_folio;
> +	}
> +
> +	/* No support for huge pages. */
> +	if (WARN_ON_ONCE(folio_nr_pages(folio) > 1)) {
> +		ret = VM_FAULT_SIGBUS;
> +		goto out_folio;
> +	}
> +

/* We only support mmap of small folios. */
VM_WARN_ON_ONCE(folio_test_large(folio));

> +	if (!folio_test_uptodate(folio)) {
> +		clear_highpage(folio_page(folio, 0));
> +		folio_mark_uptodate(folio);
> +	}
> +
> +	vmf->page = folio_file_page(folio, vmf->pgoff);
> +

Apart from that LGTM.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 02/11] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages
  2025-01-30 17:16   ` David Hildenbrand
@ 2025-01-31  9:02     ` Fuad Tabba
  0 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-31  9:02 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	yu.c.zhang, isaku.yamahata, mic, vbabka, vannapurve, ackerleytng,
	mail, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

Hi David,

On Thu, 30 Jan 2025 at 17:16, David Hildenbrand <david@redhat.com> wrote:
>
> On 29.01.25 18:23, Fuad Tabba wrote:
> > Before transitioning a guest_memfd folio to unshared, thereby
> > disallowing access by the host and allowing the hypervisor to
> > transition its view of the guest page as private, we need to be
> > sure that the host doesn't have any references to the folio.
> >
> > This patch introduces a new type for guest_memfd folios, which
> > isn't activated in this series but is here as a placeholder and
> > to facilitate the code in the next patch. This will be used in
> > the future to register a callback that informs the guest_memfd
> > subsystem when the last reference is dropped, therefore knowing
> > that the host doesn't have any remaining references.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >   include/linux/page-flags.h | 7 +++++++
> >   mm/debug.c                 | 1 +
> >   mm/swap.c                  | 5 +++++
> >   3 files changed, 13 insertions(+)
> >
> > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> > index 6615f2f59144..bab3cac1f93b 100644
> > --- a/include/linux/page-flags.h
> > +++ b/include/linux/page-flags.h
> > @@ -942,6 +942,7 @@ enum pagetype {
> >       PGTY_slab       = 0xf5,
> >       PGTY_zsmalloc   = 0xf6,
> >       PGTY_unaccepted = 0xf7,
> > +     PGTY_guestmem   = 0xf8,
> >
> >       PGTY_mapcount_underflow = 0xff
> >   };
> > @@ -1091,6 +1092,12 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb)
> >   FOLIO_TEST_FLAG_FALSE(hugetlb)
> >   #endif
> >
>
> Some short doc would be nice, to at least hint that this is related to
> guest_memfd, and that these are otherwise folios.
>
>
> /*
>   * guestmem folios are folios that are used to back VM memory as managed
>   * guest_memfd. Once the last reference is put, instead of freeing these
>   * folios back to the page allocator, they are returned to guest_memfd.
>   *
>   * For now, guestmem will only be set on these folios as long as they
>   * cannot be mapped to user space ("private state"), with the plan of
>   * always setting that type once typed folios can be mapped to user
>   * space cleanly.
>   */

Will add this.

> > +#ifdef CONFIG_KVM_GMEM_MAPPABLE
> > +FOLIO_TYPE_OPS(guestmem, guestmem)
> > +#else
> > +FOLIO_TEST_FLAG_FALSE(guestmem)
> > +#endif
> > +
> >   PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc)
> >
> >   /*
> > diff --git a/mm/debug.c b/mm/debug.c
> > index 95b6ab809c0e..db93be385ed9 100644
> > --- a/mm/debug.c
> > +++ b/mm/debug.c
> > @@ -56,6 +56,7 @@ static const char *page_type_names[] = {
> >       DEF_PAGETYPE_NAME(table),
> >       DEF_PAGETYPE_NAME(buddy),
> >       DEF_PAGETYPE_NAME(unaccepted),
> > +     DEF_PAGETYPE_NAME(guestmem),
>  >   };>
> >   static const char *page_type_name(unsigned int page_type)
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 8a66cd9cb9da..73d61c7f8edd 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -37,6 +37,7 @@
> >   #include <linux/page_idle.h>
> >   #include <linux/local_lock.h>
> >   #include <linux/buffer_head.h>
> > +#include <linux/kvm_host.h>
> >
> >   #include "internal.h"
> >
> > @@ -101,6 +102,10 @@ static void free_typed_folio(struct folio *folio)
> >               if (IS_ENABLED(CONFIG_HUGETLBFS))
> >                       free_huge_folio(folio);
> >               return;
> > +     case PGTY_guestmem:
> > +             if (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM))
> > +                     WARN_ONCE(1, "A placeholder that shouldn't trigger.");
>
>
> Does it make sense to directly introduce the callback into guest_memfd
> and handle the WARN_ONCE() in there? Then, we don't have tot ouch this
> core code later again.

Yes. That would simplify things.

Thanks,
/fuad

> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-01-30 17:20   ` David Hildenbrand
@ 2025-01-31  9:11     ` Fuad Tabba
  0 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-01-31  9:11 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	yu.c.zhang, isaku.yamahata, mic, vbabka, vannapurve, ackerleytng,
	mail, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

Hi David,

On Thu, 30 Jan 2025 at 17:20, David Hildenbrand <david@redhat.com> wrote:
>
> On 29.01.25 18:23, Fuad Tabba wrote:
> > Add support for mmap() and fault() for guest_memfd backed memory
> > in the host for VMs that support in-place conversion between
> > shared and private (shared memory). To that end, this patch adds
> > the ability to check whether the VM type has that support, and
> > only allows mapping its memory if that's the case.
> >
> > Additionally, this behavior is gated with a new configuration
> > option, CONFIG_KVM_GMEM_SHARED_MEM.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> >
> > ---
> >
> > This patch series will allow shared memory support for software
> > VMs in x86. It will also introduce a similar VM type for arm64
> > and allow shared memory support for that. In the future, pKVM
> > will also support shared memory.
> > ---
> >   include/linux/kvm_host.h | 11 ++++++
> >   virt/kvm/Kconfig         |  4 +++
> >   virt/kvm/guest_memfd.c   | 77 ++++++++++++++++++++++++++++++++++++++++
> >   3 files changed, 92 insertions(+)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index 401439bb21e3..408429f13bf4 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -717,6 +717,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
> >   }
> >   #endif
> >
> > +/*
> > + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for
> > + * private memory is enabled and it supports in-place shared/private conversion.
> > + */
> > +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM)
> > +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm)
> > +{
> > +     return false;
> > +}
> > +#endif
> > +
> >   #ifndef kvm_arch_has_readonly_mem
> >   static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
> >   {
> > diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> > index 54e959e7d68f..4e759e8020c5 100644
> > --- a/virt/kvm/Kconfig
> > +++ b/virt/kvm/Kconfig
> > @@ -124,3 +124,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE
> >   config HAVE_KVM_ARCH_GMEM_INVALIDATE
> >          bool
> >          depends on KVM_PRIVATE_MEM
> > +
> > +config KVM_GMEM_SHARED_MEM
> > +       select KVM_PRIVATE_MEM
> > +       bool
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index 47a9f68f7b24..86441581c9ae 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -307,7 +307,84 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
> >       return gfn - slot->base_gfn + slot->gmem.pgoff;
> >   }
> >
> > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
> > +{
> > +     struct inode *inode = file_inode(vmf->vma->vm_file);
> > +     struct folio *folio;
> > +     vm_fault_t ret = VM_FAULT_LOCKED;
> > +
> > +     filemap_invalidate_lock_shared(inode->i_mapping);
> > +
> > +     folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> > +     if (IS_ERR(folio)) {
> > +             ret = VM_FAULT_SIGBUS;
> > +             goto out_filemap;
> > +     }
> > +
> > +     if (folio_test_hwpoison(folio)) {
> > +             ret = VM_FAULT_HWPOISON;
> > +             goto out_folio;
> > +     }
> > +
>
> Worth adding a comment, something like
>
> /*
>   * Only private folios are marked as "guestmem" so far, and we never
>   * expect private folios at this point.
>   */
> > +     if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
> > +             ret = VM_FAULT_SIGBUS;
> > +             goto out_folio;
> > +     }
> > +
> > +     /* No support for huge pages. */
> > +     if (WARN_ON_ONCE(folio_nr_pages(folio) > 1)) {
> > +             ret = VM_FAULT_SIGBUS;
> > +             goto out_folio;
> > +     }
> > +
>
> /* We only support mmap of small folios. */
> VM_WARN_ON_ONCE(folio_test_large(folio));

Will do. Thanks.

/fuad

>
> > +     if (!folio_test_uptodate(folio)) {
> > +             clear_highpage(folio_page(folio, 0));
> > +             folio_mark_uptodate(folio);
> > +     }
> > +
> > +     vmf->page = folio_file_page(folio, vmf->pgoff);
> > +
>
> Apart from that LGTM.
>
> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-01-29 17:23 ` [RFC PATCH v2 04/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared Fuad Tabba
@ 2025-01-31  9:11   ` David Hildenbrand
  2025-01-31  9:52     ` Fuad Tabba
  0 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-01-31  9:11 UTC (permalink / raw)
  To: Fuad Tabba, kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 29.01.25 18:23, Fuad Tabba wrote:
> Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
> that the VM supports shared memory in guest_memfd, or that the
> host can create VMs that support shared memory. Supporting shared
> memory implies that memory can be mapped when shared with the
> host.
> 
> For now, this checks only whether the VM type supports sharing
> guest_memfd backed memory. In the future, it will be expanded to
> check whether the specific memory address is shared with the
> host.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>   include/uapi/linux/kvm.h |  1 +
>   virt/kvm/guest_memfd.c   | 13 +++++++++++++
>   virt/kvm/kvm_main.c      |  4 ++++
>   3 files changed, 18 insertions(+)
> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 502ea63b5d2e..3ac805c5abf1 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -933,6 +933,7 @@ struct kvm_enable_cap {
>   #define KVM_CAP_PRE_FAULT_MEMORY 236
>   #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
>   #define KVM_CAP_X86_GUEST_MODE 238
> +#define KVM_CAP_GMEM_SHARED_MEM 239
>   
>   struct kvm_irq_routing_irqchip {
>   	__u32 irqchip;
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 86441581c9ae..4e1144ed3446 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -308,6 +308,13 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
>   }
>   
>   #ifdef CONFIG_KVM_GMEM_SHARED_MEM
> +static bool kvm_gmem_is_shared(struct file *file, pgoff_t pgoff)

I assume you want to call this something like:

kvm_gmem_folio_is_shared
kvm_gmem_offset_is_shared
kvm_gmem_range_is_shared
...

To make it clearer that you are only checking one piece and not the 
whole thing.

But then, I wonder if that could be handled in kvm_gmem_get_folio(),
and e.g., specified via a flag?

kvm_gmem_get_folio(inode, vmf->pgoff, KVM_GMEM_GF_SHARED);

Maybe existing callers would want to pass KVM_GMEM_GF_PRIVATE, and the 
ones that "don't care" don't pas anything?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-01-31  9:11   ` David Hildenbrand
@ 2025-01-31  9:52     ` Fuad Tabba
  2025-01-31 10:52       ` David Hildenbrand
  0 siblings, 1 reply; 23+ messages in thread
From: Fuad Tabba @ 2025-01-31  9:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	yu.c.zhang, isaku.yamahata, mic, vbabka, vannapurve, ackerleytng,
	mail, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

Hi David,

On Fri, 31 Jan 2025 at 09:11, David Hildenbrand <david@redhat.com> wrote:
>
> On 29.01.25 18:23, Fuad Tabba wrote:
> > Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
> > that the VM supports shared memory in guest_memfd, or that the
> > host can create VMs that support shared memory. Supporting shared
> > memory implies that memory can be mapped when shared with the
> > host.
> >
> > For now, this checks only whether the VM type supports sharing
> > guest_memfd backed memory. In the future, it will be expanded to
> > check whether the specific memory address is shared with the
> > host.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >   include/uapi/linux/kvm.h |  1 +
> >   virt/kvm/guest_memfd.c   | 13 +++++++++++++
> >   virt/kvm/kvm_main.c      |  4 ++++
> >   3 files changed, 18 insertions(+)
> >
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 502ea63b5d2e..3ac805c5abf1 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -933,6 +933,7 @@ struct kvm_enable_cap {
> >   #define KVM_CAP_PRE_FAULT_MEMORY 236
> >   #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
> >   #define KVM_CAP_X86_GUEST_MODE 238
> > +#define KVM_CAP_GMEM_SHARED_MEM 239
> >
> >   struct kvm_irq_routing_irqchip {
> >       __u32 irqchip;
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index 86441581c9ae..4e1144ed3446 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -308,6 +308,13 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
> >   }
> >
> >   #ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > +static bool kvm_gmem_is_shared(struct file *file, pgoff_t pgoff)
>
> I assume you want to call this something like:
>
> kvm_gmem_folio_is_shared
> kvm_gmem_offset_is_shared
> kvm_gmem_range_is_shared
> ...
>
> To make it clearer that you are only checking one piece and not the
> whole thing.
>
> But then, I wonder if that could be handled in kvm_gmem_get_folio(),
> and e.g., specified via a flag?
>
> kvm_gmem_get_folio(inode, vmf->pgoff, KVM_GMEM_GF_SHARED);
>
> Maybe existing callers would want to pass KVM_GMEM_GF_PRIVATE, and the
> ones that "don't care" don't pas anything?

I agree that naming this function to make it clearer would be a good
idea. That said, I think with the patches I removed, it doesn't even
need to be exposed at all, even looking at future patches --- it has
no callers other than kvm_gmem_fault(). Therefore, I don't think we
need to add a flag to kvm_gmem_get_folio().

I'll have a closer look while preparing the respin and fix it either way.

Thanks,
/fuad



> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 04/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared
  2025-01-31  9:52     ` Fuad Tabba
@ 2025-01-31 10:52       ` David Hildenbrand
  0 siblings, 0 replies; 23+ messages in thread
From: David Hildenbrand @ 2025-01-31 10:52 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	yu.c.zhang, isaku.yamahata, mic, vbabka, vannapurve, ackerleytng,
	mail, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, roypat,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On 31.01.25 10:52, Fuad Tabba wrote:
> Hi David,
> 
> On Fri, 31 Jan 2025 at 09:11, David Hildenbrand <david@redhat.com> wrote:
>>
>> On 29.01.25 18:23, Fuad Tabba wrote:
>>> Add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates
>>> that the VM supports shared memory in guest_memfd, or that the
>>> host can create VMs that support shared memory. Supporting shared
>>> memory implies that memory can be mapped when shared with the
>>> host.
>>>
>>> For now, this checks only whether the VM type supports sharing
>>> guest_memfd backed memory. In the future, it will be expanded to
>>> check whether the specific memory address is shared with the
>>> host.
>>>
>>> Signed-off-by: Fuad Tabba <tabba@google.com>
>>> ---
>>>    include/uapi/linux/kvm.h |  1 +
>>>    virt/kvm/guest_memfd.c   | 13 +++++++++++++
>>>    virt/kvm/kvm_main.c      |  4 ++++
>>>    3 files changed, 18 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>> index 502ea63b5d2e..3ac805c5abf1 100644
>>> --- a/include/uapi/linux/kvm.h
>>> +++ b/include/uapi/linux/kvm.h
>>> @@ -933,6 +933,7 @@ struct kvm_enable_cap {
>>>    #define KVM_CAP_PRE_FAULT_MEMORY 236
>>>    #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
>>>    #define KVM_CAP_X86_GUEST_MODE 238
>>> +#define KVM_CAP_GMEM_SHARED_MEM 239
>>>
>>>    struct kvm_irq_routing_irqchip {
>>>        __u32 irqchip;
>>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
>>> index 86441581c9ae..4e1144ed3446 100644
>>> --- a/virt/kvm/guest_memfd.c
>>> +++ b/virt/kvm/guest_memfd.c
>>> @@ -308,6 +308,13 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
>>>    }
>>>
>>>    #ifdef CONFIG_KVM_GMEM_SHARED_MEM
>>> +static bool kvm_gmem_is_shared(struct file *file, pgoff_t pgoff)
>>
>> I assume you want to call this something like:
>>
>> kvm_gmem_folio_is_shared
>> kvm_gmem_offset_is_shared
>> kvm_gmem_range_is_shared
>> ...
>>
>> To make it clearer that you are only checking one piece and not the
>> whole thing.
>>
>> But then, I wonder if that could be handled in kvm_gmem_get_folio(),
>> and e.g., specified via a flag?
>>
>> kvm_gmem_get_folio(inode, vmf->pgoff, KVM_GMEM_GF_SHARED);
>>
>> Maybe existing callers would want to pass KVM_GMEM_GF_PRIVATE, and the
>> ones that "don't care" don't pas anything?
> 
> I agree that naming this function to make it clearer would be a good
> idea. That said, I think with the patches I removed, it doesn't even
> need to be exposed at all, even looking at future patches --- it has
> no callers other than kvm_gmem_fault(). Therefore, I don't think we
> need to add a flag to kvm_gmem_get_folio().
> 
> I'll have a closer look while preparing the respin and fix it either way.

Okay, having some indication in the name that we only want a shared 
folio etc. might make sense.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-01-29 17:23 ` [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
  2025-01-30 17:20   ` David Hildenbrand
@ 2025-02-07 16:45   ` Patrick Roy
  2025-02-10  8:33     ` Fuad Tabba
  1 sibling, 1 reply; 23+ messages in thread
From: Patrick Roy @ 2025-02-07 16:45 UTC (permalink / raw)
  To: Fuad Tabba, kvm, linux-arm-msm, linux-mm
  Cc: pbonzini, chenhuacai, mpe, anup, paul.walmsley, palmer, aou,
	seanjc, viro, brauner, willy, akpm, xiaoyao.li, yilun.xu,
	chao.p.peng, jarkko, amoorthy, dmatlack, yu.c.zhang,
	isaku.yamahata, mic, vbabka, vannapurve, ackerleytng, mail,
	david, michael.roth, wei.w.wang, liam.merwick, isaku.yamahata,
	kirill.shutemov, suzuki.poulose, steven.price, quic_eberman,
	quic_mnalajal, quic_tsoni, quic_svaddagi, quic_cvanscha,
	quic_pderrin, quic_pheragu, catalin.marinas, james.morse,
	yuzenghui, oliver.upton, maz, will, qperret, keirf, shuah, hch,
	jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

Hi Fuad!

On Wed, 2025-01-29 at 17:23 +0000, Fuad Tabba wrote:
> Add support for mmap() and fault() for guest_memfd backed memory
> in the host for VMs that support in-place conversion between
> shared and private (shared memory). To that end, this patch adds
> the ability to check whether the VM type has that support, and
> only allows mapping its memory if that's the case.
> 
> Additionally, this behavior is gated with a new configuration
> option, CONFIG_KVM_GMEM_SHARED_MEM.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> 
> ---
> 
> This patch series will allow shared memory support for software
> VMs in x86. It will also introduce a similar VM type for arm64
> and allow shared memory support for that. In the future, pKVM
> will also support shared memory.
> ---
>  include/linux/kvm_host.h | 11 ++++++
>  virt/kvm/Kconfig         |  4 +++
>  virt/kvm/guest_memfd.c   | 77 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 92 insertions(+)
> 
> -snip-
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 47a9f68f7b24..86441581c9ae 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -307,7 +307,84 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
>         return gfn - slot->base_gfn + slot->gmem.pgoff;
>  }
> 
> +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
> +{
> +       struct inode *inode = file_inode(vmf->vma->vm_file);
> +       struct folio *folio;
> +       vm_fault_t ret = VM_FAULT_LOCKED;
> +
> +       filemap_invalidate_lock_shared(inode->i_mapping);
> +
> +       folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> +       if (IS_ERR(folio)) {
> +               ret = VM_FAULT_SIGBUS;
> +               goto out_filemap;
> +       }
> +
> +       if (folio_test_hwpoison(folio)) {
> +               ret = VM_FAULT_HWPOISON;
> +               goto out_folio;
> +       }
> +
> +       if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
> +               ret = VM_FAULT_SIGBUS;
> +               goto out_folio;
> +       }
> +
> +       /* No support for huge pages. */
> +       if (WARN_ON_ONCE(folio_nr_pages(folio) > 1)) {
> +               ret = VM_FAULT_SIGBUS;
> +               goto out_folio;
> +       }
> +
> +       if (!folio_test_uptodate(folio)) {
> +               clear_highpage(folio_page(folio, 0));
> +               folio_mark_uptodate(folio);

kvm_gmem_mark_prepared() instead of direct folio_mark_uptodate() here, I
think (in preparation of things like [1])? Noticed this while rebasing
my direct map removal series on top of this and wondering why mmap'd
folios sometimes didn't get removed (since it hooks mark_prepared()).

Best,
Patrick

[1]: https://lore.kernel.org/kvm/20241108155056.332412-1-pbonzini@redhat.com/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages
  2025-02-07 16:45   ` Patrick Roy
@ 2025-02-10  8:33     ` Fuad Tabba
  0 siblings, 0 replies; 23+ messages in thread
From: Fuad Tabba @ 2025-02-10  8:33 UTC (permalink / raw)
  To: Patrick Roy
  Cc: kvm, linux-arm-msm, linux-mm, pbonzini, chenhuacai, mpe, anup,
	paul.walmsley, palmer, aou, seanjc, viro, brauner, willy, akpm,
	xiaoyao.li, yilun.xu, chao.p.peng, jarkko, amoorthy, dmatlack,
	yu.c.zhang, isaku.yamahata, mic, vbabka, vannapurve, ackerleytng,
	mail, david, michael.roth, wei.w.wang, liam.merwick,
	isaku.yamahata, kirill.shutemov, suzuki.poulose, steven.price,
	quic_eberman, quic_mnalajal, quic_tsoni, quic_svaddagi,
	quic_cvanscha, quic_pderrin, quic_pheragu, catalin.marinas,
	james.morse, yuzenghui, oliver.upton, maz, will, qperret, keirf,
	shuah, hch, jgg, rientjes, jhubbard, fvdl, hughd, jthoughton

On Fri, 7 Feb 2025 at 16:45, Patrick Roy <roypat@amazon.co.uk> wrote:
>
> Hi Fuad!
>
> On Wed, 2025-01-29 at 17:23 +0000, Fuad Tabba wrote:
> > Add support for mmap() and fault() for guest_memfd backed memory
> > in the host for VMs that support in-place conversion between
> > shared and private (shared memory). To that end, this patch adds
> > the ability to check whether the VM type has that support, and
> > only allows mapping its memory if that's the case.
> >
> > Additionally, this behavior is gated with a new configuration
> > option, CONFIG_KVM_GMEM_SHARED_MEM.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> >
> > ---
> >
> > This patch series will allow shared memory support for software
> > VMs in x86. It will also introduce a similar VM type for arm64
> > and allow shared memory support for that. In the future, pKVM
> > will also support shared memory.
> > ---
> >  include/linux/kvm_host.h | 11 ++++++
> >  virt/kvm/Kconfig         |  4 +++
> >  virt/kvm/guest_memfd.c   | 77 ++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 92 insertions(+)
> >
> > -snip-
> >
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index 47a9f68f7b24..86441581c9ae 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -307,7 +307,84 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn)
> >         return gfn - slot->base_gfn + slot->gmem.pgoff;
> >  }
> >
> > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM
> > +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf)
> > +{
> > +       struct inode *inode = file_inode(vmf->vma->vm_file);
> > +       struct folio *folio;
> > +       vm_fault_t ret = VM_FAULT_LOCKED;
> > +
> > +       filemap_invalidate_lock_shared(inode->i_mapping);
> > +
> > +       folio = kvm_gmem_get_folio(inode, vmf->pgoff);
> > +       if (IS_ERR(folio)) {
> > +               ret = VM_FAULT_SIGBUS;
> > +               goto out_filemap;
> > +       }
> > +
> > +       if (folio_test_hwpoison(folio)) {
> > +               ret = VM_FAULT_HWPOISON;
> > +               goto out_folio;
> > +       }
> > +
> > +       if (WARN_ON_ONCE(folio_test_guestmem(folio)))  {
> > +               ret = VM_FAULT_SIGBUS;
> > +               goto out_folio;
> > +       }
> > +
> > +       /* No support for huge pages. */
> > +       if (WARN_ON_ONCE(folio_nr_pages(folio) > 1)) {
> > +               ret = VM_FAULT_SIGBUS;
> > +               goto out_folio;
> > +       }
> > +
> > +       if (!folio_test_uptodate(folio)) {
> > +               clear_highpage(folio_page(folio, 0));
> > +               folio_mark_uptodate(folio);
>
> kvm_gmem_mark_prepared() instead of direct folio_mark_uptodate() here, I
> think (in preparation of things like [1])? Noticed this while rebasing
> my direct map removal series on top of this and wondering why mmap'd
> folios sometimes didn't get removed (since it hooks mark_prepared()).

Thanks for pointing that out. Will fix.
/fuad

> Best,
> Patrick
>
> [1]: https://lore.kernel.org/kvm/20241108155056.332412-1-pbonzini@redhat.com/


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-02-10  8:33 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-29 17:23 [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 01/11] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 02/11] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages Fuad Tabba
2025-01-30 17:16   ` David Hildenbrand
2025-01-31  9:02     ` Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages Fuad Tabba
2025-01-30 17:20   ` David Hildenbrand
2025-01-31  9:11     ` Fuad Tabba
2025-02-07 16:45   ` Patrick Roy
2025-02-10  8:33     ` Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 04/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is shared Fuad Tabba
2025-01-31  9:11   ` David Hildenbrand
2025-01-31  9:52     ` Fuad Tabba
2025-01-31 10:52       ` David Hildenbrand
2025-01-29 17:23 ` [RFC PATCH v2 05/11] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 06/11] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 07/11] KVM: arm64: Refactor user_mem_abort() calculation of force_pte Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 08/11] KVM: arm64: Handle guest_memfd()-backed guest page faults Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 09/11] KVM: arm64: Introduce KVM_VM_TYPE_ARM_SW_PROTECTED machine type Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 10/11] KVM: arm64: Enable mapping guest_memfd in arm64 Fuad Tabba
2025-01-29 17:23 ` [RFC PATCH v2 11/11] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba
2025-01-30 16:50 ` [RFC PATCH v2 00/11] KVM: Mapping guest_memfd backed memory at the host for software protected VMs David Hildenbrand
2025-01-30 16:57   ` Fuad Tabba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox