From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CB17C5B552 for ; Wed, 4 Jun 2025 08:38:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D38516B05BE; Wed, 4 Jun 2025 04:38:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D0F9F6B05C0; Wed, 4 Jun 2025 04:38:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C26476B05C1; Wed, 4 Jun 2025 04:38:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A58986B05BE for ; Wed, 4 Jun 2025 04:38:02 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 44549160AF6 for ; Wed, 4 Jun 2025 08:38:02 +0000 (UTC) X-FDA: 83517065604.09.44D763F Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf12.hostedemail.com (Postfix) with ESMTP id 97EDF40007 for ; Wed, 4 Jun 2025 08:38:00 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XFV75tEa; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of tabba@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749026280; a=rsa-sha256; cv=none; b=bZFUaR8RzfWVeiUoE73J2aSzYaTC8IwRMkDJypGlbxsDqPqIpMRu6fs5AQigNQyPqc+4RR 7MuXqxm5QU4pD4IyiZWLZvxzzsccQSTtNWprW42DOSwgQzLuziz+RW05I6eo+K7YFVdhge ozOSQv8U4rIpPnpKEhVkiLWzEPTeECo= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XFV75tEa; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of tabba@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749026280; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a2nRJ1Q/J0IOV9UAbVb9ieUJGGlK4N6jfQ5hZo/FrXs=; b=m+xEpZn8g5lFihftMO2vj7cRo2LWNLjuWAV3rDA8QnPyT6+s1YJVorMbnQ4T+mTIqVDn+k oqpu5yyBUAKmqwRP2ibkHZCcjUVojY9jcHb0rxglsibTKVmNhncoNbo2vU4yFmL523kW31 uEIZm/DVi8L/7G6WiijPMaQKRvKBHl0= Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4a433f52485so346721cf.0 for ; Wed, 04 Jun 2025 01:38:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1749026280; x=1749631080; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=a2nRJ1Q/J0IOV9UAbVb9ieUJGGlK4N6jfQ5hZo/FrXs=; b=XFV75tEaKVOt5+gFw5UfEefP3+ivzM1csw053cB+IYL++pSnfiuUG937q9D14wsFxv nUGWkLVcOiAwWAIEsvllsnr7YMbNCg/xKC3bqLFvm1ydKc4F/5bhGt8K/Ee50b6BdfMo SvZEA9vKE4PjMFX5E+akQ1qtaKMPD5i1B+rOOYSsdBCWDy9lIBALkiVonuo26o/29fs3 /b59pcQxAzmZu5CESTFeg6QEUd7VoHWz4/QXTK1WEXbYW0/i3TCThmSKcb6wlhDw5ND1 3OMCGgmA5GkTzP0adeSFfdrA6Gun56m9aGIqUjXQ2VfYuKZZFG3Vc6C2hbD5nKZPXVFt SULA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749026280; x=1749631080; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=a2nRJ1Q/J0IOV9UAbVb9ieUJGGlK4N6jfQ5hZo/FrXs=; b=WKPeNOk6CVYcfLkJxcRhHy8q9RgW00o6qEKa+lutL08QeiVeE2Jf8gL+OB97TgbFIA 2dpq21DqnFDxo4sMXs64e9hvZgcJiiu7i0E9xeU8GDUOTJ1gliD/nNxpuorr00hiZBer H1VJ9uwFeHaNL44HouuRnxSZ2VwT16nkRvzHhFnt8YrF9o9Ez9Vz65pHG+Z4rVEu+4My CBlPevdv8QyxYNg8nYMYAz0iV8IShh0q0tlp8ELvcLUKgdKbL6847V+bQI/vMggyiKYA NwIFOXY976Ye6M3FCMvY0Ea2faEbLizEFCVbrwNlg2fjBWFl3ZGs3LQaB3DXQB3hyzX3 9YZw== X-Forwarded-Encrypted: i=1; AJvYcCV8nJekH0BiqB2HBhOJ5AbmGHfhUumyqr1TRhH2kqx3yaugoU2ZctTrujPrzdfsXAzHDUkmxF33hw==@kvack.org X-Gm-Message-State: AOJu0YzLA/g8IV6HPSFytBHWYW7oTm2/TdyqEcPWCS7v2DogZAFlvpE+ h5Dosy/X/U7hivShEEDvwvZk4/RA5JBLugEX0YlmYelWA0nXtHWRhWSHJs66u18D2RGC57uHNNX jFAC2VWXXwTufloDFRonseTKyrIXdW94dJig4uZRR X-Gm-Gg: ASbGncuHGVQJulgvsAKrzrdGIf5+bLDxUrzwIeY25q1MyX3QRBaXaUJjXkA6oDmoViS +pk3zcfnIjH2epEiGnqD02/8QbJDzqsW1VLqUDK6cpCQJR1PiTwyovHBNKiCiDrHl9j1K87ejSS tUxdRj0ic1McLHoS/pDVTD4LyakG4tWZ3OsBECoGduWF6zBjvFHNappDuZZvMMabHBq5zlL+TiI za0IsmjVg== X-Google-Smtp-Source: AGHT+IHe+8rltjxQpERvGlnh1BE+Qb+AyjiyXr0NHbslF7o6q6+hHqMnR/sx/OcJ6plG1reS8jBIkRgriDdO9djZwwM= X-Received: by 2002:a05:622a:2284:b0:49d:9782:5c53 with SMTP id d75a77b69052e-4a5a52d36d4mr3127361cf.22.1749026279368; Wed, 04 Jun 2025 01:37:59 -0700 (PDT) MIME-Version: 1.0 References: <20250527180245.1413463-1-tabba@google.com> <20250527180245.1413463-9-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Wed, 4 Jun 2025 09:37:22 +0100 X-Gm-Features: AX0GCFvfuFQgjgGz6fhOBu2X3BjrPjlVCK3TUMn9rtOaIi991YFOepXnJmCsqZk Message-ID: Subject: Re: [PATCH v10 08/16] KVM: guest_memfd: Allow host to map guest_memfd pages To: Gavin Shan Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 97EDF40007 X-Stat-Signature: ptn8x7db99jyg8ngttzysjicudc94wpz X-Rspam-User: X-HE-Tag: 1749026280-37252 X-HE-Meta: U2FsdGVkX18fyOeLi4HKlIkXT3R9cbjogd/T0Hmqm8Bkl3dNYsPkDLKnHSY7O6QZh4XJYh5zdXo3HEED5AKt7tAFOge1HKrNfM0bfvh3iV1y8cTyO+vog0/wp7FyKFYg3RTS0dthMrtvnkOEErvSwyXjR21UFGUb44FPpKrWTdQjck04JYmk+izAfoV7JaQjg+1q6RE+4BYn5X37wvRRciVuJH9SgHcC7dgmXj+WsWm+2pl8PReba5k/l0O+s/fuy8RMlR64QF+gaRJ+Pja793J5vbohF4VHwAuagm+mwUUdLQm0V5x3XYrztA8He5iFa0PlzppPVKyqsGIfYXVD/b3QwGt8p1XmHiMOJoSxh/h+lUZtbF3qQv7tnRZkn53RXM3+Gl7Nxtq6c+OOJuBHA6P3eMPL0Tpg4JfdllcLil7cQDF/E+NIQ5XXcNADCzSbpiSc2EWsEqh8tE/Szhm+G3KZmI8pj4tIiZReLtc3qDPIqcGIwc0FMcUIRPQVgZ0PDNNtj7nLee1F7U5LDl+tDSFy9TvCoL8SlLIdAabWECM2gnO0ydsgzuvzRCczUMwr7GhCzAkUBlfDTuWPOkM8UHoGAPoGdcvbOF6wVpYHBmGllki8VlJfzvZrpmPlmoGuIsvHAK7zIJUjYIaLwKxTwUaQAfDLPtkMzdpRJTC0leR/JkD6g7u5Y4sXpevsAJtdhTY0atrtlBShBsEuMMFszHuj8Hx4ENodZd2ZG9KIV91vr0dKcX6Y1wiBPi+4l6qqF258GIij9BjR4h4zk0vXGM9z1yXMLdm5xz36+HdbV+xA2HTXYLIhqKXoURi/Mv9UC9za37Oq7jqwoHKmAWsiwClRN+J0ieTn4gBRDXTPI8mT/H2MqG168W1ThnWC5y8u8xEn3wh5C4W71FJF8tP/Ec/zAEb9GRUZ9vS+6mfaXwymxky6RZWh3CuZjjyQVVnmRH+tJGtiu5q3FC7aYye 5LpcZy9c YcLCqJKPvumsE1D940nsNBjEnM59kJkyjF58G2vj1wtvObHP0xjj00mE55w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Gavin, On Wed, 4 Jun 2025 at 07:02, Gavin Shan wrote: > > Hi Fuad, > > On 5/28/25 4:02 AM, Fuad Tabba wrote: > > This patch enables support for shared memory in guest_memfd, including > > mapping that memory at the host userspace. This support is gated by the > > configuration option KVM_GMEM_SHARED_MEM, and toggled by the guest_memfd > > flag GUEST_MEMFD_FLAG_SUPPORT_SHARED, which can be set when creating a > > guest_memfd instance. > > > > Co-developed-by: Ackerley Tng > > Signed-off-by: Ackerley Tng > > Signed-off-by: Fuad Tabba > > --- > > arch/x86/include/asm/kvm_host.h | 10 ++++ > > arch/x86/kvm/x86.c | 3 +- > > include/linux/kvm_host.h | 13 ++++++ > > include/uapi/linux/kvm.h | 1 + > > virt/kvm/Kconfig | 5 ++ > > virt/kvm/guest_memfd.c | 81 +++++++++++++++++++++++++++++++++ > > 6 files changed, 112 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 709cc2a7ba66..ce9ad4cd93c5 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -2255,8 +2255,18 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level, > > > > #ifdef CONFIG_KVM_GMEM > > #define kvm_arch_supports_gmem(kvm) ((kvm)->arch.supports_gmem) > > + > > +/* > > + * CoCo VMs with hardware support that use guest_memfd only for backing private > > + * memory, e.g., TDX, cannot use guest_memfd with userspace mapping enabled. > > + */ > > +#define kvm_arch_supports_gmem_shared_mem(kvm) \ > > + (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM) && \ > > + ((kvm)->arch.vm_type == KVM_X86_SW_PROTECTED_VM || \ > > + (kvm)->arch.vm_type == KVM_X86_DEFAULT_VM)) > > #else > > #define kvm_arch_supports_gmem(kvm) false > > +#define kvm_arch_supports_gmem_shared_mem(kvm) false > > #endif > > > > #define kvm_arch_has_readonly_mem(kvm) (!(kvm)->arch.has_protected_state) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 035ced06b2dd..2a02f2457c42 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -12718,7 +12718,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) > > return -EINVAL; > > > > kvm->arch.vm_type = type; > > - kvm->arch.supports_gmem = (type == KVM_X86_SW_PROTECTED_VM); > > + kvm->arch.supports_gmem = > > + type == KVM_X86_DEFAULT_VM || type == KVM_X86_SW_PROTECTED_VM; > > /* Decided by the vendor code for other VM types. */ > > kvm->arch.pre_fault_allowed = > > type == KVM_X86_DEFAULT_VM || type == KVM_X86_SW_PROTECTED_VM; > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 80371475818f..ba83547e62b0 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -729,6 +729,19 @@ static inline bool kvm_arch_supports_gmem(struct kvm *kvm) > > } > > #endif > > > > +/* > > + * Returns true if this VM supports shared mem in guest_memfd. > > + * > > + * Arch code must define kvm_arch_supports_gmem_shared_mem if support for > > + * guest_memfd is enabled. > > + */ > > +#if !defined(kvm_arch_supports_gmem_shared_mem) && !IS_ENABLED(CONFIG_KVM_GMEM) > > +static inline bool kvm_arch_supports_gmem_shared_mem(struct kvm *kvm) > > +{ > > + return false; > > +} > > +#endif > > + > > #ifndef kvm_arch_has_readonly_mem > > static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm) > > { > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > > index b6ae8ad8934b..c2714c9d1a0e 100644 > > --- a/include/uapi/linux/kvm.h > > +++ b/include/uapi/linux/kvm.h > > @@ -1566,6 +1566,7 @@ struct kvm_memory_attributes { > > #define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) > > > > #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest_memfd) > > +#define GUEST_MEMFD_FLAG_SUPPORT_SHARED (1ULL << 0) > > > > struct kvm_create_guest_memfd { > > __u64 size; > > diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig > > index 559c93ad90be..df225298ab10 100644 > > --- a/virt/kvm/Kconfig > > +++ b/virt/kvm/Kconfig > > @@ -128,3 +128,8 @@ config HAVE_KVM_ARCH_GMEM_PREPARE > > config HAVE_KVM_ARCH_GMEM_INVALIDATE > > bool > > depends on KVM_GMEM > > + > > +config KVM_GMEM_SHARED_MEM > > + select KVM_GMEM > > + bool > > + prompt "Enable support for non-private (shared) memory in guest_memfd" > > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > > index 6db515833f61..5d34712f64fc 100644 > > --- a/virt/kvm/guest_memfd.c > > +++ b/virt/kvm/guest_memfd.c > > @@ -312,7 +312,81 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) > > return gfn - slot->base_gfn + slot->gmem.pgoff; > > } > > > > +static bool kvm_gmem_supports_shared(struct inode *inode) > > +{ > > + u64 flags; > > + > > + if (!IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM)) > > + return false; > > + > > + flags = (u64)inode->i_private; > > + > > + return flags & GUEST_MEMFD_FLAG_SUPPORT_SHARED; > > +} > > + > > + > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +static vm_fault_t kvm_gmem_fault_shared(struct vm_fault *vmf) > > +{ > > + struct inode *inode = file_inode(vmf->vma->vm_file); > > + struct folio *folio; > > + vm_fault_t ret = VM_FAULT_LOCKED; > > + > > + folio = kvm_gmem_get_folio(inode, vmf->pgoff); > > + if (IS_ERR(folio)) { > > + int err = PTR_ERR(folio); > > + > > + if (err == -EAGAIN) > > + return VM_FAULT_RETRY; > > + > > + return vmf_error(err); > > + } > > + > > + if (WARN_ON_ONCE(folio_test_large(folio))) { > > + ret = VM_FAULT_SIGBUS; > > + goto out_folio; > > + } > > + > > + if (!folio_test_uptodate(folio)) { > > + clear_highpage(folio_page(folio, 0)); > > + kvm_gmem_mark_prepared(folio); > > + } > > + > > + vmf->page = folio_file_page(folio, vmf->pgoff); > > + > > +out_folio: > > + if (ret != VM_FAULT_LOCKED) { > > + folio_unlock(folio); > > + folio_put(folio); > > + } > > + > > + return ret; > > +} > > + > > +static const struct vm_operations_struct kvm_gmem_vm_ops = { > > + .fault = kvm_gmem_fault_shared, > > +}; > > + > > +static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma) > > +{ > > + if (!kvm_gmem_supports_shared(file_inode(file))) > > + return -ENODEV; > > + > > + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) != > > + (VM_SHARED | VM_MAYSHARE)) { > > + return -EINVAL; > > + } > > + > > + vma->vm_ops = &kvm_gmem_vm_ops; > > + > > + return 0; > > +} > > +#else > > +#define kvm_gmem_mmap NULL > > +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ > > + > > nit: The hunk of code doesn't have to be guarded by CONFIG_KVM_GMEM_SHARED_MEM. > With the guard removed, we run into error (-ENODEV) returned by kvm_gmem_mmap() > for non-sharable (or non-mapped) file, same effect as to "kvm_gmem_fops.mmap = NULL". > > I may have missed other intentions to have this guard here. You're right. This guard is here because it was needed before, but not anymore. I'll remove it. > > static struct file_operations kvm_gmem_fops = { > > + .mmap = kvm_gmem_mmap, > > .open = generic_file_open, > > .release = kvm_gmem_release, > > .fallocate = kvm_gmem_fallocate, > > @@ -463,6 +537,9 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) > > u64 flags = args->flags; > > u64 valid_flags = 0; > > > > + if (kvm_arch_supports_gmem_shared_mem(kvm)) > > + valid_flags |= GUEST_MEMFD_FLAG_SUPPORT_SHARED; > > + > > if (flags & ~valid_flags) > > return -EINVAL; > > > > @@ -501,6 +578,10 @@ int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, > > offset + size > i_size_read(inode)) > > goto err; > > > > + if (kvm_gmem_supports_shared(inode) && > > + !kvm_arch_supports_gmem_shared_mem(kvm)) > > + goto err; > > + > > This check looks unnecessary if I'm not missing anything. The file (inode) can't be created > by kvm_gmem_create(GUEST_MEMFD_FLAG_SUPPORT_SHARED) on !kvm_arch_supports_gmem_shared_mem(). > It means "kvm_gmem_supports_shared(inode) == true" is indicating "kvm_arch_supports_gmem_shared_mem(kvm) == true". > In this case, we won't never break the check? :-) You're right here as well. This check was there before that flag was added, and I should have removed it after having added that. Consider it gone! Thanks! /fuad > > filemap_invalidate_lock(inode->i_mapping); > > > > start = offset >> PAGE_SHIFT; > > Thanks, > Gavin >