From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 488BCC021B1 for ; Thu, 20 Feb 2025 17:11:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D8D162802EA; Thu, 20 Feb 2025 12:11:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D3D722802D6; Thu, 20 Feb 2025 12:11:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C055B2802EA; Thu, 20 Feb 2025 12:11:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A58B42802D6 for ; Thu, 20 Feb 2025 12:11:06 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 16D0C1A2463 for ; Thu, 20 Feb 2025 17:11:06 +0000 (UTC) X-FDA: 83140963332.16.EEA9299 Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by imf15.hostedemail.com (Postfix) with ESMTP id 16166A001F for ; Thu, 20 Feb 2025 17:10:59 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pAl3l6p7; spf=pass (imf15.hostedemail.com: domain of tabba@google.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740071460; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=csE1lxdbw4PoIV6OjaA2TVMtZGu4HQa1m39VTDpWUbw=; b=GErIoBJgYoR5AugJvsqUkvmcXe5RjeqRnTw92VLcYersLuwGAEPFTc18riQqemDlTFLg/Y kfO1ppxUQpbqAjeNEsi2Bu60A/RwXYgH5/G9r1ZpUob0YDnntNcrzbocSsNVk7AJIOm062 FAXq0IfVgEFXWqnDqRgJTWGS7lcL0HE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pAl3l6p7; spf=pass (imf15.hostedemail.com: domain of tabba@google.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740071460; a=rsa-sha256; cv=none; b=S0CSMDKSCBgJZcQq6TKmzBBhMlTwzthumeoivzSbV0PlCGxLL90Fdv9/q5rdZKKY2LkvNS 0MD7lDOOYQSUjSQDjpQxdu3K4fXbnuTKzK0g0ETltc8+k9bdpRS3VUR0b6Oq+m/HEIDi3c FRrkAOUEv9IjqC9Dd2HClI/eq2Mkfw0= Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-471fbfe8b89so471751cf.0 for ; Thu, 20 Feb 2025 09:10:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740071459; x=1740676259; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=csE1lxdbw4PoIV6OjaA2TVMtZGu4HQa1m39VTDpWUbw=; b=pAl3l6p7gmplCzmvkKW3p10gya6SSTX5xIWY1rclmxIA9K7HO07/FgAJQvyv7X0Ehh riLzJSS8VtupWaFqVCqcJ43TQWqdv2ebeHdfKtlgx4mUdRmgCFjXDw7vMkoBw9H/fTlC deP44nq6hCMsmLIk6hL6L74EcGH/vkK+mqc0CnE7pEw4Y3c2+pC3nBMpgQUD8wRI9tBE Y5ikacMcy+BNl/3La6JhcQb246FKCM7sG8aYVUduwaEtTjV10DXN4Js3enE5l3yLC7q/ t5xXyv4XWCeQ7kTSCL1ngVmTTicfaFWXMJbR8nAkdWc03Qd3xvcZORoHjGs47blsAWlw 65ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740071459; x=1740676259; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=csE1lxdbw4PoIV6OjaA2TVMtZGu4HQa1m39VTDpWUbw=; b=KjipgnSTvjWChlp4qX9y+CZUMaS+YA4toyDfvY/LS4E3VBL/3kc4+SaCmYRSl09opk oYNcmhT8e2ndN/G7O8Wdlr/U8LZu3spm7dMSdpoUDHIzDEiJ1hrfY5phM8+EBFTJ7HHE ycEAhEjKFWMICX0K82nza/8t+CtL2YW7OxqJd//APcM9+yamew0goORFN+3+oKUsSM86 RyEwCVg94HRmfALQkMXgpbHLnfsDNDbvLIIyI8wWibWjzOW7AT/OBNWIjfeirRWlQrmI Uytgc8p6R6Fzf0IJAuiMnHW0hMKZicNACJU5n5+3t1/Ug+bml7Sf7avv08kzLh/6d/S8 esrg== X-Forwarded-Encrypted: i=1; AJvYcCXYpC+8umYBP3peUXtwA5+AdcDRFxuS6y19pEXkXOnM1p0oTgrWPUMeA77y8csSZPlDhwFlKiHbvQ==@kvack.org X-Gm-Message-State: AOJu0YzDwOxmDomsLyeyLTL9TLjG1yY3p3NPXsi+Q23LdqoT399twqdQ QqOEbiMx/nyl32dJ5aqAWEmp4cJpYOkPDE45I4cSefuqDdbcmwEzIDLfml3774o58wW3ol7Mydr mt7u1s9CdIVPw22PvB1AYIZsM3I9r2Fr2TyXz X-Gm-Gg: ASbGnctynEGy+xDTg9jcXnqK2cpjmLmer8WKbZOTz1+4o6n8LHzqeSkkHl+sqz9WVam 25HzLl4xWv6RsiVC0mTDJ8jKa3QwzMrWWvZUmSRjPrbGjUYliKKz7ySaSPtg7KKOCBcMHS+w= X-Google-Smtp-Source: AGHT+IGV5ilwyuGxVvkxrmkgq9M3vn2k0B9ezXlr7SSZ3+TzYxaqtBgcjqG32cJt4SvBepbls8LGETqKuuV9x8V1oxM= X-Received: by 2002:ac8:7fc2:0:b0:471:83af:c8d3 with SMTP id d75a77b69052e-47215bfad6dmr4249761cf.14.1740071458321; Thu, 20 Feb 2025 09:10:58 -0800 (PST) MIME-Version: 1.0 References: <20250218172500.807733-1-tabba@google.com> <20250218172500.807733-4-tabba@google.com> <2cdc5414-a280-4c47-86d5-4261a12deab6@redhat.com> <69467908-17a5-4700-b5da-efc0446b8663@redhat.com> In-Reply-To: <69467908-17a5-4700-b5da-efc0446b8663@redhat.com> From: Fuad Tabba Date: Thu, 20 Feb 2025 17:10:20 +0000 X-Gm-Features: AWEUYZkgND9iO_-ICqCPCwj2nu9sTiSbmtQCnVyN_ILdjE-Q8jiXphkxiuthWr4 Message-ID: Subject: Re: [PATCH v4 03/10] KVM: guest_memfd: Allow host to map guest_memfd() pages To: David Hildenbrand Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: 16166A001F X-Stat-Signature: 8ob4931q1wsrb991j7uax9m3678r16u7 X-Rspamd-Server: rspam03 X-HE-Tag: 1740071459-359972 X-HE-Meta: U2FsdGVkX1/MUx7Wz/qyDq99y6sMb/3YijD2yJP8C48WPbVaI3ulq/M1/ixSbG9Rr36MfzjAwLuVdUW51UVSX0cGmZ2/9Xj9/exHc8+AJji21QjwB8wMpl3Y9/tGM8NTSX8r6qCe8WPy9wySq0X6UZtsIBTtZaifybQWqMpnttCyXTtqkgEl9yBXTEH8raPDwiPG9dwLd+vw4oyRa4SC1bK6xJNZ5gupqsD4Em96Er3plJ+Mt/C9Y9tzSr5iA40RrgBZ4GxBNi4k1H2JnDsMcOl4l2yNuWnhgjNVXxm4/U17Us830goDzqvw5vT4cRrQMyVKh/8a2y7xcds90w5HVStD/vKeYT65YPyAq5bZ3MJHztuflWfeFF/8DIPFKqcmtqPZHkTuDt4KuYTrmVeQZAsHaQzFvo4oGfSW0wElOamFg7cB+VTaQGxt196eah9FhnkIhlqWe8WOhyc+gxKlO6TWqnwW4ADNCcE76JZXNQdWhXM4lLQz6PTTO97a1m8tVrqkTMUA7JsTrRuyWbdXnB/0vgrGahEgDQ7fvFsT++j4wIn+kYYcOho+RdZHCho8Bnw0UMeYi+Q+1nGvsb5Uu9/vZ4LDJyM3AYEB2Te22Sg9anNo0RaZOrsKBOtTwAplRE8oMEHxyzG9eOse9UvA4B0aVqM/Ll88ioRVr5F/nHyEBOS3uGpoc0fl15VsrMbZ7j+AXWJT3bUHwrsOJeP0OAtZOFHBHGA2LmDgn7omiBIllZslmekA8CutbgnzlUKGgNOI548pQ77PRRkSTJtEQ5yCZnTS8juowBm+ElnctSpsbc4pcdmpKc2abAUyVIy/NggiGZ0PmZt/9uo/ZUb0DO/65IluSgrH0ZzLqeEnC//C+aBkYlaqXdPOeot6uOclAvj0RgKcXDMj+aHRJc2MOPTm1dRMAwKSvz2zsotbviydWaPOlh48joTgNO+MHkK76Grb3UNJi3aHKvzu06j QlUKny5w DhPMFYtfidLJHFLAMuPfOsod1PBMGvUA2LrahE4BQxL+vphsBNz4x/a/1UwjlBt8J3odUr/eLNedTBfiB3l965+rne04Qyz661Zou4OzxshvF1Js7Jh3IPUJky2wZXbijvlduql6qJOxMhyVwuyIVrR63UUSg2woeU4wr0FV2N270nPxB7Z3MrV3PR8jL5uw0+8mwdlCc+ysjU6As7e/8SW79Cp9buuoYhL46mc669wBvh+VKP/7awZwREQE7K/T/p091nqu53xXSnAHJoWemMRZZUYQwKZJWVb9VURSsnaRrPk9TIIWryW3OYq/KASDH5gkuIszOEHistoyOVy7iY/K7UefCewjjJ+to+4DGLFIoiHk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 20 Feb 2025 at 15:58, David Hildenbrand wrote: > > On 20.02.25 16:45, Fuad Tabba wrote: > > Hi David, > > > > On Thu, 20 Feb 2025 at 12:04, Fuad Tabba wrote: > >> > >> On Thu, 20 Feb 2025 at 11:58, David Hildenbrand wrote: > >>> > >>> On 18.02.25 18:24, Fuad Tabba wrote: > >>>> Add support for mmap() and fault() for guest_memfd backed memory > >>>> in the host for VMs that support in-place conversion between > >>>> shared and private. To that end, this patch adds the ability to > >>>> check whether the VM type supports in-place conversion, and only > >>>> allows mapping its memory if that's the case. > >>>> > >>>> This behavior is also gated by the configuration option > >>>> KVM_GMEM_SHARED_MEM. > >>>> > >>>> Signed-off-by: Fuad Tabba > >>>> --- > >>>> include/linux/kvm_host.h | 11 +++++ > >>>> virt/kvm/guest_memfd.c | 103 +++++++++++++++++++++++++++++++++++++++ > >>>> 2 files changed, 114 insertions(+) > >>>> > >>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > >>>> index 3ad0719bfc4f..f9e8b10a4b09 100644 > >>>> --- a/include/linux/kvm_host.h > >>>> +++ b/include/linux/kvm_host.h > >>>> @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm) > >>>> } > >>>> #endif > >>>> > >>>> +/* > >>>> + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for > >>>> + * private memory is enabled and it supports in-place shared/private conversion. > >>>> + */ > >>>> +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM) > >>>> +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm) > >>>> +{ > >>>> + return false; > >>>> +} > >>>> +#endif > >>>> + > >>>> #ifndef kvm_arch_has_readonly_mem > >>>> static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm) > >>>> { > >>>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > >>>> index c6f6792bec2a..30b47ff0e6d2 100644 > >>>> --- a/virt/kvm/guest_memfd.c > >>>> +++ b/virt/kvm/guest_memfd.c > >>>> @@ -317,9 +317,112 @@ void kvm_gmem_handle_folio_put(struct folio *folio) > >>>> { > >>>> WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); > >>>> } > >>>> + > >>>> +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) > >>>> +{ > >>>> + struct kvm_gmem *gmem = file->private_data; > >>>> + > >>>> + /* For now, VMs that support shared memory share all their memory. */ > >>>> + return kvm_arch_gmem_supports_shared_mem(gmem->kvm); > >>>> +} > >>>> + > >>>> +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) > >>>> +{ > >>>> + struct inode *inode = file_inode(vmf->vma->vm_file); > >>>> + struct folio *folio; > >>>> + vm_fault_t ret = VM_FAULT_LOCKED; > >>>> + > >>>> + filemap_invalidate_lock_shared(inode->i_mapping); > >>>> + > >>>> + folio = kvm_gmem_get_folio(inode, vmf->pgoff); > >>>> + if (IS_ERR(folio)) { > >>>> + switch (PTR_ERR(folio)) { > >>>> + case -EAGAIN: > >>>> + ret = VM_FAULT_RETRY; > >>>> + break; > >>>> + case -ENOMEM: > >>>> + ret = VM_FAULT_OOM; > >>>> + break; > >>>> + default: > >>>> + ret = VM_FAULT_SIGBUS; > >>>> + break; > >>>> + } > >>>> + goto out_filemap; > >>>> + } > >>>> + > >>>> + if (folio_test_hwpoison(folio)) { > >>>> + ret = VM_FAULT_HWPOISON; > >>>> + goto out_folio; > >>>> + } > >>>> + > >>>> + /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */ > >>>> + if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) { > >>>> + ret = VM_FAULT_SIGBUS; > >>>> + goto out_folio; > >>>> + } > >>>> + > >>>> + /* > >>>> + * Only private folios are marked as "guestmem" so far, and we never > >>>> + * expect private folios at this point. > >>>> + */ > >>>> + if (WARN_ON_ONCE(folio_test_guestmem(folio))) { > >>>> + ret = VM_FAULT_SIGBUS; > >>>> + goto out_folio; > >>>> + } > >>>> + > >>>> + /* No support for huge pages. */ > >>>> + if (WARN_ON_ONCE(folio_test_large(folio))) { > >>>> + ret = VM_FAULT_SIGBUS; > >>>> + goto out_folio; > >>>> + } > >>>> + > >>>> + if (!folio_test_uptodate(folio)) { > >>>> + clear_highpage(folio_page(folio, 0)); > >>>> + kvm_gmem_mark_prepared(folio); > >>>> + } > >>> > >>> kvm_gmem_get_pfn()->__kvm_gmem_get_pfn() seems to call > >>> kvm_gmem_prepare_folio() instead. > >>> > >>> Could we do the same here? > >> > >> Will do. > > > > I realized it's not that straightforward. __kvm_gmem_prepare_folio() > > requires the kvm_memory_slot, which is used to calculate the gfn. At > > that point we have neither, and it's not just an issue of access, but > > there might not be a slot associated with that yet. > > Hmm, right ... I wonder if that might be problematic. I assume no > memslot == no memory attribute telling us if it is private or shared at > least for now? > > Once guest_memfd maintains that state, it might be "cleaner" ? What's > your thought? The idea is that this doesn't determine whether it's shared or private by the guest_memfd's attributes, but by the new state added in the other patch series. That's independent of memslots and guest addresses altogether. One scenario you can imagine is the host wanting to fault in memory to initialize it before associating it with a memslot. I guess we could make it a requirement that you cannot fault-in pages unless they are associated with a memslot, but that might be too restrictive. Cheers, /fuad > -- > Cheers, > > David / dhildenb >