From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 829D5C28B2E for ; Mon, 10 Mar 2025 10:51:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C629280002; Mon, 10 Mar 2025 06:51:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 74EEC280001; Mon, 10 Mar 2025 06:51:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CB10280002; Mon, 10 Mar 2025 06:51:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3C6C2280001 for ; Mon, 10 Mar 2025 06:51:15 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5076780DC9 for ; Mon, 10 Mar 2025 10:51:17 +0000 (UTC) X-FDA: 83205324594.19.A1B0A8B Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) by imf09.hostedemail.com (Postfix) with ESMTP id 5377214000A for ; Mon, 10 Mar 2025 10:51:15 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OEKjMsBh; spf=pass (imf09.hostedemail.com: domain of tabba@google.com designates 209.85.160.173 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741603875; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GsgHhJNmkDq1O/YozvCptEjFHuaUcIYXqJY3xYqe+UM=; b=29F7Q2JMnbQMXYlV0GKFm010AGoVRy3GAPTU78WIFyaIbESKDQMjSjA13IgO4s3hVk5QZJ WjsB5oDMT+xOoLyJAvv7r2b4qh6QW4u1XD9Ruuk3wlLN1vmfNo8cdlcNwcGhIhZoKbicvE Qoz7CNWjGiL7QT02tWPAJkVSpGK6dNw= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OEKjMsBh; spf=pass (imf09.hostedemail.com: domain of tabba@google.com designates 209.85.160.173 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741603875; a=rsa-sha256; cv=none; b=VfNDz64e1XKk+O++hGMjRGv9606RhvsGiCJZgcWsZpSjxb7UJV5a9irD6CLDgQTd/Wt60d vJXrYcfrRkfiQx6qu5tFcvH4g0PmqXVio//OwTn9872xCorm0ZCSJNcXtFpIdgOoG3Uc9C pliU8or9ldkNhdAyGlkcIUFf7xT6Y9Y= Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-47681dba807so185441cf.1 for ; Mon, 10 Mar 2025 03:51:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1741603874; x=1742208674; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GsgHhJNmkDq1O/YozvCptEjFHuaUcIYXqJY3xYqe+UM=; b=OEKjMsBhw9rsiHljiFyzPW8YBZUe5Pt2allnsvdoyoLdRBJ/2WYVA3fSKA8ElAAH75 HmwaKBGMwHDu3ri2YlfCmPwfrDqv9Y3JjOCl3vMdAv5zPRlSg79IciY193E5Zb10iacN AfC0T0aFQARtf+XwCYIgx2kjOUMWjwNdN3c35SM9f0wV2/quD5XLe2FMKwC3Yy5kDVhC 4lIMorvXucSbocOofZzTKUA6xYbiuwsiT+6IH+mxcifm1dS17JoGMtyjP0jtS/MOvovT a3DbZo63qizIRYQrLBaFj+Ig0XF3EB5HBEpJrdrjL4QdDNNTCQw29zBuMDLT6Ef+dplH KTSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741603874; x=1742208674; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GsgHhJNmkDq1O/YozvCptEjFHuaUcIYXqJY3xYqe+UM=; b=LAd+Gya7o/lxyBIcgSPsUpvIRqEMyCxDFnx9J51mitGqExSdx5zcvR0dGPa8D2//E3 xD2RX1i5WOKG0SM+Pkc8UKsbPYIVi8whIlG1Z2tw2GxPpNf4vcumFQgE8QpDMoAVbF8O uh5AVLjlVsPspnmTwTo9iEAzlb4Y7+0/7Su80mOFkJjxe8rmzjchK2OPcJoNcdOBJcWE urq7LERM399jAJppdqzt7EKchkPqGfACslZusSHCwJNf5OcUcWBb9DLEwxRLSLAyRHSe jJbuPEq8Kk/6UPpqjFVKb0nfsbjigVa/tN3Q2VJHDwBMhl9ZXKwa5MyBoorka9w9gg67 6hlQ== X-Forwarded-Encrypted: i=1; AJvYcCWC6z5sT3BuPfIE/vfOCV3XemWQfxmaj4Qek8DW+qdV1BZJFFEV+9kiVKhLNkt2scQDPsnlnqda5g==@kvack.org X-Gm-Message-State: AOJu0YwSLYyFTDiB0Khdy69tUsSTT1VX+UgnBOG3J3iVKNQDMuujkRsU wXGMHSu1L3THxPzxsHkfQsvSyLzyjr+kW+BMTb/srAPnJ750SdljtRYCPk1djSgpL+AVoaozOgv l5hFlVsudpw0P9v5qhYRcVDWYLTC+uoeuOAWV X-Gm-Gg: ASbGnctIzgsfcrHWg5UpAE5BTIY3ztSyEyNh41EpxKJSlVWxNuSgQHRFRJRNdnhXLnU BYJPzO4WtsAlxhLIo9coZBql+3u3NDUHhifmM41RouPdyf8jV93iqfhm5YpHOSDZjbJWzpDIgxX i62bKxDU6RBD8MprOGTW4Ia8UGMnoiIzkKF8M= X-Google-Smtp-Source: AGHT+IFjx4d1g1257vPbEG4Ss+rawxH74R9/yXbl+GoTTbybUHaqCakPJfsdpLytVxhx4SVCnvjiKG47fh4ne3E9FtY= X-Received: by 2002:a05:622a:2a0b:b0:471:eab0:ef21 with SMTP id d75a77b69052e-47668a7e96fmr5778091cf.13.1741603874136; Mon, 10 Mar 2025 03:51:14 -0700 (PDT) MIME-Version: 1.0 References: <20250303171013.3548775-3-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Mon, 10 Mar 2025 10:50:37 +0000 X-Gm-Features: AQ5f1Jrapxy-udzYyByq6fwLBA7bWnEYimb6j8q79RDN3N0_PrJEhUvOrrSfYks Message-ID: Subject: Re: [PATCH v5 2/9] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages To: Ackerley Tng Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5377214000A X-Stat-Signature: yna883gyfgwkq4eseqm9e5e84yhyzuh7 X-Rspam-User: X-HE-Tag: 1741603875-242551 X-HE-Meta: U2FsdGVkX1+M2bWtYF29KP5dsMd+eej4B1xXI1VAkrnWdqhmsFL6wuAxdzE0CIlxRqGyTc8lylEmNT5tb+xtt93v4dPuoMAPTabT2a/Lv05FpdOQ8R/UEHXCBvMla7WpcP+XqIGWhqRem1il0HRjQBLlJ+kn1gax8+VFVURSOWAo1feD/dVetJn3bIKJfXfDYtdIU0+gd34bIAJXF3LcQ/EhXNrb8NbXe8DAxitcLKkImAZEV+1uzavqIqHr/9zoEHgZxhj7R5nMohzDcqwWVSPT+OOpbJsIHvOlRHa/KKKZwwRaH9HyghH6geaZbSKE//8c9IyfpUfhIfcOy+d6u1eWSn3O/gGydV9RsdfiVkl3MbCZXGxX2hDheBO8Bapyl/eO4VTzP24wwSXpMNWCtiiK09ZiKpRpI6jLumGqJBUSbCo8Uv+KnC72qNnYewdXxsGvWWppWSsy1Tx26Q/rwPJWqeQVujpjpYtdUJwEe8FkwTLEaFIVPgyld5dP4dA8r9ThfCzuY8q5CstJDLXaz1yeHCgTN1/nrkPAlxXXTOJTahiu+NcVysmVq9wRdLLwfiy4oQuCWcasaWEHDXKvQrUSUGDu3fT1FA9gNXQ/KwLABovaWIJCey7R7ebo5G6pLMt9944LZRCwICcZuxHrbwVv5kEyBb8ayL3ovk/nGXY0a8F5KEZ3okcsmnkvGxlDJ8bZ1Eaw5mTV2dpqaPTrCe7XSYtL2ReDPa9zId/BX9ohz7i++Yu67WV9NSwLWZ98n2sas33IeDLOyhjSCscFPC4f0dg0GD2ZZvv8UhZwsDS+IudhtwD6e10uWUbEm5kZxZd3cGtzsKGp1GHUV0VePV8cSGPqvGMxLDgUbfB24kx4rTPIm6kOcW/88UhyiVcB3PTufc3AcakUAu7u3adOyQNBIHLX+4CL/UFzuu/j5PWNyIpdV3bBeSpWWrlD44G2BP9V4AfG65XWIh5sMC+ 0oFhp2VF guxyt8gvhExxnEGvdkFJIse6zPDDo5R9cF1hbMBpT+UA4WuYi55VCPS5IfZ+INP42+Yzd0cdu2GNQVK4wXkV17mhYHyCkLPTJJPMnKCDiZkS6f9H3l0sRBbXiRsVmIzDT09NW7z2YhezMLyyRkmkYwDgAY/L3NpYbE9mYHXePouFC0kk4hWCdOCen7Rx7MxJYWzAf7bh2WhW3JZMeJpPZf6eMMDkVGO1ptXlejPzIQZfktnVCfHRqJQ6mg/2872oIxX1TC/2k2hlh8tskc7+rFghVKPXHS64EzkIWHkYToISgwH3puODQGmeReEP0Y1613DBelrcdLQyaOMAW/7B5c92iqHY4gce3VyiQ1u30W/QqXXU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Ackerley, On Fri, 7 Mar 2025 at 17:04, Ackerley Tng wrote: > > Fuad Tabba writes: > > > Before transitioning a guest_memfd folio to unshared, thereby > > disallowing access by the host and allowing the hypervisor to > > transition its view of the guest page as private, we need to be > > sure that the host doesn't have any references to the folio. > > > > This patch introduces a new type for guest_memfd folios, which > > isn't activated in this series but is here as a placeholder and > > to facilitate the code in the subsequent patch series. This will > > be used in the future to register a callback that informs the > > guest_memfd subsystem when the last reference is dropped, > > therefore knowing that the host doesn't have any remaining > > references. > > > > This patch also introduces the configuration option, > > KVM_GMEM_SHARED_MEM, which toggles support for mapping > > guest_memfd shared memory at the host. > > > > Signed-off-by: Fuad Tabba > > Acked-by: Vlastimil Babka > > Acked-by: David Hildenbrand > > --- > > include/linux/kvm_host.h | 7 +++++++ > > include/linux/page-flags.h | 16 ++++++++++++++++ > > mm/debug.c | 1 + > > mm/swap.c | 9 +++++++++ > > virt/kvm/Kconfig | 5 +++++ > > 5 files changed, 38 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index f34f4cfaa513..7788e3625f6d 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -2571,4 +2571,11 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, > > struct kvm_pre_fault_memory *range); > > #endif > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +static inline void kvm_gmem_handle_folio_put(struct folio *folio) > > +{ > > + WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); > > +} > > +#endif > > + > > #endif > > Following up with the discussion at the guest_memfd biweekly call on the > guestmem library, I think this folio_put() handler for guest_memfd could > be the first function that's refactored out into (placeholder name) > mm/guestmem.c. > > This folio_put() handler has to stay in memory even after KVM (as a > module) is unloaded from memory, and so it is a good candidate for the > first function in the guestmem library. > > Along those lines, CONFIG_KVM_GMEM_SHARED_MEM in this patch can be > renamed CONFIG_GUESTMEM, and CONFIG_GUESTMEM will guard the existence of > PGTY_guestmem. > > CONFIG_KVM_GMEM_SHARED_MEM can be introduced in the next patch of this > series, which could, in Kconfig, select CONFIG_GUESTMEM. > > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > > index 6dc2494bd002..daeee9a38e4c 100644 > > --- a/include/linux/page-flags.h > > +++ b/include/linux/page-flags.h > > @@ -933,6 +933,7 @@ enum pagetype { > > PGTY_slab = 0xf5, > > PGTY_zsmalloc = 0xf6, > > PGTY_unaccepted = 0xf7, > > + PGTY_guestmem = 0xf8, > > > > PGTY_mapcount_underflow = 0xff > > }; > > @@ -1082,6 +1083,21 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb) > > FOLIO_TEST_FLAG_FALSE(hugetlb) > > #endif > > > > +/* > > + * guestmem folios are used to back VM memory as managed by guest_memfd. Once > > + * the last reference is put, instead of freeing these folios back to the page > > + * allocator, they are returned to guest_memfd. > > + * > > + * For now, guestmem will only be set on these folios as long as they cannot be > > + * mapped to user space ("private state"), with the plan of always setting that > > + * type once typed folios can be mapped to user space cleanly. > > + */ > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +FOLIO_TYPE_OPS(guestmem, guestmem) > > +#else > > +FOLIO_TEST_FLAG_FALSE(guestmem) > > +#endif > > + > > PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc) > > > > /* > > diff --git a/mm/debug.c b/mm/debug.c > > index 8d2acf432385..08bc42c6cba8 100644 > > --- a/mm/debug.c > > +++ b/mm/debug.c > > @@ -56,6 +56,7 @@ static const char *page_type_names[] = { > > DEF_PAGETYPE_NAME(table), > > DEF_PAGETYPE_NAME(buddy), > > DEF_PAGETYPE_NAME(unaccepted), > > + DEF_PAGETYPE_NAME(guestmem), > > }; > > > > static const char *page_type_name(unsigned int page_type) > > diff --git a/mm/swap.c b/mm/swap.c > > index 47bc1bb919cc..241880a46358 100644 > > --- a/mm/swap.c > > +++ b/mm/swap.c > > @@ -38,6 +38,10 @@ > > #include > > #include > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +#include > > +#endif > > + > > #include "internal.h" > > > > #define CREATE_TRACE_POINTS > > @@ -101,6 +105,11 @@ static void free_typed_folio(struct folio *folio) > > case PGTY_hugetlb: > > free_huge_folio(folio); > > return; > > +#endif > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > + case PGTY_guestmem: > > + kvm_gmem_handle_folio_put(folio); > > + return; > > #endif > > default: > > WARN_ON_ONCE(1); > > diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig > > index 54e959e7d68f..37f7734cb10f 100644 > > --- a/virt/kvm/Kconfig > > +++ b/virt/kvm/Kconfig > > @@ -124,3 +124,8 @@ config HAVE_KVM_ARCH_GMEM_PREPARE > > config HAVE_KVM_ARCH_GMEM_INVALIDATE > > bool > > depends on KVM_PRIVATE_MEM > > + > > +config KVM_GMEM_SHARED_MEM > > + select KVM_PRIVATE_MEM > > + depends on !KVM_GENERIC_MEMORY_ATTRIBUTES > > Enforcing that KVM_GENERIC_MEMORY_ATTRIBUTES is not selected should not > be a strict requirement. Fuad explained in an offline chat that this is > just temporary. > > If we have CONFIG_GUESTMEM, then this question is moot, I think > CONFIG_GUESTMEM would just be independent of everything else; other > configs would depend on CONFIG_GUESTMEM. There are two things here. First of all, the unfortunate naming situation where PRIVATE could mean GUESTMEM, or private could mean not shared. I plan to tackle this aspect (i.e., the naming) in a separate patch series, since that will surely generate a lot of debate :) The other part is that, with shared memory in-place, the memory attributes are an orthogonal matter. The attributes are the userpace's view of what it expects the state of the memory to be, and are used to multiplex whether the memory being accessed is guest_memfd or the regular (i.e., most likely anonymous) memory used normally by KVM. This behavior however would be architecture, or even vm-type specific. Cheers, /fuad > > + bool