From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9E5BC0219D for ; Thu, 13 Feb 2025 08:30:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F54C280005; Thu, 13 Feb 2025 03:30:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A613280001; Thu, 13 Feb 2025 03:30:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6460D280005; Thu, 13 Feb 2025 03:30:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 46BB0280001 for ; Thu, 13 Feb 2025 03:30:00 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E5A1780895 for ; Thu, 13 Feb 2025 08:29:59 +0000 (UTC) X-FDA: 83114248518.28.7273939 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf23.hostedemail.com (Postfix) with ESMTP id 0CEEA140002 for ; Thu, 13 Feb 2025 08:29:57 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YpPLSqH0; spf=pass (imf23.hostedemail.com: domain of tabba@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739435398; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=c8rc+eSLaASsL8fL1WESUdpU7jFTDlqjkIgb+1SucNc=; b=hA7u3ni14N4zguEvCjkprRitXUj1PLPjvK8RV10tgr2SwVcpF6oaKkHjLBjqpw2zwVdvkl 7OeiCbIgyRMhFzkCPH5/TA6M9U6U1dFcKvdZd0gltB398LGWLV5CDCdOhu9phFMfc2bCdV V42sgV9LY5/DzeIMeOJxgJUt3rNKeqQ= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YpPLSqH0; spf=pass (imf23.hostedemail.com: domain of tabba@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739435398; a=rsa-sha256; cv=none; b=7KTGxX6OfCj5X9h4httwxorBPDcdlWMtzDiibulKHK9czY/xc4D/AkVO0YBzuGOMcE7tdQ V19Uf0xnzVar2knuCiER78d/XBjA2KWz0uzhhRDdtr3VJqEkkgG1lqGK6kVJC/GaaU7XhL HSE2FcULLDKqnHudwaamVh1iAO5LiPU= Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-47180c199ebso145681cf.0 for ; Thu, 13 Feb 2025 00:29:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739435397; x=1740040197; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=c8rc+eSLaASsL8fL1WESUdpU7jFTDlqjkIgb+1SucNc=; b=YpPLSqH0gmLro6jy9hmBB5fLvMWJLPTa+jjCKZUGO82JbTFbxyaJfdg0ar7AC3Tia0 UICXxhbnHV6pRykvypiXiIBMBJRCs0gJN5aKqF8u0oohjcTpTKEZcbA46oFlUo/2zeLZ DKXhGecH59q0UAE1DgRzr6Yea16nF3srpr43ccgv2ry/nkxiuta7FOwW1RtqrMTdJwPe h4O0iivEvFJvLLxNoaqPIfcZDRH2egYfdeqP64IWIkBKqAs9UUYbK7OZ9HMXrmbrsXch 89ARFS8Hgo6EphJOm27GIxrYFrBEyU35+m8hd9iuMU9V57gnlH6lfzcMRqLO0te3zSAs qBgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739435397; x=1740040197; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=c8rc+eSLaASsL8fL1WESUdpU7jFTDlqjkIgb+1SucNc=; b=mm/K/TI8byNuXEI0X4l+gQjyMbhLGnA9SVyzop1CbE0esPLCvkxUrXheHSKNaThZay /+Lnn7E+YiU2K8I8fm3Rg0VQcWc0vecLRrcYghqHyPBx9a1JEBXitlomRpVFDhjZqw7x +lKEaK18F8bayP+1vb5BXHovrIV1H5441aDiV+nl49uy4nmWcEOMu+lk43577sx6FgcK rnK1UkcCn/+dPckUXq+wP0q1Nh0szfkx84dopw6Ef5vap8F6XOHgEWR8hdLvotPrU0wk DRXEC92CXa/vqWmPXtvUM+pHlVJ6ANJEuhfU9Ts+MGrsxFAKSpW5frvkqsTtfLBZvf2u gZUQ== X-Forwarded-Encrypted: i=1; AJvYcCW57N5wTk+XqBUZBPJNcUVbfiHPh/gZWaaepAxog5drZhw+W1RgAbDvuNQJNBVUmdvOhaw/MC/Olw==@kvack.org X-Gm-Message-State: AOJu0Yw/vtCnQYKzmB8LZnJj2Kq5Kp7gSrMjZS2WqCAlMQ8Svmx+Y4eQ M6R+OYSYCS+0+/LP5RQnbtGJ6PvJ9bSBemUs40ghbJfemf6AI/CR+oKldckQa649js7hRSE2/zC eHhD3RnGqx/tp27zr1NFcFwZ1iPyQfA1ev/qn X-Gm-Gg: ASbGncvYoIwBz34MpNnmeJjChtR7Uv5+6M4QzC8URbN/7dA+nU9iNJu7tRSvaipA8F2 wyCQm/KCAdBipcztrPT/KFHJ9MxknQscX7E626puqp17oPUuh/FYAayn+meYUyqu8FVC3YP8= X-Google-Smtp-Source: AGHT+IFsf1LqrhAqThDkdlzjRpdFWXlMNBndbl2/UG8PnTSAsROmdqRusrs9lXPALGBNxc/+lw2WAbpz/r3EtXijSuw= X-Received: by 2002:a05:622a:1a8b:b0:46e:1311:5920 with SMTP id d75a77b69052e-471bfff7a0fmr2462921cf.0.1739435396819; Thu, 13 Feb 2025 00:29:56 -0800 (PST) MIME-Version: 1.0 References: <20250211121128.703390-1-tabba@google.com> <20250211121128.703390-3-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Thu, 13 Feb 2025 08:29:20 +0000 X-Gm-Features: AWEUYZliaZUBkqWCmEwfTO5kaA5s4qzUgPc9AubPU2YL7pSedrAov_-FXME_4Is Message-ID: Subject: Re: [PATCH v3 02/11] KVM: guest_memfd: Handle final folio_put() of guest_memfd pages To: Peter Xu Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 0CEEA140002 X-Stat-Signature: 4foe7nejohwqwu44pw96xycmopo9sy97 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739435397-861760 X-HE-Meta: U2FsdGVkX19OtmWZYX21KTNCuAtq7UDwfVZuSoaf+esH2adHQpzifrigGWn1bz286zWhi/VB7y6eTQ9Ubf3fqjpJpNmNFRpmmKJy1gBTG7QJMmzKPbZrEwtDKDZ0Wvq5kFQbrLaNsKZHcMV1nnKVCrzj/XnsBO2HDpGkeOWygFKtbWyrSGyLeK4Lc2bod2v4wLD4Z0BOr8BV/i0rmKcxCTNihJ2lPKUhhk/hHoQzjCwdIlv4PbgY9pmgGdThy2ky8gyU9qbkIpoaAxduYCP/ovPavOdPanT+w/yUYm5Na2kEcfO198cET19KMht6S/Gerq7Brtv49ew3GEMmXSbrkNXfC6n4NESkt1O9fIAwN1aWNJYDqOO40NG/92wcE5IKtRWxZfJZB/VXeOZyOTobL9ZINGtwKJr9PgwVMf0/fyaX1h3F09dRjPdcu53g2rmglX9Ku3GxgZ2H/F9OSlQP+cKAlGgplokr0aBx64NSayImENwlxHkadAL9R8uyso3kNxvFlhaKmBjNTNHqFw1/I5qVZN+V8qY7h83XtrYZMvMmPaIj2IjYhlf/peUX+1knbgu9yi/YtMcDVlG8cNqds5b7dhQ6F7lQ6FUEeo6smH1eltNnVnubtp2jEnQXpomxOwHPmryJZ0P8FmVwvSwr4p1ZjsAxiaAVWcqzylehmoXy68MK6dNI1EQpB7kGsHv6sGAak4zwMcN+dgCGczClEN/hLpgKMqrwowetQiK0+r69pjseRWxZilS5SpOgZfdVWTBa1wMLV4zET5ol3n7aDdOeUlKfBjmrUsn6Ylor8Cw1YJUzEfwx0D3whCtH+7z8ep6iQ2WKH5q/ARTKJW0p0nkDwa1wtvhUVXtmYZ8WNHEluX1D3druOg6z546ISIhXy1aeEjxhVT6Ce/NGVJaYCZrtv9aZIZ+pT856Og9ZFJqPje1EffcrbIgdZWM1H83rA2pZocb/uF2xV1zILM7 vP2sv7La WqBEZS6SS9I067Mfon/73XU5QWf1IJatzHdec1EyPxOBDoftWYAvWr8aFge9l6cFZBGc2XNKtaz2GEx8lM6lIHbiOyqA0yK2dPVj1dIQstK4q7Ls0KI2ck1aCAmqBfJ21dA8ORUiYhZCNLDC2lM4G/CctUSsfWnKxo+hH38DJqjcBiEZGHBdlHW8sweUSN4Yz3lVSGhV/WNuMiTSOuF2QTqzuY/ubFRMAjGNYq5FLe0CaXxvpO/PXUp/04cvrsoxncvoIaywU38Xxr/8krCX1yQEpjLYLxol/yIqikViUb7VSF5jxEvn0pBLGDpHPQYlObg6wPfSYuxTSyO6H1wGNPa0zBtOdzQuxyMFdb7sbVlCrGX2f87reCBkFccD/YQLY4cD1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Peter, On Wed, 12 Feb 2025 at 18:19, Peter Xu wrote: > > On Tue, Feb 11, 2025 at 12:11:18PM +0000, Fuad Tabba wrote: > > Before transitioning a guest_memfd folio to unshared, thereby > > disallowing access by the host and allowing the hypervisor to > > transition its view of the guest page as private, we need to be > > sure that the host doesn't have any references to the folio. > > > > This patch introduces a new type for guest_memfd folios, which > > isn't activated in this series but is here as a placeholder and > > to facilitate the code in the next patch. This will be used in > > the future to register a callback that informs the guest_memfd > > subsystem when the last reference is dropped, therefore knowing > > that the host doesn't have any remaining references. > > > > Signed-off-by: Fuad Tabba > > --- > > include/linux/kvm_host.h | 9 +++++++++ > > include/linux/page-flags.h | 17 +++++++++++++++++ > > mm/debug.c | 1 + > > mm/swap.c | 9 +++++++++ > > virt/kvm/guest_memfd.c | 7 +++++++ > > 5 files changed, 43 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index f34f4cfaa513..8b5f28f6efff 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -2571,4 +2571,13 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, > > struct kvm_pre_fault_memory *range); > > #endif > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +void kvm_gmem_handle_folio_put(struct folio *folio); > > +#else > > +static inline void kvm_gmem_handle_folio_put(struct folio *folio) > > +{ > > + WARN_ON_ONCE(1); > > +} > > +#endif > > + > > #endif > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > > index 6dc2494bd002..734afda268ab 100644 > > --- a/include/linux/page-flags.h > > +++ b/include/linux/page-flags.h > > @@ -933,6 +933,17 @@ enum pagetype { > > PGTY_slab = 0xf5, > > PGTY_zsmalloc = 0xf6, > > PGTY_unaccepted = 0xf7, > > + /* > > + * guestmem folios are used to back VM memory as managed by guest_memfd. > > + * Once the last reference is put, instead of freeing these folios back > > + * to the page allocator, they are returned to guest_memfd. > > + * > > + * For now, guestmem will only be set on these folios as long as they > > + * cannot be mapped to user space ("private state"), with the plan of > > + * always setting that type once typed folios can be mapped to user > > + * space cleanly. > > Does it imply that gmem folios can be mapped to userspace at some point? > It'll be great if you can share more about it, since as of now it looks > like anything has a page type cannot use the per-page mapcount. This is the goal of this series. By the end of this series, you can map gmem folios, as long as they belong to a VM type that allows it. My other series, which will be rebased on this one, adds the distinction of memory shared with the host vs memory private to the guest: https://lore.kernel.org/all/20250117163001.2326672-1-tabba@google.com/ That series deals with the mapcount issue, by only applying the type once the mapcount is 0. We talked about this in the guest_memfd mm sync, David Hildenbrand mentioned ongoing work to remove the overlaying of the type with the memcount. That should solve the problem completely. > When looking at this, I also found that __folio_rmap_sanity_checks() has > some folio_test_hugetlb() tests, not sure whether they're prone to be > changed too e.g. to cover all pages that have a type, so as to cover gmem. > > For the longer term, it'll be definitely nice if gmem folios can be > mapcounted just like normal file folios. It can enable gmem as a backstore > just like what normal memfd would do, with gmem managing the folios. That's the plan, I agree. > > + */ > > + PGTY_guestmem = 0xf8, > > > > PGTY_mapcount_underflow = 0xff > > }; > > @@ -1082,6 +1093,12 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb) > > FOLIO_TEST_FLAG_FALSE(hugetlb) > > #endif > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > This seems to only be defined in follow up patches.. so may need some > adjustments. It's a configuration option. If you like, I could bring forward the patch that adds it to the kconfig file. Thank you, /fuad > > +FOLIO_TYPE_OPS(guestmem, guestmem) > > +#else > > +FOLIO_TEST_FLAG_FALSE(guestmem) > > +#endif > > + > > PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc) > > > > /* > > diff --git a/mm/debug.c b/mm/debug.c > > index 8d2acf432385..08bc42c6cba8 100644 > > --- a/mm/debug.c > > +++ b/mm/debug.c > > @@ -56,6 +56,7 @@ static const char *page_type_names[] = { > > DEF_PAGETYPE_NAME(table), > > DEF_PAGETYPE_NAME(buddy), > > DEF_PAGETYPE_NAME(unaccepted), > > + DEF_PAGETYPE_NAME(guestmem), > > }; > > > > static const char *page_type_name(unsigned int page_type) > > diff --git a/mm/swap.c b/mm/swap.c > > index 47bc1bb919cc..241880a46358 100644 > > --- a/mm/swap.c > > +++ b/mm/swap.c > > @@ -38,6 +38,10 @@ > > #include > > #include > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +#include > > +#endif > > + > > #include "internal.h" > > > > #define CREATE_TRACE_POINTS > > @@ -101,6 +105,11 @@ static void free_typed_folio(struct folio *folio) > > case PGTY_hugetlb: > > free_huge_folio(folio); > > return; > > +#endif > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > + case PGTY_guestmem: > > + kvm_gmem_handle_folio_put(folio); > > + return; > > #endif > > default: > > WARN_ON_ONCE(1); > > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > > index b2aa6bf24d3a..c6f6792bec2a 100644 > > --- a/virt/kvm/guest_memfd.c > > +++ b/virt/kvm/guest_memfd.c > > @@ -312,6 +312,13 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) > > return gfn - slot->base_gfn + slot->gmem.pgoff; > > } > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > +void kvm_gmem_handle_folio_put(struct folio *folio) > > +{ > > + WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); > > +} > > +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ > > + > > static struct file_operations kvm_gmem_fops = { > > .open = generic_file_open, > > .release = kvm_gmem_release, > > -- > > 2.48.1.502.g6dc24dfdaf-goog > > > > > > -- > Peter Xu >