From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 192DEC282EC for ; Mon, 17 Mar 2025 13:43:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E8F22280002; Mon, 17 Mar 2025 09:43:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E167E280001; Mon, 17 Mar 2025 09:43:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6B6D280002; Mon, 17 Mar 2025 09:43:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AA596280001 for ; Mon, 17 Mar 2025 09:43:32 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E06EA121EF8 for ; Mon, 17 Mar 2025 13:43:33 +0000 (UTC) X-FDA: 83231160306.01.79A7F15 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf11.hostedemail.com (Postfix) with ESMTP id 5631240006 for ; Mon, 17 Mar 2025 13:43:31 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=qUU8Xunw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=vT8Ci5J9; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=qUU8Xunw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=vT8Ci5J9; spf=pass (imf11.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742219011; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=36UxgCjXd+zUwHnW5IFylHTFmHnIYg203Aq8g5JDPuE=; b=zUhmx4wI/gDhBp0v+9MFMauFZ9FHTwDQHcTDq1sXipEGfdZu6bXMpHmgJAaJooPu9knjl4 maF88LPa+eE7zlKWCIv+HujHAXr9/gV3prBd1yUucV9HOZkEPocVB8YAPP4wHZWK3b/WEH x9QGAeR/Mx7wFofeu6dLsXP8vHY99tw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742219011; a=rsa-sha256; cv=none; b=jH0RA8SPXGcVEhPQLceO9p2lSxIeWsmZs2vAHdEtIuokM/wxu1aQdA1NlAF5FA96oKDqf4 hqIf74YrhU3OTYFI0pUAa0mYX3m0h4C79hjx4fnF4miZjjC0dyS/Gf0ilgZfdZWlB5DNWI X3C4qNwk2W4a85Qnno9nvITkd55qsyo= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=qUU8Xunw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=vT8Ci5J9; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=qUU8Xunw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=vT8Ci5J9; spf=pass (imf11.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 913671FDFA; Mon, 17 Mar 2025 13:43:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1742219009; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=36UxgCjXd+zUwHnW5IFylHTFmHnIYg203Aq8g5JDPuE=; b=qUU8Xunwic24ii7M/gpCd6m2gySVMRB9pYtshEjrzlFi6rFcSFlcQd6ggz+MV9enD1/Rkr 1xcOWYG/hEPRyFJ0qyYaBwMMaCMonXuJyvO3OktCRcugEOtl/BTydirFl2APwUeSUklLMn HW02nzcsVT2VKoI3m6v6hi+Fz4ysaHs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1742219009; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=36UxgCjXd+zUwHnW5IFylHTFmHnIYg203Aq8g5JDPuE=; b=vT8Ci5J9El/++OKj0vgk8aT76aPRuRHv3+CDseqSsce3te6AjwGa8f/1csmMiczJA5/hCl f+55iIqCfnjhrBBQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1742219009; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=36UxgCjXd+zUwHnW5IFylHTFmHnIYg203Aq8g5JDPuE=; b=qUU8Xunwic24ii7M/gpCd6m2gySVMRB9pYtshEjrzlFi6rFcSFlcQd6ggz+MV9enD1/Rkr 1xcOWYG/hEPRyFJ0qyYaBwMMaCMonXuJyvO3OktCRcugEOtl/BTydirFl2APwUeSUklLMn HW02nzcsVT2VKoI3m6v6hi+Fz4ysaHs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1742219009; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=36UxgCjXd+zUwHnW5IFylHTFmHnIYg203Aq8g5JDPuE=; b=vT8Ci5J9El/++OKj0vgk8aT76aPRuRHv3+CDseqSsce3te6AjwGa8f/1csmMiczJA5/hCl f+55iIqCfnjhrBBQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 16F3A132CF; Mon, 17 Mar 2025 13:43:29 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id GX3sBAEn2GfuXwAAD6G6ig (envelope-from ); Mon, 17 Mar 2025 13:43:29 +0000 Message-ID: Date: Mon, 17 Mar 2025 14:43:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 03/10] KVM: guest_memfd: Handle kvm_gmem_handle_folio_put() for KVM as a module Content-Language: en-US To: Ackerley Tng , Fuad Tabba Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vannapurve@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com References: From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 5631240006 X-Stat-Signature: 7a8hbrqzj7mnpsmirbfzmc83orimtqza X-HE-Tag: 1742219011-523126 X-HE-Meta: U2FsdGVkX18cqbO/k8Ke4bnKMYW4Wr4sOz8Ul71MK6CJH17m83dM8ChAOuCZqfUr1nSCvaPQ+W/d53msx1vDHeaNehfhkYzrXcOgJ9KEZuiJ+wW+SC9YkX8tocoHzyVGYc20k8mqMs3coLAsN6tDmPsAMDnF8TaDZeO4ivt4Vzr4wN3aopfLAEO+7VAJhfrZ7Cmw1aVgef34YXAetozZ9+qIMiUrkhzov/FcbriuppPO/or2V4uCasfiKQHltahS5iSpD7vrDnbzbM+jx6E6KniUH0vXfhTAvIfKss8keITBqP7C/oA1Q2cpMNm8bLi6cxK6h/IBPU4LH4RncPrAeDu9kpths34fuHMVYYCYfTXCTBxbPjD3quK5Yq8SxGIwuTtjS/6BArE6AA2+GDQJWaH2P3ZolXTwypOXhjJRIwXQpquEcbxaWvoS5AGMewXZkRp01SPM8wzTN1rrXuk37dDEI3+cJtIiffWx0YOKaBJ01wh/zcOS/t0aZWhI1qkaM3CApVpqieXXMTvtYUR8joTtlr3kE+NSx6dCHCUc4r7vZPK/TKbubyIUudtze15w0O5gWwbD2zDfKv9KJiM8BxxE0m2ApCj7q7gwrgQ8szoyQATmW7Nvt5k/WuS9jx0wGUchhQ+bMMPQLOdk+cWWUgvx5KeUqnXmI3aWC86ljA3YPa27+xeCltSzQvUFF6D+73h1iE9xsMTWoZAYAKOrZnJBhA1JQnpuOUXgD6C1LuxBuqb89rz+bpePVusFJRnx0ejBTo3xUlsHRkTjOq3ENeQj6qXF1gqrw4McGLoRf/Pkk8diUULZ8OFX5ATG/gINNdGLI+BtgQZHWo9e98Ch1+tqLMZIlUzBFLWsbIs7XHyZlwKJsJC8DrhW8YLTCBpgTyO4l6ja124VeYsYavfDrhdU1nKeBK/6lAHaAPh4ITw4Pf2hIEKHqn8zwaYJz2S0cNSFLyUvVvUnmGnKs/n kKH/yGmw 8J1otXRfjeANbBuMBqGLySvxqQd05iHBxI7McFw1A2WiHYXELdMwZIOaN0R/uR0ysmn1Cso02QtoJksnrrM8Tw/qq9P4TVzX58W568cmkqRMB84Zx1N2nYFJ+FAnbcYW/MPYoAYf1yhvykOv4wjZdRLDfDWkmcEVhhNra0aGa9r7uZZW9gH5r6kAwNy2BG8It4GmVQOb7NG3tqY4dGCWQSjt/yChj0nrw/dvPDz3qRUUYjTte1+nRBQL9TD+MouDwEuSu7O2c5rFsVBjJgRVP2vPW0Gy/lftwoVo+vi2MYwApJCzw/CvlTVFEOZtlKcu0gdWBBhrd4I3bhFZV0/S4XUgEyKAA5PcJbLwZtlFd42R/p8pKV5HRUM4DaizxjZFCRUcvXq+/1KqmA7+LhlwWSNYBpDz34D1Wx5i+xIRwQT6ue2M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3/13/25 14:49, Ackerley Tng wrote: > Fuad Tabba writes: > >> In some architectures, KVM could be defined as a module. If there is a >> pending folio_put() while KVM is unloaded, the system could crash. By >> having a helper check for that and call the function only if it's >> available, we are able to handle that case more gracefully. >> >> Signed-off-by: Fuad Tabba >> >> --- >> >> This patch could be squashed with the previous one of the maintainers >> think it would be better. >> --- >> include/linux/kvm_host.h | 5 +---- >> mm/swap.c | 20 +++++++++++++++++++- >> virt/kvm/guest_memfd.c | 8 ++++++++ >> 3 files changed, 28 insertions(+), 5 deletions(-) >> >> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h >> index 7788e3625f6d..3ad0719bfc4f 100644 >> --- a/include/linux/kvm_host.h >> +++ b/include/linux/kvm_host.h >> @@ -2572,10 +2572,7 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, >> #endif >> >> #ifdef CONFIG_KVM_GMEM_SHARED_MEM >> -static inline void kvm_gmem_handle_folio_put(struct folio *folio) >> -{ >> - WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); >> -} >> +void kvm_gmem_handle_folio_put(struct folio *folio); >> #endif >> >> #endif >> diff --git a/mm/swap.c b/mm/swap.c >> index 241880a46358..27dfd75536c8 100644 >> --- a/mm/swap.c >> +++ b/mm/swap.c >> @@ -98,6 +98,24 @@ static void page_cache_release(struct folio *folio) >> unlock_page_lruvec_irqrestore(lruvec, flags); >> } >> >> +#ifdef CONFIG_KVM_GMEM_SHARED_MEM >> +static void gmem_folio_put(struct folio *folio) >> +{ >> +#if IS_MODULE(CONFIG_KVM) >> + void (*fn)(struct folio *folio); >> + >> + fn = symbol_get(kvm_gmem_handle_folio_put); >> + if (WARN_ON_ONCE(!fn)) >> + return; >> + >> + fn(folio); >> + symbol_put(kvm_gmem_handle_folio_put); >> +#else >> + kvm_gmem_handle_folio_put(folio); >> +#endif >> +} >> +#endif Yeah, this is not great. The vfio code isn't setting a good example to follow :( > Sorry about the premature sending earlier! > > I was thinking about having a static function pointer in mm/swap.c that > will be filled in when KVM is loaded and cleared when KVM is unloaded. > > One benefit I see is that it'll avoid the lookup that symbol_get() does > on every folio_put(), but some other pinning on KVM would have to be > done to prevent KVM from being unloaded in the middle of > kvm_gmem_handle_folio_put() call. Isn't there some "natural" dependency between things such that at the point the KVM module is able to unload itself, no guest_memfd areas should be existing anymore at that point, and thus also not any pages that would use this callback should exist? In that case it would mean there's a memory leak if that happens so while we might be trying to avoid calling a function that was unleaded, we don't need to try has hard as symbol_get()/put() on every invocation, but a racy check would be good enough? Or would such a late folio_put() be legitimate to happen because some short-lived folio_get() from e.g. a pfn scanner could prolong the page's lifetime beyond the KVM module? I'd hope that since you want to make pages PGTY_guestmem only in certain points of their lifetime, then maybe this should not be possible to happen? > Do you/anyone else see pros/cons either way? > >> + >> static void free_typed_folio(struct folio *folio) >> { >> switch (folio_get_type(folio)) { >> @@ -108,7 +126,7 @@ static void free_typed_folio(struct folio *folio) >> #endif >> #ifdef CONFIG_KVM_GMEM_SHARED_MEM >> case PGTY_guestmem: >> - kvm_gmem_handle_folio_put(folio); >> + gmem_folio_put(folio); >> return; >> #endif >> default: >> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c >> index b2aa6bf24d3a..5fc414becae5 100644 >> --- a/virt/kvm/guest_memfd.c >> +++ b/virt/kvm/guest_memfd.c >> @@ -13,6 +13,14 @@ struct kvm_gmem { >> struct list_head entry; >> }; >> >> +#ifdef CONFIG_KVM_GMEM_SHARED_MEM >> +void kvm_gmem_handle_folio_put(struct folio *folio) >> +{ >> + WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); >> +} >> +EXPORT_SYMBOL_GPL(kvm_gmem_handle_folio_put); >> +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ >> + >> /** >> * folio_file_pfn - like folio_file_page, but return a pfn. >> * @folio: The folio which contains this index.