From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C80DBC5AE59 for ; Wed, 28 May 2025 07:01:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D63C6B0083; Wed, 28 May 2025 03:01:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3874B6B0088; Wed, 28 May 2025 03:01:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 275B16B0089; Wed, 28 May 2025 03:01:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 080136B0083 for ; Wed, 28 May 2025 03:01:57 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 3C564B9E24 for ; Wed, 28 May 2025 07:01:56 +0000 (UTC) X-FDA: 83491421832.28.C70A677 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) by imf18.hostedemail.com (Postfix) with ESMTP id 5D8621C000F for ; Wed, 28 May 2025 07:01:53 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=mckb5rv6; spf=none (imf18.hostedemail.com: domain of binbin.wu@linux.intel.com has no SPF policy when checking 192.198.163.11) smtp.mailfrom=binbin.wu@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748415714; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DLc9OEQitmNg2Gp/MujyZPt+A8zCfpR7RLPAbSGp5yI=; b=05IoKQlCKNRFoXVkcNiNBwD0WHlIHLhUS+dO/ppemuk0U7pxwbmJD7qOVMr9iEfgb4SHYD QagatxX4QT0a+JLKC0LwjMuiumnhsL7W3rgcfZ5JppLpQ6EjQrqEw8658vbx9HxJOAr0Wz zsdQqD487u0yk4sWS4bxNzfC6D0EW/Y= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=mckb5rv6; spf=none (imf18.hostedemail.com: domain of binbin.wu@linux.intel.com has no SPF policy when checking 192.198.163.11) smtp.mailfrom=binbin.wu@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748415714; a=rsa-sha256; cv=none; b=6kqNVM3IObynNG2PXqqHj7UcSRZwLMWLOaEpJh+DWMDY/W9ME/sb/DWuHDF0vlWoVFAtB8 C62zHyviXZ+8rcLL7waOly3Gdxr61JZ+dxttujPVyqVR0UQRYIePNp1f7mJb82xFU5PWzl FreoBfDgHY1DKVT/KnyG9sTtimB5mmQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748415713; x=1779951713; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=F/OH2byaWIb39+LvX14LmWOV3JG5wy+sTXzFUhj4ogI=; b=mckb5rv6qZDM3PU6XTg9SMu53CJNzR+Q22l6M+7uQtSuaiBH49YT8Oif EXTdHH6HjvMUNKrPqP1zJ9UC2NREiw3j7ipUJ1u2J3glCq5P+OLtSHsP6 HSk8S5wXnNFL3B8bfQrLp2ujv6/E3Hx54Ad8NosIN3ZFW7Cxi0bP/Gfi7 GiRtNQeRAflXKYDdIER/nQmEx3bhtVSdAZZ6IpjDZhktbGfziSU82iHPB oJoJuNrowEhSZIGFs27bqg1twFIzyiEHAWwj2qgOkrIPwQYAOMTezIWNg 4Ap4EDYqNSPJKO0t3qfSoqwK1DCHJjJtNYAacxX4YsE05KaVJRB4ELOsU g==; X-CSE-ConnectionGUID: zbE2KtbtTGWpqtOJQkCTsw== X-CSE-MsgGUID: 1Fd1YrAKSzybb96+BpJSRA== X-IronPort-AV: E=McAfee;i="6700,10204,11446"; a="61058310" X-IronPort-AV: E=Sophos;i="6.15,320,1739865600"; d="scan'208";a="61058310" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 00:01:52 -0700 X-CSE-ConnectionGUID: wmW7fVgDR2+XxtYJ87FHnw== X-CSE-MsgGUID: 6tDCCgNVQZqzAr6aeJPyjA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,320,1739865600"; d="scan'208";a="148016015" Received: from unknown (HELO [10.238.3.95]) ([10.238.3.95]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 00:01:33 -0700 Message-ID: <21b9b151-6e4f-47b8-9c6b-73eeb0c20165@linux.intel.com> Date: Wed, 28 May 2025 15:01:31 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v2 05/51] KVM: guest_memfd: Skip LRU for guest_memfd folios To: Ackerley Tng Cc: kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, aik@amd.com, ajones@ventanamicro.com, akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, brauner@kernel.org, catalin.marinas@arm.com, chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, jarkko@kernel.org, jgg@ziepe.ca, jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, kirill.shutemov@intel.com, liam.merwick@oracle.com, maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, steven.sistare@oracle.com, suzuki.poulose@arm.com, tabba@google.com, thomas.lendacky@amd.com, usama.arif@bytedance.com, vannapurve@google.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, xiaoyao.li@intel.com, yan.y.zhao@intel.com, yilun.xu@intel.com, yuzenghui@huawei.com, zhiquan1.li@intel.com References: <37f60bbd7d408cf6d421d0582462488262c720ab.1747264138.git.ackerleytng@google.com> Content-Language: en-US From: Binbin Wu In-Reply-To: <37f60bbd7d408cf6d421d0582462488262c720ab.1747264138.git.ackerleytng@google.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 5D8621C000F X-Stat-Signature: pi15k9jz4pfnn3seqnkq8m4iknbtbn5z X-Rspam-User: X-HE-Tag: 1748415713-871599 X-HE-Meta: U2FsdGVkX1/ze47fV5B4Kyfu4ZC972taCjZIXb9WAX3TlRElszd51aKb1FGEz0iAPZYy/trienuLSqFjC/spLNAQ83vJYmdvq1m8ZNU5iKAbFKy75PxNfxfNUJoAOnf16q1hscnyJ+Si7v4KEgIJFg3QUI9IMklqCU3LqWBCAGvftxHdGb9t0O8i+12BWE4tACjr5zbv94ZazUTqnYf79WPGruVjLh7y5YjBB424QhlO1C98XSN8S/VyThDe+yoJ58Dfjv4u0rRn1ZBX8aVYsoyYTiPkjRICBfLNEMFSgLKARXw3ndhbNF1eJGJ8myIHuhwZfYrvGaGDF7G2tj5pLXr4uO1ferwh7z6BARv+/4fhfrQ6DBjiTtu8W1mw40t5Ut/Yru4IsghsVkp9phmrWRRJ1ZAap+0LhrH9DNEeV5KDEIkrafq3kG0QfzFrw9OmViE13STklbfgxFTUPNT5MO22EOCQO5vDFjC39jR0y49uTXU9gNb6eV6BFYpW2WaMWePvYet0Khi+hQrHnKx7CfJZNXQAhUH13bB8wZajcoT/EcC8CCgTvOVR9fV2SpHhiuQ4n3S3U79uQ2VOdOPZZ7Eh8gyiqnw2aBumjPvSdvzM7tfx3kEZYqtNOAmxN24joQe40izixgDad0LQJRfK2I3ORhpQnJRFJmd+2z6qsfQ0KD1lASXKHswjxz1Kya9tIEydwCsA/RofbTLMVzVZQHGPVBNvxmNntzZFx6aidvMxzc+W59ohaj31j/tmy72pFvHueIFqnUWuxUHMWL3JOIevzIU6BlOfVN82hpyn/Z2ZJASrS4TMXNMh0L6XXriR+78qd6/WDK1w2KQQNfOFcxkw4o20ji5qccbSlKagmvmLDVmpiiw42/QwGdZkJ4sTIdGBU2P2Ak9jDHnTV3Sc0IiuBBpGV7ztECXwuq8rey7759QeXZBGNrHjEdr7bWKMEzr3OcKWWEXmPB8uhgC ygHEj1PA 3dotNWVLXCwXD8iScu3iGG/gi4Z6+BU3lS9NlmVMjS1TeG6/tH4GUnrRGBAuSKpRnwzmoBVKam2xEbF5GJwfG2+ZopYh1+26I9m7IO1DQpKcRRxY1hNzr5csJgcawxEXg77sY+PkvYPuWVAnHOMwADUMrVgy2hMvnOzn/tMmIviTps3D3ziD15g+1M6O/9uOL9NNhTyjvCkDaRb/LsNwDIKneOWoeikDp2grqmWX6wb85zo7D5ieViNX2CeUp5IgZi2D8/OKxsQaSrhkukD0ldt3lyXYt1hk3nGftMP/p0CK8/t4KkbsPDBoEmCPsfJ8UMwacdFWwcVcgYIQ4r+cParGPPZJelK3sbZaj6YjlmPoORRnuaRDUHSp032jKsg0G9LaDLw0eH3W2lAm9n6WzAgyFEw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 5/15/2025 7:41 AM, Ackerley Tng wrote: > filemap_add_folio(), called from filemap_grab_folio(), adds the folio > onto some LRU list, which is not necessary for guest_memfd since > guest_memfd folios don't participate in any swapping. > > This patch reimplements part of filemap_add_folio() to avoid adding > allocated guest_memfd folios to the filemap. filemap -> LRU list? > > With shared to private conversions dependent on refcounts, avoiding > usage of LRU ensures that LRU lists no longer take any refcounts on > guest_memfd folios and significantly reduces the chance of elevated > refcounts during conversion. > > Signed-off-by: Ackerley Tng > Change-Id: Ia2540d9fc132d46219e6e714fd42bc82a62a27fa > --- > mm/filemap.c | 1 + > mm/memcontrol.c | 2 + > virt/kvm/guest_memfd.c | 91 ++++++++++++++++++++++++++++++++++++++---- > 3 files changed, 86 insertions(+), 8 deletions(-) > [...] > /* > * Returns a locked folio on success. The caller is responsible for > * setting the up-to-date flag before the memory is mapped into the guest. > @@ -477,8 +509,46 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot, > */ > static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) > { > + struct folio *folio; > + gfp_t gfp; > + int ret; > + > +repeat: > + folio = filemap_lock_folio(inode->i_mapping, index); > + if (!IS_ERR(folio)) > + return folio; > + > + gfp = mapping_gfp_mask(inode->i_mapping); > + > /* TODO: Support huge pages. */ > - return filemap_grab_folio(inode->i_mapping, index); > + folio = filemap_alloc_folio(gfp, 0); > + if (!folio) > + return ERR_PTR(-ENOMEM); > + > + ret = mem_cgroup_charge(folio, NULL, gfp); > + if (ret) { > + folio_put(folio); > + return ERR_PTR(ret); > + } > + > + ret = kvm_gmem_filemap_add_folio(inode->i_mapping, folio, index); > + if (ret) { > + folio_put(folio); > + > + /* > + * There was a race, two threads tried to get a folio indexing > + * to the same location in the filemap. The losing thread should > + * free the allocated folio, then lock the folio added to the > + * filemap by the winning thread. How about changing “then lock the folio added to the filemap by the winning thread” to "the winning thread locks the folio added to the filemap"? > + */ > + if (ret == -EEXIST) > + goto repeat; > + > + return ERR_PTR(ret); > + } > + > + __folio_set_locked(folio); > + return folio; > } > > static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, > @@ -956,23 +1026,28 @@ static int kvm_gmem_error_folio(struct address_space *mapping, struct folio *fol > } > > #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE > +static void kvm_gmem_invalidate(struct folio *folio) > +{ > + kvm_pfn_t pfn = folio_pfn(folio); > + > + kvm_arch_gmem_invalidate(pfn, pfn + folio_nr_pages(folio)); > +} > +#else > +static inline void kvm_gmem_invalidate(struct folio *folio) {} No need to tag a local static function with "inline". > +#endif > + [...]