From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B88BC54ED1 for ; Wed, 28 May 2025 03:16:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E31C6B0089; Tue, 27 May 2025 23:16:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1BA2D6B008A; Tue, 27 May 2025 23:16:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D00C6B008C; Tue, 27 May 2025 23:16:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E232C6B0089 for ; Tue, 27 May 2025 23:16:40 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9E4EE141DF5 for ; Wed, 28 May 2025 03:16:40 +0000 (UTC) X-FDA: 83490854160.21.4F35B64 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by imf22.hostedemail.com (Postfix) with ESMTP id D35F5C0002 for ; Wed, 28 May 2025 03:16:37 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=HTt291Fe; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf22.hostedemail.com: domain of binbin.wu@linux.intel.com has no SPF policy when checking 192.198.163.7) smtp.mailfrom=binbin.wu@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748402198; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7+Q6S4ci7n/27qKc5HFr3UugZ1i+lAZhd2BV+QeNqvQ=; b=VdJca0QRYNmhr1XC6zdI1w23YwSQC9QM7LbiPlij0XPV96JIIeCgVvKNm7cJEC+CIXC0Tu /jp/lDFXnuqAOG3sVS3ggGiuty7wdb79RJ5UWA+ozXsz1D+Mg1+8cC5Z9YT6YjbN/0ALQx ozRIK7vLFt+maZfI7ezzhVC5zdrU+mY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748402198; a=rsa-sha256; cv=none; b=IsbSBOqi6NSt40NcHEOPNn8NqH2XNqLiRoXPpdcfvT+hzrRsOqhG1P9vqlHo/sph/GX2SA /aVEeiujwaA3xdcAyRLv96KV3YQ+Lfia2snHkhQe0+N8alNwCooy4Y7CugZy4lKaStHYBC DoV51+racX8bd5FldruMLNRGgWYyYYY= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=HTt291Fe; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf22.hostedemail.com: domain of binbin.wu@linux.intel.com has no SPF policy when checking 192.198.163.7) smtp.mailfrom=binbin.wu@linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748402198; x=1779938198; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=ol3hrreomk92oy/dMrcK8dYkcWIBmCK1VoFKv2uei8c=; b=HTt291Fe+YtU65/udhiMWmevCn2SeppDnx/gfX1r4c2l0T/iKJdth7tw LMgwWoqkWtdwBy2ICnxPkdN5kKgB1bzoYTAcSmVeLaTGcAzZgrDPzh9FE bBmhqreblzNKBq55S2Berw7IedPJfQumUFPlbSBenrmgvEPnEKeX2bnoa +pBSldhxdSIU50k4MjueCGHaHJpsLB706rnZdtcVPb/hLKYZgbL8k6oTc bCbxf+NZyoQcs/sHHnopFWkjMWRPWI5syWJg1VOBVs+G/eyH3tebSstoT 4qPvG9RDRPuoJ4Uu6U73i02BbuyqdfOAjwCzoxgeBv/6KCCl5Wr2MMBVx Q==; X-CSE-ConnectionGUID: 7TB8TwgTROqjNekeSkCTsw== X-CSE-MsgGUID: 6Q7I3yUxRMaRB3F6FReT0w== X-IronPort-AV: E=McAfee;i="6700,10204,11446"; a="75803198" X-IronPort-AV: E=Sophos;i="6.15,320,1739865600"; d="scan'208";a="75803198" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2025 20:16:36 -0700 X-CSE-ConnectionGUID: qT1plBtYR2uKqedJ4Wk7ng== X-CSE-MsgGUID: o4JH+FVTTm+OuGV88pruRA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,320,1739865600"; d="scan'208";a="146961192" Received: from unknown (HELO [10.238.3.95]) ([10.238.3.95]) by fmviesa003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2025 20:16:18 -0700 Message-ID: Date: Wed, 28 May 2025 11:16:15 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls To: Ackerley Tng Cc: kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, aik@amd.com, ajones@ventanamicro.com, akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, brauner@kernel.org, catalin.marinas@arm.com, chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, jarkko@kernel.org, jgg@ziepe.ca, jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, kirill.shutemov@intel.com, liam.merwick@oracle.com, maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, steven.sistare@oracle.com, suzuki.poulose@arm.com, tabba@google.com, thomas.lendacky@amd.com, usama.arif@bytedance.com, vannapurve@google.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, xiaoyao.li@intel.com, yan.y.zhao@intel.com, yilun.xu@intel.com, yuzenghui@huawei.com, zhiquan1.li@intel.com References: Content-Language: en-US From: Binbin Wu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D35F5C0002 X-Stat-Signature: mdmikryq4e86rwj11xqehhw54gnku55a X-Rspam-User: X-HE-Tag: 1748402197-179021 X-HE-Meta: U2FsdGVkX187MTM0PpGlS3FIky/TwNyJNaUCNC8Z2sc7dJ7WYHdir42F2XoXg4TMSQdyLJaqkC93ArnN/zPPGtrEYpaVQcrhP2vbHpZSgaRsmh4BoM1v69zoObz9AHSMRFpGVDiI16tJIvyNKbT5B8w5u2Aw3GYDOyXcX5cY7YFd7JOmGp/44xNgQleRztgiZCer24qdAFkg/1YgTky1nkYzPwkCJO+yXv1Dhoziq0SaWJwKGl9RP/YjnLqIse/zLvXJwRzQ1A5GYcQ6HNkv1lia8KvMrgCo2jgCLzmAxbd58ON8mm/Vy8HxJe26HSoD+FNmzQ5w5n/v+8ICpsb2zmmYjV9p4OywNlOFwVEpKr0WFBZLSO3G4cZfeVcl4H3rhsJZ+EaRj0p8rngDnt29YzMeOs543B6MmbpjGpCY6StqtdEG308561TVwe5nzZLXGN7Y4by7LucpNeag7nWxizJkjbNBhc0TQ7k9WSR1ID0e+ius6gkCoOl0p7AEleWs9jQG9M1FCLKi6x/eCR5AVkJS2SgR5KR68YeXNh/n4v/0F2Gej6gLH6JOezkmSsQq+W5DMqM7Kz7S8mNomg99NDHKx6b7zPMU3o8evyk61gmbm0kD8c9IusUUrSOUCYXcqire+mWT5FFfEcTWXUuLkQe+uoo6CZ0rdtr0ZPA0qpY6FPO2nJq9CW3kXNEC47ABKCRhRZw5+loIc/F504NIOO1AmI/z2hZTeQ3oJUFFgzTSnKhu3Z8KM1LuuIkMdOk9Vfc94hVxZhwx13Rca19yTcadp2lLKP4J98G3zas9TeYnFplu31w9fVLbJXQkDSMLJYaIotv9kCVs4bDpP7zIdD/6pYUiATMThnnKVG2QBtlqlHLty2+0GMGeD+fu0YYxSXwqz4MDgIFi5W2KuST8WQzg1xkbzRvsyv8D1xA9iAup6r56smYlagN39i2xbNx3tTUebDzvEcw0PjNo6e9 g3GSUG7U xVURQTt+BaRaLhT7g79Ayr7ozDSFcJZgCmV8SQVieHz0YTCtDzmT91RrkTnF5I6Vk55PmBnQClO+B2VhhKxrNw8Lguny9h4I4mXVLjTa2OSLVdARR9v3sOxA+iZc8h+L6YmgIbA3FoIj3rc+maAfz/TNRdgU/GzbqtkEafcggCDLYppPID7Bxgkg0jTL15442FvJ2Hk7iCq/e60WRZ88t5i4DQqTCxLpybVM3CDkMavQp+jKSkUW1dQo512Ck54i9wFtZBkJGrtFKlgLAwE+R1OLqOxaCYJrP+VmXLxUd07NrvLoz5lSZ0srAOFkUtloGuMO6Mx2/HotXRh7inWdXHrTS34L68QtHjYrA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 5/15/2025 7:41 AM, Ackerley Tng wrote: [...] > + > +static int kvm_gmem_convert_range(struct file *file, pgoff_t start, > + size_t nr_pages, bool shared, > + pgoff_t *error_index) > +{ > + struct conversion_work *work, *tmp, *rollback_stop_item; > + LIST_HEAD(work_list); > + struct inode *inode; > + enum shareability m; > + int ret; > + > + inode = file_inode(file); > + > + filemap_invalidate_lock(inode->i_mapping); > + > + m = shared ? SHAREABILITY_ALL : SHAREABILITY_GUEST; > + ret = kvm_gmem_convert_compute_work(inode, start, nr_pages, m, &work_list); > + if (ret || list_empty(&work_list)) > + goto out; > + > + list_for_each_entry(work, &work_list, list) > + kvm_gmem_convert_invalidate_begin(inode, work); > + > + list_for_each_entry(work, &work_list, list) { > + ret = kvm_gmem_convert_should_proceed(inode, work, shared, > + error_index); Since kvm_gmem_invalidate_begin() begins to handle shared memory, kvm_gmem_convert_invalidate_begin() will zap the table. The shared mapping could be zapped in kvm_gmem_convert_invalidate_begin() even when kvm_gmem_convert_should_proceed() returns error. The sequence is a bit confusing to me, at least in this patch so far. > + if (ret) > + goto invalidate_end; > + } > + > + list_for_each_entry(work, &work_list, list) { > + rollback_stop_item = work; > + ret = kvm_gmem_shareability_apply(inode, work, m); > + if (ret) > + break; > + } > + > + if (ret) { > + m = shared ? SHAREABILITY_GUEST : SHAREABILITY_ALL; > + list_for_each_entry(work, &work_list, list) { > + if (work == rollback_stop_item) > + break; > + > + WARN_ON(kvm_gmem_shareability_apply(inode, work, m)); > + } > + } > + > +invalidate_end: > + list_for_each_entry(work, &work_list, list) > + kvm_gmem_convert_invalidate_end(inode, work); > +out: > + filemap_invalidate_unlock(inode->i_mapping); > + > + list_for_each_entry_safe(work, tmp, &work_list, list) { > + list_del(&work->list); > + kfree(work); > + } > + > + return ret; > +} > + [...] > @@ -186,15 +490,26 @@ static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, > unsigned long index; > > xa_for_each_range(&gmem->bindings, index, slot, start, end - 1) { > + enum kvm_gfn_range_filter filter; > pgoff_t pgoff = slot->gmem.pgoff; > > + filter = KVM_FILTER_PRIVATE; > + if (kvm_gmem_memslot_supports_shared(slot)) { > + /* > + * Unmapping would also cause invalidation, but cannot > + * rely on mmu_notifiers to do invalidation via > + * unmapping, since memory may not be mapped to > + * userspace. > + */ > + filter |= KVM_FILTER_SHARED; > + } > + > struct kvm_gfn_range gfn_range = { > .start = slot->base_gfn + max(pgoff, start) - pgoff, > .end = slot->base_gfn + min(pgoff + slot->npages, end) - pgoff, > .slot = slot, > .may_block = true, > - /* guest memfd is relevant to only private mappings. */ > - .attr_filter = KVM_FILTER_PRIVATE, > + .attr_filter = filter, > }; > > if (!found_memslot) { > @@ -484,11 +799,49 @@ EXPORT_SYMBOL_GPL(kvm_gmem_memslot_supports_shared); > #define kvm_gmem_mmap NULL > #endif /* CONFIG_KVM_GMEM_SHARED_MEM */ > [...]