From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AE53C02181 for ; Mon, 20 Jan 2025 10:30:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AE4586B0082; Mon, 20 Jan 2025 05:30:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A92EC6B0083; Mon, 20 Jan 2025 05:30:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 910096B0085; Mon, 20 Jan 2025 05:30:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 721466B0082 for ; Mon, 20 Jan 2025 05:30:30 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1249C8298E for ; Mon, 20 Jan 2025 10:30:30 +0000 (UTC) X-FDA: 83027461020.04.F9324AF Received: from flow-a6-smtp.messagingengine.com (flow-a6-smtp.messagingengine.com [103.168.172.141]) by imf02.hostedemail.com (Postfix) with ESMTP id F070F80007 for ; Mon, 20 Jan 2025 10:30:27 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="d t3nFW1"; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=vjrCs9sC; spf=pass (imf02.hostedemail.com: domain of kirill@shutemov.name designates 103.168.172.141 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737369028; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0ojkMCVyarP9L03GWogf+XnkAUBps3DAxMGuBvM8joI=; b=ZhjspiTvEmMkmm8hfw70FLyfBX0qB/QXsgaU2NpQ6mVgAwpb7Cv6PfBAmEpK8hhTFEr7/6 NcRnhhn9XoNigH8C7c98V/iDqIiNt8f0UlEBUEKIxAviCe+sHhYHvQpgVlCth11KqPcLCg 2zb2kA8RHjju+rVUv6T+9QKUvPdEm3Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737369028; a=rsa-sha256; cv=none; b=0Wl9ITrIPd/MoxGWWxJhbEZYalLE1WdGD9K0d/VBDjV4k6MBCAE4RwuvpfE+bpqmHaRaFD ZR9pA6V5bv0sW3T+JC9Stdm3dqNS9lWJdE/Oz0c1oDbihwyRGv5dEBSj1lG725NAQZhuNo TN++zZpstriIiE5cHB9TxBfCWTylrJ4= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="d t3nFW1"; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=vjrCs9sC; spf=pass (imf02.hostedemail.com: domain of kirill@shutemov.name designates 103.168.172.141 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none Received: from phl-compute-06.internal (phl-compute-06.phl.internal [10.202.2.46]) by mailflow.phl.internal (Postfix) with ESMTP id 11015200EC5; Mon, 20 Jan 2025 05:30:27 -0500 (EST) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-06.internal (MEProxy); Mon, 20 Jan 2025 05:30:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm1; t=1737369027; x= 1737376227; bh=0ojkMCVyarP9L03GWogf+XnkAUBps3DAxMGuBvM8joI=; b=d t3nFW16YKmJEzF626fgPnJO+RgHZhRP27aLr6RJn8jFxx6S8njGTe3MZ9BI/THP1 FHOPmNNmEktaWlyHMVMx24GRuE6s/HneKgesiTeeqiSvEHeXCWYKj6QlNPa1eHfF 9BeERmUIxddgSUUf8WzYHJ1KWJNdiLnfRmy1At5yDK+8/BsQDI9xoFcvIen+tvTQ cg5NE4mSp04L/yRkVSpKCOlaqfk+w1v6SUdwmhx0zpaeg6ihu3kTiSvilkDprf8d xRcXc9KCGDhmt/7psKI8q/49uXx/J9eoVagnz3L0jsqZKfjp8BaHTkmp/Uvqfx2n vt2kxtsg4KSb1ZYAs+0FA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1737369027; x=1737376227; bh=0ojkMCVyarP9L03GWogf+XnkAUBps3DAxMG uBvM8joI=; b=vjrCs9sCk67Ho93YcphGlBXipIuFEoQFKkzL+pW+SYq0LauV9DR Fez1bGmLWdR/699rL0M/YnDLXXQdPyQ7a4VHBBBRJALKj/EAVJ0iGczA0UW3/GJG y1oKsgjp1VO88PRgoPo7vi2gGSXwgy3bvQGjwHj6JfL0PnBLKASAhfHhGzP1shB1 aRLr0l+uAfgutni0+92suOFC3bmh5EAHXN4UpKP60Gw+OFUBt2I6bXW5YIvhbQJi A108QMLG4kGWIqsSHDOr5GR55obqdXCvwIgdT2Fcq3BAFL74kaUA1R1847Agt/Y6 kBF8hrdXNhKyG1aOZ9KflvsA3nkJ4+dXhCw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefuddrudeiledgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepfffhvfevuffkfhggtggujgesthdtsfdttddtvden ucfhrhhomhepfdfmihhrihhllhcutedrucfuhhhuthgvmhhovhdfuceokhhirhhilhhlse hshhhuthgvmhhovhdrnhgrmhgvqeenucggtffrrghtthgvrhhnpeffvdevueetudfhhfff veelhfetfeevveekleevjeduudevvdduvdelteduvefhkeenucevlhhushhtvghrufhiii gvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehkihhrihhllhesshhhuhhtvghmohhv rdhnrghmvgdpnhgspghrtghpthhtohepiedupdhmohguvgepshhmthhpohhuthdprhgtph htthhopehtrggssggrsehgohhoghhlvgdrtghomhdprhgtphhtthhopehkvhhmsehvghgv rhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtoheplhhinhhugidqrghrmhdqmhhsmhesvh hgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihhnuhigqdhmmheskhhvrggt khdrohhrghdprhgtphhtthhopehpsghonhiiihhnihesrhgvughhrghtrdgtohhmpdhrtg hpthhtoheptghhvghnhhhurggtrghisehkvghrnhgvlhdrohhrghdprhgtphhtthhopehm phgvsegvlhhlvghrmhgrnhdrihgurdgruhdprhgtphhtthhopegrnhhuphessghrrghinh hfrghulhhtrdhorhhgpdhrtghpthhtohepphgruhhlrdifrghlmhhslhgvhiesshhifhhi vhgvrdgtohhm X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 20 Jan 2025 05:30:06 -0500 (EST) Date: Mon, 20 Jan 2025 12:30:03 +0200 From: "Kirill A. Shutemov" To: Fuad Tabba Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com Subject: Re: [RFC PATCH v5 05/15] KVM: guest_memfd: Folio mappability states and functions that manage their transition Message-ID: References: <20250117163001.2326672-1-tabba@google.com> <20250117163001.2326672-6-tabba@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250117163001.2326672-6-tabba@google.com> X-Stat-Signature: 9r753nckhofzcnya98zb1qzrfx65as8u X-Rspamd-Queue-Id: F070F80007 X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1737369027-555275 X-HE-Meta: U2FsdGVkX1+hE0cgFRJmLV+5UwQ2HsN4dXizKXQRNJLXqUSPH0lsJhwGH3gV0Hjqq055vhGPcGZ0qd9WKCw7RehihBqVJ9CFGSyBJmbAJmTApjBOSgLDj5QNnodQGrgr0cvkdXRWyqI8USu/ZYRysvq3KGnZA0LJ///Uu4dkpHcA6e4LjHzCuxnHyOa48Z4ZSm8dV8r2HGzBsk28GkxYXKFUfjuZszjqoBRHXMM2wb03cbxaRnr+rNASMUc1duZMmO0xhEIH+6g6VNT7qWsJiJ9pR+cYgMvM6gapkwyKTbtbtI8TswwggvMDMT6LhE28FIEu5l+TxFE1wSGKyWT9XNRU9rfnZPshASv2XrVLMg6KBYIfi+3P2mzsMi0Ey5SafXvkWa7G+oQQZz9qmUEF+VEFwIDXounq2z4JV36W4/vcGcdVwOPNEBRpOAXQl62aBp1IXhOfjNa03DFh1zKmvDCXk0eEhzoDWB07R34muVYN9aBa2f4cHRqRJJpM/EXcSNxhKZiNZ6oPouOk0aeW+Ng/8UWWQ35gXACZif7wByiaX/7Xg/DQ2LEHIUmouInWFM7xFkCC3G+ujLWmmcHnASYCClojMDuD1reL8LgdBOuYkvbEmu/zJWsp5fS6wAMsbkI3R0IvOZkhOe8dPQ9HwJOXgP3GEhDgiIuggglWeuJdmnMyjynbQb1ulj1eH2nQqKdC+uEJOK0U66WDUbOmmoRgCY9Cltalz4/AUOQsNG6SwRnkLC2DsmEihoNAZWgYxWionbY5tqaYl3IbjLwxJ1K4+nBg9LC5AHjLYl5Mf/xPd1G1vDZ4aufOqLLYfVeqrw8AdEGqRThup4I6Kp2utNcEC1KzjfISaDu4o/3IIizn51mEnzOLxKpBRvcOLXXCiwuoMhG2Xf8l6PoxUFywNN5VFYCr06cJzIh//ien5guhbD305v4g6x1gFJTDQfl/noDQN+4PR8lG/7kS1aB +qBC84ws UuVxDOouDdh3Znes0eV7rfecN3yuXXEv97IZS39vHTdQkXE/8EZ6YqOssnrSy5XEISg0Uui7F3cIZTVAz/ONl08YyDaDtCQQ9TbdqPo1+aURyERqlExXKIvhvSatVJLg88/rdkqIM0e7/d2c1qLg0Dvoj9rLVh/0NwxTCxf3PN5Vpt5MQ8IMyCsncWemeA2e8223NNa6S53CWJArLTv3hEIwWTvwJuw/BG63fxmYbdU0PiHTum/YC+8z5ycaVvuK8xlcXJUhM8O/VjSmsemiQm6YJubABMxKFmIFWBzCyCVY32MaN4OzehDoPTXQa017ZciOxEEdwfwjf3vR1q9aka6RJwscJuXpMV67/TMI4TkgjihoP5ZxJJGsaFaZm9/FM9FJ8tSGVAO9OFrQrcMM6QnYHMveQy5hSt5s7Jq8X3hWKOVRmLDcykK1CiJFi+qEn40k/2o+J+/uoWM8Yzo9webIS5cAOtsed0aJd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 17, 2025 at 04:29:51PM +0000, Fuad Tabba wrote: > +/* > + * Marks the range [start, end) as not mappable by the host. If the host doesn't > + * have any references to a particular folio, then that folio is marked as > + * mappable by the guest. > + * > + * However, if the host still has references to the folio, then the folio is > + * marked and not mappable by anyone. Marking it is not mappable allows it to > + * drain all references from the host, and to ensure that the hypervisor does > + * not transition the folio to private, since the host still might access it. > + * > + * Usually called when guest unshares memory with the host. > + */ > +static int gmem_clear_mappable(struct inode *inode, pgoff_t start, pgoff_t end) > +{ > + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; > + void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_MAPPABLE); > + void *xval_none = xa_mk_value(KVM_GMEM_NONE_MAPPABLE); > + pgoff_t i; > + int r = 0; > + > + filemap_invalidate_lock(inode->i_mapping); > + for (i = start; i < end; i++) { > + struct folio *folio; > + int refcount = 0; > + > + folio = filemap_lock_folio(inode->i_mapping, i); > + if (!IS_ERR(folio)) { > + refcount = folio_ref_count(folio); > + } else { > + r = PTR_ERR(folio); > + if (WARN_ON_ONCE(r != -ENOENT)) > + break; > + > + folio = NULL; > + } > + > + /* +1 references are expected because of filemap_lock_folio(). */ > + if (folio && refcount > folio_nr_pages(folio) + 1) { Looks racy. What prevent anybody from obtaining a reference just after check? Lock on folio doesn't stop random filemap_get_entry() from elevating the refcount. folio_ref_freeze() might be required. > + /* > + * Outstanding references, the folio cannot be faulted > + * in by anyone until they're dropped. > + */ > + r = xa_err(xa_store(mappable_offsets, i, xval_none, GFP_KERNEL)); > + } else { > + /* > + * No outstanding references. Transition the folio to > + * guest mappable immediately. > + */ > + r = xa_err(xa_store(mappable_offsets, i, xval_guest, GFP_KERNEL)); > + } > + > + if (folio) { > + folio_unlock(folio); > + folio_put(folio); > + } > + > + if (WARN_ON_ONCE(r)) > + break; > + } > + filemap_invalidate_unlock(inode->i_mapping); > + > + return r; > +} -- Kiryl Shutsemau / Kirill A. Shutemov