From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9602BCF11F5 for ; Thu, 10 Oct 2024 14:37:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 292426B0082; Thu, 10 Oct 2024 10:37:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 268126B0088; Thu, 10 Oct 2024 10:37:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E2736B0089; Thu, 10 Oct 2024 10:37:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E504B6B0082 for ; Thu, 10 Oct 2024 10:37:01 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B04F81A0256 for ; Thu, 10 Oct 2024 14:36:55 +0000 (UTC) X-FDA: 82657944642.15.0DCDBB7 Received: from flow-a2-smtp.messagingengine.com (flow-a2-smtp.messagingengine.com [103.168.172.137]) by imf15.hostedemail.com (Postfix) with ESMTP id 5D3C5A0025 for ; Thu, 10 Oct 2024 14:36:58 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="G 6Dmp57"; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=T6CZztMf; dmarc=none; spf=pass (imf15.hostedemail.com: domain of kirill@shutemov.name designates 103.168.172.137 as permitted sender) smtp.mailfrom=kirill@shutemov.name ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728570868; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=flRm1BokDLMvxzrVka91UpcRv5MTRteKyP5Z9UtyoZQ=; b=I+ci30kjbs+Ei/SAtzPvQ1y/40czn5k5xx9nj1XJSY1D/MVeNqqwGJfQDCKFzYWXNxeqGp dpfc2ZywTta5MArSAzQKfLtzShG56177v8rNdwx9r706pPhEvBLhFiSn05OqwVYqmS6jWG 5fBnay599mND4iMAvk61X8ofG38QqUY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728570868; a=rsa-sha256; cv=none; b=YZ4zreONOLGY1epIah3BArKOLbE6pZzYt+8vOdtokiLOWGp1m+w069bmi56FWcvAaXu8/4 d8OeDsRufDlIU4SRubtkra8mAWlNHicVK62//X/N9MyLyL4j+9lQasTCZBTo6VGdYIGDRZ g4s+2Hb1BZZJr4QZ3rE+LZJi9U3HiEs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="G 6Dmp57"; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=T6CZztMf; dmarc=none; spf=pass (imf15.hostedemail.com: domain of kirill@shutemov.name designates 103.168.172.137 as permitted sender) smtp.mailfrom=kirill@shutemov.name Received: from phl-compute-03.internal (phl-compute-03.phl.internal [10.202.2.43]) by mailflow.phl.internal (Postfix) with ESMTP id D168E2007A3; Thu, 10 Oct 2024 10:36:58 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-03.internal (MEProxy); Thu, 10 Oct 2024 10:36:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm1; t=1728571018; x= 1728578218; bh=flRm1BokDLMvxzrVka91UpcRv5MTRteKyP5Z9UtyoZQ=; b=G 6Dmp57C8sQ6+zv6SVu5LZ/o9uljIOnJsNgWQ2OLf38ArVUZhaAT7udceSAYnbMmG J2wl+vqWhxEJNXJCG+3IqZyqwqyK749SE5ibpielzwV+5rpFgj+JPmqBNXsGEl/6 O1N4ilE4nzy3dFYvoqiL16YMw7ivwJ8XpAr0RtM5bfTX1jRy61oaHUXSwFxk+an2 59A2X5x06Zgs5yk3nTh5Qd5/NJxzSRJPYcuqYxPUbgECfvZZdn2Dc1md1mwWu/Rh AkeTt68L3hXshMWGXcqGASiZG7yqcgfC4nHHcEOjYTXlFcWAZvbkkkk+NolfvLmM C6ucG3qzQLlYAgXSDRX/A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1728571018; x=1728578218; bh=flRm1BokDLMvxzrVka91UpcRv5MT RteKyP5Z9UtyoZQ=; b=T6CZztMfwwkJxguBMq3sg4Qa+DsdBBF5N9noGqlJ24xz AJKH5WVxYSxA0+IdfcY89VBA3NaSpr8EK046g400FSPFYCsSi6bGOX04C9qnn7Pt 43SAkNe4RNjOdO0xPF970xWSSloZdqJQhZwOeupOyJLxk66ZxNnJ8IdJO7LxF6Gr KgwWHfmOpfFLV+o2JXrbmh8AHEnOuaXGbYa9YQZZ/4nwGF9eZjAAMf+ov9Fyiz1P fo1/o8+hpZRST5Hvkx1wolgjUmtRmjogOI4sirASfY6OFhMaG/KrOsvF2ggVReeJ +4JTDv/5Ux7epfZiyE7X2ztVF3hXOvgi+Jh1HtRNQQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdefhedgkeduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepfffhvfevuffkfhggtggujgesthdtsfdttddtvden ucfhrhhomhepfdfmihhrihhllhcutedrucfuhhhuthgvmhhovhdfuceokhhirhhilhhlse hshhhuthgvmhhovhdrnhgrmhgvqeenucggtffrrghtthgvrhhnpeffvdevueetudfhhfff veelhfetfeevveekleevjeduudevvdduvdelteduvefhkeenucevlhhushhtvghrufhiii gvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehkihhrihhllhesshhhuhhtvghmohhv rdhnrghmvgdpnhgspghrtghpthhtohepiedupdhmohguvgepshhmthhpohhuthdprhgtph htthhopehtrggssggrsehgohhoghhlvgdrtghomhdprhgtphhtthhopehkvhhmsehvghgv rhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtoheplhhinhhugidqrghrmhdqmhhsmhesvh hgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihhnuhigqdhmmheskhhvrggt khdrohhrghdprhgtphhtthhopehpsghonhiiihhnihesrhgvughhrghtrdgtohhmpdhrtg hpthhtoheptghhvghnhhhurggtrghisehkvghrnhgvlhdrohhrghdprhgtphhtthhopehm phgvsegvlhhlvghrmhgrnhdrihgurdgruhdprhgtphhtthhopegrnhhuphessghrrghinh hfrghulhhtrdhorhhgpdhrtghpthhtohepphgruhhlrdifrghlmhhslhgvhiesshhifhhi vhgvrdgtohhm X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 10 Oct 2024 10:36:39 -0400 (EDT) Date: Thu, 10 Oct 2024 17:36:34 +0300 From: "Kirill A. Shutemov" To: Fuad Tabba Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com Subject: Re: [PATCH v3 04/11] KVM: guest_memfd: Allow host to mmap guest_memfd() pages when shared Message-ID: References: <20241010085930.1546800-1-tabba@google.com> <20241010085930.1546800-5-tabba@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5D3C5A0025 X-Stat-Signature: k9ra5gw8yg67rtb3uj4go88ubb7wh8j9 X-Rspam-User: X-HE-Tag: 1728571018-277997 X-HE-Meta: U2FsdGVkX188r53G99Em/Mh+ACKlu38zRHkct2vn+RyfCySDmo51CECoxiVKqMo1XEscPYAXobzYZZKA1KrVCaz+3+fpPiXl2DgRJ2AC/C4QP15fspBRI4xjRwRFIBj+qNrkNe32qkui0CTD9nhoiiSeBxd3Z2VX3dlc+++72zAriydWE8GJm4feSaiRv+D2X5DT55R8QEzLArXNGuXezpRxh7FkAZkkXXLNUyR9Gg7a87Ih9xTl5oWFmvVN4tKZ+QDyL02lrSIT7uoAYotH7n2vjoK+cbzgY+vHIvrys98ixf6IX4yYKaqFTQw5lGBj72iww/AYtht0D28MNYk9sAw5Og5oNxWtKxaxx9sgtd3B4XqKEj67XEtQzQ5QNLkxmMFcb/FiEesMzN1DdVPcwvaeRZbA9+0q2EiquHrA75BOFczO7upCDjiYMIbrT+wSV2lYiZ99hw9rinMl/1/OEGAO1nLjlJTm0LCF9RHdmXbIXoDyVQVe5ZuS0rlADL/Cdf8hx/QffBTnYazziRW5C24YK3huvQXExXEmLxcJpfCBH6N6r1EVFFUTqBRGEOPLjJp8iJjIYIAzFqr6ers5DB9hPZaL9tXtoIHwK8C4HAr6I09hgerG/I71Qx6BNq9VpyoaqW/MWL7L+tihCmjJUVZgDICNnMgojghVsHKufYg8mhN//2b4xj6N5HdeRv45DayYnJpnr1YgqBFXz740aur3ZQzFdOTm5iLIgqsjnyDBAJfNkcLT2mg8FmkcVCJ+V510NOEnV+rKhr+Qm4LZ5rcQBoyZRgYLD9vUi6AOWTylTvcOD/3+n9uQj1No8W16+2EbztNDHDuY3Rshh/xFRjtbfD7Jwfygjn+KvtgmOz4L9w+qdDBlMYa9T3uu1GwtI056Na9UkUUKQ3WSF2l7NBuwej7kuzvwIooyRpj1u2TUybJmMADz+1AjOe6KoAq3/OAG5nYEzAKXt77cZ2w DdfS58ww sltuXn614KJjuQkexnYO1lqKXaUy1hovoukqlwvkUhSojuBibWUWFgCHQ/QrHc1HVBvv9ovSPKzfsZTL54cHdr1IQBeSkhccxuq1kvdC3qN5XHow/hwVo1hY0C+NpFABpgv754JMmlRUl24es83Ugo4KNwbikGGTx4icRTel3TfpRydhdlty9ZMPCr3YqkC8Rfyx6HhCkmosjpcbeQAh/cQ6576fcYXs1+JkiwlcxmlqpDmZgPcn3Sp0L4WPJnFJqmt1uAr1SBEOytnZ8yUZ+6fhVKjkDMxNHc+8GsQibYZ8mavDFjcZ03UvTrIa3bUz+Q9SaAESVg3VpFu4iTiXXvTBog8apxcOtDz6Is6E/Q4n9+tHlX50eQCX3j6+Bk0cLY1xDiVmMInM9Ej6hH2B4K0ICn2gaDFYEXStB8VwzsUt0346gBZg2u5X1fELv3muTTh3z0LP3UWFiJo2fHGfvVVjSsGyhqMlF0zyMlp0jD92ijA0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 10, 2024 at 03:28:38PM +0100, Fuad Tabba wrote: > On Thu, 10 Oct 2024 at 13:21, Kirill A. Shutemov wrote: > > > > On Thu, Oct 10, 2024 at 11:23:55AM +0100, Fuad Tabba wrote: > > > Hi Kirill, > > > > > > On Thu, 10 Oct 2024 at 11:14, Kirill A. Shutemov wrote: > > > > > > > > On Thu, Oct 10, 2024 at 09:59:23AM +0100, Fuad Tabba wrote: > > > > > +out: > > > > > + if (ret != VM_FAULT_LOCKED) { > > > > > + folio_put(folio); > > > > > + folio_unlock(folio); > > > > > > > > Hm. Here and in few other places you return reference before unlocking. > > > > > > > > I think it is safe because nobody can (or can they?) remove the page from > > > > pagecache while the page is locked so we have at least one refcount on the > > > > folie, but it *looks* like a use-after-free bug. > > > > > > > > Please follow the usual pattern: _unlock() then _put(). > > > > > > That is deliberate, since these patches rely on the refcount to check > > > whether the host has any mappings, and the folio lock in order not to > > > race. It's not that it's not safe to decrement the refcount after > > > unlocking, but by doing that i cannot rely on the folio lock to ensure > > > that there aren't any races between the code added to check whether a > > > folio is mappable, and the code that checks whether the refcount is > > > safe. It's a tiny window, but it's there. > > > > > > What do you think? > > > > I don't think your scheme is race-free either. gmem_clear_mappable() is > > going to fail with -EPERM if there's any transient pin on the page. For > > instance from any physical memory scanner. > > I remember discussing this before. One question that I have is, is it > possible to get a transient pin while the folio lock is held, or would > that have happened before taking the lock? Yes. The normal pattern is to get the pin on the page before attempting to lock it. In case of physical scanners it happens like this: 1. pfn_to_page()/pfn_folio() 2. get_page_unless_zero()/folio_get_nontail_page() 3. lock_page()/folio_lock() if needed -- Kiryl Shutsemau / Kirill A. Shutemov