From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE06DC02181 for ; Mon, 20 Jan 2025 10:41:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DDA06B008C; Mon, 20 Jan 2025 05:41:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 766DC6B0092; Mon, 20 Jan 2025 05:41:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 607F06B0093; Mon, 20 Jan 2025 05:41:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 42B906B008C for ; Mon, 20 Jan 2025 05:41:30 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id CE239C2A41 for ; Mon, 20 Jan 2025 10:41:29 +0000 (UTC) X-FDA: 83027488698.24.4A61A2E Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) by imf23.hostedemail.com (Postfix) with ESMTP id E19FD140002 for ; Mon, 20 Jan 2025 10:41:27 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ixhjccib; spf=pass (imf23.hostedemail.com: domain of tabba@google.com designates 209.85.128.54 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737369688; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wa8U7ekwRqzLAwTlgofMJWDrrrDutQV4QhHlQEndWGc=; b=rxkQ6sikSpzvD2CN4Rn6hlvgzl09Goq/pxB+z4waIcaDW04uSZZYv6lc8XdIREoV12Vj9x 36L9SvBDjn11rKWAl8GyJ8fek2/mpX4etl4Lfxw3Bpa14/jSTAAdO4d2citE0xG4w72fWp 1deUTYPbrvifeq37yUwfYtMko3zYxOc= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ixhjccib; spf=pass (imf23.hostedemail.com: domain of tabba@google.com designates 209.85.128.54 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737369688; a=rsa-sha256; cv=none; b=jv9TV3ZmoIdp6HC60hLo+Auj9dWrqjTgitabLnUlsl8i1P8IaGFsrM3MfCoHT50hvMEf8q DvyEaHCVf9+LeYKZlsKROp72QkS7pEYFf28Oj+mCg78qZrrmvpsI0JfMjb50U3E3zIocfd rniJ+dn5HCTjeR/ooZhvHQ7qfn4ln6s= Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-4368a290e0dso64825e9.1 for ; Mon, 20 Jan 2025 02:41:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1737369686; x=1737974486; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=wa8U7ekwRqzLAwTlgofMJWDrrrDutQV4QhHlQEndWGc=; b=IxhjccibFXfiK74Fc/9YFPnZ2OVNuS1MoHOqkQoJo01QBPSlg0qn6/zB34SdUny+1R gfPpb/kaP+xUpzNhEje7muNlHqDIcfkk8I3QD9KghkVhlEicD07tKK/0em2uXHTvIJQg WlzGMWPXC1F3L6fC4Kq5Mkc7dlH5atoFyvixcFti1xD4NdsHrMA8QQJFO9JXMLTPsAsK 0KTzBZlnTw7wChADIKpaZlZ2DAzRfhh+bl7J+qX3Njx5jYkBgjsAWbHls5g6rD1dE3Tt uQeqQEZBR8bwVsgtnwyEqOgxJ98DbmznVx5t/TTemTY1n4mEVrW92AKWQcslwThM9CfE s3sQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737369686; x=1737974486; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wa8U7ekwRqzLAwTlgofMJWDrrrDutQV4QhHlQEndWGc=; b=NPy+c4JprahoF4sgKOgLtXHyGdeDQgglg81eRXHM+RKaOFi215331rUiOzxQ15HtIm u46ldPpr/jzHpx3G3hyBwE8OyVIJhI3baGHUoE9dsoxztkOkPvIV1kY3oB5+MLxIN6xW 7lZW5dul31csZDPP/2G4ny6Q78tzMhIGg9Xhbm+ONh6vbxN2qAa9M63JdwxfptcsMk/P hRoxuPUjz/V1GXeEiwutVhbWz5YobfjaTD70MzNm9eAy+OAKY5VUrboQqQEmxEsOPIo1 i0ovLeszmG83uIl26ZpUNe9dUu7Oo+LeyKbBeJYtdQT9N0RT8sKga7Y89xtv3TpacfIv a5vw== X-Forwarded-Encrypted: i=1; AJvYcCXRa98TCO2IG94lUO7kAMgmxnAwdmS4hBxF9q4EIHpjYnSchdPmMq5rWWUmybT56/TxrxcoxQn1Pg==@kvack.org X-Gm-Message-State: AOJu0Yw/mMIrGZ0evO1nhzaRDiJO/bF1++LSCUcx8SdiG2YXMfKohInf E3o/Ho2yM+sx3dEeQzQmB/szObUxLpvhKg3gxaPjU5WWmY1HESL8mEUOMhV+9Y0E9gS/s9v8r9I xKtxGcMvKJ4U2fXTds9mJ+xIK1OvNpQssUesL X-Gm-Gg: ASbGncuAE0lbZdfu/q47YKwuF9WWevTTrhpd0qt+Pv03yN8xNsj/Tw3mQvA5dgfrSuK 9AWZH5BqbCvmvEys00jrEiUM719Rj0xthDVNpLrdXGR8V+8bQTA== X-Google-Smtp-Source: AGHT+IH8V90wUWmza7AHfIRy6XGoAOzH/BHgH1iFvRMewjUCt2v1FkW0WfzG9YSgoTPWS8RLVSaEKE3KU2xpoH2YkDs= X-Received: by 2002:a05:600c:160a:b0:42c:9e35:cde6 with SMTP id 5b1f17b1804b1-438a08f2f4amr2342115e9.2.1737369686185; Mon, 20 Jan 2025 02:41:26 -0800 (PST) MIME-Version: 1.0 References: <20250117163001.2326672-1-tabba@google.com> <20250117163001.2326672-6-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Mon, 20 Jan 2025 10:40:49 +0000 X-Gm-Features: AbW1kvY0Ctf9m93HrlbiEv-nOB8N3ILayL1LhKZu0dv-V19y4dKMdkR14DjXPsg Message-ID: Subject: Re: [RFC PATCH v5 05/15] KVM: guest_memfd: Folio mappability states and functions that manage their transition To: "Kirill A. Shutemov" Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E19FD140002 X-Stat-Signature: eqegysbs1q3bnf1hk1nisu6fpj1f3745 X-Rspam-User: X-HE-Tag: 1737369687-703857 X-HE-Meta: U2FsdGVkX1+QedBwuzkmcXIaxfBD6fQHtB4jZL0PvsVpU+p857mCrGRxPUGCUlPY1Bv/xvPMtVNBdkyiItvXhwn+xDy2MvpnZixsoB3IG9kocpGOZx0Otfc2RD/TRotHoD0/isExxHKezFdRZMIgf248ncWjpibn+3zfhue6RtJU6/HFTA9Bk35AMxtCNz2yCWf4RMtkawjTA6DIwFoaADFZufrYymvihAOzC3pJ2/zCOF54EyVbJ1S4WcYHMCM0cwR9rsO56cnw6FVYJOGucxwRPtkIeQxRkOn038NM7qbWDaIWS7xE40G0RT8rTkw22q28yA0T4fUxge2z6XA1S1aOKRPVULel2h0sIOXLhRQns3ldordayrJ6SCGOhPMmJlB4pYCqIs7Bd1ovTVX+cwyDGjzPuCh7FQY45X6QBOtAsli0qcBlhlC+7x19FNGt1SOvosIdGwvfWC/Xj3GjTr9uLA/cjCxBc9UhL52mNeyLG+H1cD0XHhZAxk4RzAtqp8zYeE0u04XTkTB+Fef2n2EYnvvrHzL3cQ+2D58e1zH0FMxNZxWjTAAr2h39LhfTdburBG8/0QdfCZ+EZvGz0B/oN7miiUGXaJcSCAhkX0D6oaGN3lnH8PNXA1MlE49NBVao4m52kALJy/JCSlIWPBES7qi9fnPpG64Y/Qr4S07JeWjQtPllN67VvXnEnTv1uHrr4TFR/96+/oXc9vkDxV3kwFZG844rAWYdIfefVC1u5C9dUpXpDplGtoXIngIs3ZDZdTQ0YJg33Tw4RBC8Cd20Ew226MlQecA3U5EQxzQYAQqKvVXt6sueVnkvRCqm0OTUbpBDIaNNZrk+jWoCdlvWM7p7JjVpAKt8EYo2jFkIUG1jo9DOHN5H9gKPEGZCgCrxNfjkyjXo8B7GASYkKIRUOzjrmceuwNKtfAlMEIyCepqDYHYUkiqBpyf2wicv0Qmew9LmD4JeDdDvcDZ ZxVBGU/F +mClMm63bSXM/whfzkR6A16f3anaToFvzZ5krylHCUvgtQiMiFAq+UiZ9h/kN0n3HAh1/rIWUliWW9P++d2BHfs9qC67aTeB7vj33NVQFQbYqdzySBa86qBONFtDrgBPnKU8U4pzm1AF1EL9M95FN7eFHAhfVv1px4PY8+hXD39QhUjFlx6hoKXpqVuIkN95dg6un8oym/ZZg/JmwgaCQFG2PyXJ0O+CIFxeC35SBZS9rp3xB486h8/nCRdk0iNWlnpCptESWMQZgC+3Cc/je+/nWQRqI6enR5OA5JyEVkI68npzPpxiLi42rKA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.117665, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 20 Jan 2025 at 10:30, Kirill A. Shutemov wrote: > > On Fri, Jan 17, 2025 at 04:29:51PM +0000, Fuad Tabba wrote: > > +/* > > + * Marks the range [start, end) as not mappable by the host. If the host doesn't > > + * have any references to a particular folio, then that folio is marked as > > + * mappable by the guest. > > + * > > + * However, if the host still has references to the folio, then the folio is > > + * marked and not mappable by anyone. Marking it is not mappable allows it to > > + * drain all references from the host, and to ensure that the hypervisor does > > + * not transition the folio to private, since the host still might access it. > > + * > > + * Usually called when guest unshares memory with the host. > > + */ > > +static int gmem_clear_mappable(struct inode *inode, pgoff_t start, pgoff_t end) > > +{ > > + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; > > + void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_MAPPABLE); > > + void *xval_none = xa_mk_value(KVM_GMEM_NONE_MAPPABLE); > > + pgoff_t i; > > + int r = 0; > > + > > + filemap_invalidate_lock(inode->i_mapping); > > + for (i = start; i < end; i++) { > > + struct folio *folio; > > + int refcount = 0; > > + > > + folio = filemap_lock_folio(inode->i_mapping, i); > > + if (!IS_ERR(folio)) { > > + refcount = folio_ref_count(folio); > > + } else { > > + r = PTR_ERR(folio); > > + if (WARN_ON_ONCE(r != -ENOENT)) > > + break; > > + > > + folio = NULL; > > + } > > + > > + /* +1 references are expected because of filemap_lock_folio(). */ > > + if (folio && refcount > folio_nr_pages(folio) + 1) { > > Looks racy. > > What prevent anybody from obtaining a reference just after check? > > Lock on folio doesn't stop random filemap_get_entry() from elevating the > refcount. > > folio_ref_freeze() might be required. I thought the folio lock would be sufficient, but you're right, nothing prevents getting a reference after the check. I'll use a folio_ref_freeze() when I respin. Thanks, /fuad > > + /* > > + * Outstanding references, the folio cannot be faulted > > + * in by anyone until they're dropped. > > + */ > > + r = xa_err(xa_store(mappable_offsets, i, xval_none, GFP_KERNEL)); > > + } else { > > + /* > > + * No outstanding references. Transition the folio to > > + * guest mappable immediately. > > + */ > > + r = xa_err(xa_store(mappable_offsets, i, xval_guest, GFP_KERNEL)); > > + } > > + > > + if (folio) { > > + folio_unlock(folio); > > + folio_put(folio); > > + } > > + > > + if (WARN_ON_ONCE(r)) > > + break; > > + } > > + filemap_invalidate_unlock(inode->i_mapping); > > + > > + return r; > > +} > > -- > Kiryl Shutsemau / Kirill A. Shutemov