From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71D3BEEAA68 for ; Thu, 14 Sep 2023 19:12:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EA2088D001D; Thu, 14 Sep 2023 15:12:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E52108D0001; Thu, 14 Sep 2023 15:12:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D19868D001D; Thu, 14 Sep 2023 15:12:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C3C778D0001 for ; Thu, 14 Sep 2023 15:12:18 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 55079406C9 for ; Thu, 14 Sep 2023 19:12:18 +0000 (UTC) X-FDA: 81236148756.18.76A0F04 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf22.hostedemail.com (Postfix) with ESMTP id 9709DC000E for ; Thu, 14 Sep 2023 19:12:16 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bF+q5jhy; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 3D1sDZQYKCAIugcpleiqqing.eqonkpwz-oomxcem.qti@flex--seanjc.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3D1sDZQYKCAIugcpleiqqing.eqonkpwz-oomxcem.qti@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694718736; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SYBWSie1WcKieNCkM5oJg6niwnE6sJx6iHmmYDkBNI4=; b=CiThZnzd0tPGvU7W8YCCrwQKeWobbMvWrTEldTwfGmlAopHG0Drq8T5f4NM9gUgiLxCT7h q2+S8uAvIUYB+lZ182Scg0I9sI3qJY87j2c7Mk9ciZboUKjGvcX+sVNfxyloA0Ihq2ojLi T6G2hUgW+hsM72/8UKSw+SagFuEEwlU= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bF+q5jhy; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 3D1sDZQYKCAIugcpleiqqing.eqonkpwz-oomxcem.qti@flex--seanjc.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3D1sDZQYKCAIugcpleiqqing.eqonkpwz-oomxcem.qti@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694718736; a=rsa-sha256; cv=none; b=ZFpxJG8lmX7NDxy1JphY/R76/+XYPoY0syZfuk2jSnW3IYqk0XsPyh2lhMImMd5rSMS7Yf Ymn2vUCxAfnnx6XpHXNoP+7o5jjgRhc4GhSSgSTBhUkO1+e963RdB+vYHxbe0nNWmzl/xK 2E0YU+vZwHIssWcmmTjj2hnWcd9s378= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-d81a47e12b5so896557276.0 for ; Thu, 14 Sep 2023 12:12:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694718736; x=1695323536; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SYBWSie1WcKieNCkM5oJg6niwnE6sJx6iHmmYDkBNI4=; b=bF+q5jhyjZPqEDXHqv7rkf120qtw1m5wpDiUh/Np8raHSFGpbaKwAdteQhCbABNOfe fB6jsgTifanKNMgfd4WYsnw+xOmwTyBeN9/lrdr/kLl0ThdEpsio+X+a5aH6zBqYRxXu jBVbcp8H1KYL1NdvZJv9kS3TnGE0gEtQ35iqgb1uuw+kNCPYLm6kF2YyqwJ/zplaq1bj LYQcPk5ZFxOUbo07D41yKiqGhWaNxpjpboRdMy4Qv7rYG0G8W8ZKeBdfeGVuzDwjwEv2 1dk948yZQZiebMz/cgHqRNbFeOX7tHsna4YC0Y7L+Po8ZXFJHpiv6m7eDJ3qkVaA3COv rBug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694718736; x=1695323536; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SYBWSie1WcKieNCkM5oJg6niwnE6sJx6iHmmYDkBNI4=; b=WNMMr1YOXBs+FP1f507Aur16kxbVw3a5IS8jw2NEThLoljoXoDauepnlItPrq78nWB CyKslHfSvJ7n+f/LDjX51Of1WCd5rgWNgDWe9v+oqA4Uyhs0DkAmjAmaKFXwiPJm1A4R rbOIgAHxR0TzD3maWeP9E895T6PKNzEBGH18eYdRlfTLk11MVReICuaUDnplzcKeP9iD Pks6gR0LlRlx9Dn7g4E4Klivl+wUAN9tq6leX7uwjuQ6fzoatepDiiQr8KjT5lrUSeL9 hEcFmKYDWDN60YfbYHW8K8DxlQLeIvmo5XNgGC3KfBU3bMj/Dt4s74YqzdgS5xQ69KVO VyPg== X-Gm-Message-State: AOJu0YzjF7FUYlCaaWiDfJbwoYZXHu1eRS3UEFYBgP1KUHVIo+GWdKZ7 smd0O9plEk/t7cVUHv/f+AzjTtoJDUk= X-Google-Smtp-Source: AGHT+IEbiNaBi+bbmiwHYIoXYm8wvREABvNIqLAChDCxgGMZnU/PhXOjKBEoc76witsZv+IRa9HvnC8Otog= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:bc7:0:b0:d77:fb4e:d85e with SMTP id 190-20020a250bc7000000b00d77fb4ed85emr138869ybl.6.1694718735743; Thu, 14 Sep 2023 12:12:15 -0700 (PDT) Date: Thu, 14 Sep 2023 12:12:14 -0700 In-Reply-To: <253965df-6d80-bbfd-ab01-f9e69b274bf3@quicinc.com> Mime-Version: 1.0 References: <253965df-6d80-bbfd-ab01-f9e69b274bf3@quicinc.com> Message-ID: Subject: Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Elliot Berman Cc: Ackerley Tng , pbonzini@redhat.com, maz@kernel.org, oliver.upton@linux.dev, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, willy@infradead.org, akpm@linux-foundation.org, paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, chao.p.peng@linux.intel.com, tabba@google.com, jarkko@kernel.org, yu.c.zhang@linux.intel.com, vannapurve@google.com, mail@maciej.szmigiero.name, vbabka@suse.cz, david@redhat.com, qperret@google.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: 9709DC000E X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 4881t3i76sqfagwqzxnesoi7hr8yk49t X-HE-Tag: 1694718736-587221 X-HE-Meta: U2FsdGVkX19QbdsQs+fmlTCuplzd1m7M6JSC7DrBJqQbpaPyUS82VqaNVOYlU2ZDDq21wbjvB9t+k2lzxv1yIjnRwPdiV5ZnUOpOS8h3FG8ZFLtna3K4yL2hxHsiLYkgQV+0zTm9jx6NiHHXciL8zhFIpsHtU35bkxrNjS8lMm4QFS0jo3FLNbxZYT+oTRJx+0gK39qKIAZ0Ud2zZKrxtqKcoswq3KaamCehaDN5FTWG10mTgRunwVxoL11fHHyNcsSDS1iXHm1Nmbu7SeKMq6W9pf/okdfaoZAffFjXX9CcFG22yo5KhxkK11su6yRoa7bF040YJBFG3q0V30L3AWSc1NBuroDKZBKFT1/hgIWp5WwY1nRnckLKnEmgn0HKPMH0iBFPEWWqoIGF1Mhy39OnLA6OAsRwQ6wba7WWEq4qCpYU9a7LoT8RecKbMxvPiPW5PS9Z6pfvcbpFPlixkxxQDjqEbnM1ZbDTUACVut/GL1buHCXTnbs25G0K+mhWohvoLtB6XDfEGkl2hPhMXxPtM7TEPq5OuvSu2BYBRHrrkZWSTZzeF+xBnFF6ETyP/Gb+OtXVxUW5UlO1DytA87UoexLPHkmVYoAnJQO82f9HyeM5AsPG5KxO7gk0tpZophoNSPutGjdcgQYJu+hbpbhR2/iApyle3vhL9kqwUNrsig60aA0GJyDz5jrM6D9doNnXSrDwj+mYf64q3aukuzXQQ4+Z9mx2kcakWKdli2fJbO5H58cfgudA1pJyVDsI7hZ7r+7dbSK9iYectizP8vAtxtsNFc7/MHsPqRYQBld7tz8sDiK9wRecC51SSI+tV/+kbq73kytEHQoc1HJo9J/GHmXh7Xp1xreaxPA9Lqz1DVmSHrSWVfTT5q0Y+GzYN53n/nuZsdFtJYNS1ogcJxNHG2pENOs9FHo+EUnRgGTD//Mhesd4cWcLhsB/QqvquNSofRBm1Rb6VaURttO StnTuBz4 41QAgGb5noOb/nDPCz1z3EUJt6v/lyXpdV+RDZzE4WzUp0xFwbKs6qR/fVO1XChSQs/UGWJHxLX/zKW4XqgRFzkRJ/rW+DAKf5Isw+aX4dBk+/vSGX51UtqO28VITMXFz+EuH4uXI91EmuakBELHrgUkoEGKJ6TFJl7K7qSzpVnUU9StPEnsRkBXCeqXFQ7hUfwc0aiq/AnVkI1ZAIYXg7NC8X3XERZHYaCCla9129xl3F3Fstg/Xe/E7IV43EIKsYTNjKZ3x3avjTx8jYL2ZecYzU4IvCAhPivQTvGlDXwSVxBhY6Ky7oEHe/4zGDqxsi+DR2zbWkIQU5LgnkvdtGRh26WxXITHe1yFkPDB+BJkIBBZR97+xvpuZM12rPYYTjzlIvPM8KCjdqAWU/NOjNixLzHA2V90fMnMmMs1enDsoVDV5MQ6XisBYCN9a65FdKFWpwOaPaU+lgM5gXn8UeJksxVLqmHAICHNI2vzzxU18LbvEhRvF6UJdNtyEerGX7+/S9DAx1taX2F0mJqtdofXN22/n+4+a/OoDJ01Cn1IPABoO+3riLrEVV2mrYtXfZgKC9cVsm4otuSCq6X/xZwMnndf6CBsigohH5H1H6qTKC2USmY0HHT3UQi7dARkvlO9bYoIDDejlR+XqNhVN5kTTf0Ih/4RESsAQ5LhiH/JbjYz55otRLSfA8FUc/0PP2FsR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Aug 28, 2023, Elliot Berman wrote: > I had a 3rd question that's related to how to wire the gmem up to a virtual > machine: > > I learned of a usecase to implement copy-on-write for gmem. The premise > would be to have a "golden copy" of the memory that multiple virtual > machines can map in as RO. If a virtual machine tries to write to those > pages, they get copied to a virtual machine-specific page that isn't shared > with other VMs. How do we track those pages? The answer is going to be gunyah specific, because gmem itself isn't designed to provide a virtualization layer ("virtual" in the virtual memory sense, not in the virtual machine sense). Like any other CoW implementation, the RO page would need to be copied to a different physical page, and whatever layer translates gfns to physical pages would need to be updated. E.g. in gmem terms, allocate a new gmem page/instance and update the gfn=>gmem[offset] translation in KVM/gunyah. For VMA-based memory, that translation happens in the primary MMU, and is largely transparent to KVM (or any other secondary MMU). E.g. the primary MMU works with the backing store (if necessary) to allocate a new page and do the copy, notifies secondary MMUs, zaps the old PTE(s), and then installs the new PTE(s). KVM/gunyah just needs to react to the mmu_notifier event, e.g. zap secondary MMU PTEs, and then KVM/gunyah naturally gets the new, writable page/PTE when following the host virtual address, e.g. via gup(). The downside of eliminating the middle-man (primary MMU) from gmem is that the "owner" (KVM or gunyah) is now responsible for these types of operations. For some things, e.g. page migration, it's actually easier in some ways, but for CoW it's quite a bit more work for KVM/gunyah because KVM/gunyah now needs to do things that were previously handled by the primary MMU. In KVM, assuming no additional support in KVM, doing CoW would mean modifying memslots to redirect the gfn from the RO page to the writable page. For a variety of reasons, that would be _extremely_ expensive in KVM, but still possible. If there were a strong use case for supporting CoW with KVM+gmem, then I suspect that we'd probably implement new KVM uAPI of some form to provide reasonable performance. But I highly doubt we'll ever do that, because one of core tenets of KVM+gmem is to isolate guest memory from the rest of the world, and especially from host userspace, and that just doesn't mesh well with CoW'd memory being shared across multiple VMs.