From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A69FC2D0CD for ; Wed, 21 May 2025 14:42:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EDEC26B0082; Wed, 21 May 2025 10:42:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E8FD06B0085; Wed, 21 May 2025 10:42:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA5856B0088; Wed, 21 May 2025 10:42:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id BC0406B0082 for ; Wed, 21 May 2025 10:42:17 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 828A7C07DE for ; Wed, 21 May 2025 14:42:17 +0000 (UTC) X-FDA: 83467180314.21.83988E8 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf22.hostedemail.com (Postfix) with ESMTP id 8D7CCC0008 for ; Wed, 21 May 2025 14:42:15 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=3tdOOU5A; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of vannapurve@google.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=vannapurve@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747838535; a=rsa-sha256; cv=none; b=MP6oI2LnP2NCcbScGALXfz/tlZP5rsKniW8mb197xG+fB+FR5cOEjBFzwoEluNY2w9Uk20 q8ednm86YF7Q1L+WbvyBxoVDBVVjzHUzRfnTCjMA9cdCiLfPFI8sM3QyWSd0wftD8t5EaV +DMjghvkGfkB40TpMOymPZx4T9QI83A= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=3tdOOU5A; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of vannapurve@google.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=vannapurve@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747838535; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MeIreJ0VzWT2UiLJVYu37D2mNjDEruPiWRZcit3nFDU=; b=l0QtzFfyizsNVsA6D1NQAZ/c45MRzPYzmhCmE5djkz6W3X6FK5/kpVnvcxVlzneEjeTgtL UU6VQ1UuYoOlddASFYGZY2q7khc0SrS81q13CkM8VjQnR7w2n9gOXk02t5H6h6ON6fBR2a 6UWpymubqeQ7J4ROsx3b3VtMvn1qLBw= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-231f61dc510so1047575ad.0 for ; Wed, 21 May 2025 07:42:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747838534; x=1748443334; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MeIreJ0VzWT2UiLJVYu37D2mNjDEruPiWRZcit3nFDU=; b=3tdOOU5ABLmBJETKyW4jXjtecg4hZn6u9OnMY+yTaA3l/BMYbWgoyLUF3S6B6rbxxk i4bAYvPBuN25kPuIyFlcfnRFyswVnCViG0Lqixf16xImpCOLti09A0yPWXf0coO0Oj+v lktXlGwgBjcJQ+ettw6jrOUvn92YA+hNtZO5yc5isVIuo4Vi1KQ+HVafA8fyO1hE8Gmd zFYM8zXU9cPeSxpFudA6T1DDSFWflMBHGM5K3i4yKaWByBha2By6J+9Z1+gS/vNLarvN LbRjjwkId0aG097Xzdbv1WrGyvJT3P+zQAQQBPvw0O92j1++444arq/mbfmioDm8qSET 7W8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747838534; x=1748443334; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MeIreJ0VzWT2UiLJVYu37D2mNjDEruPiWRZcit3nFDU=; b=F6qSq+xSqPUVBtsDAeIomSAx/Yn4uzZ0zDNq0Zh0+900724ITPhBfeVgJRpyM3pFa4 s8jiRWrxSUQJyIG2AJHRaEqVHv+rv5qBI9QjPouqSXsqxt8mE1Uu6ETI0/3puaguCZjN lKhXb3/0R1Xr/iOTbQ+PG/h/XNh8PE2w0uP4BcFYaQ7NkeIIAX7pQG1/u8SogtzWBtcJ U9JuLCs+C5CVh95+Pp0igyGYFY+CdQZ2O9mNuKDbGOhWBLtXtqIRYc2Kmp1/eLrSUgub Ckm0POq+w5eZrBc7aLjJU8nEfMnWHU0aBeZQH5r8b31uR0cY68mlchJWCKt+fKcJHPns cCBg== X-Forwarded-Encrypted: i=1; AJvYcCVrHJ6l9M+ryAnh8/Hcva2rF/gNvVRYeDKv9qrNC/5kztRrTQquDMvzDcTrNzvTE28TfR9SpEoYRw==@kvack.org X-Gm-Message-State: AOJu0YwgoU4a4+FwnW2Pk0AN2UeGo+DyaSOHOKKMDdIglqZRa+NAiIcU LZ0KmblDCECb/lq49WvlCDOZ4td+WgCeoewT4twQI6MITicvfmaBzbXDeMPDLbrwsesAvkvZ2QC ZbI3jvyii53bd3fFWC22iOs41tUz/2as7jExT1kU3 X-Gm-Gg: ASbGncvJwe2rz+gnznwgsy2aKI7Of3kNII4Ks5TLwM3Vgmy0Gx+3Gd2NNRO8VRdBfy3 54eF7zTEZJqf40JSrT8TB4gZcKjQhow25gLbHkBZhku5YY6kWwLQkeDlprdFaHRuYlz1owOWjpa sSiC0J3GdkfE4eRsnB7LD5k9ItuU2jjAwMV1iIJgW3FGu2z28iNC1s16j91Ab2DWWcGRM0MulEL TI= X-Google-Smtp-Source: AGHT+IG2dIYCeOz3RfprKj5ASeFAdVUkomyHxXDYNyHo+nRE69+av8tpRMQ+wxHyswC7YgfpsfFFUwky7GIhiFkkWlc= X-Received: by 2002:a17:902:cecd:b0:231:d7cf:cf18 with SMTP id d9443c01a7336-23203eee503mr11647135ad.1.1747838533592; Wed, 21 May 2025 07:42:13 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Vishal Annapurve Date: Wed, 21 May 2025 07:42:01 -0700 X-Gm-Features: AX0GCFtAT9R_Oe05rYdzQPFgRH8-_eDMLizxCLAk0K-mKF9SD9rgeSxLsWR73nE Message-ID: Subject: Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls To: Fuad Tabba Cc: Ackerley Tng , kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, aik@amd.com, ajones@ventanamicro.com, akpm@linux-foundation.org, amoorthy@google.com, anthony.yznaga@oracle.com, anup@brainfault.org, aou@eecs.berkeley.edu, bfoster@redhat.com, binbin.wu@linux.intel.com, brauner@kernel.org, catalin.marinas@arm.com, chao.p.peng@intel.com, chenhuacai@kernel.org, dave.hansen@intel.com, david@redhat.com, dmatlack@google.com, dwmw@amazon.co.uk, erdemaktas@google.com, fan.du@intel.com, fvdl@google.com, graf@amazon.com, haibo1.xu@intel.com, hch@infradead.org, hughd@google.com, ira.weiny@intel.com, isaku.yamahata@intel.com, jack@suse.cz, james.morse@arm.com, jarkko@kernel.org, jgg@ziepe.ca, jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de, jthoughton@google.com, jun.miao@intel.com, kai.huang@intel.com, keirf@google.com, kent.overstreet@linux.dev, kirill.shutemov@intel.com, liam.merwick@oracle.com, maciej.wieczor-retman@intel.com, mail@maciej.szmigiero.name, maz@kernel.org, mic@digikod.net, michael.roth@amd.com, mpe@ellerman.id.au, muchun.song@linux.dev, nikunj@amd.com, nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com, pankaj.gupta@amd.com, paul.walmsley@sifive.com, pbonzini@redhat.com, pdurrant@amazon.co.uk, peterx@redhat.com, pgonda@google.com, pvorel@suse.cz, qperret@google.com, quic_cvanscha@quicinc.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, quic_svaddagi@quicinc.com, quic_tsoni@quicinc.com, richard.weiyang@gmail.com, rick.p.edgecombe@intel.com, rientjes@google.com, roypat@amazon.co.uk, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, steven.sistare@oracle.com, suzuki.poulose@arm.com, thomas.lendacky@amd.com, usama.arif@bytedance.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org, willy@infradead.org, xiaoyao.li@intel.com, yan.y.zhao@intel.com, yilun.xu@intel.com, yuzenghui@huawei.com, zhiquan1.li@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 4cet9kbrih7a9duf7oxc7qjgkjue9j4d X-Rspam-User: X-Rspamd-Queue-Id: 8D7CCC0008 X-Rspamd-Server: rspam06 X-HE-Tag: 1747838535-866145 X-HE-Meta: U2FsdGVkX1/O4QwFRvWDwd4F0ODnM1vZP3Br4ux3Bw9zjs8BB1Ix1gYKFvP6+vJKS08hNAugpLE80mfGTBdOvFpgIq+441y0RPvyDSUM3AWA5KDBTfDzMj7tVeKTjJ9PC1yaQhj2HHIz1x69IHUVjZea7xQ6Uas1cRvtwcJ+j2xGwfhuAmHGJF1EHTVCVOaiSAF4E78J3h3tqjj29j2itzBPZTAS9l5KWvsj987GoCNbFu1YYqy18OAg9TUug+aw7LThnbhyNyEZZI3maY8zDXcTXQS/MblInuw684b5i7r2QurzeVd1Kbiz6KZLuLSMaZDbi0SbszLXvb//dZce25hBUJotaVALWvx9L2CO9v9wwvvOqNG4rTbuJbguy/gjGbCtfMIqhv/1UnNAlB97+Df+PjYWRtdJwLHfx45WjSE4zPCKXMNthO1yX7zmYWOm8Zzvb7emf2DPyM5RTyl8E/vMn881hEmOMw0RCKGJCkFU6BoDRUUU/4HKE5aFZz2BJHKcbKx60d1QcyKuUf3RT6LgAecv7i43qRbX+9KQV1L2PEMPuQ7FdJXR9uez+aLQZv58Cwg2CGIPblU9JopQ8xZvzZIkC6UnqD80PE25A/+Fu9ywWxb45E+ttSM+i25PUyq3i0If4L5nKoLO5iN+UyQOBP5ZYLaSqYdtJVICh6FqbwOvFtMaXVLJRfrmN2Fkh3YkwGUX2ZzuuggGbdJuMcy8xgtOautNamp8tO9Ok0KosaHoLSIGnfCcbXllCbCfJmwtqf6Ho6YDw6Hp1ZC8G+T24tR9t7KvEj7nFr7CeK2ufteQ4UT+HBZKsLbANvTiAz+oKbmTMLnepmBEikY8+7sdZahVnGKyfnlV0MQH92n9Vt1Vz84BGOGs0U5+ZVkJKsZBhLXWAWZ8b36UE+dxsVWipe1F1M4PIh0K7PyD94Z0igL9JyX/lnY75kkX7jfK1vq2sFcz/MOGPl5TqPb Bu/mbHOW 29KAZHWGClxFVALp4bw6L7CFyciaWsC7pMUk6UUNovS1wil8SYSHyKd4cAj/0I68kzkIsT+Yct/sz13+XQwXdRF3zHBvBfibcaywErpoeXc8kFBTQbUHgZAKd/f8YmWN1Y3BG/tcs3gf1AobO/SlruY2scx2zdpIknWPN3HJnuu0Q5jQcTnjyshZbKjt0oN51iL4ppvhDwq6m7oIOA2rBl9DV6+/W7if82jazOTkPDyBGbnOzQW710YwY0rzehpqSxmUc52/4qGjBnGlX03xaCitnDupuQ3vdU6eW/Q6fkhz6gRhkOqvT+wGJCYWMQOePruDy/K06VrUTwXdOvWNZT0GYzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 21, 2025 at 5:36=E2=80=AFAM Fuad Tabba wrote= : > .... > > When rebooting, the memslots may not yet be bound to the guest_memfd, > > but we want to reset the guest_memfd's to private. If we use > > KVM_SET_MEMORY_ATTRIBUTES to convert, we'd be forced to first bind, the= n > > convert. If we had a direct ioctl, we don't have this restriction. > > > > If we do the conversion via vcpu_run() we would be forced to handle > > conversions only with a vcpu_run() and only the guest can initiate a > > conversion. > > > > On a guest boot for TDX, the memory is assumed to be private. If the we > > gave it memory set as shared, we'd just have a bunch of > > KVM_EXIT_MEMORY_FAULTs that slow down boot. Hence on a guest reboot, we > > will want to reset the guest memory to private. > > > > We could say the firmware should reset memory to private on guest > > reboot, but we can't force all guests to update firmware. > > Here is where I disagree. I do think that this is the CoCo guest's > responsibility (and by guest I include its firmware) to fix its own > state after a reboot. How would the host even know that a guest is > rebooting if it's a CoCo guest? There are a bunch of complexities here, reboot sequence on x86 can be triggered using multiple ways that I don't fully understand, but few of them include reading/writing to "reset register" in MMIO/PCI config space that are emulated by the host userspace directly. Host has to know when the guest is shutting down to manage it's lifecycle. x86 CoCo VM firmwares don't support warm/soft reboot and even if it does in future, guest kernel can choose a different reboot mechanism. So guest reboot needs to be emulated by always starting from scratch. This sequence needs initial guest firmware payload to be installed into private ranges of guest_memfd. > > Either the host doesn't (or cannot even) know that the guest is > rebooting, in which case I don't see how having an IOCTL would help. Host does know that the guest is rebooting. > Or somehow the host does know that, i.e., via a hypercall that > indicates that. In which case, we could have it so that for that type > of VM, we would reconvert its pages to private on a reboot. This possibly could be solved by resetting the ranges to private when binding with a memslot of certain VM type. But then Google also has a usecase to support intrahost migration where a live VM and associated guest_memfd files are bound to new KVM VM and memslots. Otherwise, we need an additional contract between userspace/KVM to intercept/handle guest_memfd range reset.