From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 504E3C8303C for ; Tue, 8 Jul 2025 15:39:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99FEB6B0098; Tue, 8 Jul 2025 11:39:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 929826B009C; Tue, 8 Jul 2025 11:39:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F1B56B009D; Tue, 8 Jul 2025 11:39:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6EEFB6B0098 for ; Tue, 8 Jul 2025 11:39:06 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3D1051A068C for ; Tue, 8 Jul 2025 15:39:06 +0000 (UTC) X-FDA: 83641505892.05.9E86182 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by imf11.hostedemail.com (Postfix) with ESMTP id 89E5F40005 for ; Tue, 8 Jul 2025 15:39:04 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FKNB+WnB; spf=pass (imf11.hostedemail.com: domain of 3ljttaAYKCA05rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3ljttaAYKCA05rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751989144; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fJ6Trmj44z6pMJddwVQzuI+dtUG3teA4UW/36PaybfA=; b=r2FsVghdesnDjLqzHg8fCLDm1405ct3IMShWcXhHpUVJlPvTT8oBC9DL70RJuSb7Hh0g9D qwBBEI5bSKrK7AjKJGhFjuH6j1L6lpeuUGA4BmEmJWrRcXN1co/0Fx3l8dXztC8sSPMRKW U/hNZG11/uejeIPqTtpLwMJdhatXOrw= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FKNB+WnB; spf=pass (imf11.hostedemail.com: domain of 3ljttaAYKCA05rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3ljttaAYKCA05rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751989144; a=rsa-sha256; cv=none; b=snZp1Q61ODz0lKjvILvPWw6uaAp0F2ARGsgVccLtRDdzVNxslhzhWSPJwrE3g0X+ZH+sVl C7xNR8SMya+7kCSFo7kT8F09atNFdhjqEWIeBLMlR6RFBcowyQip4g6EeK4cOMfcDdqL/C J88EwoEbx2beLiSNxMoa1rVIODUaRTk= Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-749177ad09fso1796841b3a.2 for ; Tue, 08 Jul 2025 08:39:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1751989143; x=1752593943; darn=kvack.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=fJ6Trmj44z6pMJddwVQzuI+dtUG3teA4UW/36PaybfA=; b=FKNB+WnBsvvrThalKesqRMn4WRPvz9CIfjlt31Ajzp+8Szd1102WZ+6iu+CH+MQFEz xhJSSpvQpDC0eEu9MNQgisUcd+/ukBIXbZG4lNgJDlQR3dl7S6V+iCEHVAXiq3cCq3b7 2exBxKDqG8RghRuzz9NEzt88WceRi8et1HjbjQLzx3hNv+ujGqLQ0pEI/DiqTFrqdwLj af7lWzbEbZ1AYDmR0XVRvBISK+VFmbrQNF7ufttn2MZqoSsN4JHZhPbmDaN3fIrgfk5/ sjVmph8WRp4ygd+dVrcgtPhhs0HiYDXtFD7Vm6ZNZsIxcGDHnLQ4OESwLl1Xz0pcWTW6 CkUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751989143; x=1752593943; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=fJ6Trmj44z6pMJddwVQzuI+dtUG3teA4UW/36PaybfA=; b=AdwKmv3YSHJb+zuDT9p2jdxjuwKD5ITvB8bNSO+Lbq4aKIijaOYUJ6xoquFam6h5bm oXqIlrSrQjqaYrOocUesKUUNCS8H/veJEhilFVINV2Pg4cq6Nlnb+omNHFG/nOQAItd9 ma0y9TMSIcnuyiUGRcnmlFj9mWXSdZ9mHpCgKOhJBvE675mJPZ/urrS5AGO5J+snyopE PfIb30WlQCGVq/xa7j7irTXuYy7ts3rvOGYGOms1mHynnYjk9jgGLdq6XowaQzQgTzaZ yq1Ntl0TeQEBgF3ifZ9OPtCu7VJZCDQBvzWanxOtXLK1louJRninq08bMV/IDABWN0tZ YHaQ== X-Forwarded-Encrypted: i=1; AJvYcCWL7EXNz9/qrQaaTA3QtA4nlUcylgk+n3uCNiUwvvqmMx6UnmySJrbqaQD/mfqYROesQ5u2Res+eg==@kvack.org X-Gm-Message-State: AOJu0YwYqzwO5DibQqan5T2U3HW4OCvM38VbHb/NVPifj29hMr8xbIzR OHA//fk/luYdC0rFoaJhw7JbV30x9qTRUkyiOrri9RBzO8U4KVGA+l5ryMHQTSqEN3/hrF5DLgJ FH/nOsQ== X-Google-Smtp-Source: AGHT+IEVTH1cDHrDrkBM9ANruotfYiflC/Rn/GMmhs6k3Qn6zxl4FY9ZL04L/ybWIyiK/dNj58MjPa3dDVo= X-Received: from pfbfi39.prod.google.com ([2002:a05:6a00:39a7:b0:747:b682:5cc0]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:929e:b0:742:8d52:62f1 with SMTP id d2e1a72fcca58-74ce6419a9emr28472954b3a.8.1751989142330; Tue, 08 Jul 2025 08:39:02 -0700 (PDT) Date: Tue, 8 Jul 2025 08:38:55 -0700 In-Reply-To: Mime-Version: 1.0 References: <006899ccedf93f45082390460620753090c01914.camel@intel.com> Message-ID: Subject: Re: [RFC PATCH v2 00/51] 1G page support for guest_memfd From: Sean Christopherson To: Vishal Annapurve Cc: Rick P Edgecombe , "pvorel@suse.cz" , "kvm@vger.kernel.org" , "catalin.marinas@arm.com" , Jun Miao , Kirill Shutemov , "pdurrant@amazon.co.uk" , "vbabka@suse.cz" , "peterx@redhat.com" , "x86@kernel.org" , "amoorthy@google.com" , "jack@suse.cz" , "quic_svaddagi@quicinc.com" , "keirf@google.com" , "palmer@dabbelt.com" , "vkuznets@redhat.com" , "mail@maciej.szmigiero.name" , "anthony.yznaga@oracle.com" , Wei W Wang , "tabba@google.com" , "Wieczor-Retman, Maciej" , Yan Y Zhao , "ajones@ventanamicro.com" , "willy@infradead.org" , "rppt@kernel.org" , "quic_mnalajal@quicinc.com" , "aik@amd.com" , "usama.arif@bytedance.com" , Dave Hansen , "fvdl@google.com" , "paul.walmsley@sifive.com" , "bfoster@redhat.com" , "nsaenz@amazon.es" , "anup@brainfault.org" , "quic_eberman@quicinc.com" , "linux-kernel@vger.kernel.org" , "thomas.lendacky@amd.com" , "mic@digikod.net" , "oliver.upton@linux.dev" , "akpm@linux-foundation.org" , "quic_cvanscha@quicinc.com" , "steven.price@arm.com" , "binbin.wu@linux.intel.com" , "hughd@google.com" , Zhiquan1 Li , "rientjes@google.com" , "mpe@ellerman.id.au" , Erdem Aktas , "david@redhat.com" , "jgg@ziepe.ca" , "jhubbard@nvidia.com" , Haibo1 Xu , Fan Du , "maz@kernel.org" , "muchun.song@linux.dev" , Isaku Yamahata , "jthoughton@google.com" , "steven.sistare@oracle.com" , "quic_pheragu@quicinc.com" , "jarkko@kernel.org" , "chenhuacai@kernel.org" , Kai Huang , "shuah@kernel.org" , "dwmw@amazon.co.uk" , Chao P Peng , "pankaj.gupta@amd.com" , Alexander Graf , "nikunj@amd.com" , "viro@zeniv.linux.org.uk" , "pbonzini@redhat.com" , "yuzenghui@huawei.com" , "jroedel@suse.de" , "suzuki.poulose@arm.com" , "jgowans@amazon.com" , Yilun Xu , "liam.merwick@oracle.com" , "michael.roth@amd.com" , "quic_tsoni@quicinc.com" , Xiaoyao Li , "aou@eecs.berkeley.edu" , Ira Weiny , "richard.weiyang@gmail.com" , "kent.overstreet@linux.dev" , "qperret@google.com" , "dmatlack@google.com" , "james.morse@arm.com" , "brauner@kernel.org" , "linux-fsdevel@vger.kernel.org" , "ackerleytng@google.com" , "pgonda@google.com" , "quic_pderrin@quicinc.com" , "hch@infradead.org" , "linux-mm@kvack.org" , "will@kernel.org" , "roypat@amazon.co.uk" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: a774p1sh5udjqsccq14shbfz31rqrgkg X-Rspamd-Queue-Id: 89E5F40005 X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1751989144-617177 X-HE-Meta: U2FsdGVkX1/LwGnCqVvk2U5AL+1e4E5abT+P7p/21V28aoyJj33YgtWr/3ai9q+qSzK2UQ1Gmwq0FtNQYePR6TpvJtm5K89p2+fEmu/qyV4g6jRxUIx392XNFCucX54fkMgWaNmqwSMfldDVzG/zuMBvEFMDcAFUlH40rSf7AuELdqnekKiecb84YfDA+QU7qnkPvxttaH+XKco4vSKNx9aBgZf/cRWDlfXX2m8cN81fmv7N2YZ4AfZI6sS7q2gSFcGoOvJ416r+woiDi/UYnj4zMXkg+V8QTrpeW+J2ZD54+kS24nPBa/yqj/612Usha3sZM/NOD4itzLfhHzS3GEEpXbcypDIh+WFWEaki7DhSThXn7nnMjxFF9qhDjwQR9dxHSbziX2sbqBM+/cBRE2a5LVK/R5DgOeHb9wIdRgyS4eS6XjbIN06RvsZobDKnwMDU9+pFKFGMbam4dZdrE5YR/xKi7DjWhfrT2+cT44zBEcStuEsHawvNBdwYz0hZZpaxWmG4OfMDZ3yRYy08R1INw1nNDKIb2Fqjse0y7Y7PAVDTjz5tT36l/l5T5msTcicND/EVp5LlXvpq61YUtM/PaVieGw5qCMuwXsa63JrTidtYcCm0pM6Wu0li0AZJYRGQNL07D0zzlasFpM6fYNr3gYEv+yq5YmH6oZXH/IJZoBG4FPzldz7eFCBNjTuIomExvdywWXlYGij4aYAaK7bLypYJmzdzfxu84DwAJeuZUSwFlnYNqi2bv4JyQkyYFk0KEtIHrBRhBl6Ccs0CObXmldY0d+r35Go62ZYiUW4fG106IB89kSElrFa5/73pKFJmSskpty/Fl8Jh324T99d3y5cqpKTM6Z1RRxiGOoFUenhZA7zFKsvHm/EM45ScMOLB9rqcP7C4u/2rwyjW4WNlFCSDLPkK28A6vNF3UO4OPYTdcqwHzEeac8mAo6ZtQGj1dtxEnJMe3hvS5Xk LZqK/rWs 6/DlmFK5of/3i1mRBhPBcBQfokQf5ls+2AWEYs9+ghR3hsdS4zMgwdEjT9WprGNsTs/US4cFSBik1252f8m9wC6v5FxZ+y0Lpr8T5af+nIh4aBXmmizQIZb7tRR29CeGNlXT9ini6iO15M6qroAbnyN8RDJo4T8k8Z2IdcJTUcSyKGD4UHn+n1F7dALrRfIZQW3VmSfYjOKegp/aT18ZEVlR/p5hNKl/HyaTrLJlLnesoDKIYfMEgV3oAgWQJQv2s6tVlD98mIyN2wtiYyIidDJJuLn3AEM4/gUobFuY0qL1wPiVuENm/f4+bpHcEFwpmWc0DtyP6WczvsXPNy9EvvapTFeqTbZ6LlIjGYrT1nRYmLWbp6EwYx1oLKnOu+A5TOios4URVFupeqTEnBn8T/70IttpdPoa4t9+K X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 08, 2025, Vishal Annapurve wrote: > On Tue, Jul 8, 2025 at 7:52=E2=80=AFAM Edgecombe, Rick P > wrote: > > > > On Tue, 2025-07-08 at 07:20 -0700, Sean Christopherson wrote: > > > > For TDX if we don't zero on conversion from private->shared we will= be > > > > dependent > > > > on behavior of the CPU when reading memory with keyid 0, which was > > > > previously > > > > encrypted and has some protection bits set. I don't *think* the beh= avior is > > > > architectural. So it might be prudent to either make it so, or zero= it in > > > > the > > > > kernel in order to not make non-architectual behavior into userspac= e ABI. > > > > > > Ya, by "vendor specific", I was also lumping in cases where the kerne= l would > > > need to zero memory in order to not end up with effectively undefined > > > behavior. > > > > Yea, more of an answer to Vishal's question about if CC VMs need zeroin= g. And > > the answer is sort of yes, even though TDX doesn't require it. But we a= ctually > > don't want to zero memory when reclaiming memory. So TDX KVM code needs= to know > > that the operation is a to-shared conversion and not another type of pr= ivate > > zap. Like a callback from gmem, or maybe more simply a kernel internal = flag to > > set in gmem such that it knows it should zero it. >=20 > If the answer is that "always zero on private to shared conversions" > for all CC VMs, pKVM VMs *are* CoCo VMs. Just because pKVM doesn't rely on third party fir= mware to provide confidentiality and integrity doesn't make it any less of a CoCo= VM. > > > : And maybe a new flag for KVM_GMEM_CONVERT_PRIVATE for user space t= o > > > : explicitly request that the page range is converted to private and= the > > > : content needs to be retained. So that TDX can identify which case = needs > > > : to call in-place TDH.PAGE.ADD. > > > > > > If so, I agree with that idea, e.g. add a PRESERVE flag or whatever. = That way > > > userspace has explicit control over what happens to the data during > > > conversion, > > > and KVM can reject unsupported conversions, e.g. PRESERVE is only all= owed for > > > shared =3D> private and only for select VM types. > > > > Ok, we should POC how it works with TDX. >=20 > I don't think we need a flag to preserve memory as I mentioned in [2]. II= UC, > 1) Conversions are always content-preserving for pKVM. No? Perserving contents on private =3D> shared is a security vulnerability= waiting to happen. > 2) Shared to private conversions are always content-preserving for all > VMs as far as guest_memfd is concerned. There is no "as far as guest_memfd is concerned". Userspace doesn't care w= hether code lives in guest_memfd.c versus arch/xxx/kvm, the only thing that matter= s is the behavior that userspace sees. I don't want to end up with userspace AB= I that is vendor/VM specific. > 3) Private to shared conversions are not content-preserving for CC VMs > as far as guest_memfd is concerned, subject to more discussions. >=20 > [2] https://lore.kernel.org/lkml/CAGtprH-Kzn2kOGZ4JuNtUT53Hugw64M-_XMmhz_= gCiDS6BAFtQ@mail.gmail.com/