From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B683C4707B for ; Wed, 10 Jan 2024 15:45:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E20F56B009F; Wed, 10 Jan 2024 10:45:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DD0EF6B00A0; Wed, 10 Jan 2024 10:45:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C71656B00A1; Wed, 10 Jan 2024 10:45:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B32C16B009F for ; Wed, 10 Jan 2024 10:45:42 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 676FF40B71 for ; Wed, 10 Jan 2024 15:45:42 +0000 (UTC) X-FDA: 81663826524.08.C00CE5D Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 317FF1600B1 for ; Wed, 10 Jan 2024 15:45:38 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=UFfYaxVY; spf=pass (imf08.hostedemail.com: domain of 3orueZQYKCMc5rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3orueZQYKCMc5rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704901539; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Lq0U/JA8ruxxMjOd91X5TeH+1BTCqwEXpz7z0V/bils=; b=tw6QLgtq4mCddG26TGfmozFoocSCN+PxCS5um+aHYni7n95GpXtiwStUf10VHEX0LHkEWs W6Y9B2thpVgZ2cR3cpshvobQoAXDKTF43R0GO3civbdF2HyfXQ9yY0H1++AT09SnO9nTgJ GeZmKtWFOC5VyTmmxBmVJ9YPkJp8yrg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704901539; a=rsa-sha256; cv=none; b=KYOZ6auEw5+vLlCjQCFjzrF4Gn3iPg0VazBLwOulqpE9M1lXf91wfuunZjk31IvrOkwc8N XlxaFykJkSY4FOKa+QVFBUXodu8qF0r4pjeuV9Cg5M8FQ4JehNB+D7sQHqJDv7mzF5/duy SUNjo6FLC9GeXeiPTV9EfUT4Mm7SfKQ= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=UFfYaxVY; spf=pass (imf08.hostedemail.com: domain of 3orueZQYKCMc5rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3orueZQYKCMc5rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5eb6dba1796so78481117b3.1 for ; Wed, 10 Jan 2024 07:45:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1704901538; x=1705506338; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Lq0U/JA8ruxxMjOd91X5TeH+1BTCqwEXpz7z0V/bils=; b=UFfYaxVY+z1A5ZPr1ASyqCv7urXXhgURXKJQJ/ClJIZaZTg3lFUKcRRqcAZsJ8mLYK wViqrLRK0JqAW1YczA94aImIXNVXzrH8l/4JCh07qSAISbeWi5jEETqUmyUGU8g/d45m 9hmZy+NHdxyGkyqQnu5pY/JSYui+obHgA3pUmmnR+1jKxyvgWwJSUSEYhdje3qgEh5n3 ER09itRGHQJbbVv9E6YXujvr2mru9qC254+vcqKojoAPXbuZGSKzwhdez1z01kpVScEG pZW4evnArV229mAAS/iVzyMVeclesIBWvHRugxse0pBKvV9yw1lRECAf4UV0Wi9teofT 80mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704901538; x=1705506338; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Lq0U/JA8ruxxMjOd91X5TeH+1BTCqwEXpz7z0V/bils=; b=BqSg2dJkT2pxGYRTdsKOek/sGDmC5xYY1iJb3TW9mNYvIEndtd9+s+gFNm9F66blkW WdEYq1ckXpO+qbwXh1H8HMAVufpo58Dq6HDavMCOCvIQgU0MFTDMH10TBgBCaP8sq9TT 1ufCRDvx3jAh5XedMZASRclndT1AehYGhqtMEO33QS7V+fAfTheolGaR0Zgmkl9z0Tpi d/9j7Upqc9G6bHEqexukj7XPPt7nSMz2m8cO+s3kxkt72YACsphRi2O6Alpiw4Jnc+Vo xA8xGXqokxsOSKR263IXNK6WpI7tMjh8CcT+ROxqdbmgCnBeU6rTUYKBeV1QzUESUOkn rNIA== X-Gm-Message-State: AOJu0Yyves4WUHpfg/uVi6/F2FoCYsbf4SAajCsBN8IQJPMMmpydlbfD YW2e0SvkDrDS42WOFus+2vc5vXvbmifpwogIjw== X-Google-Smtp-Source: AGHT+IGTdQzTgcbH03msREAtwwMGQwWLrPJpgE5yk/oSK2EIjbPojsGBLiAo3crBf9e87rN8qint7m6Y/wM= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:be12:0:b0:5f9:88f5:e4e7 with SMTP id i18-20020a81be12000000b005f988f5e4e7mr581962ywn.10.1704901538193; Wed, 10 Jan 2024 07:45:38 -0800 (PST) Date: Wed, 10 Jan 2024 07:45:36 -0800 In-Reply-To: <20231230172351.574091-19-michael.roth@amd.com> Mime-Version: 1.0 References: <20231230172351.574091-1-michael.roth@amd.com> <20231230172351.574091-19-michael.roth@amd.com> Message-ID: Subject: Re: [PATCH v11 18/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command From: Sean Christopherson To: Michael Roth Cc: kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de, thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org, pbonzini@redhat.com, vkuznets@redhat.com, jmattson@google.com, luto@kernel.org, dave.hansen@linux.intel.com, slp@redhat.com, pgonda@google.com, peterz@infradead.org, srinivas.pandruvada@linux.intel.com, rientjes@google.com, dovmurik@linux.ibm.com, tobin@ibm.com, bp@alien8.de, vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com, tony.luck@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, alpergun@google.com, jarkko@kernel.org, ashish.kalra@amd.com, nikunj.dadhania@amd.com, pankaj.gupta@amd.com, liam.merwick@oracle.com, zhi.a.wang@intel.com, Brijesh Singh Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: 317FF1600B1 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: fp5yn8ezhdbi3fg3um5d8eesd4bf67oh X-HE-Tag: 1704901538-509296 X-HE-Meta: U2FsdGVkX1/2j0uIO/mwpwob8djBnodTHjwbeGzRuwu6keW7tg277/j6vASOQAyxuj2aiF3jXAohN9osiyptn7ATDTzJXYEAnufOOR2t2nGKz5Hbs5vqPAENYKgCJrfQ1BDOp3/qgLMZTntcEmz/7SOcPBW6ULkChUfcBKbodhzIMDF9E2xsbYOrI3HIJFBhuqJH0HkkOGUUsjZOrZ/bueww0cFdbfCE1S7A4hmAdCHAjtd6/C2NhhpHBUmAAXZZcW4m1J2HI3lTpp0J241cL0ogaNltqCF1E/2RnEwHD/UTIyM8dVuDoc4vxQUzUWmn2VD6iwCfwxI2sq2gTvRQt7p5kdtdgFubkctocYLHzOnZ87RjAfO4anss2X+iBv+r1FsCSd7bmcBeQEpY2dZzf/kkxXCdnGwdF18uhgtUu0KR+vmc0Q6CL9lPxALcQ421AkYSpyvWejUT/pU8q1LIAONLYNR132eFERu6jWT/f49OO5Zg9l5DeKyHQaLhn8LIE/s+dyAhW4iOnIgPMsbknAkQvAvduaaoxWYtXmm7j63X4Ht5uRzqWAwWNbSeyLFNzkHMBMIEWXN38QgzgRfeVVG91/k24SBBpOjd98Od/bYuI78hiz9Hb0GHJSHdxHwszUcQwvzlTSh0bowoxrB/hNOQH7xtrhh1SgyF4Wwom0wMEnU4V6dR6QaFYj05gRr6XPMMPN/TDS3lQeZSc2ysfTj0Y+DPsWL6Wzm0nwabTt3v/jypYfI+dRjW5R/vr+xpPrkh4xyBjLHrlJvtnqYCWA807cFNzJKM6T3QQQt0LHGFEtlfEyJC3NTQUe+ON5C7OBsh1I0ZrleWkMlmYg7wvH5sl1LJt3Oit5ui8AzskfgbxbcrdGPGWY8U1nLaaiMQ0L1AG5ABag97kyxehYhcpgn8GlXMOFY7JI3hTYGYqcKzbntvomI8DYHfjZEd6RqyYDR6DEC2Jdrgwf62jVU As6WTArw nxaY7t5C04L8plkbRG6GDybmhKXdENn395TTCrBINDP1r8e452Qh2dEB+h9A3tPcuJvKAOclg4BdEvs6fSU0W2LjDWVfcbG2libpH2YirhPWHFi2d7WJVdIJb3y6mn86B1MGTL+eCTh0DeVd593RkbQYfTZh9ItNeVPWMhE32h1eOwjOfKAKeGQgpnO0TChaRzATojwAvk9MVGekeqFLFFGIuQfNNNIlX4abMu3oXWbOR2mIBhzk7hhSOs7nr/m7AkaYFrH/6WulJ7Fhpc82dPCGFbGkfwO08d04+Xq1x4sLR8BQv4c9tzwRtKmWJTlIIoxYdwDc4KrtD3vC6rQ2G1D6eb50L5OfxNNKIMz6nLSNmXA9EC/oQxlkowahuUwtO/Ox6plSYMMA6zu6p4HOaX0lDlCeE89qy6dE66ZUpJV6dEhV0T4G2hUUP8KZsMVwjBlB16ccQJLrsvFXtftdlYS5KxiUWzNFUQUwFvgUUKpK/bt9lnCuEbeBiWpRt5S4ZbywVkQ3Odl3BAQVHWqzPpTROXTYEObEUvr4Sb2zVu3ZbiW4ONVR7MT5WoLll47BVUU9X9alNhJsK/fIQH6+sjuZ2A3KL4ab3xmb6CxNDUQzeO4LVmcQBzUMk5wvid8qdGHVweGiHj7xDSjhiO0aceHoUiw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Dec 30, 2023, Michael Roth wrote: > From: Brijesh Singh > > The KVM_SEV_SNP_LAUNCH_UPDATE command can be used to insert data into > the guest's memory. The data is encrypted with the cryptographic context > created with the KVM_SEV_SNP_LAUNCH_START. > > In addition to the inserting data, it can insert a two special pages > into the guests memory: the secrets page and the CPUID page. > > While terminating the guest, reclaim the guest pages added in the RMP > table. If the reclaim fails, then the page is no longer safe to be > released back to the system and leak them. > > For more information see the SEV-SNP specification. Please rewrite all changelogs to explain what *KVM* support is being added, why the proposed uAPI looks like it does, and how the new uAPI is intended be used. Porividing a crash course on the relevant hardware behavior is definitely helpful, but the changelog absolutely needs to explain/justify the patch. > Co-developed-by: Michael Roth > Signed-off-by: Michael Roth > Signed-off-by: Brijesh Singh > Signed-off-by: Ashish Kalra > --- > .../virt/kvm/x86/amd-memory-encryption.rst | 28 +++ > arch/x86/kvm/svm/sev.c | 181 ++++++++++++++++++ > include/uapi/linux/kvm.h | 19 ++ > 3 files changed, 228 insertions(+) > > diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst > index b1beb2fe8766..d4325b26724c 100644 > --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst > +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst > @@ -485,6 +485,34 @@ Returns: 0 on success, -negative on error > > See the SEV-SNP specification for further detail on the launch input. > > +20. KVM_SNP_LAUNCH_UPDATE > +------------------------- > + > +The KVM_SNP_LAUNCH_UPDATE is used for encrypting a memory region. It also > +calculates a measurement of the memory contents. The measurement is a signature > +of the memory contents that can be sent to the guest owner as an attestation > +that the memory was encrypted correctly by the firmware. > + > +Parameters (in): struct kvm_snp_launch_update > + > +Returns: 0 on success, -negative on error > + > +:: > + > + struct kvm_sev_snp_launch_update { > + __u64 start_gfn; /* Guest page number to start from. */ > + __u64 uaddr; /* userspace address need to be encrypted */ Huh? Why is KVM taking a userspace address? IIUC, the address unconditionally gets translated into a gfn, so why not pass a gfn? And speaking of gfns, AFAICT start_gfn is never used. Oof, reading more of the code, this *requires* an effective in-place copy-and-convert of guest memory. > + __u32 len; /* length of memory region */ Bytes? Pages? One field above operates on frame numbers, one apparently operates on a byte-granularity address. > + __u8 imi_page; /* 1 if memory is part of the IMI */ What's "the IMI"? Initial Measurement Image? I assume this is essentially just a flag that communicates whether or not the page should be measured? > + __u8 page_type; /* page type */ > + __u8 vmpl3_perms; /* VMPL3 permission mask */ > + __u8 vmpl2_perms; /* VMPL2 permission mask */ > + __u8 vmpl1_perms; /* VMPL1 permission mask */ Why? KVM doesn't support VMPLs. > +static int snp_page_reclaim(u64 pfn) > +{ > + struct sev_data_snp_page_reclaim data = {0}; > + int err, rc; > + > + data.paddr = __sme_set(pfn << PAGE_SHIFT); > + rc = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err); > + if (rc) { > + /* > + * If the reclaim failed, then page is no longer safe > + * to use. Uh, why can reclaim fail, and why does the kernel apparently not care about leaking pages? AFAICT, nothing ever complains beyond a pr_debug. That sounds bonkers to me, i.e. at the very minimum, why doesn't this warrant a WARN_ON_ONCE? > + */ > + snp_leak_pages(pfn, 1); > + } > + > + return rc; > +} > + > +static int host_rmp_make_shared(u64 pfn, enum pg_level level, bool leak) > +{ > + int rc; > + > + rc = rmp_make_shared(pfn, level); > + if (rc && leak) > + snp_leak_pages(pfn, > + page_level_size(level) >> PAGE_SHIFT); Completely unnecessary wrap. > + > + return rc; > +} > + > static void sev_unbind_asid(struct kvm *kvm, unsigned int handle) > { > struct sev_data_deactivate deactivate; > @@ -1990,6 +2020,154 @@ static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp) > return rc; > } > > +static int snp_launch_update_gfn_handler(struct kvm *kvm, > + struct kvm_gfn_range *range, > + void *opaque) > +{ > + struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info; > + struct kvm_memory_slot *memslot = range->slot; > + struct sev_data_snp_launch_update data = {0}; > + struct kvm_sev_snp_launch_update params; > + struct kvm_sev_cmd *argp = opaque; > + int *error = &argp->error; > + int i, n = 0, ret = 0; > + unsigned long npages; > + kvm_pfn_t *pfns; > + gfn_t gfn; > + > + if (!kvm_slot_can_be_private(memslot)) { > + pr_err("SEV-SNP requires private memory support via guest_memfd.\n"); Yeah, no. Sprinkling pr_err() all over the place in user-triggerable error paths is not acceptable. I get that it's often hard to extract "what went wrong" out of the kernel, but adding pr_err() everywhere is not a viable solution. > + return -EINVAL; > + } > + > + if (copy_from_user(¶ms, (void __user *)(uintptr_t)argp->data, sizeof(params))) { > + pr_err("Failed to copy user parameters for SEV-SNP launch.\n"); > + return -EFAULT; > + } > + > + data.gctx_paddr = __psp_pa(sev->snp_context); > + > + npages = range->end - range->start; > + pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL_ACCOUNT); > + if (!pfns) > + return -ENOMEM; > + > + pr_debug("%s: GFN range 0x%llx-0x%llx, type %d\n", __func__, > + range->start, range->end, params.page_type); > + > + for (gfn = range->start, i = 0; gfn < range->end; gfn++, i++) { > + int order, level; > + bool assigned; > + void *kvaddr; > + > + ret = __kvm_gmem_get_pfn(kvm, memslot, gfn, &pfns[i], &order, false); > + if (ret) > + goto e_release; > + > + n++; > + ret = snp_lookup_rmpentry((u64)pfns[i], &assigned, &level); > + if (ret || assigned) { > + pr_err("Failed to ensure GFN 0x%llx is in initial shared state, ret: %d, assigned: %d\n", > + gfn, ret, assigned); > + return -EFAULT; > + } > + > + kvaddr = pfn_to_kaddr(pfns[i]); > + if (!virt_addr_valid(kvaddr)) { I really, really don't like that this assume guest_memfd is backed by struct page. > + pr_err("Invalid HVA 0x%llx for GFN 0x%llx\n", (uint64_t)kvaddr, gfn); > + ret = -EINVAL; > + goto e_release; > + } > + > + ret = kvm_read_guest_page(kvm, gfn, kvaddr, 0, PAGE_SIZE); Good gravy. If I'm reading this correctly, KVM: 1. Translates an HVA into a GFN. 2. Gets the PFN for that GFN from guest_memfd 3. Verifies the PFN is not assigned to the guest 4. Copies memory from the shared memslot page to the guest_memfd page 5. Converts the page to private and asks the PSP to encrypt it (a) As above, why is #1 a thing? (b) Why are KVM's memory attributes never consulted? (c) What prevents TOCTOU issues with respect to the RMP? (d) Why is *KVM* copying memory into guest_memfd? (e) What guarantees the direct map is valid for guest_memfd? (f) Why does KVM's uAPI *require* the source page to come from the same memslot? > + if (ret) { > + pr_err("Guest read failed, ret: 0x%x\n", ret); > + goto e_release; > + } > + > + ret = rmp_make_private(pfns[i], gfn << PAGE_SHIFT, PG_LEVEL_4K, > + sev_get_asid(kvm), true); > + if (ret) { > + ret = -EFAULT; > + goto e_release; > + } > + > + data.address = __sme_set(pfns[i] << PAGE_SHIFT); > + data.page_size = PG_LEVEL_TO_RMP(PG_LEVEL_4K); > + data.page_type = params.page_type; > + data.vmpl3_perms = params.vmpl3_perms; > + data.vmpl2_perms = params.vmpl2_perms; > + data.vmpl1_perms = params.vmpl1_perms; > + ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE, > + &data, error); > + if (ret) { > + pr_err("SEV-SNP launch update failed, ret: 0x%x, fw_error: 0x%x\n", > + ret, *error); > + snp_page_reclaim(pfns[i]); > + > + /* > + * When invalid CPUID function entries are detected, the firmware > + * corrects these entries for debugging purpose and leaves the > + * page unencrypted so it can be provided users for debugging > + * and error-reporting. > + * > + * Copy the corrected CPUID page back to shared memory so > + * userpsace can retrieve this information. Why? IIUC, this is basically backdooring reads/writes into guest_memfd to avoid having to add proper mmap() support. > + */ > + if (params.page_type == SNP_PAGE_TYPE_CPUID && > + *error == SEV_RET_INVALID_PARAM) { > + int ret; Ugh, do not shadow variables. > + > + host_rmp_make_shared(pfns[i], PG_LEVEL_4K, true); > + > + ret = kvm_write_guest_page(kvm, gfn, kvaddr, 0, PAGE_SIZE); > + if (ret) > + pr_err("Failed to write CPUID page back to userspace, ret: 0x%x\n", > + ret); > + } > + > + goto e_release; > + } > + } > + > +e_release: > + /* Content of memory is updated, mark pages dirty */ > + for (i = 0; i < n; i++) { > + set_page_dirty(pfn_to_page(pfns[i])); > + mark_page_accessed(pfn_to_page(pfns[i])); > + > + /* > + * If its an error, then update RMP entry to change page ownership > + * to the hypervisor. > + */ > + if (ret) > + host_rmp_make_shared(pfns[i], PG_LEVEL_4K, true); > + > + put_page(pfn_to_page(pfns[i])); > + } > + > + kvfree(pfns); Saving PFNs from guest_memfd, which is fully owned by KVM, is so unnecessarily complex. Add a guest_memfd API (or three) to do this safely, e.g. to lock the pages, do (and track) the RMP conversion, etc...