linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Michael Roth <michael.roth@amd.com>
Cc: kvm@vger.kernel.org, linux-coco@lists.linux.dev,
	linux-mm@kvack.org,  linux-crypto@vger.kernel.org,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	 tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de,
	 thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org,
	pbonzini@redhat.com,  vkuznets@redhat.com, jmattson@google.com,
	luto@kernel.org,  dave.hansen@linux.intel.com, slp@redhat.com,
	pgonda@google.com,  peterz@infradead.org,
	srinivas.pandruvada@linux.intel.com,  rientjes@google.com,
	dovmurik@linux.ibm.com, tobin@ibm.com, bp@alien8.de,
	 vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com,
	tony.luck@intel.com,  sathyanarayanan.kuppuswamy@linux.intel.com,
	alpergun@google.com,  jarkko@kernel.org, ashish.kalra@amd.com,
	nikunj.dadhania@amd.com,  pankaj.gupta@amd.com,
	liam.merwick@oracle.com, zhi.a.wang@intel.com,
	 Brijesh Singh <brijesh.singh@amd.com>
Subject: Re: [PATCH v11 18/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
Date: Wed, 10 Jan 2024 07:45:36 -0800	[thread overview]
Message-ID: <ZZ67oJwzAsSvui5U@google.com> (raw)
In-Reply-To: <20231230172351.574091-19-michael.roth@amd.com>

On Sat, Dec 30, 2023, Michael Roth wrote:
> From: Brijesh Singh <brijesh.singh@amd.com>
> 
> The KVM_SEV_SNP_LAUNCH_UPDATE command can be used to insert data into
> the guest's memory. The data is encrypted with the cryptographic context
> created with the KVM_SEV_SNP_LAUNCH_START.
> 
> In addition to the inserting data, it can insert a two special pages
> into the guests memory: the secrets page and the CPUID page.
> 
> While terminating the guest, reclaim the guest pages added in the RMP
> table. If the reclaim fails, then the page is no longer safe to be
> released back to the system and leak them.
> 
> For more information see the SEV-SNP specification.

Please rewrite all changelogs to explain what *KVM* support is being added, why
the proposed uAPI looks like it does, and how the new uAPI is intended be used.

Porividing a crash course on the relevant hardware behavior is definitely helpful,
but the changelog absolutely needs to explain/justify the patch.

> Co-developed-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/x86/amd-memory-encryption.rst    |  28 +++
>  arch/x86/kvm/svm/sev.c                        | 181 ++++++++++++++++++
>  include/uapi/linux/kvm.h                      |  19 ++
>  3 files changed, 228 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index b1beb2fe8766..d4325b26724c 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -485,6 +485,34 @@ Returns: 0 on success, -negative on error
>  
>  See the SEV-SNP specification for further detail on the launch input.
>  
> +20. KVM_SNP_LAUNCH_UPDATE
> +-------------------------
> +
> +The KVM_SNP_LAUNCH_UPDATE is used for encrypting a memory region. It also
> +calculates a measurement of the memory contents. The measurement is a signature
> +of the memory contents that can be sent to the guest owner as an attestation
> +that the memory was encrypted correctly by the firmware.
> +
> +Parameters (in): struct  kvm_snp_launch_update
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +        struct kvm_sev_snp_launch_update {
> +                __u64 start_gfn;        /* Guest page number to start from. */
> +                __u64 uaddr;            /* userspace address need to be encrypted */

Huh?  Why is KVM taking a userspace address?  IIUC, the address unconditionally
gets translated into a gfn, so why not pass a gfn?

And speaking of gfns, AFAICT start_gfn is never used.

Oof, reading more of the code, this *requires* an effective in-place copy-and-convert
of guest memory.

> +                __u32 len;              /* length of memory region */

Bytes?  Pages?  One field above operates on frame numbers, one apparently operates
on a byte-granularity address.

> +                __u8 imi_page;          /* 1 if memory is part of the IMI */

What's "the IMI"?  Initial Measurement Image?  I assume this is essentially just
a flag that communicates whether or not the page should be measured?

> +                __u8 page_type;         /* page type */
> +                __u8 vmpl3_perms;       /* VMPL3 permission mask */
> +                __u8 vmpl2_perms;       /* VMPL2 permission mask */
> +                __u8 vmpl1_perms;       /* VMPL1 permission mask */

Why?  KVM doesn't support VMPLs.

> +static int snp_page_reclaim(u64 pfn)
> +{
> +	struct sev_data_snp_page_reclaim data = {0};
> +	int err, rc;
> +
> +	data.paddr = __sme_set(pfn << PAGE_SHIFT);
> +	rc = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
> +	if (rc) {
> +		/*
> +		 * If the reclaim failed, then page is no longer safe
> +		 * to use.

Uh, why can reclaim fail, and why does the kernel apparently not care about
leaking pages?  AFAICT, nothing ever complains beyond a pr_debug.  That sounds
bonkers to me, i.e. at the very minimum, why doesn't this warrant a WARN_ON_ONCE?

> +		 */
> +		snp_leak_pages(pfn, 1);
> +	}
> +
> +	return rc;
> +}
> +
> +static int host_rmp_make_shared(u64 pfn, enum pg_level level, bool leak)
> +{
> +	int rc;
> +
> +	rc = rmp_make_shared(pfn, level);
> +	if (rc && leak)
> +		snp_leak_pages(pfn,
> +			       page_level_size(level) >> PAGE_SHIFT);

Completely unnecessary wrap.

> +
> +	return rc;
> +}
> +
>  static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
>  {
>  	struct sev_data_deactivate deactivate;
> @@ -1990,6 +2020,154 @@ static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return rc;
>  }
>  
> +static int snp_launch_update_gfn_handler(struct kvm *kvm,
> +					 struct kvm_gfn_range *range,
> +					 void *opaque)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	struct kvm_memory_slot *memslot = range->slot;
> +	struct sev_data_snp_launch_update data = {0};
> +	struct kvm_sev_snp_launch_update params;
> +	struct kvm_sev_cmd *argp = opaque;
> +	int *error = &argp->error;
> +	int i, n = 0, ret = 0;
> +	unsigned long npages;
> +	kvm_pfn_t *pfns;
> +	gfn_t gfn;
> +
> +	if (!kvm_slot_can_be_private(memslot)) {
> +		pr_err("SEV-SNP requires private memory support via guest_memfd.\n");

Yeah, no.  Sprinkling pr_err() all over the place in user-triggerable error paths
is not acceptable.  I get that it's often hard to extract "what went wrong" out
of the kernel, but adding pr_err() everywhere is not a viable solution.

> +		return -EINVAL;
> +	}
> +
> +	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data, sizeof(params))) {
> +		pr_err("Failed to copy user parameters for SEV-SNP launch.\n");
> +		return -EFAULT;
> +	}
> +
> +	data.gctx_paddr = __psp_pa(sev->snp_context);
> +
> +	npages = range->end - range->start;
> +	pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL_ACCOUNT);
> +	if (!pfns)
> +		return -ENOMEM;
> +
> +	pr_debug("%s: GFN range 0x%llx-0x%llx, type %d\n", __func__,
> +		 range->start, range->end, params.page_type);
> +
> +	for (gfn = range->start, i = 0; gfn < range->end; gfn++, i++) {
> +		int order, level;
> +		bool assigned;
> +		void *kvaddr;
> +
> +		ret = __kvm_gmem_get_pfn(kvm, memslot, gfn, &pfns[i], &order, false);
> +		if (ret)
> +			goto e_release;
> +
> +		n++;
> +		ret = snp_lookup_rmpentry((u64)pfns[i], &assigned, &level);
> +		if (ret || assigned) {
> +			pr_err("Failed to ensure GFN 0x%llx is in initial shared state, ret: %d, assigned: %d\n",
> +			       gfn, ret, assigned);
> +			return -EFAULT;
> +		}
> +
> +		kvaddr = pfn_to_kaddr(pfns[i]);
> +		if (!virt_addr_valid(kvaddr)) {

I really, really don't like that this assume guest_memfd is backed by struct page.

> +			pr_err("Invalid HVA 0x%llx for GFN 0x%llx\n", (uint64_t)kvaddr, gfn);
> +			ret = -EINVAL;
> +			goto e_release;
> +		}
> +
> +		ret = kvm_read_guest_page(kvm, gfn, kvaddr, 0, PAGE_SIZE);

Good gravy.  If I'm reading this correctly, KVM:

  1. Translates an HVA into a GFN.
  2. Gets the PFN for that GFN from guest_memfd
  3. Verifies the PFN is not assigned to the guest
  4. Copies memory from the shared memslot page to the guest_memfd page
  5. Converts the page to private and asks the PSP to encrypt it

(a) As above, why is #1 a thing?
(b) Why are KVM's memory attributes never consulted?
(c) What prevents TOCTOU issues with respect to the RMP?
(d) Why is *KVM* copying memory into guest_memfd?
(e) What guarantees the direct map is valid for guest_memfd?
(f) Why does KVM's uAPI *require* the source page to come from the same memslot?

> +		if (ret) {
> +			pr_err("Guest read failed, ret: 0x%x\n", ret);
> +			goto e_release;
> +		}
> +
> +		ret = rmp_make_private(pfns[i], gfn << PAGE_SHIFT, PG_LEVEL_4K,
> +				       sev_get_asid(kvm), true);
> +		if (ret) {
> +			ret = -EFAULT;
> +			goto e_release;
> +		}
> +
> +		data.address = __sme_set(pfns[i] << PAGE_SHIFT);
> +		data.page_size = PG_LEVEL_TO_RMP(PG_LEVEL_4K);
> +		data.page_type = params.page_type;
> +		data.vmpl3_perms = params.vmpl3_perms;
> +		data.vmpl2_perms = params.vmpl2_perms;
> +		data.vmpl1_perms = params.vmpl1_perms;
> +		ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE,
> +				      &data, error);
> +		if (ret) {
> +			pr_err("SEV-SNP launch update failed, ret: 0x%x, fw_error: 0x%x\n",
> +			       ret, *error);
> +			snp_page_reclaim(pfns[i]);
> +
> +			/*
> +			 * When invalid CPUID function entries are detected, the firmware
> +			 * corrects these entries for debugging purpose and leaves the
> +			 * page unencrypted so it can be provided users for debugging
> +			 * and error-reporting.
> +			 *
> +			 * Copy the corrected CPUID page back to shared memory so
> +			 * userpsace can retrieve this information.

Why?  IIUC, this is basically backdooring reads/writes into guest_memfd to avoid
having to add proper mmap() support.
 
> +			 */
> +			if (params.page_type == SNP_PAGE_TYPE_CPUID &&
> +			    *error == SEV_RET_INVALID_PARAM) {
> +				int ret;

Ugh, do not shadow variables.

> +
> +				host_rmp_make_shared(pfns[i], PG_LEVEL_4K, true);
> +
> +				ret = kvm_write_guest_page(kvm, gfn, kvaddr, 0, PAGE_SIZE);
> +				if (ret)
> +					pr_err("Failed to write CPUID page back to userspace, ret: 0x%x\n",
> +					       ret);
> +			}
> +
> +			goto e_release;
> +		}
> +	}
> +
> +e_release:
> +	/* Content of memory is updated, mark pages dirty */
> +	for (i = 0; i < n; i++) {
> +		set_page_dirty(pfn_to_page(pfns[i]));
> +		mark_page_accessed(pfn_to_page(pfns[i]));
> +
> +		/*
> +		 * If its an error, then update RMP entry to change page ownership
> +		 * to the hypervisor.
> +		 */
> +		if (ret)
> +			host_rmp_make_shared(pfns[i], PG_LEVEL_4K, true);
> +
> +		put_page(pfn_to_page(pfns[i]));
> +	}
> +
> +	kvfree(pfns);

Saving PFNs from guest_memfd, which is fully owned by KVM, is so unnecessarily
complex.  Add a guest_memfd API (or three) to do this safely, e.g. to lock the
pages, do (and track) the RMP conversion, etc...


  reply	other threads:[~2024-01-10 15:45 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-30 17:23 [PATCH v11 00/35] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
2023-12-30 17:23 ` [PATCH v11 01/35] KVM: Add hugepage support for dedicated guest memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 02/35] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 03/35] KVM: Use AS_INACCESSIBLE when creating guest_memfd inode Michael Roth
2023-12-30 17:23 ` [PATCH v11 04/35] KVM: x86: Add gmem hook for initializing memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 05/35] KVM: x86: Add gmem hook for invalidating memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 06/35] KVM: x86/mmu: Pass around full 64-bit error code for KVM page faults Michael Roth
2024-02-06 20:51   ` Sean Christopherson
2024-02-12 10:00     ` Paolo Bonzini
2024-02-12 16:42       ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 07/35] KVM: x86: Add KVM_X86_SNP_VM vm_type Michael Roth
2023-12-30 17:23 ` [PATCH v11 08/35] KVM: x86: Define RMP page fault error bits for #NPF Michael Roth
2023-12-30 17:23 ` [PATCH v11 09/35] KVM: x86: Determine shared/private faults based on vm_type Michael Roth
     [not found]   ` <CABgObfanrHTL429Cr8tcMGqs-Ov+6LWeQbzghvjQiGu9tz0EUA@mail.gmail.com>
2024-02-12 16:27     ` Sean Christopherson
2024-02-12 16:47       ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 10/35] KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests Michael Roth
2023-12-30 17:23 ` [PATCH v11 11/35] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
2023-12-30 17:23 ` [PATCH v11 12/35] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
2023-12-30 17:23 ` [PATCH v11 13/35] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
2023-12-30 17:23 ` [PATCH v11 14/35] KVM: SEV: Add initial SEV-SNP support Michael Roth
2023-12-30 17:23 ` [PATCH v11 15/35] KVM: SEV: Add KVM_SNP_INIT command Michael Roth
2024-02-06 23:51   ` Paolo Bonzini
2024-03-20 17:28   ` Paolo Bonzini
2023-12-30 17:23 ` [PATCH v11 16/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
2023-12-30 17:23 ` [PATCH v11 17/35] KVM: Add HVA range operator Michael Roth
2023-12-30 17:23 ` [PATCH v11 18/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
2024-01-10 15:45   ` Sean Christopherson [this message]
2024-01-16  4:14     ` Michael Roth
2024-02-02 22:54       ` Sean Christopherson
2024-02-06 23:43         ` Paolo Bonzini
2024-02-07  2:43           ` Sean Christopherson
2024-02-07  8:03             ` Paolo Bonzini
2024-02-09  1:52           ` Michael Roth
2024-02-09 14:34             ` Sean Christopherson
2024-03-18 21:02   ` Peter Gonda
2023-12-30 17:23 ` [PATCH v11 19/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
2023-12-30 17:23 ` [PATCH v11 20/35] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
2023-12-30 17:23 ` [PATCH v11 21/35] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
2023-12-30 17:23 ` [PATCH v11 22/35] KVM: SEV: Add support to handle " Michael Roth
2023-12-30 17:23 ` [PATCH v11 23/35] KVM: x86: Export the kvm_zap_gfn_range() for the SNP use Michael Roth
2023-12-30 17:23 ` [PATCH v11 24/35] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
2023-12-30 17:23 ` [PATCH v11 25/35] KVM: SEV: Use a VMSA physical address variable for populating VMCB Michael Roth
2023-12-30 17:23 ` [PATCH v11 26/35] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
2024-01-05 22:08   ` Jacob Xu
2024-01-08 15:53     ` Sean Christopherson
2023-12-30 17:23 ` [PATCH v11 27/35] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
2023-12-30 17:23 ` [PATCH v11 28/35] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
2024-03-11  5:50   ` Binbin Wu
2023-12-30 17:23 ` [PATCH v11 29/35] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
2023-12-30 17:23 ` [PATCH v11 30/35] KVM: x86: Add gmem hook for determining max NPT mapping level Michael Roth
2024-02-12 10:50   ` Paolo Bonzini
2024-02-12 17:03     ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 31/35] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
2023-12-30 17:23 ` [PATCH v11 32/35] KVM: SVM: Add module parameter to enable the SEV-SNP Michael Roth
2023-12-30 17:23 ` [PATCH v11 33/35] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
2023-12-30 17:23 ` [PATCH v11 34/35] crypto: ccp: Add the SNP_SET_CONFIG_{START,END} commands Michael Roth
2023-12-30 17:23 ` [PATCH v11 35/35] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZ67oJwzAsSvui5U@google.com \
    --to=seanjc@google.com \
    --cc=ak@linux.intel.com \
    --cc=alpergun@google.com \
    --cc=ardb@kernel.org \
    --cc=ashish.kalra@amd.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dovmurik@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=jmattson@google.com \
    --cc=jroedel@suse.de \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=michael.roth@amd.com \
    --cc=mingo@redhat.com \
    --cc=nikunj.dadhania@amd.com \
    --cc=pankaj.gupta@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pgonda@google.com \
    --cc=rientjes@google.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=slp@redhat.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tobin@ibm.com \
    --cc=tony.luck@intel.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    --cc=zhi.a.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox