From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95280CDB47E for ; Wed, 18 Oct 2023 13:49:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BBE318D015A; Wed, 18 Oct 2023 09:49:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B6D858D0016; Wed, 18 Oct 2023 09:49:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A352F8D015A; Wed, 18 Oct 2023 09:49:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 948798D0016 for ; Wed, 18 Oct 2023 09:49:28 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 54574B5067 for ; Wed, 18 Oct 2023 13:49:28 +0000 (UTC) X-FDA: 81358714416.15.9DDBFE5 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf18.hostedemail.com (Postfix) with ESMTP id 880B71C0017 for ; Wed, 18 Oct 2023 13:49:26 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YPfr17K0; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of 3ZeIvZQYKCBoI40D926EE6B4.2ECB8DKN-CCAL02A.EH6@flex--seanjc.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3ZeIvZQYKCBoI40D926EE6B4.2ECB8DKN-CCAL02A.EH6@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697636966; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oAKQqv24W/9hoLdu1u7LFvXDRMO4zQJ7ELL3SC0djfQ=; b=4BdQBr9HlOaJNfNCaKsltr4n8hzGNVu7WSd9zsDuzpdgoQ/HPuA0/od4JKXYHGijdt5lcQ fMO4JztTSFIQJI9pynEKPq17IWE7ad1iWE68yL3L36/Xr4NdmgPjsnotp/vjz9WibEH4kh aFm4XiucsjgD1C9I8xaYkRAWZ9SmfM0= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YPfr17K0; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of 3ZeIvZQYKCBoI40D926EE6B4.2ECB8DKN-CCAL02A.EH6@flex--seanjc.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3ZeIvZQYKCBoI40D926EE6B4.2ECB8DKN-CCAL02A.EH6@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697636966; a=rsa-sha256; cv=none; b=wILrGfEoAxYx2+HrhWhHZ768DrJ9V+jBQ2sR/7NrUcMqO9JNg86bXIm2p+g0dZUkiV5JBb efzXsFnreZQyLzGPo2rSqQ+4JkYiY8iX2BkcsctHPGLMBuQUYTjuxdmD7ywsZYpr42khpt TvXi9zPPkZBWGiN+TPMsLtaHd7/j7iw= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5a7dd655566so106173337b3.2 for ; Wed, 18 Oct 2023 06:49:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697636965; x=1698241765; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=oAKQqv24W/9hoLdu1u7LFvXDRMO4zQJ7ELL3SC0djfQ=; b=YPfr17K0kw0CP6XaDZ0fNpAZrrxRgxNzl9VIWy+gNVLzKifuzLat97fLAUMMRyfM7D xohICoaP9S2obsHxBCkNNfNLQNjyLunKFKjwfrzqqXfJP82hbjdKTPUv55KiHzoquPHi Gap3+3+L/w/xCEs+qbhfGlHGjQ2SIu8shdy2ieO4HKahugprQgYXYs7madpdpoTI6N8+ GBJI51g+BFv3qH9ZHZOeNA9fpcby+asn80j4KyLnRHJDFx9/eRT6CRcOojMor2g3KcAO my2fZKs2smdjGgwPCVVAMZ4IatzRVQPBep4rkxXFOt0qdV54KsSC2kXqHero0zXMRhwa fHfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697636965; x=1698241765; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=oAKQqv24W/9hoLdu1u7LFvXDRMO4zQJ7ELL3SC0djfQ=; b=GN3Uko8DdqQPbFIK/pBICsQQJ6fVDz5ZfAdRTdXscF6mXKXuXgT9iQrAfdioY48Mco JNibrEwVrh+9oXqVf30Ex6aoPwT6Gq8/jW3Kv7tr+pbXGF8rrakmIiqar1hWPW9yo8jb EF9LR2TjW08QwAvg5gf53bxiQDOI+UNicMMwaEyWyDWXP7RsZ5DQQos58KB5SofqlCiV 4mm1QKKeWvwZ7qMjHX8/3rtkNfmcU0O/rl6+V1o7sq0lP6jQFpZZvW5XUyMHFDyfoWKx gSkEI2wFAIIWXEK0fNrGBxDgsy1hsfK9W9pGhwKKwU5bv4W7jkyZ78GLuQqtw+veLG5v YqUQ== X-Gm-Message-State: AOJu0YyjLrwJr/eRMfQ7mqFauw9KTIK0GPuvgyJ3XBogSWhDbO4pMSnA 03DF/kM+ffZgkUGIhX6y5b3wWVFPWTQ= X-Google-Smtp-Source: AGHT+IHnFQDi3IwMQOX78sjRJgZJ8FaG1+pts8CizuoqnC8OKrYfZy+4dJST840bISoqEf+kFNtJYopCXPU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a0d:dbd2:0:b0:583:4f82:b9d9 with SMTP id d201-20020a0ddbd2000000b005834f82b9d9mr132011ywe.5.1697636965615; Wed, 18 Oct 2023 06:49:25 -0700 (PDT) Date: Wed, 18 Oct 2023 06:48:59 -0700 In-Reply-To: Mime-Version: 1.0 References: <20231016132819.1002933-1-michael.roth@amd.com> <20231016132819.1002933-49-michael.roth@amd.com> Message-ID: Subject: Re: [PATCH v10 48/50] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event From: Sean Christopherson To: Alexey Kardashevskiy Cc: Dionna Amalie Glaze , Michael Roth , kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de, thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org, pbonzini@redhat.com, vkuznets@redhat.com, jmattson@google.com, luto@kernel.org, dave.hansen@linux.intel.com, slp@redhat.com, pgonda@google.com, peterz@infradead.org, srinivas.pandruvada@linux.intel.com, rientjes@google.com, dovmurik@linux.ibm.com, tobin@ibm.com, bp@alien8.de, vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com, tony.luck@intel.com, marcorr@google.com, sathyanarayanan.kuppuswamy@linux.intel.com, alpergun@google.com, jarkko@kernel.org, ashish.kalra@amd.com, nikunj.dadhania@amd.com, pankaj.gupta@amd.com, liam.merwick@oracle.com, zhi.a.wang@intel.com, Brijesh Singh Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: 880B71C0017 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: obzd8yaice6438bnebs7mfks9fremyiw X-HE-Tag: 1697636966-552766 X-HE-Meta: U2FsdGVkX19kglFbI613l7aoKrvVhZhwZBZ+oD1uamXqxW5pLS507WnvWZSJF1kmgX4nvNUUF+MXtzo0gN9lvUDrwFQxoTvXAYzxvcRocC+8WRZSey4/FB8p74QmISj/o4JLjvsa1eyvghvNe/LcAAENOaS7Cp17s9nOFDm95Ro+rkdhq5lXCGTsvM9I2yJgIMNEaUqxt6zbuKkwfcXkHNdY1Cb/WSuSyXo6EDVRHcTGiJLxWYJyPQ1Y/I3++odUpgHhNxGYUslw4C554hxAeiWAvMgWZ1Yo8q3bH69EbTv9zjg+bgHAAkF/4X9Pek5OkVTLixpBWukLe/hqv84p/1PcF3zJNJXGUI+ZIHe89PQdT4yPFN52R6nnfrDnOVVcnTQXHK26lUJTpQEicQAi8tY5vPU/pGib/QOcm/bah/rC2jc1EG0anQzZe90cWIPMN3EJWwxNm63ut3J4wkMRJ+55Qu0D/omZaXo87YjTTMownz7MRRChSlNimEzIIFagWhoGXLO8iCKCRnJ7kUWTYxbBgi/1ymwvTSkPI7e7RTng5ih4i9YFZeBZJnIYQzYhvuJ/e0v1bMyxeQNTXuqqlos3sM39a3d5y1mOFUtKxwnw3ohVINIk9VJIP/+N+qyRsVwhr1iUMDlv2xW5eZE2DK+VX/0wxrl/14kLKQhz+82Iik5KdiAUrc1T9/cHDiz2m39ZH4blZAQE5g/mwP6rBHNVuSnuG7Kt+mMtcwsZnSzDouN/KRWpVjb8Mfzv2nZqbM2flTZmEfbjBp4ARejgOWCWwVjBsF4ALRlGBZar6vpe5P72Br3HSOckgbdPe7Khb6/l8ErNYHBpm3ZmP48l9pc5c3DfHyEF8Z7j1l8r5SDgE/XImix5GMnK7ulurO6Q5bXLFTxXRS8LgXuyNZbrji/7oV9nTPA3eXcbmuGuuDTcDKujrCZXXCYgiqwOvCxa7P+Z1BezAKlhSKqzuE+ QHmvQBef Y/0CCu+vONqWvhDi1IdYHfj7AWuoRRoDTp7P76+TIUNNYnu5S0kN6rPJSX7fQa6Sin0lMo27HquO2lyxg6KInRZq/ZwJIUS/bs+VxSnFTN604ijzOXfTdjsWqUpf0n5GorfImIvj09cOK7JstByw/pHH4y9P7MvS8GIh+8cL1tKfm6WYjZMfBfoZ8L3GR8ktg7kG76pbspmjmC3He2oU7D84aFEpbXEeL1Pwi71ubWCAgWDKFhLuk1JDQa24+jzCnlqFsz00h1p8mEOXzRnvhBWk8jdtOZJsg4XPhP6s5nx6eYxR1bZcgm7XakHbV4y0nTAWnE+RD+ykTPLyykwLmyK5An0i8hdn9pTdQw+tcICIKpN469BRuxdfnvArdicXyxJ3SjgOPkIbpTNjfm0ap/97UiBfzIHRA2GQ0bLODMoNTL4BQ94BJS2U3IenQ1hg+ColtzGfzWBYrv/2c/dsCa5dTnB8Pzv5AwAMF01Ob3fjK927gJvZbnLLa2vso/aDyKMOpnrFRyM2GmsPI94W8s25N5U/1+o8F9TLfMHq3cBmsNfzp6ATVAxUj2oy2moAQHK70Z9jvTupY2Jl3XiFcu7SVjCptmFAi5o8rsrhjrsvY0TGw10jc28Ie7JO4k+VAlY04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000059, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 18, 2023, Alexey Kardashevskiy wrote: > > On 18/10/23 03:27, Sean Christopherson wrote: > > On Mon, Oct 16, 2023, Dionna Amalie Glaze wrote: > > > > + > > > > + /* > > > > + * If a VMM-specific certificate blob hasn't been provided, grab the > > > > + * host-wide one. > > > > + */ > > > > + snp_certs = sev_snp_certs_get(sev->snp_certs); > > > > + if (!snp_certs) > > > > + snp_certs = sev_snp_global_certs_get(); > > > > + > > > > > > This is where the generation I suggested adding would get checked. If > > > the instance certs' generation is not the global generation, then I > > > think we need a way to return to the VMM to make that right before > > > continuing to provide outdated certificates. > > > This might be an unreasonable request, but the fact that the certs and > > > reported_tcb can be set while a VM is running makes this an issue. > > > > Before we get that far, the changelogs need to explain why the kernel is storing > > userspace blobs in the first place. The whole thing is a bit of a mess. > > > > sev_snp_global_certs_get() has data races that could lead to variations of TOCTOU > > bugs: sev_ioctl_snp_set_config() can overwrite psp_master->sev_data->snp_certs > > while sev_snp_global_certs_get() is running. If the compiler reloads snp_certs > > between bumping the refcount and grabbing the pointer, KVM will end up leaking a > > refcount and consuming a pointer without a refcount. > > > > if (!kref_get_unless_zero(&certs->kref)) > > return NULL; > > > > return certs; > > I'm missing something here. The @certs pointer is on the stack, No, nothing guarantees that @certs is on the stack and will never be reloaded. sev_snp_certs_get() is in full view of sev_snp_global_certs_get(), so it's entirely possible that it can be inlined. Then you end up with: struct sev_device *sev; if (!psp_master || !psp_master->sev_data) return NULL; sev = psp_master->sev_data; if (!sev->snp_initialized) return NULL; if (!sev->snp_certs) return NULL; if (!kref_get_unless_zero(&sev->snp_certs->kref)) return NULL; return sev->snp_certs; At which point the compiler could choose to omit a local variable entirely, it could store @certs in a register and reload after kref_get_unless_zero(), etc. If psp_master->sev_data->snp_certs is changed at any point, odd thing can happen. That atomic operation in kref_get_unless_zero() might prevent a reload between getting the kref and the return, but it wouldn't prevent a reload between the !NULL check and kref_get_unless_zero(). > > If userspace wants to provide garbage to the guest, so be it, not KVM's problem. > > That way, whether the VM gets the global cert or a per-VM cert is purely a userspace > > concern. > > The global cert lives in CCP (/dev/sev), the per VM cert lives in kvmvm_fd. > "A la vcpu->run" is fine for the latter but for the former we need something > else. Why? The cert ultimately comes from userspace, no? Make userspace deal with it. > And there is scenario when one global certs blob is what is needed and > copying it over multiple VMs seems suboptimal. That's a solvable problem. I'm not sure I like the most obvious solution, but it is a solution: let userspace define a KVM-wide blob pointer, either via .mmap() or via an ioctl(). FWIW, there's no need to do .mmap() shenanigans, e.g. an ioctl() to set the userspace pointer would suffice. The benefit of a kernel controlled pointer is that it doesn't require copying to a kernel buffer (or special code to copy from userspace into guest). Actually, looking at the flow again, AFAICT there's nothing special about the target DATA_PAGE. It must be SHARED *before* SVM_VMGEXIT_EXT_GUEST_REQUEST, i.e. KVM doesn't need to do conversions, there's no kernel priveleges required, etc. And the GHCB doesn't dictate ordering between storing the certificates and doing the request. That means the certificate stuff can be punted entirely to usersepace. Heh, typing up the below, there's another bug: KVM will incorrectly "return" '0' for non-SNP guests: unsigned long exitcode = 0; u64 data_gpa; int err, rc; if (!sev_snp_guest(vcpu->kvm)) { rc = SEV_RET_INVALID_GUEST; <= sets "rc", not "exitcode" goto e_fail; } e_fail: ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, exitcode); Which really highlights that we need to get test infrastructure up and running for SEV-ES, SNP, and TDX. Anyways, back to punting to userspace. Here's a rough sketch. The only new uAPI is the definition of KVM_HC_SNP_GET_CERTS and its arguments. static void snp_handle_guest_request(struct vcpu_svm *svm) { struct vmcb_control_area *control = &svm->vmcb->control; struct sev_data_snp_guest_request data = {0}; struct kvm_vcpu *vcpu = &svm->vcpu; struct kvm *kvm = vcpu->kvm; struct kvm_sev_info *sev; gpa_t req_gpa = control->exit_info_1; gpa_t resp_gpa = control->exit_info_2; unsigned long rc; int err; if (!sev_snp_guest(vcpu->kvm)) { rc = SEV_RET_INVALID_GUEST; goto e_fail; } sev = &to_kvm_svm(kvm)->sev_info; mutex_lock(&sev->guest_req_lock); rc = snp_setup_guest_buf(svm, &data, req_gpa, resp_gpa); if (rc) goto unlock; rc = sev_issue_cmd(kvm, SEV_CMD_SNP_GUEST_REQUEST, &data, &err); if (rc) /* Ensure an error value is returned to guest. */ rc = err ? err : SEV_RET_INVALID_ADDRESS; snp_cleanup_guest_buf(&data, &rc); unlock: mutex_unlock(&sev->guest_req_lock); e_fail: ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, rc); } static int snp_complete_ext_guest_request(struct kvm_vcpu *vcpu) { u64 certs_exitcode = vcpu->run->hypercall.args[2]; struct vcpu_svm *svm = to_svm(vcpu); if (certs_exitcode) ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, certs_exitcode); else snp_handle_guest_request(svm); return 1; } static int snp_handle_ext_guest_request(struct vcpu_svm *svm) { struct kvm_vcpu *vcpu = &svm->vcpu; struct kvm *kvm = vcpu->kvm; struct kvm_sev_info *sev; unsigned long exitcode; u64 data_gpa; if (!sev_snp_guest(vcpu->kvm)) { ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SEV_RET_INVALID_GUEST); return 1; } data_gpa = vcpu->arch.regs[VCPU_REGS_RAX]; if (!IS_ALIGNED(data_gpa, PAGE_SIZE)) { ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SEV_RET_INVALID_ADDRESS); return 1; } vcpu->run->hypercall.nr = KVM_HC_SNP_GET_CERTS; vcpu->run->hypercall.args[0] = data_gpa; vcpu->run->hypercall.args[1] = vcpu->arch.regs[VCPU_REGS_RBX]; vcpu->run->hypercall.flags = KVM_EXIT_HYPERCALL_LONG_MODE; vcpu->arch.complete_userspace_io = snp_complete_ext_guest_request; return 0; }