From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B951C47422 for ; Fri, 26 Jan 2024 11:01:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 899746B0080; Fri, 26 Jan 2024 06:01:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 84A1F6B0082; Fri, 26 Jan 2024 06:01:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 713D96B008A; Fri, 26 Jan 2024 06:01:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6256B6B0080 for ; Fri, 26 Jan 2024 06:01:02 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3098B805BD for ; Fri, 26 Jan 2024 11:01:02 +0000 (UTC) X-FDA: 81721169964.01.C7D1323 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id C7C8B1C0014 for ; Fri, 26 Jan 2024 11:00:59 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cduZuv0M; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf18.hostedemail.com: domain of pbonzini@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=pbonzini@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706266860; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vz4PWAkydzxt+A7z4COEEylDr902GbpCA3qhDTN+Qeg=; b=K1zBe2pCkBWPFYFJ1KW1J08ohL123rJZKUKm0RgAnPr5CtsQ2FARwWTYkalVECj3UhqUTA EqTMAJCGPtQZSzdT4jgKyDVtNHp2fbx0pqRuOX6uQfU9mUuUuO3/fQ+a2PBWjq1kYd/XF2 qYZsMjjFpo5nkcpV3q7MPrUHq9Jfidk= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cduZuv0M; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf18.hostedemail.com: domain of pbonzini@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=pbonzini@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706266860; a=rsa-sha256; cv=none; b=0CynD79LN8ioKo8xXigeGNgwJhJesEE9ptu0EK1RLLslwklihajyOY2bMsozDlXpPHSQDA QbIKJSbT22gzJiKnkZrJIIkIxd1ZXK6s5oUQiT0xYTl6L13XDMj9yvnM+jp9Q3PcM1PpY1 UGxUOCETsRpPkGooO2a8GsexXrjth0Q= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706266859; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vz4PWAkydzxt+A7z4COEEylDr902GbpCA3qhDTN+Qeg=; b=cduZuv0M4Psp1UTRkyPuFLufaaHbpUVp6kClk7L0pz/gGYWXDFEvBwYXTMqDgx4XYJrzL6 oJtcRsSZrWX4C+U0RoKsUkpEId1sUnt70DbaNSAhLb7LaGKmTTtMVwQLyPnGTd9J4iarfL Y68eDSPDieorlPcNYSQjsIpOXZVERto= Received: from mail-ua1-f72.google.com (mail-ua1-f72.google.com [209.85.222.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-546-R5zSYZU3MguieL0VFpgtew-1; Fri, 26 Jan 2024 06:00:57 -0500 X-MC-Unique: R5zSYZU3MguieL0VFpgtew-1 Received: by mail-ua1-f72.google.com with SMTP id a1e0cc1a2514c-7d2dfe84153so132599241.3 for ; Fri, 26 Jan 2024 03:00:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706266857; x=1706871657; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vz4PWAkydzxt+A7z4COEEylDr902GbpCA3qhDTN+Qeg=; b=b5+qNRhGdLGh0QZXyhEzcTJoIcCsbO+CKP9G0TGLpiqoHvTS2VMivShVCkwZhpzFF+ zgqd06P3Z2lu7ZAkX8kd5OfktMLbO1Di5SOfwRtwjOsC04fphgGZjjSM64k2gDtF0Xc4 ZC6VLpPucT9vHEksrrfJgi0ox21J28hv8GvI19KzVtf8uq/ii9vvOPuHQZbeoLIN5quT br12FCw9/rzXqoPcdR7NHHlwd3Txxm3MX7XzmX9/l0Y0DgbBYrqwr4csvBa/cccykRM3 IgW5i68Ew83M6pL/zJtsNZfxvHJsp24YROVeSImWfxJJV9ii7Ngxpkc/mr05NdCFrtCV NPTg== X-Gm-Message-State: AOJu0YwoCpJxHkfbtVaqBko9MW5B4UeqQ8dnyKzny1tActECOMFhQ1Oo yT2fzZwEGfQ7dSJpKictxNZ3qNnIPTCyRlmH/amQTSyE3zSWvy3yccie+pbu9OHP0I8/VJEpO8L qV7unl/F0USnsKMRtzNKPHO0o+AKyeYzCDvAvAoVwVDmZ4qKtjYuVOK9Dgq/3OX4gaSKuYb3Gg1 q0IGShZfAOgeIWC5ReEVbO6ig= X-Received: by 2002:a67:f7d3:0:b0:46b:2177:3dd5 with SMTP id a19-20020a67f7d3000000b0046b21773dd5mr813773vsp.59.1706266857191; Fri, 26 Jan 2024 03:00:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IGpXTYAbdzLiIsrnQlBUga5yJGtTPifrKZ94nSeg3N8tWU7X+J3z3U88iJcUhzqazoErnhnGdfkCOzGN9AK1kM= X-Received: by 2002:a67:f7d3:0:b0:46b:2177:3dd5 with SMTP id a19-20020a67f7d3000000b0046b21773dd5mr813761vsp.59.1706266856821; Fri, 26 Jan 2024 03:00:56 -0800 (PST) MIME-Version: 1.0 References: <20240126041126.1927228-1-michael.roth@amd.com> <20240126041126.1927228-22-michael.roth@amd.com> In-Reply-To: <20240126041126.1927228-22-michael.roth@amd.com> From: Paolo Bonzini Date: Fri, 26 Jan 2024 12:00:43 +0100 Message-ID: Subject: Re: [PATCH v2 21/25] KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe To: Michael Roth Cc: x86@kernel.org, kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de, thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org, seanjc@google.com, vkuznets@redhat.com, jmattson@google.com, luto@kernel.org, dave.hansen@linux.intel.com, slp@redhat.com, pgonda@google.com, peterz@infradead.org, srinivas.pandruvada@linux.intel.com, rientjes@google.com, tobin@ibm.com, bp@alien8.de, vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com, tony.luck@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, alpergun@google.com, jarkko@kernel.org, ashish.kalra@amd.com, nikunj.dadhania@amd.com, pankaj.gupta@amd.com, liam.merwick@oracle.com, Brijesh Singh , Marc Orr X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C7C8B1C0014 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: a5n95qgdy79owuce7i8ep5ohyba5d1qp X-HE-Tag: 1706266859-85420 X-HE-Meta: U2FsdGVkX19+mRf86MwlmlohOYUwNuJNdiTeiXqhuWhqVgSaDu1bhSG15ypUAzf4gMJj+isiAkLkU1m6eKLXqNjpNxWAM94PPrHi/nbFl8Yk5jdIi5ustUCjhx9G3uhHYnl2Wmam3EkFB37azaHX6ME8KNq51cQJI34ju3nvNRvBrzNgCH56VJSTju83SPru3dgPVcPDDSMNz+waV8IltqVB00/PZTpyjRCBiDDLOmMJ4dkONzAgkXmysWZpzcMmmFHLLu/o5K78J2Sgt1Mb7blrsgJLze8FtDtEeuX0WE7oiG61gULGnS76im0YIxIB0Y1ndVrEEAbMHvH2wctt8se6tThDPshyJGBCm1Q/BnLqY0w3xa0hzkuXQh+6hU1YKDjNQB2CwAiPWipsONE/NiAeIJTPkg3+gYshZGF7Ze0sJsZdCQVcFQdmx7laODEPOorqNME9lThXFg0P2Gp0RV7Dm80w3sZhbWls9lvD4lEpysjPT3ymVyqJKBCvZlmve9KgSbKXm6sRTokcGV3RyT3KljDu6vjTOkKN/nvO8Dm+GhgcWq8P9ZZrKHjCRBolGN9QRiw8v1Iz+MwUXM5DpOnGTMSxwzODAwPm/OckyPvY4eEfRB/nm5uI4ao6lkq4Zx6el0u4f5P3pb7+rmLNOsIMk5CIwXKXIY2h1XPEPz67jDSDSiLbjLt7NHgOU6gmLAqX8IS5mKg29wJNJ8cpCJOO7Xfy4s04jyqd7OP3VwaEf8gQ/PK+NbCafDkaFlur/b/1fXYYclY1VH69usSUrUQzox5830UCImvfoIWjtCQPx3h0xpom6d0xQPuGpA0aRBvuchsvEwlhv/KmXOfhe/XrHr2USjrCNYtmEOXOTN7kjj7sRqGocHJJVXQSWzW0O66uBhWrR67PqrH9X6TfsYFzNASIEldzJSUoDAmlOsDEEq90DknVdWW5AxjvMa3F0Aa5OO+1VcHUUPD1JMF 5Jccszbo SsmUBjqcLBx4o9ilX/US1hQwfGlaLznig2NidYL9f0zoOfqnDJhZqJOx4oLM2UphtDNyHBSoYNtFcZEUSE3Gpcs4JX+XcOKancqIYSq0S2qlam2aRbZvXrJ37fTwqjPbEAf33XkT0sMebZVyTKdjAX3PeB2foeZhwmLZazdhNnV31QnwvH9L7VyREWNRHAKdzxRlnNbsNpPLtLj6jfwxL1k495iH4dYNefyE1mGA/veXhalbHmYZecp0pbRc0S69iyWfh92sCBtLZz11agjpt453JXypjTLtK40xvZxFD9dztycGHsAVM6g3o+DcovADGJsRt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 26, 2024 at 5:45=E2=80=AFAM Michael Roth = wrote: > > From: Brijesh Singh > > Implement a workaround for an SNP erratum where the CPU will incorrectly > signal an RMP violation #PF if a hugepage (2MB or 1GB) collides with the > RMP entry of a VMCB, VMSA or AVIC backing page. > > When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC > backing pages as "in-use" via a reserved bit in the corresponding RMP > entry after a successful VMRUN. This is done for _all_ VMs, not just > SNP-Active VMs. > > If the hypervisor accesses an in-use page through a writable > translation, the CPU will throw an RMP violation #PF. On early SNP > hardware, if an in-use page is 2MB-aligned and software accesses any > part of the associated 2MB region with a hugepage, the CPU will > incorrectly treat the entire 2MB region as in-use and signal a an RMP > violation #PF. > > To avoid this, the recommendation is to not use a 2MB-aligned page for > the VMCB, VMSA or AVIC pages. Add a generic allocator that will ensure > that the page returned is not 2MB-aligned and is safe to be used when > SEV-SNP is enabled. Also implement similar handling for the VMCB/VMSA > pages of nested guests. > > Signed-off-by: Brijesh Singh > Co-developed-by: Marc Orr > Signed-off-by: Marc Orr > Reported-by: Alper Gun # for nested VMSA case > Co-developed-by: Ashish Kalra > Signed-off-by: Ashish Kalra > Acked-by: Vlastimil Babka > [mdr: squash in nested guest handling from Ashish, commit msg fixups] > Signed-off-by: Michael Roth Acked-by: Paolo Bonzini > --- > arch/x86/include/asm/kvm-x86-ops.h | 1 + > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/lapic.c | 5 ++++- > arch/x86/kvm/svm/nested.c | 2 +- > arch/x86/kvm/svm/sev.c | 32 ++++++++++++++++++++++++++++++ > arch/x86/kvm/svm/svm.c | 17 +++++++++++++--- > arch/x86/kvm/svm/svm.h | 1 + > 7 files changed, 54 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kv= m-x86-ops.h > index 378ed944b849..ab24ce207988 100644 > --- a/arch/x86/include/asm/kvm-x86-ops.h > +++ b/arch/x86/include/asm/kvm-x86-ops.h > @@ -138,6 +138,7 @@ KVM_X86_OP(complete_emulated_msr) > KVM_X86_OP(vcpu_deliver_sipi_vector) > KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons); > KVM_X86_OP_OPTIONAL(get_untagged_addr) > +KVM_X86_OP_OPTIONAL(alloc_apic_backing_page) > > #undef KVM_X86_OP > #undef KVM_X86_OP_OPTIONAL > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_h= ost.h > index b5b2d0fde579..5c12af29fd9b 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1794,6 +1794,7 @@ struct kvm_x86_ops { > unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *= vcpu); > > gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsi= gned int flags); > + void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu); > }; > > struct kvm_x86_nested_ops { > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 3242f3da2457..1edf93ee3395 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -2815,7 +2815,10 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int ti= mer_advance_ns) > > vcpu->arch.apic =3D apic; > > - apic->regs =3D (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); > + if (kvm_x86_ops.alloc_apic_backing_page) > + apic->regs =3D static_call(kvm_x86_alloc_apic_backing_pag= e)(vcpu); > + else > + apic->regs =3D (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT= ); > if (!apic->regs) { > printk(KERN_ERR "malloc apic regs error for vcpu %x\n", > vcpu->vcpu_id); > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > index dee62362a360..55b9a6d96bcf 100644 > --- a/arch/x86/kvm/svm/nested.c > +++ b/arch/x86/kvm/svm/nested.c > @@ -1181,7 +1181,7 @@ int svm_allocate_nested(struct vcpu_svm *svm) > if (svm->nested.initialized) > return 0; > > - vmcb02_page =3D alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); > + vmcb02_page =3D snp_safe_alloc_page(&svm->vcpu); > if (!vmcb02_page) > return -ENOMEM; > svm->nested.vmcb02.ptr =3D page_address(vmcb02_page); > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index 564091f386f7..f99435b6648f 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -3163,3 +3163,35 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu = *vcpu, u8 vector) > > ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1); > } > + > +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) > +{ > + unsigned long pfn; > + struct page *p; > + > + if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP)) > + return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); > + > + /* > + * Allocate an SNP-safe page to workaround the SNP erratum where > + * the CPU will incorrectly signal an RMP violation #PF if a > + * hugepage (2MB or 1GB) collides with the RMP entry of a > + * 2MB-aligned VMCB, VMSA, or AVIC backing page. > + * > + * Allocate one extra page, choose a page which is not > + * 2MB-aligned, and free the other. > + */ > + p =3D alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1); > + if (!p) > + return NULL; > + > + split_page(p, 1); > + > + pfn =3D page_to_pfn(p); > + if (IS_ALIGNED(pfn, PTRS_PER_PMD)) > + __free_page(p++); > + else > + __free_page(p + 1); > + > + return p; > +} > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 61f2bdc9f4f8..272d5ed37ce7 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -703,7 +703,7 @@ static int svm_cpu_init(int cpu) > int ret =3D -ENOMEM; > > memset(sd, 0, sizeof(struct svm_cpu_data)); > - sd->save_area =3D alloc_page(GFP_KERNEL | __GFP_ZERO); > + sd->save_area =3D snp_safe_alloc_page(NULL); > if (!sd->save_area) > return ret; > > @@ -1421,7 +1421,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu) > svm =3D to_svm(vcpu); > > err =3D -ENOMEM; > - vmcb01_page =3D alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); > + vmcb01_page =3D snp_safe_alloc_page(vcpu); > if (!vmcb01_page) > goto out; > > @@ -1430,7 +1430,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu) > * SEV-ES guests require a separate VMSA page used to con= tain > * the encrypted register state of the guest. > */ > - vmsa_page =3D alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO)= ; > + vmsa_page =3D snp_safe_alloc_page(vcpu); > if (!vmsa_page) > goto error_free_vmcb_page; > > @@ -4900,6 +4900,16 @@ static int svm_vm_init(struct kvm *kvm) > return 0; > } > > +static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu) > +{ > + struct page *page =3D snp_safe_alloc_page(vcpu); > + > + if (!page) > + return NULL; > + > + return page_address(page); > +} > + > static struct kvm_x86_ops svm_x86_ops __initdata =3D { > .name =3D KBUILD_MODNAME, > > @@ -5031,6 +5041,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = =3D { > > .vcpu_deliver_sipi_vector =3D svm_vcpu_deliver_sipi_vector, > .vcpu_get_apicv_inhibit_reasons =3D avic_vcpu_get_apicv_inhibit_r= easons, > + .alloc_apic_backing_page =3D svm_alloc_apic_backing_page, > }; > > /* > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h > index 8ef95139cd24..7f1fbd874c45 100644 > --- a/arch/x86/kvm/svm/svm.h > +++ b/arch/x86/kvm/svm/svm.h > @@ -694,6 +694,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm); > void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector); > void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa); > void sev_es_unmap_ghcb(struct vcpu_svm *svm); > +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu); > > /* vmenter.S */ > > -- > 2.25.1 >