From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE7EAC4167D for ; Sun, 5 Nov 2023 16:33:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F0BA440166; Sun, 5 Nov 2023 11:33:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 779B9440150; Sun, 5 Nov 2023 11:33:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F3BF440166; Sun, 5 Nov 2023 11:33:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4CE12440150 for ; Sun, 5 Nov 2023 11:33:45 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 28D96A0494 for ; Sun, 5 Nov 2023 16:33:45 +0000 (UTC) X-FDA: 81424446810.28.FD9FAE4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf21.hostedemail.com (Postfix) with ESMTP id 6BD121C0008 for ; Sun, 5 Nov 2023 16:33:43 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aNzFeKgF; spf=pass (imf21.hostedemail.com: domain of pbonzini@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=pbonzini@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699202023; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z6jLn+Cf+ELqic+NRdH1pwNqi5BNqb8m7PLldT/zVQQ=; b=jeDID4Xg5Ng5xGRnXaSgOrEIM5/v7Z+zQN9R1SaKiY1xg8QHKYn6fIQ+UdVkxllH+qyhRF XqUHq+1Wzl50rp3e4+QxYqr49rrrgagb0P9aXNZrtF8pRiO1w8sg5xfpsYUSLcSfclCV7H LuG9A0oVR5WVjWwiT8ytWwn5s7ih0gg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699202023; a=rsa-sha256; cv=none; b=YuWZP5Kl4bHDVyMcNyT27Zb5AXkGNAHH+jViTbERmmLwxPYyS3IHDW0lw49j37e7tZrk18 sBU6lAccRYGc8Cy0BZyMQnos20ZQyCWJvoIdFRUg2epFrx/YH8mPfIDMa3dfctvk9ZrGzt 4KOf87NUZa2bp24gNTUV+ULvy5xTIhk= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aNzFeKgF; spf=pass (imf21.hostedemail.com: domain of pbonzini@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=pbonzini@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699202022; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=z6jLn+Cf+ELqic+NRdH1pwNqi5BNqb8m7PLldT/zVQQ=; b=aNzFeKgFVugBM+Mu4hpHsVfzzPsRCBgAS5CqbI1iKLDx3mfGOmSaDGFXIJ7t5e0Xlm/gVM 9UFnlCphahVKYS37n4IJ+vSFf4+3tB+s03EsBQNIPeZ/gsNqkN1b3LB0bqmq29Tj3nZxpz Zv6kvk5e5XdbDXHkJ4AJVI7Hyn4k+PE= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-436-ZVTWeWvRMHGJrKSqBypOSw-1; Sun, 05 Nov 2023 11:33:38 -0500 X-MC-Unique: ZVTWeWvRMHGJrKSqBypOSw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id ECB6C3C0FC8C; Sun, 5 Nov 2023 16:33:35 +0000 (UTC) Received: from avogadro.redhat.com (unknown [10.39.192.93]) by smtp.corp.redhat.com (Postfix) with ESMTP id B1B8D2166B26; Sun, 5 Nov 2023 16:33:28 +0000 (UTC) From: Paolo Bonzini To: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , =?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?= , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A. Shutemov" Subject: [PATCH 21/34] KVM: x86: Add support for "protected VMs" that can utilize private memory Date: Sun, 5 Nov 2023 17:30:24 +0100 Message-ID: <20231105163040.14904-22-pbonzini@redhat.com> In-Reply-To: <20231105163040.14904-1-pbonzini@redhat.com> References: <20231105163040.14904-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 X-Rspamd-Queue-Id: 6BD121C0008 X-Rspam-User: X-Stat-Signature: ra8ss4uo7de131fi61bfkatoiaczffn8 X-Rspamd-Server: rspam03 X-HE-Tag: 1699202023-795643 X-HE-Meta: U2FsdGVkX1/0BW7Sj0bFYK/ZoeBARB1jX/9wClBGR0uzlKz3sPjV67yap8iryCQ2RiovBeEgEGLwMLC3jioVQbtN6Bz0U6vnsISC0cZDnznut5Yq1SES8u4430+0CYCagflvj2McAKpqtRhDUGrhUlTeG93w6XeW5hiMwEpzMJJ2cX8KMKVFs1FMSKpXl5uQX8Bc10uIRH207leGTaMwOiAdk0sIkwLHhsNnVjyiWk1gI6XWLGqkHErOSelcJ4iqnLULiu9PoZt0ycO8WvWXdsgrveHpealBNNLfbvdl9DSapEg+w4UuLagGbgzc4Pq648QicYU17UP2zIx1a2fEoCkQ5x24J2ugZE8KfUvrQMtVWnMGXNrJQg64SjqtgCQOti6YrG5++oQllLpEk29uyNTZ/2otFU6DkmxBqS/MeEUkTi+bWv72CWg0XBEEw+m6c037AHpt8qznB80vOGIHa6zR3N7ywaMv3SW91ACYi/W2MHZ1tdArLvk6168dEW12iiKqOzxpmMtIhXKBG48+Wzi4Tk4WWo6iQ+dh6sLCLhVuRPvZLhBUCOAvTZhea56CoUAwsrC3yqQSk1HtpzHdVYBcVOc0jsxnlVXcgN0VymqwEfDY893AhCFFQxvfoJBmQBG9Sqv+8JgYeYKnplmP1tAjABLA79aXnckEyuA9r75arrm0SjjyQ8VU5D8ctcvEVfbEw8UBHlmTk4tasBLl9M+T3hhaxnl6sart7YIu2bbDQqIom7pAuWP5xoM4gzsU+IDG35+RPnFBH5HhZlunJYdeuaSUQwFiSyCRQxl7RwAtdJkgNVv9kXxSznLIs1jSz/qWQs8pJkXjzvHbOk5Ifcu0eA/Lfq0LaZB+OGoY4ive1cUomboVp2EMyLzIUYuem71i0NNx/pKhpzRtKBLpgi8r5uoG6ZZpRv8yPVehg0R8dEDA5wH1rZPkSZ1Ni9F43QQiFN0H73rFaEg/9bz PePdKLF9 cKVMFRJLClBbDRbY8Ggao4WSfceb4+PkvwDAO115imVnje+ibuBerRnmIR6YvE9b+WgUJVCoY9RO1np2NeSWUFHeDmpiH73xQ3LB2GUbJ7Z+P+aHMTg82bVX5n1imRj+5WZahbvTZZkHLCLNrWJDXwGPR461dbbvre00O4Taze+JNVLYMyKyBX1YHiT+rvlMXSZoI9SRpIuio7/USwUbVlr7j3RDrOfu/qd59xLd3fXAjKuTHP8Jx5uzA+BxSS+bIgYDQo036C6gdIds= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Sean Christopherson Add a new x86 VM type, KVM_X86_SW_PROTECTED_VM, to serve as a development and testing vehicle for Confidential (CoCo) VMs, and potentially to even become a "real" product in the distant future, e.g. a la pKVM. The private memory support in KVM x86 is aimed at AMD's SEV-SNP and Intel's TDX, but those technologies are extremely complex (understatement), difficult to debug, don't support running as nested guests, and require hardware that's isn't universally accessible. I.e. relying SEV-SNP or TDX for maintaining guest private memory isn't a realistic option. At the very least, KVM_X86_SW_PROTECTED_VM will enable a variety of selftests for guest_memfd and private memory support without requiring unique hardware. Signed-off-by: Sean Christopherson Reviewed-by: Paolo Bonzini Message-Id: <20231027182217.3615211-24-seanjc@google.com> Signed-off-by: Paolo Bonzini --- Documentation/virt/kvm/api.rst | 32 ++++++++++++++++++++++++++++++++ arch/x86/include/asm/kvm_host.h | 15 +++++++++------ arch/x86/include/uapi/asm/kvm.h | 3 +++ arch/x86/kvm/Kconfig | 12 ++++++++++++ arch/x86/kvm/mmu/mmu_internal.h | 1 + arch/x86/kvm/x86.c | 16 +++++++++++++++- include/uapi/linux/kvm.h | 1 + virt/kvm/Kconfig | 5 +++++ 8 files changed, 78 insertions(+), 7 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 4a9a291380ad..38882263278d 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -147,10 +147,29 @@ described as 'basic' will be available. The new VM has no virtual cpus and no memory. You probably want to use 0 as machine type. +X86: +^^^^ + +Supported X86 VM types can be queried via KVM_CAP_VM_TYPES. + +S390: +^^^^^ + In order to create user controlled virtual machines on S390, check KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as privileged user (CAP_SYS_ADMIN). +MIPS: +^^^^^ + +To use hardware assisted virtualization on MIPS (VZ ASE) rather than +the default trap & emulate implementation (which changes the virtual +memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the +flag KVM_VM_MIPS_VZ. + +ARM64: +^^^^^^ + On arm64, the physical address size for a VM (IPA Size limit) is limited to 40bits by default. The limit can be configured if the host supports the extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use @@ -8766,6 +8785,19 @@ block sizes is exposed in KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES as a 64-bit bitmap (each bit describing a block size). The default value is 0, to disable the eager page splitting. +8.41 KVM_CAP_VM_TYPES +--------------------- + +:Capability: KVM_CAP_MEMORY_ATTRIBUTES +:Architectures: x86 +:Type: system ioctl + +This capability returns a bitmap of support VM types. The 1-setting of bit @n +means the VM type with value @n is supported. Possible values of @n are:: + + #define KVM_X86_DEFAULT_VM 0 + #define KVM_X86_SW_PROTECTED_VM 1 + 9. Known KVM API problems ========================= diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 75ab0da06e64..a565a2e70f30 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1255,6 +1255,7 @@ enum kvm_apicv_inhibit { }; struct kvm_arch { + unsigned long vm_type; unsigned long n_used_mmu_pages; unsigned long n_requested_mmu_pages; unsigned long n_max_mmu_pages; @@ -2089,6 +2090,12 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd); void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level, int tdp_max_root_level, int tdp_huge_page_level); +#ifdef CONFIG_KVM_PRIVATE_MEM +#define kvm_arch_has_private_mem(kvm) ((kvm)->arch.vm_type != KVM_X86_DEFAULT_VM) +#else +#define kvm_arch_has_private_mem(kvm) false +#endif + static inline u16 kvm_read_ldt(void) { u16 ldt; @@ -2137,14 +2144,10 @@ enum { #define HF_SMM_INSIDE_NMI_MASK (1 << 2) # define KVM_MAX_NR_ADDRESS_SPACES 2 +/* SMM is currently unsupported for guests with private memory. */ +# define kvm_arch_nr_memslot_as_ids(kvm) (kvm_arch_has_private_mem(kvm) ? 1 : 2) # define kvm_arch_vcpu_memslots_id(vcpu) ((vcpu)->arch.hflags & HF_SMM_MASK ? 1 : 0) # define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, (role).smm) - -static inline int kvm_arch_nr_memslot_as_ids(struct kvm *kvm) -{ - return KVM_MAX_NR_ADDRESS_SPACES; -} - #else # define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, 0) #endif diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 1a6a1f987949..a448d0964fc0 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -562,4 +562,7 @@ struct kvm_pmu_event_filter { /* x86-specific KVM_EXIT_HYPERCALL flags. */ #define KVM_EXIT_HYPERCALL_LONG_MODE BIT(0) +#define KVM_X86_DEFAULT_VM 0 +#define KVM_X86_SW_PROTECTED_VM 1 + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index e61383674c75..c1716e83d176 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -77,6 +77,18 @@ config KVM_WERROR If in doubt, say "N". +config KVM_SW_PROTECTED_VM + bool "Enable support for KVM software-protected VMs" + depends on EXPERT + depends on X86_64 + select KVM_GENERIC_PRIVATE_MEM + help + Enable support for KVM software-protected VMs. Currently "protected" + means the VM can be backed with memory provided by + KVM_CREATE_GUEST_MEMFD. + + If unsure, say "N". + config KVM_INTEL tristate "KVM for Intel (and compatible) processors support" depends on KVM && IA32_FEAT_CTL diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 86c7cb692786..b66a7d47e0e4 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -297,6 +297,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, .max_level = KVM_MAX_HUGEPAGE_LEVEL, .req_level = PG_LEVEL_4K, .goal_level = PG_LEVEL_4K, + .is_private = kvm_mem_is_private(vcpu->kvm, cr2_or_gpa >> PAGE_SHIFT), }; int r; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f521c97f5c64..6d0772b47041 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4548,6 +4548,13 @@ static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu, return 0; } +static bool kvm_is_vm_type_supported(unsigned long type) +{ + return type == KVM_X86_DEFAULT_VM || + (type == KVM_X86_SW_PROTECTED_VM && + IS_ENABLED(CONFIG_KVM_SW_PROTECTED_VM) && tdp_enabled); +} + int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) { int r = 0; @@ -4739,6 +4746,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_X86_NOTIFY_VMEXIT: r = kvm_caps.has_notify_vmexit; break; + case KVM_CAP_VM_TYPES: + r = BIT(KVM_X86_DEFAULT_VM); + if (kvm_is_vm_type_supported(KVM_X86_SW_PROTECTED_VM)) + r |= BIT(KVM_X86_SW_PROTECTED_VM); + break; default: break; } @@ -12436,9 +12448,11 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) int ret; unsigned long flags; - if (type) + if (!kvm_is_vm_type_supported(type)) return -EINVAL; + kvm->arch.vm_type = type; + ret = kvm_page_track_init(kvm); if (ret) goto out; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 8eb10f560c69..e9cb2df67a1d 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1227,6 +1227,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_MEMORY_FAULT_INFO 232 #define KVM_CAP_MEMORY_ATTRIBUTES 233 #define KVM_CAP_GUEST_MEMFD 234 +#define KVM_CAP_VM_TYPES 235 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 08afef022db9..2c964586aa14 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -104,3 +104,8 @@ config KVM_GENERIC_MEMORY_ATTRIBUTES config KVM_PRIVATE_MEM select XARRAY_MULTI bool + +config KVM_GENERIC_PRIVATE_MEM + select KVM_GENERIC_MEMORY_ATTRIBUTES + select KVM_PRIVATE_MEM + bool -- 2.39.1