From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E391C27C46 for ; Fri, 27 Oct 2023 18:23:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ADD7C8001F; Fri, 27 Oct 2023 14:23:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A632380018; Fri, 27 Oct 2023 14:23:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8686D8001F; Fri, 27 Oct 2023 14:23:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6D02180018 for ; Fri, 27 Oct 2023 14:23:14 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 42DBD140DAD for ; Fri, 27 Oct 2023 18:23:14 +0000 (UTC) X-FDA: 81392063508.28.99AEB28 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf16.hostedemail.com (Postfix) with ESMTP id 56F0B18001C for ; Fri, 27 Oct 2023 18:23:12 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4IzqyXtQ; spf=pass (imf16.hostedemail.com: domain of 3DwA8ZQYKCDAeQMZVOSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--seanjc.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3DwA8ZQYKCDAeQMZVOSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698430992; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vy9wKTn0fkDEcqGdy+Uw8qnM4ToO/Gfdy3c1eQgyq4E=; b=D0nic5GfO3WrGK31cwFIli4hIgmi0aIhyzc1fRxOQCuih0MG1ovUcVrukHACAEAiomR5ha clvmelojj/JPQ/KiE/aS6Py+uU9vg15JLz0IILuu8evgldgglnMD944Dy5R7XFc9+Mkx4B 15Vn9sKuMbHQ0LMRtXpSp01PfsSbqhw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4IzqyXtQ; spf=pass (imf16.hostedemail.com: domain of 3DwA8ZQYKCDAeQMZVOSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--seanjc.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3DwA8ZQYKCDAeQMZVOSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698430992; a=rsa-sha256; cv=none; b=5E8SJ9CM9wx0LWHjXNxK/udIXGAiI1knCI4G7ZrrKfGzRmebnNb962m6gmcQTEUGXRqmvc dLDxAgc3alBhe0by55/ZXzWbF2ndvEyfR905DHVD26O7y30dG/BnMvTbuiEHZ/PowSmVbD +KmI+pT30KjsVJb+4KPSVIV9OWjLQJ0= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-5aaae6f46e1so2074834a12.3 for ; Fri, 27 Oct 2023 11:23:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698430991; x=1699035791; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=vy9wKTn0fkDEcqGdy+Uw8qnM4ToO/Gfdy3c1eQgyq4E=; b=4IzqyXtQsNkZyHR55Wa2NG7ZmYJjDey3YfPuIOSiYTmde8x4WkH6ibd2W8SgHZq1Fg uFgKBnN1C3MDWPYvqEoJ/LzqKi750Ej/wIDBuMo5veMjltYEah/6w4MSL1K67P9pUqS3 K12UfnYYSoU2YiIpt5+3I84rNd6Be4kFLjdFwCIUOXgfjsqSJ+VQ5o1CREONBP0a1vxm YEAzcQvW+VrYhac6sC/pyO9Z2v/FJI/l8hfTBoGx3MdgNhN5lQUKMgpYyDweXjhdw/tl HW+uun1E9ogB4zp8nme/mtNY9Q9zPE8H8gj+pZ22n/wFCW8gpv5aU60VySOQe0RnrMpC mxWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698430991; x=1699035791; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vy9wKTn0fkDEcqGdy+Uw8qnM4ToO/Gfdy3c1eQgyq4E=; b=Hh1c0qQ6jAt1HEUpj4AopR0YcIh44jjScI0harV2z9ywi8rAzvGL4Bvv6th2UsRVb+ Tt9CsQKBPVWusnB1qr7Q3HPivGMb0xXWod400wFXClyXksF0G4ssHjtwV5F2YOhrrJh5 PSV5qHjMjuSu6V1Rc66N0TLgH6C+mvR4EBmO2Qv5ByksBs02X8WRsvaA+iMJmp1fI978 5n1op59b2oq1regIeXBKMWtS5IwJRF8+EpoWvVEu0BeJSmH9T0hkvJTUfjb4n4D7iZPn l6Vf3C18El2QZgvHXX5hbeXrH8ti7LXQVzsuOxAu8/AbrTumVsrtYO0OWWmm+0YTQMs3 IdkA== X-Gm-Message-State: AOJu0Yzh9k1lHUu3dZACo5bXbqQONepecdzBUtXLZiQ64JFg3/68O13v er5imOmcK60dTYWyamZy7Z43XPV7bJY= X-Google-Smtp-Source: AGHT+IHJ99M17cdlIrIz+y3E5dgMN6oBoPPflGRQ0snggMyySInUDtQaTlwdekdlvCWpXPVtetZE0SP3KDw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:d4cc:b0:1cc:1900:28d7 with SMTP id o12-20020a170902d4cc00b001cc190028d7mr82411plg.12.1698430991116; Fri, 27 Oct 2023 11:23:11 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 27 Oct 2023 11:22:05 -0700 In-Reply-To: <20231027182217.3615211-1-seanjc@google.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.820.g83a721a137-goog Message-ID: <20231027182217.3615211-24-seanjc@google.com> Subject: [PATCH v13 23/35] KVM: x86: Add support for "protected VMs" that can utilize private memory From: Sean Christopherson To: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , "=?UTF-8?q?Micka=C3=ABl=20Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 56F0B18001C X-Rspam-User: X-Stat-Signature: ocmxistmyntxo3hhoiokw3cb3r7fgpdh X-Rspamd-Server: rspam01 X-HE-Tag: 1698430992-992886 X-HE-Meta: U2FsdGVkX196kufEub5FAVGJVVCZ3ejxGj2EQnE0D7t775uMfP7i5zZxSkVypLa1RV1SO+7nCvqxH75kOUk9CJ2FmSe37gySAuw23uvXtWUtg9b0GqHi5GcP9xKqHgpTxh5qymqy0fGmmTdCy16iWCDTfsMV9wM96zhMny6PxKl2FbtcThrMalYL1nSDRl1HkhVizTyKHX8GiPkcWnERxHBPdF1q6+N6tRivp9fCDuzbZqRsgUhMt6c/cC2OqdPMlxP6+eF4GMYpnGmQsC0zl1cklQaTFlVBuFHzn3vnjS08FERGQWzuJ7jfP8KJ68OeFLPOMV7POLz2/CnUob8hjw0/pb/VCYzCo4ykze4aVCdOb/+fBntmvClO+2LDAYhG1baFju/34eD7jsDl1G801FHF7fVTLmQ1OQfWEe169/4FjgiR++5DmtwclINXtPNI8CTZWhXZzENCNgugr6iRMDCyPpNMaewpApfETSsQmyWVbdmS5AGGB+HvoAOb6KJ6i+WDG2PGWtoGM7BpJgJZFUd76JgTRBfKUBBMDn1ERZxICDBVZ40hkuI6oO0NIWr9viBa7YVui32qDTrjKRNDpYF2oVhvY7lbfNfVylpRTzobH/BZt3WT6T/GXutyBK9y7yf6fHIaS//GwPmFz1rNPuxlEYDCikHwk9oQUROiahqmageMDfhIr1VmfkEk6c3St4FQflenyI0T3N/tKYp1LmxzwgfRI8KuGYJ0qbrLkAjA7eDXUxdHJGFSvqYLisTuwE8DoAPWl6ObPXaWYEpHsTvqCDCIgoWyE9ul9gb92YrLvHCyo6Z0necCegsoutdmpXbaR498pElR1F66XABxvop/lX67jfvBasgBADTZ4CISfRIFJFau5SVFwGLsth4z0g6DhRZpcT5Y3xBuVlYfdgnNfOl6iD3c4mzpKSO80dk5rzX23FCVXbN2exsu8+GHqbn0cip9ErAbpSoVrd8 0mfYT5mr BA86NAjaNsekbrcJYzmSCmj0qNI4+z1WwRUwHomLFrCcD/IseSQ2Z9w97TQ1GTF5H3EKF0HDbKLIItw/ghLvzXjSw5H6n8qbIyXb+qHJ+IBFNOHpC6QADkjThS5lqHYt4ipFYerYcQYiPvzpObJBt9FQdTYHG5Mg4vsQmKP4GucOF/mhL9XCnjwrBILNnOa3MOzEa/+/sHytkfun1T/HXAQ6kln/R6zaX9yO9oMwnMijhtlUFWOMVglp3d55LhwuGHj8ZYKnl+ELl5PisEv+l55mAivmVtlcsVM1cExuAlxOlZosaKrwQ8ct5HNMOolRBjKoLr3W56ghxVPK9VLooyw/ORIa/QcGZH10/bx2r/3M+aCqoRrDQ/0VRTiYkjmqsyGfk88QP/GCinGlM3jMIoiwxuY+4yckNQdumQF/PGa7g5ZDkdh06f1m6YCL0PyxQkcHYD6UdfocGAdG6EO26pYha2+MhNrek2bcspS+iVf1rmQcLeY/+Z6Jj+75MdsQTTg65yALeZt1e8Ajen0d+xlBT0wYS7oKJJHFFJUemz8x62aaFKkCS4m3Ar4G64aA64L5HXW2quLfVx4e+tyCmwxY15OBGbaEpCP8DhaAuoz1kwKOv07ge5u1UHYFbialvsGO6tT/dwrDDCeBAuwboznVjz3p9mMsDP1UCmRBfZ4rDgwnbcC0twi/mmHKbTJeVO3CM3KJL0CMSsAlInHStCUqm5Ytl/IIaWl7h66mdbVJol/CdsbPbliSKeepYK6C2kiYj X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add a new x86 VM type, KVM_X86_SW_PROTECTED_VM, to serve as a development and testing vehicle for Confidential (CoCo) VMs, and potentially to even become a "real" product in the distant future, e.g. a la pKVM. The private memory support in KVM x86 is aimed at AMD's SEV-SNP and Intel's TDX, but those technologies are extremely complex (understatement), difficult to debug, don't support running as nested guests, and require hardware that's isn't universally accessible. I.e. relying SEV-SNP or TDX for maintaining guest private memory isn't a realistic option. At the very least, KVM_X86_SW_PROTECTED_VM will enable a variety of selftests for guest_memfd and private memory support without requiring unique hardware. Signed-off-by: Sean Christopherson --- Documentation/virt/kvm/api.rst | 32 ++++++++++++++++++++++++++++++++ arch/x86/include/asm/kvm_host.h | 15 +++++++++------ arch/x86/include/uapi/asm/kvm.h | 3 +++ arch/x86/kvm/Kconfig | 12 ++++++++++++ arch/x86/kvm/mmu/mmu_internal.h | 1 + arch/x86/kvm/x86.c | 16 +++++++++++++++- include/uapi/linux/kvm.h | 1 + virt/kvm/Kconfig | 5 +++++ 8 files changed, 78 insertions(+), 7 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 38dc1fda4f45..00029436ac5b 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -147,10 +147,29 @@ described as 'basic' will be available. The new VM has no virtual cpus and no memory. You probably want to use 0 as machine type. +X86: +^^^^ + +Supported X86 VM types can be queried via KVM_CAP_VM_TYPES. + +S390: +^^^^^ + In order to create user controlled virtual machines on S390, check KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as privileged user (CAP_SYS_ADMIN). +MIPS: +^^^^^ + +To use hardware assisted virtualization on MIPS (VZ ASE) rather than +the default trap & emulate implementation (which changes the virtual +memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the +flag KVM_VM_MIPS_VZ. + +ARM64: +^^^^^^ + On arm64, the physical address size for a VM (IPA Size limit) is limited to 40bits by default. The limit can be configured if the host supports the extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use @@ -8650,6 +8669,19 @@ block sizes is exposed in KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES as a 64-bit bitmap (each bit describing a block size). The default value is 0, to disable the eager page splitting. +8.41 KVM_CAP_VM_TYPES +--------------------- + +:Capability: KVM_CAP_MEMORY_ATTRIBUTES +:Architectures: x86 +:Type: system ioctl + +This capability returns a bitmap of support VM types. The 1-setting of bit @n +means the VM type with value @n is supported. Possible values of @n are:: + + #define KVM_X86_DEFAULT_VM 0 + #define KVM_X86_SW_PROTECTED_VM 1 + 9. Known KVM API problems ========================= diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f9e8d5642069..dff10051e9b6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1244,6 +1244,7 @@ enum kvm_apicv_inhibit { }; struct kvm_arch { + unsigned long vm_type; unsigned long n_used_mmu_pages; unsigned long n_requested_mmu_pages; unsigned long n_max_mmu_pages; @@ -2077,6 +2078,12 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd); void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level, int tdp_max_root_level, int tdp_huge_page_level); +#ifdef CONFIG_KVM_PRIVATE_MEM +#define kvm_arch_has_private_mem(kvm) ((kvm)->arch.vm_type != KVM_X86_DEFAULT_VM) +#else +#define kvm_arch_has_private_mem(kvm) false +#endif + static inline u16 kvm_read_ldt(void) { u16 ldt; @@ -2125,14 +2132,10 @@ enum { #define HF_SMM_INSIDE_NMI_MASK (1 << 2) # define KVM_MAX_NR_ADDRESS_SPACES 2 +/* SMM is currently unsupported for guests with private memory. */ +# define kvm_arch_nr_memslot_as_ids(kvm) (kvm_arch_has_private_mem(kvm) ? 1 : 2) # define kvm_arch_vcpu_memslots_id(vcpu) ((vcpu)->arch.hflags & HF_SMM_MASK ? 1 : 0) # define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, (role).smm) - -static inline int kvm_arch_nr_memslot_as_ids(struct kvm *kvm) -{ - return KVM_MAX_NR_ADDRESS_SPACES; -} - #else # define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, 0) #endif diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 1a6a1f987949..a448d0964fc0 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -562,4 +562,7 @@ struct kvm_pmu_event_filter { /* x86-specific KVM_EXIT_HYPERCALL flags. */ #define KVM_EXIT_HYPERCALL_LONG_MODE BIT(0) +#define KVM_X86_DEFAULT_VM 0 +#define KVM_X86_SW_PROTECTED_VM 1 + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 091b74599c22..8452ed0228cb 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -77,6 +77,18 @@ config KVM_WERROR If in doubt, say "N". +config KVM_SW_PROTECTED_VM + bool "Enable support for KVM software-protected VMs" + depends on EXPERT + depends on X86_64 + select KVM_GENERIC_PRIVATE_MEM + help + Enable support for KVM software-protected VMs. Currently "protected" + means the VM can be backed with memory provided by + KVM_CREATE_GUEST_MEMFD. + + If unsure, say "N". + config KVM_INTEL tristate "KVM for Intel (and compatible) processors support" depends on KVM && IA32_FEAT_CTL diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 86c7cb692786..b66a7d47e0e4 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -297,6 +297,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, .max_level = KVM_MAX_HUGEPAGE_LEVEL, .req_level = PG_LEVEL_4K, .goal_level = PG_LEVEL_4K, + .is_private = kvm_mem_is_private(vcpu->kvm, cr2_or_gpa >> PAGE_SHIFT), }; int r; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c4d17727b199..e3eb608b6692 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4441,6 +4441,13 @@ static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu, return 0; } +static bool kvm_is_vm_type_supported(unsigned long type) +{ + return type == KVM_X86_DEFAULT_VM || + (type == KVM_X86_SW_PROTECTED_VM && + IS_ENABLED(CONFIG_KVM_SW_PROTECTED_VM) && tdp_enabled); +} + int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) { int r = 0; @@ -4632,6 +4639,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_X86_NOTIFY_VMEXIT: r = kvm_caps.has_notify_vmexit; break; + case KVM_CAP_VM_TYPES: + r = BIT(KVM_X86_DEFAULT_VM); + if (kvm_is_vm_type_supported(KVM_X86_SW_PROTECTED_VM)) + r |= BIT(KVM_X86_SW_PROTECTED_VM); + break; default: break; } @@ -12314,9 +12326,11 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) int ret; unsigned long flags; - if (type) + if (!kvm_is_vm_type_supported(type)) return -EINVAL; + kvm->arch.vm_type = type; + ret = kvm_page_track_init(kvm); if (ret) goto out; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 29e9eb51dec9..5b5820d19e71 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1218,6 +1218,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_MEMORY_FAULT_INFO 231 #define KVM_CAP_MEMORY_ATTRIBUTES 232 #define KVM_CAP_GUEST_MEMFD 233 +#define KVM_CAP_VM_TYPES 234 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 08afef022db9..2c964586aa14 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -104,3 +104,8 @@ config KVM_GENERIC_MEMORY_ATTRIBUTES config KVM_PRIVATE_MEM select XARRAY_MULTI bool + +config KVM_GENERIC_PRIVATE_MEM + select KVM_GENERIC_MEMORY_ATTRIBUTES + select KVM_PRIVATE_MEM + bool -- 2.42.0.820.g83a721a137-goog