From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE941C4332F for ; Mon, 6 Nov 2023 11:02:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 582338D0018; Mon, 6 Nov 2023 06:02:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5319C8D0002; Mon, 6 Nov 2023 06:02:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 421948D0018; Mon, 6 Nov 2023 06:02:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 33BC78D0002 for ; Mon, 6 Nov 2023 06:02:36 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id A52ECA074F for ; Mon, 6 Nov 2023 11:02:34 +0000 (UTC) X-FDA: 81427241028.27.E7DD8FF Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) by imf14.hostedemail.com (Postfix) with ESMTP id B876710002B for ; Mon, 6 Nov 2023 11:02:32 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LeZw62MF; spf=pass (imf14.hostedemail.com: domain of tabba@google.com designates 209.85.128.182 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699268552; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=V2GJ8/Y90IpmxpdY5rYjfIxqDFVHPJDdAKZ/UZv6WBc=; b=MgJZ3TLk0P+UHpSzs0Vc5ujoq8/Dqu8q46eqGbX7oLpnwOr4+jbWntm/3vCPenEo/Ip6N/ H0qPxb/EkyLEup6uQInTlUdPMtYTJnd22dS8goTbviSeJh/jIoj3/0wFCGomctY6HgNcrZ uWtQtqCZNoRMcIa1Vbe8PoKxrzK835A= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699268552; a=rsa-sha256; cv=none; b=ua4zH+sVdSB3KbRXY1Vyo/C8atYjqAyXf1JOG+rvM5bZMuLUhS0U9denQ98LaOt3YJJZM5 9bzFvszGPpADL1pyRvnarWSciqnOxtYqdgreAZcHhYcEFwTn3nr1X1wiQuCSL/ZQ2e+eVJ en9N3rhwehh3m1w/9RSY21OEwOlO6uM= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LeZw62MF; spf=pass (imf14.hostedemail.com: domain of tabba@google.com designates 209.85.128.182 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-5b35579f475so49837477b3.3 for ; Mon, 06 Nov 2023 03:02:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1699268552; x=1699873352; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=V2GJ8/Y90IpmxpdY5rYjfIxqDFVHPJDdAKZ/UZv6WBc=; b=LeZw62MF0DooaYhAYlY+23M3f/7MK757uS3BWY/u7OUBo6iuL7FwPQ5sHo+dr8fM26 YobYDe3GfcZrc8JwLa2wrH4iJfG+oP84GjMJsiyJeLtkbDk30u6LeFa+EiLC/sRi95T0 wUmebKzxPWha8cyiBU/fwFQUkheCIiXf4rP7aZz3rVMYmN723apHRSwxO493UA9Sf8S7 ljh+mbgqbQKPzTrRPu2b/EtQm4xRDPqJUMAWH/gYxmLhZhC7PczGOLCJWnzP5vfofGcv BGjcOIb/m9mcgVCXYMACfKYnpHsk70jCufX8neq9paLpstof/lPXTNt6vzKEdpIMPR2j Y6Gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699268552; x=1699873352; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V2GJ8/Y90IpmxpdY5rYjfIxqDFVHPJDdAKZ/UZv6WBc=; b=B2OrDxqGm/OxrLcEcW16Pi/PsikpCLkhrybrsvV6w7g0HD2wkRPjJXBS/0y4B6Mayr DDAJUPcCIcLRd01Ower9NOPuM3EqH4RwPoxDDu91tWKX1Z0GzS7XxtZvI2y4C7hvIF8J 23cqtFwA31oLY0om17DQS8Ddh9J0XDe8vcrSg/zDWEyea8y+9/l3qWP0+WK1CLmavtjN HqbIa97xMnPD0FFNBPzeQpFTeJMTrkh16Wfrza8xdl0pSEuchks92zKoQviXdsd6KOQ+ IYEUVekP1V8Kvr7AQzAsBJX+c7PUwpy6C22tIefccFlIVuMEHrPJdEidB8SDYCdDHB5d NkWA== X-Gm-Message-State: AOJu0YydoNOJaBnnVY1rb8t2TDZp+lmfOaYRD2sr9GUbHlQ6Fpf15Yte IlRQ0DBkdnrwcZqhxjF3FEfPTtt4nCEXJwPbnDAPRw== X-Google-Smtp-Source: AGHT+IENsSw7a/V4UHxjgeCZCxumLhfZFTIQeh3gjUw+ULtbqWsBt8O3seLfs1J5njILHP4HGiHHO4KXlr48V0F1FV0= X-Received: by 2002:a0d:ead2:0:b0:5a2:20ec:40be with SMTP id t201-20020a0dead2000000b005a220ec40bemr12462308ywe.29.1699268551723; Mon, 06 Nov 2023 03:02:31 -0800 (PST) MIME-Version: 1.0 References: <20231105163040.14904-1-pbonzini@redhat.com> <20231105163040.14904-22-pbonzini@redhat.com> In-Reply-To: <20231105163040.14904-22-pbonzini@redhat.com> From: Fuad Tabba Date: Mon, 6 Nov 2023 11:01:56 +0000 Message-ID: Subject: Re: [PATCH 21/34] KVM: x86: Add support for "protected VMs" that can utilize private memory To: Paolo Bonzini Cc: Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A. Shutemov" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: yjjyaqq5y19gchu8sfszrak6uti46hfb X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: B876710002B X-Rspam-User: X-HE-Tag: 1699268552-975936 X-HE-Meta: U2FsdGVkX19JYX9W00bepKNgLlBR7Tgk1YYX+40vPHD3ioi/4n2pmNSB6KZTCWf60u64U3UBGm0RIqd2HvYi1uOJnQ8vUyF0wBqr88wiaJMRPFl0FqxdNN3A3F6sz4d3i6jEliGZ+yBYBHIX7JZBanM2A/76/d7VBulHDhVFxqVp7pxQ5Ask3m9PDyXUbR0zw2x0OwArgNEAk+qgJHasnd89WqYOTUM48p+Bi1s5SHRL58uFoJfgdjMAtE0IJX5ZCdfGid4uiARdmVU5PosiEe3QcXrBAqQZGe5kMWqw+/fvg1wmsqK3TJmCtjMh9XvXJn7Gdn1I3yB15UNGjYYoPXQ/BcvzX9nKCUhbuD5ah/yUiOYkm1PpSgf9+FkMhC7wSq4Y6dczktD5g4H/5WPgxbZVj7WfqTr98aXwaTsyDAa8KW7tgrn2cTaSL2D2bW03WvEUC0T9TQf44XVu+aL0qM/Mvr+573dNpsPivh5rVzLFWUn9GInJpvHtOdBjTqJL9u75rcsT/fJ3+gvMEFrrE9wQtfEK/cdVaDk2uBUj2mawIqclzNNlpbARGXUzHbQkDGcQsLEepCQzc1+KpNnXk7akaQyy6633gKLg1GrSveYdORFZQI1v4NH6kztkg/WaPtef3j2jy3tsbwQXkIHy2mqk0eXKwIMIu9dNRkAP5dzquC+lrkF8aYcxw6Etis2XxJs4rJREgcpyclfOufPFGmNaxG/MBuJHAuzpyr07FOXTzhkRX73fHuPduzxYSo0fXN4Zpue4T4HvAZkwAeHn/QNgrdQIz5oJvRJEWKRU2L4SA3zRkfKZjueNWFUeuTF6a5iVygpFCKIZ9oYKpYC7emET21blVrExQ8k5Rl16/zCrNRLJubv0NsCj0T+ffN1WPglQD+3uX+l528XqK/mw4oYc4IfYsynHcsW6Z4ibB/trERlaM37802VzwIIp/n4EpMBmmELxgRUqB9/0Kw4 c8qAhb6W /EbF6qqJYL4Z1UbtgxklYqWllWHQvK3ZlcEWBiTRHVzkr6Cp0l3+WXVCE27yvxLMsUhW4rPyvoadSMLPVm0zQg45sWyBCNYX6QntrAn8GYN1SHLlHnt6v2+pMALTqIjQs+ezVjVRJKEkgbTN0nERKz5LyiQXwV9hNjKnhV9M17QqWFfnJxFelrzdeZ5jxMbiJK6AmcC2eatjmmvmo6bdH0hLys+rQo5ycjcMbANlCq1yZP4MWLvH9aPMmrppx5+x7nSq+4kGrEc9bVdXMldLZ6EP0vpCZE8hoEVWDodR534oBw3odxO6eyoFgbUiCM5q/ppkFVOCUMBHB5ATpD2upgg3tj9RxQt7qAk3/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, On Sun, Nov 5, 2023 at 4:33=E2=80=AFPM Paolo Bonzini = wrote: > > From: Sean Christopherson > > Add a new x86 VM type, KVM_X86_SW_PROTECTED_VM, to serve as a development > and testing vehicle for Confidential (CoCo) VMs, and potentially to even > become a "real" product in the distant future, e.g. a la pKVM. > > The private memory support in KVM x86 is aimed at AMD's SEV-SNP and > Intel's TDX, but those technologies are extremely complex (understatement= ), > difficult to debug, don't support running as nested guests, and require > hardware that's isn't universally accessible. I.e. relying SEV-SNP or TD= X (replied to v13 earlier, sorry) nit: "that isn't" Reviewed-by: Fuad Tabba Tested-by: Fuad Tabba Cheers, /fuad > for maintaining guest private memory isn't a realistic option. > > At the very least, KVM_X86_SW_PROTECTED_VM will enable a variety of > selftests for guest_memfd and private memory support without requiring > unique hardware. > > Signed-off-by: Sean Christopherson > Reviewed-by: Paolo Bonzini > Message-Id: <20231027182217.3615211-24-seanjc@google.com> > Signed-off-by: Paolo Bonzini > --- > Documentation/virt/kvm/api.rst | 32 ++++++++++++++++++++++++++++++++ > arch/x86/include/asm/kvm_host.h | 15 +++++++++------ > arch/x86/include/uapi/asm/kvm.h | 3 +++ > arch/x86/kvm/Kconfig | 12 ++++++++++++ > arch/x86/kvm/mmu/mmu_internal.h | 1 + > arch/x86/kvm/x86.c | 16 +++++++++++++++- > include/uapi/linux/kvm.h | 1 + > virt/kvm/Kconfig | 5 +++++ > 8 files changed, 78 insertions(+), 7 deletions(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.= rst > index 4a9a291380ad..38882263278d 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -147,10 +147,29 @@ described as 'basic' will be available. > The new VM has no virtual cpus and no memory. > You probably want to use 0 as machine type. > > +X86: > +^^^^ > + > +Supported X86 VM types can be queried via KVM_CAP_VM_TYPES. > + > +S390: > +^^^^^ > + > In order to create user controlled virtual machines on S390, check > KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as > privileged user (CAP_SYS_ADMIN). > > +MIPS: > +^^^^^ > + > +To use hardware assisted virtualization on MIPS (VZ ASE) rather than > +the default trap & emulate implementation (which changes the virtual > +memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the > +flag KVM_VM_MIPS_VZ. > + > +ARM64: > +^^^^^^ > + > On arm64, the physical address size for a VM (IPA Size limit) is limited > to 40bits by default. The limit can be configured if the host supports t= he > extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use > @@ -8766,6 +8785,19 @@ block sizes is exposed in KVM_CAP_ARM_SUPPORTED_BL= OCK_SIZES as a > 64-bit bitmap (each bit describing a block size). The default value is > 0, to disable the eager page splitting. > > +8.41 KVM_CAP_VM_TYPES > +--------------------- > + > +:Capability: KVM_CAP_MEMORY_ATTRIBUTES > +:Architectures: x86 > +:Type: system ioctl > + > +This capability returns a bitmap of support VM types. The 1-setting of = bit @n > +means the VM type with value @n is supported. Possible values of @n are= :: > + > + #define KVM_X86_DEFAULT_VM 0 > + #define KVM_X86_SW_PROTECTED_VM 1 > + > 9. Known KVM API problems > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_h= ost.h > index 75ab0da06e64..a565a2e70f30 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1255,6 +1255,7 @@ enum kvm_apicv_inhibit { > }; > > struct kvm_arch { > + unsigned long vm_type; > unsigned long n_used_mmu_pages; > unsigned long n_requested_mmu_pages; > unsigned long n_max_mmu_pages; > @@ -2089,6 +2090,12 @@ void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t = new_pgd); > void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level, > int tdp_max_root_level, int tdp_huge_page_level); > > +#ifdef CONFIG_KVM_PRIVATE_MEM > +#define kvm_arch_has_private_mem(kvm) ((kvm)->arch.vm_type !=3D KVM_X86_= DEFAULT_VM) > +#else > +#define kvm_arch_has_private_mem(kvm) false > +#endif > + > static inline u16 kvm_read_ldt(void) > { > u16 ldt; > @@ -2137,14 +2144,10 @@ enum { > #define HF_SMM_INSIDE_NMI_MASK (1 << 2) > > # define KVM_MAX_NR_ADDRESS_SPACES 2 > +/* SMM is currently unsupported for guests with private memory. */ > +# define kvm_arch_nr_memslot_as_ids(kvm) (kvm_arch_has_private_mem(kvm) = ? 1 : 2) > # define kvm_arch_vcpu_memslots_id(vcpu) ((vcpu)->arch.hflags & HF_SMM_M= ASK ? 1 : 0) > # define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, (role= ).smm) > - > -static inline int kvm_arch_nr_memslot_as_ids(struct kvm *kvm) > -{ > - return KVM_MAX_NR_ADDRESS_SPACES; > -} > - > #else > # define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, 0) > #endif > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/= kvm.h > index 1a6a1f987949..a448d0964fc0 100644 > --- a/arch/x86/include/uapi/asm/kvm.h > +++ b/arch/x86/include/uapi/asm/kvm.h > @@ -562,4 +562,7 @@ struct kvm_pmu_event_filter { > /* x86-specific KVM_EXIT_HYPERCALL flags. */ > #define KVM_EXIT_HYPERCALL_LONG_MODE BIT(0) > > +#define KVM_X86_DEFAULT_VM 0 > +#define KVM_X86_SW_PROTECTED_VM 1 > + > #endif /* _ASM_X86_KVM_H */ > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig > index e61383674c75..c1716e83d176 100644 > --- a/arch/x86/kvm/Kconfig > +++ b/arch/x86/kvm/Kconfig > @@ -77,6 +77,18 @@ config KVM_WERROR > > If in doubt, say "N". > > +config KVM_SW_PROTECTED_VM > + bool "Enable support for KVM software-protected VMs" > + depends on EXPERT > + depends on X86_64 > + select KVM_GENERIC_PRIVATE_MEM > + help > + Enable support for KVM software-protected VMs. Currently "prot= ected" > + means the VM can be backed with memory provided by > + KVM_CREATE_GUEST_MEMFD. > + > + If unsure, say "N". > + > config KVM_INTEL > tristate "KVM for Intel (and compatible) processors support" > depends on KVM && IA32_FEAT_CTL > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_inter= nal.h > index 86c7cb692786..b66a7d47e0e4 100644 > --- a/arch/x86/kvm/mmu/mmu_internal.h > +++ b/arch/x86/kvm/mmu/mmu_internal.h > @@ -297,6 +297,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vc= pu *vcpu, gpa_t cr2_or_gpa, > .max_level =3D KVM_MAX_HUGEPAGE_LEVEL, > .req_level =3D PG_LEVEL_4K, > .goal_level =3D PG_LEVEL_4K, > + .is_private =3D kvm_mem_is_private(vcpu->kvm, cr2_or_gpa = >> PAGE_SHIFT), > }; > int r; > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index f521c97f5c64..6d0772b47041 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4548,6 +4548,13 @@ static int kvm_ioctl_get_supported_hv_cpuid(struct= kvm_vcpu *vcpu, > return 0; > } > > +static bool kvm_is_vm_type_supported(unsigned long type) > +{ > + return type =3D=3D KVM_X86_DEFAULT_VM || > + (type =3D=3D KVM_X86_SW_PROTECTED_VM && > + IS_ENABLED(CONFIG_KVM_SW_PROTECTED_VM) && tdp_enabled); > +} > + > int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > { > int r =3D 0; > @@ -4739,6 +4746,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, = long ext) > case KVM_CAP_X86_NOTIFY_VMEXIT: > r =3D kvm_caps.has_notify_vmexit; > break; > + case KVM_CAP_VM_TYPES: > + r =3D BIT(KVM_X86_DEFAULT_VM); > + if (kvm_is_vm_type_supported(KVM_X86_SW_PROTECTED_VM)) > + r |=3D BIT(KVM_X86_SW_PROTECTED_VM); > + break; > default: > break; > } > @@ -12436,9 +12448,11 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned l= ong type) > int ret; > unsigned long flags; > > - if (type) > + if (!kvm_is_vm_type_supported(type)) > return -EINVAL; > > + kvm->arch.vm_type =3D type; > + > ret =3D kvm_page_track_init(kvm); > if (ret) > goto out; > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 8eb10f560c69..e9cb2df67a1d 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -1227,6 +1227,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_MEMORY_FAULT_INFO 232 > #define KVM_CAP_MEMORY_ATTRIBUTES 233 > #define KVM_CAP_GUEST_MEMFD 234 > +#define KVM_CAP_VM_TYPES 235 > > #ifdef KVM_CAP_IRQ_ROUTING > > diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig > index 08afef022db9..2c964586aa14 100644 > --- a/virt/kvm/Kconfig > +++ b/virt/kvm/Kconfig > @@ -104,3 +104,8 @@ config KVM_GENERIC_MEMORY_ATTRIBUTES > config KVM_PRIVATE_MEM > select XARRAY_MULTI > bool > + > +config KVM_GENERIC_PRIVATE_MEM > + select KVM_GENERIC_MEMORY_ATTRIBUTES > + select KVM_PRIVATE_MEM > + bool > -- > 2.39.1 > >