From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE60BEB64DA for ; Wed, 5 Jul 2023 08:13:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 146616B0074; Wed, 5 Jul 2023 04:13:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F6526B0075; Wed, 5 Jul 2023 04:13:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F008D8D0001; Wed, 5 Jul 2023 04:13:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E0EDB6B0074 for ; Wed, 5 Jul 2023 04:13:37 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A1AF51A06A0 for ; Wed, 5 Jul 2023 08:13:37 +0000 (UTC) X-FDA: 80976844074.12.066A396 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf22.hostedemail.com (Postfix) with ESMTP id CAB25C0028 for ; Wed, 5 Jul 2023 08:13:34 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IRe4bMTg; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf22.hostedemail.com: domain of yuan.yao@linux.intel.com has no SPF policy when checking 192.55.52.151) smtp.mailfrom=yuan.yao@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688544815; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GyFSyDlf2yyzDcdnCAdxVJEXDGTJX8oLwq4TxBGnYHY=; b=cGuyK+Ww5bfqUlqYLQo5YB6YUI2UViPFTI/Qh5jm2vRJg6GIHYCLX6/3KVPsdcgmlNx6gn x82br0lsSJSu8qS24husVzSayVPRd5JsgRZIX84vbwkCCgLj++ewTziANyphjFmcseiDte y1mi0+aaZmmyY7fwzmi+VcP3dIA3B9g= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IRe4bMTg; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf22.hostedemail.com: domain of yuan.yao@linux.intel.com has no SPF policy when checking 192.55.52.151) smtp.mailfrom=yuan.yao@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688544815; a=rsa-sha256; cv=none; b=Jy8k2XlC69A0Yo+1YEK/eh02Y9qsPBroehGnk3hDWEd+IurraQZN2WgsO58Rl1SoiLoZ1+ Qv9xhuh7YilrCImw6pFSDAP/9xm7i5f/IxQgLfW+FnJXG7QBZeI55ZUcBDtf+1P/ML8SMa TdXkxHMb7oh+KJPPIeTuyxRX3EIZO8A= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688544814; x=1720080814; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=e/iEbxlMpbFZkhrHZbFCGIHogRTV2vFKjs4hj9pLW+k=; b=IRe4bMTgvLUeiko8pffEz4s4nD2zU1h1HJWweHdBytZGdQyLjg0WKlSH yAM5ohfS5RlXd3uc+DlyMlbBC1l10aFovHRw0oMUafZAod2b0++hz+vYn DC32roi6jax6mzyEiMfnnuEp0l814eLGKvu39649SwQGwyRlRhvi3E62I BJWoDN5PcWLdP4bXchviuJYX0G1+HbztRqPI4HzJp5WnehP0QLlrchzG8 30VCk+d30FBg1wJieq/YwlSZvg2sDmYBXFMV5d7Tw9KXZjZbZAB81JwDP CDcNTDRau2t4cGbazm1ir49Qt3kFGn0hMssPx1tg1ZwY041XwsvotkdPE A==; X-IronPort-AV: E=McAfee;i="6600,9927,10761"; a="343611832" X-IronPort-AV: E=Sophos;i="6.01,182,1684825200"; d="scan'208";a="343611832" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Jul 2023 01:13:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10761"; a="669324964" X-IronPort-AV: E=Sophos;i="6.01,182,1684825200"; d="scan'208";a="669324964" Received: from yy-desk-7060.sh.intel.com (HELO localhost) ([10.239.159.76]) by orsmga003.jf.intel.com with ESMTP; 05 Jul 2023 01:13:24 -0700 Date: Wed, 5 Jul 2023 16:13:23 +0800 From: Yuan Yao To: Kai Huang Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, dave.hansen@intel.com, kirill.shutemov@linux.intel.com, tony.luck@intel.com, peterz@infradead.org, tglx@linutronix.de, bp@alien8.de, mingo@redhat.com, hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com, david@redhat.com, dan.j.williams@intel.com, rafael.j.wysocki@intel.com, ashok.raj@intel.com, reinette.chatre@intel.com, len.brown@intel.com, ak@linux.intel.com, isaku.yamahata@intel.com, ying.huang@intel.com, chao.gao@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, nik.borisov@suse.com, bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com Subject: Re: [PATCH v12 15/22] x86/virt/tdx: Configure global KeyID on all packages Message-ID: <20230705081323.24lm5fa5vqrcidvz@yy-desk-7060> References: <0c5a74c4c3aeac57b623c98d456aa059f833ebdd.1687784645.git.kai.huang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0c5a74c4c3aeac57b623c98d456aa059f833ebdd.1687784645.git.kai.huang@intel.com> User-Agent: NeoMutt/20171215 X-Rspamd-Queue-Id: CAB25C0028 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: dutm7h1ika8w5pbfhp5i4p1q4nxidmq5 X-HE-Tag: 1688544814-287160 X-HE-Meta: U2FsdGVkX18Yz4woh1tFE0lUnIijdqJrJp/7Qe6J5wIA3mljVrNLuL79VNeMDWwSMsHiYu1YEuKbQoSl1QcXLZPVtbSKZ0mr1Sr5MWL1Gh0xoSu7a/3ovurwS/bp9NYk5cvBsA0plnwqPN0RDtERicNCRaa10iHBqITDaMnqr37FnLop5DriLW82euB0pficHChubaRAOJxCX3M9TX9tp9UuwBcZjXtyggJFvj3HxFFHJ9F+hqzXqeDT17+kJhvtvu7f+Jtmh1jRqUaRyP3TZmdRjCsdjYY1qF7Ns7D9z6welJvpdsEfofTmuVXI8uxavoumfPrFBcMIptjiQlrBxbxAMXfOxaVow6lTHby23pSaCZPggRXb/+BXa60bpUEIlBjn4WalsTwbLhnjR8CrylJezzNxSjubd9BKbxQpzfPDTIaBw6kdezWVCPtnl9hZdlU8zo/MzbL048o1c1g3Gss7igKjJVXzQ5EBnmrGelwY+qXU/GO7N//BaBvb7aEkAjGHN6Wr6TAyzjV+5cXyn6/vpfpbnOzkI2MukqeAGd5QdM8e2NjrIk4kDzSLq+4V7U8dNrDdHvyUW72xF0xrBzvp5btRTI8/SOxtWskIdp9VDrj1ya02ZKKf1CUhmTaVzb+NWiTZ3dBBzzHe02Sera8yqIED8ODo8f174ABPJ//LYmjNzolLC0bBaseUt1f7YYt+WWCA9fFNEl5Y7uqeqdS0wu0wE1gUq7a7JOiMgUomYuJ/EYC/vo+v4WR54a3JdEV/1JvhoHXMHa7zcJrhjILp9DGoxcWBabEp6uo85CKJvcTdb8Eo+36sq7nZFspOptEkTGfVu4jvMUJ0v49nQZgO9jKZqbEA+ivOJ1qYYhZYCgDvi5nbpbquxIOf4LleuKzoTGkzG2CaNl/EeyNRT/kSo6RhiyjkSO4sW1HKZ3baxrdx58fWd5TECu+IQ8t7rj8Kz5PAa7rqdyRUJAQ w5D757Dr y5a79bl7uEc2urrdxpjvQGjak4w+2lwP5aq951YOdutvXeGTTKFZaK+083B5vQ4LZny/mXd5Jvp7KMlXef0OY9U5Xca61ecclpT4DgSJEJWG5S3dlUedb3w5YaSv8ZEqZjGD5Y0hvuHJ2BE/LWzOh6nvt+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 27, 2023 at 02:12:45AM +1200, Kai Huang wrote: > After the list of TDMRs and the global KeyID are configured to the TDX > module, the kernel needs to configure the key of the global KeyID on all > packages using TDH.SYS.KEY.CONFIG. > > This SEAMCALL cannot run parallel on different cpus. Loop all online > cpus and use smp_call_on_cpu() to call this SEAMCALL on the first cpu of > each package. > > To keep things simple, this implementation takes no affirmative steps to > online cpus to make sure there's at least one cpu for each package. The > callers (aka. KVM) can ensure success by ensuring sufficient CPUs are > online for this to succeed. > > Intel hardware doesn't guarantee cache coherency across different > KeyIDs. The PAMTs are transitioning from being used by the kernel > mapping (KeyId 0) to the TDX module's "global KeyID" mapping. > > This means that the kernel must flush any dirty KeyID-0 PAMT cachelines > before the TDX module uses the global KeyID to access the PAMTs. > Otherwise, if those dirty cachelines were written back, they would > corrupt the TDX module's metadata. Aside: This corruption would be > detected by the memory integrity hardware on the next read of the memory > with the global KeyID. The result would likely be fatal to the system > but would not impact TDX security. > > Following the TDX module specification, flush cache before configuring > the global KeyID on all packages. Given the PAMT size can be large > (~1/256th of system RAM), just use WBINVD on all CPUs to flush. > > If TDH.SYS.KEY.CONFIG fails, the TDX module may already have used the > global KeyID to write the PAMTs. Therefore, use WBINVD to flush cache > before returning the PAMTs back to the kernel. Also convert all PAMTs > back to normal by using MOVDIR64B as suggested by the TDX module spec, > although on the platform without the "partial write machine check" > erratum it's OK to leave PAMTs as is. > > Signed-off-by: Kai Huang > Reviewed-by: Isaku Yamahata > Reviewed-by: Kirill A. Shutemov > --- > > v11 -> v12: > - Added Kirill's tag > - Improved changelog (Nikolay) > > v10 -> v11: > - Convert PAMTs back to normal when module initialization fails. > - Fixed an error in changelog > > v9 -> v10: > - Changed to use 'smp_call_on_cpu()' directly to do key configuration. > > v8 -> v9: > - Improved changelog (Dave). > - Improved comments to explain the function to configure global KeyID > "takes no affirmative action to online any cpu". (Dave). > - Improved other comments suggested by Dave. > > v7 -> v8: (Dave) > - Changelog changes: > - Point out this is the step of "multi-steps" of init_tdx_module(). > - Removed MOVDIR64B part. > - Other changes due to removing TDH.SYS.SHUTDOWN and TDH.SYS.LP.INIT. > - Changed to loop over online cpus and use smp_call_function_single() > directly as the patch to shut down TDX module has been removed. > - Removed MOVDIR64B part in comment. > > v6 -> v7: > - Improved changelong and comment to explain why MOVDIR64B isn't used > when returning PAMTs back to the kernel. > > > --- > arch/x86/virt/vmx/tdx/tdx.c | 135 +++++++++++++++++++++++++++++++++++- > arch/x86/virt/vmx/tdx/tdx.h | 1 + > 2 files changed, 134 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c > index 1992245290de..f5d4dbc11aee 100644 > --- a/arch/x86/virt/vmx/tdx/tdx.c > +++ b/arch/x86/virt/vmx/tdx/tdx.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > #include > #include "tdx.h" > > @@ -577,7 +578,8 @@ static void tdmr_get_pamt(struct tdmr_info *tdmr, unsigned long *pamt_base, > *pamt_size = pamt_sz; > } > > -static void tdmr_free_pamt(struct tdmr_info *tdmr) > +static void tdmr_do_pamt_func(struct tdmr_info *tdmr, > + void (*pamt_func)(unsigned long base, unsigned long size)) > { > unsigned long pamt_base, pamt_size; > > @@ -590,9 +592,19 @@ static void tdmr_free_pamt(struct tdmr_info *tdmr) > if (WARN_ON_ONCE(!pamt_base)) > return; > > + (*pamt_func)(pamt_base, pamt_size); > +} > + > +static void free_pamt(unsigned long pamt_base, unsigned long pamt_size) > +{ > free_contig_range(pamt_base >> PAGE_SHIFT, pamt_size >> PAGE_SHIFT); > } > > +static void tdmr_free_pamt(struct tdmr_info *tdmr) > +{ > + tdmr_do_pamt_func(tdmr, free_pamt); > +} > + > static void tdmrs_free_pamt_all(struct tdmr_info_list *tdmr_list) > { > int i; > @@ -621,6 +633,41 @@ static int tdmrs_set_up_pamt_all(struct tdmr_info_list *tdmr_list, > return ret; > } > > +/* > + * Convert TDX private pages back to normal by using MOVDIR64B to > + * clear these pages. Note this function doesn't flush cache of > + * these TDX private pages. The caller should make sure of that. > + */ > +static void reset_tdx_pages(unsigned long base, unsigned long size) > +{ > + const void *zero_page = (const void *)page_address(ZERO_PAGE(0)); > + unsigned long phys, end; > + > + end = base + size; > + for (phys = base; phys < end; phys += 64) > + movdir64b(__va(phys), zero_page); Worried write overflow at beginning but then I recalled that PAMT size is 4KB aligned for 1G/2M/4K entries, thus: Reviewed-by: Yuan Yao > + > + /* > + * MOVDIR64B uses WC protocol. Use memory barrier to > + * make sure any later user of these pages sees the > + * updated data. > + */ > + mb(); > +} > + > +static void tdmr_reset_pamt(struct tdmr_info *tdmr) > +{ > + tdmr_do_pamt_func(tdmr, reset_tdx_pages); > +} > + > +static void tdmrs_reset_pamt_all(struct tdmr_info_list *tdmr_list) > +{ > + int i; > + > + for (i = 0; i < tdmr_list->nr_consumed_tdmrs; i++) > + tdmr_reset_pamt(tdmr_entry(tdmr_list, i)); > +} > + > static unsigned long tdmrs_count_pamt_kb(struct tdmr_info_list *tdmr_list) > { > unsigned long pamt_size = 0; > @@ -898,6 +945,55 @@ static int config_tdx_module(struct tdmr_info_list *tdmr_list, u64 global_keyid) > return ret; > } > > +static int do_global_key_config(void *data) > +{ > + /* > + * TDH.SYS.KEY.CONFIG may fail with entropy error (which is a > + * recoverable error). Assume this is exceedingly rare and > + * just return error if encountered instead of retrying. > + * > + * All '0's are just unused parameters. > + */ > + return seamcall(TDH_SYS_KEY_CONFIG, 0, 0, 0, 0, NULL, NULL); > +} > + > +/* > + * Attempt to configure the global KeyID on all physical packages. > + * > + * This requires running code on at least one CPU in each package. If a > + * package has no online CPUs, that code will not run and TDX module > + * initialization (TDMR initialization) will fail. > + * > + * This code takes no affirmative steps to online CPUs. Callers (aka. > + * KVM) can ensure success by ensuring sufficient CPUs are online for > + * this to succeed. > + */ > +static int config_global_keyid(void) > +{ > + cpumask_var_t packages; > + int cpu, ret = -EINVAL; > + > + if (!zalloc_cpumask_var(&packages, GFP_KERNEL)) > + return -ENOMEM; > + > + for_each_online_cpu(cpu) { > + if (cpumask_test_and_set_cpu(topology_physical_package_id(cpu), > + packages)) > + continue; > + > + /* > + * TDH.SYS.KEY.CONFIG cannot run concurrently on > + * different cpus, so just do it one by one. > + */ > + ret = smp_call_on_cpu(cpu, do_global_key_config, NULL, true); > + if (ret) > + break; > + } > + > + free_cpumask_var(packages); > + return ret; > +} > + > static int init_tdx_module(void) > { > struct tdsysinfo_struct *sysinfo; > @@ -956,15 +1052,47 @@ static int init_tdx_module(void) > if (ret) > goto out_free_pamts; > > + /* > + * Hardware doesn't guarantee cache coherency across different > + * KeyIDs. The kernel needs to flush PAMT's dirty cachelines > + * (associated with KeyID 0) before the TDX module can use the > + * global KeyID to access the PAMT. Given PAMTs are potentially > + * large (~1/256th of system RAM), just use WBINVD on all cpus > + * to flush the cache. > + */ > + wbinvd_on_all_cpus(); > + > + /* Config the key of global KeyID on all packages */ > + ret = config_global_keyid(); > + if (ret) > + goto out_reset_pamts; > + > /* > * TODO: > * > - * - Configure the global KeyID on all packages. > * - Initialize all TDMRs. > * > * Return error before all steps are done. > */ > ret = -EINVAL; > +out_reset_pamts: > + if (ret) { > + /* > + * Part of PAMTs may already have been initialized by the > + * TDX module. Flush cache before returning PAMTs back > + * to the kernel. > + */ > + wbinvd_on_all_cpus(); > + /* > + * According to the TDX hardware spec, if the platform > + * doesn't have the "partial write machine check" > + * erratum, any kernel read/write will never cause #MC > + * in kernel space, thus it's OK to not convert PAMTs > + * back to normal. But do the conversion anyway here > + * as suggested by the TDX spec. > + */ > + tdmrs_reset_pamt_all(&tdmr_list); > + } > out_free_pamts: > if (ret) > tdmrs_free_pamt_all(&tdmr_list); > @@ -1019,6 +1147,9 @@ static int __tdx_enable(void) > * lock to prevent any new cpu from becoming online; 2) done both VMXON > * and tdx_cpu_enable() on all online cpus. > * > + * This function requires there's at least one online cpu for each CPU > + * package to succeed. > + * > * This function can be called in parallel by multiple callers. > * > * Return 0 if TDX is enabled successfully, otherwise error. > diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h > index c386aa3afe2a..a0438513bec0 100644 > --- a/arch/x86/virt/vmx/tdx/tdx.h > +++ b/arch/x86/virt/vmx/tdx/tdx.h > @@ -21,6 +21,7 @@ > /* > * TDX module SEAMCALL leaf functions > */ > +#define TDH_SYS_KEY_CONFIG 31 > #define TDH_SYS_INFO 32 > #define TDH_SYS_INIT 33 > #define TDH_SYS_LP_INIT 35 > -- > 2.40.1 >