From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90BC6C678D5 for ; Wed, 8 Mar 2023 22:27:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 304D56B0071; Wed, 8 Mar 2023 17:27:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B3A96B0072; Wed, 8 Mar 2023 17:27:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A2636B0075; Wed, 8 Mar 2023 17:27:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 073FC6B0071 for ; Wed, 8 Mar 2023 17:27:45 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id CD1F01610C2 for ; Wed, 8 Mar 2023 22:27:44 +0000 (UTC) X-FDA: 80547169248.10.DFF0200 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) by imf27.hostedemail.com (Postfix) with ESMTP id E48B54000D for ; Wed, 8 Mar 2023 22:27:42 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=HW44QUqU; spf=pass (imf27.hostedemail.com: domain of isaku.yamahata@gmail.com designates 209.85.210.171 as permitted sender) smtp.mailfrom=isaku.yamahata@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678314463; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2xuYo+kEimI1JiWgrDooT7VycvX93kcpUx+pvKWDXsg=; b=tFoKWA1kNod5eRVFBzVDlgwippB/NHntEXqDl0/9LvvEDw0+h+TR0PtniAio1IWiGZM4Wg VX1TkLJE7iAOULrTuhcyFmq12O5ilADDIWc4XBdo85zI2y4tymGzm/cDfRC2/Z0QgxZPKQ vN8y1rueW6Vt3env4knluhrNrUAS36E= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=HW44QUqU; spf=pass (imf27.hostedemail.com: domain of isaku.yamahata@gmail.com designates 209.85.210.171 as permitted sender) smtp.mailfrom=isaku.yamahata@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678314463; a=rsa-sha256; cv=none; b=vXRW2O98IJxU4iuCCNUCftDcQ4jtpkoMUke6oSlD6C7ZdVWZG8HIZP+oWhb4NZGYs6C5uZ Vfo3hTZrkHK3u/sI2UEe3y7k2f1bOBN2ZtfnEB4vXS9e2MdKG2HfrjK6As+Ld6mBepxg9Q 13ppmHLlyQKaBwSJr/rDzfe7BLXn6ow= Received: by mail-pf1-f171.google.com with SMTP id cp12so224388pfb.5 for ; Wed, 08 Mar 2023 14:27:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1678314462; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2xuYo+kEimI1JiWgrDooT7VycvX93kcpUx+pvKWDXsg=; b=HW44QUqUrTulduLimbW/0ulFslu4t1IyWgiO8czxGolwDQNkIIWrGgQtiL+BWQV+4u Q35ROKKbVX/y8HJbLCI7vydrU6kGrnQMI48oGjeFlXLp53/NL35vYj5UF250tGwsSfuT YPIJ9+qtw6i0IGAXKlooWfW1ntkkzQfWV014VlO2yCOse2CTDp8zm43XhwjwKNjTKuyO S06NzJm4WqrVHAoNGEvbR9QuFoqD1FyWk3DgE78xcnB2wa/4xNiu8E1i6Oy50/EXW4Uz HuE9Iyeo80EEd6x/SS1yoN5S40lG/tU4487NpUIOSwNlAbnuLjdKAUBwJc/PCEd9Dpir zK9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678314462; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2xuYo+kEimI1JiWgrDooT7VycvX93kcpUx+pvKWDXsg=; b=7Ebt9ijW9NxbcvRXVTySZf4jV1Ae0cLn/rgU/aGOMRySNN4eA9W+cm+FA9nZuEZMdO Wm8NG6gvmJXFOc7LJ3/p/oOA7K6vVvkMasobATVA1Td5e52Ds/CdaYizKG0tBJDiwc6R SWLKharzLVykg7z4WEv5l7rv30S9IBDo7X0IFht4GPJZX8m8zvcPWdmG5V4LqaM4YwbE XKO6osh0iXa5VvUaut4rswzVIlf4brRIbgO4lailJckvH8xlPUcCHaRVK+xb4x3r2h/n a4U8QVqx8jqQY6hunU95jqk01L5F2f3QmHBagzhcpxU7TFR9xvkg2O1q8GKVL6eLCtOX bJhg== X-Gm-Message-State: AO0yUKUNOXMT/PoPu+r1NJsT1xoPe+93gtGq+pktdQDBTRr8cZQOt6Ij EL9X9UGzrnzRx3q2J8ShJyc= X-Google-Smtp-Source: AK7set8Mc+3El3Q5EC7SwuesaEtyAyOhkePWm7aBpLdQ9ANVI9cPkvALV47exu6fIRaRcJa0N7y9tA== X-Received: by 2002:a62:848d:0:b0:593:89ab:2ec4 with SMTP id k135-20020a62848d000000b0059389ab2ec4mr21105276pfd.10.1678314461447; Wed, 08 Mar 2023 14:27:41 -0800 (PST) Received: from localhost ([192.55.54.55]) by smtp.gmail.com with ESMTPSA id c24-20020aa78818000000b005d3901948d8sm9800589pfo.44.2023.03.08.14.27.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Mar 2023 14:27:40 -0800 (PST) Date: Wed, 8 Mar 2023 14:27:38 -0800 From: Isaku Yamahata To: Kai Huang Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, dave.hansen@intel.com, peterz@infradead.org, tglx@linutronix.de, seanjc@google.com, pbonzini@redhat.com, dan.j.williams@intel.com, rafael.j.wysocki@intel.com, kirill.shutemov@linux.intel.com, ying.huang@intel.com, reinette.chatre@intel.com, len.brown@intel.com, tony.luck@intel.com, ak@linux.intel.com, isaku.yamahata@intel.com, chao.gao@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, david@redhat.com, bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com, isaku.yamahata@gmail.com Subject: Re: [PATCH v10 05/16] x86/virt/tdx: Add skeleton to enable TDX on demand Message-ID: <20230308222738.GA3419702@ls.amr.corp.intel.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: E48B54000D X-Stat-Signature: 7i14s4zyt5j38qbu8n7hh8u6zmc71iwn X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1678314462-986687 X-HE-Meta: U2FsdGVkX19eiYWUq3WmJTkw612v9762yeEqcy4BuMEDJ8pUgdwXGGnxlYaToN5X1lvV06dHxzFFXCqR3SG5KGq+xj2RhdePNUNnJszLSI+GGSN1PcQKtvfKy1rZMhK3P9n0Zdt/OrGXU0fzRYXbLZjdvtOOjd6qHx3MoFgE5HUwxBIUAbuvIB21sc3zsMKy+yu2wyCRO3gtdvAfCEGNeB/QW1hfuDZ6Io5s7GM+tjioyouGbV6bOt9u0llVBcE5O2nxWw2xRfxZGqgfPtakHWbnWpnoJI0hzoSyh0iTJtTfoqsxH1RGfX5fS1+ajw5XbBh0pQZvht8c9IkpaQyOC3L+rGvpTPtxJ0NgZpRC6eadDlrhnb/D2xinNRxjc33apFQWVZyqBrQHLcfcVjgfhKJJkef0ciK2SciJlGiXS0lp1WGyEEyYMqZyOHb1VgNJ9R8qSP0yGx3nVADZu6vGVbJ4bvptEepzQ9V+kkbAfk8C3aZKUpat+lPHsPMAx+Om3MvshkYNsMzlFilQoeWJsdaQhEkU0RZW/wtkrHkRVLRGjNu2DokCDlhX1CUqIA0RT53DHSnMEJewhkeS2So5uB+aR9TAnLN4c1LTi91ZGUe8sPmVReT+9tsmHdGa7TaHC91rviMjqxYFv/Q38+LZDwjysPX1QRVLuKg9X4a5ve6R65GYRuw2J55hM0Etx5aMZyRIFxVysEategNhxbDBzTyU6uMlREc81EIQYeN5HUcZrkSJ8QxYQn/WOZNnQjDFKbvfVpybyaH5oqZ4p9f5DO8mDa9qBGWgdrHvVFl8DHsG4JJ+qfSOZhJERgC9OCY6wRopBJ53FshFUrQx0hgnSpI9DhDjspP6xrOEozpe+X5455ZhhEksV+zQMqbXitb977J/4cAdN1PzAUix7xgjf2ytVflIcGZiWk+44coslVzZUfypVqxAlgik/A8D1igeS3up8umjVU0WzGzgNr2 VjyMlbbU wOUaWcIZcDjMQikaLKI+Pc6tOWFV1C6bIeT8oXbssRN9uy3I5uuFFSSG4+E+Xh08dHME8rEF4bkqoVOevV/Nor2YsypI9E6PB5RDE+V6YpmmIArM5N+h6v7oY+dnqOpdsRJUcS5UrvhNVmyfXIwidp08S8zxSnu7Rejz1GKMpdNrEEXm263JrraMP8eM30N+YRX7RMGP1kHIWqN+sWwO8PSL8OdFEIsKaDP/sMbSfS6Tsm8xaC+usbH7sQti3KQQZmnvAKpBVCQDFq54+FyAC9aW+TTtJ7IZcn65P1Iu4VykQaUV0l3XrQa4x/KMkte/PlgtsVWO3dyQu4NFxoXwq63L6j4d4h6dBK6aoo3SBRn4HyH02pWMVqybP+cx2enzfYNEqpTQcn/ZAa8GHq3xeXsUh9pStKEq/S+ljlw6mP4+ZErUoTCnVYw+i5Xx/pTgJR2T8qPp4H/xd1M0zvM5iCnrnQF78Sas4nCYNLBAWxPlqPcITL92yAoBnq9p8SjgcZhjzMrEuHQnna5KpPCpR4QwlIw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 07, 2023 at 03:13:50AM +1300, Kai Huang wrote: > To enable TDX the kernel needs to initialize TDX from two perspectives: > 1) Do a set of SEAMCALLs to initialize the TDX module to make it ready > to create and run TDX guests; 2) Do the per-cpu initialization SEAMCALL > on one logical cpu before the kernel wants to make any other SEAMCALLs > on that cpu (including those involved during module initialization and > running TDX guests). > > The TDX module can be initialized only once in its lifetime. Instead > of always initializing it at boot time, this implementation chooses an > "on demand" approach to initialize TDX until there is a real need (e.g > when requested by KVM). This approach has below pros: > > 1) It avoids consuming the memory that must be allocated by kernel and > given to the TDX module as metadata (~1/256th of the TDX-usable memory), > and also saves the CPU cycles of initializing the TDX module (and the > metadata) when TDX is not used at all. > > 2) The TDX module design allows it to be updated while the system is > running. The update procedure shares quite a few steps with this "on > demand" initialization mechanism. The hope is that much of "on demand" > mechanism can be shared with a future "update" mechanism. A boot-time > TDX module implementation would not be able to share much code with the > update mechanism. > > 3) Making SEAMCALL requires VMX to be enabled. Currently, only the KVM > code mucks with VMX enabling. If the TDX module were to be initialized > separately from KVM (like at boot), the boot code would need to be > taught how to muck with VMX enabling and KVM would need to be taught how > to cope with that. Making KVM itself responsible for TDX initialization > lets the rest of the kernel stay blissfully unaware of VMX. > > Similar to module initialization, also make the per-cpu initialization > "on demand" as it also depends on VMX to be enabled. > > Add two functions, tdx_enable() and tdx_cpu_enable(), to enable the TDX > module and enable TDX on local cpu respectively. For now tdx_enable() > is a placeholder. The TODO list will be pared down as functionality is > added. > > In tdx_enable() use a state machine protected by mutex to make sure the > initialization will only be done once, as tdx_enable() can be called > multiple times (i.e. KVM module can be reloaded) and may be called > concurrently by other kernel components in the future. > > The per-cpu initialization on each cpu can only be done once during the > module's life time. Use a per-cpu variable to track its status to make > sure it is only done once in tdx_cpu_enable(). > > Also, a SEAMCALL to do TDX module global initialization must be done > once on any logical cpu before any per-cpu initialization SEAMCALL. Do > it inside tdx_cpu_enable() too (if hasn't been done). > > tdx_enable() can potentially invoke SEAMCALLs on any online cpus. The > per-cpu initialization must be done before those SEAMCALLs are invoked > on some cpu. To keep things simple, in tdx_cpu_enable(), always do the > per-cpu initialization regardless of whether the TDX module has been > initialized or not. And in tdx_enable(), don't call tdx_cpu_enable() > but assume the caller has disabled CPU hotplug and done VMXON and > tdx_cpu_enable() on all online cpus before calling tdx_enable(). > > Signed-off-by: Kai Huang > --- > > v9 -> v10: > - Merged the patch to handle per-cpu initialization to this patch to > tell the story better. > - Changed how to handle the per-cpu initialization to only provide a > tdx_cpu_enable() function to let the user of TDX to do it when the > user wants to run TDX code on a certain cpu. > - Changed tdx_enable() to not call cpus_read_lock() explicitly, but > call lockdep_assert_cpus_held() to assume the caller has done that. > - Improved comments around tdx_enable() and tdx_cpu_enable(). > - Improved changelog to tell the story better accordingly. > > v8 -> v9: > - Removed detailed TODO list in the changelog (Dave). > - Added back steps to do module global initialization and per-cpu > initialization in the TODO list comment. > - Moved the 'enum tdx_module_status_t' from tdx.c to local tdx.h > > v7 -> v8: > - Refined changelog (Dave). > - Removed "all BIOS-enabled cpus" related code (Peter/Thomas/Dave). > - Add a "TODO list" comment in init_tdx_module() to list all steps of > initializing the TDX Module to tell the story (Dave). > - Made tdx_enable() unverisally return -EINVAL, and removed nonsense > comments (Dave). > - Simplified __tdx_enable() to only handle success or failure. > - TDX_MODULE_SHUTDOWN -> TDX_MODULE_ERROR > - Removed TDX_MODULE_NONE (not loaded) as it is not necessary. > - Improved comments (Dave). > - Pointed out 'tdx_module_status' is software thing (Dave). > > v6 -> v7: > - No change. > > v5 -> v6: > - Added code to set status to TDX_MODULE_NONE if TDX module is not > loaded (Chao) > - Added Chao's Reviewed-by. > - Improved comments around cpus_read_lock(). > > - v3->v5 (no feedback on v4): > - Removed the check that SEAMRR and TDX KeyID have been detected on > all present cpus. > - Removed tdx_detect(). > - Added num_online_cpus() to MADT-enabled CPUs check within the CPU > hotplug lock and return early with error message. > - Improved dmesg printing for TDX module detection and initialization. > > --- > arch/x86/include/asm/tdx.h | 4 + > arch/x86/virt/vmx/tdx/tdx.c | 182 ++++++++++++++++++++++++++++++++++++ > arch/x86/virt/vmx/tdx/tdx.h | 25 +++++ > 3 files changed, 211 insertions(+) > > diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h > index b489b5b9de5d..112a5b9bd5cd 100644 > --- a/arch/x86/include/asm/tdx.h > +++ b/arch/x86/include/asm/tdx.h > @@ -102,8 +102,12 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, > > #ifdef CONFIG_INTEL_TDX_HOST > bool platform_tdx_enabled(void); > +int tdx_cpu_enable(void); > +int tdx_enable(void); > #else /* !CONFIG_INTEL_TDX_HOST */ > static inline bool platform_tdx_enabled(void) { return false; } > +static inline int tdx_cpu_enable(void) { return -EINVAL; } > +static inline int tdx_enable(void) { return -EINVAL; } > #endif /* CONFIG_INTEL_TDX_HOST */ > > #endif /* !__ASSEMBLY__ */ > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c > index b65b838f3b5d..29127cb70f51 100644 > --- a/arch/x86/virt/vmx/tdx/tdx.c > +++ b/arch/x86/virt/vmx/tdx/tdx.c > @@ -13,6 +13,10 @@ > #include > #include > #include > +#include > +#include > +#include > +#include > #include > #include > #include > @@ -22,6 +26,18 @@ static u32 tdx_global_keyid __ro_after_init; > static u32 tdx_guest_keyid_start __ro_after_init; > static u32 tdx_nr_guest_keyids __ro_after_init; > > +static unsigned int tdx_global_init_status; > +static DEFINE_SPINLOCK(tdx_global_init_lock); > +#define TDX_GLOBAL_INIT_DONE _BITUL(0) > +#define TDX_GLOBAL_INIT_FAILED _BITUL(1) > + > +static DEFINE_PER_CPU(unsigned int, tdx_lp_init_status); > +#define TDX_LP_INIT_DONE _BITUL(0) > +#define TDX_LP_INIT_FAILED _BITUL(1) > + > +static enum tdx_module_status_t tdx_module_status; > +static DEFINE_MUTEX(tdx_module_lock); > + > /* > * Use tdx_global_keyid to indicate that TDX is uninitialized. > * This is used in TDX initialization error paths to take it from > @@ -159,3 +175,169 @@ static int __always_unused seamcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, > put_cpu(); > return ret; > } > + > +static int try_init_module_global(void) > +{ > + int ret; > + > + /* > + * The TDX module global initialization only needs to be done > + * once on any cpu. > + */ > + spin_lock(&tdx_global_init_lock); > + > + if (tdx_global_init_status & TDX_GLOBAL_INIT_DONE) { > + ret = tdx_global_init_status & TDX_GLOBAL_INIT_FAILED ? > + -EINVAL : 0; > + goto out; > + } > + > + /* All '0's are just unused parameters. */ > + ret = seamcall(TDH_SYS_INIT, 0, 0, 0, 0, NULL, NULL); > + > + tdx_global_init_status = TDX_GLOBAL_INIT_DONE; > + if (ret) > + tdx_global_init_status |= TDX_GLOBAL_INIT_FAILED; If entropy is lacking (rdrand failure), TDH_SYS_INIT can return TDX_SYS_BUSY. In such case, we should allow the caller to retry or make this function retry instead of marking error stickily. Except that, Reviewed-by: Isaku Yamahata Thanks, -- Isaku Yamahata