From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09E41C4167B for ; Mon, 27 Nov 2023 10:00:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9ADA96B028F; Mon, 27 Nov 2023 05:00:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 95C7A6B0290; Mon, 27 Nov 2023 05:00:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 825016B0291; Mon, 27 Nov 2023 05:00:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6EC286B028F for ; Mon, 27 Nov 2023 05:00:34 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3C4A11A0197 for ; Mon, 27 Nov 2023 10:00:34 +0000 (UTC) X-FDA: 81503289588.07.D56504D Received: from mail.alien8.de (mail.alien8.de [65.109.113.108]) by imf08.hostedemail.com (Postfix) with ESMTP id 6E43E160031 for ; Mon, 27 Nov 2023 10:00:30 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=M6Rt7dYW; dmarc=pass (policy=none) header.from=alien8.de; spf=pass (imf08.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701079231; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TtlE36AUctgokR2LY9a4CvYQHcd567Ym1wq8sq+2ihM=; b=W03suYSkt5gywBdTkDOtWjlsEo7ECbawGvMdhzXTwydSK8EGJlPP0LgANBwQ9gj3i4kz+z vtXerdIkaCyWLAU0GuTItIOB3+F6qS/cIvJkWBw8gO7lLh+e00ThVzwP2e5+cD6L6Otyuk OxSOerGBdftgSlhh9xHxOTAZ0NpW184= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=M6Rt7dYW; dmarc=pass (policy=none) header.from=alien8.de; spf=pass (imf08.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701079231; a=rsa-sha256; cv=none; b=0p8i+0VEXjBgeE/oTQPucDl93kNdz3XujY+669HSlWH8bZZtPj+hz8cWHMvynpRTPm0s4C cCbC4FuGCkvfJx/8J02DlzIeHYNx36Zz6Ee/3D8cuWcoiynfS83mKaX9Wra+aKVN3pwj1B l+s69n5Gi2oSyAnxXLxkgkgh0MVwplw= Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id 7ECCC40E0257; Mon, 27 Nov 2023 10:00:27 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id BU3zEmMa-BjB; Mon, 27 Nov 2023 10:00:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=alien8; t=1701079225; bh=TtlE36AUctgokR2LY9a4CvYQHcd567Ym1wq8sq+2ihM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=M6Rt7dYWHGuYYMKKnLHdE+bsGHF4qtcXP+fynrgyym9GNI2DisXZC/Im1xT+/o7ky vPEWJVl69HWMWMtsPt4j7/Maw8VNJUX8xA6vTrjjh0hu6NjyPY74CNk2vOpPKAaRyr eKUaIRfaJ671zR1EZ7+EIHom+tYw3QpojOHy0Tvbnf9OWdz6YrRppAv4btNQ24JKAG ys+2POPXC2GIhcNbDyLmCPlW6bnhZhwmOOnOpKQpYBLmD56/gdjCoiQTQwJzW3rLW3 BRjzUP2icnTuEN5XYI8+sryVJjw7KRTSqb+xDKhQBEOoJGxr/8ICUdFYK44slWI85j N3EYYhf8P/f2EDo95DHJ6iqXaRp2fyo0hoWARYy3s4mWnma41/nz2SFsWL+A5jZUN9 kyIn1B88ZjyJH2l/W+Zt7XuJSDsWLa/k1N0P6zaD5b5BL/sH6lknLw63swIpXLtpXQ Sz+5D1rUWLIt7lNSiTx7krZmSV1RpOoP+4mguMQpVuHkXfcZEXI2TlxQFnrkIvz3nB rxu5dSBq25HD2LP5DOJdprHsyRqw3/r/B7X90yhqMyBRamWw7ZI4piTMVxxuOSsgPp Y5ZJgsTZuvZe4I2q3oZTtxYmyPqgHgvKlyVaDctG5MbOoTTQlisZcOge/khZj1o1Jp F4+/JjWMsZQKD/FINdCPRVsQ= Received: from zn.tnic (pd95304da.dip0.t-ipconnect.de [217.83.4.218]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 00F5940E014B; Mon, 27 Nov 2023 09:59:43 +0000 (UTC) Date: Mon, 27 Nov 2023 10:59:37 +0100 From: Borislav Petkov To: Michael Roth Cc: kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de, thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org, pbonzini@redhat.com, seanjc@google.com, vkuznets@redhat.com, jmattson@google.com, luto@kernel.org, dave.hansen@linux.intel.com, slp@redhat.com, pgonda@google.com, peterz@infradead.org, srinivas.pandruvada@linux.intel.com, rientjes@google.com, dovmurik@linux.ibm.com, tobin@ibm.com, vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com, tony.luck@intel.com, marcorr@google.com, sathyanarayanan.kuppuswamy@linux.intel.com, alpergun@google.com, jarkko@kernel.org, ashish.kalra@amd.com, nikunj.dadhania@amd.com, pankaj.gupta@amd.com, liam.merwick@oracle.com, zhi.a.wang@intel.com, Brijesh Singh , Jarkko Sakkinen Subject: Re: [PATCH v10 14/50] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP Message-ID: <20231127095937.GLZWRoiaqGlJMX54Xb@fat_crate.local> References: <20231016132819.1002933-1-michael.roth@amd.com> <20231016132819.1002933-15-michael.roth@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20231016132819.1002933-15-michael.roth@amd.com> X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 6E43E160031 X-Stat-Signature: s3tm8w3d9hg1ic4euugrm3ny7wri88yf X-Rspam-User: X-HE-Tag: 1701079230-913943 X-HE-Meta: U2FsdGVkX1/io/gFehroixjOE6n428E8rLI1X12WnJ1ObNloZb22RiWsuCiJilPtz5pszASOmeF/98eEEoRSHQpCMOwwsUjf4N7mLdaL7Vml1fbOq3Lf98di9jZHdCCKnuztwZttUXhmB+8G0usQCX7zsa8KkRiSOHp4ly9SfKNu95kxjcTd9eFmZWRaIX9zopfY++fgpRWzVETOsJPg9ic3EJS1zqR9HhzN0nF+o+ieG4uLVo/1ZkOnoa9SWW5wZbboCSDGeMxH+xJRri81mjvtQuoLxsRe4Xe1oKguADWZbu/OrWeGQM9p0S94pNnRa52XLN3W7/CFzMgf/i5SWAR6lFfkw8q0z2gkbmreKEhe38setX3abgY2pDXJclQ+jKFI1Bs9bE/jBgVtI26zIUBjgywVtElMF2Vuz4zllt7iVq+5HUyUF9pRRCphC8mqK0brHYUFOEs97sPGmyHwKDd2VbR0QFarDPEyS/3pU65nST24qMPhAodEIyZpTUBH+1Xj1S5j+Wc44PgMWdS+nPh4ZAiR095c9QDuAW+hbuXYDq61kGbaF2zcnJc5npEOBDXPy2QF+EP19uZ9GlymJvd+cSy6xgKOfcG3s0zFra6j5mfMAn2D13XqDqgVOxudQS8drkz8jV+SVCYoyEky5cxUiKzf4dFMdHfu66FobQGsuCHXJUBpdZeDn25wSq1XExVw4CiarYEH/gJtOB+mZ9Lb7cCRQmsopMj5IVc8eQDW9IGXohBvlAqcF9QReHdaH704C0re8/OGg3d5WCM3avBp5XfB5oiwjR+tn9DKX6n/YRDHBd6ISxzLk8UP7WyvBei7dKFKhZ0r8BAfAoVWizCw8AX8OCgRDe9Edrr+9tyBEqUS1kE3vY+ePt4nIMrxM69CA5bjcoxnIdwp8zntS6pnJfu/COq2zJtxY/N8lf7Pz3NyOSXDeV49sfFvp5Ce/WX/xTdj4KZhWv0fP7X vCKvxr8J haxT2iIkhH6RUXpfe35Cti+ubAP9Ogt6CtzNQ/RrnBK0nyHc4QUxYlG4vz+mllAc09oejbS6+6gsfThGSzIV6ZGh+VaY0wRqEwTf0q3pI0LK7anuibBph6e9OYBO3FfP8fH16yreD23jqaHsvZL4SUDf6T16qf6/HfdtHHAljBmlRm4SS/2mCflt9nQ7n8jtrB7SAPg9m4gM4C3A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Oct 16, 2023 at 08:27:43AM -0500, Michael Roth wrote: > +/* > + * SEV_DATA_RANGE_LIST: > + * Array containing range of pages that firmware transitions to HV-fixed > + * page state. > + */ > +struct sev_data_range_list *snp_range_list; > +static int __sev_snp_init_locked(int *error); Put the function above the caller instead of doing a forward declaration. > static inline bool sev_version_greater_or_equal(u8 maj, u8 min) > { > struct sev_device *sev = psp_master->sev_data; > @@ -466,9 +479,9 @@ static inline int __sev_do_init_locked(int *psp_ret) > return __sev_init_locked(psp_ret); > } > > -static int __sev_platform_init_locked(int *error) > +static int ___sev_platform_init_locked(int *error, bool probe) > { > - int rc = 0, psp_ret = SEV_RET_NO_FW_CALL; > + int rc, psp_ret = SEV_RET_NO_FW_CALL; > struct psp_device *psp = psp_master; > struct sev_device *sev; > > @@ -480,6 +493,34 @@ static int __sev_platform_init_locked(int *error) > if (sev->state == SEV_STATE_INIT) > return 0; > > + /* > + * Legacy guests cannot be running while SNP_INIT(_EX) is executing, > + * so perform SEV-SNP initialization at probe time. > + */ > + rc = __sev_snp_init_locked(error); > + if (rc && rc != -ENODEV) { > + /* > + * Don't abort the probe if SNP INIT failed, > + * continue to initialize the legacy SEV firmware. > + */ > + dev_err(sev->dev, "SEV-SNP: failed to INIT rc %d, error %#x\n", rc, *error); > + } > + > + /* Delay SEV/SEV-ES support initialization */ > + if (probe && !psp_init_on_probe) > + return 0; > + > + if (!sev_es_tmr) { > + /* Obtain the TMR memory area for SEV-ES use */ > + sev_es_tmr = sev_fw_alloc(SEV_ES_TMR_SIZE); > + if (sev_es_tmr) > + /* Must flush the cache before giving it to the firmware */ > + clflush_cache_range(sev_es_tmr, SEV_ES_TMR_SIZE); > + else > + dev_warn(sev->dev, > + "SEV: TMR allocation failed, SEV-ES support unavailable\n"); > + } > + > if (sev_init_ex_buffer) { > rc = sev_read_init_ex_file(); > if (rc) > @@ -522,6 +563,11 @@ static int __sev_platform_init_locked(int *error) > return 0; > } > > +static int __sev_platform_init_locked(int *error) > +{ > + return ___sev_platform_init_locked(error, false); > +} Uff, this is silly. And it makes the code hard to follow and that meat of the platform init functionality in the ___-prefixed function a mess. And the problem is that that "probe" functionality is replicated from the one place where it is actually needed - sev_pci_init() which calls that new sev_platform_init_on_probe() function - to everything that calls __sev_platform_init_locked() for which you've added a wrapper. What you should do, instead, is split the code around __sev_snp_init_locked() in a separate function which does only that and is called something like __sev_platform_init_snp_locked() or so which does that unconditional work. And then you define: _sev_platform_init_locked(int *error, bool probe) note the *one* '_' - i.e., first layer: _sev_platform_init_locked(int *error, bool probe): { __sev_platform_init_snp_locked(error); if (!probe) return 0; if (psp_init_on_probe) __sev_platform_init_locked(error); ... } and you do the probing in that function only so that it doesn't get lost in the bunch of things __sev_platform_init_locked() does. And then you call _sev_platform_init_locked() everywhere and no need for a second sev_platform_init_on_probe(). > + > int sev_platform_init(int *error) > { > int rc; > @@ -534,6 +580,17 @@ int sev_platform_init(int *error) > } > EXPORT_SYMBOL_GPL(sev_platform_init); > > +static int sev_platform_init_on_probe(int *error) > +{ > + int rc; > + > + mutex_lock(&sev_cmd_mutex); > + rc = ___sev_platform_init_locked(error, true); > + mutex_unlock(&sev_cmd_mutex); > + > + return rc; > +} > + > static int __sev_platform_shutdown_locked(int *error) > { > struct sev_device *sev = psp_master->sev_data; > @@ -838,6 +895,191 @@ static int sev_update_firmware(struct device *dev) > return ret; > } > > +static void snp_set_hsave_pa(void *arg) > +{ > + wrmsrl(MSR_VM_HSAVE_PA, 0); > +} > + > +static int snp_filter_reserved_mem_regions(struct resource *rs, void *arg) > +{ > + struct sev_data_range_list *range_list = arg; > + struct sev_data_range *range = &range_list->ranges[range_list->num_elements]; > + size_t size; > + > + if ((range_list->num_elements * sizeof(struct sev_data_range) + > + sizeof(struct sev_data_range_list)) > PAGE_SIZE) > + return -E2BIG; Why? A comment would be helpful like with the rest this patch adds. > + switch (rs->desc) { > + case E820_TYPE_RESERVED: > + case E820_TYPE_PMEM: > + case E820_TYPE_ACPI: > + range->base = rs->start & PAGE_MASK; > + size = (rs->end + 1) - rs->start; > + range->page_count = size >> PAGE_SHIFT; > + range_list->num_elements++; > + break; > + default: > + break; > + } > + > + return 0; > +} > + > +static int __sev_snp_init_locked(int *error) > +{ > + struct psp_device *psp = psp_master; > + struct sev_data_snp_init_ex data; > + struct sev_device *sev; > + int rc = 0; > + > + if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP)) > + return -ENODEV; > + > + if (!psp || !psp->sev_data) > + return -ENODEV; Only caller checks this already. > + sev = psp->sev_data; > + > + if (sev->snp_initialized) Do we really need this silly boolean or is there a way to query the platform whether SNP has been initialized? > + return 0; > + > + if (!sev_version_greater_or_equal(SNP_MIN_API_MAJOR, SNP_MIN_API_MINOR)) { > + dev_dbg(sev->dev, "SEV-SNP support requires firmware version >= %d:%d\n", > + SNP_MIN_API_MAJOR, SNP_MIN_API_MINOR); > + return 0; > + } > + > + /* > + * The SNP_INIT requires the MSR_VM_HSAVE_PA must be set to 0h > + * across all cores. > + */ > + on_each_cpu(snp_set_hsave_pa, NULL, 1); > + > + /* > + * Starting in SNP firmware v1.52, the SNP_INIT_EX command takes a list of > + * system physical address ranges to convert into the HV-fixed page states > + * during the RMP initialization. For instance, the memory that UEFI > + * reserves should be included in the range list. This allows system > + * components that occasionally write to memory (e.g. logging to UEFI > + * reserved regions) to not fail due to RMP initialization and SNP enablement. > + */ > + if (sev_version_greater_or_equal(SNP_MIN_API_MAJOR, 52)) { Is there a generic way to probe SNP_INIT_EX presence in the firmware or are FW version numbers the only way? > + /* > + * Firmware checks that the pages containing the ranges enumerated > + * in the RANGES structure are either in the Default page state or in the "default" > + * firmware page state. > + */ > + snp_range_list = kzalloc(PAGE_SIZE, GFP_KERNEL); > + if (!snp_range_list) { > + dev_err(sev->dev, > + "SEV: SNP_INIT_EX range list memory allocation failed\n"); > + return -ENOMEM; > + } > + > + /* > + * Retrieve all reserved memory regions setup by UEFI from the e820 memory map > + * to be setup as HV-fixed pages. > + */ > + ^ Superfluous newline. > + rc = walk_iomem_res_desc(IORES_DESC_NONE, IORESOURCE_MEM, 0, ~0, > + snp_range_list, snp_filter_reserved_mem_regions); > + if (rc) { > + dev_err(sev->dev, > + "SEV: SNP_INIT_EX walk_iomem_res_desc failed rc = %d\n", rc); > + return rc; > + } > + > + memset(&data, 0, sizeof(data)); > + data.init_rmp = 1; > + data.list_paddr_en = 1; > + data.list_paddr = __psp_pa(snp_range_list); > + > + /* > + * Before invoking SNP_INIT_EX with INIT_RMP=1, make sure that > + * all dirty cache lines containing the RMP are flushed. > + * > + * NOTE: that includes writes via RMPUPDATE instructions, which > + * are also cacheable writes. > + */ > + wbinvd_on_all_cpus(); > + > + rc = __sev_do_cmd_locked(SEV_CMD_SNP_INIT_EX, &data, error); > + if (rc) > + return rc; > + } else { > + /* > + * SNP_INIT is equivalent to SNP_INIT_EX with INIT_RMP=1, so > + * just as with that case, make sure all dirty cache lines > + * containing the RMP are flushed. > + */ > + wbinvd_on_all_cpus(); > + > + rc = __sev_do_cmd_locked(SEV_CMD_SNP_INIT, NULL, error); > + if (rc) > + return rc; > + } So instead of duplicating the code here at the end of the if-else branching, you can do: void *arg = &data; if () { ... cmd = SEV_CMD_SNP_INIT_EX; } else { cmd = SEV_CMD_SNP_INIT; arg = NULL; } wbinvd_on_all_cpus(); rc = __sev_do_cmd_locked(cmd, arg, error); if (rc) return rc; > + /* Prepare for first SNP guest launch after INIT */ > + wbinvd_on_all_cpus(); Why is that WBINVD needed? > + rc = __sev_do_cmd_locked(SEV_CMD_SNP_DF_FLUSH, NULL, error); > + if (rc) > + return rc; > + > + sev->snp_initialized = true; > + dev_dbg(sev->dev, "SEV-SNP firmware initialized\n"); > + > + return rc; > +} > + > +static int __sev_snp_shutdown_locked(int *error) > +{ > + struct sev_device *sev = psp_master->sev_data; > + struct sev_data_snp_shutdown_ex data; > + int ret; > + > + if (!sev->snp_initialized) > + return 0; > + > + memset(&data, 0, sizeof(data)); > + data.length = sizeof(data); > + data.iommu_snp_shutdown = 1; > + > + wbinvd_on_all_cpus(); > + > +retry: > + ret = __sev_do_cmd_locked(SEV_CMD_SNP_SHUTDOWN_EX, &data, error); > + /* SHUTDOWN may require DF_FLUSH */ > + if (*error == SEV_RET_DFFLUSH_REQUIRED) { > + ret = __sev_do_cmd_locked(SEV_CMD_SNP_DF_FLUSH, NULL, NULL); > + if (ret) { > + dev_err(sev->dev, "SEV-SNP DF_FLUSH failed\n"); > + return ret; When you return here, sev->snp_initialized is still true but, in reality, it probably is in some half-broken state after issuing those commands you it is not really initialized anymore. > + } > + goto retry; This needs an upper limit from which to break out and not potentially endless-loop. > + } > + if (ret) { > + dev_err(sev->dev, "SEV-SNP firmware shutdown failed\n"); > + return ret; > + } > + > + sev->snp_initialized = false; > + dev_dbg(sev->dev, "SEV-SNP firmware shutdown\n"); > + > + return ret; > +} > + > +static int sev_snp_shutdown(int *error) > +{ > + int rc; > + > + mutex_lock(&sev_cmd_mutex); > + rc = __sev_snp_shutdown_locked(error); Why is this "locked" version even there if it is called only here? IOW, put all the logic in here - no need for __sev_snp_shutdown_locked(). > + mutex_unlock(&sev_cmd_mutex); > + > + return rc; > +} ... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette