From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B524C54EBC for ; Tue, 10 Jan 2023 15:27:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 769AE8E0002; Tue, 10 Jan 2023 10:27:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F2928E0001; Tue, 10 Jan 2023 10:27:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 593CB8E0002; Tue, 10 Jan 2023 10:27:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 442218E0001 for ; Tue, 10 Jan 2023 10:27:35 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1ADBE80923 for ; Tue, 10 Jan 2023 15:27:35 +0000 (UTC) X-FDA: 80339268870.06.70ED21E Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf15.hostedemail.com (Postfix) with ESMTP id A4BBAA001F for ; Tue, 10 Jan 2023 15:27:32 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=TBHcTG3T; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf15.hostedemail.com: domain of dave.hansen@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dave.hansen@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673364452; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qgAGRQFDjdy1c26RfZaqfFoyFEgOi2DGf/vkFaPhLew=; b=AkRh83R/7tGH2tIisqb7hyoN3nUhT1J7QjHuz8r8bXwGB1GEW2QtLu3hRt9ini1xa4BH8O D8f6dVUe1B0jDFe00tm4RIAXgiQQ32VT2+0wqjzzKKVqrC6xpzqHu2HVRMnD5+rl5pvmPd I5SZW58JKQkHl0368AOBuOO1d9tcx1A= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=TBHcTG3T; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf15.hostedemail.com: domain of dave.hansen@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dave.hansen@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673364452; a=rsa-sha256; cv=none; b=3p0digWqnYJYgI4c7biM3ajD1NlJkcGkF7wLyVESY55c+69V8YTfJasuEbXiGxFAFEsvLP 5/gtrglMEdaQp1uUKb8zUPKgHtbF8/4qEkTWD5iDOLTfV825gSIfzQuHl5S/UUX2FJoxCw EW13Y/fMX2DrR7myFnsXC0xmROLog+4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673364452; x=1704900452; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=1okefenrvDF2MYrLVuk0vOUQggCTJ75eaKRCbpEVbEY=; b=TBHcTG3T18WkVrkttzarrFGaKYKBuQ66tEibWTw17oN2pEBT05S/hTKd /wZ/3fcgbkg2f+BLlNgMW2S+aMEr69BRnhGjCpq+Y8eVX97xGzkdOC+Xn 7MKlDl14ejkWl6ZQJWkuUVNTQ4vpobKQSIGMzDcFHMF5CHK93JH1hCENE B2Xbxcgph+ChmwNQjzjHC9fmY/BH2Myod274RYPqhHdhoNlrdjXpzKtRS 2+Ccl20F0enoQv/v+a5gWsg7pCWTdB53q9Obyke5/hoE/mU2J/64ReTEY bCmGeEDlqrdU3bGqtxPHuU+77sULpRcCZgPk2j/CFYJQ/0MYYp0uYOwqf g==; X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="306682414" X-IronPort-AV: E=Sophos;i="5.96,315,1665471600"; d="scan'208";a="306682414" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jan 2023 07:27:30 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="831039678" X-IronPort-AV: E=Sophos;i="5.96,315,1665471600"; d="scan'208";a="831039678" Received: from svenka7-mobl1.amr.corp.intel.com (HELO [10.209.63.27]) ([10.209.63.27]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jan 2023 07:27:30 -0800 Message-ID: <944ffd4b-3090-e068-a649-b9a84add8395@intel.com> Date: Tue, 10 Jan 2023 07:27:29 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: [PATCH v8 15/16] x86/virt/tdx: Flush cache in kexec() when TDX is enabled Content-Language: en-US To: "Huang, Kai" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" Cc: "Luck, Tony" , "bagasdotme@gmail.com" , "ak@linux.intel.com" , "Wysocki, Rafael J" , "kirill.shutemov@linux.intel.com" , "Christopherson,, Sean" , "Chatre, Reinette" , "pbonzini@redhat.com" , "tglx@linutronix.de" , "linux-mm@kvack.org" , "Yamahata, Isaku" , "peterz@infradead.org" , "Shahar, Sagi" , "imammedo@redhat.com" , "Gao, Chao" , "Brown, Len" , "sathyanarayanan.kuppuswamy@linux.intel.com" , "Huang, Ying" , "Williams, Dan J" References: <6f959f494f0fb3dedfa963c3d6a0ce7f395b745d.camel@intel.com> From: Dave Hansen In-Reply-To: <6f959f494f0fb3dedfa963c3d6a0ce7f395b745d.camel@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: A4BBAA001F X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: po3t99efdfdbf458fctbyjqnzioqxrtx X-HE-Tag: 1673364452-850187 X-HE-Meta: U2FsdGVkX1+MfWACfYyIo20RdDVEHXabdLZr3JAeXwTxHZD+mYsg9U11XEaQoHz1KhajYH4C7koxBDVwc6Tq7KLn76Yjbu6xYaAVJ6Nu6gBpOxR+qLxLdxhqs9ylod+XKm2dPOPYxHxUTLHGmWeGc7h0EvL6Du6vLOQ2NM6XRi14m+n1/hPGekHRcfUBEmGKq8H+IDcpR8LvZ465PgktbBMlOuteK3cClgeNR/JFF3oe7YjFQKx4LRnUaM+o192CzpY2E9qAU1b8EK67UljECF8DeVop4jYWLwCEk8BO1QNmswz2Hw+yL4K0LTS56RsVAiTS9hPev95LthfernAca6oXOCliQYuDEb2EbQmjh/SuvRd2DB7T7bkxoKDtvbIuZiA/6YoV32iLR3KzGozA7a0ledf3dGPcbHExV9gNv/n5vto1PoVPq2DDXy1tAoDn28llXnfqJ5G0aXDQimICkvti7/w22TwlyKjrxkEOa/GFZtZbC8Lp2WO5HW1m0sX4RI0afNShS85ptnZmqf/W/KwibLjljDYQpLpJREB/wPQHvXhdmNVT9I/DDLG5fVKJAbkT+OJu3lX8tblZTg7dZQWjlt+kFzcJY7tAVcYBPgJ9pqHRdVCRvbnjZDbduu2ZhG3sRhlXzoWNonGnYIuoa4LpxMHergC+2ZMGi/2pQaFTCX/1r++87d/ijPEJPKsve683iBDAm7//FQL54i14BbQjIdIIYqnjFUbLldHgAobeJ6TArJi02iMexojSpIQ7HSxo+UqaU0dZwomXopFHjprSv64R9AUkMnVW6jn7/zB44flBiEKOLj2lsYo+oQYEmHSbW5WMoFMiy0aTR+Nq+q9s45opd+EOq+wpLdBKTxgyjCqxNOOKce0ZefQulvhCDp/vB1E+FZx1NBW5lbOpGQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 1/10/23 03:29, Huang, Kai wrote: > On Fri, 2023-01-06 at 16:35 -0800, Dave Hansen wrote: >> On 12/8/22 22:52, Kai Huang wrote: ... >>> However, this implementation doesn't convert TDX private pages back to >>> normal in kexec() because of below considerations: >>> >>> 1) Neither the kernel nor the TDX module has existing infrastructure to >>> track which pages are TDX private pages. >>> 2) The number of TDX private pages can be large, and converting all of >>> them (cache flush + using MOVDIR64B to clear the page) in kexec() can >>> be time consuming. >>> 3) The new kernel will almost only use KeyID 0 to access memory. KeyID >>> 0 doesn't support integrity-check, so it's OK. >>> 4) The kernel doesn't (and may never) support MKTME. If any 3rd party >>> kernel ever supports MKTME, it can/should do MOVDIR64B to clear the >>> page with the new MKTME KeyID (just like TDX does) before using it. >> >> Yeah, why are we getting all worked up about MKTME when there is not >> support? > > I am not sure whether we need to consider 3rd party kernel case? No, we don't. >> The only thing that matters here is dirty cacheline writeback. There >> are two things the kernel needs to do to mitigate that: >> >> 1. Stop accessing TDX private memory mappings >> 1a. Stop making TDX module calls (uses global private KeyID) >> 1b. Stop TDX guests from running (uses per-guest KeyID) >> 2. Flush any cachelines from previous private KeyID writes >> >> There are a couple of ways we can do #2. We do *NOT* need to convert >> *ANYTHING* back to KeyID 0. Page conversion doesn't even come into play >> in any way as far as I can tell. > > May I ask why? When I was writing this patch I was not sure whether kexec() > should give the new kernel a clean slate. SGX driver doesn't EREMOVE all EPC > during kexec() but depends on the new kernel to do that too, but I don't know > what's the general guide of supporting kexec(). Think about it this way: kexec() is modifying persistent (across kexec) state to get the system ready for the new kernel. The caches are persistent state. Devices have persistent state. Memory state persists across kexec(). The memory integrity metadata persists. What persistent state does a conversion to KeyID-0 affect? It resets the integrity metadata and the memory contents. Kexec leaves memory contents in place and doesn't zero them, so memory contents don't matter. The integrity metadata also doesn't matter because the memory will be used as KeyID-0 and that KeyID doesn't read the integrity metadata. What practical impact does a conversion back to KeyID-0 serve? What persistent state does it affect that matters? >> I think you're also saying that since all CPUs go through this path and >> there is no TDX activity between the WBINVD and the native_halt() that >> 1a and 1b basically happen for "free" without needing to do theme >> explicitly. > > Yes. Should we mention this part in changelog? That would be nice.