From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A3F0C433FE for ; Thu, 24 Nov 2022 00:48:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9D7726B0071; Wed, 23 Nov 2022 19:48:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9603A6B0072; Wed, 23 Nov 2022 19:48:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 801146B0074; Wed, 23 Nov 2022 19:48:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6CC516B0071 for ; Wed, 23 Nov 2022 19:48:19 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 460D4AB46F for ; Thu, 24 Nov 2022 00:48:19 +0000 (UTC) X-FDA: 80166499518.02.7663509 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf24.hostedemail.com (Postfix) with ESMTP id 9A19518000B for ; Thu, 24 Nov 2022 00:48:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1669250897; x=1700786897; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=aTDswLwxnrT1m4Bb2RWu3He8hrN2rQUrmQl5QgIYb3c=; b=LAz4LIwn7DMvcR3d/2CwgHXNgPgADPaiaRQP7WWl97nQVnD2nzI3C897 jqZR83Yhkz/Lg24JIODrkWI6h9sXKf7OIoUU/XBnPuukYQAkp/S5rtusW u5LQAQPnjZZBAm4ODngSJElpiokajOmtYYLnbkmVtsqb+/UaMrf640mev 51UZzIKX2YPS9ZufSjbC2M6Fg9Em4vnYsIpII33InGGNYLuQnLc2zS8+h otb/km9AWHF3w7mdxrOhlgndM/KZzIgmx+n2xgCcMQ2bahIScaEnaF8+Z CejW0oiTYnTS5E6kpYSsO8QFnJk2OhwHP9halvgO+mZ6jfSRLY9PZFZDC g==; X-IronPort-AV: E=McAfee;i="6500,9779,10540"; a="312887781" X-IronPort-AV: E=Sophos;i="5.96,189,1665471600"; d="scan'208";a="312887781" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Nov 2022 16:48:16 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10540"; a="673057198" X-IronPort-AV: E=Sophos;i="5.96,189,1665471600"; d="scan'208";a="673057198" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Nov 2022 16:48:11 -0800 From: "Huang, Ying" To: "Huang, Kai" Cc: "kvm@vger.kernel.org" , "Hansen, Dave" , "Luck, Tony" , "bagasdotme@gmail.com" , "ak@linux.intel.com" , "Wysocki, Rafael J" , "linux-kernel@vger.kernel.org" , "Christopherson,, Sean" , "Chatre, Reinette" , "pbonzini@redhat.com" , "linux-mm@kvack.org" , "Yamahata, Isaku" , "kirill.shutemov@linux.intel.com" , "Shahar, Sagi" , "imammedo@redhat.com" , "peterz@infradead.org" , "Gao, Chao" , "Brown, Len" , "sathyanarayanan.kuppuswamy@linux.intel.com" , "Williams, Dan J" Subject: Re: [PATCH v7 10/20] x86/virt/tdx: Use all system memory when initializing TDX module as TDX memory References: <9b545148275b14a8c7edef1157f8ec44dc8116ee.1668988357.git.kai.huang@intel.com> <87cz9gvpej.fsf@yhuang6-desk2.ccr.corp.intel.com> <87sfibpxda.fsf@yhuang6-desk2.ccr.corp.intel.com> <973ca04b3323d28a31dbc1cfeb52bd10bd9d9bf3.camel@intel.com> Date: Thu, 24 Nov 2022 08:47:07 +0800 In-Reply-To: <973ca04b3323d28a31dbc1cfeb52bd10bd9d9bf3.camel@intel.com> (Kai Huang's message of "Tue, 22 Nov 2022 17:16:11 +0800") Message-ID: <87o7sxp4ac.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=LAz4LIwn; spf=pass (imf24.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669250898; a=rsa-sha256; cv=none; b=JH1YbRFBb+nQ+w4StXItk8ZbValGkT3OwKpEw3oSQNvEYfhmxxl8uZNgFBoUNhtv5jfvVf rkJTpTyBgCjw33EM94YMaaRbG2FLibTLIJ+G+Oh8xF58XjI7Nfi0AsGf/0F2fJb5BAOXgY nmy4tjMzAlH1LZMaRE+23p+y7HOZ9ak= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669250898; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DWZRS+VKqZF4U0ceU4tPFUNAPyz/ozu8hCLv6FFwi4A=; b=yeQyko2MzwhXHQk5kkKEAXX8tEEOmmAJd6fYs+x5fh1znolKr8xrQ83jQ9nzpedoHG+lcR vhMcvBTHALXcCxC2cVuEimGzyDYM7I0Q3yS57J9nRHxycUpXnuby0WKvj1wJ1qawSJ4+7d 3oN8SqVcgjMksdeh2Aj9/qwfTMM104Y= X-Rspamd-Queue-Id: 9A19518000B X-Rspam-User: Authentication-Results: imf24.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=LAz4LIwn; spf=pass (imf24.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam06 X-Stat-Signature: ha8kr6mcoy3z3j46k4n148kujqwu7s38 X-HE-Tag: 1669250897-570830 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: "Huang, Kai" writes: >> > > > +/* >> > > > + * Add all memblock memory regions to the @tdx_memlist as TDX memory. >> > > > + * Must be called when get_online_mems() is called by the caller. >> > > > + */ >> > > > +static int build_tdx_memory(void) >> > > > +{ >> > > > + unsigned long start_pfn, end_pfn; >> > > > + int i, nid, ret; >> > > > + >> > > > + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { >> > > > + /* >> > > > + * The first 1MB may not be reported as TDX convertible >> > > > + * memory. Manually exclude them as TDX memory. >> > > > + * >> > > > + * This is fine as the first 1MB is already reserved in >> > > > + * reserve_real_mode() and won't end up to ZONE_DMA as >> > > > + * free page anyway. >> > > > + */ >> > > > + start_pfn = max(start_pfn, (unsigned long)SZ_1M >> PAGE_SHIFT); >> > > > + if (start_pfn >= end_pfn) >> > > > + continue; >> > > >> > > How about check whether first 1MB is reserved instead of depending on >> > > the corresponding code isn't changed? Via for_each_reserved_mem_range()? >> > >> > IIUC, some reserved memory can be freed to page allocator directly, i.e. kernel >> > init code/data. I feel it's not safe to just treat reserved memory will never >> > be in page allocator. Otherwise we have for_each_free_mem_range() can use. >> >> Yes. memblock reverse information isn't perfect. But I still think >> that it is still better than just assumption to check whether the frist >> 1MB is reserved in memblock. Or, we can check whether the pages of the >> first 1MB is reversed via checking struct page directly? >> > > Sorry I am a little bit confused what you want to achieve here. Do you want to > make some sanity check to make sure the first 1MB is indeed not in the page > allocator? > > IIUC, it is indeed true. Please see the comment of calling reserve_real_mode() > in setup_arch(). Also please see efi_free_boot_services(), which doesn't free > the boot service if it is below 1MB. > > Also, my understanding is kernel's intention is to always reserve the first 1MB: > > /* > * Don't free memory under 1M for two reasons: > * - BIOS might clobber it > * - Crash kernel needs it to be reserved > */ > > So if any page in first 1MB ended up to the page allocator, it should be the > kernel bug which is not related to TDX, correct? I suggest to add some code to verify this. It's possible for the code to be changed in the future (although possibility is low). And TDX may not be changed at the same time. Then the verifying code here can catch that. So, we can make change accordingly. Best Regards, Huang, Ying