From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB105C433EF for ; Thu, 23 Jun 2022 17:19:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D6F78E0167; Thu, 23 Jun 2022 13:19:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 560108E0144; Thu, 23 Jun 2022 13:19:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4029F8E0167; Thu, 23 Jun 2022 13:19:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2B07F8E0144 for ; Thu, 23 Jun 2022 13:19:47 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id F100D20966 for ; Thu, 23 Jun 2022 17:19:46 +0000 (UTC) X-FDA: 79610162814.15.24102D1 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf29.hostedemail.com (Postfix) with ESMTP id D376712001F for ; Thu, 23 Jun 2022 17:19:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1656004786; x=1687540786; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=2yZfCAUS0MYfDCPoO0523LT8rgoWa8z9eTj07PaU7AU=; b=dYi3ESlksb2Gs3902XSryJM9/bbWPOICcDAhkLBNm/s+pCPOLcJkmWMM MHJ+6kUCu/LDDIbc9Q655XpH05ppFiIQcFzKkpVM1SbMzmDdGD0NTWWl3 wLzzYv7Do/YMd2JbsJYFt6XvEOi9DWH045uDSQT62mITyxSoQVfEJCa6z VwT3PV7clTWqomobnubzpNPJSar80K7WqAOyWKv9Xh0mk/PwhnkjR3Ppq 1dBIbjsFkWNUNNU2PKWz5uyhZakRQ0qo6cekh8Zjveqe3DfTEPjTDnCAK MvA2ml/8joRx0E1U9q4cSAPbp+tqJ3zxyj4gT4YmFDI80jHT3cLCu41nv Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10387"; a="281862610" X-IronPort-AV: E=Sophos;i="5.92,216,1650956400"; d="scan'208";a="281862610" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jun 2022 10:19:42 -0700 X-IronPort-AV: E=Sophos;i="5.92,216,1650956400"; d="scan'208";a="563530605" Received: from ckeane-mobl1.amr.corp.intel.com (HELO [10.209.81.98]) ([10.209.81.98]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jun 2022 10:19:41 -0700 Message-ID: Date: Thu, 23 Jun 2022 10:19:15 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCHv7 10/14] x86/mm: Avoid load_unaligned_zeropad() stepping into unaccepted memory Content-Language: en-US To: "Kirill A. Shutemov" , Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Varad Gautam , Dario Faggioli , Mike Rapoport , David Hildenbrand , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org References: <20220614120231.48165-1-kirill.shutemov@linux.intel.com> <20220614120231.48165-11-kirill.shutemov@linux.intel.com> From: Dave Hansen In-Reply-To: <20220614120231.48165-11-kirill.shutemov@linux.intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656004786; a=rsa-sha256; cv=none; b=ydlCl69sf0Qo/fh26fD6PaXtInkcLnG6qctACm0UMVE61ADo1fQHJQRNhcOcN3a+hsVszP l4IMHY1QUP+FIm3fb6KPFAtIvZQmJZTrS6MZXXqycfDaRIJSPAF52lgbGesfsoNZnUCAqB P4ixSSfMSGmA49jjzH3Ge3wsB+gzhms= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656004786; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mzzNy0adUcf8fr35H3z/fwarh0PFobY9AM8JpI1jWj4=; b=JW6YPHxbPVsVX9/NUKAuZEzUOdnAwNUcVn4Ql/Dxn3yeLP4Yl3es4caQxMYG7AUmc6ffCg ISOViu/MuVUcpTSdN2T7iWzQBtJK2Q33Ny/kXHuwB1VJIBTFFNowsVxihj3pAm9XNsWxDv J5Bcm9xespMmdXvMIjRQticeoS1LoEQ= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dYi3ESlk; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf29.hostedemail.com: domain of dave.hansen@intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=dave.hansen@intel.com X-Stat-Signature: ts5hqpaa5ribfm8ypgo3ser1fexx4hwh X-Rspamd-Queue-Id: D376712001F Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dYi3ESlk; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf29.hostedemail.com: domain of dave.hansen@intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=dave.hansen@intel.com X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1656004785-584617 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/14/22 05:02, Kirill A. Shutemov wrote: > load_unaligned_zeropad() can lead to unwanted loads across page boundaries. > The unwanted loads are typically harmless. But, they might be made to > totally unrelated or even unmapped memory. load_unaligned_zeropad() > relies on exception fixup (#PF, #GP and now #VE) to recover from these > unwanted loads. > > But, this approach does not work for unaccepted memory. For TDX, a load > from unaccepted memory will not lead to a recoverable exception within > the guest. The guest will exit to the VMM where the only recourse is to > terminate the guest. > > There are three parts to fix this issue and comprehensively avoid access > to unaccepted memory. Together these ensure that an extra “guard” page > is accepted in addition to the memory that needs to be used. > > 1. Implicitly extend the range_contains_unaccepted_memory(start, end) > checks up to end+2M if ‘end’ is aligned on a 2M boundary. > 2. Implicitly extend accept_memory(start, end) to end+2M if ‘end’ is > aligned on a 2M boundary. > 3. Set PageUnaccepted() on both memory that itself needs to be accepted > *and* memory where the next page needs to be accepted. Essentially, > make PageUnaccepted(page) a marker for whether work needs to be done > to make ‘page’ usable. That work might include accepting pages in > addition to ‘page’ itself. ... That all looks pretty good. > diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c > index 1df918b21469..bcd56fe82b9e 100644 > --- a/arch/x86/mm/unaccepted_memory.c > +++ b/arch/x86/mm/unaccepted_memory.c > @@ -23,6 +23,38 @@ void accept_memory(phys_addr_t start, phys_addr_t end) > bitmap = __va(boot_params.unaccepted_memory); > range_start = start / PMD_SIZE; > > + /* > + * load_unaligned_zeropad() can lead to unwanted loads across page > + * boundaries. The unwanted loads are typically harmless. But, they > + * might be made to totally unrelated or even unmapped memory. > + * load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now > + * #VE) to recover from these unwanted loads. > + * > + * But, this approach does not work for unaccepted memory. For TDX, a > + * load from unaccepted memory will not lead to a recoverable exception > + * within the guest. The guest will exit to the VMM where the only > + * recourse is to terminate the guest. > + * > + * There are three parts to fix this issue and comprehensively avoid > + * access to unaccepted memory. Together these ensure that an extra > + * “guard” page is accepted in addition to the memory that needs to be > + * used: > + * > + * 1. Implicitly extend the range_contains_unaccepted_memory(start, end) > + * checks up to end+2M if ‘end’ is aligned on a 2M boundary. > + * > + * 2. Implicitly extend accept_memory(start, end) to end+2M if ‘end’ is > + * aligned on a 2M boundary. > + * > + * 3. Set PageUnaccepted() on both memory that itself needs to be > + * accepted *and* memory where the next page needs to be accepted. > + * Essentially, make PageUnaccepted(page) a marker for whether work > + * needs to be done to make ‘page’ usable. That work might include > + * accepting pages in addition to ‘page’ itself. > + */ One nit with this: I'd much rather add one sentence to these to help tie the code implementing it with this comment. Maybe something like: * 2. Implicitly extend accept_memory(start, end) to end+2M if ‘end’ is * aligned on a 2M boundary. (immediately following this comment) > + if (!(end % PMD_SIZE)) > + end += PMD_SIZE; > + > spin_lock_irqsave(&unaccepted_memory_lock, flags); > for_each_set_bitrange_from(range_start, range_end, bitmap, > DIV_ROUND_UP(end, PMD_SIZE)) { > @@ -46,6 +78,10 @@ bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) > > bitmap = __va(boot_params.unaccepted_memory); > > + /* See comment on load_unaligned_zeropad() in accept_memory() */ > + if (!(end % PMD_SIZE)) > + end += PMD_SIZE; It's a wee bit hard to follow this back to the comment that it references, even with them sitting next to each other in this diff. How about adding: /* * Also consider the unaccepted state of the *next* page. See * fix #1 in the comment on load_unaligned_zeropad() in * accept_memory(). */ > spin_lock_irqsave(&unaccepted_memory_lock, flags); > while (start < end) { > if (test_bit(start / PMD_SIZE, bitmap)) { > diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c > index b91c89100b2d..bc1110509de4 100644 > --- a/drivers/firmware/efi/libstub/x86-stub.c > +++ b/drivers/firmware/efi/libstub/x86-stub.c > @@ -709,6 +709,13 @@ static efi_status_t allocate_unaccepted_memory(struct boot_params *params, > return EFI_SUCCESS; > } > > + /* > + * range_contains_unaccepted_memory() may need to check one 2M chunk > + * beyond the end of RAM to deal with load_unaligned_zeropad(). Make > + * sure that the bitmap is large enough handle it. > + */ > + max_addr += PMD_SIZE; I guess the alternative to this would have been to record 'max_addr', then special case 'max_addr'+2M in the bitmap checks. I agree this is probably nicer. Also, the changelog needs to at least *mention* this little tidbit. It was a bit of a surprise when I got here. With those fixed: Reviewed-by: Dave Hansen