From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F490C761A6 for ; Mon, 3 Apr 2023 13:28:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CD84B6B0071; Mon, 3 Apr 2023 09:28:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C5EC66B0072; Mon, 3 Apr 2023 09:28:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD7F66B0074; Mon, 3 Apr 2023 09:28:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9A9A06B0071 for ; Mon, 3 Apr 2023 09:28:42 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 69E6D1205CA for ; Mon, 3 Apr 2023 13:28:42 +0000 (UTC) X-FDA: 80640159684.04.3B12909 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf22.hostedemail.com (Postfix) with ESMTP id D1C2DC0020 for ; Mon, 3 Apr 2023 13:28:38 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=XkFA+dPa; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=asU5R1Qc; spf=pass (imf22.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680528519; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zHqm72f+Z0I2AzfAbhnKdnJNlmuD9/GdE0RimqxF3Bk=; b=WNqOUWuwn16mC740khHVDabwxBA+2l5TBhFYgr/L2/jWH37iYJoX9JHCYrH2r1jFgxxUys w2crfBxl6YUnsUZzM4U1wL19xqXMV3NETvOjJHhg8gVF2S666k1RLq+EpfAZa5ZS2j3k3r WWPFdDOn8zS80A3QmZcqMzM665MqSMw= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=XkFA+dPa; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=asU5R1Qc; spf=pass (imf22.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680528519; a=rsa-sha256; cv=none; b=ELKAu3CpPbLfy9yhXIwNHiOGHjgMVyXzoSvwKRiSMlx28katjygfbTpkVYYCtXurwzRsHT 5LNqNQ0dZBem7VTvbhybSEdnmYQDIDXMKtAQv+bdk1nuZW05JvItLlsLegwY3z4xCUECmS e/dJcWRnSabUZbdWl4uWGLbC4CCC4X4= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 52D1B1FF8F; Mon, 3 Apr 2023 13:28:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1680528517; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zHqm72f+Z0I2AzfAbhnKdnJNlmuD9/GdE0RimqxF3Bk=; b=XkFA+dPaXYBCtXjZnIh4szDFpQyFAbgURFtWJRBp72ylru+ubP+sTKd0QLfYl8oVc+Ixfs rNSkSh9+aLOTgj2DaZY3hGjW8HHY0YR7nlDZ2/mRsfW0EDTbJnJQ3Zdnx/UNNDLhzOUctg hHAHydkFgXA0q7O3MJB1VHuIQGbrar0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1680528517; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zHqm72f+Z0I2AzfAbhnKdnJNlmuD9/GdE0RimqxF3Bk=; b=asU5R1Qcu5R93vWGXtOCEMuBatUblI6aSIhFT7IZYcZjU9yCRyubBsXf5VUCB/d+SB+pPS 5x4XX0DH2gRwmIBQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id C19CB1331A; Mon, 3 Apr 2023 13:28:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id I7tWLoTUKmTwcAAAMHmgww (envelope-from ); Mon, 03 Apr 2023 13:28:36 +0000 Message-ID: <48567ee3-b482-bafd-bd25-cbb8bf3403b2@suse.cz> Date: Mon, 3 Apr 2023 15:28:36 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCHv9 11/14] x86/mm: Avoid load_unaligned_zeropad() stepping into unaccepted memory Content-Language: en-US To: "Kirill A. Shutemov" , Borislav Petkov , Andy Lutomirski , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel Cc: Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Tom Lendacky , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Dave Hansen , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, Dave Hansen References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> <20230330114956.20342-12-kirill.shutemov@linux.intel.com> From: Vlastimil Babka In-Reply-To: <20230330114956.20342-12-kirill.shutemov@linux.intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D1C2DC0020 X-Rspam-User: X-Stat-Signature: b9xogpeasac5hr1k6xzcwkzfwtafyuie X-HE-Tag: 1680528518-863743 X-HE-Meta: U2FsdGVkX1+8BukaY3f0aRrtOXtCWU3/6bPQpQCGXtML66hd1HvGQJZL8izy6tKH2LYmZoOwK8R1TuZEKqU0SLNosr8hlZ/atPr/Q+wXGEN10GZm0bMC0Eu7u5hUM9z1mDNxnpHUPK/i0WwXBBtZQ96dSz2DS1+89dkF1t4qdihVmuagd8iPJpVjuW9Yg6VEWOzkq2I5z/u7VRweqMTeW1t9fv29U+AIOumwlhYTmovNe6t4f+m01r51LnESHp3jqyHbTQ8iPuXIGwD98u96XKXv7XqjBzP9x33dv0/Z66ntZsvwLwH64KdeAlZTYySq/BXs1BQnA1RSc4LRnKfA9oUy2opROIw0O0LtlObS5STRsypV8dVWbNjovZHmOuAFKK7Hk8Q8f72CpZjczi/NaabHVHNMbAiI8poMqIv6CiP1ubqr+4tFHwctBf6SLVNMUIUeH8v6VVkX819FWLQ5I5eCq2P/hl2MBZzB8W97x0oCrDxie4cwNQhvGQ8DSJ5p+QIl+DPwqC5WOAVFO5jSoXYQAROWRIIeeQzJitkx5CmPw2FAP2D/iqFDjBhmgGYODI86zKhKh8oZbV7SjNKqt1h6CmYoICGS8N+Ad+XMi0S7ZrNhT1NegAuFO4aRPGvna8LkZH/1GJxll+YVR8Mai38rJG5JztU1bS4BA7o9TUry35Y1w5yoPDZqOVDQlxSCEOodBJXEzBYvf2CprNlcuA7wOgOtQJlBBYFVMphGcn4+2+J7BKTNZGcGPgDVSpIU5/31nGRpCQh+/4JATwvMFvpoy+ic5W34hPXT3qXEmLg9lyPBLVzbVxoUMzaR9jZ6myfUP8FI/4ZAKpCjztRb0rDdh2CHmtqE9tc7HnsDqx2tt+OiWlQ2DJP+X7fiXuGx9nKBOhJJ1hSUNYNQ5DvVa7W+o5f6I/AhTgDoX65LBrwJrAgVyCFLvXY4r45EfmLdIymAEU7XnOGXprOUxze v6hiaU3Q CVoCxjvENxI5EmYe45XGz10o1QVvzhu9sQuvgHcWqSHxpDYO1JAIJWq/eenauhyhOgm1eHE6LQ4lqI5IDMyuvRHGyFTFToDfXUOEsM/iUCGbAghDCE5900nL7gWsQOVO2TXJcsqs24my2Bs8SCHkIpALhvHuPki26OOb3gdVHEqBnCYEmFH0piTXsmO1jZpfTzOnimYKg5qFTkpJR3jvnz0HT6zESD1HDBP8ZNr6mSulma/tfXtXB7Jeze2wAXFaKytjUnjDU8sYMc0iINz3wTLYUuGEfDAFOTlZZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 3/30/23 13:49, Kirill A. Shutemov wrote: > load_unaligned_zeropad() can lead to unwanted loads across page boundaries. > The unwanted loads are typically harmless. But, they might be made to > totally unrelated or even unmapped memory. load_unaligned_zeropad() > relies on exception fixup (#PF, #GP and now #VE) to recover from these > unwanted loads. > > But, this approach does not work for unaccepted memory. For TDX, a load > from unaccepted memory will not lead to a recoverable exception within > the guest. The guest will exit to the VMM where the only recourse is to > terminate the guest. > > There are three parts to fix this issue and comprehensively avoid access > to unaccepted memory. Together these ensure that an extra "guard" page > is accepted in addition to the memory that needs to be used. > > 1. Implicitly extend the range_contains_unaccepted_memory(start, end) > checks up to end+2M if 'end' is aligned on a 2M boundary. It may > require checking 2M chunk beyond end of RAM. The bitmap allocation is > modified to accommodate this. > 2. Implicitly extend accept_memory(start, end) to end+2M if 'end' is > aligned on a 2M boundary. > 3. Set PageUnaccepted() on both memory that itself needs to be accepted > *and* memory where the next page needs to be accepted. Essentially, > make PageUnaccepted(page) a marker for whether work needs to be done > to make 'page' usable. That work might include accepting pages in > addition to 'page' itself. > > Side note: This leads to something strange. Pages which were accepted > at boot, marked by the firmware as accepted and will never > _need_ to be accepted might have PageUnaccepted() set on > them. PageUnaccepted(page) is a cue to ensure that the next > page is accepted before 'page' can be used. At least the part about PageUnaccepted() is obsolete in v9, no? > This is an actual, real-world problem which was discovered during TDX > testing. > > Signed-off-by: Kirill A. Shutemov > Reviewed-by: Dave Hansen > --- > arch/x86/mm/unaccepted_memory.c | 39 +++++++++++++++++++++++++ > drivers/firmware/efi/libstub/x86-stub.c | 7 +++++ > 2 files changed, 46 insertions(+) > > diff --git a/arch/x86/mm/unaccepted_memory.c b/arch/x86/mm/unaccepted_memory.c > index 1df918b21469..a0a58486eb74 100644 > --- a/arch/x86/mm/unaccepted_memory.c > +++ b/arch/x86/mm/unaccepted_memory.c > @@ -23,6 +23,38 @@ void accept_memory(phys_addr_t start, phys_addr_t end) > bitmap = __va(boot_params.unaccepted_memory); > range_start = start / PMD_SIZE; > > + /* > + * load_unaligned_zeropad() can lead to unwanted loads across page > + * boundaries. The unwanted loads are typically harmless. But, they > + * might be made to totally unrelated or even unmapped memory. > + * load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now > + * #VE) to recover from these unwanted loads. > + * > + * But, this approach does not work for unaccepted memory. For TDX, a > + * load from unaccepted memory will not lead to a recoverable exception > + * within the guest. The guest will exit to the VMM where the only > + * recourse is to terminate the guest. > + * > + * There are three parts to fix this issue and comprehensively avoid > + * access to unaccepted memory. Together these ensure that an extra > + * "guard" page is accepted in addition to the memory that needs to be > + * used: > + * > + * 1. Implicitly extend the range_contains_unaccepted_memory(start, end) > + * checks up to end+2M if 'end' is aligned on a 2M boundary. > + * > + * 2. Implicitly extend accept_memory(start, end) to end+2M if 'end' is > + * aligned on a 2M boundary. (immediately following this comment) > + * > + * 3. Set PageUnaccepted() on both memory that itself needs to be > + * accepted *and* memory where the next page needs to be accepted. > + * Essentially, make PageUnaccepted(page) a marker for whether work > + * needs to be done to make 'page' usable. That work might include > + * accepting pages in addition to 'page' itself. > + */ And here. > + if (!(end % PMD_SIZE)) > + end += PMD_SIZE; > + > spin_lock_irqsave(&unaccepted_memory_lock, flags); > for_each_set_bitrange_from(range_start, range_end, bitmap, > DIV_ROUND_UP(end, PMD_SIZE)) { > @@ -46,6 +78,13 @@ bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end) > > bitmap = __va(boot_params.unaccepted_memory); > > + /* > + * Also consider the unaccepted state of the *next* page. See fix #1 in > + * the comment on load_unaligned_zeropad() in accept_memory(). > + */ > + if (!(end % PMD_SIZE)) > + end += PMD_SIZE; > + > spin_lock_irqsave(&unaccepted_memory_lock, flags); > while (start < end) { > if (test_bit(start / PMD_SIZE, bitmap)) { > diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c > index 1643ddbde249..1afe7b5b02e1 100644 > --- a/drivers/firmware/efi/libstub/x86-stub.c > +++ b/drivers/firmware/efi/libstub/x86-stub.c > @@ -715,6 +715,13 @@ static efi_status_t allocate_unaccepted_bitmap(struct boot_params *params, > return EFI_SUCCESS; > } > > + /* > + * range_contains_unaccepted_memory() may need to check one 2M chunk > + * beyond the end of RAM to deal with load_unaligned_zeropad(). Make > + * sure that the bitmap is large enough handle it. > + */ > + max_addr += PMD_SIZE; > + > /* > * If unaccepted memory is present, allocate a bitmap to track what > * memory has to be accepted before access.