From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DACCC77B7F for ; Tue, 16 May 2023 21:32:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B212F900004; Tue, 16 May 2023 17:32:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD05F900002; Tue, 16 May 2023 17:32:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 970E7900004; Tue, 16 May 2023 17:32:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 87F9C900002 for ; Tue, 16 May 2023 17:32:58 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 315BE1603E6 for ; Tue, 16 May 2023 21:32:58 +0000 (UTC) X-FDA: 80797418436.29.E569AB8 Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by imf10.hostedemail.com (Postfix) with ESMTP id DE962C0017 for ; Tue, 16 May 2023 21:32:55 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm2 header.b="d Toej6C"; dkim=pass header.d=messagingengine.com header.s=fm1 header.b=ZzXdJPHV; spf=pass (imf10.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.20 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684272776; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aQIbBC1XH4gKwubmFlh9PRPXW8iWnwJsnr8qddLaW2A=; b=1l76zQRgDqlu/k5qjPeO5CIcxBJRiyviY7SjJXX+gIKxPwyIkyLUx5btQhxfxxanQGY4Ax qZZst28hqjKTWCCq24n5evpTmDioYlnwl3eVdR7KezazN18IHGbKSfrETNekKnfukViGg0 a9yda1P7Ia4LNc+FmPRztu0BIBkRpTU= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm2 header.b="d Toej6C"; dkim=pass header.d=messagingengine.com header.s=fm1 header.b=ZzXdJPHV; spf=pass (imf10.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.20 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684272776; a=rsa-sha256; cv=none; b=0+v9CoGFznNglGrnj/hIc4kJYVTJygRmYdBcFCJwH+uBCtqpRc4c8HPMjP+SKJpqK2fcQQ MCyk1ng11KMBQl73+fZ77IGTdOoa6IeFVnY0u8kYKzZSFk1cos/5si1wxwtZVHqW0yIXz+ JJlq4YzWE+6eER12fkk7vEJVxHNa1Cg= Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id 834C932002F9; Tue, 16 May 2023 17:32:51 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Tue, 16 May 2023 17:32:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1684272771; x= 1684359171; bh=aQIbBC1XH4gKwubmFlh9PRPXW8iWnwJsnr8qddLaW2A=; b=d Toej6CqMPorHZ/HiMHAHV0j0BoSLvt88M9EjjPPLQALhvOYPl8sJVDYkEkATMSQr Gnu+vgKt0GS03j6xSQgSkeU+Wzr3NZ4A2eLM365uF7NEoSQYzQ6OK74h4N9WNH2C eIpqbO3JGhVr2RpO3IyKd3o2HbiAjEoz7vWgc+dEy4E54NVqhmGjEldwmIL/hAAg Llf/LjSF4uLtbv3OIbKb/oGQ0KfoJ+LqFZLmZ2ZXNngMQO4btufKJfYWffr4uYaK y4L/znDLmVm7fJ3Tf2MCVfQoazVL2doz4AFfcgEXo8HDI2RYQyusBHm2q3FSeRrp UY+33OeQXATi+AeBIh43A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1684272771; x=1684359171; bh=aQIbBC1XH4gKw ubmFlh9PRPXW8iWnwJsnr8qddLaW2A=; b=ZzXdJPHVPx2iYYPj1GM9V7VuzcYN4 xtAHh+yAs5OOiP2Nx6sjK5vNxyK+sFwIzHWZqwmm7FaHHl7IX0h3kiF392+SD8VT igNgcEqjSRnGMq/kiUyLI/0GGRDmeEm38kL18VgLH5YvxXbGjh+FiRh9aS5zmL5a AEbuces9Hd4WUn7SLKtOp+WpWpp5qbHQbtFmXxoKNtcq0GC5jbcwBjc3Dzp81g0w EjWwn0m+w/vB2NiiZ6PcXphNQYMYairdzk7sJNVkVzVRiSFhyzWNnc1oejWcZ4wV zErtUZoKdJaFs6liPGe7LTyOZYWlEwy13GEDvXzzwkFz/pxQLKUPTfzbQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeehledgudeivdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehttddttddttddvnecuhfhrohhmpedfmfhi rhhilhhlucetrdcuufhhuhhtvghmohhvfdcuoehkihhrihhllhesshhhuhhtvghmohhvrd hnrghmvgeqnecuggftrfgrthhtvghrnhephfeigefhtdefhedtfedthefghedutddvueeh tedttdehjeeukeejgeeuiedvkedtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrg hmpehmrghilhhfrhhomhepkhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgv X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 16 May 2023 17:32:49 -0400 (EDT) Received: by box.shutemov.name (Postfix, from userid 1000) id DF2D610C8C1; Wed, 17 May 2023 00:32:45 +0300 (+03) Date: Wed, 17 May 2023 00:32:45 +0300 From: "Kirill A. Shutemov" To: Tom Lendacky Cc: "Kirill A. Shutemov" , Borislav Petkov , Andy Lutomirski , Dave Hansen , Sean Christopherson , Andrew Morton , Joerg Roedel , Ard Biesheuvel , Andi Kleen , Kuppuswamy Sathyanarayanan , David Rientjes , Vlastimil Babka , Thomas Gleixner , Peter Zijlstra , Paolo Bonzini , Ingo Molnar , Dario Faggioli , Mike Rapoport , David Hildenbrand , Mel Gorman , marcelo.cerri@canonical.com, tim.gardner@canonical.com, khalid.elmously@canonical.com, philip.cox@canonical.com, aarcange@redhat.com, peterx@redhat.com, x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, Mike Rapoport Subject: Re: [PATCHv11 1/9] mm: Add support for unaccepted memory Message-ID: <20230516213245.oruzw2kinbfqcwwl@box.shutemov.name> References: <20230513220418.19357-1-kirill.shutemov@linux.intel.com> <20230513220418.19357-2-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: bjc6y8mmokxrwbmbf18yyedpjz398zyk X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: DE962C0017 X-Rspam-User: X-HE-Tag: 1684272775-269070 X-HE-Meta: U2FsdGVkX1/QDSF2rmf2iZgCRE4I/Vgho2LrkynlvTSuY0bv0Ki/+zLiBQWCu8ZRlhOZFg9Ls9TR4iN0FFOeghpipkRO9clIvrX6/hnQUFA46dWZcyG2xKXiNrZ62FRgqkej7sey0Ti6aCWTeapPc5DHVJoODXWFcOZQScBQK9ivbjFXlxGwz5R0Dy3x0GeNhxHpsU/aNIcx31eFk40TGo+6mV3usiVYgjM8I5v5vAXHRLJgblL12j+B+a4AW5OQp7FsgCuvcS3a4LmR9nmHq7K72wdrqQFY5udrT4UuIcCoX2J7Q6Dm4sSdFb5Qoaom9nkZS+3NBrOTROFTrdf5m5AnwApiNF1JidFGxr06H1JbsQtxaa3rFXTLOq4cpJgMQ4bHT8N2Oj4+4F24odPMWRs9o+orgVV5zRLVX4QjEJBzKUb0Uy2K0roIeslnPU7ChNRSQ4Ul1cOHiYvYq8NcAHtae1XByJa/xMYOgQYQuSj6iswukwaJdaca2+ndJcjAhqW2dQ4f/7YpCIiNmYN8bRajF8PaQ5NjSOx0dcrB6kw60ADX28ZF5mrAAZtpmNtQriFBhp1tgV/b8sdhsjLlHsWXeeLY5z4d2/+iDXZoWx9X4OXkyQoHqZPX/Xij5qIo2rAspRdpWDXHabtj41OWiXjQm6Kmw5AUt0H+luX2K73jStiMsq+OVVIfh7ScO6OeXpvufdZchvaKYotZEwNmodgZj/Mh4cLGODuWKEMMOuPC0PQNPWspKQnuTtip/0owNbLUjAFjPM2wZWBZ+wcrQLa0uWhypqi122In+nZkLeDH3iOMvdQLON7UbT0YSjjx/NMXHrzxEABsAW2Wad0mc6TJGoTKQdeNBH3XgrVwgVnwQ2ne1NrGVsD/kxxKAWiVj0FqtLKLpBU8iSor1Rg3X6PROCd7jTAV6DsDtxYQS+Of26IXMc+IQpa8p1dPauFI6DEhavuk4bgA8jYs+hZ 6CTMCcY2 aOXJMPVR+EZZpfmbpckXvA+k9F7ladkvviLhBUrrc8L27jmwCJPIJbOQ8TbL/t6HEg7YQok4TtrBQO4aLl8RY+I1RzuJgzL4OJjVj/+UlTL1H/537EpGrs0zWdxahdJIiC18qMe4s6mZAuJx6C8lxxavnYQblOiHYtFrCNfAeULBikM8DAmuv1ydblyIygfEE2kZSA0wKZaifvRo+5ry+Sc94ZHRoNdKyQxqq5faNYz3MErYVw4EoIXk4pKcY/pITKlqc9+HrfPdrT+QPFeG2nLsjxMo007kMzWCvtUdx67iT3a8LS02KjBu9Qf4b4gf2qNlNjJm02QOJ0ZQYOMlN+vCvyZRa0rlJGtTP X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 16, 2023 at 02:44:00PM -0500, Tom Lendacky wrote: > On 5/13/23 17:04, Kirill A. Shutemov wrote: > > UEFI Specification version 2.9 introduces the concept of memory > > acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD > > SEV-SNP, require memory to be accepted before it can be used by the > > guest. Accepting happens via a protocol specific to the Virtual Machine > > platform. > > > > There are several ways kernel can deal with unaccepted memory: > > > > 1. Accept all the memory during the boot. It is easy to implement and > > it doesn't have runtime cost once the system is booted. The downside > > is very long boot time. > > > > Accept can be parallelized to multiple CPUs to keep it manageable > > (i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate > > memory bandwidth and does not scale beyond the point. > > > > 2. Accept a block of memory on the first use. It requires more > > infrastructure and changes in page allocator to make it work, but > > it provides good boot time. > > > > On-demand memory accept means latency spikes every time kernel steps > > onto a new memory block. The spikes will go away once workload data > > set size gets stabilized or all memory gets accepted. > > > > 3. Accept all memory in background. Introduce a thread (or multiple) > > that gets memory accepted proactively. It will minimize time the > > system experience latency spikes on memory allocation while keeping > > low boot time. > > > > This approach cannot function on its own. It is an extension of #2: > > background memory acceptance requires functional scheduler, but the > > page allocator may need to tap into unaccepted memory before that. > > > > The downside of the approach is that these threads also steal CPU > > cycles and memory bandwidth from the user's workload and may hurt > > user experience. > > > > The patch implements #1 and #2 for now. #2 is the default. Some > > workloads may want to use #1 with accept_memory=eager in kernel > > command line. #3 can be implemented later based on user's demands. > > > > Support of unaccepted memory requires a few changes in core-mm code: > > > > - memblock has to accept memory on allocation; > > > > - page allocator has to accept memory on the first allocation of the > > page; > > > > Memblock change is trivial. > > > > The page allocator is modified to accept pages. New memory gets accepted > > before putting pages on free lists. It is done lazily: only accept new > > pages when we run out of already accepted memory. The memory gets > > accepted until the high watermark is reached. > > > > EFI code will provide two helpers if the platform supports unaccepted > > memory: > > > > - accept_memory() makes a range of physical addresses accepted. > > > > - range_contains_unaccepted_memory() checks anything within the range > > of physical addresses requires acceptance. > > > > Signed-off-by: Kirill A. Shutemov > > Acked-by: Mike Rapoport # memblock > > Reviewed-by: Vlastimil Babka > > --- > > drivers/base/node.c | 7 ++ > > fs/proc/meminfo.c | 5 ++ > > include/linux/mm.h | 19 +++++ > > include/linux/mmzone.h | 8 ++ > > mm/internal.h | 1 + > > mm/memblock.c | 9 +++ > > mm/mm_init.c | 7 ++ > > mm/page_alloc.c | 173 +++++++++++++++++++++++++++++++++++++++++ > > mm/vmstat.c | 3 + > > 9 files changed, 232 insertions(+) > > > > > diff --git a/mm/internal.h b/mm/internal.h > > index 68410c6d97ac..b1db7ba5f57d 100644 > > --- a/mm/internal.h > > +++ b/mm/internal.h > > @@ -1099,4 +1099,5 @@ struct vma_prepare { > > struct vm_area_struct *remove; > > struct vm_area_struct *remove2; > > }; > > + > > Looks like an unintentional change. Yep, will fix. > > #endif /* __MM_INTERNAL_H */ > > diff --git a/mm/memblock.c b/mm/memblock.c > > index 3feafea06ab2..50b921119600 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -1436,6 +1436,15 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, > > */ > > kmemleak_alloc_phys(found, size, 0); > > + /* > > + * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, > > + * require memory to be accepted before it can be used by the > > + * guest. > > + * > > + * Accept the memory of the allocated buffer. > > + */ > > + accept_memory(found, found + size); > > I'm not an mm or memblock expert, but do we need to worry about freed memory > from memblock_phys_free() being possibly doubly accepted? A double > acceptance will trigger a guest termination on SNP. There will be no double acceptance. accept_memory() will consult the bitmap before accepting any memory. For already accepted memory it is a nop. -- Kiryl Shutsemau / Kirill A. Shutemov