From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 518E4C021B2 for ; Sun, 23 Feb 2025 18:51:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C8BA6B007B; Sun, 23 Feb 2025 13:51:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9789B6B0082; Sun, 23 Feb 2025 13:51:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 819406B0083; Sun, 23 Feb 2025 13:51:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 643F56B007B for ; Sun, 23 Feb 2025 13:51:52 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 08980C01BF for ; Sun, 23 Feb 2025 18:51:52 +0000 (UTC) X-FDA: 83152103664.20.74FC58D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf20.hostedemail.com (Postfix) with ESMTP id 46D781C0007 for ; Sun, 23 Feb 2025 18:51:50 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=AC8V4qpt; spf=pass (imf20.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740336710; a=rsa-sha256; cv=none; b=gCs+uNRDG+64IafHLvct87/xzFr6Q9J/3V8WSnJ/q9nJhYuqmBO/DQj/v2qSYhjjflWsDJ TYtJhRt3+aPSHXB+Zd0Xen6bTO6/OM74kHOFWfs7Y2qSql4zKxfKtzC9BQ8exIWftCgctG BBwUTQupgtP/YjR2klq6/bHjH+/C/ss= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=AC8V4qpt; spf=pass (imf20.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740336710; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nRWUqmsrLK6i38ZO4BPx5T6pvN4JGgKtctrGtp+3Kwk=; b=otPXqojwreeCgm/gtcDgN0xZm86BG/FjmMKJBPDzMAp5S5B+nDBHxJXjNcPsSXoOw02jnv 6iKcbuY7M8cecuH6ePxxuYJ8kpCi6g+3ccg4Y4H4v55OMYmZsXlb2buK7Ai1LmqbJ194hy LGi6FpOVLEvtngejU0lhw1hyzl/+rfo= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 7BE3E5C5AD5; Sun, 23 Feb 2025 18:51:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AFCCAC4CEDD; Sun, 23 Feb 2025 18:51:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740336708; bh=tco7lF+ns8c6//J7LH3Pw1kK1mQ+wFDcdS7oH71ovZU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=AC8V4qptb+AtjX51lQpQkK+feFUMV6tEPciZsc50vgPEZZgXJ1Mr4Wzt7gW7ncRVo Pt9qf6/7Xj1nucn97rkBAvcg1gvIbSBfi4vLWNYW1Bti3Pp9XJRm1yUzDZOyQsQked X34aOkKQ9k+9y02frKQDZp3NLs+3NcckJUmxCX9gLcUhTieOhWf166eJVGsyLSlOR9 lWRj/Ck4JYsdvBrvGHwpNQCWs/LyFm4bpcEkILuW9yIPoYj373AD5iOcoPBVBLranD XyPmKBdMuZXi/JeAzNVhBDavpPR/JhEriZkg8Hu83FZNMSODeSrM13Nax7awzelSqN yk0gFpZg0lLSA== Date: Sun, 23 Feb 2025 20:51:27 +0200 From: Mike Rapoport To: Jason Gunthorpe Cc: Pasha Tatashin , linux-kernel@vger.kernel.org, Alexander Graf , Andrew Morton , Andy Lutomirski , Anthony Yznaga , Arnd Bergmann , Ashish Kalra , Benjamin Herrenschmidt , Borislav Petkov , Catalin Marinas , Dave Hansen , David Woodhouse , Eric Biederman , Ingo Molnar , James Gowans , Jonathan Corbet , Krzysztof Kozlowski , Mark Rutland , Paolo Bonzini , "H. Peter Anvin" , Peter Zijlstra , Pratyush Yadav , Rob Herring , Rob Herring , Saravana Kannan , Stanislav Kinsburskii , Steven Rostedt , Thomas Gleixner , Tom Lendacky , Usama Arif , Will Deacon , devicetree@vger.kernel.org, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Subject: Re: [PATCH v4 05/14] kexec: Add Kexec HandOver (KHO) generation helpers Message-ID: References: <20250206132754.2596694-1-rppt@kernel.org> <20250206132754.2596694-6-rppt@kernel.org> <20250210202220.GC3765641@nvidia.com> <20250211124943.GC3754072@nvidia.com> <20250211163720.GH3754072@nvidia.com> <20250212152336.GA3848889@nvidia.com> <20250212174303.GU3754072@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250212174303.GU3754072@nvidia.com> X-Rspamd-Queue-Id: 46D781C0007 X-Stat-Signature: i65o7d1bkjqhzsw4eztnefh6ssbmc7fs X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1740336710-821331 X-HE-Meta: U2FsdGVkX19dnVbawm0ciJx+QHG0GyWnKBWpivuQKl9kugQswaFDhFXeRmjwo0OsTHGaQKdnOpIEamL9ZXRm/cc5FMgb/9JG09lhjGmxDbA5qQmynZDFEKV1rS2e79cmFAbx8oaDbMt+fkyCBNWrIockDHz4uhlW5teXX4LeX1pp59HeVB1dU7wHu7bRSIewGRm0sLi5O3rDmn9pp1lIuKo1h3hDDbjhlt1BzF1Eg9rLJL8WfuMVtLC8icd4hDvsV++mafCJiUhif+ncMrvKO3CMLrTOnajQewuac+HfRZEf6HO/klZmBoK/dy1TX5uoUkZXfEXkJVZe3s3w0ABjLPmLKiDuHZSHu2F3QfSMCQNMrl8xVPV1zE16DDy0OR03TGmgwmxzw28+n96V3NgLb7Ihlw4qVVLqd2hSvMokpTCNKXb83S1VXteKrWgeiUnFM1HHCcdDaPSK2+R/moo0A2vMMwq5RA+6oqUYzRN3yjzBGTEl2hi0MNq3Gt8VbZEUmrJiXSVC1HBsMIuDwxXU3FSrbRiMLPm+ktPpw8yyrm1H19pfsyDNH9DpHDeHsThzXXeqLUwNafP83AMLozww8c40nGq5f7Vqg8Z5nNbJ6sxyANbaJhU8JnNXKnDgvClawyCiHosuxSa0Ofr2b1RWZk8PJLMG/BKk9YX6UQAvZ1E4YgNnsIiCUrWDTJIctRW9U6IOdhBHPPs3mAiCfQuw1MKftIzCsbu9Ch1q/s/lqrYIzRIkwZ9eVXd5rzdGbmUmTOEFODJXDZCNJ1+YbAR8+v7Q0UE9dDJEU9OTJFalX1WS+4VqrcXn2XNRBZE2TXevDLmroJh6ct1N1UDK1sVTXSsHvNSW83w3tF761anGcbtg9/NCrMU+gnWrGHfgoWIiPKP1MamrDQ3c2GsK7XUxVOQLi+3Ke2HlReOZDfbGo3S8CJsVek3nc9JZNnSWdlWQ4VLlOW6UlSGC+zyhxUo rfURsbcX V4ZbPFgh0JSjmBAPQK7OCaPbZ3cj0eROvNBaBKgkvqj0J6jKym/Zi0lg1bE6HngkUrh9Cfr+rbmSlnnWEkJw50r3/D++mv2hISQP0rDGGi9icRMBgsFcCiaZ6lCP364DXa719ez4FqHSnJEq6r5YTJI20CQIi/crG82qQh+vd8s6jUHjtSOF48xo/2qevp4fq9fWBPWFt/DTceVzsBWbpWk0TUQb1A+h7a3hfSR/YzLcJctarRvuvuCkby7Sa6ar655JSYtzq5Vo/GEJA/DLRHGwebOt6sULuusIGENZFaj8VCj9NKLQqb4ZRBhsYkA2FCldBhxTMwJyRTI4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 12, 2025 at 01:43:03PM -0400, Jason Gunthorpe wrote: > On Wed, Feb 12, 2025 at 06:39:06PM +0200, Mike Rapoport wrote: > > > As I've mentioned off-list earlier, KHO in its current form is the lowest > > level of abstraction for state preservation and it is by no means is > > intended to provide complex drivers with all the tools necessary. > > My point, is I think it is the wrong level of abstraction and the > wrong FDT schema. It does not and cannot solve the problems we know we > will have, so why invest anything into that schema? Preserving a lot of random pages spread all over the place will be a problem no matter what. With kho_preserve_folio() the users will still need to save physical address of that folio somewhere, be it FDT or some binary structure that FDT will point to. So either instead of "mem" properties we'll have "addresses" property or a pointer to yet another page that should be preserved and, by the way, "mem" may come handy in this case :) I don't see how the "mem" property contradicts future extensions and for simple use cases it is already enough. The simple reserve_mem use case in this patchset indeed does not represent the complexity of a driver, but it's still useful, at least for the ftrace folks. And reserve_mem is just fine with "mem" property. > I think the scratch system is great, and an amazing improvement over > past version. Upgrade the memory preservation to match and it will be > really good. > > > What you propose is a great optimization for memory preservation mechanism, > > and additional and very useful abstraction layer on top of "basic KHO"! > > I do not see this as a layer on top, I see it as fundamentally > replacing the memory preservation mechanism with something more > scalable. There are two parts to the memory preservation: making sure the preserved pages don't make it to the free lists and than restoring struct page/folio/memdesc so that the pages will look the same way as when they were allocated. For the first part we must memblock_reserve(addr, size) for every preserved range before memblock releases memory to the buddy. I did an experiment and preserved 1GiB of random order-0 pages and measured time required to reserve everything in memblock. kho_deserialize() you suggested slightly outperformed kho_init_reserved_pages() that parsed a single "mem" property containing an array of pairs. For more random distribution of orders and more deep FDT the difference or course would be higher, but still both options sucked relatively to a maple tree serialized similarly to your tracker xarray. For the restoration of the struct folio for multiorder folios the tracker xarray is a really great fit, but again, it does not contradict having "mem" properties. And the restoration of struct folio does not have to happen very early, so we'd probably want to run it in parallel, somewhat like deferred initialization of struct page. > > But I think it will be easier to start with something *very simple* and > > probably suboptimal and then extend it rather than to try to build complex > > comprehensive solution from day one. > > But why? Just do it right from the start? I spent like a hour > sketching that, the existing preservation code is also very simple, > why not just fix it right now? As I see it, we can have both. "mem" property for simple use cases, or as a partial solution for complex use cases and tracker you proposed for preserving the order of the folios. And as another optimization we may want a maple tree for coalescing as much as possible to reduce amount of memblock_reserve() calls. > Jason -- Sincerely yours, Mike.