From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 133A8C02198 for ; Sat, 8 Feb 2025 23:39:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DDC556B0082; Sat, 8 Feb 2025 18:39:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D8C576B0083; Sat, 8 Feb 2025 18:39:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C54166B0088; Sat, 8 Feb 2025 18:39:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A7A256B0082 for ; Sat, 8 Feb 2025 18:39:28 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 33AE41A03B2 for ; Sat, 8 Feb 2025 23:39:28 +0000 (UTC) X-FDA: 83098396416.26.E404E7F Received: from mail-vs1-f47.google.com (mail-vs1-f47.google.com [209.85.217.47]) by imf30.hostedemail.com (Postfix) with ESMTP id 664E980003 for ; Sat, 8 Feb 2025 23:39:26 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hL93ZUGP; spf=pass (imf30.hostedemail.com: domain of xiyou.wangcong@gmail.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=xiyou.wangcong@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739057966; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HLLi9zQ99OQKa4jeeM1uEjERWUaiY/XgjD2LAy4mnqo=; b=MVug8G8GfravN+aCC25vas3+lmzYiQtJ8uevt/FSDRawTbgTOUYz+cdZvgWo9/PWJTxzs0 IESELYpjBaEssoiVm+ydm7lwglFrgAkv5Xjy4Ulu4ETGlKsrJ3H7B6nPt9fA3qgbkNwkJ5 1kvaIgtVNP+azQ3LNan/TQOpCLalVZo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739057966; a=rsa-sha256; cv=none; b=1dYcrjIyDU43OhC4Bj6FUWaUPRYKQush4uPNtUOoBOtMLOaKX4hzr2jjHUn+Cy20uECrrf Fi3tuIkue1rx0fSLLH+O6UXpvVPayZGSJV3/ijfX4g/Crrxk5SqEC/Z7+/M9Hnmvs14zlE X7Hk88tjvsmeK0OgBEMHLMwz8dflUbA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hL93ZUGP; spf=pass (imf30.hostedemail.com: domain of xiyou.wangcong@gmail.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=xiyou.wangcong@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vs1-f47.google.com with SMTP id ada2fe7eead31-4ba9680271fso807511137.1 for ; Sat, 08 Feb 2025 15:39:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739057965; x=1739662765; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HLLi9zQ99OQKa4jeeM1uEjERWUaiY/XgjD2LAy4mnqo=; b=hL93ZUGP2g2fSIpU4WBjt/cM0+evbPcaYHlfjRe/+O1LTs36U2sGA+Yd3WZ6TiNgDd TGbEYjvFJNHirxBzHyc4G8TWeyXsAfS6dx0aoew/1QZGUtfEXgnPDxoDH2eQbJMgCCuu NKLHZyRhp6k2qEHzwUe+JUOLE1ud5yzLUuvwuJyx9CemS9NmC7l/bqIyu6KcxvDZZXDE Q+j1tDdWotoczkQOd/FXsGzd/WOi08jvOUj/IPL3McfFXeyndeygKGVb/hdGIFJjTb8v qFJMagh9KqFPVPxcCVkKh6a6uDrNanRUczn8mYPwTAxtWPx2CwU6+nJCBiu8apdQXZUh Dayg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739057965; x=1739662765; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HLLi9zQ99OQKa4jeeM1uEjERWUaiY/XgjD2LAy4mnqo=; b=dSgsANDtje2GcqlI+2EjLyTYYrUu5hykg1JnJJSB6fkweCtFvd3/GImru5mN6nIJrR TKtshVjY6FjJI7XT4a1xFcuAdzm05NngMyW+RmBf8/AiQSRU42e8VKqcNVUVktcoproe FCP0HPWumtCc5so6W/hzGmkJZqpdNYlXsiSsvYYxtBF5nr1XYRCm64VtrH975tVC9y9w X+25eTYeELlTYw5uUzw+zRa11Gwu8R3tr16qZUQnjGSVRcBHeWu0zXXHol/d4CnrhnrU aTloD6m6nvPwTniCAVGrDuK1hH1T+8wqQTvmqKB8i6MpQyRFF1Jnzgs47R3cWJdSqkBs jTXw== X-Forwarded-Encrypted: i=1; AJvYcCWsRWj7OGDV6J4KajiXzzx+VOCApgrhf77aPuon2QBjebv6rdP9QpzcsHzTXefcRrhq0fJEbr4B2Q==@kvack.org X-Gm-Message-State: AOJu0YzNEtJV/FV43jhjjg10Pr/505XsHefKXcDzsvALX5p/7nExjzs9 rBrW6wkbLhxnfUo7i/fi4/V9qyLQsKbgPeddMzLF+vhU3NkZ+LtdeYhPUCIip+Nu5wNjkhi0LVD kUmiu3uYISM+VwY7mzOjyfZOoG9g= X-Gm-Gg: ASbGncvixASAePr8OBzyCfxoTaSFcCy9Y/jRpc+vvrEIk5x4eOy/12RfQW8+0Vx6u6d pfCheXeQ5pk9d/8clujLg9KLue67xs+/9xn4kHr5ltNyR1In5x1yhCUzYQEzgSqU/jONooxmgMs Tl8vSdNNv4TBlcLTtj949hdgU3Bvm2Iw== X-Google-Smtp-Source: AGHT+IHmuA9yI8oIa2KsmkgQWv2DRmFgZtiHlP+mcm3z0bsdHAAi0IXmrnafXkrbEDLixwCrvDHR7uvtSxfCF6dSkfY= X-Received: by 2002:a05:6102:548a:b0:4b3:fee3:2820 with SMTP id ada2fe7eead31-4ba8714179fmr4328252137.9.1739057965418; Sat, 08 Feb 2025 15:39:25 -0800 (PST) MIME-Version: 1.0 References: <20250206132754.2596694-1-rppt@kernel.org> In-Reply-To: <20250206132754.2596694-1-rppt@kernel.org> From: Cong Wang Date: Sat, 8 Feb 2025 15:39:14 -0800 X-Gm-Features: AWEUYZmi8qx_aJGxEbGM8RQjVoiD2zsmGiS4HbL5u3CyG-yT-ostr9_GwuABT9M Message-ID: Subject: Re: [PATCH v4 00/14] kexec: introduce Kexec HandOver (KHO) To: Mike Rapoport Cc: linux-kernel@vger.kernel.org, Alexander Graf , Andrew Morton , Andy Lutomirski , Anthony Yznaga , Arnd Bergmann , Ashish Kalra , Benjamin Herrenschmidt , Borislav Petkov , Catalin Marinas , Dave Hansen , David Woodhouse , Eric Biederman , Ingo Molnar , James Gowans , Jonathan Corbet , Krzysztof Kozlowski , Mark Rutland , Paolo Bonzini , Pasha Tatashin , "H. Peter Anvin" , Peter Zijlstra , Pratyush Yadav , Rob Herring , Rob Herring , Saravana Kannan , Stanislav Kinsburskii , Steven Rostedt , Thomas Gleixner , Tom Lendacky , Usama Arif , Will Deacon , devicetree@vger.kernel.org, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 664E980003 X-Rspamd-Server: rspam07 X-Stat-Signature: 6fthbobshnyosmcqa8q7u1sxr4iuy53f X-HE-Tag: 1739057966-614479 X-HE-Meta: U2FsdGVkX1/O5kupC7/FHlqpkaFqpb4lQMVeJOdyzrvsye+8Rbd03s8arXlxNHjHGrvAEs9lfD1bDDkj/Tl50FCymjLq5UrQZRAn0jQjkiAxgsq6p2kwE/Q1DbR2e5+rNZIdfclJGfqy2IPb+9QXYJiI0K653D6fmwLXgSjNES+3lJp1cUpNDMihKSzF3CrOdkp7KBe7rjGv5BoUmkVTYRx/PYAQhZhDj3RjbFC2jlgqWZSUm8sRJ8AbceV6dgI2kzWPRNmtUosG8lPHCkPUkb8bDOIxNRSyUSbh0+tEnvX0XJ/lBVCn3strNibmc9ktXfS14tGqzm0qTOD9xc/h/VcTQkupotVWNbadMs5LtSrRhq4sVajzZ27nJkJRVwaiHsae57ifhxgwlOPU/an/+aEaMF4RM+hxAdFoc+oAxsbM1HieH/Ecumn9Xmk56GZVEXYdC2+4X6vvxNSQzqxRBBzEQDFTVD9kQuf9EOlPzHgJokObypjbecl94roOtnhg7ct56Kmpeg1dZj9AYiDtRwEtvNMMs990/nwnbZVBUACsil55oG3CV4Nf9b7jhVul1VUfRSb/8TxpwiVirTMiERdQmVmYmbitmUAS6z2uOYmBDDnuEjse6wpQg7/JVK/sT2dT7yfTAFgu0Z388U7ddvxl2bbb+0YAe/xuNir5MyDJeULxXe/Wnk59YP4d5w4NnoqhULpe+9s8bxxnGzrLRkayOsl2rVO/xhmGvsgTbOahASYzZKe5PE8JDXBhkFdo2CwHNHTwhHeX3IKn/H9V1HquthW3B48w7nIqnvtEkTD/VymcaokRTFseO+hXt223IVSvaUUTY142q59qVBqem6YPypECt+GbW87Ajk30KcDEZpeaYI3Z2FHCKFxCx41e1THrFaDxM4o7kA5L6A0lt1t8Ft3X7h0WFb95yFUN4H5xMH0R3AXoTCLlvXHydKgVwCg5cBk5g7TPQh4t1F3 oGKI8Vr/ qNyFGCMOb08AmxY2GsR06QEfx8btkpFUCPsTf81TcepjcLM/Lq9NRfkaa8P3haNdQi7XWNqkSbkGPYemow046owH5iq3JOAa8LhHLi66RTANgNkXCYdf6CCOC6yNUuJzluloy6DaXFegCjKGVYQAoWknRhnOzyUpFvMAvK0B48UwFTazQquiq8pDDlH2lvQOYGQ2uYV72/QqCH7oteXMoCEFmp5d/w86xZXp/ZIbuK4U3Czr+igaJNNbn7cpKRYo+MRw/d0LtX1MLmIdTz/ZcHGC5G5RHiME6OzggPeJCVKCd9pNCjzbZwVbbJlx96txGJOczVRZmi0RALhbhqs+RTQVV6dGUJpDxVcp0Qe/kOP83GTK7ncF5/9HUFGCho4DnLOEtqi3+cYaMVFyZftom+xXIEesJEdNNFWsW03pkpyb1TeQZlbjPPS9ZklfyxuM5zoMAoY0+fEJsnVQU4wccFKMb/DTd674XW6g+r8Ct6gcb+sPElLSLlZBLKMINicW3vgpJvP3qALdsdbs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Mike, On Thu, Feb 6, 2025 at 5:28=E2=80=AFAM Mike Rapoport wrot= e: > > From: "Mike Rapoport (Microsoft)" > > Hi, > > This a next version of Alex's "kexec: Allow preservation of ftrace buffer= s" > series (https://lore.kernel.org/all/20240117144704.602-1-graf@amazon.com)= , > just to make things simpler instead of ftrace we decided to preserve > "reserve_mem" regions. > > The patches are also available in git: > https://git.kernel.org/rppt/h/kho/v4 > > > Kexec today considers itself purely a boot loader: When we enter the new > kernel, any state the previous kernel left behind is irrelevant and the > new kernel reinitializes the system. > > However, there are use cases where this mode of operation is not what we > actually want. In virtualization hosts for example, we want to use kexec > to update the host kernel while virtual machine memory stays untouched. > When we add device assignment to the mix, we also need to ensure that > IOMMU and VFIO states are untouched. If we add PCIe peer to peer DMA, we > need to do the same for the PCI subsystem. If we want to kexec while an > SEV-SNP enabled virtual machine is running, we need to preserve the VM > context pages and physical memory. See "pkernfs: Persisting guest memory > and kernel/device state safely across kexec" Linux Plumbers > Conference 2023 presentation for details: > > https://lpc.events/event/17/contributions/1485/ > > To start us on the journey to support all the use cases above, this patch > implements basic infrastructure to allow hand over of kernel state across > kexec (Kexec HandOver, aka KHO). As a really simple example target, we us= e > memblock's reserve_mem. > With this patch set applied, memory that was reserved using "reserve_mem" > command line options remains intact after kexec and it is guaranteed to > reside at the same physical address. Nice work! One concern there is that using memblock to reserve memory as crashkernel= =3D is not flexible. I worked on kdump years ago and one of the biggest pains of kdump is how much memory should be reserved with crashkernel=3D. And it is still a pain today. If we reserve more, that would mean more waste for the 1st kernel. If we reserve less, that would induce more OOM for the 2nd kernel. I'd suggest considering using CMA, where the "reserved" memory can be still reusable for other purposes, just that pages can be migrated out of t= his reserved region on demand, that is, when loading a kexec kernel. Of course, we need to make sure they are not reused by what you want to preserve here, e.g., IOMMU. So you might need additional work to make it work, but still I believe this is the right direction. Just my two cents. Thanks!