From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DDA6C02181 for ; Fri, 24 Jan 2025 11:31:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CDB1028005F; Fri, 24 Jan 2025 06:31:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C6466280059; Fri, 24 Jan 2025 06:31:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B049728005F; Fri, 24 Jan 2025 06:31:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9181D280059 for ; Fri, 24 Jan 2025 06:31:07 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2F9961C7E07 for ; Fri, 24 Jan 2025 11:31:07 +0000 (UTC) X-FDA: 83042128974.16.B4DF9EF Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 835AD16000A for ; Fri, 24 Jan 2025 11:31:05 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=FXjBL2E2; spf=pass (imf08.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737718265; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZfGucH2v5BqtoJuBr51rUQLY7Od0nXdkxb065257dsA=; b=fQb8udAGyz3uSyx+us+AJdU95T7UDflfRLm6RBEyxN6V6Q9YCvhJpSAwUVvsYA9Phe4Db1 iEUpCqQ7SHkESriEqyTdMrsjvOec3DEJeg0NHADaADF4UZOYlPIslCBGNNsgvM5TdwGxv1 HQsgJRkqxYDhqcVHu2WaKwJ2t7vzT0Y= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737718265; a=rsa-sha256; cv=none; b=mfNyo83FuYif3vAyoGjiv9ws86JvwG/EV0WktCNgW3HFMAaq2P37aBvHpvp7Quvsz6C7t3 RK7xU0lJQap6KkJ2lEzv3EnPvCi99zuUbNPha8fQk88qMIEUzwxD4R97yg7qZSYujyLTM1 hw3lXW1weDtqIHWOo6Oi8AXaZReNu08= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=FXjBL2E2; spf=pass (imf08.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 5C8345C59E2; Fri, 24 Jan 2025 11:30:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2A286C4CED2; Fri, 24 Jan 2025 11:31:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737718264; bh=y6Lbny+3p/h4Eh7xwNjDzN3/iWGS1aDVpJdWZvweIHk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=FXjBL2E22JJr6P9aeYtcaQoTRBHyN2lvhc64yJJKZM15pwAZwKJYssWJkvZmk58n8 Mjn+OBuMZzi7HY9ACJUT/WxEH07j6hUC1CRN9F+6TVZ9Hi7UmxjzvlbhrlIrD76idJ CCZ1gYueoFPRM4URcln6cx6vKj3VfTAXxh2zTUghJbIeWgEnnf05Smy7bV6tVtvIbD Ztcx4DXJl26IbxNKNEZgsN2Hl1eaqu5x0l6xqBoX4jgj75n1WWJr35joBJkEo03wH4 HWOozkMq2LaSq4aPKXK68o39v/Mpioc+rbugGEguHE7d+NMgxuZhKly21hHOY34/vn JEYAsXeti4teg== Date: Fri, 24 Jan 2025 13:30:52 +0200 From: Mike Rapoport To: Jason Gunthorpe Cc: lsf-pc@lists.linux-foundation.org, Alexander Graf , "Gowans, James" , linux-mm@kvack.org, David Rientjes , Pasha Tatashin Subject: Re: [LSF/MM/BPF TOPIC] memory persistence over kexec Message-ID: References: <20250120141427.GK674319@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250120141427.GK674319@ziepe.ca> X-Stat-Signature: 4dptwrndopse7d8gfrh3i9b7f9yt6r5h X-Rspamd-Queue-Id: 835AD16000A X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1737718265-429834 X-HE-Meta: U2FsdGVkX19DdKkzjAAP0++1ihPTbTQPo6yeC2QlmbXTbhlkXjKCchNCIrtgl8SEIoOeHvQav19rOQ74dtrF5hmlGzRJ6PAfojxD134+7RpFrBfMZUHcId9pFt37zA3OQjRY40LwOGNjxPVy32QtlQ3nbEeZCVLgysjQn3cM2DLNwKpE/JZmPhSQ3ewUVGP9xMJ3EjIf/A5JMYM6enQDkmzWaxYdGJQPovd8+oUKV5izmwNdAalzGsz3EDf1c34q7V/A1S3T0sUGN8TWdCLnnFGxwLxp+EsmR0bfc1WtD0szykwG3QZEhh1Zo0XjdMT6IDW1PsAh6nfMmcIpBOXCcGX3BdAyTuniH9QlibCU8yCKUn810Gg36iePeoTe8lWgYvi/BG3tNYfw2rASe1spDlnxaUZGp63XPlxYHdkpKh+Ca+5oOgYArVomTDlsZXtIC7fLUBZlnaXil2IxvHdR52bNBYPR+tTG72sZ9txVBOjZD6L/X7KcRIjmb6IIDN4zFeLehpBOtH+D1m/67calz19R79qeb301Su0aOVGKwmvvGdu7PDv46XAUcpJNoku2bQrJZw1CIaOnQusEQankM41nzp/y/kc3axHDPve5ULcH36cnJnIt8MnUv+KSPfzfjSfQod+4KDZckVTc1y+TAoaYzq7Iq/8AVEGGRbCb5edI+/6MeNSeiuDqy2ibuYj2vOGqeLgUyXrA5KvNEhkzWqwuQm3M7SqD3hAgGUkdWM1JpjwO1vW+023jMFcgf32QBfRihfTCohM+jsW8BIlHgKlorRJw8hHZHHIhjDO69pdUeIQgV56ZiB4CBjouxyWeg3Up8AjboQv7hFA1/v4QHbeO0MYPo5KKL1yfTiHqXtCuSf8Qk5SXgKTt1Lc1rCKarnb2XpIfbXYqoP5jHSGmvClMQb6e33JS3vebnlUDgyS6ltizQPsQk6CEDNiXI2eaCyIRmpA1UWq8noobzbC jssP3Gvw G9lcwxHgD8ZZK34UgK6BNCjmhyhQWKGLspyVm0W8s2lJFgzNx2UPlWFbztjJdsAjdkm6seRqdeSZ+2/A5wD5I7mNZC5/XE5Fiu2nmZp/VAbF/zrQx99kwICtNP8NhKrnrkfiD6+7CouUK9l03O50mBktxYRGeiwsn3Hr3o4DnO1OHQneDr6qK3I6apoiONmDs81CT25VtTmTFcxB0Gb6rW6J4JCxPWK+PeBz47DmR/bs0bmVHEfJzycWHcCwEF+JlbhYQk+8gmvoV5WHobHER+1/EM9wDkvvy7pD20l81ixbr35o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Jason, On Mon, Jan 20, 2025 at 10:14:27AM -0400, Jason Gunthorpe wrote: > On Mon, Jan 20, 2025 at 09:54:15AM +0200, Mike Rapoport wrote: > > Hi, > > > > I'd like to discuss memory persistence across kexec. > > > > Currently there is ongoing work on Kexec HandOver (KHO) [1] that allows > > serialization and deserialization of kernel data as well as preserving > > arbitrary memory ranges across kexec. > > > > In addition, KHO keeps a physically contiguous memory regions that are > > guaranteed to not have any memory that KHO would preserve, but still can be > > used by the system. The kexeced kernel bootstraps itself using those > > regions and sets all handed over memory as in use. KHO users then can > > recover their state from the preserved data. This includes memory > > reservations, where the user can either discard or claim reservations. > > > > KHO can be used as the base layer for implementation of persistence-aware > > memory allocator and persistent in-memory filesystem. > > > > Aside from status update on KHO progress there are a few topics that I would > > like to discuss: > > * Is it feasible and desirable to enable KHO support in tmpfs and hugetlbfs? > > * Or is it better to implement yet another in-memory filesystem dedicated > > for persistence? > > * What is the best way to ensure that the memory we want to persist is not > > scattered all over the place? > > There is alot of talk about taking *drivers* and having them survive > kexec, meaning the driver has to put alot of its state into KHO and > then get it back out again. > > I've been hoping for a model where a driver can be told to "go to KHO" > and the KHO code can be largely contained in the driver and regulated > to recording the driver state. This implies the state may be > fragmented all over memory. I'm not sure I follow what do you mean by "go to KHO" here. I believe that ftrace example in Alex's v3 of KHO (https://lore.kernel.org/all/20240117144704.602-1-graf@amazon.com) has enough meat to demonstrate the basic model. The driver has to pass the state it wishes to preserve and then during the initialization after kexec the driver can restore it's state from the preserved one. > The other direction is that the driver has to start up in some special > KHO mode and KHO becomes invasive on all driver paths to use special > KHO allocations. This seems like a PITA. > > You can see this difference just in the discussion around the iommu > serialization where one idea was to have KHO be an integral (and > invasive!) part of the page table operations from time zero vs some > later serialization at kexec time. I didn't follow that discussion closely, but there still should be a step when iommu driver would try to deserialize the data and use it if deserialization succeeds. My understanding it that a major part of the complexity in iommu is the userspace facing bits that need to be somehow connected to the restored in kernel structures after kexec. > Regardless, I'm interested in this discussion to bring some > concreteness about how drivers work.. > > Jason -- Sincerely yours, Mike.