linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Changyuan Lyu <changyuanl@google.com>
To: jgg@nvidia.com
Cc: akpm@linux-foundation.org, anthony.yznaga@oracle.com,
	arnd@arndb.de,  ashish.kalra@amd.com, benh@kernel.crashing.org,
	bp@alien8.de,  catalin.marinas@arm.com, changyuanl@google.com,
	corbet@lwn.net,  dave.hansen@linux.intel.com,
	devicetree@vger.kernel.org, dwmw2@infradead.org,
	 ebiederm@xmission.com, graf@amazon.com, hpa@zytor.com,
	jgowans@amazon.com,  kexec@lists.infradead.org, krzk@kernel.org,
	 linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	luto@kernel.org,  mark.rutland@arm.com, mingo@redhat.com,
	pasha.tatashin@soleen.com,  pbonzini@redhat.com,
	peterz@infradead.org, ptyadav@amazon.de,  robh+dt@kernel.org,
	robh@kernel.org, rostedt@goodmis.org, rppt@kernel.org,
	 saravanak@google.com, skinsburskii@linux.microsoft.com,
	tglx@linutronix.de,  thomas.lendacky@amd.com, will@kernel.org,
	x86@kernel.org
Subject: Re: [PATCH v5 07/16] kexec: add Kexec HandOver (KHO) generation helpers
Date: Sun, 23 Mar 2025 12:02:04 -0700	[thread overview]
Message-ID: <20250323190204.742672-1-changyuanl@google.com> (raw)
In-Reply-To: <20250321133447.GA251739@nvidia.com>

Hi Jason, thanks for reviewing the patchset!

On Fri, Mar 21, 2025 at 10:34:47 -0300, Jason Gunthorpe <jgg@nvidia.com> wrote:
> On Wed, Mar 19, 2025 at 06:55:42PM -0700, Changyuan Lyu wrote:
> > From: Alexander Graf <graf@amazon.com>
> >
> > Add the core infrastructure to generate Kexec HandOver metadata. Kexec
> > HandOver is a mechanism that allows Linux to preserve state - arbitrary
> > properties as well as memory locations - across kexec.
> >
> > It does so using 2 concepts:
> >
> >   1) State Tree - Every KHO kexec carries a state tree that describes the
> >      state of the system. The state tree is represented as hash-tables.
> >      Device drivers can add/remove their data into/from the state tree at
> >      system runtime. On kexec, the tree is converted to FDT (flattened
> >      device tree).
>
> Why are we changing this? I much prefered the idea of having recursive
> FDTs than this notion copying eveything into tables then out into FDT?
> Now that we have the preserved pages mechanism there is a pretty
> direct path to doing recursive FDT.

We are not copying data into the hashtables, instead the hashtables only
record the address and size of the data to be serialized into FDT.
The idea is similar to recording preserved folios in xarray
and then serialize it to linked pages.

> I feel like this patch is premature, it should come later in the
> project along with a stronger justification for this approach.
>
> IHMO keep things simple for this series, just the very basics.

The main purpose of using hashtables is to enable KHO users to save
data to KHO at any time, not just at the time of activate/finalize KHO
through sysfs/debugfs. For example, FDBox can save the data into KHO
tree once a new fd is saved to KHO. Also, using hashtables allows KHO
users to add data to KHO concurrently, while with notifiers, KHO users'
callbacks are executed serially.

Regarding the suggestion of recursive FDT, I feel like it is already
doable with this patchset, or even with Mike's V4 patch. A KHO user can
just allocates a buffer, serialize all its states to the buffer using
libfdt (or even using other binary formats), save the address of the
buffer to KHO's tree, and finally register the buffer's underlying
pages/folios with kho_preserve_folio().

> > +int register_kho_notifier(struct notifier_block *nb)
> > +{
> > +	return blocking_notifier_chain_register(&kho_out.chain_head, nb);
> > +}
> > +EXPORT_SYMBOL_GPL(register_kho_notifier);
>
> And another different set of notifiers? :(

I changed the semantics of the notifiers. In Mike's V4, the KHO notifier
is to pass the fdt pointer to KHO users to push data into the blob. In
this patchset, it notifies KHO users about the last chance for saving
data to KHO.

It is not necessary for every KHO user to register a
notifier, as they can use the helper functions to save data to KHO tree
anytime (but before the KHO tree is converted and frozen). For example,
FDBox would not need a notifier if it saves data to KHO tree immediately
once an FD is registered to it.

However, some KHO users may still want to add data just before kexec,
so I kept the notifiers and allow KHO users to get notified when the
state tree hashtables are about to be frozen and converted to FDT.

> > +static int kho_finalize(void)
> > +{
> > +	int err = 0;
> > +	void *fdt;
> > +
> > +	fdt = kvmalloc(kho_out.fdt_max, GFP_KERNEL);
> > +	if (!fdt)
> > +		return -ENOMEM;
>
> We go to all the trouble of keeping track of stuff in dynamic hashes
> but still can't automatically size the fdt and keep the dumb uapi to
> have the user say? :( :(

The reason of keeping fdt_max in the this patchset is to simplify the
support of kexec_file_load().

We want to be able to do kexec_file_load()
first and then do KHO activation/finalization to move kexec_file_load()
out of the blackout window. At the time of kexec_file_load(), we need to
pass the KHO FDT address to the new kernel's setup data (x86) or
devicetree (arm), but KHO FDT is not generated yet. The simple solution
used in this patchset is to reserve a ksegment of size fdt_max and pass
the address of that ksegment to the new kernel. The final FDT is copied
to that ksegment in kernel_kexec().
The extra benefit of this solution is the reserved ksegment is
physically contiguous.

To completely remove fdt_max, I am considering the idea in [1]. At the
time of kexec_file_load(), we pass the address of an anchor page to
the new kernel, and the anchor page will later be fulfilled with the
physical addresses of the pages containing the FDT blob. Multiple
anchor pages can be linked together. The FDT blob pages can be physically
noncontiguous.

[1] https://lore.kernel.org/all/CA+CK2bBBX+HgD0HLj-AyTScM59F2wXq11BEPgejPMHoEwqj+_Q@mail.gmail.com/

Best,
Changyuan


  reply	other threads:[~2025-03-23 19:02 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20  1:55 [PATCH v5 00/16] kexec: introduce Kexec HandOver (KHO) Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 01/16] kexec: define functions to map and unmap segments Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 02/16] mm/mm_init: rename init_reserved_page to init_deferred_page Changyuan Lyu
2025-03-20  7:10   ` Krzysztof Kozlowski
2025-03-20 17:15     ` Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 03/16] memblock: add MEMBLOCK_RSRV_KERN flag Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 04/16] memblock: Add support for scratch memory Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 05/16] memblock: introduce memmap_init_kho_scratch() Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 06/16] hashtable: add macro HASHTABLE_INIT Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 07/16] kexec: add Kexec HandOver (KHO) generation helpers Changyuan Lyu
2025-03-21 13:34   ` Jason Gunthorpe
2025-03-23 19:02     ` Changyuan Lyu [this message]
2025-03-24 16:28       ` Jason Gunthorpe
2025-03-25  0:21         ` Changyuan Lyu
2025-03-25  2:20           ` Jason Gunthorpe
2025-03-24 18:40   ` Frank van der Linden
2025-03-25 19:19     ` Mike Rapoport
2025-03-25 21:56       ` Frank van der Linden
2025-03-26 11:59         ` Mike Rapoport
2025-03-26 16:25           ` Frank van der Linden
2025-03-20  1:55 ` [PATCH v5 08/16] kexec: add KHO parsing support Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 09/16] kexec: enable KHO support for memory preservation Changyuan Lyu
2025-03-21 13:46   ` Jason Gunthorpe
2025-03-22 19:12     ` Mike Rapoport
2025-03-23 18:55       ` Jason Gunthorpe
2025-03-24 18:18         ` Mike Rapoport
2025-03-24 20:07           ` Jason Gunthorpe
2025-03-26 12:07             ` Mike Rapoport
2025-03-23 19:07     ` Changyuan Lyu
2025-03-25  2:04       ` Jason Gunthorpe
2025-03-27 10:03   ` Pratyush Yadav
2025-03-27 13:31     ` Jason Gunthorpe
2025-03-27 17:28       ` Pratyush Yadav
2025-03-28 12:53         ` Jason Gunthorpe
2025-04-02 16:44         ` Changyuan Lyu
2025-04-02 16:47           ` Pratyush Yadav
2025-04-02 18:37             ` Pasha Tatashin
2025-04-02 18:49               ` Pratyush Yadav
2025-04-02 19:16   ` Pratyush Yadav
2025-04-03 11:42     ` Jason Gunthorpe
2025-04-03 13:58       ` Mike Rapoport
2025-04-03 14:24         ` Jason Gunthorpe
2025-04-04  9:54           ` Mike Rapoport
2025-04-04 12:47             ` Jason Gunthorpe
2025-04-04 13:53               ` Mike Rapoport
2025-04-04 14:30                 ` Jason Gunthorpe
2025-04-04 16:24                   ` Pratyush Yadav
2025-04-04 17:31                     ` Jason Gunthorpe
2025-04-06 16:13                     ` Mike Rapoport
2025-04-06 16:11                   ` Mike Rapoport
2025-04-07 14:16                     ` Jason Gunthorpe
2025-04-07 16:31                       ` Mike Rapoport
2025-04-07 17:03                         ` Jason Gunthorpe
2025-04-09  9:06                           ` Mike Rapoport
2025-04-09 12:56                             ` Jason Gunthorpe
2025-04-09 13:58                               ` Mike Rapoport
2025-04-09 15:37                                 ` Jason Gunthorpe
2025-04-09 16:19                                   ` Mike Rapoport
2025-04-09 16:28                                     ` Jason Gunthorpe
2025-04-10 16:51                                       ` Matthew Wilcox
2025-04-10 17:31                                         ` Jason Gunthorpe
2025-04-09 16:28                       ` Mike Rapoport
2025-04-09 18:32                         ` Jason Gunthorpe
2025-04-04 16:15                 ` Pratyush Yadav
2025-04-06 16:34                   ` Mike Rapoport
2025-04-07 14:23                     ` Jason Gunthorpe
2025-04-03 13:57     ` Mike Rapoport
2025-04-11  4:02     ` Changyuan Lyu
2025-04-03 15:50   ` Pratyush Yadav
2025-04-03 16:10     ` Jason Gunthorpe
2025-04-03 17:37       ` Pratyush Yadav
2025-04-04 12:54         ` Jason Gunthorpe
2025-04-04 15:39           ` Pratyush Yadav
2025-04-09  8:35       ` Mike Rapoport
2025-03-20  1:55 ` [PATCH v5 10/16] kexec: add KHO support to kexec file loads Changyuan Lyu
2025-03-21 13:48   ` Jason Gunthorpe
2025-03-20  1:55 ` [PATCH v5 11/16] kexec: add config option for KHO Changyuan Lyu
2025-03-20  7:10   ` Krzysztof Kozlowski
2025-03-20 17:18     ` Changyuan Lyu
2025-03-24  4:18   ` Dave Young
2025-03-24 19:26     ` Pasha Tatashin
2025-03-25  1:24       ` Dave Young
2025-03-25  3:07         ` Dave Young
2025-03-25  6:57     ` Baoquan He
2025-03-25  8:36       ` Dave Young
2025-03-26  9:17         ` Dave Young
2025-03-26 11:28           ` Mike Rapoport
2025-03-26 12:09             ` Dave Young
2025-03-25 14:04       ` Pasha Tatashin
2025-03-20  1:55 ` [PATCH v5 12/16] arm64: add KHO support Changyuan Lyu
2025-03-20  7:13   ` Krzysztof Kozlowski
2025-03-20  8:30     ` Krzysztof Kozlowski
2025-03-20 23:29     ` Changyuan Lyu
2025-04-11  3:47   ` Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 13/16] x86/setup: use memblock_reserve_kern for memory used by kernel Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 14/16] x86: add KHO support Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 15/16] memblock: add KHO support for reserve_mem Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 16/16] Documentation: add documentation for KHO Changyuan Lyu
2025-03-20 14:45   ` Jonathan Corbet
2025-03-21  6:33     ` Changyuan Lyu
2025-03-21 13:46       ` Jonathan Corbet
2025-03-25 14:19 ` [PATCH v5 00/16] kexec: introduce Kexec HandOver (KHO) Pasha Tatashin
2025-03-25 15:03   ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250323190204.742672-1-changyuanl@google.com \
    --to=changyuanl@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=anthony.yznaga@oracle.com \
    --cc=arnd@arndb.de \
    --cc=ashish.kalra@amd.com \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dwmw2@infradead.org \
    --cc=ebiederm@xmission.com \
    --cc=graf@amazon.com \
    --cc=hpa@zytor.com \
    --cc=jgg@nvidia.com \
    --cc=jgowans@amazon.com \
    --cc=kexec@lists.infradead.org \
    --cc=krzk@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ptyadav@amazon.de \
    --cc=robh+dt@kernel.org \
    --cc=robh@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=saravanak@google.com \
    --cc=skinsburskii@linux.microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox