From: Mike Rapoport <rppt@kernel.org>
To: Jason Miu <jasonmiu@google.com>
Cc: Alexander Graf <graf@amazon.com>,
Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>,
Changyuan Lyu <changyuanl@google.com>,
David Matlack <dmatlack@google.com>,
David Rientjes <rientjes@google.com>,
Jason Gunthorpe <jgg@nvidia.com>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Pratyush Yadav <pratyush@kernel.org>,
kexec@lists.infradead.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH v7 1/2] kho: Adopt radix tree for preserved memory tracking
Date: Tue, 20 Jan 2026 19:57:37 +0200 [thread overview]
Message-ID: <aW_CEV-Qqrj2dvEb@kernel.org> (raw)
In-Reply-To: <20260116034432.1520731-2-jasonmiu@google.com>
Hi Jason,
On Thu, Jan 15, 2026 at 07:44:31PM -0800, Jason Miu wrote:
> Introduce a radix tree implementation for tracking preserved memory
> pages and switch the KHO memory tracking mechanism to use it. This
> lays the groundwork for a stateless KHO implementation that eliminates
> the need for serialization and the associated "finalize" state.
>
> This patch introduces the core radix tree data structures and
> constants to the KHO ABI. It adds the radix tree node and leaf
> structures, along with documentation for the radix tree key encoding
> scheme that combines a page's physical address and order.
>
> To support broader use by other kernel subsystems, such as hugetlb
> preservation, the core radix tree manipulation functions are exported
> as a public API.
>
> The xarray-based memory tracking is replaced with this new radix tree
> implementation. The core KHO preservation and unpreservation functions
> are wired up to use the radix tree helpers. On boot, the second kernel
> restores the preserved memory map by walking the radix tree whose root
> physical address is passed via the FDT.
>
> The ABI `compatible` version is bumped to "kho-v2" to reflect the
> structural changes in the preserved memory map and sub-FDT property
> names. This includes renaming "fdt" to "preserved-data" to better
> reflect that preserved state may use formats other than FDT.
>
> Signed-off-by: Jason Miu <jasonmiu@google.com>
...
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index 49bf2cecab12..06adaf56cd69 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -5,6 +5,7 @@
> * Copyright (C) 2025 Microsoft Corporation, Mike Rapoport <rppt@kernel.org>
> * Copyright (C) 2025 Google LLC, Changyuan Lyu <changyuanl@google.com>
> * Copyright (C) 2025 Pasha Tatashin <pasha.tatashin@soleen.com>
> + * Copyright (C) 2025 Google LLC, Jason Miu <jasonmiu@google.com>
It's already 2026 ;-)
> */
>
> #define pr_fmt(fmt) "KHO: " fmt
...
> +int kho_radix_add_page(struct kho_radix_tree *tree,
> + unsigned long pfn, unsigned int order)
> +{
> + /* Newly allocated nodes for error cleanup */
> + struct kho_radix_node *intermediate_nodes[KHO_TREE_MAX_DEPTH] = { 0 };
> + unsigned long key = kho_radix_encode_key(PFN_PHYS(pfn), order);
> + struct kho_radix_node *new_node, *anchor_node;
> + struct kho_radix_node *node = tree->root;
> + unsigned int i, idx, anchor_idx;
> + struct kho_radix_leaf *leaf;
> + int err = 0;
> +
> + if (WARN_ON_ONCE(!tree->root))
> + return -EINVAL;
> +
> + might_sleep();
> +
> + guard(mutex)(&tree->lock);
> +
> + /* Go from high levels to low levels */
> + for (i = KHO_TREE_MAX_DEPTH - 1; i > 0; i--) {
> + idx = kho_radix_get_table_index(key, i);
> +
> + if (node->table[idx]) {
> + node = phys_to_virt(node->table[idx]);
> + continue;
> + }
> +
> + /* Next node is empty, create a new node for it */
> + new_node = (struct kho_radix_node *)get_zeroed_page(GFP_KERNEL);
> + if (!new_node) {
> + err = -ENOMEM;
> + goto err_free_nodes;
> + }
> +
> + node->table[idx] = virt_to_phys(new_node);
> +
> + /*
> + * Capture the node where the new branch starts for cleanup
> + * if allocation fails.
> + */
> + if (!anchor_node) {
I think anchor_node should be initialized to NULL for this to work.
> + anchor_node = node;
> + anchor_idx = idx;
> + }
> + intermediate_nodes[i] = new_node;
> +
> + node = new_node;
> + }
> +
> + /* Handle the leaf level bitmap (level 0) */
> + idx = kho_radix_get_bitmap_index(key);
> + leaf = (struct kho_radix_leaf *)node;
> + __set_bit(idx, leaf->bitmap);
> +
> + return 0;
> +
> +err_free_nodes:
> + for (i = KHO_TREE_MAX_DEPTH - 1; i > 0; i--) {
> + if (intermediate_nodes[i])
> + free_page((unsigned long)intermediate_nodes[i]);
> + }
> + if (anchor_node)
> + anchor_node->table[anchor_idx] = 0;
> +
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(kho_radix_add_page);
...
> + if (WARN_ON(!node->table[idx]))
> + return;
> +
> + node = phys_to_virt((phys_addr_t)node->table[idx]);
No need for casting.
> + shift = ((level - 1) * KHO_TABLE_SIZE_LOG2) +
> + KHO_BITMAP_SIZE_LOG2;
> + key = start | (i << shift);
> +
> + node = phys_to_virt((phys_addr_t)root->table[i]);
Ditto.
> @@ -1466,12 +1489,6 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
> goto out;
> }
>
> - mem_map_phys = kho_get_mem_map_phys(fdt);
> - if (!mem_map_phys) {
> - err = -ENOENT;
> - goto out;
> - }
I think we should keep the logic that skips scratch initialization if there
were no memory preservations, like Pasha implemented here:
https://lkml.kernel.org/r/20251223140140.2090337-1-pasha.tatashin@soleen.com
(commit e1c3bfd091f3 ("kho: validate preserved memory map during
population") in today's mm tree)
We just should update the validation to work with the radix tree.
> scratch = early_memremap(scratch_phys, scratch_len);
> if (!scratch) {
> pr_warn("setup: failed to memremap scratch (phys=0x%llx, len=%lld)\n",
> @@ -1512,7 +1529,6 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
>
> kho_in.fdt_phys = fdt_phys;
> kho_in.scratch_phys = scratch_phys;
> - kho_in.mem_map_phys = mem_map_phys;
> kho_scratch_cnt = scratch_cnt;
> pr_info("found kexec handover data.\n");
>
> --
> 2.52.0.457.g6b5491de43-goog
>
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2026-01-20 17:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-16 3:44 [PATCH v7 0/2] Make KHO Stateless Jason Miu
2026-01-16 3:44 ` [PATCH v7 1/2] kho: Adopt radix tree for preserved memory tracking Jason Miu
2026-01-20 17:57 ` Mike Rapoport [this message]
2026-01-16 3:44 ` [PATCH v7 2/2] kho: Remove finalize state and clients Jason Miu
2026-01-20 17:25 ` Mike Rapoport
2026-01-19 18:43 ` [PATCH v7 0/2] Make KHO Stateless Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aW_CEV-Qqrj2dvEb@kernel.org \
--to=rppt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=changyuanl@google.com \
--cc=dmatlack@google.com \
--cc=graf@amazon.com \
--cc=jasonmiu@google.com \
--cc=jgg@nvidia.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pasha.tatashin@soleen.com \
--cc=pratyush@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox