From: Pratyush Yadav <pratyush@kernel.org>
To: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Alexander Graf <graf@amazon.com>, Baoquan He <bhe@redhat.com>,
Changyuan Lyu <changyuanl@google.com>,
Chris Li <chrisl@kernel.org>, Jason Gunthorpe <jgg@nvidia.com>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Pratyush Yadav <pratyush@kernel.org>,
kexec@lists.infradead.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 2/4] kho: replace kho_preserve_phys() with kho_preserve_pages()
Date: Thu, 18 Sep 2025 12:32:08 +0200 [thread overview]
Message-ID: <mafs05xdggifr.fsf@kernel.org> (raw)
In-Reply-To: <20250917174033.3810435-3-rppt@kernel.org>
Hi Mike,
On Wed, Sep 17 2025, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>
> to make it clear that KHO operates on pages rather than on a random
> physical address.
>
> The kho_preserve_pages() will be also used in upcoming support for
> vmalloc preservation.
>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
> include/linux/kexec_handover.h | 5 +++--
> kernel/kexec_handover.c | 25 +++++++++++--------------
> mm/memblock.c | 4 +++-
> 3 files changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h
> index 348844cffb13..cc5c49b0612b 100644
> --- a/include/linux/kexec_handover.h
> +++ b/include/linux/kexec_handover.h
> @@ -18,6 +18,7 @@ enum kho_event {
>
> struct folio;
> struct notifier_block;
> +struct page;
>
> #define DECLARE_KHOSER_PTR(name, type) \
> union { \
> @@ -42,7 +43,7 @@ struct kho_serialization;
> bool kho_is_enabled(void);
>
> int kho_preserve_folio(struct folio *folio);
> -int kho_preserve_phys(phys_addr_t phys, size_t size);
> +int kho_preserve_pages(struct page *page, unsigned int nr_pages);
> struct folio *kho_restore_folio(phys_addr_t phys);
> int kho_add_subtree(struct kho_serialization *ser, const char *name, void *fdt);
> int kho_retrieve_subtree(const char *name, phys_addr_t *phys);
> @@ -65,7 +66,7 @@ static inline int kho_preserve_folio(struct folio *folio)
> return -EOPNOTSUPP;
> }
>
> -static inline int kho_preserve_phys(phys_addr_t phys, size_t size)
> +static inline int kho_preserve_pages(struct page *page, unsigned int nr_pages)
> {
> return -EOPNOTSUPP;
> }
> diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c
> index f421acc58c1f..3ad59c5f9eaa 100644
> --- a/kernel/kexec_handover.c
> +++ b/kernel/kexec_handover.c
> @@ -698,26 +698,23 @@ int kho_preserve_folio(struct folio *folio)
> EXPORT_SYMBOL_GPL(kho_preserve_folio);
>
> /**
> - * kho_preserve_phys - preserve a physically contiguous range across kexec.
> - * @phys: physical address of the range.
> - * @size: size of the range.
> + * kho_preserve_pages - preserve contiguous pages across kexec
> + * @page: first page in the list.
> + * @nr_pages: number of pages.
> *
> - * Instructs KHO to preserve the memory range from @phys to @phys + @size
> - * across kexec.
> + * Preserve a contiguous list of order 0 pages. Must be restored using
> + * kho_restore_page() on each order 0 page.
This is not true. The pages are preserved with the maximum order
possible.
while (pfn < end_pfn) {
const unsigned int order =
min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn));
err = __kho_preserve_order(track, pfn, order);
[...]
So four 0-order pages will be preserved as one 2-order page. Restoring
them as four 0-order pages is wrong. And my proposed patch for checking
the magic [0] will uncover this exact bug.
I think you should either change the logic to always preserve at order
0, or maybe add a kho_restore_pages() that replicates the same order
calculation.
[0] https://lore.kernel.org/lkml/20250917125725.665-2-pratyush@kernel.org/
> *
> * Return: 0 on success, error code on failure
> */
> -int kho_preserve_phys(phys_addr_t phys, size_t size)
> +int kho_preserve_pages(struct page *page, unsigned int nr_pages)
> {
> - unsigned long pfn = PHYS_PFN(phys);
> + struct kho_mem_track *track = &kho_out.ser.track;
> + const unsigned long start_pfn = page_to_pfn(page);
> + const unsigned long end_pfn = start_pfn + nr_pages;
> + unsigned long pfn = start_pfn;
> unsigned long failed_pfn = 0;
> - const unsigned long start_pfn = pfn;
> - const unsigned long end_pfn = PHYS_PFN(phys + size);
> int err = 0;
> - struct kho_mem_track *track = &kho_out.ser.track;
> -
> - if (!PAGE_ALIGNED(phys) || !PAGE_ALIGNED(size))
> - return -EINVAL;
>
> while (pfn < end_pfn) {
> const unsigned int order =
> @@ -737,7 +734,7 @@ int kho_preserve_phys(phys_addr_t phys, size_t size)
>
> return err;
> }
> -EXPORT_SYMBOL_GPL(kho_preserve_phys);
> +EXPORT_SYMBOL_GPL(kho_preserve_pages);
>
> /* Handling for debug/kho/out */
>
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 117d963e677c..6ec3eaa4e8d1 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -2516,8 +2516,10 @@ static int reserve_mem_kho_finalize(struct kho_serialization *ser)
>
> for (i = 0; i < reserved_mem_count; i++) {
> struct reserve_mem_table *map = &reserved_mem_table[i];
> + struct page *page = phys_to_page(map->start);
> + unsigned int nr_pages = map->size >> PAGE_SHIFT;
>
> - err |= kho_preserve_phys(map->start, map->size);
> + err |= kho_preserve_pages(page, nr_pages);
Unrelated to this patch, but since there is no
kho_restore_{phys,pages}(), won't the reserve_mem memory end up with
uninitialized struct pages, since preserved pages are
memblock_reserved_mark_noinit()?
That would also be a case for kho_restore_pages() I suppose?
--
Regards,
Pratyush Yadav
next prev parent reply other threads:[~2025-09-18 10:32 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-17 17:40 [PATCH v4 0/4] kho: add support for preserving vmalloc allocations Mike Rapoport
2025-09-17 17:40 ` [PATCH v4 1/4] kho: check if kho is finalized in __kho_preserve_order() Mike Rapoport
2025-09-18 10:12 ` Pratyush Yadav
2025-09-17 17:40 ` [PATCH v4 2/4] kho: replace kho_preserve_phys() with kho_preserve_pages() Mike Rapoport
2025-09-18 10:32 ` Pratyush Yadav [this message]
2025-09-18 11:04 ` Mike Rapoport
2025-09-17 17:40 ` [PATCH v4 3/4] kho: add support for preserving vmalloc allocations Mike Rapoport
2025-09-17 21:15 ` Andrew Morton
2025-09-17 21:21 ` Jason Gunthorpe
2025-09-18 10:33 ` Pratyush Yadav
2025-09-17 17:40 ` [PATCH v4 4/4] lib/test_kho: use kho_preserve_vmalloc instead of storing addresses in fdt Mike Rapoport
2025-09-18 10:36 ` Pratyush Yadav
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mafs05xdggifr.fsf@kernel.org \
--to=pratyush@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=changyuanl@google.com \
--cc=chrisl@kernel.org \
--cc=graf@amazon.com \
--cc=jgg@nvidia.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pasha.tatashin@soleen.com \
--cc=rppt@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox