linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pratyush Yadav <ptyadav@amazon.de>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Changyuan Lyu <changyuanl@google.com>,
	<linux-kernel@vger.kernel.org>, <graf@amazon.com>,
	<akpm@linux-foundation.org>, <luto@kernel.org>,
	<anthony.yznaga@oracle.com>, <arnd@arndb.de>,
	<ashish.kalra@amd.com>, <benh@kernel.crashing.org>,
	<bp@alien8.de>, <catalin.marinas@arm.com>,
	<dave.hansen@linux.intel.com>, <dwmw2@infradead.org>,
	<ebiederm@xmission.com>, <mingo@redhat.com>, <jgowans@amazon.com>,
	<corbet@lwn.net>, <krzk@kernel.org>, <rppt@kernel.org>,
	<mark.rutland@arm.com>, <pbonzini@redhat.com>,
	<pasha.tatashin@soleen.com>, <hpa@zytor.com>,
	<peterz@infradead.org>, <robh+dt@kernel.org>, <robh@kernel.org>,
	<saravanak@google.com>, <skinsburskii@linux.microsoft.com>,
	<rostedt@goodmis.org>, <tglx@linutronix.de>,
	<thomas.lendacky@amd.com>, <usama.arif@bytedance.com>,
	<will@kernel.org>, <devicetree@vger.kernel.org>,
	<kexec@lists.infradead.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-doc@vger.kernel.org>, <linux-mm@kvack.org>,
	<x86@kernel.org>
Subject: Re: [PATCH v5 09/16] kexec: enable KHO support for memory preservation
Date: Fri, 4 Apr 2025 15:39:15 +0000	[thread overview]
Message-ID: <mafs0h634os0s.fsf@amazon.de> (raw)
In-Reply-To: <20250404125421.GI342109@nvidia.com>

On Fri, Apr 04 2025, Jason Gunthorpe wrote:

> On Thu, Apr 03, 2025 at 05:37:06PM +0000, Pratyush Yadav wrote:
>
[...]
>> > This should be more like:
>> >
>> > union {
>> >       void *table;
>> >       phys_addr_t table_phys;
>> > };
>> >
>> > Since we are not using the low bits right now and it is alot cheaper
>> > to convert from va to phys only once during the final step. __va is
>> > not exactly fast.
>> 
>> The descriptor is used on _every_ level of the table, not just the
>> top.
>
> Yes
>
>> So if we use virtual addresses, at serialize time we would have to walk
>> the whole table and covert all addresses to physical.
>
> Yes
>
>> And __va() does
>> not seem to be doing too much. On x86, it expands to:
>> 
>>     #define __va(x)			((void *)((unsigned long)(x)+PAGE_OFFSET))
>> 
>> and on ARM64 to:
>> 
>>     #define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
>>     #define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET) | PAGE_OFFSET)
>
> Hmm, I was sure sparsemem added a bunch of stuff to this path, maybe
> I'm thinking of page_to_phys

Yep, page_to_phys for sparsemem is somewhat expensive, but __va() seems
to be fine.

    #define __page_to_pfn(pg)					\
    ({	const struct page *__pg = (pg);				\
            int __sec = page_to_section(__pg);			\
            (unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec)));	\
    })

>
>> >> +struct kho_mem_track {
>> >> +	/* Points to L4 KHOMEM descriptor, each order gets its own table. */
>> >> +	struct xarray orders;
>> >> +};
>> >
>> > I think it would be easy to add a 5th level and just use bits 63:57 as
>> > a 6 bit order. Then you don't need all this stuff either.
>> 
>> I am guessing you mean to store the order in the table descriptor
>> itself, instead of having a different table for each order.
>
> Not quite, I mean to index the per-order sub trees by using the high
> order bits. You still end up with N seperate bitmap trees, but instead
> of using an xarray to hold their top pointers you hold them in a 5th
> level.
>
>> Though now that I think of it, it is probably much simpler to just use
>> khomem_desc_t orders[NR_PAGE_ORDERS] instead of the xarray.
>
> Which is basically this, but encoding the index to orders in the address

I think this is way easier to wrap your head around compared to trying
to encode orders in address. That doesn't even work that well since
address has pretty much no connection to order. So a page at address say
0x100000 might be any order. So we would have to encode the order into
the address, and then decode it when doing table operations. Just having
a separate table indexed separately by orders is way simpler.

-- 
Regards,
Pratyush Yadav


  reply	other threads:[~2025-04-04 15:39 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20  1:55 [PATCH v5 00/16] kexec: introduce Kexec HandOver (KHO) Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 01/16] kexec: define functions to map and unmap segments Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 02/16] mm/mm_init: rename init_reserved_page to init_deferred_page Changyuan Lyu
2025-03-20  7:10   ` Krzysztof Kozlowski
2025-03-20 17:15     ` Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 03/16] memblock: add MEMBLOCK_RSRV_KERN flag Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 04/16] memblock: Add support for scratch memory Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 05/16] memblock: introduce memmap_init_kho_scratch() Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 06/16] hashtable: add macro HASHTABLE_INIT Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 07/16] kexec: add Kexec HandOver (KHO) generation helpers Changyuan Lyu
2025-03-21 13:34   ` Jason Gunthorpe
2025-03-23 19:02     ` Changyuan Lyu
2025-03-24 16:28       ` Jason Gunthorpe
2025-03-25  0:21         ` Changyuan Lyu
2025-03-25  2:20           ` Jason Gunthorpe
2025-03-24 18:40   ` Frank van der Linden
2025-03-25 19:19     ` Mike Rapoport
2025-03-25 21:56       ` Frank van der Linden
2025-03-26 11:59         ` Mike Rapoport
2025-03-26 16:25           ` Frank van der Linden
2025-03-20  1:55 ` [PATCH v5 08/16] kexec: add KHO parsing support Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 09/16] kexec: enable KHO support for memory preservation Changyuan Lyu
2025-03-21 13:46   ` Jason Gunthorpe
2025-03-22 19:12     ` Mike Rapoport
2025-03-23 18:55       ` Jason Gunthorpe
2025-03-24 18:18         ` Mike Rapoport
2025-03-24 20:07           ` Jason Gunthorpe
2025-03-26 12:07             ` Mike Rapoport
2025-03-23 19:07     ` Changyuan Lyu
2025-03-25  2:04       ` Jason Gunthorpe
2025-03-27 10:03   ` Pratyush Yadav
2025-03-27 13:31     ` Jason Gunthorpe
2025-03-27 17:28       ` Pratyush Yadav
2025-03-28 12:53         ` Jason Gunthorpe
2025-04-02 16:44         ` Changyuan Lyu
2025-04-02 16:47           ` Pratyush Yadav
2025-04-02 18:37             ` Pasha Tatashin
2025-04-02 18:49               ` Pratyush Yadav
2025-04-02 19:16   ` Pratyush Yadav
2025-04-03 11:42     ` Jason Gunthorpe
2025-04-03 13:58       ` Mike Rapoport
2025-04-03 14:24         ` Jason Gunthorpe
2025-04-04  9:54           ` Mike Rapoport
2025-04-04 12:47             ` Jason Gunthorpe
2025-04-04 13:53               ` Mike Rapoport
2025-04-04 14:30                 ` Jason Gunthorpe
2025-04-04 16:24                   ` Pratyush Yadav
2025-04-04 17:31                     ` Jason Gunthorpe
2025-04-06 16:13                     ` Mike Rapoport
2025-04-06 16:11                   ` Mike Rapoport
2025-04-07 14:16                     ` Jason Gunthorpe
2025-04-07 16:31                       ` Mike Rapoport
2025-04-07 17:03                         ` Jason Gunthorpe
2025-04-09  9:06                           ` Mike Rapoport
2025-04-09 12:56                             ` Jason Gunthorpe
2025-04-09 13:58                               ` Mike Rapoport
2025-04-09 15:37                                 ` Jason Gunthorpe
2025-04-09 16:19                                   ` Mike Rapoport
2025-04-09 16:28                                     ` Jason Gunthorpe
2025-04-10 16:51                                       ` Matthew Wilcox
2025-04-10 17:31                                         ` Jason Gunthorpe
2025-04-09 16:28                       ` Mike Rapoport
2025-04-09 18:32                         ` Jason Gunthorpe
2025-04-04 16:15                 ` Pratyush Yadav
2025-04-06 16:34                   ` Mike Rapoport
2025-04-07 14:23                     ` Jason Gunthorpe
2025-04-03 13:57     ` Mike Rapoport
2025-04-11  4:02     ` Changyuan Lyu
2025-04-03 15:50   ` Pratyush Yadav
2025-04-03 16:10     ` Jason Gunthorpe
2025-04-03 17:37       ` Pratyush Yadav
2025-04-04 12:54         ` Jason Gunthorpe
2025-04-04 15:39           ` Pratyush Yadav [this message]
2025-04-09  8:35       ` Mike Rapoport
2025-03-20  1:55 ` [PATCH v5 10/16] kexec: add KHO support to kexec file loads Changyuan Lyu
2025-03-21 13:48   ` Jason Gunthorpe
2025-03-20  1:55 ` [PATCH v5 11/16] kexec: add config option for KHO Changyuan Lyu
2025-03-20  7:10   ` Krzysztof Kozlowski
2025-03-20 17:18     ` Changyuan Lyu
2025-03-24  4:18   ` Dave Young
2025-03-24 19:26     ` Pasha Tatashin
2025-03-25  1:24       ` Dave Young
2025-03-25  3:07         ` Dave Young
2025-03-25  6:57     ` Baoquan He
2025-03-25  8:36       ` Dave Young
2025-03-26  9:17         ` Dave Young
2025-03-26 11:28           ` Mike Rapoport
2025-03-26 12:09             ` Dave Young
2025-03-25 14:04       ` Pasha Tatashin
2025-03-20  1:55 ` [PATCH v5 12/16] arm64: add KHO support Changyuan Lyu
2025-03-20  7:13   ` Krzysztof Kozlowski
2025-03-20  8:30     ` Krzysztof Kozlowski
2025-03-20 23:29     ` Changyuan Lyu
2025-04-11  3:47   ` Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 13/16] x86/setup: use memblock_reserve_kern for memory used by kernel Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 14/16] x86: add KHO support Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 15/16] memblock: add KHO support for reserve_mem Changyuan Lyu
2025-03-20  1:55 ` [PATCH v5 16/16] Documentation: add documentation for KHO Changyuan Lyu
2025-03-20 14:45   ` Jonathan Corbet
2025-03-21  6:33     ` Changyuan Lyu
2025-03-21 13:46       ` Jonathan Corbet
2025-03-25 14:19 ` [PATCH v5 00/16] kexec: introduce Kexec HandOver (KHO) Pasha Tatashin
2025-03-25 15:03   ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mafs0h634os0s.fsf@amazon.de \
    --to=ptyadav@amazon.de \
    --cc=akpm@linux-foundation.org \
    --cc=anthony.yznaga@oracle.com \
    --cc=arnd@arndb.de \
    --cc=ashish.kalra@amd.com \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=changyuanl@google.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dwmw2@infradead.org \
    --cc=ebiederm@xmission.com \
    --cc=graf@amazon.com \
    --cc=hpa@zytor.com \
    --cc=jgg@nvidia.com \
    --cc=jgowans@amazon.com \
    --cc=kexec@lists.infradead.org \
    --cc=krzk@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=robh+dt@kernel.org \
    --cc=robh@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=saravanak@google.com \
    --cc=skinsburskii@linux.microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=usama.arif@bytedance.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox