linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vivian Wang <wangruikang@iscas.ac.cn>
To: "Thomas Weißschuh" <thomas.weissschuh@linutronix.de>,
	"Paul Walmsley" <pjw@kernel.org>,
	"Palmer Dabbelt" <palmer@dabbelt.com>,
	"Albert Ou" <aou@eecs.berkeley.edu>,
	"Mike Rapoport" <rppt@kernel.org>
Cc: Alexandre Ghiti <alex@ghiti.fr>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org
Subject: Re: [BUG] SPARSEMEM broken on RISC-V; was: [PATCH] arch, mm: consolidate initialization of SPARSE memory model
Date: Tue, 10 Mar 2026 12:04:22 +0800	[thread overview]
Message-ID: <482adca2-4755-4e86-8488-fa2b1a02f0b5@iscas.ac.cn> (raw)
In-Reply-To: <20260309082841-9c542a85-073d-4d08-8b8e-a56621a13c91@linutronix.de>

On 3/9/26 15:34, Thomas WeiÃschuh wrote:
> Hi RISC-V maintainers,
>
> SPARSEMEM on RISC-V is currently broken in mainline.
> Could you take a look at my report and the suggestions from Mike below?

Not riscv maintainer, just happened to be taking look at 32-bit RISC-V
lately...

Fortunately this is a 32-bit only thing. On 64-bit we enable vmemmap,
which means that phys_to_page is also just arithmetic. Almost all (or
just all?) commercially available riscv w/ MMU stuff are 64-bit.

I guess RV32 needs some more love?

> On Mon, Feb 23, 2026 at 09:40:59PM +0200, Mike Rapoport wrote:
>> On Mon, Feb 23, 2026 at 02:52:45PM +0100, Thomas Weißschuh wrote:
>>> On Sun, Jan 11, 2026 at 10:20:58AM +0200, Mike Rapoport wrote:
>>>> Every architecture calls sparse_init() during setup_arch() although the
>>>> data structures created by sparse_init() are not used until the
>>>> initialization of the core MM.
>>>>
>>>> Beside the code duplication, calling sparse_init() from architecture
>>>> specific code causes ordering differences of vmemmap and HVO initialization
>>>> on different architectures.
>>>>
>>>> Move the call to sparse_init() from architecture specific code to
>>>> free_area_init() to ensure that vmemmap and HVO initialization order is
>>>> always the same.
>>> This broke the boot on RISC-V 32-bit (rv32_defconfig) for me.
>>>
>>> Specifically if sparse_init() is *not* called before the following callchain,
>>> the kernel dies at that point.
>>>
>>> start_kernel()
>>>   setup_arch()
>>>     apply_boot_alternatives()
>>>       _apply_alternatives()
>>>         riscv_cpufeature_patch_func()
>>>           patch_text_nosync()
>>>           riscv_alternative_fix_offsets()
>> Hm, most architectures do alternatives patching much later in the boot,
>> when much more subsystems (including mm) is already initialized.
>>
>> Any particular reason riscv does it that early? 

Theoretically, it could be moved by making sure more things don't use
alternatives on boot.

In practice, a bunch of random things call the alternatives-using
versions of macros to check for extensions that I'd say (again, not
maintainer, just taking a look) I'd rather have this be as early as
possible, to minimize the minefield where random functions give the
wrong answer. (Since those are all supposed to generate only a single
nop or jump, WARN is not a good option.) Just looking at callees of
arch/riscv/kernel/setup.c setup_arch(), I see:

  * init_rt_signal_env() -> get_rt_frame_size() -> has_vector() ->
    riscv_has_extension_unlikely()
  * init_rt_signal_env() -> get_rt_frame_size() -> has_xtheadvector() ->
    riscv_has_vendor_extension_unlikely()
  * riscv_user_isa_enable() -> riscv_has_extension_unlikely()

And it's probably only going to get worse as more extensions come that
have architectural state.

>>> Simple reproducer, using kunit:
>>>
>>> ./tools/testing/kunit/kunit.py run --raw_output=all --make_options LLVM=1 --arch riscv32 --kconfig_add CONFIG_SPARSEMEM_MANUAL=y --kconfig_add CONFIG_SPARSEMEM=y
>> Looking at patch_map it's quite clear why movement of sparse_init() cased a
>> crash:
>>
>> 	if (core_kernel_text(uintaddr) || is_kernel_exittext(uintaddr))
>> 		page = phys_to_page(__pa_symbol(addr));
>>
>> phys_to_page() with CONFIG_SPARSEMEM=y will try to access memory section
>> that are initialized in sparse_init().
>>
>> What I don't understand is why patch_map() needs a struct page for kernel
>> text patching at all, __pa_symbol() should work just fine.
>> And the BUG_ON(!page) is completely bogus for phys_to_page() conversion,
>> because that one is pure arithmetics.

"Copied from arm64" was the reason, I believe. See code pre/post commit
8d09e2d569f6 ("arm64: patching: avoid early page_to_phys()") where the
exact same idea as yours was applied to arch/arm64/kernel/patching.c
patch_map(). Evidently riscv needs the same fix.

>> If moving apply_boot_alternatives() is not an option for riscv, something
>> like the patch below should fix the issue with access to nonexistent
>> memory sections. But I think moving apply_boot_alternatives() later in boot
>> would make things less fragile.
>>
>> diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
>> index db13c9ddf9e3..89b3c13f2865 100644
>> --- a/arch/riscv/kernel/patch.c
>> +++ b/arch/riscv/kernel/patch.c
>> @@ -43,18 +43,19 @@ static __always_inline void *patch_map(void *addr, const unsigned int fixmap)
>>  {
>>  	uintptr_t uintaddr = (uintptr_t) addr;
>>  	struct page *page;
>> +	phys_addr_t phys;
>>  
>> -	if (core_kernel_text(uintaddr) || is_kernel_exittext(uintaddr))
>> -		page = phys_to_page(__pa_symbol(addr));
>> -	else if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX))
>> +	if (core_kernel_text(uintaddr) || is_kernel_exittext(uintaddr)) {
>> +		phys = __pa_symbol(addr);
>> +	} else if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX)) {
>>  		page = vmalloc_to_page(addr);
>> -	else
>> +		BUG_ON(!page);
>> +		phys = page_to_phys(page);
>> +	} else {
>>  		return addr;
>> +	}
>>  
>> -	BUG_ON(!page);
>> -
>> -	return (void *)set_fixmap_offset(fixmap, page_to_phys(page) +
>> -					 offset_in_page(addr));
>> +	return (void *)set_fixmap_offset(fixmap, phys + offset_in_page(addr));
>>  }
>>  
>>  static void patch_unmap(int fixmap)

The __pa_symbol(addr) case looks wrong - it adds an offset to an address
already with offset. This matches what arm64 did and should look pretty
similar to commit 8d09e2d569f6 ("arm64: patching: avoid early
page_to_phys()"):

diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
--- a/arch/riscv/kernel/patch.c
+++ b/arch/riscv/kernel/patch.c
@@ -42,19 +42,19 @@ static inline bool is_kernel_exittext(uintptr_t addr)
 static __always_inline void *patch_map(void *addr, const unsigned int fixmap)
 {
 	uintptr_t uintaddr = (uintptr_t) addr;
-	struct page *page;
-
-	if (core_kernel_text(uintaddr) || is_kernel_exittext(uintaddr))
-		page = phys_to_page(__pa_symbol(addr));
-	else if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX))
-		page = vmalloc_to_page(addr);
-	else
+	phys_addr_t phys;
+
+	if (core_kernel_text(uintaddr) || is_kernel_exittext(uintaddr)) {
+		phys = __pa_symbol(addr);
+	} else if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX)) {
+		struct page *page = vmalloc_to_page(addr);
+		BUG_ON(!page);
+		phys = page_to_phys(page) + offset_in_page(addr);
+	} else {
 		return addr;
+	}
 
-	BUG_ON(!page);
-
-	return (void *)set_fixmap_offset(fixmap, page_to_phys(page) +
-					 offset_in_page(addr));
+	return (void *)set_fixmap_offset(fixmap, phys);
 }
 
 static void patch_unmap(int fixmap)

I can confirm this fixes riscv32 boot oops on QEMU. I can send it out as
a proper patch later, pending making checkpatch.pl happy/happier and
testing.

For reference this is arm64:

static void __kprobes *patch_map(void *addr, int fixmap)
{
	phys_addr_t phys;

	if (is_image_text((unsigned long)addr)) {
		phys = __pa_symbol(addr);
	} else {
		struct page *page = vmalloc_to_page(addr);
		BUG_ON(!page);
		phys = page_to_phys(page) + offset_in_page(addr);
	}

	return (void *)set_fixmap_offset(fixmap, phys);
}




  reply	other threads:[~2026-03-10  4:05 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-11  8:20 [PATCH v3 00/29] arch, mm: consolidate hugetlb early reservation Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 01/29] alpha: introduce arch_zone_limits_init() Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 02/29] arc: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 03/29] arm: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 04/29] arm: make initialization of zero page independent of the memory map Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 05/29] arm64: introduce arch_zone_limits_init() Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 06/29] csky: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 07/29] hexagon: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 08/29] loongarch: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 09/29] m68k: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 10/29] microblaze: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 11/29] mips: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 12/29] nios2: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 13/29] openrisc: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 14/29] parisc: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 15/29] powerpc: " Mike Rapoport
2026-01-13 12:29   ` Ritesh Harjani
2026-01-11  8:20 ` [PATCH v3 16/29] riscv: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 17/29] s390: " Mike Rapoport
2026-01-12  7:02   ` Alexander Gordeev
2026-01-12  7:34     ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 18/29] sh: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 19/29] sparc: " Mike Rapoport
2026-01-13 12:28   ` Andreas Larsson
2026-01-11  8:20 ` [PATCH v3 20/29] um: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 21/29] x86: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 22/29] xtensa: " Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 23/29] arch, mm: consolidate initialization of nodes, zones and memory map Mike Rapoport
2026-02-27 15:14   ` Vlastimil Babka
2026-02-27 20:31     ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 24/29] arch, mm: consolidate initialization of SPARSE memory model Mike Rapoport
2026-02-23 13:52   ` Thomas Weißschuh
2026-02-23 19:40     ` Mike Rapoport
2026-03-09  7:34       ` [BUG] SPARSEMEM broken on RISC-V; was: [PATCH] " Thomas Weißschuh
2026-03-10  4:04         ` Vivian Wang [this message]
2026-02-25  3:30   ` [PATCH v3 24/29] " Ritesh Harjani
2026-02-25 16:25     ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 25/29] mips: drop paging_init() Mike Rapoport
2026-01-11  8:21 ` [PATCH v3 26/29] x86: don't reserve hugetlb memory in setup_arch() Mike Rapoport
2026-01-11  8:21 ` [PATCH v3 27/29] mm, arch: consolidate hugetlb CMA reservation Mike Rapoport
2026-01-11  8:21 ` [PATCH v3 28/29] mm/hugetlb: drop hugetlb_cma_check() Mike Rapoport
2026-01-11  8:21 ` [PATCH v3 29/29] Revert "mm/hugetlb: deal with multiple calls to hugetlb_bootmem_alloc" Mike Rapoport
2026-01-12 22:23 ` [PATCH v3 00/29] arch, mm: consolidate hugetlb early reservation Andrew Morton
2026-01-13  6:50   ` Kalle Niemi
2026-01-13  8:40     ` Kalle Niemi
2026-02-20  4:10 ` patchwork-bot+linux-riscv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=482adca2-4755-4e86-8488-fa2b1a02f0b5@iscas.ac.cn \
    --to=wangruikang@iscas.ac.cn \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=aou@eecs.berkeley.edu \
    --cc=david@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=palmer@dabbelt.com \
    --cc=pjw@kernel.org \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=thomas.weissschuh@linutronix.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox