From: Mike Rapoport <rppt@kernel.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Alex Shi <alexs@kernel.org>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Andreas Larsson <andreas@gaisler.com>,
Borislav Petkov <bp@alien8.de>, Brian Cain <bcain@kernel.org>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>,
"David S. Miller" <davem@davemloft.net>,
Dave Hansen <dave.hansen@linux.intel.com>,
David Hildenbrand <david@kernel.org>,
Dinh Nguyen <dinguyen@kernel.org>,
Geert Uytterhoeven <geert@linux-m68k.org>,
Guo Ren <guoren@kernel.org>, Heiko Carstens <hca@linux.ibm.com>,
Helge Deller <deller@gmx.de>, Huacai Chen <chenhuacai@kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Johannes Berg <johannes@sipsolutions.net>,
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
Jonathan Corbet <corbet@lwn.net>,
Klara Modin <klarasmodin@gmail.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Magnus Lindholm <linmag7@gmail.com>,
Matt Turner <mattst88@gmail.com>,
Max Filippov <jcmvbkbc@gmail.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Michal Hocko <mhocko@suse.com>, Michal Simek <monstr@monstr.eu>,
Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
Palmer Dabbelt <palmer@dabbelt.com>,
Pratyush Yadav <pratyush@kernel.org>,
Richard Weinberger <richard@nod.at>,
Ritesh Harjani <ritesh.list@gmail.com>,
Russell King <linux@armlinux.org.uk>,
Stafford Horne <shorne@gmail.com>,
Suren Baghdasaryan <surenb@google.com>,
Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
Thomas Gleixner <tglx@linutronix.de>,
Vasily Gorbik <gor@linux.ibm.com>,
Vineet Gupta <vgupta@kernel.org>, Will Deacon <will@kernel.org>,
x86@kernel.org, linux-alpha@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org,
linux-cxl@vger.kernel.org, linux-doc@vger.kernel.org,
linux-hexagon@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org,
linux-mm@kvack.org, linux-openrisc@vger.kernel.org,
linux-parisc@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
linux-snps-arc@lists.infradead.org, linux-um@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev,
sparclinux@vger.kernel.org
Subject: Re: [PATCH v3 23/29] arch, mm: consolidate initialization of nodes, zones and memory map
Date: Fri, 27 Feb 2026 22:31:41 +0200 [thread overview]
Message-ID: <aaH_LVnl8FlERA_r@kernel.org> (raw)
In-Reply-To: <b9527ed4-7a5c-42e9-8814-b276b3741f63@suse.cz>
Hi Vlastimil,
On Fri, Feb 27, 2026 at 04:14:42PM +0100, Vlastimil Babka wrote:
> On 1/11/26 09:20, Mike Rapoport wrote:
> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> >
> > To initialize node, zone and memory map data structures every architecture
> > calls free_area_init() during setup_arch() and passes it an array of zone
> > limits.
> >
> > Beside code duplication it creates "interesting" ordering cases between
> > allocation and initialization of hugetlb and the memory map. Some
> > architectures allocate hugetlb pages very early in setup_arch() in certain
> > cases, some only create hugetlb CMA areas in setup_arch() and sometimes
> > hugetlb allocations happen mm_core_init().
> >
> > With arch_zone_limits_init() helper available now on all architectures it
> > is no longer necessary to call free_area_init() from architecture setup
> > code. Rather core MM initialization can call arch_zone_limits_init() in a
> > single place.
> >
> > This allows to unify ordering of hugetlb vs memory map allocation and
> > initialization.
> >
> > Remove the call to free_area_init() from architecture specific code and
> > place it in a new mm_core_init_early() function that is called immediately
> > after setup_arch().
> >
> > After this refactoring it is possible to consolidate hugetlb allocations
> > and eliminate differences in ordering of hugetlb and memory map
> > initialization among different architectures.
> >
> > As the first step of this consolidation move hugetlb_bootmem_alloc() to
> > mm_core_early_init().
> >
> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> I've bisected a problem with virtme-ng testing a NUMA memoryless
> node setup (on x86_64) to this patch (commit d49004c5f0c1).
>
> It's executed like this, where node 0 has memory and node 1 only cpus:
>
> vng -vr . -p 8 -m 4G --numa 4G,cpus=0-3 --numa 0,cpus=4-7
>
> This fails to boot due to:
>
> [ 0.095894] BUG: unable to handle page fault for address: 0000000000004620
> [ 0.097196] #PF: supervisor read access in kernel mode
> [ 0.098180] #PF: error_code(0x0000) - not-present page
> [ 0.099155] PGD 0 P4D 0
> [ 0.099641] Oops: Oops: 0000 [#1] SMP NOPTI
> [ 0.100437] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.19.0-rc6-00152-gf206359553c9 #53 PREEMPT
> [ 0.102201] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-2-g4f253b9b-prebuilt.qemu.org 04/01/2014
> [ 0.104313] RIP: 0010:mm_core_init_early+0x263/0x900
> [ 0.105271] Code: 93 ff 72 09 8b 7c 24 30 e8 da 82 00 00 48 63 44 24 30 45 31 db 4c 8b 24 c5 a0 7b 1d 9a 48 89 c3 4c 89 5c 24 50 4c 89 5c 24 58 <41> 83 bc 24 20 46 00 00 00 75 0b 41 83 bc 24 14 47 00 00 00 74 04
> [ 0.108863] RSP: 0000:ffffffff99403e38 EFLAGS: 00010046
> [ 0.109861] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001
> [ 0.111223] RDX: 0000000000000040 RSI: 0000000000100000 RDI: ffff89597fffae00
> [ 0.112577] RBP: 0000000000000005 R08: 0000000000000000 R09: ffff89597fffa200
> [ 0.113924] R10: 80000000ffffe000 R11: 0000000000000000 R12: 0000000000000000
> [ 0.115294] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [ 0.116656] FS: 0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
> [ 0.118193] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.119283] CR2: 0000000000004620 CR3: 0000000060048000 CR4: 00000000000000b0
> [ 0.120645] Call Trace:
> [ 0.121122] <TASK>
> [ 0.121521] start_kernel+0x5d/0x780
> [ 0.122206] x86_64_start_reservations+0x24/0x30
> [ 0.123079] x86_64_start_kernel+0xd1/0xe0
> [ 0.123860] common_startup_64+0x12c/0x138
> [ 0.124641] </TASK>
> [ 0.125061] Modules linked in:
> [ 0.125646] CR2: 0000000000004620
> [ 0.126279] ---[ end trace 0000000000000000 ]---
> [ 0.127162] RIP: 0010:mm_core_init_early+0x263/0x900
> [ 0.128106] Code: 93 ff 72 09 8b 7c 24 30 e8 da 82 00 00 48 63 44 24 30 45 31 db 4c 8b 24 c5 a0 7b 1d 9a 48 89 c3 4c 89 5c 24 50 4c 89 5c 24 58 <41> 83 bc 24 20 46 00 00 00 75 0b 41 83 bc 24 14 47 00 00 00 74 04
> [ 0.131676] RSP: 0000:ffffffff99403e38 EFLAGS: 00010046
> [ 0.132684] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001
> [ 0.134033] RDX: 0000000000000040 RSI: 0000000000100000 RDI: ffff89597fffae00
> [ 0.135412] RBP: 0000000000000005 R08: 0000000000000000 R09: ffff89597fffa200
> [ 0.136763] R10: 80000000ffffe000 R11: 0000000000000000 R12: 0000000000000000
> [ 0.138112] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [ 0.139487] FS: 0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
> [ 0.141014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.142094] CR2: 0000000000004620 CR3: 0000000060048000 CR4: 00000000000000b0
> [ 0.143448] Kernel panic - not syncing: Attempted to kill the idle task!
> [ 0.144833] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>
> > ./scripts/faddr2line vmlinux mm_core_init_early+0x263/0x900
> mm_core_init_early+0x263/0x900:
> free_area_init_node at mm/mm_init.c:1721
> (inlined by) free_area_init at mm/mm_init.c:1902
> (inlined by) mm_core_init_early at mm/mm_init.c:2681
>
> It crashes at WARN_ON(pgdat->nr_zones || pgdat->kswapd_highest_zoneidx);
> because pgdat is NULL.
>
> With some debug printk's I've figured out that in free_area_init()
> we have:
>
> if (!node_online(nid))
> alloc_offline_node_data(nid);
>
> pgdat = NODE_DATA(nid);
> free_area_init_node(nid);
>
>
> But node_online() is true so this allocation doesn't happen, and
> pgdat remains NULL.
>
> And node_online() becomes true in init_cpu_to_node():
>
> if (!node_online(node))
> node_set_online(node);
>
> But without having a pgdat allocated.
>
> I was able to workaround this by changing the code in free_area_init() to
>
> if (!node_online(nid) || !NODE_DATA(nid))
> alloc_offline_node_data(nid);
if (!NODE_DATA(nid)) is enough ...
> But I don't have the bigger picture, and also didn't check yet what exactly
> about this patch results in the failure. Probably ordering of various related
> actions. Thoughts?
... and there's a fix already in the mm-hotfixes-stable:
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-hotfixes-unstable&id=a4ab97e34bb687a2ca63fc70b47e8762e689797f
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2026-02-27 20:32 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-11 8:20 [PATCH v3 00/29] arch, mm: consolidate hugetlb early reservation Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 01/29] alpha: introduce arch_zone_limits_init() Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 02/29] arc: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 03/29] arm: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 04/29] arm: make initialization of zero page independent of the memory map Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 05/29] arm64: introduce arch_zone_limits_init() Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 06/29] csky: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 07/29] hexagon: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 08/29] loongarch: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 09/29] m68k: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 10/29] microblaze: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 11/29] mips: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 12/29] nios2: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 13/29] openrisc: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 14/29] parisc: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 15/29] powerpc: " Mike Rapoport
2026-01-13 12:29 ` Ritesh Harjani
2026-01-11 8:20 ` [PATCH v3 16/29] riscv: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 17/29] s390: " Mike Rapoport
2026-01-12 7:02 ` Alexander Gordeev
2026-01-12 7:34 ` Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 18/29] sh: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 19/29] sparc: " Mike Rapoport
2026-01-13 12:28 ` Andreas Larsson
2026-01-11 8:20 ` [PATCH v3 20/29] um: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 21/29] x86: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 22/29] xtensa: " Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 23/29] arch, mm: consolidate initialization of nodes, zones and memory map Mike Rapoport
2026-02-27 15:14 ` Vlastimil Babka
2026-02-27 20:31 ` Mike Rapoport [this message]
2026-01-11 8:20 ` [PATCH v3 24/29] arch, mm: consolidate initialization of SPARSE memory model Mike Rapoport
2026-02-23 13:52 ` Thomas Weißschuh
2026-02-23 19:40 ` Mike Rapoport
2026-02-25 3:30 ` Ritesh Harjani
2026-02-25 16:25 ` Mike Rapoport
2026-01-11 8:20 ` [PATCH v3 25/29] mips: drop paging_init() Mike Rapoport
2026-01-11 8:21 ` [PATCH v3 26/29] x86: don't reserve hugetlb memory in setup_arch() Mike Rapoport
2026-01-11 8:21 ` [PATCH v3 27/29] mm, arch: consolidate hugetlb CMA reservation Mike Rapoport
2026-01-11 8:21 ` [PATCH v3 28/29] mm/hugetlb: drop hugetlb_cma_check() Mike Rapoport
2026-01-11 8:21 ` [PATCH v3 29/29] Revert "mm/hugetlb: deal with multiple calls to hugetlb_bootmem_alloc" Mike Rapoport
2026-01-12 22:23 ` [PATCH v3 00/29] arch, mm: consolidate hugetlb early reservation Andrew Morton
2026-01-13 6:50 ` Kalle Niemi
2026-01-13 8:40 ` Kalle Niemi
2026-02-20 4:10 ` patchwork-bot+linux-riscv
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaH_LVnl8FlERA_r@kernel.org \
--to=rppt@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=alexs@kernel.org \
--cc=andreas@gaisler.com \
--cc=bcain@kernel.org \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=chenhuacai@kernel.org \
--cc=chleroy@kernel.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=david@kernel.org \
--cc=deller@gmx.de \
--cc=dinguyen@kernel.org \
--cc=geert@linux-m68k.org \
--cc=glaubitz@physik.fu-berlin.de \
--cc=gor@linux.ibm.com \
--cc=guoren@kernel.org \
--cc=hca@linux.ibm.com \
--cc=jcmvbkbc@gmail.com \
--cc=johannes@sipsolutions.net \
--cc=klarasmodin@gmail.com \
--cc=linmag7@gmail.com \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-hexagon@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-m68k@lists.linux-m68k.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-openrisc@vger.kernel.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linux-snps-arc@lists.infradead.org \
--cc=linux-um@lists.infradead.org \
--cc=linux@armlinux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=loongarch@lists.linux.dev \
--cc=lorenzo.stoakes@oracle.com \
--cc=mattst88@gmail.com \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=monstr@monstr.eu \
--cc=mpe@ellerman.id.au \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=palmer@dabbelt.com \
--cc=pratyush@kernel.org \
--cc=richard@nod.at \
--cc=ritesh.list@gmail.com \
--cc=shorne@gmail.com \
--cc=sparclinux@vger.kernel.org \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=tsbogend@alpha.franken.de \
--cc=vbabka@suse.cz \
--cc=vgupta@kernel.org \
--cc=will@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox