linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: fix NULL NODE_DATA dereference for memoryless nodes on boot
@ 2026-02-22  5:44 Ming Lei
  2026-02-22 11:21 ` Mike Rapoport
  0 siblings, 1 reply; 3+ messages in thread
From: Ming Lei @ 2026-02-22  5:44 UTC (permalink / raw)
  To: Andrew Morton, linux-mm; +Cc: linux-kernel, Ming Lei, Mike Rapoport (Microsoft)

Commit d49004c5f0c1 ("arch, mm: consolidate initialization of nodes,
zones and memory map") moved free_area_init() from setup_arch() to
mm_core_init_early(), which runs after setup_arch() returns.

This changed the ordering relative to init_cpu_to_node() on x86. Before
the commit, free_area_init() ran during paging_init() (called from
setup_arch()) *before* init_cpu_to_node(). After the commit, it runs
*after* init_cpu_to_node().

On machines with memoryless NUMA nodes (e.g., node 0 has CPUs but no
memory), this causes a NULL pointer dereference:

 1. numa_register_nodes() skips memoryless nodes: no alloc_node_data()
    and no node_set_online() for them.
 2. init_cpu_to_node() sets memoryless nodes online (they have CPUs)
    but does not allocate NODE_DATA.
 3. free_area_init() checks "if (!node_online(nid))" to decide whether
    to call alloc_offline_node_data(). Since the memoryless node is now
    online, the allocation is skipped, leaving NODE_DATA(nid) == NULL.
 4. The immediate "pgdat = NODE_DATA(nid)" dereferences NULL.

The crash happens before console_init(), so no output is visible without
earlyprintk. With earlyprintk enabled, the following panic is observed:

 BUG: unable to handle page fault for address: 000000000002a1e0
 Oops: Oops: 0000 [#1] SMP NOPTI
 RIP: 0010:free_area_init_node+0x3a/0x540
 Call Trace:
  <TASK>
  free_area_init+0x331/0x4e0
  start_kernel+0x69/0x4a0
  x86_64_start_reservations+0x24/0x30
  x86_64_start_kernel+0x125/0x130
  common_startup_64+0x13e/0x148
  </TASK>
 Kernel panic - not syncing: Attempted to kill the idle task!

Fix this by checking "if (!NODE_DATA(nid))" instead of
"if (!node_online(nid))". This directly tests whether the per-node data
structure needs to be allocated, regardless of the node's online status.

Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Fixes: d49004c5f0c1 ("arch, mm: consolidate initialization of nodes, zones and memory map")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 mm/mm_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/mm_init.c b/mm/mm_init.c
index 61d983d23f55..9d63cab36204 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1896,7 +1896,7 @@ static void __init free_area_init(void)
 	for_each_node(nid) {
 		pg_data_t *pgdat;
 
-		if (!node_online(nid))
+		if (!NODE_DATA(nid))
 			alloc_offline_node_data(nid);
 
 		pgdat = NODE_DATA(nid);
-- 
2.52.0



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: fix NULL NODE_DATA dereference for memoryless nodes on boot
  2026-02-22  5:44 [PATCH] mm: fix NULL NODE_DATA dereference for memoryless nodes on boot Ming Lei
@ 2026-02-22 11:21 ` Mike Rapoport
  2026-02-22 12:01   ` Ming Lei
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Rapoport @ 2026-02-22 11:21 UTC (permalink / raw)
  To: Ming Lei; +Cc: Andrew Morton, linux-mm, linux-kernel

Hi,

On Sun, Feb 22, 2026 at 01:44:51PM +0800, Ming Lei wrote:
> Commit d49004c5f0c1 ("arch, mm: consolidate initialization of nodes,
> zones and memory map") moved free_area_init() from setup_arch() to
> mm_core_init_early(), which runs after setup_arch() returns.
> 
> This changed the ordering relative to init_cpu_to_node() on x86. Before
> the commit, free_area_init() ran during paging_init() (called from
> setup_arch()) *before* init_cpu_to_node(). After the commit, it runs
> *after* init_cpu_to_node().
> 
> On machines with memoryless NUMA nodes (e.g., node 0 has CPUs but no
> memory), this causes a NULL pointer dereference:
> 
>  1. numa_register_nodes() skips memoryless nodes: no alloc_node_data()
>     and no node_set_online() for them.
>  2. init_cpu_to_node() sets memoryless nodes online (they have CPUs)
>     but does not allocate NODE_DATA.
>  3. free_area_init() checks "if (!node_online(nid))" to decide whether
>     to call alloc_offline_node_data(). Since the memoryless node is now
>     online, the allocation is skipped, leaving NODE_DATA(nid) == NULL.
>  4. The immediate "pgdat = NODE_DATA(nid)" dereferences NULL.
> 
> The crash happens before console_init(), so no output is visible without
> earlyprintk. With earlyprintk enabled, the following panic is observed:
> 
>  BUG: unable to handle page fault for address: 000000000002a1e0
>  Oops: Oops: 0000 [#1] SMP NOPTI
>  RIP: 0010:free_area_init_node+0x3a/0x540
>  Call Trace:
>   <TASK>
>   free_area_init+0x331/0x4e0
>   start_kernel+0x69/0x4a0
>   x86_64_start_reservations+0x24/0x30
>   x86_64_start_kernel+0x125/0x130
>   common_startup_64+0x13e/0x148
>   </TASK>
>  Kernel panic - not syncing: Attempted to kill the idle task!
> 
> Fix this by checking "if (!NODE_DATA(nid))" instead of
> "if (!node_online(nid))". This directly tests whether the per-node data
> structure needs to be allocated, regardless of the node's online status.
 
I believe that this change is fine for !x86 as well, but it deserves a
sentence in the commit log.

> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> Fixes: d49004c5f0c1 ("arch, mm: consolidate initialization of nodes, zones and memory map")
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  mm/mm_init.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 61d983d23f55..9d63cab36204 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1896,7 +1896,7 @@ static void __init free_area_init(void)
>  	for_each_node(nid) {
>  		pg_data_t *pgdat;
>  
> -		if (!node_online(nid))
> +		if (!NODE_DATA(nid))
>  			alloc_offline_node_data(nid);

A comment that says that if an architecture didn't allocate node data, we
presume that the node is memoryless and offline would be nice here.

>  
>  		pgdat = NODE_DATA(nid);
> -- 
> 2.52.0
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: fix NULL NODE_DATA dereference for memoryless nodes on boot
  2026-02-22 11:21 ` Mike Rapoport
@ 2026-02-22 12:01   ` Ming Lei
  0 siblings, 0 replies; 3+ messages in thread
From: Ming Lei @ 2026-02-22 12:01 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: Andrew Morton, linux-mm, linux-kernel

On Sun, Feb 22, 2026 at 01:21:42PM +0200, Mike Rapoport wrote:
> Hi,
> 
> On Sun, Feb 22, 2026 at 01:44:51PM +0800, Ming Lei wrote:
> > Commit d49004c5f0c1 ("arch, mm: consolidate initialization of nodes,
> > zones and memory map") moved free_area_init() from setup_arch() to
> > mm_core_init_early(), which runs after setup_arch() returns.
> > 
> > This changed the ordering relative to init_cpu_to_node() on x86. Before
> > the commit, free_area_init() ran during paging_init() (called from
> > setup_arch()) *before* init_cpu_to_node(). After the commit, it runs
> > *after* init_cpu_to_node().
> > 
> > On machines with memoryless NUMA nodes (e.g., node 0 has CPUs but no
> > memory), this causes a NULL pointer dereference:
> > 
> >  1. numa_register_nodes() skips memoryless nodes: no alloc_node_data()
> >     and no node_set_online() for them.
> >  2. init_cpu_to_node() sets memoryless nodes online (they have CPUs)
> >     but does not allocate NODE_DATA.
> >  3. free_area_init() checks "if (!node_online(nid))" to decide whether
> >     to call alloc_offline_node_data(). Since the memoryless node is now
> >     online, the allocation is skipped, leaving NODE_DATA(nid) == NULL.
> >  4. The immediate "pgdat = NODE_DATA(nid)" dereferences NULL.
> > 
> > The crash happens before console_init(), so no output is visible without
> > earlyprintk. With earlyprintk enabled, the following panic is observed:
> > 
> >  BUG: unable to handle page fault for address: 000000000002a1e0
> >  Oops: Oops: 0000 [#1] SMP NOPTI
> >  RIP: 0010:free_area_init_node+0x3a/0x540
> >  Call Trace:
> >   <TASK>
> >   free_area_init+0x331/0x4e0
> >   start_kernel+0x69/0x4a0
> >   x86_64_start_reservations+0x24/0x30
> >   x86_64_start_kernel+0x125/0x130
> >   common_startup_64+0x13e/0x148
> >   </TASK>
> >  Kernel panic - not syncing: Attempted to kill the idle task!
> > 
> > Fix this by checking "if (!NODE_DATA(nid))" instead of
> > "if (!node_online(nid))". This directly tests whether the per-node data
> > structure needs to be allocated, regardless of the node's online status.
>  
> I believe that this change is fine for !x86 as well, but it deserves a
> sentence in the commit log.
> 
> > Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > Fixes: d49004c5f0c1 ("arch, mm: consolidate initialization of nodes, zones and memory map")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  mm/mm_init.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > index 61d983d23f55..9d63cab36204 100644
> > --- a/mm/mm_init.c
> > +++ b/mm/mm_init.c
> > @@ -1896,7 +1896,7 @@ static void __init free_area_init(void)
> >  	for_each_node(nid) {
> >  		pg_data_t *pgdat;
> >  
> > -		if (!node_online(nid))
> > +		if (!NODE_DATA(nid))
> >  			alloc_offline_node_data(nid);
> 
> A comment that says that if an architecture didn't allocate node data, we
> presume that the node is memoryless and offline would be nice here.

Hi Mike,

All are addressed in V2:

https://lore.kernel.org/linux-mm/20260222115702.3659-1-ming.lei@redhat.com/

But miss to Cc you, sorry...

Thanks,
Ming



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-02-22 12:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-22  5:44 [PATCH] mm: fix NULL NODE_DATA dereference for memoryless nodes on boot Ming Lei
2026-02-22 11:21 ` Mike Rapoport
2026-02-22 12:01   ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox