* Re: [PATCH v4] mm: pass nid to reserve_bootmem_region()
@ 2025-04-25 10:02 Gutierrez Cantu, Bernardo
2025-04-25 10:20 ` [PATCH] mm: memblock: Fix arguments passed to memblock_set_node() Bernardo C. Gutierrez Cantu
0 siblings, 1 reply; 8+ messages in thread
From: Gutierrez Cantu, Bernardo @ 2025-04-25 10:02 UTC (permalink / raw)
To: yajun.deng; +Cc: akpm, linux-kernel, linux-mm, lkp, rppt, dwmw2
> + if (memblock_is_nomap(region))
> + reserve_bootmem_region(start, end, nid);
> +
> + memblock_set_node(start, end, &memblock.reserved, nid);
memblock_set_node() receives a `base` and a `size` argument. Passing `end` would
cause us to set the node id on an incorrect range. Will send a fixup patch
shortly...
Best regards
Bernardo
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] mm: memblock: Fix arguments passed to memblock_set_node()
2025-04-25 10:02 [PATCH v4] mm: pass nid to reserve_bootmem_region() Gutierrez Cantu, Bernardo
@ 2025-04-25 10:20 ` Bernardo C. Gutierrez Cantu
2025-04-25 14:18 ` David Woodhouse
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Bernardo C. Gutierrez Cantu @ 2025-04-25 10:20 UTC (permalink / raw)
To: bercantu; +Cc: akpm, dwmw2, linux-kernel, linux-mm, lkp, rppt, yajun.deng
memblock_set_node() receives a `base` and a `size` arguments, but we are
passing the `start` and `end` of the memory regions when iterating over
them in memmap_init_reserved_pages() to set their node ids.
This results in the function setting the node ids for the reserved memory
regions in `[base, base + base + size)` instead of `[base, base + size)`.
Pass `start` and `size`, so that we iterate over the correct range.
Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
Signed-off-by: Bernardo C. Gutierrez Cantu <bercantu@amazon.de>
---
mm/memblock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memblock.c b/mm/memblock.c
index 0a53db4d9f7b..9639f04b4fdf 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -2196,7 +2196,7 @@ static void __init memmap_init_reserved_pages(void)
if (memblock_is_nomap(region))
reserve_bootmem_region(start, end, nid);
- memblock_set_node(start, end, &memblock.reserved, nid);
+ memblock_set_node(start, region->size, &memblock.reserved, nid);
}
/*
--
2.47.1
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: memblock: Fix arguments passed to memblock_set_node()
2025-04-25 10:20 ` [PATCH] mm: memblock: Fix arguments passed to memblock_set_node() Bernardo C. Gutierrez Cantu
@ 2025-04-25 14:18 ` David Woodhouse
2025-04-25 14:37 ` Jiaxun Yang
2025-04-26 1:05 ` Andrew Morton
2025-04-26 8:22 ` Mike Rapoport
2 siblings, 1 reply; 8+ messages in thread
From: David Woodhouse @ 2025-04-25 14:18 UTC (permalink / raw)
To: Bernardo C. Gutierrez Cantu
Cc: akpm, linux-kernel, linux-mm, lkp, rppt, yajun.deng, Huang Pei,
Thomas Bogendoerfer, Huacai Chen, Jiaxun Yang, linux-mips
[-- Attachment #1: Type: text/plain, Size: 1815 bytes --]
On Fri, 2025-04-25 at 10:20 +0000, Bernardo C. Gutierrez Cantu wrote:
> memblock_set_node() receives a `base` and a `size` arguments, but we are
> passing the `start` and `end` of the memory regions when iterating over
> them in memmap_init_reserved_pages() to set their node ids.
>
> This results in the function setting the node ids for the reserved memory
> regions in `[base, base + base + size)` instead of `[base, base + size)`.
>
> Pass `start` and `size`, so that we iterate over the correct range.
>
> Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
>
> Signed-off-by: Bernardo C. Gutierrez Cantu <bercantu@amazon.de>
Thanks, Bernardo. Good catch!
Acked-by: David Woodhouse <dwmw2@infradead.org>
That function taking (start, size) arguments seems a little surprising
to me; I instinctively expect physical memory management functions to
use (start, end). Clearly the author of that commit did too :)
As you pointed out in our private chat though, the rest of the
memblock_* functions are all (start, size) too, so they are at least
consistent. I don't think it makes a lot of sense to talk about
changing them *all* at this point?
I did a quick grep for memblock_set_node() callers, and the one in
szmem() in arch/mips/loongson64/init.c looks odd.
/* set nid for reserved memory */
memblock_set_node((u64)node << 44, (u64)(node + 1) << 44,
&memblock.reserved, node);
At first glance I suspect the 'size' should just be (1<<44) or maybe it
should be inside the loop over the memmap, and called with mem_start,
mem_size each time?
And why are we calling memblock_reserve() for what appears to be a
single system-wide vgabios_addr, repeatedly each time szmem() is called
for a different NUMA node?
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: memblock: Fix arguments passed to memblock_set_node()
2025-04-25 14:18 ` David Woodhouse
@ 2025-04-25 14:37 ` Jiaxun Yang
0 siblings, 0 replies; 8+ messages in thread
From: Jiaxun Yang @ 2025-04-25 14:37 UTC (permalink / raw)
To: David Woodhouse, Bernardo C. Gutierrez Cantu
Cc: Andrew Morton, linux-kernel, linux-mm, kernel test robot, rppt,
yajun.deng, Huang Pei, Thomas Bogendoerfer, Huacai Chen,
linux-mips
在2025年4月25日周五 下午3:18,David Woodhouse写道:
[...]
Hi David & co,
>
>
> I did a quick grep for memblock_set_node() callers, and the one in
> szmem() in arch/mips/loongson64/init.c looks odd.
>
> /* set nid for reserved memory */
> memblock_set_node((u64)node << 44, (u64)(node + 1) << 44,
> &memblock.reserved, node);
>
> At first glance I suspect the 'size' should just be (1<<44) or maybe it
> should be inside the loop over the memmap, and called with mem_start,
> mem_size each time?
You are right, it should be (1 << 44), it is an oversight when I was
converting MIPS private boot allocator to memblock.
>
> And why are we calling memblock_reserve() for what appears to be a
> single system-wide vgabios_addr, repeatedly each time szmem() is called
> for a different NUMA node?
Alas this should be done only for node 0.
Thanks
>
>
> 附件:
> * smime.p7s
--
- Jiaxun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: memblock: Fix arguments passed to memblock_set_node()
2025-04-25 10:20 ` [PATCH] mm: memblock: Fix arguments passed to memblock_set_node() Bernardo C. Gutierrez Cantu
2025-04-25 14:18 ` David Woodhouse
@ 2025-04-26 1:05 ` Andrew Morton
2025-04-28 9:13 ` Bernardo C. Gutierrez Cantu
2025-04-26 8:22 ` Mike Rapoport
2 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2025-04-26 1:05 UTC (permalink / raw)
To: Bernardo C. Gutierrez Cantu
Cc: dwmw2, linux-kernel, linux-mm, lkp, rppt, yajun.deng
On Fri, 25 Apr 2025 10:20:03 +0000 "Bernardo C. Gutierrez Cantu" <bercantu@amazon.de> wrote:
> memblock_set_node() receives a `base` and a `size` arguments, but we are
> passing the `start` and `end` of the memory regions when iterating over
> them in memmap_init_reserved_pages() to set their node ids.
>
> This results in the function setting the node ids for the reserved memory
> regions in `[base, base + base + size)` instead of `[base, base + size)`.
>
> Pass `start` and `size`, so that we iterate over the correct range.
>
> Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
>
> ...
>
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -2196,7 +2196,7 @@ static void __init memmap_init_reserved_pages(void)
> if (memblock_is_nomap(region))
> reserve_bootmem_region(start, end, nid);
>
> - memblock_set_node(start, end, &memblock.reserved, nid);
> + memblock_set_node(start, region->size, &memblock.reserved, nid);
> }
>
> /*
What were the runtime effects of this bug?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: memblock: Fix arguments passed to memblock_set_node()
2025-04-25 10:20 ` [PATCH] mm: memblock: Fix arguments passed to memblock_set_node() Bernardo C. Gutierrez Cantu
2025-04-25 14:18 ` David Woodhouse
2025-04-26 1:05 ` Andrew Morton
@ 2025-04-26 8:22 ` Mike Rapoport
2025-04-28 9:53 ` Bernardo C. Gutierrez Cantu
2 siblings, 1 reply; 8+ messages in thread
From: Mike Rapoport @ 2025-04-26 8:22 UTC (permalink / raw)
To: Bernardo C. Gutierrez Cantu
Cc: akpm, dwmw2, linux-kernel, linux-mm, lkp, yajun.deng
Hi Bernardo,
On Fri, Apr 25, 2025 at 10:20:03AM +0000, Bernardo C. Gutierrez Cantu wrote:
> memblock_set_node() receives a `base` and a `size` arguments, but we are
> passing the `start` and `end` of the memory regions when iterating over
> them in memmap_init_reserved_pages() to set their node ids.
>
> This results in the function setting the node ids for the reserved memory
> regions in `[base, base + base + size)` instead of `[base, base + size)`.
>
> Pass `start` and `size`, so that we iterate over the correct range.
>
> Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
>
> Signed-off-by: Bernardo C. Gutierrez Cantu <bercantu@amazon.de>
There's already a fix in memblock tree:
https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock.git/commit/?h=for-next&id=06eaa824fd239edd1eab2754f29b2d03da313003
Will send PR to Linus soon.
> ---
> mm/memblock.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 0a53db4d9f7b..9639f04b4fdf 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -2196,7 +2196,7 @@ static void __init memmap_init_reserved_pages(void)
> if (memblock_is_nomap(region))
> reserve_bootmem_region(start, end, nid);
>
> - memblock_set_node(start, end, &memblock.reserved, nid);
> + memblock_set_node(start, region->size, &memblock.reserved, nid);
> }
>
> /*
> --
> 2.47.1
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: memblock: Fix arguments passed to memblock_set_node()
2025-04-26 1:05 ` Andrew Morton
@ 2025-04-28 9:13 ` Bernardo C. Gutierrez Cantu
0 siblings, 0 replies; 8+ messages in thread
From: Bernardo C. Gutierrez Cantu @ 2025-04-28 9:13 UTC (permalink / raw)
To: akpm; +Cc: bercantu, dwmw2, linux-kernel, linux-mm, lkp, rppt, yajun.deng
On Sat, 26 Apr 2025 03:05:39 +0200 Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 25 Apr 2025 10:20:03 +0000 "Bernardo C. Gutierrez Cantu" <bercantu@amazon.de> wrote:
>
> > memblock_set_node() receives a `base` and a `size` arguments, but we are
> > passing the `start` and `end` of the memory regions when iterating over
> > them in memmap_init_reserved_pages() to set their node ids.
> >
> > This results in the function setting the node ids for the reserved memory
> > regions in `[base, base + base + size)` instead of `[base, base + size)`.
> >
> > Pass `start` and `size`, so that we iterate over the correct range.
> >
> > Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> >
> > ...
> >
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -2196,7 +2196,7 @@ static void __init memmap_init_reserved_pages(void)
> > if (memblock_is_nomap(region))
> > reserve_bootmem_region(start, end, nid);
> >
> > - memblock_set_node(start, end, &memblock.reserved, nid);
> > + memblock_set_node(start, region->size, &memblock.reserved, nid);
> > }
> >
> > /*
>
> What were the runtime effects of this bug?
I found the bug via code introspection while trying to learn how the boot time
memory allocation of Linux worked. I was not actively pursuing to fix a runtime
issue that I could then prove was fixed by this.
But I see that in most cases this bug should be mostly benign (which could
explain why it was not caught before). In kernels compiled with CONFIG_NUMA, the
memblock_set_node() function would iterate over more memblock regions and
potentially set incorrect node ids, which would then be overwritten with the
correct node ids once the caller function memmap_init_reserved_pages() iterates
over the next regions. So the resultant node ids after initialization should
still be correct.
It could still impact the length of the mm init process on kernel boot, but I
did not measure the improvement, as it is highly dependent on the memory
configuration system, and in my specific system under test, it was neglible.
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: memblock: Fix arguments passed to memblock_set_node()
2025-04-26 8:22 ` Mike Rapoport
@ 2025-04-28 9:53 ` Bernardo C. Gutierrez Cantu
0 siblings, 0 replies; 8+ messages in thread
From: Bernardo C. Gutierrez Cantu @ 2025-04-28 9:53 UTC (permalink / raw)
To: rppt; +Cc: akpm, bercantu, dwmw2, linux-kernel, linux-mm, lkp, yajun.deng
On Sat, Apr 26, 2025 at 10:22:39AM +0200, Mike Rapoport wrote:
> Hi Bernardo,
>
> On Fri, Apr 25, 2025 at 10:20:03AM +0000, Bernardo C. Gutierrez Cantu wrote:
> > memblock_set_node() receives a `base` and a `size` arguments, but we are
> > passing the `start` and `end` of the memory regions when iterating over
> > them in memmap_init_reserved_pages() to set their node ids.
> >
> > This results in the function setting the node ids for the reserved memory
> > regions in `[base, base + base + size)` instead of `[base, base + size)`.
> >
> > Pass `start` and `size`, so that we iterate over the correct range.
> >
> > Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> >
> > Signed-off-by: Bernardo C. Gutierrez Cantu <bercantu@amazon.de>
>
> There's already a fix in memblock tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock.git/commit/?h=for-next&id=06eaa824fd239edd1eab2754f29b2d03da313003
>
> Will send PR to Linus soon.
>
Hi Mike
Oh ok, I see. Thank you, that patch should fix the issue.
Will it also be backported to LTS kernels affected by the bug, which was
introduced in v6.5?
> > ---
> > mm/memblock.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index 0a53db4d9f7b..9639f04b4fdf 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -2196,7 +2196,7 @@ static void __init memmap_init_reserved_pages(void)
> > if (memblock_is_nomap(region))
> > reserve_bootmem_region(start, end, nid);
> >
> > - memblock_set_node(start, end, &memblock.reserved, nid);
> > + memblock_set_node(start, region->size, &memblock.reserved, nid);
> > }
> >
> > /*
> > --
> > 2.47.1
>
> --
> Sincerely yours,
> Mike.
Best regards
Bernardo
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-04-28 9:53 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-25 10:02 [PATCH v4] mm: pass nid to reserve_bootmem_region() Gutierrez Cantu, Bernardo
2025-04-25 10:20 ` [PATCH] mm: memblock: Fix arguments passed to memblock_set_node() Bernardo C. Gutierrez Cantu
2025-04-25 14:18 ` David Woodhouse
2025-04-25 14:37 ` Jiaxun Yang
2025-04-26 1:05 ` Andrew Morton
2025-04-28 9:13 ` Bernardo C. Gutierrez Cantu
2025-04-26 8:22 ` Mike Rapoport
2025-04-28 9:53 ` Bernardo C. Gutierrez Cantu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox