* [PATCH] mm: Fix up memory allocation tracing
@ 2025-05-06 14:45 Guenter Roeck
2025-05-06 15:13 ` Vlastimil Babka
0 siblings, 1 reply; 3+ messages in thread
From: Guenter Roeck @ 2025-05-06 14:45 UTC (permalink / raw)
To: Christoph Lameter
Cc: David Rientjes, Andrew Morton, Vlastimil Babka, Roman Gushchin,
Harry Yoo, linux-mm, linux-kernel, Guenter Roeck
intcp_init_early() calls syscon_regmap_lookup_by_compatible() which in
turn calls of_syscon_register(). This function allocates memory.
intcp_init_early() is called well before kmalloc caches are initialized.
As consequence, kmalloc_caches[] entries are NULL, and NULL is passed as
kmem_cache argument to __kmalloc_cache_noprof(). While slab_alloc_node()
handles this just fine, the trace code unconditionally dereferences it.
This results in crashes such as
Unable to handle kernel NULL pointer dereference at virtual address 0000000c when read
[0000000c] *pgd=00000000
Internal error: Oops: 5 [#1] ARM
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc5-00026-g5fcc9bf84ee5 #1 PREEMPT
Hardware name: ARM Integrator/CP (Device Tree)
PC is at __kmalloc_cache_noprof+0xec/0x39c
LR is at __kmalloc_cache_noprof+0x34/0x39c
...
Call trace:
__kmalloc_cache_noprof from of_syscon_register+0x7c/0x310
of_syscon_register from device_node_get_regmap+0xa4/0xb0
device_node_get_regmap from intcp_init_early+0xc/0x40
intcp_init_early from start_kernel+0x60/0x688
start_kernel from 0x0
The problem is not seen with all versions of gcc. Some versions such as
gcc 9.x apparently do not dereference the pointer, presumably if tracing
is disabled. The problem has been reproduced with gcc 10.x, 11.x, and 13.x.
Fix the problem by only dereferencing the kmem_cache pointer if it is
not NULL, and pass a dummy parameter otherwise. Only add the check to
__kmalloc_cache_noprof() since it is the only function known to be
affected.
The problem affects all supported branches of Linux. The crashing function
depends on the kernel version, and some versions are only affected if
CONFIG_TRACING is enabled.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
I only changed a single call of trace_kmalloc() because it is the only one
known to be affected. I'll be happy to change the remaining callers if that
is preferred.
I have seen this problem for a long time. I always thought it is a compiler
problem because it is not seen with gcc 9.x. However, it turns out that the
problem is real.
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c
index be8b09e09d30..627aa8d2b9fd 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4353,7 +4353,7 @@ void *__kmalloc_cache_noprof(struct kmem_cache *s, gfp_t gfpflags, size_t size)
void *ret = slab_alloc_node(s, NULL, gfpflags, NUMA_NO_NODE,
_RET_IP_, size);
- trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, NUMA_NO_NODE);
+ trace_kmalloc(_RET_IP_, ret, size, s ? s->size : -1, gfpflags, NUMA_NO_NODE);
ret = kasan_kmalloc(s, ret, size, gfpflags);
return ret;
--
2.45.2
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm: Fix up memory allocation tracing
2025-05-06 14:45 [PATCH] mm: Fix up memory allocation tracing Guenter Roeck
@ 2025-05-06 15:13 ` Vlastimil Babka
2025-05-06 16:07 ` Guenter Roeck
0 siblings, 1 reply; 3+ messages in thread
From: Vlastimil Babka @ 2025-05-06 15:13 UTC (permalink / raw)
To: Guenter Roeck, Christoph Lameter
Cc: David Rientjes, Andrew Morton, Roman Gushchin, Harry Yoo,
linux-mm, linux-kernel, Liviu Dudau, Sudeep Holla,
Lorenzo Pieralisi, Linus Walleij, Russell King, linux-arm-kernel
On 5/6/25 16:45, Guenter Roeck wrote:
> intcp_init_early() calls syscon_regmap_lookup_by_compatible() which in
> turn calls of_syscon_register(). This function allocates memory.
CCing people for intcp_init_early()
> intcp_init_early() is called well before kmalloc caches are initialized.
> As consequence, kmalloc_caches[] entries are NULL, and NULL is passed as
> kmem_cache argument to __kmalloc_cache_noprof(). While slab_alloc_node()
> handles this just fine, the trace code unconditionally dereferences it.
> This results in crashes such as
OK, so we have crashes that are not deterministic. But also
intcp_init_early() deterministically fails, right? This means it's called
before mm_core_init(), and given the "_early" part of the name that's
probably expected (i.e. I don't think it's due to some random initcall
ordering) but maybe then it's wrong to rely on kmalloc() in DT's init_early
hook?
> Unable to handle kernel NULL pointer dereference at virtual address 0000000c when read
> [0000000c] *pgd=00000000
> Internal error: Oops: 5 [#1] ARM
> Modules linked in:
> CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc5-00026-g5fcc9bf84ee5 #1 PREEMPT
> Hardware name: ARM Integrator/CP (Device Tree)
> PC is at __kmalloc_cache_noprof+0xec/0x39c
> LR is at __kmalloc_cache_noprof+0x34/0x39c
> ...
> Call trace:
> __kmalloc_cache_noprof from of_syscon_register+0x7c/0x310
> of_syscon_register from device_node_get_regmap+0xa4/0xb0
> device_node_get_regmap from intcp_init_early+0xc/0x40
> intcp_init_early from start_kernel+0x60/0x688
> start_kernel from 0x0
>
> The problem is not seen with all versions of gcc. Some versions such as
> gcc 9.x apparently do not dereference the pointer, presumably if tracing
> is disabled. The problem has been reproduced with gcc 10.x, 11.x, and 13.x.
>
> Fix the problem by only dereferencing the kmem_cache pointer if it is
> not NULL, and pass a dummy parameter otherwise. Only add the check to
> __kmalloc_cache_noprof() since it is the only function known to be
> affected.
>
> The problem affects all supported branches of Linux. The crashing function
> depends on the kernel version, and some versions are only affected if
> CONFIG_TRACING is enabled.
>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Let's see if making slab silently handle those unexpectedly early calls as
NULL is the right way or we should warn in a debug config or something.
> ---
> I only changed a single call of trace_kmalloc() because it is the only one
> known to be affected. I'll be happy to change the remaining callers if that
> is preferred.
>
> I have seen this problem for a long time. I always thought it is a compiler
> problem because it is not seen with gcc 9.x. However, it turns out that the
> problem is real.
>
> mm/slub.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index be8b09e09d30..627aa8d2b9fd 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4353,7 +4353,7 @@ void *__kmalloc_cache_noprof(struct kmem_cache *s, gfp_t gfpflags, size_t size)
> void *ret = slab_alloc_node(s, NULL, gfpflags, NUMA_NO_NODE,
> _RET_IP_, size);
>
> - trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, NUMA_NO_NODE);
> + trace_kmalloc(_RET_IP_, ret, size, s ? s->size : -1, gfpflags, NUMA_NO_NODE);
>
> ret = kasan_kmalloc(s, ret, size, gfpflags);
> return ret;
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm: Fix up memory allocation tracing
2025-05-06 15:13 ` Vlastimil Babka
@ 2025-05-06 16:07 ` Guenter Roeck
0 siblings, 0 replies; 3+ messages in thread
From: Guenter Roeck @ 2025-05-06 16:07 UTC (permalink / raw)
To: Vlastimil Babka, Christoph Lameter
Cc: David Rientjes, Andrew Morton, Roman Gushchin, Harry Yoo,
linux-mm, linux-kernel, Liviu Dudau, Sudeep Holla,
Lorenzo Pieralisi, Linus Walleij, Russell King, linux-arm-kernel
On 5/6/25 08:13, Vlastimil Babka wrote:
> On 5/6/25 16:45, Guenter Roeck wrote:
>> intcp_init_early() calls syscon_regmap_lookup_by_compatible() which in
>> turn calls of_syscon_register(). This function allocates memory.
>
> CCing people for intcp_init_early()
>
>> intcp_init_early() is called well before kmalloc caches are initialized.
>> As consequence, kmalloc_caches[] entries are NULL, and NULL is passed as
>> kmem_cache argument to __kmalloc_cache_noprof(). While slab_alloc_node()
>> handles this just fine, the trace code unconditionally dereferences it.
>> This results in crashes such as
>
> OK, so we have crashes that are not deterministic. But also
> intcp_init_early() deterministically fails, right? This means it's called
> before mm_core_init(), and given the "_early" part of the name that's
> probably expected (i.e. I don't think it's due to some random initcall
Yes, it is called before mm_core_init(), from setup_arch(). And, yes,
you are correct, the call to syscon_regmap_lookup_by_compatible() from
intcp_init_early() does indeed fail with -ENOMEM. That is interesting;
I did not realize that. Thanks a lot for noticing.
Please forget this patch; I'll submit an alternative specifically th address
the integratorcp problem. Sorry for the noise.
Guenter
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-05-06 16:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-06 14:45 [PATCH] mm: Fix up memory allocation tracing Guenter Roeck
2025-05-06 15:13 ` Vlastimil Babka
2025-05-06 16:07 ` Guenter Roeck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox