* [BUG] WARNING in __alloc_frozen_pages_noprof
@ 2025-11-19 9:07 Xianying Wang
2025-11-26 9:46 ` Vlastimil Babka
0 siblings, 1 reply; 4+ messages in thread
From: Xianying Wang @ 2025-11-19 9:07 UTC (permalink / raw)
To: akpm
Cc: vbabka, surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel
Hi,
I hit the following warning in the page allocator when opening a perf
event with callchain sampling after increasing
kernel.perf_event_max_stack.This warning can be triggered by first
writing a large value into kernel.perf_event_max_stack and then
opening a perf event with callchain sampling enabled.
The reproducer does two things:
1) It writes a large (but still accepted) value to the sysctl:
echo 0x40132 > /proc/sys/kernel/perf_event_max_stack
(0x40132 = 262450 in decimal. This is below the current upper bound
enforced by perf_event_max_stack_handler(), which uses 640 * 1024
as extra2.)
2) It calls perf_event_open() with callchain sampling:
struct perf_event_attr attr = {
.type = PERF_TYPE_HARDWARE,
.size = sizeof(attr),
.config = PERF_COUNT_HW_CPU_CYCLES,
.sample_type = PERF_SAMPLE_CALLCHAIN,
.sample_period = 1,
.disabled = 1,
};
fd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, 0);
The same warning is reproducible on both v6.17.0 and v6.18-rc2
(6.18.0-rc2-00120 g6fab32bb6508), only the line numbers in
__alloc_frozen_pages_noprof() differ slightly.
The suspected cause is that alloc_callchain_buffers() uses
sysctl_perf_event_max_stack directly when computing the size of the
per-CPU callchain buffers. For large but valid values of
kernel.perf_event_max_stack, perf_callchain_entry__sizeof() grows to
several megabytes, and alloc_callchain_buffers() ends up doing a very
large contiguous kmalloc_node() per CPU. This high-order allocation
then triggers the warning in __alloc_frozen_pages_noprof() in the page
allocator.
This can be reproduced on:
HEAD commit:
e5f0a698b34ed76002dc5cff3804a61c80233a7a
6fab32bb6508abbb8b7b1c5498e44f0c32320ed5
report: https://pastebin.com/raw/bCq3d4KR
console output : https://pastebin.com/raw/5hfk57Vd
kernel config : https://pastebin.com/raw/1grwrT16
C reproducer :https://pastebin.com/raw/GADWbwKN
Let me know if you need more details or testing.
Best regards,
Xianying
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] WARNING in __alloc_frozen_pages_noprof
2025-11-19 9:07 [BUG] WARNING in __alloc_frozen_pages_noprof Xianying Wang
@ 2025-11-26 9:46 ` Vlastimil Babka
2025-11-26 11:19 ` Peter Zijlstra
0 siblings, 1 reply; 4+ messages in thread
From: Vlastimil Babka @ 2025-11-26 9:46 UTC (permalink / raw)
To: Xianying Wang, akpm
Cc: surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, linux-perf-users
+CC perf people as AFAIU the problem originates there. Should the limit
be lowered, or the allocations e.g. switched to kvmalloc, to avoid
requesting impossibly high order allocations?
/*
* There are several places where we assume that the order value is sane
* so bail out early if the request is out of bound.
*/
if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
return NULL;
On 11/19/25 10:07 AM, Xianying Wang wrote:
> Hi,
>
> I hit the following warning in the page allocator when opening a perf
> event with callchain sampling after increasing
> kernel.perf_event_max_stack.This warning can be triggered by first
> writing a large value into kernel.perf_event_max_stack and then
> opening a perf event with callchain sampling enabled.
>
> The reproducer does two things:
>
> 1) It writes a large (but still accepted) value to the sysctl:
>
> echo 0x40132 > /proc/sys/kernel/perf_event_max_stack
>
> (0x40132 = 262450 in decimal. This is below the current upper bound
>
> enforced by perf_event_max_stack_handler(), which uses 640 * 1024
>
> as extra2.)
>
> 2) It calls perf_event_open() with callchain sampling:
>
> struct perf_event_attr attr = {
>
> .type = PERF_TYPE_HARDWARE,
>
> .size = sizeof(attr),
>
> .config = PERF_COUNT_HW_CPU_CYCLES,
>
> .sample_type = PERF_SAMPLE_CALLCHAIN,
>
> .sample_period = 1,
>
> .disabled = 1,
>
> };
>
> fd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, 0);
>
> The same warning is reproducible on both v6.17.0 and v6.18-rc2
> (6.18.0-rc2-00120 g6fab32bb6508), only the line numbers in
> __alloc_frozen_pages_noprof() differ slightly.
>
> The suspected cause is that alloc_callchain_buffers() uses
> sysctl_perf_event_max_stack directly when computing the size of the
> per-CPU callchain buffers. For large but valid values of
> kernel.perf_event_max_stack, perf_callchain_entry__sizeof() grows to
> several megabytes, and alloc_callchain_buffers() ends up doing a very
> large contiguous kmalloc_node() per CPU. This high-order allocation
> then triggers the warning in __alloc_frozen_pages_noprof() in the page
> allocator.
>
> This can be reproduced on:
>
> HEAD commit:
>
> e5f0a698b34ed76002dc5cff3804a61c80233a7a
>
> 6fab32bb6508abbb8b7b1c5498e44f0c32320ed5
>
> report: https://pastebin.com/raw/bCq3d4KR
>
> console output : https://pastebin.com/raw/5hfk57Vd
>
> kernel config : https://pastebin.com/raw/1grwrT16
>
> C reproducer :https://pastebin.com/raw/GADWbwKN
>
> Let me know if you need more details or testing.
>
> Best regards,
>
> Xianying
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] WARNING in __alloc_frozen_pages_noprof
2025-11-26 9:46 ` Vlastimil Babka
@ 2025-11-26 11:19 ` Peter Zijlstra
2025-11-26 19:00 ` Namhyung Kim
0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2025-11-26 11:19 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Xianying Wang, akpm, surenb, mhocko, jackmanb, hannes, ziy,
linux-mm, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Liang, Kan, linux-perf-users
On Wed, Nov 26, 2025 at 10:46:38AM +0100, Vlastimil Babka wrote:
> +CC perf people as AFAIU the problem originates there. Should the limit
> be lowered, or the allocations e.g. switched to kvmalloc, to avoid
> requesting impossibly high order allocations?
>
> /*
> * There are several places where we assume that the order value is sane
> * so bail out early if the request is out of bound.
> */
> if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
> return NULL;
>
>
>
> On 11/19/25 10:07 AM, Xianying Wang wrote:
> > Hi,
> >
> > I hit the following warning in the page allocator when opening a perf
> > event with callchain sampling after increasing
> > kernel.perf_event_max_stack.This warning can be triggered by first
> > writing a large value into kernel.perf_event_max_stack and then
> > opening a perf event with callchain sampling enabled.
> >
> > The reproducer does two things:
> >
> > 1) It writes a large (but still accepted) value to the sysctl:
> >
> > echo 0x40132 > /proc/sys/kernel/perf_event_max_stack
> >
Yeah, that is far too large. I suppose the actual max is somewhere near
8k, which would give 64k data for just the callchain -- given that a
single perf buffer entry is limited to 64k (IIRC) and all that.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] WARNING in __alloc_frozen_pages_noprof
2025-11-26 11:19 ` Peter Zijlstra
@ 2025-11-26 19:00 ` Namhyung Kim
0 siblings, 0 replies; 4+ messages in thread
From: Namhyung Kim @ 2025-11-26 19:00 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Vlastimil Babka, Xianying Wang, akpm, surenb, mhocko, jackmanb,
hannes, ziy, linux-mm, linux-kernel, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
linux-perf-users
Hello,
On Wed, Nov 26, 2025 at 12:19:21PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 26, 2025 at 10:46:38AM +0100, Vlastimil Babka wrote:
> > +CC perf people as AFAIU the problem originates there. Should the limit
> > be lowered, or the allocations e.g. switched to kvmalloc, to avoid
> > requesting impossibly high order allocations?
> >
> > /*
> > * There are several places where we assume that the order value is sane
> > * so bail out early if the request is out of bound.
> > */
> > if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
> > return NULL;
> >
> >
> >
> > On 11/19/25 10:07 AM, Xianying Wang wrote:
> > > Hi,
> > >
> > > I hit the following warning in the page allocator when opening a perf
> > > event with callchain sampling after increasing
> > > kernel.perf_event_max_stack.This warning can be triggered by first
> > > writing a large value into kernel.perf_event_max_stack and then
> > > opening a perf event with callchain sampling enabled.
> > >
> > > The reproducer does two things:
> > >
> > > 1) It writes a large (but still accepted) value to the sysctl:
> > >
> > > echo 0x40132 > /proc/sys/kernel/perf_event_max_stack
> > >
>
> Yeah, that is far too large. I suppose the actual max is somewhere near
> 8k, which would give 64k data for just the callchain -- given that a
> single perf buffer entry is limited to 64k (IIRC) and all that.
Right, we have u16 size in struct perf_event_header.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-11-26 19:00 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-19 9:07 [BUG] WARNING in __alloc_frozen_pages_noprof Xianying Wang
2025-11-26 9:46 ` Vlastimil Babka
2025-11-26 11:19 ` Peter Zijlstra
2025-11-26 19:00 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox