* [BUG] WARNING in __alloc_frozen_pages_noprof
@ 2025-11-19 9:07 Xianying Wang
2025-11-26 9:46 ` Vlastimil Babka
0 siblings, 1 reply; 4+ messages in thread
From: Xianying Wang @ 2025-11-19 9:07 UTC (permalink / raw)
To: akpm
Cc: vbabka, surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel
Hi,
I hit the following warning in the page allocator when opening a perf
event with callchain sampling after increasing
kernel.perf_event_max_stack.This warning can be triggered by first
writing a large value into kernel.perf_event_max_stack and then
opening a perf event with callchain sampling enabled.
The reproducer does two things:
1) It writes a large (but still accepted) value to the sysctl:
echo 0x40132 > /proc/sys/kernel/perf_event_max_stack
(0x40132 = 262450 in decimal. This is below the current upper bound
enforced by perf_event_max_stack_handler(), which uses 640 * 1024
as extra2.)
2) It calls perf_event_open() with callchain sampling:
struct perf_event_attr attr = {
.type = PERF_TYPE_HARDWARE,
.size = sizeof(attr),
.config = PERF_COUNT_HW_CPU_CYCLES,
.sample_type = PERF_SAMPLE_CALLCHAIN,
.sample_period = 1,
.disabled = 1,
};
fd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, 0);
The same warning is reproducible on both v6.17.0 and v6.18-rc2
(6.18.0-rc2-00120 g6fab32bb6508), only the line numbers in
__alloc_frozen_pages_noprof() differ slightly.
The suspected cause is that alloc_callchain_buffers() uses
sysctl_perf_event_max_stack directly when computing the size of the
per-CPU callchain buffers. For large but valid values of
kernel.perf_event_max_stack, perf_callchain_entry__sizeof() grows to
several megabytes, and alloc_callchain_buffers() ends up doing a very
large contiguous kmalloc_node() per CPU. This high-order allocation
then triggers the warning in __alloc_frozen_pages_noprof() in the page
allocator.
This can be reproduced on:
HEAD commit:
e5f0a698b34ed76002dc5cff3804a61c80233a7a
6fab32bb6508abbb8b7b1c5498e44f0c32320ed5
report: https://pastebin.com/raw/bCq3d4KR
console output : https://pastebin.com/raw/5hfk57Vd
kernel config : https://pastebin.com/raw/1grwrT16
C reproducer :https://pastebin.com/raw/GADWbwKN
Let me know if you need more details or testing.
Best regards,
Xianying
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [BUG] WARNING in __alloc_frozen_pages_noprof 2025-11-19 9:07 [BUG] WARNING in __alloc_frozen_pages_noprof Xianying Wang @ 2025-11-26 9:46 ` Vlastimil Babka 2025-11-26 11:19 ` Peter Zijlstra 0 siblings, 1 reply; 4+ messages in thread From: Vlastimil Babka @ 2025-11-26 9:46 UTC (permalink / raw) To: Xianying Wang, akpm Cc: surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, linux-perf-users +CC perf people as AFAIU the problem originates there. Should the limit be lowered, or the allocations e.g. switched to kvmalloc, to avoid requesting impossibly high order allocations? /* * There are several places where we assume that the order value is sane * so bail out early if the request is out of bound. */ if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) return NULL; On 11/19/25 10:07 AM, Xianying Wang wrote: > Hi, > > I hit the following warning in the page allocator when opening a perf > event with callchain sampling after increasing > kernel.perf_event_max_stack.This warning can be triggered by first > writing a large value into kernel.perf_event_max_stack and then > opening a perf event with callchain sampling enabled. > > The reproducer does two things: > > 1) It writes a large (but still accepted) value to the sysctl: > > echo 0x40132 > /proc/sys/kernel/perf_event_max_stack > > (0x40132 = 262450 in decimal. This is below the current upper bound > > enforced by perf_event_max_stack_handler(), which uses 640 * 1024 > > as extra2.) > > 2) It calls perf_event_open() with callchain sampling: > > struct perf_event_attr attr = { > > .type = PERF_TYPE_HARDWARE, > > .size = sizeof(attr), > > .config = PERF_COUNT_HW_CPU_CYCLES, > > .sample_type = PERF_SAMPLE_CALLCHAIN, > > .sample_period = 1, > > .disabled = 1, > > }; > > fd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, 0); > > The same warning is reproducible on both v6.17.0 and v6.18-rc2 > (6.18.0-rc2-00120 g6fab32bb6508), only the line numbers in > __alloc_frozen_pages_noprof() differ slightly. > > The suspected cause is that alloc_callchain_buffers() uses > sysctl_perf_event_max_stack directly when computing the size of the > per-CPU callchain buffers. For large but valid values of > kernel.perf_event_max_stack, perf_callchain_entry__sizeof() grows to > several megabytes, and alloc_callchain_buffers() ends up doing a very > large contiguous kmalloc_node() per CPU. This high-order allocation > then triggers the warning in __alloc_frozen_pages_noprof() in the page > allocator. > > This can be reproduced on: > > HEAD commit: > > e5f0a698b34ed76002dc5cff3804a61c80233a7a > > 6fab32bb6508abbb8b7b1c5498e44f0c32320ed5 > > report: https://pastebin.com/raw/bCq3d4KR > > console output : https://pastebin.com/raw/5hfk57Vd > > kernel config : https://pastebin.com/raw/1grwrT16 > > C reproducer :https://pastebin.com/raw/GADWbwKN > > Let me know if you need more details or testing. > > Best regards, > > Xianying ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] WARNING in __alloc_frozen_pages_noprof 2025-11-26 9:46 ` Vlastimil Babka @ 2025-11-26 11:19 ` Peter Zijlstra 2025-11-26 19:00 ` Namhyung Kim 0 siblings, 1 reply; 4+ messages in thread From: Peter Zijlstra @ 2025-11-26 11:19 UTC (permalink / raw) To: Vlastimil Babka Cc: Xianying Wang, akpm, surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, linux-perf-users On Wed, Nov 26, 2025 at 10:46:38AM +0100, Vlastimil Babka wrote: > +CC perf people as AFAIU the problem originates there. Should the limit > be lowered, or the allocations e.g. switched to kvmalloc, to avoid > requesting impossibly high order allocations? > > /* > * There are several places where we assume that the order value is sane > * so bail out early if the request is out of bound. > */ > if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) > return NULL; > > > > On 11/19/25 10:07 AM, Xianying Wang wrote: > > Hi, > > > > I hit the following warning in the page allocator when opening a perf > > event with callchain sampling after increasing > > kernel.perf_event_max_stack.This warning can be triggered by first > > writing a large value into kernel.perf_event_max_stack and then > > opening a perf event with callchain sampling enabled. > > > > The reproducer does two things: > > > > 1) It writes a large (but still accepted) value to the sysctl: > > > > echo 0x40132 > /proc/sys/kernel/perf_event_max_stack > > Yeah, that is far too large. I suppose the actual max is somewhere near 8k, which would give 64k data for just the callchain -- given that a single perf buffer entry is limited to 64k (IIRC) and all that. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] WARNING in __alloc_frozen_pages_noprof 2025-11-26 11:19 ` Peter Zijlstra @ 2025-11-26 19:00 ` Namhyung Kim 0 siblings, 0 replies; 4+ messages in thread From: Namhyung Kim @ 2025-11-26 19:00 UTC (permalink / raw) To: Peter Zijlstra Cc: Vlastimil Babka, Xianying Wang, akpm, surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, linux-perf-users Hello, On Wed, Nov 26, 2025 at 12:19:21PM +0100, Peter Zijlstra wrote: > On Wed, Nov 26, 2025 at 10:46:38AM +0100, Vlastimil Babka wrote: > > +CC perf people as AFAIU the problem originates there. Should the limit > > be lowered, or the allocations e.g. switched to kvmalloc, to avoid > > requesting impossibly high order allocations? > > > > /* > > * There are several places where we assume that the order value is sane > > * so bail out early if the request is out of bound. > > */ > > if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp)) > > return NULL; > > > > > > > > On 11/19/25 10:07 AM, Xianying Wang wrote: > > > Hi, > > > > > > I hit the following warning in the page allocator when opening a perf > > > event with callchain sampling after increasing > > > kernel.perf_event_max_stack.This warning can be triggered by first > > > writing a large value into kernel.perf_event_max_stack and then > > > opening a perf event with callchain sampling enabled. > > > > > > The reproducer does two things: > > > > > > 1) It writes a large (but still accepted) value to the sysctl: > > > > > > echo 0x40132 > /proc/sys/kernel/perf_event_max_stack > > > > > Yeah, that is far too large. I suppose the actual max is somewhere near > 8k, which would give 64k data for just the callchain -- given that a > single perf buffer entry is limited to 64k (IIRC) and all that. Right, we have u16 size in struct perf_event_header. Thanks, Namhyung ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-11-26 19:00 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-11-19 9:07 [BUG] WARNING in __alloc_frozen_pages_noprof Xianying Wang 2025-11-26 9:46 ` Vlastimil Babka 2025-11-26 11:19 ` Peter Zijlstra 2025-11-26 19:00 ` Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox