linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [BUG] WARNING in __alloc_frozen_pages_noprof
@ 2025-11-19  9:07 Xianying Wang
  2025-11-26  9:46 ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Xianying Wang @ 2025-11-19  9:07 UTC (permalink / raw)
  To: akpm
  Cc: vbabka, surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel

Hi,

I hit the following warning in the page allocator when opening a perf
event with callchain sampling after increasing
kernel.perf_event_max_stack.This warning can be triggered by first
writing a large value into kernel.perf_event_max_stack and then
opening a perf event with callchain sampling enabled.

The reproducer does two things:

1) It writes a large (but still accepted) value to the sysctl:

echo 0x40132 > /proc/sys/kernel/perf_event_max_stack

(0x40132 = 262450 in decimal. This is below the current upper bound

enforced by perf_event_max_stack_handler(), which uses 640 * 1024

as extra2.)

2) It calls perf_event_open() with callchain sampling:

struct perf_event_attr attr = {

.type = PERF_TYPE_HARDWARE,

.size = sizeof(attr),

.config = PERF_COUNT_HW_CPU_CYCLES,

.sample_type = PERF_SAMPLE_CALLCHAIN,

.sample_period = 1,

.disabled = 1,

};

fd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, 0);

The same warning is reproducible on both v6.17.0 and v6.18-rc2
(6.18.0-rc2-00120 g6fab32bb6508), only the line numbers in
__alloc_frozen_pages_noprof() differ slightly.

The suspected cause is that alloc_callchain_buffers() uses
sysctl_perf_event_max_stack directly when computing the size of the
per-CPU callchain buffers. For large but valid values of
kernel.perf_event_max_stack, perf_callchain_entry__sizeof() grows to
several megabytes, and alloc_callchain_buffers() ends up doing a very
large contiguous kmalloc_node() per CPU. This high-order allocation
then triggers the warning in __alloc_frozen_pages_noprof() in the page
allocator.

This can be reproduced on:

HEAD commit:

e5f0a698b34ed76002dc5cff3804a61c80233a7a

6fab32bb6508abbb8b7b1c5498e44f0c32320ed5

report: https://pastebin.com/raw/bCq3d4KR

console output : https://pastebin.com/raw/5hfk57Vd

kernel config : https://pastebin.com/raw/1grwrT16

C reproducer :https://pastebin.com/raw/GADWbwKN

Let me know if you need more details or testing.

Best regards,

Xianying


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] WARNING in __alloc_frozen_pages_noprof
  2025-11-19  9:07 [BUG] WARNING in __alloc_frozen_pages_noprof Xianying Wang
@ 2025-11-26  9:46 ` Vlastimil Babka
  2025-11-26 11:19   ` Peter Zijlstra
  0 siblings, 1 reply; 4+ messages in thread
From: Vlastimil Babka @ 2025-11-26  9:46 UTC (permalink / raw)
  To: Xianying Wang, akpm
  Cc: surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Liang, Kan, linux-perf-users

+CC perf people as AFAIU the problem originates there. Should the limit
be lowered, or the allocations e.g. switched to kvmalloc, to avoid
requesting impossibly high order allocations?

        /*
         * There are several places where we assume that the order value is sane
         * so bail out early if the request is out of bound.
         */
        if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
                return NULL;



On 11/19/25 10:07 AM, Xianying Wang wrote:
> Hi,
> 
> I hit the following warning in the page allocator when opening a perf
> event with callchain sampling after increasing
> kernel.perf_event_max_stack.This warning can be triggered by first
> writing a large value into kernel.perf_event_max_stack and then
> opening a perf event with callchain sampling enabled.
> 
> The reproducer does two things:
> 
> 1) It writes a large (but still accepted) value to the sysctl:
> 
> echo 0x40132 > /proc/sys/kernel/perf_event_max_stack
> 
> (0x40132 = 262450 in decimal. This is below the current upper bound
> 
> enforced by perf_event_max_stack_handler(), which uses 640 * 1024
> 
> as extra2.)
> 
> 2) It calls perf_event_open() with callchain sampling:
> 
> struct perf_event_attr attr = {
> 
> .type = PERF_TYPE_HARDWARE,
> 
> .size = sizeof(attr),
> 
> .config = PERF_COUNT_HW_CPU_CYCLES,
> 
> .sample_type = PERF_SAMPLE_CALLCHAIN,
> 
> .sample_period = 1,
> 
> .disabled = 1,
> 
> };
> 
> fd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, 0);
> 
> The same warning is reproducible on both v6.17.0 and v6.18-rc2
> (6.18.0-rc2-00120 g6fab32bb6508), only the line numbers in
> __alloc_frozen_pages_noprof() differ slightly.
> 
> The suspected cause is that alloc_callchain_buffers() uses
> sysctl_perf_event_max_stack directly when computing the size of the
> per-CPU callchain buffers. For large but valid values of
> kernel.perf_event_max_stack, perf_callchain_entry__sizeof() grows to
> several megabytes, and alloc_callchain_buffers() ends up doing a very
> large contiguous kmalloc_node() per CPU. This high-order allocation
> then triggers the warning in __alloc_frozen_pages_noprof() in the page
> allocator.
> 
> This can be reproduced on:
> 
> HEAD commit:
> 
> e5f0a698b34ed76002dc5cff3804a61c80233a7a
> 
> 6fab32bb6508abbb8b7b1c5498e44f0c32320ed5
> 
> report: https://pastebin.com/raw/bCq3d4KR
> 
> console output : https://pastebin.com/raw/5hfk57Vd
> 
> kernel config : https://pastebin.com/raw/1grwrT16
> 
> C reproducer :https://pastebin.com/raw/GADWbwKN
> 
> Let me know if you need more details or testing.
> 
> Best regards,
> 
> Xianying



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] WARNING in __alloc_frozen_pages_noprof
  2025-11-26  9:46 ` Vlastimil Babka
@ 2025-11-26 11:19   ` Peter Zijlstra
  2025-11-26 19:00     ` Namhyung Kim
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2025-11-26 11:19 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Xianying Wang, akpm, surenb, mhocko, jackmanb, hannes, ziy,
	linux-mm, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Liang, Kan, linux-perf-users

On Wed, Nov 26, 2025 at 10:46:38AM +0100, Vlastimil Babka wrote:
> +CC perf people as AFAIU the problem originates there. Should the limit
> be lowered, or the allocations e.g. switched to kvmalloc, to avoid
> requesting impossibly high order allocations?
> 
>         /*
>          * There are several places where we assume that the order value is sane
>          * so bail out early if the request is out of bound.
>          */
>         if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
>                 return NULL;
> 
> 
> 
> On 11/19/25 10:07 AM, Xianying Wang wrote:
> > Hi,
> > 
> > I hit the following warning in the page allocator when opening a perf
> > event with callchain sampling after increasing
> > kernel.perf_event_max_stack.This warning can be triggered by first
> > writing a large value into kernel.perf_event_max_stack and then
> > opening a perf event with callchain sampling enabled.
> > 
> > The reproducer does two things:
> > 
> > 1) It writes a large (but still accepted) value to the sysctl:
> > 
> > echo 0x40132 > /proc/sys/kernel/perf_event_max_stack
> > 

Yeah, that is far too large. I suppose the actual max is somewhere near
8k, which would give 64k data for just the callchain -- given that a
single perf buffer entry is limited to 64k (IIRC) and all that.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] WARNING in __alloc_frozen_pages_noprof
  2025-11-26 11:19   ` Peter Zijlstra
@ 2025-11-26 19:00     ` Namhyung Kim
  0 siblings, 0 replies; 4+ messages in thread
From: Namhyung Kim @ 2025-11-26 19:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vlastimil Babka, Xianying Wang, akpm, surenb, mhocko, jackmanb,
	hannes, ziy, linux-mm, linux-kernel, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan,
	linux-perf-users

Hello,

On Wed, Nov 26, 2025 at 12:19:21PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 26, 2025 at 10:46:38AM +0100, Vlastimil Babka wrote:
> > +CC perf people as AFAIU the problem originates there. Should the limit
> > be lowered, or the allocations e.g. switched to kvmalloc, to avoid
> > requesting impossibly high order allocations?
> > 
> >         /*
> >          * There are several places where we assume that the order value is sane
> >          * so bail out early if the request is out of bound.
> >          */
> >         if (WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp))
> >                 return NULL;
> > 
> > 
> > 
> > On 11/19/25 10:07 AM, Xianying Wang wrote:
> > > Hi,
> > > 
> > > I hit the following warning in the page allocator when opening a perf
> > > event with callchain sampling after increasing
> > > kernel.perf_event_max_stack.This warning can be triggered by first
> > > writing a large value into kernel.perf_event_max_stack and then
> > > opening a perf event with callchain sampling enabled.
> > > 
> > > The reproducer does two things:
> > > 
> > > 1) It writes a large (but still accepted) value to the sysctl:
> > > 
> > > echo 0x40132 > /proc/sys/kernel/perf_event_max_stack
> > > 
> 
> Yeah, that is far too large. I suppose the actual max is somewhere near
> 8k, which would give 64k data for just the callchain -- given that a
> single perf buffer entry is limited to 64k (IIRC) and all that.

Right, we have u16 size in struct perf_event_header.

Thanks,
Namhyung



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-11-26 19:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-19  9:07 [BUG] WARNING in __alloc_frozen_pages_noprof Xianying Wang
2025-11-26  9:46 ` Vlastimil Babka
2025-11-26 11:19   ` Peter Zijlstra
2025-11-26 19:00     ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox