* [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches")
@ 2026-02-22 21:36 Chris Bainbridge
2026-02-23 8:41 ` Harry Yoo
0 siblings, 1 reply; 5+ messages in thread
From: Chris Bainbridge @ 2026-02-22 21:36 UTC (permalink / raw)
To: vbabka
Cc: surenb, harry.yoo, hao.li, leitao, Liam.Howlett, zhao1.liu,
linux-kernel, linux-mm, linux-btrfs, regressions
Hi,
The latest mainline kernel (v6.19-11831-ga95f71ad3e2e) has page
allocation failures when doing things like compiling a kernel. I can
also reproduce this with a stress test like
`stress-ng --vm 2 --vm-bytes 110% --verify -v`
[ 104.032925] kswapd0: page allocation failure: order:0, mode:0xc0c40(GFP_NOFS|__GFP_COMP|__GFP_NOMEMALLOC), nodemask=(null),cpuset=/,mems_allowed=0
[ 104.033307] CPU: 4 UID: 0 PID: 156 Comm: kswapd0 Not tainted 6.19.0-rc5-00027-g40fd0acc45d0 #435 PREEMPT(voluntary)
[ 104.033312] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
[ 104.033314] Call Trace:
[ 104.033316] <TASK>
[ 104.033319] dump_stack_lvl+0x6a/0x90
[ 104.033328] warn_alloc.cold+0x95/0x1af
[ 104.033334] ? zone_watermark_ok+0x80/0x80
[ 104.033350] __alloc_frozen_pages_noprof+0xec3/0x2470
[ 104.033353] ? __lock_acquire+0x489/0x2600
[ 104.033359] ? stack_access_ok+0x1c0/0x1c0
[ 104.033367] ? warn_alloc+0x1d0/0x1d0
[ 104.033371] ? __lock_acquire+0x489/0x2600
[ 104.033375] ? _raw_spin_unlock_irqrestore+0x48/0x60
[ 104.033379] ? _raw_spin_unlock_irqrestore+0x48/0x60
[ 104.033382] ? lockdep_hardirqs_on+0x78/0x100
[ 104.033394] allocate_slab+0x2b7/0x510
[ 104.033399] refill_objects+0x25d/0x380
[ 104.033407] __pcs_replace_empty_main+0x193/0x5f0
[ 104.033412] kmem_cache_alloc_noprof+0x5b6/0x6f0
[ 104.033415] ? alloc_extent_state+0x1b/0x210 [btrfs]
[ 104.033479] alloc_extent_state+0x1b/0x210 [btrfs]
[ 104.033527] btrfs_clear_extent_bit_changeset+0x2be/0x9c0 [btrfs]
[ 104.033575] btrfs_clear_record_extent_bits+0x10/0x20 [btrfs]
[ 104.033615] btrfs_qgroup_check_reserved_leak+0xbd/0x2b0 [btrfs]
[ 104.033659] ? lock_release+0x17b/0x2a0
[ 104.033663] ? btrfs_qgroup_convert_reserved_meta+0xe90/0xe90 [btrfs]
[ 104.033703] ? do_raw_spin_unlock+0x54/0x1e0
[ 104.033707] ? _raw_spin_unlock+0x29/0x40
[ 104.033710] ? btrfs_lookup_first_ordered_extent+0x1d4/0x370 [btrfs]
[ 104.033762] ? preempt_count_add+0x73/0x140
[ 104.033768] btrfs_destroy_inode+0x301/0x6a0 [btrfs]
[ 104.033820] ? __destroy_inode+0x194/0x570
[ 104.033826] destroy_inode+0xb9/0x190
[ 104.033830] evict+0x4d8/0x900
[ 104.033832] ? lock_release+0x17b/0x2a0
[ 104.033835] ? find_held_lock+0x2b/0x80
[ 104.033839] ? destroy_inode+0x190/0x190
[ 104.033842] ? __list_lru_walk_one+0x30d/0x440
[ 104.033849] ? _raw_spin_unlock+0x29/0x40
[ 104.033851] ? __list_lru_walk_one+0x30d/0x440
[ 104.033854] ? __wait_on_freeing_inode+0x2a0/0x2a0
[ 104.033860] dispose_list+0xf0/0x1b0
[ 104.033866] prune_icache_sb+0xde/0x150
[ 104.033869] ? list_lru_count_one+0x13f/0x270
[ 104.033873] ? dump_mapping+0x250/0x250
[ 104.033875] ? lock_release+0x17b/0x2a0
[ 104.033882] super_cache_scan+0x302/0x4d0
[ 104.033889] do_shrink_slab+0x32e/0xd30
[ 104.033898] shrink_slab+0x7b6/0xda0
[ 104.033902] ? shrink_slab+0x4b1/0xda0
[ 104.033908] ? reparent_shrinker_deferred+0x330/0x330
[ 104.033914] ? trace_event_raw_event_sched_switch+0x410/0x410
[ 104.033921] shrink_node+0xac4/0x36e0
[ 104.033933] ? lru_gen_release_memcg+0x3c0/0x3c0
[ 104.033940] ? pgdat_balanced+0x15f/0x4b0
[ 104.033943] ? __cond_resched+0x23/0x30
[ 104.033950] ? balance_pgdat+0x739/0x1530
[ 104.033952] balance_pgdat+0x739/0x1530
[ 104.033960] ? shrink_node+0x36e0/0x36e0
[ 104.033962] ? __timer_delete_sync+0x177/0x240
[ 104.033966] ? __timer_delete_sync+0x177/0x240
[ 104.033970] ? _raw_spin_unlock_irqrestore+0x48/0x60
[ 104.033975] ? __lock_acquire+0x489/0x2600
[ 104.033979] ? call_timer_fn+0x3b0/0x3b0
[ 104.033981] ? schedule+0x2ba/0x390
[ 104.033990] ? lock_is_held_type+0xd5/0x130
[ 104.033997] ? kswapd+0x364/0x7f0
[ 104.034004] kswapd+0x445/0x7f0
[ 104.034010] ? balance_pgdat+0x1530/0x1530
[ 104.034013] ? _raw_spin_unlock_irqrestore+0x48/0x60
[ 104.034016] ? finish_wait+0x280/0x280
[ 104.034022] ? __kthread_parkme+0xb4/0x200
[ 104.034027] ? balance_pgdat+0x1530/0x1530
[ 104.034029] kthread+0x3ad/0x760
[ 104.034033] ? kthread_is_per_cpu+0xb0/0xb0
[ 104.034035] ? ret_from_fork+0x70/0x850
[ 104.034039] ? ret_from_fork+0x70/0x850
[ 104.034042] ? _raw_spin_unlock_irq+0x24/0x50
[ 104.034045] ? kthread_is_per_cpu+0xb0/0xb0
[ 104.034049] ret_from_fork+0x6dc/0x850
[ 104.034053] ? exit_thread+0x70/0x70
[ 104.034057] ? __switch_to+0x36f/0xd80
[ 104.034061] ? kthread_is_per_cpu+0xb0/0xb0
[ 104.034065] ret_from_fork_asm+0x11/0x20
[ 104.034077] </TASK>
[ 104.034078] Mem-Info:
[ 104.034111] active_anon:511 inactive_anon:2355672 isolated_anon:0
active_file:77595 inactive_file:204731 isolated_file:0
unevictable:7150 dirty:925 writeback:57
slab_reclaimable:20227 slab_unreclaimable:201840
mapped:121227 shmem:10197 pagetables:9634
sec_pagetables:733 bounce:0
kernel_misc_reclaimable:0
free:36223 free_pcp:529 free_cma:0
[ 104.034119] Node 0 active_anon:2044kB inactive_anon:9422688kB active_file:310380kB inactive_file:818924kB unevictable:28600kB isolated(anon):0kB isolated(file):0kB mapped:484908kB dirty:3700kB writeback:228kB shmem:40788kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:8534016kB kernel_stack:31616kB pagetables:38536kB sec_pagetables:2932kB all_unreclaimable? no Balloon:0kB
[ 104.034126] Node 0 DMA free:13316kB boost:0kB min:84kB low:104kB high:124kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB zspages:0kB present:15996kB managed:15364kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 104.034135] lowmem_reserve[]: 0 2862 11990 11990 11990
[ 104.034147] Node 0 DMA32 free:52184kB boost:0kB min:15860kB low:19824kB high:23788kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:2871780kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB zspages:0kB present:2997084kB managed:2931416kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 104.034155] lowmem_reserve[]: 0 0 9127 9127 9127
[ 104.034166] Node 0 Normal free:79392kB boost:28672kB min:80308kB low:93216kB high:106124kB reserved_highatomic:2048KB free_highatomic:32KB active_anon:2044kB inactive_anon:6550896kB active_file:310380kB inactive_file:818924kB unevictable:28600kB writepending:4252kB zspages:0kB present:13077504kB managed:9346788kB mlocked:28600kB bounce:0kB free_pcp:2116kB local_pcp:0kB free_cma:0kB
[ 104.034174] lowmem_reserve[]: 0 0 0 0 0
[ 104.034185] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 13316kB
[ 104.034308] Node 0 DMA32: 3*4kB (U) 5*8kB (UM) 5*16kB (UM) 8*32kB (UM) 11*64kB (UM) 15*128kB (UM) 11*256kB (UM) 9*512kB (UM) 9*1024kB (UM) 0*2048kB 8*4096kB (UM) = 52420kB
[ 104.034348] Node 0 Normal: 1024*4kB (UMEH) 534*8kB (UEH) 409*16kB (UME) 1301*32kB (UME) 154*64kB (UME) 39*128kB (UME) 14*256kB (UM) 1*512kB (U) 2*1024kB (UM) 1*2048kB (U) 0*4096kB = 79584kB
[ 104.034390] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 104.034393] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 104.034396] 299766 total pagecache pages
[ 104.034398] 0 pages in swap cache
[ 104.034401] Free swap = 8387580kB
[ 104.034403] Total swap = 8387580kB
[ 104.034405] 4022646 pages RAM
[ 104.034407] 0 pages HighMem/MovableOnly
[ 104.034410] 949254 pages reserved
[ 104.034412] 0 pages hwpoisoned
The page allocation failures bisect to:
e47c897a29491ade20b27612fdd3107c39a07357 slab: add sheaves to most caches
#regzbot introduced: e47c897a29491ade20b27612fdd3107c39a07357
#regzbot title: kswapd0: page allocation failure
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches")
2026-02-22 21:36 [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches") Chris Bainbridge
@ 2026-02-23 8:41 ` Harry Yoo
2026-02-23 11:12 ` Chris Bainbridge
0 siblings, 1 reply; 5+ messages in thread
From: Harry Yoo @ 2026-02-23 8:41 UTC (permalink / raw)
To: Chris Bainbridge
Cc: vbabka, surenb, hao.li, leitao, Liam.Howlett, zhao1.liu,
linux-kernel, linux-mm, linux-btrfs, regressions
On Sun, Feb 22, 2026 at 09:36:58PM +0000, Chris Bainbridge wrote:
> Hi,
>
> The latest mainline kernel (v6.19-11831-ga95f71ad3e2e) has page
> allocation failures when doing things like compiling a kernel. I can
> also reproduce this with a stress test like
> `stress-ng --vm 2 --vm-bytes 110% --verify -v`
Hi, thanks for the report!
> [ 104.032925] kswapd0: page allocation failure: order:0, mode:0xc0c40(GFP_NOFS|__GFP_COMP|__GFP_NOMEMALLOC), nodemask=(null),cpuset=/,mems_allowed=0
> [ 104.033307] CPU: 4 UID: 0 PID: 156 Comm: kswapd0 Not tainted 6.19.0-rc5-00027-g40fd0acc45d0 #435 PREEMPT(voluntary)
> [ 104.033312] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
> [ 104.033314] Call Trace:
> [ 104.033316] <TASK>
> [ 104.033319] dump_stack_lvl+0x6a/0x90
> [ 104.033328] warn_alloc.cold+0x95/0x1af
> [ 104.033334] ? zone_watermark_ok+0x80/0x80
> [ 104.033350] __alloc_frozen_pages_noprof+0xec3/0x2470
> [ 104.033353] ? __lock_acquire+0x489/0x2600
> [ 104.033359] ? stack_access_ok+0x1c0/0x1c0
> [ 104.033367] ? warn_alloc+0x1d0/0x1d0
> [ 104.033371] ? __lock_acquire+0x489/0x2600
> [ 104.033375] ? _raw_spin_unlock_irqrestore+0x48/0x60
> [ 104.033379] ? _raw_spin_unlock_irqrestore+0x48/0x60
> [ 104.033382] ? lockdep_hardirqs_on+0x78/0x100
> [ 104.033394] allocate_slab+0x2b7/0x510
> [ 104.033399] refill_objects+0x25d/0x380
> [ 104.033407] __pcs_replace_empty_main+0x193/0x5f0
> [ 104.033412] kmem_cache_alloc_noprof+0x5b6/0x6f0
> [ 104.033415] ? alloc_extent_state+0x1b/0x210 [btrfs]
> [ 104.033479] alloc_extent_state+0x1b/0x210 [btrfs]
> [ 104.033527] btrfs_clear_extent_bit_changeset+0x2be/0x9c0 [btrfs]
Hmm while bisect points out the first bad commit is
commit e47c897a2949 ("slab: add sheaves to most caches"),
I think the caller is supposed to specify __GFP_NOWARN if it doesn't
care about allocation failure?
btrfs_clear_extent_bit_changeset() says:
> if (!prealloc) {
> /*
> * Don't care for allocation failure here because we might end
> * up not needing the pre-allocated extent state at all, which
> * is the case if we only have in the tree extent states that
> * cover our input range and don't cover too any other range.
> * If we end up needing a new extent state we allocate it later.
> */
> prealloc = alloc_extent_state(mask);
> }
Oh wait, I see what's going on. bisection pointed out the commit
because slab tries to refill sheaves with __GFP_NOMEMALLOC (and then
falls back to slowpath if it fails).
Since failing to refill sheaves doesn't mean the allocation will fail,
it should specify __GFP_NOWARN with __GFP_NOMEMALLOC as long as there's
fallback method.
But for __prefill_sheaf_pfmemalloc(), it should specify __GPF_NOWARN on
the first attempt only when gfp_pfmemalloc_allowed() returns true.
--
Cheers,
Harry / Hyeonggon
> [ 104.033575] btrfs_clear_record_extent_bits+0x10/0x20 [btrfs]
> [ 104.033615] btrfs_qgroup_check_reserved_leak+0xbd/0x2b0 [btrfs]
> [ 104.033659] ? lock_release+0x17b/0x2a0
> [ 104.033663] ? btrfs_qgroup_convert_reserved_meta+0xe90/0xe90 [btrfs]
> [ 104.033703] ? do_raw_spin_unlock+0x54/0x1e0
> [ 104.033707] ? _raw_spin_unlock+0x29/0x40
> [ 104.033710] ? btrfs_lookup_first_ordered_extent+0x1d4/0x370 [btrfs]
> [ 104.033762] ? preempt_count_add+0x73/0x140
> [ 104.033768] btrfs_destroy_inode+0x301/0x6a0 [btrfs]
> [ 104.033820] ? __destroy_inode+0x194/0x570
> [ 104.033826] destroy_inode+0xb9/0x190
> [ 104.033830] evict+0x4d8/0x900
> [ 104.033832] ? lock_release+0x17b/0x2a0
> [ 104.033835] ? find_held_lock+0x2b/0x80
> [ 104.033839] ? destroy_inode+0x190/0x190
> [ 104.033842] ? __list_lru_walk_one+0x30d/0x440
> [ 104.033849] ? _raw_spin_unlock+0x29/0x40
> [ 104.033851] ? __list_lru_walk_one+0x30d/0x440
> [ 104.033854] ? __wait_on_freeing_inode+0x2a0/0x2a0
> [ 104.033860] dispose_list+0xf0/0x1b0
> [ 104.033866] prune_icache_sb+0xde/0x150
> [ 104.033869] ? list_lru_count_one+0x13f/0x270
> [ 104.033873] ? dump_mapping+0x250/0x250
> [ 104.033875] ? lock_release+0x17b/0x2a0
> [ 104.033882] super_cache_scan+0x302/0x4d0
> [ 104.033889] do_shrink_slab+0x32e/0xd30
> [ 104.033898] shrink_slab+0x7b6/0xda0
> [ 104.033902] ? shrink_slab+0x4b1/0xda0
> [ 104.033908] ? reparent_shrinker_deferred+0x330/0x330
> [ 104.033914] ? trace_event_raw_event_sched_switch+0x410/0x410
> [ 104.033921] shrink_node+0xac4/0x36e0
> [ 104.033933] ? lru_gen_release_memcg+0x3c0/0x3c0
> [ 104.033940] ? pgdat_balanced+0x15f/0x4b0
> [ 104.033943] ? __cond_resched+0x23/0x30
> [ 104.033950] ? balance_pgdat+0x739/0x1530
> [ 104.033952] balance_pgdat+0x739/0x1530
> [ 104.033960] ? shrink_node+0x36e0/0x36e0
> [ 104.033962] ? __timer_delete_sync+0x177/0x240
> [ 104.033966] ? __timer_delete_sync+0x177/0x240
> [ 104.033970] ? _raw_spin_unlock_irqrestore+0x48/0x60
> [ 104.033975] ? __lock_acquire+0x489/0x2600
> [ 104.033979] ? call_timer_fn+0x3b0/0x3b0
> [ 104.033981] ? schedule+0x2ba/0x390
> [ 104.033990] ? lock_is_held_type+0xd5/0x130
> [ 104.033997] ? kswapd+0x364/0x7f0
> [ 104.034004] kswapd+0x445/0x7f0
> [ 104.034010] ? balance_pgdat+0x1530/0x1530
> [ 104.034013] ? _raw_spin_unlock_irqrestore+0x48/0x60
> [ 104.034016] ? finish_wait+0x280/0x280
> [ 104.034022] ? __kthread_parkme+0xb4/0x200
> [ 104.034027] ? balance_pgdat+0x1530/0x1530
> [ 104.034029] kthread+0x3ad/0x760
> [ 104.034033] ? kthread_is_per_cpu+0xb0/0xb0
> [ 104.034035] ? ret_from_fork+0x70/0x850
> [ 104.034039] ? ret_from_fork+0x70/0x850
> [ 104.034042] ? _raw_spin_unlock_irq+0x24/0x50
> [ 104.034045] ? kthread_is_per_cpu+0xb0/0xb0
> [ 104.034049] ret_from_fork+0x6dc/0x850
> [ 104.034053] ? exit_thread+0x70/0x70
> [ 104.034057] ? __switch_to+0x36f/0xd80
> [ 104.034061] ? kthread_is_per_cpu+0xb0/0xb0
> [ 104.034065] ret_from_fork_asm+0x11/0x20
> [ 104.034077] </TASK>
> [ 104.034078] Mem-Info:
> [ 104.034111] active_anon:511 inactive_anon:2355672 isolated_anon:0
> active_file:77595 inactive_file:204731 isolated_file:0
> unevictable:7150 dirty:925 writeback:57
> slab_reclaimable:20227 slab_unreclaimable:201840
> mapped:121227 shmem:10197 pagetables:9634
> sec_pagetables:733 bounce:0
> kernel_misc_reclaimable:0
> free:36223 free_pcp:529 free_cma:0
> [ 104.034119] Node 0 active_anon:2044kB inactive_anon:9422688kB active_file:310380kB inactive_file:818924kB unevictable:28600kB isolated(anon):0kB isolated(file):0kB mapped:484908kB dirty:3700kB writeback:228kB shmem:40788kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:8534016kB kernel_stack:31616kB pagetables:38536kB sec_pagetables:2932kB all_unreclaimable? no Balloon:0kB
> [ 104.034126] Node 0 DMA free:13316kB boost:0kB min:84kB low:104kB high:124kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB zspages:0kB present:15996kB managed:15364kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 104.034135] lowmem_reserve[]: 0 2862 11990 11990 11990
> [ 104.034147] Node 0 DMA32 free:52184kB boost:0kB min:15860kB low:19824kB high:23788kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:2871780kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB zspages:0kB present:2997084kB managed:2931416kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 104.034155] lowmem_reserve[]: 0 0 9127 9127 9127
> [ 104.034166] Node 0 Normal free:79392kB boost:28672kB min:80308kB low:93216kB high:106124kB reserved_highatomic:2048KB free_highatomic:32KB active_anon:2044kB inactive_anon:6550896kB active_file:310380kB inactive_file:818924kB unevictable:28600kB writepending:4252kB zspages:0kB present:13077504kB managed:9346788kB mlocked:28600kB bounce:0kB free_pcp:2116kB local_pcp:0kB free_cma:0kB
> [ 104.034174] lowmem_reserve[]: 0 0 0 0 0
> [ 104.034185] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 13316kB
> [ 104.034308] Node 0 DMA32: 3*4kB (U) 5*8kB (UM) 5*16kB (UM) 8*32kB (UM) 11*64kB (UM) 15*128kB (UM) 11*256kB (UM) 9*512kB (UM) 9*1024kB (UM) 0*2048kB 8*4096kB (UM) = 52420kB
> [ 104.034348] Node 0 Normal: 1024*4kB (UMEH) 534*8kB (UEH) 409*16kB (UME) 1301*32kB (UME) 154*64kB (UME) 39*128kB (UME) 14*256kB (UM) 1*512kB (U) 2*1024kB (UM) 1*2048kB (U) 0*4096kB = 79584kB
> [ 104.034390] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> [ 104.034393] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> [ 104.034396] 299766 total pagecache pages
> [ 104.034398] 0 pages in swap cache
> [ 104.034401] Free swap = 8387580kB
> [ 104.034403] Total swap = 8387580kB
> [ 104.034405] 4022646 pages RAM
> [ 104.034407] 0 pages HighMem/MovableOnly
> [ 104.034410] 949254 pages reserved
> [ 104.034412] 0 pages hwpoisoned
>
>
> The page allocation failures bisect to:
>
> e47c897a29491ade20b27612fdd3107c39a07357 slab: add sheaves to most caches
>
> #regzbot introduced: e47c897a29491ade20b27612fdd3107c39a07357
> #regzbot title: kswapd0: page allocation failure
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches")
2026-02-23 8:41 ` Harry Yoo
@ 2026-02-23 11:12 ` Chris Bainbridge
2026-02-23 11:59 ` Harry Yoo
0 siblings, 1 reply; 5+ messages in thread
From: Chris Bainbridge @ 2026-02-23 11:12 UTC (permalink / raw)
To: Harry Yoo
Cc: vbabka, surenb, hao.li, leitao, Liam.Howlett, zhao1.liu,
linux-kernel, linux-mm, linux-btrfs, regressions
On Mon, Feb 23, 2026 at 05:41:17PM +0900, Harry Yoo wrote:
> On Sun, Feb 22, 2026 at 09:36:58PM +0000, Chris Bainbridge wrote:
> > Hi,
> >
> > The latest mainline kernel (v6.19-11831-ga95f71ad3e2e) has page
> > allocation failures when doing things like compiling a kernel. I can
> > also reproduce this with a stress test like
> > `stress-ng --vm 2 --vm-bytes 110% --verify -v`
>
> Hi, thanks for the report!
>
> > [ 104.032925] kswapd0: page allocation failure: order:0, mode:0xc0c40(GFP_NOFS|__GFP_COMP|__GFP_NOMEMALLOC), nodemask=(null),cpuset=/,mems_allowed=0
> > [ 104.033307] CPU: 4 UID: 0 PID: 156 Comm: kswapd0 Not tainted 6.19.0-rc5-00027-g40fd0acc45d0 #435 PREEMPT(voluntary)
> > [ 104.033312] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
> > [ 104.033314] Call Trace:
> > [ 104.033316] <TASK>
> > [ 104.033319] dump_stack_lvl+0x6a/0x90
> > [ 104.033328] warn_alloc.cold+0x95/0x1af
> > [ 104.033334] ? zone_watermark_ok+0x80/0x80
> > [ 104.033350] __alloc_frozen_pages_noprof+0xec3/0x2470
> > [ 104.033353] ? __lock_acquire+0x489/0x2600
> > [ 104.033359] ? stack_access_ok+0x1c0/0x1c0
> > [ 104.033367] ? warn_alloc+0x1d0/0x1d0
> > [ 104.033371] ? __lock_acquire+0x489/0x2600
> > [ 104.033375] ? _raw_spin_unlock_irqrestore+0x48/0x60
> > [ 104.033379] ? _raw_spin_unlock_irqrestore+0x48/0x60
> > [ 104.033382] ? lockdep_hardirqs_on+0x78/0x100
> > [ 104.033394] allocate_slab+0x2b7/0x510
> > [ 104.033399] refill_objects+0x25d/0x380
> > [ 104.033407] __pcs_replace_empty_main+0x193/0x5f0
> > [ 104.033412] kmem_cache_alloc_noprof+0x5b6/0x6f0
> > [ 104.033415] ? alloc_extent_state+0x1b/0x210 [btrfs]
> > [ 104.033479] alloc_extent_state+0x1b/0x210 [btrfs]
> > [ 104.033527] btrfs_clear_extent_bit_changeset+0x2be/0x9c0 [btrfs]
>
> Hmm while bisect points out the first bad commit is
> commit e47c897a2949 ("slab: add sheaves to most caches"),
>
> I think the caller is supposed to specify __GFP_NOWARN if it doesn't
> care about allocation failure?
>
> btrfs_clear_extent_bit_changeset() says:
> > if (!prealloc) {
> > /*
> > * Don't care for allocation failure here because we might end
> > * up not needing the pre-allocated extent state at all, which
> > * is the case if we only have in the tree extent states that
> > * cover our input range and don't cover too any other range.
> > * If we end up needing a new extent state we allocate it later.
> > */
> > prealloc = alloc_extent_state(mask);
> > }
>
> Oh wait, I see what's going on. bisection pointed out the commit
> because slab tries to refill sheaves with __GFP_NOMEMALLOC (and then
> falls back to slowpath if it fails).
>
> Since failing to refill sheaves doesn't mean the allocation will fail,
> it should specify __GFP_NOWARN with __GFP_NOMEMALLOC as long as there's
> fallback method.
>
> But for __prefill_sheaf_pfmemalloc(), it should specify __GPF_NOWARN on
> the first attempt only when gfp_pfmemalloc_allowed() returns true.
Is this fix sufficient to do the right thing? I tested it, and it does
appear to prevent logging of the allocation failures for my test case.
diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
index d0dd50f7d279..d2e1083848e8 100644
--- a/fs/btrfs/extent-io-tree.c
+++ b/fs/btrfs/extent-io-tree.c
@@ -641,7 +641,7 @@ int btrfs_clear_extent_bit_changeset(struct extent_io_tree *tree, u64 start, u64
* cover our input range and don't cover too any other range.
* If we end up needing a new extent state we allocate it later.
*/
- prealloc = alloc_extent_state(mask);
+ prealloc = alloc_extent_state(mask | __GFP_NOWARN);
}
spin_lock(&tree->lock);
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches")
2026-02-23 11:12 ` Chris Bainbridge
@ 2026-02-23 11:59 ` Harry Yoo
2026-02-23 20:30 ` David Sterba
0 siblings, 1 reply; 5+ messages in thread
From: Harry Yoo @ 2026-02-23 11:59 UTC (permalink / raw)
To: Chris Bainbridge
Cc: vbabka, surenb, hao.li, leitao, Liam.Howlett, zhao1.liu,
linux-kernel, linux-mm, linux-btrfs, regressions
On Mon, Feb 23, 2026 at 11:12:47AM +0000, Chris Bainbridge wrote:
> On Mon, Feb 23, 2026 at 05:41:17PM +0900, Harry Yoo wrote:
> > On Sun, Feb 22, 2026 at 09:36:58PM +0000, Chris Bainbridge wrote:
> > > Hi,
> > >
> > > The latest mainline kernel (v6.19-11831-ga95f71ad3e2e) has page
> > > allocation failures when doing things like compiling a kernel. I can
> > > also reproduce this with a stress test like
> > > `stress-ng --vm 2 --vm-bytes 110% --verify -v`
> >
> > Hi, thanks for the report!
> >
> > > [ 104.032925] kswapd0: page allocation failure: order:0, mode:0xc0c40(GFP_NOFS|__GFP_COMP|__GFP_NOMEMALLOC), nodemask=(null),cpuset=/,mems_allowed=0
> > > [ 104.033307] CPU: 4 UID: 0 PID: 156 Comm: kswapd0 Not tainted 6.19.0-rc5-00027-g40fd0acc45d0 #435 PREEMPT(voluntary)
> > > [ 104.033312] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
> > > [ 104.033314] Call Trace:
> > > [ 104.033316] <TASK>
> > > [ 104.033319] dump_stack_lvl+0x6a/0x90
> > > [ 104.033328] warn_alloc.cold+0x95/0x1af
> > > [ 104.033334] ? zone_watermark_ok+0x80/0x80
> > > [ 104.033350] __alloc_frozen_pages_noprof+0xec3/0x2470
> > > [ 104.033353] ? __lock_acquire+0x489/0x2600
> > > [ 104.033359] ? stack_access_ok+0x1c0/0x1c0
> > > [ 104.033367] ? warn_alloc+0x1d0/0x1d0
> > > [ 104.033371] ? __lock_acquire+0x489/0x2600
> > > [ 104.033375] ? _raw_spin_unlock_irqrestore+0x48/0x60
> > > [ 104.033379] ? _raw_spin_unlock_irqrestore+0x48/0x60
> > > [ 104.033382] ? lockdep_hardirqs_on+0x78/0x100
> > > [ 104.033394] allocate_slab+0x2b7/0x510
> > > [ 104.033399] refill_objects+0x25d/0x380
> > > [ 104.033407] __pcs_replace_empty_main+0x193/0x5f0
> > > [ 104.033412] kmem_cache_alloc_noprof+0x5b6/0x6f0
> > > [ 104.033415] ? alloc_extent_state+0x1b/0x210 [btrfs]
> > > [ 104.033479] alloc_extent_state+0x1b/0x210 [btrfs]
> > > [ 104.033527] btrfs_clear_extent_bit_changeset+0x2be/0x9c0 [btrfs]
> >
> > Hmm while bisect points out the first bad commit is
> > commit e47c897a2949 ("slab: add sheaves to most caches"),
> >
> > I think the caller is supposed to specify __GFP_NOWARN if it doesn't
> > care about allocation failure?
> >
> > btrfs_clear_extent_bit_changeset() says:
> > > if (!prealloc) {
> > > /*
> > > * Don't care for allocation failure here because we might end
> > > * up not needing the pre-allocated extent state at all, which
> > > * is the case if we only have in the tree extent states that
> > > * cover our input range and don't cover too any other range.
> > > * If we end up needing a new extent state we allocate it later.
> > > */
> > > prealloc = alloc_extent_state(mask);
> > > }
> >
> > Oh wait, I see what's going on. bisection pointed out the commit
> > because slab tries to refill sheaves with __GFP_NOMEMALLOC (and then
> > falls back to slowpath if it fails).
> >
> > Since failing to refill sheaves doesn't mean the allocation will fail,
> > it should specify __GFP_NOWARN with __GFP_NOMEMALLOC as long as there's
> > fallback method.
> >
> > But for __prefill_sheaf_pfmemalloc(), it should specify __GPF_NOWARN on
> > the first attempt only when gfp_pfmemalloc_allowed() returns true.
>
> Is this fix sufficient to do the right thing? I tested it, and it does
> appear to prevent logging of the allocation failures for my test case.
I think we should do both both 1) setting __GFP_NOWARN from btrfs side
and 2) making slab try to refill sheaves with __GFP_NOWARN when
there's a fallback path.
I'm writing a fix for 2) and I'll send it soon.
> diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
> index d0dd50f7d279..d2e1083848e8 100644
> --- a/fs/btrfs/extent-io-tree.c
> +++ b/fs/btrfs/extent-io-tree.c
> @@ -641,7 +641,7 @@ int btrfs_clear_extent_bit_changeset(struct extent_io_tree *tree, u64 start, u64
> * cover our input range and don't cover too any other range.
> * If we end up needing a new extent state we allocate it later.
> */
> - prealloc = alloc_extent_state(mask);
> + prealloc = alloc_extent_state(mask | __GFP_NOWARN);
This seems to be a right thing to do to me, but as I'm not familiar
with btrfs, I'll let btrfs folks leave comment on it :)
> }
>
> spin_lock(&tree->lock);
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches")
2026-02-23 11:59 ` Harry Yoo
@ 2026-02-23 20:30 ` David Sterba
0 siblings, 0 replies; 5+ messages in thread
From: David Sterba @ 2026-02-23 20:30 UTC (permalink / raw)
To: Harry Yoo
Cc: Chris Bainbridge, vbabka, surenb, hao.li, leitao, Liam.Howlett,
zhao1.liu, linux-kernel, linux-mm, linux-btrfs, regressions
On Mon, Feb 23, 2026 at 08:59:30PM +0900, Harry Yoo wrote:
> On Mon, Feb 23, 2026 at 11:12:47AM +0000, Chris Bainbridge wrote:
> > On Mon, Feb 23, 2026 at 05:41:17PM +0900, Harry Yoo wrote:
> > > On Sun, Feb 22, 2026 at 09:36:58PM +0000, Chris Bainbridge wrote:
> > > > Hi,
> > > >
> > > > The latest mainline kernel (v6.19-11831-ga95f71ad3e2e) has page
> > > > allocation failures when doing things like compiling a kernel. I can
> > > > also reproduce this with a stress test like
> > > > `stress-ng --vm 2 --vm-bytes 110% --verify -v`
> > >
> > > Hi, thanks for the report!
> > >
> > > > [ 104.032925] kswapd0: page allocation failure: order:0, mode:0xc0c40(GFP_NOFS|__GFP_COMP|__GFP_NOMEMALLOC), nodemask=(null),cpuset=/,mems_allowed=0
> > > > [ 104.033307] CPU: 4 UID: 0 PID: 156 Comm: kswapd0 Not tainted 6.19.0-rc5-00027-g40fd0acc45d0 #435 PREEMPT(voluntary)
> > > > [ 104.033312] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
> > > > [ 104.033314] Call Trace:
> > > > [ 104.033316] <TASK>
> > > > [ 104.033319] dump_stack_lvl+0x6a/0x90
> > > > [ 104.033328] warn_alloc.cold+0x95/0x1af
> > > > [ 104.033334] ? zone_watermark_ok+0x80/0x80
> > > > [ 104.033350] __alloc_frozen_pages_noprof+0xec3/0x2470
> > > > [ 104.033353] ? __lock_acquire+0x489/0x2600
> > > > [ 104.033359] ? stack_access_ok+0x1c0/0x1c0
> > > > [ 104.033367] ? warn_alloc+0x1d0/0x1d0
> > > > [ 104.033371] ? __lock_acquire+0x489/0x2600
> > > > [ 104.033375] ? _raw_spin_unlock_irqrestore+0x48/0x60
> > > > [ 104.033379] ? _raw_spin_unlock_irqrestore+0x48/0x60
> > > > [ 104.033382] ? lockdep_hardirqs_on+0x78/0x100
> > > > [ 104.033394] allocate_slab+0x2b7/0x510
> > > > [ 104.033399] refill_objects+0x25d/0x380
> > > > [ 104.033407] __pcs_replace_empty_main+0x193/0x5f0
> > > > [ 104.033412] kmem_cache_alloc_noprof+0x5b6/0x6f0
> > > > [ 104.033415] ? alloc_extent_state+0x1b/0x210 [btrfs]
> > > > [ 104.033479] alloc_extent_state+0x1b/0x210 [btrfs]
> > > > [ 104.033527] btrfs_clear_extent_bit_changeset+0x2be/0x9c0 [btrfs]
> > >
> > > Hmm while bisect points out the first bad commit is
> > > commit e47c897a2949 ("slab: add sheaves to most caches"),
> > >
> > > I think the caller is supposed to specify __GFP_NOWARN if it doesn't
> > > care about allocation failure?
> > >
> > > btrfs_clear_extent_bit_changeset() says:
> > > > if (!prealloc) {
> > > > /*
> > > > * Don't care for allocation failure here because we might end
> > > > * up not needing the pre-allocated extent state at all, which
> > > > * is the case if we only have in the tree extent states that
> > > > * cover our input range and don't cover too any other range.
> > > > * If we end up needing a new extent state we allocate it later.
> > > > */
> > > > prealloc = alloc_extent_state(mask);
> > > > }
> > >
> > > Oh wait, I see what's going on. bisection pointed out the commit
> > > because slab tries to refill sheaves with __GFP_NOMEMALLOC (and then
> > > falls back to slowpath if it fails).
> > >
> > > Since failing to refill sheaves doesn't mean the allocation will fail,
> > > it should specify __GFP_NOWARN with __GFP_NOMEMALLOC as long as there's
> > > fallback method.
> > >
> > > But for __prefill_sheaf_pfmemalloc(), it should specify __GPF_NOWARN on
> > > the first attempt only when gfp_pfmemalloc_allowed() returns true.
> >
> > Is this fix sufficient to do the right thing? I tested it, and it does
> > appear to prevent logging of the allocation failures for my test case.
>
> I think we should do both both 1) setting __GFP_NOWARN from btrfs side
> and 2) making slab try to refill sheaves with __GFP_NOWARN when
> there's a fallback path.
>
> I'm writing a fix for 2) and I'll send it soon.
>
> > diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
> > index d0dd50f7d279..d2e1083848e8 100644
> > --- a/fs/btrfs/extent-io-tree.c
> > +++ b/fs/btrfs/extent-io-tree.c
> > @@ -641,7 +641,7 @@ int btrfs_clear_extent_bit_changeset(struct extent_io_tree *tree, u64 start, u64
> > * cover our input range and don't cover too any other range.
> > * If we end up needing a new extent state we allocate it later.
> > */
> > - prealloc = alloc_extent_state(mask);
> > + prealloc = alloc_extent_state(mask | __GFP_NOWARN);
>
> This seems to be a right thing to do to me, but as I'm not familiar
> with btrfs, I'll let btrfs folks leave comment on it :)
I agree the flag should be added, as the comment explains allocation
failures are not fatal at this place. There's another call to the
alloc_extent_state() with GFP_ATOMIC so we cannot simply sink NOWARN
there.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-02-23 20:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-22 21:36 [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches") Chris Bainbridge
2026-02-23 8:41 ` Harry Yoo
2026-02-23 11:12 ` Chris Bainbridge
2026-02-23 11:59 ` Harry Yoo
2026-02-23 20:30 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox