* scheduling while atomic on rc3 - migration + buffer heads
@ 2025-04-21 15:14 Kent Overstreet
2025-04-21 15:47 ` Raghavendra K T
2025-04-21 17:27 ` Darrick J. Wong
0 siblings, 2 replies; 4+ messages in thread
From: Kent Overstreet @ 2025-04-21 15:14 UTC (permalink / raw)
To: linux-mm, linux-ext4, linux-fsdevel
This just popped up in one of my test runs.
Given that it's buffer heads, it has to be the ext4 root filesystem, not
bcachefs.
00465 ========= TEST lz4_buffered
00465
00465 WATCHDOG 360
00466 bcachefs (vdb): starting version 1.25: extent_flags opts=errors=panic,compression=lz4
00466 bcachefs (vdb): initializing new filesystem
00466 bcachefs (vdb): going read-write
00466 bcachefs (vdb): marking superblocks
00466 bcachefs (vdb): initializing freespace
00466 bcachefs (vdb): done initializing freespace
00466 bcachefs (vdb): reading snapshots table
00466 bcachefs (vdb): reading snapshots done
00466 bcachefs (vdb): done starting filesystem
00466 starting copy
00515 BUG: sleeping function called from invalid context at mm/util.c:743
00515 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 120, name: kcompactd0
00515 preempt_count: 1, expected: 0
00515 RCU nest depth: 0, expected: 0
00515 1 lock held by kcompactd0/120:
00515 #0: ffffff80c0c558f0 (&mapping->i_private_lock){+.+.}-{3:3}, at: __buffer_migrate_folio+0x114/0x298
00515 Preemption disabled at:
00515 [<ffffffc08025fa84>] __buffer_migrate_folio+0x114/0x298
00515 CPU: 11 UID: 0 PID: 120 Comm: kcompactd0 Not tainted 6.15.0-rc3-ktest-gb2a78fdf7d2f #20530 PREEMPT
00515 Hardware name: linux,dummy-virt (DT)
00515 Call trace:
00515 show_stack+0x1c/0x30 (C)
00515 dump_stack_lvl+0xb0/0xc0
00515 dump_stack+0x14/0x20
00515 __might_resched+0x180/0x288
00515 folio_mc_copy+0x54/0x98
00515 __migrate_folio.isra.0+0x68/0x168
00515 __buffer_migrate_folio+0x280/0x298
00515 buffer_migrate_folio_norefs+0x18/0x28
00515 migrate_pages_batch+0x94c/0xeb8
00515 migrate_pages_sync+0x84/0x240
00515 migrate_pages+0x284/0x698
00515 compact_zone+0xa40/0x10f8
00515 kcompactd_do_work+0x204/0x498
00515 kcompactd+0x3c4/0x400
00515 kthread+0x13c/0x208
00515 ret_from_fork+0x10/0x20
00518 starting sync
00519 starting rm
00520 ========= FAILED TIMEOUT lz4_buffered in 360s
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: scheduling while atomic on rc3 - migration + buffer heads
2025-04-21 15:14 scheduling while atomic on rc3 - migration + buffer heads Kent Overstreet
@ 2025-04-21 15:47 ` Raghavendra K T
2025-04-21 15:55 ` Kent Overstreet
2025-04-21 17:27 ` Darrick J. Wong
1 sibling, 1 reply; 4+ messages in thread
From: Raghavendra K T @ 2025-04-21 15:47 UTC (permalink / raw)
To: Kent Overstreet, linux-mm, linux-ext4, linux-fsdevel; +Cc: wqu
On 4/21/2025 8:44 PM, Kent Overstreet wrote:
+Qu as I see similar report from him
> This just popped up in one of my test runs.
>
> Given that it's buffer heads, it has to be the ext4 root filesystem, not
> bcachefs.
>
> 00465 ========= TEST lz4_buffered
> 00465
> 00465 WATCHDOG 360
> 00466 bcachefs (vdb): starting version 1.25: extent_flags opts=errors=panic,compression=lz4
> 00466 bcachefs (vdb): initializing new filesystem
> 00466 bcachefs (vdb): going read-write
> 00466 bcachefs (vdb): marking superblocks
> 00466 bcachefs (vdb): initializing freespace
> 00466 bcachefs (vdb): done initializing freespace
> 00466 bcachefs (vdb): reading snapshots table
> 00466 bcachefs (vdb): reading snapshots done
> 00466 bcachefs (vdb): done starting filesystem
> 00466 starting copy
> 00515 BUG: sleeping function called from invalid context at mm/util.c:743
> 00515 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 120, name: kcompactd0
> 00515 preempt_count: 1, expected: 0
> 00515 RCU nest depth: 0, expected: 0
> 00515 1 lock held by kcompactd0/120:
> 00515 #0: ffffff80c0c558f0 (&mapping->i_private_lock){+.+.}-{3:3}, at: __buffer_migrate_folio+0x114/0x298
> 00515 Preemption disabled at:
> 00515 [<ffffffc08025fa84>] __buffer_migrate_folio+0x114/0x298
> 00515 CPU: 11 UID: 0 PID: 120 Comm: kcompactd0 Not tainted 6.15.0-rc3-ktest-gb2a78fdf7d2f #20530 PREEMPT
> 00515 Hardware name: linux,dummy-virt (DT)
> 00515 Call trace:
> 00515 show_stack+0x1c/0x30 (C)
> 00515 dump_stack_lvl+0xb0/0xc0
> 00515 dump_stack+0x14/0x20
> 00515 __might_resched+0x180/0x288
> 00515 folio_mc_copy+0x54/0x98
> 00515 __migrate_folio.isra.0+0x68/0x168
> 00515 __buffer_migrate_folio+0x280/0x298
> 00515 buffer_migrate_folio_norefs+0x18/0x28
> 00515 migrate_pages_batch+0x94c/0xeb8
> 00515 migrate_pages_sync+0x84/0x240
> 00515 migrate_pages+0x284/0x698
> 00515 compact_zone+0xa40/0x10f8
> 00515 kcompactd_do_work+0x204/0x498
> 00515 kcompactd+0x3c4/0x400
> 00515 kthread+0x13c/0x208
> 00515 ret_from_fork+0x10/0x20
> 00518 starting sync
> 00519 starting rm
> 00520 ========= FAILED TIMEOUT lz4_buffered in 360s
>
I have also seen similar stack with folio_mc_copy() while testing
PTE A bit patches.
IIUC, it has something to do with cond_resched() called from
folio_mc_copy().
(Thomas (tglx) mentioned long back that cond_resched() does not have the
scope awareness), not sure where should the fix be done in these
cases..
(I mean caller of the migrate_folio should call with no spinlock held
but with mutex? )
Regards
- Raghu
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: scheduling while atomic on rc3 - migration + buffer heads
2025-04-21 15:47 ` Raghavendra K T
@ 2025-04-21 15:55 ` Kent Overstreet
0 siblings, 0 replies; 4+ messages in thread
From: Kent Overstreet @ 2025-04-21 15:55 UTC (permalink / raw)
To: Raghavendra K T; +Cc: linux-mm, linux-ext4, linux-fsdevel, wqu
On Mon, Apr 21, 2025 at 09:17:18PM +0530, Raghavendra K T wrote:
> On 4/21/2025 8:44 PM, Kent Overstreet wrote:
>
> +Qu as I see similar report from him
>
> > This just popped up in one of my test runs.
> >
> > Given that it's buffer heads, it has to be the ext4 root filesystem, not
> > bcachefs.
> >
> > 00465 ========= TEST lz4_buffered
> > 00465
> > 00465 WATCHDOG 360
> > 00466 bcachefs (vdb): starting version 1.25: extent_flags opts=errors=panic,compression=lz4
> > 00466 bcachefs (vdb): initializing new filesystem
> > 00466 bcachefs (vdb): going read-write
> > 00466 bcachefs (vdb): marking superblocks
> > 00466 bcachefs (vdb): initializing freespace
> > 00466 bcachefs (vdb): done initializing freespace
> > 00466 bcachefs (vdb): reading snapshots table
> > 00466 bcachefs (vdb): reading snapshots done
> > 00466 bcachefs (vdb): done starting filesystem
> > 00466 starting copy
> > 00515 BUG: sleeping function called from invalid context at mm/util.c:743
> > 00515 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 120, name: kcompactd0
> > 00515 preempt_count: 1, expected: 0
> > 00515 RCU nest depth: 0, expected: 0
> > 00515 1 lock held by kcompactd0/120:
> > 00515 #0: ffffff80c0c558f0 (&mapping->i_private_lock){+.+.}-{3:3}, at: __buffer_migrate_folio+0x114/0x298
> > 00515 Preemption disabled at:
> > 00515 [<ffffffc08025fa84>] __buffer_migrate_folio+0x114/0x298
> > 00515 CPU: 11 UID: 0 PID: 120 Comm: kcompactd0 Not tainted 6.15.0-rc3-ktest-gb2a78fdf7d2f #20530 PREEMPT
> > 00515 Hardware name: linux,dummy-virt (DT)
> > 00515 Call trace:
> > 00515 show_stack+0x1c/0x30 (C)
> > 00515 dump_stack_lvl+0xb0/0xc0
> > 00515 dump_stack+0x14/0x20
> > 00515 __might_resched+0x180/0x288
> > 00515 folio_mc_copy+0x54/0x98
> > 00515 __migrate_folio.isra.0+0x68/0x168
> > 00515 __buffer_migrate_folio+0x280/0x298
> > 00515 buffer_migrate_folio_norefs+0x18/0x28
> > 00515 migrate_pages_batch+0x94c/0xeb8
> > 00515 migrate_pages_sync+0x84/0x240
> > 00515 migrate_pages+0x284/0x698
> > 00515 compact_zone+0xa40/0x10f8
> > 00515 kcompactd_do_work+0x204/0x498
> > 00515 kcompactd+0x3c4/0x400
> > 00515 kthread+0x13c/0x208
> > 00515 ret_from_fork+0x10/0x20
> > 00518 starting sync
> > 00519 starting rm
> > 00520 ========= FAILED TIMEOUT lz4_buffered in 360s
> >
>
> I have also seen similar stack with folio_mc_copy() while testing
> PTE A bit patches.
>
> IIUC, it has something to do with cond_resched() called from
> folio_mc_copy().
>
> (Thomas (tglx) mentioned long back that cond_resched() does not have the
> scope awareness), not sure where should the fix be done in these
> cases..
That's true, calling cond_resched() while a spinlock held is a bug.
> (I mean caller of the migrate_folio should call with no spinlock held
> but with mutex? )
Yes. migrate_folio() does large data copies, so we don't want all that
running in atomic context.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: scheduling while atomic on rc3 - migration + buffer heads
2025-04-21 15:14 scheduling while atomic on rc3 - migration + buffer heads Kent Overstreet
2025-04-21 15:47 ` Raghavendra K T
@ 2025-04-21 17:27 ` Darrick J. Wong
1 sibling, 0 replies; 4+ messages in thread
From: Darrick J. Wong @ 2025-04-21 17:27 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-mm, linux-ext4, linux-fsdevel
On Mon, Apr 21, 2025 at 11:14:44AM -0400, Kent Overstreet wrote:
> This just popped up in one of my test runs.
>
> Given that it's buffer heads, it has to be the ext4 root filesystem, not
> bcachefs.
Wrong. udev calling libblkid reading the (mounted) bdev to figure out
there's a bcachefs filesystem will still create bufferheads, and
possibly very large ones.
willy's temporary workaround in
https://lore.kernel.org/linux-fsdevel/Z_VwF1MA-R7MgDVG@casper.infradead.org/
shuts all that up enough to move on to triaging the rest of the
bleeding.
--D
> 00465 ========= TEST lz4_buffered
> 00465
> 00465 WATCHDOG 360
> 00466 bcachefs (vdb): starting version 1.25: extent_flags opts=errors=panic,compression=lz4
> 00466 bcachefs (vdb): initializing new filesystem
> 00466 bcachefs (vdb): going read-write
> 00466 bcachefs (vdb): marking superblocks
> 00466 bcachefs (vdb): initializing freespace
> 00466 bcachefs (vdb): done initializing freespace
> 00466 bcachefs (vdb): reading snapshots table
> 00466 bcachefs (vdb): reading snapshots done
> 00466 bcachefs (vdb): done starting filesystem
> 00466 starting copy
> 00515 BUG: sleeping function called from invalid context at mm/util.c:743
> 00515 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 120, name: kcompactd0
> 00515 preempt_count: 1, expected: 0
> 00515 RCU nest depth: 0, expected: 0
> 00515 1 lock held by kcompactd0/120:
> 00515 #0: ffffff80c0c558f0 (&mapping->i_private_lock){+.+.}-{3:3}, at: __buffer_migrate_folio+0x114/0x298
> 00515 Preemption disabled at:
> 00515 [<ffffffc08025fa84>] __buffer_migrate_folio+0x114/0x298
> 00515 CPU: 11 UID: 0 PID: 120 Comm: kcompactd0 Not tainted 6.15.0-rc3-ktest-gb2a78fdf7d2f #20530 PREEMPT
> 00515 Hardware name: linux,dummy-virt (DT)
> 00515 Call trace:
> 00515 show_stack+0x1c/0x30 (C)
> 00515 dump_stack_lvl+0xb0/0xc0
> 00515 dump_stack+0x14/0x20
> 00515 __might_resched+0x180/0x288
> 00515 folio_mc_copy+0x54/0x98
> 00515 __migrate_folio.isra.0+0x68/0x168
> 00515 __buffer_migrate_folio+0x280/0x298
> 00515 buffer_migrate_folio_norefs+0x18/0x28
> 00515 migrate_pages_batch+0x94c/0xeb8
> 00515 migrate_pages_sync+0x84/0x240
> 00515 migrate_pages+0x284/0x698
> 00515 compact_zone+0xa40/0x10f8
> 00515 kcompactd_do_work+0x204/0x498
> 00515 kcompactd+0x3c4/0x400
> 00515 kthread+0x13c/0x208
> 00515 ret_from_fork+0x10/0x20
> 00518 starting sync
> 00519 starting rm
> 00520 ========= FAILED TIMEOUT lz4_buffered in 360s
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-04-21 17:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-21 15:14 scheduling while atomic on rc3 - migration + buffer heads Kent Overstreet
2025-04-21 15:47 ` Raghavendra K T
2025-04-21 15:55 ` Kent Overstreet
2025-04-21 17:27 ` Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox