[syzbot] [btrfs?] kernel BUG in __folio_start

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
@ 2024-11-24 13:45 syzbot
  2024-11-24 21:26 ` Matthew Wilcox
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: syzbot @ 2024-11-24 13:45 UTC (permalink / raw)
  To: akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs, willy

Hello,

syzbot found the following issue on:

HEAD commit:    228a1157fb9f Merge tag '6.13-rc-part1-SMB3-client-fixes' o..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13820530580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=402159daa216c89d
dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13840778580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17840778580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/d32a8e8c5aae/disk-228a1157.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/28d5c070092e/vmlinux-228a1157.xz
kernel image: https://storage.googleapis.com/syzbot-assets/45af4bfd9e8e/bzImage-228a1157.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/69603aa12e8f/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+aac7bff85be224de5156@syzkaller.appspotmail.com

 __fput+0x5ba/0xa50 fs/file_table.c:458
 task_work_run+0x24f/0x310 kernel/task_work.c:239
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
------------[ cut here ]------------
kernel BUG at mm/page-writeback.c:3119!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 0 UID: 0 PID: 12 Comm: kworker/u8:1 Not tainted 6.12.0-syzkaller-08446-g228a1157fb9f #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: btrfs-delalloc btrfs_work_helper
RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 ae c3 ff e9 ba f5 ff ff e8 0a ae c3 ff 4c 89 f7 48 c7 c6 00 2e 14 8c e8 8b 4f 0d 00 90 <0f> 0b e8 f3 ad c3 ff 4c 89 f7 48 c7 c6 60 34 14 8c e8 74 4f 0d 00
RSP: 0018:ffffc90000117500 EFLAGS: 00010246
RAX: ed413247a2060f00 RBX: 0000000000000002 RCX: 0000000000000001
RDX: dffffc0000000000 RSI: ffffffff8c0ad620 RDI: 0000000000000001
RBP: ffffc90000117670 R08: ffffffff942b2967 R09: 1ffffffff285652c
R10: dffffc0000000000 R11: fffffbfff285652d R12: 0000000000000000
R13: 1ffff92000022eac R14: ffffea0001cab940 R15: ffff888077139710
FS:  0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6661870000 CR3: 00000000792b2000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 process_one_folio fs/btrfs/extent_io.c:187 [inline]
 __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
 submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
 submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
 run_ordered_work fs/btrfs/async-thread.c:245 [inline]
 btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
 worker_thread+0x870/0xd30 kernel/workqueue.c:3391
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 ae c3 ff e9 ba f5 ff ff e8 0a ae c3 ff 4c 89 f7 48 c7 c6 00 2e 14 8c e8 8b 4f 0d 00 90 <0f> 0b e8 f3 ad c3 ff 4c 89 f7 48 c7 c6 60 34 14 8c e8 74 4f 0d 00
RSP: 0018:ffffc90000117500 EFLAGS: 00010246
RAX: ed413247a2060f00 RBX: 0000000000000002 RCX: 0000000000000001
RDX: dffffc0000000000 RSI: ffffffff8c0ad620 RDI: 0000000000000001
RBP: ffffc90000117670 R08: ffffffff942b2967 R09: 1ffffffff285652c
R10: dffffc0000000000 R11: fffffbfff285652d R12: 0000000000000000
R13: 1ffff92000022eac R14: ffffea0001cab940 R15: ffff888077139710
FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055ec8463e668 CR3: 000000007ed5e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-24 13:45 [syzbot] [btrfs?] kernel BUG in __folio_start_writeback syzbot
@ 2024-11-24 21:26 ` Matthew Wilcox
  2024-11-25  0:30   ` Qu Wenruo
  2024-11-26  6:42 ` Qu Wenruo
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2024-11-24 21:26 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs

On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote:
> 
>  __fput+0x5ba/0xa50 fs/file_table.c:458
>  task_work_run+0x24f/0x310 kernel/task_work.c:239
>  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>  syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
>  do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f

This is:

        VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);

ie we've called __folio_start_writeback() on a folio which is already
under writeback.

Higher up in the trace, we have the useful information:

 page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5
 memcg:ffff888140adc000
 aops:btrfs_aops ino:105 dentry name(?):"file2"
 flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff)
 raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710
 raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000
 page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio))
 page_owner tracks the page as allocated

The interesting part of the page_owner stacktrace is:

  filemap_alloc_folio_noprof+0xdf/0x500
  __filemap_get_folio+0x446/0xbd0
  prepare_one_folio+0xb6/0xa20
  btrfs_buffered_write+0x6bd/0x1150
  btrfs_direct_write+0x52d/0xa30
  btrfs_do_write_iter+0x2a0/0x760
  do_iter_readv_writev+0x600/0x880
  vfs_writev+0x376/0xba0

(ie not very interesting)

> Workqueue: btrfs-delalloc btrfs_work_helper
> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> Call Trace:
>  <TASK>
>  process_one_folio fs/btrfs/extent_io.c:187 [inline]
>  __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
>  submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
>  submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
>  run_ordered_work fs/btrfs/async-thread.c:245 [inline]
>  btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
>  process_one_work kernel/workqueue.c:3229 [inline]

This looks like a race?

process_one_folio() calls
btrfs_folio_clamp_set_writeback calls
btrfs_subpage_set_writeback:

        spin_lock_irqsave(&subpage->lock, flags);
        bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits)
;
        if (!folio_test_writeback(folio))
                folio_start_writeback(folio);
        spin_unlock_irqrestore(&subpage->lock, flags);

so somebody else set writeback after we tested for writeback here.

One thing that comes to mind is that _usually_ we take folio_lock()
first, then start writeback, then call folio_unlock() and btrfs isn't
doing that here (afaict).  Maybe that's not the source of the bug?

If it is, should we have a VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio)
in __folio_start_writeback()?  Or is there somewhere that can't lock the
folio before starting writeback?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-24 21:26 ` Matthew Wilcox
@ 2024-11-25  0:30   ` Qu Wenruo
  2024-11-25 10:44     ` Aleksandr Nogikh
  0 siblings, 1 reply; 15+ messages in thread
From: Qu Wenruo @ 2024-11-25  0:30 UTC (permalink / raw)
  To: Matthew Wilcox, syzbot
  Cc: akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs



在 2024/11/25 07:56, Matthew Wilcox 写道:
> On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote:
>>
>>   __fput+0x5ba/0xa50 fs/file_table.c:458
>>   task_work_run+0x24f/0x310 kernel/task_work.c:239
>>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>   syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
>>   do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
>>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> This is:
> 
>          VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);
> 
> ie we've called __folio_start_writeback() on a folio which is already
> under writeback.
> 
> Higher up in the trace, we have the useful information:
> 
>   page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5
>   memcg:ffff888140adc000
>   aops:btrfs_aops ino:105 dentry name(?):"file2"
>   flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff)
>   raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710
>   raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000
>   page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio))
>   page_owner tracks the page as allocated
> 
> The interesting part of the page_owner stacktrace is:
> 
>    filemap_alloc_folio_noprof+0xdf/0x500
>    __filemap_get_folio+0x446/0xbd0
>    prepare_one_folio+0xb6/0xa20
>    btrfs_buffered_write+0x6bd/0x1150
>    btrfs_direct_write+0x52d/0xa30
>    btrfs_do_write_iter+0x2a0/0x760
>    do_iter_readv_writev+0x600/0x880
>    vfs_writev+0x376/0xba0
> 
> (ie not very interesting)
> 
>> Workqueue: btrfs-delalloc btrfs_work_helper
>> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
>> Call Trace:
>>   <TASK>
>>   process_one_folio fs/btrfs/extent_io.c:187 [inline]
>>   __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
>>   submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
>>   submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
>>   run_ordered_work fs/btrfs/async-thread.c:245 [inline]
>>   btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
>>   process_one_work kernel/workqueue.c:3229 [inline]
> 
> This looks like a race?
> 
> process_one_folio() calls
> btrfs_folio_clamp_set_writeback calls
> btrfs_subpage_set_writeback:
> 
>          spin_lock_irqsave(&subpage->lock, flags);
>          bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits)
> ;
>          if (!folio_test_writeback(folio))
>                  folio_start_writeback(folio);
>          spin_unlock_irqrestore(&subpage->lock, flags);
> 
> so somebody else set writeback after we tested for writeback here.

The test VM is using X86_64, thus we won't go into the subpage routine, 
but directly call folio_start_writeback().

> 
> One thing that comes to mind is that _usually_ we take folio_lock()
> first, then start writeback, then call folio_unlock() and btrfs isn't
> doing that here (afaict).  Maybe that's not the source of the bug?

We still hold the folio locked, do submission then unlock.

You can check extent_writepage(), where at the entrance we check if the 
folio is still locked.
Then inside extent_writepage_io() we do the submission, setting the 
folio writeback inside submit_one_sector().
Eventually unlock the folio at the end of extent_writepage(), that's for 
the uncompressed writes.

There are a lot of special handling for async submission (compression), 
but it  still holds the folio locked, do compression and submission, and 
unlock, just all in another thread (this case).

So it looks like something is wrong when transferring the ownership of 
the page cache folios to the compression path, or some not properly 
handled error path.

Unfortunately I'm not really able to reproduce the case using the 
reproducer...

Thanks,
Qu



> 
> If it is, should we have a VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio)
> in __folio_start_writeback()?  Or is there somewhere that can't lock the
> folio before starting writeback?
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-25  0:30   ` Qu Wenruo
@ 2024-11-25 10:44     ` Aleksandr Nogikh
  2024-11-26  8:43       ` Qu Wenruo
  0 siblings, 1 reply; 15+ messages in thread
From: Aleksandr Nogikh @ 2024-11-25 10:44 UTC (permalink / raw)
  To: Qu Wenruo
  Cc: Matthew Wilcox, syzbot, akpm, clm, dsterba, josef, linux-btrfs,
	linux-fsdevel, linux-kernel, linux-mm, syzkaller-bugs

On Mon, Nov 25, 2024 at 1:30 AM 'Qu Wenruo' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
>
>
>
> 在 2024/11/25 07:56, Matthew Wilcox 写道:
> > On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote:
> >>
> >>   __fput+0x5ba/0xa50 fs/file_table.c:458
> >>   task_work_run+0x24f/0x310 kernel/task_work.c:239
> >>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
> >>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
> >>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
> >>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
> >>   syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
> >>   do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
> >>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >
> > This is:
> >
> >          VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);
> >
> > ie we've called __folio_start_writeback() on a folio which is already
> > under writeback.
> >
> > Higher up in the trace, we have the useful information:
> >
> >   page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5
> >   memcg:ffff888140adc000
> >   aops:btrfs_aops ino:105 dentry name(?):"file2"
> >   flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff)
> >   raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710
> >   raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000
> >   page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio))
> >   page_owner tracks the page as allocated
> >
> > The interesting part of the page_owner stacktrace is:
> >
> >    filemap_alloc_folio_noprof+0xdf/0x500
> >    __filemap_get_folio+0x446/0xbd0
> >    prepare_one_folio+0xb6/0xa20
> >    btrfs_buffered_write+0x6bd/0x1150
> >    btrfs_direct_write+0x52d/0xa30
> >    btrfs_do_write_iter+0x2a0/0x760
> >    do_iter_readv_writev+0x600/0x880
> >    vfs_writev+0x376/0xba0
> >
> > (ie not very interesting)
> >
> >> Workqueue: btrfs-delalloc btrfs_work_helper
> >> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> >> Call Trace:
> >>   <TASK>
> >>   process_one_folio fs/btrfs/extent_io.c:187 [inline]
> >>   __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
> >>   submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
> >>   submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
> >>   run_ordered_work fs/btrfs/async-thread.c:245 [inline]
> >>   btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
> >>   process_one_work kernel/workqueue.c:3229 [inline]
> >
> > This looks like a race?
> >
> > process_one_folio() calls
> > btrfs_folio_clamp_set_writeback calls
> > btrfs_subpage_set_writeback:
> >
> >          spin_lock_irqsave(&subpage->lock, flags);
> >          bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits)
> > ;
> >          if (!folio_test_writeback(folio))
> >                  folio_start_writeback(folio);
> >          spin_unlock_irqrestore(&subpage->lock, flags);
> >
> > so somebody else set writeback after we tested for writeback here.
>
> The test VM is using X86_64, thus we won't go into the subpage routine,
> but directly call folio_start_writeback().
>
> >
> > One thing that comes to mind is that _usually_ we take folio_lock()
> > first, then start writeback, then call folio_unlock() and btrfs isn't
> > doing that here (afaict).  Maybe that's not the source of the bug?
>
> We still hold the folio locked, do submission then unlock.
>
> You can check extent_writepage(), where at the entrance we check if the
> folio is still locked.
> Then inside extent_writepage_io() we do the submission, setting the
> folio writeback inside submit_one_sector().
> Eventually unlock the folio at the end of extent_writepage(), that's for
> the uncompressed writes.
>
> There are a lot of special handling for async submission (compression),
> but it  still holds the folio locked, do compression and submission, and
> unlock, just all in another thread (this case).
>
> So it looks like something is wrong when transferring the ownership of
> the page cache folios to the compression path, or some not properly
> handled error path.
>
> Unfortunately I'm not really able to reproduce the case using the
> reproducer...

I've just tried to reproduce locally using the downloadable assets and
the kernel crashed ~ after 1 minute of running the attached C repro.

[   87.616440][ T9044] ------------[ cut here ]------------
[   87.617126][ T9044] kernel BUG at mm/page-writeback.c:3119!
[   87.619308][ T9044] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[   87.620174][ T9044] CPU: 1 UID: 0 PID: 9044 Comm: kworker/u10:6 Not
tainted 6.12.0-syzkaller-08446-g228a1157fb9f #0

Here are the instructions I followed:
https://github.com/google/syzkaller/blob/master/docs/syzbot_assets.md#run-a-c-reproducer

-- 
Aleksandr

>
> Thanks,
> Qu
>
>
>
> >
> > If it is, should we have a VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio)
> > in __folio_start_writeback()?  Or is there somewhere that can't lock the
> > folio before starting writeback?
> >
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-24 13:45 [syzbot] [btrfs?] kernel BUG in __folio_start_writeback syzbot
  2024-11-24 21:26 ` Matthew Wilcox
@ 2024-11-26  6:42 ` Qu Wenruo
  2024-11-26  7:35   ` syzbot
  2024-11-28 18:56 ` syzbot
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Qu Wenruo @ 2024-11-26  6:42 UTC (permalink / raw)
  To: syzbot, akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs, willy

#syz test: https://github.com/btrfs/linux.git for-next

在 2024/11/25 00:15, syzbot 写道:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    228a1157fb9f Merge tag '6.13-rc-part1-SMB3-client-fixes' o..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13820530580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=402159daa216c89d
> dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13840778580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17840778580000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/d32a8e8c5aae/disk-228a1157.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/28d5c070092e/vmlinux-228a1157.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/45af4bfd9e8e/bzImage-228a1157.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/69603aa12e8f/mount_0.gz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+aac7bff85be224de5156@syzkaller.appspotmail.com
> 
>   __fput+0x5ba/0xa50 fs/file_table.c:458
>   task_work_run+0x24f/0x310 kernel/task_work.c:239
>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>   syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
>   do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> ------------[ cut here ]------------
> kernel BUG at mm/page-writeback.c:3119!
> Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> CPU: 0 UID: 0 PID: 12 Comm: kworker/u8:1 Not tainted 6.12.0-syzkaller-08446-g228a1157fb9f #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> Workqueue: btrfs-delalloc btrfs_work_helper
> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 ae c3 ff e9 ba f5 ff ff e8 0a ae c3 ff 4c 89 f7 48 c7 c6 00 2e 14 8c e8 8b 4f 0d 00 90 <0f> 0b e8 f3 ad c3 ff 4c 89 f7 48 c7 c6 60 34 14 8c e8 74 4f 0d 00
> RSP: 0018:ffffc90000117500 EFLAGS: 00010246
> RAX: ed413247a2060f00 RBX: 0000000000000002 RCX: 0000000000000001
> RDX: dffffc0000000000 RSI: ffffffff8c0ad620 RDI: 0000000000000001
> RBP: ffffc90000117670 R08: ffffffff942b2967 R09: 1ffffffff285652c
> R10: dffffc0000000000 R11: fffffbfff285652d R12: 0000000000000000
> R13: 1ffff92000022eac R14: ffffea0001cab940 R15: ffff888077139710
> FS:  0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f6661870000 CR3: 00000000792b2000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>   <TASK>
>   process_one_folio fs/btrfs/extent_io.c:187 [inline]
>   __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
>   submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
>   submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
>   run_ordered_work fs/btrfs/async-thread.c:245 [inline]
>   btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
>   process_one_work kernel/workqueue.c:3229 [inline]
>   process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
>   worker_thread+0x870/0xd30 kernel/workqueue.c:3391
>   kthread+0x2f0/0x390 kernel/kthread.c:389
>   ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>   </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 ae c3 ff e9 ba f5 ff ff e8 0a ae c3 ff 4c 89 f7 48 c7 c6 00 2e 14 8c e8 8b 4f 0d 00 90 <0f> 0b e8 f3 ad c3 ff 4c 89 f7 48 c7 c6 60 34 14 8c e8 74 4f 0d 00
> RSP: 0018:ffffc90000117500 EFLAGS: 00010246
> RAX: ed413247a2060f00 RBX: 0000000000000002 RCX: 0000000000000001
> RDX: dffffc0000000000 RSI: ffffffff8c0ad620 RDI: 0000000000000001
> RBP: ffffc90000117670 R08: ffffffff942b2967 R09: 1ffffffff285652c
> R10: dffffc0000000000 R11: fffffbfff285652d R12: 0000000000000000
> R13: 1ffff92000022eac R14: ffffea0001cab940 R15: ffff888077139710
> FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055ec8463e668 CR3: 000000007ed5e000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-26  6:42 ` Qu Wenruo
@ 2024-11-26  7:35   ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2024-11-26  7:35 UTC (permalink / raw)
  To: akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs, willy, wqu

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
kernel BUG in __folio_start_writeback

 do_group_exit+0x207/0x2c0 kernel/exit.c:1088
 get_signal+0x16a3/0x1740 kernel/signal.c:2918
 arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0xc9/0x370 kernel/entry/common.c:218
 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
------------[ cut here ]------------
kernel BUG at mm/page-writeback.c:3119!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 1 UID: 0 PID: 3538 Comm: kworker/u8:10 Not tainted 6.12.0-rc7-syzkaller-00132-g21865e0dd679 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: btrfs-delalloc btrfs_work_helper
RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 79 c4 ff e9 ba f5 ff ff e8 0a 79 c4 ff 4c 89 f7 48 c7 c6 c0 0e f4 8b e8 6b 46 0d 00 90 <0f> 0b e8 f3 78 c4 ff 4c 89 f7 48 c7 c6 20 15 f4 8b e8 54 46 0d 00
RSP: 0018:ffffc9000ca9f500 EFLAGS: 00010246
RAX: 258fc5bd6608dc00 RBX: 0000000000000002 RCX: 0000000000000001
RDX: dffffc0000000000 RSI: ffffffff8beacb20 RDI: 0000000000000001
RBP: ffffc9000ca9f670 R08: ffffffff94059917 R09: 1ffffffff280b322
R10: dffffc0000000000 R11: fffffbfff280b323 R12: 0000000000000000
R13: 1ffff92001953eac R14: ffffea0001c40500 R15: ffff888073b564f8
FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c0002adb80 CR3: 0000000027072000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 process_one_folio fs/btrfs/extent_io.c:187 [inline]
 __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
 submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
 submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
 run_ordered_work fs/btrfs/async-thread.c:245 [inline]
 btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
 worker_thread+0x870/0xd30 kernel/workqueue.c:3391
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 79 c4 ff e9 ba f5 ff ff e8 0a 79 c4 ff 4c 89 f7 48 c7 c6 c0 0e f4 8b e8 6b 46 0d 00 90 <0f> 0b e8 f3 78 c4 ff 4c 89 f7 48 c7 c6 20 15 f4 8b e8 54 46 0d 00
RSP: 0018:ffffc9000ca9f500 EFLAGS: 00010246
RAX: 258fc5bd6608dc00 RBX: 0000000000000002 RCX: 0000000000000001
RDX: dffffc0000000000 RSI: ffffffff8beacb20 RDI: 0000000000000001
RBP: ffffc9000ca9f670 R08: ffffffff94059917 R09: 1ffffffff280b322
R10: dffffc0000000000 R11: fffffbfff280b323 R12: 0000000000000000
R13: 1ffff92001953eac R14: ffffea0001c40500 R15: ffff888073b564f8
FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fabe0e31440 CR3: 0000000032718000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


Tested on:

commit:         21865e0d btrfs: use PTR_ERR() instead of PTR_ERR_OR_ZE..
git tree:       https://github.com/btrfs/linux.git for-next
console output: https://syzkaller.appspot.com/x/log.txt?x=10835778580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=fa4954ad2c62b915
dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-25 10:44     ` Aleksandr Nogikh
@ 2024-11-26  8:43       ` Qu Wenruo
  0 siblings, 0 replies; 15+ messages in thread
From: Qu Wenruo @ 2024-11-26  8:43 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: Matthew Wilcox, syzbot, akpm, clm, dsterba, josef, linux-btrfs,
	linux-fsdevel, linux-kernel, linux-mm, syzkaller-bugs



在 2024/11/25 21:14, Aleksandr Nogikh 写道:
> On Mon, Nov 25, 2024 at 1:30 AM 'Qu Wenruo' via syzkaller-bugs
> <syzkaller-bugs@googlegroups.com> wrote:
>>
>>
>>
>> 在 2024/11/25 07:56, Matthew Wilcox 写道:
>>> On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote:
>>>>
>>>>    __fput+0x5ba/0xa50 fs/file_table.c:458
>>>>    task_work_run+0x24f/0x310 kernel/task_work.c:239
>>>>    resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>>>>    exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>>>>    exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>>>    __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>>>    syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
>>>>    do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
>>>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>
>>> This is:
>>>
>>>           VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);
>>>
>>> ie we've called __folio_start_writeback() on a folio which is already
>>> under writeback.
>>>
>>> Higher up in the trace, we have the useful information:
>>>
>>>    page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5
>>>    memcg:ffff888140adc000
>>>    aops:btrfs_aops ino:105 dentry name(?):"file2"
>>>    flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff)
>>>    raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710
>>>    raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000
>>>    page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio))
>>>    page_owner tracks the page as allocated
>>>
>>> The interesting part of the page_owner stacktrace is:
>>>
>>>     filemap_alloc_folio_noprof+0xdf/0x500
>>>     __filemap_get_folio+0x446/0xbd0
>>>     prepare_one_folio+0xb6/0xa20
>>>     btrfs_buffered_write+0x6bd/0x1150
>>>     btrfs_direct_write+0x52d/0xa30
>>>     btrfs_do_write_iter+0x2a0/0x760
>>>     do_iter_readv_writev+0x600/0x880
>>>     vfs_writev+0x376/0xba0
>>>
>>> (ie not very interesting)
>>>
>>>> Workqueue: btrfs-delalloc btrfs_work_helper
>>>> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
>>>> Call Trace:
>>>>    <TASK>
>>>>    process_one_folio fs/btrfs/extent_io.c:187 [inline]
>>>>    __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
>>>>    submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
>>>>    submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
>>>>    run_ordered_work fs/btrfs/async-thread.c:245 [inline]
>>>>    btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
>>>>    process_one_work kernel/workqueue.c:3229 [inline]
>>>
>>> This looks like a race?
>>>
>>> process_one_folio() calls
>>> btrfs_folio_clamp_set_writeback calls
>>> btrfs_subpage_set_writeback:
>>>
>>>           spin_lock_irqsave(&subpage->lock, flags);
>>>           bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits)
>>> ;
>>>           if (!folio_test_writeback(folio))
>>>                   folio_start_writeback(folio);
>>>           spin_unlock_irqrestore(&subpage->lock, flags);
>>>
>>> so somebody else set writeback after we tested for writeback here.
>>
>> The test VM is using X86_64, thus we won't go into the subpage routine,
>> but directly call folio_start_writeback().
>>
>>>
>>> One thing that comes to mind is that _usually_ we take folio_lock()
>>> first, then start writeback, then call folio_unlock() and btrfs isn't
>>> doing that here (afaict).  Maybe that's not the source of the bug?
>>
>> We still hold the folio locked, do submission then unlock.
>>
>> You can check extent_writepage(), where at the entrance we check if the
>> folio is still locked.
>> Then inside extent_writepage_io() we do the submission, setting the
>> folio writeback inside submit_one_sector().
>> Eventually unlock the folio at the end of extent_writepage(), that's for
>> the uncompressed writes.
>>
>> There are a lot of special handling for async submission (compression),
>> but it  still holds the folio locked, do compression and submission, and
>> unlock, just all in another thread (this case).
>>
>> So it looks like something is wrong when transferring the ownership of
>> the page cache folios to the compression path, or some not properly
>> handled error path.
>>
>> Unfortunately I'm not really able to reproduce the case using the
>> reproducer...
> 
> I've just tried to reproduce locally using the downloadable assets and
> the kernel crashed ~ after 1 minute of running the attached C repro.
> 
> [   87.616440][ T9044] ------------[ cut here ]------------
> [   87.617126][ T9044] kernel BUG at mm/page-writeback.c:3119!
> [   87.619308][ T9044] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> [   87.620174][ T9044] CPU: 1 UID: 0 PID: 9044 Comm: kworker/u10:6 Not
> tainted 6.12.0-syzkaller-08446-g228a1157fb9f #0
> 
> Here are the instructions I followed:
> https://github.com/google/syzkaller/blob/master/docs/syzbot_assets.md#run-a-c-reproducer

Thanks for the confirmation.

I can reproduce it using the exact disk image (around 1min), but not 
inside my usual development VM (over 5min).

So it will a lot tricky to debug now...

Thanks,
Qu


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-24 13:45 [syzbot] [btrfs?] kernel BUG in __folio_start_writeback syzbot
  2024-11-24 21:26 ` Matthew Wilcox
  2024-11-26  6:42 ` Qu Wenruo
@ 2024-11-28 18:56 ` syzbot
  2024-11-28 21:26   ` Qu Wenruo
  2024-11-29 21:17 ` Qu Wenruo
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2024-11-28 18:56 UTC (permalink / raw)
  To: akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, nogikh, syzkaller-bugs, willy, wqu

syzbot has bisected this issue to:

commit c87c299776e4d75bcc5559203ae2c37dc0396d80
Author: Qu Wenruo <wqu@suse.com>
Date:   Thu Oct 10 04:46:12 2024 +0000

    btrfs: make buffered write to copy one page a time

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=165dd3c0580000
start commit:   228a1157fb9f Merge tag '6.13-rc-part1-SMB3-client-fixes' o..
git tree:       upstream
final oops:     https://syzkaller.appspot.com/x/report.txt?x=155dd3c0580000
console output: https://syzkaller.appspot.com/x/log.txt?x=115dd3c0580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=402159daa216c89d
dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13840778580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17840778580000

Reported-by: syzbot+aac7bff85be224de5156@syzkaller.appspotmail.com
Fixes: c87c299776e4 ("btrfs: make buffered write to copy one page a time")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-28 18:56 ` syzbot
@ 2024-11-28 21:26   ` Qu Wenruo
  0 siblings, 0 replies; 15+ messages in thread
From: Qu Wenruo @ 2024-11-28 21:26 UTC (permalink / raw)
  To: syzbot, akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, nogikh, syzkaller-bugs, willy, wqu

# syz test: https://github.com/adam900710/linux.git writeback_fix

在 2024/11/29 05:26, syzbot 写道:
> syzbot has bisected this issue to:
>
> commit c87c299776e4d75bcc5559203ae2c37dc0396d80
> Author: Qu Wenruo <wqu@suse.com>
> Date:   Thu Oct 10 04:46:12 2024 +0000
>
>      btrfs: make buffered write to copy one page a time
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=165dd3c0580000
> start commit:   228a1157fb9f Merge tag '6.13-rc-part1-SMB3-client-fixes' o..
> git tree:       upstream
> final oops:     https://syzkaller.appspot.com/x/report.txt?x=155dd3c0580000
> console output: https://syzkaller.appspot.com/x/log.txt?x=115dd3c0580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=402159daa216c89d
> dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13840778580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17840778580000
>
> Reported-by: syzbot+aac7bff85be224de5156@syzkaller.appspotmail.com
> Fixes: c87c299776e4 ("btrfs: make buffered write to copy one page a time")
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-24 13:45 [syzbot] [btrfs?] kernel BUG in __folio_start_writeback syzbot
                   ` (2 preceding siblings ...)
  2024-11-28 18:56 ` syzbot
@ 2024-11-29 21:17 ` Qu Wenruo
  2024-11-30  1:51   ` syzbot
  2024-11-30  6:36 ` Qu Wenruo
  2025-01-23  5:06 ` syzbot
  5 siblings, 1 reply; 15+ messages in thread
From: Qu Wenruo @ 2024-11-29 21:17 UTC (permalink / raw)
  To: syzbot, akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs, willy

#syz test: https://github.com/adam900710/linux.git writeback_fix

在 2024/11/25 00:15, syzbot 写道:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    228a1157fb9f Merge tag '6.13-rc-part1-SMB3-client-fixes' o..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13820530580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=402159daa216c89d
> dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13840778580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17840778580000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/d32a8e8c5aae/disk-228a1157.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/28d5c070092e/vmlinux-228a1157.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/45af4bfd9e8e/bzImage-228a1157.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/69603aa12e8f/mount_0.gz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+aac7bff85be224de5156@syzkaller.appspotmail.com
> 
>   __fput+0x5ba/0xa50 fs/file_table.c:458
>   task_work_run+0x24f/0x310 kernel/task_work.c:239
>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>   syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
>   do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> ------------[ cut here ]------------
> kernel BUG at mm/page-writeback.c:3119!
> Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> CPU: 0 UID: 0 PID: 12 Comm: kworker/u8:1 Not tainted 6.12.0-syzkaller-08446-g228a1157fb9f #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> Workqueue: btrfs-delalloc btrfs_work_helper
> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 ae c3 ff e9 ba f5 ff ff e8 0a ae c3 ff 4c 89 f7 48 c7 c6 00 2e 14 8c e8 8b 4f 0d 00 90 <0f> 0b e8 f3 ad c3 ff 4c 89 f7 48 c7 c6 60 34 14 8c e8 74 4f 0d 00
> RSP: 0018:ffffc90000117500 EFLAGS: 00010246
> RAX: ed413247a2060f00 RBX: 0000000000000002 RCX: 0000000000000001
> RDX: dffffc0000000000 RSI: ffffffff8c0ad620 RDI: 0000000000000001
> RBP: ffffc90000117670 R08: ffffffff942b2967 R09: 1ffffffff285652c
> R10: dffffc0000000000 R11: fffffbfff285652d R12: 0000000000000000
> R13: 1ffff92000022eac R14: ffffea0001cab940 R15: ffff888077139710
> FS:  0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f6661870000 CR3: 00000000792b2000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>   <TASK>
>   process_one_folio fs/btrfs/extent_io.c:187 [inline]
>   __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
>   submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
>   submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
>   run_ordered_work fs/btrfs/async-thread.c:245 [inline]
>   btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
>   process_one_work kernel/workqueue.c:3229 [inline]
>   process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
>   worker_thread+0x870/0xd30 kernel/workqueue.c:3391
>   kthread+0x2f0/0x390 kernel/kthread.c:389
>   ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>   </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 ae c3 ff e9 ba f5 ff ff e8 0a ae c3 ff 4c 89 f7 48 c7 c6 00 2e 14 8c e8 8b 4f 0d 00 90 <0f> 0b e8 f3 ad c3 ff 4c 89 f7 48 c7 c6 60 34 14 8c e8 74 4f 0d 00
> RSP: 0018:ffffc90000117500 EFLAGS: 00010246
> RAX: ed413247a2060f00 RBX: 0000000000000002 RCX: 0000000000000001
> RDX: dffffc0000000000 RSI: ffffffff8c0ad620 RDI: 0000000000000001
> RBP: ffffc90000117670 R08: ffffffff942b2967 R09: 1ffffffff285652c
> R10: dffffc0000000000 R11: fffffbfff285652d R12: 0000000000000000
> R13: 1ffff92000022eac R14: ffffea0001cab940 R15: ffff888077139710
> FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055ec8463e668 CR3: 000000007ed5e000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-29 21:17 ` Qu Wenruo
@ 2024-11-30  1:51   ` syzbot
  2024-11-30  4:27     ` Qu Wenruo
  0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2024-11-30  1:51 UTC (permalink / raw)
  To: akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs, willy, wqu

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: MAX_LOCKDEP_KEYS too low!

BUG: MAX_LOCKDEP_KEYS too low!
turning off the locking correctness validator.
CPU: 1 UID: 0 PID: 11728 Comm: kworker/u8:10 Not tainted 6.12.0-rc7-syzkaller-00133-g17a4e91a431b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: btrfs-cache btrfs_work_helper
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 register_lock_class+0x827/0x980 kernel/locking/lockdep.c:1328
 __lock_acquire+0xf3/0x2100 kernel/locking/lockdep.c:5077
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
 process_one_work kernel/workqueue.c:3204 [inline]
 process_scheduled_works+0x950/0x1850 kernel/workqueue.c:3310
 worker_thread+0x870/0xd30 kernel/workqueue.c:3391
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>

Tested on:

commit:         17a4e91a btrfs: test if we need to wait the writeback ..
git tree:       https://github.com/adam900710/linux.git writeback_fix
console output: https://syzkaller.appspot.com/x/log.txt?x=12c5ad30580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=fa4954ad2c62b915
dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-30  1:51   ` syzbot
@ 2024-11-30  4:27     ` Qu Wenruo
  0 siblings, 0 replies; 15+ messages in thread
From: Qu Wenruo @ 2024-11-30  4:27 UTC (permalink / raw)
  To: syzbot, akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs, willy



在 2024/11/30 12:21, syzbot 写道:
> Hello,
> 
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> BUG: MAX_LOCKDEP_KEYS too low!
> 
> BUG: MAX_LOCKDEP_KEYS too low!

Hi Syzbot guys,

Syzbot is great, but I'm wondering if it's possible to disable lockdep 
for this particular test?
Or just let it re-run the test again?

If the test doesn't crash with my fix, but only lockdep warnings on 
certain too low values, I'd call it fixed.

BTW, I'm not seeing where I can changed the MAX_LCKDEP_KEYS values in 
the kernel...

Thanks,
Qu

> turning off the locking correctness validator.
> CPU: 1 UID: 0 PID: 11728 Comm: kworker/u8:10 Not tainted 6.12.0-rc7-syzkaller-00133-g17a4e91a431b #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> Workqueue: btrfs-cache btrfs_work_helper
> Call Trace:
>   <TASK>
>   __dump_stack lib/dump_stack.c:94 [inline]
>   dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
>   register_lock_class+0x827/0x980 kernel/locking/lockdep.c:1328
>   __lock_acquire+0xf3/0x2100 kernel/locking/lockdep.c:5077
>   lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
>   process_one_work kernel/workqueue.c:3204 [inline]
>   process_scheduled_works+0x950/0x1850 kernel/workqueue.c:3310
>   worker_thread+0x870/0xd30 kernel/workqueue.c:3391
>   kthread+0x2f0/0x390 kernel/kthread.c:389
>   ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>   </TASK>
> 
> 
> Tested on:
> 
> commit:         17a4e91a btrfs: test if we need to wait the writeback ..
> git tree:       https://github.com/adam900710/linux.git writeback_fix
> console output: https://syzkaller.appspot.com/x/log.txt?x=12c5ad30580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=fa4954ad2c62b915
> dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> 
> Note: no patches were applied.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-24 13:45 [syzbot] [btrfs?] kernel BUG in __folio_start_writeback syzbot
                   ` (3 preceding siblings ...)
  2024-11-29 21:17 ` Qu Wenruo
@ 2024-11-30  6:36 ` Qu Wenruo
  2024-11-30  7:01   ` syzbot
  2025-01-23  5:06 ` syzbot
  5 siblings, 1 reply; 15+ messages in thread
From: Qu Wenruo @ 2024-11-30  6:36 UTC (permalink / raw)
  To: syzbot, akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs, willy

#syz test: https://github.com/adam900710/linux.git writeback_fix

在 2024/11/25 00:15, syzbot 写道:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    228a1157fb9f Merge tag '6.13-rc-part1-SMB3-client-fixes' o..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13820530580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=402159daa216c89d
> dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13840778580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17840778580000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/d32a8e8c5aae/disk-228a1157.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/28d5c070092e/vmlinux-228a1157.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/45af4bfd9e8e/bzImage-228a1157.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/69603aa12e8f/mount_0.gz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+aac7bff85be224de5156@syzkaller.appspotmail.com
> 
>   __fput+0x5ba/0xa50 fs/file_table.c:458
>   task_work_run+0x24f/0x310 kernel/task_work.c:239
>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>   syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
>   do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> ------------[ cut here ]------------
> kernel BUG at mm/page-writeback.c:3119!
> Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
> CPU: 0 UID: 0 PID: 12 Comm: kworker/u8:1 Not tainted 6.12.0-syzkaller-08446-g228a1157fb9f #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> Workqueue: btrfs-delalloc btrfs_work_helper
> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 ae c3 ff e9 ba f5 ff ff e8 0a ae c3 ff 4c 89 f7 48 c7 c6 00 2e 14 8c e8 8b 4f 0d 00 90 <0f> 0b e8 f3 ad c3 ff 4c 89 f7 48 c7 c6 60 34 14 8c e8 74 4f 0d 00
> RSP: 0018:ffffc90000117500 EFLAGS: 00010246
> RAX: ed413247a2060f00 RBX: 0000000000000002 RCX: 0000000000000001
> RDX: dffffc0000000000 RSI: ffffffff8c0ad620 RDI: 0000000000000001
> RBP: ffffc90000117670 R08: ffffffff942b2967 R09: 1ffffffff285652c
> R10: dffffc0000000000 R11: fffffbfff285652d R12: 0000000000000000
> R13: 1ffff92000022eac R14: ffffea0001cab940 R15: ffff888077139710
> FS:  0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f6661870000 CR3: 00000000792b2000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>   <TASK>
>   process_one_folio fs/btrfs/extent_io.c:187 [inline]
>   __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
>   submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
>   submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
>   run_ordered_work fs/btrfs/async-thread.c:245 [inline]
>   btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
>   process_one_work kernel/workqueue.c:3229 [inline]
>   process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
>   worker_thread+0x870/0xd30 kernel/workqueue.c:3391
>   kthread+0x2f0/0x390 kernel/kthread.c:389
>   ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>   </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> Code: 25 ff 0f 00 00 0f 84 d3 00 00 00 e8 14 ae c3 ff e9 ba f5 ff ff e8 0a ae c3 ff 4c 89 f7 48 c7 c6 00 2e 14 8c e8 8b 4f 0d 00 90 <0f> 0b e8 f3 ad c3 ff 4c 89 f7 48 c7 c6 60 34 14 8c e8 74 4f 0d 00
> RSP: 0018:ffffc90000117500 EFLAGS: 00010246
> RAX: ed413247a2060f00 RBX: 0000000000000002 RCX: 0000000000000001
> RDX: dffffc0000000000 RSI: ffffffff8c0ad620 RDI: 0000000000000001
> RBP: ffffc90000117670 R08: ffffffff942b2967 R09: 1ffffffff285652c
> R10: dffffc0000000000 R11: fffffbfff285652d R12: 0000000000000000
> R13: 1ffff92000022eac R14: ffffea0001cab940 R15: ffff888077139710
> FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055ec8463e668 CR3: 000000007ed5e000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-30  6:36 ` Qu Wenruo
@ 2024-11-30  7:01   ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2024-11-30  7:01 UTC (permalink / raw)
  To: akpm, clm, dsterba, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, syzkaller-bugs, willy, wqu

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
BUG: MAX_LOCKDEP_KEYS too low!

BUG: MAX_LOCKDEP_KEYS too low!
turning off the locking correctness validator.
CPU: 1 UID: 0 PID: 18394 Comm: syz-executor388 Not tainted 6.12.0-rc7-syzkaller-00133-g17a4e91a431b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 register_lock_class+0x827/0x980 kernel/locking/lockdep.c:1328
 __lock_acquire+0xf3/0x2100 kernel/locking/lockdep.c:5077
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
 touch_wq_lockdep_map+0xc7/0x170 kernel/workqueue.c:3880
 __flush_workqueue+0x14f/0x1600 kernel/workqueue.c:3922
 drain_workqueue+0xc9/0x3a0 kernel/workqueue.c:4086
 destroy_workqueue+0xba/0xc40 kernel/workqueue.c:5830
 btrfs_stop_all_workers+0xbb/0x2a0 fs/btrfs/disk-io.c:1782
 close_ctree+0x6bb/0xd60 fs/btrfs/disk-io.c:4360
 generic_shutdown_super+0x139/0x2d0 fs/super.c:642
 kill_anon_super+0x3b/0x70 fs/super.c:1237
 btrfs_kill_super+0x41/0x50 fs/btrfs/super.c:2112
 deactivate_locked_super+0xc4/0x130 fs/super.c:473
 cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373
 task_work_run+0x24f/0x310 kernel/task_work.c:239
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x168/0x370 kernel/entry/common.c:218
 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f378bf8c357
Code: 08 00 48 83 c4 08 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 c7 c2 b0 ff ff ff f7 d8 64 89 02 b8
RSP: 002b:00007ffd4c441108 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f378bf8c357
RDX: 0000000000000000 RSI: 0000000000000009 RDI: 00007ffd4c4411c0
RBP: 00007ffd4c4411c0 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000202 R12: 00007ffd4c442280
R13: 00005555591197d0 R14: 431bde82d7b634db R15: 00007ffd4c442224
 </TASK>
BTRFS info (device loop2): last unmount of filesystem bf719321-eb1f-43c1-9145-be0044cdbc04
BTRFS info (device loop2): last unmount of filesystem 454c899b-20f1-4098-b6bd-9b424eb38c60
BTRFS info (device loop2): last unmount of filesystem e789dab4-7b2e-44bb-bb97-19a8dd7be099
BTRFS info (device loop2): last unmount of filesystem a11fd0de-3a92-4478-af85-4e70dfb2fb44
BTRFS info (device loop2): last unmount of filesystem 85ccfa0b-566f-4eb9-b1a6-ea2fe97ca044
BTRFS info (device loop2): last unmount of filesystem 5a8c012e-dba3-4ff5-a22f-46e4b5bb2f55
BTRFS info (device loop2): last unmount of filesystem 2fe685f2-8834-419b-bd91-466d40ccece7
BTRFS info (device loop2): last unmount of filesystem fc366aaa-c1c0-4d55-9034-d39fce006f22
BTRFS info (device loop2): last unmount of filesystem 364312bb-b5a2-487f-aaa2-e36f3a1b701f
BTRFS info (device loop2): last unmount of filesystem 14d642db-7b15-43e4-81e6-4b8fac6a25f8


Tested on:

commit:         17a4e91a btrfs: test if we need to wait the writeback ..
git tree:       https://github.com/adam900710/linux.git writeback_fix
console output: https://syzkaller.appspot.com/x/log.txt?x=1501b9e8580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=fa4954ad2c62b915
dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
  2024-11-24 13:45 [syzbot] [btrfs?] kernel BUG in __folio_start_writeback syzbot
                   ` (4 preceding siblings ...)
  2024-11-30  6:36 ` Qu Wenruo
@ 2025-01-23  5:06 ` syzbot
  5 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2025-01-23  5:06 UTC (permalink / raw)
  To: akpm, clm, dsterba, fdmanana, josef, linux-btrfs, linux-fsdevel,
	linux-kernel, linux-mm, nogikh, peterz, quwenruo.btrfs,
	syzkaller-bugs, willy, wqu

syzbot suspects this issue was fixed by commit:

commit 66951e4860d3c688bfa550ea4a19635b57e00eca
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Mon Jan 13 12:50:11 2025 +0000

    sched/fair: Fix update_cfs_group() vs DELAY_DEQUEUE

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11c129f8580000
start commit:   228a1157fb9f Merge tag '6.13-rc-part1-SMB3-client-fixes' o..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=402159daa216c89d
dashboard link: https://syzkaller.appspot.com/bug?extid=aac7bff85be224de5156
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13840778580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17840778580000

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: sched/fair: Fix update_cfs_group() vs DELAY_DEQUEUE

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-01-23  5:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-24 13:45 [syzbot] [btrfs?] kernel BUG in __folio_start_writeback syzbot
2024-11-24 21:26 ` Matthew Wilcox
2024-11-25  0:30   ` Qu Wenruo
2024-11-25 10:44     ` Aleksandr Nogikh
2024-11-26  8:43       ` Qu Wenruo
2024-11-26  6:42 ` Qu Wenruo
2024-11-26  7:35   ` syzbot
2024-11-28 18:56 ` syzbot
2024-11-28 21:26   ` Qu Wenruo
2024-11-29 21:17 ` Qu Wenruo
2024-11-30  1:51   ` syzbot
2024-11-30  4:27     ` Qu Wenruo
2024-11-30  6:36 ` Qu Wenruo
2024-11-30  7:01   ` syzbot
2025-01-23  5:06 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox