Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Matthew Wilcox <willy@infradead.org>
To: syzbot <syzbot+aac7bff85be224de5156@syzkaller.appspotmail.com>
Cc: akpm@linux-foundation.org, clm@fb.com, dsterba@suse.com,
	josef@toxicpanda.com, linux-btrfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback
Date: Sun, 24 Nov 2024 21:26:53 +0000	[thread overview]
Message-ID: <Z0OaHcMWcRtohZfz@casper.infradead.org> (raw)
In-Reply-To: <67432dee.050a0220.1cc393.0041.GAE@google.com>

On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote:
> 
>  __fput+0x5ba/0xa50 fs/file_table.c:458
>  task_work_run+0x24f/0x310 kernel/task_work.c:239
>  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>  syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
>  do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f

This is:

        VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);

ie we've called __folio_start_writeback() on a folio which is already
under writeback.

Higher up in the trace, we have the useful information:

 page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5
 memcg:ffff888140adc000
 aops:btrfs_aops ino:105 dentry name(?):"file2"
 flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff)
 raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710
 raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000
 page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio))
 page_owner tracks the page as allocated

The interesting part of the page_owner stacktrace is:

  filemap_alloc_folio_noprof+0xdf/0x500
  __filemap_get_folio+0x446/0xbd0
  prepare_one_folio+0xb6/0xa20
  btrfs_buffered_write+0x6bd/0x1150
  btrfs_direct_write+0x52d/0xa30
  btrfs_do_write_iter+0x2a0/0x760
  do_iter_readv_writev+0x600/0x880
  vfs_writev+0x376/0xba0

(ie not very interesting)

> Workqueue: btrfs-delalloc btrfs_work_helper
> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119
> Call Trace:
>  <TASK>
>  process_one_folio fs/btrfs/extent_io.c:187 [inline]
>  __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216
>  submit_one_async_extent fs/btrfs/inode.c:1229 [inline]
>  submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632
>  run_ordered_work fs/btrfs/async-thread.c:245 [inline]
>  btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324
>  process_one_work kernel/workqueue.c:3229 [inline]

This looks like a race?

process_one_folio() calls
btrfs_folio_clamp_set_writeback calls
btrfs_subpage_set_writeback:

        spin_lock_irqsave(&subpage->lock, flags);
        bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits)
;
        if (!folio_test_writeback(folio))
                folio_start_writeback(folio);
        spin_unlock_irqrestore(&subpage->lock, flags);

so somebody else set writeback after we tested for writeback here.

One thing that comes to mind is that _usually_ we take folio_lock()
first, then start writeback, then call folio_unlock() and btrfs isn't
doing that here (afaict).  Maybe that's not the source of the bug?

If it is, should we have a VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio)
in __folio_start_writeback()?  Or is there somewhere that can't lock the
folio before starting writeback?

next prev parent reply	other threads:[~2024-11-24 21:27 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-24 13:45 syzbot
2024-11-24 21:26 ` Matthew Wilcox [this message]
2024-11-25  0:30   ` Qu Wenruo
2024-11-25 10:44     ` Aleksandr Nogikh
2024-11-26  8:43       ` Qu Wenruo
2024-11-26  6:42 ` Qu Wenruo
2024-11-26  7:35   ` syzbot
2024-11-28 18:56 ` syzbot
2024-11-28 21:26   ` Qu Wenruo
2024-11-29 21:17 ` Qu Wenruo
2024-11-30  1:51   ` syzbot
2024-11-30  4:27     ` Qu Wenruo
2024-11-30  6:36 ` Qu Wenruo
2024-11-30  7:01   ` syzbot
2025-01-23  5:06 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z0OaHcMWcRtohZfz@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=syzbot+aac7bff85be224de5156@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox