From: Yeoreum Yun <yeoreum.yun@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: Yunseong Kim <ysk@kzalloc.com>, Byungchul Park <byungchul@sk.com>,
Hillf Danton <hdanton@sina.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel_team@skhynix.com
Subject: Re: [RFC] mm/migrate: make sure folio_unlock() before folio_wait_writeback()
Date: Tue, 7 Oct 2025 08:53:17 +0100 [thread overview]
Message-ID: <aOTG7VTk4s9WfrMN@e129823.arm.com> (raw)
In-Reply-To: <deb6c0a2-e166-4c91-9736-276c9f1741c9@redhat.com>
Hi David,
> On 07.10.25 08:32, Yunseong Kim wrote:
> > Hi Hillf,
> >
> > Here are the syzlang and kernel log, and you can also find the gist snippet
> > in the body of the first RFC mail:
> >
> > https://gist.github.com/kzall0c/a6091bb2fd536865ca9aabfd017a1fc5
> >
> > I am reviewing this issue again on the v6.17, The issue is always reproducible,
> > usually occurring within about 10k attempts with the 8 procs.
>
> I can see a DEPT splat and I wonder what happens if DEPT is disabled.
>
> Will the machine actually deadlock or is this just DEPT complaining (and
> probably getting something wrong)?
>
As Pedro mention[0], I believe this DEPT splat is a false positive.
The folio targeted by __find_get_block_slow() belongs to bd_mapping,
which is not the same folio whose writeback flag gets cleared
in ext4_end_io_end().
Since DEPT currently does not distinguish regular-file data folios from
the corresponding block-device folios,
such false positives are a known issue, and we plan to fix it.
Also, when i see the log shared from Yunseong (in hung.log)
I can check the migration is stuck while waiting buffer_head lock:
...
[ 3123.713542][ T89] INFO: task syz.4.2628:42733 blocked for more than 143 seconds.
[ 3123.713550][ T89] Not tainted 6.15.11-00046-g2c223fa7bd9a-dirty #13
[ 3123.713557][ T89] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3123.713562][ T89] task:syz.4.2628 state:D stack:0 pid:42733 tgid:42732 ppid:41804 task_flags:0x400040 flags:0x00000009
[ 3123.713577][ T89] Call trace:
[ 3123.713582][ T89] __switch_to+0x19c/0x2c0 (T)
[ 3123.713598][ T89] __schedule+0x514/0x1208
[ 3123.713614][ T89] schedule+0x40/0x164
[ 3123.713629][ T89] io_schedule+0x3c/0x5c
[ 3123.713644][ T89] bit_wait_io+0x14/0x70
[ 3123.713662][ T89] __wait_on_bit_lock+0xa0/0x120
[ 3123.713678][ T89] out_of_line_wait_on_bit_lock+0x8c/0xc0
[ 3123.713695][ T89] __lock_buffer+0x74/0xb8
[ 3123.713720][ T89] __buffer_migrate_folio+0x190/0x504
[ 3123.713747][ T89] buffer_migrate_folio_norefs+0x30/0x3c
[ 3123.713764][ T89] move_to_new_folio+0xe4/0x528
[ 3123.713779][ T89] migrate_pages_batch+0xee0/0x1788
[ 3123.713795][ T89] migrate_pages+0x15c4/0x1840
[ 3123.713810][ T89] compact_zone+0x9c8/0x1d20
[ 3123.713822][ T89] compact_node+0xd4/0x27c
[ 3123.713832][ T89] sysctl_compaction_handler+0x104/0x194
[ 3123.713843][ T89] proc_sys_call_handler+0x25c/0x3f8
[ 3123.713865][ T89] proc_sys_write+0x20/0x2c
[ 3123.713878][ T89] do_iter_readv_writev+0x350/0x448
[ 3123.713897][ T89] vfs_writev+0x1ac/0x44c
[ 3123.713913][ T89] do_pwritev+0x100/0x15c
[ 3123.713929][ T89] __arm64_sys_pwritev2+0x6c/0xcc
[ 3123.713945][ T89] invoke_syscall.constprop.0+0x64/0x18c
[ 3123.713961][ T89] el0_svc_common.constprop.0+0x80/0x198
[ 3123.713978][ T89] do_el0_svc+0x28/0x3c
[ 3123.713993][ T89] el0_svc+0x50/0x220
[ 3123.714004][ T89] el0t_64_sync_handler+0x10c/0x140
[ 3123.714017][ T89] el0t_64_sync+0x1b8/0x1bc
...
which is different from description "stuck on writeback".
Unfortunately, I couldn't analyse more with the log he shared
since it was truncated.
@Yunseong, Could you make a reproduce without DEPT and share
full log for futher analysis?
Thanks.
[0] https://lore.kernel.org/all/dglxbwe2i5ubofefdxwo5jvyhdfjov37z5jzc5guedhe4dl6ia@pmkjkec3isb4/
--
Sincerely,
Yeoreum Yun
next prev parent reply other threads:[~2025-10-07 7:54 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-02 8:16 Byungchul Park
2025-10-02 11:38 ` David Hildenbrand
2025-10-02 22:02 ` Hillf Danton
2025-10-03 0:48 ` Byungchul Park
2025-10-03 0:52 ` Byungchul Park
2025-10-07 6:32 ` Yunseong Kim
2025-10-07 7:04 ` David Hildenbrand
2025-10-07 7:53 ` Yeoreum Yun [this message]
2025-10-13 4:36 ` Byungchul Park
2025-10-13 8:08 ` David Hildenbrand
2025-10-03 1:02 ` Byungchul Park
2025-10-03 2:31 ` Byungchul Park
2025-10-03 14:04 ` Pedro Falcato
2025-10-02 11:42 ` Yeoreum Yun
2025-10-02 11:49 ` Yeoreum Yun
2025-10-03 2:08 ` Byungchul Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOTG7VTk4s9WfrMN@e129823.arm.com \
--to=yeoreum.yun@arm.com \
--cc=akpm@linux-foundation.org \
--cc=byungchul@sk.com \
--cc=david@redhat.com \
--cc=hdanton@sina.com \
--cc=kernel_team@skhynix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ysk@kzalloc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox