linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yeoreum Yun <yeoreum.yun@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: Yunseong Kim <ysk@kzalloc.com>, Byungchul Park <byungchul@sk.com>,
	Hillf Danton <hdanton@sina.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel_team@skhynix.com
Subject: Re: [RFC] mm/migrate: make sure folio_unlock() before folio_wait_writeback()
Date: Tue, 7 Oct 2025 08:53:17 +0100	[thread overview]
Message-ID: <aOTG7VTk4s9WfrMN@e129823.arm.com> (raw)
In-Reply-To: <deb6c0a2-e166-4c91-9736-276c9f1741c9@redhat.com>

Hi David,

> On 07.10.25 08:32, Yunseong Kim wrote:
> > Hi Hillf,
> >
> > Here are the syzlang and kernel log, and you can also find the gist snippet
> > in the body of the first RFC mail:
> >
> >   https://gist.github.com/kzall0c/a6091bb2fd536865ca9aabfd017a1fc5
> >
> > I am reviewing this issue again on the v6.17, The issue is always reproducible,
> > usually occurring within about 10k attempts with the 8 procs.
>
> I can see a DEPT splat and I wonder what happens if DEPT is disabled.
>
> Will the machine actually deadlock or is this just DEPT complaining (and
> probably getting something wrong)?
>

As Pedro mention[0], I believe this DEPT splat is a false positive.
The folio targeted by __find_get_block_slow() belongs to bd_mapping,
which is not the same folio whose writeback flag gets cleared
in ext4_end_io_end().

Since DEPT currently does not distinguish regular-file data folios from
the corresponding block-device folios,
such false positives are a known issue, and we plan to fix it.

Also, when i see the log shared from Yunseong (in hung.log)
I can check the migration is stuck while waiting buffer_head lock:
...
[ 3123.713542][   T89] INFO: task syz.4.2628:42733 blocked for more than 143 seconds.
[ 3123.713550][   T89]       Not tainted 6.15.11-00046-g2c223fa7bd9a-dirty #13
[ 3123.713557][   T89] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3123.713562][   T89] task:syz.4.2628      state:D stack:0     pid:42733 tgid:42732 ppid:41804  task_flags:0x400040 flags:0x00000009
[ 3123.713577][   T89] Call trace:
[ 3123.713582][   T89]  __switch_to+0x19c/0x2c0 (T)
[ 3123.713598][   T89]  __schedule+0x514/0x1208
[ 3123.713614][   T89]  schedule+0x40/0x164
[ 3123.713629][   T89]  io_schedule+0x3c/0x5c
[ 3123.713644][   T89]  bit_wait_io+0x14/0x70
[ 3123.713662][   T89]  __wait_on_bit_lock+0xa0/0x120
[ 3123.713678][   T89]  out_of_line_wait_on_bit_lock+0x8c/0xc0
[ 3123.713695][   T89]  __lock_buffer+0x74/0xb8
[ 3123.713720][   T89]  __buffer_migrate_folio+0x190/0x504
[ 3123.713747][   T89]  buffer_migrate_folio_norefs+0x30/0x3c
[ 3123.713764][   T89]  move_to_new_folio+0xe4/0x528
[ 3123.713779][   T89]  migrate_pages_batch+0xee0/0x1788
[ 3123.713795][   T89]  migrate_pages+0x15c4/0x1840
[ 3123.713810][   T89]  compact_zone+0x9c8/0x1d20
[ 3123.713822][   T89]  compact_node+0xd4/0x27c
[ 3123.713832][   T89]  sysctl_compaction_handler+0x104/0x194
[ 3123.713843][   T89]  proc_sys_call_handler+0x25c/0x3f8
[ 3123.713865][   T89]  proc_sys_write+0x20/0x2c
[ 3123.713878][   T89]  do_iter_readv_writev+0x350/0x448
[ 3123.713897][   T89]  vfs_writev+0x1ac/0x44c
[ 3123.713913][   T89]  do_pwritev+0x100/0x15c
[ 3123.713929][   T89]  __arm64_sys_pwritev2+0x6c/0xcc
[ 3123.713945][   T89]  invoke_syscall.constprop.0+0x64/0x18c
[ 3123.713961][   T89]  el0_svc_common.constprop.0+0x80/0x198
[ 3123.713978][   T89]  do_el0_svc+0x28/0x3c
[ 3123.713993][   T89]  el0_svc+0x50/0x220
[ 3123.714004][   T89]  el0t_64_sync_handler+0x10c/0x140
[ 3123.714017][   T89]  el0t_64_sync+0x1b8/0x1bc
...

which is different from description "stuck on writeback".

Unfortunately, I couldn't analyse more with the log he shared
since it was truncated.

@Yunseong, Could you make a reproduce without DEPT and share
full log for futher analysis?

Thanks.

[0] https://lore.kernel.org/all/dglxbwe2i5ubofefdxwo5jvyhdfjov37z5jzc5guedhe4dl6ia@pmkjkec3isb4/

--
Sincerely,
Yeoreum Yun


  reply	other threads:[~2025-10-07  7:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-02  8:16 Byungchul Park
2025-10-02 11:38 ` David Hildenbrand
2025-10-02 22:02   ` Hillf Danton
2025-10-03  0:48     ` Byungchul Park
2025-10-03  0:52       ` Byungchul Park
2025-10-07  6:32         ` Yunseong Kim
2025-10-07  7:04           ` David Hildenbrand
2025-10-07  7:53             ` Yeoreum Yun [this message]
2025-10-13  4:36             ` Byungchul Park
2025-10-13  8:08               ` David Hildenbrand
2025-10-03  1:02   ` Byungchul Park
2025-10-03  2:31   ` Byungchul Park
2025-10-03 14:04   ` Pedro Falcato
2025-10-02 11:42 ` Yeoreum Yun
2025-10-02 11:49   ` Yeoreum Yun
2025-10-03  2:08     ` Byungchul Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOTG7VTk4s9WfrMN@e129823.arm.com \
    --to=yeoreum.yun@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=byungchul@sk.com \
    --cc=david@redhat.com \
    --cc=hdanton@sina.com \
    --cc=kernel_team@skhynix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ysk@kzalloc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox