From: Dave Chinner <david@fromorbit.com>
To: mikhail <mikhail.v.gavrilov@gmail.com>
Cc: "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: freezing system for several second on high I/O [kernel 4.15]
Date: Wed, 7 Feb 2018 17:55:20 +1100 [thread overview]
Message-ID: <20180207065520.66f6gocvxlnxmkyv@destitution> (raw)
In-Reply-To: <1517974845.4352.8.camel@gmail.com>
On Wed, Feb 07, 2018 at 08:40:45AM +0500, mikhail wrote:
> On Tue, 2018-02-06 at 12:12 +0500, Mikhail Gavrilov wrote:
> > On 6 February 2018 at 11:08, Dave Chinner <david@fromorbit.com> wrote:
> Yet another hung:
> Trace report: https://dumps.sy24.ru/1/trace_report.txt.bz2 (9.4 MB)
> dmesg:
> [ 369.374381] INFO: task TaskSchedulerFo:5624 blocked for more than 120 seconds.
> [ 369.374391] Not tainted 4.15.0-rc4-amd-vega+ #9
> [ 369.374393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 369.374395] TaskSchedulerFo D11688 5624 3825 0x00000000
> [ 369.374400] Call Trace:
> [ 369.374407] __schedule+0x2dc/0xba0
> [ 369.374410] ? __lock_acquire+0x2d4/0x1350
> [ 369.374415] ? __down+0x84/0x110
> [ 369.374417] schedule+0x33/0x90
> [ 369.374419] schedule_timeout+0x25a/0x5b0
> [ 369.374423] ? mark_held_locks+0x5f/0x90
> [ 369.374425] ? _raw_spin_unlock_irq+0x2c/0x40
> [ 369.374426] ? __down+0x84/0x110
> [ 369.374429] ? trace_hardirqs_on_caller+0xf4/0x190
> [ 369.374431] ? __down+0x84/0x110
> [ 369.374433] __down+0xac/0x110
> [ 369.374466] ? _xfs_buf_find+0x263/0xac0 [xfs]
> [ 369.374470] down+0x41/0x50
> [ 369.374472] ? down+0x41/0x50
> [ 369.374490] xfs_buf_lock+0x4e/0x270 [xfs]
> [ 369.374507] _xfs_buf_find+0x263/0xac0 [xfs]
> [ 369.374528] xfs_buf_get_map+0x29/0x490 [xfs]
> [ 369.374545] xfs_buf_read_map+0x2b/0x300 [xfs]
> [ 369.374567] xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
> [ 369.374585] xfs_read_agi+0xaa/0x200 [xfs]
> [ 369.374605] xfs_iunlink+0x4d/0x150 [xfs]
> [ 369.374609] ? current_time+0x32/0x70
> [ 369.374629] xfs_droplink+0x54/0x60 [xfs]
> [ 369.374654] xfs_rename+0xb15/0xd10 [xfs]
> [ 369.374680] xfs_vn_rename+0xd3/0x140 [xfs]
> [ 369.374687] vfs_rename+0x476/0x960
> [ 369.374695] SyS_rename+0x33f/0x390
> [ 369.374704] entry_SYSCALL_64_fastpath+0x1f/0x96
Again, this is waiting on a lock....
> [ 369.374707] RIP: 0033:0x7f01cf705137
> [ 369.374708] RSP: 002b:00007f01873e5608 EFLAGS: 00000202 ORIG_RAX: 0000000000000052
> [ 369.374710] RAX: ffffffffffffffda RBX: 0000000000000119 RCX: 00007f01cf705137
> [ 369.374711] RDX: 00007f01873e56dc RSI: 00003a5cd3540850 RDI: 00003a5cd7ea8000
> [ 369.374713] RBP: 00007f01873e6340 R08: 0000000000000000 R09: 00007f01873e54e0
> [ 369.374714] R10: 00007f01873e55f0 R11: 0000000000000202 R12: 00007f01873e6218
> [ 369.374715] R13: 00007f01873e6358 R14: 0000000000000000 R15: 00003a5cd8416000
> [ 369.374725] INFO: task disk_cache:0:3971 blocked for more than 120 seconds.
> [ 369.374727] Not tainted 4.15.0-rc4-amd-vega+ #9
> [ 369.374729] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 369.374731] disk_cache:0 D12432 3971 3903 0x00000000
> [ 369.374735] Call Trace:
> [ 369.374738] __schedule+0x2dc/0xba0
> [ 369.374743] ? wait_for_completion+0x10e/0x1a0
> [ 369.374745] schedule+0x33/0x90
> [ 369.374747] schedule_timeout+0x25a/0x5b0
> [ 369.374751] ? mark_held_locks+0x5f/0x90
> [ 369.374753] ? _raw_spin_unlock_irq+0x2c/0x40
> [ 369.374755] ? wait_for_completion+0x10e/0x1a0
> [ 369.374757] ? trace_hardirqs_on_caller+0xf4/0x190
> [ 369.374760] ? wait_for_completion+0x10e/0x1a0
> [ 369.374762] wait_for_completion+0x136/0x1a0
> [ 369.374765] ? wake_up_q+0x80/0x80
> [ 369.374782] ? _xfs_buf_read+0x23/0x30 [xfs]
> [ 369.374798] xfs_buf_submit_wait+0xb2/0x530 [xfs]
> [ 369.374814] _xfs_buf_read+0x23/0x30 [xfs]
> [ 369.374828] xfs_buf_read_map+0x14b/0x300 [xfs]
> [ 369.374847] ? xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
> [ 369.374867] xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
> [ 369.374883] xfs_da_read_buf+0xca/0x110 [xfs]
> [ 369.374901] xfs_dir3_data_read+0x23/0x60 [xfs]
> [ 369.374916] xfs_dir2_leaf_addname+0x335/0x8b0 [xfs]
> [ 369.374936] xfs_dir_createname+0x17e/0x1d0 [xfs]
> [ 369.374956] xfs_create+0x6ad/0x840 [xfs]
> [ 369.374981] xfs_generic_create+0x1fa/0x2d0 [xfs]
> [ 369.375000] xfs_vn_mknod+0x14/0x20 [xfs]
> [ 369.375016] xfs_vn_create+0x13/0x20 [xfs]
That is held by this process, one it is waiting for IO completion.
There's nothing in the traces relating to this IO, because the trace
only starts at 270s after boot, and this process has been waiting
since submitting it's IO at 250s after boot. The traces tell me that
IO is still running, but it only takes on IO to go missing for
everything to have problems.
> [ 369.378073] =============================================
>
> Again false positive?
The lockdep false positive in the previous trace has nothing to do
with the IO completion/hung task issue here.
> If it not fs problem why process blocked for such time?
Lots of reasons, but the typical reason for Io completion not
occurring is that broken hardware.
And it looks like there's another different trace appended, and this
one is quite instructive:
> [ 237.360627] task PC stack pid father
> [ 237.360882] tracker-store D12296 2481 1846 0x00000000
> [ 237.360894] Call Trace:
> [ 237.360901] __schedule+0x2dc/0xba0
> [ 237.360905] ? _raw_spin_unlock_irq+0x2c/0x40
> [ 237.360913] schedule+0x33/0x90
> [ 237.360918] io_schedule+0x16/0x40
> [ 237.360923] generic_file_read_iter+0x3b8/0xe10
> [ 237.360937] ? page_cache_tree_insert+0x140/0x140
> [ 237.360985] xfs_file_buffered_aio_read+0x6e/0x1a0 [xfs]
> [ 237.361018] xfs_file_read_iter+0x68/0xc0 [xfs]
> [ 237.361023] __vfs_read+0xf1/0x160
> [ 237.361035] vfs_read+0xa3/0x150
> [ 237.361042] SyS_pread64+0x98/0xc0
> [ 237.361048] entry_SYSCALL_64_fastpath+0x1f/0x96
There are multiple processes stuck waiting for data read IO
completion.
> [ 237.361546] TaskSchedulerFo D12184 4005 3825 0x00000000
> [ 237.361554] Call Trace:
> [ 237.361560] __schedule+0x2dc/0xba0
> [ 237.361566] ? _raw_spin_unlock_irq+0x2c/0x40
> [ 237.361572] schedule+0x33/0x90
> [ 237.361577] io_schedule+0x16/0x40
> [ 237.361581] wait_on_page_bit+0xd7/0x170
> [ 237.361588] ? page_cache_tree_insert+0x140/0x140
> [ 237.361596] truncate_inode_pages_range+0x702/0x9d0
> [ 237.361606] ? generic_write_end+0x98/0x100
> [ 237.361617] ? sched_clock+0x9/0x10
> [ 237.361623] ? unmap_mapping_range+0x76/0x130
> [ 237.361632] ? up_write+0x1f/0x40
> [ 237.361636] ? unmap_mapping_range+0x76/0x130
> [ 237.361643] truncate_pagecache+0x48/0x70
> [ 237.361648] truncate_setsize+0x32/0x40
> [ 237.361677] xfs_setattr_size+0xe3/0x340 [xfs]
And there's a truncate blocked waiting for data IO completion.
> [ 237.361842] wait_for_completion+0x136/0x1a0
> [ 237.361846] ? wake_up_q+0x80/0x80
> [ 237.361872] ? _xfs_buf_read+0x23/0x30 [xfs]
> [ 237.361898] xfs_buf_submit_wait+0xb2/0x530 [xfs]
> [ 237.361935] _xfs_buf_read+0x23/0x30 [xfs]
> [ 237.361958] xfs_buf_read_map+0x14b/0x300 [xfs]
> [ 237.361981] ? xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
> [ 237.362003] xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
> [ 237.362005] ? rcu_read_lock_sched_held+0x79/0x80
> [ 237.362027] xfs_imap_to_bp+0x67/0xe0 [xfs]
> [ 237.362056] xfs_iread+0x86/0x220 [xfs]
> [ 237.362090] xfs_iget+0x4c5/0x1070 [xfs]
> [ 237.362094] ? kfree+0xfe/0x2e0
> [ 237.362132] xfs_lookup+0x149/0x1e0 [xfs]
> [ 237.362164] xfs_vn_lookup+0x70/0xb0 [xfs]
> [ 237.362172] lookup_slow+0x132/0x220
> [ 237.362192] walk_component+0x1bd/0x340
> [ 237.362202] path_lookupat+0x84/0x1f0
> [ 237.362212] filename_lookup+0xb6/0x190
> [ 237.362226] ? __check_object_size+0xaf/0x1b0
> [ 237.362233] ? strncpy_from_user+0x4d/0x170
> [ 237.362242] user_path_at_empty+0x36/0x40
> [ 237.362245] ? user_path_at_empty+0x36/0x40
> [ 237.362250] vfs_statx+0x76/0xe0
and there's a stat() call blocked waiting for inode cluster read IO
completion.
> [ 237.362321] TaskSchedulerFo D11576 5625 3825 0x00000000
> [ 237.362328] Call Trace:
> [ 237.362334] __schedule+0x2dc/0xba0
> [ 237.362339] ? _raw_spin_unlock_irq+0x2c/0x40
> [ 237.362346] schedule+0x33/0x90
> [ 237.362350] io_schedule+0x16/0x40
> [ 237.362355] wait_on_page_bit_common+0x10a/0x1a0
> [ 237.362362] ? page_cache_tree_insert+0x140/0x140
> [ 237.362371] __filemap_fdatawait_range+0xfd/0x190
> [ 237.362389] filemap_write_and_wait_range+0x4b/0x90
> [ 237.362412] xfs_setattr_size+0x10b/0x340 [xfs]
> [ 237.362433] ? setattr_prepare+0x69/0x190
> [ 237.362464] xfs_vn_setattr_size+0x57/0x150 [xfs]
> [ 237.362493] xfs_vn_setattr+0x87/0xb0 [xfs]
> [ 237.362502] notify_change+0x300/0x420
> [ 237.362512] do_truncate+0x73/0xc0
Truncate explicitly waiting for data writeback completion.
> [ 237.362586] Call Trace:
> [ 237.362593] __schedule+0x2dc/0xba0
> [ 237.362622] ? _xfs_log_force_lsn+0x2d4/0x360 [xfs]
> [ 237.362634] schedule+0x33/0x90
> [ 237.362664] _xfs_log_force_lsn+0x2d9/0x360 [xfs]
> [ 237.362671] ? wake_up_q+0x80/0x80
> [ 237.362705] xfs_file_fsync+0x10f/0x2b0 [xfs]
> [ 237.362718] vfs_fsync_range+0x4e/0xb0
> [ 237.362726] do_fsync+0x3d/0x70
fsync() waiting on log write IO completion.
> [ 237.362804] schedule_timeout+0x25a/0x5b0
> [ 237.362811] ? mark_held_locks+0x5f/0x90
> [ 237.362815] ? _raw_spin_unlock_irq+0x2c/0x40
> [ 237.362818] ? wait_for_completion+0x10e/0x1a0
> [ 237.362823] ? trace_hardirqs_on_caller+0xf4/0x190
> [ 237.362829] ? wait_for_completion+0x10e/0x1a0
> [ 237.362833] wait_for_completion+0x136/0x1a0
> [ 237.362838] ? wake_up_q+0x80/0x80
> [ 237.362868] ? _xfs_buf_read+0x23/0x30 [xfs]
> [ 237.362897] xfs_buf_submit_wait+0xb2/0x530 [xfs]
> [ 237.362926] _xfs_buf_read+0x23/0x30 [xfs]
> [ 237.362952] xfs_buf_read_map+0x14b/0x300 [xfs]
> [ 237.362981] ? xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
> [ 237.363014] xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
> [ 237.363019] ? rcu_read_lock_sched_held+0x79/0x80
> [ 237.363048] xfs_imap_to_bp+0x67/0xe0 [xfs]
> [ 237.363081] xfs_iread+0x86/0x220 [xfs]
> [ 237.363116] xfs_iget+0x4c5/0x1070 [xfs]
> [ 237.363121] ? kfree+0xfe/0x2e0
> [ 237.363162] xfs_lookup+0x149/0x1e0 [xfs]
> [ 237.363195] xfs_vn_lookup+0x70/0xb0 [xfs]
> [ 237.363204] lookup_open+0x2dc/0x7c0
open() waiting on inode read IO completion.
.....
And there's a lot more threads all waiting on IO completion, both
data or metadata, so I'm not going to bother commenting further
because filesystems don't hang like this by themselves.
i.e. This has all the hallmarks of something below the filesystem
dropping IO completions, such as the hardware being broken. The
filesystem is just the messenger....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2018-02-07 6:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-30 18:40 mikhail
2018-01-31 2:22 ` Dave Chinner
2018-02-05 3:25 ` mikhail
2018-02-06 3:47 ` mikhail
2018-02-06 6:08 ` Dave Chinner
2018-02-06 7:12 ` Mikhail Gavrilov
2018-02-07 3:40 ` mikhail
2018-02-07 6:55 ` Dave Chinner [this message]
2018-02-10 9:34 ` mikhail
2018-02-10 9:35 ` mikhail
2018-02-11 22:56 ` Dave Chinner
2018-02-14 21:27 ` mikhail
2018-02-14 21:52 ` Dave Chinner
2018-02-15 3:40 ` mikhail
2018-02-15 3:42 ` mikhail
2018-02-15 3:46 ` Matthew Wilcox
2018-02-15 5:44 ` Dave Chinner
2018-02-15 19:02 ` Mikhail Gavrilov
2018-02-15 21:48 ` Dave Chinner
2018-02-18 14:02 ` Mikhail Gavrilov
2018-02-19 5:02 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180207065520.66f6gocvxlnxmkyv@destitution \
--to=david@fromorbit.com \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mikhail.v.gavrilov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox