On Tue, 2018-02-06 at 12:12 +0500, Mikhail Gavrilov wrote:
> On 6 February 2018 at 11:08, Dave Chinner <david@fromorbit.com> wrote:
> > You collected a trace of something, but didn't supply any of the
> > other storage and fs config stuff that was mentioned in that link.
> 
> Sorry.
> Anyway this information existed in attached dmesg.
> 
> > > Cant you now look into in?
> > 
> > I don't see a filesystem problem from the log you've posted, I see
> > some slow IO a minute after boot, then a lockdep false positive
> > about 40mins in, but the system then reports GPU memory allocation
> > problems and hardware MCEs for the next 4-5 hours before the GPU
> > appears to stop working.
> 
> Lockdep about 40mins it's normal? I don't think so.
> 
> > 
> > [....]
> > 
> > > [    4.687255] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
> > 
> > I'm guessing that you have an ssd with ext4 and two 4TB drives.
> > And ext4 is on the SSD?
> 
> Yes, right.
> 
> > 
> > > [    5.778628] EXT4-fs (sda1): re-mounted. Opts: (null)
> > 
> > .....
> > > [    7.918812] XFS (sdb): Mounting V5 Filesystem
> > > [    8.123854] XFS (sdb): Starting recovery (logdev: internal)
> > 
> > And there's an XFS filesystem on one drive...
> 
> Yep.
> 
> > 
> > > [   77.459679] sysrq: SysRq : Show Blocked State
> > > [   77.459693]   task                        PC stack   pid father
> > > [   77.459947] tracker-store   D12296  2469   1847 0x00000000
> > > [   77.459957] Call Trace:
> > > [   77.459963]  __schedule+0x2dc/0xba0
> > > [   77.459966]  ? _raw_spin_unlock_irq+0x2c/0x40
> > > [   77.459970]  schedule+0x33/0x90
> > > [   77.459974]  io_schedule+0x16/0x40
> > > [   77.459978]  generic_file_read_iter+0x3b8/0xe10
> > > [   77.459986]  ? page_cache_tree_insert+0x140/0x140
> > > [   77.460026]  xfs_file_buffered_aio_read+0x6e/0x1a0 [xfs]
> > > [   77.460054]  xfs_file_read_iter+0x68/0xc0 [xfs]
> > > [   77.460058]  __vfs_read+0xf1/0x160
> > > [   77.460065]  vfs_read+0xa3/0x150
> > > [   77.460069]  SyS_pread64+0x98/0xc0
> > > [   77.460074]  entry_SYSCALL_64_fastpath+0x1f/0x96
> > 
> > That's waiting on a read IO - no indication of anything being wrong
> > here....
> > 
> > > [ 2095.241660] TaskSchedulerFo (16168) used greatest stack depth: 10232 bytes left
> > > 
> > > [ 2173.204790] ======================================================
> > > [ 2173.204791] WARNING: possible circular locking dependency detected
> > > [ 2173.204793] 4.15.0-rc4-amd-vega+ #8 Not tainted
> > > [ 2173.204794] ------------------------------------------------------
> > > [ 2173.204795] gnome-shell/1971 is trying to acquire lock:
> > > [ 2173.204796]  (sb_internal){.+.+}, at: [<00000000221fd49d>] xfs_trans_alloc+0xec/0x130 [xfs]
> > > [ 2173.204832]
> > >                but task is already holding lock:
> > > [ 2173.204833]  (fs_reclaim){+.+.}, at: [<00000000bdc32871>] fs_reclaim_acquire.part.74+0x5/0x30
> > > [ 2173.204837]
> > >                which lock already depends on the new lock.
> > 
> > And here we go again on another lockdep memory-reclaim false positive
> > whack-a-mole game.
> 
> Here occurring interface lagging.
> 
> > 
> > > [ 2173.204838]
> > >                the existing dependency chain (in reverse order) is:
> > > [ 2173.204839]
> > >                -> #1 (fs_reclaim){+.+.}:
> > > [ 2173.204843]        fs_reclaim_acquire.part.74+0x29/0x30
> > > [ 2173.204844]        fs_reclaim_acquire+0x19/0x20
> > > [ 2173.204846]        kmem_cache_alloc+0x33/0x300
> > > [ 2173.204870]        kmem_zone_alloc+0x6c/0xf0 [xfs]
> > > [ 2173.204891]        xfs_trans_alloc+0x6b/0x130 [xfs]
> > > [ 2173.204912]        xfs_efi_recover+0x11c/0x1c0 [xfs]
> > > [ 2173.204932]        xlog_recover_process_efi+0x41/0x60 [xfs]
> > > [ 2173.204951]        xlog_recover_process_intents.isra.40+0x138/0x270 [xfs]
> > > [ 2173.204969]        xlog_recover_finish+0x23/0xb0 [xfs]
> > > [ 2173.204987]        xfs_log_mount_finish+0x61/0xe0 [xfs]
> > > [ 2173.205005]        xfs_mountfs+0x657/0xa60 [xfs]
> > > [ 2173.205022]        xfs_fs_fill_super+0x4aa/0x630 [xfs]
> > > [ 2173.205024]        mount_bdev+0x184/0x1c0
> > > [ 2173.205042]        xfs_fs_mount+0x15/0x20 [xfs]
> > > [ 2173.205043]        mount_fs+0x32/0x150
> > > [ 2173.205045]        vfs_kern_mount.part.25+0x5d/0x160
> > > [ 2173.205046]        do_mount+0x65d/0xde0
> > > [ 2173.205047]        SyS_mount+0x98/0xe0
> > > [ 2173.205049]        do_syscall_64+0x6c/0x220
> > > [ 2173.205052]        return_from_SYSCALL_64+0x0/0x75
> > > [ 2173.205053]
> > >                -> #0 (sb_internal){.+.+}:
> > > [ 2173.205056]        lock_acquire+0xa3/0x1f0
> > > [ 2173.205058]        __sb_start_write+0x11c/0x190
> > > [ 2173.205075]        xfs_trans_alloc+0xec/0x130 [xfs]
> > > [ 2173.205091]        xfs_free_eofblocks+0x12a/0x1e0 [xfs]
> > > [ 2173.205108]        xfs_inactive+0xf0/0x110 [xfs]
> > > [ 2173.205125]        xfs_fs_destroy_inode+0xbb/0x2d0 [xfs]
> > > [ 2173.205127]        destroy_inode+0x3b/0x60
> > > [ 2173.205128]        evict+0x13e/0x1a0
> > > [ 2173.205129]        dispose_list+0x56/0x80
> > > [ 2173.205131]        prune_icache_sb+0x5a/0x80
> > > [ 2173.205132]        super_cache_scan+0x137/0x1b0
> > > [ 2173.205134]        shrink_slab.part.47+0x1fb/0x590
> > > [ 2173.205135]        shrink_slab+0x29/0x30
> > > [ 2173.205136]        shrink_node+0x11e/0x2f0
> > > [ 2173.205137]        do_try_to_free_pages+0xd0/0x350
> > > [ 2173.205138]        try_to_free_pages+0x136/0x340
> > > [ 2173.205140]        __alloc_pages_slowpath+0x487/0x1150
> > > [ 2173.205141]        __alloc_pages_nodemask+0x3a8/0x430
> > > [ 2173.205143]        dma_generic_alloc_coherent+0x91/0x160
> > > [ 2173.205146]        x86_swiotlb_alloc_coherent+0x25/0x50
> > > [ 2173.205150]        ttm_dma_pool_get_pages+0x230/0x630 [ttm]
> > 
> > OK, new symptom of the ages old problem with using lockdep for
> > annotating things that are not locks. In this case, it's both
> > memory reclaim and filesystem freeze annotations that are colliding
> > with an XFS function that can be called above and below memory
> > allocation and producing a false positive.
> > 
> > i.e. it's perfectly safe for us to call xfs_trans_alloc() in the
> > manner we are from memory reclaim because we're not in a GFP_NOFS or
> > PF_MEMALLOC_NOFS context.
> > 
> > And it's also perfectly safe for us to call xfs_trans_alloc from log
> > recovery at mount time like we are because the filesystem cannot be
> > frozen before a mount is complete and hence sb_internal ordering is
> > completely irrelevant at that point.
> > 
> > So it's a false positive, and I don't think there's anything we can
> > do to prevent it because using __GFP_NOLOCKDEP in xfs_trans_alloc()
> > will mean lockdep will not warn we we have a real deadlock due to
> > transaction nesting in memory reclaim contexts.....
> > 
> > From here, there's nothing filesystem related in the logs:
> 
> But here I am feel system hang.
> 
> > 
> > [.....]
> > 
> > > [ 2229.274826] swiotlb: coherent allocation failed for device 0000:07:00.0 size=2097152
> > 
> > You are getting gpu memory allocation failures....
> > 
> > > [ 2234.832320] amdgpu 0000:07:00.0: swiotlb buffer is full (sz: 2097152 bytes)
> > > [ 2234.832325] swiotlb: coherent allocation failed for device 0000:07:00.0 size=2097152
> > 
> > repeatedly, until ....
> > 
> > > [ 2938.815747] mce: [Hardware Error]: Machine check events logged
> > 
> > your hardware starts throwing errors at the CPU.
> 
> It's false positive events due Haswell CPU under virtualization.
> https://bugs.launchpad.net/qemu/+bug/1307225
> I am really don't know why Intel still not fix it.
> 
> > 
> > > [ 2999.259697] kworker/dying (16220) used greatest stack depth: 9808 bytes left
> > > [ 3151.714448] perf: interrupt took too long (2521 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
> > > [ 5331.990934] TCP: request_sock_TCP: Possible SYN flooding on port 8201. Sending cookies.  Check SNMP counters.
> > > [ 5331.991837] TCP: request_sock_TCP: Possible SYN flooding on port 9208. Sending cookies.  Check SNMP counters.
> > > [ 5334.781978] TCP: request_sock_TCP: Possible SYN flooding on port 7171. Sending cookies.  Check SNMP counters.
> > 
> > other bad things are happening to your machine....
> 
> What it is means?
> 
> > 
> > > [ 5354.636542] sbis3plugin[29294]: segfault at 8 ip 000000001c84acaf sp 0000000038851b8c error 4 in
> > > libQt5Core.so[7f87ee665000+5a3000]
> > > [ 5794.612947] perf: interrupt took too long (3152 > 3151), lowering kernel.perf_event_max_sample_rate to 63000
> > > [ 6242.114852] amdgpu 0000:07:00.0: swiotlb buffer is full (sz: 2097152 bytes)
> > > [ 6242.114857] swiotlb: coherent allocation failed for device 0000:07:00.0 size=2097152
> > 
> > Random userspace segfaults and more gpu memory allocation failures
> 
> sbis3plugin segfault is already reported so not needed pay attention to this.
> 
> > 
> > > [ 9663.267767] mce: [Hardware Error]: Machine check events logged
> > > [10322.649619] mce: [Hardware Error]: Machine check events logged
> > > [10557.294312] amdgpu 0000:07:00.0: swiotlb buffer is full (sz: 2097152 bytes)
> > > [10557.294317] swiotlb: coherent allocation failed for device 0000:07:00.0 size=2097152
> > 
> > more hardware and gpu memory allocation failures
> > 
> > [more gpu memalloc failures]
> > 
> > > [13609.734065] mce: [Hardware Error]: Machine check events logged
> > > [13920.399283] mce: [Hardware Error]: Machine check events logged
> 
> I would not pay attention to "mce: [Hardware Error]: Machine check
> events logged" because this it well-known problem with virtualization
> Haswell CPU.
> 
> > > [14116.872461] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, last signaled seq=1653418, last
> > > emitted seq=1653420
> > > [14116.872466] [drm] No hardware hang detected. Did some blocks stall?
> 
> This is too  well-known problem with latest AMD Vega GPU
> https://bugs.freedesktop.org/show_bug.cgi?id=104001
> 
> > And finally after 4+ hours hardware errors and the GPU times out and
> > drm is confused....
> 
> 
> > So there doesn't appear to be any filesystem problem here, just a
> > heavily loaded system under memory pressure....
> 
> This is too strange because machine have enough amount of physical
> memory 32GB and only half was used.
> 
> --
> Best Regards,
> Mike Gavrilov.
> 


Yet another hung:
Trace report: https://dumps.sy24.ru/1/trace_report.txt.bz2 (9.4 MB)
dmesg:
[  369.374381] INFO: task TaskSchedulerFo:5624 blocked for more than 120 seconds.
[  369.374391]       Not tainted 4.15.0-rc4-amd-vega+ #9
[  369.374393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  369.374395] TaskSchedulerFo D11688  5624   3825 0x00000000
[  369.374400] Call Trace:
[  369.374407]  __schedule+0x2dc/0xba0
[  369.374410]  ? __lock_acquire+0x2d4/0x1350
[  369.374415]  ? __down+0x84/0x110
[  369.374417]  schedule+0x33/0x90
[  369.374419]  schedule_timeout+0x25a/0x5b0
[  369.374423]  ? mark_held_locks+0x5f/0x90
[  369.374425]  ? _raw_spin_unlock_irq+0x2c/0x40
[  369.374426]  ? __down+0x84/0x110
[  369.374429]  ? trace_hardirqs_on_caller+0xf4/0x190
[  369.374431]  ? __down+0x84/0x110
[  369.374433]  __down+0xac/0x110
[  369.374466]  ? _xfs_buf_find+0x263/0xac0 [xfs]
[  369.374470]  down+0x41/0x50
[  369.374472]  ? down+0x41/0x50
[  369.374490]  xfs_buf_lock+0x4e/0x270 [xfs]
[  369.374507]  _xfs_buf_find+0x263/0xac0 [xfs]
[  369.374528]  xfs_buf_get_map+0x29/0x490 [xfs]
[  369.374545]  xfs_buf_read_map+0x2b/0x300 [xfs]
[  369.374567]  xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
[  369.374585]  xfs_read_agi+0xaa/0x200 [xfs]
[  369.374605]  xfs_iunlink+0x4d/0x150 [xfs]
[  369.374609]  ? current_time+0x32/0x70
[  369.374629]  xfs_droplink+0x54/0x60 [xfs]
[  369.374654]  xfs_rename+0xb15/0xd10 [xfs]
[  369.374680]  xfs_vn_rename+0xd3/0x140 [xfs]
[  369.374687]  vfs_rename+0x476/0x960
[  369.374695]  SyS_rename+0x33f/0x390
[  369.374704]  entry_SYSCALL_64_fastpath+0x1f/0x96
[  369.374707] RIP: 0033:0x7f01cf705137
[  369.374708] RSP: 002b:00007f01873e5608 EFLAGS: 00000202 ORIG_RAX: 0000000000000052
[  369.374710] RAX: ffffffffffffffda RBX: 0000000000000119 RCX: 00007f01cf705137
[  369.374711] RDX: 00007f01873e56dc RSI: 00003a5cd3540850 RDI: 00003a5cd7ea8000
[  369.374713] RBP: 00007f01873e6340 R08: 0000000000000000 R09: 00007f01873e54e0
[  369.374714] R10: 00007f01873e55f0 R11: 0000000000000202 R12: 00007f01873e6218
[  369.374715] R13: 00007f01873e6358 R14: 0000000000000000 R15: 00003a5cd8416000
[  369.374725] INFO: task disk_cache:0:3971 blocked for more than 120 seconds.
[  369.374727]       Not tainted 4.15.0-rc4-amd-vega+ #9
[  369.374729] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  369.374731] disk_cache:0    D12432  3971   3903 0x00000000
[  369.374735] Call Trace:
[  369.374738]  __schedule+0x2dc/0xba0
[  369.374743]  ? wait_for_completion+0x10e/0x1a0
[  369.374745]  schedule+0x33/0x90
[  369.374747]  schedule_timeout+0x25a/0x5b0
[  369.374751]  ? mark_held_locks+0x5f/0x90
[  369.374753]  ? _raw_spin_unlock_irq+0x2c/0x40
[  369.374755]  ? wait_for_completion+0x10e/0x1a0
[  369.374757]  ? trace_hardirqs_on_caller+0xf4/0x190
[  369.374760]  ? wait_for_completion+0x10e/0x1a0
[  369.374762]  wait_for_completion+0x136/0x1a0
[  369.374765]  ? wake_up_q+0x80/0x80
[  369.374782]  ? _xfs_buf_read+0x23/0x30 [xfs]
[  369.374798]  xfs_buf_submit_wait+0xb2/0x530 [xfs]
[  369.374814]  _xfs_buf_read+0x23/0x30 [xfs]
[  369.374828]  xfs_buf_read_map+0x14b/0x300 [xfs]
[  369.374847]  ? xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
[  369.374867]  xfs_trans_read_buf_map+0xc4/0x5d0 [xfs]
[  369.374883]  xfs_da_read_buf+0xca/0x110 [xfs]
[  369.374901]  xfs_dir3_data_read+0x23/0x60 [xfs]
[  369.374916]  xfs_dir2_leaf_addname+0x335/0x8b0 [xfs]
[  369.374936]  xfs_dir_createname+0x17e/0x1d0 [xfs]
[  369.374956]  xfs_create+0x6ad/0x840 [xfs]
[  369.374981]  xfs_generic_create+0x1fa/0x2d0 [xfs]
[  369.375000]  xfs_vn_mknod+0x14/0x20 [xfs]
[  369.375016]  xfs_vn_create+0x13/0x20 [xfs]
[  369.375018]  lookup_open+0x5ea/0x7c0
[  369.375025]  ? __wake_up_common_lock+0x65/0xc0
[  369.375032]  path_openat+0x318/0xc80
[  369.375039]  do_filp_open+0x9b/0x110
[  369.375047]  ? _raw_spin_unlock+0x27/0x40
[  369.375053]  do_sys_open+0x1ba/0x250
[  369.375055]  ? do_sys_open+0x1ba/0x250
[  369.375059]  SyS_openat+0x14/0x20
[  369.375062]  entry_SYSCALL_64_fastpath+0x1f/0x96
[  369.375063] RIP: 0033:0x7f616bf1b080
[  369.375064] RSP: 002b:00007f614bd56930 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
[  369.375067] RAX: ffffffffffffffda RBX: 00003d8825112800 RCX: 00007f616bf1b080
[  369.375068] RDX: 0000000000080041 RSI: 00003d8824da6070 RDI: ffffffffffffff9c
[  369.375069] RBP: 0000000000000022 R08: 0000000000000000 R09: 0000000000000050
[  369.375070] R10: 00000000000001a4 R11: 0000000000000293 R12: 00007f614bd569c8
[  369.375071] R13: 0000000000000008 R14: 00003d8824da6150 R15: 0000000000000008
[  369.375206] 
               Showing all locks held in the system:
[  369.375215] 5 locks held by kworker/2:1/60:
[  369.375221]  #0:  ((wq_completion)"xfs-eofblocks/%s"mp->m_fsname){+.+.}, at: [<00000000731c4c52>]
process_one_work+0x1b9/0x680
[  369.375230]  #1:  ((work_completion)(&(&mp->m_eofblocks_work)->work)){+.+.}, at: [<00000000731c4c52>]
process_one_work+0x1b9/0x680
[  369.375236]  #2:  (&sb->s_type->i_mutex_key#20){++++}, at: [<00000000d49e2308>] xfs_ilock_nowait+0x12d/0x270 [xfs]
[  369.375258]  #3:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.375281]  #4:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.375301] 1 lock held by khungtaskd/67:
[  369.375302]  #0:  (tasklist_lock){.+.+}, at: [<000000006840dd64>] debug_show_all_locks+0x3d/0x1a0
[  369.375314] 3 locks held by kworker/u16:5/148:
[  369.375315]  #0:  ((wq_completion)"writeback"){+.+.}, at: [<00000000731c4c52>] process_one_work+0x1b9/0x680
[  369.375321]  #1:  ((work_completion)(&(&wb->dwork)->work)){+.+.}, at: [<00000000731c4c52>]
process_one_work+0x1b9/0x680
[  369.375327]  #2:  (&type->s_umount_key#63){++++}, at: [<0000000022e51a82>] trylock_super+0x1b/0x50
[  369.375392] 4 locks held by gnome-shell/1970:
[  369.375393]  #0:  (&mm->mmap_sem){++++}, at: [<00000000642ae303>] vm_mmap_pgoff+0xa1/0x120
[  369.375401]  #1:  (sb_writers#17){.+.+}, at: [<00000000626e98dc>] touch_atime+0x64/0xd0
[  369.375408]  #2:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.375430]  #3:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.375453] 4 locks held by pool/6879:
[  369.375454]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.375462]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<00000000d0bc23a2>] lock_rename+0xda/0x100
[  369.375470]  #2:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.375491]  #3:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.375520] 8 locks held by dconf-service/2129:
[  369.375521]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.375538]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<00000000d0bc23a2>] lock_rename+0xda/0x100
[  369.375546]  #2:  (&sb->s_type->i_mutex_key#20){++++}, at: [<00000000926eb288>] lock_two_nondirectories+0x6d/0x80
[  369.375553]  #3:  (&sb->s_type->i_mutex_key#20/4){+.+.}, at: [<0000000032f8e229>] lock_two_nondirectories+0x56/0x80
[  369.375571]  #4:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.375607]  #5:  (&xfs_dir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.375639]  #6:  (&xfs_nondir_ilock_class){++++}, at: [<000000005543d627>] xfs_ilock_nowait+0x194/0x270 [xfs]
[  369.375671]  #7:  (&xfs_nondir_ilock_class){++++}, at: [<000000005543d627>] xfs_ilock_nowait+0x194/0x270 [xfs]
[  369.375741] 1 lock held by tracker-store/2481:
[  369.375743]  #0:  (&sb->s_type->i_mutex_key#20){++++}, at: [<000000009a06b5ff>] xfs_ilock+0x1a6/0x210 [xfs]
[  369.375822] 8 locks held by TaskSchedulerBa/3894:
[  369.375824]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.375835]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<00000000d0bc23a2>] lock_rename+0xda/0x100
[  369.375848]  #2:  (&inode->i_rwsem){++++}, at: [<00000000926eb288>] lock_two_nondirectories+0x6d/0x80
[  369.375858]  #3:  (&inode->i_rwsem/4){+.+.}, at: [<0000000032f8e229>] lock_two_nondirectories+0x56/0x80
[  369.375869]  #4:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.375905]  #5:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.375936]  #6:  (&xfs_dir_ilock_class){++++}, at: [<000000005543d627>] xfs_ilock_nowait+0x194/0x270 [xfs]
[  369.375967]  #7:  (&xfs_nondir_ilock_class){++++}, at: [<000000005543d627>] xfs_ilock_nowait+0x194/0x270 [xfs]
[  369.375997] 6 locks held by TaskSchedulerFo/3896:
[  369.375999]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376010]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<000000001fe370fd>] do_unlinkat+0x129/0x300
[  369.376023]  #2:  (&inode->i_rwsem){++++}, at: [<00000000d6a8d3d3>] vfs_unlink+0x50/0x1c0
[  369.376033]  #3:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.376068]  #4:  (&xfs_dir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376098]  #5:  (&xfs_nondir_ilock_class){++++}, at: [<000000005543d627>] xfs_ilock_nowait+0x194/0x270 [xfs]
[  369.376130] 2 locks held by TaskSchedulerFo/3897:
[  369.376132]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376143]  #1:  (&type->i_mutex_dir_key#7){++++}, at: [<000000000a1a7597>] path_openat+0x2fe/0xc80
[  369.376155] 4 locks held by TaskSchedulerFo/3898:
[  369.376157]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376179]  #1:  (&type->i_mutex_dir_key#7){++++}, at: [<000000000a1a7597>] path_openat+0x2fe/0xc80
[  369.376191]  #2:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.376226]  #3:  (&xfs_dir_ilock_class/5){+.+.}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376261] 3 locks held by TaskSchedulerFo/4004:
[  369.376263]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376274]  #1:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.376310]  #2:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376341] 3 locks held by TaskSchedulerFo/4214:
[  369.376343]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376353]  #1:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.376388]  #2:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376420] 8 locks held by TaskSchedulerFo/5624:
[  369.376421]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376433]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<00000000d0bc23a2>] lock_rename+0xda/0x100
[  369.376446]  #2:  (&sb->s_type->i_mutex_key#20){++++}, at: [<00000000926eb288>] lock_two_nondirectories+0x6d/0x80
[  369.376457]  #3:  (&sb->s_type->i_mutex_key#20/4){+.+.}, at: [<0000000032f8e229>] lock_two_nondirectories+0x56/0x80
[  369.376470]  #4:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.376504]  #5:  (&xfs_dir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376535]  #6:  (&xfs_nondir_ilock_class/2){+.+.}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376568]  #7:  (&xfs_nondir_ilock_class/3){+.+.}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376600] 2 locks held by TaskSchedulerFo/5625:
[  369.376602]  #0:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.376636]  #1:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376667] 2 locks held by TaskSchedulerFo/5627:
[  369.376669]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376680]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<000000001fe370fd>] do_unlinkat+0x129/0x300
[  369.376695] 5 locks held by disk_cache:0/3971:
[  369.376697]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376708]  #1:  (&type->i_mutex_dir_key#7){++++}, at: [<000000000a1a7597>] path_openat+0x2fe/0xc80
[  369.376719]  #2:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.376754]  #3:  (&xfs_dir_ilock_class/5){+.+.}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376786]  #4:  (&(&ip->i_lock)->mr_lock){+.+.}, at: [<000000005543d627>] xfs_ilock_nowait+0x194/0x270 [xfs]
[  369.376824] 1 lock held by firefox/4007:
[  369.376826]  #0:  (&type->i_mutex_dir_key#7){++++}, at: [<00000000487923d9>] path_openat+0x6d6/0xc80
[  369.376840] 5 locks held by Cache2 I/O/4896:
[  369.376842]  #0:  (sb_writers#17){.+.+}, at: [<0000000090328571>] do_sys_ftruncate.constprop.17+0xdf/0x110
[  369.376855]  #1:  (&sb->s_type->i_mutex_key#20){++++}, at: [<000000001bfdce57>] do_truncate+0x65/0xc0
[  369.376866]  #2:  (&(&ip->i_mmaplock)->mr_lock){++++}, at: [<00000000493cf182>] xfs_ilock+0x156/0x210 [xfs]
[  369.376896]  #3:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.376931]  #4:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.376964] 4 locks held by Classif~ Update/5798:
[  369.376966]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.376977]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<0000000083a49cad>] filename_create+0x83/0x160
[  369.376990]  #2:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.377025]  #3:  (&xfs_dir_ilock_class/5){+.+.}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.377059] 4 locks held by StreamTrans #29/6033:
[  369.377060]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.377071]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<00000000d0bc23a2>] lock_rename+0xda/0x100
[  369.377084]  #2:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.377119]  #3:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.377150] 3 locks held by QuotaManager IO/6194:
[  369.377152]  #0:  (&f->f_pos_lock){+.+.}, at: [<00000000a655448c>] __fdget_pos+0x4c/0x60
[  369.377179]  #1:  (&type->i_mutex_dir_key#7){++++}, at: [<000000009c036bbe>] iterate_dir+0x53/0x1a0
[  369.377199]  #2:  (&xfs_dir_ilock_class){++++}, at: [<00000000276bf747>] xfs_ilock+0xe6/0x210 [xfs]
[  369.377242] 2 locks held by StreamTrans #35/6237:
[  369.377244]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.377255]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<00000000d0bc23a2>] lock_rename+0xda/0x100
[  369.377268] 3 locks held by DOM Worker/6246:
[  369.377270]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.377281]  #1:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.377317]  #2:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.377349] 2 locks held by StreamTrans #42/6259:
[  369.377350]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.377361]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<00000000d0bc23a2>] lock_rename+0xda/0x100
[  369.377377] 3 locks held by StreamTrans #48/6956:
[  369.377378]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.377389]  #1:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.377424]  #2:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.377472] 1 lock held by pool/4595:
[  369.377474]  #0:  (&type->i_mutex_dir_key#7){++++}, at: [<00000000534237e8>] lookup_slow+0xe5/0x220
[  369.377509] 1 lock held by worker/5171:
[  369.377511]  #0:  (&sb->s_type->i_mutex_key#20){++++}, at: [<000000009a06b5ff>] xfs_ilock+0x1a6/0x210 [xfs]
[  369.377558] 1 lock held by CPU 0/KVM/5172:
[  369.377561]  #0:  (&vcpu->mutex){+.+.}, at: [<00000000109f3ea1>] vcpu_load+0x1c/0x60 [kvm]
[  369.377601] 1 lock held by gitkraken/5407:
[  369.377604]  #0:  (&type->i_mutex_dir_key#7){++++}, at: [<00000000534237e8>] lookup_slow+0xe5/0x220
[  369.377624] 3 locks held by gitkraken/5495:
[  369.377626]  #0:  (&f->f_pos_lock){+.+.}, at: [<00000000a655448c>] __fdget_pos+0x4c/0x60
[  369.377642]  #1:  (&type->i_mutex_dir_key#7){++++}, at: [<000000009c036bbe>] iterate_dir+0x53/0x1a0
[  369.377660]  #2:  (&xfs_dir_ilock_class){++++}, at: [<00000000276bf747>] xfs_ilock+0xe6/0x210 [xfs]
[  369.377742] 1 lock held by trace-cmd/6122:
[  369.377743]  #0:  (&pipe->mutex/1){+.+.}, at: [<0000000005d368c0>] pipe_lock+0x1f/0x30
[  369.377756] 1 lock held by trace-cmd/6123:
[  369.377757]  #0:  (&pipe->mutex/1){+.+.}, at: [<0000000005d368c0>] pipe_lock+0x1f/0x30
[  369.377768] 1 lock held by trace-cmd/6124:
[  369.377770]  #0:  (&pipe->mutex/1){+.+.}, at: [<0000000005d368c0>] pipe_lock+0x1f/0x30
[  369.377781] 1 lock held by trace-cmd/6125:
[  369.377783]  #0:  (&pipe->mutex/1){+.+.}, at: [<0000000005d368c0>] pipe_lock+0x1f/0x30
[  369.377795] 1 lock held by trace-cmd/6126:
[  369.377797]  #0:  (&pipe->mutex/1){+.+.}, at: [<0000000005d368c0>] pipe_lock+0x1f/0x30
[  369.377808] 1 lock held by trace-cmd/6127:
[  369.377809]  #0:  (&pipe->mutex/1){+.+.}, at: [<0000000005d368c0>] pipe_lock+0x1f/0x30
[  369.377821] 1 lock held by trace-cmd/6128:
[  369.377822]  #0:  (&pipe->mutex/1){+.+.}, at: [<0000000005d368c0>] pipe_lock+0x1f/0x30
[  369.377833] 1 lock held by trace-cmd/6129:
[  369.377835]  #0:  (&pipe->mutex/1){+.+.}, at: [<0000000005d368c0>] pipe_lock+0x1f/0x30
[  369.377851] 2 locks held by nautilus/6272:
[  369.377853]  #0:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.377890]  #1:  (&xfs_nondir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.377933] 6 locks held by rm/6958:
[  369.377935]  #0:  (sb_writers#17){.+.+}, at: [<00000000e08ea99d>] mnt_want_write+0x24/0x50
[  369.377946]  #1:  (&type->i_mutex_dir_key#7/1){+.+.}, at: [<000000001fe370fd>] do_unlinkat+0x129/0x300
[  369.377960]  #2:  (&sb->s_type->i_mutex_key#20){++++}, at: [<00000000d6a8d3d3>] vfs_unlink+0x50/0x1c0
[  369.377971]  #3:  (sb_internal#2){.+.+}, at: [<000000009149be51>] xfs_trans_alloc+0xec/0x130 [xfs]
[  369.378006]  #4:  (&xfs_dir_ilock_class){++++}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]
[  369.378038]  #5:  (&xfs_nondir_ilock_class/1){+.+.}, at: [<000000009f144141>] xfs_ilock+0x16e/0x210 [xfs]

[  369.378073] =============================================

Again false positive? If it not fs problem why process blocked for such time?
Expected behaviour: processes should not blocked even on high I/O load. I agree that they must work more slower than
usual, but not stuck in some place as we see here. And if it caused not by fs I wonder to now why here hits such limits.