linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: david@fromorbit.com
Cc: tytso@mit.edu, rientjes@google.com, hannes@cmpxchg.org,
	mhocko@suse.cz, dchinner@redhat.com, linux-mm@kvack.org,
	oleg@redhat.com, akpm@linux-foundation.org, mgorman@suse.de,
	torvalds@linux-foundation.org, fernando_b1@lab.ntt.co.jp
Subject: Re: How to handle TIF_MEMDIE stalls?
Date: Fri, 27 Feb 2015 21:42:55 +0900	[thread overview]
Message-ID: <201502272142.BFJ09388.OLOMFFFVSQJOtH@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20150227073949.GJ4251@dastard>

Dave Chinner wrote:
> On Wed, Feb 25, 2015 at 11:31:17PM +0900, Tetsuo Handa wrote:
> > I got two problems (one is stall at io_schedule()
> 
> This is a typical "blame the messenger" bug report. XFS is stuck in
> inode reclaim waiting for log IO completion to occur, along with all
> the other processes iin xfs_log_force also stuck waiting for the
> same Io completion.

I wanted to know whether transaction based reservations can solve these
problems. Inside filesystem layer, I guess you can calculate how much
memory is needed for your filesystem transaction. But I'm wondering
whether we can calculate how much memory is needed inside block layer.
If block layer failed to reserve memory, won't file I/O fail under
extreme memory pressure? And if __GFP_NOFAIL were used inside block
layer, won't the OOM killer deadlock problem arise?

> 
> You need to find where that IO completion that everything is waiting
> on has got stuck or show that it's not a lost IO and actually an
> XFS problem. e.g has the IO stack got stuck on a mempool somewhere?
> 

I didn't get a vmcore for this stall. But it seemed to me that

kworker/3:0H is doing

  xfs_fs_free_cached_objects()
  => xfs_reclaim_inodes_nr()
    => xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr_to_scan)
      => xfs_reclaim_inode() because mutex_trylock(&pag->pag_ici_reclaim_lock)
         was succeessful
         => xfs_iunpin_wait(ip) because xfs_ipincount(ip) was non 0
           => __xfs_iunpin_wait()
             => waiting inside io_schedule() for somebody to unpin

kswapd0 is doing

  xfs_fs_free_cached_objects()
  => xfs_reclaim_inodes_nr()
    => xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr_to_scan)
      => not calling xfs_reclaim_inode() because
         mutex_trylock(&pag->pag_ici_reclaim_lock) failed due to kworker/3:0H
      => SYNC_TRYLOCK is dropped for retry loop due to

            if (skipped && (flags & SYNC_WAIT) && *nr_to_scan > 0) {
                    trylock = 0;
                    goto restart;
            }

      => calling mutex_lock(&pag->pag_ici_reclaim_lock) and gets blocked
         due to kworker/3:0H

kworker/3:0H is trying to free memory but somebody needs memory to make
forward progress. kswapd0 is also trying to free memory but is blocked by
kworker/3:0H already holding the lock. Since kswapd0 cannot make forward
progress, somebody can't allocate memory. Finally the system started
stalling. Is this decoding correct?

----------
[ 1225.773411] kworker/3:0H    D ffff88007cadb4f8 11632    27      2 0x00000000
[ 1225.776911]  ffff88007cadb4f8 ffff88007cadb508 ffff88007cac6740 0000000000014080
[ 1225.780670]  ffffffff8101cd19 ffff88007cadbfd8 0000000000014080 ffff88007c28b740
[ 1225.784431]  ffff88007cac6740 ffff88007cadb540 ffff88007f8d4998 ffff88007cadb540
[ 1225.788766] Call Trace:
[ 1225.789988]  [<ffffffff8101cd19>] ? read_tsc+0x9/0x10
[ 1225.792444]  [<ffffffff812acbd9>] ? xfs_iunpin_wait+0x19/0x20
[ 1225.795228]  [<ffffffff816b2590>] io_schedule+0xa0/0x130
[ 1225.797802]  [<ffffffff812a9569>] __xfs_iunpin_wait+0xe9/0x140
arch/x86/include/asm/atomic.h:27
fs/xfs/xfs_inode.c:2433
[ 1225.800621]  [<ffffffff810af3b0>] ? autoremove_wake_function+0x40/0x40
[ 1225.803770]  [<ffffffff812acbd9>] xfs_iunpin_wait+0x19/0x20
fs/xfs/xfs_inode.c:2443
[ 1225.806471]  [<ffffffff812a209c>] xfs_reclaim_inode+0x7c/0x360
include/linux/spinlock.h:309
fs/xfs/xfs_inode.h:144
fs/xfs/xfs_icache.c:920
[ 1225.809283]  [<ffffffff812a25d7>] xfs_reclaim_inodes_ag+0x257/0x370
fs/xfs/xfs_icache.c:1105
[ 1225.812308]  [<ffffffff81340839>] ? radix_tree_gang_lookup_tag+0x89/0xd0
[ 1225.815532]  [<ffffffff8116fe58>] ? list_lru_walk_node+0x148/0x190
[ 1225.817951]  [<ffffffff812a2783>] xfs_reclaim_inodes_nr+0x33/0x40
fs/xfs/xfs_icache.c:1166
[ 1225.819373]  [<ffffffff812b3545>] xfs_fs_free_cached_objects+0x15/0x20
[ 1225.820898]  [<ffffffff811c29e9>] super_cache_scan+0x169/0x170
[ 1225.822245]  [<ffffffff8115aed6>] shrink_node_slabs+0x1d6/0x370
[ 1225.823588]  [<ffffffff8115dd2a>] shrink_zone+0x20a/0x240
[ 1225.824830]  [<ffffffff8115e0dc>] do_try_to_free_pages+0x16c/0x460
[ 1225.826230]  [<ffffffff8115e48a>] try_to_free_pages+0xba/0x150
[ 1225.827570]  [<ffffffff81151542>] __alloc_pages_nodemask+0x5b2/0x9d0
[ 1225.829030]  [<ffffffff8119ecbc>] kmem_getpages+0x8c/0x200
[ 1225.830277]  [<ffffffff811a122b>] fallback_alloc+0x17b/0x230
[ 1225.831561]  [<ffffffff811a107b>] ____cache_alloc_node+0x18b/0x1c0
[ 1225.833061]  [<ffffffff811a3b00>] kmem_cache_alloc+0x330/0x5c0
[ 1225.834435]  [<ffffffff8133c9d9>] ? ida_pre_get+0x69/0x100
[ 1225.835719]  [<ffffffff8133c9d9>] ida_pre_get+0x69/0x100
[ 1225.836963]  [<ffffffff8133d312>] ida_simple_get+0x42/0xf0
[ 1225.838248]  [<ffffffff81086211>] create_worker+0x31/0x1c0
[ 1225.839519]  [<ffffffff81087831>] worker_thread+0x3d1/0x4d0
[ 1225.840800]  [<ffffffff81087460>] ? rescuer_thread+0x3a0/0x3a0
[ 1225.842123]  [<ffffffff8108c5e2>] kthread+0xd2/0xf0
[ 1225.843234]  [<ffffffff81010000>] ? perf_trace_xen_mmu_ptep_modify_prot+0x90/0xf0
[ 1225.844978]  [<ffffffff8108c510>] ? kthread_create_on_node+0x180/0x180
[ 1225.846481]  [<ffffffff816b63fc>] ret_from_fork+0x7c/0xb0
[ 1225.847718]  [<ffffffff8108c510>] ? kthread_create_on_node+0x180/0x180
[ 1225.849279] kswapd0         D ffff88007708f998 11552    45      2 0x00000000
[ 1225.850977]  ffff88007708f998 0000000000000000 ffff88007c28b740 0000000000014080
[ 1225.852798]  0000000000000003 ffff88007708ffd8 0000000000014080 ffff880077ff2740
[ 1225.854575]  ffff88007c28b740 0000000000000000 ffff88007948e3a8 ffff88007948e3ac
[ 1225.856358] Call Trace:
[ 1225.856928]  [<ffffffff816b2799>] schedule_preempt_disabled+0x29/0x70
[ 1225.858384]  [<ffffffff816b43d5>] __mutex_lock_slowpath+0x95/0x100
[ 1225.859799]  [<ffffffff816b4463>] mutex_lock+0x23/0x37
arch/x86/include/asm/current.h:14
kernel/locking/mutex.h:22
kernel/locking/mutex.c:103
[ 1225.860983]  [<ffffffff812a264c>] xfs_reclaim_inodes_ag+0x2cc/0x370
fs/xfs/xfs_icache.c:1034
[ 1225.862403]  [<ffffffff8109eb48>] ? __enqueue_entity+0x78/0x80
[ 1225.863742]  [<ffffffff810a5f37>] ? enqueue_entity+0x237/0x8f0
[ 1225.865100]  [<ffffffff81340839>] ? radix_tree_gang_lookup_tag+0x89/0xd0
[ 1225.866659]  [<ffffffff8116fe58>] ? list_lru_walk_node+0x148/0x190
[ 1225.868106]  [<ffffffff812a2783>] xfs_reclaim_inodes_nr+0x33/0x40
fs/xfs/xfs_icache.c:1166
[ 1225.869522]  [<ffffffff812b3545>] xfs_fs_free_cached_objects+0x15/0x20
[ 1225.871015]  [<ffffffff811c29e9>] super_cache_scan+0x169/0x170
[ 1225.872338]  [<ffffffff8115aed6>] shrink_node_slabs+0x1d6/0x370
[ 1225.873679]  [<ffffffff8115dd2a>] shrink_zone+0x20a/0x240
[ 1225.874920]  [<ffffffff8115ed2d>] kswapd+0x4fd/0x9c0
[ 1225.876049]  [<ffffffff8115e830>] ? mem_cgroup_shrink_node_zone+0x140/0x140
[ 1225.877654]  [<ffffffff8108c5e2>] kthread+0xd2/0xf0
[ 1225.878762]  [<ffffffff81010000>] ? perf_trace_xen_mmu_ptep_modify_prot+0x90/0xf0
[ 1225.880495]  [<ffffffff8108c510>] ? kthread_create_on_node+0x180/0x180
[ 1225.881996]  [<ffffffff816b63fc>] ret_from_fork+0x7c/0xb0
[ 1225.883336]  [<ffffffff8108c510>] ? kthread_create_on_node+0x180/0x180
----------

I killed mutex_lock() and memory allocation from shrinker functions
in drivers/gpu/drm/ttm/ttm_page_alloc[_dma].c because I observed that
kswapd0 was blocked for so long at mutex_lock().

If kswapd0 is blocked forever at e.g. mutex_lock() inside shrinker
functions, who else can make forward progress?

Shouldn't we avoid calling functions which could potentially block for
unpredictable duration (e.g. unkillable locks and/or completion) from
shrinker functions?



> IOWs, when you run CONFIG_XFS_DEBUG=y, you'll often get failures
> that are valuable to XFS developers but have no runtime effect on
> production systems.

Oh, I didn't know this failure is specific to CONFIG_XFS_DEBUG=y ...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-02-27 12:43 UTC|newest]

Thread overview: 177+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-12 13:54 [RFC PATCH] oom: Don't count on mm-less current process Tetsuo Handa
2014-12-16 12:47 ` Michal Hocko
2014-12-17 11:54   ` Tetsuo Handa
2014-12-17 13:08     ` Michal Hocko
2014-12-18 12:11       ` Tetsuo Handa
2014-12-18 15:33         ` Michal Hocko
2014-12-19 12:07           ` Tetsuo Handa
2014-12-19 12:49             ` Michal Hocko
2014-12-20  9:13               ` Tetsuo Handa
2014-12-20 11:42                 ` Tetsuo Handa
2014-12-22 20:25                   ` Michal Hocko
2014-12-23  1:00                     ` Tetsuo Handa
2014-12-23  9:51                       ` Michal Hocko
2014-12-23 11:46                         ` Tetsuo Handa
2014-12-23 11:57                           ` Tetsuo Handa
2014-12-23 12:12                             ` Tetsuo Handa
2014-12-23 12:27                             ` Michal Hocko
2014-12-23 12:24                           ` Michal Hocko
2014-12-23 13:00                             ` Tetsuo Handa
2014-12-23 13:09                               ` Michal Hocko
2014-12-23 13:20                                 ` Tetsuo Handa
2014-12-23 13:43                                   ` Michal Hocko
2014-12-23 14:11                                     ` Tetsuo Handa
2014-12-23 14:57                                       ` Michal Hocko
2014-12-19 12:22           ` How to handle TIF_MEMDIE stalls? Tetsuo Handa
2014-12-20  2:03             ` Dave Chinner
2014-12-20 12:41               ` Tetsuo Handa
2014-12-20 22:35                 ` Dave Chinner
2014-12-21  8:45                   ` Tetsuo Handa
2014-12-21 20:42                     ` Dave Chinner
2014-12-22 16:57                       ` Michal Hocko
2014-12-22 21:30                         ` Dave Chinner
2014-12-23  9:41                           ` Johannes Weiner
2014-12-24  1:06                             ` Dave Chinner
2014-12-24  2:40                               ` Linus Torvalds
2014-12-29 18:19                     ` Michal Hocko
2014-12-30  6:42                       ` Tetsuo Handa
2014-12-30 11:21                         ` Michal Hocko
2014-12-30 13:33                           ` Tetsuo Handa
2014-12-31 10:24                             ` Tetsuo Handa
2015-02-09 11:44                           ` Tetsuo Handa
2015-02-10 13:58                             ` Tetsuo Handa
2015-02-10 15:19                               ` Johannes Weiner
2015-02-11  2:23                                 ` Tetsuo Handa
2015-02-11 13:37                                   ` Tetsuo Handa
2015-02-11 18:50                                     ` Oleg Nesterov
2015-02-11 18:59                                       ` Oleg Nesterov
2015-03-14 13:03                                         ` Tetsuo Handa
2015-02-17 12:23                                   ` Tetsuo Handa
2015-02-17 12:53                                     ` Johannes Weiner
2015-02-17 15:38                                       ` Michal Hocko
2015-02-17 22:54                                       ` Dave Chinner
2015-02-17 23:32                                         ` Dave Chinner
2015-02-18  8:25                                         ` Michal Hocko
2015-02-18 10:48                                           ` Dave Chinner
2015-02-18 12:16                                             ` Michal Hocko
2015-02-18 21:31                                               ` Dave Chinner
2015-02-19  9:40                                                 ` Michal Hocko
2015-02-19 22:03                                                   ` Dave Chinner
2015-02-20  9:27                                                     ` Michal Hocko
2015-02-19 11:01                                               ` Johannes Weiner
2015-02-19 12:29                                                 ` Michal Hocko
2015-02-19 12:58                                                   ` Michal Hocko
2015-02-19 15:29                                                     ` Tetsuo Handa
2015-02-19 21:53                                                       ` Tetsuo Handa
2015-02-20  9:13                                                       ` Michal Hocko
2015-02-20 13:37                                                         ` Stefan Ring
2015-02-19 13:29                                                   ` Tetsuo Handa
2015-02-20  9:10                                                     ` Michal Hocko
2015-02-20 12:20                                                       ` Tetsuo Handa
2015-02-20 12:38                                                         ` Michal Hocko
2015-02-19 21:43                                                   ` Dave Chinner
2015-02-20 12:48                                                     ` Michal Hocko
2015-02-20 23:09                                                       ` Dave Chinner
2015-02-19 10:24                                         ` Johannes Weiner
2015-02-19 22:52                                           ` Dave Chinner
2015-02-20 10:36                                             ` Tetsuo Handa
2015-02-20 23:15                                               ` Dave Chinner
2015-02-21  3:20                                                 ` Theodore Ts'o
2015-02-21  9:19                                                   ` Andrew Morton
2015-02-21 13:48                                                     ` Tetsuo Handa
2015-02-21 21:38                                                     ` Dave Chinner
2015-02-22  0:20                                                     ` Johannes Weiner
2015-02-23 10:48                                                       ` Michal Hocko
2015-02-23 11:23                                                         ` Tetsuo Handa
2015-02-23 21:33                                                       ` David Rientjes
2015-02-22 14:48                                                     ` __GFP_NOFAIL and oom_killer_disabled? Tetsuo Handa
2015-02-23 10:21                                                       ` Michal Hocko
2015-02-23 13:03                                                         ` Tetsuo Handa
2015-02-24 18:14                                                           ` Michal Hocko
2015-02-25 11:22                                                             ` Tetsuo Handa
2015-02-25 16:02                                                               ` Michal Hocko
2015-02-25 21:48                                                                 ` Tetsuo Handa
2015-02-25 21:51                                                                   ` Andrew Morton
2015-02-21 12:00                                                   ` How to handle TIF_MEMDIE stalls? Tetsuo Handa
2015-02-23 10:26                                                   ` Michal Hocko
2015-02-21 11:12                                                 ` Tetsuo Handa
2015-02-21 21:48                                                   ` Dave Chinner
2015-02-21 23:52                                             ` Johannes Weiner
2015-02-23  0:45                                               ` Dave Chinner
2015-02-23  1:29                                                 ` Andrew Morton
2015-02-23  7:32                                                   ` Dave Chinner
2015-02-27 18:24                                                     ` Vlastimil Babka
2015-02-28  0:03                                                       ` Dave Chinner
2015-02-28 15:17                                                         ` Theodore Ts'o
2015-03-02  9:39                                                     ` Vlastimil Babka
2015-03-02 22:31                                                       ` Dave Chinner
2015-03-03  9:13                                                         ` Vlastimil Babka
2015-03-04  1:33                                                           ` Dave Chinner
2015-03-04  8:50                                                             ` Vlastimil Babka
2015-03-04 11:03                                                               ` Dave Chinner
2015-03-07  0:20                                                         ` Johannes Weiner
2015-03-07  3:43                                                           ` Dave Chinner
2015-03-07 15:08                                                             ` Johannes Weiner
2015-03-02 20:22                                                     ` Johannes Weiner
2015-03-02 23:12                                                       ` Dave Chinner
2015-03-03  2:50                                                         ` Johannes Weiner
2015-03-04  6:52                                                           ` Dave Chinner
2015-03-04 15:04                                                             ` Johannes Weiner
2015-03-04 17:38                                                               ` Theodore Ts'o
2015-03-04 23:17                                                                 ` Dave Chinner
2015-02-28 16:29                                                 ` Johannes Weiner
2015-02-28 16:41                                                   ` Theodore Ts'o
2015-02-28 22:15                                                     ` Johannes Weiner
2015-03-01 11:17                                                       ` Tetsuo Handa
2015-03-06 11:53                                                         ` Tetsuo Handa
2015-03-01 13:43                                                       ` Theodore Ts'o
2015-03-01 16:15                                                         ` Johannes Weiner
2015-03-01 19:36                                                           ` Theodore Ts'o
2015-03-01 20:44                                                             ` Johannes Weiner
2015-03-01 20:17                                                         ` Johannes Weiner
2015-03-01 21:48                                                       ` Dave Chinner
2015-03-02  0:17                                                         ` Dave Chinner
2015-03-02 12:46                                                           ` Brian Foster
2015-02-28 18:36                                                 ` Vlastimil Babka
2015-03-02 15:18                                                 ` Michal Hocko
2015-03-02 16:05                                                   ` Johannes Weiner
2015-03-02 17:10                                                     ` Michal Hocko
2015-03-02 17:27                                                       ` Johannes Weiner
2015-03-02 16:39                                                   ` Theodore Ts'o
2015-03-02 16:58                                                     ` Michal Hocko
2015-03-04 12:52                                                       ` Dave Chinner
2015-02-17 14:59                                     ` Michal Hocko
2015-02-17 14:50                                 ` Michal Hocko
2015-02-17 14:37                             ` Michal Hocko
2015-02-17 14:44                               ` Michal Hocko
2015-02-16 11:23                           ` Tetsuo Handa
2015-02-16 15:42                             ` Johannes Weiner
2015-02-17 11:57                               ` Tetsuo Handa
2015-02-17 13:16                                 ` Johannes Weiner
2015-02-17 16:50                                   ` Michal Hocko
2015-02-17 23:25                                     ` Dave Chinner
2015-02-18  8:48                                       ` Michal Hocko
2015-02-18 11:23                                         ` Tetsuo Handa
2015-02-18 12:29                                           ` Michal Hocko
2015-02-18 14:06                                             ` Tetsuo Handa
2015-02-18 14:25                                               ` Michal Hocko
2015-02-19 10:48                                                 ` Tetsuo Handa
2015-02-20  8:26                                                   ` Michal Hocko
2015-02-23 22:08                                 ` David Rientjes
2015-02-24 11:20                                   ` Tetsuo Handa
2015-02-24 15:20                                     ` Theodore Ts'o
2015-02-24 21:02                                       ` Dave Chinner
2015-02-25 14:31                                         ` Tetsuo Handa
2015-02-27  7:39                                           ` Dave Chinner
2015-02-27 12:42                                             ` Tetsuo Handa [this message]
2015-02-27 13:12                                               ` Dave Chinner
2015-03-04 12:41                                                 ` Tetsuo Handa
2015-03-04 13:25                                                   ` Dave Chinner
2015-03-04 14:11                                                     ` Tetsuo Handa
2015-03-05  1:36                                                       ` Dave Chinner
2015-02-17 16:33                             ` Michal Hocko
2014-12-29 17:40                   ` [PATCH] mm: get rid of radix tree gfp mask for pagecache_get_page (was: Re: How to handle TIF_MEMDIE stalls?) Michal Hocko
2014-12-29 18:45                     ` Linus Torvalds
2014-12-29 19:33                       ` Michal Hocko
2014-12-30 13:42                         ` Michal Hocko
2014-12-30 21:45                           ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201502272142.BFJ09388.OLOMFFFVSQJOtH@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=fernando_b1@lab.ntt.co.jp \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox