Re: [PATCH v7] mm: Add memory allocation watchdog kernel thread.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hannes@cmpxchg.org,
	mgorman@techsingularity.net, david@fromorbit.com,
	apolyakov@beget.ru
Subject: Re: [PATCH v7] mm: Add memory allocation watchdog kernel thread.
Date: Fri, 10 Mar 2017 16:26:11 +0100	[thread overview]
Message-ID: <20170310152611.GM3753@dhcp22.suse.cz> (raw)
In-Reply-To: <201703102019.JHJ58283.MQHtVFOOFOLFJS@I-love.SAKURA.ne.jp>

On Fri 10-03-17 20:19:58, Tetsuo Handa wrote:
[...]
> Or, did you mean "some additional examples of how this new code was used
> to understand and improve real-world kernel problems" as how this patch
> helped in [2] [3] ?
> 
> Regarding [3] (now continued as
> http://lkml.kernel.org/r/201703031948.CHJ81278.VOHSFFFOOLJQMt@I-love.SAKURA.ne.jp ),
> I used this patch for confirming the following things.
> 
>   (1) kswapd cannot make progress waiting for lock.
> 
> ----------
> [  518.900038] MemAlloc: kswapd0(69) flags=0xa40840 switches=23883 uninterruptible
> [  518.902095] kswapd0         D10776    69      2 0x00000000
> [  518.903784] Call Trace:
[...]
> 
> [ 1095.632984] MemAlloc: kswapd0(69) flags=0xa40840 switches=23883 uninterruptible
> [ 1095.632985] kswapd0         D10776    69      2 0x00000000
> [ 1095.632988] Call Trace:
[...]
> ----------
> 
>   (2) All WQ_MEM_RECLAIM threads shown by show_workqueue_state()
>       cannot make progress waiting for memory allocation.
> 
> ----------
> [ 1095.633625] MemAlloc: xfs-data/sda1(451) flags=0x4228060 switches=45509 seq=1 gfp=0x1604240(GFP_NOFS|__GFP_NOWARN|__GFP_COMP|__GFP_NOTRACK) order=0 delay=652073
> [ 1095.633626] xfs-data/sda1   R  running task    12696   451      2 0x00000000
> [ 1095.633663] Workqueue: xfs-data/sda1 xfs_end_io [xfs]
> [ 1095.633665] Call Trace:
[...]

We can get all this from the sysrq+t output, right? Well except for the
gfp mask, oder and delay which is something that would give us a hint
that something is going on with the allocation because all those tasks
are in the page allocation obviously from the Call trace.
> ----------
> 
>   (3) order-0 GFP_NOIO allocation request cannot make progress
>       waiting for memory allocation.

which is completely irrelevant to the issue...
 
>   (4) Number of stalling threads does not decrease over time while
>       number of out_of_memory() calls increases over time.

You would see the same from the allocation stall warning. Even when it
is not really ideal (we are getting there though) you would see that
some allocation simply do not make a forward progress.

> ----------
> [  518.090012] MemAlloc-Info: stalling=184 dying=1 exiting=0 victim=1 oom_count=8441307
> [  553.070829] MemAlloc-Info: stalling=184 dying=1 exiting=0 victim=1 oom_count=10318507
> [  616.394649] MemAlloc-Info: stalling=186 dying=1 exiting=0 victim=1 oom_count=13908219
> [  642.266252] MemAlloc-Info: stalling=186 dying=1 exiting=0 victim=1 oom_count=15180673
> [  702.412189] MemAlloc-Info: stalling=187 dying=1 exiting=0 victim=1 oom_count=18732529
> [  736.787879] MemAlloc-Info: stalling=187 dying=1 exiting=0 victim=1 oom_count=20565244
> [  800.715759] MemAlloc-Info: stalling=188 dying=1 exiting=0 victim=1 oom_count=24411576
> [  837.571405] MemAlloc-Info: stalling=188 dying=1 exiting=0 victim=1 oom_count=26463562
> [  899.021495] MemAlloc-Info: stalling=189 dying=1 exiting=0 victim=1 oom_count=30144879
> [  936.282709] MemAlloc-Info: stalling=189 dying=1 exiting=0 victim=1 oom_count=32129234
> [  997.328119] MemAlloc-Info: stalling=190 dying=1 exiting=0 victim=1 oom_count=35657983
> [ 1033.977265] MemAlloc-Info: stalling=190 dying=1 exiting=0 victim=1 oom_count=37659912
> [ 1095.630961] MemAlloc-Info: stalling=190 dying=1 exiting=0 victim=1 oom_count=40639677
> [ 1095.821248] MemAlloc-Info: stalling=190 dying=1 exiting=0 victim=1 oom_count=40646791

yeah, this is more than you can find out right now but I am not really
sure how much more this will tell you from what we already know above.
Do not take me wrong this is all nice and dandy but existing debugging
tools would give us a good clue what is going on even without that.
 
[...]

> Regarding "infinite too_many_isolated() loop" case
> ( http://lkml.kernel.org/r/201702212335.DJB30777.JOFMHSFtVLQOOF@I-love.SAKURA.ne.jp ),
> I used this patch for confirming the following things.
> 
>   (1) kswapd cannot make progress waiting for lock.
[...]
> 
>   (2) Both GFP_IO and GFP_KERNEL allocations are stuck at
>       too_many_isolated() loop.
[...]
> ----------
> 
>   (3) Number of stalling threads does not decrease over time and
>       number of out_of_memory() calls does not increase over time.
> 
> ----------
> [ 1209.781787] MemAlloc-Info: stalling=32 dying=1 exiting=0 victim=1 oom_count=45896
> [ 1212.195351] MemAlloc-Info: stalling=32 dying=1 exiting=0 victim=1 oom_count=45896
> [ 1242.551629] MemAlloc-Info: stalling=36 dying=1 exiting=0 victim=1 oom_count=45896
> [ 1245.149165] MemAlloc-Info: stalling=36 dying=1 exiting=0 victim=1 oom_count=45896
> [ 1275.319189] MemAlloc-Info: stalling=40 dying=1 exiting=0 victim=1 oom_count=45896
> [ 1278.241813] MemAlloc-Info: stalling=40 dying=1 exiting=0 victim=1 oom_count=45896
> ----------

OK, this is a better example but I would argue that multiple snapshots
when CPUs are basically idle while all memory allocations are sitting on
the congestion wait would signal this pretty clearly as well and we can
see this from the sysrq+t as well.

So, we have means to debug these issues. Some of them are rather coarse
and your watchdog can collect much more and maybe give us a clue much
quicker but we still have to judge whether all this is really needed
because it doesn't come for free. Have you considered this aspect?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-03-10 15:26 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-28  1:21 Tetsuo Handa
2017-03-09 10:46 ` Tetsuo Handa
2017-03-09 22:37   ` Andrew Morton
2017-03-10 10:40     ` Michal Hocko
2017-03-10 11:19       ` Tetsuo Handa
2017-03-10 13:49         ` Brian Foster
2017-03-10 15:26         ` Michal Hocko [this message]
2017-03-11  1:46           ` Tetsuo Handa
2017-03-13  9:45             ` Michal Hocko
2017-03-13 13:45               ` Tetsuo Handa
2017-03-09 22:35 ` Andrew Morton
2017-03-10 11:12   ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170310152611.GM3753@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=apolyakov@beget.ru \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox