linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: wang Yu <yuwang668899@gmail.com>
Cc: mhocko@suse.com, penguin-kernel@i-love.sakura.ne.jp,
	linux-mm@kvack.org, chenggang.qcg@alibaba-inc.com,
	yuwang.yuwang@alibaba-inc.com,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] mm,page_alloc: softlockup on warn_alloc on
Date: Fri, 15 Sep 2017 07:37:32 -0700	[thread overview]
Message-ID: <20170915143732.GA8397@cmpxchg.org> (raw)
In-Reply-To: <20170915095849.9927-1-yuwang668899@gmail.com>

On Fri, Sep 15, 2017 at 05:58:49PM +0800, wang Yu wrote:
> From: "yuwang.yuwang" <yuwang.yuwang@alibaba-inc.com>
> 
> I found a softlockup when running some stress testcase in 4.9.x,
> but i think the mainline have the same problem.
> 
> call trace:
> [365724.502896] NMI watchdog: BUG: soft lockup - CPU#31 stuck for 22s!
> [jbd2/sda3-8:1164]

We've started seeing the same thing on 4.11. Tons and tons of
allocation stall warnings followed by the soft lock-ups.

These allocation stalls happen when the allocating task reclaims
successfully yet isn't able to allocate, meaning other threads are
stealing those pages.

Now, it *looks* like something changed recently to make this race
window wider, and there might well be a bug there. But regardless, we
have a real livelock or at least starvation window here, where
reclaimers have their bounty continuously stolen by concurrent allocs;
but instead of recognizing and handling the situation, we flood the
console which in many cases adds fuel to the fire.

When threads cannibalize each other to the point where one of them can
reclaim but not allocate for 10s, it's safe to say we are out of
memory. I think we need something like the below regardless of any
other investigations and fixes into the root cause here.

But Michal, this needs an answer. We don't want to paper over bugs,
but we also cannot continue to ship a kernel that has a known issue
and for which there are mitigation fixes, root-caused or not.

How can we figure out if there is a bug here? Can we time the calls to
__alloc_pages_direct_reclaim() and __alloc_pages_direct_compact() and
drill down from there? Print out the number of times we have retried?
We're counting no_progress_loops, but we are also very much interested
in progress_loops that didn't result in a successful allocation. Too
many of those and I think we want to OOM kill as per above.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bec5e96f3b88..01736596389a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3830,6 +3830,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 			"page allocation stalls for %ums, order:%u",
 			jiffies_to_msecs(jiffies-alloc_start), order);
 		stall_timeout += 10 * HZ;
+		goto oom;
 	}
 
 	/* Avoid recursion of direct reclaim */
@@ -3882,6 +3883,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (read_mems_allowed_retry(cpuset_mems_cookie))
 		goto retry_cpuset;
 
+oom:
 	/* Reclaim has failed us, start killing things */
 	page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
 	if (page)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-09-15 14:37 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-15  9:58 wang Yu
2017-09-15 10:39 ` Michal Hocko
2017-09-15 11:38   ` Tetsuo Handa
2017-09-15 12:00     ` Michal Hocko
2017-09-15 12:09       ` Tetsuo Handa
2017-09-15 12:14         ` Michal Hocko
2017-09-15 14:12           ` Tetsuo Handa
2017-09-15 14:23             ` Michal Hocko
2017-09-24  1:56             ` Tetsuo Handa
2017-09-15 14:37 ` Johannes Weiner [this message]
2017-09-15 15:23   ` Tetsuo Handa
2017-09-15 18:44     ` Johannes Weiner
2017-09-16  0:25       ` Tetsuo Handa
2017-09-18  6:05         ` Michal Hocko
2017-09-18  6:31           ` Tetsuo Handa
2017-09-18  6:43             ` Michal Hocko
2017-09-16  4:12   ` Tetsuo Handa
2017-10-11 11:14     ` Tetsuo Handa
2017-10-18 10:54       ` Tetsuo Handa
2017-09-18  6:03   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170915143732.GA8397@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=chenggang.qcg@alibaba-inc.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=yuwang.yuwang@alibaba-inc.com \
    --cc=yuwang668899@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox