[PATCH] mm,page_alloc: Serialize out_of_memory() and allocation stall messages.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: akpm@linux-foundation.org
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>, Petr Mladek <pmladek@suse.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	"yuwang.yuwang" <yuwang.yuwang@alibaba-inc.com>
Subject: [PATCH] mm,page_alloc: Serialize out_of_memory() and allocation stall messages.
Date: Thu, 19 Oct 2017 19:51:02 +0900	[thread overview]
Message-ID: <1508410262-4797-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> (raw)

The printk() flooding problem caused by concurrent warn_alloc() calls was
already pointed out by me, and there are reports of soft lockups caused by
warn_alloc(). But this problem is left unhandled because Michal does not
like serialization from allocation path because he is worrying about
unexpected side effects and is asking to identify the root cause of soft
lockups and fix it. But at least consuming CPU resource by not serializing
concurrent printk() plays some role in the soft lockups, for currently
printk() can consume CPU resource forever as long as somebody is appending
to printk() buffer, and writing to consoles also needs CPU resource. That
is, needlessly consuming CPU resource when calling printk() has unexpected
side effects.

Although a proposal for offloading writing to consoles to a dedicated
kernel thread is in progress, it is not yet accepted. And, even after
the proposal is accepted, writing to printk() buffer faster than the
kernel thread can write to consoles will result in loss of messages.
We should refrain from "appending to printk() buffer" and "consuming CPU
resource" at the same time if possible. We should try to (and we can)
avoid appending to printk() buffer when printk() is concurrently called
for reporting the OOM killer and allocation stalls, in order to reduce
possibility of hitting soft lockups and getting unreadably-jumbled
messages.

Although avoid mixing both memory allocation stall/failure messages and
the OOM killer messages would be nice, oom_lock mutex should not be used
for this purpose, for waiting for oom_lock mutex at warn_alloc() can
prevent anybody from calling out_of_memory() from __alloc_pages_may_oom()
because currently __alloc_pages_may_oom() does not wait for oom_lock
(i.e. causes OOM lockups after all). Therefore, this patch adds a mutex
named "oom_printk_lock". Although using mutex_lock() in order to allow
printk() to use CPU resource for writing to consoles is better from the
point of view of flushing printk(), this patch uses mutex_trylock() for
allocation stall messages because Michal does not like serialization.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Reported-by: yuwang.yuwang <yuwang.yuwang@alibaba-inc.com>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Michal Hocko <mhocko@kernel.org>
---
 include/linux/oom.h | 1 +
 mm/oom_kill.c       | 5 +++++
 mm/page_alloc.c     | 4 +++-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/oom.h b/include/linux/oom.h
index 76aac4c..1425767 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -44,6 +44,7 @@ struct oom_control {
 };

 extern struct mutex oom_lock;
+extern struct mutex oom_printk_lock;

 static inline void set_current_oom_origin(void)
 {
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 26add8a..5aef9a6 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -54,6 +54,7 @@
 int sysctl_oom_dump_tasks = 1;

 DEFINE_MUTEX(oom_lock);
+DEFINE_MUTEX(oom_printk_lock);

 #ifdef CONFIG_NUMA
 /**
@@ -1074,7 +1075,9 @@ bool out_of_memory(struct oom_control *oc)
 	    current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) {
 		get_task_struct(current);
 		oc->chosen = current;
+		mutex_lock(&oom_printk_lock);
 		oom_kill_process(oc, "Out of memory (oom_kill_allocating_task)");
+		mutex_unlock(&oom_printk_lock);
 		return true;
 	}

@@ -1085,8 +1088,10 @@ bool out_of_memory(struct oom_control *oc)
 		panic("Out of memory and no killable processes...\n");
 	}
 	if (oc->chosen && oc->chosen != (void *)-1UL) {
+		mutex_lock(&oom_printk_lock);
 		oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" :
 				 "Memory cgroup out of memory");
+		mutex_unlock(&oom_printk_lock);
 		/*
 		 * Give the killed process a good chance to exit before trying
 		 * to allocate memory again.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 97687b3..3766687 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3990,11 +3990,13 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 		goto nopage;

 	/* Make sure we know about allocations which stall for too long */
-	if (time_after(jiffies, alloc_start + stall_timeout)) {
+	if (time_after(jiffies, alloc_start + stall_timeout) &&
+	    mutex_trylock(&oom_printk_lock)) {
 		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
 			"page allocation stalls for %ums, order:%u",
 			jiffies_to_msecs(jiffies-alloc_start), order);
 		stall_timeout += 10 * HZ;
+		mutex_unlock(&oom_printk_lock);
 	}

 	/* Avoid recursion of direct reclaim */
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next             reply	other threads:[~2017-10-19 10:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 10:51 Tetsuo Handa [this message]
2017-10-19 11:44 ` Michal Hocko
2017-10-20 10:20   ` Tetsuo Handa
2017-10-24 11:23     ` Tetsuo Handa
2017-10-25 10:47       ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1508410262-4797-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=pmladek@suse.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=xiyou.wangcong@gmail.com \
    --cc=yuwang.yuwang@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox