From: David Rientjes <rientjes@google.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Oleg Nesterov <oleg@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Hugh Dickins <hughd@google.com>,
linux-mm@kvack.org, Andrey Vagin <avagin@openvz.org>
Subject: [patch -mm] oom: avoid deferring oom killer if exiting task is being traced
Date: Sat, 12 Mar 2011 17:15:25 -0800 (PST) [thread overview]
Message-ID: <alpine.DEB.2.00.1103121715030.10317@chino.kir.corp.google.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1103121709230.10317@chino.kir.corp.google.com>
The oom killer naturally defers killing anything if it finds an eligible
task that is already exiting and has yet to detach its ->mm. This avoids
unnecessarily killing tasks when one is already in the exit path and may
free enough memory that the oom killer is no longer needed. This is
detected by PF_EXITING since threads that have already detached its ->mm
are no longer considered at all.
The problem with always deferring when a thread is PF_EXITING, however,
is that it may never actually exit when being traced, specifically if
another task is tracing it with PTRACE_O_TRACEEXIT. The oom killer does
not want to defer in this case since there is no guarantee that thread
will ever exit without intervention.
This patch will now only defer the oom killer when a thread is PF_EXITING
and no ptracer has stopped its progress in the exit path. It also
ensures that a child is sacrificed for the chosen parent only if it has
a different ->mm as the comment implies: this ensures that the thread
group leader is always targeted appropriately.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
mm/oom_kill.c | 40 +++++++++++++++++++++++++---------------
1 files changed, 25 insertions(+), 15 deletions(-)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -31,6 +31,7 @@
#include <linux/memcontrol.h>
#include <linux/mempolicy.h>
#include <linux/security.h>
+#include <linux/ptrace.h>
int sysctl_panic_on_oom;
int sysctl_oom_kill_allocating_task;
@@ -316,22 +317,29 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
if (test_tsk_thread_flag(p, TIF_MEMDIE))
return ERR_PTR(-1UL);
- /*
- * This is in the process of releasing memory so wait for it
- * to finish before killing some other task by mistake.
- *
- * However, if p is the current task, we allow the 'kill' to
- * go ahead if it is exiting: this will simply set TIF_MEMDIE,
- * which will allow it to gain access to memory reserves in
- * the process of exiting and releasing its resources.
- * Otherwise we could get an easy OOM deadlock.
- */
if (p->flags & PF_EXITING) {
- if (p != current)
- return ERR_PTR(-1UL);
-
- chosen = p;
- *ppoints = 1000;
+ /*
+ * If p is the current task and is in the process of
+ * releasing memory, we allow the "kill" to set
+ * TIF_MEMDIE, which will allow it to gain access to
+ * memory reserves. Otherwise, it may stall forever.
+ *
+ * The loop isn't broken here, however, in case other
+ * threads are found to have already been oom killed.
+ */
+ if (p == current) {
+ chosen = p;
+ *ppoints = 1000;
+ } else {
+ /*
+ * If this task is not being ptraced on exit,
+ * then wait for it to finish before killing
+ * some other task unnecessarily.
+ */
+ if (!(task_ptrace(p->group_leader) &
+ PT_TRACE_EXIT))
+ return ERR_PTR(-1UL);
+ }
}
points = oom_badness(p, mem, nodemask, totalpages);
@@ -493,6 +501,8 @@ static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
list_for_each_entry(child, &t->children, sibling) {
unsigned int child_points;
+ if (child->mm == p->mm)
+ continue;
/*
* oom_badness() returns 0 if the thread is unkillable
*/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-03-13 1:15 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-01 19:09 [patch] oom: prevent unnecessary oom kills or kernel panics David Rientjes
2011-03-03 1:20 ` KOSAKI Motohiro
2011-03-03 19:53 ` David Rientjes
2011-03-06 11:14 ` KOSAKI Motohiro
2011-03-06 22:06 ` David Rientjes
2011-03-08 0:24 ` KOSAKI Motohiro
2011-03-08 2:01 ` KOSAKI Motohiro
2011-03-08 13:42 ` Oleg Nesterov
2011-03-08 23:57 ` David Rientjes
2011-03-09 10:36 ` KOSAKI Motohiro
2011-03-09 11:06 ` Oleg Nesterov
2011-03-09 20:32 ` David Rientjes
2011-03-10 12:05 ` Oleg Nesterov
2011-03-10 15:40 ` [PATCH 0/1] Was: " Oleg Nesterov
2011-03-10 15:41 ` [PATCH 1/1] oom_kill_task: mark every thread as TIF_MEMDIE Oleg Nesterov
2011-03-13 1:08 ` David Rientjes
2011-03-10 16:36 ` [PATCH 0/1] select_bad_process: improve the PF_EXITING check Oleg Nesterov
2011-03-10 16:37 ` [PATCH 1/1] " Oleg Nesterov
2011-03-10 16:40 ` [PATCH 0/1] " Oleg Nesterov
2011-03-10 17:18 ` [PATCH v2 " Oleg Nesterov
2011-03-10 17:19 ` [PATCH v2 1/1] " Oleg Nesterov
2011-03-13 1:06 ` [patch] oom: prevent unnecessary oom kills or kernel panics David Rientjes
2011-03-09 23:19 ` Andrew Morton
2011-03-11 19:45 ` David Rientjes
2011-03-12 12:34 ` Oleg Nesterov
2011-03-12 13:43 ` [PATCH 0/3] oom: TIF_MEMDIE/PF_EXITING fixes Oleg Nesterov
2011-03-12 13:44 ` [PATCH 1/3] oom: oom_kill_task: mark every thread as TIF_MEMDIE Oleg Nesterov
2011-03-13 1:14 ` David Rientjes
2011-03-12 13:44 ` [PATCH 2/3] oom: select_bad_process: improve the PF_EXITING check Oleg Nesterov
2011-03-12 13:44 ` [PATCH 3/3] oom: select_bad_process: use same_thread_group() Oleg Nesterov
2011-03-12 19:40 ` [PATCH 0/3] oom: TIF_MEMDIE/PF_EXITING fixes Hugh Dickins
2011-03-13 8:53 ` KOSAKI Motohiro
2011-03-13 21:27 ` Oleg Nesterov
2011-03-14 19:04 ` [PATCH 0/3 for 2.6.38] oom: fixes Oleg Nesterov
2011-03-14 19:04 ` [PATCH 1/3 for 2.6.38] oom: oom_kill_process: don't set TIF_MEMDIE if !p->mm Oleg Nesterov
2011-03-14 19:35 ` Linus Torvalds
2011-03-14 20:31 ` Oleg Nesterov
2011-03-14 20:32 ` David Rientjes
2011-03-15 19:12 ` Oleg Nesterov
2011-03-15 19:51 ` David Rientjes
2011-03-14 20:22 ` David Rientjes
2011-03-15 18:53 ` Oleg Nesterov
2011-03-15 19:54 ` David Rientjes
2011-03-15 21:16 ` Oleg Nesterov
2011-03-14 19:05 ` [PATCH 2/3 for 2.6.38] oom: select_bad_process: ignore TIF_MEMDIE zombies Oleg Nesterov
2011-03-14 20:50 ` David Rientjes
2011-03-14 19:05 ` [PATCH 3/3 for 2.6.38] oom: oom_kill_process: fix the child_points logic Oleg Nesterov
2011-03-14 20:41 ` David Rientjes
2011-03-15 19:21 ` Oleg Nesterov
2011-03-13 11:36 ` [PATCH 0/3] oom: TIF_MEMDIE/PF_EXITING fixes KOSAKI Motohiro
2011-03-13 1:11 ` [patch] oom: prevent unnecessary oom kills or kernel panics David Rientjes
2011-03-13 1:15 ` David Rientjes [this message]
2011-03-14 17:40 ` [patch -mm] oom: avoid deferring oom killer if exiting task is being traced Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1103121715030.10317@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=avagin@openvz.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox