linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch -mm 0/6] oom: various tiny cleanups and fixes
@ 2010-06-09  3:59 David Rientjes
  2010-06-09  3:59 ` [patch -mm 1/6] oom: dump_tasks use find_lock_task_mm too fix David Rientjes
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: David Rientjes @ 2010-06-09  3:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oleg Nesterov, KOSAKI Motohiro, KAMEZAWA Hiroyuki, linux-mm

This patchset contains some various tiny cleanups and fixes that were all 
identified by akpm during his review of the oom killer rewrite patches 
that he's merged thus far.

A few of them are fixes intended to be folded into the patch that 
introduced the code (those that are of the same name and suffixed with 
"fix" :) and a few of them are standalone improvements.

Based on mmotm-2010-06-03-16-36 with the oom patches merged on June 8.  
Since these patches only touch mm/oom_kill.c, no conflicts are expected in 
the actual "mm of the moment".

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch -mm 1/6] oom: dump_tasks use find_lock_task_mm too fix
  2010-06-09  3:59 [patch -mm 0/6] oom: various tiny cleanups and fixes David Rientjes
@ 2010-06-09  3:59 ` David Rientjes
  2010-06-09  3:59 ` [patch -mm 2/6] oom: protect dereferencing of task's comm David Rientjes
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: David Rientjes @ 2010-06-09  3:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oleg Nesterov, KOSAKI Motohiro, KAMEZAWA Hiroyuki, linux-mm


When find_lock_task_mm() returns a thread other than p in dump_tasks(),
its name should be displayed instead.  This is the thread that will be
targeted by the oom killer, not its mm-less parent.

This also allows us to safely dereference task->comm without needing
get_task_comm().

While we're here, remove the cast on task_cpu(task) as Andrew suggested.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -376,10 +376,10 @@ static void dump_tasks(const struct mem_cgroup *mem)
 			continue;
 		}
 
-		printk(KERN_INFO "[%5d] %5d %5d %8lu %8lu %3d     %3d %s\n",
+		printk(KERN_INFO "[%5d] %5d %5d %8lu %8lu %3u     %3d %s\n",
 		       task->pid, __task_cred(task)->uid, task->tgid,
 		       task->mm->total_vm, get_mm_rss(task->mm),
-		       (int)task_cpu(task), task->signal->oom_adj, p->comm);
+		       task_cpu(task), task->signal->oom_adj, task->comm);
 		task_unlock(task);
 	}
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch -mm 2/6] oom: protect dereferencing of task's comm
  2010-06-09  3:59 [patch -mm 0/6] oom: various tiny cleanups and fixes David Rientjes
  2010-06-09  3:59 ` [patch -mm 1/6] oom: dump_tasks use find_lock_task_mm too fix David Rientjes
@ 2010-06-09  3:59 ` David Rientjes
  2010-06-09  3:59 ` [patch -mm 3/6] oom: add has_intersects_mems_allowed UMA variant David Rientjes
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: David Rientjes @ 2010-06-09  3:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oleg Nesterov, KOSAKI Motohiro, KAMEZAWA Hiroyuki, linux-mm

Andrew notes that dereferencing task->comm is unsafe without holding
task_lock(task).  That's true even when dealing with current, so all
existing dereferences within the oom killer need to ensure they are
holding task_lock() before doing so.

This avoids using get_task_comm() because we'd otherwise need to
allocate a string of TASK_COMM_LEN on the stack (or add synchronization
and use a global string) and we don't want to do that because page
allocations, and thus the oom killer, can happen particularly deep in the
stack.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -387,10 +387,10 @@ static void dump_tasks(const struct mem_cgroup *mem)
 static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
 							struct mem_cgroup *mem)
 {
+	task_lock(current);
 	pr_warning("%s invoked oom-killer: gfp_mask=0x%x, order=%d, "
 		"oom_adj=%d\n",
 		current->comm, gfp_mask, order, current->signal->oom_adj);
-	task_lock(current);
 	cpuset_print_task_mems_allowed(current);
 	task_unlock(current);
 	dump_stack();
@@ -443,8 +443,10 @@ static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 		return 0;
 	}
 
+	task_lock(p);
 	pr_err("%s: Kill process %d (%s) score %lu or sacrifice child\n",
 		message, task_pid_nr(p), p->comm, points);
+	task_unlock(p);
 
 	/* Try to sacrifice the worst child first */
 	do_posix_clock_monotonic_gettime(&uptime);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch -mm 3/6] oom: add has_intersects_mems_allowed UMA variant
  2010-06-09  3:59 [patch -mm 0/6] oom: various tiny cleanups and fixes David Rientjes
  2010-06-09  3:59 ` [patch -mm 1/6] oom: dump_tasks use find_lock_task_mm too fix David Rientjes
  2010-06-09  3:59 ` [patch -mm 2/6] oom: protect dereferencing of task's comm David Rientjes
@ 2010-06-09  3:59 ` David Rientjes
  2010-06-09  3:59 ` [patch -mm 4/6] oom: introduce find_lock_task_mm to fix mm false positives fix David Rientjes
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: David Rientjes @ 2010-06-09  3:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oleg Nesterov, KOSAKI Motohiro, KAMEZAWA Hiroyuki, linux-mm

has_intersects_mems_allowed() shall always return true for machines
without CONFIG_NUMA since filtering tasks by either cpuset mems or
mempolicy nodes is unnecessary on such machines.

While we're here, fix the comment to make it conform to kerneldoc style.

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -36,10 +36,15 @@ int sysctl_oom_dump_tasks = 1;
 static DEFINE_SPINLOCK(zone_scan_lock);
 /* #define DEBUG */
 
-/*
- * Do all threads of the target process overlap our allowed nodes?
+#ifdef CONFIG_NUMA
+/**
+ * has_intersects_mems_allowed() - check task eligiblity for kill
  * @tsk: task struct of which task to consider
  * @mask: nodemask passed to page allocator for mempolicy ooms
+ *
+ * Task eligibility is determined by whether or not a candidate task, @tsk,
+ * shares the same mempolicy nodes as current if it is bound by such a policy
+ * and whether or not it has the same set of allowed cpuset nodes.
  */
 static bool has_intersects_mems_allowed(struct task_struct *tsk,
 					const nodemask_t *mask)
@@ -68,6 +73,13 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk,
 	} while (tsk != start);
 	return false;
 }
+#else
+static bool has_intersects_mems_allowed(struct task_struct *tsk,
+					const nodemask_t *mask)
+{
+	return true;
+}
+#endif /* CONFIG_NUMA */
 
 static struct task_struct *find_lock_task_mm(struct task_struct *p)
 {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch -mm 4/6] oom: introduce find_lock_task_mm to fix mm false positives fix
  2010-06-09  3:59 [patch -mm 0/6] oom: various tiny cleanups and fixes David Rientjes
                   ` (2 preceding siblings ...)
  2010-06-09  3:59 ` [patch -mm 3/6] oom: add has_intersects_mems_allowed UMA variant David Rientjes
@ 2010-06-09  3:59 ` David Rientjes
  2010-06-09  3:59 ` [patch -mm 5/6] oom: sacrifice child with highest badness score for parent fix David Rientjes
  2010-06-09  3:59 ` [patch -mm 6/6] oom: improve commentary in dump_tasks() David Rientjes
  5 siblings, 0 replies; 7+ messages in thread
From: David Rientjes @ 2010-06-09  3:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oleg Nesterov, KOSAKI Motohiro, KAMEZAWA Hiroyuki, linux-mm

find_lock_task_mm() should be documented so that we clearly understand
what it does and why we need it.

At the same time, remove a stale coment about dereferencing of a local
variable "mm" in badness() which no longer exists and was removed when
find_lock_task_mm() was added.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -81,6 +81,12 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk,
 }
 #endif /* CONFIG_NUMA */
 
+/*
+ * The process p may have detached its own ->mm while exiting or through
+ * use_mm(), but one or more of its subthreads may still have a valid
+ * pointer.  Return p, or any of its subthreads with a valid ->mm, with
+ * task_lock() held.
+ */
 static struct task_struct *find_lock_task_mm(struct task_struct *p)
 {
 	struct task_struct *t = p;
@@ -135,10 +141,6 @@ unsigned long badness(struct task_struct *p, unsigned long uptime)
 	 * The memory size of the process is the basis for the badness.
 	 */
 	points = p->mm->total_vm;
-
-	/*
-	 * After this unlock we can no longer dereference local variable `mm'
-	 */
 	task_unlock(p);
 
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch -mm 5/6] oom: sacrifice child with highest badness score for parent fix
  2010-06-09  3:59 [patch -mm 0/6] oom: various tiny cleanups and fixes David Rientjes
                   ` (3 preceding siblings ...)
  2010-06-09  3:59 ` [patch -mm 4/6] oom: introduce find_lock_task_mm to fix mm false positives fix David Rientjes
@ 2010-06-09  3:59 ` David Rientjes
  2010-06-09  3:59 ` [patch -mm 6/6] oom: improve commentary in dump_tasks() David Rientjes
  5 siblings, 0 replies; 7+ messages in thread
From: David Rientjes @ 2010-06-09  3:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oleg Nesterov, KOSAKI Motohiro, KAMEZAWA Hiroyuki, linux-mm

Elaborate on the comment in oom_kill_process() so it's clear why a
killable child with a different mm is sacrificied for its parent.

At the same time, rename auto variable `c' to "child" and move "cpoints"
inside the list_for_each_entry() loop with a more descriptive name as
akpm suggests.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   25 +++++++++++++++----------
 1 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -440,7 +440,7 @@ static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 			    const char *message)
 {
 	struct task_struct *victim = p;
-	struct task_struct *c;
+	struct task_struct *child;
 	struct task_struct *t = p;
 	unsigned long victim_points = 0;
 	struct timespec uptime;
@@ -462,22 +462,27 @@ static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 		message, task_pid_nr(p), p->comm, points);
 	task_unlock(p);
 
-	/* Try to sacrifice the worst child first */
+	/*
+	 * If any of p's children has a different mm and is eligible for kill,
+	 * the one with the highest badness() score is sacrificed for its
+	 * parent.  This attempts to lose the minimal amount of work done while
+	 * still freeing memory.
+	 */
 	do_posix_clock_monotonic_gettime(&uptime);
 	do {
-		unsigned long cpoints;
+		list_for_each_entry(child, &t->children, sibling) {
+			unsigned long child_points;
 
-		list_for_each_entry(c, &t->children, sibling) {
-			if (c->mm == p->mm)
+			if (child->mm == p->mm)
 				continue;
-			if (mem && !task_in_mem_cgroup(c, mem))
+			if (mem && !task_in_mem_cgroup(child, mem))
 				continue;
 
 			/* badness() returns 0 if the thread is unkillable */
-			cpoints = badness(c, uptime.tv_sec);
-			if (cpoints > victim_points) {
-				victim = c;
-				victim_points = cpoints;
+			child_points = badness(child, uptime.tv_sec);
+			if (child_points > victim_points) {
+				victim = child;
+				victim_points = child_points;
 			}
 		}
 	} while_each_thread(p, t);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch -mm 6/6] oom: improve commentary in dump_tasks()
  2010-06-09  3:59 [patch -mm 0/6] oom: various tiny cleanups and fixes David Rientjes
                   ` (4 preceding siblings ...)
  2010-06-09  3:59 ` [patch -mm 5/6] oom: sacrifice child with highest badness score for parent fix David Rientjes
@ 2010-06-09  3:59 ` David Rientjes
  5 siblings, 0 replies; 7+ messages in thread
From: David Rientjes @ 2010-06-09  3:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Oleg Nesterov, KOSAKI Motohiro, KAMEZAWA Hiroyuki, linux-mm

The comments in dump_tasks() should be updated to be more clear about why
tasks are filtered and how they are filtered by its argument.

An unnecessary comment concerning a check for is_global_init() is removed
since it isn't of importance.

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |   11 +++--------
 1 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -351,7 +351,7 @@ static struct task_struct *select_bad_process(unsigned long *ppoints,
 
 /**
  * dump_tasks - dump current memory state of all system tasks
- * @mem: target memory controller
+ * @mem: current's memory controller, if constrained
  *
  * Dumps the current memory state of all system tasks, excluding kernel threads.
  * State information includes task's pid, uid, tgid, vm size, rss, cpu, oom_adj
@@ -370,11 +370,6 @@ static void dump_tasks(const struct mem_cgroup *mem)
 	printk(KERN_INFO "[ pid ]   uid  tgid total_vm      rss cpu oom_adj "
 	       "name\n");
 	for_each_process(p) {
-		/*
-		 * We don't have is_global_init() check here, because the old
-		 * code do that. printing init process is not big matter. But
-		 * we don't hope to make unnecessary compatibility breaking.
-		 */
 		if (p->flags & PF_KTHREAD)
 			continue;
 		if (mem && !task_in_mem_cgroup(p, mem))
@@ -383,8 +378,8 @@ static void dump_tasks(const struct mem_cgroup *mem)
 		task = find_lock_task_mm(p);
 		if (!task) {
 			/*
-			 * Probably oom vs task-exiting race was happen and ->mm
-			 * have been detached. thus there's no need to report
+			 * This is a kthread or all of p's threads have already
+			 * detached their mm's.  There's no need to report 
 			 * them; they can't be oom killed anyway.
 			 */
 			continue;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-06-09  3:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-09  3:59 [patch -mm 0/6] oom: various tiny cleanups and fixes David Rientjes
2010-06-09  3:59 ` [patch -mm 1/6] oom: dump_tasks use find_lock_task_mm too fix David Rientjes
2010-06-09  3:59 ` [patch -mm 2/6] oom: protect dereferencing of task's comm David Rientjes
2010-06-09  3:59 ` [patch -mm 3/6] oom: add has_intersects_mems_allowed UMA variant David Rientjes
2010-06-09  3:59 ` [patch -mm 4/6] oom: introduce find_lock_task_mm to fix mm false positives fix David Rientjes
2010-06-09  3:59 ` [patch -mm 5/6] oom: sacrifice child with highest badness score for parent fix David Rientjes
2010-06-09  3:59 ` [patch -mm 6/6] oom: improve commentary in dump_tasks() David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox