[PATCH 2/4] oom: make oom_score to per-process value

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: kosaki.motohiro@jp.fujitsu.com, Paul Menage <menage@google.com>,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>, linux-mm <linux-mm@kvack.org>
Subject: [PATCH 2/4] oom: make oom_score to per-process value
Date: Tue,  4 Aug 2009 19:26:37 +0900 (JST)	[thread overview]
Message-ID: <20090804192557.6A43.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <20090804191031.6A3D.A69D9226@jp.fujitsu.com>

Subject: [PATCH] oom: make oom_score to per-process value

oom-killer kill a process, not task. Then oom_score should be
calculated as per-process too. it makes consistency more and
makes speed up select_bad_process().


Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>,
---
 Documentation/filesystems/proc.txt |    4 ++--
 fs/proc/base.c                     |    2 +-
 mm/oom_kill.c                      |   36 +++++++++++++++++++++++++++++-------
 3 files changed, 32 insertions(+), 10 deletions(-)

Index: b/mm/oom_kill.c
===================================================================
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -58,6 +58,19 @@ void set_oom_adj(struct task_struct *tsk
 }
 
 
+static int has_intersects_mems_allowed(struct task_struct *tsk)
+{
+	struct task_struct *t;
+
+	t = tsk;
+	do {
+		if (cpuset_mems_allowed_intersects(current, t))
+			return 1;
+		t = next_thread(t);
+	} while (t != tsk);
+
+	return 0;
+}
 
 /**
  * badness - calculate a numeric value for how bad this task has been
@@ -77,18 +90,26 @@ void set_oom_adj(struct task_struct *tsk
  *    algorithm has been meticulously tuned to meet the principle
  *    of least surprise ... (be careful when you change it)
  */
-
 unsigned long badness(struct task_struct *p, unsigned long uptime)
 {
 	unsigned long points, cpu_time, run_time;
 	struct mm_struct *mm;
 	struct task_struct *child;
 	int oom_adj;
+	struct task_cputime task_time;
+	unsigned long flags;
+	unsigned long utime;
+	unsigned long stime;
 
 	oom_adj = get_oom_adj(p);
 	if (oom_adj == OOM_DISABLE)
 		return 0;
 
+	if (!lock_task_sighand(p, &flags))
+		return 0;
+	thread_group_cputime(p, &task_time);
+	unlock_task_sighand(p, &flags);
+
 	task_lock(p);
 	mm = p->mm;
 	if (!mm) {
@@ -132,8 +153,9 @@ unsigned long badness(struct task_struct
          * of seconds. There is no particular reason for this other than
          * that it turned out to work very well in practice.
 	 */
-	cpu_time = (cputime_to_jiffies(p->utime) + cputime_to_jiffies(p->stime))
-		>> (SHIFT_HZ + 3);
+	utime = cputime_to_jiffies(task_time.utime);
+	stime = cputime_to_jiffies(task_time.stime);
+	cpu_time = (utime + stime) >> (SHIFT_HZ + 3);
 
 	if (uptime >= p->start_time.tv_sec)
 		run_time = (uptime - p->start_time.tv_sec) >> 10;
@@ -174,7 +196,7 @@ unsigned long badness(struct task_struct
 	 * because p may have allocated or otherwise mapped memory on
 	 * this node before. However it will be less likely.
 	 */
-	if (!cpuset_mems_allowed_intersects(current, p))
+	if (!has_intersects_mems_allowed(p))
 		points /= 8;
 
 	/*
@@ -230,13 +252,13 @@ static inline enum oom_constraint constr
 static struct task_struct *select_bad_process(unsigned long *ppoints,
 						struct mem_cgroup *mem)
 {
-	struct task_struct *g, *p;
+	struct task_struct *p;
 	struct task_struct *chosen = NULL;
 	struct timespec uptime;
 	*ppoints = 0;
 
 	do_posix_clock_monotonic_gettime(&uptime);
-	do_each_thread(g, p) {
+	for_each_process(p) {
 		unsigned long points;
 
 		/*
@@ -286,7 +308,7 @@ static struct task_struct *select_bad_pr
 			chosen = p;
 			*ppoints = points;
 		}
-	} while_each_thread(g, p);
+	}
 
 	return chosen;
 }
Index: b/fs/proc/base.c
===================================================================
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -450,7 +450,7 @@ static int proc_oom_score(struct task_st
 
 	do_posix_clock_monotonic_gettime(&uptime);
 	read_lock(&tasklist_lock);
-	points = badness(task, uptime.tv_sec);
+	points = badness(task->group_leader, uptime.tv_sec);
 	read_unlock(&tasklist_lock);
 	return sprintf(buffer, "%lu\n", points);
 }
Index: b/Documentation/filesystems/proc.txt
===================================================================
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1195,13 +1195,13 @@ The following heuristics are then applie
  * if the task was reniced, its score doubles
  * superuser or direct hardware access tasks (CAP_SYS_ADMIN, CAP_SYS_RESOURCE
  	or CAP_SYS_RAWIO) have their score divided by 4
- * if oom condition happened in one cpuset and checked task does not belong
+ * if oom condition happened in one cpuset and checked process does not belong
  	to it, its score is divided by 8
  * the resulting score is multiplied by two to the power of oom_adj, i.e.
 	points <<= oom_adj when it is positive and
 	points >>= -(oom_adj) otherwise
 
-The task with the highest badness score is then selected and its children
+The process with the highest badness score is then selected and its children
 are killed, process itself will be killed in an OOM situation when it does
 not have children or some of them disabled oom like described above.
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-08-04  9:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-04 10:25 [PATCH for 2.6.31 0/4] fix oom_adj regression v2 KOSAKI Motohiro
2009-08-04 10:25 ` [PATCH 1/4] oom: move oom_adj to signal_struct KOSAKI Motohiro
2009-08-05  0:45   ` Minchan Kim
2009-08-05  2:29     ` KOSAKI Motohiro
2009-08-05  2:40       ` Minchan Kim
2009-08-05  2:51         ` KOSAKI Motohiro
2009-08-05  5:55           ` Minchan Kim
2009-08-05  6:03             ` KAMEZAWA Hiroyuki
2009-08-05  6:37               ` Minchan Kim
2009-08-05  6:53                 ` KOSAKI Motohiro
2009-08-05  7:20                   ` Minchan Kim
2009-08-05  6:55                 ` KAMEZAWA Hiroyuki
2009-08-05  6:04             ` KOSAKI Motohiro
2009-08-05  6:29               ` Minchan Kim
2009-08-05  6:47                 ` KOSAKI Motohiro
2009-08-06  1:34   ` Oleg Nesterov
2009-08-06  5:16     ` KOSAKI Motohiro
2009-08-04 10:26 ` KOSAKI Motohiro [this message]
2009-08-04 10:27 ` [PATCH 3/4] oom: oom_kill doesn't kill vfork parent(or child) KOSAKI Motohiro
2009-08-04 10:28 ` [PATCH 4/4] oom: fix oom_adjust_write() input sanity check KOSAKI Motohiro
2009-08-05 23:33   ` Andrew Morton
2009-08-06  5:06     ` KOSAKI Motohiro
2009-08-05 23:39 ` [PATCH for 2.6.31 0/4] fix oom_adj regression v2 Andrew Morton
2009-08-06  5:13   ` KOSAKI Motohiro
2009-08-06  8:07     ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090804192557.6A43.A69D9226@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=menage@google.com \
    --cc=oleg@redhat.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox