Re: [RFC] oom-kill: give the dying task a higher priority

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>,
	balbir@linux.vnet.ibm.com, Oleg Nesterov <oleg@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	David Rientjes <rientjes@google.com>, Mel Gorman <mel@csn.ul.ie>,
	williams@redhat.com
Subject: Re: [RFC] oom-kill: give the dying task a higher priority
Date: Sun, 30 May 2010 23:15:59 -0300	[thread overview]
Message-ID: <20100531021559.GA19784@uudg.org> (raw)
In-Reply-To: <20100529125136.62CA.A69D9226@jp.fujitsu.com>

On Sat, May 29, 2010 at 12:59:09PM +0900, KOSAKI Motohiro wrote:
| Hi
| 
| > oom-killer: give the dying task rt priority (v3)
| > 
| > Give the dying task RT priority so that it can be scheduled quickly and die,
| > freeing needed memory.
| > 
| > Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
| 
| Almostly acceptable to me. but I have two requests, 
| 
| - need 1) force_sig() 2)sched_setscheduler() order as Oleg mentioned
| - don't boost priority if it's in mem_cgroup_out_of_memory()
| 
| Can you accept this? if not, can you please explain the reason?
| 
| Thanks.

The last patch I posted was the wrong patch from my queue. Sorry for the
confusion. Here is the last version of the patch, including the suggestions
from Oleg, Peter and Kosaki Motohiro:


oom-kill: give the dying task a higher priority (v4)

In a system under heavy load it was observed that even after the
oom-killer selects a task to die, the task may take a long time to die.

Right before sending a SIGKILL to the task selected by the oom-killer
this task has it's priority increased so that it can exit() exit soon,
freeing memory. That is accomplished by:

        /*
         * We give our sacrificial lamb high priority and access to
         * all the memory it needs. That way it should be able to
         * exit() and clear out its resources quickly...
         */
 	p->rt.time_slice = HZ;
 	set_tsk_thread_flag(p, TIF_MEMDIE);

It sounds plausible giving the dying task an even higher priority to be
sure it will be scheduled sooner and free the desired memory. It was
suggested on LKML using SCHED_FIFO:1, the lowest RT priority so that this
task won't interfere with any running RT task.

Another good suggestion, implemented here, was to avoid boosting the dying
task priority in case of mem_cgroup OOM.

Signed-off-by: Luis Claudio R. Goncalves <lclaudio@uudg.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 709aedf..6a25293 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -380,7 +380,8 @@ static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
  * flag though it's unlikely that  we select a process with CAP_SYS_RAW_IO
  * set.
  */
-static void __oom_kill_task(struct task_struct *p, int verbose)
+static void __oom_kill_task(struct task_struct *p, struct mem_cgroup *mem,
+								int verbose)
 {
 	if (is_global_init(p)) {
 		WARN_ON(1);
@@ -413,11 +414,20 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
 	 */
 	p->rt.time_slice = HZ;
 	set_tsk_thread_flag(p, TIF_MEMDIE);
-
 	force_sig(SIGKILL, p);
+	/*
+	 * If this is a system OOM (not a memcg OOM), speed up the recovery
+	 * by boosting the dying task priority to the lowest FIFO priority.
+	 * That helps with the recovery and avoids interfering with RT tasks.
+	 */
+	if (mem == NULL) {
+		struct sched_param param;
+		param.sched_priority = 1;
+		sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
+	}
 }
 
-static int oom_kill_task(struct task_struct *p)
+static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem)
 {
 	/* WARNING: mm may not be dereferenced since we did not obtain its
 	 * value from get_task_mm(p).  This is OK since all we need to do is
@@ -430,7 +440,7 @@ static int oom_kill_task(struct task_struct *p)
 	if (!p->mm || p->signal->oom_adj == OOM_DISABLE)
 		return 1;
 
-	__oom_kill_task(p, 1);
+	__oom_kill_task(p, mem, 1);
 
 	return 0;
 }
@@ -449,7 +459,7 @@ static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 	 * its children or threads, just set TIF_MEMDIE so it can die quickly
 	 */
 	if (p->flags & PF_EXITING) {
-		__oom_kill_task(p, 0);
+		__oom_kill_task(p, mem, 0);
 		return 0;
 	}
 
@@ -462,10 +472,10 @@ static int oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 			continue;
 		if (mem && !task_in_mem_cgroup(c, mem))
 			continue;
-		if (!oom_kill_task(c))
+		if (!oom_kill_task(c, mem))
 			return 0;
 	}
-	return oom_kill_task(p);
+	return oom_kill_task(p, mem);
 }
 
 #ifdef CONFIG_CGROUP_MEM_RES_CTLR

-- 
[ Luis Claudio R. Goncalves                    Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9  2696 7203 D980 A448 C8F8 ]

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-05-31  2:16 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-27 18:04 Luis Claudio R. Goncalves
2010-05-27 18:33 ` Oleg Nesterov
2010-05-28  2:54   ` KOSAKI Motohiro
2010-05-28  3:51     ` Luis Claudio R. Goncalves
2010-05-28  4:33       ` Balbir Singh
2010-05-28  4:46         ` KOSAKI Motohiro
2010-05-28  5:30           ` Minchan Kim
2010-05-28  5:39             ` KOSAKI Motohiro
2010-05-28  5:50               ` Minchan Kim
2010-05-28  5:59                 ` KOSAKI Motohiro
2010-05-28  7:52                   ` Minchan Kim
2010-05-28 12:53                   ` Luis Claudio R. Goncalves
2010-05-28 14:06                     ` Minchan Kim
2010-05-28 14:20                       ` Balbir Singh
2010-05-28 15:03                         ` Minchan Kim
2010-05-28 14:36                       ` Luis Claudio R. Goncalves
2010-05-28 15:12                         ` Minchan Kim
2010-05-28 15:21                           ` Peter Zijlstra
2010-05-28 15:35                             ` Minchan Kim
2010-05-28 15:28                           ` Luis Claudio R. Goncalves
2010-05-28 15:45                             ` Minchan Kim
2010-05-28 16:48                               ` Luis Claudio R. Goncalves
2010-05-29  3:59                                 ` KOSAKI Motohiro
2010-05-31  2:15                                   ` Luis Claudio R. Goncalves [this message]
2010-05-31  5:06                                   ` Minchan Kim
2010-05-31  6:35                                     ` KOSAKI Motohiro
2010-05-31  7:05                                       ` Minchan Kim
2010-05-31  7:25                                         ` KAMEZAWA Hiroyuki
2010-05-31  9:30                                           ` Minchan Kim
2010-05-30 15:09                                 ` Minchan Kim
2010-05-31  0:21                                 ` KAMEZAWA Hiroyuki
2010-05-31  5:01                                   ` Minchan Kim
2010-05-31  5:04                                     ` KAMEZAWA Hiroyuki
2010-05-31  5:46                                       ` Minchan Kim
2010-05-31  5:54                                         ` KAMEZAWA Hiroyuki
2010-05-31  6:09                                           ` Minchan Kim
2010-05-31  6:51                                             ` KAMEZAWA Hiroyuki
2010-05-31 10:33                                               ` Minchan Kim
2010-05-31 13:52                                               ` Luis Claudio R. Goncalves
2010-05-31 23:50                                                 ` KAMEZAWA Hiroyuki
2010-06-01 17:35                                                   ` Luis Claudio R. Goncalves
2010-06-01 20:49                                                     ` David Rientjes
2010-06-02 13:54                                                       ` KOSAKI Motohiro
2010-06-02 14:20                                                         ` Luis Claudio R. Goncalves
2010-06-02 21:11                                                         ` David Rientjes
2010-06-02 23:36                                                           ` KOSAKI Motohiro
2010-06-03  0:52                                                             ` Minchan Kim
2010-06-03  7:50                                                           ` Peter Zijlstra
2010-06-03 20:32                                                             ` David Rientjes
2010-06-01  8:19                                                 ` Minchan Kim
2010-06-01 18:36                                                   ` David Rientjes
2010-05-28  6:27           ` Balbir Singh
2010-05-28  6:34             ` KAMEZAWA Hiroyuki
2010-05-28  6:38             ` KOSAKI Motohiro
2010-05-28 15:53       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100531021559.GA19784@uudg.org \
    --to=lclaudio@uudg.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan.kim@gmail.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox