From: David Rientjes <rientjes@google.com>
To: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Minchan Kim <minchan.kim@gmail.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
balbir@linux.vnet.ibm.com, Oleg Nesterov <oleg@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>, Mel Gorman <mel@csn.ul.ie>,
williams@redhat.com
Subject: Re: [RFC] oom-kill: give the dying task a higher priority
Date: Tue, 1 Jun 2010 13:49:58 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.00.1006011347060.13136@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100601173535.GD23428@uudg.org>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3350 bytes --]
On Tue, 1 Jun 2010, Luis Claudio R. Goncalves wrote:
> oom-kill: give the dying task a higher priority (v5)
>
> In a system under heavy load it was observed that even after the
> oom-killer selects a task to die, the task may take a long time to die.
>
> Right before sending a SIGKILL to the task selected by the oom-killer
> this task has it's priority increased so that it can exit() exit soon,
> freeing memory. That is accomplished by:
>
> /*
> * We give our sacrificial lamb high priority and access to
> * all the memory it needs. That way it should be able to
> * exit() and clear out its resources quickly...
> */
> p->rt.time_slice = HZ;
> set_tsk_thread_flag(p, TIF_MEMDIE);
>
> It sounds plausible giving the dying task an even higher priority to be
> sure it will be scheduled sooner and free the desired memory. It was
> suggested on LKML using SCHED_FIFO:1, the lowest RT priority so that
> this task won't interfere with any running RT task.
>
> If the dying task is already an RT task, leave it untouched.
>
> Another good suggestion, implemented here, was to avoid boosting the
> dying task priority in case of mem_cgroup OOM.
>
> Signed-off-by: Luis Claudio R. Goncalves <lclaudio@uudg.org>
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 709aedf..67e18ca 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -52,6 +52,22 @@ static int has_intersects_mems_allowed(struct task_struct *tsk)
> return 0;
> }
>
> +/*
> + * If this is a system OOM (not a memcg OOM) and the task selected to be
> + * killed is not already running at high (RT) priorities, speed up the
> + * recovery by boosting the dying task to the lowest FIFO priority.
> + * That helps with the recovery and avoids interfering with RT tasks.
> + */
> +static void boost_dying_task_prio(struct task_struct *p,
> + struct mem_cgroup *mem)
> +{
> + if ((mem == NULL) && !rt_task(p)) {
> + struct sched_param param;
> + param.sched_priority = 1;
> + sched_setscheduler_nocheck(p, SCHED_FIFO, ¶m);
> + }
> +}
> +
> /**
> * badness - calculate a numeric value for how bad this task has been
> * @p: task struct of which task we should calculate
> @@ -277,8 +293,10 @@ static struct task_struct *select_bad_process(unsigned long *ppoints,
> * blocked waiting for another task which itself is waiting
> * for memory. Is there a better alternative?
> */
> - if (test_tsk_thread_flag(p, TIF_MEMDIE))
> + if (test_tsk_thread_flag(p, TIF_MEMDIE)) {
> + boost_dying_task_prio(p, mem);
> return ERR_PTR(-1UL);
> + }
>
> /*
> * This is in the process of releasing memory so wait for it
That's unnecessary, if p already has TIF_MEMDIE set, then
boost_dying_task_prio(p) has already been called.
> @@ -291,9 +309,10 @@ static struct task_struct *select_bad_process(unsigned long *ppoints,
> * Otherwise we could get an easy OOM deadlock.
> */
> if (p->flags & PF_EXITING) {
> - if (p != current)
> + if (p != current) {
> + boost_dying_task_prio(p, mem);
> return ERR_PTR(-1UL);
> -
> + }
> chosen = p;
> *ppoints = ULONG_MAX;
> }
This has the potential to actually make it harder to free memory if p is
waiting to acquire a writelock on mm->mmap_sem in the exit path while the
thread holding mm->mmap_sem is trying to run.
next prev parent reply other threads:[~2010-06-01 20:50 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-27 18:04 Luis Claudio R. Goncalves
2010-05-27 18:33 ` Oleg Nesterov
2010-05-28 2:54 ` KOSAKI Motohiro
2010-05-28 3:51 ` Luis Claudio R. Goncalves
2010-05-28 4:33 ` Balbir Singh
2010-05-28 4:46 ` KOSAKI Motohiro
2010-05-28 5:30 ` Minchan Kim
2010-05-28 5:39 ` KOSAKI Motohiro
2010-05-28 5:50 ` Minchan Kim
2010-05-28 5:59 ` KOSAKI Motohiro
2010-05-28 7:52 ` Minchan Kim
2010-05-28 12:53 ` Luis Claudio R. Goncalves
2010-05-28 14:06 ` Minchan Kim
2010-05-28 14:20 ` Balbir Singh
2010-05-28 15:03 ` Minchan Kim
2010-05-28 14:36 ` Luis Claudio R. Goncalves
2010-05-28 15:12 ` Minchan Kim
2010-05-28 15:21 ` Peter Zijlstra
2010-05-28 15:35 ` Minchan Kim
2010-05-28 15:28 ` Luis Claudio R. Goncalves
2010-05-28 15:45 ` Minchan Kim
2010-05-28 16:48 ` Luis Claudio R. Goncalves
2010-05-29 3:59 ` KOSAKI Motohiro
2010-05-31 2:15 ` Luis Claudio R. Goncalves
2010-05-31 5:06 ` Minchan Kim
2010-05-31 6:35 ` KOSAKI Motohiro
2010-05-31 7:05 ` Minchan Kim
2010-05-31 7:25 ` KAMEZAWA Hiroyuki
2010-05-31 9:30 ` Minchan Kim
2010-05-30 15:09 ` Minchan Kim
2010-05-31 0:21 ` KAMEZAWA Hiroyuki
2010-05-31 5:01 ` Minchan Kim
2010-05-31 5:04 ` KAMEZAWA Hiroyuki
2010-05-31 5:46 ` Minchan Kim
2010-05-31 5:54 ` KAMEZAWA Hiroyuki
2010-05-31 6:09 ` Minchan Kim
2010-05-31 6:51 ` KAMEZAWA Hiroyuki
2010-05-31 10:33 ` Minchan Kim
2010-05-31 13:52 ` Luis Claudio R. Goncalves
2010-05-31 23:50 ` KAMEZAWA Hiroyuki
2010-06-01 17:35 ` Luis Claudio R. Goncalves
2010-06-01 20:49 ` David Rientjes [this message]
2010-06-02 13:54 ` KOSAKI Motohiro
2010-06-02 14:20 ` Luis Claudio R. Goncalves
2010-06-02 21:11 ` David Rientjes
2010-06-02 23:36 ` KOSAKI Motohiro
2010-06-03 0:52 ` Minchan Kim
2010-06-03 7:50 ` Peter Zijlstra
2010-06-03 20:32 ` David Rientjes
2010-06-01 8:19 ` Minchan Kim
2010-06-01 18:36 ` David Rientjes
2010-05-28 6:27 ` Balbir Singh
2010-05-28 6:34 ` KAMEZAWA Hiroyuki
2010-05-28 6:38 ` KOSAKI Motohiro
2010-05-28 15:53 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1006011347060.13136@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=lclaudio@uudg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=minchan.kim@gmail.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox