From: David Rientjes <rientjes@google.com>
To: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>, Lubos Lunak <l.lunak@suse.cz>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Nick Piggin <npiggin@suse.de>, Jiri Kosina <jkosina@suse.cz>
Subject: Re: Improving OOM killer
Date: Wed, 3 Feb 2010 10:58:01 -0800 (PST) [thread overview]
Message-ID: <alpine.DEB.2.00.1002031021190.14088@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100203170127.GH19641@balbir.in.ibm.com>
On Wed, 3 Feb 2010, Balbir Singh wrote:
> > IIRC the child accumulating code was introduced to deal with
> > malicious code (fork bombs), but it makes things worse for the
> > (much more common) situation of a system without malicious
> > code simply running out of memory due to being very busy.
> >
>
> For fork bombs, we could do a number of children number test and have
> a threshold before we consider a process and its children for
> badness().
>
Yes, we could look for the number of children with seperate mm's and then
penalize those threads that have forked an egregious amount, say, 500
tasks. I think we should check for this threshold within the badness()
heuristic to identify such forkbombs and not limit it only to certain
applications.
My rewrite for the badness() heuristic is centered on the idea that scores
should range from 0 to 1000, 0 meaning "never kill this task" and 1000
meaning "kill this task first." The baseline for a thread, p, may be
something like this:
unsigned int badness(struct task_struct *p,
unsigned long totalram)
{
struct task_struct *child;
struct mm_struct *mm;
int forkcount = 0;
long points;
task_lock(p);
mm = p->mm;
if (!mm) {
task_unlock(p);
return 0;
}
points = (get_mm_rss(mm) +
get_mm_counter(mm, MM_SWAPENTS)) * 1000 /
totalram;
task_unlock(p);
list_for_each_entry(child, &p->children, sibling)
/* No lock, child->mm won't be dereferenced */
if (child->mm && child->mm != mm)
forkcount++;
/* Forkbombs get penalized 10% of available RAM */
if (forkcount > 500)
points += 100;
...
/*
* /proc/pid/oom_adj ranges from -1000 to +1000 to either
* completely disable oom killing or always prefer it.
*/
points += p->signal->oom_adj;
if (points < 0)
return 0;
return (points <= 1000) ? points : 1000;
}
static struct task_struct *select_bad_process(...,
nodemask_t *nodemask)
{
struct task_struct *p;
unsigned long totalram = 0;
int nid;
for_each_node_mask(nid, nodemask)
totalram += NODE_DATA(nid)->node_present_pages;
for_each_process(p) {
unsigned int points;
...
if (!nodes_intersects(p->mems_allowed, nodemasks))
continue;
...
points = badness(p, totalram);
...
}
...
}
In this example, /proc/pid/oom_adj now ranges from -1000 to +1000, with
OOM_DISABLE being -1000, to polarize tasks for oom killing or determine
when a task is leaking memory because it is using far more memory than it
should. The nodemask passed from the page allocator should be intersected
with current->mems_allowed within the oom killer; userspace is then fully
aware of what value is an egregious amount of RAM for a task to consume,
including information it knows about the task's cpuset or mempolicy. For
example, it would be very simple for a user to set an oom_adj of -500,
which means "we discount 50% of the task's allowed memory from being
considered in the heuristic" or +500, which means "we always allow all
other system/cpuset/mempolicy tasks to use at least 50% more allowed
memory than this one."
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-02-03 18:58 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-01 22:02 Lubos Lunak
2010-02-01 23:53 ` David Rientjes
2010-02-02 21:10 ` Lubos Lunak
2010-02-03 1:41 ` David Rientjes
2010-02-03 1:52 ` KAMEZAWA Hiroyuki
2010-02-03 2:12 ` David Rientjes
2010-02-03 2:12 ` KAMEZAWA Hiroyuki
2010-02-03 2:36 ` [patch] sysctl: clean up vm related variable declarations David Rientjes
2010-02-03 8:07 ` KOSAKI Motohiro
2010-02-03 8:17 ` Balbir Singh
2010-02-03 22:54 ` Improving OOM killer Lubos Lunak
2010-02-04 0:00 ` David Rientjes
2010-02-03 7:50 ` KOSAKI Motohiro
2010-02-03 9:40 ` David Rientjes
2010-02-03 8:57 ` Balbir Singh
2010-02-03 12:10 ` Lubos Lunak
2010-02-03 12:25 ` Balbir Singh
2010-02-03 15:00 ` Minchan Kim
2010-02-03 16:06 ` Minchan Kim
2010-02-03 21:22 ` Lubos Lunak
2010-02-03 14:49 ` Rik van Riel
2010-02-03 17:01 ` Balbir Singh
2010-02-03 18:58 ` David Rientjes [this message]
2010-02-03 19:29 ` Frans Pop
2010-02-03 19:52 ` David Rientjes
2010-02-03 20:12 ` Frans Pop
2010-02-03 20:26 ` David Rientjes
2010-02-03 22:55 ` Lubos Lunak
2010-02-04 0:05 ` David Rientjes
2010-02-04 0:18 ` Rik van Riel
2010-02-04 21:48 ` David Rientjes
2010-02-04 22:06 ` Rik van Riel
2010-02-04 22:14 ` David Rientjes
2010-02-10 20:54 ` Lubos Lunak
2010-02-10 21:10 ` Rik van Riel
2010-02-10 21:29 ` Lubos Lunak
2010-02-10 22:18 ` Alan Cox
2010-02-10 22:31 ` David Rientjes
2010-02-11 9:50 ` Lubos Lunak
2010-02-04 22:31 ` Frans Pop
2010-02-04 22:53 ` David Rientjes
2010-02-04 7:58 ` Lubos Lunak
2010-02-04 21:34 ` David Rientjes
2010-02-10 20:54 ` Lubos Lunak
2010-02-10 21:09 ` Rik van Riel
2010-02-10 21:34 ` Lubos Lunak
2010-02-10 22:25 ` David Rientjes
2010-02-11 10:16 ` Lubos Lunak
2010-02-11 21:17 ` David Rientjes
2010-02-04 9:50 ` Jiri Kosina
2010-02-04 21:39 ` David Rientjes
2010-02-05 7:35 ` Oliver Neukum
2010-02-10 3:10 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1002031021190.14088@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=jkosina@suse.cz \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=l.lunak@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox