linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Nick Piggin <npiggin@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Lubos Lunak <l.lunak@suse.cz>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch 4/7 -mm] oom: badness heuristic rewrite
Date: Thu, 11 Feb 2010 01:14:43 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.00.1002102332200.22152@chino.kir.corp.google.com> (raw)
In-Reply-To: <4B73833D.5070008@redhat.com>

On Wed, 10 Feb 2010, Rik van Riel wrote:

> > OOM_ADJUST_MIN and OOM_ADJUST_MAX have been exported to userspace since
> > 2006 via include/linux/oom.h.  This alters their values from -16 to -1000
> > and from +15 to +1000, respectively.
> 
> That seems like a bad idea.  Google may have the luxury of
> being able to recompile all its in-house applications, but
> this will not be true for many other users of /proc/<pid>/oom_adj
> 

Changing any value that may have a tendency to be hardcoded elsewhere is 
always controversial, but I think the nature of /proc/pid/oom_adj allows 
us to do so for two specific reasons:

 - hardcoded values tend not the fall within a range, they tend to either
   always prefer a certain task for oom kill first or disable oom killing
   entirely.  The current implementation uses this as a bitshift on a
   seemingly unpredictable and unscientific heuristic that is very 
   difficult to predict at runtime.  This means that fewer and fewer
   applications would hardcode a value of '8', for example, because its 
   semantics depends entirely on RAM capacity of the system to begin with
   since badness() scores are only useful when used in comparison with
   other tasks.

 - the badness() heuristic is radically changed from what it is currently
   so this gives applications that hardcoded /proc/pid/oom_adj values into
   their software a reason to notice the change and adjust to the new
   semantics of the badness score.  Using /proc/pid/oom_adj as a bitshift
   has no real application to any sane heuristic that represents scores in
   units of meaning, so users should end up with a net benefit of the
   change by being able to better tune the oom killing behavior with a
   much more powerful and easier to understand heuristic that requires
   them to recalculate exactly what oom_adj should be for any given
   application in terms of real units and business goals.

As mentioned in the changelog, we've exported these minimum and maximum 
values via a kernel header file since at least 2006.  At what point do we 
assume they are going to be used and not hardcoded into applications?  
That was certainly the intention when making them user visible.

> > +/*
> > + * Tasks that fork a very large number of children with seperate address
> > spaces
> > + * may be the result of a bug, user error, or a malicious application.  The
> > oom
> > + * killer assesses a penalty equaling
> 
> It could also be the result of the system getting many client
> connections - think of overloaded mail, web or database servers.
> 

True, that's a great example of why child tasks should be sacrificed for 
the parent: if the oom killer is being called then we are truly overloaded 
and there's no shame in killing excessive client connections to recover, 
otherwise we might find the entire server becoming unresponsive.  The user 
can easily tune to /proc/sys/vm/oom_forkbomb_thres to define what 
"excessive" is to assess the penalty, if any.  I'll add that to the 
comment if we require a second revision.

Thanks for your speedy review of this patchset so far, Rik!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-02-11  9:14 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-10 16:32 [patch 0/7 -mm] oom killer rewrite David Rientjes
2010-02-10 16:32 ` [patch 1/7 -mm] oom: filter tasks not sharing the same cpuset David Rientjes
2010-02-10 17:08   ` Rik van Riel
2010-02-11 23:52   ` KAMEZAWA Hiroyuki
2010-02-15  2:56   ` KOSAKI Motohiro
2010-02-15 22:06     ` David Rientjes
2010-02-16  4:52       ` KOSAKI Motohiro
2010-02-16  6:01         ` KOSAKI Motohiro
2010-02-16  7:03         ` Nick Piggin
2010-02-16  8:49           ` David Rientjes
2010-02-16  9:04             ` Nick Piggin
2010-02-16  9:10               ` David Rientjes
2010-02-16  8:46         ` David Rientjes
2010-02-10 16:32 ` [patch 2/7 -mm] oom: sacrifice child with highest badness score for parent David Rientjes
2010-02-10 20:52   ` Rik van Riel
2010-02-12  0:00   ` KAMEZAWA Hiroyuki
2010-02-12  0:15     ` David Rientjes
2010-02-13  2:49   ` Minchan Kim
2010-02-15  3:08   ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms David Rientjes
2010-02-10 22:47   ` Rik van Riel
2010-02-15  5:03   ` KOSAKI Motohiro
2010-02-15 22:11     ` David Rientjes
2010-02-16  5:15       ` KOSAKI Motohiro
2010-02-16 21:52         ` David Rientjes
2010-02-17  0:48           ` David Rientjes
2010-02-17  1:13             ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 4/7 -mm] oom: badness heuristic rewrite David Rientjes
2010-02-11  4:10   ` Rik van Riel
2010-02-11  9:14     ` David Rientjes [this message]
2010-02-11 15:07       ` Nick Bowler
2010-02-11 21:01         ` David Rientjes
2010-02-11 21:43       ` Andrew Morton
2010-02-11 21:51         ` David Rientjes
2010-02-11 22:31           ` Andrew Morton
2010-02-11 22:42             ` David Rientjes
2010-02-11 23:11               ` Andrew Morton
2010-02-11 23:31                 ` David Rientjes
2010-02-11 23:37                   ` Andrew Morton
2010-02-12 13:56       ` Minchan Kim
2010-02-12 21:00         ` David Rientjes
2010-02-13  2:45           ` Minchan Kim
2010-02-15 21:54             ` David Rientjes
2010-02-16 13:14               ` Minchan Kim
2010-02-16 21:41                 ` David Rientjes
2010-02-17  7:41                   ` Minchan Kim
2010-02-17  9:23                     ` David Rientjes
2010-02-17 13:08                       ` Minchan Kim
2010-02-15  8:05   ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 5/7 -mm] oom: replace sysctls with quick mode David Rientjes
2010-02-12  0:26   ` KAMEZAWA Hiroyuki
2010-02-12  9:58     ` David Rientjes
2010-02-15  8:09   ` KOSAKI Motohiro
2010-02-15 22:15     ` David Rientjes
2010-02-16  5:25       ` KOSAKI Motohiro
2010-02-16  9:04         ` David Rientjes
2010-02-10 16:32 ` [patch 6/7 -mm] oom: avoid oom killer for lowmem allocations David Rientjes
2010-02-11  4:13   ` Rik van Riel
2010-02-11  9:19     ` David Rientjes
2010-02-11 14:08       ` Rik van Riel
2010-02-12  1:28   ` KAMEZAWA Hiroyuki
2010-02-12 10:06     ` David Rientjes
2010-02-15  0:09       ` KAMEZAWA Hiroyuki
2010-02-15 22:01         ` David Rientjes
2010-02-15  8:29   ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 7/7 -mm] oom: remove unnecessary code and cleanup David Rientjes
2010-02-12  0:12   ` KAMEZAWA Hiroyuki
2010-02-12  0:21     ` David Rientjes
2010-02-15  8:31       ` KOSAKI Motohiro
2010-02-15  2:51 ` [patch 0/7 -mm] oom killer rewrite KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1002102332200.22152@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=l.lunak@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox