From: David Rientjes <rientjes@google.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Nick Piggin <npiggin@suse.de>, Oleg Nesterov <oleg@redhat.com>,
Balbir Singh <balbir@in.ibm.com>,
linux-mm@kvack.org
Subject: Re: [patch -mm 1/2] oom: badness heuristic rewrite
Date: Tue, 3 Aug 2010 00:23:32 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.00.1008030016590.20849@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100803133255.deb5c208.kamezawa.hiroyu@jp.fujitsu.com>
On Tue, 3 Aug 2010, KAMEZAWA Hiroyuki wrote:
> In old behavior, oom_score order is synchronous both in the system and
> container. High-score one will be killed.
> IOW, oom_score have worked as oom_score.
>
This isn't necessarily true as I've already pointed out: the highest score
as exported by /proc/pid/oom_score is not always killed if it's not a
candidate task: it may be in a disjoint memcg, for example. The highest
_candidate_ task is killed, and that's unchanged with my rewrite.
The current /proc/pid/oom_score is also not synchronous between the system
and container at least in the cpuset case since we currently divide a
task's score by 8 if it doesn't intersect current's mems_allowed, so
that's not true either.
> But, after the patch, the user (of LXC at el.) can't trust oom_score.
Yes, they can, but they need to know the context in which the oom occurs.
/proc/pid/oom_score cannot export multiple values although its kill
ranking actually depends on whether its a system oom, memcg oom, cpuset
oom, etc. It needs to export a single value as a function of the
heuristic. The user must then take those values at the time of
collection and find how the various tasks rank relative to one another
depending on MPOL_BIND, cpuset hierarchy, etc. That's actually not that
difficult because admins who don't use any cgroups typically only have
system-wide ooms where oom_score is always accurate and admins who use
cpusets or memcg or mempolicies on large NUMA systems already know the set
of tasks that are attached to them and want to prioritize the killing list
specifically for those entities.
> Especially with memcg, it just shows a _broken_ value.
>
Not at all, the user knows what tasks are attached to the memcg and can
easily determine which task is going to be killed when it ooms: simply
iterate through the memcg tasklist, check /proc/pid/oom_score, and sort.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-08-03 7:17 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-17 19:16 David Rientjes
2010-07-17 19:16 ` [patch -mm 2/2] oom: deprecate oom_adj tunable David Rientjes
2010-07-29 23:08 ` [patch -mm 1/2] oom: badness heuristic rewrite Andrew Morton
2010-07-30 0:12 ` KOSAKI Motohiro
2010-07-30 1:38 ` Andrew Morton
2010-07-30 11:02 ` KOSAKI Motohiro
2010-07-30 20:14 ` David Rientjes
2010-08-02 20:43 ` Andrew Morton
2010-08-03 0:00 ` KAMEZAWA Hiroyuki
2010-08-03 0:27 ` David Rientjes
2010-08-03 0:36 ` KAMEZAWA Hiroyuki
2010-08-03 1:02 ` David Rientjes
2010-08-03 1:08 ` KAMEZAWA Hiroyuki
2010-08-03 1:24 ` KAMEZAWA Hiroyuki
2010-08-03 1:52 ` David Rientjes
2010-08-03 2:05 ` KAMEZAWA Hiroyuki
2010-08-03 3:05 ` David Rientjes
2010-08-03 3:11 ` KAMEZAWA Hiroyuki
2010-08-03 4:20 ` David Rientjes
2010-08-03 4:32 ` KAMEZAWA Hiroyuki
2010-08-03 7:23 ` David Rientjes [this message]
2010-08-03 7:21 ` KAMEZAWA Hiroyuki
2010-08-03 7:27 ` KAMEZAWA Hiroyuki
2010-08-03 20:43 ` David Rientjes
2010-08-03 1:50 ` David Rientjes
2010-08-03 1:50 ` KAMEZAWA Hiroyuki
2010-08-03 6:00 ` KOSAKI Motohiro
2010-08-03 7:16 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1008030016590.20849@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@in.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox