From: David Rientjes <rientjes@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: andrea@suse.de, clameter@sgi.com, linux-mm@kvack.org
Subject: Re: [patch -mm 5/5] oom: add sysctl to dump tasks memory state
Date: Wed, 26 Sep 2007 13:46:49 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.0.9999.0709261337080.23401@chino.kir.corp.google.com> (raw)
In-Reply-To: <20070926130616.f16446fd.akpm@linux-foundation.org>
On Wed, 26 Sep 2007, Andrew Morton wrote:
> > Adds a new sysctl, 'oom_dump_tasks', that dumps a list of all system tasks
> > (excluding kernel threads) and their pid, uid, tgid, vm size, rss cpu,
> > oom_adj score, and name.
> >
> > Helpful for determining why an OOM condition occurred and what rogue task
> > caused it.
> >
> > It is configurable so that large systems, such as those with several
> > thousand tasks, do not incur a performance penalty associated with data
> > they may not desire.
> >
> > There currently do not appear to be any other generic kernel callers that
> > dump all this information. Perhaps in the future it will be worthwhile
> > to construct a generic task dump interface based on passing a set of
> > flags that specify what per-task information shall be shown.
>
> It isn't obvious to me why this has "oom" in its name. It is just a
> general display-stuff-about-task-memory handler, isn't it?
>
Yes. When other subsystems have been converted to using it, probably with
a callback filter function and flags to specify what traits to show for
each task, it will be feasible to move it out of the OOM killer. Until
that happens, however, it can remain static and in oom_kill.c.
There's several places in the kernel where a tasklist is dumped but the
information they dump are very different. Any generic tasklist dumping
interface will become complex just based on the number of possible traits
to display.
> Nor is it obvious why we need it at all. This sort of information can
> already be gathered from /proc/pid/whatever. If the system is all wedged
> and you can't get console control then this info dump doesn't provide you
> with info which you're interested in anyway - you want to see the global
> (or per-cgroup) info, not the per-task info.
>
It can be gathered by other means, yes, but not at the time of OOM nor
immediately before a task is killed. This tasklist dump is done very
close to the OOM kill and it represents the per-task memory state, whether
system or cgroup, that triggered that event. This could be done other
ways, for instance with an OOM userspace notifier, but that would delay
the SIGKILL being sent. So in the interest of a fast OOM killer, it's
best to dump the information ourselves, if the user chose to enable that
functionality.
The information should be displayed in a per-task manner because the
global memory state doesn't really matter: we know we're OOM, because
we're in the OOM killer. Showing how little free memory we have isn't
immediately helpful on a system-wide basis. But oom_dump_tasks, the way
I've written it, allows you to identify the "rogue" task that is using way
more memory than expected and allows you to alter oom_adj scores in the
case when the task you've identified, and the one you want dead, isn't the
one that ends up being killed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-09-26 20:46 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-22 17:47 [patch -mm 1/5] oom: prevent including sched.h in header file David Rientjes
2007-09-22 17:47 ` [patch -mm 2/5] oom: add header file to Kbuild as unifdef David Rientjes
2007-09-22 17:47 ` [patch -mm 3/5] oom: convert zone_scan_lock from mutex to spinlock David Rientjes
2007-09-22 17:47 ` [patch -mm 4/5] mm: test and set zone reclaim lock before starting reclaim David Rientjes
2007-09-22 17:47 ` [patch -mm 5/5] oom: add sysctl to dump tasks memory state David Rientjes
2007-09-25 4:37 ` Balbir Singh
2007-09-25 4:57 ` David Rientjes
2007-09-26 20:06 ` Andrew Morton
2007-09-26 20:46 ` David Rientjes [this message]
2007-09-26 21:47 ` Andrew Morton
2007-09-27 6:15 ` David Rientjes
2007-09-24 19:05 ` [patch -mm 4/5] mm: test and set zone reclaim lock before starting reclaim Christoph Lameter
2007-09-24 19:14 ` David Rientjes
2007-09-24 20:11 ` Christoph Lameter
2007-09-24 21:02 ` David Rientjes
2007-09-24 21:09 ` Christoph Lameter
2007-09-25 4:26 ` Balbir Singh
2007-09-25 4:34 ` David Rientjes
2007-09-25 6:17 ` Balbir Singh
2007-09-25 6:29 ` David Rientjes
2007-09-25 21:15 ` Christoph Lameter
2007-09-25 21:19 ` David Rientjes
2007-09-25 21:39 ` Christoph Lameter
2007-09-25 21:43 ` David Rientjes
2007-09-25 21:14 ` Christoph Lameter
2007-09-25 21:17 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.0.9999.0709261337080.23401@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=andrea@suse.de \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox