From: David Rientjes <rientjes@google.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Minchan Kim <minchan.kim@gmail.com>, Greg KH <greg@kroah.com>,
Linux Driver Project <devel@driverdev.osuosl.org>,
linux-mm <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: [patch -rc] oom: always return a badness score of non-zero for eligible tasks
Date: Thu, 9 Sep 2010 12:07:43 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.00.1009091152090.5556@chino.kir.corp.google.com> (raw)
In-Reply-To: <1284053081.7586.7910.camel@nimitz>
On Thu, 9 Sep 2010, Dave Hansen wrote:
> Hi Nitin,
>
> I've been playing with using zram (from -staging) to back some qemu
> guest memory directly. Basically mmap()'ing the device in instead of
> using anonymous memory. The old code with the backing swap devices
> seemed to work pretty well, but I'm running into a problem with the new
> code.
>
> I have plenty of swap on the system, and I'd been running with compcache
> nicely for a while. But, I went to go tar up (and gzip) a pretty large
> directory in my qemu guest. It panic'd the qemu host system:
>
> [703826.003126] Kernel panic - not syncing: Out of memory and no killable processes...
> [703826.003127]
> [703826.012350] Pid: 25508, comm: cat Not tainted 2.6.36-rc3-00114-g9b9913d #29
I'm curious why there are no killable processes on the system; it seems
like the triggering task here, cat, would at least be killable itself.
Could you post the tasklist dump that preceeds this (or, if you've
disabled it try echo 1 > /proc/sys/vm/oom_dump_tasks first)?
It's possible that if you have enough swap that none of the eligible tasks
actually have non-zero badness scores either because they are being run as
root or because the amount of RAM or swap is sufficiently high such that
(task's rss + swap) / (total rss + swap) is never non-zero. And, since
root tasks have a 3% bonus, it's possible these are all root tasks and no
single task uses more than 3% of rss and swap.
While this may not be the issue in your case, and can be confirmed with
the tasklist dump if you can get it, we need to protect against these
situations where eligible tasks may not be killed.
Andrew, I'd like to propose this patch for 2.6.36-rc-series since the
worst case is that the machine will panic if there are an exceptionally
large number of tasks, each with little memory usage at the time of oom.
oom: always return a badness score of non-zero for eligible tasks
A task's badness score is roughly a proportion of its rss and swap
compared to the system's capacity. The scale ranges from 0 to 1000 with
the highest score chosen for kill. Thus, this scale operates on a
resolution of 0.1% of RAM + swap. Admin tasks are also given a 3% bonus,
so the badness score of an admin task using 3% of memory, for example,
would still be 0.
It's possible that an exceptionally large number of tasks will combine to
exhaust all resources but never have a single task that uses more than
0.1% of RAM and swap (or 3.0% for admin tasks).
This patch ensures that the badness score of any eligible task is never 0
so the machine doesn't unnecessarily panic because it cannot find a task
to kill.
Signed-off-by: David Rientjes <rientjes@google.com>
---
mm/oom_kill.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -208,8 +208,13 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem,
*/
points += p->signal->oom_score_adj;
- if (points < 0)
- return 0;
+ /*
+ * Never return 0 for an eligible task that may be killed since it's
+ * possible that no single user task uses more than 0.1% of memory and
+ * no single admin tasks uses more than 3.0%.
+ */
+ if (points <= 0)
+ return 1;
return (points < 1000) ? points : 1000;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-09-09 19:07 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-09 17:26 [PATCH 00/10] zram: various improvements and cleanups Nitin Gupta
2010-08-09 17:26 ` [PATCH 01/10] Replace ioctls with sysfs interface Nitin Gupta
2010-08-09 18:34 ` Pekka Enberg
2010-08-10 3:06 ` Nitin Gupta
2010-08-31 23:06 ` Dave Hansen
2010-08-09 17:26 ` [PATCH 02/10] Remove need for explicit device initialization Nitin Gupta
2010-08-09 18:36 ` Pekka Enberg
2010-08-10 3:38 ` Nitin Gupta
2010-08-09 17:26 ` [PATCH 03/10] Use percpu stats Nitin Gupta
2010-08-09 18:44 ` Pekka Enberg
2010-08-10 4:34 ` Andrew Morton
2010-08-11 16:39 ` Nitin Gupta
2010-08-11 17:18 ` Andrew Morton
2010-08-30 16:20 ` Christoph Lameter
2010-08-31 20:31 ` Nitin Gupta
2010-08-31 21:28 ` Eric Dumazet
2010-08-31 21:35 ` Christoph Lameter
2010-08-31 21:41 ` Eric Dumazet
2010-09-01 20:05 ` Christoph Lameter
2010-09-01 20:38 ` Eric Dumazet
2010-09-02 0:34 ` Christoph Lameter
2010-08-31 5:36 ` Anton Blanchard
2010-09-01 3:41 ` Anton Blanchard
2010-09-01 3:51 ` Anton Blanchard
2010-09-17 20:59 ` Andrew Morton
2010-08-09 17:26 ` [PATCH 04/10] Use percpu buffers Nitin Gupta
2010-08-09 18:57 ` Pekka Enberg
2010-08-10 4:47 ` Nitin Gupta
2010-08-10 5:05 ` Pekka Enberg
2010-08-10 5:32 ` Nitin Gupta
2010-08-10 7:36 ` Pekka Enberg
2010-08-09 17:26 ` [PATCH 05/10] Reduce per table entry overhead by 4 bytes Nitin Gupta
2010-08-09 18:59 ` Pekka Enberg
2010-08-10 4:55 ` Nitin Gupta
2010-08-09 17:26 ` [PATCH 06/10] Block discard support Nitin Gupta
2010-08-09 19:03 ` Pekka Enberg
2010-08-10 2:23 ` Jens Axboe
2010-08-10 4:54 ` Nitin Gupta
2010-08-10 15:54 ` Jens Axboe
2010-08-09 17:26 ` [PATCH 07/10] Increase compressed page size threshold Nitin Gupta
2010-08-09 18:32 ` Pekka Enberg
2010-08-09 17:26 ` [PATCH 08/10] Some cleanups Nitin Gupta
2010-08-09 19:02 ` Pekka Enberg
2010-08-09 17:26 ` [PATCH 09/10] Update zram documentation Nitin Gupta
2010-08-09 17:26 ` [PATCH 10/10] Document sysfs entries Nitin Gupta
2010-08-09 19:02 ` Pekka Enberg
2010-08-31 22:37 ` [PATCH 00/10] zram: various improvements and cleanups Greg KH
2010-09-01 3:32 ` Anton Blanchard
2010-09-09 17:24 ` OOM panics with zram Dave Hansen
2010-09-09 19:07 ` David Rientjes [this message]
2010-09-09 19:48 ` [patch -rc] oom: always return a badness score of non-zero for eligible tasks Dave Hansen
2010-09-09 21:00 ` David Rientjes
2010-09-09 21:10 ` Dave Hansen
2010-09-09 21:40 ` David Rientjes
2010-10-03 18:41 ` OOM panics with zram Nitin Gupta
2010-10-03 19:27 ` Dave Hansen
2010-10-03 19:40 ` Nitin Gupta
2010-10-04 11:08 ` Ed Tomlinson
2010-10-05 23:43 ` Greg KH
2010-10-06 2:29 ` Nitin Gupta
2010-10-06 2:36 ` Greg KH
2010-10-06 4:30 ` Nitin Gupta
2010-10-06 7:38 ` Pekka Enberg
2010-10-06 14:03 ` Greg KH
2010-10-06 14:16 ` Pekka Enberg
2010-10-06 14:53 ` Nitin Gupta
2010-10-06 14:02 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1009091152090.5556@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=devel@driverdev.osuosl.org \
--cc=greg@kroah.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan.kim@gmail.com \
--cc=ngupta@vflare.org \
--cc=penberg@cs.helsinki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox