linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	linux-mm@kvack.org
Subject: Re: [patch] memcg: add oom killer delay
Date: Wed, 23 Feb 2011 16:51:24 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.00.1102231636260.21906@chino.kir.corp.google.com> (raw)
In-Reply-To: <20110223150850.8b52f244.akpm@linux-foundation.org>

On Wed, 23 Feb 2011, Andrew Morton wrote:

> Your patch still stinks!
> 
> If userspace can't handle a disabled oom-killer then userspace
> shouldn't have disabled the oom-killer.
> 

I agree, but userspace may not always be perfect especially on large 
scale; we, in kernel land, can easily choose to ignore that but it's only 
a problem because we're providing an interface where the memcg will 
livelock without userspace intervention.  The global oom killer doesn't 
have this problem and for years it has even radically panicked the machine 
instead of livelocking EVEN THOUGH other threads, those that are 
OOM_DISABLE, may be getting work done.

This is a memcg-specific issue because memory.oom_control has opened the 
possibility up to livelock that userspace may have no way of correcting on 
its own especially when it may be oom itself.  The natural conclusion is 
that you should never set memory.oom_control unless you can guarantee a 
perfect userspace implementation that will never be unresponsive.  At our 
scale, we can't make that guarantee so memory.oom_control is not helpful 
at all.

If that's the case, then what else do we have at our disposal other than 
memory.oom_delay_millisecs that allows us to increase a hard limit or kill 
a job of lower priority other than setting memory thresholds and hoping 
userspace will schedule and respond before the memcg is completely oom?

> How do we fix this properly?
> 
> A little birdie tells me that the offending userspace oom handler is
> running in a separate memcg and is not itself running out of memory. 

It depends on how you configure your memory controllers, but even if it is 
running in a separate memcg how can you make the conclusion it isn't oom 
in parallel?

> The problem is that the userspace oom handler is also taking peeks into
> processes which are in the stressed memcg and is getting stuck on
> mmap_sem in the procfs reads.  Correct?
> 

That's outside the scope of this feature and is a separate discussion; 
this patch specifically addresses an issue where a userspace job scheduler 
wants to take action when a memcg is oom before deferring to the kernel 
and happens to become unresponsive for whatever reason.

> It seems to me that such a userspace oom handler is correctly designed,
> and that we should be looking into the reasons why it is unreliable,
> and fixing them.  Please tell us about this?
> 

The problem isn't specific to any one cause or implementation, we know 
that userspace programs have bugs, they can stall forever in D-state, they 
can be oom themselves, they get stuck waiting on a lock, etc etc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-02-24  0:51 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-08  0:24 David Rientjes
2011-02-08  1:55 ` KAMEZAWA Hiroyuki
2011-02-08  2:13   ` David Rientjes
2011-02-08  2:13     ` KAMEZAWA Hiroyuki
2011-02-08  2:20       ` KAMEZAWA Hiroyuki
2011-02-08  2:37         ` David Rientjes
2011-02-08 10:25           ` Balbir Singh
2011-02-09 22:19 ` David Rientjes
2011-02-10  0:04   ` KAMEZAWA Hiroyuki
2011-02-16  3:15     ` David Rientjes
2011-02-20 22:19       ` David Rientjes
2011-02-23 23:08   ` Andrew Morton
2011-02-24  0:13     ` KAMEZAWA Hiroyuki
2011-02-24  0:51     ` David Rientjes [this message]
2011-03-03 20:11       ` David Rientjes
2011-03-03 21:52       ` Andrew Morton
2011-03-08  0:12         ` David Rientjes
2011-03-08  0:29           ` Andrew Morton
2011-03-08  0:36             ` David Rientjes
2011-03-08  0:51               ` Andrew Morton
2011-03-08  1:02                 ` David Rientjes
2011-03-08  1:18                   ` Andrew Morton
2011-03-08  1:33                     ` David Rientjes
2011-03-08  2:51                       ` KAMEZAWA Hiroyuki
2011-03-08  3:07                         ` David Rientjes
2011-03-08  3:13                           ` KAMEZAWA Hiroyuki
2011-03-08  3:56                             ` David Rientjes
2011-03-08  4:17                               ` KAMEZAWA Hiroyuki
2011-03-08  5:30                                 ` David Rientjes
2011-03-08  5:49                                   ` KAMEZAWA Hiroyuki
2011-03-08 23:49                                     ` David Rientjes
2011-03-09  6:04                                       ` KAMEZAWA Hiroyuki
2011-03-09  6:44                                         ` David Rientjes
2011-03-09  7:16                                           ` KAMEZAWA Hiroyuki
2011-03-09 21:12                                             ` David Rientjes
2011-03-09 21:27                                               ` [patch] memcg: give current access to memory reserves if it's trying to die David Rientjes
2011-03-09 23:30                                                 ` KAMEZAWA Hiroyuki
2011-03-17 23:37                                                   ` David Rientjes
2011-03-17 23:53                                                 ` Andrew Morton
2011-03-18  4:35                                                   ` KAMEZAWA Hiroyuki
2011-03-18  5:17                                                     ` Andrew Morton
2011-03-18  5:58                                                       ` KAMEZAWA Hiroyuki
2011-03-18 20:36                                                       ` David Rientjes
2011-03-18 20:32                                                   ` David Rientjes
2011-03-08  3:06                     ` [patch] memcg: add oom killer delay KAMEZAWA Hiroyuki
  -- strict thread matches above, loose matches on Subject: below --
2010-12-22  7:27 David Rientjes
2010-12-22  7:59 ` Andrew Morton
2010-12-22  8:17   ` KAMEZAWA Hiroyuki
2010-12-22  8:31     ` KOSAKI Motohiro
2010-12-22  8:48     ` David Rientjes
2010-12-22  8:48       ` KAMEZAWA Hiroyuki
2010-12-22  8:55         ` KAMEZAWA Hiroyuki
2010-12-22  9:21           ` David Rientjes
2010-12-27  1:47             ` KAMEZAWA Hiroyuki
2010-12-22  9:04         ` David Rientjes
2010-12-22  8:42   ` David Rientjes
2010-12-25 10:47 ` Balbir Singh
2010-12-26 20:35   ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1102231636260.21906@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox