linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	minchan.kim@gmail.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH v3] oom-kill: add lowmem usage aware oom kill handling
Date: Wed, 27 Jan 2010 15:40:41 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.00.1001271511120.4663@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100125151503.49060e74.kamezawa.hiroyu@jp.fujitsu.com>

On Mon, 25 Jan 2010, KAMEZAWA Hiroyuki wrote:

> Default oom-killer uses badness calculation based on process's vm_size
> and some amounts of heuristics. Some users see proc->oom_score and
> proc->oom_adj to control oom-killed tendency under their server.
> 
> Now, we know oom-killer don't work ideally in some situaion, in PCs. Some
> enhancements are demanded. But such enhancements for oom-killer makes
> incomaptibility to oom-controls in enterprise world. So, this patch
> adds sysctl for extensions for oom-killer. Main purpose is for
> making a chance for wider test for new scheme.
> 

That's insufficient for inclusion in mainline, we don't add new sysctls so 
that new heuristics can be tried out.  It's fine to propose a new sysctl 
to define how the oom killer behaves, but the main purpose would not be 
for testing; rather, it would be to enable options that users within the 
minority would want to use.

I disagree that we should be doing this as a bitmask that defines certain 
oom killer options; we already have three seperate sysctls which also 
enable options: panic_on_oom, oom_kill_allocating_task, and 
oom_dump_tasks.  Either these existing sysctls need to be converted to the 
bitmask, breaking the long-standing legacy support, or you simply need to 
clutter procfs a little more.  I'm slightly biased toward the latter since 
it doesn't require any userspace change and tunables such as panic_on_oom 
have been around for a long time.

 [ Note: it may be possible to consolidate two of these existing sysctls
   down into one: oom_dump_tasks can be enabled by default if the tasklist
   is sufficiently short and the only use-case for oom_kill_allocating_task
   is for machines with enormously long tasklists to prevent unnecessary
   delays in selecting a bad process to kill.  Thus, we could probably
   consolidate these into one sysctl: oom_kill_quick, which would disable
   the tasklist dump and always kill current when invoked. ]

> One cause of OOM-Killer is memory shortage in lower zones.
> (If memory is enough, lowmem_reserve_ratio works well. but..)

I don't understand the reference to lowmem_reserve_ratio here, it may 
reserve lowmem from ~GFP_DMA requests but it does nothing to prevent oom 
conditions from excessive DMA page allocations.

> I saw lowmem-oom frequently on x86-32 and sometimes on ia64 in
> my cusotmer support jobs. If we just see process's vm_size at oom,
> we can never kill a process which has lowmem.

That's not always true, it may end up killing a large consumer of DMA 
memory by chance simply because the heuristics work out that way.  In 
other words, we can't say it will "never" work correctly as it is 
currently implemented.  I agree we can make it smarter, however.

> At last, there will be an oom-serial-killer.
> 

Heh.

> Now, we have per-mm lowmem usage counter. We can make use of it
> to select a good victim.
> 
> This patch does
>   - add sysctl for new bahavior.
>   - add CONSTRAINT_LOWMEM to oom's constraint type.
>   - pass constraint to __badness()

You mean badness()?  Passing the constraint works well for my 
CONSTRAINT_MEMPOLICY patch as well.

>   - change calculation based on constraint. If CONSTRAINT_LOWMEM,
>     use low_rss instead of vmsize.
> 

Nack, we can't simply use the lowmem rss as a baseline because 
/proc/pid/oom_adj, the single most powerful heuristic in badness(), is not 
defined for these dual scenarios.  There may only be a single baseline to 
define for oom_adj, otherwise it will have erradic results depending on 
the context in which the oom killer is called.  It can be used to polarize 
the heuristic depending on the total VM size which may be disadvantageous 
when using lowmem rss as the baseline.

I think the best alternative would be to strongly penalize the badness() 
points for tasks that do not have a lowmem rss when we are constrained by 
CONSTRAINT_LOWMEM, similar to how we penalize tasks not sharing current's 
mems_allowed since it (usually) doesn't help.  We do not necessarily 
always want to kill the task that is consuming the most lowmem for a 
single page allocation; we need to decide how valuable lowmem is in 
relation to overall VM size, however.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-01-27 23:40 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-21  5:59 [PATCH] " KAMEZAWA Hiroyuki
2010-01-21 15:18 ` Minchan Kim
2010-01-21 23:48   ` KAMEZAWA Hiroyuki
2010-01-22  0:40     ` Minchan Kim
2010-01-22  1:06       ` KAMEZAWA Hiroyuki
2010-01-21 15:29 ` Balbir Singh
2010-01-21 23:54   ` KAMEZAWA Hiroyuki
2010-01-22  6:23 ` [PATCH v2] " KAMEZAWA Hiroyuki
2010-01-22 14:00   ` Minchan Kim
2010-01-22 15:16     ` KAMEZAWA Hiroyuki
2010-01-22 15:41       ` Minchan Kim
2010-01-25  6:15   ` [PATCH v3] " KAMEZAWA Hiroyuki
2010-01-26 23:12     ` Andrew Morton
2010-01-26 23:53       ` KAMEZAWA Hiroyuki
2010-01-27  0:19         ` Andrew Morton
2010-01-27  0:58           ` KAMEZAWA Hiroyuki
2010-01-27  6:30             ` [PATCH v4 0/2] " KAMEZAWA Hiroyuki
2010-01-27  6:32               ` [PATCH v4 1/2] sysctl clean up vm related variable declarations KAMEZAWA Hiroyuki
2010-01-28  8:54                 ` David Rientjes
2010-01-28 10:30                   ` KAMEZAWA Hiroyuki
2010-01-27  6:33               ` [PATCH v4 2/2] oom-kill: add lowmem usage aware oom kill handling v4 KAMEZAWA Hiroyuki
2010-01-28  0:12               ` [PATCH v4 0/2] oom-kill: add lowmem usage aware oom kill handling KAMEZAWA Hiroyuki
2010-01-27 23:56             ` [PATCH v3] " David Rientjes
2010-01-28  0:16             ` Alan Cox
2010-01-28  0:26               ` KAMEZAWA Hiroyuki
2010-01-28  0:59               ` David Rientjes
2010-01-29  0:25               ` Vedran Furač
2010-01-29  0:35                 ` Alan Cox
2010-01-29  0:57                   ` Vedran Furač
2010-01-29 11:03                     ` Alan Cox
2010-01-30 12:33                       ` Vedran Furač
2010-01-30 12:59                         ` Alan Cox
2010-01-30 17:30                           ` Vedran Furač
2010-01-30 17:45                             ` Alan Cox
2010-01-30 18:17                               ` Vedran Furač
2010-01-27 23:46         ` David Rientjes
2010-01-26 23:16     ` Andrew Morton
2010-01-26 23:44       ` KAMEZAWA Hiroyuki
2010-01-27 23:40     ` David Rientjes [this message]
2010-01-29 16:11 KAMEZAWA Hiroyuki
2010-01-29 16:21 ` Alan Cox
2010-01-29 16:25   ` KAMEZAWA Hiroyuki
2010-01-29 16:30     ` Alan Cox
2010-01-29 16:41       ` KAMEZAWA Hiroyuki
2010-01-29 21:07         ` David Rientjes
2010-01-30 12:46           ` Vedran Furač
2010-01-30 22:53             ` David Rientjes
2010-01-31 20:29               ` Vedran Furač
2010-02-01 10:33                 ` David Rientjes
2010-02-01  0:01           ` KAMEZAWA Hiroyuki
2010-02-01 10:28             ` David Rientjes
2010-01-29 21:11     ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1001271511120.4663@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox