From: David Rientjes <rientjes@google.com>
To: leonid.moiseichuk@nokia.com
Cc: gregkh@suse.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
cesarb@cesarb.net, kamezawa.hiroyu@jp.fujitsu.com,
emunson@mgebm.net, penberg@kernel.org, aarcange@redhat.com,
riel@redhat.com, mel@csn.ul.ie, dima@android.com,
rebecca@android.com, san@google.com, akpm@linux-foundation.org,
vesa.jaaskelainen@nokia.com
Subject: RE: [PATCH 3.2.0-rc1 3/3] Used Memory Meter pseudo-device module
Date: Wed, 11 Jan 2012 13:44:42 -0800 (PST) [thread overview]
Message-ID: <alpine.DEB.2.00.1201111338320.21755@chino.kir.corp.google.com> (raw)
In-Reply-To: <84FF21A720B0874AA94B46D76DB98269045568A1@008-AM1MPN1-003.mgdnok.nokia.com>
On Wed, 11 Jan 2012, leonid.moiseichuk@nokia.com wrote:
> > So if the page allocator can make no progress in freeing memory, we would
> > introduce a delay in out_of_memory() if it were configured via a sysctl from
> > userspace. When this delay is started, applications waiting on this event can
> > be notified with eventfd(2) that the delay has started and they have
> > however many milliseconds to address the situation. When they rewrite the
> > sysctl, the delay is cleared. If they don't rewrite the sysctl and the delay
> > expires, the oom killer proceeds with killing.
> >
> > What's missing for your use case with this proposal?
>
> Timed delays in multi-process handling in case OOM looks for me fragile
> construction due to delays are not predicable.
Not sure what you mean by predictable; the oom conditions themselves
certainly aren't predictable, otherwise you wouldn't need notification at
all. The delays are predictable since you configure it to be a number of
millisecs via a global sysctl. Userspace can either handle the oom itself
and rewrite that sysctl to reset the delay or write 0 to make the kernel
immediately oom. If the delay expires, then it is assumed that userspace
is dead and the kernel will proceed to avoid livelock.
> Memcg supports [1] better approach to freeze whole group and kick
> pointed user-space application to handle it. We planned
> to use it as:
> - enlarge cgroup
> - send SIGTERM to selected "bad" application e.g. based on oom_score
> - wait a bit
> - send SIGKILL to "bad" application
> - reduce group size
>
> But finally default OOM killer starts to work fine.
>
I think you're misunderstanding the proposal; in the case of a global oom
(that means without memcg) then, by definition, all threads that are
allocating memory would be frozen and incur the delay at the point they
would currently call into the oom killer. If your userspace is alive,
i.e. the application responsible for managing oom killing, then it can
wait on eventfd(2), wake up, and then send SIGTERM and SIGKILL to the
appropriate threads based on priority.
So, again, why wouldn't this work for you?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-01-11 21:44 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-04 17:21 [PATCH 3.2.0-rc1 0/3] Used Memory Meter pseudo-device and related changes in MM Leonid Moiseichuk
2012-01-04 17:21 ` [PATCH 3.2.0-rc1 1/3] Making si_swapinfo exportable Leonid Moiseichuk
2012-01-04 17:21 ` [PATCH 3.2.0-rc1 2/3] MM hook for page allocation and release Leonid Moiseichuk
2012-01-04 20:40 ` Pekka Enberg
2012-01-05 6:59 ` KAMEZAWA Hiroyuki
2012-01-05 11:26 ` leonid.moiseichuk
2012-01-05 12:49 ` Pekka Enberg
2012-01-05 15:05 ` Rik van Riel
2012-01-05 15:17 ` leonid.moiseichuk
2012-01-05 15:22 ` Mel Gorman
2012-01-04 17:21 ` [PATCH 3.2.0-rc1 3/3] Used Memory Meter pseudo-device module Leonid Moiseichuk
2012-01-04 19:55 ` Greg KH
2012-01-09 9:58 ` leonid.moiseichuk
2012-01-09 10:09 ` David Rientjes
2012-01-09 10:19 ` leonid.moiseichuk
2012-01-09 20:55 ` David Rientjes
2012-01-11 12:46 ` leonid.moiseichuk
2012-01-11 21:44 ` David Rientjes [this message]
2012-01-12 8:32 ` leonid.moiseichuk
2012-01-12 20:54 ` David Rientjes
2012-01-13 9:34 ` leonid.moiseichuk
2012-01-13 11:06 ` David Rientjes
2012-01-13 11:51 ` leonid.moiseichuk
2012-01-13 21:35 ` David Rientjes
2012-01-04 19:56 ` [PATCH 3.2.0-rc1 0/3] Used Memory Meter pseudo-device and related changes in MM Greg KH
2012-01-04 20:17 ` Rik van Riel
2012-01-04 20:42 ` Pekka Enberg
2012-01-05 23:01 ` David Rientjes
2012-01-05 12:22 ` leonid.moiseichuk
2012-01-05 11:47 ` leonid.moiseichuk
2012-01-05 12:40 ` Pekka Enberg
2012-01-05 13:02 ` leonid.moiseichuk
2012-01-05 14:57 ` Greg KH
2012-01-05 16:13 ` leonid.moiseichuk
2012-01-05 23:10 ` David Rientjes
2012-01-09 8:27 ` leonid.moiseichuk
2012-01-06 0:26 ` KOSAKI Motohiro
2012-01-09 8:49 ` leonid.moiseichuk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1201111338320.21755@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cesarb@cesarb.net \
--cc=dima@android.com \
--cc=emunson@mgebm.net \
--cc=gregkh@suse.de \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=leonid.moiseichuk@nokia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=penberg@kernel.org \
--cc=rebecca@android.com \
--cc=riel@redhat.com \
--cc=san@google.com \
--cc=vesa.jaaskelainen@nokia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox