From: Roman Gushchin <guro@fb.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org,
hannes@cmpxchg.org, tj@kernel.org, gthelen@google.com
Subject: Re: cgroup-aware OOM killer, how to move forward
Date: Tue, 17 Jul 2018 10:38:45 -0700 [thread overview]
Message-ID: <20180717173844.GB14909@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1807162115180.157949@chino.kir.corp.google.com>
On Mon, Jul 16, 2018 at 09:19:18PM -0700, David Rientjes wrote:
> On Fri, 13 Jul 2018, Roman Gushchin wrote:
>
> > > > > All cgroup v2 files do not need to be boolean and the only way you can add
> > > > > a subtree oom kill is to introduce yet another file later. Please make it
> > > > > tristate so that you can define a mechanism of default (process only),
> > > > > local cgroup, or subtree, and so we can avoid adding another option later
> > > > > that conflicts with the proposed one. This should be easy.
> > > >
> > > > David, we're adding a cgroup v2 knob, and in cgroup v2 a memory cgroup
> > > > either has a sub-tree, either attached processes. So, there is no difference
> > > > between local cgroup and subtree.
> > > >
> > >
> > > Uhm, what? We're talking about a common ancestor reaching its limit, so
> > > it's oom, and it has multiple immediate children with their own processes
> > > attached. The difference is killing all processes attached to the
> > > victim's cgroup or all processes under the oom mem cgroup's subtree.
> > >
> >
> > But it's a binary decision, no?
> > If memory.group_oom set, the whole sub-tree will be killed. Otherwise not.
> >
>
> No, if memory.max is reached and memory.group_oom is set, my understanding
> of your proposal is that a process is chosen and all eligible processes
> attached to its mem cgroup are oom killed. My desire for a tristate is so
> that it can be specified that all processes attached to the *subtree* are
> oom killed. With single unified hierarchy mandated by cgroup v2, we can
> separate descendant cgroups for use with other controllers and enforce
> memory.max by an ancestor.
>
> Making this a boolean value is only preventing it from becoming
> extensible. If memory.group_oom only is effective for the victim's mem
> cgroup, it becomes impossible to specify that all processes in the subtree
> should be oom killed as a result of the ancestor limit without adding yet
> another tunable.
Let me show my proposal on examples. Let's say we have the following hierarchy,
and the biggest process (or the process with highest oom_score_adj) is in D.
/
|
A
|
B
/ \
C D
Let's look at different examples and intended behavior:
1) system-wide OOM
- default settings: the biggest process is killed
- D/memory.group_oom=1: all processes in D are killed
- A/memory.group_oom=1: all processes in A are killed
2) memcg oom in B
- default settings: the biggest process is killed
- A/memory.group_oom=1: the biggest process is killed
- B/memory.group_oom=1: all processes in B are killed
- D/memory.group_oom=1: all processes in D are killed
Please, note, that processes can't be attached directly to A and B,
so "all processes in A are killed" means all processes in the sub-tree
are killed. Immortal processes (oom_score_adj=-1000) are excluded.
I believe, that this model is full and doesn't require any further
extension.
Thanks!
next prev parent reply other threads:[~2018-07-17 17:39 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-11 22:40 Roman Gushchin
2018-07-12 12:07 ` Michal Hocko
2018-07-12 15:55 ` Roman Gushchin
2018-07-13 21:34 ` David Rientjes
2018-07-13 22:16 ` Roman Gushchin
2018-07-13 22:39 ` David Rientjes
2018-07-13 23:05 ` Roman Gushchin
2018-07-13 23:11 ` David Rientjes
2018-07-13 23:16 ` Roman Gushchin
2018-07-17 4:19 ` David Rientjes
2018-07-17 12:41 ` Michal Hocko
2018-07-17 17:38 ` Roman Gushchin [this message]
2018-07-17 19:49 ` Michal Hocko
2018-07-17 20:06 ` Roman Gushchin
2018-07-17 20:41 ` David Rientjes
2018-07-17 20:52 ` Roman Gushchin
2018-07-20 8:30 ` David Rientjes
2018-07-20 11:21 ` Tejun Heo
2018-07-20 16:13 ` Roman Gushchin
2018-07-20 20:28 ` David Rientjes
2018-07-20 20:47 ` Roman Gushchin
2018-07-23 23:06 ` David Rientjes
2018-07-23 14:12 ` Michal Hocko
2018-07-18 8:19 ` Michal Hocko
2018-07-18 8:12 ` Michal Hocko
2018-07-18 15:28 ` Roman Gushchin
2018-07-19 7:38 ` Michal Hocko
2018-07-19 17:05 ` Roman Gushchin
2018-07-20 8:32 ` David Rientjes
2018-07-23 14:17 ` Michal Hocko
2018-07-23 15:09 ` Tejun Heo
2018-07-24 7:32 ` Michal Hocko
2018-07-24 13:08 ` Tejun Heo
2018-07-24 13:26 ` Michal Hocko
2018-07-24 13:31 ` Tejun Heo
2018-07-24 13:50 ` Michal Hocko
2018-07-24 13:55 ` Tejun Heo
2018-07-24 14:25 ` Michal Hocko
2018-07-24 14:28 ` Tejun Heo
2018-07-24 14:35 ` Tejun Heo
2018-07-24 14:43 ` Michal Hocko
2018-07-24 14:49 ` Tejun Heo
2018-07-24 15:52 ` Roman Gushchin
2018-07-25 12:00 ` Michal Hocko
2018-07-25 11:58 ` Michal Hocko
2018-07-30 8:03 ` Michal Hocko
2018-07-30 14:04 ` Tejun Heo
2018-07-30 15:29 ` Roman Gushchin
2018-07-24 11:59 ` Tetsuo Handa
2018-07-25 0:10 ` Roman Gushchin
2018-07-25 12:23 ` Tetsuo Handa
2018-07-25 13:01 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180717173844.GB14909@castle.DHCP.thefacebook.com \
--to=guro@fb.com \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox