linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org,
	hannes@cmpxchg.org, tj@kernel.org, gthelen@google.com
Subject: Re: cgroup-aware OOM killer, how to move forward
Date: Tue, 17 Jul 2018 10:38:45 -0700	[thread overview]
Message-ID: <20180717173844.GB14909@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1807162115180.157949@chino.kir.corp.google.com>

On Mon, Jul 16, 2018 at 09:19:18PM -0700, David Rientjes wrote:
> On Fri, 13 Jul 2018, Roman Gushchin wrote:
> 
> > > > > All cgroup v2 files do not need to be boolean and the only way you can add 
> > > > > a subtree oom kill is to introduce yet another file later.  Please make it 
> > > > > tristate so that you can define a mechanism of default (process only), 
> > > > > local cgroup, or subtree, and so we can avoid adding another option later 
> > > > > that conflicts with the proposed one.  This should be easy.
> > > > 
> > > > David, we're adding a cgroup v2 knob, and in cgroup v2 a memory cgroup
> > > > either has a sub-tree, either attached processes. So, there is no difference
> > > > between local cgroup and subtree.
> > > > 
> > > 
> > > Uhm, what?  We're talking about a common ancestor reaching its limit, so 
> > > it's oom, and it has multiple immediate children with their own processes 
> > > attached.  The difference is killing all processes attached to the 
> > > victim's cgroup or all processes under the oom mem cgroup's subtree.
> > > 
> > 
> > But it's a binary decision, no?
> > If memory.group_oom set, the whole sub-tree will be killed. Otherwise not.
> > 
> 
> No, if memory.max is reached and memory.group_oom is set, my understanding 
> of your proposal is that a process is chosen and all eligible processes 
> attached to its mem cgroup are oom killed.  My desire for a tristate is so 
> that it can be specified that all processes attached to the *subtree* are 
> oom killed.  With single unified hierarchy mandated by cgroup v2, we can 
> separate descendant cgroups for use with other controllers and enforce 
> memory.max by an ancestor.
> 
> Making this a boolean value is only preventing it from becoming 
> extensible.  If memory.group_oom only is effective for the victim's mem 
> cgroup, it becomes impossible to specify that all processes in the subtree 
> should be oom killed as a result of the ancestor limit without adding yet 
> another tunable.

Let me show my proposal on examples. Let's say we have the following hierarchy,
and the biggest process (or the process with highest oom_score_adj) is in D.

  /
  |
  A
  |
  B
 / \
C   D

Let's look at different examples and intended behavior:
1) system-wide OOM
  - default settings: the biggest process is killed
  - D/memory.group_oom=1: all processes in D are killed
  - A/memory.group_oom=1: all processes in A are killed
2) memcg oom in B
  - default settings: the biggest process is killed
  - A/memory.group_oom=1: the biggest process is killed
  - B/memory.group_oom=1: all processes in B are killed
  - D/memory.group_oom=1: all processes in D are killed

Please, note, that processes can't be attached directly to A and B,
so "all processes in A are killed" means all processes in the sub-tree
are killed. Immortal processes (oom_score_adj=-1000) are excluded.

I believe, that this model is full and doesn't require any further
extension.

Thanks!

  parent reply	other threads:[~2018-07-17 17:39 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 22:40 Roman Gushchin
2018-07-12 12:07 ` Michal Hocko
2018-07-12 15:55   ` Roman Gushchin
2018-07-13 21:34 ` David Rientjes
2018-07-13 22:16   ` Roman Gushchin
2018-07-13 22:39     ` David Rientjes
2018-07-13 23:05       ` Roman Gushchin
2018-07-13 23:11         ` David Rientjes
2018-07-13 23:16           ` Roman Gushchin
2018-07-17  4:19             ` David Rientjes
2018-07-17 12:41               ` Michal Hocko
2018-07-17 17:38               ` Roman Gushchin [this message]
2018-07-17 19:49                 ` Michal Hocko
2018-07-17 20:06                   ` Roman Gushchin
2018-07-17 20:41                     ` David Rientjes
2018-07-17 20:52                       ` Roman Gushchin
2018-07-20  8:30                         ` David Rientjes
2018-07-20 11:21                           ` Tejun Heo
2018-07-20 16:13                             ` Roman Gushchin
2018-07-20 20:28                             ` David Rientjes
2018-07-20 20:47                               ` Roman Gushchin
2018-07-23 23:06                                 ` David Rientjes
2018-07-23 14:12                               ` Michal Hocko
2018-07-18  8:19                       ` Michal Hocko
2018-07-18  8:12                     ` Michal Hocko
2018-07-18 15:28                       ` Roman Gushchin
2018-07-19  7:38                         ` Michal Hocko
2018-07-19 17:05                           ` Roman Gushchin
2018-07-20  8:32                             ` David Rientjes
2018-07-23 14:17                             ` Michal Hocko
2018-07-23 15:09                               ` Tejun Heo
2018-07-24  7:32                                 ` Michal Hocko
2018-07-24 13:08                                   ` Tejun Heo
2018-07-24 13:26                                     ` Michal Hocko
2018-07-24 13:31                                       ` Tejun Heo
2018-07-24 13:50                                         ` Michal Hocko
2018-07-24 13:55                                           ` Tejun Heo
2018-07-24 14:25                                             ` Michal Hocko
2018-07-24 14:28                                               ` Tejun Heo
2018-07-24 14:35                                                 ` Tejun Heo
2018-07-24 14:43                                                 ` Michal Hocko
2018-07-24 14:49                                                   ` Tejun Heo
2018-07-24 15:52                                                     ` Roman Gushchin
2018-07-25 12:00                                                       ` Michal Hocko
2018-07-25 11:58                                                     ` Michal Hocko
2018-07-30  8:03                                       ` Michal Hocko
2018-07-30 14:04                                         ` Tejun Heo
2018-07-30 15:29                                           ` Roman Gushchin
2018-07-24 11:59 ` Tetsuo Handa
2018-07-25  0:10   ` Roman Gushchin
2018-07-25 12:23     ` Tetsuo Handa
2018-07-25 13:01       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180717173844.GB14909@castle.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox