linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>, Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	hannes@cmpxchg.org, gthelen@google.com
Subject: Re: cgroup-aware OOM killer, how to move forward
Date: Mon, 23 Jul 2018 16:06:43 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.21.1807231555550.196032@chino.kir.corp.google.com> (raw)
In-Reply-To: <20180720204746.GA23478@castle.DHCP.thefacebook.com>

On Fri, 20 Jul 2018, Roman Gushchin wrote:

> > > > process chosen for oom kill.  I know that you care about the latter.  My 
> > > > *only* suggestion was for the tunable to take a string instead of a 
> > > > boolean so it is extensible for future use.  This seems like something so 
> > > > trivial.
> > > 
> > > So, I'd much prefer it as boolean.  It's a fundamentally binary
> > > property, either handle the cgroup as a unit when chosen as oom victim
> > > or not, nothing more.
> > 
> > With the single hierarchy mandate of cgroup v2, the need arises to 
> > separate processes from a single job into subcontainers for use with 
> > controllers other than mem cgroup.  In that case, we have no functionality 
> > to oom kill all processes in the subtree.
> > 
> > A boolean can kill all processes attached to the victim's mem cgroup, but 
> > cannot kill all processes in a subtree if the limit of a common ancestor 
> > is reached.
> 
> Why so?
> 
> Once again my proposal:
> as soon as the OOM killer selected a victim task,
> we'll look at the victim task's memory cgroup.
> If memory.oom.group is not set, we're done.
> Otherwise let's traverse the memory cgroup tree up to
> the OOMing cgroup (or root) as long as memory.oom.group is set.
> Kill the last cgroup entirely (including all children).
> 

I know this is your proposal, I'm suggesting a context-based extension 
based on which mem cgroup is oom: the common ancestor or the leaf.

Consider /A, /A/b, and /A/c, and memory.oom_group is 1 for all of them.  
When /A, /A/b, or /A/c is oom, all processes attached to /A and its 
subtree are oom killed per your semantic.  That occurs when any of the 
three mem cgroups are oom.

I'm suggesting that it may become useful to kill an entire subtree when 
the common ancestor, /A, is oom, but not when /A/b or /A/c is oom.  There 
is no way to specify this with the proposal and trees where the limits of
/A/b + /A/c > /A exist.  We want all processes killed in /A/b or /A/c if 
they reach their individual limits.  We want all processes killed in /A's 
subtree if /A reaches its limit.

I am not asking for that support to be implemented immediately if you do 
not have a need for it.  But I am asking that your interface to do so is 
extensible so that we may implement it.  Given the no internal process 
constraint of cgroup v2, defining this as two separate tunables would 
always have one be effective and the other be irrelevant, so I suggest it 
is overloaded.

  reply	other threads:[~2018-07-23 23:06 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 22:40 Roman Gushchin
2018-07-12 12:07 ` Michal Hocko
2018-07-12 15:55   ` Roman Gushchin
2018-07-13 21:34 ` David Rientjes
2018-07-13 22:16   ` Roman Gushchin
2018-07-13 22:39     ` David Rientjes
2018-07-13 23:05       ` Roman Gushchin
2018-07-13 23:11         ` David Rientjes
2018-07-13 23:16           ` Roman Gushchin
2018-07-17  4:19             ` David Rientjes
2018-07-17 12:41               ` Michal Hocko
2018-07-17 17:38               ` Roman Gushchin
2018-07-17 19:49                 ` Michal Hocko
2018-07-17 20:06                   ` Roman Gushchin
2018-07-17 20:41                     ` David Rientjes
2018-07-17 20:52                       ` Roman Gushchin
2018-07-20  8:30                         ` David Rientjes
2018-07-20 11:21                           ` Tejun Heo
2018-07-20 16:13                             ` Roman Gushchin
2018-07-20 20:28                             ` David Rientjes
2018-07-20 20:47                               ` Roman Gushchin
2018-07-23 23:06                                 ` David Rientjes [this message]
2018-07-23 14:12                               ` Michal Hocko
2018-07-18  8:19                       ` Michal Hocko
2018-07-18  8:12                     ` Michal Hocko
2018-07-18 15:28                       ` Roman Gushchin
2018-07-19  7:38                         ` Michal Hocko
2018-07-19 17:05                           ` Roman Gushchin
2018-07-20  8:32                             ` David Rientjes
2018-07-23 14:17                             ` Michal Hocko
2018-07-23 15:09                               ` Tejun Heo
2018-07-24  7:32                                 ` Michal Hocko
2018-07-24 13:08                                   ` Tejun Heo
2018-07-24 13:26                                     ` Michal Hocko
2018-07-24 13:31                                       ` Tejun Heo
2018-07-24 13:50                                         ` Michal Hocko
2018-07-24 13:55                                           ` Tejun Heo
2018-07-24 14:25                                             ` Michal Hocko
2018-07-24 14:28                                               ` Tejun Heo
2018-07-24 14:35                                                 ` Tejun Heo
2018-07-24 14:43                                                 ` Michal Hocko
2018-07-24 14:49                                                   ` Tejun Heo
2018-07-24 15:52                                                     ` Roman Gushchin
2018-07-25 12:00                                                       ` Michal Hocko
2018-07-25 11:58                                                     ` Michal Hocko
2018-07-30  8:03                                       ` Michal Hocko
2018-07-30 14:04                                         ` Tejun Heo
2018-07-30 15:29                                           ` Roman Gushchin
2018-07-24 11:59 ` Tetsuo Handa
2018-07-25  0:10   ` Roman Gushchin
2018-07-25 12:23     ` Tetsuo Handa
2018-07-25 13:01       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.1807231555550.196032@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox