linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Tejun Heo <tj@kernel.org>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [patch -mm 0/4] mm, memcg: introduce oom policies
Date: Wed, 17 Jan 2018 14:31:57 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.10.1801171420330.86895@chino.kir.corp.google.com> (raw)
In-Reply-To: <20180117114554.GA10523@castle.DHCP.thefacebook.com>

On Wed, 17 Jan 2018, Roman Gushchin wrote:

> You're introducing a new oom_policy knob, which has two separate sets
> of possible values for the root and non-root cgroups. I don't think
> it aligns with the existing cgroup v2 design.
> 

The root mem cgroup can use "none" or "cgroup" to either disable or 
enable, respectively, the cgroup aware oom killer.  These are available to 
non root mem cgroups as well, with the addition of hierarchical usage 
comparison to prevent the ability to completely evade the oom killer by 
the user by creating sub cgroups.  Just as we have files that are made 
available on the root cgroup and not others, I think it's fine to allow 
only certain policies on the root mem cgroup.  As I wrote to Tejun, I'm 
fine with the policy being separated from the mechanism.  That can be 
resolved in that email thread, but the overall point of this patchset is 
to allow hierarchical comparison on some subtrees and not on others.  We 
can avoid the mount option in the same manner.

> For the root cgroup it works exactly as mount option, and both "none"
> and "cgroup" values have no meaning outside of the root cgroup. We can
> discuss if a knob on root cgroup is better than a mount option, or not
> (I don't think so), but it has nothing to do with oom policy as you
> define it for non-root cgroups.
> 

It certainly does have value outside of the root cgroup: for system oom 
conditions it is possible to choose the largest process, largest single 
cgroup, or largest subtree to kill from.  For memcg oom conditions it's 
possible for a consumer in a subtree to define whether it wants the 
largest memory hogging process to be oom killed or the largest of its 
child sub cgroups.  This would be needed for a job scheduler in its own 
subtree, for example, that creates its own subtrees to limit jobs 
scheduled by individual users who have the power over their subtree but 
not their limit.  This is a real-world example.  Michal also provided his 
own example concerning top-level /admins and /students.  They want to use 
"cgroup" themselves and then /students/roman would be forced to "tree".

Keep in mind that this patchset alone makes the interface extensible and 
addresses one of my big three concerns, but the comparison of the root mem 
cgroup compared to other individual cgroups and cgroups maintaining a 
subtree needs to be fixed separately so that we don't completely evade all 
of these oom policies by using /proc/pid/oom_score_adj in the root mem 
cgroup.  That's a separate issue that needs to be addressed.  My last 
concern, about userspace influence, can probably be addressed after this 
is merged by yet another tunable since it's vital that important cgroups 
can be protected.  It does make sense for an important cgroup to be able 
to use 50% of memory without being oom killed, and that's impossible if 
cgroup v2 has been mounted with your mount option.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2018-01-17 22:32 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-17  2:14 David Rientjes
2018-01-17  2:15 ` [patch -mm 1/4] mm, memcg: introduce per-memcg oom policy tunable David Rientjes
2018-01-17  2:15 ` [patch -mm 2/4] mm, memcg: replace cgroup aware oom killer mount option with tunable David Rientjes
2018-01-17  2:15 ` [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable David Rientjes
2018-01-17 15:41   ` Tejun Heo
2018-01-17 16:00     ` Michal Hocko
2018-01-17 22:18       ` David Rientjes
2018-01-23 15:13         ` Michal Hocko
2018-01-17 22:14     ` David Rientjes
2018-01-19 20:53       ` David Rientjes
2018-01-20 12:32         ` Tejun Heo
2018-01-22 22:34           ` David Rientjes
2018-01-23 15:53             ` Michal Hocko
2018-01-23 22:22               ` David Rientjes
2018-01-24  8:20                 ` Michal Hocko
2018-01-24 21:44                   ` David Rientjes
2018-01-24 22:08                     ` Andrew Morton
2018-01-24 22:18                       ` Tejun Heo
2018-01-25  8:11                       ` Michal Hocko
2018-01-25  8:05                     ` Michal Hocko
2018-01-25 23:27                       ` David Rientjes
2018-01-26 10:07                         ` Michal Hocko
2018-01-26 22:33                           ` David Rientjes
2018-01-17  2:15 ` [patch -mm 4/4] mm, memcg: add hierarchical usage oom policy David Rientjes
2018-01-17 11:46 ` [patch -mm 0/4] mm, memcg: introduce oom policies Roman Gushchin
2018-01-17 22:31   ` David Rientjes [this message]
2018-01-25 23:53 ` [patch -mm v2 0/3] " David Rientjes
2018-01-25 23:53   ` [patch -mm v2 1/3] mm, memcg: introduce per-memcg oom policy tunable David Rientjes
2018-01-26 17:15     ` Michal Hocko
2018-01-29 22:38       ` David Rientjes
2018-01-30  8:50         ` Michal Hocko
2018-01-30 22:38           ` David Rientjes
2018-01-31  9:47             ` Michal Hocko
2018-02-01 10:11               ` David Rientjes
2018-01-25 23:53   ` [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable David Rientjes
2018-01-26  0:00     ` Andrew Morton
2018-01-26 22:20       ` David Rientjes
2018-01-26 22:39         ` Andrew Morton
2018-01-26 22:52           ` David Rientjes
2018-01-27  0:17             ` Andrew Morton
2018-01-29 10:46               ` Michal Hocko
2018-01-29 19:11                 ` Tejun Heo
2018-01-30  8:54                   ` Michal Hocko
2018-01-30 11:58                     ` Roman Gushchin
2018-01-30 12:08                       ` Michal Hocko
2018-01-30 12:13                         ` Roman Gushchin
2018-01-30 12:20                           ` Michal Hocko
2018-01-30 15:15                             ` Tejun Heo
2018-01-30 17:30                             ` Johannes Weiner
2018-01-30 19:39                             ` Andrew Morton
2018-01-29 22:16               ` David Rientjes
2018-01-25 23:53   ` [patch -mm v2 3/3] mm, memcg: add hierarchical usage oom policy David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1801171420330.86895@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox