From: David Rientjes <rientjes@google.com>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org, Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
Tejun Heo <tj@kernel.org>,
kernel-team@fb.com, cgroups@vger.kernel.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [v6 2/4] mm, oom: cgroup-aware OOM killer
Date: Wed, 23 Aug 2017 16:19:11 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.10.1708231614310.68096@chino.kir.corp.google.com> (raw)
In-Reply-To: <20170823165201.24086-3-guro@fb.com>
On Wed, 23 Aug 2017, Roman Gushchin wrote:
> Traditionally, the OOM killer is operating on a process level.
> Under oom conditions, it finds a process with the highest oom score
> and kills it.
>
> This behavior doesn't suit well the system with many running
> containers:
>
> 1) There is no fairness between containers. A small container with
> few large processes will be chosen over a large one with huge
> number of small processes.
>
> 2) Containers often do not expect that some random process inside
> will be killed. In many cases much safer behavior is to kill
> all tasks in the container. Traditionally, this was implemented
> in userspace, but doing it in the kernel has some advantages,
> especially in a case of a system-wide OOM.
>
> 3) Per-process oom_score_adj affects global OOM, so it's a breache
> in the isolation.
>
> To address these issues, cgroup-aware OOM killer is introduced.
>
> Under OOM conditions, it tries to find the biggest memory consumer,
> and free memory by killing corresponding task(s). The difference
> the "traditional" OOM killer is that it can treat memory cgroups
> as memory consumers as well as single processes.
>
> By default, it will look for the biggest leaf cgroup, and kill
> the largest task inside.
>
> But a user can change this behavior by enabling the per-cgroup
> oom_kill_all_tasks option. If set, it causes the OOM killer treat
> the whole cgroup as an indivisible memory consumer. In case if it's
> selected as on OOM victim, all belonging tasks will be killed.
>
I'm very happy with the rest of the patchset, but I feel that I must renew
my objection to memory.oom_kill_all_tasks being able to override the
setting of the admin of setting a process to be oom disabled. From my
perspective, setting memory.oom_kill_all_tasks with an oom disabled
process attached that now becomes killable either (1) overrides the
CAP_SYS_RESOURCE oom disabled setting or (2) is lazy and doesn't modify
/proc/pid/oom_score_adj itself.
I'm not sure what is objectionable about allowing
memory.oom_kill_all_tasks to coexist with oom disabled processes. Just
kill everything else so that the oom disabled process can report the oom
condition after notification, restart the task, etc. If it's problematic,
then whomever is declaring everything must be killed shall also modify
/proc/pid/oom_score_adj of oom disabled processes. If it doesn't have
permission to change that, then I think there's a much larger concern.
> Tasks in the root cgroup are treated as independent memory consumers,
> and are compared with other memory consumers (e.g. leaf cgroups).
> The root cgroup doesn't support the oom_kill_all_tasks feature.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-23 23:19 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-23 16:51 [v6 1/4] mm, oom: refactor the oom_kill_process() function Roman Gushchin
2017-08-23 16:51 ` [v6 0/4] cgroup-aware OOM killer Roman Gushchin
2017-08-23 16:51 ` [v6 2/4] mm, oom: " Roman Gushchin
2017-08-23 23:19 ` David Rientjes [this message]
2017-08-25 10:57 ` Roman Gushchin
2017-08-24 11:47 ` Michal Hocko
2017-08-24 12:28 ` Roman Gushchin
2017-08-24 12:58 ` Michal Hocko
2017-08-24 13:58 ` Roman Gushchin
2017-08-24 14:13 ` Michal Hocko
2017-08-24 14:58 ` Roman Gushchin
2017-08-25 8:14 ` Michal Hocko
2017-08-25 10:39 ` Roman Gushchin
2017-08-25 10:58 ` Michal Hocko
2017-08-30 11:22 ` Roman Gushchin
2017-08-30 20:56 ` David Rientjes
2017-08-31 13:34 ` Roman Gushchin
2017-08-31 20:01 ` David Rientjes
2017-08-23 16:52 ` [v6 3/4] mm, oom: introduce oom_priority for memory cgroups Roman Gushchin
2017-08-24 12:10 ` Michal Hocko
2017-08-24 12:51 ` Roman Gushchin
2017-08-24 13:48 ` Michal Hocko
2017-08-24 14:11 ` Roman Gushchin
2017-08-28 20:54 ` David Rientjes
2017-08-23 16:52 ` [v6 4/4] mm, oom, docs: describe the cgroup-aware OOM killer Roman Gushchin
2017-08-24 11:15 ` [v6 1/4] mm, oom: refactor the oom_kill_process() function Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.10.1708231614310.68096@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=cgroups@vger.kernel.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox