linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org
Subject: Re: [RFC PATCH v2 0/7] cgroup-aware OOM killer
Date: Mon, 26 Jun 2017 13:55:31 +0200	[thread overview]
Message-ID: <20170626115531.GI11534@dhcp22.suse.cz> (raw)
In-Reply-To: <20170623183946.GA24014@castle>

On Fri 23-06-17 19:39:46, Roman Gushchin wrote:
> On Fri, Jun 23, 2017 at 03:43:24PM +0200, Michal Hocko wrote:
> > On Thu 22-06-17 18:10:03, Roman Gushchin wrote:
> > > Hi, Michal!
> > > 
> > > Thank you very much for the review. I've tried to address your
> > > comments in v3 (sent yesterday), so that is why it took some time to reply.
> > 
> > I will try to look at it sometimes next week hopefully
> 
> Thanks!
> 
> > > > - You seem to completely ignore per task oom_score_adj and override it
> > > >   by the memcg value. This makes some sense but it can lead to an
> > > >   unexpected behavior when somebody relies on the original behavior.
> > > >   E.g. a workload that would corrupt data when killed unexpectedly and
> > > >   so it is protected by OOM_SCORE_ADJ_MIN. Now this assumption will
> > > >   break when running inside a container. I do not have a good answer
> > > >   what is the desirable behavior and maybe there is no universal answer.
> > > >   Maybe you just do not to kill those tasks? But then you have to be
> > > >   careful when selecting a memcg victim. Hairy...
> > > 
> > > I do not ignore it completely, but it matters only for root cgroup tasks
> > > and inside a cgroup when oom_kill_all_tasks is off.
> > > 
> > > I believe, that cgroup v2 requirement is a good enough. I mean you can't
> > > move from v1 to v2 without changing cgroup settings, and if we will provide
> > > per-cgroup oom_score_adj, it will be enough to reproduce the old behavior.
> > > 
> > > Also, if you think it's necessary, I can add a sysctl to turn the cgroup-aware
> > > oom killer off completely and provide compatibility mode.
> > > We can't really save the old system-wide behavior of per-process oom_score_adj,
> > > it makes no sense in the containerized environment.
> > 
> > So what you are going to do with those applications that simply cannot
> > be killed and which set OOM_SCORE_ADJ_MIN explicitly. Are they
> > unsupported? How does a user find out? One way around this could be to
> > simply to not kill tasks with OOM_SCORE_ADJ_MIN.
> 
> They won't be killed by cgroup OOM, but under some circumstances can be killed
> by the global OOM (e.g. there are no other tasks in the selected cgroup,
> cgroup v2 is used, and per-cgroup oom score adjustment is not set).

Hmm, mem_cgroup_select_oom_victim will happily select a memcg which
contains OOM_SCORE_ADJ_MIN tasks because it ignores per-task score adj.
So memcg OOM killer can kill those tasks AFAICS. But that is not all
that important. Becasuse...

> I believe, that per-process oom_score_adj should not play any role outside
> of the containing cgroup, it's violation of isolation.
> 
> Right now if tasks with oom_score_adj=-1000 eating all memory in a cgroup,
> they will be looping forever, OOM killer can't fix this.

... Yes and that is a price we have to pay for the hard requirement
that oom killer never kills OOM_SCORE_ADJ_MIN task. It is hard to
change that without breaking any existing userspace which relies on the
configuration to protect from an unexpected SIGKILL.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2017-06-26 11:55 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-01 18:35 Roman Gushchin
2017-06-01 18:35 ` [RFC PATCH v2 1/7] mm, oom: refactor select_bad_process() to take memcg as an argument Roman Gushchin
2017-06-04 19:25   ` Vladimir Davydov
2017-06-04 22:50   ` David Rientjes
2017-06-06 16:20     ` Roman Gushchin
2017-06-06 20:42       ` David Rientjes
2017-06-08 15:59         ` Roman Gushchin
2017-06-01 18:35 ` [RFC PATCH v2 2/7] mm, oom: split oom_kill_process() and export __oom_kill_process() Roman Gushchin
2017-06-01 18:35 ` [RFC PATCH v2 3/7] mm, oom: export oom_evaluate_task() and select_bad_process() Roman Gushchin
2017-06-01 18:35 ` [RFC PATCH v2 4/7] mm, oom: introduce oom_kill_all_tasks option for memory cgroups Roman Gushchin
2017-06-04 19:30   ` Vladimir Davydov
2017-06-01 18:35 ` [RFC PATCH v2 5/7] mm, oom: introduce oom_score_adj " Roman Gushchin
2017-06-04 19:39   ` Vladimir Davydov
2017-06-01 18:35 ` [RFC PATCH v2 6/7] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-06-04 20:43   ` Vladimir Davydov
2017-06-06 15:59     ` Roman Gushchin
2017-06-01 18:35 ` [RFC PATCH v2 7/7] mm,oom,docs: describe the " Roman Gushchin
2017-06-09 16:30 ` [RFC PATCH v2 0/7] " Michal Hocko
2017-06-22 17:10   ` Roman Gushchin
2017-06-23 13:43     ` Michal Hocko
2017-06-23 18:39       ` Roman Gushchin
2017-06-26 11:55         ` Michal Hocko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170626115531.GI11534@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=guro@fb.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox