From: David Rientjes <rientjes@google.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux MM <linux-mm@kvack.org>
Subject: Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom
Date: Thu, 16 Jul 2020 12:53:14 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.23.453.2007161246480.3086260@chino.kir.corp.google.com> (raw)
In-Reply-To: <CALOAHbBKLCoaPMneDnk+SiRG5zKyDNoBXSJCXE3=efp4Bkpreg@mail.gmail.com>
On Thu, 16 Jul 2020, Yafang Shao wrote:
> > It's likely a misunderstanding: I wasn't necessarily concerned about
> > showing 60MB in a memcg limited to 100MB, that part we can deal with, the
> > concern was after dumping all of that great information that instead of
> > getting a "Killed process..." we get a "Oh, there's memory now, just
> > kidding about everything we just dumped" ;)
> >
>
> Actually the kernel is doing it now, see bellow,
>
> dump_header() <<<< dump lots of information
> __oom_kill_process
> p = find_lock_task_mm(victim);
> if (!p)
> return; <<<< without killing any process.
>
Ah, this is catching an instance where the chosen process has already done
exit_mm(), good catch -- I can find examples of this by scraping kernel
logs from our fleet.
So it appears there is precedence for dumping all the oom info but not
actually performing any action for it and I made the earlier point that
diagnostic information in the kernel log here is still useful. I think it
is still preferable that the kernel at least tell us why it didn't do
anything, but as you mention that already happens today.
Would you like to send a patch that checks for mem_cgroup_margin() here as
well? A second patch could make the possible inaction more visibile,
something like "Process ${pid} (${comm}) is already exiting" for the above
check or "Memcg ${memcg} is no longer out of memory".
Another thing that these messages indicate, beyond telling us why the oom
killer didn't actually SIGKILL anything, is that we can expect some skew
in the memory stats that shows an availability of memory.
>
> > We could likely enlighten userspace about that so that we don't consider
> > that to be an actual oom kill. But I also very much agree that after
> > dump_header() would be appropriate as well since the goal is to prevent
> > unnecessary oom killing.
> >
> > Would you mind sending a patch to check mem_cgroup_margin() on
> > is_memcg_oom() prior to sending the SIGKILL to the victim and printing the
> > "Killed process..." line? We'd need a line that says "xKB of memory now
> > available -- suppressing oom kill" or something along those lines so
> > userspace understands what happened. But the memory info that it emits
> > both for the state of the memcg and system RAM may also be helpful to
> > understand why we got to the oom kill in the first place, which is also a
> > good thing.
> >
> > I'd happy ack that patch since it would be a comprehensive solution that
> > avoids oom kill of user processes at all costs, which is a goal I think we
> > can all rally behind.
>
> I'd prefer to put dump_header() behind do_send_sig_info(), for example,
>
> __oom_kill_process()
> do_send_sig_info()
> dump_header() <<<< may better put it behind wake_oom_reaper(), but
> it may loses some information to dump...
> pr_err("%s: Killed process %d (%s)....")
>
I agree with Michal here that dump_header() after the actual kill would no
longer represent the state of the system (or cpuset or memcg, depending on
context) at the time of the oom kill so it's best to dump relevant
information before the actual kill.
next prev parent reply other threads:[~2020-07-16 19:53 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-14 13:57 Yafang Shao
2020-07-14 14:05 ` Michal Hocko
2020-07-14 14:30 ` Chris Down
2020-07-14 18:46 ` David Rientjes
2020-07-15 1:44 ` Yafang Shao
2020-07-15 2:44 ` David Rientjes
2020-07-15 3:10 ` Yafang Shao
2020-07-15 3:18 ` David Rientjes
2020-07-15 3:31 ` Yafang Shao
2020-07-15 17:30 ` David Rientjes
2020-07-16 2:38 ` Yafang Shao
2020-07-16 7:04 ` David Rientjes
2020-07-16 11:53 ` Yafang Shao
2020-07-16 12:21 ` Michal Hocko
2020-07-16 13:09 ` Tetsuo Handa
2020-07-16 19:53 ` David Rientjes [this message]
2020-07-17 1:35 ` Yafang Shao
2020-07-17 19:26 ` David Rientjes
2020-07-18 2:15 ` Yafang Shao
2020-07-16 5:54 ` Tetsuo Handa
2020-07-16 6:11 ` Michal Hocko
2020-07-16 7:06 ` David Rientjes
2020-07-16 6:08 ` Michal Hocko
2020-07-16 6:56 ` David Rientjes
2020-07-16 7:12 ` Michal Hocko
2020-07-16 20:04 ` David Rientjes
2020-07-28 18:04 ` Johannes Weiner
2020-07-15 6:56 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.23.453.2007161246480.3086260@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox