From: David Rientjes <rientjes@google.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org
Subject: Re: [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves
Date: Mon, 9 Dec 2013 13:46:16 -0800 (PST) [thread overview]
Message-ID: <alpine.DEB.2.02.1312091328550.11026@chino.kir.corp.google.com> (raw)
In-Reply-To: <20131209124840.GC3597@dhcp22.suse.cz>
On Mon, 9 Dec 2013, Michal Hocko wrote:
> > Google depends on getting memory.oom_control notifications only when they
> > are actionable, which is exactly how Documentation/cgroups/memory.txt
> > describes how userspace should respond to such a notification.
> >
> > "Actionable" here means that the kernel has exhausted its capabilities of
> > allowing for future memory freeing, which is the entire premise of any oom
> > killer.
> >
> > Giving a dying process or a process that is going to subsequently die
> > access to memory reserves is a capability the kernel users to ensure
> > progress is made in oom conditions. It is not an exhaustion of
> > capabilities.
> >
> > Yes, we all know that subsequent to the userspace notification that memory
> > may be freed and the kill no longer becomes required. There is nothing
> > that can be done about that, and it has never been implied that a memcg is
> > guaranteed to still be oom when the process wakes up.
> >
> > I'm referring to a siutation that can manifest in a number of ways:
> > coincidental process exit, coincidental process being killed,
> > VMPRESSURE_CRITICAL notification that results in a process being killed,
> > or memory threshold notification that results in a process being killed.
> > Regardless, we're talking about a situation where something is already
> > in the exit path or has been killed and is simply attempting to free its
> > memory.
>
> You have already mentioned that. Several times in fact. And I do
> understand what you are saying. You are just not backing your claims
> with anything that would convince us that what you are trying to solve
> is an issue in the real life. So show us it is real, please.
>
What exactly would you like to see? It's obvious that the kernel has not
exhausted its capabilities of allowing for future memory freeing if the
notification happens before the check for current->flags & PF_EXITING or
fatal_signal_pending(current). Does that conditional get triggered? ALL
THE TIME. We know it happens because I had to introduce it into both the
system oom killer and the memcg oom killer to fix mm->mmap_sem issues for
threads that were killed as part of the oom killer SIGKILL but weren't the
thread lucky enough to get TIF_MEMDIE set and they were in the allocation
path.
Are you asking me to patch our kernel, get it rolled out, and plot a graph
to show how often it gets triggered over time in our datacenters and that
it causes us to get unnecessary oom kill notifications?
I'm trying to support you in any way I can by giving you the information
you need, but in all honesty this seems pretty trivial and obvious to
understand. I'm really quite stunned at this thread. What exactly are
you arguing in the other direction for? What does giving an oom
notification before allowing exiting processes to free its memory so the
memcg or system is no longer oom do? Why can't you use memory thresholds
or vmpressure for such a situation?
> > Such a process simply needs access to memory reserves to make progress and
> > free its memory as part of the exit path. The process waiting on
> > memory.oom_control does _not_ need to do any of the actions mentioned in
> > Documentation/cgroups/memory.txt: reduce usage, enlarge the limit, kill a
> > process, or move a process with charge migration.
> >
> > It would be ridiculous to require anybody implementing such a process to
> > check if the oom condition still exists after a period of time before
> > taking such an action.
>
> Why would you consider that ridiculous? If your memcg is oom already
> then waiting few seconds to let racing tasks finish doesn't sound that
> bad to me.
>
A few seconds? Is that just handwaving or are you making a guarantee that
all processes that need access to memory reserves will wake up, try its
allocation, get the memcg's oom lock, get access to memory reserves,
allocate, return to handle its pending SIGKILL, proceed down the exit()
path, and free its memory by then?
Meanwhile, the userspace oom handler is doing its little sleep(3) that you
suggest, it checks the status of the memcg, finds it's still oom, but
doesn't realize because it didn't do a second blocking read() that its a
second oom condition for a different process attached to the memcg and
that process simply needs memory reserves to exit.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-12-09 21:46 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-31 1:39 [patch] mm, memcg: add memory.oom_control notification for system oom David Rientjes
2013-10-31 5:49 ` Johannes Weiner
2013-11-13 22:19 ` David Rientjes
2013-11-13 23:34 ` Johannes Weiner
2013-11-14 0:56 ` David Rientjes
2013-11-14 3:25 ` Johannes Weiner
2013-11-14 22:57 ` David Rientjes
2013-11-14 23:26 ` [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves David Rientjes
2013-11-14 23:26 ` [patch 2/2] mm, memcg: add memory.oom_control notification for system oom David Rientjes
2013-11-18 18:52 ` Michal Hocko
2013-11-19 1:25 ` David Rientjes
2013-11-19 12:41 ` Michal Hocko
2013-11-18 12:52 ` [patch 1/2] mm, memcg: avoid oom notification when current needs access to memory reserves Michal Hocko
2013-11-18 12:55 ` Michal Hocko
2013-11-19 1:19 ` David Rientjes
2013-11-18 15:41 ` Johannes Weiner
2013-11-18 16:51 ` Michal Hocko
2013-11-19 1:22 ` David Rientjes
2013-11-22 16:51 ` Johannes Weiner
2013-11-27 0:53 ` David Rientjes
2013-11-27 16:34 ` Johannes Weiner
2013-11-27 21:51 ` David Rientjes
2013-11-27 23:19 ` Johannes Weiner
2013-11-28 0:22 ` David Rientjes
2013-11-28 2:28 ` Johannes Weiner
2013-11-28 2:52 ` David Rientjes
2013-11-28 3:16 ` Johannes Weiner
2013-12-02 20:02 ` Michal Hocko
2013-12-02 21:25 ` Johannes Weiner
2013-12-03 12:04 ` Michal Hocko
2013-12-03 20:17 ` Johannes Weiner
2013-12-03 21:00 ` Michal Hocko
2013-12-03 21:23 ` Johannes Weiner
2013-12-03 23:50 ` David Rientjes
2013-12-04 3:34 ` Johannes Weiner
2013-12-04 11:13 ` Michal Hocko
2013-12-05 0:23 ` David Rientjes
2013-12-09 12:48 ` Michal Hocko
2013-12-09 21:46 ` David Rientjes [this message]
2013-12-09 22:51 ` Johannes Weiner
2013-12-09 23:05 ` Johannes Weiner
2014-01-10 0:34 ` David Rientjes
2013-12-10 10:38 ` Michal Hocko
2013-12-11 1:03 ` David Rientjes
2013-12-11 9:55 ` Michal Hocko
2013-12-11 22:40 ` David Rientjes
2013-12-12 10:31 ` Michal Hocko
2013-12-12 10:50 ` Michal Hocko
2013-12-12 12:11 ` Michal Hocko
2013-12-12 12:37 ` Michal Hocko
2013-12-13 23:55 ` David Rientjes
2013-12-17 16:23 ` Michal Hocko
2013-12-17 20:50 ` David Rientjes
2013-12-18 20:04 ` Michal Hocko
2013-12-19 6:09 ` David Rientjes
2013-12-19 14:41 ` Michal Hocko
2014-01-08 0:25 ` Andrew Morton
2014-01-08 10:33 ` Michal Hocko
2014-01-09 14:30 ` [PATCH] memcg: Do not hang on OOM when killed by userspace OOM " Michal Hocko
2014-01-09 21:40 ` David Rientjes
2014-01-10 8:23 ` Michal Hocko
2014-01-10 21:33 ` David Rientjes
2014-01-15 14:26 ` Michal Hocko
2014-01-15 21:19 ` David Rientjes
2014-01-16 10:12 ` Michal Hocko
2014-01-21 6:13 ` David Rientjes
2014-01-21 13:21 ` Michal Hocko
2014-01-09 21:34 ` [patch 1/2] mm, memcg: avoid oom notification when current needs " David Rientjes
2014-01-09 22:47 ` Andrew Morton
2014-01-10 0:01 ` David Rientjes
2014-01-10 0:12 ` Andrew Morton
2014-01-10 0:23 ` David Rientjes
2014-01-10 0:35 ` David Rientjes
2014-01-10 22:14 ` Johannes Weiner
2014-01-12 22:10 ` David Rientjes
2014-01-15 14:34 ` Michal Hocko
2014-01-15 21:23 ` David Rientjes
2014-01-16 9:32 ` Michal Hocko
2014-01-21 5:58 ` David Rientjes
2014-01-21 6:04 ` Greg Kroah-Hartmann
2014-01-21 6:08 ` David Rientjes
2014-01-10 8:30 ` Michal Hocko
2014-01-10 21:38 ` David Rientjes
2014-01-10 22:34 ` Johannes Weiner
2014-01-12 22:14 ` David Rientjes
2013-11-18 15:54 ` [patch] mm, memcg: add memory.oom_control notification for system oom Johannes Weiner
2013-11-18 23:15 ` One Thousand Gnomes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.02.1312091328550.11026@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox