From: Johannes Weiner <hannes@cmpxchg.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Greg Thelen <gthelen@google.com>,
Shakeel Butt <shakeelb@google.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>,
linux-fsdevel@vger.kernel.org,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg
Date: Wed, 25 Oct 2017 17:13:59 -0400 [thread overview]
Message-ID: <20171025211359.GA17899@cmpxchg.org> (raw)
In-Reply-To: <20171025190057.mqmnprhce7kvsfz7@dhcp22.suse.cz>
On Wed, Oct 25, 2017 at 09:00:57PM +0200, Michal Hocko wrote:
> On Wed 25-10-17 14:11:06, Johannes Weiner wrote:
> > "Safe" is a vague term, and it doesn't make much sense to me in this
> > situation. The OOM behavior should be predictable and consistent.
> >
> > Yes, global might in the rarest cases also return -ENOMEM. Maybe. We
> > don't have to do that in memcg because we're not physically limited.
>
> OK, so here seems to be the biggest disconnect. Being physically or
> artificially constrained shouldn't make much difference IMHO. In both
> cases the resource is simply limited for the consumer. And once all the
> attempts to fit within the limit fail then the request for the resource
> has to fail.
It's a huge difference. In the global case, we have to make trade-offs
to not deadlock the kernel. In the memcg case, we have to make a trade
off between desirable OOM behavior and desirable meaning of memory.max.
If we can borrow a resource temporarily from the ether to resolve the
OOM situation, I don't see why we shouldn't. We're only briefly
ignoring the limit to make sure the allocating task isn't preventing
the OOM victim from exiting or the OOM reaper from reaping. It's more
of an implementation detail than interface.
The only scenario you brought up where this might be the permanent
overrun is the single, oom-disabled task. And I explained why that is
a silly argument, why that's the least problematic consequence of
oom-disabling, and why it probably shouldn't even be configurable.
The idea that memory.max must never be breached is an extreme and
narrow view. As Greg points out, there are allocations we do not even
track. There are other scenarios that force allocations. They may
violate the limit on paper, but they're not notably weakening the goal
of memory.max - isolating workloads from each other.
Let's look at it this way.
There are two deadlock relationships the OOM killer needs to solve
between the triggerer and the potential OOM victim:
#1 Memory. The triggerer needs memory that the victim has,
but the victim needs some additional memory to release it.
#2 Locks. The triggerer needs memory that the victim has, but
the victim needs a lock the triggerer holds to release it.
We have no qualms letting the victim temporarily (until the victim's
exit) ignore memory.max to resolve the memory deadlock #1.
I don't understand why it's such a stretch to let the triggerer
temporarily (until the victim's exit) ignore memory.max to resolve the
locks deadlock #2. [1]
We need both for the OOM killer to function correctly.
We've solved #1 both for memcg and globally. But we haven't solved #2.
Global can still deadlock, and memcg copped out and returns -ENOMEM.
Adding speculative OOM killing before the -ENOMEM makes things more
muddy and unpredictable. It doesn't actually solve deadlock #2.
[1] And arguably that's what we should be doing in the global case
too: give the triggerer access to reserves. If you recall this
thread here: https://patchwork.kernel.org/patch/6088511/
> > > So the only change I am really proposing is to keep retrying as long
> > > as the oom killer makes a forward progress and ENOMEM otherwise.
> >
> > That's the behavior change I'm against.
>
> So just to make it clear you would be OK with the retry on successful
> OOM killer invocation and force charge on oom failure, right?
Yeah, that sounds reasonable to me.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-25 21:14 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-05 22:21 Shakeel Butt
2017-10-06 7:59 ` Michal Hocko
2017-10-06 19:33 ` Shakeel Butt
2017-10-09 6:24 ` Michal Hocko
2017-10-09 17:52 ` Greg Thelen
2017-10-09 18:04 ` Michal Hocko
2017-10-09 18:17 ` Michal Hocko
2017-10-10 9:10 ` Michal Hocko
2017-10-10 22:21 ` Shakeel Butt
2017-10-11 9:09 ` Michal Hocko
2017-10-09 20:26 ` Johannes Weiner
2017-10-10 9:14 ` Michal Hocko
2017-10-10 14:17 ` Johannes Weiner
2017-10-10 14:24 ` Michal Hocko
2017-10-12 19:03 ` Johannes Weiner
2017-10-12 23:57 ` Greg Thelen
2017-10-13 6:51 ` Michal Hocko
2017-10-13 6:35 ` Michal Hocko
2017-10-13 7:00 ` Michal Hocko
2017-10-13 15:24 ` Michal Hocko
2017-10-24 12:18 ` Michal Hocko
2017-10-24 17:54 ` Johannes Weiner
2017-10-24 16:06 ` Johannes Weiner
2017-10-24 16:22 ` Michal Hocko
2017-10-24 17:23 ` Johannes Weiner
2017-10-24 17:55 ` Michal Hocko
2017-10-24 18:58 ` Johannes Weiner
2017-10-24 20:15 ` Michal Hocko
2017-10-25 6:51 ` Greg Thelen
2017-10-25 7:15 ` Michal Hocko
2017-10-25 13:11 ` Johannes Weiner
2017-10-25 14:12 ` Michal Hocko
2017-10-25 16:44 ` Johannes Weiner
2017-10-25 17:29 ` Michal Hocko
2017-10-25 18:11 ` Johannes Weiner
2017-10-25 19:00 ` Michal Hocko
2017-10-25 21:13 ` Johannes Weiner [this message]
2017-10-25 22:49 ` Greg Thelen
2017-10-26 7:49 ` Michal Hocko
2017-10-26 12:45 ` Tetsuo Handa
2017-10-26 14:31 ` Johannes Weiner
2017-10-26 19:56 ` Greg Thelen
2017-10-27 8:20 ` Michal Hocko
2017-10-27 20:50 ` Shakeel Butt
2017-10-30 8:29 ` Michal Hocko
2017-10-30 19:28 ` Shakeel Butt
2017-10-31 8:00 ` Michal Hocko
2017-10-31 16:49 ` Johannes Weiner
2017-10-31 18:50 ` Michal Hocko
2017-10-24 15:45 ` Johannes Weiner
2017-10-24 16:30 ` Michal Hocko
2017-10-10 23:32 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171025211359.GA17899@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=shakeelb@google.com \
--cc=vdavydov.dev@gmail.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox