From: David Rientjes <rientjes@google.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Michel Lespinasse <walken@google.com>,
linux-mm <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Divyesh Shah <dpshah@google.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: FYI: mmap_sem OOM patch
Date: Tue, 13 Jul 2010 14:08:28 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.00.1007131349250.1821@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100713091333.EA3E.A69D9226@jp.fujitsu.com>
On Tue, 13 Jul 2010, KOSAKI Motohiro wrote:
> > > I disagree. __GFP_NOFAIL mean this allocation failure can makes really
> > > dangerous result. Instead, OOM-Killer should try to kill next process.
> > > I think.
> > >
> >
> > That's not what happens, __alloc_pages_high_priority() will loop forever
> > for __GFP_NOFAIL, the oom killer is never recalled.
>
> Yup, please reread the discusstion.
>
There's nothing in the discussion that addresses the fact that
__alloc_pages_high_priority() loops infinitely for __GFP_NOFAIL without
again calling the oom killer. You may be proposing a change to that when
you said "OOM-Killer should try to kill next process. I think," but I
can't speculate on that. If you are, please propose a patch.
The success of whether we can allocate without watermarks is dependent on
where we set them and what types of exclusions we allow. This isn't local
only to the page allocator but rather to the entire kernel since
GFP_ATOMIC allocations, for example, can deplete the min watermark by
~60%. The remaining memory is what we allow access to for
ALLOC_NO_WATERMARKS: those allocations in the direct reclaim patch and
those that have been oom killed.
Thus, it's important that oom killed tasks do not completely deplete
memory reserves, otherwise __GFP_NOFAIL allocations will loop forever
without killing additional tasks, as you say. That's sane since oom
killing additional tasks wouldn't allow them access to any additional
memory, anyway, so the allocation would still fail (we only call the oom
killer for those allocations that are retried) and the victim could not
exit.
With that said, I'm not exactly sure what change you're proposing when you
say the oom killer should try to kill another process because it won't
lead to any future memory freeing unless the victim can exit without
allocating memory. If that's not possible, you've proliferated a ton of
innocent kills that were triggered because of one __GFP_NOFAIL attempt.
I agree with Peter that it would be ideal to remove __GFP_NOFAIL, but I
think we require changes in the retry logic first before that is possible.
Right now, we insist on retrying all blockable !__GFP_NORETRY allocations
under PAGE_ALLOC_COSTLY_ORDER indefinitely. That, in a sense, is already
__GFP_NOFAIL behavior that is implicit: that's why we typically see
__GFP_NOFAIL with GFP_NOFS instead. With GFP_NOFS, we never kill the oom
killer in the first place, so memory allocation is only more successful on
a subsequent attempt by either direct reclaim or memory compaction.
There's nothing preventing users from doing
do {
page = alloc_page(GFP_KERNEL);
} while (!page);
if the retry logic were reworked to start failing allocations or we
removed __GFP_NOFAIL. Thus, I think __GFP_NOFAIL should be substituted
with a different gfp flag that acts similar but failable: use compaction,
reclaim, and the oom killer where possible and in that order if there is
no success and then retry to a high watermark one final time. If the
allocation is still unsuccessful, return NULL. This allows users to do
what getblk() does, for instance, by implementing their own memory freeing
functions. It also allows them to use all of the page allocator's powers
(compaction, reclaim, oom killer) without infinitely looping by either
using an order under PAGE_ALLOC_COSTLY_ORDER or insisting on __GFP_NOFAIL.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-07-13 21:08 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-07 23:11 Michel Lespinasse
2010-07-08 0:16 ` KOSAKI Motohiro
2010-07-08 1:11 ` KOSAKI Motohiro
2010-07-08 9:02 ` Peter Zijlstra
2010-07-08 9:24 ` KOSAKI Motohiro
2010-07-08 9:35 ` Peter Zijlstra
2010-07-08 9:51 ` KOSAKI Motohiro
2010-07-08 10:30 ` Peter Zijlstra
[not found] ` <AANLkTimLSnNot2byTWYuIHE8rhGLXbl1zKsQQhmci1Do@mail.gmail.com>
2010-07-08 10:49 ` Peter Zijlstra
2010-07-08 10:57 ` KOSAKI Motohiro
2010-07-08 11:02 ` Peter Zijlstra
2010-07-08 11:06 ` KOSAKI Motohiro
2010-07-08 11:23 ` Peter Zijlstra
2010-07-09 1:31 ` KOSAKI Motohiro
2010-07-12 21:47 ` David Rientjes
2010-07-13 0:14 ` KOSAKI Motohiro
2010-07-13 21:08 ` David Rientjes [this message]
[not found] ` <AANLkTimArLPHrxHNEejiXKNYk9To6qsjglbgzyypXP-c@mail.gmail.com>
2010-07-08 12:43 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1007131349250.1821@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=dpshah@google.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox