Re: [RFC PATCH 1/2] mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_HARD with more useful semantic

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@suse.de,
	vbabka@suse.cz, hannes@cmpxchg.org, riel@redhat.com,
	david@fromorbit.com, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/2] mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_HARD with more useful semantic
Date: Sat, 11 Jun 2016 23:35:49 +0900	[thread overview]
Message-ID: <201606112335.HBG09891.OLFJOFtVMOQHSF@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160607123149.GK12305@dhcp22.suse.cz>

Michal Hocko wrote:
> On Tue 07-06-16 21:11:03, Tetsuo Handa wrote:
> > Remaining __GFP_REPEAT users are not always doing costly allocations.
> 
> Yes but...
> 
> > Sometimes they pass __GFP_REPEAT because the size is given from userspace.
> > Thus, unconditional s/__GFP_REPEAT/__GFP_RETRY_HARD/g is not good.
> 
> Would that be a regression though? Strictly speaking the __GFP_REPEAT
> documentation was explicit to not loop for ever. So nobody should have
> expected nofail semantic pretty much by definition. The fact that our
> previous implementation was not fully conforming to the documentation is
> just an implementation detail.  All the remaining users of __GFP_REPEAT
> _have_ to be prepared for the allocation failure. So what exactly is the
> problem with them?

A !costly allocation becomes weaker than now if __GFP_RETRY_HARD is passed.

> > >  	/* Reclaim has failed us, start killing things */
> > >  	page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
> > >  	if (page)
> > > @@ -3719,6 +3731,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > >  	/* Retry as long as the OOM killer is making progress */
> > >  	if (did_some_progress) {
> > >  		no_progress_loops = 0;
> > > +		passed_oom = true;
> > 
> > This is too premature. did_some_progress != 0 after returning from
> > __alloc_pages_may_oom() does not mean the OOM killer was invoked. It only means
> > that mutex_trylock(&oom_lock) was attempted.
> 
> which means that we have reached the OOM condition and _somebody_ is
> actaully handling the OOM on our behalf.

That _somebody_ might release oom_lock without invoking the OOM killer (e.g.
doing !__GFP_FS allocation), which means that we have reached the OOM condition
and nobody is actually handling the OOM on our behalf. __GFP_RETRY_HARD becomes
as weak as __GFP_NORETRY. I think this is a regression.



> > What I think more important is hearing from __GFP_REPEAT users how hard they
> > want to retry. It is possible that they want to retry unless SIGKILL is
> > delivered, but passing __GFP_NOFAIL is too hard, and therefore __GFP_REPEAT
> > is used instead. It is possible that they use __GFP_NOFAIL || __GFP_KILLABLE
> > if __GFP_KILLABLE were available. In my module (though I'm not using
> > __GFP_REPEAT), I want to retry unless SIGKILL is delivered.
> 
> To be honest killability for a particular allocation request sounds
> like a hack to me. Just consider the expected semantic. How do you
> handle when one path uses explicit __GFP_KILLABLE while other path (from
> the same syscall) is not... If anything this would have to be process
> context wise.

I didn't catch your question. But making code killable should be considered
good unless it complicates error handling paths.

Since we are not setting TIF_MEMDIE to all OOM-killed threads, OOM-killed
threads will have to loop until mutex_trylock(&oom_lock) succeeds in order
to get TIF_MEMDIE by calling out_of_memory(), which is a needless delay.

Many allocations from syscall context can give up upon SIGKILL. We don't
need to allow OOM-killed threads to use memory reserves if that allocation
is killable.

Converting down_write(&mm->mmap_sem) to down_write_killable(&mm->mmap_sem)
is considered good. But converting kmalloc(GFP_KERNEL) to
kmalloc(GFP_KERNEL | __GFP_KILLABLE) is considered hack. Why?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2016-06-11 15:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-06 11:32 [RFC PATCH 0/2] mm: give GFP_REPEAT a better semantic Michal Hocko
2016-06-06 11:32 ` [RFC PATCH 1/2] mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_HARD with more useful semantic Michal Hocko
2016-06-07 12:11   ` Tetsuo Handa
2016-06-07 12:31     ` Michal Hocko
2016-06-11 14:35       ` Tetsuo Handa [this message]
2016-06-13 11:37         ` Michal Hocko
2016-06-13 14:54           ` Tetsuo Handa
2016-06-13 15:17             ` Michal Hocko
2016-06-14 11:12               ` Tetsuo Handa
2016-06-14 18:54                 ` Michal Hocko
2016-06-06 11:32 ` [RFC PATCH 2/2] xfs: map KM_MAYFAIL to __GFP_RETRY_HARD Michal Hocko
2016-06-16  0:23   ` Dave Chinner
2016-06-16  8:03     ` Michal Hocko
2016-06-16 11:26       ` Michal Hocko
2016-06-17 18:22         ` Johannes Weiner
2016-06-17 20:30           ` Vlastimil Babka
2016-06-17 21:39             ` Johannes Weiner
2016-06-20  8:08               ` Michal Hocko
2016-06-21  4:22                 ` Johannes Weiner
2016-06-21  9:29                   ` Vlastimil Babka
2016-06-21 17:00                   ` Michal Hocko
2016-10-06 11:14 ` [RFC PATCH 0/2] mm: give GFP_REPEAT a better semantic Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201606112335.HBG09891.OLFJOFtVMOQHSF@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=riel@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox