From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@suse.de,
vbabka@suse.cz, hannes@cmpxchg.org, riel@redhat.com,
david@fromorbit.com, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/2] mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_HARD with more useful semantic
Date: Sat, 11 Jun 2016 23:35:49 +0900 [thread overview]
Message-ID: <201606112335.HBG09891.OLFJOFtVMOQHSF@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160607123149.GK12305@dhcp22.suse.cz>
Michal Hocko wrote:
> On Tue 07-06-16 21:11:03, Tetsuo Handa wrote:
> > Remaining __GFP_REPEAT users are not always doing costly allocations.
>
> Yes but...
>
> > Sometimes they pass __GFP_REPEAT because the size is given from userspace.
> > Thus, unconditional s/__GFP_REPEAT/__GFP_RETRY_HARD/g is not good.
>
> Would that be a regression though? Strictly speaking the __GFP_REPEAT
> documentation was explicit to not loop for ever. So nobody should have
> expected nofail semantic pretty much by definition. The fact that our
> previous implementation was not fully conforming to the documentation is
> just an implementation detail. All the remaining users of __GFP_REPEAT
> _have_ to be prepared for the allocation failure. So what exactly is the
> problem with them?
A !costly allocation becomes weaker than now if __GFP_RETRY_HARD is passed.
> > > /* Reclaim has failed us, start killing things */
> > > page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
> > > if (page)
> > > @@ -3719,6 +3731,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > > /* Retry as long as the OOM killer is making progress */
> > > if (did_some_progress) {
> > > no_progress_loops = 0;
> > > + passed_oom = true;
> >
> > This is too premature. did_some_progress != 0 after returning from
> > __alloc_pages_may_oom() does not mean the OOM killer was invoked. It only means
> > that mutex_trylock(&oom_lock) was attempted.
>
> which means that we have reached the OOM condition and _somebody_ is
> actaully handling the OOM on our behalf.
That _somebody_ might release oom_lock without invoking the OOM killer (e.g.
doing !__GFP_FS allocation), which means that we have reached the OOM condition
and nobody is actually handling the OOM on our behalf. __GFP_RETRY_HARD becomes
as weak as __GFP_NORETRY. I think this is a regression.
> > What I think more important is hearing from __GFP_REPEAT users how hard they
> > want to retry. It is possible that they want to retry unless SIGKILL is
> > delivered, but passing __GFP_NOFAIL is too hard, and therefore __GFP_REPEAT
> > is used instead. It is possible that they use __GFP_NOFAIL || __GFP_KILLABLE
> > if __GFP_KILLABLE were available. In my module (though I'm not using
> > __GFP_REPEAT), I want to retry unless SIGKILL is delivered.
>
> To be honest killability for a particular allocation request sounds
> like a hack to me. Just consider the expected semantic. How do you
> handle when one path uses explicit __GFP_KILLABLE while other path (from
> the same syscall) is not... If anything this would have to be process
> context wise.
I didn't catch your question. But making code killable should be considered
good unless it complicates error handling paths.
Since we are not setting TIF_MEMDIE to all OOM-killed threads, OOM-killed
threads will have to loop until mutex_trylock(&oom_lock) succeeds in order
to get TIF_MEMDIE by calling out_of_memory(), which is a needless delay.
Many allocations from syscall context can give up upon SIGKILL. We don't
need to allow OOM-killed threads to use memory reserves if that allocation
is killable.
Converting down_write(&mm->mmap_sem) to down_write_killable(&mm->mmap_sem)
is considered good. But converting kmalloc(GFP_KERNEL) to
kmalloc(GFP_KERNEL | __GFP_KILLABLE) is considered hack. Why?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-06-11 15:26 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-06 11:32 [RFC PATCH 0/2] mm: give GFP_REPEAT a better semantic Michal Hocko
2016-06-06 11:32 ` [RFC PATCH 1/2] mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_HARD with more useful semantic Michal Hocko
2016-06-07 12:11 ` Tetsuo Handa
2016-06-07 12:31 ` Michal Hocko
2016-06-11 14:35 ` Tetsuo Handa [this message]
2016-06-13 11:37 ` Michal Hocko
2016-06-13 14:54 ` Tetsuo Handa
2016-06-13 15:17 ` Michal Hocko
2016-06-14 11:12 ` Tetsuo Handa
2016-06-14 18:54 ` Michal Hocko
2016-06-06 11:32 ` [RFC PATCH 2/2] xfs: map KM_MAYFAIL to __GFP_RETRY_HARD Michal Hocko
2016-06-16 0:23 ` Dave Chinner
2016-06-16 8:03 ` Michal Hocko
2016-06-16 11:26 ` Michal Hocko
2016-06-17 18:22 ` Johannes Weiner
2016-06-17 20:30 ` Vlastimil Babka
2016-06-17 21:39 ` Johannes Weiner
2016-06-20 8:08 ` Michal Hocko
2016-06-21 4:22 ` Johannes Weiner
2016-06-21 9:29 ` Vlastimil Babka
2016-06-21 17:00 ` Michal Hocko
2016-10-06 11:14 ` [RFC PATCH 0/2] mm: give GFP_REPEAT a better semantic Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201606112335.HBG09891.OLFJOFtVMOQHSF@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=riel@redhat.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox