From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: aarcange@redhat.com, akpm@linux-foundation.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
rientjes@google.com, hannes@cmpxchg.org,
mjaggi@caviumnetworks.com, mgorman@suse.de, oleg@redhat.com,
vdavydov.dev@gmail.com, vbabka@suse.cz
Subject: Re: [PATCH] mm,oom: Try last second allocation before and after selecting an OOM victim.
Date: Wed, 1 Nov 2017 15:48:45 +0100 [thread overview]
Message-ID: <20171101144845.tey4ozou44tfpp3g@dhcp22.suse.cz> (raw)
In-Reply-To: <201711012338.AGB30781.JHOMFQFVSFtOLO@I-love.SAKURA.ne.jp>
On Wed 01-11-17 23:38:49, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Wed 01-11-17 20:58:50, Tetsuo Handa wrote:
> > > > > But doing ALLOC_OOM for last second allocation attempt from out_of_memory() involve
> > > > > duplicating code (e.g. rebuilding zone list).
> > > >
> > > > Why would you do it? Do not blindly copy and paste code without
> > > > a good reason. What kind of problem does this actually solve?
> > >
> > > prepare_alloc_pages()/finalise_ac() initializes as
> > >
> > > ac->high_zoneidx = gfp_zone(gfp_mask);
> > > ac->zonelist = node_zonelist(preferred_nid, gfp_mask);
> > > ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
> > > ac->high_zoneidx, ac->nodemask);
> > >
> > > and selecting as an OOM victim reinitializes as
> > >
> > > ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
> > > ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
> > > ac->high_zoneidx, ac->nodemask);
> > >
> > > and I assume that this reinitialization might affect which memory reserve
> > > the OOM victim allocates from.
> > >
> > > You mean such difference is too trivial to care about?
> >
> > You keep repeating what the _current_ code does without explaining _why_
> > do we need the same thing in the oom path. Could you finaly answer my
> > question please?
>
> Because I consider that following what the current code does is reasonable
> unless there are explicit reasons not to follow.
Following this pattern makes a code mess over time because nobody
remembers why something is done a specific way anymore. Everybody just
keeps the ball rolling because he is afraid to change the code he
doesn't understand. Don't do that!
[...]
> Does "that comment" refer to
>
> Elaborating the comment: the reason for the high wmark is to reduce
> the likelihood of livelocks and be sure to invoke the OOM killer, if
> we're still under pressure and reclaim just failed. The high wmark is
> used to be sure the failure of reclaim isn't going to be ignored. If
> using the min wmark like you propose there's risk of livelock or
> anyway of delayed OOM killer invocation.
>
> part? Then, I know it is not about gfp flags.
>
> But how can OOM livelock happen when the last second allocation does not
> wait for memory reclaim (because __GFP_DIRECT_RECLAIM is masked) ?
> The last second allocation shall return immediately, and we will call
> out_of_memory() if the last second allocation failed.
I think Andrea just wanted to say that we do want to invoke OOM killer
and resolve the memory pressure rather than keep looping in the
reclaim/oom path just because there are few pages allocated and freed in
the meantime.
[...]
> > I am not sure such a scenario matters all that much because it assumes
> > that the oom victim doesn't really free much memory [1] (basically less than
> > HIGH-MIN). Most OOM situation simply have a memory hog consuming
> > significant amount of memory.
>
> The OOM killer does not always kill a memory hog consuming significant amount
> of memory. The OOM killer kills a process with highest OOM score (and instead
> one of its children if any). I don't think that assuming an OOM victim will free
> memory enough to succeed ALLOC_WMARK_HIGH is appropriate.
OK, so let's agree to disagree. I claim that we shouldn't care all that
much. If any of the current heuristics turns out to lead to killing too
many tasks then we should simply remove it rather than keep bloating an
already complex code with more and more kluges.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-11-01 14:48 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-28 8:07 Tetsuo Handa
2017-10-30 14:18 ` Michal Hocko
2017-10-31 10:40 ` Tetsuo Handa
2017-10-31 12:10 ` Michal Hocko
2017-10-31 12:42 ` Tetsuo Handa
2017-10-31 12:48 ` Michal Hocko
2017-10-31 13:13 ` Tetsuo Handa
2017-10-31 13:22 ` Michal Hocko
2017-10-31 13:51 ` Tetsuo Handa
2017-10-31 14:10 ` Michal Hocko
2017-11-01 11:58 ` Tetsuo Handa
2017-11-01 12:46 ` Michal Hocko
2017-11-01 14:38 ` Tetsuo Handa
2017-11-01 14:48 ` Michal Hocko [this message]
2017-11-01 15:37 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171101144845.tey4ozou44tfpp3g@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mjaggi@caviumnetworks.com \
--cc=oleg@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox