Re: [RFC PATCH 2/2] mm,oom: Try last second allocation after selecting an OOM victim.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@suse.com
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	rientjes@google.com, mjaggi@caviumnetworks.com, mgorman@suse.de,
	oleg@redhat.com, vdavydov.dev@gmail.com, vbabka@suse.cz
Subject: Re: [RFC PATCH 2/2] mm,oom: Try last second allocation after selecting an OOM victim.
Date: Sat, 9 Sep 2017 09:55:00 +0900	[thread overview]
Message-ID: <201709090955.HFA57316.QFOSVMtFOJLFOH@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20170825080020.GE25498@dhcp22.suse.cz>

There is no response to your suggestion. Can we agree with going to this direction?
If no response, for now I push ignore MMF_OOM_SKIP for once approach.

Michal Hocko wrote:
> On Thu 24-08-17 23:40:36, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > On Thu 24-08-17 21:18:26, Tetsuo Handa wrote:
> > > > Manish Jaggi noticed that running LTP oom01/oom02 ltp tests with high core
> > > > count causes random kernel panics when an OOM victim which consumed memory
> > > > in a way the OOM reaper does not help was selected by the OOM killer [1].
> > > > 
> > > > Since commit 696453e66630ad45 ("mm, oom: task_will_free_mem should skip
> > > > oom_reaped tasks") changed task_will_free_mem(current) in out_of_memory()
> > > > to return false as soon as MMF_OOM_SKIP is set, many threads sharing the
> > > > victim's mm were not able to try allocation from memory reserves after the
> > > > OOM reaper gave up reclaiming memory.
> > > > 
> > > > I proposed a patch which alllows task_will_free_mem(current) in
> > > > out_of_memory() to ignore MMF_OOM_SKIP for once so that all OOM victim
> > > > threads are guaranteed to have tried ALLOC_OOM allocation attempt before
> > > > start selecting next OOM victims [2], for Michal Hocko did not like
> > > > calling get_page_from_freelist() from the OOM killer which is a layer
> > > > violation [3]. But now, Michal thinks that calling get_page_from_freelist()
> > > > after task_will_free_mem(current) test is better than allowing
> > > > task_will_free_mem(current) to ignore MMF_OOM_SKIP for once [4], for
> > > > this would help other cases when we race with an exiting tasks or somebody
> > > > managed to free memory while we were selecting an OOM victim which can take
> > > > quite some time.
> > > 
> > > This a lot of text which can be more confusing than helpful. Could you
> > > state the problem clearly without detours? Yes, the oom killer selection
> > > can race with those freeing memory. And it has been like that since
> > > basically ever.
> > 
> > The problem which Manish Jaggi reported (and I can still reproduce) is that
> > the OOM killer ignores MMF_OOM_SKIP mm too early. And the problem became real
> > in 4.8 due to commit 696453e66630ad45 ("mm, oom: task_will_free_mem should skip
> > oom_reaped tasks"). Thus, it has _not_ been like that since basically ever.
> 
> Again, you are mixing more things together. Manish usecase triggers a
> pathological case where the oom reaper is not able to reclaim basically
> any memory and so we unnecessarily kill another victim if the original
> one doesn't finish quick enough.
> 
> This patch and your former attempts will only help (for that particular
> case) if the victim itself wanted to allocate and didn't manage to pass
> through the ALLOC_OOM attempt since it was killed. This yet again a
> corner case and something this patch won't plug in general (it only
> takes another task to go that path). That's why I consider that
> confusing to mention in the changelog.
> 
> What I am trying to say is that time-to-check vs time-to-kill has
> been a race window since ever and a large amount of memory can be
> released during that time. This patch definitely reduces that time
> window _considerably_. There is still a race window left but this is
> inherently racy so you could argue that the remaining window is small to
> lose sleep over. After all this is a corner case again. From my years of
> experience with OOM reports I haven't met many (if any) cases like that.
> So the primary question is whether we do care about this race window
> enough to even try to fix it. Considering an absolute lack of reports
> I would tend to say we don't but if the fix can be made non-intrusive
> which seems likely then we actually can try it out at least.
> 
> > >                                        I wanted to remove this some time
> > > ago but it has been pointed out that this was really needed
> > > https://patchwork.kernel.org/patch/8153841/ Maybe things have changed
> > > and if so please explain.
> > 
> > get_page_from_freelist() in __alloc_pages_may_oom() will remain needed
> > because it can help allocations which do not call oom_kill_process() (i.e.
> > allocations which do "goto out;" in __alloc_pages_may_oom() without calling
> > out_of_memory(), and allocations which do "return;" in out_of_memory()
> > without calling oom_kill_process() (e.g. !__GFP_FS)) to succeed.
> 
> I do not understand. Those request will simply back off and retry the
> allocation or bail out and fail the allocation. My primary question was
> 
> : that the above link contains an explanation from Andrea that the reason
> : for the high wmark is to reduce the likelihood of livelocks and be sure
> : to invoke the OOM killer,
> 
> I am not sure how much that reason applies to the current code but if it
> still applies then we should do the same for later
> last-minute-allocation as well. Having both and disagreeing is just a
> mess.
> -- 
> Michal Hocko
> SUSE Labs
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-09-09  1:46 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1503577106-9196-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>
2017-08-24 12:18 ` Tetsuo Handa
2017-08-24 13:18   ` Michal Hocko
2017-08-24 14:40     ` Tetsuo Handa
2017-08-25  8:00       ` Michal Hocko
2017-09-09  0:55         ` Tetsuo Handa [this message]
     [not found]           ` <201710172204.AGG30740.tVHJFFOQLMSFOO@I-love.SAKURA.ne.jp>
2017-10-20 12:40             ` Michal Hocko
2017-10-20 14:18               ` Tetsuo Handa
2017-10-23 11:30                 ` Michal Hocko
2017-10-24 11:24                   ` Tetsuo Handa
2017-10-24 11:41                     ` Michal Hocko
2017-10-25 10:48                       ` Tetsuo Handa
2017-10-25 11:09                         ` Michal Hocko
2017-10-25 12:15                           ` Tetsuo Handa
2017-10-25 12:41                             ` Michal Hocko
2017-10-25 14:58                               ` Tetsuo Handa
2017-10-25 15:05                                 ` Michal Hocko
2017-10-25 15:34                                   ` Tetsuo Handa
2017-08-24 13:03 ` [PATCH 1/2] mm,page_alloc: Don't call __node_reclaim() with oom_lock held Michal Hocko
2017-08-25 20:47 ` Andrew Morton
2017-08-26  1:28   ` Tetsuo Handa
2017-08-27  4:17     ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201709090955.HFA57316.QFOSVMtFOJLFOH@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mjaggi@caviumnetworks.com \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox