From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@suse.com
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
rientjes@google.com, mjaggi@caviumnetworks.com, mgorman@suse.de,
oleg@redhat.com, vdavydov.dev@gmail.com, vbabka@suse.cz
Subject: Re: [RFC PATCH 2/2] mm,oom: Try last second allocation after selecting an OOM victim.
Date: Thu, 24 Aug 2017 23:40:36 +0900 [thread overview]
Message-ID: <201708242340.ICG00066.JtFOFVSMOHOLFQ@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20170824131836.GN5943@dhcp22.suse.cz>
Michal Hocko wrote:
> On Thu 24-08-17 21:18:26, Tetsuo Handa wrote:
> > Manish Jaggi noticed that running LTP oom01/oom02 ltp tests with high core
> > count causes random kernel panics when an OOM victim which consumed memory
> > in a way the OOM reaper does not help was selected by the OOM killer [1].
> >
> > Since commit 696453e66630ad45 ("mm, oom: task_will_free_mem should skip
> > oom_reaped tasks") changed task_will_free_mem(current) in out_of_memory()
> > to return false as soon as MMF_OOM_SKIP is set, many threads sharing the
> > victim's mm were not able to try allocation from memory reserves after the
> > OOM reaper gave up reclaiming memory.
> >
> > I proposed a patch which alllows task_will_free_mem(current) in
> > out_of_memory() to ignore MMF_OOM_SKIP for once so that all OOM victim
> > threads are guaranteed to have tried ALLOC_OOM allocation attempt before
> > start selecting next OOM victims [2], for Michal Hocko did not like
> > calling get_page_from_freelist() from the OOM killer which is a layer
> > violation [3]. But now, Michal thinks that calling get_page_from_freelist()
> > after task_will_free_mem(current) test is better than allowing
> > task_will_free_mem(current) to ignore MMF_OOM_SKIP for once [4], for
> > this would help other cases when we race with an exiting tasks or somebody
> > managed to free memory while we were selecting an OOM victim which can take
> > quite some time.
>
> This a lot of text which can be more confusing than helpful. Could you
> state the problem clearly without detours? Yes, the oom killer selection
> can race with those freeing memory. And it has been like that since
> basically ever.
The problem which Manish Jaggi reported (and I can still reproduce) is that
the OOM killer ignores MMF_OOM_SKIP mm too early. And the problem became real
in 4.8 due to commit 696453e66630ad45 ("mm, oom: task_will_free_mem should skip
oom_reaped tasks"). Thus, it has _not_ been like that since basically ever.
> Doing a last minute allocation attempt might help. Now
> there are more important questions. How likely is that. Do people have
> to care? __alloc_pages_may_oom already does a almost-the-last moment
> allocation. Do we still need it?
get_page_from_freelist() in __alloc_pages_may_oom() would help only if
MMF_OOM_SKIP is set after some memory is reclaimed. But the problem is
that MMF_OOM_SKIP is set without reclaiming any memory.
> It also does ALLOC_WMARK_HIGH
> allocation which your path doesn't do.
The intent of this patch is to replace "[PATCH v2] mm, oom:
task_will_free_mem(current) should ignore MMF_OOM_SKIP for once."
which you have nacked 3 days ago.
> I wanted to remove this some time
> ago but it has been pointed out that this was really needed
> https://patchwork.kernel.org/patch/8153841/ Maybe things have changed
> and if so please explain.
get_page_from_freelist() in __alloc_pages_may_oom() will remain needed
because it can help allocations which do not call oom_kill_process() (i.e.
allocations which do "goto out;" in __alloc_pages_may_oom() without calling
out_of_memory(), and allocations which do "return;" in out_of_memory()
without calling oom_kill_process() (e.g. !__GFP_FS)) to succeed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-24 15:51 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1503577106-9196-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>
2017-08-24 12:18 ` Tetsuo Handa
2017-08-24 13:18 ` Michal Hocko
2017-08-24 14:40 ` Tetsuo Handa [this message]
2017-08-25 8:00 ` Michal Hocko
2017-09-09 0:55 ` Tetsuo Handa
[not found] ` <201710172204.AGG30740.tVHJFFOQLMSFOO@I-love.SAKURA.ne.jp>
2017-10-20 12:40 ` Michal Hocko
2017-10-20 14:18 ` Tetsuo Handa
2017-10-23 11:30 ` Michal Hocko
2017-10-24 11:24 ` Tetsuo Handa
2017-10-24 11:41 ` Michal Hocko
2017-10-25 10:48 ` Tetsuo Handa
2017-10-25 11:09 ` Michal Hocko
2017-10-25 12:15 ` Tetsuo Handa
2017-10-25 12:41 ` Michal Hocko
2017-10-25 14:58 ` Tetsuo Handa
2017-10-25 15:05 ` Michal Hocko
2017-10-25 15:34 ` Tetsuo Handa
2017-08-24 13:03 ` [PATCH 1/2] mm,page_alloc: Don't call __node_reclaim() with oom_lock held Michal Hocko
2017-08-25 20:47 ` Andrew Morton
2017-08-26 1:28 ` Tetsuo Handa
2017-08-27 4:17 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201708242340.ICG00066.JtFOFVSMOHOLFQ@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=mjaggi@caviumnetworks.com \
--cc=oleg@redhat.com \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox