From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org, rientjes@google.com, oleg@redhat.com,
vdavydov@parallels.com, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 07/10] mm, oom: fortify task_will_free_mem
Date: Fri, 17 Jun 2016 15:29:04 +0200 [thread overview]
Message-ID: <20160617132903.GJ21670@dhcp22.suse.cz> (raw)
In-Reply-To: <201606172212.FHJ78143.FJSVFLQOOMtFHO@I-love.SAKURA.ne.jp>
On Fri 17-06-16 22:12:22, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Fri 17-06-16 20:38:01, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > > > Anyway, would you be OK with the patch if I added the current->mm check
> > > > > > and resolve its necessity in a separate patch?
> > > > >
> > > > > Please correct task_will_free_mem() in oom_kill_process() as well.
> > > >
> > > > We cannot hold task_lock over all task_will_free_mem I am even not sure
> > > > we have to develop an elaborate way to make it raceless just for the nommu
> > > > case. The current case is simple as we cannot race here. Is that
> > > > sufficient for you?
> > >
> > > We can use find_lock_task_mm() inside mark_oom_victim().
> > > That is, call wake_oom_reaper() from mark_oom_victim() like
> > >
> > > void mark_oom_victim(struct task_struct *tsk, bool can_use_oom_reaper)
> > > {
> > > WARN_ON(oom_killer_disabled);
> > > /* OOM killer might race with memcg OOM */
> > > tsk = find_lock_task_mm(tsk);
> > > if (!tsk)
> > > return;
> > > if (test_and_set_tsk_thread_flag(tsk, TIF_MEMDIE)) {
> > > task_unlock(tsk);
> > > return;
> > > }
> > > task_unlock(tsk);
> > > atomic_inc(&tsk->signal->oom_victims);
> > > /*
> > > * Make sure that the task is woken up from uninterruptible sleep
> > > * if it is frozen because OOM killer wouldn't be able to free
> > > * any memory and livelock. freezing_slow_path will tell the freezer
> > > * that TIF_MEMDIE tasks should be ignored.
> > > */
> > > __thaw_task(tsk);
> > > atomic_inc(&oom_victims);
> > > if (can_use_oom_reaper)
> > > wake_oom_reaper(tsk);
> > > }
> > >
> > > and move mark_oom_victim() by normal path to after task_unlock(victim).
> > >
> > > do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
> > > - mark_oom_victim(victim);
> > >
> > > - if (can_oom_reap)
> > > - wake_oom_reaper(victim);
> > > + wake_oom_reaper(victim, can_oom_reap);
> >
> > I do not like this because then we would have to check the reapability
> > from inside the oom_reaper again.
>
> I didn't understand why you think so. But strictly speaking, can_oom_reap calculation
> in oom_kill_process() is always racy, and [PATCH 10/10] is not safe.
>
> CPU0 (memory allocating task) CPU1 (kthread) CPU2 (OOM victim)
>
> Calls use_mm(victim->mm).
> Starts some worker.
> Enters out_of_memory().
> Enters oom_kill_process().
> Finishes some worker.
> Calls rcu_read_lock().
> Sets can_oom_reap = false due to process_shares_mm() && !same_thread_group() && (p->flags & PF_KTHREAD).
> Calls unuse_mm(victim->mm).
> Continues scanning other processes.
> Calls mmput(victim->mm).
> Sends SIGKILL to victim.
> Calls rcu_read_unlock().
> Leaves oom_kill_process().
> Calls do_exit().
> Leaves out_of_memory().
> Sets victim->mm = NULL from exit_mm().
> Calls mmput() from exit_mm().
> __mmput() is called because victim was the last user.
> Enters out_of_memory().
> oom_scan_process_thread() returns OOM_SCAN_ABORT.
> Leaves out_of_memory().
> __mmput() stalls but the oom_reaper is not called.
>
> For correctness, can_oom_reap needs to be calculated inside the oom_reaper.
Why it would be any less racy than the above? It doesn't employ any
serialization with use_mm users nor it serialize with the exit path.
The timing would get different but not in the way to talk about
correctness.
> > But let me ask again. Does this really matter so much just because of
> > nommu where we can fall in different traps? Can we simply focus on mmu
> > (aka vast majority of cases) make it work reliably and see what we can
> > do with nommu later?
>
> To me, timeout based one is sufficient for handling any traps that hit
> nommu kernels after the OOM killer is invoked.
>
> Anyway, I don't like this series because this series ignores
> theoretical cases.
I am pretty sure you would end up in the land of surprises and new
classes of races with any timeout based solutions as well. But we've
been through that discussion already.
> I can't make progress as long as you repeat "does it really matter/occur".
> Please go ahead without Reviewed-by: or Acked-by: from me.
Fair enough. I appreciate your review which has caught many real bugs
and subtle issues!
I will repost the series on Monday.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-06-17 13:29 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-09 11:52 [PATCH 0/10 -v4] Handle oom bypass more gracefully Michal Hocko
2016-06-09 11:52 ` [PATCH 01/10] proc, oom: drop bogus task_lock and mm check Michal Hocko
2016-06-09 11:52 ` [PATCH 02/10] proc, oom: drop bogus sighand lock Michal Hocko
2016-06-09 11:52 ` [PATCH 03/10] proc, oom_adj: extract oom_score_adj setting into a helper Michal Hocko
2016-06-09 11:52 ` [PATCH 04/10] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj Michal Hocko
2016-06-15 15:03 ` Oleg Nesterov
2016-06-09 11:52 ` [PATCH 05/10] mm, oom: skip vforked tasks from being selected Michal Hocko
2016-06-15 14:51 ` Oleg Nesterov
2016-06-16 6:24 ` Michal Hocko
2016-06-09 11:52 ` [PATCH 06/10] mm, oom: kill all tasks sharing the mm Michal Hocko
2016-06-09 11:52 ` [PATCH 07/10] mm, oom: fortify task_will_free_mem Michal Hocko
2016-06-09 13:18 ` Tetsuo Handa
2016-06-09 14:20 ` Michal Hocko
2016-06-11 8:10 ` Tetsuo Handa
2016-06-13 11:27 ` Michal Hocko
2016-06-16 12:54 ` Tetsuo Handa
2016-06-16 14:29 ` Michal Hocko
2016-06-16 15:40 ` Tetsuo Handa
2016-06-16 15:53 ` Michal Hocko
2016-06-17 11:38 ` Tetsuo Handa
2016-06-17 12:26 ` Michal Hocko
2016-06-17 13:12 ` Tetsuo Handa
2016-06-17 13:29 ` Michal Hocko [this message]
2016-06-09 11:52 ` [PATCH 08/10] mm, oom: task_will_free_mem should skip oom_reaped tasks Michal Hocko
2016-06-17 11:35 ` Tetsuo Handa
2016-06-17 12:56 ` Michal Hocko
2016-06-09 11:52 ` [PATCH 09/10] mm, oom_reaper: do not attempt to reap a task more than twice Michal Hocko
2016-06-15 14:48 ` Oleg Nesterov
2016-06-16 6:28 ` Michal Hocko
2016-06-09 11:52 ` [PATCH 10/10] mm, oom: hide mm which is shared with kthread or global init Michal Hocko
2016-06-09 15:15 ` Tetsuo Handa
2016-06-09 15:41 ` Michal Hocko
2016-06-16 13:15 ` Tetsuo Handa
2016-06-16 13:36 ` Tetsuo Handa
2016-06-15 14:37 ` Oleg Nesterov
2016-06-16 6:31 ` Michal Hocko
2016-06-13 11:23 ` [PATCH 0/10 -v4] Handle oom bypass more gracefully Michal Hocko
2016-06-13 14:13 ` Michal Hocko
2016-06-14 20:17 ` Oleg Nesterov
2016-06-14 20:44 ` Oleg Nesterov
2016-06-16 6:33 ` Michal Hocko
2016-06-15 15:09 ` Oleg Nesterov
2016-06-16 6:34 ` Michal Hocko
-- strict thread matches above, loose matches on Subject: below --
2016-06-20 12:43 [PATCH 0/10 -v5] " Michal Hocko
2016-06-20 12:43 ` [PATCH 07/10] mm, oom: fortify task_will_free_mem Michal Hocko
2016-06-03 9:16 [PATCH 0/10 -v3] Handle oom bypass more gracefully Michal Hocko
2016-06-03 9:16 ` [PATCH 07/10] mm, oom: fortify task_will_free_mem Michal Hocko
2016-06-03 11:42 ` Tetsuo Handa
2016-06-03 12:12 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160617132903.GJ21670@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=oleg@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rientjes@google.com \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox