From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@suse.cz, akpm@linux-foundation.org
Cc: linux-mm@kvack.org, rientjes@google.com, oleg@redhat.com
Subject: Re: [RFC PATCH] oom: Don't count on mm-less current process.
Date: Sat, 20 Dec 2014 20:42:08 +0900 [thread overview]
Message-ID: <201412202042.ECJ64551.FHOOJOQLFFtVMS@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <201412201813.JJF95860.VSLOQOFHFJOFtM@I-love.SAKURA.ne.jp>
Tetsuo Handa wrote:
> By the way, Michal, I think there is still an unlikely race window at
> set_tsk_thread_flag(p, TIF_MEMDIE) in oom_kill_process(). For example,
> task1 calls out_of_memory() and select_bad_process() is called from
> out_of_memory(). oom_scan_process_thread(task2) is called from
> select_bad_process(). oom_scan_process_thread() returns OOM_SCAN_OK
> because task2->mm != NULL and task_will_free_mem(task2) == false.
> select_bad_process() calls get_task_struct(task2) and returns task2.
> Task1 goes to sleep and task2 is woken up. Task2 enters into do_exit()
> and gets PF_EXITING at exit_signals() and releases mm at exit_mm().
> Task2 goes to sleep and task1 is woken up. Task1 calls
> oom_kill_process(task2). oom_kill_process() sets TIF_MEMDIE on task2
> because task_will_free_mem(task2) == true due to PF_EXITING already set...
> Should we do like
>
> if (task_will_free_mem(p)) {
> if (p->mm)
> set_tsk_thread_flag(p, TIF_MEMDIE);
> put_task_struct(p);
> return;
> }
>
> at oom_kill_process() ? Or even if we do so, how to check if task1 went
> to sleep between task2->mm and set_tsk_thread_flag(task2, TIF_MEMDIE) ?
> This race window is very very unlikely because releasing task2->mm is
> expected to release some memory. But if somebody else consumed memory
> released by exit_mm(task2), I think there is nothing to protect.
Well, this could happen if task2 is one of threads in a multi-threaded
process like Java where exit_mm(task2) decrements refcount than releases
memory. Below is a patch. Michal, please check.
----------------------------------------
>From a2ebb5b873ec5af45e0bea9ea6da2a93c0f06c35 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sat, 20 Dec 2014 20:05:14 +0900
Subject: [PATCH] oom: Close race of setting TIF_MEMDIE to mm-less process.
exit_mm() and oom_kill_process() could race with regard to handling of
TIF_MEMDIE flag if sequence described below occurred.
P1 calls out_of_memory(). out_of_memory() calls select_bad_process().
select_bad_process() calls oom_scan_process_thread(P2). If P2->mm != NULL
and task_will_free_mem(P2) == false, oom_scan_process_thread(P2) returns
OOM_SCAN_OK. And if P2 is chosen as a victim task, select_bad_process()
returns P2 after calling get_task_struct(P2). Then, P1 goes to sleep and
P2 is woken up. P2 enters into do_exit() and gets PF_EXITING at exit_signals()
and releases mm at exit_mm(). Then, P2 goes to sleep and P1 is woken up.
P1 calls oom_kill_process(P2). oom_kill_process() sets TIF_MEMDIE on P2
because task_will_free_mem(P2) == true due to PF_EXITING already set.
Afterward, oom_scan_process_thread(P2) will return OOM_SCAN_ABORT because
test_tsk_thread_flag(P2, TIF_MEMDIE) is checked before P2->mm is checked.
If TIF_MEMDIE was again set to P2, the OOM killer will be blocked by P2
sitting in the final schedule() waiting for P2's parent to reap P2.
It will trigger an OOM livelock if P2's parent is unable to reap P2 due to
doing an allocation and waiting for the OOM killer to kill P2.
To close this race window, clear TIF_MEMDIE if P2->mm == NULL after
set_tsk_thread_flag(P2, TIF_MEMDIE) is done.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
kernel/exit.c | 1 +
mm/oom_kill.c | 3 +++
2 files changed, 4 insertions(+)
diff --git a/kernel/exit.c b/kernel/exit.c
index 1ea4369..46d72e6 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -435,6 +435,7 @@ static void exit_mm(struct task_struct *tsk)
task_unlock(tsk);
mm_update_next_owner(mm);
mmput(mm);
+ smp_wmb(); /* Avoid race with oom_kill_process(). */
clear_thread_flag(TIF_MEMDIE);
}
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f82dd13..c8ae445 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -440,6 +440,9 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
*/
if (task_will_free_mem(p)) {
set_tsk_thread_flag(p, TIF_MEMDIE);
+ smp_rmb(); /* Avoid race with exit_mm(). */
+ if (unlikely(!p->mm))
+ clear_tsk_thread_flag(p, TIF_MEMDIE);
put_task_struct(p);
return;
}
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-12-20 12:22 UTC|newest]
Thread overview: 177+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-12 13:54 Tetsuo Handa
2014-12-16 12:47 ` Michal Hocko
2014-12-17 11:54 ` Tetsuo Handa
2014-12-17 13:08 ` Michal Hocko
2014-12-18 12:11 ` Tetsuo Handa
2014-12-18 15:33 ` Michal Hocko
2014-12-19 12:07 ` Tetsuo Handa
2014-12-19 12:49 ` Michal Hocko
2014-12-20 9:13 ` Tetsuo Handa
2014-12-20 11:42 ` Tetsuo Handa [this message]
2014-12-22 20:25 ` Michal Hocko
2014-12-23 1:00 ` Tetsuo Handa
2014-12-23 9:51 ` Michal Hocko
2014-12-23 11:46 ` Tetsuo Handa
2014-12-23 11:57 ` Tetsuo Handa
2014-12-23 12:12 ` Tetsuo Handa
2014-12-23 12:27 ` Michal Hocko
2014-12-23 12:24 ` Michal Hocko
2014-12-23 13:00 ` Tetsuo Handa
2014-12-23 13:09 ` Michal Hocko
2014-12-23 13:20 ` Tetsuo Handa
2014-12-23 13:43 ` Michal Hocko
2014-12-23 14:11 ` Tetsuo Handa
2014-12-23 14:57 ` Michal Hocko
2014-12-19 12:22 ` How to handle TIF_MEMDIE stalls? Tetsuo Handa
2014-12-20 2:03 ` Dave Chinner
2014-12-20 12:41 ` Tetsuo Handa
2014-12-20 22:35 ` Dave Chinner
2014-12-21 8:45 ` Tetsuo Handa
2014-12-21 20:42 ` Dave Chinner
2014-12-22 16:57 ` Michal Hocko
2014-12-22 21:30 ` Dave Chinner
2014-12-23 9:41 ` Johannes Weiner
2014-12-24 1:06 ` Dave Chinner
2014-12-24 2:40 ` Linus Torvalds
2014-12-29 18:19 ` Michal Hocko
2014-12-30 6:42 ` Tetsuo Handa
2014-12-30 11:21 ` Michal Hocko
2014-12-30 13:33 ` Tetsuo Handa
2014-12-31 10:24 ` Tetsuo Handa
2015-02-09 11:44 ` Tetsuo Handa
2015-02-10 13:58 ` Tetsuo Handa
2015-02-10 15:19 ` Johannes Weiner
2015-02-11 2:23 ` Tetsuo Handa
2015-02-11 13:37 ` Tetsuo Handa
2015-02-11 18:50 ` Oleg Nesterov
2015-02-11 18:59 ` Oleg Nesterov
2015-03-14 13:03 ` Tetsuo Handa
2015-02-17 12:23 ` Tetsuo Handa
2015-02-17 12:53 ` Johannes Weiner
2015-02-17 15:38 ` Michal Hocko
2015-02-17 22:54 ` Dave Chinner
2015-02-17 23:32 ` Dave Chinner
2015-02-18 8:25 ` Michal Hocko
2015-02-18 10:48 ` Dave Chinner
2015-02-18 12:16 ` Michal Hocko
2015-02-18 21:31 ` Dave Chinner
2015-02-19 9:40 ` Michal Hocko
2015-02-19 22:03 ` Dave Chinner
2015-02-20 9:27 ` Michal Hocko
2015-02-19 11:01 ` Johannes Weiner
2015-02-19 12:29 ` Michal Hocko
2015-02-19 12:58 ` Michal Hocko
2015-02-19 15:29 ` Tetsuo Handa
2015-02-19 21:53 ` Tetsuo Handa
2015-02-20 9:13 ` Michal Hocko
2015-02-20 13:37 ` Stefan Ring
2015-02-19 13:29 ` Tetsuo Handa
2015-02-20 9:10 ` Michal Hocko
2015-02-20 12:20 ` Tetsuo Handa
2015-02-20 12:38 ` Michal Hocko
2015-02-19 21:43 ` Dave Chinner
2015-02-20 12:48 ` Michal Hocko
2015-02-20 23:09 ` Dave Chinner
2015-02-19 10:24 ` Johannes Weiner
2015-02-19 22:52 ` Dave Chinner
2015-02-20 10:36 ` Tetsuo Handa
2015-02-20 23:15 ` Dave Chinner
2015-02-21 3:20 ` Theodore Ts'o
2015-02-21 9:19 ` Andrew Morton
2015-02-21 13:48 ` Tetsuo Handa
2015-02-21 21:38 ` Dave Chinner
2015-02-22 0:20 ` Johannes Weiner
2015-02-23 10:48 ` Michal Hocko
2015-02-23 11:23 ` Tetsuo Handa
2015-02-23 21:33 ` David Rientjes
2015-02-22 14:48 ` __GFP_NOFAIL and oom_killer_disabled? Tetsuo Handa
2015-02-23 10:21 ` Michal Hocko
2015-02-23 13:03 ` Tetsuo Handa
2015-02-24 18:14 ` Michal Hocko
2015-02-25 11:22 ` Tetsuo Handa
2015-02-25 16:02 ` Michal Hocko
2015-02-25 21:48 ` Tetsuo Handa
2015-02-25 21:51 ` Andrew Morton
2015-02-21 12:00 ` How to handle TIF_MEMDIE stalls? Tetsuo Handa
2015-02-23 10:26 ` Michal Hocko
2015-02-21 11:12 ` Tetsuo Handa
2015-02-21 21:48 ` Dave Chinner
2015-02-21 23:52 ` Johannes Weiner
2015-02-23 0:45 ` Dave Chinner
2015-02-23 1:29 ` Andrew Morton
2015-02-23 7:32 ` Dave Chinner
2015-02-27 18:24 ` Vlastimil Babka
2015-02-28 0:03 ` Dave Chinner
2015-02-28 15:17 ` Theodore Ts'o
2015-03-02 9:39 ` Vlastimil Babka
2015-03-02 22:31 ` Dave Chinner
2015-03-03 9:13 ` Vlastimil Babka
2015-03-04 1:33 ` Dave Chinner
2015-03-04 8:50 ` Vlastimil Babka
2015-03-04 11:03 ` Dave Chinner
2015-03-07 0:20 ` Johannes Weiner
2015-03-07 3:43 ` Dave Chinner
2015-03-07 15:08 ` Johannes Weiner
2015-03-02 20:22 ` Johannes Weiner
2015-03-02 23:12 ` Dave Chinner
2015-03-03 2:50 ` Johannes Weiner
2015-03-04 6:52 ` Dave Chinner
2015-03-04 15:04 ` Johannes Weiner
2015-03-04 17:38 ` Theodore Ts'o
2015-03-04 23:17 ` Dave Chinner
2015-02-28 16:29 ` Johannes Weiner
2015-02-28 16:41 ` Theodore Ts'o
2015-02-28 22:15 ` Johannes Weiner
2015-03-01 11:17 ` Tetsuo Handa
2015-03-06 11:53 ` Tetsuo Handa
2015-03-01 13:43 ` Theodore Ts'o
2015-03-01 16:15 ` Johannes Weiner
2015-03-01 19:36 ` Theodore Ts'o
2015-03-01 20:44 ` Johannes Weiner
2015-03-01 20:17 ` Johannes Weiner
2015-03-01 21:48 ` Dave Chinner
2015-03-02 0:17 ` Dave Chinner
2015-03-02 12:46 ` Brian Foster
2015-02-28 18:36 ` Vlastimil Babka
2015-03-02 15:18 ` Michal Hocko
2015-03-02 16:05 ` Johannes Weiner
2015-03-02 17:10 ` Michal Hocko
2015-03-02 17:27 ` Johannes Weiner
2015-03-02 16:39 ` Theodore Ts'o
2015-03-02 16:58 ` Michal Hocko
2015-03-04 12:52 ` Dave Chinner
2015-02-17 14:59 ` Michal Hocko
2015-02-17 14:50 ` Michal Hocko
2015-02-17 14:37 ` Michal Hocko
2015-02-17 14:44 ` Michal Hocko
2015-02-16 11:23 ` Tetsuo Handa
2015-02-16 15:42 ` Johannes Weiner
2015-02-17 11:57 ` Tetsuo Handa
2015-02-17 13:16 ` Johannes Weiner
2015-02-17 16:50 ` Michal Hocko
2015-02-17 23:25 ` Dave Chinner
2015-02-18 8:48 ` Michal Hocko
2015-02-18 11:23 ` Tetsuo Handa
2015-02-18 12:29 ` Michal Hocko
2015-02-18 14:06 ` Tetsuo Handa
2015-02-18 14:25 ` Michal Hocko
2015-02-19 10:48 ` Tetsuo Handa
2015-02-20 8:26 ` Michal Hocko
2015-02-23 22:08 ` David Rientjes
2015-02-24 11:20 ` Tetsuo Handa
2015-02-24 15:20 ` Theodore Ts'o
2015-02-24 21:02 ` Dave Chinner
2015-02-25 14:31 ` Tetsuo Handa
2015-02-27 7:39 ` Dave Chinner
2015-02-27 12:42 ` Tetsuo Handa
2015-02-27 13:12 ` Dave Chinner
2015-03-04 12:41 ` Tetsuo Handa
2015-03-04 13:25 ` Dave Chinner
2015-03-04 14:11 ` Tetsuo Handa
2015-03-05 1:36 ` Dave Chinner
2015-02-17 16:33 ` Michal Hocko
2014-12-29 17:40 ` [PATCH] mm: get rid of radix tree gfp mask for pagecache_get_page (was: Re: How to handle TIF_MEMDIE stalls?) Michal Hocko
2014-12-29 18:45 ` Linus Torvalds
2014-12-29 19:33 ` Michal Hocko
2014-12-30 13:42 ` Michal Hocko
2014-12-30 21:45 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201412202042.ECJ64551.FHOOJOQLFFtVMS@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=oleg@redhat.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox