Re: Possible race condition in oom-killer

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: mjaggi@caviumnetworks.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: Possible race condition in oom-killer
Date: Sat, 29 Jul 2017 13:31:44 +0900	[thread overview]
Message-ID: <201707291331.JGI18780.OtJVLFMHFOFSOQ@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20170728140706.GT2274@dhcp22.suse.cz>

Michal Hocko wrote:
> On Fri 28-07-17 22:55:51, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > On Fri 28-07-17 22:15:01, Tetsuo Handa wrote:
> > > > task_will_free_mem(current) in out_of_memory() returning false due to
> > > > MMF_OOM_SKIP already set allowed each thread sharing that mm to select a new
> > > > OOM victim. If task_will_free_mem(current) in out_of_memory() did not return
> > > > false, threads sharing MMF_OOM_SKIP mm would not have selected new victims
> > > > to the level where all OOM killable processes are killed and calls panic().
> > > 
> > > I am not sure I understand. Do you mean this?
> > 
> > Yes.
> > 
> > > ---
> > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > > index 9e8b4f030c1c..671e4a4107d0 100644
> > > --- a/mm/oom_kill.c
> > > +++ b/mm/oom_kill.c
> > > @@ -779,13 +779,6 @@ static bool task_will_free_mem(struct task_struct *task)
> > >  	if (!__task_will_free_mem(task))
> > >  		return false;
> > >  
> > > -	/*
> > > -	 * This task has already been drained by the oom reaper so there are
> > > -	 * only small chances it will free some more
> > > -	 */
> > > -	if (test_bit(MMF_OOM_SKIP, &mm->flags))
> > > -		return false;
> > > -
> > >  	if (atomic_read(&mm->mm_users) <= 1)
> > >  		return true;
> > >  
> > > If yes I would have to think about this some more because that might
> > > have weird side effects (e.g. oom_victims counting after threads passed
> > > exit_oom_victim).
> > 
> > But this check should not be removed unconditionally. We should still return
> > false if returning true was not sufficient to solve the OOM situation, for
> > we need to select next OOM victim in that case.
> > 

I think that below one can manage this race condition.

---
 include/linux/sched.h |  1 +
 mm/oom_kill.c         | 21 ++++++++++++++-------
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0db4870..3fccf72 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -652,6 +652,7 @@ struct task_struct {
 	/* disallow userland-initiated cgroup migration */
 	unsigned			no_cgroup_migration:1;
 #endif
+	unsigned			oom_kill_free_check_raced:1;
 
 	unsigned long			atomic_flags; /* Flags requiring atomic access. */
 
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 9e8b4f0..a093193 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -779,13 +779,6 @@ static bool task_will_free_mem(struct task_struct *task)
 	if (!__task_will_free_mem(task))
 		return false;
 
-	/*
-	 * This task has already been drained by the oom reaper so there are
-	 * only small chances it will free some more
-	 */
-	if (test_bit(MMF_OOM_SKIP, &mm->flags))
-		return false;
-
 	if (atomic_read(&mm->mm_users) <= 1)
 		return true;
 
@@ -806,6 +799,20 @@ static bool task_will_free_mem(struct task_struct *task)
 	}
 	rcu_read_unlock();
 
+	/*
+	 * It is possible that current thread fails to try allocation from
+	 * memory reserves if the OOM reaper set MMF_OOM_SKIP on this mm before
+	 * current thread calls out_of_memory() in order to get TIF_MEMDIE.
+	 * In that case, allow current thread to try TIF_MEMDIE allocation
+	 * before start selecting next OOM victims.
+	 */
+	if (ret && test_bit(MMF_OOM_SKIP, &mm->flags)) {
+		if (task == current && !task->oom_kill_free_check_raced)
+			task->oom_kill_free_check_raced = true;
+		else
+			ret = false;
+	}
+
 	return ret;
 }
 
-- 
1.8.3.1

What is "oom_victims counting after threads passed exit_oom_victim" ?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-07-29  4:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <e6c83a26-1d59-4afd-55cf-04e58bdde188@caviumnetworks.com>
2017-07-28 12:32 ` Michal Hocko
2017-07-28 12:59   ` Tetsuo Handa
2017-07-28 13:07     ` Michal Hocko
2017-07-28 13:15       ` Tetsuo Handa
2017-07-28 13:29         ` Michal Hocko
2017-07-28 13:55           ` Tetsuo Handa
2017-07-28 14:07             ` Michal Hocko
2017-07-29  4:31               ` Tetsuo Handa [this message]
2017-08-01 12:14                 ` Michal Hocko
2017-08-01 14:16                   ` Tetsuo Handa
2017-08-01 14:47                     ` Michal Hocko
2017-08-01 10:46               ` Tetsuo Handa
2017-08-01 11:30                 ` Michal Hocko
2017-07-28 13:15     ` Manish Jaggi
2017-07-28 13:50   ` Manish Jaggi
2017-07-28 14:12     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201707291331.JGI18780.OtJVLFMHFOFSOQ@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mjaggi@caviumnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox