linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: rientjes@google.com, akpm@linux-foundation.org,
	aarcange@redhat.com, guro@fb.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [patch v2] mm, oom: fix concurrent munlock and oom reaper unmap
Date: Thu, 19 Apr 2018 20:51:45 +0900	[thread overview]
Message-ID: <201804192051.JDE35992.OLFOQFMOtJHFSV@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20180419110419.GQ17484@dhcp22.suse.cz>

Michal Hocko wrote:
> > We need to teach the OOM reaper stop reaping as soon as entering exit_mmap().
> > Maybe let the OOM reaper poll for progress (e.g. none of get_mm_counter(mm, *)
> > decreased for last 1 second) ?
> 
> Can we start simple and build a more elaborate heuristics on top _please_?
> In other words holding the mmap_sem for write for oom victims in
> exit_mmap should handle the problem. We can then enhance this to probe
> for progress or any other clever tricks if we find out that the race
> happens too often and we kill more than necessary.
> 
> Let's not repeat the error of trying to be too clever from the beginning
> as we did previously. This are is just too subtle and obviously error
> prone.
> 
Something like this?

---
 mm/mmap.c     | 41 +++++++++++++++++++++++------------------
 mm/oom_kill.c | 29 +++++++++++------------------
 2 files changed, 34 insertions(+), 36 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 188f195..3edb7da 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3015,6 +3015,28 @@ void exit_mmap(struct mm_struct *mm)
 	/* mm's last user has gone, and its about to be pulled down */
 	mmu_notifier_release(mm);
 
+	if (unlikely(mm_is_oom_victim(mm))) {
+		/*
+		 * Tell oom_reap_task() not to start reaping this mm.
+		 *
+		 * oom_reap_task() depends on a stable VM_LOCKED flag to
+		 * indicate it should not unmap during munlock_vma_pages_all().
+		 *
+		 * Since MMF_UNSTABLE is set before calling down_write(),
+		 * oom_reap_task() which calls down_read() before testing
+		 * MMF_UNSTABLE will not run on this mm after up_write().
+		 *
+		 * mm_is_oom_victim() cannot be set from under us because
+		 * victim->mm is already set to NULL under task_lock before
+		 * calling mmput() and victim->signal->oom_mm is set by the oom
+		 * killer only if victim->mm is non-NULL while holding
+		 * task_lock().
+		 */
+		set_bit(MMF_UNSTABLE, &mm->flags);
+		down_write(&mm->mmap_sem);
+		up_write(&mm->mmap_sem);
+	}
+
 	if (mm->locked_vm) {
 		vma = mm->mmap;
 		while (vma) {
@@ -3036,26 +3058,9 @@ void exit_mmap(struct mm_struct *mm)
 	/* update_hiwater_rss(mm) here? but nobody should be looking */
 	/* Use -1 here to ensure all VMAs in the mm are unmapped */
 	unmap_vmas(&tlb, vma, 0, -1);
-
-	if (unlikely(mm_is_oom_victim(mm))) {
-		/*
-		 * Wait for oom_reap_task() to stop working on this
-		 * mm. Because MMF_OOM_SKIP is already set before
-		 * calling down_read(), oom_reap_task() will not run
-		 * on this "mm" post up_write().
-		 *
-		 * mm_is_oom_victim() cannot be set from under us
-		 * either because victim->mm is already set to NULL
-		 * under task_lock before calling mmput and oom_mm is
-		 * set not NULL by the OOM killer only if victim->mm
-		 * is found not NULL while holding the task_lock.
-		 */
-		set_bit(MMF_OOM_SKIP, &mm->flags);
-		down_write(&mm->mmap_sem);
-		up_write(&mm->mmap_sem);
-	}
 	free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING);
 	tlb_finish_mmu(&tlb, 0, -1);
+	set_bit(MMF_OOM_SKIP, &mm->flags);
 
 	/*
 	 * Walk the list again, actually closing and freeing it,
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ff992fa..1fef1b6 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -510,25 +510,16 @@ static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
 
 	/*
 	 * If the mm has invalidate_{start,end}() notifiers that could block,
+	 * or if the mm is in exit_mmap() which has unpredictable dependencies,
 	 * sleep to give the oom victim some more time.
 	 * TODO: we really want to get rid of this ugly hack and make sure that
 	 * notifiers cannot block for unbounded amount of time
 	 */
-	if (mm_has_blockable_invalidate_notifiers(mm)) {
-		up_read(&mm->mmap_sem);
-		schedule_timeout_idle(HZ);
-		goto unlock_oom;
-	}
-
-	/*
-	 * MMF_OOM_SKIP is set by exit_mmap when the OOM reaper can't
-	 * work on the mm anymore. The check for MMF_OOM_SKIP must run
-	 * under mmap_sem for reading because it serializes against the
-	 * down_write();up_write() cycle in exit_mmap().
-	 */
-	if (test_bit(MMF_OOM_SKIP, &mm->flags)) {
+	if (mm_has_blockable_invalidate_notifiers(mm) ||
+	    test_bit(MMF_UNSTABLE, &mm->flags)) {
 		up_read(&mm->mmap_sem);
 		trace_skip_task_reaping(tsk->pid);
+		schedule_timeout_idle(HZ);
 		goto unlock_oom;
 	}
 
@@ -590,11 +581,9 @@ static void oom_reap_task(struct task_struct *tsk)
 	while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm))
 		schedule_timeout_idle(HZ/10);
 
-	if (attempts <= MAX_OOM_REAP_RETRIES ||
-	    test_bit(MMF_OOM_SKIP, &mm->flags))
+	if (test_bit(MMF_UNSTABLE, &mm->flags))
 		goto done;
 
-
 	pr_info("oom_reaper: unable to reap pid:%d (%s)\n",
 		task_pid_nr(tsk), tsk->comm);
 	debug_show_all_locks();
@@ -603,8 +592,12 @@ static void oom_reap_task(struct task_struct *tsk)
 	tsk->oom_reaper_list = NULL;
 
 	/*
-	 * Hide this mm from OOM killer because it has been either reaped or
-	 * somebody can't call up_write(mmap_sem).
+	 * Hide this mm from the OOM killer because:
+	 *   the OOM reaper completed reaping
+	 * or
+	 *   exit_mmap() told the OOM reaper not to start reaping
+	 * or
+	 *   neither exit_mmap() nor the OOM reaper started reaping
 	 */
 	set_bit(MMF_OOM_SKIP, &mm->flags);
 
-- 
1.8.3.1

  reply	other threads:[~2018-04-19 11:51 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-17 22:46 [patch] " David Rientjes
2018-04-18  0:57 ` Tetsuo Handa
2018-04-18  2:39   ` David Rientjes
2018-04-18  2:52     ` [patch v2] " David Rientjes
2018-04-18  3:55       ` Tetsuo Handa
2018-04-18  4:11         ` David Rientjes
2018-04-18  4:47           ` Tetsuo Handa
2018-04-18  5:20             ` David Rientjes
2018-04-18  7:50       ` Michal Hocko
2018-04-18 11:49         ` Tetsuo Handa
2018-04-18 11:58           ` Michal Hocko
2018-04-18 13:25             ` Tetsuo Handa
2018-04-18 13:44               ` Michal Hocko
2018-04-18 14:28                 ` Tetsuo Handa
2018-04-18 19:14         ` David Rientjes
2018-04-19  6:35           ` Michal Hocko
2018-04-19 10:45             ` Tetsuo Handa
2018-04-19 11:04               ` Michal Hocko
2018-04-19 11:51                 ` Tetsuo Handa [this message]
2018-04-19 12:48                   ` Michal Hocko
2018-04-19 19:14               ` David Rientjes
2018-04-19 19:34             ` David Rientjes
2018-04-19 22:13               ` Tetsuo Handa
2018-04-20  8:23               ` Michal Hocko
2018-04-20 12:40                 ` Michal Hocko
2018-04-22  3:22                   ` David Rientjes
2018-04-22  3:48                     ` [patch v2] mm, oom: fix concurrent munlock and oom reaperunmap Tetsuo Handa
2018-04-22 13:08                       ` Michal Hocko
2018-04-24  2:31                       ` David Rientjes
2018-04-24  5:11                         ` Tetsuo Handa
2018-04-24  5:35                           ` David Rientjes
2018-04-24 21:57                             ` [patch v2] mm, oom: fix concurrent munlock and oom reaper unmap Tetsuo Handa
2018-04-24 22:25                               ` David Rientjes
2018-04-24 22:34                                 ` [patch v3 for-4.17] " David Rientjes
2018-04-24 23:19                                   ` Michal Hocko
2018-04-24 13:04                         ` [patch v2] mm, oom: fix concurrent munlock and oom reaperunmap Michal Hocko
2018-04-24 20:01                           ` David Rientjes
2018-04-24 20:13                             ` Michal Hocko
2018-04-24 20:22                               ` David Rientjes
2018-04-24 20:31                                 ` Michal Hocko
2018-04-24 21:07                                   ` David Rientjes
2018-04-24 23:08                                     ` Michal Hocko
2018-04-24 23:14                                       ` Michal Hocko
2018-04-22  3:45                 ` [patch v2] mm, oom: fix concurrent munlock and oom reaper unmap David Rientjes
2018-04-22 13:18                   ` Michal Hocko
2018-04-23 16:09                     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201804192051.JDE35992.OLFOQFMOtJHFSV@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox