linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: hannes@cmpxchg.org, akpm@linux-foundation.org
Cc: mhocko@kernel.org, linux-mm@kvack.org, rientjes@google.com,
	linux-kernel@vger.kernel.org, mhocko@suse.com
Subject: Re: [PATCH 2/2]oom-clear-tif_memdie-after-oom_reaper-managed-to-unmap-the-address-space-fix
Date: Thu, 10 Mar 2016 20:17:33 +0900	[thread overview]
Message-ID: <201603102017.ECB12953.HLFJQFVOtMFSOO@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160310004500.GA7374@cmpxchg.org>

Johannes Weiner wrote:
> On Wed, Mar 09, 2016 at 03:08:53PM -0800, Andrew Morton wrote:
> > On Wed, 9 Mar 2016 17:48:29 -0500 Johannes Weiner <hannes@cmpxchg.org> wrote:
> > 
> > > However, I disagree with your changelog.
> > 
> > What text would you prefer?
> 
> I'd just keep the one you had initially. Or better, this modified
> version:
> 
> When the OOM killer scans tasks and encounters a PF_EXITING one, it
> force-selects that task regardless of the score. The problem is that
> if that task got stuck waiting for some state the allocation site is
> holding, the OOM reaper can not move on to the next best victim.
> 

There is no guarantee that the OOM reaper is waken up.
There are shortcuts which I don't like.

> Frankly, I don't even know why we check for exiting tasks in the OOM
> killer. We've tried direct reclaim at least 15 times by the time we
> decide the system is OOM, there was plenty of time to exit and free
> memory; and a task might exit voluntarily right after we issue a kill.
> This is testing pure noise. Remove it.
> 

My concern is what an optimistic idea it is to wait for task_will_free_mem() or
TIF_MEMDIE task forever blindly
( http://lkml.kernel.org/r/201602232224.FEJ69269.LMVJOFFOQSHtFO@I-love.SAKURA.ne.jp ).
We have

  do_exit() {
    exit_signals(); /* sets PF_EXITING */
    /* (1) start */
    exit_mm() {
      mm_release() {
        exit_robust_list() {
          get_user() {
            __do_page_fault() {
              /* (1) end */
              down_read(&current->mm->mmap_sem);
              handle_mm_fault() {
                kmalloc(GFP_KERNEL) {
                  out_of_memory() {
                    if (current->mm &&
                        (fatal_signal_pending(current) || task_will_free_mem(current))) {
                      mark_oom_victim(current); /* sets TIF_MEMDIE */
                      return true;
                    }
                  }
                }
              }
              up_read(&current->mm->mmap_sem);
              /* (2) start */
            }
          }
        }
      }
      /* (2) end */
      down_read(&current->mm->mmap_sem);
      up_read(&current->mm->mmap_sem);
      current->mm = NULL;
      exit_oom_victim();
    }
  }

sequence. We will hit silent OOM livelock if somebody sharing the mm does
down_write_killable(&current->mm->mmap_sem) and kmalloc(GFP_KERNEL) for mmap() etc. at (1) or (2)
due to failing to send SIGKILL to somebody doing/done down_write_killable(&current->mm->mmap_sem)
and returning OOM_SCAN_ABORT without testing whether down_read(&victim->mm->mmap_sem) will succeed.
Since the OOM reaper is not invoked when shortcut is used, nobody can unlock.

Doing

-	if (task_will_free_mem(task) && !is_sysrq_oom(oc))
+	if (task_will_free_mem(task) && !is_sysrq_oom(oc) && can_lock_mm_for_read(task))
		return OOM_SCAN_ABORT;

and

	if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
-		if (!is_sysrq_oom(oc))
+		if (!is_sysrq_oom(oc) && can_lock_mm_for_read(task))
			return OOM_SCAN_ABORT;
	}

is a too fast decision because can_lock_mm_for_read(task) might become true
if if we waited for a moment. Doing

-	if (task_will_free_mem(task) && !is_sysrq_oom(oc))
+	if (task_will_free_mem(task) && !is_sysrq_oom(oc) && we_havent_waited_enough_period(task))
		return OOM_SCAN_ABORT;

and

	if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
-		if (!is_sysrq_oom(oc))
+		if (!is_sysrq_oom(oc) && we_havent_waited_enough_period(task))
			return OOM_SCAN_ABORT;
	}

is a timeout based unlocking which Michal does not like. Doing

-	if (task_will_free_mem(task) && !is_sysrq_oom(oc))
+	if (task_will_free_mem(task) && !is_sysrq_oom(oc) && should_oom_scan_abort(task))
		return OOM_SCAN_ABORT;

and

	if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
-		if (!is_sysrq_oom(oc))
+		if (!is_sysrq_oom(oc) && should_oom_scan_abort(task))
			return OOM_SCAN_ABORT;
	}

is a counter based unlocking which I don't know what Michal thinks.

This situation is similar to when to declare OOM in OOM detection rework.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-03-10 11:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-08 13:12 [PATCH -mm 0/2] oom_reaper: missing parts Michal Hocko
2016-03-08 13:12 ` [PATCH 1/2] mm-oom_reaper-report-success-failure-fix-fix Michal Hocko
2016-03-08 13:12 ` [PATCH 2/2] oom-clear-tif_memdie-after-oom_reaper-managed-to-unmap-the-address-space-fix Michal Hocko
2016-03-09 21:21   ` Andrew Morton
2016-03-09 22:21     ` Tetsuo Handa
2016-03-09 22:48       ` Johannes Weiner
2016-03-09 23:08         ` Andrew Morton
2016-03-10  0:45           ` Johannes Weiner
2016-03-10 11:17             ` Tetsuo Handa [this message]
2016-03-09 22:30     ` Johannes Weiner
2016-03-08 13:18 ` [PATCH -mm 0/2] oom_reaper: missing parts Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201603102017.ECB12953.HLFJQFVOtMFSOO@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox