linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org
Cc: akpm@linux-foundation.org, rientjes@google.com, linux-mm@kvack.org
Subject: Re: [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context
Date: Wed, 25 May 2016 23:30:26 +0900	[thread overview]
Message-ID: <201605252330.IAC82384.OOSQHVtFFFLOMJ@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20160525135002.GI20132@dhcp22.suse.cz>

Michal Hocko wrote:
> On Wed 25-05-16 19:52:18, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > > Just a random thought, but after this patch is applied, do we still need to use
> > > > a dedicated kernel thread for OOM-reap operation? If I recall correctly, the
> > > > reason we decided to use a dedicated kernel thread was that calling
> > > > down_read(&mm->mmap_sem) / mmput() from the OOM killer context is unsafe due to
> > > > dependency. By replacing mmput() with mmput_async(), since __oom_reap_task() will
> > > > no longer do operations that might block, can't we try OOM-reap operation from
> > > > current thread which called mark_oom_victim() or oom_scan_process_thread() ?
> > > 
> > > I was already thinking about that. It is true that the main blocker
> > > was the mmput, as you say, but the dedicated kernel thread seems to be
> > > more robust locking and stack wise. So I would prefer staying with the
> > > current approach until we see that it is somehow limitting. One pid and
> > > kernel stack doesn't seem to be a terrible price to me. But as I've said
> > > I am not bound to the kernel thread approach...
> > > 
> > 
> > It seems to me that async OOM reaping widens race window for needlessly
> > selecting next OOM victim, for the OOM reaper holding a reference of a
> > TIF_MEMDIE thread's mm expedites clearing TIF_MEMDIE from that thread
> > by making atomic_dec_and_test() in mmput() from exit_mm() false.
>  
> AFAIU you mean
> __oom_reap_task			exit_mm
>   atomic_inc_not_zero
> 				  tsk->mm = NULL
> 				  mmput
>   				    atomic_dec_and_test # > 0
> 				  exit_oom_victim # New victim will be
> 				  		  # selected
> 				<OOM killer invoked>
> 				  # no TIF_MEMDIE task so we can select a new one
>   unmap_page_range # to release the memory
> 

Yes.

> Previously we were kind of protected by PF_EXITING check in
> oom_scan_process_thread which is not there anymore. The race is possible
> even without the oom reaper because many other call sites might pin
> the address space and be preempted for an unbounded amount of time. We

It is true that there has been a race window even without the OOM reaper
(and I tried to mitigate it using oomkiller_holdoff_timer).
But until the OOM reaper kernel thread was introduced, the sequence

 				  mmput
   				    atomic_dec_and_test # > 0
 				  exit_oom_victim # New victim will be
 				  		  # selected

was able to select another thread sharing that mm (with noisy dump_header()
messages which I think should be suppressed after that thread group received
SIGKILL from oom_kill_process()). Since the OOM reaper is a kernel thread,
this sequence will simply select a different thread group not sharing that mm.
In this regard, I think that async OOM reaping increased possibility of
needlessly selecting next OOM victim.

> could widen the race window by reintroducing the check or moving
> exit_oom_victim later in do_exit after exit_notify which then removes
> the task from the task_list (in __unhash_process) so the OOM killer
> wouldn't see it anyway. Sounds ugly to me though.
> 
> > Maybe we should wait for first OOM reap attempt from the OOM killer context
> > before releasing oom_lock mutex (sync OOM reaping) ?
> 
> I do not think we want to wait inside the oom_lock as it is a global
> lock shared by all OOM killer contexts. Another option would be to use
> the oom_lock inside __oom_reap_task. It is not super cool either because
> now we have a dependency on the lock but looks like reasonably easy
> solution.

It would be nice if we can wait until memory reclaimed from the OOM victim's
mm is queued to freelist for allocation. But I don't have idea other than
oomkiller_holdoff_timer.

I think this problem should be discussed another day in a new thread.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-05-25 14:30 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-26 14:04 [PATCH 0/2] last pile of oom_reaper patches for now Michal Hocko
2016-04-26 14:04 ` [PATCH 1/2] mm, oom_reaper: hide oom reaped tasks from OOM killer more carefully Michal Hocko
2016-04-26 14:04 ` [PATCH 2/2] mm, oom_reaper: do not mmput synchronously from the oom reaper context Michal Hocko
2016-04-26 14:18   ` kbuild test robot
2016-04-26 14:58     ` Michal Hocko
2016-05-19 14:29   ` Tetsuo Handa
2016-05-19 17:20     ` Michal Hocko
2016-05-25 10:52       ` Tetsuo Handa
2016-05-25 13:50         ` Michal Hocko
2016-05-25 14:30           ` Tetsuo Handa [this message]
2016-05-20  1:30 Minchan Kim
2016-05-20  6:16 ` Michal Hocko
2016-05-20  7:12   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201605252330.IAC82384.OOSQHVtFFFLOMJ@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox