linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
To: guro-b10kYP2dOMg@public.gmane.org
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org,
	tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	kernel-team-b10kYP2dOMg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	penguin-kernel-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org
Subject: Re: [v3 1/6] mm, oom: use oom_victims counter to synchronize oom victim selection
Date: Fri, 23 Jun 2017 06:52:20 +0900	[thread overview]
Message-ID: <201706230652.FDH69263.OtOLFSFMHFOQJV@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <201706230537.IDB21366.SQHJVFOOFOMFLt-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>

Tetsuo Handa wrote:
> Roman Gushchin wrote:
> > On Thu, Jun 22, 2017 at 09:40:28AM +0900, Tetsuo Handa wrote:
> > > Roman Gushchin wrote:
> > > > --- a/mm/oom_kill.c
> > > > +++ b/mm/oom_kill.c
> > > > @@ -992,6 +992,13 @@ bool out_of_memory(struct oom_control *oc)
> > > >  	if (oom_killer_disabled)
> > > >  		return false;
> > > >  
> > > > +	/*
> > > > +	 * If there are oom victims in flight, we don't need to select
> > > > +	 * a new victim.
> > > > +	 */
> > > > +	if (atomic_read(&oom_victims) > 0)
> > > > +		return true;
> > > > +
> > > >  	if (!is_memcg_oom(oc)) {
> > > >  		blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
> > > >  		if (freed > 0)
> > > 
> > > The OOM reaper is not available for CONFIG_MMU=n kernels, and timeout based
> > > giveup is not permitted, but a multithreaded process might be selected as
> > > an OOM victim. Not setting TIF_MEMDIE to all threads sharing an OOM victim's
> > > mm increases possibility of preventing some OOM victim thread from terminating
> > > (e.g. one of them cannot leave __alloc_pages_slowpath() with mmap_sem held for
> > > write due to waiting for the TIF_MEMDIE thread to call exit_oom_victim() when
> > > the TIF_MEMDIE thread is waiting for the thread with mmap_sem held for write).
> > 
> > I agree, that CONFIG_MMU=n is a special case, and the proposed approach can't
> > be used directly. But can you, please, why do you find the first  chunk wrong?
> 
> Since you are checking oom_victims before checking task_will_free_mem(current),
> only one thread can get TIF_MEMDIE. This is where a multithreaded OOM victim without
> the OOM reaper can get stuck forever.

Oops, I misinterpreted. This is where a multithreaded OOM victim with or without
the OOM reaper can get stuck forever. Think about a process with two threads is
selected by the OOM killer and only one of these two threads can get TIF_MEMDIE.

  Thread-1                 Thread-2                 The OOM killer           The OOM reaper

                           Calls down_write(&current->mm->mmap_sem).
  Enters __alloc_pages_slowpath().
                           Enters __alloc_pages_slowpath().
  Takes oom_lock.
  Calls out_of_memory().
                                                    Selects Thread-1 as an OOM victim.
  Gets SIGKILL.            Gets SIGKILL.
  Gets TIF_MEMDIE.
  Releases oom_lock.
  Leaves __alloc_pages_slowpath() because Thread-1 has TIF_MEMDIE.
                                                                             Takes oom_lock.
                                                                             Will do nothing because down_read_trylock() fails.
                                                                             Releases oom_lock.
                                                                             Gives up and sets MMF_OOM_SKIP after one second.
                           Takes oom_lock.
                           Calls out_of_memory().
                           Will not check MMF_OOM_SKIP because Thread-1 still has TIF_MEMDIE. // <= get stuck waiting for Thread-1.
                           Releases oom_lock.
                           Will not leave __alloc_pages_slowpath() because Thread-2 does not have TIF_MEMDIE.
                           Will not call up_write(&current->mm->mmap_sem).
  Reaches do_exit().
  Calls down_read(&current->mm->mmap_sem) in exit_mm() in do_exit(). // <= get stuck waiting for Thread-2.
  Will not call up_read(&current->mm->mmap_sem) in exit_mm() in do_exit().
  Will not clear TIF_MEMDIE in exit_oom_victim() in exit_mm() in do_exit().

  parent reply	other threads:[~2017-06-22 21:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21 21:19 [v3 0/6] cgroup-aware OOM killer Roman Gushchin
2017-06-21 21:19 ` [v3 1/6] mm, oom: use oom_victims counter to synchronize oom victim selection Roman Gushchin
     [not found]   ` <201706220040.v5M0eSnK074332@www262.sakura.ne.jp>
2017-06-22 16:58     ` Roman Gushchin
2017-06-22 20:37       ` Tetsuo Handa
     [not found]         ` <201706230537.IDB21366.SQHJVFOOFOMFLt-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2017-06-22 21:52           ` Tetsuo Handa [this message]
2017-06-29 18:47             ` Roman Gushchin
2017-06-29 20:13               ` Tetsuo Handa
2017-06-29  9:04   ` Michal Hocko
2017-06-21 21:19 ` [v3 2/6] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-07-10 23:05   ` David Rientjes
2017-07-11 12:51     ` Roman Gushchin
2017-07-11 20:56       ` David Rientjes
2017-07-12 12:11         ` Roman Gushchin
2017-07-12 20:26           ` David Rientjes
2017-06-21 21:19 ` [v3 3/6] mm, oom: cgroup-aware OOM killer debug info Roman Gushchin
2017-06-21 21:19 ` [v3 4/6] mm, oom: introduce oom_score_adj for memory cgroups Roman Gushchin
2017-06-21 21:19 ` [v3 5/6] mm, oom: don't mark all oom victims tasks with TIF_MEMDIE Roman Gushchin
2017-06-29  8:53   ` Michal Hocko
2017-06-29 18:45     ` Roman Gushchin
2017-06-30  8:25       ` Michal Hocko
2017-06-21 21:19 ` [v3 6/6] mm,oom,docs: describe the cgroup-aware OOM killer Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201706230652.FDH69263.OtOLFSFMHFOQJV@I-love.SAKURA.ne.jp \
    --to=penguin-kernel-jpay3/yim36haxmntkn67xf5damn2ifp@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=guro-b10kYP2dOMg@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=kernel-team-b10kYP2dOMg@public.gmane.org \
    --cc=linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox