From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: guro@fb.com
Cc: linux-mm@kvack.org, mhocko@kernel.org, vdavydov.dev@gmail.com,
hannes@cmpxchg.org, tj@kernel.org, kernel-team@fb.com,
cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [v3 1/6] mm, oom: use oom_victims counter to synchronize oom victim selection
Date: Fri, 23 Jun 2017 05:37:50 +0900 [thread overview]
Message-ID: <201706230537.IDB21366.SQHJVFOOFOMFLt@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20170622165858.GA30035@castle>
Roman Gushchin wrote:
> On Thu, Jun 22, 2017 at 09:40:28AM +0900, Tetsuo Handa wrote:
> > Roman Gushchin wrote:
> > > --- a/mm/oom_kill.c
> > > +++ b/mm/oom_kill.c
> > > @@ -992,6 +992,13 @@ bool out_of_memory(struct oom_control *oc)
> > > if (oom_killer_disabled)
> > > return false;
> > >
> > > + /*
> > > + * If there are oom victims in flight, we don't need to select
> > > + * a new victim.
> > > + */
> > > + if (atomic_read(&oom_victims) > 0)
> > > + return true;
> > > +
> > > if (!is_memcg_oom(oc)) {
> > > blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
> > > if (freed > 0)
> >
> > Above in this patch and below in patch 5 are wrong.
> >
> > > --- a/mm/oom_kill.c
> > > +++ b/mm/oom_kill.c
> > > @@ -665,7 +672,13 @@ static void mark_oom_victim(struct task_struct *tsk)
> > > * that TIF_MEMDIE tasks should be ignored.
> > > */
> > > __thaw_task(tsk);
> > > - atomic_inc(&oom_victims);
> > > +
> > > + /*
> > > + * If there are no oom victims in flight,
> > > + * give the task an access to the memory reserves.
> > > + */
> > > + if (atomic_inc_return(&oom_victims) == 1)
> > > + set_tsk_thread_flag(tsk, TIF_MEMDIE);
> > > }
> > >
> > > /**
> >
> > The OOM reaper is not available for CONFIG_MMU=n kernels, and timeout based
> > giveup is not permitted, but a multithreaded process might be selected as
> > an OOM victim. Not setting TIF_MEMDIE to all threads sharing an OOM victim's
> > mm increases possibility of preventing some OOM victim thread from terminating
> > (e.g. one of them cannot leave __alloc_pages_slowpath() with mmap_sem held for
> > write due to waiting for the TIF_MEMDIE thread to call exit_oom_victim() when
> > the TIF_MEMDIE thread is waiting for the thread with mmap_sem held for write).
>
> I agree, that CONFIG_MMU=n is a special case, and the proposed approach can't
> be used directly. But can you, please, why do you find the first chunk wrong?
Since you are checking oom_victims before checking task_will_free_mem(current),
only one thread can get TIF_MEMDIE. This is where a multithreaded OOM victim without
the OOM reaper can get stuck forever.
> Basically, it's exactly what we do now: we increment oom_victims for every oom
> victim, and every process decrements this counter during it's exit path.
> If there is at least one existing victim, we will select it again, so it's just
> an optimization. Am I missing something? Why should we start new victim selection
> if there processes that will likely quit and release memory soon?
If you check oom_victims like
http://lkml.kernel.org/r/201706220053.v5M0rmOU078764@www262.sakura.ne.jp does,
all threads in a multithreaded OOM victim will have a chance to get TIF_MEMDIE.
By the way, below in patch 5 is also wrong.
You are not permitted to set TIF_MEMDIE if oom_killer_disabled == true.
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -556,8 +556,18 @@ static void oom_reap_task(struct task_struct *tsk)
struct mm_struct *mm = tsk->signal->oom_mm;
/* Retry the down_read_trylock(mmap_sem) a few times */
- while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm))
+ while (attempts++ < MAX_OOM_REAP_RETRIES &&
+ !__oom_reap_task_mm(tsk, mm)) {
+
+ /*
+ * If the task has no access to the memory reserves,
+ * grant it to help the task to exit.
+ */
+ if (!test_tsk_thread_flag(tsk, TIF_MEMDIE))
+ set_tsk_thread_flag(tsk, TIF_MEMDIE);
+
schedule_timeout_idle(HZ/10);
+ }
if (attempts <= MAX_OOM_REAP_RETRIES)
goto done;
next prev parent reply other threads:[~2017-06-22 20:37 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-21 21:19 [v3 0/6] cgroup-aware OOM killer Roman Gushchin
2017-06-21 21:19 ` [v3 1/6] mm, oom: use oom_victims counter to synchronize oom victim selection Roman Gushchin
[not found] ` <201706220040.v5M0eSnK074332@www262.sakura.ne.jp>
2017-06-22 16:58 ` Roman Gushchin
2017-06-22 20:37 ` Tetsuo Handa [this message]
[not found] ` <201706230537.IDB21366.SQHJVFOOFOMFLt-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2017-06-22 21:52 ` Tetsuo Handa
2017-06-29 18:47 ` Roman Gushchin
2017-06-29 20:13 ` Tetsuo Handa
2017-06-29 9:04 ` Michal Hocko
2017-06-21 21:19 ` [v3 2/6] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-07-10 23:05 ` David Rientjes
2017-07-11 12:51 ` Roman Gushchin
2017-07-11 20:56 ` David Rientjes
2017-07-12 12:11 ` Roman Gushchin
2017-07-12 20:26 ` David Rientjes
2017-06-21 21:19 ` [v3 3/6] mm, oom: cgroup-aware OOM killer debug info Roman Gushchin
2017-06-21 21:19 ` [v3 4/6] mm, oom: introduce oom_score_adj for memory cgroups Roman Gushchin
2017-06-21 21:19 ` [v3 5/6] mm, oom: don't mark all oom victims tasks with TIF_MEMDIE Roman Gushchin
2017-06-29 8:53 ` Michal Hocko
2017-06-29 18:45 ` Roman Gushchin
2017-06-30 8:25 ` Michal Hocko
2017-06-21 21:19 ` [v3 6/6] mm,oom,docs: describe the cgroup-aware OOM killer Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201706230537.IDB21366.SQHJVFOOFOMFLt@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=cgroups@vger.kernel.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox