From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org, rientjes@google.com, oleg@redhat.com,
vdavydov@parallels.com, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 10/10] mm, oom: hide mm which is shared with kthread or global init
Date: Mon, 6 Jun 2016 10:15:26 +0200 [thread overview]
Message-ID: <20160606081526.GC11895@dhcp22.suse.cz> (raw)
In-Reply-To: <201606040016.BFG17115.OFMLSJFOtHQOFV@I-love.SAKURA.ne.jp>
On Sat 04-06-16 00:16:32, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > The only case where the oom_reaper is not triggered for the oom victim
> > is when it shares the memory with a kernel thread (aka use_mm) or with
> > the global init. After "mm, oom: skip vforked tasks from being selected"
> > the victim cannot be a vforked task of the global init so we are left
> > with clone(CLONE_VM) (without CLONE_THREAD or CLONE_SIGHAND).
>
> According to clone(2) manpage
>
> Since Linux 2.5.35, flags must also include CLONE_SIGHAND if
> CLONE_THREAD is specified (and note that, since Linux
> 2.6.0-test6, CLONE_SIGHAND also requires CLONE_VM to be
> included).
>
> clone(CLONE_VM | CLONE_SIGHAND) and clone(CLONE_VM | CLONE_SIGHAND | CLONE_THREAD)
> are allowed but clone(CLONE_VM | CLONE_THREAD) is not allowed. Therefore,
> I think "clone(CLONE_VM) (without CLONE_THREAD or CLONE_SIGHAND)" should be
> written like "clone(CLONE_VM without CLONE_SIGHAND)".
Sure, I can change the wording.
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 9a5cc12a479a..3a3b136ee9db 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -283,10 +283,19 @@ enum oom_scan_t oom_scan_process_thread(struct oom_control *oc,
> >
> > /*
> > * This task already has access to memory reserves and is being killed.
> > - * Don't allow any other task to have access to the reserves.
> > + * Don't allow any other task to have access to the reserves unless
> > + * this is a current task which is clearly in the allocation path and
> > + * the access to memory reserves didn't help so we should rather try
> > + * to kill somebody else or panic on no oom victim than loop with no way
> > + * forward. Go with OOM_SCAN_OK rather than OOM_SCAN_CONTINUE to double
> > + * check MMF_OOM_REAPED in oom_badness() to make sure we've done
> > + * everything to reclaim memory.
> > */
> > - if (!is_sysrq_oom(oc) && atomic_read(&task->signal->oom_victims))
> > - return OOM_SCAN_ABORT;
> > + if (!is_sysrq_oom(oc) && atomic_read(&task->signal->oom_victims)) {
> > + if (task != current)
> > + return OOM_SCAN_ABORT;
> > + return OOM_SCAN_OK;
> > + }
>
> I don't think above change is needed. Instead, making sure that TIF_MEMDIE is
> cleared (or ignored) some time later is needed.
This is a counterpart for oom_kill_process which doesn't clear
TIF_MEMDIE for the current task if it is not reapable.
> If an allocating task leaves out_of_memory() with a TIF_MEMDIE thread, it is
> guaranteed (provided that CONFIG_MMU=y && oom_reaper_th != NULL) that the OOM
> reaper is woken up and clear TIF_MEMDIE and sets MMF_OOM_REAPED regardless of
> reaping result.
>
> Leaving current thread from out_of_memory() without clearing TIF_MEMDIE might
> cause OOM lockup, for there is no guarantee that current thread will not wait
> for locks in unkillable state after current memory allocation request completes
> (e.g. getname() followed by mutex_lock() shown at
> http://lkml.kernel.org/r/201509290118.BCJ43256.tSFFFMOLHVOJOQ@I-love.SAKURA.ne.jp ).
>
> > @@ -922,8 +936,17 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
> > }
> > rcu_read_unlock();
> >
> > - if (can_oom_reap)
> > + if (can_oom_reap) {
> > wake_oom_reaper(victim);
> > + } else if (victim != current) {
> > + /*
> > + * If we want to guarantee a forward progress we cannot keep
> > + * the oom victim TIF_MEMDIE here. Sleep for a while and then
> > + * drop the flag to make sure another victim can be selected.
> > + */
> > + schedule_timeout_killable(HZ);
>
> Sending SIGKILL to victim makes this sleep a no-op if
> same_thread_group(victim, current) == true.
Yes, I just wanted to skip exit_oom_victim here because the current task
wouldn't have any means to use memory reserves. This might be not
sufficient as you write above. I will think about this some more.
> > + exit_oom_victim(victim);
> > + }
> >
> > mmdrop(mm);
> > put_task_struct(victim);
> > --
> > 2.8.1
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-06-06 8:15 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-03 9:16 [PATCH 0/10 -v3] Handle oom bypass more gracefully Michal Hocko
2016-06-03 9:16 ` [PATCH 01/10] proc, oom: drop bogus task_lock and mm check Michal Hocko
2016-06-03 9:16 ` [PATCH 02/10] proc, oom: drop bogus sighand lock Michal Hocko
2016-06-03 9:16 ` [PATCH 03/10] proc, oom_adj: extract oom_score_adj setting into a helper Michal Hocko
2016-06-03 9:16 ` [PATCH 04/10] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj Michal Hocko
2016-06-03 9:16 ` [PATCH 05/10] mm, oom: skip vforked tasks from being selected Michal Hocko
2016-06-03 9:16 ` [PATCH 06/10] mm, oom: kill all tasks sharing the mm Michal Hocko
2016-06-06 22:27 ` David Rientjes
2016-06-06 23:20 ` Oleg Nesterov
2016-06-07 6:37 ` Michal Hocko
2016-06-07 22:15 ` David Rientjes
2016-06-08 6:22 ` Michal Hocko
2016-06-08 22:51 ` David Rientjes
2016-06-09 6:46 ` Michal Hocko
2016-06-03 9:16 ` [PATCH 07/10] mm, oom: fortify task_will_free_mem Michal Hocko
2016-06-03 11:42 ` Tetsuo Handa
2016-06-03 12:12 ` Michal Hocko
2016-06-03 9:16 ` [PATCH 08/10] mm, oom: task_will_free_mem should skip oom_reaped tasks Michal Hocko
2016-06-03 9:16 ` [RFC PATCH 09/10] mm, oom_reaper: do not attempt to reap a task more than twice Michal Hocko
2016-06-03 9:16 ` [RFC PATCH 10/10] mm, oom: hide mm which is shared with kthread or global init Michal Hocko
2016-06-03 15:16 ` Tetsuo Handa
2016-06-06 8:15 ` Michal Hocko [this message]
2016-06-06 13:26 ` Michal Hocko
2016-06-07 6:26 ` Michal Hocko
2016-06-03 12:00 ` [PATCH 0/10 -v3] Handle oom bypass more gracefully Tetsuo Handa
2016-06-03 12:20 ` Michal Hocko
2016-06-03 12:22 ` Michal Hocko
2016-06-04 10:57 ` Tetsuo Handa
2016-06-06 8:39 ` Michal Hocko
2016-06-03 15:17 ` Tetsuo Handa
2016-06-06 8:36 ` Michal Hocko
2016-06-07 14:30 ` Tetsuo Handa
2016-06-07 15:05 ` Michal Hocko
2016-06-07 21:49 ` Tetsuo Handa
2016-06-08 7:27 ` Michal Hocko
2016-06-08 14:55 ` Tetsuo Handa
2016-06-08 16:05 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160606081526.GC11895@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=oleg@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rientjes@google.com \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox