From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org, linux-mm@kvack.org
Cc: rientjes@google.com, oleg@redhat.com, vdavydov@parallels.com,
akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
mhocko@suse.com
Subject: Re: [RFC PATCH 10/10] mm, oom: hide mm which is shared with kthread or global init
Date: Sat, 4 Jun 2016 00:16:32 +0900 [thread overview]
Message-ID: <201606040016.BFG17115.OFMLSJFOtHQOFV@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <1464945404-30157-11-git-send-email-mhocko@kernel.org>
Michal Hocko wrote:
> The only case where the oom_reaper is not triggered for the oom victim
> is when it shares the memory with a kernel thread (aka use_mm) or with
> the global init. After "mm, oom: skip vforked tasks from being selected"
> the victim cannot be a vforked task of the global init so we are left
> with clone(CLONE_VM) (without CLONE_THREAD or CLONE_SIGHAND).
According to clone(2) manpage
Since Linux 2.5.35, flags must also include CLONE_SIGHAND if
CLONE_THREAD is specified (and note that, since Linux
2.6.0-test6, CLONE_SIGHAND also requires CLONE_VM to be
included).
clone(CLONE_VM | CLONE_SIGHAND) and clone(CLONE_VM | CLONE_SIGHAND | CLONE_THREAD)
are allowed but clone(CLONE_VM | CLONE_THREAD) is not allowed. Therefore,
I think "clone(CLONE_VM) (without CLONE_THREAD or CLONE_SIGHAND)" should be
written like "clone(CLONE_VM without CLONE_SIGHAND)".
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 9a5cc12a479a..3a3b136ee9db 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -283,10 +283,19 @@ enum oom_scan_t oom_scan_process_thread(struct oom_control *oc,
>
> /*
> * This task already has access to memory reserves and is being killed.
> - * Don't allow any other task to have access to the reserves.
> + * Don't allow any other task to have access to the reserves unless
> + * this is a current task which is clearly in the allocation path and
> + * the access to memory reserves didn't help so we should rather try
> + * to kill somebody else or panic on no oom victim than loop with no way
> + * forward. Go with OOM_SCAN_OK rather than OOM_SCAN_CONTINUE to double
> + * check MMF_OOM_REAPED in oom_badness() to make sure we've done
> + * everything to reclaim memory.
> */
> - if (!is_sysrq_oom(oc) && atomic_read(&task->signal->oom_victims))
> - return OOM_SCAN_ABORT;
> + if (!is_sysrq_oom(oc) && atomic_read(&task->signal->oom_victims)) {
> + if (task != current)
> + return OOM_SCAN_ABORT;
> + return OOM_SCAN_OK;
> + }
I don't think above change is needed. Instead, making sure that TIF_MEMDIE is
cleared (or ignored) some time later is needed.
If an allocating task leaves out_of_memory() with a TIF_MEMDIE thread, it is
guaranteed (provided that CONFIG_MMU=y && oom_reaper_th != NULL) that the OOM
reaper is woken up and clear TIF_MEMDIE and sets MMF_OOM_REAPED regardless of
reaping result.
Leaving current thread from out_of_memory() without clearing TIF_MEMDIE might
cause OOM lockup, for there is no guarantee that current thread will not wait
for locks in unkillable state after current memory allocation request completes
(e.g. getname() followed by mutex_lock() shown at
http://lkml.kernel.org/r/201509290118.BCJ43256.tSFFFMOLHVOJOQ@I-love.SAKURA.ne.jp ).
> @@ -922,8 +936,17 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
> }
> rcu_read_unlock();
>
> - if (can_oom_reap)
> + if (can_oom_reap) {
> wake_oom_reaper(victim);
> + } else if (victim != current) {
> + /*
> + * If we want to guarantee a forward progress we cannot keep
> + * the oom victim TIF_MEMDIE here. Sleep for a while and then
> + * drop the flag to make sure another victim can be selected.
> + */
> + schedule_timeout_killable(HZ);
Sending SIGKILL to victim makes this sleep a no-op if
same_thread_group(victim, current) == true.
> + exit_oom_victim(victim);
> + }
>
> mmdrop(mm);
> put_task_struct(victim);
> --
> 2.8.1
next prev parent reply other threads:[~2016-06-03 15:16 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-03 9:16 [PATCH 0/10 -v3] Handle oom bypass more gracefully Michal Hocko
2016-06-03 9:16 ` [PATCH 01/10] proc, oom: drop bogus task_lock and mm check Michal Hocko
2016-06-03 9:16 ` [PATCH 02/10] proc, oom: drop bogus sighand lock Michal Hocko
2016-06-03 9:16 ` [PATCH 03/10] proc, oom_adj: extract oom_score_adj setting into a helper Michal Hocko
2016-06-03 9:16 ` [PATCH 04/10] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj Michal Hocko
2016-06-03 9:16 ` [PATCH 05/10] mm, oom: skip vforked tasks from being selected Michal Hocko
2016-06-03 9:16 ` [PATCH 06/10] mm, oom: kill all tasks sharing the mm Michal Hocko
2016-06-06 22:27 ` David Rientjes
2016-06-06 23:20 ` Oleg Nesterov
2016-06-07 6:37 ` Michal Hocko
2016-06-07 22:15 ` David Rientjes
2016-06-08 6:22 ` Michal Hocko
2016-06-08 22:51 ` David Rientjes
2016-06-09 6:46 ` Michal Hocko
2016-06-03 9:16 ` [PATCH 07/10] mm, oom: fortify task_will_free_mem Michal Hocko
2016-06-03 11:42 ` Tetsuo Handa
2016-06-03 12:12 ` Michal Hocko
2016-06-03 9:16 ` [PATCH 08/10] mm, oom: task_will_free_mem should skip oom_reaped tasks Michal Hocko
2016-06-03 9:16 ` [RFC PATCH 09/10] mm, oom_reaper: do not attempt to reap a task more than twice Michal Hocko
2016-06-03 9:16 ` [RFC PATCH 10/10] mm, oom: hide mm which is shared with kthread or global init Michal Hocko
2016-06-03 15:16 ` Tetsuo Handa [this message]
2016-06-06 8:15 ` Michal Hocko
2016-06-06 13:26 ` Michal Hocko
2016-06-07 6:26 ` Michal Hocko
2016-06-03 12:00 ` [PATCH 0/10 -v3] Handle oom bypass more gracefully Tetsuo Handa
2016-06-03 12:20 ` Michal Hocko
2016-06-03 12:22 ` Michal Hocko
2016-06-04 10:57 ` Tetsuo Handa
2016-06-06 8:39 ` Michal Hocko
2016-06-03 15:17 ` Tetsuo Handa
2016-06-06 8:36 ` Michal Hocko
2016-06-07 14:30 ` Tetsuo Handa
2016-06-07 15:05 ` Michal Hocko
2016-06-07 21:49 ` Tetsuo Handa
2016-06-08 7:27 ` Michal Hocko
2016-06-08 14:55 ` Tetsuo Handa
2016-06-08 16:05 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201606040016.BFG17115.OFMLSJFOtHQOFV@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=oleg@redhat.com \
--cc=rientjes@google.com \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox