From: Shayan Pooya <shayan@liveve.org>
To: Michal Hocko <mhocko@kernel.org>,
Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
koct9i@gmail.com
Cc: cgroups mailinglist <cgroups@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org
Subject: Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup
Date: Tue, 12 Jul 2016 08:35:06 -0700 [thread overview]
Message-ID: <CABAubTg91qrUd4DO7T2SiJQBK9ypuhP0+F-091ZxtmonjaaYWg@mail.gmail.com> (raw)
In-Reply-To: <20160712071927.GD14586@dhcp22.suse.cz>
>> With strace, when running 500 concurrent mem-hog tasks on the same
>> kernel, 33 of them failed with:
>>
>> strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
>> `THREAD_GETMEM (self, tid) != ppid' failed.
>>
>> Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
>> And discussed before at: https://lkml.org/lkml/2015/2/6/470 but that
>> patch was not accepted.
>
> OK, so the problem is that the oom killed task doesn't report the futex
> release properly? If yes then I fail to see how that is memcg specific.
> Could you try to clarify what you consider a bug again, please? I am not
> really sure I understand this report.
It looks like it is just a very easy way to reproduce the problem that
Konstantin described in that lkml thread. That patch was not accepted
and I see no other fixes for that issue upstream. Here is a copy of
his root-cause analysis from said thread:
Whole sequence looks like: task calls fork, glibc calls syscall clone with
CLONE_CHILD_SETTID and passes pointer to TLS THREAD_SELF->tid as argument.
Child task gets read-only copy of VM including TLS. Child calls put_user()
to handle CLONE_CHILD_SETTID from schedule_tail(). put_user() trigger page
fault and it fails because do_wp_page() hits memcg limit without invoking
OOM-killer because this is page-fault from kernel-space. Put_user returns
-EFAULT, which is ignored. Child returns into user-space and catches here
assert (THREAD_GETMEM (self, tid) != ppid), glibc tries to print something
but hangs on deadlock on internal locks. Halt and catch fire.
Regards
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-07-12 15:35 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-09 23:49 Shayan Pooya
2016-07-11 6:41 ` Michal Hocko
2016-07-11 17:40 ` Shayan Pooya
2016-07-11 18:33 ` Shayan Pooya
2016-07-12 7:19 ` Michal Hocko
2016-07-12 15:35 ` Shayan Pooya [this message]
2016-07-12 15:52 ` Konstantin Khlebnikov
2016-07-12 16:52 ` Oleg Nesterov
2016-07-12 22:57 ` Shayan Pooya
2016-07-14 13:22 ` Oleg Nesterov
2016-07-14 15:35 ` Shayan Pooya
2016-07-15 16:58 ` Shayan Pooya
2016-07-18 13:53 ` Oleg Nesterov
2016-07-13 8:08 ` Michal Hocko
2016-07-12 7:17 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABAubTg91qrUd4DO7T2SiJQBK9ypuhP0+F-091ZxtmonjaaYWg@mail.gmail.com \
--to=shayan@liveve.org \
--cc=cgroups@vger.kernel.org \
--cc=khlebnikov@yandex-team.ru \
--cc=koct9i@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox