From: Michal Hocko <mhocko@kernel.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: syzbot <syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
"Roman Gushchin" <guro@fb.com>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Jérôme Glisse" <jglisse@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
"Linux MM" <linux-mm@kvack.org>,
"Tetsuo Handa" <penguin-kernel@i-love.sakura.ne.jp>,
syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
yuzhoujian@didichuxing.com
Subject: Re: general protection fault in oom_unkillable_task
Date: Sat, 15 Jun 2019 15:49:55 +0200 [thread overview]
Message-ID: <20190615134955.GA28441@dhcp22.suse.cz> (raw)
In-Reply-To: <CALvZod72=KuBZkSd0ey5orJFGFpwx462XY=cZvO3NOXC0MogFw@mail.gmail.com>
On Fri 14-06-19 20:15:31, Shakeel Butt wrote:
> On Fri, Jun 14, 2019 at 6:08 PM syzbot
> <syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: 3f310e51 Add linux-next specific files for 20190607
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15ab8771a00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=5d176e1849bbc45
> > dashboard link: https://syzkaller.appspot.com/bug?extid=d0fc9d3c166bc5e4a94b
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com
> >
> > kasan: CONFIG_KASAN_INLINE enabled
> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > general protection fault: 0000 [#1] PREEMPT SMP KASAN
> > CPU: 0 PID: 28426 Comm: syz-executor.5 Not tainted 5.2.0-rc3-next-20190607
> > #11
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > RIP: 0010:__read_once_size include/linux/compiler.h:194 [inline]
> > RIP: 0010:has_intersects_mems_allowed mm/oom_kill.c:84 [inline]
>
> It seems like oom_unkillable_task() is broken for memcg OOMs. It
> should not be calling has_intersects_mems_allowed() for memcg OOMs.
You are right. It doesn't really make much sense to check for the NUMA
policy/cpusets when the memcg oom is NUMA agnostic. Now that I am
looking at the code then I am really wondering why do we even call
oom_unkillable_task from oom_badness. proc_oom_score shouldn't care
about NUMA either.
In other words the following should fix this unless I am missing
something (task_in_mem_cgroup seems to be a relict from before the group
oom handling). But please note that I am still not fully operation and
laying in the bed.
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 5a58778c91d4..43eb479a5dc7 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -161,8 +161,8 @@ static bool oom_unkillable_task(struct task_struct *p,
return true;
/* When mem_cgroup_out_of_memory() and p is not member of the group */
- if (memcg && !task_in_mem_cgroup(p, memcg))
- return true;
+ if (memcg)
+ return false;
/* p may not have freeable memory in nodemask */
if (!has_intersects_mems_allowed(p, nodemask))
@@ -318,7 +318,7 @@ static int oom_evaluate_task(struct task_struct *task, void *arg)
struct oom_control *oc = arg;
unsigned long points;
- if (oom_unkillable_task(task, NULL, oc->nodemask))
+ if (oom_unkillable_task(task, oc->memcg, oc->nodemask))
goto next;
--
Michal Hocko
SUSE Labs
WARNING: multiple messages have this Message-ID
From: Hillf Danton <hdanton@sina.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>,
syzbot <syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Roman Gushchin <guro@fb.com>,
Johannes Weiner <hannes@cmpxchg.org>,
jglisse@redhat.com, LKML <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
yuzhoujian@didichuxing.com
Subject: Re: general protection fault in oom_unkillable_task
Date: Sun, 16 Jun 2019 13:48:51 +0800 [thread overview]
Message-ID: <20190615134955.GA28441@dhcp22.suse.cz> (raw)
Message-ID: <20190616054851.XS-MCkU6KtmEMDze8SQKKfnRjNXDGpLc1YJ_xWpWTbI@z> (raw)
In-Reply-To: <CALvZod72=KuBZkSd0ey5orJFGFpwx462XY=cZvO3NOXC0MogFw@mail.gmail.com>
Hello Michal
On Sat, 15 Jun 2019 13:49:57 +0000 (UTC) Michal Hocko wrote:
> On Fri 14-06-19 20:15:31, Shakeel Butt wrote:
> > On Fri, Jun 14, 2019 at 6:08 PM syzbot
> > <syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: 3f310e51 Add linux-next specific files for 20190607
> > > git tree: linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15ab8771a00000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=5d176e1849bbc45
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=d0fc9d3c166bc5e4a94b
> > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com
> > >
> > > kasan: CONFIG_KASAN_INLINE enabled
> > > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > > general protection fault: 0000 [#1] PREEMPT SMP KASAN
> > > CPU: 0 PID: 28426 Comm: syz-executor.5 Not tainted 5.2.0-rc3-next-20190607
> > > #11
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > RIP: 0010:__read_once_size include/linux/compiler.h:194 [inline]
> > > RIP: 0010:has_intersects_mems_allowed mm/oom_kill.c:84 [inline]
> >
> > It seems like oom_unkillable_task() is broken for memcg OOMs. It
> > should not be calling has_intersects_mems_allowed() for memcg OOMs.
>
> You are right. It doesn't really make much sense to check for the NUMA
> policy/cpusets when the memcg oom is NUMA agnostic. Now that I am
> looking at the code then I am really wondering why do we even call
> oom_unkillable_task from oom_badness. proc_oom_score shouldn't care
> about NUMA either.
>
> In other words the following should fix this unless I am missing
> something (task_in_mem_cgroup seems to be a relict from before the group
> oom handling). But please note that I am still not fully operation and
> laying in the bed.
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5a58778c91d4..43eb479a5dc7 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -161,8 +161,8 @@ static bool oom_unkillable_task(struct task_struct *p,
> return true;
>
> /* When mem_cgroup_out_of_memory() and p is not member of the group */
> - if (memcg && !task_in_mem_cgroup(p, memcg))
> - return true;
> + if (memcg)
> + return false;
>
Given the members of the memcg:
1> tasks with flags having PF_EXITING set.
2> tasks without memory footprints on numa node-A-B.
3> tasks with memory footprint on numa node-A-B-C.
We'd try much to avoid killing 1> and 2> tasks imo to meet the current memory
allocation that only wants pages from node-A.
--
Hillf
> /* p may not have freeable memory in nodemask */
> if (!has_intersects_mems_allowed(p, nodemask))
> @@ -318,7 +318,7 @@ static int oom_evaluate_task(struct task_struct *task, void *arg)
> struct oom_control *oc = arg;
> unsigned long points;
>
> - if (oom_unkillable_task(task, NULL, oc->nodemask))
> + if (oom_unkillable_task(task, oc->memcg, oc->nodemask))
> goto next;
>
> --
> Michal Hocko
> SUSE Labs
>
next prev parent reply other threads:[~2019-06-15 13:50 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-15 1:08 syzbot
2019-06-15 1:10 ` Tetsuo Handa
2019-06-15 15:59 ` Tetsuo Handa
2019-06-15 3:15 ` Shakeel Butt
2019-06-15 13:49 ` Michal Hocko [this message]
2019-06-15 16:11 ` Shakeel Butt
2019-06-15 16:48 ` Tetsuo Handa
2019-06-15 18:50 ` Shakeel Butt
2019-06-15 21:33 ` Tetsuo Handa
2019-06-16 7:37 ` Tetsuo Handa
2019-06-16 15:13 ` Tetsuo Handa
2019-06-17 6:31 ` Michal Hocko
2019-06-17 13:23 ` Shakeel Butt
2019-06-18 1:45 ` Andrew Morton
2019-06-18 4:21 ` Shakeel Butt
2019-06-17 6:33 ` Michal Hocko
2019-06-17 9:56 ` Tetsuo Handa
2019-06-17 10:13 ` Michal Hocko
2019-06-16 5:48 ` Hillf Danton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190615134955.GA28441@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=jglisse@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=shakeelb@google.com \
--cc=syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=yuzhoujian@didichuxing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox