linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Shakeel Butt <shakeelb@google.com>
Cc: syzbot <syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Roman Gushchin" <guro@fb.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Tetsuo Handa" <penguin-kernel@i-love.sakura.ne.jp>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	yuzhoujian@didichuxing.com
Subject: Re: general protection fault in oom_unkillable_task
Date: Sat, 15 Jun 2019 15:49:55 +0200	[thread overview]
Message-ID: <20190615134955.GA28441@dhcp22.suse.cz> (raw)
In-Reply-To: <CALvZod72=KuBZkSd0ey5orJFGFpwx462XY=cZvO3NOXC0MogFw@mail.gmail.com>

On Fri 14-06-19 20:15:31, Shakeel Butt wrote:
> On Fri, Jun 14, 2019 at 6:08 PM syzbot
> <syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    3f310e51 Add linux-next specific files for 20190607
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15ab8771a00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=5d176e1849bbc45
> > dashboard link: https://syzkaller.appspot.com/bug?extid=d0fc9d3c166bc5e4a94b
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com
> >
> > kasan: CONFIG_KASAN_INLINE enabled
> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > general protection fault: 0000 [#1] PREEMPT SMP KASAN
> > CPU: 0 PID: 28426 Comm: syz-executor.5 Not tainted 5.2.0-rc3-next-20190607
> > #11
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > RIP: 0010:__read_once_size include/linux/compiler.h:194 [inline]
> > RIP: 0010:has_intersects_mems_allowed mm/oom_kill.c:84 [inline]
> 
> It seems like oom_unkillable_task() is broken for memcg OOMs. It
> should not be calling has_intersects_mems_allowed() for memcg OOMs.

You are right. It doesn't really make much sense to check for the NUMA
policy/cpusets when the memcg oom is NUMA agnostic. Now that I am
looking at the code then I am really wondering why do we even call
oom_unkillable_task from oom_badness. proc_oom_score shouldn't care
about NUMA either.

In other words the following should fix this unless I am missing
something (task_in_mem_cgroup seems to be a relict from before the group
oom handling). But please note that I am still not fully operation and
laying in the bed.

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 5a58778c91d4..43eb479a5dc7 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -161,8 +161,8 @@ static bool oom_unkillable_task(struct task_struct *p,
 		return true;
 
 	/* When mem_cgroup_out_of_memory() and p is not member of the group */
-	if (memcg && !task_in_mem_cgroup(p, memcg))
-		return true;
+	if (memcg)
+		return false;
 
 	/* p may not have freeable memory in nodemask */
 	if (!has_intersects_mems_allowed(p, nodemask))
@@ -318,7 +318,7 @@ static int oom_evaluate_task(struct task_struct *task, void *arg)
 	struct oom_control *oc = arg;
 	unsigned long points;
 
-	if (oom_unkillable_task(task, NULL, oc->nodemask))
+	if (oom_unkillable_task(task, oc->memcg, oc->nodemask))
 		goto next;
 
-- 
Michal Hocko
SUSE Labs


WARNING: multiple messages have this Message-ID
From: Hillf Danton <hdanton@sina.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>,
	syzbot <syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	jglisse@redhat.com, LKML <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	yuzhoujian@didichuxing.com
Subject: Re: general protection fault in oom_unkillable_task
Date: Sun, 16 Jun 2019 13:48:51 +0800	[thread overview]
Message-ID: <20190615134955.GA28441@dhcp22.suse.cz> (raw)
Message-ID: <20190616054851.XS-MCkU6KtmEMDze8SQKKfnRjNXDGpLc1YJ_xWpWTbI@z> (raw)
In-Reply-To: <CALvZod72=KuBZkSd0ey5orJFGFpwx462XY=cZvO3NOXC0MogFw@mail.gmail.com>


Hello Michal

On Sat, 15 Jun 2019 13:49:57 +0000 (UTC) Michal Hocko wrote:
> On Fri 14-06-19 20:15:31, Shakeel Butt wrote:
> > On Fri, Jun 14, 2019 at 6:08 PM syzbot
> > <syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    3f310e51 Add linux-next specific files for 20190607
> > > git tree:       linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15ab8771a00000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=5d176e1849bbc45
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=d0fc9d3c166bc5e4a94b
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com
> > >
> > > kasan: CONFIG_KASAN_INLINE enabled
> > > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > > general protection fault: 0000 [#1] PREEMPT SMP KASAN
> > > CPU: 0 PID: 28426 Comm: syz-executor.5 Not tainted 5.2.0-rc3-next-20190607
> > > #11
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > RIP: 0010:__read_once_size include/linux/compiler.h:194 [inline]
> > > RIP: 0010:has_intersects_mems_allowed mm/oom_kill.c:84 [inline]
> > 
> > It seems like oom_unkillable_task() is broken for memcg OOMs. It
> > should not be calling has_intersects_mems_allowed() for memcg OOMs.
> 
> You are right. It doesn't really make much sense to check for the NUMA
> policy/cpusets when the memcg oom is NUMA agnostic. Now that I am
> looking at the code then I am really wondering why do we even call
> oom_unkillable_task from oom_badness. proc_oom_score shouldn't care
> about NUMA either.
> 
> In other words the following should fix this unless I am missing
> something (task_in_mem_cgroup seems to be a relict from before the group
> oom handling). But please note that I am still not fully operation and
> laying in the bed.
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5a58778c91d4..43eb479a5dc7 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -161,8 +161,8 @@ static bool oom_unkillable_task(struct task_struct *p,
>  		return true;
>  
>  	/* When mem_cgroup_out_of_memory() and p is not member of the group */
> -	if (memcg && !task_in_mem_cgroup(p, memcg))
> -		return true;
> +	if (memcg)
> +		return false;
>
Given the members of the memcg:
1> tasks with flags having PF_EXITING set.
2> tasks without memory footprints on numa node-A-B.
3> tasks with memory footprint on numa node-A-B-C.

We'd try much to avoid killing 1> and 2> tasks imo to meet the current memory
allocation that only wants pages from node-A.

--
Hillf
>  	/* p may not have freeable memory in nodemask */
>  	if (!has_intersects_mems_allowed(p, nodemask))
> @@ -318,7 +318,7 @@ static int oom_evaluate_task(struct task_struct *task, void *arg)
>  	struct oom_control *oc = arg;
>  	unsigned long points;
>  
> -	if (oom_unkillable_task(task, NULL, oc->nodemask))
> +	if (oom_unkillable_task(task, oc->memcg, oc->nodemask))
>  		goto next;
>  
> -- 
> Michal Hocko
> SUSE Labs
> 


  reply	other threads:[~2019-06-15 13:50 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-15  1:08 syzbot
2019-06-15  1:10 ` Tetsuo Handa
2019-06-15 15:59   ` Tetsuo Handa
2019-06-15  3:15 ` Shakeel Butt
2019-06-15 13:49   ` Michal Hocko [this message]
2019-06-15 16:11     ` Shakeel Butt
2019-06-15 16:48       ` Tetsuo Handa
2019-06-15 18:50         ` Shakeel Butt
2019-06-15 21:33           ` Tetsuo Handa
2019-06-16  7:37             ` Tetsuo Handa
2019-06-16 15:13               ` Tetsuo Handa
2019-06-17  6:31                 ` Michal Hocko
2019-06-17 13:23                 ` Shakeel Butt
2019-06-18  1:45                   ` Andrew Morton
2019-06-18  4:21                     ` Shakeel Butt
2019-06-17  6:33       ` Michal Hocko
2019-06-17  9:56         ` Tetsuo Handa
2019-06-17 10:13           ` Michal Hocko
2019-06-16  5:48     ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190615134955.GA28441@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=shakeelb@google.com \
    --cc=syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=yuzhoujian@didichuxing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox