From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f71.google.com (mail-wm0-f71.google.com [74.125.82.71]) by kanga.kvack.org (Postfix) with ESMTP id 712886B0069 for ; Sat, 22 Oct 2016 02:27:27 -0400 (EDT) Received: by mail-wm0-f71.google.com with SMTP id d199so5902787wmd.0 for ; Fri, 21 Oct 2016 23:27:27 -0700 (PDT) Received: from mail-wm0-x231.google.com (mail-wm0-x231.google.com. [2a00:1450:400c:c09::231]) by mx.google.com with ESMTPS id w206si1762659wmd.79.2016.10.21.23.27.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Oct 2016 23:27:24 -0700 (PDT) Received: by mail-wm0-x231.google.com with SMTP id c78so19488917wme.0 for ; Fri, 21 Oct 2016 23:27:24 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <57CF4D3D.50603@iogearbox.net> From: Kees Cook Date: Fri, 21 Oct 2016 23:27:23 -0700 Message-ID: Subject: Re: mm: GPF in __insert_vmap_area Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Paul McKenney Cc: Dmitry Vyukov , Daniel Borkmann , Andrew Morton , David Rientjes , zijun_hu , Joonsoo Kim , "linux-mm@kvack.org" , LKML , "Kirill A. Shutemov" , Andrey Ryabinin , Konstantin Khlebnikov , syzkaller , Alexei Starovoitov On Thu, Sep 8, 2016 at 8:05 AM, Dmitry Vyukov wrote: > On Wed, Sep 7, 2016 at 1:11 AM, Daniel Borkmann wrote: >> On 09/06/2016 11:03 PM, Kees Cook wrote: >>> >>> On Sat, Sep 3, 2016 at 8:15 AM, Dmitry Vyukov wrote: >>>> >>>> Hello, >>>> >>>> While running syzkaller fuzzer I've got the following GPF: >>>> >>>> general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN >>>> Dumping ftrace buffer: >>>> (ftrace buffer empty) >>>> Modules linked in: >>>> CPU: 2 PID: 4268 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ >>>> #8 >>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs >>>> 01/01/2011 >>>> task: ffff88006a6527c0 task.stack: ffff880052630000 >>>> RIP: 0010:[] [] >>>> __list_add_valid+0x26/0xd0 lib/list_debug.c:23 >>>> RSP: 0018:ffff880052637a18 EFLAGS: 00010202 >>>> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90001c87000 >>>> RDX: 0000000000000001 RSI: ffff88001344cdb0 RDI: 0000000000000008 >>>> RBP: ffff880052637a30 R08: 0000000000000001 R09: 0000000000000000 >>>> R10: 0000000000000000 R11: ffffffff8a5deee0 R12: ffff88006cc47230 >>>> R13: ffff88001344cdb0 R14: ffff88006cc47230 R15: 0000000000000000 >>>> FS: 00007fbacc97e700(0000) GS:ffff88006d200000(0000) >>>> knlGS:0000000000000000 >>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> CR2: 0000000020de7000 CR3: 000000003c4d2000 CR4: 00000000000006e0 >>>> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000 >>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 >>>> Stack: >>>> ffff88006cc47200 ffff88001344cd98 ffff88006cc47200 ffff880052637a78 >>>> ffffffff817bc6d1 ffff88006cc47208 ffffed000d988e41 ffff88006cc47208 >>>> ffff88006cc3e680 ffffc900035b7000 ffffc900035a7000 ffff88006cc47200 >>>> Call Trace: >>>> [< inline >] __list_add_rcu include/linux/rculist.h:51 >>>> [< inline >] list_add_rcu include/linux/rculist.h:78 >>>> [] __insert_vmap_area+0x1c1/0x3c0 mm/vmalloc.c:340 >>>> [] alloc_vmap_area+0x614/0x890 mm/vmalloc.c:458 >>>> [] __get_vm_area_node+0xe8/0x340 mm/vmalloc.c:1377 >>>> [] __vmalloc_node_range+0xaa/0x6d0 mm/vmalloc.c:1687 >>>> [< inline >] __vmalloc_node mm/vmalloc.c:1736 >>>> [] __vmalloc+0x5b/0x70 mm/vmalloc.c:1742 >>>> [] bpf_prog_alloc+0x3c/0x190 kernel/bpf/core.c:82 >>>> [] bpf_prog_create_from_user+0xa9/0x2c0 >>>> net/core/filter.c:1132 >>>> [< inline >] seccomp_prepare_filter kernel/seccomp.c:373 >>>> [< inline >] seccomp_prepare_user_filter kernel/seccomp.c:408 >>>> [< inline >] seccomp_set_mode_filter kernel/seccomp.c:737 >>>> [] do_seccomp+0x317/0x1800 kernel/seccomp.c:787 >>>> [] prctl_set_seccomp+0x34/0x60 kernel/seccomp.c:830 >>>> [< inline >] SYSC_prctl kernel/sys.c:2157 >>>> [] SyS_prctl+0x82f/0xc80 kernel/sys.c:2075 >>>> [] entry_SYSCALL_64_fastpath+0x23/0xc1 >>>> Code: 00 00 00 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 >>>> 49 89 fc 48 8d 7a 08 53 48 89 d3 48 89 fa 48 83 ec 08 48 c1 ea 03 <80> >>>> 3c 02 00 75 7c 48 8b 53 08 48 39 f2 75 37 48 89 f2 48 b8 00 >>>> RIP [] __list_add_valid+0x26/0xd0 lib/list_debug.c:23 >>>> RSP >>>> ---[ end trace 983e625f02f00d9f ]--- >>>> Kernel panic - not syncing: Fatal exception >>>> >>>> On commit 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next. >>>> Unfortunately it is not reproducible. I've spent some more time looking at this, and I'm satisfied that this isn't a problem introduced by my CONFIG_DEBUG_LIST changes. Thoughts below... >> >> >> Can you elaborate? You hit this only once and then never again >> in this or some other, similar call-trace form, right? > > Correct. > >>>> The crashing line is: >>>> CHECK_DATA_CORRUPTION(next->prev != prev, >>>> >>>> It crashed on KASAN check at (%rax, %rdx), this address corresponds to >>>> next address = 0x8. So next was ~NULL. The main issue is that the argument for "next" coming in to the check is NULL to begin with. That indicates serious problems somewhere else. If CONFIG_DEBUG_LIST wasn't set, the crash would still have happened during "next->prev = new;" (since next is NULL): static inline void __list_add_rcu(struct list_head *new, struct list_head *prev, struct list_head *next) { if (!__list_add_valid(new, prev, next)) return; new->next = next; new->prev = prev; rcu_assign_pointer(list_next_rcu(prev), new); next->prev = new; } >>> Paul, the RCU torture tests passed with the CONFIG_DEBUG_LIST changes, >>> IIRC, yes? I'd love to rule out some kind of race condition between >>> the removal and add code for the checking. >>> >>> Daniel, IIRC there was some talk about RCU and BPF? Am I remembering >> >> >> --verbose, what talk specifically? There were some fixes longer >> time ago, but related to eBPF, not cBPF, but even there I don't >> see currently how it could be related to a va->list corruption >> triggered in __insert_vmap_area(). >> >>> that correctly? I'm having a hard time imagining how a list add could >>> fail (maybe a race between two adds)? >> >> >> Or some use after free that would have corrupted that memory? I >> would think right now that the path via bpf_prog_alloc() could >> have triggered, but not necessarily caused the issue, hmm. > > That's highly likely. So, this appears to be a bug somewhere else that happens to manifest as a crash in the check (and would have crashed in the same place under CONFIG_DEBUG_LIST before my changes too). -Kees -- Kees Cook Nexus Security -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org