From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by kanga.kvack.org (Postfix) with ESMTP id 5886E6B0005 for ; Tue, 2 Feb 2016 16:08:35 -0500 (EST) Received: by mail-wm0-f50.google.com with SMTP id l66so135889705wml.0 for ; Tue, 02 Feb 2016 13:08:35 -0800 (PST) Received: from mx2.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id n1si4726509wjf.177.2016.02.02.13.08.33 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 02 Feb 2016 13:08:34 -0800 (PST) Date: Tue, 2 Feb 2016 22:08:32 +0100 (CET) From: Jiri Kosina Subject: Re: mm: uninterruptable tasks hanged on mmap_sem In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Dmitry Vyukov Cc: Andrew Morton , "Kirill A. Shutemov" , Oleg Nesterov , Konstantin Khlebnikov , "linux-mm@kvack.org" , LKML , Takashi Iwai , syzkaller , Kostya Serebryany , Alexander Potapenko , Sasha Levin On Tue, 2 Feb 2016, Dmitry Vyukov wrote: > Hello, > > If the following program run in a parallel loop, eventually it leaves > hanged uninterruptable tasks on mmap_sem. > > [ 4074.740298] sysrq: SysRq : Show Locks Held > [ 4074.740780] Showing all locks held in the system: > ... > [ 4074.762133] 1 lock held by a.out/1276: > [ 4074.762427] #0: (&mm->mmap_sem){++++++}, at: [] > __mm_populate+0x25c/0x350 > [ 4074.763149] 1 lock held by a.out/1147: > [ 4074.763438] #0: (&mm->mmap_sem){++++++}, at: [] > vm_mmap_pgoff+0x12c/0x1b0 > [ 4074.764164] 1 lock held by a.out/1284: > [ 4074.764447] #0: (&mm->mmap_sem){++++++}, at: [] > __mm_populate+0x25c/0x350 > [ 4074.765287] > > They all look as follows: > > # cat /proc/1284/task/**/stack > [] call_rwsem_down_write_failed+0x13/0x20 > [] vm_mmap_pgoff+0x12c/0x1b0 > [] SyS_mmap_pgoff+0x208/0x580 > [] SyS_mmap+0x16/0x20 > [] entry_SYSCALL_64_fastpath+0x16/0x7a > [] 0xffffffffffffffff > [] wait_on_page_bit+0x1de/0x210 > [] filemap_fault+0xfeb/0x14d0 > [] __do_fault+0x1b2/0x3e0 > [] handle_mm_fault+0x1b4e/0x49a0 > [] __get_user_pages+0x2c0/0x11a0 > [] populate_vma_page_range+0x198/0x230 > [] __mm_populate+0x1fb/0x350 > [] do_mlock+0x291/0x360 > [] SyS_mlock2+0x4b/0x70 > [] entry_SYSCALL_64_fastpath+0x16/0x7a > [] 0xffffffffffffffff This stacktrace is odd. > > # cat /proc/1284/status > Name: a.out > State: D (disk sleep) > Tgid: 1147 > Ngid: 0 > Pid: 1284 > PPid: 28436 > TracerPid: 0 > Uid: 0 0 0 0 > Gid: 0 0 0 0 > FDSize: 64 > Groups: 0 > NStgid: 1147 > NSpid: 1284 > NSpgid: 28436 > NSsid: 6529 > VmPeak: 50356 kB > VmSize: 50356 kB > VmLck: 16 kB > VmPin: 0 kB > VmHWM: 8 kB > VmRSS: 8 kB > RssAnon: 8 kB > RssFile: 0 kB > RssShmem: 0 kB > VmData: 49348 kB > VmStk: 136 kB > VmExe: 828 kB > VmLib: 8 kB > VmPTE: 44 kB > VmPMD: 12 kB > VmSwap: 0 kB > HugetlbPages: 0 kB > Threads: 2 > SigQ: 1/3189 > SigPnd: 0000000000000100 > ShdPnd: 0000000000000100 > SigBlk: 0000000000000000 > SigIgn: 0000000000000000 > SigCgt: 0000000180000000 > CapInh: 0000000000000000 > CapPrm: 0000003fffffffff > CapEff: 0000003fffffffff > CapBnd: 0000003fffffffff > CapAmb: 0000000000000000 > Seccomp: 0 > Cpus_allowed: f > Cpus_allowed_list: 0-3 > Mems_allowed: 00000000,00000003 > Mems_allowed_list: 0-1 > voluntary_ctxt_switches: 3 > nonvoluntary_ctxt_switches: 1 > > > There are no BUGs, WARNINGs, stalls on console. > > Not sure if its mm or floppy fault. I am pretty sure that it's floppy fault, even before I looked at the reproducer > > > // autogenerated by syzkaller (http://github.com/google/syzkaller) > #include > #include > #include > #include > #include > > #ifndef SYS_mlock2 > #define SYS_mlock2 325 > #endif > > long r[7]; > > void* thr(void* arg) > { > switch ((long)arg) { > case 0: > r[0] = syscall(SYS_mmap, 0x20000000ul, 0x1000ul, 0x3ul, 0x32ul, > 0xfffffffffffffffful, 0x0ul); > break; > case 1: > memcpy((void*)0x20000000, "\x2f\x64\x65\x76\x2f\x66\x64\x23", 8); > r[2] = syscall(SYS_open, "/dev/fd0", 0x800ul, 0, 0, 0); Just to make sure -- I guess that this is a minimal testcase already, right? IOW, if you eliminate the open(/dev/fd0) call, the bug will vanish? I'll try to reproduce this later tonight or tomorrow. -- Jiri Kosina SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org