* [BUG] mmotm-081130: null pointer deref in sigkill_pending() @ 2008-12-02 13:18 Lee Schermerhorn 2008-12-02 18:46 ` Ying Han 0 siblings, 1 reply; 3+ messages in thread From: Lee Schermerhorn @ 2008-12-02 13:18 UTC (permalink / raw) To: linux-kernel, Andrew Morton; +Cc: Ying Han, linux-mm, Hugh Dickins While testing mmotm [11/29, 11/30, likely in 12/01], I came across the following bug after ~2 hours of heavy stress on x86_64 [same bug/trace hit on ia64]: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 IP: [<ffffffff8024e679>] sigkill_pending+0x19/0x30 PGD 63be12067 PUD 7fd0cc067 PMD 0 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC last sysfs file: /sys/block/hda/removable CPU 7 Modules linked in: sunrpc ipv6 dm_mirror dm_region_hash dm_log dm_mod rfkill input_polldev pci_slot parport_pc lp parport ide_cd_mod cdrom serio_raw hpilo hpwdt button amd_rng pata_acpi i2c_amd756 libata i2c_core pcspkr mptspi mptscsih sym53c8xx scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 19530, comm: ps Not tainted 2.6.28-rc6-mmotm-081130-2235 #10 RIP: 0010:[<ffffffff8024e679>] [<ffffffff8024e679>] sigkill_pending+0x19/0x30 RSP: 0018:ffff8806a64efd38 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8805fc96fb00 RDX: ffff8806a64efe38 RSI: 00007fff62423000 RDI: ffff88020c87abe0 RBP: ffff8806a64efd38 R08: 0000000000000000 R09: ffff8806a64efe30 R10: 0000000000000002 R11: 0000000000000000 R12: ffff8805fc96fb00 R13: 0000000000000000 R14: 0000000000000010 R15: ffff8806a64efe30 FS: 00007f32d4ac46f0(0000) GS:ffff8807fed46d80(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000038 CR3: 0000000784571000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ps (pid: 19530, threadinfo ffff8806a64ee000, task ffff88062cf6d7c0) Stack: ffff8806a64efdb8 ffffffff80298986 0000000000000001 ffff88062cf6d7c0 ffff8803ad100680 ffff88020c87abe0 0000000000000000 ffff8806a64efe38 ffff8806a64efe30 00000001804f0765 00007fff624236be 0000000000000000 Call Trace: [<ffffffff80298986>] __get_user_pages+0x166/0x440 [<ffffffff80298c92>] get_user_pages+0x32/0x40 [<ffffffff80298dab>] access_process_vm+0x10b/0x1e0 [<ffffffff8030bd60>] proc_pid_cmdline+0x90/0x120 [<ffffffff802ae568>] ? alloc_pages_current+0xa8/0x100 [<ffffffff8030de5b>] proc_info_read+0x9b/0xe0 [<ffffffff802c0ba4>] vfs_read+0xc4/0x160 [<ffffffff802c1000>] sys_read+0x50/0x90 [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b Code: e8 08 83 e0 01 c3 66 0f 1f 44 00 00 66 0f 1f 44 00 00 f6 87 49 05 00 00 01 55 b8 01 00 00 00 48 89 e5 75 12 48 8b 87 10 05 00 00 <48> 8b 40 38 48 c1 e8 08 83 e0 01 c9 c3 66 2e 0f 1f 84 00 00 00 RIP [<ffffffff8024e679>] sigkill_pending+0x19/0x30 RSP <ffff8806a64efd38> CR2: 0000000000000038 ---[ end trace 80efa2c8bcce4fdc ]--- Ad hoc instrumentation showed that this is likely the result of the "make-get_user_pages-interruptible" patch in mmotm. The test workload includes a number of tasks that repeatedly run a list of miscellaneous programs, including ps(1). Occasionally, ps will catch a task with a NULL "tsk->signal" [exiting?], resulting in a null pointer deref in sigkill_pending() called from __get_user_pages(). Before the aforementioned patch, sigkill_pending() was only called from ptrace_stop() for tsk == current. The comment block on ptrace_stop() says: "This must be called with current->sighand->siglock held." So, it doesn't look safe to call sigkill_pending() from __get_user_pages()--especially on a task != current--without similar locking. I'm not sure what the correct fix should be here. Related: the patch and it's update add the following to __get_user_pages(): if (unlikely(sigkill_pending(current) || sigkill_pending(tsk))) return i ? i : -ERESTARTSYS; As a minimum, I don't think we want to call the second sigkill_pending() unless tsk != current. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [BUG] mmotm-081130: null pointer deref in sigkill_pending() 2008-12-02 13:18 [BUG] mmotm-081130: null pointer deref in sigkill_pending() Lee Schermerhorn @ 2008-12-02 18:46 ` Ying Han 2008-12-02 19:03 ` Lee Schermerhorn 0 siblings, 1 reply; 3+ messages in thread From: Ying Han @ 2008-12-02 18:46 UTC (permalink / raw) To: Lee Schermerhorn; +Cc: linux-kernel, Andrew Morton, linux-mm, Hugh Dickins On Tue, Dec 2, 2008 at 5:18 AM, Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > While testing mmotm [11/29, 11/30, likely in 12/01], I came across the > following bug after ~2 hours of heavy stress on x86_64 [same bug/trace > hit on ia64]: > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 > IP: [<ffffffff8024e679>] sigkill_pending+0x19/0x30 > PGD 63be12067 PUD 7fd0cc067 PMD 0 > Oops: 0000 [#1] SMP DEBUG_PAGEALLOC > last sysfs file: /sys/block/hda/removable > CPU 7 > Modules linked in: sunrpc ipv6 dm_mirror dm_region_hash dm_log dm_mod rfkill input_polldev pci_slot parport_pc lp parport ide_cd_mod cdrom serio_raw hpilo hpwdt button amd_rng pata_acpi i2c_amd756 libata i2c_core pcspkr mptspi mptscsih sym53c8xx scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd uhci_hcd > Pid: 19530, comm: ps Not tainted 2.6.28-rc6-mmotm-081130-2235 #10 > RIP: 0010:[<ffffffff8024e679>] [<ffffffff8024e679>] sigkill_pending+0x19/0x30 > RSP: 0018:ffff8806a64efd38 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8805fc96fb00 > RDX: ffff8806a64efe38 RSI: 00007fff62423000 RDI: ffff88020c87abe0 > RBP: ffff8806a64efd38 R08: 0000000000000000 R09: ffff8806a64efe30 > R10: 0000000000000002 R11: 0000000000000000 R12: ffff8805fc96fb00 > R13: 0000000000000000 R14: 0000000000000010 R15: ffff8806a64efe30 > FS: 00007f32d4ac46f0(0000) GS:ffff8807fed46d80(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000000000038 CR3: 0000000784571000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process ps (pid: 19530, threadinfo ffff8806a64ee000, task ffff88062cf6d7c0) > Stack: > ffff8806a64efdb8 ffffffff80298986 0000000000000001 ffff88062cf6d7c0 > ffff8803ad100680 ffff88020c87abe0 0000000000000000 ffff8806a64efe38 > ffff8806a64efe30 00000001804f0765 00007fff624236be 0000000000000000 > Call Trace: > [<ffffffff80298986>] __get_user_pages+0x166/0x440 > [<ffffffff80298c92>] get_user_pages+0x32/0x40 > [<ffffffff80298dab>] access_process_vm+0x10b/0x1e0 > [<ffffffff8030bd60>] proc_pid_cmdline+0x90/0x120 > [<ffffffff802ae568>] ? alloc_pages_current+0xa8/0x100 > [<ffffffff8030de5b>] proc_info_read+0x9b/0xe0 > [<ffffffff802c0ba4>] vfs_read+0xc4/0x160 > [<ffffffff802c1000>] sys_read+0x50/0x90 > [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b > Code: e8 08 83 e0 01 c3 66 0f 1f 44 00 00 66 0f 1f 44 00 00 f6 87 49 05 00 00 01 55 b8 01 00 00 00 48 89 e5 75 12 48 8b 87 10 05 00 00 <48> 8b 40 38 48 c1 e8 08 83 e0 01 c9 c3 66 2e 0f 1f 84 00 00 00 > RIP [<ffffffff8024e679>] sigkill_pending+0x19/0x30 > RSP <ffff8806a64efd38> > CR2: 0000000000000038 > ---[ end trace 80efa2c8bcce4fdc ]--- > > Ad hoc instrumentation showed that this is likely the result of the > "make-get_user_pages-interruptible" patch in mmotm. The test workload > includes a number of tasks that repeatedly run a list of miscellaneous > programs, including ps(1). Occasionally, ps will catch a task with a > NULL "tsk->signal" [exiting?], resulting in a null pointer deref in > sigkill_pending() called from __get_user_pages(). > > Before the aforementioned patch, sigkill_pending() was only called from > ptrace_stop() for tsk == current. The comment block on ptrace_stop() > says: "This must be called with current->sighand->siglock held." So, > it doesn't look safe to call sigkill_pending() from > __get_user_pages()--especially on a task != current--without similar > locking. > > I'm not sure what the correct fix should be here. I made a fix for the patch as Oleg suggested which replace the sigkill_pending() as fatal_signal_pending(). Andrew deleted the original patch from mm tree this morning and i am about the post the new patch with the fix. > > Related: the patch and it's update add the following to > __get_user_pages(): > > if (unlikely(sigkill_pending(current) || > sigkill_pending(tsk))) > return i ? i : -ERESTARTSYS; > > As a minimum, I don't think we want to call the second sigkill_pending() > unless tsk != current. In most cases, (current == tsk) in get_user_pages. However in some situation current is calling get_user_pages on behalf of tsk and we want to interrupt the allocation in this case as well. I might change it something like this: > if (unlikely(sigkill_pending(current) || > ((current != tsk) && sigkill_pending(tsk)))) > return i ? i : -ERESTARTSYS; > > Lee > > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [BUG] mmotm-081130: null pointer deref in sigkill_pending() 2008-12-02 18:46 ` Ying Han @ 2008-12-02 19:03 ` Lee Schermerhorn 0 siblings, 0 replies; 3+ messages in thread From: Lee Schermerhorn @ 2008-12-02 19:03 UTC (permalink / raw) To: Ying Han; +Cc: linux-kernel, Andrew Morton, linux-mm, Hugh Dickins On Tue, 2008-12-02 at 10:46 -0800, Ying Han wrote: > On Tue, Dec 2, 2008 at 5:18 AM, Lee Schermerhorn > <Lee.Schermerhorn@hp.com> wrote: > > While testing mmotm [11/29, 11/30, likely in 12/01], I came across the > > following bug after ~2 hours of heavy stress on x86_64 [same bug/trace > > hit on ia64]: > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 > > IP: [<ffffffff8024e679>] sigkill_pending+0x19/0x30 > > PGD 63be12067 PUD 7fd0cc067 PMD 0 > > Oops: 0000 [#1] SMP DEBUG_PAGEALLOC > > last sysfs file: /sys/block/hda/removable > > CPU 7 > > Modules linked in: sunrpc ipv6 dm_mirror dm_region_hash dm_log dm_mod rfkill input_polldev pci_slot parport_pc lp parport ide_cd_mod cdrom serio_raw hpilo hpwdt button amd_rng pata_acpi i2c_amd756 libata i2c_core pcspkr mptspi mptscsih sym53c8xx scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd uhci_hcd > > Pid: 19530, comm: ps Not tainted 2.6.28-rc6-mmotm-081130-2235 #10 > > RIP: 0010:[<ffffffff8024e679>] [<ffffffff8024e679>] sigkill_pending+0x19/0x30 > > RSP: 0018:ffff8806a64efd38 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8805fc96fb00 > > RDX: ffff8806a64efe38 RSI: 00007fff62423000 RDI: ffff88020c87abe0 > > RBP: ffff8806a64efd38 R08: 0000000000000000 R09: ffff8806a64efe30 > > R10: 0000000000000002 R11: 0000000000000000 R12: ffff8805fc96fb00 > > R13: 0000000000000000 R14: 0000000000000010 R15: ffff8806a64efe30 > > FS: 00007f32d4ac46f0(0000) GS:ffff8807fed46d80(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 0000000000000038 CR3: 0000000784571000 CR4: 00000000000006e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process ps (pid: 19530, threadinfo ffff8806a64ee000, task ffff88062cf6d7c0) > > Stack: > > ffff8806a64efdb8 ffffffff80298986 0000000000000001 ffff88062cf6d7c0 > > ffff8803ad100680 ffff88020c87abe0 0000000000000000 ffff8806a64efe38 > > ffff8806a64efe30 00000001804f0765 00007fff624236be 0000000000000000 > > Call Trace: > > [<ffffffff80298986>] __get_user_pages+0x166/0x440 > > [<ffffffff80298c92>] get_user_pages+0x32/0x40 > > [<ffffffff80298dab>] access_process_vm+0x10b/0x1e0 > > [<ffffffff8030bd60>] proc_pid_cmdline+0x90/0x120 > > [<ffffffff802ae568>] ? alloc_pages_current+0xa8/0x100 > > [<ffffffff8030de5b>] proc_info_read+0x9b/0xe0 > > [<ffffffff802c0ba4>] vfs_read+0xc4/0x160 > > [<ffffffff802c1000>] sys_read+0x50/0x90 > > [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b > > Code: e8 08 83 e0 01 c3 66 0f 1f 44 00 00 66 0f 1f 44 00 00 f6 87 49 05 00 00 01 55 b8 01 00 00 00 48 89 e5 75 12 48 8b 87 10 05 00 00 <48> 8b 40 38 48 c1 e8 08 83 e0 01 c9 c3 66 2e 0f 1f 84 00 00 00 > > RIP [<ffffffff8024e679>] sigkill_pending+0x19/0x30 > > RSP <ffff8806a64efd38> > > CR2: 0000000000000038 > > ---[ end trace 80efa2c8bcce4fdc ]--- > > > > Ad hoc instrumentation showed that this is likely the result of the > > "make-get_user_pages-interruptible" patch in mmotm. The test workload > > includes a number of tasks that repeatedly run a list of miscellaneous > > programs, including ps(1). Occasionally, ps will catch a task with a > > NULL "tsk->signal" [exiting?], resulting in a null pointer deref in > > sigkill_pending() called from __get_user_pages(). > > > > Before the aforementioned patch, sigkill_pending() was only called from > > ptrace_stop() for tsk == current. The comment block on ptrace_stop() > > says: "This must be called with current->sighand->siglock held." So, > > it doesn't look safe to call sigkill_pending() from > > __get_user_pages()--especially on a task != current--without similar > > locking. > > > > > I'm not sure what the correct fix should be here. > I made a fix for the patch as Oleg suggested which replace the > sigkill_pending() as fatal_signal_pending(). > Andrew deleted the original patch from mm tree this morning and i am > about the post the new patch with > the fix. OK. I've been working on a fix, but I'll await yours. I have found another interaction of this patch with munlock of mlocked vmas on exit if the task was killed with SIGKILL. Get_user_pages() fails and we don't munlock the pages. I have a fix for that [another internal GUPS_FLAG], as well, but I'll wait until yours shows up in mmotm and rebase atop that. Meanwhile, my fix lets me continue to test on mmotm-081201. > > > > Related: the patch and it's update add the following to > > __get_user_pages(): > > > > if (unlikely(sigkill_pending(current) || > > sigkill_pending(tsk))) > > return i ? i : -ERESTARTSYS; > > > > As a minimum, I don't think we want to call the second sigkill_pending() > > unless tsk != current. > In most cases, (current == tsk) in get_user_pages. However in some > situation current is calling get_user_pages on behalf of tsk and we > want to interrupt the allocation in this case as well. Agreed. It was, in fact, the case where ps(1) was calling get_user_pages() to access another [exiting] task's command line via /proc that I hit the problem. > I might change it something like this: > > > if (unlikely(sigkill_pending(current) || > > ((current != tsk) && sigkill_pending(tsk)))) > > return i ? i : -ERESTARTSYS; Yes, that's what I was thinking. Thanks, Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-12-02 19:03 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-12-02 13:18 [BUG] mmotm-081130: null pointer deref in sigkill_pending() Lee Schermerhorn 2008-12-02 18:46 ` Ying Han 2008-12-02 19:03 ` Lee Schermerhorn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox