* mm/mmu_notifier: inconsistent lock state in mmu_notifier_register()
@ 2012-10-17 21:53 Andrea Righi
2012-10-18 12:24 ` Gavin Shan
2012-10-18 12:24 ` Gavin Shan
0 siblings, 2 replies; 5+ messages in thread
From: Andrea Righi @ 2012-10-17 21:53 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: Christoph Lameter, Andrew Morton, linux-mm, linux-kernel
Just got this on 3.7.0-rc1 (last git commit 1867353):
[49048.262912] =================================
[49048.262913] [ INFO: inconsistent lock state ]
[49048.262916] 3.7.0-rc1+ #518 Not tainted
[49048.262918] ---------------------------------
[49048.262919] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[49048.262922] kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes:
[49048.262924] (&mapping->i_mmap_mutex){+.+.?.}, at: [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0
[49048.262933] {RECLAIM_FS-ON-W} state was registered at:
[49048.262935] [<ffffffff810ed5d6>] mark_held_locks+0x86/0x150
[49048.262938] [<ffffffff810edce7>] lockdep_trace_alloc+0x67/0xc0
[49048.262942] [<ffffffff811a9323>] kmem_cache_alloc_trace+0x33/0x230
[49048.262945] [<ffffffff811a1a27>] do_mmu_notifier_register+0x87/0x180
[49048.262948] [<ffffffff811a1b53>] mmu_notifier_register+0x13/0x20
[49048.262951] [<ffffffff81006738>] kvm_dev_ioctl+0x428/0x510
[49048.262955] [<ffffffff811c7ce8>] do_vfs_ioctl+0x98/0x570
[49048.262959] [<ffffffff811c8251>] sys_ioctl+0x91/0xb0
[49048.262962] [<ffffffff815df302>] system_call_fastpath+0x16/0x1b
[49048.262966] irq event stamp: 825
[49048.262968] hardirqs last enabled at (825): [<ffffffff815d6fa0>] _raw_spin_unlock_irq+0x30/0x60
[49048.262971] hardirqs last disabled at (824): [<ffffffff815d6659>] _raw_spin_lock_irq+0x19/0x80
[49048.262975] softirqs last enabled at (0): [<ffffffff81082170>] copy_process+0x630/0x17c0
[49048.262979] softirqs last disabled at (0): [< (null)>] (null)
[49048.262981]
[49048.262981] other info that might help us debug this:
[49048.262983] Possible unsafe locking scenario:
[49048.262983]
[49048.262984] CPU0
[49048.262986] ----
[49048.262987] lock(&mapping->i_mmap_mutex);
[49048.262989] <Interrupt>
[49048.262991] lock(&mapping->i_mmap_mutex);
[49048.262993]
[49048.262993] *** DEADLOCK ***
[49048.262993]
[49048.262995] no locks held by kswapd0/35.
[49048.262996]
[49048.262996] stack backtrace:
[49048.262999] Pid: 35, comm: kswapd0 Not tainted 3.7.0-rc1+ #518
[49048.263000] Call Trace:
[49048.263005] [<ffffffff815cd988>] print_usage_bug+0x1f5/0x206
[49048.263008] [<ffffffff8105a21f>] ? save_stack_trace+0x2f/0x50
[49048.263011] [<ffffffff810ea865>] mark_lock+0x295/0x2f0
[49048.263014] [<ffffffff810e9c70>] ? print_irq_inversion_bug.part.42+0x1f0/0x1f0
[49048.263017] [<ffffffff810eae5d>] __lock_acquire+0x59d/0x1c20
[49048.263020] [<ffffffff815cf163>] ? put_cpu_partial+0x65/0xbd
[49048.263024] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90
[49048.263028] [<ffffffff810c5555>] ? sched_clock_cpu+0xc5/0x120
[49048.263031] [<ffffffff810ecbe0>] lock_acquire+0x90/0x210
[49048.263034] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0
[49048.263038] [<ffffffff815d2ea3>] mutex_lock_nested+0x73/0x3d0
[49048.263041] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0
[49048.263044] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0
[49048.263047] [<ffffffff810e764e>] ? put_lock_stats.isra.26+0xe/0x40
[49048.263051] [<ffffffff810e7a84>] ? lock_release_holdtime.part.27+0xd4/0x150
[49048.263055] [<ffffffff8116edab>] ? __remove_mapping+0xab/0x120
[49048.263058] [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0
[49048.263061] [<ffffffff81171b94>] shrink_page_list+0x3e4/0xa20
[49048.263064] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90
[49048.263068] [<ffffffff811726f5>] ? shrink_inactive_list+0x165/0x4b0
[49048.263071] [<ffffffff815d6fa0>] ? _raw_spin_unlock_irq+0x30/0x60
[49048.263075] [<ffffffff81172787>] shrink_inactive_list+0x1f7/0x4b0
[49048.263079] [<ffffffff81172e8d>] shrink_lruvec+0x44d/0x550
[49048.263082] [<ffffffff81173693>] kswapd+0x703/0xdf0
[49048.263086] [<ffffffff810af470>] ? __init_waitqueue_head+0x60/0x60
[49048.263090] [<ffffffff81172f90>] ? shrink_lruvec+0x550/0x550
[49048.263093] [<ffffffff810ae98d>] kthread+0xed/0x100
[49048.263097] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190
[49048.263100] [<ffffffff815df25c>] ret_from_fork+0x7c/0xb0
[49048.263103] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190
Should we use a GFP_NOFS allocation in mmu_notifier_register() or is
there a better way to fix/avoid this?
Thanks,
-Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: mm/mmu_notifier: inconsistent lock state in mmu_notifier_register() 2012-10-17 21:53 mm/mmu_notifier: inconsistent lock state in mmu_notifier_register() Andrea Righi @ 2012-10-18 12:24 ` Gavin Shan 2012-10-18 12:24 ` Gavin Shan 1 sibling, 0 replies; 5+ messages in thread From: Gavin Shan @ 2012-10-18 12:24 UTC (permalink / raw) To: Andrea Righi Cc: Andrea Arcangeli, Christoph Lameter, Andrew Morton, linux-mm, linux-kernel [-- Attachment #1: Type: text/plain, Size: 4705 bytes --] Hi Andrea, Do you have chance to have a try on the attached patch? Thanks, Gavin On Wed, Oct 17, 2012 at 11:53:38PM +0200, Andrea Righi wrote: >Just got this on 3.7.0-rc1 (last git commit 1867353): > >[49048.262912] ================================= >[49048.262913] [ INFO: inconsistent lock state ] >[49048.262916] 3.7.0-rc1+ #518 Not tainted >[49048.262918] --------------------------------- >[49048.262919] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. >[49048.262922] kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes: >[49048.262924] (&mapping->i_mmap_mutex){+.+.?.}, at: [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0 >[49048.262933] {RECLAIM_FS-ON-W} state was registered at: >[49048.262935] [<ffffffff810ed5d6>] mark_held_locks+0x86/0x150 >[49048.262938] [<ffffffff810edce7>] lockdep_trace_alloc+0x67/0xc0 >[49048.262942] [<ffffffff811a9323>] kmem_cache_alloc_trace+0x33/0x230 >[49048.262945] [<ffffffff811a1a27>] do_mmu_notifier_register+0x87/0x180 >[49048.262948] [<ffffffff811a1b53>] mmu_notifier_register+0x13/0x20 >[49048.262951] [<ffffffff81006738>] kvm_dev_ioctl+0x428/0x510 >[49048.262955] [<ffffffff811c7ce8>] do_vfs_ioctl+0x98/0x570 >[49048.262959] [<ffffffff811c8251>] sys_ioctl+0x91/0xb0 >[49048.262962] [<ffffffff815df302>] system_call_fastpath+0x16/0x1b >[49048.262966] irq event stamp: 825 >[49048.262968] hardirqs last enabled at (825): [<ffffffff815d6fa0>] _raw_spin_unlock_irq+0x30/0x60 >[49048.262971] hardirqs last disabled at (824): [<ffffffff815d6659>] _raw_spin_lock_irq+0x19/0x80 >[49048.262975] softirqs last enabled at (0): [<ffffffff81082170>] copy_process+0x630/0x17c0 >[49048.262979] softirqs last disabled at (0): [< (null)>] (null) >[49048.262981] >[49048.262981] other info that might help us debug this: >[49048.262983] Possible unsafe locking scenario: >[49048.262983] >[49048.262984] CPU0 >[49048.262986] ---- >[49048.262987] lock(&mapping->i_mmap_mutex); >[49048.262989] <Interrupt> >[49048.262991] lock(&mapping->i_mmap_mutex); >[49048.262993] >[49048.262993] *** DEADLOCK *** >[49048.262993] >[49048.262995] no locks held by kswapd0/35. >[49048.262996] >[49048.262996] stack backtrace: >[49048.262999] Pid: 35, comm: kswapd0 Not tainted 3.7.0-rc1+ #518 >[49048.263000] Call Trace: >[49048.263005] [<ffffffff815cd988>] print_usage_bug+0x1f5/0x206 >[49048.263008] [<ffffffff8105a21f>] ? save_stack_trace+0x2f/0x50 >[49048.263011] [<ffffffff810ea865>] mark_lock+0x295/0x2f0 >[49048.263014] [<ffffffff810e9c70>] ? print_irq_inversion_bug.part.42+0x1f0/0x1f0 >[49048.263017] [<ffffffff810eae5d>] __lock_acquire+0x59d/0x1c20 >[49048.263020] [<ffffffff815cf163>] ? put_cpu_partial+0x65/0xbd >[49048.263024] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90 >[49048.263028] [<ffffffff810c5555>] ? sched_clock_cpu+0xc5/0x120 >[49048.263031] [<ffffffff810ecbe0>] lock_acquire+0x90/0x210 >[49048.263034] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 >[49048.263038] [<ffffffff815d2ea3>] mutex_lock_nested+0x73/0x3d0 >[49048.263041] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 >[49048.263044] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 >[49048.263047] [<ffffffff810e764e>] ? put_lock_stats.isra.26+0xe/0x40 >[49048.263051] [<ffffffff810e7a84>] ? lock_release_holdtime.part.27+0xd4/0x150 >[49048.263055] [<ffffffff8116edab>] ? __remove_mapping+0xab/0x120 >[49048.263058] [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0 >[49048.263061] [<ffffffff81171b94>] shrink_page_list+0x3e4/0xa20 >[49048.263064] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90 >[49048.263068] [<ffffffff811726f5>] ? shrink_inactive_list+0x165/0x4b0 >[49048.263071] [<ffffffff815d6fa0>] ? _raw_spin_unlock_irq+0x30/0x60 >[49048.263075] [<ffffffff81172787>] shrink_inactive_list+0x1f7/0x4b0 >[49048.263079] [<ffffffff81172e8d>] shrink_lruvec+0x44d/0x550 >[49048.263082] [<ffffffff81173693>] kswapd+0x703/0xdf0 >[49048.263086] [<ffffffff810af470>] ? __init_waitqueue_head+0x60/0x60 >[49048.263090] [<ffffffff81172f90>] ? shrink_lruvec+0x550/0x550 >[49048.263093] [<ffffffff810ae98d>] kthread+0xed/0x100 >[49048.263097] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190 >[49048.263100] [<ffffffff815df25c>] ret_from_fork+0x7c/0xb0 >[49048.263103] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190 > >Should we use a GFP_NOFS allocation in mmu_notifier_register() or is >there a better way to fix/avoid this? > >Thanks, >-Andrea > >-- >To unsubscribe, send a message with 'unsubscribe linux-mm' in >the body to majordomo@kvack.org. For more info on Linux MM, >see: http://www.linux-mm.org/ . >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > [-- Attachment #2: 0001-mm-mmu_notifier-allocate-mmu_notifier-in-advance.patch --] [-- Type: text/x-diff, Size: 2296 bytes --] >From 8b7dcc6afd617e8b52ed1b10221195cce0c8f442 Mon Sep 17 00:00:00 2001 From: Gavin Shan <shangw@linux.vnet.ibm.com> Date: Thu, 18 Oct 2012 20:14:06 +0800 Subject: [PATCH] mm/mmu_notifier: allocate mmu_notifier in advance While allocating mmu_notifier with parameter GFP_KERNEL, swap would start to work in case of tight available memory. Eventually, that would lead to dead-lock while swap deamon does swapping anonymous pages. It was caused by commit e0f3c3f78da29b114e7c1c68019036559f715948 ("mm/mmu_notifier: init notifier if necessary"). The patch simply back out the above commit. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> --- mm/mmu_notifier.c | 26 +++++++++++++------------- 1 files changed, 13 insertions(+), 13 deletions(-) diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 479a1e7..8a5ac8c 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -196,28 +196,28 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn, BUG_ON(atomic_read(&mm->mm_users) <= 0); /* - * Verify that mmu_notifier_init() already run and the global srcu is - * initialized. - */ + * Verify that mmu_notifier_init() already run and the global srcu is + * initialized. + */ BUG_ON(!srcu.per_cpu_ref); + ret = -ENOMEM; + mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL); + if (unlikely(!mmu_notifier_mm)) + goto out; + if (take_mmap_sem) down_write(&mm->mmap_sem); ret = mm_take_all_locks(mm); if (unlikely(ret)) - goto out; + goto out_clean; if (!mm_has_notifiers(mm)) { - mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), - GFP_KERNEL); - if (unlikely(!mmu_notifier_mm)) { - ret = -ENOMEM; - goto out_of_mem; - } INIT_HLIST_HEAD(&mmu_notifier_mm->list); spin_lock_init(&mmu_notifier_mm->lock); mm->mmu_notifier_mm = mmu_notifier_mm; + mmu_notifier_mm = NULL; } atomic_inc(&mm->mm_count); @@ -233,12 +233,12 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn, hlist_add_head(&mn->hlist, &mm->mmu_notifier_mm->list); spin_unlock(&mm->mmu_notifier_mm->lock); -out_of_mem: mm_drop_all_locks(mm); -out: +out_clean: if (take_mmap_sem) up_write(&mm->mmap_sem); - + kfree(mmu_notifier_mm); +out: BUG_ON(atomic_read(&mm->mm_users) <= 0); return ret; } -- 1.7.5.4 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm/mmu_notifier: inconsistent lock state in mmu_notifier_register() 2012-10-17 21:53 mm/mmu_notifier: inconsistent lock state in mmu_notifier_register() Andrea Righi 2012-10-18 12:24 ` Gavin Shan @ 2012-10-18 12:24 ` Gavin Shan 2012-10-18 12:48 ` Andrea Righi 2012-10-19 7:05 ` Andrea Righi 1 sibling, 2 replies; 5+ messages in thread From: Gavin Shan @ 2012-10-18 12:24 UTC (permalink / raw) To: Andrea Righi Cc: Andrea Arcangeli, Christoph Lameter, Andrew Morton, linux-mm, linux-kernel [-- Attachment #1: Type: text/plain, Size: 4705 bytes --] Hi Andrea, Do you have chance to have a try on the attached patch? Thanks, Gavin On Wed, Oct 17, 2012 at 11:53:38PM +0200, Andrea Righi wrote: >Just got this on 3.7.0-rc1 (last git commit 1867353): > >[49048.262912] ================================= >[49048.262913] [ INFO: inconsistent lock state ] >[49048.262916] 3.7.0-rc1+ #518 Not tainted >[49048.262918] --------------------------------- >[49048.262919] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. >[49048.262922] kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes: >[49048.262924] (&mapping->i_mmap_mutex){+.+.?.}, at: [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0 >[49048.262933] {RECLAIM_FS-ON-W} state was registered at: >[49048.262935] [<ffffffff810ed5d6>] mark_held_locks+0x86/0x150 >[49048.262938] [<ffffffff810edce7>] lockdep_trace_alloc+0x67/0xc0 >[49048.262942] [<ffffffff811a9323>] kmem_cache_alloc_trace+0x33/0x230 >[49048.262945] [<ffffffff811a1a27>] do_mmu_notifier_register+0x87/0x180 >[49048.262948] [<ffffffff811a1b53>] mmu_notifier_register+0x13/0x20 >[49048.262951] [<ffffffff81006738>] kvm_dev_ioctl+0x428/0x510 >[49048.262955] [<ffffffff811c7ce8>] do_vfs_ioctl+0x98/0x570 >[49048.262959] [<ffffffff811c8251>] sys_ioctl+0x91/0xb0 >[49048.262962] [<ffffffff815df302>] system_call_fastpath+0x16/0x1b >[49048.262966] irq event stamp: 825 >[49048.262968] hardirqs last enabled at (825): [<ffffffff815d6fa0>] _raw_spin_unlock_irq+0x30/0x60 >[49048.262971] hardirqs last disabled at (824): [<ffffffff815d6659>] _raw_spin_lock_irq+0x19/0x80 >[49048.262975] softirqs last enabled at (0): [<ffffffff81082170>] copy_process+0x630/0x17c0 >[49048.262979] softirqs last disabled at (0): [< (null)>] (null) >[49048.262981] >[49048.262981] other info that might help us debug this: >[49048.262983] Possible unsafe locking scenario: >[49048.262983] >[49048.262984] CPU0 >[49048.262986] ---- >[49048.262987] lock(&mapping->i_mmap_mutex); >[49048.262989] <Interrupt> >[49048.262991] lock(&mapping->i_mmap_mutex); >[49048.262993] >[49048.262993] *** DEADLOCK *** >[49048.262993] >[49048.262995] no locks held by kswapd0/35. >[49048.262996] >[49048.262996] stack backtrace: >[49048.262999] Pid: 35, comm: kswapd0 Not tainted 3.7.0-rc1+ #518 >[49048.263000] Call Trace: >[49048.263005] [<ffffffff815cd988>] print_usage_bug+0x1f5/0x206 >[49048.263008] [<ffffffff8105a21f>] ? save_stack_trace+0x2f/0x50 >[49048.263011] [<ffffffff810ea865>] mark_lock+0x295/0x2f0 >[49048.263014] [<ffffffff810e9c70>] ? print_irq_inversion_bug.part.42+0x1f0/0x1f0 >[49048.263017] [<ffffffff810eae5d>] __lock_acquire+0x59d/0x1c20 >[49048.263020] [<ffffffff815cf163>] ? put_cpu_partial+0x65/0xbd >[49048.263024] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90 >[49048.263028] [<ffffffff810c5555>] ? sched_clock_cpu+0xc5/0x120 >[49048.263031] [<ffffffff810ecbe0>] lock_acquire+0x90/0x210 >[49048.263034] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 >[49048.263038] [<ffffffff815d2ea3>] mutex_lock_nested+0x73/0x3d0 >[49048.263041] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 >[49048.263044] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 >[49048.263047] [<ffffffff810e764e>] ? put_lock_stats.isra.26+0xe/0x40 >[49048.263051] [<ffffffff810e7a84>] ? lock_release_holdtime.part.27+0xd4/0x150 >[49048.263055] [<ffffffff8116edab>] ? __remove_mapping+0xab/0x120 >[49048.263058] [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0 >[49048.263061] [<ffffffff81171b94>] shrink_page_list+0x3e4/0xa20 >[49048.263064] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90 >[49048.263068] [<ffffffff811726f5>] ? shrink_inactive_list+0x165/0x4b0 >[49048.263071] [<ffffffff815d6fa0>] ? _raw_spin_unlock_irq+0x30/0x60 >[49048.263075] [<ffffffff81172787>] shrink_inactive_list+0x1f7/0x4b0 >[49048.263079] [<ffffffff81172e8d>] shrink_lruvec+0x44d/0x550 >[49048.263082] [<ffffffff81173693>] kswapd+0x703/0xdf0 >[49048.263086] [<ffffffff810af470>] ? __init_waitqueue_head+0x60/0x60 >[49048.263090] [<ffffffff81172f90>] ? shrink_lruvec+0x550/0x550 >[49048.263093] [<ffffffff810ae98d>] kthread+0xed/0x100 >[49048.263097] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190 >[49048.263100] [<ffffffff815df25c>] ret_from_fork+0x7c/0xb0 >[49048.263103] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190 > >Should we use a GFP_NOFS allocation in mmu_notifier_register() or is >there a better way to fix/avoid this? > >Thanks, >-Andrea > >-- >To unsubscribe, send a message with 'unsubscribe linux-mm' in >the body to majordomo@kvack.org. For more info on Linux MM, >see: http://www.linux-mm.org/ . >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > [-- Attachment #2: 0001-mm-mmu_notifier-allocate-mmu_notifier-in-advance.patch --] [-- Type: text/x-diff, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm/mmu_notifier: inconsistent lock state in mmu_notifier_register() 2012-10-18 12:24 ` Gavin Shan @ 2012-10-18 12:48 ` Andrea Righi 2012-10-19 7:05 ` Andrea Righi 1 sibling, 0 replies; 5+ messages in thread From: Andrea Righi @ 2012-10-18 12:48 UTC (permalink / raw) To: Gavin Shan Cc: Andrea Arcangeli, Christoph Lameter, Andrew Morton, linux-mm, linux-kernel On Thu, Oct 18, 2012 at 08:24:17PM +0800, Gavin Shan wrote: > Hi Andrea, > > Do you have chance to have a try on the attached patch? > > Thanks, > Gavin Oh I see, this is surely better than using GFP_NOFS. Trying your patch right now. Thanks! -Andrea > > On Wed, Oct 17, 2012 at 11:53:38PM +0200, Andrea Righi wrote: > >Just got this on 3.7.0-rc1 (last git commit 1867353): > > > >[49048.262912] ================================= > >[49048.262913] [ INFO: inconsistent lock state ] > >[49048.262916] 3.7.0-rc1+ #518 Not tainted > >[49048.262918] --------------------------------- > >[49048.262919] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. > >[49048.262922] kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes: > >[49048.262924] (&mapping->i_mmap_mutex){+.+.?.}, at: [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0 > >[49048.262933] {RECLAIM_FS-ON-W} state was registered at: > >[49048.262935] [<ffffffff810ed5d6>] mark_held_locks+0x86/0x150 > >[49048.262938] [<ffffffff810edce7>] lockdep_trace_alloc+0x67/0xc0 > >[49048.262942] [<ffffffff811a9323>] kmem_cache_alloc_trace+0x33/0x230 > >[49048.262945] [<ffffffff811a1a27>] do_mmu_notifier_register+0x87/0x180 > >[49048.262948] [<ffffffff811a1b53>] mmu_notifier_register+0x13/0x20 > >[49048.262951] [<ffffffff81006738>] kvm_dev_ioctl+0x428/0x510 > >[49048.262955] [<ffffffff811c7ce8>] do_vfs_ioctl+0x98/0x570 > >[49048.262959] [<ffffffff811c8251>] sys_ioctl+0x91/0xb0 > >[49048.262962] [<ffffffff815df302>] system_call_fastpath+0x16/0x1b > >[49048.262966] irq event stamp: 825 > >[49048.262968] hardirqs last enabled at (825): [<ffffffff815d6fa0>] _raw_spin_unlock_irq+0x30/0x60 > >[49048.262971] hardirqs last disabled at (824): [<ffffffff815d6659>] _raw_spin_lock_irq+0x19/0x80 > >[49048.262975] softirqs last enabled at (0): [<ffffffff81082170>] copy_process+0x630/0x17c0 > >[49048.262979] softirqs last disabled at (0): [< (null)>] (null) > >[49048.262981] > >[49048.262981] other info that might help us debug this: > >[49048.262983] Possible unsafe locking scenario: > >[49048.262983] > >[49048.262984] CPU0 > >[49048.262986] ---- > >[49048.262987] lock(&mapping->i_mmap_mutex); > >[49048.262989] <Interrupt> > >[49048.262991] lock(&mapping->i_mmap_mutex); > >[49048.262993] > >[49048.262993] *** DEADLOCK *** > >[49048.262993] > >[49048.262995] no locks held by kswapd0/35. > >[49048.262996] > >[49048.262996] stack backtrace: > >[49048.262999] Pid: 35, comm: kswapd0 Not tainted 3.7.0-rc1+ #518 > >[49048.263000] Call Trace: > >[49048.263005] [<ffffffff815cd988>] print_usage_bug+0x1f5/0x206 > >[49048.263008] [<ffffffff8105a21f>] ? save_stack_trace+0x2f/0x50 > >[49048.263011] [<ffffffff810ea865>] mark_lock+0x295/0x2f0 > >[49048.263014] [<ffffffff810e9c70>] ? print_irq_inversion_bug.part.42+0x1f0/0x1f0 > >[49048.263017] [<ffffffff810eae5d>] __lock_acquire+0x59d/0x1c20 > >[49048.263020] [<ffffffff815cf163>] ? put_cpu_partial+0x65/0xbd > >[49048.263024] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90 > >[49048.263028] [<ffffffff810c5555>] ? sched_clock_cpu+0xc5/0x120 > >[49048.263031] [<ffffffff810ecbe0>] lock_acquire+0x90/0x210 > >[49048.263034] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 > >[49048.263038] [<ffffffff815d2ea3>] mutex_lock_nested+0x73/0x3d0 > >[49048.263041] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 > >[49048.263044] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 > >[49048.263047] [<ffffffff810e764e>] ? put_lock_stats.isra.26+0xe/0x40 > >[49048.263051] [<ffffffff810e7a84>] ? lock_release_holdtime.part.27+0xd4/0x150 > >[49048.263055] [<ffffffff8116edab>] ? __remove_mapping+0xab/0x120 > >[49048.263058] [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0 > >[49048.263061] [<ffffffff81171b94>] shrink_page_list+0x3e4/0xa20 > >[49048.263064] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90 > >[49048.263068] [<ffffffff811726f5>] ? shrink_inactive_list+0x165/0x4b0 > >[49048.263071] [<ffffffff815d6fa0>] ? _raw_spin_unlock_irq+0x30/0x60 > >[49048.263075] [<ffffffff81172787>] shrink_inactive_list+0x1f7/0x4b0 > >[49048.263079] [<ffffffff81172e8d>] shrink_lruvec+0x44d/0x550 > >[49048.263082] [<ffffffff81173693>] kswapd+0x703/0xdf0 > >[49048.263086] [<ffffffff810af470>] ? __init_waitqueue_head+0x60/0x60 > >[49048.263090] [<ffffffff81172f90>] ? shrink_lruvec+0x550/0x550 > >[49048.263093] [<ffffffff810ae98d>] kthread+0xed/0x100 > >[49048.263097] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190 > >[49048.263100] [<ffffffff815df25c>] ret_from_fork+0x7c/0xb0 > >[49048.263103] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190 > > > >Should we use a GFP_NOFS allocation in mmu_notifier_register() or is > >there a better way to fix/avoid this? > > > >Thanks, > >-Andrea > > > >-- > >To unsubscribe, send a message with 'unsubscribe linux-mm' in > >the body to majordomo@kvack.org. For more info on Linux MM, > >see: http://www.linux-mm.org/ . > >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > > >From 8b7dcc6afd617e8b52ed1b10221195cce0c8f442 Mon Sep 17 00:00:00 2001 > From: Gavin Shan <shangw@linux.vnet.ibm.com> > Date: Thu, 18 Oct 2012 20:14:06 +0800 > Subject: [PATCH] mm/mmu_notifier: allocate mmu_notifier in advance > > While allocating mmu_notifier with parameter GFP_KERNEL, swap would > start to work in case of tight available memory. Eventually, that > would lead to dead-lock while swap deamon does swapping anonymous > pages. It was caused by commit e0f3c3f78da29b114e7c1c68019036559f715948 > ("mm/mmu_notifier: init notifier if necessary"). > > The patch simply back out the above commit. > > Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> > --- > mm/mmu_notifier.c | 26 +++++++++++++------------- > 1 files changed, 13 insertions(+), 13 deletions(-) > > diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c > index 479a1e7..8a5ac8c 100644 > --- a/mm/mmu_notifier.c > +++ b/mm/mmu_notifier.c > @@ -196,28 +196,28 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn, > BUG_ON(atomic_read(&mm->mm_users) <= 0); > > /* > - * Verify that mmu_notifier_init() already run and the global srcu is > - * initialized. > - */ > + * Verify that mmu_notifier_init() already run and the global srcu is > + * initialized. > + */ > BUG_ON(!srcu.per_cpu_ref); > > + ret = -ENOMEM; > + mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL); > + if (unlikely(!mmu_notifier_mm)) > + goto out; > + > if (take_mmap_sem) > down_write(&mm->mmap_sem); > ret = mm_take_all_locks(mm); > if (unlikely(ret)) > - goto out; > + goto out_clean; > > if (!mm_has_notifiers(mm)) { > - mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), > - GFP_KERNEL); > - if (unlikely(!mmu_notifier_mm)) { > - ret = -ENOMEM; > - goto out_of_mem; > - } > INIT_HLIST_HEAD(&mmu_notifier_mm->list); > spin_lock_init(&mmu_notifier_mm->lock); > > mm->mmu_notifier_mm = mmu_notifier_mm; > + mmu_notifier_mm = NULL; > } > atomic_inc(&mm->mm_count); > > @@ -233,12 +233,12 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn, > hlist_add_head(&mn->hlist, &mm->mmu_notifier_mm->list); > spin_unlock(&mm->mmu_notifier_mm->lock); > > -out_of_mem: > mm_drop_all_locks(mm); > -out: > +out_clean: > if (take_mmap_sem) > up_write(&mm->mmap_sem); > - > + kfree(mmu_notifier_mm); > +out: > BUG_ON(atomic_read(&mm->mm_users) <= 0); > return ret; > } > -- > 1.7.5.4 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: mm/mmu_notifier: inconsistent lock state in mmu_notifier_register() 2012-10-18 12:24 ` Gavin Shan 2012-10-18 12:48 ` Andrea Righi @ 2012-10-19 7:05 ` Andrea Righi 1 sibling, 0 replies; 5+ messages in thread From: Andrea Righi @ 2012-10-19 7:05 UTC (permalink / raw) To: Gavin Shan Cc: Andrea Arcangeli, Christoph Lameter, Andrew Morton, linux-mm, linux-kernel On Thu, Oct 18, 2012 at 08:24:17PM +0800, Gavin Shan wrote: > Hi Andrea, > > Do you have chance to have a try on the attached patch? > > Thanks, > Gavin Gavin, the patch looks good to me and I confirm that the lockdep splat disappeared with it. Feel free to add my: Tested-by: Andrea Righi <andrea@betterlinux.com> Thanks, -Andrea > > On Wed, Oct 17, 2012 at 11:53:38PM +0200, Andrea Righi wrote: > >Just got this on 3.7.0-rc1 (last git commit 1867353): > > > >[49048.262912] ================================= > >[49048.262913] [ INFO: inconsistent lock state ] > >[49048.262916] 3.7.0-rc1+ #518 Not tainted > >[49048.262918] --------------------------------- > >[49048.262919] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. > >[49048.262922] kswapd0/35 [HC0[0]:SC0[0]:HE1:SE1] takes: > >[49048.262924] (&mapping->i_mmap_mutex){+.+.?.}, at: [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0 > >[49048.262933] {RECLAIM_FS-ON-W} state was registered at: > >[49048.262935] [<ffffffff810ed5d6>] mark_held_locks+0x86/0x150 > >[49048.262938] [<ffffffff810edce7>] lockdep_trace_alloc+0x67/0xc0 > >[49048.262942] [<ffffffff811a9323>] kmem_cache_alloc_trace+0x33/0x230 > >[49048.262945] [<ffffffff811a1a27>] do_mmu_notifier_register+0x87/0x180 > >[49048.262948] [<ffffffff811a1b53>] mmu_notifier_register+0x13/0x20 > >[49048.262951] [<ffffffff81006738>] kvm_dev_ioctl+0x428/0x510 > >[49048.262955] [<ffffffff811c7ce8>] do_vfs_ioctl+0x98/0x570 > >[49048.262959] [<ffffffff811c8251>] sys_ioctl+0x91/0xb0 > >[49048.262962] [<ffffffff815df302>] system_call_fastpath+0x16/0x1b > >[49048.262966] irq event stamp: 825 > >[49048.262968] hardirqs last enabled at (825): [<ffffffff815d6fa0>] _raw_spin_unlock_irq+0x30/0x60 > >[49048.262971] hardirqs last disabled at (824): [<ffffffff815d6659>] _raw_spin_lock_irq+0x19/0x80 > >[49048.262975] softirqs last enabled at (0): [<ffffffff81082170>] copy_process+0x630/0x17c0 > >[49048.262979] softirqs last disabled at (0): [< (null)>] (null) > >[49048.262981] > >[49048.262981] other info that might help us debug this: > >[49048.262983] Possible unsafe locking scenario: > >[49048.262983] > >[49048.262984] CPU0 > >[49048.262986] ---- > >[49048.262987] lock(&mapping->i_mmap_mutex); > >[49048.262989] <Interrupt> > >[49048.262991] lock(&mapping->i_mmap_mutex); > >[49048.262993] > >[49048.262993] *** DEADLOCK *** > >[49048.262993] > >[49048.262995] no locks held by kswapd0/35. > >[49048.262996] > >[49048.262996] stack backtrace: > >[49048.262999] Pid: 35, comm: kswapd0 Not tainted 3.7.0-rc1+ #518 > >[49048.263000] Call Trace: > >[49048.263005] [<ffffffff815cd988>] print_usage_bug+0x1f5/0x206 > >[49048.263008] [<ffffffff8105a21f>] ? save_stack_trace+0x2f/0x50 > >[49048.263011] [<ffffffff810ea865>] mark_lock+0x295/0x2f0 > >[49048.263014] [<ffffffff810e9c70>] ? print_irq_inversion_bug.part.42+0x1f0/0x1f0 > >[49048.263017] [<ffffffff810eae5d>] __lock_acquire+0x59d/0x1c20 > >[49048.263020] [<ffffffff815cf163>] ? put_cpu_partial+0x65/0xbd > >[49048.263024] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90 > >[49048.263028] [<ffffffff810c5555>] ? sched_clock_cpu+0xc5/0x120 > >[49048.263031] [<ffffffff810ecbe0>] lock_acquire+0x90/0x210 > >[49048.263034] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 > >[49048.263038] [<ffffffff815d2ea3>] mutex_lock_nested+0x73/0x3d0 > >[49048.263041] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 > >[49048.263044] [<ffffffff81192fbc>] ? page_referenced+0x9c/0x2e0 > >[49048.263047] [<ffffffff810e764e>] ? put_lock_stats.isra.26+0xe/0x40 > >[49048.263051] [<ffffffff810e7a84>] ? lock_release_holdtime.part.27+0xd4/0x150 > >[49048.263055] [<ffffffff8116edab>] ? __remove_mapping+0xab/0x120 > >[49048.263058] [<ffffffff81192fbc>] page_referenced+0x9c/0x2e0 > >[49048.263061] [<ffffffff81171b94>] shrink_page_list+0x3e4/0xa20 > >[49048.263064] [<ffffffff81052d06>] ? native_sched_clock+0x26/0x90 > >[49048.263068] [<ffffffff811726f5>] ? shrink_inactive_list+0x165/0x4b0 > >[49048.263071] [<ffffffff815d6fa0>] ? _raw_spin_unlock_irq+0x30/0x60 > >[49048.263075] [<ffffffff81172787>] shrink_inactive_list+0x1f7/0x4b0 > >[49048.263079] [<ffffffff81172e8d>] shrink_lruvec+0x44d/0x550 > >[49048.263082] [<ffffffff81173693>] kswapd+0x703/0xdf0 > >[49048.263086] [<ffffffff810af470>] ? __init_waitqueue_head+0x60/0x60 > >[49048.263090] [<ffffffff81172f90>] ? shrink_lruvec+0x550/0x550 > >[49048.263093] [<ffffffff810ae98d>] kthread+0xed/0x100 > >[49048.263097] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190 > >[49048.263100] [<ffffffff815df25c>] ret_from_fork+0x7c/0xb0 > >[49048.263103] [<ffffffff810ae8a0>] ? flush_kthread_worker+0x190/0x190 > > > >Should we use a GFP_NOFS allocation in mmu_notifier_register() or is > >there a better way to fix/avoid this? > > > >Thanks, > >-Andrea > > > >-- > >To unsubscribe, send a message with 'unsubscribe linux-mm' in > >the body to majordomo@kvack.org. For more info on Linux MM, > >see: http://www.linux-mm.org/ . > >Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > > >From 8b7dcc6afd617e8b52ed1b10221195cce0c8f442 Mon Sep 17 00:00:00 2001 > From: Gavin Shan <shangw@linux.vnet.ibm.com> > Date: Thu, 18 Oct 2012 20:14:06 +0800 > Subject: [PATCH] mm/mmu_notifier: allocate mmu_notifier in advance > > While allocating mmu_notifier with parameter GFP_KERNEL, swap would > start to work in case of tight available memory. Eventually, that > would lead to dead-lock while swap deamon does swapping anonymous > pages. It was caused by commit e0f3c3f78da29b114e7c1c68019036559f715948 > ("mm/mmu_notifier: init notifier if necessary"). > > The patch simply back out the above commit. > > Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> > --- > mm/mmu_notifier.c | 26 +++++++++++++------------- > 1 files changed, 13 insertions(+), 13 deletions(-) > > diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c > index 479a1e7..8a5ac8c 100644 > --- a/mm/mmu_notifier.c > +++ b/mm/mmu_notifier.c > @@ -196,28 +196,28 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn, > BUG_ON(atomic_read(&mm->mm_users) <= 0); > > /* > - * Verify that mmu_notifier_init() already run and the global srcu is > - * initialized. > - */ > + * Verify that mmu_notifier_init() already run and the global srcu is > + * initialized. > + */ > BUG_ON(!srcu.per_cpu_ref); > > + ret = -ENOMEM; > + mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL); > + if (unlikely(!mmu_notifier_mm)) > + goto out; > + > if (take_mmap_sem) > down_write(&mm->mmap_sem); > ret = mm_take_all_locks(mm); > if (unlikely(ret)) > - goto out; > + goto out_clean; > > if (!mm_has_notifiers(mm)) { > - mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), > - GFP_KERNEL); > - if (unlikely(!mmu_notifier_mm)) { > - ret = -ENOMEM; > - goto out_of_mem; > - } > INIT_HLIST_HEAD(&mmu_notifier_mm->list); > spin_lock_init(&mmu_notifier_mm->lock); > > mm->mmu_notifier_mm = mmu_notifier_mm; > + mmu_notifier_mm = NULL; > } > atomic_inc(&mm->mm_count); > > @@ -233,12 +233,12 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn, > hlist_add_head(&mn->hlist, &mm->mmu_notifier_mm->list); > spin_unlock(&mm->mmu_notifier_mm->lock); > > -out_of_mem: > mm_drop_all_locks(mm); > -out: > +out_clean: > if (take_mmap_sem) > up_write(&mm->mmap_sem); > - > + kfree(mmu_notifier_mm); > +out: > BUG_ON(atomic_read(&mm->mm_users) <= 0); > return ret; > } > -- > 1.7.5.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-10-19 7:05 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-10-17 21:53 mm/mmu_notifier: inconsistent lock state in mmu_notifier_register() Andrea Righi 2012-10-18 12:24 ` Gavin Shan 2012-10-18 12:24 ` Gavin Shan 2012-10-18 12:48 ` Andrea Righi 2012-10-19 7:05 ` Andrea Righi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox