> On Apr 10, 2023, at 8:36 PM, kernel test robot <yujie.liu@intel.com> wrote:
>
> Hello,
>
> kernel test robot noticed "WARNING:inconsistent_lock_state" on:
>
> commit: 5fd403eb6c181c63a3aacd55d92b80256a0670cf ("shmem: stable directory cookies")
> git://git.kernel.org/cgit/linux/kernel/git/cel/linux topic-shmem-stable-dir-cookies
>
> in testcase: boot
>
> compiler: gcc-11
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
> If you fix the issue, kindly add following tag
> | Reported-by: kernel test robot <yujie.liu@intel.com>
> | Link:
https://lore.kernel.org/oe-lkp/202304101606.79aea62f-yujie.liu@intel.com
>
>
> [ 21.279213][ C0] WARNING: inconsistent lock state
> [ 21.279668][ C0] 6.3.0-rc5-00001-g5fd403eb6c18 #1 Not tainted
> [ 21.280199][ C0] --------------------------------
> [ 21.280657][ C0] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> [ 21.281238][ C0] swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> [ 21.281773][ C0] ffff8881102e9b50 (&xa->xa_lock#3){+.?.}-{2:2}, at: xa_destroy (lib/xarray.c:2214)
> [ 21.283140][ C0] {SOFTIRQ-ON-W} state was registered at:
> [ 21.283640][ C0] __lock_acquire (kernel/locking/lockdep.c:5010)
> [ 21.284089][ C0] lock_acquire (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5671 kernel/locking/lockdep.c:5634)
> [ 21.284513][ C0] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
> [ 21.284937][ C0] shmem_doff_add (include/linux/xarray.h:965 mm/shmem.c:2943)
> [ 21.285375][ C0] shmem_mknod (mm/shmem.c:3014)
> [ 21.285791][ C0] vfs_mknod (fs/namei.c:3916)
> [ 21.286195][ C0] devtmpfs_work_loop (drivers/base/devtmpfs.c:228 drivers/base/devtmpfs.c:393 drivers/base/devtmpfs.c:408)
> [ 21.286653][ C0] devtmpfsd (devtmpfs.c:?)
> [ 21.287046][ C0] kthread (kernel/kthread.c:376)
> [ 21.287441][ C0] ret_from_fork (arch/x86/entry/entry_64.S:314)
> [ 21.287864][ C0] irq event stamp: 167451
> [ 21.288264][ C0] hardirqs last enabled at (167450): kasan_quarantine_put (arch/x86/include/asm/irqflags.h:42 (discriminator 1) arch/x86/include/asm/irqflags.h:77 (discriminator 1) arch/x86/include/asm/irqflags.h:135 (discriminator 1) mm/kasan/quarantine.c:242
(discriminator 1))
> [ 21.289095][ C0] hardirqs last disabled at (167451): _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:108 kernel/locking/spinlock.c:162)
> [ 21.289969][ C0] softirqs last enabled at (167330): __do_softirq (kernel/softirq.c:415 kernel/softirq.c:600)
> [ 21.290755][ C0] softirqs last disabled at (167355): irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650 kernel/softirq.c:640 kernel/softirq.c:662)
> [ 21.291540][ C0]
> [ 21.291540][ C0] other info that might help us debug this:
> [ 21.292230][ C0] Possible unsafe locking scenario:
> [ 21.292230][ C0]
> [ 21.292905][ C0] CPU0
> [ 21.293235][ C0] ----
> [ 21.293575][ C0] lock(&xa->xa_lock#3);
> [ 21.293987][ C0] <Interrupt>
> [ 21.294327][ C0] lock(&xa->xa_lock#3);
> [ 21.294753][ C0]
> [ 21.294753][ C0] *** DEADLOCK ***
> [ 21.294753][ C0]
> [ 21.295483][ C0] 1 lock held by swapper/0/0:
> [ 21.295914][ C0] #0: ffffffff8597a260 (rcu_callback){....}-{0:0}, at: rcu_do_batch (kernel/rcu/tree.c:2104)
It appears that RCU is trying to evict a tmpfs directory inode prematurely.
lockdep catches this because someone else is trying to add an entry to it
while RCU is trying to free it. Classic use-after-free.
Jeff, the only new iput() in this patch is the one you suggested in
shmem_symlink(). Are you sure it is needed (and also correct)?
> [ 21.296705][ C0]
> [ 21.296705][ C0] stack backtrace:
> [ 21.297254][ C0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.3.0-rc5-00001-g5fd403eb6c18 #1
> [ 21.298007][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
> [ 21.298867][ C0] Call Trace:
> [ 21.299199][ C0] <IRQ>
> [ 21.299503][ C0] dump_stack_lvl (lib/dump_stack.c:107)
> [ 21.299926][ C0] mark_lock_irq (kernel/locking/lockdep.c:232 kernel/locking/lockdep.c:3976 kernel/locking/lockdep.c:4179)
> [ 21.300362][ C0] ? secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
> [ 21.300888][ C0] ? __pfx_mark_lock_irq (kernel/locking/lockdep.c:4170)
> [ 21.301353][ C0] ? __pfx_stack_trace_save (kernel/stacktrace.c:114)
> [ 21.302659][ C0] ? orc_find+0x23b/0x4d0
> [ 21.303106][ C0] ? save_trace (kernel/locking/lockdep.c:586)
> [ 21.303519][ C0] ? secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
> [ 21.304046][ C0] ? kernel_text_address (kernel/extable.c:99)
> [ 21.304507][ C0] mark_lock+0x4b8/0x13a0
> [ 21.304959][ C0] ? __pfx_mark_lock+0x10/0x10
> [ 21.305445][ C0] mark_usage (kernel/locking/lockdep.c:4530)
> [ 21.305844][ C0] __lock_acquire (kernel/locking/lockdep.c:5010)
> [ 21.306285][ C0] ? mark_lock+0xbc/0x13a0
> [ 21.306741][ C0] ? kasan_save_stack (mm/kasan/common.c:46)
> [ 21.307182][ C0] ? __pfx___lock_acquire (kernel/locking/lockdep.c:4914)
> [ 21.307647][ C0] ? rcu_do_batch (include/linux/rcupdate.h:332 kernel/rcu/tree.c:2114)
> [ 21.308082][ C0] ? rcu_core (kernel/rcu/tree.c:2374)
> [ 21.308489][ C0] ? __do_softirq (arch/x86/include/asm/atomic.h:29 include/linux/jump_label.h:260 include/linux/jump_label.h:270 include/trace/events/irq.h:142 kernel/softirq.c:572)
> [ 21.308922][ C0] ? irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650 kernel/softirq.c:640 kernel/softirq.c:662)
> [ 21.309353][ C0] lock_acquire (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5671 kernel/locking/lockdep.c:5634)
> [ 21.309771][ C0] ? xa_destroy (lib/xarray.c:2214)
> [ 21.310185][ C0] ? __pfx_lock_acquire (kernel/locking/lockdep.c:5637)
> [ 21.310639][ C0] ? __pfx___lock_acquire (kernel/locking/lockdep.c:4914)
> [ 21.311104][ C0] ? ftrace_likely_update (arch/x86/include/asm/smap.h:56 kernel/trace/trace_branch.c:229)
> [ 21.311571][ C0] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
> [ 21.312040][ C0] ? xa_destroy (lib/xarray.c:2214)
> [ 21.312457][ C0] xa_destroy (lib/xarray.c:2214)
> [ 21.312859][ C0] ? lock_acquire (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5671 kernel/locking/lockdep.c:5634)
> [ 21.313290][ C0] ? __pfx_xa_destroy (lib/xarray.c:2208)
> [ 21.313746][ C0] shmem_free_in_core_inode (mm/shmem.c:2989 mm/shmem.c:4061)
> [ 21.314231][ C0] ? __pfx_shmem_free_in_core_inode (mm/shmem.c:4058)
> [ 21.314763][ C0] i_callback (fs/inode.c:252)
> [ 21.315154][ C0] rcu_do_batch (include/linux/rcupdate.h:332 kernel/rcu/tree.c:2114)
> [ 21.315580][ C0] ? __pfx_rcu_do_batch (kernel/rcu/tree.c:2051)
> [ 21.316037][ C0] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:42 arch/x86/include/asm/irqflags.h:77 arch/x86/include/asm/irqflags.h:135 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
> [ 21.316550][ C0] ? rcu_report_qs_rdp (kernel/rcu/tree.c:2002)
> [ 21.317012][ C0] rcu_core (kernel/rcu/tree.c:2374)
> [ 21.317414][ C0] __do_softirq (arch/x86/include/asm/atomic.h:29 include/linux/jump_label.h:260 include/linux/jump_label.h:270 include/trace/events/irq.h:142 kernel/softirq.c:572)
> [ 21.317836][ C0] irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650 kernel/softirq.c:640 kernel/softirq.c:662)
> [ 21.318251][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1107 (discriminator 14))
> [ 21.318749][ C0] </IRQ>
> [ 21.319052][ C0] <TASK>
> [ 21.319355][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:645)
> [ 21.319880][ C0] RIP: 0010:default_idle (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:72 arch/x86/kernel/process.c:704)
> [ 21.320344][ C0] Code: ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 8b 05 f6 b4 a8 03 85 c0 7e 07 0f 00 2d 8b 96 3d 00 fb f4 <fa> c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90
> All code
> ========
> 0: ff (bad)
> 1: ff (bad)
> 2: ff 90 90 90 90 90 call *-0x6f6f6f70(%rax)
> 8: 90 nop
> 9: 90 nop
> a: 90 nop
> b: 90 nop
> c: 90 nop
> d: 90 nop
> e: 90 nop
> f: 90 nop
> 10: 90 nop
> 11: 90 nop
> 12: 90 nop
> 13: f3 0f 1e fa endbr64
> 17: 8b 05 f6 b4 a8 03 mov 0x3a8b4f6(%rip),%eax # 0x3a8b513
> 1d: 85 c0 test %eax,%eax
> 1f: 7e 07 jle 0x28
> 21: 0f 00 2d 8b 96 3d 00 verw 0x3d968b(%rip) # 0x3d96b3
> 28: fb sti
> 29: f4 hlt
> 2a:* fa cli <-- trapping instruction
> 2b: c3 ret
> 2c: cc int3
> 2d: cc int3
> 2e: cc int3
> 2f: cc int3
> 30: 0f 1f 00 nopl (%rax)
> 33: 90 nop
> 34: 90 nop
> 35: 90 nop
> 36: 90 nop
> 37: 90 nop
> 38: 90 nop
> 39: 90 nop
> 3a: 90 nop
> 3b: 90 nop
> 3c: 90 nop
> 3d: 90 nop
> 3e: 90 nop
> 3f: 90 nop
>
> Code starting with the faulting instruction
> ===========================================
> 0: fa cli
> 1: c3 ret
> 2: cc int3
> 3: cc int3
> 4: cc int3
> 5: cc int3
> 6: 0f 1f 00 nopl (%rax)
> 9: 90 nop
> a: 90 nop
> b: 90 nop
> c: 90 nop
> d: 90 nop
> e: 90 nop
> f: 90 nop
> 10: 90 nop
> 11: 90 nop
> 12: 90 nop
> 13: 90 nop
> 14: 90 nop
> 15: 90 nop
>
>
> --
> 0-DAY CI Kernel Test Service
>
https://github.com/intel/lkp-tests