Greeting, FYI, we noticed the following commit (built with gcc-11): commit: 73fc16ad40634badd5f51083e94792bb267f2919 ("mm/page_owner.c: use get_task_comm() to record task command name with the protection of task_lock()") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): If you fix the issue, kindly add following tag Reported-by: kernel test robot [ 42.475348][ C1] WARNING: inconsistent lock state [ 42.476513][ C1] 5.18.0-rc3-00045-g73fc16ad4063 #1 Not tainted [ 42.477915][ C1] -------------------------------- [ 42.479133][ C1] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 42.480717][ C1] swapper/1/0 [HC0[0]:SC1[3]:HE1:SE0] takes: [ 42.482139][ C1] ffff888100808cd8 (&p->alloc_lock){+.?.}-{2:2}, at: __get_task_comm (fs/exec.c:1217) [ 42.484200][ C1] {SOFTIRQ-ON-W} state was registered at: [ 42.485512][ C1] __lock_acquire (kernel/locking/lockdep.c:4983) [ 42.490900][ C1] lock_acquire (kernel/locking/lockdep.c:436 kernel/locking/lockdep.c:5643 kernel/locking/lockdep.c:5606) [ 42.492097][ C1] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) [ 42.493338][ C1] __set_task_comm (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/task.h:34 fs/exec.c:1231) [ 42.494644][ C1] kthreadd (kernel/kthread.c:723) [ 42.495684][ C1] ret_from_fork (arch/x86/entry/entry_64.S:304) [ 42.496803][ C1] irq event stamp: 172198 [ 42.497798][ C1] hardirqs last enabled at (172198): _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 arch/x86/include/asm/irqflags.h:138 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194) [ 42.500372][ C1] hardirqs last disabled at (172197): _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:108 kernel/locking/spinlock.c:162) [ 42.502627][ C1] softirqs last enabled at (172176): __do_softirq (arch/x86/include/asm/preempt.h:27 kernel/softirq.c:402 kernel/softirq.c:587) [ 42.504810][ C1] softirqs last disabled at (172183): __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:637) [ 42.507080][ C1] [ 42.507080][ C1] other info that might help us debug this: [ 42.509073][ C1] Possible unsafe locking scenario: [ 42.509073][ C1] [ 42.510935][ C1] CPU0 [ 42.511819][ C1] ---- [ 42.512771][ C1] lock(&p->alloc_lock); [ 42.513881][ C1] [ 42.514897][ C1] lock(&p->alloc_lock); [ 42.516025][ C1] [ 42.516025][ C1] *** DEADLOCK *** [ 42.516025][ C1] [ 42.518075][ C1] no locks held by swapper/1/0. [ 42.519348][ C1] [ 42.519348][ C1] stack backtrace: [ 42.520964][ C1] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.18.0-rc3-00045-g73fc16ad4063 #1 [ 42.523160][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 42.525319][ C1] Call Trace: [ 42.526317][ C1] [ 42.527101][ C1] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) [ 42.528143][ C1] mark_lock_irq.cold (kernel/locking/lockdep.c:4178 kernel/locking/lockdep.c:3947 kernel/locking/lockdep.c:4150) [ 42.529457][ C1] ? secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:303) [ 42.530942][ C1] ? lock_chain_count (kernel/locking/lockdep.c:4141) [ 42.532165][ C1] ? stack_trace_save (kernel/stacktrace.c:123) [ 42.533480][ C1] ? filter_irq_stacks (kernel/stacktrace.c:114) [ 42.534827][ C1] ? save_trace (kernel/locking/lockdep.c:555) [ 42.535869][ C1] mark_lock+0x4c9/0xac0 [ 42.537066][ C1] ? mark_lock_irq (kernel/locking/lockdep.c:4566) [ 42.538311][ C1] ? lock_downgrade (kernel/locking/lockdep.c:5293) [ 42.539540][ C1] mark_usage (kernel/locking/lockdep.c:4502) [ 42.540657][ C1] __lock_acquire (kernel/locking/lockdep.c:4983) [ 42.541858][ C1] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 arch/x86/include/asm/irqflags.h:138 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194) [ 42.543367][ C1] lock_acquire (kernel/locking/lockdep.c:436 kernel/locking/lockdep.c:5643 kernel/locking/lockdep.c:5606) [ 42.544612][ C1] ? __get_task_comm (fs/exec.c:1217) [ 42.545826][ C1] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5)) [ 42.547145][ C1] ? memory_failure_queue_kick (mm/page_owner.c:107) [ 42.548624][ C1] ? e1000_alloc_rx_buffers (include/linux/skbuff.h:3071 drivers/net/ethernet/intel/e1000/e1000_main.c:2072 drivers/net/ethernet/intel/e1000/e1000_main.c:4583) [ 42.549998][ C1] ? e1000_clean_rx_irq (drivers/net/ethernet/intel/e1000/e1000_main.c:4485) [ 42.551339][ C1] ? e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3805) [ 42.552510][ C1] ? __napi_poll+0xa1/0x380 [ 42.553784][ C1] ? net_rx_action (net/core/dev.c:6482 net/core/dev.c:6567) [ 42.555053][ C1] ? __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:559) [ 42.556298][ C1] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) [ 42.557427][ C1] ? __get_task_comm (fs/exec.c:1217) [ 42.558732][ C1] __get_task_comm (fs/exec.c:1217) [ 42.559817][ C1] __set_page_owner (include/linux/instrumented.h:41 include/asm-generic/bitops/instrumented-non-atomic.h:27 mm/page_owner.c:172 mm/page_owner.c:189) [ 42.561029][ C1] get_page_from_freelist (mm/page_alloc.c:2452 mm/page_alloc.c:4182) [ 42.562325][ C1] __alloc_pages (mm/page_alloc.c:5408) [ 42.563467][ C1] ? __alloc_pages_slowpath+0x1900/0x1900 [ 42.565180][ C1] ? __skb_gro_checksum_complete (net/core/gro.c:774) [ 42.566702][ C1] ? __lock_release (kernel/locking/lockdep.c:5317) [ 42.567796][ C1] page_frag_alloc_align (mm/page_alloc.c:5532 mm/page_alloc.c:5561) [ 42.569066][ C1] __netdev_alloc_frag_align (include/linux/bottom_half.h:33 net/core/skbuff.c:164) [ 42.570427][ C1] e1000_alloc_rx_buffers (include/linux/skbuff.h:3071 drivers/net/ethernet/intel/e1000/e1000_main.c:2072 drivers/net/ethernet/intel/e1000/e1000_main.c:4583) [ 42.571776][ C1] ? napi_gro_receive (include/trace/events/net.h:287 net/core/gro.c:631) [ 42.573118][ C1] e1000_clean_rx_irq (drivers/net/ethernet/intel/e1000/e1000_main.c:4485) [ 42.574128][ C1] ? e1000_82547_tx_fifo_stall_task (drivers/net/ethernet/intel/e1000/e1000_main.c:4353) [ 42.575704][ C1] e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3805) [ 42.576928][ C1] ? e1000_clean_tx_irq (drivers/net/ethernet/intel/e1000/e1000_main.c:3796) [ 42.578339][ C1] __napi_poll+0xa1/0x380 [ 42.579552][ C1] net_rx_action (net/core/dev.c:6482 net/core/dev.c:6567) [ 42.580746][ C1] ? __napi_poll+0x380/0x380 [ 42.582031][ C1] ? sched_clock_cpu (kernel/sched/clock.c:369) [ 42.583354][ C1] ? lock_downgrade (kernel/locking/lockdep.c:5293) [ 42.584624][ C1] __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:559) [ 42.585748][ C1] __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:637) [ 42.586968][ C1] irq_exit_rcu (kernel/softirq.c:651) [ 42.588004][ C1] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) [ 42.589199][ C1] [ 42.589956][ C1] [ 42.590819][ C1] asm_common_interrupt (arch/x86/include/asm/idtentry.h:636) [ 42.592069][ C1] RIP: 0010:default_idle (arch/x86/kernel/process.c:734) [ 42.593441][ C1] Code: ff ff ff 48 89 df e8 8f 12 f7 fd e9 55 ff ff ff 4c 89 e7 e8 82 12 f7 fd eb 96 0f 1f 44 00 00 eb 07 0f 00 2d b2 61 48 00 fb f4 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc All code ======== 0: ff (bad) 1: ff (bad) 2: ff 48 89 decl -0x77(%rax) 5: df e8 fucomip %st(0),%st 7: 8f (bad) 8: 12 f7 adc %bh,%dh a: fd std b: e9 55 ff ff ff jmpq 0xffffffffffffff65 10: 4c 89 e7 mov %r12,%rdi 13: e8 82 12 f7 fd callq 0xfffffffffdf7129a 18: eb 96 jmp 0xffffffffffffffb0 1a: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 1f: eb 07 jmp 0x28 21: 0f 00 2d b2 61 48 00 verw 0x4861b2(%rip) # 0x4861da 28: fb sti 29: f4 hlt 2a:* c3 retq <-- trapping instruction 2b: cc int3 2c: cc int3 2d: cc int3 2e: cc int3 2f: cc int3 30: cc int3 31: cc int3 32: cc int3 33: cc int3 34: cc int3 35: cc int3 36: cc int3 37: cc int3 38: cc int3 39: cc int3 3a: cc int3 3b: cc int3 3c: cc int3 3d: cc int3 3e: cc int3 3f: cc int3 Code starting with the faulting instruction =========================================== 0: c3 retq 1: cc int3 2: cc int3 3: cc int3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: cc int3 c: cc int3 d: cc int3 e: cc int3 f: cc int3 10: cc int3 11: cc int3 12: cc int3 13: cc int3 14: cc int3 15: cc int3 To reproduce: # build kernel cd linux cp config-5.18.0-rc3-00045-g73fc16ad4063 .config make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH= modules_install cd find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k -m modules.cgz job-script # job-script is attached in this email # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. -- 0-DAY CI Kernel Test Service https://01.org/lkp