FYI, we noticed the following commit (built with gcc-7): commit: f66871fb4ce1e3784559ed297cfe868615c93102 ("Synchronize task mm counters on demand") url: https://github.com/0day-ci/linux/commits/Daniel-Colascione/Synchronize-task-mm-counters-on-demand/20180222-231321 in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 2G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +-------------------------------------------------+------------+------------+ | | af3e79d295 | f66871fb4c | +-------------------------------------------------+------------+------------+ | boot_successes | 1093 | 0 | | boot_failures | 59 | 14 | | RIP:arch_local_irq_restore | 3 | 4 | | BUG:kernel_hang_in_boot_stage | 4 | 9 | | RIP:arch_local_irq_enable | 2 | | | INFO:rcu_sched_detected_stalls_on_CPUs/tasks | 2 | 1 | | BUG:kernel_hang_in_test_stage | 53 | 4 | | INFO:rcu_sched_self-detected_stall_on_CPU | 1 | | | WARNING:inconsistent_lock_state | 0 | 14 | | inconsistent{HARDIRQ-ON-W}->{IN-HARDIRQ-W}usage | 0 | 14 | | RIP:__clear_user | 0 | 1 | | RIP:queued_spin_lock_slowpath | 0 | 1 | | RIP:__down_read_trylock | 0 | 1 | | RIP:smp_call_function_single | 0 | 1 | | Kernel_panic-not_syncing:softlockup:hung_tasks | 0 | 1 | | RIP:lock_acquire | 0 | 2 | | RIP:__d_lookup_rcu | 0 | 1 | | RIP:check_poison_obj | 0 | 1 | | RIP:_copy_from_iter_full | 0 | 1 | +-------------------------------------------------+------------+------------+ [ 80.120252] WARNING: inconsistent lock state [ 80.120252] 4.16.0-rc2-00065-gf66871f #151 Not tainted [ 80.120252] -------------------------------- [ 80.120252] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. [ 80.120252] modprobe/141 [HC1[1]:SC0[0]:HE0:SE1] takes: [ 80.120252] (&(&p->alloc_lock)->rlock){?.+.}, at: [<0000000034f1cfd0>] sync_mm_rss_all_users+0xea/0x16d [ 80.120252] {HARDIRQ-ON-W} state was registered at: [ 80.120252] _raw_spin_lock+0x30/0x61 [ 80.120252] __set_task_comm+0x25/0x156 [ 80.120252] kthreadd+0x28/0x21d [ 80.120252] ret_from_fork+0x3a/0x50 [ 80.120252] irq event stamp: 312 [ 80.120252] hardirqs last enabled at (311): [<00000000f42cde47>] _raw_read_unlock_irqrestore+0x42/0x54 [ 80.120252] hardirqs last disabled at (312): [<000000003fa8ce06>] apic_timer_interrupt+0x82/0x90 [ 80.120252] softirqs last enabled at (104): [<0000000043d2b201>] __do_softirq+0x3ad/0x3e9 [ 80.120252] softirqs last disabled at (93): [<0000000014e94b82>] irq_exit+0x57/0xa6 [ 80.120252] [ 80.120252] other info that might help us debug this: [ 80.120252] Possible unsafe locking scenario: [ 80.120252] [ 80.120252] CPU0 [ 80.120252] ---- [ 80.120252] lock(&(&p->alloc_lock)->rlock); [ 80.120252] [ 80.120252] lock(&(&p->alloc_lock)->rlock); [ 80.120252] [ 80.120252] *** DEADLOCK *** [ 80.120252] [ 80.120252] 1 lock held by modprobe/141: [ 80.120252] #0: (rcu_read_lock){....}, at: [<000000002ed86a2c>] sync_mm_rss_all_users+0x5/0x16d [ 80.120252] [ 80.120252] stack backtrace: [ 80.120252] CPU: 0 PID: 141 Comm: modprobe Not tainted 4.16.0-rc2-00065-gf66871f #151 [ 80.120252] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 80.120252] Call Trace: [ 80.120252] [ 80.120252] dump_stack+0x81/0xb3 [ 80.120252] print_usage_bug+0x1b6/0x1c5 [ 80.120252] ? print_shortest_lock_dependencies+0x177/0x177 [ 80.120252] mark_lock+0x10d/0x1fa [ 80.120252] __lock_acquire+0x3a5/0xe9a [ 80.120252] ? sync_mm_rss_all_users+0xea/0x16d [ 80.120252] ? __lock_acquire+0x32c/0xe9a [ 80.120252] ? rcu_read_unlock+0x59/0x59 [ 80.120252] ? sync_mm_rss_all_users+0xea/0x16d [ 80.120252] ? lock_acquire+0x183/0x1bd [ 80.120252] lock_acquire+0x183/0x1bd [ 80.120252] ? sync_mm_rss_all_users+0xea/0x16d [ 80.120252] _raw_spin_lock+0x30/0x61 [ 80.120252] ? sync_mm_rss_all_users+0xea/0x16d [ 80.120252] sync_mm_rss_all_users+0xea/0x16d [ 80.120252] get_mm_counter+0x19/0x33 [ 80.120252] get_mm_rss+0xc/0x32 [ 80.120252] __acct_update_integrals+0x38/0x64 [ 80.120252] update_process_times+0x1c/0x4a [ 80.120252] tick_sched_handle+0x45/0x51 [ 80.120252] tick_sched_timer+0x34/0x62 [ 80.120252] __hrtimer_run_queues+0x1e7/0x342 [ 80.120252] ? tick_sched_do_timer+0x29/0x29 [ 80.120252] hrtimer_interrupt+0x92/0x165 [ 80.120252] smp_apic_timer_interrupt+0x155/0x255 [ 80.120252] apic_timer_interrupt+0x87/0x90 [ 80.120252] [ 80.120252] RIP: 0010:lock_acquire+0x2/0x1bd [ 80.120252] RSP: 0000:ffffadc381527d18 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12 [ 80.120252] RAX: ffff9aa430cd53c0 RBX: ffff9aa43258d0c0 RCX: 0000000000000001 [ 80.120252] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9aa430cd54e8 [ 80.120252] RBP: ffffadc381527d38 R08: 0000000000000001 R09: 0000000000000000 [ 80.120252] R10: ffffadc381527c78 R11: ffffffffb5805e98 R12: ffff9aa431ade900 [ 80.120252] R13: 00007ffcc8d48c40 R14: ffff9aa431165b00 R15: 00007ffcc8d48c39 [ 80.120252] __might_fault+0x61/0x89 [ 80.120252] ? __might_fault+0x37/0x89 [ 80.120252] create_elf_tables+0x7e/0x515 [ 80.120252] ? map_vdso+0x102/0x110 [ 80.120252] load_elf_binary+0xc56/0xe9c [ 80.120252] search_binary_handler+0x86/0x209 [ 80.120252] do_execveat_common+0x495/0x748 [ 80.120252] ? rcu_read_lock_sched_held+0x38/0x5a [ 80.120252] do_execve+0x1f/0x21 [ 80.120252] call_usermodehelper_exec_async+0xfa/0x122 [ 80.120252] ? call_usermodehelper+0x3e/0x3e [ 80.120252] ret_from_fork+0x3a/0x50 [ 81.094814] modprobe (141) used greatest stack depth: 14024 bytes left [ 81.100631] lp: driver loaded but no devices found [ 81.127691] Applicom driver: $Id: ac.c,v 1.30 2000/03/22 16:03:57 dwmw2 Exp $ [ 81.138256] ac.o: No PCI boards found. [ 81.145095] ac.o: For an ISA board you must supply memory and irq parameters. [ 81.163664] Non-volatile memory driver v1.3 BUG: kernel hang in boot stage Elapsed time: 730 #!/bin/bash # To reproduce, # 1) save job-script and this script (both are attached in 0day report email) # 2) run this script with your compiled kernel and optional env $INSTALL_MOD_PATH kernel=$1 initrds=( To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k job-script # job-script is attached in this email Thanks, lkp