Greeting, FYI, we noticed the following commit (built with gcc-11): commit: 040b83fcecfb86f3225d3a5de7fd9b3fbccf83b4 ("sbitmap: fix possible io hung due to lost wakeup") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: ltp version: ltp-x86_64-14c1f76-1_20220929 with following parameters: disk: 1HDD fs: f2fs test: dio-03 test-description: The LTP testsuite contains a collection of tools for testing the Linux kernel and related features. test-url: http://linux-test-project.github.io/ on test machine: 4 threads 1 sockets Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz (Ivy Bridge) with 8G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): If you fix the issue, kindly add following tag | Reported-by: kernel test robot | Link: https://lore.kernel.org/r/202210052046.709b0594-oliver.sang@intel.com [ 388.493914][ C2] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [diotest3:4818] [ 388.494088][ C2] Modules linked in: dm_mod f2fs crc32_generic netconsole btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c i915 sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 sg drm_buddy intel_rapl_msr intel_rapl_common intel_gtt x86_pkg_temp_thermal intel_powerclamp drm_display_helper ttm usb_storage coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm_kms_helper syscopyarea ahci rapl libahci sysfillrect intel_cstate intel_uncore sysimgblt ipmi_devintf mei_me libata mei fb_sys_fops ipmi_msghandler video drm fuse ip_tables [ 388.494989][ C2] CPU: 2 PID: 4818 Comm: diotest3 Not tainted 6.0.0-rc2-00011-g040b83fcecfb #1 [ 388.495150][ C2] Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013 [ 388.495296][ C2] RIP: 0010:__sbq_wake_up (include/linux/list.h:292 include/linux/wait.h:129 lib/sbitmap.c:591 lib/sbitmap.c:609) [ 388.495404][ C2] Code: 28 41 bd 08 00 00 00 48 ba 00 00 00 00 00 fc ff df 48 63 dd 48 c1 e3 06 48 01 cb 4c 8d 63 10 4c 89 e0 48 c1 e8 03 80 3c 10 00 <0f> 85 0f 02 00 00 48 8b 43 10 49 39 c4 75 21 83 c5 01 83 e5 07 41 All code ======== 0: 28 41 bd sub %al,-0x43(%rcx) 3: 08 00 or %al,(%rax) 5: 00 00 add %al,(%rax) 7: 48 ba 00 00 00 00 00 movabs $0xdffffc0000000000,%rdx e: fc ff df 11: 48 63 dd movslq %ebp,%rbx 14: 48 c1 e3 06 shl $0x6,%rbx 18: 48 01 cb add %rcx,%rbx 1b: 4c 8d 63 10 lea 0x10(%rbx),%r12 1f: 4c 89 e0 mov %r12,%rax 22: 48 c1 e8 03 shr $0x3,%rax 26: 80 3c 10 00 cmpb $0x0,(%rax,%rdx,1) 2a:* 0f 85 0f 02 00 00 jne 0x23f <-- trapping instruction 30: 48 8b 43 10 mov 0x10(%rbx),%rax 34: 49 39 c4 cmp %rax,%r12 37: 75 21 jne 0x5a 39: 83 c5 01 add $0x1,%ebp 3c: 83 e5 07 and $0x7,%ebp 3f: 41 rex.B Code starting with the faulting instruction =========================================== 0: 0f 85 0f 02 00 00 jne 0x215 6: 48 8b 43 10 mov 0x10(%rbx),%rax a: 49 39 c4 cmp %rax,%r12 d: 75 21 jne 0x30 f: 83 c5 01 add $0x1,%ebp 12: 83 e5 07 and $0x7,%ebp 15: 41 rex.B [ 388.495727][ C2] RSP: 0018:ffffc90000220da8 EFLAGS: 00000246 [ 388.495854][ C2] RAX: 1ffff11043ebe112 RBX: ffff88821f5f0880 RCX: ffff88821f5f0800 [ 388.496001][ C2] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff888218207938 [ 388.496144][ C2] RBP: 0000000000000002 R08: 0000000000000000 R09: ffff888218207937 [ 388.496287][ C2] R10: ffffed1043040f26 R11: 0000000000000001 R12: ffff88821f5f0890 [ 388.496428][ C2] R13: 0000000000000008 R14: ffff888218207910 R15: ffff888218207934 [ 388.496571][ C2] FS: 00007f01ba9ee740(0000) GS:ffff888180700000(0000) knlGS:0000000000000000 [ 388.496741][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 388.496873][ C2] CR2: 00007feb7870f500 CR3: 0000000215fec006 CR4: 00000000001706e0 [ 388.497018][ C2] Call Trace: [ 388.497087][ C2] [ 388.497150][ C2] sbitmap_queue_clear (lib/sbitmap.c:657 lib/sbitmap.c:725) [ 388.497248][ C2] __blk_mq_free_request (block/blk-mq.c:623) [ 388.497351][ C2] ? blk_mq_free_request (arch/x86/include/asm/atomic.h:123 include/linux/atomic/atomic-instrumented.h:576 block/blk.h:483 block/blk-mq.c:646) [ 388.497452][ C2] scsi_end_request (drivers/scsi/scsi_lib.c:573) [ 388.497561][ C2] scsi_io_completion (drivers/scsi/scsi_lib.c:965) [ 388.497671][ C2] ? scsi_run_host_queues (drivers/scsi/scsi_lib.c:943) [ 388.497783][ C2] ? scsi_device_unbusy (arch/x86/include/asm/bitops.h:60 include/asm-generic/bitops/instrumented-atomic.h:29 include/linux/sbitmap.h:327 include/linux/sbitmap.h:336 drivers/scsi/scsi_lib.c:296) [ 388.497899][ C2] blk_complete_reqs (block/blk-mq.c:1021 (discriminator 3)) [ 388.497999][ C2] __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:572) [ 388.498091][ C2] __irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650) [ 388.498185][ C2] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) [ 388.498278][ C2] [ 388.498341][ C2] [ 388.498404][ C2] asm_common_interrupt (arch/x86/include/asm/idtentry.h:640) [ 388.498512][ C2] RIP: 0010:_raw_spin_unlock_irqrestore (kernel/locking/spinlock.c:195) [ 388.498635][ C2] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 c6 07 00 0f 1f 00 f7 c6 00 02 00 00 74 01 fb 65 ff 0d a5 20 b6 7c cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 All code ======== 0: cc int3 1: cc int3 2: cc int3 3: cc int3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: cc int3 c: cc int3 d: cc int3 e: cc int3 f: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 14: c6 07 00 movb $0x0,(%rdi) 17: 0f 1f 00 nopl (%rax) 1a: f7 c6 00 02 00 00 test $0x200,%esi 20: 74 01 je 0x23 22: fb sti 23: 65 ff 0d a5 20 b6 7c decl %gs:0x7cb620a5(%rip) # 0x7cb620cf 2a:* c3 retq <-- trapping instruction 2b: cc int3 2c: cc int3 2d: cc int3 2e: cc int3 2f: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1) 36: 00 00 00 00 3a: 66 data16 3b: 66 data16 3c: 2e cs 3d: 0f .byte 0xf 3e: 1f (bad) 3f: 84 .byte 0x84 Code starting with the faulting instruction =========================================== 0: c3 retq 1: cc int3 2: cc int3 3: cc int3 4: cc int3 5: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1) c: 00 00 00 00 10: 66 data16 11: 66 data16 12: 2e cs 13: 0f .byte 0xf 14: 1f (bad) 15: 84 .byte 0x84 To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. -- 0-DAY CI Kernel Test Service https://01.org/lkp