Greeting, FYI, we noticed WARNING:suspicious_RCU_usage due to commit (built with gcc-11): commit: 8b7e3b7ca3897ebc4cb7b23c65a4618d64056e3b ("[PATCH RFC 05/10] mm/hugetlb: Make walk_hugetlb_range() RCU-safe") url: https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/mm-hugetlb-Make-huge_pte_offset-thread-safe-for-pmd-unshare/20221031-053221 base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/lkml/20221030212929.335473-6-peterx@redhat.com patch subject: [PATCH RFC 05/10] mm/hugetlb: Make walk_hugetlb_range() RCU-safe in testcase: kernel-selftests version: kernel-selftests-x86_64-9313ba54-1_20221017 with following parameters: sc_nr_hugepages: 2 group: vm test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel. test-url: https://www.kernel.org/doc/Documentation/kselftest.txt on test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 16G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): If you fix the issue, kindly add following tag | Reported-by: kernel test robot | Link: https://lore.kernel.org/oe-lkp/202211061521.28931f7-oliver.sang@intel.com kern :warn : [ 181.942648] WARNING: suspicious RCU usage kern :warn : [ 181.943175] 6.1.0-rc1-00309-g8b7e3b7ca389 #1 Tainted: G S kern :warn : [ 181.943972] ----------------------------- kern :warn : [ 181.944526] include/linux/rcupdate.h:364 Illegal context switch in RCU read-side critical section! kern :warn : [ 181.945559] other info that might help us debug this: kern :warn : [ 181.946625] rcu_scheduler_active = 2, debug_locks = 1 kern :warn : [ 181.947473] 2 locks held by hmm-tests/9934: kern :warn : [ 181.948016] #0: ffff8884325b2d18 (&mm->mmap_lock#2){++++}-{3:3}, at: dmirror_fault (test_hmm.c:?) test_hmm kern :warn : [ 181.949129] #1: ffffffff858a7860 (rcu_read_lock){....}-{1:2}, at: walk_hugetlb_range (pagewalk.c:?) kern :warn : [ 181.950161] stack backtrace: kern :warn : [ 181.950780] CPU: 9 PID: 9934 Comm: hmm-tests Tainted: G S 6.1.0-rc1-00309-g8b7e3b7ca389 #1 kern :warn : [ 181.951863] Hardware name: Dell Inc. Vostro 3670/0HVPDY, BIOS 1.5.11 12/24/2018 kern :warn : [ 181.952709] Call Trace: kern :warn : [ 181.953070] kern :warn : [ 181.953403] dump_stack_lvl (??:?) kern :warn : [ 181.953890] __might_resched (??:?) kern :warn : [ 181.954403] __mutex_lock (mutex.c:?) kern :warn : [ 181.954886] ? validate_chain (lockdep.c:?) kern :warn : [ 181.955405] ? hugetlb_fault (??:?) kern :warn : [ 181.955926] ? mark_lock+0xca/0xac0 kern :warn : [ 181.956450] ? mutex_lock_io_nested (mutex.c:?) kern :warn : [ 181.957039] ? check_prev_add (lockdep.c:?) kern :warn : [ 181.957580] ? hugetlb_vm_op_pagesize (hugetlb.c:?) kern :warn : [ 181.958177] ? hugetlb_fault (??:?) kern :warn : [ 181.958690] hugetlb_fault (??:?) kern :warn : [ 181.959199] ? find_held_lock (lockdep.c:?) kern :warn : [ 181.959709] ? hugetlb_no_page (??:?) kern :warn : [ 181.960255] ? __lock_release (lockdep.c:?) kern :warn : [ 181.960772] ? lock_downgrade (lockdep.c:?) kern :warn : [ 181.961292] ? lock_is_held_type (??:?) kern :warn : [ 181.961830] ? handle_mm_fault (??:?) kern :warn : [ 181.962363] handle_mm_fault (??:?) kern :warn : [ 181.962870] ? hmm_vma_walk_hugetlb_entry (hmm.c:?) kern :warn : [ 181.963501] hmm_vma_fault (hmm.c:?) kern :warn : [ 181.964096] walk_hugetlb_range (pagewalk.c:?) kern :warn : [ 181.964639] __walk_page_range (pagewalk.c:?) kern :warn : [ 181.965160] walk_page_range (??:?) kern :warn : [ 181.965670] ? __walk_page_range (??:?) kern :warn : [ 181.966213] ? rcu_read_unlock (main.c:?) kern :warn : [ 181.966718] ? lock_is_held_type (??:?) kern :warn : [ 181.967259] ? mmu_interval_read_begin (??:?) kern :warn : [ 181.967855] ? lock_is_held_type (??:?) kern :warn : [ 181.968400] hmm_range_fault (??:?) kern :warn : [ 181.968911] ? down_read (??:?) kern :warn : [ 181.969383] ? hmm_vma_fault (??:?) kern :warn : [ 181.969891] ? __lock_release (lockdep.c:?) kern :warn : [ 181.970416] dmirror_fault (test_hmm.c:?) test_hmm kern :warn : [ 181.971012] ? dmirror_migrate_to_system+0x590/0x590 test_hmm kern :warn : [ 181.971847] ? find_held_lock (lockdep.c:?) kern :warn : [ 181.972355] ? dmirror_write+0x202/0x310 test_hmm kern :warn : [ 181.973069] ? __lock_release (lockdep.c:?) kern :warn : [ 181.973586] ? lock_downgrade (lockdep.c:?) kern :warn : [ 181.974107] ? lock_is_held_type (??:?) kern :warn : [ 181.974641] ? dmirror_write+0x202/0x310 test_hmm kern :warn : [ 181.975355] ? lock_release (??:?) kern :warn : [ 181.975845] ? __mutex_unlock_slowpath (mutex.c:?) kern :warn : [ 181.976444] ? bit_wait_io_timeout (mutex.c:?) kern :warn : [ 181.977008] ? lock_is_held_type (??:?) kern :warn : [ 181.977547] ? dmirror_do_write (test_hmm.c:?) test_hmm kern :warn : [ 181.978185] dmirror_write+0x1bf/0x310 test_hmm kern :warn : [ 181.978881] ? dmirror_fault (test_hmm.c:?) test_hmm kern :warn : [ 181.979484] ? lock_is_held_type (??:?) kern :warn : [ 181.980021] ? __might_fault (??:?) kern :warn : [ 181.980523] ? lock_release (??:?) kern :warn : [ 181.981019] dmirror_fops_unlocked_ioctl (test_hmm.c:?) test_hmm kern :warn : [ 181.981732] ? dmirror_exclusive+0x780/0x780 test_hmm kern :warn : [ 181.982485] ? do_user_addr_fault (fault.c:?) kern :warn : [ 181.983042] ? __lock_release (lockdep.c:?) kern :warn : [ 181.983562] __x64_sys_ioctl (??:?) kern :warn : [ 181.984074] do_syscall_64 (??:?) kern :warn : [ 181.984545] ? do_user_addr_fault (fault.c:?) kern :warn : [ 181.985103] ? do_user_addr_fault (fault.c:?) kern :warn : [ 181.985654] ? irqentry_exit_to_user_mode (??:?) kern :warn : [ 181.986256] ? lockdep_hardirqs_on_prepare (lockdep.c:?) kern :warn : [ 181.986945] entry_SYSCALL_64_after_hwframe (??:?) kern :warn : [ 181.987569] RIP: 0033:0x7fac2f598e9b kern :warn : [ 181.988047] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1b 48 8b 44 24 18 64 48 2b 04 25 28 00 All code ======== 0: 00 48 89 add %cl,-0x77(%rax) 3: 44 24 18 rex.R and $0x18,%al 6: 31 c0 xor %eax,%eax 8: 48 8d 44 24 60 lea 0x60(%rsp),%rax d: c7 04 24 10 00 00 00 movl $0x10,(%rsp) 14: 48 89 44 24 08 mov %rax,0x8(%rsp) 19: 48 8d 44 24 20 lea 0x20(%rsp),%rax 1e: 48 89 44 24 10 mov %rax,0x10(%rsp) 23: b8 10 00 00 00 mov $0x10,%eax 28: 0f 05 syscall 2a:* 41 89 c0 mov %eax,%r8d <-- trapping instruction 2d: 3d 00 f0 ff ff cmp $0xfffff000,%eax 32: 77 1b ja 0x4f 34: 48 8b 44 24 18 mov 0x18(%rsp),%rax 39: 64 fs 3a: 48 rex.W 3b: 2b .byte 0x2b 3c: 04 25 add $0x25,%al 3e: 28 00 sub %al,(%rax) Code starting with the faulting instruction =========================================== 0: 41 89 c0 mov %eax,%r8d 3: 3d 00 f0 ff ff cmp $0xfffff000,%eax 8: 77 1b ja 0x25 a: 48 8b 44 24 18 mov 0x18(%rsp),%rax f: 64 fs 10: 48 rex.W 11: 2b .byte 0x2b 12: 04 25 add $0x25,%al 14: 28 00 sub %al,(%rax) To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. -- 0-DAY CI Kernel Test Service https://01.org/lkp