Greeting, FYI, we noticed WARNING:possible_recursive_locking_detected due to commit (built with gcc-11): commit: 7a7256d5f512b6c17957df7f59cf5e281b3ddba3 ("shmem: convert shmem_mfill_atomic_pte() to use a folio") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master in testcase: kernel-selftests version: kernel-selftests-x86_64-9313ba54-1_20221017 with following parameters: sc_nr_hugepages: 2 group: vm test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel. test-url: https://www.kernel.org/doc/Documentation/kselftest.txt on test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 16G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): [ 86.886825][ T5512] WARNING: possible recursive locking detected [ 86.886826][ T5512] 6.0.0-rc3-00323-g7a7256d5f512 #1 Tainted: G S [ 86.886843][ T501] [ 86.887428][ T5512] -------------------------------------------- [ 86.887429][ T5512] userfaultfd/5512 is trying to acquire lock: [ 86.887431][ T5512] ffff888436345f98 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault (arch/x86/include/asm/current.h:15 mm/memory.c:5630 mm/memory.c:5623) [ 86.887457][ T5512] [ 86.887457][ T5512] but task is already holding lock: [ 86.887458][ T5512] ffff888436345f98 (&mm->mmap_lock#2){++++}-{3:3}, at: mcopy_atomic (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/linux/mmap_lock.h:35 include/linux/mmap_lock.h:118 mm/userfaultfd.c:543 mm/userfaultfd.c:688) [ 86.887486][ T5512] [ 86.887486][ T5512] other info that might help us debug this: [ 86.887487][ T5512] Possible unsafe locking scenario: [ 86.887487][ T5512] [ 86.887488][ T5512] CPU0 [ 86.887488][ T5512] ---- [ 86.887489][ T5512] lock(&mm->mmap_lock#2); [ 86.896241][ T5512] lock(&mm->mmap_lock#2); [ 86.896691][ T5512] [ 86.896691][ T5512] *** DEADLOCK *** [ 86.896691][ T5512] [ 86.897494][ T5512] May be due to missing lock nesting notation [ 86.897494][ T5512] [ 86.898311][ T5512] 1 lock held by userfaultfd/5512: [ 86.898815][ T5512] #0: ffff888436345f98 (&mm->mmap_lock#2){++++}-{3:3}, at: mcopy_atomic (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/linux/mmap_lock.h:35 include/linux/mmap_lock.h:118 mm/userfaultfd.c:543 mm/userfaultfd.c:688) [ 86.899759][ T5512] [ 86.899759][ T5512] stack backtrace: [ 86.900343][ T5512] CPU: 5 PID: 5512 Comm: userfaultfd Tainted: G S 6.0.0-rc3-00323-g7a7256d5f512 #1 [ 86.901389][ T5512] Hardware name: Dell Inc. Vostro 3670/0HVPDY, BIOS 1.5.11 12/24/2018 [ 86.902193][ T5512] Call Trace: [ 86.902523][ T5512] [ 86.902815][ T5512] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) [ 86.903270][ T5512] validate_chain.cold (kernel/locking/lockdep.c:2988 kernel/locking/lockdep.c:3031 kernel/locking/lockdep.c:3816) [ 86.903777][ T5512] ? check_prev_add (kernel/locking/lockdep.c:3785) [ 86.904276][ T5512] ? check_prev_add (kernel/locking/lockdep.c:3785) [ 86.904775][ T5512] ? pte_alloc_one (include/linux/mm.h:2336 include/linux/mm.h:2363 include/asm-generic/pgalloc.h:66 arch/x86/mm/pgtable.c:33) [ 86.905243][ T5512] ? __alloc_pages_slowpath+0x1a80/0x1a80 [ 86.905910][ T5512] ? __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856) [ 86.906406][ T5512] __lock_acquire (kernel/locking/lockdep.c:5053) [ 86.906879][ T5512] lock_acquire (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5668 kernel/locking/lockdep.c:5631) [ 86.907330][ T5512] ? __might_fault (arch/x86/include/asm/current.h:15 mm/memory.c:5630 mm/memory.c:5623) [ 86.907797][ T5512] ? rcu_read_unlock (include/linux/rcupdate.h:735 (discriminator 5)) [ 86.908276][ T5512] ? lock_is_held_type (kernel/locking/lockdep.c:5407 kernel/locking/lockdep.c:5709) [ 86.908777][ T5512] __might_fault (mm/memory.c:5630 mm/memory.c:5623) [ 86.909231][ T5512] ? __might_fault (arch/x86/include/asm/current.h:15 mm/memory.c:5630 mm/memory.c:5623) [ 86.909695][ T5512] _copy_from_user (arch/x86/include/asm/preempt.h:27 lib/usercopy.c:14) [ 86.910165][ T5512] shmem_mfill_atomic_pte (mm/shmem.c:2422) [ 86.910705][ T5512] mcopy_atomic (mm/userfaultfd.c:503 mm/userfaultfd.c:637 mm/userfaultfd.c:688) [ 86.911158][ T5512] ? mcopy_atomic_pte (mm/userfaultfd.c:687) [ 86.911657][ T5512] ? lock_is_held_type (kernel/locking/lockdep.c:5407 kernel/locking/lockdep.c:5709) [ 86.912160][ T5512] ? __might_fault (mm/memory.c:5630 mm/memory.c:5623) [ 86.912625][ T5512] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688) [ 86.913086][ T5512] userfaultfd_copy (fs/userfaultfd.c:1739) [ 86.913604][ T5512] ? __wake_userfault (fs/userfaultfd.c:1704) [ 86.914088][ T5512] ? kernel_read (fs/read_write.c:451) [ 86.914581][ T5512] ? vfs_fileattr_set (fs/ioctl.c:774) [ 86.915086][ T5512] ? __fget_files (include/linux/rcupdate.h:285 include/linux/rcupdate.h:739 fs/file.c:914) [ 86.915606][ T5512] ? __lock_release (kernel/locking/lockdep.c:5342) [ 86.916106][ T5512] userfaultfd_ioctl (fs/userfaultfd.c:2023) [ 86.916608][ T5512] ? lock_is_held_type (kernel/locking/lockdep.c:5407 kernel/locking/lockdep.c:5709) [ 86.917121][ T5512] ? userfaultfd_continue (fs/userfaultfd.c:1990) [ 86.917669][ T5512] ? __fget_files (include/linux/rcupdate.h:285 include/linux/rcupdate.h:739 fs/file.c:914) [ 86.918146][ T5512] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688) [ 86.918615][ T5512] ? __fget_files (fs/file.c:917) [ 86.919097][ T5512] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856) [ 86.919585][ T5512] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 86.920038][ T5512] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.920508][ T5512] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4526) [ 86.921174][ T5512] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 86.921771][ T5512] RIP: 0033:0x7fe912f00e9b [ 86.922223][ T5512] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1b 48 8b 44 24 18 64 48 2b 04 25 28 00 All code ======== 0: 00 48 89 add %cl,-0x77(%rax) 3: 44 24 18 rex.R and $0x18,%al 6: 31 c0 xor %eax,%eax 8: 48 8d 44 24 60 lea 0x60(%rsp),%rax d: c7 04 24 10 00 00 00 movl $0x10,(%rsp) 14: 48 89 44 24 08 mov %rax,0x8(%rsp) 19: 48 8d 44 24 20 lea 0x20(%rsp),%rax 1e: 48 89 44 24 10 mov %rax,0x10(%rsp) 23: b8 10 00 00 00 mov $0x10,%eax 28: 0f 05 syscall 2a:* 41 89 c0 mov %eax,%r8d <-- trapping instruction 2d: 3d 00 f0 ff ff cmp $0xfffff000,%eax 32: 77 1b ja 0x4f 34: 48 8b 44 24 18 mov 0x18(%rsp),%rax 39: 64 fs 3a: 48 rex.W 3b: 2b .byte 0x2b 3c: 04 25 add $0x25,%al 3e: 28 00 sub %al,(%rax) Code starting with the faulting instruction =========================================== 0: 41 89 c0 mov %eax,%r8d 3: 3d 00 f0 ff ff cmp $0xfffff000,%eax 8: 77 1b ja 0x25 a: 48 8b 44 24 18 mov 0x18(%rsp),%rax f: 64 fs 10: 48 rex.W 11: 2b .byte 0x2b 12: 04 25 add $0x25,%al 14: 28 00 sub %al,(%rax) [ 86.924185][ T5512] RSP: 002b:00007fe90cbffc80 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 86.925036][ T5512] RAX: ffffffffffffffda RBX: 00007fe90cc00640 RCX: 00007fe912f00e9b [ 86.925842][ T5512] RDX: 00007fe90cbffcf0 RSI: 00000000c028aa03 RDI: 0000000000000006 [ 86.926650][ T5512] RBP: 00007fe90cbffd30 R08: 0000000000000000 R09: 00007ffdc1acdd8f [ 86.927457][ T5512] R10: 00007fe912e075d8 R11: 0000000000000246 R12: 00007fe90cc00640 [ 86.928260][ T5512] R13: 0000000000000000 R14: 00007fe912e88580 R15: 0000000000000000 [ 86.929070][ T5512] If you fix the issue, kindly add following tag | Reported-by: kernel test robot | Link: https://lore.kernel.org/r/202210211215.9dc6efb5-yujie.liu@intel.com To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. -- 0-DAY CI Kernel Test Service https://01.org/lkp