Greeting, FYI, we noticed the following commit (built with gcc-11): commit: bec0ae12106e0cf12dd4e0e21eb0754b99be0ba2 ("[PATCH v4 09/11] mm: memcontrol: use obj_cgroup APIs to charge the LRU pages") url: https://github.com/intel-lab-lkp/linux/commits/Muchun-Song/Use-obj_cgroup-APIs-to-charge-the-LRU-pages/20220524-143056 patch link: https://lore.kernel.org/linux-mm/20220524060551.80037-10-songmuchun@bytedance.com in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): If you fix the issue, kindly add following tag Reported-by: kernel test robot [ 41.024908][ T135] WARNING: possible recursive locking detected [ 41.025923][ T135] 5.18.0-00009-gbec0ae12106e #1 Not tainted [ 41.026805][ T135] -------------------------------------------- [ 41.027780][ T135] kworker/1:2/135 is trying to acquire lock: [ 41.028743][ T135] ffff88815b545068 (&lruvec->lru_lock){....}-{2:2}, at: lruvec_reparent_lock (include/linux/nodemask.h:271 mm/memcontrol.c:376) [ 41.030324][ T135] [ 41.030324][ T135] but task is already holding lock: [ 41.031629][ T135] ffff8881a1c43068 (&lruvec->lru_lock){....}-{2:2}, at: lruvec_reparent_lock (mm/memcontrol.c:378) [ 41.033231][ T135] [ 41.033231][ T135] other info that might help us debug this: [ 41.034551][ T135] Possible unsafe locking scenario: [ 41.034551][ T135] [ 41.035818][ T135] CPU0 [ 41.036409][ T135] ---- [ 41.037045][ T135] lock(&lruvec->lru_lock); [ 41.037866][ T135] lock(&lruvec->lru_lock); [ 41.039123][ T135] [ 41.039123][ T135] *** DEADLOCK *** [ 41.039123][ T135] [ 41.040984][ T135] May be due to missing lock nesting notation [ 41.040984][ T135] [ 41.042567][ T135] 5 locks held by kworker/1:2/135: [ 41.043472][ T135] #0: ffff88839d54b538 ((wq_completion)cgroup_destroy){+.+.}-{0:0}, at: process_one_work (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1280 kernel/workqueue.c:636 kernel/workqueue.c:663 kernel/workqueue.c:2260) [ 41.045556][ T135] #1: ffffc90000e9fdb8 ((work_completion)(&css->destroy_work)){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:2264) [ 41.047649][ T135] #2: ffffffffa46931c8 (cgroup_mutex){+.+.}-{3:3}, at: css_killed_work_fn (kernel/cgroup/cgroup.c:5271 kernel/cgroup/cgroup.c:5554) [ 41.049171][ T135] #3: ffffffffa47fe2d8 (objcg_lock){....}-{2:2}, at: mem_cgroup_css_offline (mm/memcontrol.c:453 mm/memcontrol.c:463 mm/memcontrol.c:5382) [ 41.050617][ T135] #4: ffff8881a1c43068 (&lruvec->lru_lock){....}-{2:2}, at: lruvec_reparent_lock (mm/memcontrol.c:378) [ 41.052031][ T135] [ 41.052031][ T135] stack backtrace: [ 41.052926][ T135] CPU: 1 PID: 135 Comm: kworker/1:2 Not tainted 5.18.0-00009-gbec0ae12106e #1 [ 41.054190][ T135] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014 [ 41.055742][ T135] Workqueue: cgroup_destroy css_killed_work_fn [ 41.056645][ T135] Call Trace: [ 41.057138][ T135] [ 41.057628][ T135] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) [ 41.058392][ T135] validate_chain.cold (kernel/locking/lockdep.c:2958 kernel/locking/lockdep.c:3001 kernel/locking/lockdep.c:3790) [ 41.059117][ T135] ? check_prev_add (kernel/locking/lockdep.c:3759) [ 41.059888][ T135] __lock_acquire (kernel/locking/lockdep.c:5029) [ 41.060579][ T135] lock_acquire (kernel/locking/lockdep.c:436 kernel/locking/lockdep.c:5643 kernel/locking/lockdep.c:5606) [ 41.061280][ T135] ? lruvec_reparent_lock (include/linux/nodemask.h:271 mm/memcontrol.c:376) [ 41.062081][ T135] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5)) [ 41.062915][ T135] ? lock_acquire (kernel/locking/lockdep.c:436 kernel/locking/lockdep.c:5643 kernel/locking/lockdep.c:5606) [ 41.063653][ T135] ? mem_cgroup_css_offline (mm/memcontrol.c:453 mm/memcontrol.c:463 mm/memcontrol.c:5382) [ 41.064504][ T135] ? do_raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 kernel/locking/spinlock_debug.c:115) [ 41.065190][ T135] ? rwlock_bug+0xc0/0xc0 [ 41.065923][ T135] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) [ 41.066676][ T135] ? lruvec_reparent_lock (include/linux/nodemask.h:271 mm/memcontrol.c:376) [ 41.067455][ T135] lruvec_reparent_lock (include/linux/nodemask.h:271 mm/memcontrol.c:376) [ 41.068227][ T135] mem_cgroup_css_offline (mm/memcontrol.c:453 mm/memcontrol.c:463 mm/memcontrol.c:5382) [ 41.069103][ T135] ? lock_is_held_type (kernel/locking/lockdep.c:5382 kernel/locking/lockdep.c:5684) [ 41.069858][ T135] css_killed_work_fn (kernel/cgroup/cgroup.c:5279 kernel/cgroup/cgroup.c:5554) [ 41.070637][ T135] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294) [ 41.071459][ T135] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5)) [ 41.072308][ T135] ? pwq_dec_nr_in_flight (kernel/workqueue.c:2184) [ 41.073231][ T135] ? rwlock_bug+0xc0/0xc0 [ 41.073922][ T135] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437) [ 41.074572][ T135] ? __kthread_parkme (arch/x86/include/asm/bitops.h:207 (discriminator 4) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 4) kernel/kthread.c:270 (discriminator 4)) [ 41.075220][ T135] ? schedule (arch/x86/include/asm/bitops.h:207 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:2154 (discriminator 1) kernel/sched/core.c:6462 (discriminator 1)) [ 41.075942][ T135] ? process_one_work (kernel/workqueue.c:2379) [ 41.076755][ T135] ? process_one_work (kernel/workqueue.c:2379) [ 41.077600][ T135] kthread (kernel/kthread.c:376) [ 41.078174][ T135] ? kthread_complete_and_exit (kernel/kthread.c:331) [ 41.078951][ T135] ret_from_fork (arch/x86/entry/entry_64.S:304) [ 41.079668][ T135] [ OK ] Started Load Kernel Modules. [ OK ] Mounted RPC Pipe File System. [ OK ] Started Remount Root and Kernel File Systems. [ OK ] Mounted Kernel Debug File System. [ OK ] Mounted Huge Pages File System. Starting Load/Save Random Seed... Starting Create System Users... Starting Apply Kernel Variables... Mounting Kernel Configuration File System... [ OK ] Started Load/Save Random Seed. [ OK ] Started Create System Users. [ OK ] Started Apply Kernel Variables. [ OK ] Mounted Kernel Configuration File System. Starting Create Static Device Nodes in /dev... [ OK ] Started Create Static Device Nodes in /dev. [ OK ] Reached target Local File Systems (Pre). [ OK ] Reached target Local File Systems. Starting Preprocess NFS configuration... Starting udev Kernel Device Manager... [ OK ] Started Journal Service. [ OK ] Started Preprocess NFS configuration. [ OK ] Reached target NFS client services. Starting Flush Journal to Persistent Storage... [ OK ] Started udev Kernel Device Manager. [ OK ] Started Flush Journal to Persistent Storage. Starting Create Volatile Files and Directories... [ OK ] Started Create Volatile Files and Directories. Starting Network Time Synchronization... Starting RPC bind portmap service... Starting Update UTMP about System Boot/Shutdown... [ OK ] Started RPC bind portmap service. [ OK ] Reached target RPC Port Mapper. [ OK ] Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems. [ OK ] Started Update UTMP about System Boot/Shutdown. [ OK ] Started Network Time Synchronization. [ OK ] Reached target System Time Synchronized. To reproduce: # build kernel cd linux cp config-5.18.0-00009-gbec0ae12106e .config make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH= modules_install cd find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k -m modules.cgz job-script # job-script is attached in this email # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. -- 0-DAY CI Kernel Test Service https://01.org/lkp