On 3 Oct 2025, at 12:53, Qi Zheng wrote: > From: Qi Zheng > > Changes in v4: > - add split_queue_lock*() and let folio_split_queue_lock*() to use them. > (I have kept everyone's Acked-bys and Reviewed-bys. If you need to discard > it, please let me know.) > - let deferred_split_scan() to use split_queue_lock_irqsave(), which will fix > the race problem in [PATCH v3 4/4]. > (Muchun Song) > - collect Reviewed-bys > - rebase onto the next-20251002 > > Changes in v3: > - use css_is_dying() in folio_split_queue_lock*() to check if memcg is dying > (David Hildenbrand, Shakeel Butt and Zi Yan) > - modify the commit message in [PATCH v2 4/4] > (Roman Gushchin) > - fix the build error in [PATCH v2 4/4] > - collect Acked-bys and Reviewed-bys > - rebase onto the next-20250926 > > Changes in v2: > - fix build errors in [PATCH 2/4] and [PATCH 4/4] > - some cleanups for [PATCH 3/4] (suggested by David Hildenbrand) > - collect Acked-bys and Reviewed-bys > - rebase onto the next-20250922 > > Hi all, > > In the future, we will reparent LRU folios during memcg offline to eliminate > dying memory cgroups, which requires reparenting the THP split queue to its > parent memcg. > > Similar to list_lru, the split queue is relatively independent and does not need > to be reparented along with objcg and LRU folios (holding objcg lock and lru > lock). Therefore, we can apply the same mechanism as list_lru to reparent the > split queue first when memcg is offine. > > The first three patches in this series are separated from the series > "Eliminate Dying Memory Cgroup" [1], mainly to do some cleanup and preparatory > work. > > The last patch reparents the THP split queue to its parent memcg during memcg > offline. > > Comments and suggestions are welcome! > > Thanks, > Qi > > [1]. https://lore.kernel.org/all/20250415024532.26632-1-songmuchun@bytedance.com/ > > Muchun Song (3): > mm: thp: replace folio_memcg() with folio_memcg_charged() > mm: thp: introduce folio_split_queue_lock and its variants > mm: thp: use folio_batch to handle THP splitting in > deferred_split_scan() > > Qi Zheng (1): > mm: thp: reparent the split queue during memcg offline > > include/linux/huge_mm.h | 4 + > include/linux/memcontrol.h | 10 ++ > mm/huge_memory.c | 258 +++++++++++++++++++++++++------------ > mm/memcontrol.c | 1 + > 4 files changed, 192 insertions(+), 81 deletions(-) > Hi Qi, I got CPU soft locks when run "echo 3 | sudo tee /proc/sys/vm/drop_caches" with today's mm-new on a freshly booted system. Reverting Patch 3 (and Patch 4) of your patchset solves the issue. My config file is attached. My kernel relevant kernel parameters are: "cgroup_no_v1=all transparent_hugepage=always thp_shmem=2M:always". The machine is a 8GB 8-core x86_64 VM. The kernel log: [ 36.441539] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [tee:810] [ 36.441549] Modules linked in: [ 36.441566] CPU: 0 UID: 0 PID: 810 Comm: tee Not tainted 6.17.0-mm-everything-2024-01-29-07-19-no-mglru+ #526 PREEMPT(voluntary) [ 36.441570] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 [ 36.441574] RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [ 36.441592] Code: 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 53 48 89 f3 e8 92 68 fd fe 80 e7 02 74 06 fb 0f 1f 44 00 00 <65> ff 0d d0 5f 7e 01 74 06 5b c3 cc cc cc cc 0f 1f 44 00 00 5b c3 [ 36.441594] RSP: 0018:ffffc900029afb60 EFLAGS: 00000202 [ 36.441598] RAX: 0000000000000001 RBX: 0000000000000286 RCX: ffff888101168670 [ 36.441601] RDX: 0000000000000001 RSI: 0000000000000286 RDI: ffff888101168658 [ 36.441602] RBP: 0000000000000001 R08: ffff88813ba44ec0 R09: 0000000000000000 [ 36.441603] R10: 00000000000001a8 R11: 0000000000000000 R12: ffff8881011685e0 [ 36.441604] R13: 0000000000000000 R14: ffff888101168000 R15: ffffc900029afd60 [ 36.441606] FS: 00007f7fe3655740(0000) GS:ffff8881b7e5d000(0000) knlGS:0000000000000000 [ 36.441607] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.441608] CR2: 0000563d4d439bf0 CR3: 000000010873c006 CR4: 0000000000370ef0 [ 36.441614] Call Trace: [ 36.441616] [ 36.441619] deferred_split_scan+0x1e0/0x480 [ 36.441627] ? _raw_spin_unlock_irqrestore+0xe/0x40 [ 36.441630] ? kvfree_rcu_queue_batch+0x96/0x1c0 [ 36.441634] ? do_raw_spin_unlock+0x46/0xd0 [ 36.441639] ? kfree_rcu_monitor+0x1da/0x2c0 [ 36.441641] ? list_lru_count_one+0x47/0x90 [ 36.441644] do_shrink_slab+0x153/0x360 [ 36.441649] shrink_slab+0xd3/0x390 [ 36.441652] drop_slab+0x7d/0x130 [ 36.441655] drop_caches_sysctl_handler+0x98/0xb0 [ 36.441660] proc_sys_call_handler+0x1c7/0x2c0 [ 36.441664] vfs_write+0x221/0x450 [ 36.441669] ksys_write+0x6c/0xe0 [ 36.441672] do_syscall_64+0x50/0x200 [ 36.441675] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 36.441678] RIP: 0033:0x7f7fe36e7687 [ 36.441685] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff [ 36.441686] RSP: 002b:00007ffdffcbba10 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 36.441688] RAX: ffffffffffffffda RBX: 00007f7fe3655740 RCX: 00007f7fe36e7687 [ 36.441689] RDX: 0000000000000002 RSI: 00007ffdffcbbbb0 RDI: 0000000000000003 [ 36.441690] RBP: 00007ffdffcbbbb0 R08: 0000000000000000 R09: 0000000000000000 [ 36.441691] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002 [ 36.441692] R13: 0000558d40be64c0 R14: 00007f7fe383de80 R15: 0000000000000002 [ 36.441694] [ 64.441531] watchdog: BUG: soft lockup - CPU#0 stuck for 53s! [tee:810] [ 64.441537] Modules linked in: [ 64.441545] CPU: 0 UID: 0 PID: 810 Comm: tee Tainted: G L 6.17.0-mm-everything-2024-01-29-07-19-no-mglru+ #526 PREEMPT(voluntary) [ 64.441548] Tainted: [L]=SOFTLOCKUP [ 64.441552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 [ 64.441555] RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [ 64.441565] Code: 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 53 48 89 f3 e8 92 68 fd fe 80 e7 02 74 06 fb 0f 1f 44 00 00 <65> ff 0d d0 5f 7e 01 74 06 5b c3 cc cc cc cc 0f 1f 44 00 00 5b c3 [ 64.441566] RSP: 0018:ffffc900029afb60 EFLAGS: 00000202 [ 64.441568] RAX: 0000000000000001 RBX: 0000000000000286 RCX: ffff888101168670 [ 64.441570] RDX: 0000000000000001 RSI: 0000000000000286 RDI: ffff888101168658 [ 64.441571] RBP: 0000000000000001 R08: ffff88813ba44ec0 R09: 0000000000000000 [ 64.441572] R10: 00000000000001a8 R11: 0000000000000000 R12: ffff8881011685e0 [ 64.441573] R13: 0000000000000000 R14: ffff888101168000 R15: ffffc900029afd60 [ 64.441574] FS: 00007f7fe3655740(0000) GS:ffff8881b7e5d000(0000) knlGS:0000000000000000 [ 64.441576] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 64.441577] CR2: 0000563d4d439bf0 CR3: 000000010873c006 CR4: 0000000000370ef0 [ 64.441581] Call Trace: [ 64.441583] [ 64.441591] deferred_split_scan+0x1e0/0x480 [ 64.441598] ? _raw_spin_unlock_irqrestore+0xe/0x40 [ 64.441599] ? kvfree_rcu_queue_batch+0x96/0x1c0 [ 64.441603] ? do_raw_spin_unlock+0x46/0xd0 [ 64.441607] ? kfree_rcu_monitor+0x1da/0x2c0 [ 64.441610] ? list_lru_count_one+0x47/0x90 [ 64.441613] do_shrink_slab+0x153/0x360 [ 64.441618] shrink_slab+0xd3/0x390 [ 64.441621] drop_slab+0x7d/0x130 [ 64.441624] drop_caches_sysctl_handler+0x98/0xb0 [ 64.441629] proc_sys_call_handler+0x1c7/0x2c0 [ 64.441632] vfs_write+0x221/0x450 [ 64.441638] ksys_write+0x6c/0xe0 [ 64.441641] do_syscall_64+0x50/0x200 [ 64.441645] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 64.441648] RIP: 0033:0x7f7fe36e7687 [ 64.441654] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff [ 64.441656] RSP: 002b:00007ffdffcbba10 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 64.441658] RAX: ffffffffffffffda RBX: 00007f7fe3655740 RCX: 00007f7fe36e7687 [ 64.441659] RDX: 0000000000000002 RSI: 00007ffdffcbbbb0 RDI: 0000000000000003 [ 64.441660] RBP: 00007ffdffcbbbb0 R08: 0000000000000000 R09: 0000000000000000 [ 64.441661] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000002 [ 64.441662] R13: 0000558d40be64c0 R14: 00007f7fe383de80 R15: 0000000000000002 [ 64.441663] -- Best Regards, Yan, Zi