Greetings! Booting a clang-16 built v6.6-rc7 kernel on my ThinkPad T60 crashes the machine. I first reported the issue at https://github.com/ClangBuiltLinux/linux/issues/1959 but got the hint there to report it upstream as the issue may not be entirely clang related. My T60 crashes at boot with: [...] BUG: kernel NULL pointer dereference, address: 00000007 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page *pdpt = 0000000002398001 *pde = 0000000000000000 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1 Comm: systemd Not tainted 6.7.0-rc1-P3 #1 Hardware name: LENOVO 2007F2G/2007F2G, BIOS 79ETE7WW (2.27 ) 03/21/2011 EIP: obj_cgroup_charge_pages+0xc/0xa8 Code: 75 ee eb cf 31 db 4b eb a0 e8 34 fe ff ff 89 c3 eb 93 8b 43 04 f0 83 00 01 eb b0 90 90 90 55 89 e5 53 57 56 83 ec 08 8b 7d 08 <8b> 71 08 f6 46 2c 01 75 38 8b 46 08 a8 03 74 2e 8b 46 0c 89 45 ec EAX: 00000001 EBX: 00000000 ECX: ffffffff EDX: 00400cc0 ESI: ffffffff EDI: 00000001 EBP: c1155ce8 ESP: c1155cd4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210286 CR0: 80050033 CR2: 00000007 CR3: 0204e000 CR4: 000006f0 Call Trace: ? show_regs+0x4e/0x5c ? __die_body+0x11/0x4c ? __die+0x21/0x30 ? page_fault_oops+0x20f/0x238 ? mt_find+0x94/0x15c ? kernelmode_fixup_or_oops+0x92/0xa8 ? __bad_area_nosemaphore+0x40/0x168 ? bad_area_nosemaphore+0xd/0x14 ? exc_page_fault+0x277/0x32c ? doublefault_shim+0x100/0x100 ? handle_exception+0x101/0x101 ? add_swap_count_continuation+0x1af/0x204 ? doublefault_shim+0x100/0x100 ? obj_cgroup_charge_pages+0xc/0xa8 ? doublefault_shim+0x100/0x100 ? obj_cgroup_charge_pages+0xc/0xa8 obj_cgroup_charge+0x8d/0xcc pcpu_alloc+0x107/0x5c0 ? cgroup_apply_control_enable+0xb1/0x250 __alloc_percpu_gfp+0x10/0x18 mem_cgroup_css_alloc+0xea/0x498 cgroup_apply_control_enable+0xb1/0x250 ? css_populate_dir+0xb5/0xd0 cgroup_mkdir+0x1a2/0x2f4 ? css_task_iter_end+0xbc/0xbc kernfs_iop_mkdir+0x52/0x68 ? kernfs_iop_lookup+0xc0/0xc0 vfs_mkdir+0x149/0x198 do_mkdirat+0x72/0xb4 __ia32_sys_mkdir+0x23/0x2c __do_fast_syscall_32+0x86/0xb0 ? kmem_cache_free+0x2c3/0x2f0 ? putname+0x3c/0x48 ? putname+0x3c/0x48 ? putname+0x3c/0x48 ? syscall_exit_to_user_mode+0x1d/0x90 ? __do_fast_syscall_32+0x92/0xb0 ? syscall_exit_to_user_mode+0x1d/0x90 ? __do_fast_syscall_32+0x92/0xb0 ? __ia32_sys_clock_gettime+0x86/0xa0 ? syscall_exit_to_user_mode+0x1d/0x90 ? __do_fast_syscall_32+0x92/0xb0 ? irqentry_exit_to_user_mode+0xa/0x1c ? irqentry_exit+0x12/0x2c ? exc_page_fault+0x112/0x32c do_fast_syscall_32+0x29/0x54 do_SYSENTER_32+0x12/0x18 entry_SYSENTER_32+0x98/0xf1 EIP: 0xb7fc8539 Code: 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 0f 1f 00 58 b8 77 00 00 00 cd 80 90 0f 1f EAX: ffffffda EBX: 00a89d50 ECX: 000001ed EDX: b79f9e4c ESI: b7ab3614 EDI: 00ad7dc0 EBP: bfea7578 ESP: bfea7508 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00200292 ? asm_exc_nmi+0xb0/0x10d Modules linked in: dmi_sysfs CR2: 0000000000000007 ---[ end trace 0000000000000000 ]--- EIP: obj_cgroup_charge_pages+0xc/0xa8 Code: 75 ee eb cf 31 db 4b eb a0 e8 34 fe ff ff 89 c3 eb 93 8b 43 04 f0 83 00 01 eb b0 90 90 90 55 89 e5 53 57 56 83 ec 08 8b 7d 08 <8b> 71 08 f6 46 2c 01 75 38 8b 46 08 a8 03 74 2e 8b 46 0c 89 45 ec EAX: 00000001 EBX: 00000000 ECX: ffffffff EDX: 00400cc0 ESI: ffffffff EDI: 00000001 EBP: c1155ce8 ESP: c1155cd4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210286 CR0: 80050033 CR2: 00000007 CR3: 0204e000 CR4: 000006f0 Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 Kernel Offset: disabled Rebooting in 40 seconds.. I bisected the crash to this commit: # git bisect bad e86828e5446d95676835679837d995dec188d2be is the first bad commit commit e86828e5446d95676835679837d995dec188d2be Author: Roman Gushchin Date: Thu Oct 19 15:53:44 2023 -0700 mm: kmem: scoped objcg protection Switch to a scope-based protection of the objcg pointer on slab/kmem allocation paths. Instead of using the get_() semantics in the pre-allocation hook and put the reference afterwards, let's rely on the fact that objcg is pinned by the scope. It's possible because: 1) if the objcg is received from the current task struct, the task is keeping a reference to the objcg. 2) if the objcg is received from an active memcg (remote charging), the memcg is pinned by the scope and has a reference to the corresponding objcg. Link: https://lkml.kernel.org/r/20231019225346.1822282-5-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Reviewed-by: Vlastimil Babka Cc: David Rientjes Cc: Dennis Zhou Cc: Johannes Weiner Cc: Michal Hocko Cc: Muchun Song Signed-off-by: Andrew Morton include/linux/memcontrol.h | 9 +++++++++ include/linux/sched/mm.h | 4 ++++ mm/memcontrol.c | 47 ++++++++++++++++++++++++++++++++++++++++++++-- mm/slab.h | 15 ++++++++------- Reverting the patch series with "git diff 7d0715d0d6b28a831b6fdfefb29c5a7a4929fa49^..e56808fef8f71a192b2740c0b6ea8be7ab865d54 | git apply -3 -R" on top of v6.7-rc1 fixes the crash and the Thinkpad T60 succesfully boots up. When building the kernel with gcc-13 the issue does not show up either. Kernel .config and dmesg attached. Regards, Erhard