linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: kernel test robot <oliver.sang@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	oe-lkp@lists.linux.dev, lkp@intel.com, linux-mm@kvack.org
Subject: Re: [vbabka:slub-percpu-sheaves-v3] [slab]  c19bb08297: BUG:kernel_NULL_pointer_dereference,address
Date: Wed, 2 Apr 2025 23:23:15 +0900	[thread overview]
Message-ID: <Z-1IU2OXez-I5H0t@harry> (raw)
In-Reply-To: <202503241413.afff5aa1-lkp@intel.com>

On Mon, Mar 24, 2025 at 02:18:53PM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
> 
> commit: c19bb0829736a5c7abe1d1b70d013489d720bb54 ("slab: add opt-in caching layer of percpu sheaves")
> https://git.kernel.org/cgit/linux/kernel/git/vbabka/linux.git slub-percpu-sheaves-v3

If HEAD is commit c19bb0829, no user enables sheaves.
That means it's trying to flush sheaves when no users enabled sheaves yet.

#syz test: https://git.kernel.org/cgit/linux/kernel/git/vbabka/linux.git slub-percpu-sheaves-v3

diff --git a/mm/slub.c b/mm/slub.c
index 2c7b2a85c628..dfd301ce4c76 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3853,7 +3853,8 @@ static int slub_cpu_dead(unsigned int cpu)
 	mutex_lock(&slab_mutex);
 	list_for_each_entry(s, &slab_caches, list) {
 		__flush_cpu_slab(s, cpu);
-		__pcs_flush_all_cpu(s, cpu);
+		if (s->cpu_sheaves)
+			__pcs_flush_all_cpu(s, cpu);
 	}
 	mutex_unlock(&slab_mutex);
 	return 0;

> 
> in testcase: rcutorture
> version: 
> with following parameters:
> 
> 	runtime: 300s
> 	test: cpuhotplug
> 	torture_type: srcud
> 
> 
> 
> config: i386-randconfig-005-20250321
> compiler: gcc-12
> test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> +---------------------------------------------+------------+------------+
> |                                             | d52c71b1f1 | c19bb08297 |
> +---------------------------------------------+------------+------------+
> | boot_successes                              | 6          | 0          |
> | boot_failures                               | 0          | 6          |
> | BUG:kernel_NULL_pointer_dereference,address | 0          | 6          |
> | Oops                                        | 0          | 6          |
> | EIP:slub_cpu_dead                           | 0          | 6          |
> | Kernel_panic-not_syncing:Fatal_exception    | 0          | 6          |
> +---------------------------------------------+------------+------------+
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202503241413.afff5aa1-lkp@intel.com
> 
> 
> [  100.813833][    T6] BUG: kernel NULL pointer dereference, address: 00000008
> [  100.814405][    T6] #PF: supervisor read access in kernel mode
> [  100.814830][    T6] #PF: error_code(0x0000) - not-present page
> [  100.815260][    T6] *pde = 00000000
> [  100.815526][    T6] Oops: Oops: 0000 [#1] SMP
> [  100.815856][    T6] CPU: 0 UID: 0 PID: 6 Comm: kworker/0:0 Not tainted 6.14.0-rc1-00007-gc19bb0829736 #1
> [  100.816542][    T6] Workqueue: events work_for_cpu_fn
> [ 100.816933][ T6] EIP: slub_cpu_dead (mm/slub.c:2578 mm/slub.c:2625 mm/slub.c:3783) 
> [ 100.817301][ T6] Code: 01 a1 2c 56 a2 43 3d 2c 56 a2 43 74 72 8d 70 b8 8d 76 00 8b 1e 83 7d f0 07 77 7a 8b 45 f0 8b 0c 85 e0 78 5f 43 01 cb 8b 7b 1c <8b> 57 08 85 d2 74 11 8d 4f 0c 89 f0 e8 c0 e3 ff ff c7 47 08 00 00
> All code
> ========
>    0:	01 a1 2c 56 a2 43    	add    %esp,0x43a2562c(%rcx)
>    6:	3d 2c 56 a2 43       	cmp    $0x43a2562c,%eax
>    b:	74 72                	je     0x7f
>    d:	8d 70 b8             	lea    -0x48(%rax),%esi
>   10:	8d 76 00             	lea    0x0(%rsi),%esi
>   13:	8b 1e                	mov    (%rsi),%ebx
>   15:	83 7d f0 07          	cmpl   $0x7,-0x10(%rbp)
>   19:	77 7a                	ja     0x95
>   1b:	8b 45 f0             	mov    -0x10(%rbp),%eax
>   1e:	8b 0c 85 e0 78 5f 43 	mov    0x435f78e0(,%rax,4),%ecx
>   25:	01 cb                	add    %ecx,%ebx
>   27:	8b 7b 1c             	mov    0x1c(%rbx),%edi
>   2a:*	8b 57 08             	mov    0x8(%rdi),%edx		<-- trapping instruction
>   2d:	85 d2                	test   %edx,%edx
>   2f:	74 11                	je     0x42
>   31:	8d 4f 0c             	lea    0xc(%rdi),%ecx
>   34:	89 f0                	mov    %esi,%eax
>   36:	e8 c0 e3 ff ff       	call   0xffffffffffffe3fb
>   3b:	c7                   	.byte 0xc7
>   3c:	47 08 00             	rex.RXB or %r8b,(%r8)
> 	...
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	8b 57 08             	mov    0x8(%rdi),%edx
>    3:	85 d2                	test   %edx,%edx
>    5:	74 11                	je     0x18
>    7:	8d 4f 0c             	lea    0xc(%rdi),%ecx
>    a:	89 f0                	mov    %esi,%eax
>    c:	e8 c0 e3 ff ff       	call   0xffffffffffffe3d1
>   11:	c7                   	.byte 0xc7
>   12:	47 08 00             	rex.RXB or %r8b,(%r8)
> 	...
> [  100.819393][    T6] EAX: 00000001 EBX: a9148000 ECX: a9148000 EDX: 00000000
> [  100.819900][    T6] ESI: 40392e80 EDI: 00000000 EBP: 401cfe78 ESP: 401cfe68
> [  100.820414][    T6] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010286
> [  100.820969][    T6] CR0: 80050033 CR2: 00000008 CR3: 7c8e9000 CR4: 00040690
> [  100.821493][    T6] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [  100.822008][    T6] DR6: fffe0ff0 DR7: 00000400
> [  100.822356][    T6] Call Trace:
> [ 100.822609][ T6] ? show_regs (arch/x86/kernel/dumpstack.c:478) 
> [ 100.822935][ T6] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) 
> [ 100.823237][ T6] ? page_fault_oops (arch/x86/mm/fault.c:709) 
> [ 100.823596][ T6] ? kernelmode_fixup_or_oops+0x58/0x70 
> [ 100.824081][ T6] ? __bad_area_nosemaphore+0x10f/0x1f0 
> [ 100.824560][ T6] ? hrtimer_interrupt (kernel/time/hrtimer.c:1877) 
> [ 100.824934][ T6] ? bad_area_nosemaphore (arch/x86/mm/fault.c:834) 
> [ 100.825309][ T6] ? do_user_addr_fault (arch/x86/mm/fault.c:1451) 
> [ 100.825686][ T6] ? sysvec_call_function_single (arch/x86/kernel/apic/apic.c:1049) 
> [ 100.826123][ T6] ? exc_page_fault (arch/x86/include/asm/irqflags.h:26 arch/x86/include/asm/irqflags.h:87 arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1488 arch/x86/mm/fault.c:1538) 
> [ 100.826474][ T6] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1493) 
> [ 100.826923][ T6] ? handle_exception (arch/x86/entry/entry_32.S:1055) 
> [ 100.827294][ T6] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1493) 
> [ 100.827737][ T6] ? slub_cpu_dead (mm/slub.c:2578 mm/slub.c:2625 mm/slub.c:3783) 
> [ 100.828071][ T6] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1493) 
> [ 100.828520][ T6] ? slub_cpu_dead (mm/slub.c:2578 mm/slub.c:2625 mm/slub.c:3783) 
> [ 100.828861][ T6] ? cpuhp_invoke_callback (kernel/cpu.c:194) 
> [ 100.829258][ T6] ? __wait_for_common (kernel/sched/completion.c:122) 
> [ 100.829631][ T6] ? hrtimer_nanosleep_restart (kernel/time/sleep_timeout.c:62) 
> [ 100.830069][ T6] ? kfree (mm/slub.c:3777) 
> [ 100.830367][ T6] ? __cpuhp_invoke_callback_range (kernel/cpu.c:967) 
> [ 100.830805][ T6] ? _cpu_down+0xf9/0x390 
> [ 100.831205][ T6] ? __cpu_down_maps_locked (kernel/cpu.c:1475) 
> [ 100.831611][ T6] ? work_for_cpu_fn (kernel/workqueue.c:6731) 
> [ 100.831966][ T6] ? process_one_work (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/jump_label.h:262 include/trace/events/workqueue.h:110 kernel/workqueue.c:3241) 
> [ 100.832338][ T6] ? worker_thread (kernel/workqueue.c:3311 kernel/workqueue.c:3398) 
> [ 100.832694][ T6] ? kthread (kernel/kthread.c:464) 
> [ 100.833001][ T6] ? rescuer_thread (kernel/workqueue.c:3344) 
> [ 100.833356][ T6] ? kthreads_online_cpu (kernel/kthread.c:413) 
> [ 100.833733][ T6] ? ret_from_fork (arch/x86/kernel/process.c:154) 
> [ 100.838377][ T6] ? kthreads_online_cpu (kernel/kthread.c:413) 
> [ 100.838752][ T6] ? ret_from_fork_asm (arch/x86/entry/entry_32.S:737) 
> [ 100.839109][ T6] ? entry_INT80_32 (arch/x86/entry/entry_32.S:945) 
> [  100.839472][    T6] Modules linked in: rcutorture torture
> [  100.839877][    T6] CR2: 0000000000000008
> [  100.840173][    T6] ---[ end trace 0000000000000000 ]---
> [ 100.840547][ T6] EIP: slub_cpu_dead (mm/slub.c:2578 mm/slub.c:2625 mm/slub.c:3783) 
> [ 100.840886][ T6] Code: 01 a1 2c 56 a2 43 3d 2c 56 a2 43 74 72 8d 70 b8 8d 76 00 8b 1e 83 7d f0 07 77 7a 8b 45 f0 8b 0c 85 e0 78 5f 43 01 cb 8b 7b 1c <8b> 57 08 85 d2 74 11 8d 4f 0c 89 f0 e8 c0 e3 ff ff c7 47 08 00 00
> All code
> ========
>    0:	01 a1 2c 56 a2 43    	add    %esp,0x43a2562c(%rcx)
>    6:	3d 2c 56 a2 43       	cmp    $0x43a2562c,%eax
>    b:	74 72                	je     0x7f
>    d:	8d 70 b8             	lea    -0x48(%rax),%esi
>   10:	8d 76 00             	lea    0x0(%rsi),%esi
>   13:	8b 1e                	mov    (%rsi),%ebx
>   15:	83 7d f0 07          	cmpl   $0x7,-0x10(%rbp)
>   19:	77 7a                	ja     0x95
>   1b:	8b 45 f0             	mov    -0x10(%rbp),%eax
>   1e:	8b 0c 85 e0 78 5f 43 	mov    0x435f78e0(,%rax,4),%ecx
>   25:	01 cb                	add    %ecx,%ebx
>   27:	8b 7b 1c             	mov    0x1c(%rbx),%edi
>   2a:*	8b 57 08             	mov    0x8(%rdi),%edx		<-- trapping instruction
>   2d:	85 d2                	test   %edx,%edx
>   2f:	74 11                	je     0x42
>   31:	8d 4f 0c             	lea    0xc(%rdi),%ecx
>   34:	89 f0                	mov    %esi,%eax
>   36:	e8 c0 e3 ff ff       	call   0xffffffffffffe3fb
>   3b:	c7                   	.byte 0xc7
>   3c:	47 08 00             	rex.RXB or %r8b,(%r8)
> 	...
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	8b 57 08             	mov    0x8(%rdi),%edx
>    3:	85 d2                	test   %edx,%edx
>    5:	74 11                	je     0x18
>    7:	8d 4f 0c             	lea    0xc(%rdi),%ecx
>    a:	89 f0                	mov    %esi,%eax
>    c:	e8 c0 e3 ff ff       	call   0xffffffffffffe3d1
>   11:	c7                   	.byte 0xc7
>   12:	47 08 00             	rex.RXB or %r8b,(%r8)
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250324/202503241413.afff5aa1-lkp@intel.com
> 
> 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
> 

-- 
Cheers,
Harry (formerly known as Hyeonggon)


  reply	other threads:[~2025-04-02 14:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-24  6:18 kernel test robot
2025-04-02 14:23 ` Harry Yoo [this message]
2025-04-02 14:33   ` Vlastimil Babka
2025-04-02 14:37     ` Harry Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-1IU2OXez-I5H0t@harry \
    --to=harry.yoo@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox