* Re: [syzbot] [mm?] WARNING: bad unlock balance in copy_process [not found] <683adb33.a70a0220.1a6ae.000b.GAE@google.com> @ 2025-09-17 20:40 ` syzbot 2025-09-18 8:35 ` Vlastimil Babka 0 siblings, 1 reply; 5+ messages in thread From: syzbot @ 2025-09-17 20:40 UTC (permalink / raw) To: Liam.Howlett, akpm, bsegall, david, dietmar.eggemann, juri.lelli, kees, linux-kernel, linux-mm, lorenzo.stoakes, mgorman, mhocko, mingo, peterz, rostedt, rppt, surenb, syzkaller-bugs, vbabka, vincent.guittot, vschneid syzbot has found a reproducer for the following issue on: HEAD commit: 6edf2885ebeb Merge branch 'for-next/core' into for-kernelci git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci console output: https://syzkaller.appspot.com/x/log.txt?x=16d14c7c580000 kernel config: https://syzkaller.appspot.com/x/.config?x=b8b6789b42526d72 dashboard link: https://syzkaller.appspot.com/bug?extid=80cb3cc5c14fad191a10 compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8 userspace arch: arm64 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=179d9f62580000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11d14c7c580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/c72239eb6d76/disk-6edf2885.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/b67e9820b2be/vmlinux-6edf2885.xz kernel image: https://storage.googleapis.com/syzbot-assets/0c4ab7e562f6/Image-6edf2885.gz.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com ===================================== WARNING: bad unlock balance detected! syzkaller #0 Not tainted ------------------------------------- syz.1.48/6865 is trying to release lock (&sighand->siglock) at: [<ffff8000803b8634>] spin_unlock include/linux/spinlock.h:391 [inline] [<ffff8000803b8634>] copy_process+0x22d4/0x31ec kernel/fork.c:2432 but there are no more locks to release! other info that might help us debug this: 1 lock held by syz.1.48/6865: #0: ffff80008fa00450 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: copy_process+0x2228/0x31ec kernel/fork.c:2274 stack backtrace: CPU: 0 UID: 0 PID: 6865 Comm: syz.1.48 Not tainted syzkaller #0 PREEMPT Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/30/2025 Call trace: show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C) __dump_stack+0x30/0x40 lib/dump_stack.c:94 dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120 dump_stack+0x1c/0x28 lib/dump_stack.c:129 print_unlock_imbalance_bug+0xf4/0xfc kernel/locking/lockdep.c:5298 __lock_release kernel/locking/lockdep.c:-1 [inline] lock_release+0x244/0x39c kernel/locking/lockdep.c:5889 __raw_spin_unlock include/linux/spinlock_api_smp.h:141 [inline] _raw_spin_unlock+0x24/0x78 kernel/locking/spinlock.c:186 spin_unlock include/linux/spinlock.h:391 [inline] copy_process+0x22d4/0x31ec kernel/fork.c:2432 kernel_clone+0x1d8/0x84c kernel/fork.c:2605 __do_sys_clone kernel/fork.c:2748 [inline] __se_sys_clone kernel/fork.c:2716 [inline] __arm64_sys_clone+0x144/0x1a0 kernel/fork.c:2716 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596 --- If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] WARNING: bad unlock balance in copy_process 2025-09-17 20:40 ` [syzbot] [mm?] WARNING: bad unlock balance in copy_process syzbot @ 2025-09-18 8:35 ` Vlastimil Babka 2025-09-18 8:48 ` Sebastian Andrzej Siewior 2025-09-18 13:09 ` [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() Sebastian Andrzej Siewior 0 siblings, 2 replies; 5+ messages in thread From: Vlastimil Babka @ 2025-09-18 8:35 UTC (permalink / raw) To: syzbot, Liam.Howlett, akpm, bsegall, david, dietmar.eggemann, juri.lelli, kees, linux-kernel, linux-mm, lorenzo.stoakes, mgorman, mhocko, mingo, peterz, rostedt, rppt, surenb, syzkaller-bugs, vincent.guittot, vschneid, Sebastian Andrzej Siewior On 9/17/25 22:40, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit: 6edf2885ebeb Merge branch 'for-next/core' into for-kernelci > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci > console output: https://syzkaller.appspot.com/x/log.txt?x=16d14c7c580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=b8b6789b42526d72 > dashboard link: https://syzkaller.appspot.com/bug?extid=80cb3cc5c14fad191a10 > compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8 > userspace arch: arm64 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=179d9f62580000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11d14c7c580000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/c72239eb6d76/disk-6edf2885.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/b67e9820b2be/vmlinux-6edf2885.xz > kernel image: https://storage.googleapis.com/syzbot-assets/0c4ab7e562f6/Image-6edf2885.gz.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com > > ===================================== > WARNING: bad unlock balance detected! > syzkaller #0 Not tainted > ------------------------------------- > syz.1.48/6865 is trying to release lock (&sighand->siglock) at: > [<ffff8000803b8634>] spin_unlock include/linux/spinlock.h:391 [inline] > [<ffff8000803b8634>] copy_process+0x22d4/0x31ec kernel/fork.c:2432 bad_fork_core_free: sched_core_free(p); spin_unlock(¤t->sighand->siglock); <- here Sebastian, I think it's your 7c4f75a21f63 ("futex: Allow automatic allocation of process wide futex hash") adding a "goto bad_fork_core_free;" from a place that doesn't yet have current->sighand->siglock locked? > but there are no more locks to release! > > other info that might help us debug this: > 1 lock held by syz.1.48/6865: > #0: ffff80008fa00450 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: copy_process+0x2228/0x31ec kernel/fork.c:2274 > > stack backtrace: > CPU: 0 UID: 0 PID: 6865 Comm: syz.1.48 Not tainted syzkaller #0 PREEMPT > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/30/2025 > Call trace: > show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C) > __dump_stack+0x30/0x40 lib/dump_stack.c:94 > dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120 > dump_stack+0x1c/0x28 lib/dump_stack.c:129 > print_unlock_imbalance_bug+0xf4/0xfc kernel/locking/lockdep.c:5298 > __lock_release kernel/locking/lockdep.c:-1 [inline] > lock_release+0x244/0x39c kernel/locking/lockdep.c:5889 > __raw_spin_unlock include/linux/spinlock_api_smp.h:141 [inline] > _raw_spin_unlock+0x24/0x78 kernel/locking/spinlock.c:186 > spin_unlock include/linux/spinlock.h:391 [inline] > copy_process+0x22d4/0x31ec kernel/fork.c:2432 > kernel_clone+0x1d8/0x84c kernel/fork.c:2605 > __do_sys_clone kernel/fork.c:2748 [inline] > __se_sys_clone kernel/fork.c:2716 [inline] > __arm64_sys_clone+0x144/0x1a0 kernel/fork.c:2716 > __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] > invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 > el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 > do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 > el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744 > el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763 > el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596 > > > --- > If you want syzbot to run the reproducer, reply with: > #syz test: git://repo/address.git branch-or-commit-hash > If you attach or paste a git patch, syzbot will apply it before testing. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] WARNING: bad unlock balance in copy_process 2025-09-18 8:35 ` Vlastimil Babka @ 2025-09-18 8:48 ` Sebastian Andrzej Siewior 2025-09-18 13:09 ` [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() Sebastian Andrzej Siewior 1 sibling, 0 replies; 5+ messages in thread From: Sebastian Andrzej Siewior @ 2025-09-18 8:48 UTC (permalink / raw) To: Vlastimil Babka Cc: syzbot, Liam.Howlett, akpm, bsegall, david, dietmar.eggemann, juri.lelli, kees, linux-kernel, linux-mm, lorenzo.stoakes, mgorman, mhocko, mingo, peterz, rostedt, rppt, surenb, syzkaller-bugs, vincent.guittot, vschneid On 2025-09-18 10:35:24 [+0200], Vlastimil Babka wrote: > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com > > > > ===================================== > > WARNING: bad unlock balance detected! > > syzkaller #0 Not tainted > > ------------------------------------- > > syz.1.48/6865 is trying to release lock (&sighand->siglock) at: > > [<ffff8000803b8634>] spin_unlock include/linux/spinlock.h:391 [inline] > > [<ffff8000803b8634>] copy_process+0x22d4/0x31ec kernel/fork.c:2432 > > bad_fork_core_free: > sched_core_free(p); > spin_unlock(¤t->sighand->siglock); <- here > > Sebastian, I think it's your 7c4f75a21f63 ("futex: Allow automatic > allocation of process wide futex hash") adding a "goto bad_fork_core_free;" > from a place that doesn't yet have current->sighand->siglock locked? Yes. Judging from -rc6, if futex_hash_allocate_default() fails we hold neither siglock nor tasklist_lock. sched_core_free() looks also bad as the cookie was allocated later in sched_core_fork(). sched_cgroup_fork() does nothing special. So it should be diff --git a/kernel/fork.c b/kernel/fork.c index c4ada32598bd5..6ca8689a83b5b 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2295,7 +2295,7 @@ __latent_entropy struct task_struct *copy_process( if (need_futex_hash_allocate_default(clone_flags)) { retval = futex_hash_allocate_default(); if (retval) - goto bad_fork_core_free; + goto bad_fork_cancel_cgroup; /* * If we fail beyond this point we don't free the allocated * futex hash map. We assume that another thread will be created Sebastian ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() 2025-09-18 8:35 ` Vlastimil Babka 2025-09-18 8:48 ` Sebastian Andrzej Siewior @ 2025-09-18 13:09 ` Sebastian Andrzej Siewior 2025-09-18 15:30 ` Steven Rostedt 1 sibling, 1 reply; 5+ messages in thread From: Sebastian Andrzej Siewior @ 2025-09-18 13:09 UTC (permalink / raw) To: Vlastimil Babka, Thomas Gleixner, Peter Zijlstra Cc: syzbot, Liam.Howlett, akpm, bsegall, david, dietmar.eggemann, juri.lelli, kees, linux-kernel, linux-mm, lorenzo.stoakes, mgorman, mhocko, mingo, peterz, rostedt, rppt, surenb, syzkaller-bugs, vincent.guittot, vschneid copy_process() uses the wrong error exit path from futex_hash_allocate_default(). After exiting from futex_hash_allocate_default(), neither tasklist_lock nor siglock has been acquired. The exit label bad_fork_core_free unlocks both of these locks which is wrong. The previous label, bad_fork_cancel_cgroup, is the correct exit. sched_cgroup_fork() did not allocate any resources that need to freed. Use bad_fork_cancel_cgroup on error exit from futex_hash_allocate_default(). Fixes: 7c4f75a21f636 ("futex: Allow automatic allocation of process wide futex hash") Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68cb1cbd.050a0220.2ff435.0599.GAE@google.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- That private-futex code was marked BROKEN in v6.16 and re-enabled in v6.17. It could use 56180dd20c19e ("futex: Use RCU-based per-CPU reference counting instead of rcuref_t") as Fixes: instead to avoid backporting to v6.16. kernel/fork.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/fork.c b/kernel/fork.c index c4ada32598bd5..6ca8689a83b5b 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2295,7 +2295,7 @@ __latent_entropy struct task_struct *copy_process( if (need_futex_hash_allocate_default(clone_flags)) { retval = futex_hash_allocate_default(); if (retval) - goto bad_fork_core_free; + goto bad_fork_cancel_cgroup; /* * If we fail beyond this point we don't free the allocated * futex hash map. We assume that another thread will be created -- 2.51.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() 2025-09-18 13:09 ` [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() Sebastian Andrzej Siewior @ 2025-09-18 15:30 ` Steven Rostedt 0 siblings, 0 replies; 5+ messages in thread From: Steven Rostedt @ 2025-09-18 15:30 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Vlastimil Babka, Thomas Gleixner, Peter Zijlstra, syzbot, Liam.Howlett, akpm, bsegall, david, dietmar.eggemann, juri.lelli, kees, linux-kernel, linux-mm, lorenzo.stoakes, mgorman, mhocko, mingo, rppt, surenb, syzkaller-bugs, vincent.guittot, vschneid On Thu, 18 Sep 2025 15:09:45 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > copy_process() uses the wrong error exit path from > futex_hash_allocate_default(). > After exiting from futex_hash_allocate_default(), neither tasklist_lock > nor siglock has been acquired. The exit label bad_fork_core_free unlocks > both of these locks which is wrong. > > The previous label, bad_fork_cancel_cgroup, is the correct exit. > sched_cgroup_fork() did not allocate any resources that need to freed. > > Use bad_fork_cancel_cgroup on error exit from > futex_hash_allocate_default(). if (need_futex_hash_allocate_default(clone_flags)) { retval = futex_hash_allocate_default(); if (retval) goto bad_fork_core_free; [..] } [..] write_lock_irq(&tasklist_lock); [..] klp_copy_process(p); sched_core_fork(p); spin_lock(¤t->sighand->siglock); [..] bad_fork_core_free: sched_core_free(p); spin_unlock(¤t->sighand->siglock); write_unlock_irq(&tasklist_lock); bad_fork_cancel_cgroup: cgroup_cancel_fork(p, args); Yep, looks bad to me! Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> -- Steve > > Fixes: 7c4f75a21f636 ("futex: Allow automatic allocation of process wide futex hash") > Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/all/68cb1cbd.050a0220.2ff435.0599.GAE@google.com > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-18 15:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <683adb33.a70a0220.1a6ae.000b.GAE@google.com>
2025-09-17 20:40 ` [syzbot] [mm?] WARNING: bad unlock balance in copy_process syzbot
2025-09-18 8:35 ` Vlastimil Babka
2025-09-18 8:48 ` Sebastian Andrzej Siewior
2025-09-18 13:09 ` [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() Sebastian Andrzej Siewior
2025-09-18 15:30 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox