* Re: [syzbot] [mm?] WARNING: bad unlock balance in copy_process
[not found] <683adb33.a70a0220.1a6ae.000b.GAE@google.com>
@ 2025-09-17 20:40 ` syzbot
2025-09-18 8:35 ` Vlastimil Babka
0 siblings, 1 reply; 5+ messages in thread
From: syzbot @ 2025-09-17 20:40 UTC (permalink / raw)
To: Liam.Howlett, akpm, bsegall, david, dietmar.eggemann, juri.lelli,
kees, linux-kernel, linux-mm, lorenzo.stoakes, mgorman, mhocko,
mingo, peterz, rostedt, rppt, surenb, syzkaller-bugs, vbabka,
vincent.guittot, vschneid
syzbot has found a reproducer for the following issue on:
HEAD commit: 6edf2885ebeb Merge branch 'for-next/core' into for-kernelci
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=16d14c7c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=b8b6789b42526d72
dashboard link: https://syzkaller.appspot.com/bug?extid=80cb3cc5c14fad191a10
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=179d9f62580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11d14c7c580000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/c72239eb6d76/disk-6edf2885.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/b67e9820b2be/vmlinux-6edf2885.xz
kernel image: https://storage.googleapis.com/syzbot-assets/0c4ab7e562f6/Image-6edf2885.gz.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com
=====================================
WARNING: bad unlock balance detected!
syzkaller #0 Not tainted
-------------------------------------
syz.1.48/6865 is trying to release lock (&sighand->siglock) at:
[<ffff8000803b8634>] spin_unlock include/linux/spinlock.h:391 [inline]
[<ffff8000803b8634>] copy_process+0x22d4/0x31ec kernel/fork.c:2432
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syz.1.48/6865:
#0: ffff80008fa00450 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: copy_process+0x2228/0x31ec kernel/fork.c:2274
stack backtrace:
CPU: 0 UID: 0 PID: 6865 Comm: syz.1.48 Not tainted syzkaller #0 PREEMPT
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/30/2025
Call trace:
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C)
__dump_stack+0x30/0x40 lib/dump_stack.c:94
dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120
dump_stack+0x1c/0x28 lib/dump_stack.c:129
print_unlock_imbalance_bug+0xf4/0xfc kernel/locking/lockdep.c:5298
__lock_release kernel/locking/lockdep.c:-1 [inline]
lock_release+0x244/0x39c kernel/locking/lockdep.c:5889
__raw_spin_unlock include/linux/spinlock_api_smp.h:141 [inline]
_raw_spin_unlock+0x24/0x78 kernel/locking/spinlock.c:186
spin_unlock include/linux/spinlock.h:391 [inline]
copy_process+0x22d4/0x31ec kernel/fork.c:2432
kernel_clone+0x1d8/0x84c kernel/fork.c:2605
__do_sys_clone kernel/fork.c:2748 [inline]
__se_sys_clone kernel/fork.c:2716 [inline]
__arm64_sys_clone+0x144/0x1a0 kernel/fork.c:2716
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744
el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763
el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596
---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] WARNING: bad unlock balance in copy_process
2025-09-17 20:40 ` [syzbot] [mm?] WARNING: bad unlock balance in copy_process syzbot
@ 2025-09-18 8:35 ` Vlastimil Babka
2025-09-18 8:48 ` Sebastian Andrzej Siewior
2025-09-18 13:09 ` [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() Sebastian Andrzej Siewior
0 siblings, 2 replies; 5+ messages in thread
From: Vlastimil Babka @ 2025-09-18 8:35 UTC (permalink / raw)
To: syzbot, Liam.Howlett, akpm, bsegall, david, dietmar.eggemann,
juri.lelli, kees, linux-kernel, linux-mm, lorenzo.stoakes,
mgorman, mhocko, mingo, peterz, rostedt, rppt, surenb,
syzkaller-bugs, vincent.guittot, vschneid,
Sebastian Andrzej Siewior
On 9/17/25 22:40, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 6edf2885ebeb Merge branch 'for-next/core' into for-kernelci
> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> console output: https://syzkaller.appspot.com/x/log.txt?x=16d14c7c580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=b8b6789b42526d72
> dashboard link: https://syzkaller.appspot.com/bug?extid=80cb3cc5c14fad191a10
> compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
> userspace arch: arm64
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=179d9f62580000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11d14c7c580000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/c72239eb6d76/disk-6edf2885.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/b67e9820b2be/vmlinux-6edf2885.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/0c4ab7e562f6/Image-6edf2885.gz.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com
>
> =====================================
> WARNING: bad unlock balance detected!
> syzkaller #0 Not tainted
> -------------------------------------
> syz.1.48/6865 is trying to release lock (&sighand->siglock) at:
> [<ffff8000803b8634>] spin_unlock include/linux/spinlock.h:391 [inline]
> [<ffff8000803b8634>] copy_process+0x22d4/0x31ec kernel/fork.c:2432
bad_fork_core_free:
sched_core_free(p);
spin_unlock(¤t->sighand->siglock); <- here
Sebastian, I think it's your 7c4f75a21f63 ("futex: Allow automatic
allocation of process wide futex hash") adding a "goto bad_fork_core_free;"
from a place that doesn't yet have current->sighand->siglock locked?
> but there are no more locks to release!
>
> other info that might help us debug this:
> 1 lock held by syz.1.48/6865:
> #0: ffff80008fa00450 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: copy_process+0x2228/0x31ec kernel/fork.c:2274
>
> stack backtrace:
> CPU: 0 UID: 0 PID: 6865 Comm: syz.1.48 Not tainted syzkaller #0 PREEMPT
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/30/2025
> Call trace:
> show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C)
> __dump_stack+0x30/0x40 lib/dump_stack.c:94
> dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120
> dump_stack+0x1c/0x28 lib/dump_stack.c:129
> print_unlock_imbalance_bug+0xf4/0xfc kernel/locking/lockdep.c:5298
> __lock_release kernel/locking/lockdep.c:-1 [inline]
> lock_release+0x244/0x39c kernel/locking/lockdep.c:5889
> __raw_spin_unlock include/linux/spinlock_api_smp.h:141 [inline]
> _raw_spin_unlock+0x24/0x78 kernel/locking/spinlock.c:186
> spin_unlock include/linux/spinlock.h:391 [inline]
> copy_process+0x22d4/0x31ec kernel/fork.c:2432
> kernel_clone+0x1d8/0x84c kernel/fork.c:2605
> __do_sys_clone kernel/fork.c:2748 [inline]
> __se_sys_clone kernel/fork.c:2716 [inline]
> __arm64_sys_clone+0x144/0x1a0 kernel/fork.c:2716
> __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744
> el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763
> el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596
>
>
> ---
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] WARNING: bad unlock balance in copy_process
2025-09-18 8:35 ` Vlastimil Babka
@ 2025-09-18 8:48 ` Sebastian Andrzej Siewior
2025-09-18 13:09 ` [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() Sebastian Andrzej Siewior
1 sibling, 0 replies; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-09-18 8:48 UTC (permalink / raw)
To: Vlastimil Babka
Cc: syzbot, Liam.Howlett, akpm, bsegall, david, dietmar.eggemann,
juri.lelli, kees, linux-kernel, linux-mm, lorenzo.stoakes,
mgorman, mhocko, mingo, peterz, rostedt, rppt, surenb,
syzkaller-bugs, vincent.guittot, vschneid
On 2025-09-18 10:35:24 [+0200], Vlastimil Babka wrote:
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com
> >
> > =====================================
> > WARNING: bad unlock balance detected!
> > syzkaller #0 Not tainted
> > -------------------------------------
> > syz.1.48/6865 is trying to release lock (&sighand->siglock) at:
> > [<ffff8000803b8634>] spin_unlock include/linux/spinlock.h:391 [inline]
> > [<ffff8000803b8634>] copy_process+0x22d4/0x31ec kernel/fork.c:2432
>
> bad_fork_core_free:
> sched_core_free(p);
> spin_unlock(¤t->sighand->siglock); <- here
>
> Sebastian, I think it's your 7c4f75a21f63 ("futex: Allow automatic
> allocation of process wide futex hash") adding a "goto bad_fork_core_free;"
> from a place that doesn't yet have current->sighand->siglock locked?
Yes. Judging from -rc6, if futex_hash_allocate_default() fails we hold
neither siglock nor tasklist_lock. sched_core_free() looks also bad as
the cookie was allocated later in sched_core_fork(). sched_cgroup_fork()
does nothing special. So it should be
diff --git a/kernel/fork.c b/kernel/fork.c
index c4ada32598bd5..6ca8689a83b5b 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2295,7 +2295,7 @@ __latent_entropy struct task_struct *copy_process(
if (need_futex_hash_allocate_default(clone_flags)) {
retval = futex_hash_allocate_default();
if (retval)
- goto bad_fork_core_free;
+ goto bad_fork_cancel_cgroup;
/*
* If we fail beyond this point we don't free the allocated
* futex hash map. We assume that another thread will be created
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default()
2025-09-18 8:35 ` Vlastimil Babka
2025-09-18 8:48 ` Sebastian Andrzej Siewior
@ 2025-09-18 13:09 ` Sebastian Andrzej Siewior
2025-09-18 15:30 ` Steven Rostedt
1 sibling, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-09-18 13:09 UTC (permalink / raw)
To: Vlastimil Babka, Thomas Gleixner, Peter Zijlstra
Cc: syzbot, Liam.Howlett, akpm, bsegall, david, dietmar.eggemann,
juri.lelli, kees, linux-kernel, linux-mm, lorenzo.stoakes,
mgorman, mhocko, mingo, peterz, rostedt, rppt, surenb,
syzkaller-bugs, vincent.guittot, vschneid
copy_process() uses the wrong error exit path from
futex_hash_allocate_default().
After exiting from futex_hash_allocate_default(), neither tasklist_lock
nor siglock has been acquired. The exit label bad_fork_core_free unlocks
both of these locks which is wrong.
The previous label, bad_fork_cancel_cgroup, is the correct exit.
sched_cgroup_fork() did not allocate any resources that need to freed.
Use bad_fork_cancel_cgroup on error exit from
futex_hash_allocate_default().
Fixes: 7c4f75a21f636 ("futex: Allow automatic allocation of process wide futex hash")
Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68cb1cbd.050a0220.2ff435.0599.GAE@google.com
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
That private-futex code was marked BROKEN in v6.16 and re-enabled in
v6.17. It could use
56180dd20c19e ("futex: Use RCU-based per-CPU reference counting instead of rcuref_t")
as Fixes: instead to avoid backporting to v6.16.
kernel/fork.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index c4ada32598bd5..6ca8689a83b5b 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2295,7 +2295,7 @@ __latent_entropy struct task_struct *copy_process(
if (need_futex_hash_allocate_default(clone_flags)) {
retval = futex_hash_allocate_default();
if (retval)
- goto bad_fork_core_free;
+ goto bad_fork_cancel_cgroup;
/*
* If we fail beyond this point we don't free the allocated
* futex hash map. We assume that another thread will be created
--
2.51.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default()
2025-09-18 13:09 ` [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() Sebastian Andrzej Siewior
@ 2025-09-18 15:30 ` Steven Rostedt
0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2025-09-18 15:30 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Vlastimil Babka, Thomas Gleixner, Peter Zijlstra, syzbot,
Liam.Howlett, akpm, bsegall, david, dietmar.eggemann, juri.lelli,
kees, linux-kernel, linux-mm, lorenzo.stoakes, mgorman, mhocko,
mingo, rppt, surenb, syzkaller-bugs, vincent.guittot, vschneid
On Thu, 18 Sep 2025 15:09:45 +0200
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> copy_process() uses the wrong error exit path from
> futex_hash_allocate_default().
> After exiting from futex_hash_allocate_default(), neither tasklist_lock
> nor siglock has been acquired. The exit label bad_fork_core_free unlocks
> both of these locks which is wrong.
>
> The previous label, bad_fork_cancel_cgroup, is the correct exit.
> sched_cgroup_fork() did not allocate any resources that need to freed.
>
> Use bad_fork_cancel_cgroup on error exit from
> futex_hash_allocate_default().
if (need_futex_hash_allocate_default(clone_flags)) {
retval = futex_hash_allocate_default();
if (retval)
goto bad_fork_core_free;
[..]
}
[..]
write_lock_irq(&tasklist_lock);
[..]
klp_copy_process(p);
sched_core_fork(p);
spin_lock(¤t->sighand->siglock);
[..]
bad_fork_core_free:
sched_core_free(p);
spin_unlock(¤t->sighand->siglock);
write_unlock_irq(&tasklist_lock);
bad_fork_cancel_cgroup:
cgroup_cancel_fork(p, args);
Yep, looks bad to me!
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-- Steve
>
> Fixes: 7c4f75a21f636 ("futex: Allow automatic allocation of process wide futex hash")
> Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/68cb1cbd.050a0220.2ff435.0599.GAE@google.com
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-18 15:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <683adb33.a70a0220.1a6ae.000b.GAE@google.com>
2025-09-17 20:40 ` [syzbot] [mm?] WARNING: bad unlock balance in copy_process syzbot
2025-09-18 8:35 ` Vlastimil Babka
2025-09-18 8:48 ` Sebastian Andrzej Siewior
2025-09-18 13:09 ` [PATCH] futex: Use correct exit on failure from futex_hash_allocate_default() Sebastian Andrzej Siewior
2025-09-18 15:30 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox