From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45163C28B30 for ; Thu, 20 Mar 2025 20:09:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 39B50280002; Thu, 20 Mar 2025 16:09:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 34B04280001; Thu, 20 Mar 2025 16:09:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 239C5280002; Thu, 20 Mar 2025 16:09:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 06608280001 for ; Thu, 20 Mar 2025 16:09:44 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E92251C9032 for ; Thu, 20 Mar 2025 20:09:45 +0000 (UTC) X-FDA: 83243019930.08.717F10B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf09.hostedemail.com (Postfix) with ESMTP id 2A93914000A for ; Thu, 20 Mar 2025 20:09:43 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=UCQR0lzD; spf=pass (imf09.hostedemail.com: domain of kees@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742501384; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HomdqEw8DtPYqb6PPn/l8aZvkOQUSw27akVbFGzCIV8=; b=wFH13zSj1Aj9uIf5cRmuYBMo6SZnVhtRQICsnXrgJwfty/vNfpTyFXIy8nY+towzkRjpHP ILlm42ULC2EOIcELhCNTNyKABGg55KdIx5RkmCHACsCW6ZJGFXUvTkvOACQQ8gGdfwTq0f mbFmSpMlfdvSz8rExuGpRChL4/MlJeY= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=UCQR0lzD; spf=pass (imf09.hostedemail.com: domain of kees@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742501384; a=rsa-sha256; cv=none; b=5Y2VDnYsR88MShaTrreHQOm8P1sLuWD4H3aEia/bX/MWwSbV/ulVEjAEuaDp5++Aj1Oc/w MTvHd7KKmcXDeHZkH5fidL+weenzPz0RNh8sSXepTySgT43oxaDEXR/M5kgpWe+yhYE7cW o1ZSWINyj1HFgqearQihYl9u0oBKmDM= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id D15795C5779; Thu, 20 Mar 2025 20:07:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 279A7C4CEDD; Thu, 20 Mar 2025 20:09:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742501382; bh=IXvse+cPAOPI+1A/p8odtSxPd7pf1eSfCqgYQkxrS5Q=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=UCQR0lzDuAIiEy/tA38KqCCM+VStvYEx9jAE3vPLr3wopNeo4XTajhm4MYNcbPNmG vvLLZFuo5tnuoMpQVOeosbZoPFFFy4Ic3wND4JXKvh0eQRbIgEs/xbStBon3JSQsqu 0MYcR+XA+Eu+8cQnaAimhpLHdQiWCGXJvEviJ7VwWwnt/FtvSpNOSIno+/rnoF/0AL ThUX2y+6wE5sSETTDE1dZxvO43sF0Rn7U/0LHf9P8XM6kF1BC2k6M0pZtTJHGOBMVD iwq0kR3sA3YpeGiKLkbEHKTmFQJaPSLR4A/pEctjO64+aZNtb52XUwT/UNMcBZqtJe iWXpSFaiBgYCA== Date: Thu, 20 Mar 2025 13:09:38 -0700 From: Kees Cook To: Oleg Nesterov , brauner@kernel.org Cc: jack@suse.cz, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com, viro@zeniv.linux.org.uk, syzbot Subject: Re: [syzbot] [fs?] [mm?] KCSAN: data-race in bprm_execve / copy_fs (4) Message-ID: <202503201225.92C5F5FB1@keescook> References: <67dc67f0.050a0220.25ae54.001f.GAE@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <67dc67f0.050a0220.25ae54.001f.GAE@google.com> X-Rspamd-Queue-Id: 2A93914000A X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 1jrphsn9z49rkebdz4ozri8qomazmy6c X-HE-Tag: 1742501383-703752 X-HE-Meta: U2FsdGVkX1+WJjD1mIxZyDbIBFkz7nvQUMUbtJFkvsUA4nY4glV4oP4RXfj38KuQyKzjcg9kvhouTvmQZ9KjIkdYSs1vCSeIFs0YSQXqHOfr6UIX2OJ8pLjf3N8/BzFfOHiweN4iQlcqLF1XwhGupPBTPJH17zVvzr2jEKqZuMus0+cRdU5bkkhTXw25LhQxf0gbfpw5LtiyWWHP8cRJbuUhUfxNyixI2tNCjodO+jlNFyjwYzfLMkGAaS0qLmvOphVReFnKgWJo5bpXcHrVUHDUip45ICEpIBTHx7D9hmldXDKVOxyBb8NiM4IxQWss/ecOC0j32CyKwQmRDVCAIpidYTW87vc8Zg/djYZrQFG/Z+76CgnP/goMKfEhU3oTjlCcf5RB8974nR3smp7WSjHWl4lURROh/5k4YNJ65eu19fdJW5VCxBvlEJos97Gi9ckrx05Z6QDCM7701L7/DS6d/MPsWlQ8w2sfWgF+QymAlWrBUP21L0zSsiIEwCvdZXaxxY+fDkYko/yTdjmFU6MQd+oWKvLd8y6698QugHx+JT2ESiEMbQDh581IXyrpQFishYWgieepavbz56bDCbetQhsXoKSoou3HMMXXFZYquNMpV/GOW0+QmMIKvAhH1bNAqEdg/M+OwsZVl4LkAL11AFFZaimrZKN30+mwzPcsX/zlBqII0eZfRvA60BSOJ8VGjt6p1c/4xVat+yxicLLQlgQcxnFVuIbA9v640kdcv58jtQdvAKpl1fnw+r/gKQlB0b7VtyflyWl+IDmhsMCRVgNkKE9aP42QGMBOBKDn/8RUZqU/Ee65G7cqpOFyZk0hnFmzJEz2XcTJs2dq6G2j8cNkWlPEsWShn4h0l+Wb10R5mFkcnfjT02HShADXbMd5N1ufpYR3WD4irpxuIRczL8Skm5FIDeOVGuEO565h+Ecsty6IjWtOIpUoQSEAIghqtnDlW/Pr1K4TlzK wCpY2UcF vKRrXb0R53nCAGGH6hvfRquI6AbU+Or5ZEGKec5LZvTKtwHIV4h+iTOKAtED5ixTh60QEOUc5J8mivz0yuh5pcDXHnVhpH2pxJAhQKzT5HKqAo8LK6E83io6SMePsvs/h+8K4yy/72ayWEt1fWHFTIDndgIuL1qIkyD79ctX2d0JImeX4NkJ3T55qDw89njb+ZBEwM29HDdnC75cvkJZp9K2F6Iw8SXYxaM/SlwqD0Ub/CFiiEAm2Bwp0CU8aYb/pKmP04y3fb88suujSMnqI3EDiBTTzBlK+JZ95WbITmkZGMLdNdsIESzbFizQIQV5pe6TE1fDGoSNCXlAQU8JEt1vsEDHZY/A2rAcWRU5kf5aM75YCiWh+qEZ+FMHLmaaVIeR8XCKE1uNuua2Y6+KB+8jFRjuI9CaSLn4QaaPAxpVAVKmDgc1Zt2r7iw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hey look another threaded exec bug. :| On Thu, Mar 20, 2025 at 12:09:36PM -0700, syzbot wrote: > ================================================================== > BUG: KCSAN: data-race in bprm_execve / copy_fs > > write to 0xffff8881044f8250 of 4 bytes by task 13692 on cpu 0: > bprm_execve+0x748/0x9c0 fs/exec.c:1884 This is: current->fs->in_exec = 0; And is part of the execve failure path: out: ... if (bprm->point_of_no_return && !fatal_signal_pending(current)) force_fatal_sig(SIGSEGV); sched_mm_cid_after_execve(current); current->fs->in_exec = 0; current->in_execve = 0; return retval; } > do_execveat_common+0x769/0x7e0 fs/exec.c:1966 > do_execveat fs/exec.c:2051 [inline] > __do_sys_execveat fs/exec.c:2125 [inline] > __se_sys_execveat fs/exec.c:2119 [inline] > __x64_sys_execveat+0x75/0x90 fs/exec.c:2119 > x64_sys_call+0x291e/0x2dc0 arch/x86/include/generated/asm/syscalls_64.h:323 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > read to 0xffff8881044f8250 of 4 bytes by task 13686 on cpu 1: > copy_fs+0x95/0xf0 kernel/fork.c:1770 This is: if (fs->in_exec) { Which is under lock: struct fs_struct *fs = current->fs; if (clone_flags & CLONE_FS) { /* tsk->fs is already what we want */ spin_lock(&fs->lock); /* "users" and "in_exec" locked for check_unsafe_exec() * */ if (fs->in_exec) { spin_unlock(&fs->lock); return -EAGAIN; } fs->users++; spin_unlock(&fs->lock); Does execve need to be taking this lock? The other thing touching it is check_unsafe_exec(), which takes the lock. It looks like the bprm_execve() lock was removed in commit 8c652f96d385 ("do_execve() must not clear fs->in_exec if it was set by another thread") which used the return value from check_unsafe_exec(): When do_execve() succeeds, it is safe to clear ->in_exec unconditionally. It can be set only if we don't share ->fs with another process, and since we already killed all sub-threads either ->in_exec == 0 or we are the only user of this ->fs. Also, we do not need fs->lock to clear fs->in_exec. This logic was updated in commit 9e00cdb091b0 ("exec:check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic"), which includes this rationale: 2. "out_unmark:" in do_execve_common() is either called under ->cred_guard_mutex, or after de_thread() which kills other threads, so we can't race with sub-thread which could set ->in_exec. And if ->fs is shared with another process ->in_exec should be false anyway. The de_thread() is part of the "point of no return" in exec_binprm(), called via exec_binprm(). But the bprm_execve() error path is reachable from many paths prior to the point of no return. What I can imagine here is two failing execs racing a fork: A start execve B fork with CLONE_FS C start execve, reach check_unsafe_exec(), set fs->in_exec A bprm_execve() failure, clear fs->in_exec B copy_fs() increment fs->users. C bprm_execve() failure, clear fs->in_exec But I don't think this is a "real" flaw, though, since the locking is to protect a _successful_ execve from a fork (i.e. getting the user count right). A successful execve will de_thread, and I don't see any wrong counting of fs->users with regard to thread lifetime. Did I miss something in the analysis? Should we perform locking anyway, or add data race annotations, or something else? -- Kees Cook