From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FE99C3ABCC for ; Tue, 13 May 2025 15:30:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DACD16B0088; Tue, 13 May 2025 11:30:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D0E266B008A; Tue, 13 May 2025 11:30:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B60EF6B00C1; Tue, 13 May 2025 11:30:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8F6546B0088 for ; Tue, 13 May 2025 11:30:40 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 36EBF120F19 for ; Tue, 13 May 2025 15:30:42 +0000 (UTC) X-FDA: 83438271924.06.4D3C37F Received: from out01.mta.xmission.com (out01.mta.xmission.com [166.70.13.231]) by imf03.hostedemail.com (Postfix) with ESMTP id 800F920003 for ; Tue, 13 May 2025 15:30:39 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=xmission.com; spf=pass (imf03.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.231 as permitted sender) smtp.mailfrom=ebiederm@xmission.com ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=xmission.com; spf=pass (imf03.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.231 as permitted sender) smtp.mailfrom=ebiederm@xmission.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747150239; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3QYId6xKsPwf9SrQMJlF7v6rYC72qOvBhcjVK3XBNvM=; b=yPqCQFx+g4Qde6vydH13oo22tjD97HFxKUQfqa8cwZryoZuPEMxDKOoNM54iCW6VbrjoE2 F3b4ACIdq1kyfpYqSu4K/6Uvb238juJFt9Suj4XmmSsLBSggF92aT7Ptuv8+ix32wkTRjV pPA/4kDz3fJZQqR2VCM9UP3/nwOt6mI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747150239; a=rsa-sha256; cv=none; b=KGbaU4uGv+6urPX7I47ZtOOJu4NM7ONKwtLs4pc90pKbA4Aiz0po1K6LZhCWXdxJm5ZkIq 7cHH388UyNLzNtJ1wkX1X54xSpZTBTy/0QRrJJgY2A/oY4czy6n+N6rIlw093JUEJ8Wmp6 7uifuA7/41F/vk6WRUIeQla9dYmnpxE= Received: from in02.mta.xmission.com ([166.70.13.52]:58534) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1uEraV-0057bO-9O; Tue, 13 May 2025 09:30:35 -0600 Received: from ip72-198-198-28.om.om.cox.net ([72.198.198.28]:59502 helo=email.froward.int.ebiederm.org.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1uEraS-00FteB-TR; Tue, 13 May 2025 09:30:34 -0600 From: "Eric W. Biederman" To: Mateusz Guzik Cc: Kees Cook , Jann Horn , Christian Brauner , Jorge Merlino , Alexander Viro , Thomas Gleixner , Andy Lutomirski , Sebastian Andrzej Siewior , Andrew Morton , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, John Johansen , Paul Moore , James Morris , "Serge E. Hallyn" , Stephen Smalley , Eric Paris , Richard Haines , Casey Schaufler , Xin Long , "David S. Miller" , Todd Kjos , Ondrej Mosnacek , Prashanth Prahlad , Micah Morton , Fenghua Yu , Andrei Vagin , linux-kernel@vger.kernel.org, apparmor@lists.ubuntu.com, linux-security-module@vger.kernel.org, selinux@vger.kernel.org, linux-hardening@vger.kernel.org, oleg@redhat.com In-Reply-To: (Mateusz Guzik's message of "Tue, 13 May 2025 15:05:45 +0200") References: <20221006082735.1321612-1-keescook@chromium.org> <20221006082735.1321612-2-keescook@chromium.org> <20221006090506.paqjf537cox7lqrq@wittgenstein> <86CE201B-5632-4BB7-BCF6-7CB2C2895409@chromium.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) Date: Tue, 13 May 2025 10:29:47 -0500 Message-ID: <87o6vw1qc4.fsf@email.froward.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1uEraS-00FteB-TR;;;mid=<87o6vw1qc4.fsf@email.froward.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=72.198.198.28;;;frm=ebiederm@xmission.com;;;spf=pass X-XM-AID: U2FsdGVkX1+f5aQcvtD7cJuBfQzEABOE535sdGOGPWo= Subject: Re: [PATCH 1/2] fs/exec: Explicitly unshare fs_struct on exec X-SA-Exim-Connect-IP: 166.70.13.52 X-SA-Exim-Rcpt-To: too long (recipient list exceeded maximum allowed size of 512 bytes) X-SA-Exim-Mail-From: ebiederm@xmission.com X-SA-Exim-Scanned: No (on out01.mta.xmission.com); SAEximRunCond expanded to false X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 800F920003 X-Stat-Signature: kdmo63mr4wrw7wwxw68yuzjrfcdegsh3 X-Rspam-User: X-HE-Tag: 1747150239-528637 X-HE-Meta: U2FsdGVkX197leE7BYFUEq60KbV/pW4BErieoyvsNa7GULF3uZXNA1d9pDVBMhh31COrlIUh0H0Retpz9+r6b1+es4MZBiRc/LH58M5nR7I4L5gWOwY/WN+gyVfFZuKH2ilanB4C6f0Q6KBBnGmTYLjXiCgCevY3LZZqGRTXEms2DIz/M5HqoHv0AJp7x4cHCQUZlpXY4mZod/r2DLCb7rtqX4bBzBWn+AJnURgkQtjAPmo4wMRKhhZTsSq8HSdamFqmurJkklzbt2gg0Lnum9eKKo/azWtgwCGFu9OHINy91aCVW4D0v73eCsjbN5AjlOBewIY7KLKACrHuCPJxD4fPB5lrhfw80yjSa5zsNSSLfh8KdOXNCp4IehAqL5P1021OxsaS8P8EooEV9IiV0bYyMDYZ2WGd30RM6HMOzFajcXy9k3q6CB7OJKrnjhdv5XAftbi8MeQe0YceqIEKNPWQWMY/G/YVw2kUvsHINbCtVnaIy7bc8OIlSCESQG4scYfdZgLW404RcTWtMsstxWkfi4zln8MSNHZh55FDXwGetxx3sycV5M5lrGP1EvBVA9ZnF8iLqYHUrutvJYgJ2A2774ZiRMdsiIWvaqBqG44/biXx+QGLZRuZSe6358mFNnaqj6UzBX4FkSUIMS61HokOXRhjWnHfJKp3TYJlnrKYB9xyEK0+XT0w2/XhReOUpE8iEJNCqXM3jE9bJhQkR0zoPUihGUDWDuxFfNXMf74B2aJHbcmzbkP9jklg8TclXn1DWazu2ztgB6hDDMQUaICqW7j8YhmKQs/WFDNXyV7e5cdjK+InhczGfwwCB7De9MeTr/ifqzOdc18LQYh1QyPXCgHJXt3io/zlKsvNIcKXFMfklKHnNebJjF8004hF/B/fcwFTltArbrvm0DhdS5SLZP8n/TyYAZE4DVHjRKaK2gJBaV10DgaEIjszBISSYZ48jJSGiNI9nxHCf3X 1/QzSQSX 7otuizhRFTU1FZw9os1uAclaCR90ZSTeKTRw1aYGk+ttbp6omizM7c08v3idEwPOEv0rne0gWacfSQE8ypYkjZajlVS7MXbOAOZgYiwEzP3qgFPrLFBNMXA8yrFmoSX2hT5Wlr44bc+ys+UwjzZLYhJJHGJ5x5ikyPjBTjmbo4zOSl6KCStdtKMz+x0uOE2u6O0ImBAKZ8+t/klp8Vp22rkRFmJd95DzL2fwOM7CMWG/2eXU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Mateusz Guzik writes: > On Thu, Oct 06, 2022 at 08:25:01AM -0700, Kees Cook wrote: >> On October 6, 2022 7:13:37 AM PDT, Jann Horn wrote: >> >On Thu, Oct 6, 2022 at 11:05 AM Christian Brauner wrote: >> >> On Thu, Oct 06, 2022 at 01:27:34AM -0700, Kees Cook wrote: >> >> > The check_unsafe_exec() counting of n_fs would not add up under a heavily >> >> > threaded process trying to perform a suid exec, causing the suid portion >> >> > to fail. This counting error appears to be unneeded, but to catch any >> >> > possible conditions, explicitly unshare fs_struct on exec, if it ends up >> >> >> >> Isn't this a potential uapi break? Afaict, before this change a call to >> >> clone{3}(CLONE_FS) followed by an exec in the child would have the >> >> parent and child share fs information. So if the child e.g., changes the >> >> working directory post exec it would also affect the parent. But after >> >> this change here this would no longer be true. So a child changing a >> >> workding directoro would not affect the parent anymore. IOW, an exec is >> >> accompanied by an unshare(CLONE_FS). Might still be worth trying ofc but >> >> it seems like a non-trivial uapi change but there might be few users >> >> that do clone{3}(CLONE_FS) followed by an exec. >> > >> >I believe the following code in Chromium explicitly relies on this >> >behavior, but I'm not sure whether this code is in active use anymore: >> > >> >https://source.chromium.org/chromium/chromium/src/+/main:sandbox/linux/suid/sandbox.c;l=101?q=CLONE_FS&sq=&ss=chromium >> >> Oh yes. I think I had tried to forget this existed. Ugh. Okay, so back to the drawing board, I guess. The counting will need to be fixed... >> >> It's possible we can move the counting after dethread -- it seems the early count was just to avoid setting flags after the point of no return, but it's not an error condition... >> > > I landed here from git blame. > > I was looking at sanitizing shared fs vs suid handling, but the entire > ordeal is so convoluted I'm confident the best way forward is to whack > the problem to begin with. > > Per the above link, the notion of a shared fs struct across different > processes is depended on so merely unsharing is a no-go. > > However, the shared state is only a problem for suid/sgid. > > Here is my proposal: *deny* exec of suid/sgid binaries if fs_struct is > shared. This will have to be checked for after the execing proc becomes > single-threaded ofc. > > While technically speaking this does introduce a change in behavior, > there is precedent for doing it and seeing if anyone yells. > > With this in place there is no point maintainig ->in_exec or checking > the flag. > > There is the known example of depending on shared fs_struct across exec. > Hopefully there is no example of depending on execing a suid/sgid binary > in such a setting -- it would be quite a weird setup given that for > security reasons the perms must not be changed. > > The upshot of this method is that any breakage will be immediately > visible in the form of a failed exec. > > Another route would be to do the mandatory unshare but only for > suid/sgid, except that would have a hidden failure (if you will). > > Comments? What is the problem that is trying to be fixed? A uapi change to not allow sharing a fs_struct for processes that change their cred on exec seems possible. I said changing cred instead of suid/sgid because there are capabilities and LSM labels that we probably want this to apply to as well. I think such a limitation can be justified based upon having a shared fs_struct is likely to allow confuse suid executables. Earlier in the thread there was talk about the refcount for fs_struct. I don't see that problem at the moment, and I don't see how dealing with suid+sgid exectuables will have any bearing on how the refcount works. Eric