From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A5ABC2D0CD for ; Thu, 15 May 2025 16:48:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D95C6B00A4; Thu, 15 May 2025 12:48:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 586ED6B00A5; Thu, 15 May 2025 12:48:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4278A6B00A6; Thu, 15 May 2025 12:48:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 222046B00A4 for ; Thu, 15 May 2025 12:48:24 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 31CE3801FE for ; Thu, 15 May 2025 16:48:24 +0000 (UTC) X-FDA: 83445725328.22.ACE5AAA Received: from out01.mta.xmission.com (out01.mta.xmission.com [166.70.13.231]) by imf28.hostedemail.com (Postfix) with ESMTP id B74F1C0005 for ; Thu, 15 May 2025 16:48:21 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.231 as permitted sender) smtp.mailfrom=ebiederm@xmission.com; dmarc=pass (policy=none) header.from=xmission.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747327702; a=rsa-sha256; cv=none; b=TX2UKn2EJSpSdru5BEeutsMlOiBF5a/ML9TNntnmPj25yFo21cLJKMDPhz8412+kNE/Snh AiGxLue/0umVh0bxiLSsmKi04MkvE6NjfSe5x6eY4+DIpGXkcRITtrO4nmrdvxJXOy8QpM oR2FTWA2axO76svQQV4nb5EQOxk5U/M= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.231 as permitted sender) smtp.mailfrom=ebiederm@xmission.com; dmarc=pass (policy=none) header.from=xmission.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747327702; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g480AU4wSMBI6z70bwsXHvxNVLc84n8QNgR8Xslkiu8=; b=Y1/8ds/ag+VFM+wGCc2dIH7Pvz5fD6/GqfceY9EiDwz9kaw23ycPYAA1QQ9m7NKEf78PvQ rhjVwy8ygKtE5lS1NRAj/+B6ouDhOcDAJmed79TLlBebISJFvNv2O+17QqGsa4n3JNZsLI 1w0V9dTSU0vwp4VX9g9QswslOal1H+g= Received: from in01.mta.xmission.com ([166.70.13.51]:52516) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1uFbko-008tj4-0n; Thu, 15 May 2025 10:48:18 -0600 Received: from ip72-198-198-28.om.om.cox.net ([72.198.198.28]:60030 helo=email.froward.int.ebiederm.org.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1uFbkm-008mIa-JT; Thu, 15 May 2025 10:48:17 -0600 From: "Eric W. Biederman" To: Kees Cook Cc: Mateusz Guzik , Jann Horn , Christian Brauner , Jorge Merlino , Alexander Viro , Thomas Gleixner , Andy Lutomirski , Sebastian Andrzej Siewior , Andrew Morton , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, John Johansen , Paul Moore , James Morris , "Serge E. Hallyn" , Stephen Smalley , Eric Paris , Richard Haines , Casey Schaufler , Xin Long , "David S. Miller" , Todd Kjos , Ondrej Mosnacek , Prashanth Prahlad , Micah Morton , Fenghua Yu , Andrei Vagin , linux-kernel@vger.kernel.org, apparmor@lists.ubuntu.com, linux-security-module@vger.kernel.org, selinux@vger.kernel.org, linux-hardening@vger.kernel.org, oleg@redhat.com References: <20221006082735.1321612-1-keescook@chromium.org> <20221006082735.1321612-2-keescook@chromium.org> <20221006090506.paqjf537cox7lqrq@wittgenstein> <86CE201B-5632-4BB7-BCF6-7CB2C2895409@chromium.org> <871pss17hq.fsf@email.froward.int.ebiederm.org> <202505140822.6AB755B6@keescook> Date: Thu, 15 May 2025 11:48:07 -0500 In-Reply-To: <202505140822.6AB755B6@keescook> (Kees Cook's message of "Wed, 14 May 2025 08:42:22 -0700") Message-ID: <87bjrtrfaw.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-XM-SPF: eid=1uFbkm-008mIa-JT;;;mid=<87bjrtrfaw.fsf@email.froward.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=72.198.198.28;;;frm=ebiederm@xmission.com;;;spf=pass X-XM-AID: U2FsdGVkX1/dsw9mCdXAA3N8gGK7mkVPwWeN8eQaLZ4= Subject: Re: [PATCH 1/2] fs/exec: Explicitly unshare fs_struct on exec X-SA-Exim-Connect-IP: 166.70.13.51 X-SA-Exim-Rcpt-To: too long (recipient list exceeded maximum allowed size of 512 bytes) X-SA-Exim-Mail-From: ebiederm@xmission.com X-SA-Exim-Scanned: No (on out01.mta.xmission.com); SAEximRunCond expanded to false X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: B74F1C0005 X-Stat-Signature: 4szrt583ynnbjdy8huoubb7get3btykb X-HE-Tag: 1747327701-694679 X-HE-Meta: U2FsdGVkX18dnotYgXzWn7gS51V1+0aMgA/1x2ByxqwYqhoEKAe23CsHL0tD7z0I4qj8M97lwkGbdy/JtWEUNIGLV190G4Bc0TvgRXSbjbZ60mX1Z0Gv0aSkyDBTi8/LB5YcNfobw4+AA2/EOqIMeK+/5JCkPowV6Xls1pSbcpOFZNWedVURZzgUbAEFUyyj12FyjPQZdrXiApnES+HrO//FB/6PnDLjM8suN0frJHeg4Sqir/2+uBVi5rdZ+FxHYWNF04jNTepDyk2FC+YAdD70jvleqeoAF6MkPaJYaG0ZsO9wW+Hsyh23hP5WtPSm47R9IUV181+tqcmyQJOjIlmjl3Cfsimjww4U5Arjm1srb9vDLVMKGiBJJEadcGbO3ZhbZ9mHY45GoHNYu8A94R3qdqJFPrsPvDtb9mL6GQKYsJR9nwYG/qEewtz4WfPkGTyEpEptvB90pmUUya2+7RWVcJ/tkgLtBGKf49MKubBop3qTKjqkK6UuHiyrnwQdLy//A3koKH/QI7DNyGrClg8Uz/5qhaMAiiLe4qTKeykB5zgJEg0fWCd+bEpMe9GUs9mXgzxmJL9m3RMjIpBbs+cPKTJn8DfY56hMgkO72vbcMQzoMwZ/4NXsmDMj2dCmk+FTpC90AnVZ0XrWIwoGt7ddtbF6IxJLNUth4dsjghsZLwr9rO1V1VOcFbhsPHz7I3FyaVifEjEabEWNSr27JhJSaWgQGksDKRqI++KRRcRFI18w4Qw13W6vBUYI5KS8uU2ryj7ATOEtJoXkdCkHDpPT7/xAZqiw2fQ5GoTtCieGxJyfcE20ZwE54QFH0glSVSDcXoHaKuaY3RR2LcsrTK0Z/P7qldfiZGdp0SBZ6QawZHa5kTqAVJ7eWNJRYd4rQNUozxuDKyVvKs3RXKG7t8Vysn98jAXvhjfjsMBtfwUCJtIuJ5bg9iCIYwF7vfjy8119DHN6jPnI9Ye34xV 8aHu4dlk sw1KdBYVVHryiyiYkZRDuHFN+y5uDIP5hnBidzpX9IHSYLTBSheCn3THWSiaF/xRKU+hj75jM07ianzOfpgOM4UH9Ard3Zxn+zDFfXHZ+KyaPaIHSXoxM6W4j+UocQP9/YG4dtql68Z0AF920IRRHZ0zs6P9n0d6UwrUzFAlKzUzW92x38QxowNNLUhzoLuN0C4idNO+zwVl5rcsk3YQIZQC1u7+1YMlMWFX+xDJ15GnkA728nuLPcGN6PTY/b9sHOhpFJKmB84Pegw4LZ+OuhReVTP0c7/dL+zVvKCNorYNkbgnq10FE1u4SQRJ5Tfazfn6U X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Kees Cook writes: > On Wed, May 14, 2025 at 02:03:31AM +0200, Mateusz Guzik wrote: >> On Wed, May 14, 2025 at 12:17=E2=80=AFAM Eric W. Biederman >> wrote: >> > >> > Jann Horn writes: >> > >> > > On Tue, May 13, 2025 at 10:57=E2=80=AFPM Kees Cook = wrote: >> > >> On May 13, 2025 6:05:45 AM PDT, Mateusz Guzik w= rote: >> > >> >Here is my proposal: *deny* exec of suid/sgid binaries if fs_struc= t is >> > >> >shared. This will have to be checked for after the execing proc be= comes >> > >> >single-threaded ofc. >> > >> >> > >> Unfortunately the above Chrome helper is setuid and uses CLONE_FS. >> > > >> > > Chrome first launches a setuid helper, and then the setuid helper do= es >> > > CLONE_FS. Mateusz's proposal would not impact this usecase. >> > > >> > > Mateusz is proposing to block the case where a process first does >> > > CLONE_FS, and *then* one of the processes sharing the fs_struct does= a >> > > setuid execve(). Linux already downgrades such an execve() to be >> > > non-setuid, which probably means anyone trying to do this will get >> > > hard-to-understand problems. Mateusz' proposal would just turn this >> > > hard-to-debug edgecase, which already doesn't really work, into a >> > > clean error; I think that is a nice improvement even just from the >> > > UAPI standpoint. >> > > >> > > If this change makes it possible to clean up the kernel code a bit, = even better. >> > >> > What has brought this to everyone's attention just now? This is >> > the second mention of this code path I have seen this week. >> > >>=20 >> There is a syzkaller report concerning ->in_exec handling, for example: >> https://lore.kernel.org/all/67dc67f0.050a0220.25ae54.001f.GAE@google.com= /#t >> >> [...] >> > It looks like most of the lsm's also test bprm->unsafe. >> > >> > So I imagine someone could very carefully separate the non-ptrace case >> > from the ptrace case but *shrug*. >> > >> > Perhaps: >> > >> > if ((is_setid || __cap_gained(permitted, new_old)) && >> > ((bprm->unsafe & ~LSM_UNSAFE_PTRACE) || >> > !ptracer_capable(current, new->user_ns))) { >> > + if (!(bprm->unsafe & LSM_UNSAFE_PTRACE)) { >> > + return -EPERM; >> > + } >> > /* downgrade; they get no more than they had, and mayb= e less */ >> > if (!ns_capable(new->user_ns, CAP_SETUID) || >> > (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS)) { >> > new->euid =3D new->uid; >> > new->egid =3D new->gid; >> > } >> > new->cap_permitted =3D cap_intersect(new->cap_permitte= d, >> > old->cap_permitted); >> > } >> > >> > If that is what you want that doesn't look to scary. I don't think >> > it simplifies anything about fs->in_exec. As fs->in_exec is set when >> > the processing calling exec is the only process that owns the fs_struc= t. >> > With fs->in_exec just being a flag that doesn't allow another thread >> > to call fork and start sharing the fs_struct during exec. >> > >> > *Shrug* >> > >> > I don't see why anyone would care. It is just a very silly corner cas= e. >>=20 >> Well I don't see how ptrace factors into any of this, apart from being >> a different case of ignoring suid/sgid. > > I actually think we might want to expand the above bit of logic to use > an explicit tests of each LSM_UNSAFE case -- the merged > logic is very difficult to read currently. Totally untested expansion, > if I'm reading everything correctly: > > if (bprm->unsafe && > (is_setid || __cap_gained(permitted, new_old))) { > bool limit_caps =3D false; > bool strip_eid =3D false; > unsigned int unsafe =3D bprm->unsafe; > /* Check each bit */ > > if (unsafe & LSM_UNSAFE_PTRACE) { > if (!ptracer_capable(current, new->user_ns)) > limit_caps =3D true; strip_eid =3D true; You missed the euid stripping there. > unsafe &=3D ~LSM_UNSAFE_PTRACE; > } > if (unsafe & LSM_UNSAFE_SHARE) { > limit_caps =3D true; > if (!ns_capable(new->user_ns, CAP_SETUID)) > strip_eid =3D true; > unsafe &=3D ~LSM_UNSAFE_SHARE; > } > if (unsafe & LSM_UNSAFE_NO_NEW_PRIVS) { > limit_caps =3D true; > if (!ns_capable(new->user_ns, CAP_SETUID)) > strip_eid =3D true; > unsafe &=3D ~LSM_UNSAFE_NO_NEW_PRIVS; > } > > if (WARN(unsafe, "Unhandled LSM_UNSAFE flag: %u?!\n", unsafe)) > return -EINVAL; > > if (limit_caps) { > new->cap_permitted =3D cap_intersect(new->cap_permitted, > old->cap_permitted); > } > if (strip_eid) { > new->euid =3D new->uid; > new->egid =3D new->gid; > } > } I think I would simplify this all to: if ((id_changed || __cap_gained(permitted, new, old)) && !ptracer_capable(current->new_user_ns)) { if (!ns_capable(new->user_ns, CAP_SETUID)) { new->euid =3D old->euid; new->egid =3D old->egid; } new->cap_permitted =3D cap_intersect(new->cap_permitted, old->cap_permitted); } if ((id_changed || __cap_gained(permitted, new, old)) && (bprm->unsafe & ~LSM_UNSAFE_PTRACE)) { return -EPERM; } The code of no_new_privs doesn't prevent capset so ignoring the results of ns_capable when NO_NEW_PRIVS is set doesn't make sense. If we are going to do anything please let's find ways to understand what is happening and simplify this code, not add to it's complexity. Eric