From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF3F9C43216 for ; Fri, 27 Aug 2021 10:19:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3C05460FDA for ; Fri, 27 Aug 2021 10:19:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3C05460FDA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ubuntu.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C72BB8D0002; Fri, 27 Aug 2021 06:19:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BFCD08D0001; Fri, 27 Aug 2021 06:19:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A75B48D0002; Fri, 27 Aug 2021 06:19:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 89DDC8D0001 for ; Fri, 27 Aug 2021 06:19:13 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 285B6230E9 for ; Fri, 27 Aug 2021 10:19:13 +0000 (UTC) X-FDA: 78520462986.31.268F8C2 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP id 7DCEFB0000A5 for ; Fri, 27 Aug 2021 10:19:12 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A6CAF60560; Fri, 27 Aug 2021 10:18:56 +0000 (UTC) Date: Fri, 27 Aug 2021 12:18:52 +0200 From: Christian Brauner To: David Hildenbrand Cc: Andy Lutomirski , Linus Torvalds , "Eric W. Biederman" , David Laight , Linux Kernel Mailing List , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Al Viro , Alexey Dobriyan , Steven Rostedt , "Peter Zijlstra (Intel)" , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Michel Lespinasse , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Collin Fijalkovich , Michal Hocko , Miklos Szeredi , Chengguang Xu , Christian =?utf-8?B?S8O2bmln?= , "linux-unionfs@vger.kernel.org" , Linux API , the arch/x86 maintainers , linux-fsdevel@vger.kernel.org, Linux-MM , Florian Weimer , Michael Kerrisk Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE Message-ID: <20210827101852.7vbb2pqqyixqzd3b@wittgenstein> References: <87lf56bllc.fsf@disp2133> <87eeay8pqx.fsf@disp2133> <5b0d7c1e73ca43ef9ce6665fec6c4d7e@AcuMS.aculab.com> <87h7ft2j68.fsf@disp2133> <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com> Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of "SRS0=crXX=NS=ubuntu.com=christian.brauner@kernel.org" designates 198.145.29.99 as permitted sender) smtp.mailfrom="SRS0=crXX=NS=ubuntu.com=christian.brauner@kernel.org"; dmarc=none X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 7DCEFB0000A5 X-Stat-Signature: yc1ehyqbn5g3gp8nj4wugz9pxxju6zus X-HE-Tag: 1630059552-398620 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Aug 26, 2021 at 11:47:07PM +0200, David Hildenbrand wrote: > On 26.08.21 19:48, Andy Lutomirski wrote: > > On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote: > > > On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski w= rote: > > > >=20 > > > > I=E2=80=99ll bite. How about we attack this in the opposite dire= ction: remove the deny write mechanism entirely. > > >=20 > > > I think that would be ok, except I can see somebody relying on it. > > >=20 > > > It's broken, it's stupid, but we've done that ETXTBUSY for a _loong= _ time. > >=20 > > Someone off-list just pointed something out to me, and I think we sho= uld push harder to remove ETXTBSY. Specifically, we've all been focused = on open() failing with ETXTBSY, and it's easy to make fun of anyone openi= ng a running program for write when they should be unlinking and replacin= g it. > >=20 > > Alas, Linux's implementation of deny_write_access() is correct^Wabsur= d, and deny_write_access() *also* returns ETXTBSY if the file is open for= write. So, in a multithreaded program, one thread does: > >=20 > > fd =3D open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC); > > write(fd, some stuff); > >=20 > > <--- problem is here > >=20 > > close(fd); > > execve("some exefile"); > >=20 > > Another thread does: > >=20 > > fork(); > > execve("something else"); > >=20 > > In between fork and execve, there's another copy of the open file des= cription, and i_writecount is held, and the execve() fails. Whoops. See= , for example: > >=20 > > https://github.com/golang/go/issues/22315 > >=20 > > I propose we get rid of deny_write_access() completely to solve this. > >=20 > > Getting rid of i_writecount itself seems a bit harder, since a handfu= l of filesystems use it for clever reasons. > >=20 > > (OFD locks seem like they might have the same problem. Maybe we shou= ld have a clone() flag to unshare the file table and close close-on-exec = things?) > >=20 >=20 > It's not like this issue is new (^2017) or relevant in practice. So no = need > to hurry IMHO. One step at a time: it might make perfect sense to remov= e > ETXTBSY, but we have to be careful to not break other user space that > actually cares about the current behavior in practice. I agree. As I at least tried to show, removing write-protection can make some exploits easier. I'm all for trying to remove this if it simplifies things but for sure this shouldn't be part of this patchset and we should be careful about it. The removal of a (misguided or only partially functioning) protection mechanism doesn't introduce but removes a failure point. And I don't think removal and addition of a failure point usually have the same consequences. Introducing a new failure point will often mean userspace quickly detects regressions. Such regressions are pretty common due to security fixes we introduce. Recent examples include [1]. Right after this was merged the regression was reported. But when allowing behavior that used to fail like ETXTBSY it can be difficult for userspace to detect such regressions. The reason for that is quite often that userspace applications don't tend to do something that they know upfront will fail. Attackers however might. [1]: bfb819ea20ce ("proc: Check /proc/$pid/attr/ writes against file open= er") Christian