From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 891DFC02182 for ; Thu, 23 Jan 2025 22:38:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBAD128000C; Thu, 23 Jan 2025 17:38:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C431A6B009B; Thu, 23 Jan 2025 17:38:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0A3728000C; Thu, 23 Jan 2025 17:38:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9347E6B009A for ; Thu, 23 Jan 2025 17:38:45 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 23A01B0803 for ; Thu, 23 Jan 2025 22:38:45 +0000 (UTC) X-FDA: 83040182610.22.997A07C Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf28.hostedemail.com (Postfix) with ESMTP id 5E750C000C for ; Thu, 23 Jan 2025 22:38:43 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=qTYMtfcV; spf=none (imf28.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737671923; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0fGNZae0SEsdeWC7CbCUBx4kB8+o1y3cPswn80K5x64=; b=FczlNfzDMaahG0W4tb02Y178PriD5XledwTUg725LbZI7PhauzZkasgu6hCxOUoPbcTr73 zsqfOrRMsWG/HB/O3ZaVG6c0FB2FgpAC3hR5IJGK57FxutHK+cyB1/wBYeaZTF2kEgcs5C nRQkzYAKmu65XqRxFOkBS9/vbYwb6qY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737671923; a=rsa-sha256; cv=none; b=RtOI6H9PO1c+0DCIcTzuhbMPJS/QU8cVw49g1c4SCXR68zHNpfwMqaHWukroaKD0knkiL9 I/03AcZijsMx56q/tUNc/cSJT1JHP1JiFAzKEZw8uZL2t5ZBTAO4eZ67U7B0xdWAGZIfp3 BlPnAHLgV29gTqLRrTOLZ2yMob7320k= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=qTYMtfcV; spf=none (imf28.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=0fGNZae0SEsdeWC7CbCUBx4kB8+o1y3cPswn80K5x64=; b=qTYMtfcV02PO6gNh0YR8vHugQY ekoQSqltH1di7AEklFlxmJQjAGXVtV08P3t+PlJu1Ro7ABZi4FLCHvRwqNMsupDfE1zwB+Vzhmkra 9GiKQjsoMjGQKYmyY7GsN0jYEpX0ysGO96Jz/mhjWeB0PpgZlvmD+5BmSrKZGhp4lxvgkbE+vAJFi 4248oBMvtEoHYytq9W243xI5dlZ65epReXaY/wxUi3JeBs0z/cdZQlQ98Z1ysNm5cLqxNpX5SOuTe wxifipKy2cBs5yAlVxSD29Eq+uIGMZwrnL8LLMHliQurDhRRMWqraojLXzSyEzsdTUfn+NylBaKkc RJa477eg==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tb5qL-0000000BIin-0Vwz; Thu, 23 Jan 2025 22:38:33 +0000 Date: Thu, 23 Jan 2025 22:38:32 +0000 From: Matthew Wilcox To: enh Cc: Vlastimil Babka , "Liam R. Howlett" , Jeff Xu , Pedro Falcato , Benjamin Berg , Lorenzo Stoakes , Kees Cook , akpm@linux-foundation.org, jannh@google.com, torvalds@linux-foundation.org, adhemerval.zanella@linaro.org, oleg@redhat.com, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, jorgelo@chromium.org, sroettger@google.com, ojeda@kernel.org, adobriyan@gmail.com, anna-maria@linutronix.de, mark.rutland@arm.com, linus.walleij@linaro.org, Jason@zx2c4.com, deller@gmx.de, rdunlap@infradead.org, davem@davemloft.net, hch@lst.de, peterx@redhat.com, hca@linux.ibm.com, f.fainelli@gmail.com, gerg@kernel.org, dave.hansen@linux.intel.com, mingo@kernel.org, ardb@kernel.org, mhocko@suse.com, 42.hyeyoo@gmail.com, peterz@infradead.org, ardb@google.com, rientjes@google.com, groeck@chromium.org, mpe@ellerman.id.au, Andrei Vagin , Dmitry Safonov <0x7f454c46@gmail.com>, Mike Rapoport , Alexander Mikhalitsyn , Christopher Ferris Subject: Re: [PATCH v4 1/1] exec: seal system mappings Message-ID: References: <2e5de601da34342d8eb0d8319dcf81ff213c7ef0.camel@sipsolutions.net> <881c3558-1101-496e-9ef4-5bef13f3f233@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: g14ptcxkgig7wfot74b6yy4jqor53zxn X-Rspam-User: X-Rspamd-Queue-Id: 5E750C000C X-Rspamd-Server: rspam03 X-HE-Tag: 1737671923-597027 X-HE-Meta: U2FsdGVkX1/a28DRuG+nxmUghdHXpGrjVDzte2efZy8eubwx5PJLszcS2pZjh6zVvPqW9cuvWyF1FeZqoRuv4lAyLKspyZvwGK+WshPo5HsmeuGbewh7WV0flkDpmsT/DGf+v7hnTGzqv38BAy1baeoxgWYIF1WAPffsJ04XOxXGWmA7bR3KPTYUkyeysjtXpc7y+bqO4VXZTGTy/232JtooJ6rjtejpzPgl/iulbzqtuc5LKPgxlt7K9o0M9gi231k7VA7SypOtUcTCHLGHVqbiPko6SagIX8OHa3QPjymAMqApB0XyGD3N4yF29O501TrJ6Jo5RmMtJK8Zmt7q0we2BlMKrVBxVu4IqLgWYIe97ONiDnZmdEwmcjGiD9EBYJh3iL59A2QVhMYq2tffszsDezoHf/JZCmwwnJ+bqvpxu1CHXBtbqhA1ti1R++ctCYHGaQ1uqo9aPJdo3KYm60oQlHzphsjtf9m8IzFvp6HTZeneppL+b0nlIPEl5tcjQiM00crDsBC9CFGfUEdTbXhJeduLyL4s9OGnGLqAadqcfR4779PmiBPQ19/PGtWGp3ZWBkty8xgA6It0EKykD/0pWJIhgCo05WmjI2ATBoFaTgpP0FVxNyqQmJ7PBKQ7aXJMG1XC0UN0Uoyr20uqBsKDBmWZxWNYhQJPJNqzdIRl+rAOLG5O2YdMa+m3WA7D3LJ2Uyus5miLI+q6uMZmS6miA6l3vgHpSY/Rl0a/Xcletw+InVMfj7TK9J6w9fkXLRpNeyJ6kJnvTPeIYebc9WIT37lC65O0Ho65h0FlgR7DfzyTVD/MUWOnlRtUI+v7/omWtJ2r1/va17q6IXTtsgg4EOH1L04EL1b7MSpc7CyeSbWc6x9CFWLdcC+G98LBcjnk7yEiThF8zWyZQ029xepUwuQiJ7R5Tr5FjwPOen2TFHJL40lOv3zYVX/zJ1rZRd3C4Shhyv8c24S/Z0Q O2nauowk d+MZAkqOd+KHyRf/P4dlDxCwjPIYnKtihddVWPe9dNTzzwfdnGI1xoRC8e6mruPvZ11LJRv4hR7/pFHFP3KJ+YNaG4emEjX0nAtewnVOBLXbN+ivtRqfeQ1nDEMbehyxahac2/GrX+UbDesr5oRBhVQ4AdjbErxXxj1oKpJN5rqeclBz2sHjzFMNLin4ijwgtPASr3VHvKlVoKKua2Yt7SpcKL8/M0nKheXY+9nQz49Yyw17/SZj07sVUJYxNd9y8ekFpjijR+xdSgdoyx962GRuoHrGK/X4KcyVcyT3DYzMfPBwYA4LfiNPjphlbWMzxBqpXS+eQ+iuF5rKxP8K1U7T1PXwBHN3JOYEPk0xMVYxUoBs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.410943, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 23, 2025 at 04:50:46PM -0500, enh wrote: > yeah, at this point i should (a) drag in +cferris who may have actual > experience of this and (b) admit that iirc i've never personally seen > _evidence_ of this, just claims. most famously in the chrome source... > if you `grep -r /proc/.*/maps` you'll find lots of examples, but > something like https://chromium.googlesource.com/chromium/src/+/main/base/debug/proc_maps_linux.h#61 > is quite representative of the "folklore" in this area. That folklore is 100% based on a true story! I'm not sure that all of the details are precisely correct, but it's true enough that I wouldn't quibble with it. In fact, we want to make it worse. Because the mmap_lock is such a huge point of contention, we want to read /proc/PID/maps protected only by RCU. That will relax the guarantees to: a. If a VMA existed and was not modified during the duration of the read, it will definitely be returned. b. If a VMA was added during the call, it might be returned. c. If a VMA was removed during the call, it might be returned. d. If an address was covered by a VMA before the call and that VMA was modified during the call, you might get the prior or posterior state of the VMA. And you might get both! What might be confusing: e. If VMA A is added, then VMA B is added, your call might show you VMA B and not VMA A. f. Similarly for deleted. g. If you have, say, a VMA from (4000-9000) and you mprotect the region (5000-6000), you might see: 4000-9000 oldA or 4000-5000 newA 4000-9000 oldA or 4000-5000 newA 5000-6000 newB 4000-9000 oldA or 4000-5000 newA 5000-6000 newB 6000-9000 newC (it's possible other combinations might be visible; i'm not working on the details of this right now) We shouldn't be able to _skip_ a VMA. That seems far worse than returning duplicates; if your maps parser sees duplicates it can either try to figure it out itself, or retry the whole read.