From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B7B6C369D7 for ; Fri, 25 Apr 2025 19:56:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8E6C06B0007; Fri, 25 Apr 2025 15:56:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 894F36B0008; Fri, 25 Apr 2025 15:56:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 783916B000C; Fri, 25 Apr 2025 15:56:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5B8A56B0007 for ; Fri, 25 Apr 2025 15:56:25 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 05E791CC4CF for ; Fri, 25 Apr 2025 19:56:27 +0000 (UTC) X-FDA: 83373623214.11.87B4EF6 Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf10.hostedemail.com (Postfix) with ESMTP id 6DA47C000F for ; Fri, 25 Apr 2025 19:56:25 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="eRS2/ztl"; spf=pass (imf10.hostedemail.com: domain of kees@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745610985; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yODlwEOL73CmWrG9zApxmu4TNAg1mJBoSCqCqvDx5Go=; b=6WCWuIJ+tdM/FOqmZdeGvdmJ2YCqlzoofz/xjNcsq4M4yoYYdu8h/pul0q/NOymcvwXfnf jjCBTcn30c8noQR0cu8FGgoIZlqIvIREvbzMLj6CYqqbo4GbeuZvi79foT0KYHYJTdP7Ww wlf6V2j7gi4yHy3ESYqIfk+paB8NLes= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="eRS2/ztl"; spf=pass (imf10.hostedemail.com: domain of kees@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745610985; a=rsa-sha256; cv=none; b=wqLmuka9iksVoYYViBgu/83xsi8UJ8slGr257zupBzxu7U/L1gTvTZI+bner2DBN0CHEQb KmH8JpOb89lxu1yDPoi3BsGsREzM0hU/RfueTFjJsgBpIyXCsSwTzLUOnuE+1NEEtb58ff Yvi30+ygKxZNhET3FpGSvh3OaeiqCRo= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 290F661136; Fri, 25 Apr 2025 19:56:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4821BC4CEE9; Fri, 25 Apr 2025 19:56:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745610984; bh=3BFpzPNhGoEvlgWJA66CMPCNVCIyPcxU5w0nVetTTVQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=eRS2/ztlUG0AIAC3y4OPWTTjG1HrkNkqEFroTJSVrgpw6FOWNjoTWr2niFagTwrIA Bg5a4Po8oPmkMpKJLrUTJ3mh8giuvpQgLeIx4a79TyiDPUdlrXzPXcjXPS082UiXgD SokBmtDgYkmcmjB3NZ8tiYb7v6HxqVti0fqViOAPKjMjt3KtE3fm5TqpQ8VN4XlQUp jY1VLPu27QHpLdMpcfPSKacUu6TlHQzzGLDy6r+KPwRDC7NGmrJDI+Tzd6ojVkilyz 0PApMN5h2jRlwiDZAdeE7iVkymsJcThsPRNFBcionN9Ne06KpZgHtKCoFnetmsxMnq 3kaYGOa25YCsw== Date: Fri, 25 Apr 2025 12:56:21 -0700 From: Kees Cook To: Catalin Marinas Cc: Ryan Roberts , Thomas =?iso-8859-1?Q?Wei=DFschuh?= , Linux Kernel Mailing List , Linux-MM , "linux-arm-kernel@lists.infradead.org" Subject: Re: BUG: vdso changes expose elf mapping issue Message-ID: <202504251158.D3D342410@keescook> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: bcz8njgtn8fomq8rdtqhqxnatppeaf59 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 6DA47C000F X-Rspam-User: X-HE-Tag: 1745610985-831380 X-HE-Meta: U2FsdGVkX181DRCJhdRIrSXK9s2/dy0ZnDiHJRoX4MqdAu851iYSiTJvzkuyL+YdGLtsjxMFl1xyz1N+OEPeu8oeQ5SRrh2MCofOTuIJyEXSkNgEziMb6H9cYmcPJn5CD9Wj2kfHk/ZVCzFvQjFkc127tl0TBGisO5FtqE/v+pSLjhae/MAcccNVoBtHZdPt5CUpOpXN57vIDmwZJo/aA4HMUn49dN2I1JR8p2fD9jMpIN6crh0M7u1wcXV5TOJ6a+Dfj6iKTQ21kXxgNZKrY3f0bposB9z8HPBGwivTMwschB6juhQT7F1uu7ETXqLPTZL1tJPOtc/embR1ZygL8UwE8SqchcvciR0SSSczB4OktMWCUJXBxk2w+2sZRUr0P+SCG4XAHR6xHF+bnw3Bf6xk+YFibK630Zwgk5QdLIuGu7FmoCK7L1t1+fkarEBH9Uzl1/oDifU8p1mdZAcNrULuC/YLsQkerMsxugxj6xIKRbAYuUUaIg4Fxp1dqOjQlefeflsQ+l2Tyo9FFlM2PGJOaPZXrloiZc20nhZ6tnKOhUkyYFNdytTeYw75CxZefwV6zktQxWhtjscP5Bzt5vhgYlgNq9gC5bXRTC/0687WGo02mqu0IZzFGA6PnMMaZ9V7YTWcq09ZmAdtdBq0ylhVWKw+RiW6ZbrLVlx9dU3H5V7O+JBwT8AZoH3MVfdceBJG1mqFRA/D3+24ZstOaloXRo+UePtPJwDSrZ5qKuBANZVpQjo5f0+tNj4SRXaZ+0g3Hsb5rdaNDr1Zmklm/ytO8UFu7t0NwCbb9d5syY80WA3RgUS0ZFfI+d8jQLoKGqcnhpGyipzSZASZEBVax0n+yNRSaFjTyGEH9vRDpkeWuvLW1ztAQm5rgoGUpOUT9ncVLYoACCG4Z1Nzu4mx2t/Y3cOqdl3ORAsDNlFUC6DLkAL31ncvhQ1um2YNgPj3CuExuNxsND3cfAqc8Wt a3bI4j7O wIVhnoiRSgedFHmlf8dCaZXcY1P8ws2QA8OPVdVzyI9kS2oOwJKjYL9OyKRVXmuClju5BJpqSSLCzKlcs2oF7dB74cZTCJlVyTkpw0UkMycfKrzwR5nSL4AvdvkL+R4JmHIv7N3YwCtx5WGhkfJ/Cqc9rx1gN3Woxe3B9StntJKdTh5DQ7mpPyUA3SWtWGHORMbcz13uOH+KmtsrTIn2VA/QWi/pQoJFGyeaIvoVeHryqKbxs2++xSeuUGVaKoSKebbZLNCLHKpYRtEbR4ssCBk66D1SgqHJXFhf0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 25, 2025 at 07:37:38PM +0100, Catalin Marinas wrote: > On Fri, Apr 25, 2025 at 01:41:31PM +0100, Ryan Roberts wrote: > > ldconfig is a statically linked, PIE executable. The kernel treats this as an > > interpreter and therefore does not map it into low memory but instead maps it > > into high memory using mmap() (mmap is top-down on arm64). Once it's mapped, > > vvar/vdso gets mapped and fills the hole right at the top that is left due to > > ldconfig's alignment requirements. Before the above change, there were 2 pages > > free between the end of the data segment and vvar; this was enough for ldconfig > > to get it's required memory with brk(). But after the change there is no space: > > > > Before: > > fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /home/ubuntu/glibc-2.35/build/elf/ldconfig > > fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /home/ubuntu/glibc-2.35/build/elf/ldconfig > > fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0 > > fffff7ffc000-fffff7ffe000 r--p 00000000 00:00 0 [vvar] > > fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0 [vdso] > > fffffffdf000-1000000000000 rw-p 00000000 00:00 0 [stack] > > > > After: > > fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /home/ubuntu/glibc-2.35/build/elf/ldconfig > > fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /home/ubuntu/glibc-2.35/build/elf/ldconfig > > fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0 > > fffff7ffa000-fffff7ffe000 r--p 00000000 00:00 0 [vvar] > > fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0 [vdso] > > fffffffdf000-1000000000000 rw-p 00000000 00:00 0 [stack] > > It does look like we've just been lucky so far. An ELF file requiring a > slightly larger brk (by two pages), it could fail. FWIW, briefly after > commit 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for > static PIE"), we got: > > Start Addr End Addr Size Offset Perms objfile > 0xaaaaaaaa0000 0xaaaaaab5d000 0xbd000 0x0 r-xp /usr/sbin/ldconfig > 0xaaaaaab6b000 0xaaaaaab73000 0x8000 0xcb000 rw-p /usr/sbin/ldconfig > 0xaaaaaab73000 0xaaaaaab78000 0x5000 0x0 rw-p [heap] > 0xfffff7ffd000 0xfffff7fff000 0x2000 0x0 r--p [vvar] > 0xfffff7fff000 0xfffff8000000 0x1000 0x0 r-xp [vdso] > 0xfffffffdf000 0x1000000000000 0x21000 0x0 rw-p [stack] > > This looks like a better layout to me when you load an ET_DYN file > without !PT_INTERP. The trouble is that !PT_INTERP must be loaded out of the way of the binary it may load, so it cannot be loaded low. > When the commit was reverted by aeb7923733d1 ("revert "fs/binfmt_elf: > use PT_LOAD p_align values for static PIE""), we went back to: > > Start Addr End Addr Size Offset Perms objfile > 0xfffff7f28000 0xfffff7fe5000 0xbd000 0x0 r-xp /usr/sbin/ldconfig > 0xfffff7ff0000 0xfffff7ff2000 0x2000 0x0 r--p [vvar] > 0xfffff7ff2000 0xfffff7ff3000 0x1000 0x0 r-xp [vdso] > 0xfffff7ff3000 0xfffff7ffb000 0x8000 0xcb000 rw-p /usr/sbin/ldconfig > 0xfffff7ffb000 0xfffff8000000 0x5000 0x0 rw-p [heap] > 0xfffffffdf000 0x1000000000000 0x21000 0x0 rw-p [stack] The revert was because, among various additional problems, that this low load would collide with things. The static PIE alignment was finally fixed with commit 3545deff0ec7 ("binfmt_elf: Honor PT_LOAD alignment for static PIE") The ultimate brk location is determined near the end of load_elf_binary() (see the code surrounding the comment "Otherwise leave a gap"). > With 6.15-rc3 my layout looks like Ryan's but in 5.18 above, the vdso is > small enough and it's squeezed between the two ldconfig sections. I think there are two surprises: - For loaders (ET_DYN without PT_INTERP, which is also "static PIE") the brk location is being moved to ELF_ET_DYN_BASE ... *but only when ASLR is enabled*. I think exclusion is the primary bug, with its origin in commit bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec"). I failed to explain my rationale at the time to have it only happen under ASLR, but I think I was trying to be conservative and not change things too much. - vdso can get loaded into _gaps_ in the ELF. I think this is asking for trouble, but technically should be okay since neither can grow. But I never like seeing immediately adjacent unrelated mappings, since we always end up with bugs (see things like commit 2a5eb9995528 ("binfmt_elf: Leave a gap between .bss and brk"). For fixing the former, the below change might work (totally untested yet, I just wanted to reply with my thoughts as I start testing this). Pardon the goofy code style, I wanted a minimal diff here: diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 7e2afe3220f7..9290a29ede28 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1284,7 +1284,7 @@ static int load_elf_binary(struct linux_binprm *bprm) mm->end_data = end_data; mm->start_stack = bprm->p; - if ((current->flags & PF_RANDOMIZE) && (snapshot_randomize_va_space > 1)) { + { /* * For architectures with ELF randomization, when executing * a loader directly (i.e. no interpreter listed in ELF @@ -1299,7 +1299,9 @@ static int load_elf_binary(struct linux_binprm *bprm) /* Otherwise leave a gap between .bss and brk. */ mm->brk = mm->start_brk = mm->brk + PAGE_SIZE; } + } + if ((current->flags & PF_RANDOMIZE) && (snapshot_randomize_va_space > 1)) { mm->brk = mm->start_brk = arch_randomize_brk(mm); #ifdef compat_brk_randomized current->brk_randomized = 1; > > Note that this issue only occurs with ASLR disabled. When ASLR is enabled, the > > brk region is setup in the low memory region that would normally be used by > > primary executable. Out of curiosity, why are you running without ASLR? Thanks for the report! I'll continue testing the above fix. Just for making sure I am able to exactly reproduce your issue, this is on a regular arm64 install of Ubuntu 22.04? -Kees -- Kees Cook