From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2197C3ABAA for ; Fri, 2 May 2025 10:01:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B52596B008A; Fri, 2 May 2025 06:01:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B03516B008C; Fri, 2 May 2025 06:01:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A2BA6B0092; Fri, 2 May 2025 06:01:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7233B6B008A for ; Fri, 2 May 2025 06:01:17 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5337D140F3F for ; Fri, 2 May 2025 10:01:17 +0000 (UTC) X-FDA: 83397524994.03.A850F53 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id 31F974001B for ; Fri, 2 May 2025 10:01:14 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746180075; a=rsa-sha256; cv=none; b=r4/5x62SWmIGAq6kbnC7M8I3zZmgRex//Nwa1tem1dUsqzL1KJdFHOGRG20ZAssEbVKWYU Kap794tefu+m6PCFCPKmq/sWk/5TTKJzqEIt8pcecxDUs3jvphSUjgIgQ1oZOb+FDCMGXj 72HJt+JKbq+kesZAiP8m9Kh/hi+OkXU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746180075; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o3FL9hlGuJUzP0+wbnHlm4pw+Rpsu5m4n7fg9uf18TE=; b=O5AIdVu/vCjl2XdLdPJZeobMSjogdOeANvyEQEn/GAV5D+FmX/qFGj/7m7R3Lr7iag7Pqs EcUPecHff19qH8nqMGx+LsbeIy/ZtfeU3M8/sfxnJZjjqBYonWQLOp6GOvQsfCM6+GPyH1 uYfMr0eeuKpP8rcru4XIXKOQ5kxQdZw= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 69BCD168F; Fri, 2 May 2025 03:01:05 -0700 (PDT) Received: from [10.57.93.118] (unknown [10.57.93.118]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A4C7B3F66E; Fri, 2 May 2025 03:01:10 -0700 (PDT) Message-ID: <87f80506-eeb3-4848-adc9-8a030b5f4136@arm.com> Date: Fri, 2 May 2025 11:01:07 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] binfmt_elf: Move brk for static PIE even if ASLR disabled To: Kees Cook Cc: Al Viro , Christian Brauner , Jan Kara , Ali Saidi , Andrew Morton , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org References: <20250502001820.it.026-kees@kernel.org> Content-Language: en-GB From: Ryan Roberts In-Reply-To: <20250502001820.it.026-kees@kernel.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 31F974001B X-Stat-Signature: 4ykw4oskzf574qwxmbsd4jobe4pyu1eg X-Rspam-User: X-HE-Tag: 1746180074-705122 X-HE-Meta: U2FsdGVkX1/VkHdYuujh01GyWF/GPaqP4EBHz5tZNagc+O5MD0aeECFRA9zTtSjhHSV/GFyvywhAqPlFYOFQUSBnNpJPP7cfrBvAfAvg0o6dWND4Za6knXn2QhCNGAqIzvaMir7W0POkqse3huBOsSKrvtiQCPUWSFjvPRLfkP+5M7sX1RSuxVuahniBygPczh3IpxtJ82MBT+/RfWlhRaCgVGCbaSnzL4I5D+WEYyCe//sqXMPNJxkjdIWf2pXtGigPD4Wx/vcXHdHFnWk/zazBgt1zZlH3Zw5vevLlVcIvaNEZQbGzkbQ5H5I+fTbHMer8J2tvgbQB9bxz2eN+pj6KJYcJ/BcjHGhPkLUkNgMtO5EYKalxf8ROCsSbb7Q/VQoKU72g1/HoIUfbD5HZGJvXeoWQF7ixZqoFkm5JZXrUwJtTv91K+4YJCTTLaL1C8zB/W9CKjdnprOkK3utvZYub03mqLEzkNfsze2hJ/qgHN9pVP0h1CADkfkDjs5Ph3LLT7y1D9BZu4dwlRGMNhWE+UovVljSaK1QfF5n7VLlTwcm1HIIQ3XwS/jJ6QRCLkPcTnUP6bjHryjGFF2iGGvaUMgsXTKpCwvjg7+MLn3HhdAKCKGnbJq1in2SkMCCNdMbqfzo4quyeYklYi/Q3gVLtZnwA9L9+18lMvUpKHZXM059WwZovd+PAcQQNuVbzFMiG9Bqstsi1rjbe6VUW+/AgAV3uiJn8Cg/F2LsUGWrhDkuIZNPKou+aIBV/B+0Ynx+egLzRQbga3QOVDIIj2tzltqGDwc1xwKxlQBw0GUXB4qD+hw0yznMLIiV3uC8GK40b6+6/DojpdEMBim9FigXfk9aRCvT8pJ0pw4dKaBpPeqYXaDMxsi6MiWFEP1zdZVrDidHIFGk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 02/05/2025 01:18, Kees Cook wrote: > In commit bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing > direct loader exec"), the brk was moved out of the mmap region when > loading static PIE binaries (ET_DYN without INTERP). The common case > for these binaries was testing new ELF loaders, so the brk needed to > be away from mmap to avoid colliding with stack, future mmaps (of the > loader-loaded binary), etc. But this was only done when ASLR was enabled, > in an attempt to minimize changes to memory layouts. > > After adding support to respect alignment requirements for static PIE > binaries in commit 3545deff0ec7 ("binfmt_elf: Honor PT_LOAD alignment > for static PIE"), it became possible to have a large gap after the > final PT_LOAD segment and the top of the mmap region. This means that > future mmap allocations might go after the last PT_LOAD segment (where > brk might be if ASLR was disabled) instead of before them (where they > traditionally ended up). > > On arm64, running with ASLR disabled, Ubuntu 22.04's "ldconfig" binary, > a static PIE, has alignment requirements that leaves a gap large enough > after the last PT_LOAD segment to fit the vdso and vvar, but still leave > enough space for the brk (which immediately follows the last PT_LOAD > segment) to be allocated by the binary. > > fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real > fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real > fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0 > ***[brk will go here at fffff7ffa000]*** > fffff7ffc000-fffff7ffe000 r--p 00000000 00:00 0 [vvar] > fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0 [vdso] > fffffffdf000-1000000000000 rw-p 00000000 00:00 0 [stack] > > After commit 0b3bc3354eb9 ("arm64: vdso: Switch to generic storage > implementation"), the arm64 vvar grew slightly, and suddenly the brk > collided with the allocation. > > fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real > fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real > fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0 > ***[oops, no room any more, vvar is at fffff7ffa000!]*** > fffff7ffa000-fffff7ffe000 r--p 00000000 00:00 0 [vvar] > fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0 [vdso] > fffffffdf000-1000000000000 rw-p 00000000 00:00 0 [stack] > > The solution is to unconditionally move the brk out of the mmap region > for static PIE binaries. Whether ASLR is enabled or not does not change if > there may be future mmap allocation collisions with a growing brk region. > > Update memory layout comments (with kernel-doc headings), consolidate > the setting of mm->brk to later (it isn't needed early), move static PIE > brk out of mmap unconditionally, and make sure brk(2) knows to base brk > position off of mm->start_brk not mm->end_data no matter what the cause of > moving it is (via current->brk_randomized). > > For the CONFIG_COMPAT_BRK case, though, leave the logic unchanged, as we > can never safely move the brk. These systems, however, are not using > specially aligned static PIE binaries. > > Reported-by: Ryan Roberts > Closes: https://lore.kernel.org/lkml/f93db308-4a0e-4806-9faf-98f890f5a5e6@arm.com/ > Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec") > Link: https://lore.kernel.org/r/20250425224502.work.520-kees@kernel.org > Signed-off-by: Kees Cook It's a shame we can't figure out a universal solution that would work for CONFIG_COMPAT_BRK too, but I can't think of anything. So as you say, let's worry about that in the unlikely event that an issue is reported. Reviewed-by: Ryan Roberts Tested-by: Ryan Roberts Thanks for sorting this out! Would be great to get it into v6.15. > --- > v2: exclude CONFIG_COMPAT_BRK (ryan) > v1: https://lore.kernel.org/all/20250425224502.work.520-kees@kernel.org/ > --- > fs/binfmt_elf.c | 71 ++++++++++++++++++++++++++++++++----------------- > 1 file changed, 47 insertions(+), 24 deletions(-) > > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index 584fa89bc877..4c1ea6b52a53 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -830,6 +830,7 @@ static int load_elf_binary(struct linux_binprm *bprm) > struct elf_phdr *elf_ppnt, *elf_phdata, *interp_elf_phdata = NULL; > struct elf_phdr *elf_property_phdata = NULL; > unsigned long elf_brk; > + bool brk_moved = false; > int retval, i; > unsigned long elf_entry; > unsigned long e_entry; > @@ -1097,15 +1098,19 @@ static int load_elf_binary(struct linux_binprm *bprm) > /* Calculate any requested alignment. */ > alignment = maximum_alignment(elf_phdata, elf_ex->e_phnum); > > - /* > - * There are effectively two types of ET_DYN > - * binaries: programs (i.e. PIE: ET_DYN with PT_INTERP) > - * and loaders (ET_DYN without PT_INTERP, since they > - * _are_ the ELF interpreter). The loaders must > - * be loaded away from programs since the program > - * may otherwise collide with the loader (especially > - * for ET_EXEC which does not have a randomized > - * position). For example to handle invocations of > + /** > + * DOC: PIE handling > + * > + * There are effectively two types of ET_DYN ELF > + * binaries: programs (i.e. PIE: ET_DYN with > + * PT_INTERP) and loaders (i.e. static PIE: ET_DYN > + * without PT_INTERP, usually the ELF interpreter > + * itself). Loaders must be loaded away from programs > + * since the program may otherwise collide with the > + * loader (especially for ET_EXEC which does not have > + * a randomized position). > + * > + * For example, to handle invocations of > * "./ld.so someprog" to test out a new version of > * the loader, the subsequent program that the > * loader loads must avoid the loader itself, so > @@ -1118,6 +1123,9 @@ static int load_elf_binary(struct linux_binprm *bprm) > * ELF_ET_DYN_BASE and loaders are loaded into the > * independently randomized mmap region (0 load_bias > * without MAP_FIXED nor MAP_FIXED_NOREPLACE). > + * > + * See below for "brk" handling details, which is > + * also affected by program vs loader and ASLR. > */ > if (interpreter) { > /* On ET_DYN with PT_INTERP, we do the ASLR. */ > @@ -1234,8 +1242,6 @@ static int load_elf_binary(struct linux_binprm *bprm) > start_data += load_bias; > end_data += load_bias; > > - current->mm->start_brk = current->mm->brk = ELF_PAGEALIGN(elf_brk); > - > if (interpreter) { > elf_entry = load_elf_interp(interp_elf_ex, > interpreter, > @@ -1291,27 +1297,44 @@ static int load_elf_binary(struct linux_binprm *bprm) > mm->end_data = end_data; > mm->start_stack = bprm->p; > > - if ((current->flags & PF_RANDOMIZE) && (snapshot_randomize_va_space > 1)) { > + /** > + * DOC: "brk" handling > + * > + * For architectures with ELF randomization, when executing a > + * loader directly (i.e. static PIE: ET_DYN without PT_INTERP), > + * move the brk area out of the mmap region and into the unused > + * ELF_ET_DYN_BASE region. Since "brk" grows up it may collide > + * early with the stack growing down or other regions being put > + * into the mmap region by the kernel (e.g. vdso). > + * > + * In the CONFIG_COMPAT_BRK case, though, everything is turned > + * off because we're not allowed to move the brk at all. > + */ > + if (!IS_ENABLED(CONFIG_COMPAT_BRK) && > + IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && > + elf_ex->e_type == ET_DYN && !interpreter) { > + elf_brk = ELF_ET_DYN_BASE; > + /* This counts as moving the brk, so let brk(2) know. */ > + brk_moved = true; > + } > + mm->start_brk = mm->brk = ELF_PAGEALIGN(elf_brk); > + > + if ((current->flags & PF_RANDOMIZE) && snapshot_randomize_va_space > 1) { > /* > - * For architectures with ELF randomization, when executing > - * a loader directly (i.e. no interpreter listed in ELF > - * headers), move the brk area out of the mmap region > - * (since it grows up, and may collide early with the stack > - * growing down), and into the unused ELF_ET_DYN_BASE region. > + * If we didn't move the brk to ELF_ET_DYN_BASE (above), > + * leave a gap between .bss and brk. > */ > - if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && > - elf_ex->e_type == ET_DYN && !interpreter) { > - mm->brk = mm->start_brk = ELF_ET_DYN_BASE; > - } else { > - /* Otherwise leave a gap between .bss and brk. */ > + if (!brk_moved) > mm->brk = mm->start_brk = mm->brk + PAGE_SIZE; > - } > > mm->brk = mm->start_brk = arch_randomize_brk(mm); > + brk_moved = true; > + } > + > #ifdef compat_brk_randomized > + if (brk_moved) > current->brk_randomized = 1; > #endif > - } > > if (current->personality & MMAP_PAGE_ZERO) { > /* Why this, you ask??? Well SVr4 maps page 0 as read-only,