From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEEA3E77184 for ; Tue, 17 Dec 2024 22:19:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D74F6B0083; Tue, 17 Dec 2024 17:19:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 65FB86B0088; Tue, 17 Dec 2024 17:19:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 528646B0089; Tue, 17 Dec 2024 17:19:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 396536B0083 for ; Tue, 17 Dec 2024 17:19:00 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id EDC85A0A3F for ; Tue, 17 Dec 2024 22:18:59 +0000 (UTC) X-FDA: 82905866316.12.7EFF59B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf22.hostedemail.com (Postfix) with ESMTP id 99161C000D for ; Tue, 17 Dec 2024 22:18:25 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=jrhvA2TP; spf=pass (imf22.hostedemail.com: domain of kees@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734473917; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AUG50NAOyaQbfQCUCJq5C2SaXQ4A9kQGFUx2pckJioM=; b=tMHofZCxehrU4M3LN7gT5IX2ZSQi4Umr2SiwGhBnYPeA8XzKcjgf717Z847dKUXg0otyCO XYwUMYw2YPTv5mzKT0Lc8uiSgIF5t5xpX+6t2VcsRRvWuWkduS/OdizQR2+K1MssK8ov4y cjTqhla3n/zpqpZvZNf2owFLjPtg+2I= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=jrhvA2TP; spf=pass (imf22.hostedemail.com: domain of kees@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734473917; a=rsa-sha256; cv=none; b=RN2KEGptCGaEErmXHPF4iF/tJljrFolsihShYq77OZ9koOUGRMv7KxztiFmIq9L/rRMHpe Py5z54vjpmZDWA00amM11qKPwB3wC6aqwA4wwiHJh1JkI+dkCCwG6UsshJg3KJemA2z16d QAIF3Yk5MSwK6iB46jOEZVISbB8CsO8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id A8F375C5C82; Tue, 17 Dec 2024 22:18:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 675F1C4CED3; Tue, 17 Dec 2024 22:18:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734473936; bh=+cDN887R3QTVEjX6KEQYRZ2FEBrq1PPQJHjIoIcdQkA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=jrhvA2TPhEHiyGY1AEBOf7bjQ40LICdGm6WCX/VlYN5Iipffw5mlwiGh5DY4DnTzD lTB+Iq+8zQz6mmCS9XLoFlWsTvHa/DcUQC/Kkgc8uaNZUNMnOEmid7ouP5lz+tsFoS 1dWUKe4a+LbNzee2vMvGYtegnaYDNAR7mzeuYJNiHMVK02Sw2PJFdFIiGbvGyXOZ/n 2TRQ2AxRP8vHrSG6xUY7IAWVDgVaF6Ic6qP+rDBUaMGR3QyFwmWc6uuII24/BSLGKk pose5P2JOBVQ4QcFDKoKlbTH4f9P069F0Xa+u+jBXZVk/ZTOz0FVnAA7xEKHVmyvP4 /LmWOy4Vfx2tQ== Date: Tue, 17 Dec 2024 14:18:53 -0800 From: Kees Cook To: jeffxu@chromium.org Cc: akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, torvalds@linux-foundation.org, adhemerval.zanella@linaro.org, oleg@redhat.com, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, jorgelo@chromium.org, sroettger@google.com, ojeda@kernel.org, adobriyan@gmail.com, anna-maria@linutronix.de, mark.rutland@arm.com, linus.walleij@linaro.org, Jason@zx2c4.com, deller@gmx.de, rdunlap@infradead.org, davem@davemloft.net, hch@lst.de, peterx@redhat.com, hca@linux.ibm.com, f.fainelli@gmail.com, gerg@kernel.org, dave.hansen@linux.intel.com, mingo@kernel.org, ardb@kernel.org, Liam.Howlett@oracle.com, mhocko@suse.com, 42.hyeyoo@gmail.com, peterz@infradead.org, ardb@google.com, enh@google.com, rientjes@google.com, groeck@chromium.org, mpe@ellerman.id.au, Vlastimil Babka , Lorenzo Stoakes , Andrei Vagin , Dmitry Safonov <0x7f454c46@gmail.com>, Mike Rapoport , Alexander Mikhalitsyn Subject: Re: [PATCH v4 1/1] exec: seal system mappings Message-ID: <202412171248.409B10D@keescook> References: <20241125202021.3684919-1-jeffxu@google.com> <20241125202021.3684919-2-jeffxu@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241125202021.3684919-2-jeffxu@google.com> X-Rspamd-Server: rspam05 X-Stat-Signature: 7rws119eot9wtwcsfqdgzf58966zcywf X-Rspamd-Queue-Id: 99161C000D X-Rspam-User: X-HE-Tag: 1734473905-895631 X-HE-Meta: U2FsdGVkX1/jwT05+ojZLxMWwZczyG4AQgSfOjP6vNc/toB1cc5dKMFqciRaJb2xYFnVJIuWChZef2etBZ7rnvHhsJhSL3QFLbEPvLHMg2UJTQmme5pejnOdmomEhmbGfrbJRsdU4R54YzvqLhWu5j1t5pal7FjNHSkIMoSPRpS7MWfv3Rlxu5lDi2nhSAIFUcv6Dt0h9pNuCsI7syb4iM9ejtNElFYCKPcdOmpPAkhPPuPbaqAeUctLo55mX8I5NC07VFcUtMaMNXubT4GLHIwshsTJ1q/uM/WrlxNx0unWO3M5P7rI9u0zdkFWh0hLI1eMGStT3yxqQhRzc0ZQURNQNNlGIIns2swN+PjlsRAn/up/bjYO01ki8mwqp0gYVRKHsQfi5tImBpXnPdq6DDcLWVH+q9/A2pwrYIieZ5id8qCXhxdeqtGNGyvkj8qnfjBAW+xrIZwtWCPJWH3d7bIawUa2U8dNfOa5H1F4CWy0buuEkJD0kSEz7jiqbwr/3I239TphtHCMxRmHat0ZqwF+zNxon2XMatg7Zo8ynP+zGLoubOEjYLoJsoZqzd9ug7kWpw5WWqx9LcDqA7Sq94sdS2s6aYtSO6q8dE5cZc/LadENpMtel+cmQII9pijqZGm6yyrG4uhY8EMw/9QihANomCXZNRGcxhK0Yn8amXvVUeOGJEXOWh2VSUuvaeupkuYXXwGaazgYtGefD3Rf9kxT115z63EoVaFqw6K5SCqVkuXSUIbdTesitsrGiKlHY+Vp6Ntsu6ol822l2QraJ+DVm+Q7d5FTxw1gdkOnuVt+esW68sNTFaPzC/T7/rrRPSTpCVQYP+ag6nQvNUtbyihru82IIfQq7XMg4zLVZQ/iRCNhJDOuuPoApp9CvZC2IQtmV/7HnUj19y/rnMG4tO/U6CeIH3Qev8eIQWpCwPCN61j+lOiMqEdmbjORaF0pBHV+lbz38DaraRZFl9g TIiAVHda 3pshAplBZJFtvTNTR7Jk4F8R1RgEZRJUV7iuJU7LvRTE4Ez+x/NYFyzMIHPSfmm+g+8ldzgFxwXG/VF38Ro4PRUVItbBsEc0NEBq2nYbDsc1fvyekm3U3AnQA74Y9USOLXgoG4cTUr4b/NWnN4Z3hsZNEmpikj5C+VCwFmppx2X64BzUEqENsiWvHoHpaG+pFMFnREvhQokg3vQw208Ol3p3zEsiB6Dsk1LIC8iGdEphZAmjNsGhlltIGpyHeqdbkiAWF9kpSPpwhxjM5sCSX72+fYkBP9kd8a1QtQRA7da/mKIGvJmirLsH39TuCAS5Ljv1JmwRJWpohnFn0Z/vxjnn6uX6sZL2g8Mps X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Nov 25, 2024 at 08:20:21PM +0000, jeffxu@chromium.org wrote: > Seal vdso, vvar, sigpage, uprobes and vsyscall. > > Those mappings are readonly or executable only, sealing can protect > them from ever changing or unmapped during the life time of the process. > For complete descriptions of memory sealing, please see mseal.rst [1]. > > System mappings such as vdso, vvar, and sigpage (for arm) are > generated by the kernel during program initialization, and are > sealed after creation. > [...] > > + exec.seal_system_mappings = [KNL] > + Format: { no | yes } > + Seal system mappings: vdso, vvar, sigpage, vsyscall, > + uprobe. > + - 'no': do not seal system mappings. > + - 'yes': seal system mappings. > + This overrides CONFIG_SEAL_SYSTEM_MAPPINGS=(y/n) > + If not specified or invalid, default is the value set by > + CONFIG_SEAL_SYSTEM_MAPPINGS. > + This option has no effect if CONFIG_64BIT=n I know there is a v5 coming, but I wanted to give my thoughts to help shape it based on the current discussion threads. The callers of _install_special_mapping() cover what is mentioned here. The vdso is very common (arm, arm64, csky, hexagon, loongarch, mips, parisc, powerpc, riscv, s390, sh, sparc, x86, um). For those with vdso, some also have vvar (arm, arm64, loongarch, mips, powerpc, riscv, s390, sparc, x86). After that, I see a few extra things, in addition to sigpage and uprobes as mentioned already in the patch: arm sigpage arm64 compat vectors (what is this for arm?) arm64 compat sigreturn (what is this for arm?) nios2 kuser helpers uprobes As mentioned in the patch, there is also the x86_64 vsyscall mapping which eludes a regular grep since it's not using _install_special_mapping() :) So I guess the question is: can we mseal all of these universally under a common knob? Do the different uses mean we need finer granularity of knob, and do different architectures need flexibility here too? The patch handles the arch question with CONFIG_ARCH_HAS_SEAL_SYSTEM_MAPPINGS (which I think will be renamed with s/SEAL/MSEAL/ if I am following the threads). This seems a good solution to me. My question is about if sigpage, vectors, and sigreturn can also be included? (It seems like the answer is "yes", but I didn't see mention of the arm64 compat mappings.) Linus has expressed the desire that security features be available by default if they don't break existing userspace and that they be compiled in if possible (rather than be behind a CONFIG) so that code paths are being exercised to gain the most exposure to finding bugs. To that end, it's best to have a kernel command line to control it if it isn't safe to have always enabled. This is how we've handled _many_ features so that the code is built into the kernel, but that end users (e.g. distro users) can enable/disable a feature without rebuilding the entire kernel. For a "built into the kernel but default disabled unless enabled at boot time" example see: config RANDOMIZE_KSTACK_OFFSET bool "Support for randomizing kernel stack offset on syscall entry" if EXPERT default y depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET ... config RANDOMIZE_KSTACK_OFFSET_DEFAULT bool "Default state of kernel stack offset randomization" depends on RANDOMIZE_KSTACK_OFFSET ... #ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, randomize_kstack_offset); ... early_param("randomize_kstack_offset", early_randomize_kstack_offset); For an example of the older "not built into the kernel but when built in can be turned off at boot time" that predated Linus's recommendation see: config HARDENED_USERCOPY bool "Harden memory copies between kernel and userspace" ... static DEFINE_STATIC_KEY_FALSE_RO(bypass_usercopy_checks); ... __setup("hardened_usercopy=", parse_hardened_usercopy); (This should arguably be "default y" in the kernel these days, but whatever.) So, if we want to have a CONFIG_MSEAL_SYSTEM_MAPPINGS at all, it should be "default y" since we have the ...ARCH_HAS... config already, and then add a CONFIG_MSEAL_SYSTEM_MAPPINGS_DEFAULT that is off by default (since we expect there may be userspace impact) and tie _that_ to the kernel command-line so that end users can use it, or system builders can enable CONFIG_MSEAL_SYSTEM_MAPPINGS_DEFAULT. For the command line name, if a namespace is desired, I'd agree that naming this "mseal.special_mappings" is reasonable. It does change process behavior, so I'm also not opposed to "process.mseal_special_mappings", and it happens at exec, so "exec.mseal_special_mappings" is fine by me too. I think the main question would be: will there be other things under the proposed "mseal", "process", or "exec" namespace? I'd like to encourage things being logically grouped since we have SO MANY already. :) Also from discussions it sounds like there may need to be even finer-gain control, likely via prctl, for dealing with the CRIU case. The proposal is to provide an opt-out prctl with CAP_CHECKPOINT_RESTORE? I think this is reasonable and lets this all work without a new CONFIG. I imagine it would look like: criu process (which has CAP_CHECKPOINT_RESTORE): - prctl(GET_MSEAL_SYSTEM_MAPPINGS) - if set: - remember we need to mseal mappings - prctl(SET_MSEAL_SYSTEM_MAPPINGS, 0) - re-exec with --mseal-system-mappings (or something) - perform the "fork a tree to restore" work - in each child, move around all the mappings - if we need to mseal mappings: - prctl(SET_MSEAL_SYSTEM_MAPPINGS, 1) - mseal each system mapping - eventually drop CAP_CHECKPOINT_RESTORE - become the restored process Does that all sound right? If so I think Jeff has all the details needed to spin a v5. -Kees -- Kees Cook