From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 377F0D2F7E9 for ; Thu, 17 Oct 2024 03:43:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A58176B007B; Wed, 16 Oct 2024 23:43:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A084F6B0082; Wed, 16 Oct 2024 23:43:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8CFA76B0083; Wed, 16 Oct 2024 23:43:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6E07D6B007B for ; Wed, 16 Oct 2024 23:43:18 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9BB1AC0D44 for ; Thu, 17 Oct 2024 03:43:06 +0000 (UTC) X-FDA: 82681698540.06.D7E1D17 Received: from mail-oa1-f49.google.com (mail-oa1-f49.google.com [209.85.160.49]) by imf24.hostedemail.com (Postfix) with ESMTP id 2540518000E for ; Thu, 17 Oct 2024 03:43:13 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=kPGpGVnb; spf=pass (imf24.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.49 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729136403; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H2Yz5mj+Rbj1K0ofec49O0C3W0tcEFu1ZJqWT+jEV9A=; b=KTpXKKF93oOb27yzngd1bUaT2DdT1lXg1kp2oHV3oY9wABF+f3hR+jp8sCFe7yEyEcA6lD 55SOMIbKd2miW7D0LJ8HPv0XmJfITq29fD1rVfedcQnKQGvvmmwBSe1KzTGY7tX2+HRczU fEQjRix+uvqishLRUHLI478oOvnywtA= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=kPGpGVnb; spf=pass (imf24.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.160.49 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729136403; a=rsa-sha256; cv=none; b=yJp6i1FvKqIs3ZDOeXzk3SVXIgQA2WiGBhYZ8NbBJBZbXZYxbHgkuo41AZhP3uGolyFzPD Edfw1J8Bc8MAgXweodHM4l2Ggsn9m8CxJHXbj/W+j+Tsn328mCfm4zvxScOVJjKM5+sYbo p22QkR7gMG14n2Q8XTHb4chxwOg3Zkw= Received: by mail-oa1-f49.google.com with SMTP id 586e51a60fabf-2888bcc0f15so18940fac.0 for ; Wed, 16 Oct 2024 20:43:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1729136595; x=1729741395; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=H2Yz5mj+Rbj1K0ofec49O0C3W0tcEFu1ZJqWT+jEV9A=; b=kPGpGVnbwe8eEhMa/B/MauXNPy1MxVwpwy87O4fVe73+Y8/BZduX3eWeK1q4nzQ1oL BwgU9HVFSnTBXtuuEvsGQnVlWmOL/kgGfWGSSPTAZyaFejD99A9ZcSj2mbJg6Iv988WQ dyPpbQVR3jVF5TYhdPx0akTnaXGbp+YO+uBeQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729136595; x=1729741395; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H2Yz5mj+Rbj1K0ofec49O0C3W0tcEFu1ZJqWT+jEV9A=; b=q58QNNcbP87FFuk80hYTld0hHMdkD2sgKNNt5i+Ro+pG7BoWEz9DPysjMKtm8dkvqc kuXRD7UGPnbcZHAMu7tszENPYIQfD4ZH4EFvQBgxKFy5Mutc6lTiMpPeAfVM+3H/s+OL 0Q/cIvEEIupvf65VooK1BiQHE9AZn9hhWSy6CRvXOgrIuGpa18CDTGcQyq4SbnecgwZB mEpCG01e6q1/IpSLMMzonHAkwDi02HNuLhxW1PRR53Gw/II5FPwbOVaCOttav/PYT+xD zsxnkOd3fexsbbbHvMrGKUosUp7HlS6dOpKTuDECEeArKCL3VsUBuz5Mhx+KrckLkmLy tcAQ== X-Forwarded-Encrypted: i=1; AJvYcCWxMtSluBsHJGVSIPdrAOrNFyN9oIfA61mkXPTzuFX0kX477FylXIO4XykE032A3RfhsvrIba6JGw==@kvack.org X-Gm-Message-State: AOJu0Yy8Gu+7RJqqfqSuiTIvilVS8m03JAy3z9/7IrGivxs0B3sfeg/a pgfTReTJgU+dPjTrQ58SaJhnhRHll1qwC5Cn0JWw0im/EO6h5FtLh2zuRaYKVuDHqDihY+/n9t0 QBf8G9iohCWTbEpW5C5za7w3TiC07BmiTloGy X-Google-Smtp-Source: AGHT+IFIDwAMziEhtxc5Z3IqIkm/cNDE2Wqi7oMxVyC4pSWiLeauh/2ofoov3ZF4ZqPNUp5m2v24KlhvxjrxavNvb3k= X-Received: by 2002:a05:6870:c386:b0:288:2955:3efb with SMTP id 586e51a60fabf-2890c6ce2f0mr480823fac.8.1729136594968; Wed, 16 Oct 2024 20:43:14 -0700 (PDT) MIME-Version: 1.0 References: <20241014215022.68530-1-jeffxu@google.com> <20241014215022.68530-2-jeffxu@google.com> <6r5sxlhfujr2expiscsfpdjtraqlvy6k3cznmv25lo6usmyw7x@igmuywngc5xi> In-Reply-To: <6r5sxlhfujr2expiscsfpdjtraqlvy6k3cznmv25lo6usmyw7x@igmuywngc5xi> From: Jeff Xu Date: Wed, 16 Oct 2024 20:43:02 -0700 Message-ID: Subject: Re: [RFC PATCH v2 1/1] exec: seal system mappings To: "Liam R. Howlett" , jeffxu@chromium.org, akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, torvalds@linux-foundation.org, adhemerval.zanella@linaro.org, oleg@redhat.com, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, jorgelo@chromium.org, sroettger@google.com, ojeda@kernel.org, adobriyan@gmail.com, anna-maria@linutronix.de, mark.rutland@arm.com, linus.walleij@linaro.org, Jason@zx2c4.com, deller@gmx.de, rdunlap@infradead.org, davem@davemloft.net, hch@lst.de, peterx@redhat.com, hca@linux.ibm.com, f.fainelli@gmail.com, gerg@kernel.org, dave.hansen@linux.intel.com, mingo@kernel.org, ardb@kernel.org, mhocko@suse.com, 42.hyeyoo@gmail.com, peterz@infradead.org, ardb@google.com, enh@google.com, rientjes@google.com, groeck@chromium.org, lorenzo.stoakes@oracle.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2540518000E X-Stat-Signature: k5g4u6bdmzq37784oae8pyug5eo8xzjf X-Rspam-User: X-HE-Tag: 1729136593-143607 X-HE-Meta: U2FsdGVkX1+1h0NtXPXu4vhCQmOq8ck37VA+UGh0dTHn0B6VUtPvuYH5oTJbQ/L6EDO9gf56nbh+ji70orT/+b8/Xu66WIILwEYKGKNO/zyPSuC4uMquf6yPJXfGajcsa40gYbwp1wNdOst1bZzWIMRZyg2vhRMgbz2Ehcw+0Ioby/q3Vrz7FdB6vKddXixRV9gBx+MmYuPaT5IQUXF9v11AFzPeTlGqm24EpHIPvMk2oJICdpLTJ3BdAobvlIZlRkEe0NrnGWKmw1am/SBmKaz0EmY2yE8FF/Oq5WxuxcqE8L53IQexnnMdkMav0lk47yjKgHzcjn8kgX5YoMOYJDv0QWHnQuQCX/nnL1jj0uyWsoqyOThF4J7PBQL1ACleM5gqI2Ckg00vTB/1885uuZIZYTd+k3otT0PZ/k0nyHkVM3zjHK6wGLOWopcxdK9K9CHZUFtpTUI4CWsnJ/6pOC2GiYR6Gnwt9URO9pxuOavJAOGCQYHZikuzwyblt3aiuWjDsD/hhIDzvNYo3fJS5S/bPTq4ur4BvmOwDssPCa+JP7ukvk2OVRS/IXP41BdgBWRAF57W7hzguM/bE89B2XvoiqW+rcgCxmwHRX3VT4HeaTJljYAkbyvcK9jqDA90Wcqx9F0lSLXRfAylBxZkQ5/zfiRJICGEOO+hI3Yqi1PaN/6aF7T3VnJHB6vhcHI1bQYfh/vohOdw/z9TWp+EjNx3V1SZybDf3ArPSXjZ1pHU3/DTUMAdEHTKYL5qleZfxYRNSimMbfXL/w7Yh9xPc+reHSMDo2ljo9DUtsV7PNKZXa+8CEyWjO9TTfHEiS+UKEYa+xeATcYFWExTRM/KD54qnojAzjWd2NtSykN/DtxpC8tAbPqevqe9VkTUWlSkGqtD1tkebTIddy1EWezbai+Bs94jdQO8WUdkD1drc1/01N4qkFtC51UkHDcr/2vAcWqXRH3w7gSvyyDX7NZ HfmQWH7N r5rCLNIx9qYdPjltSGXAfrwsDs4IgTD02z0jYMukWoIp7vrCcAWm89lVjFXrcm1a9yPnRNuIJIXW5TZxu2mGqcF6mPOoBTWVVZ/rCBLW8RVuVI8SuTuChjKr4bNpt3GSPiUICoM5HxyVWccLfGZuWHTPpWaBIWR3Z5kb0DsRLfN4ozKmvm2w1PvxX871Ixu4DqtmqjJuEd5vRQY8ddkSfIay9xMEnLW32zE+qL4vQILhwShCq2pVRPYdrg0wPetOZcE0Gums9/bl1o4qHoz7K5Qd0awNnHEMrgX7JlVqlr+/dBRo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 16, 2024 at 6:10=E2=80=AFPM Liam R. Howlett wrote: > > > + exec.seal_system_mappings =3D [KNL] > > + Format: { never | always } > > + Seal system mappings: vdso, vvar, sigpage, uprobe= s, > > + vsyscall. > > + This overwrites KCONFIG CONFIG_SEAL_SYSTEM_MAPPIN= GS_* > > + - 'never': never seal system mappings. > > Not true, uprobes are sealed when 'never' is selected. > Thanks. I forgot to uprobes from the description in Kconfig and kernel-parameters.txt, will update. > > + - 'always': always seal system mappings. > > + If not specified or invalid, default is the KCONF= IG value. > > + This option has no effect if CONFIG_64BIT=3Dn > > + > > early_page_ext [KNL,EARLY] Enforces page_ext initialization to ea= rlier > > stages so cover more early boot allocations. > > Please note that as side effect some optimization= s > > diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsy= scall/vsyscall_64.c > > index 2fb7d53cf333..20a3000550d2 100644 > > --- a/arch/x86/entry/vsyscall/vsyscall_64.c > > +++ b/arch/x86/entry/vsyscall/vsyscall_64.c > > @@ -32,6 +32,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -366,8 +367,12 @@ void __init map_vsyscall(void) > > set_vsyscall_pgtable_user_bits(swapper_pg_dir); > > } > > > > - if (vsyscall_mode =3D=3D XONLY) > > - vm_flags_init(&gate_vma, VM_EXEC); > > + if (vsyscall_mode =3D=3D XONLY) { > > + unsigned long vm_flags =3D VM_EXEC; > > + > > + update_seal_exec_system_mappings(&vm_flags); > > + vm_flags_init(&gate_vma, vm_flags); > > + } > > > > BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=3D > > (unsigned long)VSYSCALL_ADDR); > > diff --git a/fs/exec.c b/fs/exec.c > > index 77364806b48d..5030879cda47 100644 > > --- a/fs/exec.c > > +++ b/fs/exec.c > > Does it make sense for this to live in exec? Couldn't you put it in the > mm/mseal.c file? It's vma flags for mappings and you've put it in > fs/exec? > If you are referring to utilities related to kernel cmdline, they should be in this file. > > @@ -68,6 +68,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -2159,3 +2160,55 @@ fs_initcall(init_fs_exec_sysctls); > > #ifdef CONFIG_EXEC_KUNIT_TEST > > #include "tests/exec_kunit.c" > > #endif > > + > > +#ifdef CONFIG_64BIT > > +/* > > + * Kernel cmdline overwrite for CONFIG_SEAL_SYSTEM_MAPPINGS_X > > + */ > > +enum seal_system_mappings_type { > > + SEAL_SYSTEM_MAPPINGS_NEVER, > > + SEAL_SYSTEM_MAPPINGS_ALWAYS > > +}; > > + > > +static enum seal_system_mappings_type seal_system_mappings __ro_after_= init =3D > > + IS_ENABLED(CONFIG_SEAL_SYSTEM_MAPPINGS_ALWAYS) ? SEAL_SYSTEM_MAPP= INGS_ALWAYS : > > + SEAL_SYSTEM_MAPPINGS_NEVER; > > + > > +static const struct constant_table value_table_sys_mapping[] __initcon= st =3D { > > + { "never", SEAL_SYSTEM_MAPPINGS_NEVER}, > > + { "always", SEAL_SYSTEM_MAPPINGS_ALWAYS}, > > + { } > > +}; > > + > > +static int __init early_seal_system_mappings_override(char *buf) > > +{ > > + if (!buf) > > + return -EINVAL; > > + > > + seal_system_mappings =3D lookup_constant(value_table_sys_mapping, > > + buf, seal_system_mappings); > > + > > + return 0; > > +} > > + > > +early_param("exec.seal_system_mappings", early_seal_system_mappings_ov= erride); > > + > > +static bool seal_system_mappings_enabled(void) > > +{ > > + if (seal_system_mappings =3D=3D SEAL_SYSTEM_MAPPINGS_ALWAYS) > > + return true; > > + > > + return false; > > +} > > This function seems unnecessary, it is called from another 3-4 line > function only. > It is more readable this way. > > + > > +void update_seal_exec_system_mappings(unsigned long *vm_flags) > > +{ > > + if (!(*vm_flags & VM_SEALED) && seal_system_mappings_enabled()) > > Why !(*vm_flags & VM_SEALED) here? > If vm_flags is already sealed, then there is no need to check seal_system_mappings_enabled. > > + *vm_flags |=3D VM_SEALED; > > + > > +} > > Instead of passing a pointer around and checking enabled, why don't you > have a function that just returns the VM_SEALED or 0 and just or it into > the flags? This seems very heavy for what it does, why did you do it > this way? > Why is that heavy ? passing a pointer for updating variables is natural. > The name is also very long and a bit odd, it could be used for other > reasons, but you have _system_mappings on the end, and you use seal but > it's mseal (or vm_seal)? Would mseal_flag() work? > It could be longer :-) it means update_sealing_flag_for_executable_system_mappings. mseal_flag is too short and not descriptive. > > +#else > > +void update_seal_exec_system_mappings(unsigned long *vm_flags) > > +{ > > +} > > +#endif /* CONFIG_64BIT */ > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > index 42444ec95c9b..6e44aca4b24b 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > Again, I don't understand why fs.h is the place for mseal definitions? > include/linux/fs.h contains other exec related function signatures. So it is better to keep them at the same header. > > @@ -3079,6 +3079,7 @@ ssize_t __kernel_read(struct file *file, void *bu= f, size_t count, loff_t *pos); > > extern ssize_t kernel_write(struct file *, const void *, size_t, loff_= t *); > > extern ssize_t __kernel_write(struct file *, const void *, size_t, lof= f_t *); > > extern struct file * open_exec(const char *); > > +extern void update_seal_exec_system_mappings(unsigned long *vm_flags); > > We are dropping extern where possible now. > extern can be dropped, it appears not causing link error. > > > > /* fs/dcache.c -- generic fs support functions */ > > extern bool is_subdir(struct dentry *, struct dentry *); > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > > index c47a0bf25e58..e9876fae8887 100644 > > --- a/kernel/events/uprobes.c > > +++ b/kernel/events/uprobes.c > > @@ -1506,7 +1506,7 @@ static int xol_add_vma(struct mm_struct *mm, stru= ct xol_area *area) > > } > > > > vma =3D _install_special_mapping(mm, area->vaddr, PAGE_SIZE, > > - VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO, > > + VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO|VM_S= EALED, > > &xol_mapping); > > Changing all uprobes seems like something that should probably be > mentioned more than just the note at the end of the change log, even if > you think it won't have any impact. The note is even hidden at the end > of a paragraph. > > I would go as far as splitting this patch out as its own so that the > subject line specifies that all uprobes will be VM_SEALED now. > > Maybe it's fine but maybe it isn't and you've buried it so that it will > be missed by virtually everyone. > I will add "It is sealed from creation." in the above "uprobe" section. > > > if (IS_ERR(vma)) { > > ret =3D PTR_ERR(vma); > > diff --git a/mm/mmap.c b/mm/mmap.c > > index 57fd5ab2abe7..d4717e34a60d 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -2133,6 +2133,7 @@ struct vm_area_struct *_install_special_mapping( > > unsigned long addr, unsigned long len, > > unsigned long vm_flags, const struct vm_special_mapping *spec) > > { > > + update_seal_exec_system_mappings(&vm_flags); > > return __install_special_mapping(mm, addr, len, vm_flags, (void *= )spec, > > &special_mapping_vmops); > > If you were to return a flag, you could change the vm_flags argument to > vm_flags | mseal_flag() > passing pointer seems to be the most efficient way. Thanks -Jeff