From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BE29C02198 for ; Fri, 14 Feb 2025 14:40:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CE6D6B007B; Fri, 14 Feb 2025 09:40:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 57E356B0082; Fri, 14 Feb 2025 09:40:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 41EBC6B0083; Fri, 14 Feb 2025 09:40:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2449C6B007B for ; Fri, 14 Feb 2025 09:40:12 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 74FF3B050C for ; Fri, 14 Feb 2025 14:40:03 +0000 (UTC) X-FDA: 83118809886.29.0552F91 Received: from mail-ot1-f48.google.com (mail-ot1-f48.google.com [209.85.210.48]) by imf23.hostedemail.com (Postfix) with ESMTP id 78D0014000E for ; Fri, 14 Feb 2025 14:40:01 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=DxTMwY83; spf=pass (imf23.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.48 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739544001; a=rsa-sha256; cv=none; b=1VB4lyZIMN6CUjmMRiL0y7NjbhbHfsd6uwhg8Jq7Qil8kCi7MESsB+PrDF7KaM7H+Ds26l x+Lu32vtw+PIPvtpEk/VfUHu6e0330AwbK2Kzl2angrCGW/nNuRZ2QJ1L9FMxqly2ckWt3 4Hko6gt0B/xc0YdEGPJmX9gET9ae+jw= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=DxTMwY83; spf=pass (imf23.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.210.48 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739544001; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=e50JFqmY2OdzosOXKFQSp920S0Fnul4zaVHjHtIwYN4=; b=1NW1lLwXysIkImanhA/Jn/3ik6MfaLe6wLhE/eo52hd6zV5O0ZA0UUZh0TA3jbzpu+x6OP lnxG6JNZEzyUmA+0Q8YZ58iFHZk/GIqhmxiKIBYLugYUgs5jWGNEf51iYHN2tQFrHPj4G7 puXiURlirFOfcDOzXgkfpLGnXwfQbUU= Received: by mail-ot1-f48.google.com with SMTP id 46e09a7af769-72709f5e300so156006a34.2 for ; Fri, 14 Feb 2025 06:40:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1739544000; x=1740148800; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=e50JFqmY2OdzosOXKFQSp920S0Fnul4zaVHjHtIwYN4=; b=DxTMwY83K9ULvQhPa7iFjsglDdd/6HfTwajFGauea4vIJ4WEuprDNOyliyt6EnFUeg O/XudilkU9Y4X3oLot40OU6VJmir7CrZc6rDiHhxeoaCt3ptntD+6Lo6r9IktEfb49H3 TKpisdl7lwp8E8L8qY+WMbBUCj8bIJrmSo530= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739544000; x=1740148800; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=e50JFqmY2OdzosOXKFQSp920S0Fnul4zaVHjHtIwYN4=; b=dtvWlJmOuvVpWUFzNiO+tWKbnL/s9O+qANAV7pw2h+oUBdU6xrbyWYAeXq5JZxYh+f merqNhrjqzkn+gqfHSKtJdla2mUDzjKtpd/X1MckE0prr+jXiPTLv44W5kd0kPkymerL Y+e31W14CTYTSo6X6wzu3LTeAuDE98hmyqb01LN299Lc4oi1e2v2vAg4IVwDK3xCrEO6 eaYyEIrM/yMKvvxcfADsuIIDnvghaVY+MP3dbDvQ4iXcxAknH4aDL1K42FpxAcM0iwmt 2qI6G4r0SbvwxtUVyx5BVMvjRQZObtB0BFf37E2O5BH8lQPHJgv/dcFTtE0C1J4bk2m0 PkuA== X-Forwarded-Encrypted: i=1; AJvYcCUKqJAOdSur4UBDu4ItWL9V8oI7++V7QRXpWzk8nK5OM8cHRX+qCgWlHxsoZ4FYa4HOilSUnIvwBg==@kvack.org X-Gm-Message-State: AOJu0Yzf+koU0rgxYaVLZ1kfsUvP6ZbB6YtvsPqIThOWObjXcWCnBw3H IJeHadm3e+wA+0ZbOUMKQ7Oe9qTGV4yiB/Tl7Ikhl1+K1Opxhk7H2/lCb3zwYTEq2I/8QZ5Fwys slECzbOQDQAobUEFz+5LQmSI4PYZIWqEcMnrn X-Gm-Gg: ASbGncs1b5tT1Nk7YFGnKKHGTLYJQIcS93zPViiN6ZXHW0GHfHZVSaO1ZSgK7TpKybH BMtcLZRXytChJ6PKPDkvxLh4RlD69FtQyoGZSquOAoTBnojH/r5D3Nl3H4s1WZ7/aygwPPUL22Z BQ4KA3lEahUQLjNpdHWj+NedcVfSCnFA== X-Google-Smtp-Source: AGHT+IE5phE84a0UTMr4W87ScgbUlW/dFHXl7hg7/1Hl59bEFS9809KM+sJHjNr/1G9i++0cxtWwsdHrnjZkeJlT4Tg= X-Received: by 2002:a05:6808:30aa:b0:3eb:5372:980a with SMTP id 5614622812f47-3f3e1a3dae1mr621837b6e.9.1739544000360; Fri, 14 Feb 2025 06:40:00 -0800 (PST) MIME-Version: 1.0 References: <20250212032155.1276806-1-jeffxu@google.com> <20250212032155.1276806-2-jeffxu@google.com> <66q7feybn3q2vuam72uwmai322jdx3jtv2m5xmh75l6w6etqiv@knpsyh3o627k> <202502131142.F5EE115C3A@keescook> In-Reply-To: From: Jeff Xu Date: Fri, 14 Feb 2025 06:39:48 -0800 X-Gm-Features: AWEUYZnFrrkgQutGRNCikvC0GjZmPuFErPnb4MZpk1qZi3Du18JPUWGyIDShPQQ Message-ID: Subject: Re: [RFC PATCH v5 1/7] mseal, system mappings: kernel config and header change To: "Liam R. Howlett" , Jeff Xu , Kees Cook , akpm@linux-foundation.org, jannh@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, lorenzo.stoakes@oracle.com, adhemerval.zanella@linaro.org, oleg@redhat.com, avagin@gmail.com, benjamin@sipsolutions.net, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, jorgelo@chromium.org, sroettger@google.com, hch@lst.de, ojeda@kernel.org, thomas.weissschuh@linutronix.de, adobriyan@gmail.com, johannes@sipsolutions.net, pedro.falcato@gmail.com, hca@linux.ibm.com, willy@infradead.org, anna-maria@linutronix.de, mark.rutland@arm.com, linus.walleij@linaro.org, Jason@zx2c4.com, deller@gmx.de, rdunlap@infradead.org, davem@davemloft.net, peterx@redhat.com, f.fainelli@gmail.com, gerg@kernel.org, dave.hansen@linux.intel.com, mingo@kernel.org, ardb@kernel.org, mhocko@suse.com, 42.hyeyoo@gmail.com, peterz@infradead.org, ardb@google.com, enh@google.com, rientjes@google.com, groeck@chromium.org, mpe@ellerman.id.au, aleksandr.mikhalitsyn@canonical.com, mike.rapoport@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: mnei9rbs1p68rc4hiiunrxmui9y198ux X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 78D0014000E X-Rspam-User: X-HE-Tag: 1739544001-212459 X-HE-Meta: U2FsdGVkX19LobGIPoQEWI57th6y6q4w9GlHV1hHXBgKsm00XbCZD6VUbs0NWAydM/tV2BkX9V4sC566kiWXKCpHPRnpb8YAd9FMpR8jD7vDamz4QYwv7QlmoPQP5fO9ti2yFC+XZGpxE485t+uLgrmEp2c8/kfKWHa9o9dNLILKhgD7bcD7UJR/xEKAhcv5EoK6kA1UyLO2RrFGA6waxVMSrIsX5Th6g0em2BlS9312ddVp0EL9esZ9g2lLbcOVP+zPcrsL4b7cfio+eshN2J/RR9sh6Tzyr/VvhIjAzR/ia4+XJlCIvTdm0txqdogOK/de1swNF4qLDhIZi2OtH1dvRDQqymGZHYKfVa08MxcO66XGiDHaVv+yfqGbRQcYeiDbfZJGwjoqvqsD+VECXzs8cP2M/UR6M2blQAUzKMbxBAfWvwHhvrbemTKBD8cD/ZYLZWAjZ7XQRh7PG62GyikRXk3oaBuuQFBP4OGcIQ+tUU7i+C3h65MdnntKyXFghj80kPScZezPtFJu17rSm9JbHolG7rUgUHMypnMt5QZ7uNG41fC6UOZKqOwcdaGyf8sLFGcD/rLH1oi6MnAybCasRbO4wVGBiIZt6X13RcnxWmhLbW2ViQ2RF5/QEewyM9SaXUSHEixZjh6z9glNoU7fdCT6Kj5rO2rV6MVpQh43RM4+gr2JGEABwqTf5KGxV8bvuJ3jz7iOwKqkmUeXSA6KfKCnQOjp6yqtxzQFJxWgpMzDP3RTP2GzfqiFUfkiLPanCmFxUMoYc5QH1EM3vSnA5/QrXsoehY4VKfGYmLWFvTpLSqKDjLbK5mDlfUTS9vAHzttgYYY8asSPnzLfLOL/j6Kwgf17LFkidjIiG6I313lZxJ5YytAmzjaU3tYxnBSs7htzvemYLE9Np5XXbku9aCX9KamvU1x56uRdL32Ttr5YOWgEMh5tqygo+y7PbMttm6K1z1+eUOz9VzB hXUKZv0w 8fW5an9Ng+sRa4cadvOAY739fKwWD+LmhIKscmunh0Mm0ioV7SGKzMhLTVQ1AIr+jWHu4h67F4wyE82T9cTX9d89YqV5s3vJt1r2uu/axFz5eLoz+Xi0xs6oRR+8ouJ4gePM1nBx+gN49WDO0khsYq+rjPE9JZ20f1DI0l7VgFfkxmf1uCqjY31V/WGO2FnRARaDTYC5wI1zPKwXb3affwAfbdfuxkF8ILJk5yi41BgjrfAmivWOVWmgHvtmVDiooApAehzxtGir6mP3XBKK78MMjZ+r1HjPzL75yapSqlptOuA5jTTvc3697+seHRfiEAIyh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 13, 2025 at 5:10=E2=80=AFPM Liam R. Howlett wrote: > > * Liam R. Howlett [250213 19:14]: > > * Jeff Xu [250213 17:00]: > > > On Thu, Feb 13, 2025 at 12:54=E2=80=AFPM Liam R. Howlett > > > wrote: > > > > > > > > > > > > > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't = part of > > > > > > > the build. This is intentional. Any 32-bit code trying to use= the > > > > > > > sealing function or the VM_SEALED flag will immediately fail > > > > > > > compilation. This makes it easier to identify incorrect usage= . > > > > > > > > > > > > So you are against using the #define because the VM_SYSTEM_SEAL= will be > > > > > > defined, even though it will be VM_NONE? This is no worse than= a > > > > > > function that returns 0, and it aligns better with what we have= today in > > > > > > that it can be put in the list of other flags. > > > > > > > > > > When I was reading through all of this and considering the histor= y of > > > > > its development goals, it strikes me that a function is easier fo= r the > > > > > future if/when this can be made a boot-time setting. > > > > > > > > > > > > > Reworking this change to function as a boot-time parameter, or what= ever, > > > > would not be a significant amount of work, if/when the time comes. > > > > Since we don't know what the future holds, it it unnecessary to eng= ineer > > > > in a potential change for a future version when the change is easy > > > > enough to make later and keep the code cleaner now. > > > > > > > Sure, I will put the function in mm.h for this patch. We can find a > > > proper place when it is time to implement the boot-time parameter > > > change. > > > > > > The call stack for sealing system mapping is something like below: > > > > > > install_special_mapping (mm/mmap.c) > > > map_vdso (arch/x86/entry/vdso/vma.c) > > > load_elf_binary (fs/binfmt_elf.c) > > > load_misc_binary (fs/binfmt_misc.c) > > > bprm_execve (fs/exec.c) > > > do_execveat_common > > > __x64_sys_execve > > > do_syscall_64 > > > > > > IMO, there's a clear divide between the API implementer and the API u= ser. > > > mm and mm.h are the providers, offering the core mm functionality > > > through APIs/data structures like install_special_mapping(). > > > > > > The exe layer (bprm_execve, map_vdso, etc) is the consumer of the > > > install_special_mapping. > > > The logic related to checking if sealing system mapping is enabled > > > belongs to the exe layer. > > > > Since this is an all or nothing enabling, there is no reason to have > > each caller check the same thing and do the same action. You should put > > the logic into the provider - they all end up doing the same thing. > > > > Also, this is a compile time option so it doesn't even need to be > > checked on execution - just apply it in the first place, at the source. > > Your static inline function was already doing this...? > > > > I'm confused as to what you are arguing here because it goes against > > what you had and what I suggested. The alternative you are suggesting > > is more code, more instructions, and the best outcome is the same > > result. > > I think I understand what you are saying now: the interface to > install_special_mapping() needs to take the vma flag, as it does today. > What I don't understand is that what you proposed and what I proposed > both do that. > > What I'm saying is that, since system mappings are enabled, we can > already know implicitly by having VM_SYSTEM_SEAL either VM_NONE or > VM_SEAL. > > Turning this: > > @@ -264,11 +266,12 @@ static int map_vdso(const struct vdso_image *image,= unsigned long addr) > /* > * MAYWRITE to allow gdb to COW and set breakpoints > */ > + vm_flags =3D VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC; > + vm_flags |=3D mseal_system_mappings(); > vma =3D _install_special_mapping(mm, > text_start, > image->size, > - VM_READ|VM_EXEC| > - VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, > + vm_flags, > &vdso_mapping); > > to this: > > /* > * MAYWRITE to allow gdb to COW and set breakpoints > */ > vma =3D _install_special_mapping(mm, > text_start, > image->size, > VM_READ|VM_EXEC| > - VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC, > + VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC| > + VM_SYSTEM_SEAL, > &vdso_mapping); > > No unsigned long vm_flags needed. It's easier to read and I don't think > it's any more hidden than the vm_flags |=3D function call option. > The arch code needs a mseal_system_mappings() function. Otherwise, I'll have to change this line (in arch) again when I implement the kernel command line or pre-process opt-in/opt-out, which requires a function call. This isn't overthinking; based on our discussion so far, there are clear needs for a subsequent patch series. This patch is just the first step. For software layering, I'd like to see a clear separation between layers. mm implements _install_special_mapping, which accepts any combination of vm_flags as input. Then I'd like the caller (in arch code) to have all the necessary code to compose the vm_flags in one place. This helps readability. In the past, we discussed merging the vdso/vvar code across all architectures. When that happens, the new code in arch will likely have its own .c and .h files, but it will still si= t above mm. That means mm won't need to change, and the _install_special_mapping API from mm will remain unchanged. The mseal_system_mappings() function can already return VM_SEALED, and future patches will add more logic into mseal_system_mappings(), e.g. check for kernel cmdline or opt-in/opt-out. we don't need a separate VM_SYSTEM_SEAL, which is a *** redirect macro ***, I prefer to keep all the important code logic relevant to configuration of enabling system mapping sealing in one place, for easy reading. mseal_system_mappings() can be placed in mm.h in this patch, as you suggested. But in the near future, it will be moved out of mm.h and find a right header. The functionality belongs to exe namespace, because of the reasons I put in the commit message and discussions. Thanks -Jeff -Jeff