From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12B32E77198 for ; Tue, 7 Jan 2025 05:21:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C3556B00AA; Tue, 7 Jan 2025 00:21:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 54C026B00AC; Tue, 7 Jan 2025 00:21:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C6A86B00AD; Tue, 7 Jan 2025 00:21:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1C3426B00AA for ; Tue, 7 Jan 2025 00:21:47 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 786601C6C87 for ; Tue, 7 Jan 2025 05:21:46 +0000 (UTC) X-FDA: 82979508612.24.9E6EDF7 Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by imf18.hostedemail.com (Postfix) with ESMTP id 82EFE1C0003 for ; Tue, 7 Jan 2025 05:21:44 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=dqBo7orM; spf=pass (imf18.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.208.53 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736227304; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LKwyl6OspyjnQcfr2q/vOIDXkwuARjFY2nz+F+nCUfA=; b=B9cFZtHCJovccJIzAX9Nki6RtqWNemereb9ntF8EY2hy2ndvSw/0//exnN0zT2d0ErbBr2 5U9TEVdYVGOV3KMoYO7L8JKYxjFcaZRz9qXDjyZUkIzLrcUran0bu9+LRPH17i8uaeUxJI O00EHuJW5euAVAbMAawTS/O/6jKVYPw= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=dqBo7orM; spf=pass (imf18.hostedemail.com: domain of jeffxu@chromium.org designates 209.85.208.53 as permitted sender) smtp.mailfrom=jeffxu@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736227304; a=rsa-sha256; cv=none; b=z6bxbi1n+mU/8PFAJJMLyCCz3HI004Xi2g8s6uJb0+StPIrQh/gRyVRCce29gmUJzPYUxS d+VMG5xsNQ/eME2/xahJIP4uZZ5odhcxI1BkV4tOsOkonxYw/dRzcaSahxbw7xTSxc1D3L kdtvrhNVyL1DJE39VOMM2RygKm8GPo8= Received: by mail-ed1-f53.google.com with SMTP id 4fb4d7f45d1cf-5d3ce64e7e5so2825958a12.0 for ; Mon, 06 Jan 2025 21:21:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1736227303; x=1736832103; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LKwyl6OspyjnQcfr2q/vOIDXkwuARjFY2nz+F+nCUfA=; b=dqBo7orMZkUsZqdkNoAoRI+p84vFiYpjaTXuonp4ZOYdl12pC7ltW1ff2uyzSklVZZ Yp7IyXfCaco/UyaEKn8DAaV89MeKNl2PteVfCL9wkjgCL+t9ixpX9f9Ya1TxgEhuMhoB 8cVZApUxzyQIqD4d/+mAqHUuDGCYlRsSPJlG0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736227303; x=1736832103; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LKwyl6OspyjnQcfr2q/vOIDXkwuARjFY2nz+F+nCUfA=; b=QUvuV1ZoTbvDKg7bPO6ha8j2GBswZ2ZHl+7OTh74eAZvyxLtY0i82qJtecyIohYgHt kRGma9mSYWoRVUfR88IydAmEQo0OmkbSelHxe2IZxecU4b1MCCX9ZwUdVVEgrdQn5N4r /rsKORUZSjow+fk64yEZQgJYtOO4VBXXTTHF8sgh9p6rELVUrx/P7pFJGDLtVZg48T7d CplBgwJ3UEpP0nC3SbCDMitzrwTtz3JfVAYG+zyZj5WKr97mnQuENrh7G5/qPuJqM6Wu FFPZOx/iN8A0jHcJHbfENnvlhBGXNa10WMcKZrygYKsCDdYZ6/MxtZZrta/wszpWm160 Hb9Q== X-Forwarded-Encrypted: i=1; AJvYcCUHOW0Fo79fxRTzPwNgvFYz9c4o1ZJiBlVc4DSOwFsXweZwZ2+l8jPxeJPAsjXPfXwGhcyOI2nwSA==@kvack.org X-Gm-Message-State: AOJu0Yy9PFSFWfU+ZdhSdFdHldHP6794alcafwv8aFgj0li0tjAuCiXM Bs405Z250T/fyBCXhRLemUBLKdCmujuSZdGMtUtAuWKPwIaLe1rlDecjPbNdXPjfHRdfdjtjuJT 2jtgi55yBpI1FpUvt1yPUBQ8b4AMJRh2DF5mN X-Gm-Gg: ASbGncst8hncE4Creh94VG/+uHeEhGjXjJT+gSfvAfAhzoOy/Bc5/SU8mjXIqX136Kl YJBbMtRyTnP8zFC9o7WovTV2XNu8azTdXb2sQ33FexriSgHL6hxnQDvbZ5xxZfkuISyPwdnU= X-Google-Smtp-Source: AGHT+IEOY1MZJnyKNAdXNSNwfdgEXEFTpg5dwbP9keiatuo2YFaXoHoheAn4YeaED3xrB/ozQ7bs7NxX2BNTDbJdnEU= X-Received: by 2002:a05:6402:268c:b0:5d0:ccce:34b7 with SMTP id 4fb4d7f45d1cf-5d81ddf8113mr19983213a12.6.1736227302959; Mon, 06 Jan 2025 21:21:42 -0800 (PST) MIME-Version: 1.0 References: <20250102233255.1180524-1-isaacmanjarres@google.com> <20250102233255.1180524-2-isaacmanjarres@google.com> In-Reply-To: From: Jeff Xu Date: Mon, 6 Jan 2025 21:21:25 -0800 Message-ID: Subject: Re: [RFC PATCH RESEND v2 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd To: Isaac Manjarres Cc: Jann Horn , Kees Cook , lorenzo.stoakes@oracle.com, Jeff Layton , Chuck Lever , Alexander Aring , Andrew Morton , Shuah Khan , surenb@google.com, kaleshsingh@google.com, jstultz@google.com, aliceryhl@google.com, jeffxu@google.com, kees@kernel.org, kernel-team@android.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 82EFE1C0003 X-Stat-Signature: nkmmck9wzmb6z3kfe6yu39zu6aruxc98 X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736227304-295444 X-HE-Meta: U2FsdGVkX1/bG5OKNg6dS0u4hjRHvStX+b99LHG8NS8tfx0Ad0Ung7Z1ENnWVjHpZku7h3f1Fip/zSZOFFOMWuEpSEPpjdmk5ZBTyFl+88w5s8HY6QPh/PjxBBSh/0S3YtBawzPjzRbdCGf5UBnYxHxtTdg1a8WXZnhvcHrz0MuPPNHBYP7noDD7StcPrhAgI8R3wNM7AJGT7dLcNdZxUjJ6YHL0A/5x2dCw7/+M/IuH8TBQpnsdugMaJllzuFsYIH6gyOgw/Ig0W8vxayzyDpgUyMWh5Dsb+CDrxZLMiLGuttN0sWkEtSqcqq01ImPj68yjHrKXJR+LdXpBIYXVx5+gFOnMGKscln3rC53CbJG7Fi4Cx0juR7h+mC9AoOA5G2GWO3enigIB6zJm+4GqFYvlAAjMJhgHrS9qI76J8CuQLufV72QenMb4w39ObLiezjmwaZrTzcliJB8Sn87yHJkB/XMMCsJdAJefjpr0cn5Ufw52TCYLFaaEMKlkPfsZs9DL+doYEKYNxFpmnyyEw6nDvmaMtgirCU5UFkLf2DRjCEjKVQv2mxPyjtdOxY8wcLPLCf5cmmfscO3JP7Lk8skbbSEFj9sKsj/h8i6sxYbWC5mCu/hQZ/obVPYDr/wTQGvEJbGjsXUdvJl0zs8Lm3bt1v7uVttP7SpLI4AoXEUOucoCbanelu3U+n7p6d7uYtWvanuREC58msuEi1mR8BZyn5WjPV4YR5acjAMm5ar5kJSiFwfznqpsugsawX8sbMsxB3L1QW6kvjECJiscz+iQRljlmITrHpldElcoP+tZG3hGD96hs0TpmZ9xArF2XdT8aqFmqtXwhx6NHC8WXlpUGJPEB4D/vwLt4UaDMSsovVkFmDp4tHHZVhJLG8pr2Q09l+R2Ydn3nRs7LciWVE2iFJcNSo+6Li5FWPTW1MY/dG37DdcdFylgK07VxTIFfAXKG5oTSgIOAmT9FuE DP39NvfU HvVIgq5okZ1tSDC0Evgfzsekx3DQgSYVx3N3CajendVC2qjW06hKUJDXXd/sUZw/2quwG92zlxnv0InEVxXVtqrqog8ZjSO+goZ1eLImM0qDs1dK13sov76qX0GmWKWc5s7kNyxt+XGOayCDPCtxWtzcL610172fKW4f2zTTRJIBFKkAMuZo7RjMHRjxkk25ZLJdxXTVT7S1lPNs0AYv6tSl2g9i8UhyE/XGe8ZT5II90T/LStAmD3LYJuJO5D3R7Y1rLxP6WQNp3qkKcgEFyPZatc/jWFaEtpq+Y+XI/fK5GxaGNHBSUh9KSs4wKFdmIaUgdHwRCQ+08TWZd9MohG1LblGZ3Rs8SMUL6Oefy2O/ijKg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 6, 2025 at 5:26=E2=80=AFPM Isaac Manjarres wrote: > > On Mon, Jan 06, 2025 at 09:35:09AM -0800, Jeff Xu wrote: > > + Kees because this is related to W^X memfd and security. > > > > On Fri, Jan 3, 2025 at 7:04=E2=80=AFAM Jann Horn wro= te: > > > > > > On Fri, Jan 3, 2025 at 12:32=E2=80=AFAM Isaac J. Manjarres > > > wrote: > > > > Android currently uses the ashmem driver [1] for creating shared me= mory > > > > regions between processes. Ashmem buffers can initially be mapped w= ith > > > > PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the > > > > ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the > > > > permissions that the buffer can be mapped with. > > > > > > > > Processes can remove the ability to map ashmem buffers as executabl= e to > > > > ensure that those buffers cannot be exploited to run unintended cod= e. > > > > > > Is there really code out there that first maps an ashmem buffer with > > > PROT_EXEC, then uses the ioctl to remove execute permission for futur= e > > > mappings? I don't see why anyone would do that. > > > > > > > For instance, suppose process A allocates a memfd that is meant to = be > > > > read and written by itself and another process, call it B. > > > > > > > > Process A shares the buffer with process B, but process B injects c= ode > > > > into the buffer, and compromises process A, such that it makes A ma= p > > > > the buffer with PROT_EXEC. This provides an opportunity for process= A > > > > to run the code that process B injected into the buffer. > > > > > > > > If process A had the ability to seal the buffer against future > > > > executable mappings before sharing the buffer with process B, this > > > > attack would not be possible. > > > > > > I think if you want to enforce such restrictions in a scenario where > > > the attacker can already make the target process perform > > > semi-arbitrary syscalls, it would probably be more reliable to enforc= e > > > rules on executable mappings with something like SELinux policy and/o= r > > > F_SEAL_EXEC. > > > > > I would like to second on the suggestion of making this as part of F_S= EAL_EXEC. > > Thanks for taking a look at this patch Jeff! Can you please elaborate > some more on how F_SEAL_EXEC should be extended to restricting executable > mappings? > > I understand that if a memfd file is non-executable (either because it > was made non-executable via fchmod() or by being created with > MFD_NOEXEC_SEAL) one could argue that applying F_SEAL_EXEC to that file > would also mean preventing any executable mappings. However, it is not > clear to me if we should tie a file's executable permissions to whether > or not if it can be mapped as executable. For example, shared object > files don't have to have executable permissions, but processes should > be able to map them as executable. > > The case where we apply F_SEAL_EXEC on an executable memfd also seems > awkward to me, since memfds can be mapped as executable by default > so what would happen in that scenario? > > I also shared the same concerns in my response to Jann in [1]. > Apology for not being clear. I meant this below: when 1> memfd is created with MFD_NOEXEC_SEAL or 2> memfd is no-exec (NX) and F_SEAL_EXEC is set. We could also block the memfd from being mapped as executable. MFD_NOEXEC_SEAL/F_SEAL_EXEC is added in 6fd7353829ca, which is about 2 years old, I m not sure any application uses the case of creating a MFD_NOEXEC_SEAL memfd and still wants to mmap it as executable memory, that is a strange user case. It is more logical that applications want to block both execve() and mmap() for a non-executable memfd. Therefore I think we could reuse the F_SEAL_EXEC bit + NX state for this feature, for simplicity. > > > > diff --git a/mm/memfd.c b/mm/memfd.c > > > > index 5f5a23c9051d..cfd62454df5e 100644 > > > > --- a/mm/memfd.c > > > > +++ b/mm/memfd.c > > > > @@ -184,6 +184,7 @@ static unsigned int *memfd_file_seals_ptr(struc= t file *file) > > > > } > > > > > > > > #define F_ALL_SEALS (F_SEAL_SEAL | \ > > > > + F_SEAL_FUTURE_EXEC |\ > > > > F_SEAL_EXEC | \ > > > > F_SEAL_SHRINK | \ > > > > F_SEAL_GROW | \ > > > > @@ -357,14 +358,50 @@ static int check_write_seal(unsigned long *vm= _flags_ptr) > > > > return 0; > > > > } > > > > > > > > +static inline bool is_exec_sealed(unsigned int seals) > > > > +{ > > > > + return seals & F_SEAL_FUTURE_EXEC; > > > > +} > > > > + > > > > +static int check_exec_seal(unsigned long *vm_flags_ptr) > > > > +{ > > > > + unsigned long vm_flags =3D *vm_flags_ptr; > > > > + unsigned long mask =3D vm_flags & (VM_SHARED | VM_EXEC); > > > > + > > > > + /* Executability is not a concern for private mappings. */ > > > > + if (!(mask & VM_SHARED)) > > > > + return 0; > > > > > > Why is it not a concern for private mappings? > > > > > > > + /* > > > > + * New PROT_EXEC and MAP_SHARED mmaps are not allowed when = exec seal > > > > + * is active. > > > > + */ > > > > + if (mask & VM_EXEC) > > > > + return -EPERM; > > > > + > > > > + /* > > > > + * Prevent mprotect() from making an exec-sealed mapping ex= ecutable in > > > > + * the future. > > > > + */ > > > > + *vm_flags_ptr &=3D ~VM_MAYEXEC; > > > > + > > > > + return 0; > > > > +} > > > > + > > > > int memfd_check_seals_mmap(struct file *file, unsigned long *vm_fl= ags_ptr) > > > > { > > > > int err =3D 0; > > > > unsigned int *seals_ptr =3D memfd_file_seals_ptr(file); > > > > unsigned int seals =3D seals_ptr ? *seals_ptr : 0; > > > > > > > > - if (is_write_sealed(seals)) > > > > + if (is_write_sealed(seals)) { > > > > err =3D check_write_seal(vm_flags_ptr); > > > > + if (err) > > > > + return err; > > > > + } > > > > + > > > > + if (is_exec_sealed(seals)) > > > > + err =3D check_exec_seal(vm_flags_ptr); > > > > > > memfd_check_seals_mmap is only for mmap() path, right ? > > > > How about the mprotect() path ? i.e. An attacker can first create a > > RW VMA mapping for the memfd and later mprotect the VMA to be > > executable. > > > > Similar to the check_write_seal call , we might want to block mprotect > > for write seal as well. > > > > So when memfd_check_seals_mmap() is called, if the file is exec_sealed, > check_exec_seal() will not only just check that VM_EXEC is not set, > but it will also clear VM_MAYEXEC, which will prevent the mapping from > being changed to executable via mprotect() later. > Thanks for clarification. The name of check_exec_seal() is misleading , check implies a read operation, but this function actually does update. Maybe renaming to check_and_update_exec_seal or something like that ? Do you know which code checks for VM_MAYEXEC flag in the mprotect code path ? it isn't obvious to me, i.e. when I grep the VM_MAYEXEC inside mm path, it only shows one place in mprotect and that doesn't do the work. ~/mm/mm$ grep VM_MAYEXEC * mmap.c: mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; mmap.c: vm_flags &=3D ~VM_MAYEXEC; mprotect.c: if (rier && (vma->vm_flags & VM_MAYEXEC)) nommu.c: vm_flags |=3D VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; nommu.c: vm_flags |=3D VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; Thanks -Jeff > [1] https://lore.kernel.org/all/Z3x_8uFn2e0EpDqM@google.com/ > > Thanks, > Isaac