* [RFC PATCH v1 0/2] Add file seal to prevent future exec mappings
@ 2024-12-06 1:09 Isaac J. Manjarres
2024-12-06 1:09 ` [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd Isaac J. Manjarres
2024-12-06 1:09 ` [RFC PATCH v1 2/2] selftests/memfd: Add tests for F_SEAL_FUTURE_EXEC Isaac J. Manjarres
0 siblings, 2 replies; 20+ messages in thread
From: Isaac J. Manjarres @ 2024-12-06 1:09 UTC (permalink / raw)
To: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka, Jann Horn,
Shuah Khan
Cc: Isaac J. Manjarres, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest
Android uses the ashmem driver [1] for creating shared memory regions
between processes. The ashmem driver exposes an ioctl command for
processes to restrict the permissions an ashmem buffer can be mapped
with.
Buffers are created with the ability to be mapped as readable, writable,
and executable. Processes remove the ability to map some ashmem buffers
as executable to ensure that those buffers cannot be exploited to run
unintended code. Other buffers retain their ability to be mapped as
executable, as these buffers can be used for just-in-time (JIT)
compilation. So there is a need to be able to remove the ability to
map a buffer as executable on a per-buffer basis.
Android is currently trying to migrate towards replacing its ashmem
driver usage with memfd. Part of the transition involved introducing a
library that serves to abstract away how shared memory regions are
allocated (i.e. ashmem vs memfd). This allows clients to use a single
interface for restricting how a buffer can be mapped without having to
worry about how it is handled for ashmem (through the ioctl
command mentioned earlier) or memfd (through file seals).
While memfd has support for preventing buffers from being mapped as
writable beyond a certain point in time (thanks to
F_SEAL_FUTURE_WRITE), it does not have a similar interface to prevent
buffers from being mapped as executable beyond a certain point.
However, that could be implemented as a file seal (F_SEAL_FUTURE_EXEC)
which works similarly to F_SEAL_FUTURE_WRITE.
F_SEAL_FUTURE_WRITE was chosen as a template for how this new seal
should behave, instead of F_SEAL_WRITE, for the following reasons:
1. Having the new seal behave like F_SEAL_FUTURE_WRITE matches the
behavior that was present with ashmem. This aids in seamlessly
transitioning clients away from ashmem to memfd.
2. Making the new seal behave like F_SEAL_WRITE would mean that no
mappings that could become executable in the future (i.e. via
mprotect()) can exist when the seal is applied. However, there are
known cases (e.g. CursorWindow [2]) where restrictions are applied
on how a buffer can be mapped after a mapping has already been made.
That mapping may have VM_MAYEXEC set, which would not allow the seal
to be applied successfully.
Therefore, the F_SEAL_FUTURE_EXEC seal was designed to have the same
semantics as F_SEAL_FUTURE_WRITE.
Note: this series depends on Lorenzo's work [3] which allows for a
memfd's file seals to be read in do_mmap().
[1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/drivers/staging/android/ashmem.c
[2] https://developer.android.com/reference/android/database/CursorWindow
[3] https://lore.kernel.org/all/cover.1732804776.git.lorenzo.stoakes@oracle.com/
Isaac J. Manjarres (2):
mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
selftests/memfd: Add tests for F_SEAL_FUTURE_EXEC
include/linux/mm.h | 5 ++
include/uapi/linux/fcntl.h | 1 +
mm/memfd.c | 1 +
mm/mmap.c | 11 +++
tools/testing/selftests/memfd/memfd_test.c | 79 ++++++++++++++++++++++
5 files changed, 97 insertions(+)
--
2.47.0.338.g60cca15819-goog
^ permalink raw reply [flat|nested] 20+ messages in thread
* [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2024-12-06 1:09 [RFC PATCH v1 0/2] Add file seal to prevent future exec mappings Isaac J. Manjarres
@ 2024-12-06 1:09 ` Isaac J. Manjarres
2024-12-06 17:49 ` Kalesh Singh
2024-12-06 18:19 ` Lorenzo Stoakes
2024-12-06 1:09 ` [RFC PATCH v1 2/2] selftests/memfd: Add tests for F_SEAL_FUTURE_EXEC Isaac J. Manjarres
1 sibling, 2 replies; 20+ messages in thread
From: Isaac J. Manjarres @ 2024-12-06 1:09 UTC (permalink / raw)
To: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka, Jann Horn,
Shuah Khan
Cc: Isaac J. Manjarres, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
Android currently uses the ashmem driver [1] for creating shared memory
regions between processes. Ashmem buffers can initially be mapped with
PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the
ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the
permissions that the buffer can be mapped with.
Processes can remove the ability to map ashmem buffers as executable to
ensure that those buffers cannot be exploited to run unintended code.
We are currently trying to replace ashmem with memfd. However, memfd
does not have a provision to permanently remove the ability to map a
buffer as executable. Although, this should be something that can be
achieved via a new file seal.
There are known usecases (e.g. CursorWindow [2]) where a process
maps a buffer with read/write permissions before restricting the buffer
to being mapped as read-only for future mappings.
The resulting VMA from the writable mapping has VM_MAYEXEC set, meaning
that mprotect() can change the mapping to be executable. Therefore,
implementing the seal similar to F_SEAL_WRITE would not be appropriate,
since it would not work with the CursorWindow usecase. This is because
the CursorWindow process restricts the mapping permissions to read-only
after the writable mapping is created. So, adding a file seal for
executable mappings that operates like F_SEAL_WRITE would fail.
Therefore, add support for F_SEAL_FUTURE_EXEC, which is handled
similarly to F_SEAL_FUTURE_WRITE. This ensures that CursorWindow can
continue to create a writable mapping initially, and then restrict the
permissions on the buffer to be mappable as read-only by using both
F_SEAL_FUTURE_WRITE and F_SEAL_FUTURE_EXEC. After the seal is
applied, any calls to mmap() with PROT_EXEC will fail.
[1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/drivers/staging/android/ashmem.c
[2] https://developer.android.com/reference/android/database/CursorWindow
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: John Stultz <jstultz@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
---
include/linux/mm.h | 5 +++++
include/uapi/linux/fcntl.h | 1 +
mm/memfd.c | 1 +
mm/mmap.c | 11 +++++++++++
4 files changed, 18 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4eb8e62d5c67..40c03a491e45 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4096,6 +4096,11 @@ static inline bool is_write_sealed(int seals)
return seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE);
}
+static inline bool is_exec_sealed(int seals)
+{
+ return seals & F_SEAL_FUTURE_EXEC;
+}
+
/**
* is_readonly_sealed - Checks whether write-sealed but mapped read-only,
* in which case writes should be disallowing moving
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 6e6907e63bfc..ef066e524777 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -49,6 +49,7 @@
#define F_SEAL_WRITE 0x0008 /* prevent writes */
#define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */
#define F_SEAL_EXEC 0x0020 /* prevent chmod modifying exec bits */
+#define F_SEAL_FUTURE_EXEC 0x0040 /* prevent future executable mappings */
/* (1U << 31) is reserved for signed error codes */
/*
diff --git a/mm/memfd.c b/mm/memfd.c
index 35a370d75c9a..77b49995a044 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -184,6 +184,7 @@ unsigned int *memfd_file_seals_ptr(struct file *file)
}
#define F_ALL_SEALS (F_SEAL_SEAL | \
+ F_SEAL_FUTURE_EXEC |\
F_SEAL_EXEC | \
F_SEAL_SHRINK | \
F_SEAL_GROW | \
diff --git a/mm/mmap.c b/mm/mmap.c
index b1b2a24ef82e..c7b96b057fda 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
if (!file_mmap_ok(file, inode, pgoff, len))
return -EOVERFLOW;
+ if (is_exec_sealed(seals)) {
+ /* No new executable mappings if the file is exec sealed. */
+ if (prot & PROT_EXEC)
+ return -EACCES;
+ /*
+ * Prevent an initially non-executable mapping from
+ * later becoming executable via mprotect().
+ */
+ vm_flags &= ~VM_MAYEXEC;
+ }
+
flags_mask = LEGACY_MAP_MASK;
if (file->f_op->fop_flags & FOP_MMAP_SYNC)
flags_mask |= MAP_SYNC;
--
2.47.0.338.g60cca15819-goog
^ permalink raw reply [flat|nested] 20+ messages in thread
* [RFC PATCH v1 2/2] selftests/memfd: Add tests for F_SEAL_FUTURE_EXEC
2024-12-06 1:09 [RFC PATCH v1 0/2] Add file seal to prevent future exec mappings Isaac J. Manjarres
2024-12-06 1:09 ` [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd Isaac J. Manjarres
@ 2024-12-06 1:09 ` Isaac J. Manjarres
1 sibling, 0 replies; 20+ messages in thread
From: Isaac J. Manjarres @ 2024-12-06 1:09 UTC (permalink / raw)
To: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka, Jann Horn,
Shuah Khan
Cc: Isaac J. Manjarres, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
Add tests to ensure that F_SEAL_FUTURE_EXEC behaves as expected.
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: John Stultz <jstultz@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
---
tools/testing/selftests/memfd/memfd_test.c | 79 ++++++++++++++++++++++
1 file changed, 79 insertions(+)
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 46027c889e74..12c82af406b3 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -30,6 +30,7 @@
#define STACK_SIZE 65536
#define F_SEAL_EXEC 0x0020
+#define F_SEAL_FUTURE_EXEC 0x0040
#define F_WX_SEALS (F_SEAL_SHRINK | \
F_SEAL_GROW | \
@@ -317,6 +318,37 @@ static void *mfd_assert_mmap_private(int fd)
return p;
}
+static void *mfd_fail_mmap_exec(int fd)
+{
+ void *p;
+
+ p = mmap(NULL,
+ mfd_def_size,
+ PROT_EXEC,
+ MAP_SHARED,
+ fd,
+ 0);
+ if (p != MAP_FAILED) {
+ printf("mmap() didn't fail as expected\n");
+ abort();
+ }
+
+ return p;
+}
+
+static void mfd_fail_mprotect_exec(void *p)
+{
+ int ret;
+
+ ret = mprotect(p,
+ mfd_def_size,
+ PROT_EXEC);
+ if (!ret) {
+ printf("mprotect didn't fail as expected\n");
+ abort();
+ }
+}
+
static int mfd_assert_open(int fd, int flags, mode_t mode)
{
char buf[512];
@@ -997,6 +1029,52 @@ static void test_seal_future_write(void)
close(fd);
}
+/*
+ * Test SEAL_FUTURE_EXEC_MAPPING
+ * Test whether SEAL_FUTURE_EXEC_MAPPING actually prevents executable mappings.
+ */
+static void test_seal_future_exec_mapping(void)
+{
+ int fd;
+ void *p;
+
+
+ printf("%s SEAL-FUTURE-EXEC-MAPPING\n", memfd_str);
+
+ fd = mfd_assert_new("kern_memfd_seal_future_exec_mapping",
+ mfd_def_size,
+ MFD_CLOEXEC | MFD_ALLOW_SEALING);
+
+ /*
+ * PROT_READ | PROT_WRITE mappings create VMAs with VM_MAYEXEC set.
+ * However, F_SEAL_FUTURE_EXEC applies to subsequent mappings,
+ * so it should still succeed even if this mapping is active when the
+ * seal is applied.
+ */
+ p = mfd_assert_mmap_shared(fd);
+
+ mfd_assert_has_seals(fd, 0);
+
+ mfd_assert_add_seals(fd, F_SEAL_FUTURE_EXEC);
+ mfd_assert_has_seals(fd, F_SEAL_FUTURE_EXEC);
+
+ mfd_fail_mmap_exec(fd);
+
+ munmap(p, mfd_def_size);
+
+ /* Ensure that new mappings without PROT_EXEC work. */
+ p = mfd_assert_mmap_shared(fd);
+
+ /*
+ * Ensure that mappings created after the seal was applied cannot be
+ * made executable via mprotect().
+ */
+ mfd_fail_mprotect_exec(p);
+
+ munmap(p, mfd_def_size);
+ close(fd);
+}
+
static void test_seal_write_map_read_shared(void)
{
int fd;
@@ -1633,6 +1711,7 @@ int main(int argc, char **argv)
test_seal_shrink();
test_seal_grow();
test_seal_resize();
+ test_seal_future_exec_mapping();
test_sysctl_simple();
test_sysctl_nested();
--
2.47.0.338.g60cca15819-goog
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2024-12-06 1:09 ` [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd Isaac J. Manjarres
@ 2024-12-06 17:49 ` Kalesh Singh
2024-12-06 20:50 ` Isaac Manjarres
2024-12-06 18:19 ` Lorenzo Stoakes
1 sibling, 1 reply; 20+ messages in thread
From: Kalesh Singh @ 2024-12-06 17:49 UTC (permalink / raw)
To: Isaac J. Manjarres
Cc: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka, Jann Horn,
Shuah Khan, kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, John Stultz
On Thu, Dec 5, 2024 at 5:09 PM Isaac J. Manjarres
<isaacmanjarres@google.com> wrote:
>
> Android currently uses the ashmem driver [1] for creating shared memory
> regions between processes. Ashmem buffers can initially be mapped with
> PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the
> ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the
> permissions that the buffer can be mapped with.
>
> Processes can remove the ability to map ashmem buffers as executable to
> ensure that those buffers cannot be exploited to run unintended code.
> We are currently trying to replace ashmem with memfd. However, memfd
> does not have a provision to permanently remove the ability to map a
> buffer as executable. Although, this should be something that can be
> achieved via a new file seal.
>
> There are known usecases (e.g. CursorWindow [2]) where a process
> maps a buffer with read/write permissions before restricting the buffer
> to being mapped as read-only for future mappings.
>
> The resulting VMA from the writable mapping has VM_MAYEXEC set, meaning
> that mprotect() can change the mapping to be executable. Therefore,
> implementing the seal similar to F_SEAL_WRITE would not be appropriate,
> since it would not work with the CursorWindow usecase. This is because
> the CursorWindow process restricts the mapping permissions to read-only
> after the writable mapping is created. So, adding a file seal for
> executable mappings that operates like F_SEAL_WRITE would fail.
>
> Therefore, add support for F_SEAL_FUTURE_EXEC, which is handled
> similarly to F_SEAL_FUTURE_WRITE. This ensures that CursorWindow can
> continue to create a writable mapping initially, and then restrict the
> permissions on the buffer to be mappable as read-only by using both
> F_SEAL_FUTURE_WRITE and F_SEAL_FUTURE_EXEC. After the seal is
> applied, any calls to mmap() with PROT_EXEC will fail.
>
> [1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/drivers/staging/android/ashmem.c
> [2] https://developer.android.com/reference/android/database/CursorWindow
>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Kalesh Singh <kaleshsingh@google.com>
> Cc: John Stultz <jstultz@google.com>
> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
> ---
> include/linux/mm.h | 5 +++++
> include/uapi/linux/fcntl.h | 1 +
> mm/memfd.c | 1 +
> mm/mmap.c | 11 +++++++++++
> 4 files changed, 18 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 4eb8e62d5c67..40c03a491e45 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4096,6 +4096,11 @@ static inline bool is_write_sealed(int seals)
> return seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE);
> }
>
> +static inline bool is_exec_sealed(int seals)
> +{
> + return seals & F_SEAL_FUTURE_EXEC;
> +}
> +
> /**
> * is_readonly_sealed - Checks whether write-sealed but mapped read-only,
> * in which case writes should be disallowing moving
> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> index 6e6907e63bfc..ef066e524777 100644
> --- a/include/uapi/linux/fcntl.h
> +++ b/include/uapi/linux/fcntl.h
> @@ -49,6 +49,7 @@
> #define F_SEAL_WRITE 0x0008 /* prevent writes */
> #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */
> #define F_SEAL_EXEC 0x0020 /* prevent chmod modifying exec bits */
> +#define F_SEAL_FUTURE_EXEC 0x0040 /* prevent future executable mappings */
> /* (1U << 31) is reserved for signed error codes */
>
> /*
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 35a370d75c9a..77b49995a044 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -184,6 +184,7 @@ unsigned int *memfd_file_seals_ptr(struct file *file)
> }
>
> #define F_ALL_SEALS (F_SEAL_SEAL | \
> + F_SEAL_FUTURE_EXEC |\
> F_SEAL_EXEC | \
> F_SEAL_SHRINK | \
> F_SEAL_GROW | \
> diff --git a/mm/mmap.c b/mm/mmap.c
> index b1b2a24ef82e..c7b96b057fda 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> if (!file_mmap_ok(file, inode, pgoff, len))
> return -EOVERFLOW;
>
> + if (is_exec_sealed(seals)) {
> + /* No new executable mappings if the file is exec sealed. */
> + if (prot & PROT_EXEC)
> + return -EACCES;
I think this should be -EPERM to be consistent with seal_check_write()
and mmap(2) man page:
" EPERM The operation was prevented by a file seal; see fcntl(2)."
Thanks,
Kalesh
> + /*
> + * Prevent an initially non-executable mapping from
> + * later becoming executable via mprotect().
> + */
> + vm_flags &= ~VM_MAYEXEC;
> + }
> +
> flags_mask = LEGACY_MAP_MASK;
> if (file->f_op->fop_flags & FOP_MMAP_SYNC)
> flags_mask |= MAP_SYNC;
> --
> 2.47.0.338.g60cca15819-goog
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2024-12-06 1:09 ` [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd Isaac J. Manjarres
2024-12-06 17:49 ` Kalesh Singh
@ 2024-12-06 18:19 ` Lorenzo Stoakes
2024-12-06 20:48 ` Isaac Manjarres
2025-01-03 15:13 ` Jann Horn
1 sibling, 2 replies; 20+ messages in thread
From: Lorenzo Stoakes @ 2024-12-06 18:19 UTC (permalink / raw)
To: Isaac J. Manjarres
Cc: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Vlastimil Babka, Jann Horn, Shuah Khan,
kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, Kalesh Singh, John Stultz
On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> Android currently uses the ashmem driver [1] for creating shared memory
> regions between processes. Ashmem buffers can initially be mapped with
> PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the
> ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the
> permissions that the buffer can be mapped with.
>
> Processes can remove the ability to map ashmem buffers as executable to
> ensure that those buffers cannot be exploited to run unintended code.
> We are currently trying to replace ashmem with memfd. However, memfd
> does not have a provision to permanently remove the ability to map a
> buffer as executable. Although, this should be something that can be
> achieved via a new file seal.
>
> There are known usecases (e.g. CursorWindow [2]) where a process
> maps a buffer with read/write permissions before restricting the buffer
> to being mapped as read-only for future mappings.
>
> The resulting VMA from the writable mapping has VM_MAYEXEC set, meaning
> that mprotect() can change the mapping to be executable. Therefore,
> implementing the seal similar to F_SEAL_WRITE would not be appropriate,
> since it would not work with the CursorWindow usecase. This is because
> the CursorWindow process restricts the mapping permissions to read-only
> after the writable mapping is created. So, adding a file seal for
> executable mappings that operates like F_SEAL_WRITE would fail.
>
> Therefore, add support for F_SEAL_FUTURE_EXEC, which is handled
> similarly to F_SEAL_FUTURE_WRITE. This ensures that CursorWindow can
> continue to create a writable mapping initially, and then restrict the
> permissions on the buffer to be mappable as read-only by using both
> F_SEAL_FUTURE_WRITE and F_SEAL_FUTURE_EXEC. After the seal is
> applied, any calls to mmap() with PROT_EXEC will fail.
>
> [1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/drivers/staging/android/ashmem.c
> [2] https://developer.android.com/reference/android/database/CursorWindow
>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Kalesh Singh <kaleshsingh@google.com>
> Cc: John Stultz <jstultz@google.com>
> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
> ---
> include/linux/mm.h | 5 +++++
> include/uapi/linux/fcntl.h | 1 +
> mm/memfd.c | 1 +
> mm/mmap.c | 11 +++++++++++
> 4 files changed, 18 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 4eb8e62d5c67..40c03a491e45 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4096,6 +4096,11 @@ static inline bool is_write_sealed(int seals)
> return seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE);
> }
>
> +static inline bool is_exec_sealed(int seals)
> +{
> + return seals & F_SEAL_FUTURE_EXEC;
> +}
> +
> /**
> * is_readonly_sealed - Checks whether write-sealed but mapped read-only,
> * in which case writes should be disallowing moving
> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> index 6e6907e63bfc..ef066e524777 100644
> --- a/include/uapi/linux/fcntl.h
> +++ b/include/uapi/linux/fcntl.h
> @@ -49,6 +49,7 @@
> #define F_SEAL_WRITE 0x0008 /* prevent writes */
> #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */
> #define F_SEAL_EXEC 0x0020 /* prevent chmod modifying exec bits */
> +#define F_SEAL_FUTURE_EXEC 0x0040 /* prevent future executable mappings */
> /* (1U << 31) is reserved for signed error codes */
>
> /*
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 35a370d75c9a..77b49995a044 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -184,6 +184,7 @@ unsigned int *memfd_file_seals_ptr(struct file *file)
> }
>
> #define F_ALL_SEALS (F_SEAL_SEAL | \
> + F_SEAL_FUTURE_EXEC |\
> F_SEAL_EXEC | \
> F_SEAL_SHRINK | \
> F_SEAL_GROW | \
> diff --git a/mm/mmap.c b/mm/mmap.c
> index b1b2a24ef82e..c7b96b057fda 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> if (!file_mmap_ok(file, inode, pgoff, len))
> return -EOVERFLOW;
>
Not maybe in favour of _where_ in the logic we check this and definitely
not in expanding this do_mmap() stuff much further.
See comment at bottom though... I have a cunning plan :)
> + if (is_exec_sealed(seals)) {
Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
I've not tested this scenario so don't know if we somehow disallow this in
another way but note on write checks we only care about shared mappings.
I mean one could argue that a MAP_PRIVATE situation is the same as copying
the data into an anon buffer and doing what you want with it, here you
could argue the same...
So probably we should only care about VM_SHARED?
> + /* No new executable mappings if the file is exec sealed. */
> + if (prot & PROT_EXEC)
Seems strange to reference a prot flag rather than vma flag, we should have
that set up by now.
> + return -EACCES;
> + /*
> + * Prevent an initially non-executable mapping from
> + * later becoming executable via mprotect().
> + */
> + vm_flags &= ~VM_MAYEXEC;
> + }
> +
You know, I'm in two minds about this... I explicitly moved logic to
do_mmap() in [0] to workaround a chicken-and-egg scenario with having
accidentally undone the ability to mmap() read-only F_WRITE_SEALed
mappings, which meant I 'may as well' move the 'future proofing' clearing
of VM_MAYWRITE for F_SEAL_FUTURE_WRITE too.
But now I feel that the use of shmem_mmap() and hugetlbfs_file_mmap() to do
_any_ of this is pretty odious in general, we may as well do it all
upfront.
[0]:https://lore.kernel.org/all/cover.1732804776.git.lorenzo.stoakes@oracle.com/
> flags_mask = LEGACY_MAP_MASK;
> if (file->f_op->fop_flags & FOP_MMAP_SYNC)
> flags_mask |= MAP_SYNC;
> --
> 2.47.0.338.g60cca15819-goog
>
So actually - overall - could you hold off a bit on this until I've had a
think and can perhaps send a patch that refactors this?
Then your patch can build on top of that one and we can handle this all in
one place and sanely :)
Sorry you've kicked off thought processes here and that's often a dangerous
thing :P
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2024-12-06 18:19 ` Lorenzo Stoakes
@ 2024-12-06 20:48 ` Isaac Manjarres
2024-12-06 21:14 ` Lorenzo Stoakes
2025-01-03 15:13 ` Jann Horn
1 sibling, 1 reply; 20+ messages in thread
From: Isaac Manjarres @ 2024-12-06 20:48 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Vlastimil Babka, Jann Horn, Shuah Khan,
kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, Kalesh Singh, John Stultz
On Fri, Dec 06, 2024 at 06:19:49PM +0000, Lorenzo Stoakes wrote:
> On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index b1b2a24ef82e..c7b96b057fda 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> > if (!file_mmap_ok(file, inode, pgoff, len))
> > return -EOVERFLOW;
> >
>
> Not maybe in favour of _where_ in the logic we check this and definitely
> not in expanding this do_mmap() stuff much further.
>
> See comment at bottom though... I have a cunning plan :)
>
> > + if (is_exec_sealed(seals)) {
>
> Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> I've not tested this scenario so don't know if we somehow disallow this in
> another way but note on write checks we only care about shared mappings.
>
> I mean one could argue that a MAP_PRIVATE situation is the same as copying
> the data into an anon buffer and doing what you want with it, here you
> could argue the same...
>
> So probably we should only care about VM_SHARED?
Thanks for taking a look at this!
I'd originally implemented it for just the VM_SHARED case, but after
discussing it with Kalesh, I changed it to disallow executable
mappings for both MAP_SHARED and MAP_PRIVATE.
Our thought was that write sealing didn't apply in the MAP_PRIVATE case
to support COW with MAP_PRIVATE. There's nothing similar to COW with
execution, so I decided to prevent it for both cases; it also retains
the same behavior as the ashmem driver.
> > + /* No new executable mappings if the file is exec sealed. */
> > + if (prot & PROT_EXEC)
>
> Seems strange to reference a prot flag rather than vma flag, we should have
> that set up by now.
That makes sense. I can change this to check for VM_EXEC in v2 of this
series.
> > + return -EACCES;
> > + /*
> > + * Prevent an initially non-executable mapping from
> > + * later becoming executable via mprotect().
> > + */
> > + vm_flags &= ~VM_MAYEXEC;
> > + }
> > +
>
> You know, I'm in two minds about this... I explicitly moved logic to
> do_mmap() in [0] to workaround a chicken-and-egg scenario with having
> accidentally undone the ability to mmap() read-only F_WRITE_SEALed
> mappings, which meant I 'may as well' move the 'future proofing' clearing
> of VM_MAYWRITE for F_SEAL_FUTURE_WRITE too.
>
> But now I feel that the use of shmem_mmap() and hugetlbfs_file_mmap() to do
> _any_ of this is pretty odious in general, we may as well do it all
> upfront.
>
> [0]:https://lore.kernel.org/all/cover.1732804776.git.lorenzo.stoakes@oracle.com/
I agree. I really like the idea of handling the future proofing and
error checking in one place. It makes understanding how these seals
work a lot easier, and has the added benefit of only worrying about
the check once rather than having to duplicate it in both shmem_mmap() and
hugetlbfs_file_mmap().
> > flags_mask = LEGACY_MAP_MASK;
> > if (file->f_op->fop_flags & FOP_MMAP_SYNC)
> > flags_mask |= MAP_SYNC;
> > --
> > 2.47.0.338.g60cca15819-goog
> >
>
> So actually - overall - could you hold off a bit on this until I've had a
> think and can perhaps send a patch that refactors this?
>
> Then your patch can build on top of that one and we can handle this all in
> one place and sanely :)
>
> Sorry you've kicked off thought processes here and that's often a dangerous
> thing :P
Thanks again for reviewing these patches! Happy that I was able to get
the gears turning :)
I'm really interested in helping with this, so is there any forum you'd
like to use for collaborating on this or any way I can help?
I'm also more than happy to test out any patches that you'd like!
Thanks,
Isaac
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2024-12-06 17:49 ` Kalesh Singh
@ 2024-12-06 20:50 ` Isaac Manjarres
0 siblings, 0 replies; 20+ messages in thread
From: Isaac Manjarres @ 2024-12-06 20:50 UTC (permalink / raw)
To: Kalesh Singh
Cc: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka, Jann Horn,
Shuah Khan, kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, John Stultz
On Fri, Dec 06, 2024 at 09:49:35AM -0800, Kalesh Singh wrote:
> On Thu, Dec 5, 2024 at 5:09 PM Isaac J. Manjarres
> <isaacmanjarres@google.com> wrote:
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> > if (!file_mmap_ok(file, inode, pgoff, len))
> > return -EOVERFLOW;
> >
> > + if (is_exec_sealed(seals)) {
> > + /* No new executable mappings if the file is exec sealed. */
> > + if (prot & PROT_EXEC)
> > + return -EACCES;
>
> I think this should be -EPERM to be consistent with seal_check_write()
> and mmap(2) man page:
>
> " EPERM The operation was prevented by a file seal; see fcntl(2)."
>
> Thanks,
> Kalesh
>
Thanks for catching that Kalesh! I agree and will fix this in v2 of the
series.
Thanks,
Isaac
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2024-12-06 20:48 ` Isaac Manjarres
@ 2024-12-06 21:14 ` Lorenzo Stoakes
2024-12-11 20:56 ` Isaac Manjarres
0 siblings, 1 reply; 20+ messages in thread
From: Lorenzo Stoakes @ 2024-12-06 21:14 UTC (permalink / raw)
To: Isaac Manjarres
Cc: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Vlastimil Babka, Jann Horn, Shuah Khan,
kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, Kalesh Singh, John Stultz
On Fri, Dec 06, 2024 at 12:48:09PM -0800, Isaac Manjarres wrote:
> On Fri, Dec 06, 2024 at 06:19:49PM +0000, Lorenzo Stoakes wrote:
> > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index b1b2a24ef82e..c7b96b057fda 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> > > if (!file_mmap_ok(file, inode, pgoff, len))
> > > return -EOVERFLOW;
> > >
> >
> > Not maybe in favour of _where_ in the logic we check this and definitely
> > not in expanding this do_mmap() stuff much further.
> >
> > See comment at bottom though... I have a cunning plan :)
> >
> > > + if (is_exec_sealed(seals)) {
> >
> > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> > I've not tested this scenario so don't know if we somehow disallow this in
> > another way but note on write checks we only care about shared mappings.
> >
> > I mean one could argue that a MAP_PRIVATE situation is the same as copying
> > the data into an anon buffer and doing what you want with it, here you
> > could argue the same...
> >
> > So probably we should only care about VM_SHARED?
>
> Thanks for taking a look at this!
>
> I'd originally implemented it for just the VM_SHARED case, but after
> discussing it with Kalesh, I changed it to disallow executable
> mappings for both MAP_SHARED and MAP_PRIVATE.
>
> Our thought was that write sealing didn't apply in the MAP_PRIVATE case
> to support COW with MAP_PRIVATE. There's nothing similar to COW with
> execution, so I decided to prevent it for both cases; it also retains
> the same behavior as the ashmem driver.
Hm, yeah I'm not sure that's really justified, I mean what's to stop a
caller from just mapping their own memory mapping executable, copying the
data and executing?
There's also further concerns around execution restriction for instance in
memfd_add_seals():
/*
* SEAL_EXEC implys SEAL_WRITE, making W^X from the start.
*/
if (seals & F_SEAL_EXEC && inode->i_mode & 0111)
seals |= F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE;
So you probably want to change this to include F_SEAL_FUTURE_EXEC, and note
your proposal interacts negatively with the whole
MFD_NOEXEC_SCOPE_NOEXEC_ENFORCED mode set in vm.memfd_noeec - any system
with this set to '2' will literally not allow you to do what you want if
set to 2.
See https://origin.kernel.org/doc/html/latest/userspace-api/mfd_noexec.html
>
> > > + /* No new executable mappings if the file is exec sealed. */
> > > + if (prot & PROT_EXEC)
> >
> > Seems strange to reference a prot flag rather than vma flag, we should have
> > that set up by now.
>
> That makes sense. I can change this to check for VM_EXEC in v2 of this
> series.
> > > + return -EACCES;
> > > + /*
> > > + * Prevent an initially non-executable mapping from
> > > + * later becoming executable via mprotect().
> > > + */
> > > + vm_flags &= ~VM_MAYEXEC;
> > > + }
> > > +
> >
> > You know, I'm in two minds about this... I explicitly moved logic to
> > do_mmap() in [0] to workaround a chicken-and-egg scenario with having
> > accidentally undone the ability to mmap() read-only F_WRITE_SEALed
> > mappings, which meant I 'may as well' move the 'future proofing' clearing
> > of VM_MAYWRITE for F_SEAL_FUTURE_WRITE too.
> >
> > But now I feel that the use of shmem_mmap() and hugetlbfs_file_mmap() to do
> > _any_ of this is pretty odious in general, we may as well do it all
> > upfront.
> >
> > [0]:https://lore.kernel.org/all/cover.1732804776.git.lorenzo.stoakes@oracle.com/
>
> I agree. I really like the idea of handling the future proofing and
> error checking in one place. It makes understanding how these seals
> work a lot easier, and has the added benefit of only worrying about
> the check once rather than having to duplicate it in both shmem_mmap() and
> hugetlbfs_file_mmap().
>
> > > flags_mask = LEGACY_MAP_MASK;
> > > if (file->f_op->fop_flags & FOP_MMAP_SYNC)
> > > flags_mask |= MAP_SYNC;
> > > --
> > > 2.47.0.338.g60cca15819-goog
> > >
> >
> > So actually - overall - could you hold off a bit on this until I've had a
> > think and can perhaps send a patch that refactors this?
> >
> > Then your patch can build on top of that one and we can handle this all in
> > one place and sanely :)
> >
> > Sorry you've kicked off thought processes here and that's often a dangerous
> > thing :P
> Thanks again for reviewing these patches! Happy that I was able to get
> the gears turning :)
>
> I'm really interested in helping with this, so is there any forum you'd
> like to use for collaborating on this or any way I can help?
>
> I'm also more than happy to test out any patches that you'd like!
Thanks, I'm just going to post to the mailing list, this is the discussion
forum I'm making use of for this :)
I will cc- you on my patch and definitely I'd appreciate testing thanks!
But yeah, to be clear I'm not done with reviewing this, I need more time to
digest what you're trying to do here, but you definitely need to think
about the exec limitations.
>
> Thanks,
> Isaac
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2024-12-06 21:14 ` Lorenzo Stoakes
@ 2024-12-11 20:56 ` Isaac Manjarres
0 siblings, 0 replies; 20+ messages in thread
From: Isaac Manjarres @ 2024-12-11 20:56 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Jeff Layton, Chuck Lever, Alexander Aring,
Liam R. Howlett, Vlastimil Babka, Jann Horn, Shuah Khan,
kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, Kalesh Singh, John Stultz
On Fri, Dec 06, 2024 at 09:14:58PM +0000, Lorenzo Stoakes wrote:
> On Fri, Dec 06, 2024 at 12:48:09PM -0800, Isaac Manjarres wrote:
> > On Fri, Dec 06, 2024 at 06:19:49PM +0000, Lorenzo Stoakes wrote:
> > > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > > index b1b2a24ef82e..c7b96b057fda 100644
> > > > --- a/mm/mmap.c
> > > > +++ b/mm/mmap.c
> > > > @@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> > > > if (!file_mmap_ok(file, inode, pgoff, len))
> > > > return -EOVERFLOW;
> > > >
> > >
> > > Not maybe in favour of _where_ in the logic we check this and definitely
> > > not in expanding this do_mmap() stuff much further.
> > >
> > > See comment at bottom though... I have a cunning plan :)
> > >
> > > > + if (is_exec_sealed(seals)) {
> > >
> > > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> > > I've not tested this scenario so don't know if we somehow disallow this in
> > > another way but note on write checks we only care about shared mappings.
> > >
> > > I mean one could argue that a MAP_PRIVATE situation is the same as copying
> > > the data into an anon buffer and doing what you want with it, here you
> > > could argue the same...
> > >
> > > So probably we should only care about VM_SHARED?
> >
> > Thanks for taking a look at this!
> >
> > I'd originally implemented it for just the VM_SHARED case, but after
> > discussing it with Kalesh, I changed it to disallow executable
> > mappings for both MAP_SHARED and MAP_PRIVATE.
> >
> > Our thought was that write sealing didn't apply in the MAP_PRIVATE case
> > to support COW with MAP_PRIVATE. There's nothing similar to COW with
> > execution, so I decided to prevent it for both cases; it also retains
> > the same behavior as the ashmem driver.
>
> Hm, yeah I'm not sure that's really justified, I mean what's to stop a
> caller from just mapping their own memory mapping executable, copying the
> data and executing?
>
That's a fair point. In that case, I think it makes sense to enforce the
seal only when the mapping is shared.
The case I'm trying to address is when a process (A) allocates a memfd
that is meant to be read and written by itself and another process (B).
A shares the buffer with B, but B injects code into the buffer, and
compromises A such that A maps the buffer with PROT_EXEC and runs the
code that B injected into it.
If A used F_SEAL_FUTURE_EXEC prior to sharing the buffer, then it could
reduce the attack surface on itself in this scenario.
> There's also further concerns around execution restriction for instance in
> memfd_add_seals():
>
> /*
> * SEAL_EXEC implys SEAL_WRITE, making W^X from the start.
> */
> if (seals & F_SEAL_EXEC && inode->i_mode & 0111)
> seals |= F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE;
>
> So you probably want to change this to include F_SEAL_FUTURE_EXEC, and note
Do you mean adding a case where if F_SEAL_FUTURE_EXEC is in the seals,
then we should clear the X bits of the file and use F_SEAL_EXEC as well?
I don't think the case in the if condition should imply F_SEAL_FUTURE_EXEC,
since the file is still executable in this case?
> your proposal interacts negatively with the whole
> MFD_NOEXEC_SCOPE_NOEXEC_ENFORCED mode set in vm.memfd_noeec - any system
> with this set to '2' will literally not allow you to do what you want if
> set to 2.
>
> See https://origin.kernel.org/doc/html/latest/userspace-api/mfd_noexec.html
Sorry, I didn't follow how this would impact
MFD_NOEXEC_SCOPE_NOEXEC_ENFORCED. Could you please clarify that?
> > Thanks again for reviewing these patches! Happy that I was able to get
> > the gears turning :)
> >
> > I'm really interested in helping with this, so is there any forum you'd
> > like to use for collaborating on this or any way I can help?
> >
> > I'm also more than happy to test out any patches that you'd like!
>
> Thanks, I'm just going to post to the mailing list, this is the discussion
> forum I'm making use of for this :)
>
> I will cc- you on my patch and definitely I'd appreciate testing thanks!
>
> But yeah, to be clear I'm not done with reviewing this, I need more time to
> digest what you're trying to do here, but you definitely need to think
> about the exec limitations.
Thanks for sending out the patch. I took a look and tested it out and it
definitely makes implementing this a lot nicer!
Thanks,
Isaac
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2024-12-06 18:19 ` Lorenzo Stoakes
2024-12-06 20:48 ` Isaac Manjarres
@ 2025-01-03 15:13 ` Jann Horn
2025-01-06 18:26 ` Jeff Xu
1 sibling, 1 reply; 20+ messages in thread
From: Jann Horn @ 2025-01-03 15:13 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Isaac J. Manjarres, Andrew Morton, Jeff Layton, Chuck Lever,
Alexander Aring, Liam R. Howlett, Vlastimil Babka, Shuah Khan,
kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, Kalesh Singh, John Stultz
On Fri, Dec 6, 2024 at 7:19 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
> On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > + if (is_exec_sealed(seals)) {
>
> Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> I've not tested this scenario so don't know if we somehow disallow this in
> another way but note on write checks we only care about shared mappings.
>
> I mean one could argue that a MAP_PRIVATE situation is the same as copying
> the data into an anon buffer and doing what you want with it, here you
> could argue the same...
>
> So probably we should only care about VM_SHARED?
FWIW I think it doesn't make sense to distinguish between
shared/private mappings here - in the scenario described in the cover
letter, it wouldn't matter that much to an attacker whether the
mapping is shared or private (as long as the VMA contents haven't been
CoWed already). But you're also right that in the scenario described,
an attacker might also be able to create a writable+executable anon
VMA and copy into that, or map another memfd that hasn't been sealed,
or stuff like that. We can block such things - but not by only
providing sealing operations on individual memfds. I think this
instead requires policy that applies at the process level, either
using system-wide SELinux policy or using process sandboxing APIs.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-03 15:13 ` Jann Horn
@ 2025-01-06 18:26 ` Jeff Xu
2025-01-07 0:44 ` Kees Cook
0 siblings, 1 reply; 20+ messages in thread
From: Jeff Xu @ 2025-01-06 18:26 UTC (permalink / raw)
To: Jann Horn, Kees Cook
Cc: Lorenzo Stoakes, Isaac J. Manjarres, Andrew Morton, Jeff Layton,
Chuck Lever, Alexander Aring, Liam R. Howlett, Vlastimil Babka,
Shuah Khan, kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, Kalesh Singh, John Stultz
+ Kees because this is related to W^X memfd and security.
On Fri, Jan 3, 2025 at 7:14 AM Jann Horn <jannh@google.com> wrote:
>
> On Fri, Dec 6, 2024 at 7:19 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > > + if (is_exec_sealed(seals)) {
> >
> > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> > I've not tested this scenario so don't know if we somehow disallow this in
> > another way but note on write checks we only care about shared mappings.
> >
> > I mean one could argue that a MAP_PRIVATE situation is the same as copying
> > the data into an anon buffer and doing what you want with it, here you
> > could argue the same...
> >
> > So probably we should only care about VM_SHARED?
>
> FWIW I think it doesn't make sense to distinguish between
> shared/private mappings here - in the scenario described in the cover
> letter, it wouldn't matter that much to an attacker whether the
> mapping is shared or private (as long as the VMA contents haven't been
> CoWed already).
+1 on this.
The concept of blocking this for only shared mapping is questionable.
> But you're also right that in the scenario described,
> an attacker might also be able to create a writable+executable anon
> VMA and copy into that, or map another memfd that hasn't been sealed,
> or stuff like that. We can block such things - but not by only
> providing sealing operations on individual memfds. I think this
> instead requires policy that applies at the process level, either
> using system-wide SELinux policy or using process sandboxing APIs.
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-06 18:26 ` Jeff Xu
@ 2025-01-07 0:44 ` Kees Cook
2025-01-08 19:06 ` Lorenzo Stoakes
0 siblings, 1 reply; 20+ messages in thread
From: Kees Cook @ 2025-01-07 0:44 UTC (permalink / raw)
To: Jeff Xu
Cc: Jann Horn, Lorenzo Stoakes, Isaac J. Manjarres, Andrew Morton,
Jeff Layton, Chuck Lever, Alexander Aring, Liam R. Howlett,
Vlastimil Babka, Shuah Khan, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
On Mon, Jan 06, 2025 at 10:26:27AM -0800, Jeff Xu wrote:
> + Kees because this is related to W^X memfd and security.
>
> On Fri, Jan 3, 2025 at 7:14 AM Jann Horn <jannh@google.com> wrote:
> >
> > On Fri, Dec 6, 2024 at 7:19 PM Lorenzo Stoakes
> > <lorenzo.stoakes@oracle.com> wrote:
> > > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > > > + if (is_exec_sealed(seals)) {
> > >
> > > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> > > I've not tested this scenario so don't know if we somehow disallow this in
> > > another way but note on write checks we only care about shared mappings.
> > >
> > > I mean one could argue that a MAP_PRIVATE situation is the same as copying
> > > the data into an anon buffer and doing what you want with it, here you
> > > could argue the same...
> > >
> > > So probably we should only care about VM_SHARED?
> >
> > FWIW I think it doesn't make sense to distinguish between
> > shared/private mappings here - in the scenario described in the cover
> > letter, it wouldn't matter that much to an attacker whether the
> > mapping is shared or private (as long as the VMA contents haven't been
> > CoWed already).
> +1 on this.
> The concept of blocking this for only shared mapping is questionable.
Right -- why does sharedness matter? It seems more robust to me to not
create a corner case but rather apply the flag/behavior universally?
--
Kees Cook
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-07 0:44 ` Kees Cook
@ 2025-01-08 19:06 ` Lorenzo Stoakes
2025-01-08 22:07 ` Kees Cook
2025-01-09 23:30 ` Jeff Xu
0 siblings, 2 replies; 20+ messages in thread
From: Lorenzo Stoakes @ 2025-01-08 19:06 UTC (permalink / raw)
To: Kees Cook
Cc: Jeff Xu, Jann Horn, Isaac J. Manjarres, Andrew Morton,
Jeff Layton, Chuck Lever, Alexander Aring, Liam R. Howlett,
Vlastimil Babka, Shuah Khan, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
On Mon, Jan 06, 2025 at 04:44:33PM -0800, Kees Cook wrote:
> On Mon, Jan 06, 2025 at 10:26:27AM -0800, Jeff Xu wrote:
> > + Kees because this is related to W^X memfd and security.
> >
> > On Fri, Jan 3, 2025 at 7:14 AM Jann Horn <jannh@google.com> wrote:
> > >
> > > On Fri, Dec 6, 2024 at 7:19 PM Lorenzo Stoakes
> > > <lorenzo.stoakes@oracle.com> wrote:
> > > > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > > > > + if (is_exec_sealed(seals)) {
> > > >
> > > > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> > > > I've not tested this scenario so don't know if we somehow disallow this in
> > > > another way but note on write checks we only care about shared mappings.
> > > >
> > > > I mean one could argue that a MAP_PRIVATE situation is the same as copying
> > > > the data into an anon buffer and doing what you want with it, here you
> > > > could argue the same...
> > > >
> > > > So probably we should only care about VM_SHARED?
> > >
> > > FWIW I think it doesn't make sense to distinguish between
> > > shared/private mappings here - in the scenario described in the cover
> > > letter, it wouldn't matter that much to an attacker whether the
> > > mapping is shared or private (as long as the VMA contents haven't been
> > > CoWed already).
> > +1 on this.
> > The concept of blocking this for only shared mapping is questionable.
>
> Right -- why does sharedness matter? It seems more robust to me to not
> create a corner case but rather apply the flag/behavior universally?
>
I'm struggling to understand what you are protecting against, if I can receive a
buffer '-not executable-'. But then copy it into another buffer I mapped, and
execute it?
I mean am I missing something? It's very possible :)
The cost is complexity. And the difference between mappings which are shared and
those which are private and moreso MAP_PRIVATE of an fd are actually quite a lot
internally (go look at anon_vma code if you have the great benefit of not yet
doing so to see the deepest, darkest, 9th circle of complexity hell :>).
Again, I may be missing the point here or misunderstanding the apparent attack
vector, but this is where I'm coming from.
> --
> Kees Cook
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-08 19:06 ` Lorenzo Stoakes
@ 2025-01-08 22:07 ` Kees Cook
2025-01-09 23:30 ` Jeff Xu
1 sibling, 0 replies; 20+ messages in thread
From: Kees Cook @ 2025-01-08 22:07 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Jeff Xu, Jann Horn, Isaac J. Manjarres, Andrew Morton,
Jeff Layton, Chuck Lever, Alexander Aring, Liam R. Howlett,
Vlastimil Babka, Shuah Khan, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
On Wed, Jan 08, 2025 at 07:06:13PM +0000, Lorenzo Stoakes wrote:
> On Mon, Jan 06, 2025 at 04:44:33PM -0800, Kees Cook wrote:
> > On Mon, Jan 06, 2025 at 10:26:27AM -0800, Jeff Xu wrote:
> > > + Kees because this is related to W^X memfd and security.
> > >
> > > On Fri, Jan 3, 2025 at 7:14 AM Jann Horn <jannh@google.com> wrote:
> > > >
> > > > On Fri, Dec 6, 2024 at 7:19 PM Lorenzo Stoakes
> > > > <lorenzo.stoakes@oracle.com> wrote:
> > > > > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > > > > > + if (is_exec_sealed(seals)) {
> > > > >
> > > > > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> > > > > I've not tested this scenario so don't know if we somehow disallow this in
> > > > > another way but note on write checks we only care about shared mappings.
> > > > >
> > > > > I mean one could argue that a MAP_PRIVATE situation is the same as copying
> > > > > the data into an anon buffer and doing what you want with it, here you
> > > > > could argue the same...
> > > > >
> > > > > So probably we should only care about VM_SHARED?
> > > >
> > > > FWIW I think it doesn't make sense to distinguish between
> > > > shared/private mappings here - in the scenario described in the cover
> > > > letter, it wouldn't matter that much to an attacker whether the
> > > > mapping is shared or private (as long as the VMA contents haven't been
> > > > CoWed already).
> > > +1 on this.
> > > The concept of blocking this for only shared mapping is questionable.
> >
> > Right -- why does sharedness matter? It seems more robust to me to not
> > create a corner case but rather apply the flag/behavior universally?
> >
>
> I'm struggling to understand what you are protecting against, if I can receive a
> buffer '-not executable-'. But then copy it into another buffer I mapped, and
> execute it?
>
> I mean am I missing something? It's very possible :)
Jann, how do you see a private mapping being exploited this way? My
mental model of the attack depends on a malicious process tricking a
victim process -- i.e. setting it executable and then gaining exec
control of the victim to point into the buffer. An attack on a private
mapping would require a way to trick the process into making the mapping
executable (which seems a high barrier) first.
> The cost is complexity. And the difference between mappings which are shared and
> those which are private and moreso MAP_PRIVATE of an fd are actually quite a lot
> internally (go look at anon_vma code if you have the great benefit of not yet
> doing so to see the deepest, darkest, 9th circle of complexity hell :>).
Ah, okay, I thought it would be pretty "early" in the VMA logic. i.e.
asking the question "Can I make this executable?" I was expecting to be
common across the VMA regardless of private/shared. I will need to go
read the code more carefully.
Still, it seems nicer to me if the "can this be made executable in the
future" question doesn't matter on sharing, just from a perspective of
least surprise.
--
Kees Cook
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-08 19:06 ` Lorenzo Stoakes
2025-01-08 22:07 ` Kees Cook
@ 2025-01-09 23:30 ` Jeff Xu
2025-01-14 20:02 ` Isaac Manjarres
1 sibling, 1 reply; 20+ messages in thread
From: Jeff Xu @ 2025-01-09 23:30 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Kees Cook, Jann Horn, Isaac J. Manjarres, Andrew Morton,
Jeff Layton, Chuck Lever, Alexander Aring, Liam R. Howlett,
Vlastimil Babka, Shuah Khan, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
On Wed, Jan 8, 2025 at 11:06 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Mon, Jan 06, 2025 at 04:44:33PM -0800, Kees Cook wrote:
> > On Mon, Jan 06, 2025 at 10:26:27AM -0800, Jeff Xu wrote:
> > > + Kees because this is related to W^X memfd and security.
> > >
> > > On Fri, Jan 3, 2025 at 7:14 AM Jann Horn <jannh@google.com> wrote:
> > > >
> > > > On Fri, Dec 6, 2024 at 7:19 PM Lorenzo Stoakes
> > > > <lorenzo.stoakes@oracle.com> wrote:
> > > > > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > > > > > + if (is_exec_sealed(seals)) {
> > > > >
> > > > > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> > > > > I've not tested this scenario so don't know if we somehow disallow this in
> > > > > another way but note on write checks we only care about shared mappings.
> > > > >
> > > > > I mean one could argue that a MAP_PRIVATE situation is the same as copying
> > > > > the data into an anon buffer and doing what you want with it, here you
> > > > > could argue the same...
> > > > >
> > > > > So probably we should only care about VM_SHARED?
> > > >
> > > > FWIW I think it doesn't make sense to distinguish between
> > > > shared/private mappings here - in the scenario described in the cover
> > > > letter, it wouldn't matter that much to an attacker whether the
> > > > mapping is shared or private (as long as the VMA contents haven't been
> > > > CoWed already).
> > > +1 on this.
> > > The concept of blocking this for only shared mapping is questionable.
> >
> > Right -- why does sharedness matter? It seems more robust to me to not
> > create a corner case but rather apply the flag/behavior universally?
> >
>
> I'm struggling to understand what you are protecting against, if I can receive a
> buffer '-not executable-'. But then copy it into another buffer I mapped, and
> execute it?
>
preventing mmap() a memfd has the same threat model as preventing
execve() of a memfd, using execve() of a memfd as an example (since
the kernel already supports this): an attacker wanting to execute a
hijacked memfd must already have the ability to call execve() (e.g.,
by modifying a function pointer or using ROP). To prevent this, the
kernel supports making memfds non-executable (rw-) and permanently
preventing them from becoming executable (sealing with F_SEAL_EXEC).
Once the execve() attack path is blocked, the next thing an attacker
could do is mmap() the memfd into the process's memory and jump to it.
That is the reason I think we could tie this feature with
non-executable-bit + F_SEAL_EXEC, and avoid a new flag. This approach
leverages the existing memfd_create(MFD_NOEXEC_SEAL) function and
related sysctl for system level enforcement. This makes it simple for
applications to use the same function and ensures that both execve()
and mmap() are blocked by non-executable memfd. There is a small
backward compatibility risk for this approach, e.g. an application
already uses MFD_NOEXEC_SEAL but chooses to mmap it as executable, but
I think such a user case is strange to support.
If we're okay with using F_SEAL_FUTURE_EXEC, then share/private don't
matter, as the threat model explains. This flag's generic naming also
suggests it could apply to regular files in future. However, if this
flag is intended solely for shared memfd, it should be renamed to
something like F_SEAL_SHARED_MEMFD_FUTURE_EXEC_MAPPING. I don't think
this is the intention, right ?
LSM such as selinux is another option to consider, maybe it is a
better place to implement this? because selinux policy provides
system level control and enforcement, which the current implementation
lacks. If we end up having a selinux policy for this later, wouldn't
that obsolete this feature ?
-Jeff
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-09 23:30 ` Jeff Xu
@ 2025-01-14 20:02 ` Isaac Manjarres
2025-01-14 21:29 ` Kees Cook
0 siblings, 1 reply; 20+ messages in thread
From: Isaac Manjarres @ 2025-01-14 20:02 UTC (permalink / raw)
To: Jeff Xu
Cc: Lorenzo Stoakes, Kees Cook, Jann Horn, Andrew Morton,
Jeff Layton, Chuck Lever, Alexander Aring, Liam R. Howlett,
Vlastimil Babka, Shuah Khan, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
On Thu, Jan 09, 2025 at 03:30:36PM -0800, Jeff Xu wrote:
> On Wed, Jan 8, 2025 at 11:06 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Mon, Jan 06, 2025 at 04:44:33PM -0800, Kees Cook wrote:
> > > On Mon, Jan 06, 2025 at 10:26:27AM -0800, Jeff Xu wrote:
> > > > + Kees because this is related to W^X memfd and security.
> > > >
> > > > On Fri, Jan 3, 2025 at 7:14 AM Jann Horn <jannh@google.com> wrote:
> > > > >
> > > > > On Fri, Dec 6, 2024 at 7:19 PM Lorenzo Stoakes
> > > > > <lorenzo.stoakes@oracle.com> wrote:
> > > > > > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote:
> > > > > > > + if (is_exec_sealed(seals)) {
> > > > > >
> > > > > > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution?
> > > > > > I've not tested this scenario so don't know if we somehow disallow this in
> > > > > > another way but note on write checks we only care about shared mappings.
> > > > > >
> > > > > > I mean one could argue that a MAP_PRIVATE situation is the same as copying
> > > > > > the data into an anon buffer and doing what you want with it, here you
> > > > > > could argue the same...
> > > > > >
> > > > > > So probably we should only care about VM_SHARED?
> > > > >
> > > > > FWIW I think it doesn't make sense to distinguish between
> > > > > shared/private mappings here - in the scenario described in the cover
> > > > > letter, it wouldn't matter that much to an attacker whether the
> > > > > mapping is shared or private (as long as the VMA contents haven't been
> > > > > CoWed already).
> > > > +1 on this.
> > > > The concept of blocking this for only shared mapping is questionable.
> > >
> > > Right -- why does sharedness matter? It seems more robust to me to not
> > > create a corner case but rather apply the flag/behavior universally?
> > >
> >
> > I'm struggling to understand what you are protecting against, if I can receive a
> > buffer '-not executable-'. But then copy it into another buffer I mapped, and
> > execute it?
> >
> preventing mmap() a memfd has the same threat model as preventing
> execve() of a memfd, using execve() of a memfd as an example (since
> the kernel already supports this): an attacker wanting to execute a
> hijacked memfd must already have the ability to call execve() (e.g.,
> by modifying a function pointer or using ROP). To prevent this, the
> kernel supports making memfds non-executable (rw-) and permanently
> preventing them from becoming executable (sealing with F_SEAL_EXEC).
> Once the execve() attack path is blocked, the next thing an attacker
> could do is mmap() the memfd into the process's memory and jump to it.
>
I think the main issue in the threat model that I described is that
an attacking process can gain control of a more priveleged process.
Yes, having the buffer sealed against execution would prevent the
attacker from running the injected from *that* buffer, but
if they're already controlling the process, they could have the process
create a memfd that is executable (imagine a system where
MFD_NOEXEC_SEAL is not the default), copy the code, and then execute it
from there.
I spoke about this offline with Jann as well, and we both agree that
given that line of reasoning, this feature that I'm trying to add
doesn't buy us the security that I initially thought it would.
Therefore, we will be dropping this patch.
Thank you everyone for the discussion and reviews!
--Isaac
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-14 20:02 ` Isaac Manjarres
@ 2025-01-14 21:29 ` Kees Cook
2025-01-14 22:42 ` Isaac Manjarres
0 siblings, 1 reply; 20+ messages in thread
From: Kees Cook @ 2025-01-14 21:29 UTC (permalink / raw)
To: Isaac Manjarres
Cc: Jeff Xu, Lorenzo Stoakes, Jann Horn, Andrew Morton, Jeff Layton,
Chuck Lever, Alexander Aring, Liam R. Howlett, Vlastimil Babka,
Shuah Khan, kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, Kalesh Singh, John Stultz
On Tue, Jan 14, 2025 at 12:02:28PM -0800, Isaac Manjarres wrote:
> I think the main issue in the threat model that I described is that
> an attacking process can gain control of a more priveleged process.
I understood it to be about an attacker gaining execution control through
a rewritten function pointer, not that they already have arbitrary
execution control. (i.e. taking a "jump anywhere" primitive and
upgrading it to "execute anything".) Is the expectation that existing
ROP/JOP techniques make protecting memfd irrelevant?
--
Kees Cook
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-14 21:29 ` Kees Cook
@ 2025-01-14 22:42 ` Isaac Manjarres
2025-01-14 23:41 ` Jeff Xu
0 siblings, 1 reply; 20+ messages in thread
From: Isaac Manjarres @ 2025-01-14 22:42 UTC (permalink / raw)
To: Kees Cook
Cc: Jeff Xu, Lorenzo Stoakes, Jann Horn, Andrew Morton, Jeff Layton,
Chuck Lever, Alexander Aring, Liam R. Howlett, Vlastimil Babka,
Shuah Khan, kernel-team, linux-mm, linux-kernel, linux-fsdevel,
linux-kselftest, Suren Baghdasaryan, Kalesh Singh, John Stultz
On Tue, Jan 14, 2025 at 01:29:44PM -0800, Kees Cook wrote:
> On Tue, Jan 14, 2025 at 12:02:28PM -0800, Isaac Manjarres wrote:
> > I think the main issue in the threat model that I described is that
> > an attacking process can gain control of a more priveleged process.
>
> I understood it to be about an attacker gaining execution control through
> a rewritten function pointer, not that they already have arbitrary
> execution control. (i.e. taking a "jump anywhere" primitive and
> upgrading it to "execute anything".) Is the expectation that existing
> ROP/JOP techniques make protecting memfd irrelevant?
>
Is arbitrary execution control necessary? Suppose the attacker
overwrites the function pointer that the victim process is supposed to
return to. The attacker makes it that the victim process ends up
executing code that maps the buffer with PROT_EXEC and then jumps to
the buffer to execute the code that was injected.
I don't think having the ability to seal memfds against execution on a
per-buffer basis entirely addresses that attack. Can't the attacker
craft a different type of attack where they instead copy the code they
wrote to an executable memfd? I think a way to avoid that would be if
the original memfd was write-only to avoid the copy scenario but that
is not reasonable. Alternatively, MFD_NOEXEC_SEAL could be extended
to prevent executable mappings, and MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED
could be enabled, but that type of system would prevent memfd buffers
from being used for execution for legitimate usecases (e.g. JIT), which
may not be desirable.
--Isaac
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-14 22:42 ` Isaac Manjarres
@ 2025-01-14 23:41 ` Jeff Xu
2025-01-14 23:56 ` Jeff Xu
0 siblings, 1 reply; 20+ messages in thread
From: Jeff Xu @ 2025-01-14 23:41 UTC (permalink / raw)
To: Isaac Manjarres
Cc: Kees Cook, Lorenzo Stoakes, Jann Horn, Andrew Morton,
Jeff Layton, Chuck Lever, Alexander Aring, Liam R. Howlett,
Vlastimil Babka, Shuah Khan, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
On Tue, Jan 14, 2025 at 2:42 PM Isaac Manjarres
<isaacmanjarres@google.com> wrote:
>
> On Tue, Jan 14, 2025 at 01:29:44PM -0800, Kees Cook wrote:
> > On Tue, Jan 14, 2025 at 12:02:28PM -0800, Isaac Manjarres wrote:
> Alternatively, MFD_NOEXEC_SEAL could be extended
> to prevent executable mappings, and MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED
> could be enabled, but that type of system would prevent memfd buffers
> from being used for execution for legitimate usecases (e.g. JIT), which
> may not be desirable.
>
The JIT case doesn't use execve(memfd), right ?
> --Isaac
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd
2025-01-14 23:41 ` Jeff Xu
@ 2025-01-14 23:56 ` Jeff Xu
0 siblings, 0 replies; 20+ messages in thread
From: Jeff Xu @ 2025-01-14 23:56 UTC (permalink / raw)
To: Isaac Manjarres
Cc: Kees Cook, Lorenzo Stoakes, Jann Horn, Andrew Morton,
Jeff Layton, Chuck Lever, Alexander Aring, Liam R. Howlett,
Vlastimil Babka, Shuah Khan, kernel-team, linux-mm, linux-kernel,
linux-fsdevel, linux-kselftest, Suren Baghdasaryan, Kalesh Singh,
John Stultz
On Tue, Jan 14, 2025 at 3:41 PM Jeff Xu <jeffxu@chromium.org> wrote:
>
> On Tue, Jan 14, 2025 at 2:42 PM Isaac Manjarres
> <isaacmanjarres@google.com> wrote:
> >
> > On Tue, Jan 14, 2025 at 01:29:44PM -0800, Kees Cook wrote:
> > > On Tue, Jan 14, 2025 at 12:02:28PM -0800, Isaac Manjarres wrote:
>
> > Alternatively, MFD_NOEXEC_SEAL could be extended
> > to prevent executable mappings, and MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED
> > could be enabled, but that type of system would prevent memfd buffers
> > from being used for execution for legitimate usecases (e.g. JIT), which
> > may not be desirable.
> >
> The JIT case doesn't use execve(memfd), right ?
>
That might not be important.
I also think selinux policy will be a better option for this, There is
a pending work item to restrict/enforce MFD_NOEXEC_SEAL on
memfd_create().
>
>
> > --Isaac
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-01-14 23:56 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-06 1:09 [RFC PATCH v1 0/2] Add file seal to prevent future exec mappings Isaac J. Manjarres
2024-12-06 1:09 ` [RFC PATCH v1 1/2] mm/memfd: Add support for F_SEAL_FUTURE_EXEC to memfd Isaac J. Manjarres
2024-12-06 17:49 ` Kalesh Singh
2024-12-06 20:50 ` Isaac Manjarres
2024-12-06 18:19 ` Lorenzo Stoakes
2024-12-06 20:48 ` Isaac Manjarres
2024-12-06 21:14 ` Lorenzo Stoakes
2024-12-11 20:56 ` Isaac Manjarres
2025-01-03 15:13 ` Jann Horn
2025-01-06 18:26 ` Jeff Xu
2025-01-07 0:44 ` Kees Cook
2025-01-08 19:06 ` Lorenzo Stoakes
2025-01-08 22:07 ` Kees Cook
2025-01-09 23:30 ` Jeff Xu
2025-01-14 20:02 ` Isaac Manjarres
2025-01-14 21:29 ` Kees Cook
2025-01-14 22:42 ` Isaac Manjarres
2025-01-14 23:41 ` Jeff Xu
2025-01-14 23:56 ` Jeff Xu
2024-12-06 1:09 ` [RFC PATCH v1 2/2] selftests/memfd: Add tests for F_SEAL_FUTURE_EXEC Isaac J. Manjarres
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox