* [PATCH v6 0/7] mseal system mappings
@ 2025-02-24 17:45 jeffxu
2025-02-24 17:45 ` [PATCH v6 1/7] mseal, system mappings: kernel config and header change jeffxu
` (6 more replies)
0 siblings, 7 replies; 43+ messages in thread
From: jeffxu @ 2025-02-24 17:45 UTC (permalink / raw)
To: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport, Jeff Xu
From: Jeff Xu <jeffxu@chromium.org>
This is V6 version, addressing comments from V5, without code logic
change.
History:
-------
V6:
- mseal.rst: fix a typo (Randy Dunlap)
- security/Kconfig: add rr into note (Liam R. Howlett)
- remove mseal_system_mappings() and use macro instead (Liam R. Howlett)
- mseal.rst: add incompatible userland software (Lorenzo Stoakes)
- remove RFC from title (Kees Cook)
V5
https://lore.kernel.org/all/20250212032155.1276806-1-jeffxu@google.com/
V4:
https://lore.kernel.org/all/20241125202021.3684919-1-jeffxu@google.com/
V3:
https://lore.kernel.org/all/20241113191602.3541870-1-jeffxu@google.com/
V2:
https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/
V1:
https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/
------------------------------------------------------------------------
Provide infrastructure to mseal system mappings. Establish
two kernel configs (CONFIG_MSEAL_SYSTEM_MAPPINGS,
ARCH_HAS_MSEAL_SYSTEM_MAPPINGS) and MSEAL_SYSTEM_MAPPINGS_VM_FLAG
macro for future patches.
As discussed during mseal() upstream process [1], mseal() protects
the VMAs of a given virtual memory range against modifications, such
as the read/write (RW) and no-execute (NX) bits. For complete
descriptions of memory sealing, please see mseal.rst [2].
The mseal() is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system. For
example, such an attacker primitive can break control-flow integrity
guarantees since read-only memory that is supposed to be trusted can
become writable or .text pages can get remapped.
The system mappings are readonly only, memory sealing can protect
them from ever changing to writable or unmmap/remapped as different
attributes.
System mappings such as vdso, vvar, and sigpage (arm), vectors (arm)
are created by the kernel during program initialization, and could
be sealed after creation.
Unlike the aforementioned mappings, the uprobe mapping is not
established during program startup. However, its lifetime is the same
as the process's lifetime [3]. It could be sealed from creation.
The vsyscall on x86-64 uses a special address (0xffffffffff600000),
which is outside the mm managed range. This means mprotect, munmap, and
mremap won't work on the vsyscall. Since sealing doesn't enhance
the vsyscall's security, it is skipped in this patch. If we ever seal
the vsyscall, it is probably only for decorative purpose, i.e. showing
the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored.
It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may
alter the system mappings during restore operations. UML(User Mode Linux)
and gVisor, rr are also known to change the vdso/vvar mappings.
Consequently, this feature cannot be universally enabled across all
systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default.
To support mseal of system mappings, architectures must define
CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS and update their special mappings
calls to pass mseal flag. Additionally, architectures must confirm they
do not unmap/remap system mappings during the process lifetime.
In this version, we've improved the handling of system mapping sealing from
previous versions, instead of modifying the _install_special_mapping
function itself, which would affect all architectures, we now call
_install_special_mapping with a sealing flag only within the specific
architecture that requires it. This targeted approach offers two key
advantages: 1) It limits the code change's impact to the necessary
architectures, and 2) It aligns with the software architecture by keeping
the core memory management within the mm layer, while delegating the
decision of sealing system mappings to the individual architecture, which
is particularly relevant since 32-bit architectures never require sealing.
Prior to this patch series, we explored sealing special mappings from
userspace using glibc's dynamic linker. This approach revealed several
issues:
- The PT_LOAD header may report an incorrect length for vdso, (smaller
than its actual size). The dynamic linker, which relies on PT_LOAD
information to determine mapping size, would then split and partially
seal the vdso mapping. Since each architecture has its own vdso/vvar
code, fixing this in the kernel would require going through each
archiecture. Our initial goal was to enable sealing readonly mappings,
e.g. .text, across all architectures, sealing vdso from kernel since
creation appears to be simpler than sealing vdso at glibc.
- The [vvar] mapping header only contains address information, not length
information. Similar issues might exist for other special mappings.
- Mappings like uprobe are not covered by the dynamic linker,
and there is no effective solution for them.
This feature's security enhancements will benefit ChromeOS, Android,
and other high security systems.
Testing:
This feature was tested on ChromeOS and Android for both x86-64 and ARM64.
- Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly,
i.e. "sl" shown in the smaps for those mappings, and mremap is blocked.
- Passing various automation tests (e.g. pre-checkin) on ChromeOS and
Android to ensure the sealing doesn't affect the functionality of
Chromebook and Android phone.
I also tested the feature on Ubuntu on x86-64:
- With config disabled, vdso/vvar is not sealed,
- with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK,
normal operations such as browsing the web, open/edit doc are OK.
In addition, Benjamin Berg tested this on UML.
Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1]
Link: Documentation/userspace-api/mseal.rst [2]
Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3]
Jeff Xu (7):
mseal, system mappings: kernel config and header change
selftests: x86: test_mremap_vdso: skip if vdso is msealed
mseal, system mappings: enable x86-64
mseal, system mappings: enable arm64
mseal, system mappings: enable uml architecture
mseal, system mappings: uprobe mapping
mseal, system mappings: update mseal.rst
Documentation/userspace-api/mseal.rst | 7 ++++
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/vdso.c | 22 +++++++----
arch/um/Kconfig | 1 +
arch/x86/Kconfig | 1 +
arch/x86/entry/vdso/vma.c | 16 +++++---
arch/x86/um/vdso/vma.c | 6 ++-
include/linux/mm.h | 10 +++++
init/Kconfig | 18 +++++++++
kernel/events/uprobes.c | 5 ++-
security/Kconfig | 18 +++++++++
.../testing/selftests/x86/test_mremap_vdso.c | 38 +++++++++++++++++++
12 files changed, 127 insertions(+), 16 deletions(-)
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 17:45 [PATCH v6 0/7] mseal system mappings jeffxu
@ 2025-02-24 17:45 ` jeffxu
2025-02-24 18:21 ` Dave Hansen
2025-02-24 17:45 ` [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed jeffxu
` (5 subsequent siblings)
6 siblings, 1 reply; 43+ messages in thread
From: jeffxu @ 2025-02-24 17:45 UTC (permalink / raw)
To: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport, Jeff Xu
From: Jeff Xu <jeffxu@chromium.org>
Provide infrastructure to mseal system mappings. Establish
two kernel configs (CONFIG_MSEAL_SYSTEM_MAPPINGS,
ARCH_HAS_MSEAL_SYSTEM_MAPPINGS) and MSEAL_SYSTEM_MAPPINGS_VM_FLAG
macro for future patches.
As discussed during mseal() upstream process [1], mseal() protects
the VMAs of a given virtual memory range against modifications, such
as the read/write (RW) and no-execute (NX) bits. For complete
descriptions of memory sealing, please see mseal.rst [2].
The mseal() is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system. For
example, such an attacker primitive can break control-flow integrity
guarantees since read-only memory that is supposed to be trusted can
become writable or .text pages can get remapped.
The system mappings are readonly only, memory sealing can protect
them from ever changing to writable or unmmap/remapped as different
attributes.
System mappings such as vdso, vvar, and sigpage (arm), vectors (arm)
are created by the kernel during program initialization, and could
be sealed after creation.
Unlike the aforementioned mappings, the uprobe mapping is not
established during program startup. However, its lifetime is the same
as the process's lifetime [3]. It could be sealed from creation.
The vsyscall on x86-64 uses a special address (0xffffffffff600000),
which is outside the mm managed range. This means mprotect, munmap, and
mremap won't work on the vsyscall. Since sealing doesn't enhance
the vsyscall's security, it is skipped in this patch. If we ever seal
the vsyscall, it is probably only for decorative purpose, i.e. showing
the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored.
It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may
alter the system mappings during restore operations. UML(User Mode Linux)
and gVisor, rr are also known to change the vdso/vvar mappings.
Consequently, this feature cannot be universally enabled across all
systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default.
To support mseal of system mappings, architectures must define
CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS and update their special mappings
calls to pass mseal flag. Additionally, architectures must confirm they
do not unmap/remap system mappings during the process lifetime.
In this version, we've improved the handling of system mapping sealing from
previous versions, instead of modifying the _install_special_mapping
function itself, which would affect all architectures, we now call
_install_special_mapping with a sealing flag only within the specific
architecture that requires it. This targeted approach offers two key
advantages: 1) It limits the code change's impact to the necessary
architectures, and 2) It aligns with the software architecture by keeping
the core memory management within the mm layer, while delegating the
decision of sealing system mappings to the individual architecture, which
is particularly relevant since 32-bit architectures never require sealing.
Prior to this patch series, we explored sealing special mappings from
userspace using glibc's dynamic linker. This approach revealed several
issues:
- The PT_LOAD header may report an incorrect length for vdso, (smaller
than its actual size). The dynamic linker, which relies on PT_LOAD
information to determine mapping size, would then split and partially
seal the vdso mapping. Since each architecture has its own vdso/vvar
code, fixing this in the kernel would require going through each
archiecture. Our initial goal was to enable sealing readonly mappings,
e.g. .text, across all architectures, sealing vdso from kernel since
creation appears to be simpler than sealing vdso at glibc.
- The [vvar] mapping header only contains address information, not length
information. Similar issues might exist for other special mappings.
- Mappings like uprobe are not covered by the dynamic linker,
and there is no effective solution for them.
This feature's security enhancements will benefit ChromeOS, Android,
and other high security systems.
Testing:
This feature was tested on ChromeOS and Android for both x86-64 and ARM64.
- Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly,
i.e. "sl" shown in the smaps for those mappings, and mremap is blocked.
- Passing various automation tests (e.g. pre-checkin) on ChromeOS and
Android to ensure the sealing doesn't affect the functionality of
Chromebook and Android phone.
I also tested the feature on Ubuntu on x86-64:
- With config disabled, vdso/vvar is not sealed,
- with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK,
normal operations such as browsing the web, open/edit doc are OK.
In addition, Benjamin Berg tested this on UML.
Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1]
Link: Documentation/userspace-api/mseal.rst [2]
Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3]
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
---
include/linux/mm.h | 10 ++++++++++
init/Kconfig | 18 ++++++++++++++++++
security/Kconfig | 18 ++++++++++++++++++
3 files changed, 46 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7b1068ddcbb7..0e2a4a45d245 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4155,4 +4155,14 @@ int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *st
int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status);
int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
+
+/*
+ * mseal of userspace process's system mappings.
+ */
+#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
+#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
+#else
+#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
+#endif
+
#endif /* _LINUX_MM_H */
diff --git a/init/Kconfig b/init/Kconfig
index d0d021b3fa3b..07435e33f965 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1882,6 +1882,24 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
config ARCH_HAS_MEMBARRIER_SYNC_CORE
bool
+config ARCH_HAS_MSEAL_SYSTEM_MAPPINGS
+ bool
+ help
+ Control MSEAL_SYSTEM_MAPPINGS access based on architecture.
+
+ A 64-bit kernel is required for the memory sealing feature.
+ No specific hardware features from the CPU are needed.
+
+ To enable this feature, the architecture needs to update their
+ special mappings calls to include the sealing flag and confirm
+ that it doesn't unmap/remap system mappings during the life
+ time of the process. After the architecture enables this, a
+ distribution can set CONFIG_MSEAL_SYSTEM_MAPPING to manage access
+ to the feature.
+
+ For complete descriptions of memory sealing, please see
+ Documentation/userspace-api/mseal.rst
+
config HAVE_PERF_EVENTS
bool
help
diff --git a/security/Kconfig b/security/Kconfig
index f10dbf15c294..15a86a952910 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -51,6 +51,24 @@ config PROC_MEM_NO_FORCE
endchoice
+config MSEAL_SYSTEM_MAPPINGS
+ bool "mseal system mappings"
+ depends on 64BIT
+ depends on ARCH_HAS_MSEAL_SYSTEM_MAPPINGS
+ depends on !CHECKPOINT_RESTORE
+ help
+ Seal system mappings such as vdso, vvar, sigpage, uprobes, etc.
+
+ A 64-bit kernel is required for the memory sealing feature.
+ No specific hardware features from the CPU are needed.
+
+ Note: CHECKPOINT_RESTORE, UML, gVisor, rr are known to relocate or
+ unmap system mapping, therefore this config can't be enabled
+ universally.
+
+ For complete descriptions of memory sealing, please see
+ Documentation/userspace-api/mseal.rst
+
config SECURITY
bool "Enable different security models"
depends on SYSFS
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed
2025-02-24 17:45 [PATCH v6 0/7] mseal system mappings jeffxu
2025-02-24 17:45 ` [PATCH v6 1/7] mseal, system mappings: kernel config and header change jeffxu
@ 2025-02-24 17:45 ` jeffxu
2025-02-24 19:02 ` Kees Cook
2025-02-24 21:04 ` Liam R. Howlett
2025-02-24 17:45 ` [PATCH v6 3/7] mseal, system mappings: enable x86-64 jeffxu
` (4 subsequent siblings)
6 siblings, 2 replies; 43+ messages in thread
From: jeffxu @ 2025-02-24 17:45 UTC (permalink / raw)
To: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport, Jeff Xu
From: Jeff Xu <jeffxu@chromium.org>
Add code to detect if the vdso is memory sealed, skip the test
if it is.
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
---
.../testing/selftests/x86/test_mremap_vdso.c | 38 +++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/tools/testing/selftests/x86/test_mremap_vdso.c b/tools/testing/selftests/x86/test_mremap_vdso.c
index d53959e03593..c68077c56b22 100644
--- a/tools/testing/selftests/x86/test_mremap_vdso.c
+++ b/tools/testing/selftests/x86/test_mremap_vdso.c
@@ -14,6 +14,7 @@
#include <errno.h>
#include <unistd.h>
#include <string.h>
+#include <stdbool.h>
#include <sys/mman.h>
#include <sys/auxv.h>
@@ -55,13 +56,50 @@ static int try_to_remap(void *vdso_addr, unsigned long size)
}
+#define VDSO_NAME "[vdso]"
+#define VMFLAGS "VmFlags:"
+#define MSEAL_FLAGS "sl"
+#define MAX_LINE_LEN 512
+
+bool vdso_sealed(FILE *maps)
+{
+ char line[MAX_LINE_LEN];
+ bool has_vdso = false;
+
+ while (fgets(line, sizeof(line), maps)) {
+ if (strstr(line, VDSO_NAME))
+ has_vdso = true;
+
+ if (has_vdso && !strncmp(line, VMFLAGS, strlen(VMFLAGS))) {
+ if (strstr(line, MSEAL_FLAGS))
+ return true;
+
+ return false;
+ }
+ }
+
+ return false;
+}
+
int main(int argc, char **argv, char **envp)
{
pid_t child;
+ FILE *maps;
ksft_print_header();
ksft_set_plan(1);
+ maps = fopen("/proc/self/smaps", "r");
+ if (!maps) {
+ ksft_test_result_skip("Could not open /proc/self/smaps\n");
+ return 0;
+ }
+
+ if (vdso_sealed(maps)) {
+ ksft_test_result_skip("vdso is sealed\n");
+ return 0;
+ }
+
child = fork();
if (child == -1)
ksft_exit_fail_msg("failed to fork (%d): %m\n", errno);
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v6 3/7] mseal, system mappings: enable x86-64
2025-02-24 17:45 [PATCH v6 0/7] mseal system mappings jeffxu
2025-02-24 17:45 ` [PATCH v6 1/7] mseal, system mappings: kernel config and header change jeffxu
2025-02-24 17:45 ` [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed jeffxu
@ 2025-02-24 17:45 ` jeffxu
2025-02-24 21:06 ` Liam R. Howlett
2025-02-24 17:45 ` [PATCH v6 4/7] mseal, system mappings: enable arm64 jeffxu
` (3 subsequent siblings)
6 siblings, 1 reply; 43+ messages in thread
From: jeffxu @ 2025-02-24 17:45 UTC (permalink / raw)
To: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport, Jeff Xu
From: Jeff Xu <jeffxu@chromium.org>
Provide support for CONFIG_MSEAL_SYSTEM_MAPPINGS on x86-64,
covering the vdso, vvar, vvar_vclock.
Production release testing passes on Android and Chrome OS.
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
---
arch/x86/Kconfig | 1 +
arch/x86/entry/vdso/vma.c | 16 ++++++++++------
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 87198d957e2f..8fa17032ca46 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -26,6 +26,7 @@ config X86_64
depends on 64BIT
# Options that are inherently 64-bit kernel only:
select ARCH_HAS_GIGANTIC_PAGE
+ select ARCH_HAS_MSEAL_SYSTEM_MAPPINGS
select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
select ARCH_SUPPORTS_PER_VMA_LOCK
select ARCH_SUPPORTS_HUGE_PFNMAP if TRANSPARENT_HUGEPAGE
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 39e6efc1a9ca..54677964d0b5 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -247,6 +247,7 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
unsigned long text_start;
+ unsigned long vm_flags;
int ret = 0;
if (mmap_write_lock_killable(mm))
@@ -264,11 +265,12 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
/*
* MAYWRITE to allow gdb to COW and set breakpoints
*/
+ vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
vma = _install_special_mapping(mm,
text_start,
image->size,
- VM_READ|VM_EXEC|
- VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
+ vm_flags,
&vdso_mapping);
if (IS_ERR(vma)) {
@@ -276,11 +278,12 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
goto up_fail;
}
+ vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
vma = _install_special_mapping(mm,
addr,
(__VVAR_PAGES - VDSO_NR_VCLOCK_PAGES) * PAGE_SIZE,
- VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|
- VM_PFNMAP,
+ vm_flags,
&vvar_mapping);
if (IS_ERR(vma)) {
@@ -289,11 +292,12 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
goto up_fail;
}
+ vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
vma = _install_special_mapping(mm,
addr + (__VVAR_PAGES - VDSO_NR_VCLOCK_PAGES) * PAGE_SIZE,
VDSO_NR_VCLOCK_PAGES * PAGE_SIZE,
- VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|
- VM_PFNMAP,
+ vm_flags,
&vvar_vclock_mapping);
if (IS_ERR(vma)) {
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v6 4/7] mseal, system mappings: enable arm64
2025-02-24 17:45 [PATCH v6 0/7] mseal system mappings jeffxu
` (2 preceding siblings ...)
2025-02-24 17:45 ` [PATCH v6 3/7] mseal, system mappings: enable x86-64 jeffxu
@ 2025-02-24 17:45 ` jeffxu
2025-02-24 17:45 ` [PATCH v6 5/7] mseal, system mappings: enable uml architecture jeffxu
` (2 subsequent siblings)
6 siblings, 0 replies; 43+ messages in thread
From: jeffxu @ 2025-02-24 17:45 UTC (permalink / raw)
To: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport, Jeff Xu
From: Jeff Xu <jeffxu@chromium.org>
Provide support for CONFIG_MSEAL_SYSTEM_MAPPINGS on arm64, covering
the vdso, vvar, and compat-mode vectors and sigpage mappings.
Production release testing passes on Android and Chrome OS.
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
---
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/vdso.c | 22 +++++++++++++++-------
2 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fcdd0ed3eca8..39202aa9a5af 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -38,6 +38,7 @@ config ARM64
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MEM_ENCRYPT
+ select ARCH_HAS_MSEAL_SYSTEM_MAPPINGS
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_NONLEAF_PMD_YOUNG if ARM64_HAFT
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index e8ed8e5b713b..fa3b85b7ff01 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -183,6 +183,7 @@ static int __setup_additional_pages(enum vdso_abi abi,
{
unsigned long vdso_base, vdso_text_len, vdso_mapping_len;
unsigned long gp_flags = 0;
+ unsigned long vm_flags;
void *ret;
BUILD_BUG_ON(VVAR_NR_PAGES != __VVAR_PAGES);
@@ -197,8 +198,10 @@ static int __setup_additional_pages(enum vdso_abi abi,
goto up_fail;
}
+ vm_flags = VM_READ|VM_MAYREAD|VM_PFNMAP;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
ret = _install_special_mapping(mm, vdso_base, VVAR_NR_PAGES * PAGE_SIZE,
- VM_READ|VM_MAYREAD|VM_PFNMAP,
+ vm_flags,
&vvar_map);
if (IS_ERR(ret))
goto up_fail;
@@ -208,9 +211,10 @@ static int __setup_additional_pages(enum vdso_abi abi,
vdso_base += VVAR_NR_PAGES * PAGE_SIZE;
mm->context.vdso = (void *)vdso_base;
+ vm_flags = VM_READ|VM_EXEC|gp_flags|VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
ret = _install_special_mapping(mm, vdso_base, vdso_text_len,
- VM_READ|VM_EXEC|gp_flags|
- VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
+ vm_flags,
vdso_info[abi].cm);
if (IS_ERR(ret))
goto up_fail;
@@ -326,6 +330,7 @@ arch_initcall(aarch32_alloc_vdso_pages);
static int aarch32_kuser_helpers_setup(struct mm_struct *mm)
{
void *ret;
+ unsigned long vm_flags;
if (!IS_ENABLED(CONFIG_KUSER_HELPERS))
return 0;
@@ -334,9 +339,10 @@ static int aarch32_kuser_helpers_setup(struct mm_struct *mm)
* Avoid VM_MAYWRITE for compatibility with arch/arm/, where it's
* not safe to CoW the page containing the CPU exception vectors.
*/
+ vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYEXEC;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
ret = _install_special_mapping(mm, AARCH32_VECTORS_BASE, PAGE_SIZE,
- VM_READ | VM_EXEC |
- VM_MAYREAD | VM_MAYEXEC,
+ vm_flags,
&aarch32_vdso_maps[AA32_MAP_VECTORS]);
return PTR_ERR_OR_ZERO(ret);
@@ -345,6 +351,7 @@ static int aarch32_kuser_helpers_setup(struct mm_struct *mm)
static int aarch32_sigreturn_setup(struct mm_struct *mm)
{
unsigned long addr;
+ unsigned long vm_flags;
void *ret;
addr = get_unmapped_area(NULL, 0, PAGE_SIZE, 0, 0);
@@ -357,9 +364,10 @@ static int aarch32_sigreturn_setup(struct mm_struct *mm)
* VM_MAYWRITE is required to allow gdb to Copy-on-Write and
* set breakpoints.
*/
+ vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
ret = _install_special_mapping(mm, addr, PAGE_SIZE,
- VM_READ | VM_EXEC | VM_MAYREAD |
- VM_MAYWRITE | VM_MAYEXEC,
+ vm_flags,
&aarch32_vdso_maps[AA32_MAP_SIGPAGE]);
if (IS_ERR(ret))
goto out;
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v6 5/7] mseal, system mappings: enable uml architecture
2025-02-24 17:45 [PATCH v6 0/7] mseal system mappings jeffxu
` (3 preceding siblings ...)
2025-02-24 17:45 ` [PATCH v6 4/7] mseal, system mappings: enable arm64 jeffxu
@ 2025-02-24 17:45 ` jeffxu
2025-02-24 17:45 ` [PATCH v6 6/7] mseal, system mappings: uprobe mapping jeffxu
2025-02-24 17:45 ` [PATCH v6 7/7] mseal, system mappings: update mseal.rst jeffxu
6 siblings, 0 replies; 43+ messages in thread
From: jeffxu @ 2025-02-24 17:45 UTC (permalink / raw)
To: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport, Jeff Xu,
Benjamin Berg
From: Jeff Xu <jeffxu@chromium.org>
Provide support for CONFIG_MSEAL_SYSTEM_MAPPINGS on UML, covering
the vdso.
Testing passes on UML.
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Tested-by: Benjamin Berg <benjamin.berg@intel.com>
---
arch/um/Kconfig | 1 +
arch/x86/um/vdso/vma.c | 6 ++++--
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 18051b1cfce0..eb2d439a5334 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -10,6 +10,7 @@ config UML
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOV
+ select ARCH_HAS_MSEAL_SYSTEM_MAPPINGS
select ARCH_HAS_STRNCPY_FROM_USER
select ARCH_HAS_STRNLEN_USER
select HAVE_ARCH_AUDITSYSCALL
diff --git a/arch/x86/um/vdso/vma.c b/arch/x86/um/vdso/vma.c
index f238f7b33cdd..ee6d8a58f9f6 100644
--- a/arch/x86/um/vdso/vma.c
+++ b/arch/x86/um/vdso/vma.c
@@ -54,6 +54,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
{
struct vm_area_struct *vma;
struct mm_struct *mm = current->mm;
+ unsigned long vm_flags;
static struct vm_special_mapping vdso_mapping = {
.name = "[vdso]",
};
@@ -65,9 +66,10 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
return -EINTR;
vdso_mapping.pages = vdsop;
+ vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
vma = _install_special_mapping(mm, um_vdso_addr, PAGE_SIZE,
- VM_READ|VM_EXEC|
- VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
+ vm_flags,
&vdso_mapping);
mmap_write_unlock(mm);
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v6 6/7] mseal, system mappings: uprobe mapping
2025-02-24 17:45 [PATCH v6 0/7] mseal system mappings jeffxu
` (4 preceding siblings ...)
2025-02-24 17:45 ` [PATCH v6 5/7] mseal, system mappings: enable uml architecture jeffxu
@ 2025-02-24 17:45 ` jeffxu
2025-02-24 17:45 ` [PATCH v6 7/7] mseal, system mappings: update mseal.rst jeffxu
6 siblings, 0 replies; 43+ messages in thread
From: jeffxu @ 2025-02-24 17:45 UTC (permalink / raw)
To: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport, Jeff Xu
From: Jeff Xu <jeffxu@chromium.org>
Provide support to mseal the uprobe mapping.
Unlike other system mappings, the uprobe mapping is not
established during program startup. However, its lifetime is the same
as the process's lifetime. It could be sealed from creation.
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
---
kernel/events/uprobes.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 2ca797cbe465..c23ca39b81ac 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1662,6 +1662,7 @@ static const struct vm_special_mapping xol_mapping = {
static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
{
struct vm_area_struct *vma;
+ unsigned long vm_flags;
int ret;
if (mmap_write_lock_killable(mm))
@@ -1682,8 +1683,10 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
}
}
+ vm_flags = VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO;
+ vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
vma = _install_special_mapping(mm, area->vaddr, PAGE_SIZE,
- VM_EXEC|VM_MAYEXEC|VM_DONTCOPY|VM_IO,
+ vm_flags,
&xol_mapping);
if (IS_ERR(vma)) {
ret = PTR_ERR(vma);
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-24 17:45 [PATCH v6 0/7] mseal system mappings jeffxu
` (5 preceding siblings ...)
2025-02-24 17:45 ` [PATCH v6 6/7] mseal, system mappings: uprobe mapping jeffxu
@ 2025-02-24 17:45 ` jeffxu
2025-02-24 19:04 ` Kees Cook
` (2 more replies)
6 siblings, 3 replies; 43+ messages in thread
From: jeffxu @ 2025-02-24 17:45 UTC (permalink / raw)
To: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport, Jeff Xu
From: Jeff Xu <jeffxu@chromium.org>
Update memory sealing documentation to include details about system
mappings.
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
---
Documentation/userspace-api/mseal.rst | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
index 41102f74c5e2..10147281bf2d 100644
--- a/Documentation/userspace-api/mseal.rst
+++ b/Documentation/userspace-api/mseal.rst
@@ -130,6 +130,13 @@ Use cases
- Chrome browser: protect some security sensitive data structures.
+- System mappings:
+ If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
+ the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
+ uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
+ known to relocate or unmap system mapping, therefore this config can't be
+ enabled universally.
+
When not to use mseal
=====================
Applications can apply sealing to any virtual memory region from userspace,
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 17:45 ` [PATCH v6 1/7] mseal, system mappings: kernel config and header change jeffxu
@ 2025-02-24 18:21 ` Dave Hansen
2025-02-24 18:44 ` Jeff Xu
0 siblings, 1 reply; 43+ messages in thread
From: Dave Hansen @ 2025-02-24 18:21 UTC (permalink / raw)
To: jeffxu, akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin
Cc: linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On 2/24/25 09:45, jeffxu@chromium.org wrote:
> +/*
> + * mseal of userspace process's system mappings.
> + */
> +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> +#else
> +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> +#endif
This ends up looking pretty wonky in practice:
> + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
other ones.
Would it really hurt to have
#ifdef CONFIG_64BIT
/* VM is sealed, in vm_flags */
#define VM_SEALED _BITUL(63)
+#else
+#define VM_SEALED VM_NONE
#endif
?
Then all the users could just do:
vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP|
VM_SEALED
That seems to be a the common way of doing things. Take a look at:
# define VM_GROWSUP VM_NONE
...
# define VM_MTE VM_NONE
# define VM_MTE_ALLOWED VM_NONE
...
# define VM_UFFD_MINOR VM_NONE
...
#define VM_DROPPABLE VM_NONE
... and more
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 18:21 ` Dave Hansen
@ 2025-02-24 18:44 ` Jeff Xu
2025-02-24 18:52 ` Dave Hansen
2025-02-24 19:03 ` Liam R. Howlett
0 siblings, 2 replies; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 18:44 UTC (permalink / raw)
To: Dave Hansen
Cc: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 2/24/25 09:45, jeffxu@chromium.org wrote:
> > +/*
> > + * mseal of userspace process's system mappings.
> > + */
> > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> > +#else
> > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> > +#endif
>
> This ends up looking pretty wonky in practice:
>
> > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
>
> because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
> other ones.
>
> Would it really hurt to have
>
> #ifdef CONFIG_64BIT
> /* VM is sealed, in vm_flags */
> #define VM_SEALED _BITUL(63)
> +#else
> +#define VM_SEALED VM_NONE
> #endif
>
> ?
>
VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of
the build. This is intentional. Any 32-bit code trying to use the
sealing function or the VM_SEALED flag will immediately fail
compilation. This makes it easier to identify incorrect usage.
For example:
Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
#ifdef CONFIG_64BIT
[ilog2(VM_SEALED)] = "sl",
#endif
Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
in case that "#ifdef CONFIG_64BIT" line is missing.
Please note, this has been like this since the first version of
mseal() RFC patch, and I prefer to keep it this way.
Thanks
-Jeff
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 18:44 ` Jeff Xu
@ 2025-02-24 18:52 ` Dave Hansen
2025-02-24 18:55 ` Kees Cook
2025-02-24 19:03 ` Liam R. Howlett
1 sibling, 1 reply; 43+ messages in thread
From: Dave Hansen @ 2025-02-24 18:52 UTC (permalink / raw)
To: Jeff Xu
Cc: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On 2/24/25 10:44, Jeff Xu wrote:
> For example:
> Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
>
> #ifdef CONFIG_64BIT
> [ilog2(VM_SEALED)] = "sl",
> #endif
>
> Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> in case that "#ifdef CONFIG_64BIT" line is missing.
>
> Please note, this has been like this since the first version of
> mseal() RFC patch, and I prefer to keep it this way.
That logic is reasonable. But it's different from the _vast_ majority of
other flags.
So what justifies VM_SEALED being so different? It's leading to pretty
objectively ugly code in this series.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 18:52 ` Dave Hansen
@ 2025-02-24 18:55 ` Kees Cook
2025-02-24 18:59 ` Jeff Xu
` (2 more replies)
0 siblings, 3 replies; 43+ messages in thread
From: Kees Cook @ 2025-02-24 18:55 UTC (permalink / raw)
To: Dave Hansen
Cc: Jeff Xu, akpm, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote:
> On 2/24/25 10:44, Jeff Xu wrote:
> > For example:
> > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
> >
> > #ifdef CONFIG_64BIT
> > [ilog2(VM_SEALED)] = "sl",
> > #endif
> >
> > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > in case that "#ifdef CONFIG_64BIT" line is missing.
> >
> > Please note, this has been like this since the first version of
> > mseal() RFC patch, and I prefer to keep it this way.
>
> That logic is reasonable. But it's different from the _vast_ majority of
> other flags.
>
> So what justifies VM_SEALED being so different? It's leading to pretty
> objectively ugly code in this series.
Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
for "should we perform system mapping sealing?" is intentionally separate
here, so that it can be Kconfig and per-arch toggled, etc.
As for the name, I have no strong opinion. Perhaps VM_SEALED_SYSTEM_MAPPING ?
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 18:55 ` Kees Cook
@ 2025-02-24 18:59 ` Jeff Xu
2025-02-24 19:02 ` Dave Hansen
2025-02-24 19:10 ` Liam R. Howlett
2 siblings, 0 replies; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 18:59 UTC (permalink / raw)
To: Kees Cook
Cc: Dave Hansen, akpm, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 10:55 AM Kees Cook <kees@kernel.org> wrote:
>
> On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote:
> > On 2/24/25 10:44, Jeff Xu wrote:
> > > For example:
> > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
> > >
> > > #ifdef CONFIG_64BIT
> > > [ilog2(VM_SEALED)] = "sl",
> > > #endif
> > >
> > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > > in case that "#ifdef CONFIG_64BIT" line is missing.
> > >
> > > Please note, this has been like this since the first version of
> > > mseal() RFC patch, and I prefer to keep it this way.
> >
> > That logic is reasonable. But it's different from the _vast_ majority of
> > other flags.
> >
> > So what justifies VM_SEALED being so different? It's leading to pretty
> > objectively ugly code in this series.
>
> Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
> for "should we perform system mapping sealing?" is intentionally separate
> here, so that it can be Kconfig and per-arch toggled, etc.
>
Ya, it is a layer of separation also. Thanks for pointing it out.
> As for the name, I have no strong opinion. Perhaps VM_SEALED_SYSTEM_MAPPING ?
>
OK.
Thanks
-Jeff
> -Kees
>
> --
> Kees Cook
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 18:55 ` Kees Cook
2025-02-24 18:59 ` Jeff Xu
@ 2025-02-24 19:02 ` Dave Hansen
2025-02-24 19:10 ` Liam R. Howlett
2 siblings, 0 replies; 43+ messages in thread
From: Dave Hansen @ 2025-02-24 19:02 UTC (permalink / raw)
To: Kees Cook
Cc: Jeff Xu, akpm, jannh, torvalds, vbabka, lorenzo.stoakes,
Liam.Howlett, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On 2/24/25 10:55, Kees Cook wrote:
>> That logic is reasonable. But it's different from the _vast_ majority of
>> other flags.
>>
>> So what justifies VM_SEALED being so different? It's leading to pretty
>> objectively ugly code in this series.
> Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
> for "should we perform system mapping sealing?" is intentionally separate
> here, so that it can be Kconfig and per-arch toggled, etc.
Ahh, makes sense.
> As for the name, I have no strong opinion. Perhaps VM_SEALED_SYSTEM_MAPPING ?
Yeah, that'd work. Just something more consistent with the existing
naming and more compact. I think:
VM_SEALED_SYS
would fit in nicely.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed
2025-02-24 17:45 ` [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed jeffxu
@ 2025-02-24 19:02 ` Kees Cook
2025-02-24 21:04 ` Liam R. Howlett
1 sibling, 0 replies; 43+ messages in thread
From: Kees Cook @ 2025-02-24 19:02 UTC (permalink / raw)
To: jeffxu
Cc: akpm, jannh, torvalds, vbabka, lorenzo.stoakes, Liam.Howlett,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 05:45:08PM +0000, jeffxu@chromium.org wrote:
> From: Jeff Xu <jeffxu@chromium.org>
>
> Add code to detect if the vdso is memory sealed, skip the test
> if it is.
>
> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Thanks, after my full 360 on reading this code, I'm back to where I was
originally: it looks good to me. :)
Reviewed-by: Kees Cook <kees@kernel.org>
--
Kees Cook
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 18:44 ` Jeff Xu
2025-02-24 18:52 ` Dave Hansen
@ 2025-02-24 19:03 ` Liam R. Howlett
2025-02-24 19:07 ` Jeff Xu
2025-02-24 19:10 ` Jeff Xu
1 sibling, 2 replies; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 19:03 UTC (permalink / raw)
To: Jeff Xu
Cc: Dave Hansen, akpm, keescook, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
* Jeff Xu <jeffxu@chromium.org> [250224 13:44]:
> On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
> >
> > On 2/24/25 09:45, jeffxu@chromium.org wrote:
> > > +/*
> > > + * mseal of userspace process's system mappings.
> > > + */
> > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> > > +#else
> > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> > > +#endif
> >
> > This ends up looking pretty wonky in practice:
> >
> > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> >
> > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
> > other ones.
> >
> > Would it really hurt to have
> >
> > #ifdef CONFIG_64BIT
> > /* VM is sealed, in vm_flags */
> > #define VM_SEALED _BITUL(63)
> > +#else
> > +#define VM_SEALED VM_NONE
> > #endif
> >
> > ?
> >
> VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of
> the build. This is intentional. Any 32-bit code trying to use the
> sealing function or the VM_SEALED flag will immediately fail
> compilation. This makes it easier to identify incorrect usage.
>
The reason that two #defines are needed is because you can have mseal
enabled while not sealing system mappings, so for this to be clean we
need two defines.
However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion.
Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED".
> For example:
> Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
third_party?
>
> #ifdef CONFIG_64BIT
> [ilog2(VM_SEALED)] = "sl",
> #endif
>
> Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> in case that "#ifdef CONFIG_64BIT" line is missing.
I don't think it is reasonable to insist on doing things differently in
the kernel because you have external tests that would need updating.
These things can change independently, so I don't think this is a valid
argument.
If these are upstream tests, and we need these tests to work then they
can be fixed.
>
> Please note, this has been like this since the first version of
> mseal() RFC patch, and I prefer to keep it this way.
Thanks,
Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-24 17:45 ` [PATCH v6 7/7] mseal, system mappings: update mseal.rst jeffxu
@ 2025-02-24 19:04 ` Kees Cook
2025-02-24 20:26 ` Liam R. Howlett
2025-02-25 6:07 ` Lorenzo Stoakes
2 siblings, 0 replies; 43+ messages in thread
From: Kees Cook @ 2025-02-24 19:04 UTC (permalink / raw)
To: jeffxu
Cc: akpm, jannh, torvalds, vbabka, lorenzo.stoakes, Liam.Howlett,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 05:45:13PM +0000, jeffxu@chromium.org wrote:
> From: Jeff Xu <jeffxu@chromium.org>
>
> Update memory sealing documentation to include details about system
> mappings.
>
> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Thanks, I think the list of programs is good.
Reviewed-by: Kees Cook <kees@kernel.org>
--
Kees Cook
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:03 ` Liam R. Howlett
@ 2025-02-24 19:07 ` Jeff Xu
2025-02-24 19:18 ` Liam R. Howlett
2025-02-24 19:10 ` Jeff Xu
1 sibling, 1 reply; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 19:07 UTC (permalink / raw)
To: Liam R. Howlett, Jeff Xu, Dave Hansen, akpm, keescook, jannh,
torvalds, vbabka, lorenzo.stoakes, adhemerval.zanella, oleg,
avagin, benjamin, linux-kernel, linux-hardening, linux-mm,
jorgelo, sroettger, hch, ojeda, thomas.weissschuh, adobriyan,
johannes, pedro.falcato, hca, willy, anna-maria, mark.rutland,
linus.walleij, Jason, deller, rdunlap, davem, peterx, f.fainelli,
gerg, dave.hansen, mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb,
enh, rientjes, groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett
<Liam.Howlett@oracle.com> wrote:
>
> * Jeff Xu <jeffxu@chromium.org> [250224 13:44]:
> > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
> > >
> > > On 2/24/25 09:45, jeffxu@chromium.org wrote:
> > > > +/*
> > > > + * mseal of userspace process's system mappings.
> > > > + */
> > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> > > > +#else
> > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> > > > +#endif
> > >
> > > This ends up looking pretty wonky in practice:
> > >
> > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> > >
> > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
> > > other ones.
> > >
> > > Would it really hurt to have
> > >
> > > #ifdef CONFIG_64BIT
> > > /* VM is sealed, in vm_flags */
> > > #define VM_SEALED _BITUL(63)
> > > +#else
> > > +#define VM_SEALED VM_NONE
> > > #endif
> > >
> > > ?
> > >
> > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of
> > the build. This is intentional. Any 32-bit code trying to use the
> > sealing function or the VM_SEALED flag will immediately fail
> > compilation. This makes it easier to identify incorrect usage.
> >
>
> The reason that two #defines are needed is because you can have mseal
> enabled while not sealing system mappings, so for this to be clean we
> need two defines.
>
> However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion.
> Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED".
>
> > For example:
> > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
>
> third_party?
>
Sorry, I pasted the code path from ChromeOS code base, it is actually
in the kernel itself.
fs/proc/task_mmu.c
> >
> > #ifdef CONFIG_64BIT
> > [ilog2(VM_SEALED)] = "sl",
> > #endif
> >
> > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > in case that "#ifdef CONFIG_64BIT" line is missing.
>
> I don't think it is reasonable to insist on doing things differently in
> the kernel because you have external tests that would need updating.
> These things can change independently, so I don't think this is a valid
> argument.
>
> If these are upstream tests, and we need these tests to work then they
> can be fixed.
>
As above, this is actually kernel code, not test.
-Jeff
> >
> > Please note, this has been like this since the first version of
> > mseal() RFC patch, and I prefer to keep it this way.
>
> Thanks,
> Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:03 ` Liam R. Howlett
2025-02-24 19:07 ` Jeff Xu
@ 2025-02-24 19:10 ` Jeff Xu
2025-02-24 19:25 ` Kees Cook
1 sibling, 1 reply; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 19:10 UTC (permalink / raw)
To: Liam R. Howlett, Jeff Xu, Dave Hansen, akpm, keescook, jannh,
torvalds, vbabka, lorenzo.stoakes, adhemerval.zanella, oleg,
avagin, benjamin, linux-kernel, linux-hardening, linux-mm,
jorgelo, sroettger, hch, ojeda, thomas.weissschuh, adobriyan,
johannes, pedro.falcato, hca, willy, anna-maria, mark.rutland,
linus.walleij, Jason, deller, rdunlap, davem, peterx, f.fainelli,
gerg, dave.hansen, mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb,
enh, rientjes, groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett
<Liam.Howlett@oracle.com> wrote:
>
> * Jeff Xu <jeffxu@chromium.org> [250224 13:44]:
> > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
> > >
> > > On 2/24/25 09:45, jeffxu@chromium.org wrote:
> > > > +/*
> > > > + * mseal of userspace process's system mappings.
> > > > + */
> > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> > > > +#else
> > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> > > > +#endif
> > >
> > > This ends up looking pretty wonky in practice:
> > >
> > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> > >
> > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
> > > other ones.
> > >
> > > Would it really hurt to have
> > >
> > > #ifdef CONFIG_64BIT
> > > /* VM is sealed, in vm_flags */
> > > #define VM_SEALED _BITUL(63)
> > > +#else
> > > +#define VM_SEALED VM_NONE
> > > #endif
> > >
> > > ?
> > >
> > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of
> > the build. This is intentional. Any 32-bit code trying to use the
> > sealing function or the VM_SEALED flag will immediately fail
> > compilation. This makes it easier to identify incorrect usage.
> >
>
> The reason that two #defines are needed is because you can have mseal
> enabled while not sealing system mappings, so for this to be clean we
> need two defines.
>
> However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion.
> Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED".
>
How about MSEAL_SYSTME_MAPPINGS as Kees suggested ?
The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it
might take longer for the new reader to understand what it is.
-Jeff
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 18:55 ` Kees Cook
2025-02-24 18:59 ` Jeff Xu
2025-02-24 19:02 ` Dave Hansen
@ 2025-02-24 19:10 ` Liam R. Howlett
2025-02-24 19:22 ` Jeff Xu
2025-02-24 19:26 ` Kees Cook
2 siblings, 2 replies; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 19:10 UTC (permalink / raw)
To: Kees Cook
Cc: Dave Hansen, Jeff Xu, akpm, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
* Kees Cook <kees@kernel.org> [250224 13:55]:
> On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote:
> > On 2/24/25 10:44, Jeff Xu wrote:
> > > For example:
> > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
> > >
> > > #ifdef CONFIG_64BIT
> > > [ilog2(VM_SEALED)] = "sl",
> > > #endif
> > >
> > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > > in case that "#ifdef CONFIG_64BIT" line is missing.
> > >
> > > Please note, this has been like this since the first version of
> > > mseal() RFC patch, and I prefer to keep it this way.
> >
> > That logic is reasonable. But it's different from the _vast_ majority of
> > other flags.
> >
> > So what justifies VM_SEALED being so different? It's leading to pretty
> > objectively ugly code in this series.
>
> Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
> for "should we perform system mapping sealing?" is intentionally separate
> here, so that it can be Kconfig and per-arch toggled, etc.
>
Considering Dave is the second person that did not find the huge commit
message helpful, can we please limit the commit message to be about the
actual code and not the entire series?
I thought we said that it was worth while making this change in v5?
Thanks,
Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:07 ` Jeff Xu
@ 2025-02-24 19:18 ` Liam R. Howlett
2025-02-24 19:39 ` Jeff Xu
0 siblings, 1 reply; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 19:18 UTC (permalink / raw)
To: Jeff Xu
Cc: Dave Hansen, akpm, keescook, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
* Jeff Xu <jeffxu@chromium.org> [250224 14:07]:
...
> > >
> > > #ifdef CONFIG_64BIT
> > > [ilog2(VM_SEALED)] = "sl",
> > > #endif
> > >
> > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > > in case that "#ifdef CONFIG_64BIT" line is missing.
> >
> > I don't think it is reasonable to insist on doing things differently in
> > the kernel because you have external tests that would need updating.
> > These things can change independently, so I don't think this is a valid
> > argument.
> >
> > If these are upstream tests, and we need these tests to work then they
> > can be fixed.
> >
> As above, this is actually kernel code, not test.
Okay, so then they could be easily fixed to work with changing the flags
and it would transparently function in your ChromeOS code base. That's a
good way of doing things.
It doesn't make sense here since, as stated in this thread, we need two
defines. But it does mean we are less locked into how this functions
and it could change later, if needed.
Also, do we need something like the above test for VM_SEALED_SYS?
Thanks,
Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:10 ` Liam R. Howlett
@ 2025-02-24 19:22 ` Jeff Xu
2025-02-24 19:32 ` Liam R. Howlett
2025-02-24 19:26 ` Kees Cook
1 sibling, 1 reply; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 19:22 UTC (permalink / raw)
To: Liam R. Howlett, Kees Cook, Dave Hansen, Jeff Xu, akpm, jannh,
torvalds, vbabka, lorenzo.stoakes, adhemerval.zanella, oleg,
avagin, benjamin, linux-kernel, linux-hardening, linux-mm,
jorgelo, sroettger, hch, ojeda, thomas.weissschuh, adobriyan,
johannes, pedro.falcato, hca, willy, anna-maria, mark.rutland,
linus.walleij, Jason, deller, rdunlap, davem, peterx, f.fainelli,
gerg, dave.hansen, mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb,
enh, rientjes, groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 11:11 AM Liam R. Howlett
<Liam.Howlett@oracle.com> wrote:
>
> * Kees Cook <kees@kernel.org> [250224 13:55]:
> > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote:
> > > On 2/24/25 10:44, Jeff Xu wrote:
> > > > For example:
> > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
> > > >
> > > > #ifdef CONFIG_64BIT
> > > > [ilog2(VM_SEALED)] = "sl",
> > > > #endif
> > > >
> > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > > > in case that "#ifdef CONFIG_64BIT" line is missing.
> > > >
> > > > Please note, this has been like this since the first version of
> > > > mseal() RFC patch, and I prefer to keep it this way.
> > >
> > > That logic is reasonable. But it's different from the _vast_ majority of
> > > other flags.
> > >
> > > So what justifies VM_SEALED being so different? It's leading to pretty
> > > objectively ugly code in this series.
> >
> > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
> > for "should we perform system mapping sealing?" is intentionally separate
> > here, so that it can be Kconfig and per-arch toggled, etc.
> >
>
> Considering Dave is the second person that did not find the huge commit
> message helpful, can we please limit the commit message to be about the
> actual code and not the entire series?
>
> I thought we said that it was worth while making this change in v5?
>
I include the cover letter's content in the first commit message to
clearly communicate the purpose of the entire patch series, saving
maintainers' time when accepting the patch.
Should I still include that, and add what the first patch does, and
separate it from the cover letter with "----"? What do you think?
-Jeff
> Thanks,
> Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:10 ` Jeff Xu
@ 2025-02-24 19:25 ` Kees Cook
2025-02-24 19:42 ` Jeff Xu
0 siblings, 1 reply; 43+ messages in thread
From: Kees Cook @ 2025-02-24 19:25 UTC (permalink / raw)
To: Jeff Xu
Cc: Liam R. Howlett, Dave Hansen, akpm, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 11:10:22AM -0800, Jeff Xu wrote:
> On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett
> <Liam.Howlett@oracle.com> wrote:
> >
> > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]:
> > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
> > > >
> > > > On 2/24/25 09:45, jeffxu@chromium.org wrote:
> > > > > +/*
> > > > > + * mseal of userspace process's system mappings.
> > > > > + */
> > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> > > > > +#else
> > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> > > > > +#endif
> > > >
> > > > This ends up looking pretty wonky in practice:
> > > >
> > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> > > >
> > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
> > > > other ones.
> > > >
> > > > Would it really hurt to have
> > > >
> > > > #ifdef CONFIG_64BIT
> > > > /* VM is sealed, in vm_flags */
> > > > #define VM_SEALED _BITUL(63)
> > > > +#else
> > > > +#define VM_SEALED VM_NONE
> > > > #endif
> > > >
> > > > ?
> > > >
> > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of
> > > the build. This is intentional. Any 32-bit code trying to use the
> > > sealing function or the VM_SEALED flag will immediately fail
> > > compilation. This makes it easier to identify incorrect usage.
> > >
> >
> > The reason that two #defines are needed is because you can have mseal
> > enabled while not sealing system mappings, so for this to be clean we
> > need two defines.
> >
> > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion.
> > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED".
> >
> How about MSEAL_SYSTME_MAPPINGS as Kees suggested ?
>
> The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it
> might take longer for the new reader to understand what it is.
I think to address Dave's concern, it should start with "VM_". We use
"SEAL" already with VM_SEALED, so the "M" in "MSEAL" may be redundant,
and to clarify the system mapping, how avout VM_SEAL_SYSMAP ? That
capture's, hopefully, Dave, Liam, and your thoughts about the naming?
--
Kees Cook
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:10 ` Liam R. Howlett
2025-02-24 19:22 ` Jeff Xu
@ 2025-02-24 19:26 ` Kees Cook
2025-02-24 19:34 ` Jeff Xu
1 sibling, 1 reply; 43+ messages in thread
From: Kees Cook @ 2025-02-24 19:26 UTC (permalink / raw)
To: Liam R. Howlett, Dave Hansen, Jeff Xu, akpm, jannh, torvalds,
vbabka, lorenzo.stoakes, adhemerval.zanella, oleg, avagin,
benjamin, linux-kernel, linux-hardening, linux-mm, jorgelo,
sroettger, hch, ojeda, thomas.weissschuh, adobriyan, johannes,
pedro.falcato, hca, willy, anna-maria, mark.rutland,
linus.walleij, Jason, deller, rdunlap, davem, peterx, f.fainelli,
gerg, dave.hansen, mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb,
enh, rientjes, groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 02:10:58PM -0500, Liam R. Howlett wrote:
> * Kees Cook <kees@kernel.org> [250224 13:55]:
> > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote:
> > > On 2/24/25 10:44, Jeff Xu wrote:
> > > > For example:
> > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
> > > >
> > > > #ifdef CONFIG_64BIT
> > > > [ilog2(VM_SEALED)] = "sl",
> > > > #endif
> > > >
> > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > > > in case that "#ifdef CONFIG_64BIT" line is missing.
> > > >
> > > > Please note, this has been like this since the first version of
> > > > mseal() RFC patch, and I prefer to keep it this way.
> > >
> > > That logic is reasonable. But it's different from the _vast_ majority of
> > > other flags.
> > >
> > > So what justifies VM_SEALED being so different? It's leading to pretty
> > > objectively ugly code in this series.
> >
> > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
> > for "should we perform system mapping sealing?" is intentionally separate
> > here, so that it can be Kconfig and per-arch toggled, etc.
> >
>
> Considering Dave is the second person that did not find the huge commit
> message helpful, can we please limit the commit message to be about the
> actual code and not the entire series?
>
> I thought we said that it was worth while making this change in v5?
Right, please minimize patch #1's commit log to just what it is doing,
etc, and leave the rest of the rationale in the 0/N cover letter.
--
Kees Cook
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:22 ` Jeff Xu
@ 2025-02-24 19:32 ` Liam R. Howlett
2025-02-24 19:33 ` Jeff Xu
0 siblings, 1 reply; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 19:32 UTC (permalink / raw)
To: Jeff Xu
Cc: Kees Cook, Dave Hansen, akpm, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
* Jeff Xu <jeffxu@chromium.org> [250224 14:23]:
> On Mon, Feb 24, 2025 at 11:11 AM Liam R. Howlett
> <Liam.Howlett@oracle.com> wrote:
> >
> > * Kees Cook <kees@kernel.org> [250224 13:55]:
> > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote:
> > > > On 2/24/25 10:44, Jeff Xu wrote:
> > > > > For example:
> > > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
> > > > >
> > > > > #ifdef CONFIG_64BIT
> > > > > [ilog2(VM_SEALED)] = "sl",
> > > > > #endif
> > > > >
> > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > > > > in case that "#ifdef CONFIG_64BIT" line is missing.
> > > > >
> > > > > Please note, this has been like this since the first version of
> > > > > mseal() RFC patch, and I prefer to keep it this way.
> > > >
> > > > That logic is reasonable. But it's different from the _vast_ majority of
> > > > other flags.
> > > >
> > > > So what justifies VM_SEALED being so different? It's leading to pretty
> > > > objectively ugly code in this series.
> > >
> > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
> > > for "should we perform system mapping sealing?" is intentionally separate
> > > here, so that it can be Kconfig and per-arch toggled, etc.
> > >
> >
> > Considering Dave is the second person that did not find the huge commit
> > message helpful, can we please limit the commit message to be about the
> > actual code and not the entire series?
> >
> > I thought we said that it was worth while making this change in v5?
> >
> I include the cover letter's content in the first commit message to
> clearly communicate the purpose of the entire patch series, saving
> maintainers' time when accepting the patch.
Having more text than patch for such a patch seems unreasonable. I'd
find it more acceptable if it were a complicated race condition, but
everyone is getting lost in the summary.
>
> Should I still include that, and add what the first patch does, and
> separate it from the cover letter with "----"? What do you think?
Here is my v5 answer, I think it was clear about not putting the entire
summary into the first patch.
[1]. https://lore.kernel.org/all/ml3x5qchmnehdzz2rxsdcdghivaqffojiweuhvpvzw45u3l5bh@23sblrom3m36/
Thanks,
Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:32 ` Liam R. Howlett
@ 2025-02-24 19:33 ` Jeff Xu
0 siblings, 0 replies; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 19:33 UTC (permalink / raw)
To: Liam R. Howlett, Jeff Xu, Kees Cook, Dave Hansen, akpm, jannh,
torvalds, vbabka, lorenzo.stoakes, adhemerval.zanella, oleg,
avagin, benjamin, linux-kernel, linux-hardening, linux-mm,
jorgelo, sroettger, hch, ojeda, thomas.weissschuh, adobriyan,
johannes, pedro.falcato, hca, willy, anna-maria, mark.rutland,
linus.walleij, Jason, deller, rdunlap, davem, peterx, f.fainelli,
gerg, dave.hansen, mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb,
enh, rientjes, groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 11:32 AM Liam R. Howlett
<Liam.Howlett@oracle.com> wrote:
>
> * Jeff Xu <jeffxu@chromium.org> [250224 14:23]:
> > On Mon, Feb 24, 2025 at 11:11 AM Liam R. Howlett
> > <Liam.Howlett@oracle.com> wrote:
> > >
> > > * Kees Cook <kees@kernel.org> [250224 13:55]:
> > > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote:
> > > > > On 2/24/25 10:44, Jeff Xu wrote:
> > > > > > For example:
> > > > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
> > > > > >
> > > > > > #ifdef CONFIG_64BIT
> > > > > > [ilog2(VM_SEALED)] = "sl",
> > > > > > #endif
> > > > > >
> > > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > > > > > in case that "#ifdef CONFIG_64BIT" line is missing.
> > > > > >
> > > > > > Please note, this has been like this since the first version of
> > > > > > mseal() RFC patch, and I prefer to keep it this way.
> > > > >
> > > > > That logic is reasonable. But it's different from the _vast_ majority of
> > > > > other flags.
> > > > >
> > > > > So what justifies VM_SEALED being so different? It's leading to pretty
> > > > > objectively ugly code in this series.
> > > >
> > > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
> > > > for "should we perform system mapping sealing?" is intentionally separate
> > > > here, so that it can be Kconfig and per-arch toggled, etc.
> > > >
> > >
> > > Considering Dave is the second person that did not find the huge commit
> > > message helpful, can we please limit the commit message to be about the
> > > actual code and not the entire series?
> > >
> > > I thought we said that it was worth while making this change in v5?
> > >
> > I include the cover letter's content in the first commit message to
> > clearly communicate the purpose of the entire patch series, saving
> > maintainers' time when accepting the patch.
>
> Having more text than patch for such a patch seems unreasonable. I'd
> find it more acceptable if it were a complicated race condition, but
> everyone is getting lost in the summary.
>
I will remove the cover letter from the first patch then.
> >
> > Should I still include that, and add what the first patch does, and
> > separate it from the cover letter with "----"? What do you think?
>
> Here is my v5 answer, I think it was clear about not putting the entire
> summary into the first patch.
>
Thanks.
> [1]. https://lore.kernel.org/all/ml3x5qchmnehdzz2rxsdcdghivaqffojiweuhvpvzw45u3l5bh@23sblrom3m36/
>
> Thanks,
> Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:26 ` Kees Cook
@ 2025-02-24 19:34 ` Jeff Xu
0 siblings, 0 replies; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 19:34 UTC (permalink / raw)
To: Kees Cook
Cc: Liam R. Howlett, Dave Hansen, akpm, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 11:26 AM Kees Cook <kees@kernel.org> wrote:
>
> On Mon, Feb 24, 2025 at 02:10:58PM -0500, Liam R. Howlett wrote:
> > * Kees Cook <kees@kernel.org> [250224 13:55]:
> > > On Mon, Feb 24, 2025 at 10:52:13AM -0800, Dave Hansen wrote:
> > > > On 2/24/25 10:44, Jeff Xu wrote:
> > > > > For example:
> > > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c,
> > > > >
> > > > > #ifdef CONFIG_64BIT
> > > > > [ilog2(VM_SEALED)] = "sl",
> > > > > #endif
> > > > >
> > > > > Redefining VM_SEALED to VM_NONE for 32 bit won't detect the problem
> > > > > in case that "#ifdef CONFIG_64BIT" line is missing.
> > > > >
> > > > > Please note, this has been like this since the first version of
> > > > > mseal() RFC patch, and I prefer to keep it this way.
> > > >
> > > > That logic is reasonable. But it's different from the _vast_ majority of
> > > > other flags.
> > > >
> > > > So what justifies VM_SEALED being so different? It's leading to pretty
> > > > objectively ugly code in this series.
> > >
> > > Note that VM_SEALED is the "is this VMA sealed?" bit itself. The define
> > > for "should we perform system mapping sealing?" is intentionally separate
> > > here, so that it can be Kconfig and per-arch toggled, etc.
> > >
> >
> > Considering Dave is the second person that did not find the huge commit
> > message helpful, can we please limit the commit message to be about the
> > actual code and not the entire series?
> >
> > I thought we said that it was worth while making this change in v5?
>
> Right, please minimize patch #1's commit log to just what it is doing,
> etc, and leave the rest of the rationale in the 0/N cover letter.
>
Sure.
> --
> Kees Cook
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:18 ` Liam R. Howlett
@ 2025-02-24 19:39 ` Jeff Xu
2025-02-24 20:13 ` Liam R. Howlett
0 siblings, 1 reply; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 19:39 UTC (permalink / raw)
To: Liam R. Howlett, Jeff Xu, Dave Hansen, akpm, keescook, jannh,
torvalds, vbabka, lorenzo.stoakes, adhemerval.zanella, oleg,
avagin, benjamin, linux-kernel, linux-hardening, linux-mm,
jorgelo, sroettger, hch, ojeda, thomas.weissschuh, adobriyan,
johannes, pedro.falcato, hca, willy, anna-maria, mark.rutland,
linus.walleij, Jason, deller, rdunlap, davem, peterx, f.fainelli,
gerg, dave.hansen, mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb,
enh, rientjes, groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 11:18 AM Liam R. Howlett
<Liam.Howlett@oracle.com> wrote:
>
> Also, do we need something like the above test for VM_SEALED_SYS?
>
Do you mean adding selftest for sealing vdso ? or test the
VM_SEALED_SYS macro in 32 bit vs 64 bits ?
CONFIG_MSEAL_SYSTEM_MAPPINGS is by default disabled. I'm not sure
about the common practice here.
I also don't know how to detect CONFIG_MSEAL_SYSTEM_MAPPINGS from
selftest, that is needed for the selftest.
-Jeff
> Thanks,
> Liam
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:25 ` Kees Cook
@ 2025-02-24 19:42 ` Jeff Xu
2025-02-24 20:12 ` Liam R. Howlett
0 siblings, 1 reply; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 19:42 UTC (permalink / raw)
To: Kees Cook
Cc: Liam R. Howlett, Dave Hansen, akpm, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 11:25 AM Kees Cook <kees@kernel.org> wrote:
>
> On Mon, Feb 24, 2025 at 11:10:22AM -0800, Jeff Xu wrote:
> > On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett
> > <Liam.Howlett@oracle.com> wrote:
> > >
> > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]:
> > > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
> > > > >
> > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote:
> > > > > > +/*
> > > > > > + * mseal of userspace process's system mappings.
> > > > > > + */
> > > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> > > > > > +#else
> > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> > > > > > +#endif
> > > > >
> > > > > This ends up looking pretty wonky in practice:
> > > > >
> > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> > > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> > > > >
> > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
> > > > > other ones.
> > > > >
> > > > > Would it really hurt to have
> > > > >
> > > > > #ifdef CONFIG_64BIT
> > > > > /* VM is sealed, in vm_flags */
> > > > > #define VM_SEALED _BITUL(63)
> > > > > +#else
> > > > > +#define VM_SEALED VM_NONE
> > > > > #endif
> > > > >
> > > > > ?
> > > > >
> > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of
> > > > the build. This is intentional. Any 32-bit code trying to use the
> > > > sealing function or the VM_SEALED flag will immediately fail
> > > > compilation. This makes it easier to identify incorrect usage.
> > > >
> > >
> > > The reason that two #defines are needed is because you can have mseal
> > > enabled while not sealing system mappings, so for this to be clean we
> > > need two defines.
> > >
> > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion.
> > > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED".
> > >
> > How about MSEAL_SYSTME_MAPPINGS as Kees suggested ?
> >
> > The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it
> > might take longer for the new reader to understand what it is.
>
> I think to address Dave's concern, it should start with "VM_". We use
> "SEAL" already with VM_SEALED, so the "M" in "MSEAL" may be redundant,
> and to clarify the system mapping, how avout VM_SEAL_SYSMAP ? That
> capture's, hopefully, Dave, Liam, and your thoughts about the naming?
>
Liam had a comment in the previous version, asking everything related
with mseal() to have the MSEAL keyword, that is why KCONFIG change has
MSEAL.
How about VM_MSEAL_SYSMAP ?
-Jeff
> --
> Kees Cook
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:42 ` Jeff Xu
@ 2025-02-24 20:12 ` Liam R. Howlett
2025-02-24 21:08 ` Jeff Xu
0 siblings, 1 reply; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 20:12 UTC (permalink / raw)
To: Jeff Xu
Cc: Kees Cook, Dave Hansen, akpm, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
* Jeff Xu <jeffxu@chromium.org> [250224 14:42]:
> On Mon, Feb 24, 2025 at 11:25 AM Kees Cook <kees@kernel.org> wrote:
> >
> > On Mon, Feb 24, 2025 at 11:10:22AM -0800, Jeff Xu wrote:
> > > On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett
> > > <Liam.Howlett@oracle.com> wrote:
> > > >
> > > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]:
> > > > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
> > > > > >
> > > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote:
> > > > > > > +/*
> > > > > > > + * mseal of userspace process's system mappings.
> > > > > > > + */
> > > > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> > > > > > > +#else
> > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> > > > > > > +#endif
> > > > > >
> > > > > > This ends up looking pretty wonky in practice:
> > > > > >
> > > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> > > > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> > > > > >
> > > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
> > > > > > other ones.
> > > > > >
> > > > > > Would it really hurt to have
> > > > > >
> > > > > > #ifdef CONFIG_64BIT
> > > > > > /* VM is sealed, in vm_flags */
> > > > > > #define VM_SEALED _BITUL(63)
> > > > > > +#else
> > > > > > +#define VM_SEALED VM_NONE
> > > > > > #endif
> > > > > >
> > > > > > ?
> > > > > >
> > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of
> > > > > the build. This is intentional. Any 32-bit code trying to use the
> > > > > sealing function or the VM_SEALED flag will immediately fail
> > > > > compilation. This makes it easier to identify incorrect usage.
> > > > >
> > > >
> > > > The reason that two #defines are needed is because you can have mseal
> > > > enabled while not sealing system mappings, so for this to be clean we
> > > > need two defines.
> > > >
> > > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion.
> > > > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED".
> > > >
> > > How about MSEAL_SYSTME_MAPPINGS as Kees suggested ?
> > >
> > > The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it
> > > might take longer for the new reader to understand what it is.
> >
> > I think to address Dave's concern, it should start with "VM_". We use
> > "SEAL" already with VM_SEALED, so the "M" in "MSEAL" may be redundant,
> > and to clarify the system mapping, how avout VM_SEAL_SYSMAP ? That
> > capture's, hopefully, Dave, Liam, and your thoughts about the naming?
> >
> Liam had a comment in the previous version, asking everything related
> with mseal() to have the MSEAL keyword, that is why KCONFIG change has
> MSEAL.
>
> How about VM_MSEAL_SYSMAP ?
I don't recall if it was this set or previous ones, but seal is becoming
overloaded and having mseal would have been good for this one.
VM_MSEAL does gain more greping, but since we have VM_SEALED,
VM_SEAL_SYSMAP is going to show up on VM_SEAL grep, but not VM_SEALED
greps. Maybe VM_SEALED_SYSMAP would be better for searching.
Thanks,
Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 19:39 ` Jeff Xu
@ 2025-02-24 20:13 ` Liam R. Howlett
0 siblings, 0 replies; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 20:13 UTC (permalink / raw)
To: Jeff Xu
Cc: Dave Hansen, akpm, keescook, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
* Jeff Xu <jeffxu@chromium.org> [250224 14:40]:
> On Mon, Feb 24, 2025 at 11:18 AM Liam R. Howlett
> <Liam.Howlett@oracle.com> wrote:
> >
> > Also, do we need something like the above test for VM_SEALED_SYS?
> >
> Do you mean adding selftest for sealing vdso ? or test the
> VM_SEALED_SYS macro in 32 bit vs 64 bits ?
>
> CONFIG_MSEAL_SYSTEM_MAPPINGS is by default disabled. I'm not sure
> about the common practice here.
>
> I also don't know how to detect CONFIG_MSEAL_SYSTEM_MAPPINGS from
> selftest, that is needed for the selftest.
Looking more into this, I think it's fine without anything in there
changing.
Thanks,
Liam
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-24 17:45 ` [PATCH v6 7/7] mseal, system mappings: update mseal.rst jeffxu
2025-02-24 19:04 ` Kees Cook
@ 2025-02-24 20:26 ` Liam R. Howlett
2025-02-24 21:06 ` Jeff Xu
2025-02-25 6:07 ` Lorenzo Stoakes
2 siblings, 1 reply; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 20:26 UTC (permalink / raw)
To: jeffxu
Cc: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
* jeffxu@chromium.org <jeffxu@chromium.org> [250224 12:45]:
> From: Jeff Xu <jeffxu@chromium.org>
>
> Update memory sealing documentation to include details about system
> mappings.
>
> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> ---
> Documentation/userspace-api/mseal.rst | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
> index 41102f74c5e2..10147281bf2d 100644
> --- a/Documentation/userspace-api/mseal.rst
> +++ b/Documentation/userspace-api/mseal.rst
> @@ -130,6 +130,13 @@ Use cases
>
> - Chrome browser: protect some security sensitive data structures.
Did you mean to drop this line?
>
> +- System mappings:
> + If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
> + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
> + uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
> + known to relocate or unmap system mapping, therefore this config can't be
> + enabled universally.
> +
> When not to use mseal
> =====================
> Applications can apply sealing to any virtual memory region from userspace,
> --
> 2.48.1.601.g30ceb7b040-goog
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed
2025-02-24 17:45 ` [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed jeffxu
2025-02-24 19:02 ` Kees Cook
@ 2025-02-24 21:04 ` Liam R. Howlett
2025-02-24 21:24 ` Jeff Xu
1 sibling, 1 reply; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 21:04 UTC (permalink / raw)
To: jeffxu
Cc: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
* jeffxu@chromium.org <jeffxu@chromium.org> [250224 12:45]:
> From: Jeff Xu <jeffxu@chromium.org>
>
> Add code to detect if the vdso is memory sealed, skip the test
> if it is.
It also skips the test if fopen fails on smaps, but maybe that's super
rare?
>
> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> ---
> .../testing/selftests/x86/test_mremap_vdso.c | 38 +++++++++++++++++++
> 1 file changed, 38 insertions(+)
>
> diff --git a/tools/testing/selftests/x86/test_mremap_vdso.c b/tools/testing/selftests/x86/test_mremap_vdso.c
> index d53959e03593..c68077c56b22 100644
> --- a/tools/testing/selftests/x86/test_mremap_vdso.c
> +++ b/tools/testing/selftests/x86/test_mremap_vdso.c
> @@ -14,6 +14,7 @@
> #include <errno.h>
> #include <unistd.h>
> #include <string.h>
> +#include <stdbool.h>
>
> #include <sys/mman.h>
> #include <sys/auxv.h>
> @@ -55,13 +56,50 @@ static int try_to_remap(void *vdso_addr, unsigned long size)
>
> }
>
> +#define VDSO_NAME "[vdso]"
> +#define VMFLAGS "VmFlags:"
> +#define MSEAL_FLAGS "sl"
> +#define MAX_LINE_LEN 512
> +
> +bool vdso_sealed(FILE *maps)
> +{
> + char line[MAX_LINE_LEN];
> + bool has_vdso = false;
> +
> + while (fgets(line, sizeof(line), maps)) {
> + if (strstr(line, VDSO_NAME))
> + has_vdso = true;
> +
> + if (has_vdso && !strncmp(line, VMFLAGS, strlen(VMFLAGS))) {
> + if (strstr(line, MSEAL_FLAGS))
> + return true;
> +
> + return false;
> + }
> + }
> +
> + return false;
> +}
> +
> int main(int argc, char **argv, char **envp)
> {
> pid_t child;
> + FILE *maps;
>
> ksft_print_header();
> ksft_set_plan(1);
>
> + maps = fopen("/proc/self/smaps", "r");
> + if (!maps) {
> + ksft_test_result_skip("Could not open /proc/self/smaps\n");
fork() below prints errno, should this also print errno?
> + return 0;
I guess the logic here is that you might fail later because it's sealed
but you don't know? Is this rare enough not to matter?
> + }
> +
> + if (vdso_sealed(maps)) {
> + ksft_test_result_skip("vdso is sealed\n");
> + return 0;
> + }
> +
This file also has an #ifdef __i386__ later, Using it here would
prevent unnecessary scanning of the maps file. Probably not a big deal?
Do you need to close the maps file?
> child = fork();
> if (child == -1)
> ksft_exit_fail_msg("failed to fork (%d): %m\n", errno);
> --
> 2.48.1.601.g30ceb7b040-goog
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-24 20:26 ` Liam R. Howlett
@ 2025-02-24 21:06 ` Jeff Xu
2025-02-24 21:54 ` Jeff Xu
0 siblings, 1 reply; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 21:06 UTC (permalink / raw)
To: Liam R. Howlett, jeffxu, akpm, keescook, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 12:26 PM Liam R. Howlett
<Liam.Howlett@oracle.com> wrote:
>
> * jeffxu@chromium.org <jeffxu@chromium.org> [250224 12:45]:
> > From: Jeff Xu <jeffxu@chromium.org>
> >
> > Update memory sealing documentation to include details about system
> > mappings.
> >
> > Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> > ---
> > Documentation/userspace-api/mseal.rst | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
> > index 41102f74c5e2..10147281bf2d 100644
> > --- a/Documentation/userspace-api/mseal.rst
> > +++ b/Documentation/userspace-api/mseal.rst
> > @@ -130,6 +130,13 @@ Use cases
> >
> > - Chrome browser: protect some security sensitive data structures.
>
> Did you mean to drop this line?
>
Ah, thank you for catching that.
-Jeff
> >
> > +- System mappings:
> > + If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
> > + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
> > + uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
> > + known to relocate or unmap system mapping, therefore this config can't be
> > + enabled universally.
> > +
> > When not to use mseal
> > =====================
> > Applications can apply sealing to any virtual memory region from userspace,
> > --
> > 2.48.1.601.g30ceb7b040-goog
> >
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 3/7] mseal, system mappings: enable x86-64
2025-02-24 17:45 ` [PATCH v6 3/7] mseal, system mappings: enable x86-64 jeffxu
@ 2025-02-24 21:06 ` Liam R. Howlett
0 siblings, 0 replies; 43+ messages in thread
From: Liam R. Howlett @ 2025-02-24 21:06 UTC (permalink / raw)
To: jeffxu
Cc: akpm, keescook, jannh, torvalds, vbabka, lorenzo.stoakes,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
* jeffxu@chromium.org <jeffxu@chromium.org> [250224 12:45]:
> From: Jeff Xu <jeffxu@chromium.org>
>
> Provide support for CONFIG_MSEAL_SYSTEM_MAPPINGS on x86-64,
> covering the vdso, vvar, vvar_vclock.
>
> Production release testing passes on Android and Chrome OS.
>
> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> ---
> arch/x86/Kconfig | 1 +
> arch/x86/entry/vdso/vma.c | 16 ++++++++++------
> 2 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 87198d957e2f..8fa17032ca46 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -26,6 +26,7 @@ config X86_64
> depends on 64BIT
> # Options that are inherently 64-bit kernel only:
> select ARCH_HAS_GIGANTIC_PAGE
> + select ARCH_HAS_MSEAL_SYSTEM_MAPPINGS
> select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
> select ARCH_SUPPORTS_PER_VMA_LOCK
> select ARCH_SUPPORTS_HUGE_PFNMAP if TRANSPARENT_HUGEPAGE
> diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
> index 39e6efc1a9ca..54677964d0b5 100644
> --- a/arch/x86/entry/vdso/vma.c
> +++ b/arch/x86/entry/vdso/vma.c
> @@ -247,6 +247,7 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma;
> unsigned long text_start;
> + unsigned long vm_flags;
> int ret = 0;
>
> if (mmap_write_lock_killable(mm))
> @@ -264,11 +265,12 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
> /*
> * MAYWRITE to allow gdb to COW and set breakpoints
> */
> + vm_flags = VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC;
> + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> vma = _install_special_mapping(mm,
> text_start,
> image->size,
> - VM_READ|VM_EXEC|
> - VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
> + vm_flags,
Why not just | VM_SEALED_SYSMAP here, and everywhere else?
> &vdso_mapping);
>
> if (IS_ERR(vma)) {
> @@ -276,11 +278,12 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
> goto up_fail;
> }
>
> + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> vma = _install_special_mapping(mm,
> addr,
> (__VVAR_PAGES - VDSO_NR_VCLOCK_PAGES) * PAGE_SIZE,
> - VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|
> - VM_PFNMAP,
> + vm_flags,
> &vvar_mapping);
>
> if (IS_ERR(vma)) {
> @@ -289,11 +292,12 @@ static int map_vdso(const struct vdso_image *image, unsigned long addr)
> goto up_fail;
> }
>
> + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> vma = _install_special_mapping(mm,
> addr + (__VVAR_PAGES - VDSO_NR_VCLOCK_PAGES) * PAGE_SIZE,
> VDSO_NR_VCLOCK_PAGES * PAGE_SIZE,
> - VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|
> - VM_PFNMAP,
> + vm_flags,
> &vvar_vclock_mapping);
>
> if (IS_ERR(vma)) {
> --
> 2.48.1.601.g30ceb7b040-goog
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 1/7] mseal, system mappings: kernel config and header change
2025-02-24 20:12 ` Liam R. Howlett
@ 2025-02-24 21:08 ` Jeff Xu
0 siblings, 0 replies; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 21:08 UTC (permalink / raw)
To: Liam R. Howlett, Jeff Xu, Kees Cook, Dave Hansen, akpm, jannh,
torvalds, vbabka, lorenzo.stoakes, adhemerval.zanella, oleg,
avagin, benjamin, linux-kernel, linux-hardening, linux-mm,
jorgelo, sroettger, hch, ojeda, thomas.weissschuh, adobriyan,
johannes, pedro.falcato, hca, willy, anna-maria, mark.rutland,
linus.walleij, Jason, deller, rdunlap, davem, peterx, f.fainelli,
gerg, dave.hansen, mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb,
enh, rientjes, groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 12:13 PM Liam R. Howlett
<Liam.Howlett@oracle.com> wrote:
>
> * Jeff Xu <jeffxu@chromium.org> [250224 14:42]:
> > On Mon, Feb 24, 2025 at 11:25 AM Kees Cook <kees@kernel.org> wrote:
> > >
> > > On Mon, Feb 24, 2025 at 11:10:22AM -0800, Jeff Xu wrote:
> > > > On Mon, Feb 24, 2025 at 11:03 AM Liam R. Howlett
> > > > <Liam.Howlett@oracle.com> wrote:
> > > > >
> > > > > * Jeff Xu <jeffxu@chromium.org> [250224 13:44]:
> > > > > > On Mon, Feb 24, 2025 at 10:21 AM Dave Hansen <dave.hansen@intel.com> wrote:
> > > > > > >
> > > > > > > On 2/24/25 09:45, jeffxu@chromium.org wrote:
> > > > > > > > +/*
> > > > > > > > + * mseal of userspace process's system mappings.
> > > > > > > > + */
> > > > > > > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS
> > > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_SEALED
> > > > > > > > +#else
> > > > > > > > +#define MSEAL_SYSTEM_MAPPINGS_VM_FLAG VM_NONE
> > > > > > > > +#endif
> > > > > > >
> > > > > > > This ends up looking pretty wonky in practice:
> > > > > > >
> > > > > > > > + vm_flags = VM_READ|VM_MAYREAD|VM_IO|VM_DONTDUMP|VM_PFNMAP;
> > > > > > > > + vm_flags |= MSEAL_SYSTEM_MAPPINGS_VM_FLAG;
> > > > > > >
> > > > > > > because MSEAL_SYSTEM_MAPPINGS_VM_FLAG is so much different from the
> > > > > > > other ones.
> > > > > > >
> > > > > > > Would it really hurt to have
> > > > > > >
> > > > > > > #ifdef CONFIG_64BIT
> > > > > > > /* VM is sealed, in vm_flags */
> > > > > > > #define VM_SEALED _BITUL(63)
> > > > > > > +#else
> > > > > > > +#define VM_SEALED VM_NONE
> > > > > > > #endif
> > > > > > >
> > > > > > > ?
> > > > > > >
> > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of
> > > > > > the build. This is intentional. Any 32-bit code trying to use the
> > > > > > sealing function or the VM_SEALED flag will immediately fail
> > > > > > compilation. This makes it easier to identify incorrect usage.
> > > > > >
> > > > >
> > > > > The reason that two #defines are needed is because you can have mseal
> > > > > enabled while not sealing system mappings, so for this to be clean we
> > > > > need two defines.
> > > > >
> > > > > However MSEAL_SYSTEM_MAPPINGS_VM_FLAG, is _way_ too long, in my opinion.
> > > > > Keeping with "VM_SEALED" I'd suggest "VM_SYSTEM_SEALED".
> > > > >
> > > > How about MSEAL_SYSTME_MAPPINGS as Kees suggested ?
> > > >
> > > > The VM_SYSTEM_SEALED doesn't have the MSEAL key or the MAPPING, so it
> > > > might take longer for the new reader to understand what it is.
> > >
> > > I think to address Dave's concern, it should start with "VM_". We use
> > > "SEAL" already with VM_SEALED, so the "M" in "MSEAL" may be redundant,
> > > and to clarify the system mapping, how avout VM_SEAL_SYSMAP ? That
> > > capture's, hopefully, Dave, Liam, and your thoughts about the naming?
> > >
> > Liam had a comment in the previous version, asking everything related
> > with mseal() to have the MSEAL keyword, that is why KCONFIG change has
> > MSEAL.
> >
> > How about VM_MSEAL_SYSMAP ?
>
> I don't recall if it was this set or previous ones, but seal is becoming
> overloaded and having mseal would have been good for this one.
>
> VM_MSEAL does gain more greping, but since we have VM_SEALED,
> VM_SEAL_SYSMAP is going to show up on VM_SEAL grep, but not VM_SEALED
> greps. Maybe VM_SEALED_SYSMAP would be better for searching.
>
OK, I will change to VM_SEALED_SYSMAP
-Jeff
> Thanks,
> Liam
>
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed
2025-02-24 21:04 ` Liam R. Howlett
@ 2025-02-24 21:24 ` Jeff Xu
0 siblings, 0 replies; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 21:24 UTC (permalink / raw)
To: Liam R. Howlett, jeffxu, akpm, keescook, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 1:05 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * jeffxu@chromium.org <jeffxu@chromium.org> [250224 12:45]:
> > From: Jeff Xu <jeffxu@chromium.org>
> >
> > Add code to detect if the vdso is memory sealed, skip the test
> > if it is.
>
> It also skips the test if fopen fails on smaps, but maybe that's super
> rare?
>
If we can't access /proc/pid/smaps, that's an "environment" issue
unrelated to this test, therefore skip().
> >
> > Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> > ---
> > .../testing/selftests/x86/test_mremap_vdso.c | 38 +++++++++++++++++++
> > 1 file changed, 38 insertions(+)
> >
> > diff --git a/tools/testing/selftests/x86/test_mremap_vdso.c b/tools/testing/selftests/x86/test_mremap_vdso.c
> > index d53959e03593..c68077c56b22 100644
> > --- a/tools/testing/selftests/x86/test_mremap_vdso.c
> > +++ b/tools/testing/selftests/x86/test_mremap_vdso.c
> > @@ -14,6 +14,7 @@
> > #include <errno.h>
> > #include <unistd.h>
> > #include <string.h>
> > +#include <stdbool.h>
> >
> > #include <sys/mman.h>
> > #include <sys/auxv.h>
> > @@ -55,13 +56,50 @@ static int try_to_remap(void *vdso_addr, unsigned long size)
> >
> > }
> >
> > +#define VDSO_NAME "[vdso]"
> > +#define VMFLAGS "VmFlags:"
> > +#define MSEAL_FLAGS "sl"
> > +#define MAX_LINE_LEN 512
> > +
> > +bool vdso_sealed(FILE *maps)
> > +{
> > + char line[MAX_LINE_LEN];
> > + bool has_vdso = false;
> > +
> > + while (fgets(line, sizeof(line), maps)) {
> > + if (strstr(line, VDSO_NAME))
> > + has_vdso = true;
> > +
> > + if (has_vdso && !strncmp(line, VMFLAGS, strlen(VMFLAGS))) {
> > + if (strstr(line, MSEAL_FLAGS))
> > + return true;
> > +
> > + return false;
> > + }
> > + }
> > +
> > + return false;
> > +}
> > +
> > int main(int argc, char **argv, char **envp)
> > {
> > pid_t child;
> > + FILE *maps;
> >
> > ksft_print_header();
> > ksft_set_plan(1);
> >
> > + maps = fopen("/proc/self/smaps", "r");
> > + if (!maps) {
> > + ksft_test_result_skip("Could not open /proc/self/smaps\n");
>
> fork() below prints errno, should this also print errno?
>
Sure.
> > + return 0;
>
> I guess the logic here is that you might fail later because it's sealed
> but you don't know? Is this rare enough not to matter?
>
yes. We should check the "sl" flag of vdso before continuing, or the
results will be unpredictable.
Android currently includes this selftest. In addition, you or someone
commented that the selftest framework can randomly turn on kernel
config and run selftest, so it is necessary to add this check to skip.
> > + }
> > +
> > + if (vdso_sealed(maps)) {
> > + ksft_test_result_skip("vdso is sealed\n");
> > + return 0;
> > + }
> > +
>
> This file also has an #ifdef __i386__ later, Using it here would
> prevent unnecessary scanning of the maps file. Probably not a big deal?
>
> Do you need to close the maps file?
>
OK, I will add close for good coding practice.
Thanks
-Jeff
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-24 21:06 ` Jeff Xu
@ 2025-02-24 21:54 ` Jeff Xu
0 siblings, 0 replies; 43+ messages in thread
From: Jeff Xu @ 2025-02-24 21:54 UTC (permalink / raw)
To: Liam R. Howlett, jeffxu, akpm, keescook, jannh, torvalds, vbabka,
lorenzo.stoakes, adhemerval.zanella, oleg, avagin, benjamin,
linux-kernel, linux-hardening, linux-mm, jorgelo, sroettger, hch,
ojeda, thomas.weissschuh, adobriyan, johannes, pedro.falcato,
hca, willy, anna-maria, mark.rutland, linus.walleij, Jason,
deller, rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen,
mingo, ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes,
groeck, mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 1:06 PM Jeff Xu <jeffxu@chromium.org> wrote:
>
> On Mon, Feb 24, 2025 at 12:26 PM Liam R. Howlett
> <Liam.Howlett@oracle.com> wrote:
> >
> > * jeffxu@chromium.org <jeffxu@chromium.org> [250224 12:45]:
> > > From: Jeff Xu <jeffxu@chromium.org>
> > >
> > > Update memory sealing documentation to include details about system
> > > mappings.
> > >
> > > Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> > > ---
> > > Documentation/userspace-api/mseal.rst | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
> > > index 41102f74c5e2..10147281bf2d 100644
> > > --- a/Documentation/userspace-api/mseal.rst
> > > +++ b/Documentation/userspace-api/mseal.rst
> > > @@ -130,6 +130,13 @@ Use cases
> > >
> > > - Chrome browser: protect some security sensitive data structures.
> >
> > Did you mean to drop this line?
> >
> Ah, thank you for catching that.
>
Actually, this isn't a problem here.
The "-" here is part of the text, for list, so that line is not dropped).
-Jeff
> -Jeff
>
>
> > >
> > > +- System mappings:
> > > + If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
> > > + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
> > > + uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
> > > + known to relocate or unmap system mapping, therefore this config can't be
> > > + enabled universally.
> > > +
> > > When not to use mseal
> > > =====================
> > > Applications can apply sealing to any virtual memory region from userspace,
> > > --
> > > 2.48.1.601.g30ceb7b040-goog
> > >
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-24 17:45 ` [PATCH v6 7/7] mseal, system mappings: update mseal.rst jeffxu
2025-02-24 19:04 ` Kees Cook
2025-02-24 20:26 ` Liam R. Howlett
@ 2025-02-25 6:07 ` Lorenzo Stoakes
2025-02-25 22:31 ` Jeff Xu
2 siblings, 1 reply; 43+ messages in thread
From: Lorenzo Stoakes @ 2025-02-25 6:07 UTC (permalink / raw)
To: jeffxu
Cc: akpm, keescook, jannh, torvalds, vbabka, Liam.Howlett,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 05:45:13PM +0000, jeffxu@chromium.org wrote:
> From: Jeff Xu <jeffxu@chromium.org>
>
> Update memory sealing documentation to include details about system
> mappings.
>
> Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> ---
> Documentation/userspace-api/mseal.rst | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
> index 41102f74c5e2..10147281bf2d 100644
> --- a/Documentation/userspace-api/mseal.rst
> +++ b/Documentation/userspace-api/mseal.rst
> @@ -130,6 +130,13 @@ Use cases
>
> - Chrome browser: protect some security sensitive data structures.
>
> +- System mappings:
> + If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
> + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
> + uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
> + known to relocate or unmap system mapping, therefore this config can't be
> + enabled universally.
Thanks for adding this.
Similar comments to the Kconfig update - you are listing features that do not
exist yet, please just list what you're doing, specifically, and avoid the vague
'etc.', we don't need to be vague.
As per the Kconfig comment - you need to be a lot more clear, I think you're
duplicating the text from there to here, so again I suggest something like:
WARNING: This feature breaks programs which rely on relocating or
unmapping system mappings.
Known broken software at the time of writing includes
CHECKPOINT_RESTORE, UML, gVisor and rr.
You also seem to be writing very little here, it's a documentation page, you can
be as verbose as you like :)
You really need to add some more detail here in general - you aren't explaining
why people would want to enable this, what you're mitigating, etc. from that you
explain _why_ it doesn't work for some things.
You're also not mentioning architectural limitations here, for instance that you
can only do this on arches that don't require VDSO relocation and listing
known-good arches.
This is a documentation file, you can go wild :) the more information here the
better.
WARNING
=======
> +
> When not to use mseal
> =====================
> Applications can apply sealing to any virtual memory region from userspace,
> --
> 2.48.1.601.g30ceb7b040-goog
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-25 6:07 ` Lorenzo Stoakes
@ 2025-02-25 22:31 ` Jeff Xu
2025-02-25 22:36 ` Jeff Xu
2025-02-26 5:27 ` Lorenzo Stoakes
0 siblings, 2 replies; 43+ messages in thread
From: Jeff Xu @ 2025-02-25 22:31 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: akpm, keescook, jannh, torvalds, vbabka, Liam.Howlett,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
On Mon, Feb 24, 2025 at 10:07 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Mon, Feb 24, 2025 at 05:45:13PM +0000, jeffxu@chromium.org wrote:
> > From: Jeff Xu <jeffxu@chromium.org>
> >
> > Update memory sealing documentation to include details about system
> > mappings.
> >
> > Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> > ---
> > Documentation/userspace-api/mseal.rst | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
> > index 41102f74c5e2..10147281bf2d 100644
> > --- a/Documentation/userspace-api/mseal.rst
> > +++ b/Documentation/userspace-api/mseal.rst
> > @@ -130,6 +130,13 @@ Use cases
> >
> > - Chrome browser: protect some security sensitive data structures.
> >
> > +- System mappings:
> > + If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
> > + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
> > + uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
> > + known to relocate or unmap system mapping, therefore this config can't be
> > + enabled universally.
>
> Thanks for adding this.
>
> Similar comments to the Kconfig update - you are listing features that do not
> exist yet, please just list what you're doing, specifically, and avoid the vague
> 'etc.', we don't need to be vague.
>
OK, I will remove etc and list the known mappings here.
> As per the Kconfig comment - you need to be a lot more clear, I think you're
> duplicating the text from there to here, so again I suggest something like:
>
> WARNING: This feature breaks programs which rely on relocating or
> unmapping system mappings.
>
> Known broken software at the time of writing includes
> CHECKPOINT_RESTORE, UML, gVisor and rr.
>
Sure.
> You also seem to be writing very little here, it's a documentation page, you can
> be as verbose as you like :)
>
> You really need to add some more detail here in general - you aren't explaining
> why people would want to enable this, what you're mitigating, etc. from that you
> explain _why_ it doesn't work for some things.
>
The mseal.rst already includes below regarding the protection/mitigation.
> You're also not mentioning architectural limitations here, for instance that you
> can only do this on arches that don't require VDSO relocation and listing
> known-good arches.
>
> This is a documentation file, you can go wild :) the more information here the
> better.
>
> WARNING
> =======
>
> > +
> > When not to use mseal
> > =====================
> > Applications can apply sealing to any virtual memory region from userspace,
> > --
> > 2.48.1.601.g30ceb7b040-goog
> >
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-25 22:31 ` Jeff Xu
@ 2025-02-25 22:36 ` Jeff Xu
2025-02-26 6:08 ` Lorenzo Stoakes
2025-02-26 5:27 ` Lorenzo Stoakes
1 sibling, 1 reply; 43+ messages in thread
From: Jeff Xu @ 2025-02-25 22:36 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: akpm, keescook, jannh, torvalds, vbabka, Liam.Howlett,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
On Tue, Feb 25, 2025 at 2:31 PM Jeff Xu <jeffxu@chromium.org> wrote:
>
> On Mon, Feb 24, 2025 at 10:07 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Mon, Feb 24, 2025 at 05:45:13PM +0000, jeffxu@chromium.org wrote:
> > > From: Jeff Xu <jeffxu@chromium.org>
> > >
> > > Update memory sealing documentation to include details about system
> > > mappings.
> > >
> > > Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> > > ---
> > > Documentation/userspace-api/mseal.rst | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
> > > index 41102f74c5e2..10147281bf2d 100644
> > > --- a/Documentation/userspace-api/mseal.rst
> > > +++ b/Documentation/userspace-api/mseal.rst
> > > @@ -130,6 +130,13 @@ Use cases
> > >
> > > - Chrome browser: protect some security sensitive data structures.
> > >
> > > +- System mappings:
> > > + If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
> > > + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
> > > + uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
> > > + known to relocate or unmap system mapping, therefore this config can't be
> > > + enabled universally.
> >
> > Thanks for adding this.
> >
> > Similar comments to the Kconfig update - you are listing features that do not
> > exist yet, please just list what you're doing, specifically, and avoid the vague
> > 'etc.', we don't need to be vague.
> >
> OK, I will remove etc and list the known mappings here.
>
> > As per the Kconfig comment - you need to be a lot more clear, I think you're
> > duplicating the text from there to here, so again I suggest something like:
> >
> > WARNING: This feature breaks programs which rely on relocating or
> > unmapping system mappings.
> >
> > Known broken software at the time of writing includes
> > CHECKPOINT_RESTORE, UML, gVisor and rr.
> >
> Sure.
>
> > You also seem to be writing very little here, it's a documentation page, you can
> > be as verbose as you like :)
> >
> > You really need to add some more detail here in general - you aren't explaining
> > why people would want to enable this, what you're mitigating, etc. from that you
> > explain _why_ it doesn't work for some things.
> >
The mseal.rst already includes below regarding the protection/mitigation.
Memory sealing additionally protects the mapping itself against
modifications. This is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system. For example,
such an attacker primitive can break control-flow integrity guarantees
since read-only memory that is supposed to be trusted can become writable
or .text pages can get remapped. Memory sealing can automatically be
applied by the runtime loader to seal .text and .rodata pages and
applications can additionally seal security critical data at runtime.
I could copy some sections from cover-letter to here, specifically
for special mappings.
>
>
>
>
>
> > You're also not mentioning architectural limitations here, for instance that you
> > can only do this on arches that don't require VDSO relocation and listing
> > known-good arches.
> >
Sure, I will mention the architecture that has this enabled
(x86,arm,uml) -- I don't think there is an architecture limitation
though. mseal is a software feature. The reason why other
architectures don't have it is due to the fact that I don't have the
HW for testing
> > This is a documentation file, you can go wild :) the more information here the
> > better.
> >
> > WARNING
> > =======
> >
> > > +
> > > When not to use mseal
> > > =====================
> > > Applications can apply sealing to any virtual memory region from userspace,
> > > --
> > > 2.48.1.601.g30ceb7b040-goog
> > >
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-25 22:31 ` Jeff Xu
2025-02-25 22:36 ` Jeff Xu
@ 2025-02-26 5:27 ` Lorenzo Stoakes
1 sibling, 0 replies; 43+ messages in thread
From: Lorenzo Stoakes @ 2025-02-26 5:27 UTC (permalink / raw)
To: Jeff Xu
Cc: akpm, keescook, jannh, torvalds, vbabka, Liam.Howlett,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
On Tue, Feb 25, 2025 at 02:31:15PM -0800, Jeff Xu wrote:
> On Mon, Feb 24, 2025 at 10:07 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Mon, Feb 24, 2025 at 05:45:13PM +0000, jeffxu@chromium.org wrote:
> > > From: Jeff Xu <jeffxu@chromium.org>
> > >
> > > Update memory sealing documentation to include details about system
> > > mappings.
> > >
> > > Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> > > ---
> > > Documentation/userspace-api/mseal.rst | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
> > > index 41102f74c5e2..10147281bf2d 100644
> > > --- a/Documentation/userspace-api/mseal.rst
> > > +++ b/Documentation/userspace-api/mseal.rst
> > > @@ -130,6 +130,13 @@ Use cases
> > >
> > > - Chrome browser: protect some security sensitive data structures.
> > >
> > > +- System mappings:
> > > + If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
> > > + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
> > > + uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
> > > + known to relocate or unmap system mapping, therefore this config can't be
> > > + enabled universally.
> >
> > Thanks for adding this.
> >
> > Similar comments to the Kconfig update - you are listing features that do not
> > exist yet, please just list what you're doing, specifically, and avoid the vague
> > 'etc.', we don't need to be vague.
> >
> OK, I will remove etc and list the known mappings here.
>
> > As per the Kconfig comment - you need to be a lot more clear, I think you're
> > duplicating the text from there to here, so again I suggest something like:
> >
> > WARNING: This feature breaks programs which rely on relocating or
> > unmapping system mappings.
> >
> > Known broken software at the time of writing includes
> > CHECKPOINT_RESTORE, UML, gVisor and rr.
> >
> Sure.
>
> > You also seem to be writing very little here, it's a documentation page, you can
> > be as verbose as you like :)
> >
> > You really need to add some more detail here in general - you aren't explaining
> > why people would want to enable this, what you're mitigating, etc. from that you
> > explain _why_ it doesn't work for some things.
> >
> The mseal.rst already includes below regarding the protection/mitigation.
Not for system sealing. Please add it even if you think it'd be
duplicative. Thanks.
>
>
>
>
>
> > You're also not mentioning architectural limitations here, for instance that you
> > can only do this on arches that don't require VDSO relocation and listing
> > known-good arches.
> >
> > This is a documentation file, you can go wild :) the more information here the
> > better.
> >
> > WARNING
> > =======
> >
> > > +
> > > When not to use mseal
> > > =====================
> > > Applications can apply sealing to any virtual memory region from userspace,
> > > --
> > > 2.48.1.601.g30ceb7b040-goog
> > >
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v6 7/7] mseal, system mappings: update mseal.rst
2025-02-25 22:36 ` Jeff Xu
@ 2025-02-26 6:08 ` Lorenzo Stoakes
0 siblings, 0 replies; 43+ messages in thread
From: Lorenzo Stoakes @ 2025-02-26 6:08 UTC (permalink / raw)
To: Jeff Xu
Cc: akpm, keescook, jannh, torvalds, vbabka, Liam.Howlett,
adhemerval.zanella, oleg, avagin, benjamin, linux-kernel,
linux-hardening, linux-mm, jorgelo, sroettger, hch, ojeda,
thomas.weissschuh, adobriyan, johannes, pedro.falcato, hca,
willy, anna-maria, mark.rutland, linus.walleij, Jason, deller,
rdunlap, davem, peterx, f.fainelli, gerg, dave.hansen, mingo,
ardb, mhocko, 42.hyeyoo, peterz, ardb, enh, rientjes, groeck,
mpe, aleksandr.mikhalitsyn, mike.rapoport
On Tue, Feb 25, 2025 at 02:36:52PM -0800, Jeff Xu wrote:
> On Tue, Feb 25, 2025 at 2:31 PM Jeff Xu <jeffxu@chromium.org> wrote:
> >
> > On Mon, Feb 24, 2025 at 10:07 PM Lorenzo Stoakes
> > <lorenzo.stoakes@oracle.com> wrote:
> > >
> > > On Mon, Feb 24, 2025 at 05:45:13PM +0000, jeffxu@chromium.org wrote:
> > > > From: Jeff Xu <jeffxu@chromium.org>
> > > >
> > > > Update memory sealing documentation to include details about system
> > > > mappings.
> > > >
> > > > Signed-off-by: Jeff Xu <jeffxu@chromium.org>
> > > > ---
> > > > Documentation/userspace-api/mseal.rst | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst
> > > > index 41102f74c5e2..10147281bf2d 100644
> > > > --- a/Documentation/userspace-api/mseal.rst
> > > > +++ b/Documentation/userspace-api/mseal.rst
> > > > @@ -130,6 +130,13 @@ Use cases
> > > >
> > > > - Chrome browser: protect some security sensitive data structures.
> > > >
> > > > +- System mappings:
> > > > + If supported by an architecture (via CONFIG_ARCH_HAS_MSEAL_SYSTEM_MAPPINGS),
> > > > + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals system mappings, e.g. vdso, vvar,
> > > > + uprobes, sigpage, vectors, etc. CHECKPOINT_RESTORE, UML, gVisor, rr are
> > > > + known to relocate or unmap system mapping, therefore this config can't be
> > > > + enabled universally.
> > >
> > > Thanks for adding this.
> > >
> > > Similar comments to the Kconfig update - you are listing features that do not
> > > exist yet, please just list what you're doing, specifically, and avoid the vague
> > > 'etc.', we don't need to be vague.
> > >
> > OK, I will remove etc and list the known mappings here.
> >
> > > As per the Kconfig comment - you need to be a lot more clear, I think you're
> > > duplicating the text from there to here, so again I suggest something like:
> > >
> > > WARNING: This feature breaks programs which rely on relocating or
> > > unmapping system mappings.
> > >
> > > Known broken software at the time of writing includes
> > > CHECKPOINT_RESTORE, UML, gVisor and rr.
> > >
> > Sure.
> >
> > > You also seem to be writing very little here, it's a documentation page, you can
> > > be as verbose as you like :)
> > >
> > > You really need to add some more detail here in general - you aren't explaining
> > > why people would want to enable this, what you're mitigating, etc. from that you
> > > explain _why_ it doesn't work for some things.
> > >
>
> The mseal.rst already includes below regarding the protection/mitigation.
But not specifically why you'd want to do that for system mappings.
I guess you mean it's _implied_ that it would be sensible to do this for
system mappings and I suppose again I'm asking for emphasis on this
>
> Memory sealing additionally protects the mapping itself against
> modifications. This is useful to mitigate memory corruption issues where a
> corrupted pointer is passed to a memory management system. For example,
> such an attacker primitive can break control-flow integrity guarantees
> since read-only memory that is supposed to be trusted can become writable
> or .text pages can get remapped. Memory sealing can automatically be
> applied by the runtime loader to seal .text and .rodata pages and
> applications can additionally seal security critical data at runtime.
Right, this is exactly the kind of thing you need but obviously adjusted to
mention system mappings, and why they are especially problematic (again -at
risk of sounding like a security idiot here :) - I'm guessing sand box
breakout, rop, syscall, find a way to write, and un-sandboxed code now does
unexpected stuff).
>
> I could copy some sections from cover-letter to here, specifically
> for special mappings.
yeah that'd be good, while the cover letter will be copied in automatically
by Andrew to the series, it's really nice to have it at a glance in docs
for users who go to docs.kernel.org etc.
>
> >
> >
> >
> >
> >
> > > You're also not mentioning architectural limitations here, for instance that you
> > > can only do this on arches that don't require VDSO relocation and listing
> > > known-good arches.
> > >
> Sure, I will mention the architecture that has this enabled
> (x86,arm,uml) -- I don't think there is an architecture limitation
> though. mseal is a software feature. The reason why other
> architectures don't have it is due to the fact that I don't have the
> HW for testing
Yeah I mean just list the arches that you've tested, more or less.
Thanks!
>
>
>
> > > This is a documentation file, you can go wild :) the more information here the
> > > better.
> > >
> > > WARNING
> > > =======
> > >
> > > > +
> > > > When not to use mseal
> > > > =====================
> > > > Applications can apply sealing to any virtual memory region from userspace,
> > > > --
> > > > 2.48.1.601.g30ceb7b040-goog
> > > >
^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2025-02-26 6:09 UTC | newest]
Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-24 17:45 [PATCH v6 0/7] mseal system mappings jeffxu
2025-02-24 17:45 ` [PATCH v6 1/7] mseal, system mappings: kernel config and header change jeffxu
2025-02-24 18:21 ` Dave Hansen
2025-02-24 18:44 ` Jeff Xu
2025-02-24 18:52 ` Dave Hansen
2025-02-24 18:55 ` Kees Cook
2025-02-24 18:59 ` Jeff Xu
2025-02-24 19:02 ` Dave Hansen
2025-02-24 19:10 ` Liam R. Howlett
2025-02-24 19:22 ` Jeff Xu
2025-02-24 19:32 ` Liam R. Howlett
2025-02-24 19:33 ` Jeff Xu
2025-02-24 19:26 ` Kees Cook
2025-02-24 19:34 ` Jeff Xu
2025-02-24 19:03 ` Liam R. Howlett
2025-02-24 19:07 ` Jeff Xu
2025-02-24 19:18 ` Liam R. Howlett
2025-02-24 19:39 ` Jeff Xu
2025-02-24 20:13 ` Liam R. Howlett
2025-02-24 19:10 ` Jeff Xu
2025-02-24 19:25 ` Kees Cook
2025-02-24 19:42 ` Jeff Xu
2025-02-24 20:12 ` Liam R. Howlett
2025-02-24 21:08 ` Jeff Xu
2025-02-24 17:45 ` [PATCH v6 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed jeffxu
2025-02-24 19:02 ` Kees Cook
2025-02-24 21:04 ` Liam R. Howlett
2025-02-24 21:24 ` Jeff Xu
2025-02-24 17:45 ` [PATCH v6 3/7] mseal, system mappings: enable x86-64 jeffxu
2025-02-24 21:06 ` Liam R. Howlett
2025-02-24 17:45 ` [PATCH v6 4/7] mseal, system mappings: enable arm64 jeffxu
2025-02-24 17:45 ` [PATCH v6 5/7] mseal, system mappings: enable uml architecture jeffxu
2025-02-24 17:45 ` [PATCH v6 6/7] mseal, system mappings: uprobe mapping jeffxu
2025-02-24 17:45 ` [PATCH v6 7/7] mseal, system mappings: update mseal.rst jeffxu
2025-02-24 19:04 ` Kees Cook
2025-02-24 20:26 ` Liam R. Howlett
2025-02-24 21:06 ` Jeff Xu
2025-02-24 21:54 ` Jeff Xu
2025-02-25 6:07 ` Lorenzo Stoakes
2025-02-25 22:31 ` Jeff Xu
2025-02-25 22:36 ` Jeff Xu
2025-02-26 6:08 ` Lorenzo Stoakes
2025-02-26 5:27 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox