linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: jeffxu@chromium.org
Cc: akpm@linux-foundation.org, keescook@chromium.org,
	jannh@google.com, torvalds@linux-foundation.org, vbabka@suse.cz,
	Liam.Howlett@oracle.com, adhemerval.zanella@linaro.org,
	oleg@redhat.com, avagin@gmail.com, benjamin@sipsolutions.net,
	linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
	jorgelo@chromium.org, sroettger@google.com, hch@lst.de,
	ojeda@kernel.org, thomas.weissschuh@linutronix.de,
	adobriyan@gmail.com, johannes@sipsolutions.net,
	pedro.falcato@gmail.com, hca@linux.ibm.com, willy@infradead.org,
	anna-maria@linutronix.de, mark.rutland@arm.com,
	linus.walleij@linaro.org, Jason@zx2c4.com, deller@gmx.de,
	rdunlap@infradead.org, davem@davemloft.net, peterx@redhat.com,
	f.fainelli@gmail.com, gerg@kernel.org,
	dave.hansen@linux.intel.com, mingo@kernel.org, ardb@kernel.org,
	mhocko@suse.com, 42.hyeyoo@gmail.com, peterz@infradead.org,
	ardb@google.com, enh@google.com, rientjes@google.com,
	groeck@chromium.org, mpe@ellerman.id.au,
	aleksandr.mikhalitsyn@canonical.com, mike.rapoport@gmail.com,
	Jeff Xu <jeffxu@google.com>
Subject: Re: [PATCH v8 0/7] mseal system mappings
Date: Mon, 3 Mar 2025 11:50:28 +0000	[thread overview]
Message-ID: <1e84edef-03c4-4544-81c1-1006bc9beee0@lucifer.local> (raw)
In-Reply-To: <20250303050921.3033083-1-jeffxu@google.com>

Great nice descriptions thanks!

On Mon, Mar 03, 2025 at 05:09:14AM +0000, jeffxu@chromium.org wrote:
> From: Jeff Xu <jeffxu@google.com>
>
> This is V8 version, addressing comments from V7, without code logic
> change.
>
> -------------------------------------------------------------------
> As discussed during mseal() upstream process [1], mseal() protects
> the VMAs of a given virtual memory range against modifications, such
> as the read/write (RW) and no-execute (NX) bits. For complete
> descriptions of memory sealing, please see mseal.rst [2].
>
> The mseal() is useful to mitigate memory corruption issues where a
> corrupted pointer is passed to a memory management system. For
> example, such an attacker primitive can break control-flow integrity
> guarantees since read-only memory that is supposed to be trusted can
> become writable or .text pages can get remapped.
>
> The system mappings are readonly only, memory sealing can protect
> them from ever changing to writable or unmmap/remapped as different
> attributes.
>
> System mappings such as vdso, vvar, vvar_vclock,
> vectors (arm compact-mode), sigpage (arm compact-mode),
> are created by the kernel during program initialization, and could
> be sealed after creation.
>
> Unlike the aforementioned mappings, the uprobe mapping is not
> established during program startup. However, its lifetime is the same
> as the process's lifetime [3]. It could be sealed from creation.
>
> The vsyscall on x86-64 uses a special address (0xffffffffff600000),
> which is outside the mm managed range. This means mprotect, munmap, and
> mremap won't work on the vsyscall. Since sealing doesn't enhance
> the vsyscall's security, it is skipped in this patch. If we ever seal
> the vsyscall, it is probably only for decorative purpose, i.e. showing
> the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored.
>
> It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may
> alter the system mappings during restore operations. UML(User Mode Linux)
> and gVisor, rr are also known to change the vdso/vvar mappings.
> Consequently, this feature cannot be universally enabled across all
> systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default.
>
> To support mseal of system mappings, architectures must define
> CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special
> mappings calls to pass mseal flag. Additionally, architectures must
> confirm they do not unmap/remap system mappings during the process
> lifetime. The existence of this flag for an architecture implies that
> it does not require the remapping of thest system mappings during
> process lifetime, so sealing these mappings is safe from a kernel
> perspective.
>
> This version covers x86-64 and arm64 archiecture as minimum viable feature.
>
> While no specific CPU hardware features are required for enable this
> feature on an archiecture, memory sealing requires a 64-bit kernel. Other
> architectures can choose whether or not to adopt this feature. Currently,
> I'm not aware of any instances in the kernel code that actively
> munmap/mremap a system mapping without a request from userspace. The PPC
> does call munmap when _install_special_mapping fails for vdso; however,
> it's uncertain if this will ever fail for PPC - this needs to be
> investigated by PPC in the future [4]. The UML kernel can add this support
> when KUnit tests require it [5].
>
> In this version, we've improved the handling of system mapping sealing from
> previous versions, instead of modifying the _install_special_mapping
> function itself, which would affect all architectures, we now call
> _install_special_mapping with a sealing flag only within the specific
> architecture that requires it. This targeted approach offers two key
> advantages: 1) It limits the code change's impact to the necessary
> architectures, and 2) It aligns with the software architecture by keeping
> the core memory management within the mm layer, while delegating the
> decision of sealing system mappings to the individual architecture, which
> is particularly relevant since 32-bit architectures never require sealing.
>
> Prior to this patch series, we explored sealing special mappings from
> userspace using glibc's dynamic linker. This approach revealed several
> issues:
> - The PT_LOAD header may report an incorrect length for vdso, (smaller
>   than its actual size). The dynamic linker, which relies on PT_LOAD
>   information to determine mapping size, would then split and partially
>   seal the vdso mapping. Since each architecture has its own vdso/vvar
>   code, fixing this in the kernel would require going through each
>   archiecture. Our initial goal was to enable sealing readonly mappings,
>   e.g. .text, across all architectures, sealing vdso from kernel since
>   creation appears to be simpler than sealing vdso at glibc.
> - The [vvar] mapping header only contains address information, not length
>   information. Similar issues might exist for other special mappings.
> - Mappings like uprobe are not covered by the dynamic linker,
>   and there is no effective solution for them.
>
> This feature's security enhancements will benefit ChromeOS, Android,
> and other high security systems.
>
> Testing:
> This feature was tested on ChromeOS and Android for both x86-64 and ARM64.
> - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly,
>   i.e. "sl" shown in the smaps for those mappings, and mremap is blocked.
> - Passing various automation tests (e.g. pre-checkin) on ChromeOS and
>   Android to ensure the sealing doesn't affect the functionality of
>   Chromebook and Android phone.
>
> I also tested the feature on Ubuntu on x86-64:
> - With config disabled, vdso/vvar is not sealed,
> - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK,
>   normal operations such as browsing the web, open/edit doc are OK.
>
> Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1]
> Link: Documentation/userspace-api/mseal.rst [2]
> Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3]
> Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQSt+HA@mail.gmail.com/ [4]
> Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5]
>
> -------------------------------------------
> History:
>
> V8:
>   - Change ARCH_SUPPORTS_MSEAL_X to ARCH_SUPPORTS_MSEAL_X (Liam R. Howlett)
>   - Update comments in Kconfig and mseal.rst (Lorenzo Stoakes, Liam R. Howlett)
>   - Change patch header perfix to "mseal sysmap" (Lorenzo Stoakes)
>   - Remove "vm_flags =" (Kees Cook, Liam R. Howlett,  Oleg Nesterov)
>   - Drop uml architecture (Lorenzo Stoakes, Kees Cook)
>   - Add a selftest to verify system mappings are sealed (Lorenzo Stoakes)
>
> V7:
>   https://lore.kernel.org/all/20250224225246.3712295-1-jeffxu@google.com/
>   - Remove cover letter from the first patch (Liam R. Howlett)
>   - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett)
>   - logging and fclose() in selftest (Liam R. Howlett)
>
> V6:
>   https://lore.kernel.org/all/20250224174513.3600914-1-jeffxu@google.com/
>   - mseal.rst: fix a typo (Randy Dunlap)
>   - security/Kconfig: add rr into note (Liam R. Howlett)
>   - remove mseal_system_mappings() and use macro instead (Liam R. Howlett)
>   - mseal.rst: add incompatible userland software (Lorenzo Stoakes)
>   - remove RFC from title (Kees Cook)
>
> V5
>   https://lore.kernel.org/all/20250212032155.1276806-1-jeffxu@google.com/
>   - Remove kernel cmd line (Lorenzo Stoakes)
>   - Add test info (Lorenzo Stoakes)
>   - Add threat model info (Lorenzo Stoakes)
>   - Fix x86 selftest: test_mremap_vdso
>   - Restrict code change to ARM64/x86-64/UM arch only.
>   - Add userprocess.h to include seal_system_mapping().
>   - Remove sealing vsyscall.
>   - Split the patch.
>
> V4:
>   https://lore.kernel.org/all/20241125202021.3684919-1-jeffxu@google.com/
>   - ARCH_HAS_SEAL_SYSTEM_MAPPINGS (Lorenzo Stoakes)
>   - test info (Lorenzo Stoakes)
>   - Update  mseal.rst (Liam R. Howlett)
>   - Update test_mremap_vdso.c (Liam R. Howlett)
>   - Misc. style, comments, doc update (Liam R. Howlett)
>
> V3:
>   https://lore.kernel.org/all/20241113191602.3541870-1-jeffxu@google.com/
>   - Revert uprobe to v1 logic (Oleg Nesterov)
>   - use CONFIG_SEAL_SYSTEM_MAPPINGS instead of _ALWAYS/_NEVER (Kees Cook)
>   - Move kernel cmd line from fs/exec.c to mm/mseal.c and
>     misc. (Liam R. Howlett)
>
> V2:
>   https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/
>   - Seal uprobe always (Oleg Nesterov)
>   - Update comments and description (Randy Dunlap, Liam R.Howlett, Oleg Nesterov)
>   - Rebase to linux_main
>
> V1:
>  - https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/
>
> --------------------------------------------------
>
>
> Jeff Xu (7):
>   mseal sysmap: kernel config and header change
>   selftests: x86: test_mremap_vdso: skip if vdso is msealed
>   mseal sysmap: enable x86-64
>   mseal sysmap: enable arm64
>   mseal sysmap: uprobe mapping
>   mseal sysmap: update mseal.rst
>   selftest: test system mappings are sealed.
>
>  Documentation/userspace-api/mseal.rst         |  20 ++++
>  arch/arm64/Kconfig                            |   1 +
>  arch/arm64/kernel/vdso.c                      |  12 +-
>  arch/x86/Kconfig                              |   1 +
>  arch/x86/entry/vdso/vma.c                     |   7 +-
>  include/linux/mm.h                            |  10 ++
>  init/Kconfig                                  |  22 ++++
>  kernel/events/uprobes.c                       |   3 +-
>  security/Kconfig                              |  21 ++++
>  .../mseal_system_mappings/.gitignore          |   2 +
>  .../selftests/mseal_system_mappings/Makefile  |   6 +
>  .../selftests/mseal_system_mappings/config    |   1 +
>  .../mseal_system_mappings/sysmap_is_sealed.c  | 113 ++++++++++++++++++
>  .../testing/selftests/x86/test_mremap_vdso.c  |  43 +++++++
>  14 files changed, 254 insertions(+), 8 deletions(-)
>  create mode 100644 tools/testing/selftests/mseal_system_mappings/.gitignore
>  create mode 100644 tools/testing/selftests/mseal_system_mappings/Makefile
>  create mode 100644 tools/testing/selftests/mseal_system_mappings/config
>  create mode 100644 tools/testing/selftests/mseal_system_mappings/sysmap_is_sealed.c
>
> --
> 2.48.1.711.g2feabab25a-goog
>


  parent reply	other threads:[~2025-03-03 11:51 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-03  5:09 jeffxu
2025-03-03  5:09 ` [PATCH v8 1/7] mseal sysmap: kernel config and header change jeffxu
2025-03-03 11:51   ` Lorenzo Stoakes
2025-03-03 15:02   ` Liam R. Howlett
2025-03-03 16:37   ` Kees Cook
2025-03-03 19:29     ` Jeff Xu
2025-03-03  5:09 ` [PATCH v8 2/7] selftests: x86: test_mremap_vdso: skip if vdso is msealed jeffxu
2025-03-03 15:00   ` Liam R. Howlett
2025-03-03  5:09 ` [PATCH v8 3/7] mseal sysmap: enable x86-64 jeffxu
2025-03-03 11:53   ` Lorenzo Stoakes
2025-03-03 12:01   ` Lorenzo Stoakes
2025-03-03 19:34     ` Jeff Xu
2025-03-03 15:03   ` Liam R. Howlett
2025-03-03 16:38   ` Kees Cook
2025-03-03  5:09 ` [PATCH v8 4/7] mseal sysmap: enable arm64 jeffxu
2025-03-03 11:53   ` Lorenzo Stoakes
2025-03-03 15:04   ` Liam R. Howlett
2025-03-03 16:39   ` Kees Cook
2025-03-03  5:09 ` [PATCH v8 5/7] mseal sysmap: uprobe mapping jeffxu
2025-03-03  6:28   ` Oleg Nesterov
2025-03-03 11:54   ` Lorenzo Stoakes
2025-03-03 15:04   ` Liam R. Howlett
2025-03-03 16:39   ` Kees Cook
2025-03-03  5:09 ` [PATCH v8 6/7] mseal sysmap: update mseal.rst jeffxu
2025-03-03 11:57   ` Lorenzo Stoakes
2025-03-03 15:05   ` Liam R. Howlett
2025-03-03  5:09 ` [PATCH v8 7/7] selftest: test system mappings are sealed jeffxu
2025-03-03 12:08   ` Lorenzo Stoakes
2025-03-03 16:43     ` Lorenzo Stoakes
2025-03-03 16:48       ` Kees Cook
2025-03-03 19:46       ` Jeff Xu
2025-03-03 16:47     ` Kees Cook
2025-03-03 16:49       ` Lorenzo Stoakes
2025-03-03 17:01   ` Kees Cook
2025-03-03 20:20     ` Jeff Xu
2025-03-04 20:53       ` Jeff Xu
2025-03-03 11:50 ` Lorenzo Stoakes [this message]
2025-03-03 14:59 ` [PATCH v8 0/7] mseal system mappings Liam R. Howlett
2025-03-03 16:33   ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1e84edef-03c4-4544-81c1-1006bc9beee0@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=Jason@zx2c4.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aleksandr.mikhalitsyn@canonical.com \
    --cc=anna-maria@linutronix.de \
    --cc=ardb@google.com \
    --cc=ardb@kernel.org \
    --cc=avagin@gmail.com \
    --cc=benjamin@sipsolutions.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=deller@gmx.de \
    --cc=enh@google.com \
    --cc=f.fainelli@gmail.com \
    --cc=gerg@kernel.org \
    --cc=groeck@chromium.org \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=jannh@google.com \
    --cc=jeffxu@chromium.org \
    --cc=jeffxu@google.com \
    --cc=johannes@sipsolutions.net \
    --cc=jorgelo@chromium.org \
    --cc=keescook@chromium.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mhocko@suse.com \
    --cc=mike.rapoport@gmail.com \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=ojeda@kernel.org \
    --cc=oleg@redhat.com \
    --cc=pedro.falcato@gmail.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=sroettger@google.com \
    --cc=thomas.weissschuh@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox