Hello, I am reporting a reproducible WARNING in vma_modify() at mm/vma.c:830, triggered via the mseal(2) syscall on Linux 7.0.0-rc5. The bug was discovered using Syzkaller-based fuzzing. REPORTER -------- Antonius / Blue Dragon Security https://bluedragonsec.com https://github.com/bluedragonsecurity NOTE ON RELATIONSHIP TO KNOWN BUGS ----------------------------------- The VM_WARN_ON_VMG at mm/vma.c:830 inside vma_merge_existing_range() has been previously encountered via madvise()+OOM conditions (reported by syzbot+46423ed8fa1f1148c6e4 and Brad Spengler; addressed by Lorenzo's patch "mm: abort vma_modify() on merge out of memory failure"). This report describes a DISTINCT trigger via mseal(2) that: 1. Does NOT require fault injection or OOM pressure 2. Is 100% reproducible on every run (fires within 1 second) 3. Goes through a different call path: do_mseal() -> mseal_apply() rather than madvise_walk_vmas() 4. Is triggered by VM_SEALED flag state inconsistency across VMAs, not by a failed merge commit I could not find a prior LKML report or syzbot entry for this specific mseal(2) trigger. SUMMARY ------- File: mm/vma.c, line 830 Func: vma_merge_existing_range() Trigger: mseal() spanning two adjacent VMAs where the first has VM_SEALED set and the second does not Via: mseal(2) -> do_mseal() -> mseal_apply() -> vma_modify_flags() -> vma_modify() -> vma_merge_existing_range() -> VM_WARN_ON_VMG AFFECTED VERSIONS ----------------- Linux 7.0-rc3 -- confirmed (original fuzzing target) Linux 7.0-rc4 -- confirmed (mm/vma.c unchanged rc3->rc4) Linux 7.0-rc5 -- confirmed (mm/vma.c unchanged rc4->rc5) Linux 6.x -- NOT affected (mm/vma.c rewritten for 7.0) DMESG OUTPUT (Linux 7.0.0-rc5, trimmed) ---------------------------------------- [ 1680.275764] ------------[ cut here ]------------ [ 1680.275771] WARNING: mm/vma.c:830 at vma_modify+0x35b/0x2190 [ 1680.275808] CPU: 0 UID: 1000 PID: 1661 Comm: repro_mseal_vma [ 1680.275826] Tainted: [W]=WARN 7.0.0-rc5 #1 PREEMPT(lazy) [ 1680.275969] Call Trace: [ 1680.275975] [ 1680.276030] vma_modify_flags+0x24c/0x3c0 [ 1680.276085] do_mseal+0x489/0x860 [ 1680.276136] __x64_sys_mseal+0x73/0xb0 [ 1680.276187] do_syscall_64+0x111/0x690 [ 1680.276207] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 1680.276394] ---[ end trace 0000000000000000 ]--- [ 1680.314910] vmg dumped because: VM_WARN_ON_VMG(middle && ((middle != prev && vmg->start != middle->vm_start) || vmg->end > middle->vm_end)) vmg state: vmi [21de6000, 21e83000) prev [21da6000-21de6000) flags: 0x400000000f8 (VM_SEALED set) middle [21de6000-21e83000) flags: 0xf8 (NOT sealed) vmg->start = 0x21da8000 vmg->end = 0x21e16000 ROOT CAUSE ---------- The bug is in vma_merge_existing_range() at mm/vma.c:830. Reproduction sequence: 1. memfd_create("syz-mseal", MFD_CLOEXEC) -> fd1 2. mmap(0x21da8000, 0xdd000, PROT_SEM, MAP_SHARED|MAP_FIXED, fd1, 0) -> establishes VMA at [0x21da8000 .. 0x21e85000) 3. memfd_create("syz-mseal", MFD_CLOEXEC) -> fd2 4. mmap(0x21da6000, 0xdd000, PROT_SEM, MAP_SHARED|MAP_FIXED, fd2, 0) -> remaps, leaving: VMA-A [0x21da6000 - 0x21de6000) pgoff=0 (fd2) VMA-B [0x21de6000 - 0x21e83000) pgoff=0x40 (fd2) VMA-C [0x21e83000 - 0x21e85000) (leftover) 5. mseal(mmap1_result, 0x3e000, 0) -> seals [0x21da8000 .. 0x21de5fff] -> VMA-A gets VM_SEALED (0x400000000000) set 6. mseal(mmap2_result, 0x70000, 0) -> targets [0x21da6000 .. 0x21e15fff] -> range spans VMA-A (sealed) and VMA-B (not sealed) In step 6, do_mseal() calls mseal_apply() per-VMA but ultimately calls vma_modify_flags() with the original full mseal start address (0x21da8000). When vma_merge_existing_range() processes VMA-B as "middle": vmg->start = 0x21da8000 (original mseal start) middle->vm_start = 0x21de6000 (VMA-B start) middle != prev (different VMA objects) -> vmg->start != middle->vm_start -> WARN_ON fires at line 830 The invariant violation occurs because the vmg->start passed to vma_modify_flags() is not clamped to the current VMA's start when the mseal range spans multiple VMAs with different VM_SEALED states. IMPACT ------ - Reachable from unprivileged userspace (UID 1000, no capabilities) - Only memfd_create(2), mmap(2), mseal(2) required - The WARN_ON indicates that vma_merge_existing_range() operates on an inconsistent vmg state; in production kernels with WARN compiled to no-op, this could result in VMA tree state inconsistency - mseal is a security primitive; invariant violations in its application logic are security-relevant SUGGESTED FIX DIRECTION ------------------------ In do_mseal() or mseal_apply() (mm/mseal.c), when iterating over VMAs in the mseal range, the vmg->start passed to vma_modify_flags() should be clamped to max(mseal_start, vma->vm_start) rather than using the original mseal() start address. This would prevent vma_merge_existing_range() from receiving a vmg->start that is inconsistent with vmg->middle when the mseal range spans multiple VMAs with different seal states. Alternatively, the WARN_ON in vma_merge_existing_range() may need to account for the mseal multi-VMA iteration pattern, though fixing the caller in do_mseal() seems more appropriate. REPRODUCER ---------- Compile: gcc -O2 -o repro repro_mseal_vma.c && ./repro Fires: Within 1 second, iteration 0, no fault injection, no root #define _GNU_SOURCE #include #include #include #include #include #include #include #ifndef __NR_memfd_create #define __NR_memfd_create 319 #endif #ifndef __NR_mseal #define __NR_mseal 462 #endif static void setup(void) { syscall(__NR_mmap, 0x1ffffffff000UL, 0x1000UL, 0UL, 0x32UL, -1, 0UL); syscall(__NR_mmap, 0x200000000000UL, 0x1000000UL, 7UL, 0x32UL, -1, 0UL); syscall(__NR_mmap, 0x200001000000UL, 0x1000UL, 0UL, 0x32UL, -1, 0UL); } static void trigger(void) { intptr_t fd1, fd2, m1, m2; memcpy((void*)0x200000000100UL, "syz-mseal\0", 10); fd1 = syscall(__NR_memfd_create, 0x200000000100UL, 1UL); if (fd1 < 0) return; m1 = syscall(__NR_mmap, 0x21da8000UL, 0xdd000UL, 8UL, 0x11UL, (intptr_t)fd1, 0UL); memcpy((void*)0x200000000100UL, "syz-mseal\0", 10); fd2 = syscall(__NR_memfd_create, 0x200000000100UL, 1UL); if (fd2 < 0) return; m2 = syscall(__NR_mmap, 0x21da6000UL, 0xdd000UL, 8UL, 0x11UL, (intptr_t)fd2, 0UL); syscall(__NR_mseal, (uint64_t)m1, 0x3e000UL, 0UL); syscall(__NR_mseal, (uint64_t)m2, 0x70000UL, 0UL); } int main(void) { setup(); for (int i = 0;; i++) { int pid = fork(); if (pid == 0) { trigger(); _exit(0); } int st; waitpid(pid, &st, 0); fprintf(stderr, "[iter %d]\n", i); } } VERIFICATION ------------ Kernel: Linux 7.0.0-rc5 #1 SMP PREEMPT_DYNAMIC x86_64 HW: QEMU Standard PC (i440FX + PIIX), BIOS 1.17.0-debian User: UID 1000 (no root required) Fires: Iteration 0, consistently, < 1 second mm/vma.c: Not patched in rc3->rc4 or rc4->rc5 --- Reported-by: Antonius Please use this tag in the fix commit: Reported-by: Antonius --- If this is a known issue or already fixed, please point me to the relevant commit. I was unable to find a matching LKML/syzbot entry for this specific mseal(2) trigger path. Thank you, Antonius Blue Dragon Security https://bluedragonsec.com https://github.com/bluedragonsecurity