* VM_BUG_ON_VMA in split_huge_pmd_locked: huge PMD doesn't cover full VMA range
@ 2026-02-25 13:43 Sasha Levin
2026-02-25 13:50 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 2+ messages in thread
From: Sasha Levin @ 2026-02-25 13:43 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: Lorenzo Stoakes, Andrew Morton, David Hildenbrand, Hugh Dickins,
Zi Yan, Gavin Guo
Hi,
I've been playing around with improvements to syzkaller locally, and hit the
following crash on v7.0-rc1:
vma ffff888109f988c0 start 0000555580cc0000 end 0000555580ce2000 mm ffff8881048e1780
prot 8000000000000025 anon_vma ffff88810b20f100 vm_ops 0000000000000000
pgoff 555580cc0 file 0000000000000000 private_data 0000000000000000
refcnt 1
flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
------------[ cut here ]------------
kernel BUG at mm/huge_memory.c:2999!
Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
CPU: 3 UID: 0 PID: 15162 Comm: syz.7.3120 Tainted: G N 7.0.0-rc1-00001-gc5447a46efed #51 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
RIP: 0010:split_huge_pmd_locked+0x11a0/0x2f80
RSP: 0018:ffff888053cc7338 EFLAGS: 00010282
RAX: 0000000000000126 RBX: ffff888109f988d0 RCX: 0000000000000000
RDX: 0000000000000126 RSI: 0000000000000000 RDI: ffffed100a798e43
RBP: 0000555580cc0000 R08: ffffffffa3e62775 R09: 0000000000000001
R10: 0000000000000005 R11: 0000000000000000 R12: 0000000000000080
R13: 0000000000000000 R14: 0000555580c00000 R15: ffff888109f988c0
FS: 0000000000000000(0000) GS:ffff88816f701000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe2ac1907a0 CR3: 0000000021c91000 CR4: 0000000000750ef0
PKRU: 80000000
Call Trace:
<TASK>
__split_huge_pmd+0x201/0x350
unmap_page_range+0xa6a/0x3db0
unmap_single_vma+0x14b/0x230
unmap_vmas+0x28f/0x580
exit_mmap+0x203/0xa80
__mmput+0x11b/0x540
mmput+0x81/0xa0
do_exit+0x7b9/0x2c60
do_group_exit+0xd5/0x2a0
get_signal+0x1fdc/0x2340
arch_do_signal_or_restart+0x93/0x790
exit_to_user_mode_loop+0x84/0x480
do_syscall_64+0x4df/0x700
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Kernel panic - not syncing: Fatal exception
The assertion VM_BUG_ON_VMA(vma->vm_start > haddr, vma) fires at
mm/huge_memory.c:2999 because a huge PMD exists at PMD-aligned address
0x555580c00000 but the VMA only covers [0x555580cc0000, 0x555580ce2000):
a 136KB region starting 816KB past the PMD base.
---
The following analysis was performed with the help of an LLM:
The crash path is:
exit_mmap -> unmap_vmas -> unmap_page_range -> zap_pmd_range
-> sees pmd_is_huge(*pmd) is true
-> range doesn't cover full HPAGE_PMD_SIZE
-> calls __split_huge_pmd()
-> haddr = address & HPAGE_PMD_MASK = 0x555580c00000
-> __split_huge_pmd_locked()
-> VM_BUG_ON_VMA(vma->vm_start > haddr) fires
because 0x555580cc0000 > 0x555580c00000
The root cause appears to be remove_migration_pmd() (mm/huge_memory.c:4906).
This function reinstalls a huge PMD via set_pmd_at() after migration
completes, but it never checks whether the VMA still covers the full
PMD-aligned 2MB range.
Every other code path that installs a huge PMD validates VMA boundaries:
- do_huge_pmd_anonymous_page(): thp_vma_suitable_order()
- collapse_huge_page(): hugepage_vma_revalidate()
- MADV_COLLAPSE: hugepage_vma_revalidate()
- do_set_pmd() (shmem/tmpfs): thp_vma_suitable_order()
remove_migration_pmd() checks none of these.
The suspected race window is:
1. VMA [A, A+2MB) has a THP. Migration starts, PMD becomes a migration
entry.
2. Concurrently, __split_vma() runs under mmap_write_lock. It calls
vma_adjust_trans_huge() which acquires the PMD lock, splits the PMD
migration entry into 512 PTE migration entries, and releases the PMD
lock. Then VMA boundaries are modified (e.g., vma->vm_start = A+X).
3. remove_migration_ptes() runs via rmap_walk_anon() WITHOUT mmap_lock
(only the anon_vma lock). page_vma_mapped_walk() acquires the PMD
lock. If it wins the lock BEFORE step 2's split, it finds the PMD
migration entry still intact and returns with pvmw->pte == NULL.
4. remove_migration_pmd() then reinstalls the huge PMD via set_pmd_at()
without checking that the VMA (whose boundaries may have already been
modified in step 2) still covers the full PMD range.
5. Later, exit_mmap -> unmap_page_range -> zap_pmd_range encounters the
huge PMD, calls __split_huge_pmd(), and the VM_BUG_ON_VMA fires
because vma->vm_start no longer aligns with the PMD base.
The fix should add a VMA boundary check in remove_migration_pmd(). If
haddr < vma->vm_start or haddr + HPAGE_PMD_SIZE > vma->vm_end, the
function should split the PMD migration entry into PTE-level migration
entries instead of reinstalling the huge PMD, allowing PTE-level removal
to handle each page individually.
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: VM_BUG_ON_VMA in split_huge_pmd_locked: huge PMD doesn't cover full VMA range
2026-02-25 13:43 VM_BUG_ON_VMA in split_huge_pmd_locked: huge PMD doesn't cover full VMA range Sasha Levin
@ 2026-02-25 13:50 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 2+ messages in thread
From: David Hildenbrand (Arm) @ 2026-02-25 13:50 UTC (permalink / raw)
To: Sasha Levin, linux-mm, linux-kernel
Cc: Lorenzo Stoakes, Andrew Morton, Hugh Dickins, Zi Yan, Gavin Guo
On 2/25/26 14:43, Sasha Levin wrote:
> Hi,
>
> I've been playing around with improvements to syzkaller locally, and hit
> the
> following crash on v7.0-rc1:
>
> vma ffff888109f988c0 start 0000555580cc0000 end 0000555580ce2000 mm
> ffff8881048e1780
> prot 8000000000000025 anon_vma ffff88810b20f100 vm_ops 0000000000000000
> pgoff 555580cc0 file 0000000000000000 private_data 0000000000000000
> refcnt 1
> flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
> ------------[ cut here ]------------
> kernel BUG at mm/huge_memory.c:2999!
> Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
> CPU: 3 UID: 0 PID: 15162 Comm: syz.7.3120 Tainted: G
> N 7.0.0-rc1-00001-gc5447a46efed #51 PREEMPT(full)
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-
> debian-1.17.0-1 04/01/2014
> RIP: 0010:split_huge_pmd_locked+0x11a0/0x2f80
> RSP: 0018:ffff888053cc7338 EFLAGS: 00010282
> RAX: 0000000000000126 RBX: ffff888109f988d0 RCX: 0000000000000000
> RDX: 0000000000000126 RSI: 0000000000000000 RDI: ffffed100a798e43
> RBP: 0000555580cc0000 R08: ffffffffa3e62775 R09: 0000000000000001
> R10: 0000000000000005 R11: 0000000000000000 R12: 0000000000000080
> R13: 0000000000000000 R14: 0000555580c00000 R15: ffff888109f988c0
> FS: 0000000000000000(0000) GS:ffff88816f701000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fe2ac1907a0 CR3: 0000000021c91000 CR4: 0000000000750ef0
> PKRU: 80000000
> Call Trace:
> <TASK>
> __split_huge_pmd+0x201/0x350
> unmap_page_range+0xa6a/0x3db0
> unmap_single_vma+0x14b/0x230
> unmap_vmas+0x28f/0x580
> exit_mmap+0x203/0xa80
> __mmput+0x11b/0x540
> mmput+0x81/0xa0
> do_exit+0x7b9/0x2c60
> do_group_exit+0xd5/0x2a0
> get_signal+0x1fdc/0x2340
> arch_do_signal_or_restart+0x93/0x790
> exit_to_user_mode_loop+0x84/0x480
> do_syscall_64+0x4df/0x700
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> </TASK>
> Kernel panic - not syncing: Fatal exception
>
> The assertion VM_BUG_ON_VMA(vma->vm_start > haddr, vma) fires at
> mm/huge_memory.c:2999 because a huge PMD exists at PMD-aligned address
> 0x555580c00000 but the VMA only covers [0x555580cc0000, 0x555580ce2000):
> a 136KB region starting 816KB past the PMD base.
Do you have a reproducer and would this trigger before v7.0-rc1?
Lorenzo did some changes around anon_vma locking recently, maybe related
to that.
--
Cheers,
David
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-02-25 13:50 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-25 13:43 VM_BUG_ON_VMA in split_huge_pmd_locked: huge PMD doesn't cover full VMA range Sasha Levin
2026-02-25 13:50 ` David Hildenbrand (Arm)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox