* [syzbot] [mm?] WARNING in deferred_split_folio @ 2026-04-01 6:08 ` syzbot 2026-04-01 6:09 ` Request received Yail 2026-04-01 8:10 ` [syzbot] [mm?] WARNING in deferred_split_folio Lance Yang 0 siblings, 2 replies; 19+ messages in thread From: syzbot @ 2026-04-01 6:08 UTC (permalink / raw) To: Liam.Howlett, akpm, baohua, baolin.wang, david, dev.jain, lance.yang, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs, ziy Hello, syzbot found the following issue on: HEAD commit: cf7c3c02fdd0 Add linux-next specific files for 20260330 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=154ee46a580000 kernel config: https://syzkaller.appspot.com/x/.config?x=3944d875fa9bfb67 dashboard link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c846ba580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/053d3b49a360/disk-cf7c3c02.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/faabb37d41d0/vmlinux-cf7c3c02.xz kernel image: https://storage.googleapis.com/syzbot-assets/8d47fe92aaa8/bzImage-cf7c3c02.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401 __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline] tlb_batch_pages_flush mm/mmu_gather.c:151 [inline] tlb_flush_mmu_free mm/mmu_gather.c:417 [inline] tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424 tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549 exit_mmap+0x498/0x9e0 mm/mmap.c:1313 __mmput+0x118/0x430 kernel/fork.c:1177 exit_mm+0x18e/0x250 kernel/exit.c:581 do_exit+0x6a2/0x22c0 kernel/exit.c:962 do_group_exit+0x21b/0x2d0 kernel/exit.c:1116 __do_sys_exit_group kernel/exit.c:1127 [inline] __se_sys_exit_group kernel/exit.c:1125 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1125 x64_sys_call+0x221a/0x2240 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f ------------[ cut here ]------------ 1 WARNING: mm/huge_memory.c:4371 at deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371, CPU#1: syz.3.1110/10500 Modules linked in: CPU: 1 UID: 0 PID: 10500 Comm: syz.3.1110 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 RIP: 0010:deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371 Code: 31 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 c2 67 8d 09 cc e8 8c 73 93 ff 48 89 df 48 c7 c6 20 5b fc 8b e8 dd 2b f5 fe 90 <0f> 0b 90 e9 d4 fe ff ff e8 9f 7a 8a 09 e8 6a 73 93 ff 48 89 df 48 RSP: 0018:ffffc900047ef540 EFLAGS: 00010046 RAX: 1c05fb65cfaab100 RBX: ffffea0001840000 RCX: 0000000080000001 RDX: 0000000000000002 RSI: ffffffff8e4da1c7 RDI: ffff88807d6f9e80 RBP: ffffc900047ef610 R08: ffff8880b87247d3 R09: 1ffff110170e48fa R10: dffffc0000000000 R11: ffffed10170e48fb R12: ffffea0001840040 R13: 0000000000000000 R14: 0000000000010000 R15: 1ffff920008fdeb0 FS: 00007f32e32a76c0(0000) GS:ffff8881250e8000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5825757930 CR3: 0000000034ad8000 CR4: 00000000003526f0 Call Trace: <TASK> migrate_folio_move mm/migrate.c:1411 [inline] migrate_folios_move mm/migrate.c:1740 [inline] migrate_pages_batch+0x319f/0x4c40 mm/migrate.c:1996 migrate_pages_sync mm/migrate.c:2026 [inline] migrate_pages+0x1c74/0x2a10 mm/migrate.c:2135 do_mbind mm/mempolicy.c:1614 [inline] kernel_mbind mm/mempolicy.c:1757 [inline] __do_sys_mbind mm/mempolicy.c:1831 [inline] __se_sys_mbind+0xe89/0x10f0 mm/mempolicy.c:1827 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f32e239c819 Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f32e32a7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000ed RAX: ffffffffffffffda RBX: 00007f32e2616090 RCX: 00007f32e239c819 RDX: 0000000000000000 RSI: 0000000000800000 RDI: 0000200000001000 RBP: 00007f32e2432c91 R08: 0000000000000000 R09: 0000000000000002 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f32e2616128 R14: 00007f32e2616090 R15: 00007ffe30c61cc8 </TASK> --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 19+ messages in thread
* Request received 2026-04-01 6:08 ` [syzbot] [mm?] WARNING in deferred_split_folio syzbot @ 2026-04-01 6:09 ` Yail 2026-04-01 8:10 ` [syzbot] [mm?] WARNING in deferred_split_folio Lance Yang 1 sibling, 0 replies; 19+ messages in thread From: Yail @ 2026-04-01 6:09 UTC (permalink / raw) To: syzbot Cc: Akpm, Baohua, Baolin Wang, David, Dev Jain, Lance Yang, Liam Howlett, Linux-kernel, Linux-mm, Lorenzo Stoakes (Oracle), Npache, Ryan Roberts, Ziy [-- Attachment #1: Type: text/plain, Size: 315 bytes --] Your request (139) has been received and is being reviewed by our support staff. To add additional comments, reply to this email. This email is a service from Yail. Delivered by Zendesk <https://www.zendesk.com/support/?utm_campaign=text&utm_content=Yail&utm_medium=poweredbyzendesk&utm_source=email-notification> [-- Attachment #2: Type: text/html, Size: 1840 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 6:08 ` [syzbot] [mm?] WARNING in deferred_split_folio syzbot 2026-04-01 6:09 ` Request received Yail @ 2026-04-01 8:10 ` Lance Yang 2026-04-01 8:59 ` Lance Yang 1 sibling, 1 reply; 19+ messages in thread From: Lance Yang @ 2026-04-01 8:10 UTC (permalink / raw) To: syzbot+a7067a757858ac8eb085, usama.arif Cc: Liam.Howlett, akpm, baohua, baolin.wang, david, dev.jain, lance.yang, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs, ziy +Cc Usama On Tue, Mar 31, 2026 at 11:08:27PM -0700, syzbot wrote: >Hello, > >syzbot found the following issue on: > >HEAD commit: cf7c3c02fdd0 Add linux-next specific files for 20260330 >git tree: linux-next >console output: https://syzkaller.appspot.com/x/log.txt?x=154ee46a580000 >kernel config: https://syzkaller.appspot.com/x/.config?x=3944d875fa9bfb67 >dashboard link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 >compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 >syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c846ba580000 > >Downloadable assets: >disk image: https://storage.googleapis.com/syzbot-assets/053d3b49a360/disk-cf7c3c02.raw.xz >vmlinux: https://storage.googleapis.com/syzbot-assets/faabb37d41d0/vmlinux-cf7c3c02.xz >kernel image: https://storage.googleapis.com/syzbot-assets/8d47fe92aaa8/bzImage-cf7c3c02.xz > >IMPORTANT: if you fix the issue, please add the following tag to the commit: >Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com > > free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401 > __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline] > tlb_batch_pages_flush mm/mmu_gather.c:151 [inline] > tlb_flush_mmu_free mm/mmu_gather.c:417 [inline] > tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424 > tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549 > exit_mmap+0x498/0x9e0 mm/mmap.c:1313 > __mmput+0x118/0x430 kernel/fork.c:1177 > exit_mm+0x18e/0x250 kernel/exit.c:581 > do_exit+0x6a2/0x22c0 kernel/exit.c:962 > do_group_exit+0x21b/0x2d0 kernel/exit.c:1116 > __do_sys_exit_group kernel/exit.c:1127 [inline] > __se_sys_exit_group kernel/exit.c:1125 [inline] > __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1125 > x64_sys_call+0x221a/0x2240 arch/x86/include/generated/asm/syscalls_64.h:232 > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] > do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 > entry_SYSCALL_64_after_hwframe+0x77/0x7f >------------[ cut here ]------------ >1 >WARNING: mm/huge_memory.c:4371 at deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371, CPU#1: syz.3.1110/10500 >Modules linked in: >CPU: 1 UID: 0 PID: 10500 Comm: syz.3.1110 Not tainted syzkaller #0 PREEMPT(full) >Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 >RIP: 0010:deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371 >Code: 31 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 c2 67 8d 09 cc e8 8c 73 93 ff 48 89 df 48 c7 c6 20 5b fc 8b e8 dd 2b f5 fe 90 <0f> 0b 90 e9 d4 fe ff ff e8 9f 7a 8a 09 e8 6a 73 93 ff 48 89 df 48 >RSP: 0018:ffffc900047ef540 EFLAGS: 00010046 >RAX: 1c05fb65cfaab100 RBX: ffffea0001840000 RCX: 0000000080000001 >RDX: 0000000000000002 RSI: ffffffff8e4da1c7 RDI: ffff88807d6f9e80 >RBP: ffffc900047ef610 R08: ffff8880b87247d3 R09: 1ffff110170e48fa >R10: dffffc0000000000 R11: ffffed10170e48fb R12: ffffea0001840040 >R13: 0000000000000000 R14: 0000000000010000 R15: 1ffff920008fdeb0 >FS: 00007f32e32a76c0(0000) GS:ffff8881250e8000(0000) knlGS:0000000000000000 >CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >CR2: 00007f5825757930 CR3: 0000000034ad8000 CR4: 00000000003526f0 >Call Trace: > <TASK> > migrate_folio_move mm/migrate.c:1411 [inline] Looks like a race introduced by commit[1] ("mm: migrate: requeue destination folio on deferred split queue"). Between folio migration (mbind) and rmap removal (exit_mmap), I guess :) migrate_folio_move() snapshots src_partially_mapped from src before migration: if (folio_order(src) > 1 && !data_race(list_empty(&src->_deferred_list))) { src_deferred_split = true; src_partially_mapped = folio_test_partially_mapped(src); } Then move_to_new_folio() eventually unqueues src in __folio_migrate_mapping(): folio_unqueue_deferred_split(src); After that, migration restores mappings to dst: if (old_page_state & PAGE_WAS_MAPPED) remove_migration_ptes(src, dst, 0); At that point, dst is already visible again. A concurrent unmap path from another sharer can then remove some of those mappings and reach deferred_split_folio(dst, true), which sets PG_partially_mapped on dst. Migration then resumes and does: if (src_deferred_split) deferred_split_folio(dst, src_partially_mapped); If the earlier snapshot from src was false, this becomes deferred_split_folio(dst, false), but dst may already have been marked partially mapped by the concurrent rmap-removal path, so the WARN in deferred_split_folio() fires: if (partially_mapped) { ... } else { /* partially mapped folios cannot become non-partially mapped */ VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); } [1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/ > migrate_folios_move mm/migrate.c:1740 [inline] > migrate_pages_batch+0x319f/0x4c40 mm/migrate.c:1996 > migrate_pages_sync mm/migrate.c:2026 [inline] > migrate_pages+0x1c74/0x2a10 mm/migrate.c:2135 > do_mbind mm/mempolicy.c:1614 [inline] > kernel_mbind mm/mempolicy.c:1757 [inline] > __do_sys_mbind mm/mempolicy.c:1831 [inline] > __se_sys_mbind+0xe89/0x10f0 mm/mempolicy.c:1827 > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] > do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 > entry_SYSCALL_64_after_hwframe+0x77/0x7f >RIP: 0033:0x7f32e239c819 >Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48 >RSP: 002b:00007f32e32a7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000ed >RAX: ffffffffffffffda RBX: 00007f32e2616090 RCX: 00007f32e239c819 >RDX: 0000000000000000 RSI: 0000000000800000 RDI: 0000200000001000 >RBP: 00007f32e2432c91 R08: 0000000000000000 R09: 0000000000000002 >R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 >R13: 00007f32e2616128 R14: 00007f32e2616090 R15: 00007ffe30c61cc8 > </TASK> > > >--- >This report is generated by a bot. It may contain errors. >See https://goo.gl/tpsmEJ for more information about syzbot. >syzbot engineers can be reached at syzkaller@googlegroups.com. > >syzbot will keep track of this issue. See: >https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > >If the report is already addressed, let syzbot know by replying with: >#syz fix: exact-commit-title > >If you want syzbot to run the reproducer, reply with: >#syz test: git://repo/address.git branch-or-commit-hash >If you attach or paste a git patch, syzbot will apply it before testing. > >If you want to overwrite report's subsystems, reply with: >#syz set subsystems: new-subsystem >(See the list of subsystem names on the web dashboard) > >If the report is a duplicate of another one, reply with: >#syz dup: exact-subject-of-another-report > >If you want to undo deduplication, reply with: >#syz undup > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 8:10 ` [syzbot] [mm?] WARNING in deferred_split_folio Lance Yang @ 2026-04-01 8:59 ` Lance Yang 2026-04-01 9:36 ` David Hildenbrand (Arm) 2026-04-01 10:16 ` David Hildenbrand (Arm) 0 siblings, 2 replies; 19+ messages in thread From: Lance Yang @ 2026-04-01 8:59 UTC (permalink / raw) To: usama.arif, david, Liam.Howlett, ziy Cc: syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs, Lance Yang On Wed, Apr 01, 2026 at 04:10:25PM +0800, Lance Yang wrote: > >+Cc Usama > >On Tue, Mar 31, 2026 at 11:08:27PM -0700, syzbot wrote: >>Hello, >> >>syzbot found the following issue on: >> >>HEAD commit: cf7c3c02fdd0 Add linux-next specific files for 20260330 >>git tree: linux-next >>console output: https://syzkaller.appspot.com/x/log.txt?x=154ee46a580000 >>kernel config: https://syzkaller.appspot.com/x/.config?x=3944d875fa9bfb67 >>dashboard link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 >>compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 >>syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c846ba580000 >> >>Downloadable assets: >>disk image: https://storage.googleapis.com/syzbot-assets/053d3b49a360/disk-cf7c3c02.raw.xz >>vmlinux: https://storage.googleapis.com/syzbot-assets/faabb37d41d0/vmlinux-cf7c3c02.xz >>kernel image: https://storage.googleapis.com/syzbot-assets/8d47fe92aaa8/bzImage-cf7c3c02.xz >> >>IMPORTANT: if you fix the issue, please add the following tag to the commit: >>Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com >> >> free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401 >> __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline] >> tlb_batch_pages_flush mm/mmu_gather.c:151 [inline] >> tlb_flush_mmu_free mm/mmu_gather.c:417 [inline] >> tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424 >> tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549 >> exit_mmap+0x498/0x9e0 mm/mmap.c:1313 >> __mmput+0x118/0x430 kernel/fork.c:1177 >> exit_mm+0x18e/0x250 kernel/exit.c:581 >> do_exit+0x6a2/0x22c0 kernel/exit.c:962 >> do_group_exit+0x21b/0x2d0 kernel/exit.c:1116 >> __do_sys_exit_group kernel/exit.c:1127 [inline] >> __se_sys_exit_group kernel/exit.c:1125 [inline] >> __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1125 >> x64_sys_call+0x221a/0x2240 arch/x86/include/generated/asm/syscalls_64.h:232 >> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] >> do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 >> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>------------[ cut here ]------------ >>1 >>WARNING: mm/huge_memory.c:4371 at deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371, CPU#1: syz.3.1110/10500 >>Modules linked in: >>CPU: 1 UID: 0 PID: 10500 Comm: syz.3.1110 Not tainted syzkaller #0 PREEMPT(full) >>Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 >>RIP: 0010:deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371 >>Code: 31 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 c2 67 8d 09 cc e8 8c 73 93 ff 48 89 df 48 c7 c6 20 5b fc 8b e8 dd 2b f5 fe 90 <0f> 0b 90 e9 d4 fe ff ff e8 9f 7a 8a 09 e8 6a 73 93 ff 48 89 df 48 >>RSP: 0018:ffffc900047ef540 EFLAGS: 00010046 >>RAX: 1c05fb65cfaab100 RBX: ffffea0001840000 RCX: 0000000080000001 >>RDX: 0000000000000002 RSI: ffffffff8e4da1c7 RDI: ffff88807d6f9e80 >>RBP: ffffc900047ef610 R08: ffff8880b87247d3 R09: 1ffff110170e48fa >>R10: dffffc0000000000 R11: ffffed10170e48fb R12: ffffea0001840040 >>R13: 0000000000000000 R14: 0000000000010000 R15: 1ffff920008fdeb0 >>FS: 00007f32e32a76c0(0000) GS:ffff8881250e8000(0000) knlGS:0000000000000000 >>CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>CR2: 00007f5825757930 CR3: 0000000034ad8000 CR4: 00000000003526f0 >>Call Trace: >> <TASK> >> migrate_folio_move mm/migrate.c:1411 [inline] > >Looks like a race introduced by commit[1] ("mm: migrate: requeue >destination folio on deferred split queue"). > >Between folio migration (mbind) and rmap removal (exit_mmap), I guess :) > >migrate_folio_move() snapshots src_partially_mapped from src before >migration: > > if (folio_order(src) > 1 && > !data_race(list_empty(&src->_deferred_list))) { > src_deferred_split = true; > src_partially_mapped = folio_test_partially_mapped(src); > } > >Then move_to_new_folio() eventually unqueues src in >__folio_migrate_mapping(): > > folio_unqueue_deferred_split(src); > >After that, migration restores mappings to dst: > > if (old_page_state & PAGE_WAS_MAPPED) > remove_migration_ptes(src, dst, 0); > >At that point, dst is already visible again. A concurrent unmap path >from another sharer can then remove some of those mappings and reach >deferred_split_folio(dst, true), which sets PG_partially_mapped on >dst. > >Migration then resumes and does: > > if (src_deferred_split) > deferred_split_folio(dst, src_partially_mapped); > >If the earlier snapshot from src was false, this becomes >deferred_split_folio(dst, false), but dst may already have been marked >partially mapped by the concurrent rmap-removal path, so the WARN in >deferred_split_folio() fires: > > if (partially_mapped) { > ... > } else { > /* partially mapped folios cannot become non-partially mapped */ > VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); > } > >[1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/ > Perhaps the WARN is simply too strict there :) Migration already holds the folio lock on dst, while the competing rmap-removal path runs under the page-table lock. So once remove_migration_ptes(src, dst, 0) makes dst visible again, this race looks hard to avoid. So maybe the simplest fix is just to drop the WARN in the !partially_mapped path: ---8<--- Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio() From: Lance Yang <lance.yang@linux.dev> migrate_folio_move() snapshots src_partially_mapped from src before migration and later requeues dst after remove_migration_ptes(src, dst, 0). Once dst is visible again, a competing rmap-removal path can legally set PG_partially_mapped before the migration path reaches deferred_split_folio(dst, src_partially_mapped). Migration already holds the folio lock on dst, while the competing rmap-removal path runs under the page-table lock. So once remove_migration_ptes(src, dst, 0) makes dst visible again, this race looks hard to avoid. So just drop the WARN in the !partially_mapped path and preserve an already-set PG_partially_mapped bit. Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com Signed-off-by: Lance Yang <lance.yang@linux.dev> --- mm/huge_memory.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 745eb3d0d4a7..8ea8e293dc7c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); } - } else { - /* partially mapped folios cannot become non-partially mapped */ - VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); } if (list_empty(&folio->_deferred_list)) { struct mem_cgroup *memcg; --- Thanks, Lance ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 8:59 ` Lance Yang @ 2026-04-01 9:36 ` David Hildenbrand (Arm) 2026-04-01 10:16 ` David Hildenbrand (Arm) 1 sibling, 0 replies; 19+ messages in thread From: David Hildenbrand (Arm) @ 2026-04-01 9:36 UTC (permalink / raw) To: Lance Yang, usama.arif, Liam.Howlett, ziy Cc: syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs, Deepanshu Kartikey On 4/1/26 10:59, Lance Yang wrote: > > On Wed, Apr 01, 2026 at 04:10:25PM +0800, Lance Yang wrote: >> >> +Cc Usama >> >> On Tue, Mar 31, 2026 at 11:08:27PM -0700, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit: cf7c3c02fdd0 Add linux-next specific files for 20260330 >>> git tree: linux-next >>> console output: https://syzkaller.appspot.com/x/log.txt?x=154ee46a580000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=3944d875fa9bfb67 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 >>> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c846ba580000 >>> >>> Downloadable assets: >>> disk image: https://storage.googleapis.com/syzbot-assets/053d3b49a360/disk-cf7c3c02.raw.xz >>> vmlinux: https://storage.googleapis.com/syzbot-assets/faabb37d41d0/vmlinux-cf7c3c02.xz >>> kernel image: https://storage.googleapis.com/syzbot-assets/8d47fe92aaa8/bzImage-cf7c3c02.xz >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com >>> >>> free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401 >>> __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline] >>> tlb_batch_pages_flush mm/mmu_gather.c:151 [inline] >>> tlb_flush_mmu_free mm/mmu_gather.c:417 [inline] >>> tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424 >>> tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549 >>> exit_mmap+0x498/0x9e0 mm/mmap.c:1313 >>> __mmput+0x118/0x430 kernel/fork.c:1177 >>> exit_mm+0x18e/0x250 kernel/exit.c:581 >>> do_exit+0x6a2/0x22c0 kernel/exit.c:962 >>> do_group_exit+0x21b/0x2d0 kernel/exit.c:1116 >>> __do_sys_exit_group kernel/exit.c:1127 [inline] >>> __se_sys_exit_group kernel/exit.c:1125 [inline] >>> __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1125 >>> x64_sys_call+0x221a/0x2240 arch/x86/include/generated/asm/syscalls_64.h:232 >>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] >>> do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 >>> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>> ------------[ cut here ]------------ >>> 1 >>> WARNING: mm/huge_memory.c:4371 at deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371, CPU#1: syz.3.1110/10500 >>> Modules linked in: >>> CPU: 1 UID: 0 PID: 10500 Comm: syz.3.1110 Not tainted syzkaller #0 PREEMPT(full) >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 >>> RIP: 0010:deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371 >>> Code: 31 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 c2 67 8d 09 cc e8 8c 73 93 ff 48 89 df 48 c7 c6 20 5b fc 8b e8 dd 2b f5 fe 90 <0f> 0b 90 e9 d4 fe ff ff e8 9f 7a 8a 09 e8 6a 73 93 ff 48 89 df 48 >>> RSP: 0018:ffffc900047ef540 EFLAGS: 00010046 >>> RAX: 1c05fb65cfaab100 RBX: ffffea0001840000 RCX: 0000000080000001 >>> RDX: 0000000000000002 RSI: ffffffff8e4da1c7 RDI: ffff88807d6f9e80 >>> RBP: ffffc900047ef610 R08: ffff8880b87247d3 R09: 1ffff110170e48fa >>> R10: dffffc0000000000 R11: ffffed10170e48fb R12: ffffea0001840040 >>> R13: 0000000000000000 R14: 0000000000010000 R15: 1ffff920008fdeb0 >>> FS: 00007f32e32a76c0(0000) GS:ffff8881250e8000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 00007f5825757930 CR3: 0000000034ad8000 CR4: 00000000003526f0 >>> Call Trace: >>> <TASK> >>> migrate_folio_move mm/migrate.c:1411 [inline] >> >> Looks like a race introduced by commit[1] ("mm: migrate: requeue >> destination folio on deferred split queue"). >> >> Between folio migration (mbind) and rmap removal (exit_mmap), I guess :) >> >> migrate_folio_move() snapshots src_partially_mapped from src before >> migration: >> >> if (folio_order(src) > 1 && >> !data_race(list_empty(&src->_deferred_list))) { >> src_deferred_split = true; >> src_partially_mapped = folio_test_partially_mapped(src); >> } >> >> Then move_to_new_folio() eventually unqueues src in >> __folio_migrate_mapping(): >> >> folio_unqueue_deferred_split(src); >> >> After that, migration restores mappings to dst: >> >> if (old_page_state & PAGE_WAS_MAPPED) >> remove_migration_ptes(src, dst, 0); >> >> At that point, dst is already visible again. A concurrent unmap path >>from another sharer can then remove some of those mappings and reach >> deferred_split_folio(dst, true), which sets PG_partially_mapped on >> dst. >> >> Migration then resumes and does: >> >> if (src_deferred_split) >> deferred_split_folio(dst, src_partially_mapped); >> >> If the earlier snapshot from src was false, this becomes >> deferred_split_folio(dst, false), but dst may already have been marked >> partially mapped by the concurrent rmap-removal path, so the WARN in >> deferred_split_folio() fires: >> >> if (partially_mapped) { >> ... >> } else { >> /* partially mapped folios cannot become non-partially mapped */ >> VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); >> } >> >> [1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/ >> > > Perhaps the WARN is simply too strict there :) > > Migration already holds the folio lock on dst, while the competing > rmap-removal path runs under the page-table lock. So once > remove_migration_ptes(src, dst, 0) makes dst visible again, this race > looks hard to avoid. > > So maybe the simplest fix is just to drop the WARN in the > !partially_mapped path: > > ---8<--- > Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio() > > From: Lance Yang <lance.yang@linux.dev> > > migrate_folio_move() snapshots src_partially_mapped from src before > migration and later requeues dst after remove_migration_ptes(src, dst, 0). > > Once dst is visible again, a competing rmap-removal path can legally set > PG_partially_mapped before the migration path reaches > deferred_split_folio(dst, src_partially_mapped). > > Migration already holds the folio lock on dst, while the competing > rmap-removal path runs under the page-table lock. So once > remove_migration_ptes(src, dst, 0) makes dst visible again, this race > looks hard to avoid. > > So just drop the WARN in the !partially_mapped path and preserve an > already-set PG_partially_mapped bit. > > Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ > Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") > Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com > Signed-off-by: Lance Yang <lance.yang@linux.dev> A fix just appeared: https://lore.kernel.org/r/20260401084116.22219-1-kartikey406@gmail.com Have to think about this :) -- Cheers, David ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 8:59 ` Lance Yang 2026-04-01 9:36 ` David Hildenbrand (Arm) @ 2026-04-01 10:16 ` David Hildenbrand (Arm) 2026-04-01 10:53 ` Lance Yang 1 sibling, 1 reply; 19+ messages in thread From: David Hildenbrand (Arm) @ 2026-04-01 10:16 UTC (permalink / raw) To: Lance Yang, usama.arif, Liam.Howlett, ziy Cc: syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On 4/1/26 10:59, Lance Yang wrote: > > On Wed, Apr 01, 2026 at 04:10:25PM +0800, Lance Yang wrote: >> >> +Cc Usama >> >> On Tue, Mar 31, 2026 at 11:08:27PM -0700, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit: cf7c3c02fdd0 Add linux-next specific files for 20260330 >>> git tree: linux-next >>> console output: https://syzkaller.appspot.com/x/log.txt?x=154ee46a580000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=3944d875fa9bfb67 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 >>> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c846ba580000 >>> >>> Downloadable assets: >>> disk image: https://storage.googleapis.com/syzbot-assets/053d3b49a360/disk-cf7c3c02.raw.xz >>> vmlinux: https://storage.googleapis.com/syzbot-assets/faabb37d41d0/vmlinux-cf7c3c02.xz >>> kernel image: https://storage.googleapis.com/syzbot-assets/8d47fe92aaa8/bzImage-cf7c3c02.xz >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com >>> >>> free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401 >>> __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline] >>> tlb_batch_pages_flush mm/mmu_gather.c:151 [inline] >>> tlb_flush_mmu_free mm/mmu_gather.c:417 [inline] >>> tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424 >>> tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549 >>> exit_mmap+0x498/0x9e0 mm/mmap.c:1313 >>> __mmput+0x118/0x430 kernel/fork.c:1177 >>> exit_mm+0x18e/0x250 kernel/exit.c:581 >>> do_exit+0x6a2/0x22c0 kernel/exit.c:962 >>> do_group_exit+0x21b/0x2d0 kernel/exit.c:1116 >>> __do_sys_exit_group kernel/exit.c:1127 [inline] >>> __se_sys_exit_group kernel/exit.c:1125 [inline] >>> __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1125 >>> x64_sys_call+0x221a/0x2240 arch/x86/include/generated/asm/syscalls_64.h:232 >>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] >>> do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 >>> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>> ------------[ cut here ]------------ >>> 1 >>> WARNING: mm/huge_memory.c:4371 at deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371, CPU#1: syz.3.1110/10500 >>> Modules linked in: >>> CPU: 1 UID: 0 PID: 10500 Comm: syz.3.1110 Not tainted syzkaller #0 PREEMPT(full) >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 >>> RIP: 0010:deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371 >>> Code: 31 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 c2 67 8d 09 cc e8 8c 73 93 ff 48 89 df 48 c7 c6 20 5b fc 8b e8 dd 2b f5 fe 90 <0f> 0b 90 e9 d4 fe ff ff e8 9f 7a 8a 09 e8 6a 73 93 ff 48 89 df 48 >>> RSP: 0018:ffffc900047ef540 EFLAGS: 00010046 >>> RAX: 1c05fb65cfaab100 RBX: ffffea0001840000 RCX: 0000000080000001 >>> RDX: 0000000000000002 RSI: ffffffff8e4da1c7 RDI: ffff88807d6f9e80 >>> RBP: ffffc900047ef610 R08: ffff8880b87247d3 R09: 1ffff110170e48fa >>> R10: dffffc0000000000 R11: ffffed10170e48fb R12: ffffea0001840040 >>> R13: 0000000000000000 R14: 0000000000010000 R15: 1ffff920008fdeb0 >>> FS: 00007f32e32a76c0(0000) GS:ffff8881250e8000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 00007f5825757930 CR3: 0000000034ad8000 CR4: 00000000003526f0 >>> Call Trace: >>> <TASK> >>> migrate_folio_move mm/migrate.c:1411 [inline] >> >> Looks like a race introduced by commit[1] ("mm: migrate: requeue >> destination folio on deferred split queue"). >> >> Between folio migration (mbind) and rmap removal (exit_mmap), I guess :) >> >> migrate_folio_move() snapshots src_partially_mapped from src before >> migration: >> >> if (folio_order(src) > 1 && >> !data_race(list_empty(&src->_deferred_list))) { >> src_deferred_split = true; >> src_partially_mapped = folio_test_partially_mapped(src); >> } >> >> Then move_to_new_folio() eventually unqueues src in >> __folio_migrate_mapping(): >> >> folio_unqueue_deferred_split(src); >> >> After that, migration restores mappings to dst: >> >> if (old_page_state & PAGE_WAS_MAPPED) >> remove_migration_ptes(src, dst, 0); >> >> At that point, dst is already visible again. A concurrent unmap path >>from another sharer can then remove some of those mappings and reach >> deferred_split_folio(dst, true), which sets PG_partially_mapped on >> dst. >> >> Migration then resumes and does: >> >> if (src_deferred_split) >> deferred_split_folio(dst, src_partially_mapped); >> >> If the earlier snapshot from src was false, this becomes >> deferred_split_folio(dst, false), but dst may already have been marked >> partially mapped by the concurrent rmap-removal path, so the WARN in >> deferred_split_folio() fires: >> >> if (partially_mapped) { >> ... >> } else { >> /* partially mapped folios cannot become non-partially mapped */ >> VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); >> } >> >> [1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/ >> > > Perhaps the WARN is simply too strict there :) > > Migration already holds the folio lock on dst, while the competing > rmap-removal path runs under the page-table lock. So once > remove_migration_ptes(src, dst, 0) makes dst visible again, this race > looks hard to avoid. > > So maybe the simplest fix is just to drop the WARN in the > !partially_mapped path: > > ---8<--- > Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio() > > From: Lance Yang <lance.yang@linux.dev> > > migrate_folio_move() snapshots src_partially_mapped from src before > migration and later requeues dst after remove_migration_ptes(src, dst, 0). > > Once dst is visible again, a competing rmap-removal path can legally set > PG_partially_mapped before the migration path reaches > deferred_split_folio(dst, src_partially_mapped). > > Migration already holds the folio lock on dst, while the competing > rmap-removal path runs under the page-table lock. So once > remove_migration_ptes(src, dst, 0) makes dst visible again, this race > looks hard to avoid. > > So just drop the WARN in the !partially_mapped path and preserve an > already-set PG_partially_mapped bit. > > Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ > Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") > Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com > Signed-off-by: Lance Yang <lance.yang@linux.dev> > --- > mm/huge_memory.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 745eb3d0d4a7..8ea8e293dc7c 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) > mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); > > } > - } else { > - /* partially mapped folios cannot become non-partially mapped */ > - VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); > } Can't we simply move the setting before restoring migration ptes? diff --git a/mm/migrate.c b/mm/migrate.c index 05cb408846f2..5f222cb0ca90 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1385,6 +1385,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, if (rc) goto out; + /* + * Requeue the destination folio on the deferred split queue if + * the source was on the queue. The source is unqueued in + * __folio_migrate_mapping(), so we recorded the state from + * before move_to_new_folio(). + */ + if (src_deferred_split) + deferred_split_folio(dst, src_partially_mapped); + /* * When successful, push dst to LRU immediately: so that if it * turns out to be an mlocked page, remove_migration_ptes() will @@ -1400,16 +1409,6 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, if (old_page_state & PAGE_WAS_MAPPED) remove_migration_ptes(src, dst, 0); - - /* - * Requeue the destination folio on the deferred split queue if - * the source was on the queue. The source is unqueued in - * __folio_migrate_mapping(), so we recorded the state from - * before move_to_new_folio(). - */ - if (src_deferred_split) - deferred_split_folio(dst, src_partially_mapped); - out_unlock_both: folio_unlock(dst); folio_set_owner_migrate_reason(dst, reason); -- Cheers, David ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 10:16 ` David Hildenbrand (Arm) @ 2026-04-01 10:53 ` Lance Yang 2026-04-01 11:00 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 19+ messages in thread From: Lance Yang @ 2026-04-01 10:53 UTC (permalink / raw) To: david, kartikey406 Cc: lance.yang, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs +Cc Deepanshu On Wed, Apr 01, 2026 at 12:16:43PM +0200, David Hildenbrand (Arm) wrote: >On 4/1/26 10:59, Lance Yang wrote: >> >> On Wed, Apr 01, 2026 at 04:10:25PM +0800, Lance Yang wrote: >>> >>> +Cc Usama >>> >>> On Tue, Mar 31, 2026 at 11:08:27PM -0700, syzbot wrote: >>>> Hello, >>>> >>>> syzbot found the following issue on: >>>> >>>> HEAD commit: cf7c3c02fdd0 Add linux-next specific files for 20260330 >>>> git tree: linux-next >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=154ee46a580000 >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=3944d875fa9bfb67 >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085 >>>> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 >>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c846ba580000 >>>> >>>> Downloadable assets: >>>> disk image: https://storage.googleapis.com/syzbot-assets/053d3b49a360/disk-cf7c3c02.raw.xz >>>> vmlinux: https://storage.googleapis.com/syzbot-assets/faabb37d41d0/vmlinux-cf7c3c02.xz >>>> kernel image: https://storage.googleapis.com/syzbot-assets/8d47fe92aaa8/bzImage-cf7c3c02.xz >>>> >>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com >>>> >>>> free_pages_and_swap_cache+0x2b9/0x490 mm/swap_state.c:401 >>>> __tlb_batch_free_encoded_pages mm/mmu_gather.c:138 [inline] >>>> tlb_batch_pages_flush mm/mmu_gather.c:151 [inline] >>>> tlb_flush_mmu_free mm/mmu_gather.c:417 [inline] >>>> tlb_flush_mmu+0x6d3/0xa30 mm/mmu_gather.c:424 >>>> tlb_finish_mmu+0xf9/0x230 mm/mmu_gather.c:549 >>>> exit_mmap+0x498/0x9e0 mm/mmap.c:1313 >>>> __mmput+0x118/0x430 kernel/fork.c:1177 >>>> exit_mm+0x18e/0x250 kernel/exit.c:581 >>>> do_exit+0x6a2/0x22c0 kernel/exit.c:962 >>>> do_group_exit+0x21b/0x2d0 kernel/exit.c:1116 >>>> __do_sys_exit_group kernel/exit.c:1127 [inline] >>>> __se_sys_exit_group kernel/exit.c:1125 [inline] >>>> __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1125 >>>> x64_sys_call+0x221a/0x2240 arch/x86/include/generated/asm/syscalls_64.h:232 >>>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] >>>> do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 >>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>>> ------------[ cut here ]------------ >>>> 1 >>>> WARNING: mm/huge_memory.c:4371 at deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371, CPU#1: syz.3.1110/10500 >>>> Modules linked in: >>>> CPU: 1 UID: 0 PID: 10500 Comm: syz.3.1110 Not tainted syzkaller #0 PREEMPT(full) >>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 >>>> RIP: 0010:deferred_split_folio+0x974/0xaa0 mm/huge_memory.c:4371 >>>> Code: 31 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 c2 67 8d 09 cc e8 8c 73 93 ff 48 89 df 48 c7 c6 20 5b fc 8b e8 dd 2b f5 fe 90 <0f> 0b 90 e9 d4 fe ff ff e8 9f 7a 8a 09 e8 6a 73 93 ff 48 89 df 48 >>>> RSP: 0018:ffffc900047ef540 EFLAGS: 00010046 >>>> RAX: 1c05fb65cfaab100 RBX: ffffea0001840000 RCX: 0000000080000001 >>>> RDX: 0000000000000002 RSI: ffffffff8e4da1c7 RDI: ffff88807d6f9e80 >>>> RBP: ffffc900047ef610 R08: ffff8880b87247d3 R09: 1ffff110170e48fa >>>> R10: dffffc0000000000 R11: ffffed10170e48fb R12: ffffea0001840040 >>>> R13: 0000000000000000 R14: 0000000000010000 R15: 1ffff920008fdeb0 >>>> FS: 00007f32e32a76c0(0000) GS:ffff8881250e8000(0000) knlGS:0000000000000000 >>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> CR2: 00007f5825757930 CR3: 0000000034ad8000 CR4: 00000000003526f0 >>>> Call Trace: >>>> <TASK> >>>> migrate_folio_move mm/migrate.c:1411 [inline] >>> >>> Looks like a race introduced by commit[1] ("mm: migrate: requeue >>> destination folio on deferred split queue"). >>> >>> Between folio migration (mbind) and rmap removal (exit_mmap), I guess :) >>> >>> migrate_folio_move() snapshots src_partially_mapped from src before >>> migration: >>> >>> if (folio_order(src) > 1 && >>> !data_race(list_empty(&src->_deferred_list))) { >>> src_deferred_split = true; >>> src_partially_mapped = folio_test_partially_mapped(src); >>> } >>> >>> Then move_to_new_folio() eventually unqueues src in >>> __folio_migrate_mapping(): >>> >>> folio_unqueue_deferred_split(src); >>> >>> After that, migration restores mappings to dst: >>> >>> if (old_page_state & PAGE_WAS_MAPPED) >>> remove_migration_ptes(src, dst, 0); >>> >>> At that point, dst is already visible again. A concurrent unmap path >>>from another sharer can then remove some of those mappings and reach >>> deferred_split_folio(dst, true), which sets PG_partially_mapped on >>> dst. >>> >>> Migration then resumes and does: >>> >>> if (src_deferred_split) >>> deferred_split_folio(dst, src_partially_mapped); >>> >>> If the earlier snapshot from src was false, this becomes >>> deferred_split_folio(dst, false), but dst may already have been marked >>> partially mapped by the concurrent rmap-removal path, so the WARN in >>> deferred_split_folio() fires: >>> >>> if (partially_mapped) { >>> ... >>> } else { >>> /* partially mapped folios cannot become non-partially mapped */ >>> VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); >>> } >>> >>> [1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/ >>> >> >> Perhaps the WARN is simply too strict there :) >> >> Migration already holds the folio lock on dst, while the competing >> rmap-removal path runs under the page-table lock. So once >> remove_migration_ptes(src, dst, 0) makes dst visible again, this race >> looks hard to avoid. >> >> So maybe the simplest fix is just to drop the WARN in the >> !partially_mapped path: >> >> ---8<--- >> Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio() >> >> From: Lance Yang <lance.yang@linux.dev> >> >> migrate_folio_move() snapshots src_partially_mapped from src before >> migration and later requeues dst after remove_migration_ptes(src, dst, 0). >> >> Once dst is visible again, a competing rmap-removal path can legally set >> PG_partially_mapped before the migration path reaches >> deferred_split_folio(dst, src_partially_mapped). >> >> Migration already holds the folio lock on dst, while the competing >> rmap-removal path runs under the page-table lock. So once >> remove_migration_ptes(src, dst, 0) makes dst visible again, this race >> looks hard to avoid. >> >> So just drop the WARN in the !partially_mapped path and preserve an >> already-set PG_partially_mapped bit. >> >> Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ >> Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") >> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com >> Signed-off-by: Lance Yang <lance.yang@linux.dev> >> --- >> mm/huge_memory.c | 3 --- >> 1 file changed, 3 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 745eb3d0d4a7..8ea8e293dc7c 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) >> mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); >> >> } >> - } else { >> - /* partially mapped folios cannot become non-partially mapped */ >> - VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); >> } > >Can't we simply move the setting before restoring migration ptes? Afraid not, it closes the remove_migration_ptes() -> deferred_split_folio() race, but opens a new one with the shrinker, IIUC Once dst is on the deferred split queue, deferred_split_scan() can pick it up immediately. The shrinker unconditionally dequeues every folio it visits: list_del_init(&folio->_deferred_list); /* always */ Then for a non-partially-mapped folio, if folio_trylock() fails (dst is still locked by migration), it falls through to: next: if (did_split || !folio_test_partially_mapped(folio)) continue; /* not requeued, dst silently lost */ so it is *not* requeued. That seems to recreate the original issue commit[1] was fixing: letting underused THPs silently fall off the deferred split queue again ... Hopefully, I didn't miss something important :) [1] https://lore.kernel.org/all/20260312104723.1351321-1-usama.arif@linux.dev/ >diff --git a/mm/migrate.c b/mm/migrate.c >index 05cb408846f2..5f222cb0ca90 100644 >--- a/mm/migrate.c >+++ b/mm/migrate.c >@@ -1385,6 +1385,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, > if (rc) > goto out; > >+ /* >+ * Requeue the destination folio on the deferred split queue if >+ * the source was on the queue. The source is unqueued in >+ * __folio_migrate_mapping(), so we recorded the state from >+ * before move_to_new_folio(). >+ */ >+ if (src_deferred_split) >+ deferred_split_folio(dst, src_partially_mapped); >+ > /* > * When successful, push dst to LRU immediately: so that if it > * turns out to be an mlocked page, remove_migration_ptes() will >@@ -1400,16 +1409,6 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, > > if (old_page_state & PAGE_WAS_MAPPED) > remove_migration_ptes(src, dst, 0); >- >- /* >- * Requeue the destination folio on the deferred split queue if >- * the source was on the queue. The source is unqueued in >- * __folio_migrate_mapping(), so we recorded the state from >- * before move_to_new_folio(). >- */ >- if (src_deferred_split) >- deferred_split_folio(dst, src_partially_mapped); >- > out_unlock_both: > folio_unlock(dst); > folio_set_owner_migrate_reason(dst, reason); > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 10:53 ` Lance Yang @ 2026-04-01 11:00 ` David Hildenbrand (Arm) 2026-04-01 11:20 ` Lance Yang 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand (Arm) @ 2026-04-01 11:00 UTC (permalink / raw) To: Lance Yang, kartikey406 Cc: usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On 4/1/26 12:53, Lance Yang wrote: > > +Cc Deepanshu > > On Wed, Apr 01, 2026 at 12:16:43PM +0200, David Hildenbrand (Arm) wrote: >> On 4/1/26 10:59, Lance Yang wrote: >>> >>> >from another sharer can then remove some of those mappings and reach >>> >>> Perhaps the WARN is simply too strict there :) >>> >>> Migration already holds the folio lock on dst, while the competing >>> rmap-removal path runs under the page-table lock. So once >>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race >>> looks hard to avoid. >>> >>> So maybe the simplest fix is just to drop the WARN in the >>> !partially_mapped path: >>> >>> ---8<--- >>> Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio() >>> >>> From: Lance Yang <lance.yang@linux.dev> >>> >>> migrate_folio_move() snapshots src_partially_mapped from src before >>> migration and later requeues dst after remove_migration_ptes(src, dst, 0). >>> >>> Once dst is visible again, a competing rmap-removal path can legally set >>> PG_partially_mapped before the migration path reaches >>> deferred_split_folio(dst, src_partially_mapped). >>> >>> Migration already holds the folio lock on dst, while the competing >>> rmap-removal path runs under the page-table lock. So once >>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race >>> looks hard to avoid. >>> >>> So just drop the WARN in the !partially_mapped path and preserve an >>> already-set PG_partially_mapped bit. >>> >>> Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ >>> Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") >>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com >>> Signed-off-by: Lance Yang <lance.yang@linux.dev> >>> --- >>> mm/huge_memory.c | 3 --- >>> 1 file changed, 3 deletions(-) >>> >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>> index 745eb3d0d4a7..8ea8e293dc7c 100644 >>> --- a/mm/huge_memory.c >>> +++ b/mm/huge_memory.c >>> @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) >>> mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); >>> >>> } >>> - } else { >>> - /* partially mapped folios cannot become non-partially mapped */ >>> - VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); >>> } >> >> Can't we simply move the setting before restoring migration ptes? > > Afraid not, it closes the remove_migration_ptes() -> > deferred_split_folio() race, but opens a new one with the shrinker, IIUC > > Once dst is on the deferred split queue, deferred_split_scan() can > pick it up immediately. The shrinker unconditionally dequeues every > folio it visits: > > list_del_init(&folio->_deferred_list); /* always */ > > Then for a non-partially-mapped folio, if folio_trylock() fails > (dst is still locked by migration), it falls through to: > > next: > if (did_split || !folio_test_partially_mapped(folio)) > continue; /* not requeued, dst silently lost */ > > so it is *not* requeued. How is that different to the shrinker just trying to lock the folio before we unlock it and failing? The race already exists? To sort out that race a trylock must not result in the folio getting discarded. diff --git a/mm/huge_memory.c b/mm/huge_memory.c index ff9a42abd1b6..521989517cd1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -4558,7 +4558,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, goto next; } if (!folio_trylock(folio)) - goto next; + goto requeue: if (!split_folio(folio)) { did_split = true; if (underused) @@ -4569,6 +4569,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, next: if (did_split || !folio_test_partially_mapped(folio)) continue; +requeue: /* * Only add back to the queue if folio is partially mapped. * If thp_underused returns false, or if split_folio fails -- Cheers, David ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 11:00 ` David Hildenbrand (Arm) @ 2026-04-01 11:20 ` Lance Yang 2026-04-01 11:22 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 19+ messages in thread From: Lance Yang @ 2026-04-01 11:20 UTC (permalink / raw) To: david Cc: lance.yang, kartikey406, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On Wed, Apr 01, 2026 at 01:00:13PM +0200, David Hildenbrand (Arm) wrote: >On 4/1/26 12:53, Lance Yang wrote: >> >> +Cc Deepanshu >> >> On Wed, Apr 01, 2026 at 12:16:43PM +0200, David Hildenbrand (Arm) wrote: >>> On 4/1/26 10:59, Lance Yang wrote: >>>> >>>> >from another sharer can then remove some of those mappings and reach >>>> >>>> Perhaps the WARN is simply too strict there :) >>>> >>>> Migration already holds the folio lock on dst, while the competing >>>> rmap-removal path runs under the page-table lock. So once >>>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race >>>> looks hard to avoid. >>>> >>>> So maybe the simplest fix is just to drop the WARN in the >>>> !partially_mapped path: >>>> >>>> ---8<--- >>>> Subject: [PATCH 1/1] mm/thp: avoid false warning in deferred_split_folio() >>>> >>>> From: Lance Yang <lance.yang@linux.dev> >>>> >>>> migrate_folio_move() snapshots src_partially_mapped from src before >>>> migration and later requeues dst after remove_migration_ptes(src, dst, 0). >>>> >>>> Once dst is visible again, a competing rmap-removal path can legally set >>>> PG_partially_mapped before the migration path reaches >>>> deferred_split_folio(dst, src_partially_mapped). >>>> >>>> Migration already holds the folio lock on dst, while the competing >>>> rmap-removal path runs under the page-table lock. So once >>>> remove_migration_ptes(src, dst, 0) makes dst visible again, this race >>>> looks hard to avoid. >>>> >>>> So just drop the WARN in the !partially_mapped path and preserve an >>>> already-set PG_partially_mapped bit. >>>> >>>> Link: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@google.com/ >>>> Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue") >>>> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com >>>> Signed-off-by: Lance Yang <lance.yang@linux.dev> >>>> --- >>>> mm/huge_memory.c | 3 --- >>>> 1 file changed, 3 deletions(-) >>>> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>> index 745eb3d0d4a7..8ea8e293dc7c 100644 >>>> --- a/mm/huge_memory.c >>>> +++ b/mm/huge_memory.c >>>> @@ -4433,9 +4433,6 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) >>>> mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); >>>> >>>> } >>>> - } else { >>>> - /* partially mapped folios cannot become non-partially mapped */ >>>> - VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); >>>> } >>> >>> Can't we simply move the setting before restoring migration ptes? >> >> Afraid not, it closes the remove_migration_ptes() -> >> deferred_split_folio() race, but opens a new one with the shrinker, IIUC >> >> Once dst is on the deferred split queue, deferred_split_scan() can >> pick it up immediately. The shrinker unconditionally dequeues every >> folio it visits: >> >> list_del_init(&folio->_deferred_list); /* always */ >> >> Then for a non-partially-mapped folio, if folio_trylock() fails >> (dst is still locked by migration), it falls through to: >> >> next: >> if (did_split || !folio_test_partially_mapped(folio)) >> continue; /* not requeued, dst silently lost */ >> >> so it is *not* requeued. > >How is that different to the shrinker just trying to lock the folio before we >unlock it and failing? The race already exists? Ouch, you're right, I was wrong - the trylock drop is a pre-existing issue, not caused by the reorder ;) > >To sort out that race a trylock must not result in the folio getting >discarded. Nice, LGTM! Given that the "trylock -> drop" behavior seems to exist already today, do you think it's worth fixing that together with the reorder? >diff --git a/mm/huge_memory.c b/mm/huge_memory.c >index ff9a42abd1b6..521989517cd1 100644 >--- a/mm/huge_memory.c >+++ b/mm/huge_memory.c >@@ -4558,7 +4558,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, > goto next; > } > if (!folio_trylock(folio)) >- goto next; >+ goto requeue: > if (!split_folio(folio)) { > did_split = true; > if (underused) >@@ -4569,6 +4569,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, > next: > if (did_split || !folio_test_partially_mapped(folio)) > continue; >+requeue: > /* > * Only add back to the queue if folio is partially mapped. > * If thp_underused returns false, or if split_folio fails > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 11:20 ` Lance Yang @ 2026-04-01 11:22 ` David Hildenbrand (Arm) 2026-04-01 11:34 ` Lance Yang 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand (Arm) @ 2026-04-01 11:22 UTC (permalink / raw) To: Lance Yang Cc: kartikey406, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On 4/1/26 13:20, Lance Yang wrote: > > On Wed, Apr 01, 2026 at 01:00:13PM +0200, David Hildenbrand (Arm) wrote: >> On 4/1/26 12:53, Lance Yang wrote: >>> >>> +Cc Deepanshu >>> >>> >>> Afraid not, it closes the remove_migration_ptes() -> >>> deferred_split_folio() race, but opens a new one with the shrinker, IIUC >>> >>> Once dst is on the deferred split queue, deferred_split_scan() can >>> pick it up immediately. The shrinker unconditionally dequeues every >>> folio it visits: >>> >>> list_del_init(&folio->_deferred_list); /* always */ >>> >>> Then for a non-partially-mapped folio, if folio_trylock() fails >>> (dst is still locked by migration), it falls through to: >>> >>> next: >>> if (did_split || !folio_test_partially_mapped(folio)) >>> continue; /* not requeued, dst silently lost */ >>> >>> so it is *not* requeued. >> >> How is that different to the shrinker just trying to lock the folio before we >> unlock it and failing? The race already exists? > > Ouch, you're right, I was wrong - the trylock drop is a pre-existing > issue, not caused by the reorder ;) > >> >> To sort out that race a trylock must not result in the folio getting >> discarded. > > Nice, LGTM! > > Given that the "trylock -> drop" behavior seems to exist already today, > do you think it's worth fixing that together with the reorder? I'd do it in a single shot if possible. Can you craft something? (cc stable etc) -- Cheers, David ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 11:22 ` David Hildenbrand (Arm) @ 2026-04-01 11:34 ` Lance Yang 2026-04-01 11:38 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 19+ messages in thread From: Lance Yang @ 2026-04-01 11:34 UTC (permalink / raw) To: david Cc: lance.yang, kartikey406, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On Wed, Apr 01, 2026 at 01:22:58PM +0200, David Hildenbrand (Arm) wrote: >On 4/1/26 13:20, Lance Yang wrote: >> >> On Wed, Apr 01, 2026 at 01:00:13PM +0200, David Hildenbrand (Arm) wrote: >>> On 4/1/26 12:53, Lance Yang wrote: >>>> >>>> +Cc Deepanshu >>>> >>>> >>>> Afraid not, it closes the remove_migration_ptes() -> >>>> deferred_split_folio() race, but opens a new one with the shrinker, IIUC >>>> >>>> Once dst is on the deferred split queue, deferred_split_scan() can >>>> pick it up immediately. The shrinker unconditionally dequeues every >>>> folio it visits: >>>> >>>> list_del_init(&folio->_deferred_list); /* always */ >>>> >>>> Then for a non-partially-mapped folio, if folio_trylock() fails >>>> (dst is still locked by migration), it falls through to: >>>> >>>> next: >>>> if (did_split || !folio_test_partially_mapped(folio)) >>>> continue; /* not requeued, dst silently lost */ >>>> >>>> so it is *not* requeued. >>> >>> How is that different to the shrinker just trying to lock the folio before we >>> unlock it and failing? The race already exists? >> >> Ouch, you're right, I was wrong - the trylock drop is a pre-existing >> issue, not caused by the reorder ;) >> >>> >>> To sort out that race a trylock must not result in the folio getting >>> discarded. >> >> Nice, LGTM! >> >> Given that the "trylock -> drop" behavior seems to exist already today, >> do you think it's worth fixing that together with the reorder? > >I'd do it in a single shot if possible. ACK. >Can you craft something? (cc stable etc) certainly, will do! But commit[1] ("mm: migrate: requeue destination folio on deferred split queue") is only in mm-stable now, not yet upstream/stable ... [1] https://lore.kernel.org/mm-commits/20260329004127.334D2C4CEF7@smtp.kernel.org/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 11:34 ` Lance Yang @ 2026-04-01 11:38 ` David Hildenbrand (Arm) 2026-04-01 11:41 ` Lance Yang 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand (Arm) @ 2026-04-01 11:38 UTC (permalink / raw) To: Lance Yang Cc: kartikey406, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On 4/1/26 13:34, Lance Yang wrote: > > On Wed, Apr 01, 2026 at 01:22:58PM +0200, David Hildenbrand (Arm) wrote: >> On 4/1/26 13:20, Lance Yang wrote: >>> >>> >>> Ouch, you're right, I was wrong - the trylock drop is a pre-existing >>> issue, not caused by the reorder ;) >>> >>> >>> Nice, LGTM! >>> >>> Given that the "trylock -> drop" behavior seems to exist already today, >>> do you think it's worth fixing that together with the reorder? >> >> I'd do it in a single shot if possible. > > ACK. > >> Can you craft something? (cc stable etc) > > certainly, will do! But commit[1] ("mm: migrate: requeue destination > folio on deferred split queue") is only in mm-stable now, not yet > upstream/stable ... It's tricky. The original commit will be backported to stable kernels, so we want also the fix to be backported to the same stable kernels. The commit id in mm-stable is "stable" now. -- Cheers, David ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 11:38 ` David Hildenbrand (Arm) @ 2026-04-01 11:41 ` Lance Yang 2026-04-01 11:44 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 19+ messages in thread From: Lance Yang @ 2026-04-01 11:41 UTC (permalink / raw) To: David Hildenbrand (Arm) Cc: kartikey406, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On 2026/4/1 19:38, David Hildenbrand (Arm) wrote: > On 4/1/26 13:34, Lance Yang wrote: >> >> On Wed, Apr 01, 2026 at 01:22:58PM +0200, David Hildenbrand (Arm) wrote: >>> On 4/1/26 13:20, Lance Yang wrote: >>>> >>>> >>>> Ouch, you're right, I was wrong - the trylock drop is a pre-existing >>>> issue, not caused by the reorder ;) >>>> >>>> >>>> Nice, LGTM! >>>> >>>> Given that the "trylock -> drop" behavior seems to exist already today, >>>> do you think it's worth fixing that together with the reorder? >>> >>> I'd do it in a single shot if possible. >> >> ACK. >> >>> Can you craft something? (cc stable etc) >> >> certainly, will do! But commit[1] ("mm: migrate: requeue destination >> folio on deferred split queue") is only in mm-stable now, not yet >> upstream/stable ... > > It's tricky. The original commit will be backported to stable kernels, > so we want also the fix to be backported to the same stable kernels. > > The commit id in mm-stable is "stable" now. Emm... Not a big deal, I guess. We can always submit a stable backport once that fix gets merged into stable :D ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 11:41 ` Lance Yang @ 2026-04-01 11:44 ` David Hildenbrand (Arm) 2026-04-01 11:51 ` Lance Yang 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand (Arm) @ 2026-04-01 11:44 UTC (permalink / raw) To: Lance Yang Cc: kartikey406, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On 4/1/26 13:41, Lance Yang wrote: > > > On 2026/4/1 19:38, David Hildenbrand (Arm) wrote: >> On 4/1/26 13:34, Lance Yang wrote: >>> >>> >>> ACK. >>> >>> >>> certainly, will do! But commit[1] ("mm: migrate: requeue destination >>> folio on deferred split queue") is only in mm-stable now, not yet >>> upstream/stable ... >> >> It's tricky. The original commit will be backported to stable kernels, >> so we want also the fix to be backported to the same stable kernels. >> >> The commit id in mm-stable is "stable" now. > > Emm... Not a big deal, I guess. We can always submit a stable backport > once that fix gets merged into stable :D What's the problem with tagging the commit right way as Cc: stable? -- Cheers, David ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 11:44 ` David Hildenbrand (Arm) @ 2026-04-01 11:51 ` Lance Yang 2026-04-01 11:54 ` Lance Yang 0 siblings, 1 reply; 19+ messages in thread From: Lance Yang @ 2026-04-01 11:51 UTC (permalink / raw) To: David Hildenbrand (Arm) Cc: kartikey406, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On 2026/4/1 19:44, David Hildenbrand (Arm) wrote: > On 4/1/26 13:41, Lance Yang wrote: >> >> >> On 2026/4/1 19:38, David Hildenbrand (Arm) wrote: >>> On 4/1/26 13:34, Lance Yang wrote: >>>> >>>> >>>> ACK. >>>> >>>> >>>> certainly, will do! But commit[1] ("mm: migrate: requeue destination >>>> folio on deferred split queue") is only in mm-stable now, not yet >>>> upstream/stable ... >>> >>> It's tricky. The original commit will be backported to stable kernels, >>> so we want also the fix to be backported to the same stable kernels. >>> >>> The commit id in mm-stable is "stable" now. >> >> Emm... Not a big deal, I guess. We can always submit a stable backport >> once that fix gets merged into stable :D > > What's the problem with tagging the commit right way as Cc: stable? Sure, that makes sense. I'll add "Cc: stable" to the fix as well, so it can follow 8a8ca142a488 into stable ;) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [syzbot] [mm?] WARNING in deferred_split_folio 2026-04-01 11:51 ` Lance Yang @ 2026-04-01 11:54 ` Lance Yang 0 siblings, 0 replies; 19+ messages in thread From: Lance Yang @ 2026-04-01 11:54 UTC (permalink / raw) To: David Hildenbrand (Arm) Cc: kartikey406, usama.arif, Liam.Howlett, ziy, syzbot+a7067a757858ac8eb085, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, ljs, npache, ryan.roberts, syzkaller-bugs On 2026/4/1 19:51, Lance Yang wrote: > > > On 2026/4/1 19:44, David Hildenbrand (Arm) wrote: >> On 4/1/26 13:41, Lance Yang wrote: >>> >>> >>> On 2026/4/1 19:38, David Hildenbrand (Arm) wrote: >>>> On 4/1/26 13:34, Lance Yang wrote: >>>>> >>>>> >>>>> ACK. >>>>> >>>>> >>>>> certainly, will do! But commit[1] ("mm: migrate: requeue destination >>>>> folio on deferred split queue") is only in mm-stable now, not yet >>>>> upstream/stable ... >>>> >>>> It's tricky. The original commit will be backported to stable kernels, >>>> so we want also the fix to be backported to the same stable kernels. >>>> >>>> The commit id in mm-stable is "stable" now. >>> >>> Emm... Not a big deal, I guess. We can always submit a stable backport >>> once that fix gets merged into stable :D >> >> What's the problem with tagging the commit right way as Cc: stable? > > Sure, that makes sense. I'll add "Cc: stable" to the fix as well, so > it can follow 8a8ca142a488 into stable ;) 8a8ca142a488 refers to commit "mm: migrate: requeue destination folio on deferred split queue". ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <JR421L42D7V@zendesk.com>]
* [PATCH v2] mm/page_owner: warn when stack trace depth hits PAGE_OWNER_STACK_DEPTH limit @ 2026-03-28 21:44 ` Jiayuan Liang 2026-03-28 17:28 ` [syzbot ci] " syzbot ci 0 siblings, 1 reply; 19+ messages in thread From: Jiayuan Liang @ 2026-03-28 21:44 UTC (permalink / raw) To: akpm; +Cc: linux-mm, Jiayuan Liang page_owner silently truncates stack traces deeper than PAGE_OWNER_STACK_DEPTH (16), which hides root caller information during memory debugging. Add a ratelimited warning to notify developers when this truncation occurs. Signed-off-by: Jiayuan Liang <ljykernel@163.com> --- mm/page_owner.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/page_owner.c b/mm/page_owner.c index 8178e0be5..962a4f694 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -163,6 +163,9 @@ static noinline depot_stack_handle_t save_stack(gfp_t flags) set_current_in_page_owner(); nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 2); + if (nr_entries >= PAGE_OWNER_STACK_DEPTH) + pr_warn_ratelimited("page_owner: stack depth %u exceeds limit %u\n", + nr_entries, PAGE_OWNER_STACK_DEPTH); handle = stack_depot_save(entries, nr_entries, flags); if (!handle) handle = failure_handle; -- 2.43.0 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [syzbot ci] Re: mm/page_owner: warn when stack trace depth hits PAGE_OWNER_STACK_DEPTH limit 2026-03-28 21:44 ` [PATCH v2] mm/page_owner: warn when stack trace depth hits PAGE_OWNER_STACK_DEPTH limit Jiayuan Liang @ 2026-03-28 17:28 ` syzbot ci 2026-03-28 17:28 ` Request received Yail 0 siblings, 1 reply; 19+ messages in thread From: syzbot ci @ 2026-03-28 17:28 UTC (permalink / raw) To: akpm, linux-mm, ljykernel; +Cc: syzbot, syzkaller-bugs syzbot ci has tested the following series [v2] mm/page_owner: warn when stack trace depth hits PAGE_OWNER_STACK_DEPTH limit https://lore.kernel.org/all/20260328214408.2990597-1-ljykernel@163.com * [PATCH v2] mm/page_owner: warn when stack trace depth hits PAGE_OWNER_STACK_DEPTH limit and found the following issue: possible deadlock in hrtimer_start_range_ns Full report is available here: https://ci.syzbot.org/series/aa38ba8f-44db-4d01-8965-ed5a27174f71 *** possible deadlock in hrtimer_start_range_ns tree: mm-new URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git base: f46991f1780ef97efff3b668627b763581032067 arch: amd64 compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 config: https://ci.syzbot.org/builds/2808cd4a-ad2f-49d0-8b68-eb99aeb06ed6/config syz repro: https://ci.syzbot.org/findings/6c04900a-b825-4465-90f4-694fc3a9d29b/syz_repro page_owner: stack depth 16 exceeds limit 16 ====================================================== WARNING: possible circular locking dependency detected syzkaller #0 Not tainted ------------------------------------------------------ klogd/5246 is trying to acquire lock: ffffffff8e750a00 (console_owner){-.-.}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:2026 [inline] ffffffff8e750a00 (console_owner){-.-.}-{0:0}, at: vprintk_emit+0x2cf/0x560 kernel/printk/printk.c:2478 but task is already holding lock: ffff888121028298 (hrtimer_bases.lock){-.-.}-{2:2}, at: lock_hrtimer_base kernel/time/hrtimer.c:172 [inline] ffff888121028298 (hrtimer_bases.lock){-.-.}-{2:2}, at: hrtimer_start_range_ns+0xc8/0x1ff0 kernel/time/hrtimer.c:1328 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (hrtimer_bases.lock){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:132 [inline] _raw_spin_lock_irqsave+0x40/0x60 kernel/locking/spinlock.c:162 lock_hrtimer_base kernel/time/hrtimer.c:172 [inline] hrtimer_start_range_ns+0xc8/0x1ff0 kernel/time/hrtimer.c:1328 rpm_suspend+0x12d4/0x1750 drivers/base/power/runtime.c:632 __pm_runtime_idle+0x12f/0x1a0 drivers/base/power/runtime.c:1129 pm_runtime_put include/linux/pm_runtime.h:551 [inline] __device_attach+0x34f/0x450 drivers/base/dd.c:1111 device_initial_probe+0xa1/0xd0 drivers/base/dd.c:1148 bus_probe_device+0x12a/0x220 drivers/base/bus.c:613 device_add+0x7b6/0xb70 drivers/base/core.c:3691 serdev_controller_add+0x85/0x640 drivers/tty/serdev/core.c:775 serdev_tty_port_register+0x159/0x260 drivers/tty/serdev/serdev-ttyport.c:291 tty_port_register_device_attr_serdev+0xe7/0x170 drivers/tty/tty_port.c:187 serial_core_add_one_port drivers/tty/serial/serial_core.c:3109 [inline] serial_core_register_port+0x1123/0x28a0 drivers/tty/serial/serial_core.c:3307 serial8250_register_8250_port+0x1658/0x1fd0 drivers/tty/serial/8250/8250_core.c:822 serial_pnp_probe+0x568/0x7f0 drivers/tty/serial/8250/8250_pnp.c:480 pnp_device_probe+0x30b/0x4c0 drivers/pnp/driver.c:111 call_driver_probe drivers/base/dd.c:-1 [inline] really_probe+0x267/0xaf0 drivers/base/dd.c:721 __driver_probe_device+0x18c/0x320 drivers/base/dd.c:863 driver_probe_device+0x4f/0x240 drivers/base/dd.c:893 __driver_attach+0x34c/0x640 drivers/base/dd.c:1287 bus_for_each_dev+0x23b/0x2c0 drivers/base/bus.c:383 bus_add_driver+0x345/0x670 drivers/base/bus.c:756 driver_register+0x23a/0x320 drivers/base/driver.c:249 serial8250_init+0x8f/0x160 drivers/tty/serial/8250/8250_platform.c:317 do_one_initcall+0x250/0x8d0 init/main.c:1382 do_initcall_level+0x104/0x190 init/main.c:1444 do_initcalls+0x59/0xa0 init/main.c:1460 kernel_init_freeable+0x2a6/0x3e0 init/main.c:1692 kernel_init+0x1d/0x1d0 init/main.c:1582 ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 -> #2 (&dev->power.lock){-...}-{3:3}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:132 [inline] _raw_spin_lock_irqsave+0x40/0x60 kernel/locking/spinlock.c:162 __pm_runtime_resume+0x10f/0x180 drivers/base/power/runtime.c:1196 pm_runtime_get include/linux/pm_runtime.h:494 [inline] __uart_start+0x171/0x460 drivers/tty/serial/serial_core.c:149 uart_write+0x265/0xa10 drivers/tty/serial/serial_core.c:633 process_output_block drivers/tty/n_tty.c:557 [inline] n_tty_write+0xd84/0x12a0 drivers/tty/n_tty.c:2366 iterate_tty_write drivers/tty/tty_io.c:1006 [inline] file_tty_write+0x559/0xa20 drivers/tty/tty_io.c:1081 new_sync_write fs/read_write.c:595 [inline] vfs_write+0x61d/0xb90 fs/read_write.c:688 ksys_write+0x150/0x270 fs/read_write.c:740 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #1 (&port_lock_key){-.-.}-{3:3}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:132 [inline] _raw_spin_lock_irqsave+0x40/0x60 kernel/locking/spinlock.c:162 uart_port_lock_irqsave include/linux/serial_core.h:717 [inline] serial8250_console_write+0x150/0x1ba0 drivers/tty/serial/8250/8250_port.c:3316 console_emit_next_record kernel/printk/printk.c:3183 [inline] console_flush_one_record kernel/printk/printk.c:3269 [inline] console_flush_all+0x718/0xb20 kernel/printk/printk.c:3343 __console_flush_and_unlock kernel/printk/printk.c:3373 [inline] console_unlock+0xd1/0x1c0 kernel/printk/printk.c:3413 vprintk_emit+0x485/0x560 kernel/printk/printk.c:2479 _printk+0xdd/0x130 kernel/printk/printk.c:2504 register_console+0xbc2/0xfa0 kernel/printk/printk.c:4208 univ8250_console_init+0x3a/0x70 drivers/tty/serial/8250/8250_core.c:515 console_init+0x10b/0x4d0 kernel/printk/printk.c:4407 start_kernel+0x22b/0x3d0 init/main.c:1147 x86_64_start_reservations+0x24/0x30 arch/x86/kernel/head64.c:310 x86_64_start_kernel+0x143/0x1c0 arch/x86/kernel/head64.c:291 common_startup_64+0x13e/0x147 -> #0 (console_owner){-.-.}-{0:0}: check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain kernel/locking/lockdep.c:3908 [inline] __lock_acquire+0x15a5/0x2cf0 kernel/locking/lockdep.c:5237 lock_acquire+0xf0/0x2e0 kernel/locking/lockdep.c:5868 console_trylock_spinning kernel/printk/printk.c:2026 [inline] vprintk_emit+0x2eb/0x560 kernel/printk/printk.c:2478 _printk+0xdd/0x130 kernel/printk/printk.c:2504 save_stack+0x238/0x2a0 mm/page_owner.c:167 __set_page_owner+0x8d/0x4c0 mm/page_owner.c:344 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x231/0x280 mm/page_alloc.c:1860 prep_new_page mm/page_alloc.c:1868 [inline] get_page_from_freelist+0x24ba/0x2540 mm/page_alloc.c:3948 alloc_frozen_pages_nolock_noprof+0xac/0x140 mm/page_alloc.c:7795 alloc_slab_page mm/slub.c:3287 [inline] allocate_slab+0xf2/0x660 mm/slub.c:3481 new_slab mm/slub.c:3539 [inline] ___slab_alloc+0x150/0x6b0 mm/slub.c:4413 __slab_alloc_node mm/slub.c:4479 [inline] slab_alloc_node mm/slub.c:4854 [inline] kmem_cache_alloc_noprof+0x12d/0x650 mm/slub.c:4873 kmem_alloc_batch lib/debugobjects.c:371 [inline] fill_pool+0x156/0x590 lib/debugobjects.c:420 debug_objects_fill_pool lib/debugobjects.c:742 [inline] debug_object_activate+0x4a3/0x580 lib/debugobjects.c:831 debug_hrtimer_activate kernel/time/hrtimer.c:446 [inline] debug_activate kernel/time/hrtimer.c:485 [inline] enqueue_hrtimer+0x30/0x3c0 kernel/time/hrtimer.c:1089 __hrtimer_start_range_ns kernel/time/hrtimer.c:1271 [inline] hrtimer_start_range_ns+0x15ea/0x1ff0 kernel/time/hrtimer.c:1330 hrtimer_start include/linux/hrtimer.h:244 [inline] dummy_timer+0x4436/0x45d0 drivers/usb/gadget/udc/dummy_hcd.c:2008 __run_hrtimer kernel/time/hrtimer.c:1785 [inline] __hrtimer_run_queues+0x53a/0xcc0 kernel/time/hrtimer.c:1849 hrtimer_run_softirq+0x182/0x5a0 kernel/time/hrtimer.c:1866 handle_softirqs+0x22a/0x870 kernel/softirq.c:622 __do_softirq kernel/softirq.c:656 [inline] invoke_softirq kernel/softirq.c:496 [inline] __irq_exit_rcu+0x5f/0x150 kernel/softirq.c:723 irq_exit_rcu+0x9/0x30 kernel/softirq.c:739 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1056 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1056 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:697 check_kcov_mode kernel/kcov.c:183 [inline] __sanitizer_cov_trace_pc+0x30/0x70 kernel/kcov.c:217 number+0xc0f/0xf80 lib/vsprintf.c:571 vsnprintf+0x8e5/0xee0 lib/vsprintf.c:2912 sprintf+0xe7/0x140 lib/vsprintf.c:3111 print_time kernel/printk/printk.c:1359 [inline] info_print_prefix+0x16b/0x360 kernel/printk/printk.c:1385 record_print_text+0x176/0x450 kernel/printk/printk.c:1434 syslog_print+0x3b0/0x610 kernel/printk/printk.c:1645 do_syslog+0x583/0x7d0 kernel/printk/printk.c:1763 __do_sys_syslog kernel/printk/printk.c:1855 [inline] __se_sys_syslog kernel/printk/printk.c:1853 [inline] __x64_sys_syslog+0x7c/0x90 kernel/printk/printk.c:1853 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f other info that might help us debug this: Chain exists of: console_owner --> &dev->power.lock --> hrtimer_bases.lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hrtimer_bases.lock); lock(&dev->power.lock); lock(hrtimer_bases.lock); lock(console_owner); *** DEADLOCK *** 4 locks held by klogd/5246: #0: ffffffff8e750948 (syslog_lock){+.+.}-{4:4}, at: syslog_print+0x4ad/0x610 kernel/printk/printk.c:1663 #1: ffff88816eebc018 (&dum_hcd->dum->lock){..-.}-{3:3}, at: dummy_timer+0x151/0x45d0 drivers/usb/gadget/udc/dummy_hcd.c:1821 #2: ffff888121028298 (hrtimer_bases.lock){-.-.}-{2:2}, at: lock_hrtimer_base kernel/time/hrtimer.c:172 [inline] #2: ffff888121028298 (hrtimer_bases.lock){-.-.}-{2:2}, at: hrtimer_start_range_ns+0xc8/0x1ff0 kernel/time/hrtimer.c:1328 #3: ffffffff8ef08700 (fill_pool_map-wait-type-override){+.+.}-{3:3}, at: debug_objects_fill_pool lib/debugobjects.c:741 [inline] #3: ffffffff8ef08700 (fill_pool_map-wait-type-override){+.+.}-{3:3}, at: debug_object_activate+0x47a/0x580 lib/debugobjects.c:831 stack backtrace: CPU: 0 UID: 0 PID: 5246 Comm: klogd Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120 print_circular_bug+0x2e1/0x300 kernel/locking/lockdep.c:2043 check_noncircular+0x12e/0x150 kernel/locking/lockdep.c:2175 check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain kernel/locking/lockdep.c:3908 [inline] __lock_acquire+0x15a5/0x2cf0 kernel/locking/lockdep.c:5237 lock_acquire+0xf0/0x2e0 kernel/locking/lockdep.c:5868 console_trylock_spinning kernel/printk/printk.c:2026 [inline] vprintk_emit+0x2eb/0x560 kernel/printk/printk.c:2478 _printk+0xdd/0x130 kernel/printk/printk.c:2504 save_stack+0x238/0x2a0 mm/page_owner.c:167 __set_page_owner+0x8d/0x4c0 mm/page_owner.c:344 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x231/0x280 mm/page_alloc.c:1860 prep_new_page mm/page_alloc.c:1868 [inline] get_page_from_freelist+0x24ba/0x2540 mm/page_alloc.c:3948 alloc_frozen_pages_nolock_noprof+0xac/0x140 mm/page_alloc.c:7795 alloc_slab_page mm/slub.c:3287 [inline] allocate_slab+0xf2/0x660 mm/slub.c:3481 new_slab mm/slub.c:3539 [inline] ___slab_alloc+0x150/0x6b0 mm/slub.c:4413 __slab_alloc_node mm/slub.c:4479 [inline] slab_alloc_node mm/slub.c:4854 [inline] kmem_cache_alloc_noprof+0x12d/0x650 mm/slub.c:4873 kmem_alloc_batch lib/debugobjects.c:371 [inline] fill_pool+0x156/0x590 lib/debugobjects.c:420 debug_objects_fill_pool lib/debugobjects.c:742 [inline] debug_object_activate+0x4a3/0x580 lib/debugobjects.c:831 debug_hrtimer_activate kernel/time/hrtimer.c:446 [inline] debug_activate kernel/time/hrtimer.c:485 [inline] enqueue_hrtimer+0x30/0x3c0 kernel/time/hrtimer.c:1089 __hrtimer_start_range_ns kernel/time/hrtimer.c:1271 [inline] hrtimer_start_range_ns+0x15ea/0x1ff0 kernel/time/hrtimer.c:1330 hrtimer_start include/linux/hrtimer.h:244 [inline] dummy_timer+0x4436/0x45d0 drivers/usb/gadget/udc/dummy_hcd.c:2008 __run_hrtimer kernel/time/hrtimer.c:1785 [inline] __hrtimer_run_queues+0x53a/0xcc0 kernel/time/hrtimer.c:1849 hrtimer_run_softirq+0x182/0x5a0 kernel/time/hrtimer.c:1866 handle_softirqs+0x22a/0x870 kernel/softirq.c:622 __do_softirq kernel/softirq.c:656 [inline] invoke_softirq kernel/softirq.c:496 [inline] __irq_exit_rcu+0x5f/0x150 kernel/softirq.c:723 irq_exit_rcu+0x9/0x30 kernel/softirq.c:739 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1056 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1056 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:697 RIP: 0010:check_kcov_mode kernel/kcov.c:185 [inline] RIP: 0010:__sanitizer_cov_trace_pc+0x30/0x70 kernel/kcov.c:217 Code: 04 24 65 48 8b 0d 08 4b 56 11 65 8b 15 29 4b 56 11 81 e2 00 01 ff 00 74 11 81 fa 00 01 00 00 75 35 83 b9 a4 16 00 00 00 74 2c <8b> 91 80 16 00 00 83 fa 02 75 21 48 8b 91 88 16 00 00 48 8b 32 48 RSP: 0018:ffffc900036676b8 EFLAGS: 00000246 RAX: ffffffff8ba85ecf RBX: 0000000000000031 RCX: ffff888111ff0000 RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffffc90003667742 RBP: ffffc900036677e0 R08: ffffc90003667757 R09: 0000000000000000 R10: ffffc90003667740 R11: fffff520006cceeb R12: ffffc90003667acd R13: dffffc0000000000 R14: ffffc90003667ace R15: 0000000000000002 number+0xc0f/0xf80 lib/vsprintf.c:571 vsnprintf+0x8e5/0xee0 lib/vsprintf.c:2912 sprintf+0xe7/0x140 lib/vsprintf.c:3111 print_time kernel/printk/printk.c:1359 [inline] info_print_prefix+0x16b/0x360 kernel/printk/printk.c:1385 record_print_text+0x176/0x450 kernel/printk/printk.c:1434 syslog_print+0x3b0/0x610 kernel/printk/printk.c:1645 do_syslog+0x583/0x7d0 kernel/printk/printk.c:1763 __do_sys_syslog kernel/printk/printk.c:1855 [inline] __se_sys_syslog kernel/printk/printk.c:1853 [inline] __x64_sys_syslog+0x7c/0x90 kernel/printk/printk.c:1853 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f1089379fa7 Code: 73 01 c3 48 8b 0d 81 ce 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 67 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 ce 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffed757a078 EFLAGS: 00000206 ORIG_RAX: 0000000000000067 RAX: ffffffffffffffda RBX: 00007f10895184a0 RCX: 00007f1089379fa7 RDX: 00000000000003ff RSI: 00007f10895184a0 RDI: 0000000000000002 RBP: 0000000000000000 R08: 0000000000000007 R09: 7e78de0d48eb7984 R10: 0000000000004000 R11: 0000000000000206 R12: 00007f10895184a0 R13: 00007f1089508212 R14: 00007f108951855a R15: 00007f108951855a </TASK> ---------------- Code disassembly (best guess): 0: 04 24 add $0x24,%al 2: 65 48 8b 0d 08 4b 56 mov %gs:0x11564b08(%rip),%rcx # 0x11564b12 9: 11 a: 65 8b 15 29 4b 56 11 mov %gs:0x11564b29(%rip),%edx # 0x11564b3a 11: 81 e2 00 01 ff 00 and $0xff0100,%edx 17: 74 11 je 0x2a 19: 81 fa 00 01 00 00 cmp $0x100,%edx 1f: 75 35 jne 0x56 21: 83 b9 a4 16 00 00 00 cmpl $0x0,0x16a4(%rcx) 28: 74 2c je 0x56 * 2a: 8b 91 80 16 00 00 mov 0x1680(%rcx),%edx <-- trapping instruction 30: 83 fa 02 cmp $0x2,%edx 33: 75 21 jne 0x56 35: 48 8b 91 88 16 00 00 mov 0x1688(%rcx),%rdx 3c: 48 8b 32 mov (%rdx),%rsi 3f: 48 rex.W *** If these findings have caused you to resend the series or submit a separate fix, please add the following tag to your commit message: Tested-by: syzbot@syzkaller.appspotmail.com --- This report is generated by a bot. It may contain errors. syzbot ci engineers can be reached at syzkaller@googlegroups.com. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Request received 2026-03-28 17:28 ` [syzbot ci] " syzbot ci @ 2026-03-28 17:28 ` Yail 0 siblings, 0 replies; 19+ messages in thread From: Yail @ 2026-03-28 17:28 UTC (permalink / raw) To: syzbot ci; +Cc: Akpm, Linux-mm, Ljykernel, Syzbot [-- Attachment #1: Type: text/plain, Size: 314 bytes --] Your request (67) has been received and is being reviewed by our support staff. To add additional comments, reply to this email. This email is a service from Yail. Delivered by Zendesk <https://www.zendesk.com/support/?utm_campaign=text&utm_content=Yail&utm_medium=poweredbyzendesk&utm_source=email-notification> [-- Attachment #2: Type: text/html, Size: 1839 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <594Z9JP5X6G@zendesk.com>]
* [syzbot] [mm?] [cgroups?] WARNING in page_counter_uncharge (2) @ 2026-03-28 5:14 ` syzbot 2026-03-28 5:16 ` Request received Yail 0 siblings, 1 reply; 19+ messages in thread From: syzbot @ 2026-03-28 5:14 UTC (permalink / raw) To: akpm, cgroups, hannes, linux-kernel, linux-mm, mhocko, muchun.song, netdev, roman.gushchin, shakeel.butt, syzkaller-bugs Hello, syzbot found the following issue on: HEAD commit: 5597dd284ff8 net: ti: icssg-prueth: fix missing data copy .. git tree: net console output: https://syzkaller.appspot.com/x/log.txt?x=17f536da580000 kernel config: https://syzkaller.appspot.com/x/.config?x=6754c86e8d9e4c91 dashboard link: https://syzkaller.appspot.com/bug?extid=226c1f947186f8fef796 compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=131baeda580000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=167d6f72580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/b6c0ef6a1be9/disk-5597dd28.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/38b971059ff5/vmlinux-5597dd28.xz kernel image: https://storage.googleapis.com/syzbot-assets/55dd4bd79e77/bzImage-5597dd28.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+226c1f947186f8fef796@syzkaller.appspotmail.com ------------[ cut here ]------------ page_counter underflow: -512 nr_pages=512 WARNING: mm/page_counter.c:61 at page_counter_cancel mm/page_counter.c:60 [inline], CPU#1: syz.0.3396/16434 WARNING: mm/page_counter.c:61 at page_counter_uncharge+0xd2/0x150 mm/page_counter.c:184, CPU#1: syz.0.3396/16434 Modules linked in: CPU: 1 UID: 0 PID: 16434 Comm: syz.0.3396 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026 RIP: 0010:page_counter_cancel mm/page_counter.c:60 [inline] RIP: 0010:page_counter_uncharge+0xd8/0x150 mm/page_counter.c:184 Code: f7 e8 7c 88 f8 ff 4d 8b 36 4d 85 f6 74 6e e8 cf 3f 8e ff e9 6c ff ff ff e8 c5 3f 8e ff 48 8d 3d 2e ea df 0d 4c 89 fe 48 89 da <67> 48 0f b9 3a 4c 89 f7 be 08 00 00 00 e8 e6 8a f8 ff 4c 89 f0 48 RSP: 0018:ffffc9000da772b0 EFLAGS: 00010093 RAX: ffffffff82376eab RBX: 0000000000000200 RCX: ffff88807c631e80 RDX: 0000000000000200 RSI: fffffffffffffe00 RDI: ffffffff901758e0 RBP: fffffffffffffe00 R08: ffff888032fd5387 R09: 1ffff110065faa70 R10: dffffc0000000000 R11: ffffed10065faa71 R12: 0000000000000001 R13: dffffc0000000000 R14: ffff888032fd5380 R15: fffffffffffffe00 FS: 0000000000000000(0000) GS:ffff88812555a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f14ec9d5ff8 CR3: 000000000e54c000 CR4: 00000000003526f0 Call Trace: <TASK> __hugetlb_cgroup_uncharge_folio+0x15e/0x510 mm/hugetlb_cgroup.c:354 free_huge_folio+0xaef/0x11e0 mm/hugetlb.c:1782 folios_put_refs+0x553/0x8d0 mm/swap.c:983 folio_batch_release include/linux/pagevec.h:101 [inline] remove_inode_hugepages+0xf50/0x11a0 fs/hugetlbfs/inode.c:608 hugetlbfs_evict_inode+0xaf/0x260 fs/hugetlbfs/inode.c:623 evict+0x61e/0xb10 fs/inode.c:846 __dentry_kill+0x1a2/0x5e0 fs/dcache.c:670 finish_dput+0xc9/0x480 fs/dcache.c:879 __fput+0x691/0xa70 fs/file_table.c:477 task_work_run+0x1d9/0x270 kernel/task_work.c:233 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0x70f/0x23c0 kernel/exit.c:976 do_group_exit+0x21b/0x2d0 kernel/exit.c:1118 get_signal+0x1284/0x1330 kernel/signal.c:3034 arch_do_signal_or_restart+0xbc/0x830 arch/x86/kernel/signal.c:337 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline] exit_to_user_mode_loop+0x86/0x480 kernel/entry/common.c:98 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline] syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline] do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f14ebb9c799 Code: Unable to access opcode bytes at 0x7f14ebb9c76f. RSP: 002b:00007f14ec9d60e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 00007f14ebe16098 RCX: 00007f14ebb9c799 RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f14ebe16098 RBP: 00007f14ebe16090 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f14ebe16128 R14: 00007ffc404c7910 R15: 00007ffc404c79f8 </TASK> ---------------- Code disassembly (best guess): 0: f7 e8 imul %eax 2: 7c 88 jl 0xffffff8c 4: f8 clc 5: ff 4d 8b decl -0x75(%rbp) 8: 36 4d 85 f6 ss test %r14,%r14 c: 74 6e je 0x7c e: e8 cf 3f 8e ff call 0xff8e3fe2 13: e9 6c ff ff ff jmp 0xffffff84 18: e8 c5 3f 8e ff call 0xff8e3fe2 1d: 48 8d 3d 2e ea df 0d lea 0xddfea2e(%rip),%rdi # 0xddfea52 24: 4c 89 fe mov %r15,%rsi 27: 48 89 da mov %rbx,%rdx * 2a: 67 48 0f b9 3a ud1 (%edx),%rdi <-- trapping instruction 2f: 4c 89 f7 mov %r14,%rdi 32: be 08 00 00 00 mov $0x8,%esi 37: e8 e6 8a f8 ff call 0xfff88b22 3c: 4c 89 f0 mov %r14,%rax 3f: 48 rex.W --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 19+ messages in thread
* Request received 2026-03-28 5:14 ` [syzbot] [mm?] [cgroups?] WARNING in page_counter_uncharge (2) syzbot @ 2026-03-28 5:16 ` Yail 0 siblings, 0 replies; 19+ messages in thread From: Yail @ 2026-03-28 5:16 UTC (permalink / raw) To: syzbot Cc: Akpm, Cgroups, Hannes, Linux-kernel, Linux-mm, Mhocko, Muchun Song, Netdev, Roman Gushchin, Shakeel Butt [-- Attachment #1: Type: text/plain, Size: 314 bytes --] Your request (53) has been received and is being reviewed by our support staff. To add additional comments, reply to this email. This email is a service from Yail. Delivered by Zendesk <https://www.zendesk.com/support/?utm_campaign=text&utm_content=Yail&utm_medium=poweredbyzendesk&utm_source=email-notification> [-- Attachment #2: Type: text/html, Size: 1839 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <DEX62KG7X9P@zendesk.com>]
* [BUG] mm/vma.c:830 WARNING in vma_modify() via mseal(2) -- deterministic trigger without fault injection on Linux 7.0-rc5 @ 2026-03-27 7:46 ` antonius 2026-03-27 8:59 ` Request received Yail 0 siblings, 1 reply; 19+ messages in thread From: antonius @ 2026-03-27 7:46 UTC (permalink / raw) To: linux-mm Cc: lorenzo.stoakes, liam.howlett, jeffxu, akpm, linux-kernel, syzkaller-bugs [-- Attachment #1.1: Type: text/plain, Size: 7732 bytes --] Hello, I am reporting a reproducible WARNING in vma_modify() at mm/vma.c:830, triggered via the mseal(2) syscall on Linux 7.0.0-rc5. The bug was discovered using Syzkaller-based fuzzing. REPORTER -------- Antonius / Blue Dragon Security https://bluedragonsec.com https://github.com/bluedragonsecurity NOTE ON RELATIONSHIP TO KNOWN BUGS ----------------------------------- The VM_WARN_ON_VMG at mm/vma.c:830 inside vma_merge_existing_range() has been previously encountered via madvise()+OOM conditions (reported by syzbot+46423ed8fa1f1148c6e4 and Brad Spengler; addressed by Lorenzo's patch "mm: abort vma_modify() on merge out of memory failure"). This report describes a DISTINCT trigger via mseal(2) that: 1. Does NOT require fault injection or OOM pressure 2. Is 100% reproducible on every run (fires within 1 second) 3. Goes through a different call path: do_mseal() -> mseal_apply() rather than madvise_walk_vmas() 4. Is triggered by VM_SEALED flag state inconsistency across VMAs, not by a failed merge commit I could not find a prior LKML report or syzbot entry for this specific mseal(2) trigger. SUMMARY ------- File: mm/vma.c, line 830 Func: vma_merge_existing_range() Trigger: mseal() spanning two adjacent VMAs where the first has VM_SEALED set and the second does not Via: mseal(2) -> do_mseal() -> mseal_apply() -> vma_modify_flags() -> vma_modify() -> vma_merge_existing_range() -> VM_WARN_ON_VMG AFFECTED VERSIONS ----------------- Linux 7.0-rc3 -- confirmed (original fuzzing target) Linux 7.0-rc4 -- confirmed (mm/vma.c unchanged rc3->rc4) Linux 7.0-rc5 -- confirmed (mm/vma.c unchanged rc4->rc5) Linux 6.x -- NOT affected (mm/vma.c rewritten for 7.0) DMESG OUTPUT (Linux 7.0.0-rc5, trimmed) ---------------------------------------- [ 1680.275764] ------------[ cut here ]------------ [ 1680.275771] WARNING: mm/vma.c:830 at vma_modify+0x35b/0x2190 [ 1680.275808] CPU: 0 UID: 1000 PID: 1661 Comm: repro_mseal_vma [ 1680.275826] Tainted: [W]=WARN 7.0.0-rc5 #1 PREEMPT(lazy) [ 1680.275969] Call Trace: [ 1680.275975] <TASK> [ 1680.276030] vma_modify_flags+0x24c/0x3c0 [ 1680.276085] do_mseal+0x489/0x860 [ 1680.276136] __x64_sys_mseal+0x73/0xb0 [ 1680.276187] do_syscall_64+0x111/0x690 [ 1680.276207] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 1680.276394] ---[ end trace 0000000000000000 ]--- [ 1680.314910] vmg dumped because: VM_WARN_ON_VMG(middle && ((middle != prev && vmg->start != middle->vm_start) || vmg->end > middle->vm_end)) vmg state: vmi [21de6000, 21e83000) prev [21da6000-21de6000) flags: 0x400000000f8 (VM_SEALED set) middle [21de6000-21e83000) flags: 0xf8 (NOT sealed) vmg->start = 0x21da8000 vmg->end = 0x21e16000 ROOT CAUSE ---------- The bug is in vma_merge_existing_range() at mm/vma.c:830. Reproduction sequence: 1. memfd_create("syz-mseal", MFD_CLOEXEC) -> fd1 2. mmap(0x21da8000, 0xdd000, PROT_SEM, MAP_SHARED|MAP_FIXED, fd1, 0) -> establishes VMA at [0x21da8000 .. 0x21e85000) 3. memfd_create("syz-mseal", MFD_CLOEXEC) -> fd2 4. mmap(0x21da6000, 0xdd000, PROT_SEM, MAP_SHARED|MAP_FIXED, fd2, 0) -> remaps, leaving: VMA-A [0x21da6000 - 0x21de6000) pgoff=0 (fd2) VMA-B [0x21de6000 - 0x21e83000) pgoff=0x40 (fd2) VMA-C [0x21e83000 - 0x21e85000) (leftover) 5. mseal(mmap1_result, 0x3e000, 0) -> seals [0x21da8000 .. 0x21de5fff] -> VMA-A gets VM_SEALED (0x400000000000) set 6. mseal(mmap2_result, 0x70000, 0) -> targets [0x21da6000 .. 0x21e15fff] -> range spans VMA-A (sealed) and VMA-B (not sealed) In step 6, do_mseal() calls mseal_apply() per-VMA but ultimately calls vma_modify_flags() with the original full mseal start address (0x21da8000). When vma_merge_existing_range() processes VMA-B as "middle": vmg->start = 0x21da8000 (original mseal start) middle->vm_start = 0x21de6000 (VMA-B start) middle != prev (different VMA objects) -> vmg->start != middle->vm_start -> WARN_ON fires at line 830 The invariant violation occurs because the vmg->start passed to vma_modify_flags() is not clamped to the current VMA's start when the mseal range spans multiple VMAs with different VM_SEALED states. IMPACT ------ - Reachable from unprivileged userspace (UID 1000, no capabilities) - Only memfd_create(2), mmap(2), mseal(2) required - The WARN_ON indicates that vma_merge_existing_range() operates on an inconsistent vmg state; in production kernels with WARN compiled to no-op, this could result in VMA tree state inconsistency - mseal is a security primitive; invariant violations in its application logic are security-relevant SUGGESTED FIX DIRECTION ------------------------ In do_mseal() or mseal_apply() (mm/mseal.c), when iterating over VMAs in the mseal range, the vmg->start passed to vma_modify_flags() should be clamped to max(mseal_start, vma->vm_start) rather than using the original mseal() start address. This would prevent vma_merge_existing_range() from receiving a vmg->start that is inconsistent with vmg->middle when the mseal range spans multiple VMAs with different seal states. Alternatively, the WARN_ON in vma_merge_existing_range() may need to account for the mseal multi-VMA iteration pattern, though fixing the caller in do_mseal() seems more appropriate. REPRODUCER ---------- Compile: gcc -O2 -o repro repro_mseal_vma.c && ./repro Fires: Within 1 second, iteration 0, no fault injection, no root #define _GNU_SOURCE #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/syscall.h> #include <sys/wait.h> #include <unistd.h> #ifndef __NR_memfd_create #define __NR_memfd_create 319 #endif #ifndef __NR_mseal #define __NR_mseal 462 #endif static void setup(void) { syscall(__NR_mmap, 0x1ffffffff000UL, 0x1000UL, 0UL, 0x32UL, -1, 0UL); syscall(__NR_mmap, 0x200000000000UL, 0x1000000UL, 7UL, 0x32UL, -1, 0UL); syscall(__NR_mmap, 0x200001000000UL, 0x1000UL, 0UL, 0x32UL, -1, 0UL); } static void trigger(void) { intptr_t fd1, fd2, m1, m2; memcpy((void*)0x200000000100UL, "syz-mseal\0", 10); fd1 = syscall(__NR_memfd_create, 0x200000000100UL, 1UL); if (fd1 < 0) return; m1 = syscall(__NR_mmap, 0x21da8000UL, 0xdd000UL, 8UL, 0x11UL, (intptr_t)fd1, 0UL); memcpy((void*)0x200000000100UL, "syz-mseal\0", 10); fd2 = syscall(__NR_memfd_create, 0x200000000100UL, 1UL); if (fd2 < 0) return; m2 = syscall(__NR_mmap, 0x21da6000UL, 0xdd000UL, 8UL, 0x11UL, (intptr_t)fd2, 0UL); syscall(__NR_mseal, (uint64_t)m1, 0x3e000UL, 0UL); syscall(__NR_mseal, (uint64_t)m2, 0x70000UL, 0UL); } int main(void) { setup(); for (int i = 0;; i++) { int pid = fork(); if (pid == 0) { trigger(); _exit(0); } int st; waitpid(pid, &st, 0); fprintf(stderr, "[iter %d]\n", i); } } VERIFICATION ------------ Kernel: Linux 7.0.0-rc5 #1 SMP PREEMPT_DYNAMIC x86_64 HW: QEMU Standard PC (i440FX + PIIX), BIOS 1.17.0-debian User: UID 1000 (no root required) Fires: Iteration 0, consistently, < 1 second mm/vma.c: Not patched in rc3->rc4 or rc4->rc5 --- Reported-by: Antonius <antonius@bluedragonsec.com> Please use this tag in the fix commit: Reported-by: Antonius <antonius@bluedragonsec.com> --- If this is a known issue or already fixed, please point me to the relevant commit. I was unable to find a matching LKML/syzbot entry for this specific mseal(2) trigger path. Thank you, Antonius Blue Dragon Security https://bluedragonsec.com https://github.com/bluedragonsecurity [-- Attachment #1.2: Type: text/html, Size: 9230 bytes --] [-- Attachment #2: repro_mseal_vma.c --] [-- Type: text/x-csrc, Size: 6231 bytes --] // SPDX-License-Identifier: GPL-2.0 /* * Reproducer: WARNING in vma_modify() at mm/vma.c:830 * * Trigger: mseal(2) spanning two adjacent VMAs where the first * has been partially sealed (VM_SEALED set), the second * has not. vma_merge_existing_range() fires WARN_ON because * vmg->start != middle->vm_start with middle != prev. * * Affected: Linux 7.0-rc3, 7.0-rc4, 7.0-rc5 (confirmed) * mm/vma.c untouched in rc3->rc4 and rc4->rc5 patches. * Not present in Linux 6.x (mm/vma.c rewritten for 7.0). * * Note: The same WARN at mm/vma.c:830 is known to trigger via * madvise()+OOM (syzbot+46423ed8fa1f1148c6e4). This * reproducer demonstrates a DISTINCT trigger via mseal(2) * that requires NO fault injection and fires deterministically. * * Reporter: Antonius / Blue Dragon Security * https://bluedragonsec.com * https://github.com/bluedragonsecurity * * Compile: gcc -O2 -o repro_mseal_vma repro_mseal_vma.c * Run: ./repro_mseal_vma * Verify: dmesg | grep 'WARNING.*vma\.c:830' * (fires within iteration 0, < 1 second, no root needed) * * Call path: * mseal(2) * -> do_mseal() [mm/mseal.c] * -> mseal_apply() * -> vma_modify_flags() [mm/vma.c] * -> vma_modify() * -> vma_merge_existing_range() * -> VM_WARN_ON_VMG at line 830 <-- fires here * * Condition that triggers WARN: * VM_WARN_ON_VMG(middle && * ((middle != prev && vmg->start != middle->vm_start) || * vmg->end > middle->vm_end)) * * vmg->start = 0x21da8000 (from first mseal context) * middle->vm_start = 0x21de6000 (VMA-B, not sealed) * -> vmg->start != middle->vm_start -> WARN fires */ #define _GNU_SOURCE #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/syscall.h> #include <sys/wait.h> #include <unistd.h> #ifndef __NR_memfd_create #define __NR_memfd_create 319 #endif #ifndef __NR_mseal #define __NR_mseal 462 #endif /* --------------------------------------------------------------- * Fixed workspace layout (syzbot-style) * These three mmaps establish a predictable address space so that * the trigger addresses 0x21daXXXX fall within mapped memory. * --------------------------------------------------------------- */ static void setup_workspace(void) { syscall(__NR_mmap, (uint64_t)0x1ffffffff000UL, (uint64_t)0x1000UL, (uint64_t)0UL, (uint64_t)0x32UL, /* MAP_FIXED|MAP_ANON|MAP_PRIVATE */ (intptr_t)-1, (uint64_t)0UL); syscall(__NR_mmap, (uint64_t)0x200000000000UL, (uint64_t)0x1000000UL, (uint64_t)7UL, /* PROT_READ|WRITE|EXEC */ (uint64_t)0x32UL, (intptr_t)-1, (uint64_t)0UL); syscall(__NR_mmap, (uint64_t)0x200001000000UL, (uint64_t)0x1000UL, (uint64_t)0UL, (uint64_t)0x32UL, (intptr_t)-1, (uint64_t)0UL); } /* --------------------------------------------------------------- * Core trigger. * * After the two mmaps + first mseal, memory layout is: * * [0x21da6000 - 0x21de5fff] VMA-A (fd2, MAP_SHARED|MAP_FIXED) * ^-- first mseal() sets VM_SEALED here * [0x21de6000 - 0x21e82fff] VMA-B (fd2, MAP_SHARED|MAP_FIXED) * ^-- NOT sealed when second mseal fires * [0x21e83000 - 0x21e84fff] VMA-C (leftover) * * Second mseal(mmap2_result, 0x70000) targets [0x21da6000-0x21e15fff], * spanning VMA-A (sealed) into VMA-B (not sealed). * * Inside do_mseal() -> mseal_apply() -> vma_modify_flags(): * The call passes the original full mseal start (0x21da8000 from the * first mseal context) as vmg->start. When vma_merge_existing_range() * is invoked for VMA-B (middle=[0x21de6000..]): * * vmg->start (0x21da8000) != middle->vm_start (0x21de6000) * AND middle != prev * -> VM_WARN_ON_VMG fires at mm/vma.c:830 * --------------------------------------------------------------- */ static void trigger(void) { intptr_t fd1, fd2, m1, m2; /* workspace string for memfd names */ memcpy((void *)0x200000000100UL, "syz-mseal\0", 10); /* fd1: first memfd, mapped at 0x21da8000 */ fd1 = syscall(__NR_memfd_create, (uint64_t)0x200000000100UL, (uint64_t)1UL); if (fd1 < 0) return; m1 = syscall(__NR_mmap, (uint64_t)0x21da8000UL, (uint64_t)0xdd000UL, (uint64_t)8UL, /* PROT_SEM */ (uint64_t)0x11UL, /* MAP_SHARED | MAP_FIXED */ (intptr_t)fd1, (uint64_t)0UL); /* fd2: second memfd, mapped at 0x21da6000 (overlaps m1 at start) */ memcpy((void *)0x200000000100UL, "syz-mseal\0", 10); fd2 = syscall(__NR_memfd_create, (uint64_t)0x200000000100UL, (uint64_t)1UL); if (fd2 < 0) return; m2 = syscall(__NR_mmap, (uint64_t)0x21da6000UL, (uint64_t)0xdd000UL, (uint64_t)8UL, (uint64_t)0x11UL, (intptr_t)fd2, (uint64_t)0UL); /* * Step 1: Partial seal on m1 range. * Seals [0x21da8000 .. 0x21de5fff] -- a subset of VMA-A. * Sets VM_SEALED (0x400000000000) on VMA-A. */ syscall(__NR_mseal, (uint64_t)m1, (uint64_t)0x3e000UL, (uint64_t)0UL); /* * Step 2: Seal spanning VMA-A (sealed) + VMA-B (not sealed). * Range [0x21da6000 .. 0x21e15fff]. * -> vma_merge_existing_range() WARN fires. */ syscall(__NR_mseal, (uint64_t)m2, (uint64_t)0x70000UL, (uint64_t)0UL); } int main(void) { fprintf(stderr, "============================================\n" "repro_mseal_vma -- mm/vma.c:830 reproducer\n" "Reporter: Antonius / Blue Dragon Security\n" " https://bluedragonsec.com\n" " https://github.com/bluedragonsecurity" "============================================\n" "Monitor: dmesg | grep 'WARNING.*vma\\.c:830'\n\n"); setup_workspace(); for (int iter = 0;; iter++) { pid_t pid = fork(); if (pid < 0) { perror("fork"); return 1; } if (pid == 0) { trigger(); _exit(0); } int st; waitpid(pid, &st, 0); fprintf(stderr, "[iter %d]\n", iter); if (iter % 5 == 0) system("dmesg 2>/dev/null | grep -c 'WARNING.*vma\\.c:830' " "| xargs -I{} sh -c " "'[ {} -gt 0 ] && " "echo \"[+] WARNING triggered {} times total\"'"); } return 0; } [-- Attachment #3: dmesg_linux_kernel_7_rc5.png --] [-- Type: image/png, Size: 234998 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Request received 2026-03-27 7:46 ` [BUG] mm/vma.c:830 WARNING in vma_modify() via mseal(2) -- deterministic trigger without fault injection on Linux 7.0-rc5 antonius @ 2026-03-27 8:59 ` Yail 0 siblings, 0 replies; 19+ messages in thread From: Yail @ 2026-03-27 8:59 UTC (permalink / raw) To: antonius Cc: Akpm, Jeffxu, Liam Howlett, Linux-kernel, Linux-mm, Lorenzo Stoakes [-- Attachment #1: Type: text/plain, Size: 314 bytes --] Your request (39) has been received and is being reviewed by our support staff. To add additional comments, reply to this email. This email is a service from Yail. Delivered by Zendesk <https://www.zendesk.com/support/?utm_campaign=text&utm_content=Yail&utm_medium=poweredbyzendesk&utm_source=email-notification> [-- Attachment #2: Type: text/html, Size: 1839 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2026-04-01 11:54 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <EE70ZGRNE12@zendesk.com>
2026-04-01 6:08 ` [syzbot] [mm?] WARNING in deferred_split_folio syzbot
2026-04-01 6:09 ` Request received Yail
2026-04-01 8:10 ` [syzbot] [mm?] WARNING in deferred_split_folio Lance Yang
2026-04-01 8:59 ` Lance Yang
2026-04-01 9:36 ` David Hildenbrand (Arm)
2026-04-01 10:16 ` David Hildenbrand (Arm)
2026-04-01 10:53 ` Lance Yang
2026-04-01 11:00 ` David Hildenbrand (Arm)
2026-04-01 11:20 ` Lance Yang
2026-04-01 11:22 ` David Hildenbrand (Arm)
2026-04-01 11:34 ` Lance Yang
2026-04-01 11:38 ` David Hildenbrand (Arm)
2026-04-01 11:41 ` Lance Yang
2026-04-01 11:44 ` David Hildenbrand (Arm)
2026-04-01 11:51 ` Lance Yang
2026-04-01 11:54 ` Lance Yang
[not found] <JR421L42D7V@zendesk.com>
2026-03-28 21:44 ` [PATCH v2] mm/page_owner: warn when stack trace depth hits PAGE_OWNER_STACK_DEPTH limit Jiayuan Liang
2026-03-28 17:28 ` [syzbot ci] " syzbot ci
2026-03-28 17:28 ` Request received Yail
[not found] <594Z9JP5X6G@zendesk.com>
2026-03-28 5:14 ` [syzbot] [mm?] [cgroups?] WARNING in page_counter_uncharge (2) syzbot
2026-03-28 5:16 ` Request received Yail
[not found] <DEX62KG7X9P@zendesk.com>
2026-03-27 7:46 ` [BUG] mm/vma.c:830 WARNING in vma_modify() via mseal(2) -- deterministic trigger without fault injection on Linux 7.0-rc5 antonius
2026-03-27 8:59 ` Request received Yail
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox