* [syzbot] [mm?] WARNING in folio_remove_rmap_ptes @ 2025-12-23 5:23 syzbot 2025-12-23 8:24 ` David Hildenbrand (Red Hat) 2025-12-24 5:35 ` Harry Yoo 0 siblings, 2 replies; 13+ messages in thread From: syzbot @ 2025-12-23 5:23 UTC (permalink / raw) To: Liam.Howlett, akpm, david, harry.yoo, jannh, linux-kernel, linux-mm, lorenzo.stoakes, riel, syzkaller-bugs, vbabka Hello, syzbot found the following issue on: HEAD commit: 9094662f6707 Merge tag 'ata-6.19-rc2' of git://git.kernel... git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1411f77c580000 kernel config: https://syzkaller.appspot.com/x/.config?x=a11e0f726bfb6765 dashboard link: https://syzkaller.appspot.com/bug?extid=b165fc2e11771c66d8ba compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11998b1a580000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=128cdb1a580000 Downloadable assets: disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-9094662f.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/5bec9d32a91c/vmlinux-9094662f.xz kernel image: https://storage.googleapis.com/syzbot-assets/3df82e1a3cec/bzImage-9094662f.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+b165fc2e11771c66d8ba@syzkaller.appspotmail.com handle_mm_fault+0x3fe/0xad0 mm/memory.c:6580 do_user_addr_fault+0x60c/0x1370 arch/x86/mm/fault.c:1336 handle_page_fault arch/x86/mm/fault.c:1476 [inline] exc_page_fault+0x64/0xc0 arch/x86/mm/fault.c:1532 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618 ------------[ cut here ]------------ WARNING: ./include/linux/rmap.h:462 at __folio_rmap_sanity_checks include/linux/rmap.h:462 [inline], CPU#1: syz.0.18/6090 WARNING: ./include/linux/rmap.h:462 at __folio_remove_rmap mm/rmap.c:1663 [inline], CPU#1: syz.0.18/6090 WARNING: ./include/linux/rmap.h:462 at folio_remove_rmap_ptes+0xc27/0xfb0 mm/rmap.c:1779, CPU#1: syz.0.18/6090 Modules linked in: CPU: 1 UID: 0 PID: 6090 Comm: syz.0.18 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:__folio_rmap_sanity_checks include/linux/rmap.h:462 [inline] RIP: 0010:__folio_remove_rmap mm/rmap.c:1663 [inline] RIP: 0010:folio_remove_rmap_ptes+0xc27/0xfb0 mm/rmap.c:1779 Code: 00 e9 49 f4 ff ff e8 a8 35 aa ff e8 c3 55 17 ff e9 98 fc ff ff e8 99 35 aa ff 48 c7 c6 80 b7 9c 8b 4c 89 e7 e8 8a 12 f5 ff 90 <0f> 0b 90 e9 5a f6 ff ff e8 7c 35 aa ff 48 8b 54 24 10 48 b8 00 00 RSP: 0018:ffffc90003f5f260 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffea0001417f80 RCX: ffffc90003f5f144 RDX: ffff88803368c980 RSI: ffffffff8214b106 RDI: ffff88803368ce04 RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: ffff88803368d4b0 R12: ffffea0001417f80 R13: ffff888030c90500 R14: 0000000000000000 R15: ffff888012660660 FS: 00007f98fd3fe6c0(0000) GS:ffff8880d69f5000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f98fd3ddd58 CR3: 000000003661c000 CR4: 0000000000352ef0 Call Trace: <TASK> zap_present_folio_ptes mm/memory.c:1650 [inline] zap_present_ptes mm/memory.c:1708 [inline] do_zap_pte_range mm/memory.c:1810 [inline] zap_pte_range mm/memory.c:1854 [inline] zap_pmd_range mm/memory.c:1946 [inline] zap_pud_range mm/memory.c:1975 [inline] zap_p4d_range mm/memory.c:1996 [inline] unmap_page_range+0x1b7d/0x43c0 mm/memory.c:2017 unmap_single_vma+0x153/0x240 mm/memory.c:2059 unmap_vmas+0x218/0x470 mm/memory.c:2101 vms_clear_ptes+0x419/0x790 mm/vma.c:1231 vms_complete_munmap_vmas+0x1ca/0x970 mm/vma.c:1280 do_vmi_align_munmap+0x446/0x7e0 mm/vma.c:1539 do_vmi_munmap+0x204/0x3e0 mm/vma.c:1587 do_munmap+0xb6/0xf0 mm/mmap.c:1065 mremap_to+0x236/0x450 mm/mremap.c:1378 remap_move mm/mremap.c:1890 [inline] do_mremap+0x13a8/0x2020 mm/mremap.c:1933 __do_sys_mremap+0x119/0x170 mm/mremap.c:1997 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f98fdd8f7c9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f98fd3fe038 EFLAGS: 00000246 ORIG_RAX: 0000000000000019 RAX: ffffffffffffffda RBX: 00007f98fdfe5fa0 RCX: 00007f98fdd8f7c9 RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000200000ffc000 RBP: 00007f98fde13f91 R08: 0000200000002000 R09: 0000000000000000 R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f98fdfe6038 R14: 00007f98fdfe5fa0 R15: 00007ffd69c60518 </TASK> --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2025-12-23 5:23 [syzbot] [mm?] WARNING in folio_remove_rmap_ptes syzbot @ 2025-12-23 8:24 ` David Hildenbrand (Red Hat) 2025-12-24 2:48 ` Hillf Danton 2025-12-24 5:35 ` Harry Yoo 1 sibling, 1 reply; 13+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-12-23 8:24 UTC (permalink / raw) To: syzbot, Liam.Howlett, akpm, harry.yoo, jannh, linux-kernel, linux-mm, lorenzo.stoakes, riel, syzkaller-bugs, vbabka Cc: Jann Horn On 12/23/25 06:23, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 9094662f6707 Merge tag 'ata-6.19-rc2' of git://git.kernel... > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=1411f77c580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=a11e0f726bfb6765 > dashboard link: https://syzkaller.appspot.com/bug?extid=b165fc2e11771c66d8ba > compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11998b1a580000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=128cdb1a580000 > > Downloadable assets: > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-9094662f.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/5bec9d32a91c/vmlinux-9094662f.xz > kernel image: https://storage.googleapis.com/syzbot-assets/3df82e1a3cec/bzImage-9094662f.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+b165fc2e11771c66d8ba@syzkaller.appspotmail.com > > handle_mm_fault+0x3fe/0xad0 mm/memory.c:6580 > do_user_addr_fault+0x60c/0x1370 arch/x86/mm/fault.c:1336 > handle_page_fault arch/x86/mm/fault.c:1476 [inline] > exc_page_fault+0x64/0xc0 arch/x86/mm/fault.c:1532 > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618 > ------------[ cut here ]------------ > WARNING: ./include/linux/rmap.h:462 at __folio_rmap_sanity_checks include/linux/rmap.h:462 [inline], CPU#1: syz.0.18/6090 IIUC, that's the if (folio_test_anon(folio) && !folio_test_ksm(folio)) { ... VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0, folio); } Seems to indicate that the anon_vma is no longer alive :/ Fortunately we have a reproducer. CCing Jann who addded that check "recently". -- Cheers David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2025-12-23 8:24 ` David Hildenbrand (Red Hat) @ 2025-12-24 2:48 ` Hillf Danton 0 siblings, 0 replies; 13+ messages in thread From: Hillf Danton @ 2025-12-24 2:48 UTC (permalink / raw) To: David Hildenbrand (Red Hat) Cc: syzbot, harry.yoo, jannh, linux-kernel, linux-mm, syzkaller-bugs On Tue, 23 Dec 2025 09:24:05 +0100 "David Hildenbrand (Red Hat)" wrote: > On 12/23/25 06:23, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 9094662f6707 Merge tag 'ata-6.19-rc2' of git://git.kernel... > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1411f77c580000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=a11e0f726bfb6765 > > dashboard link: https://syzkaller.appspot.com/bug?extid=b165fc2e11771c66d8ba > > compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11998b1a580000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=128cdb1a580000 > > > > Downloadable assets: > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-9094662f.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/5bec9d32a91c/vmlinux-9094662f.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/3df82e1a3cec/bzImage-9094662f.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+b165fc2e11771c66d8ba@syzkaller.appspotmail.com > > > > handle_mm_fault+0x3fe/0xad0 mm/memory.c:6580 > > do_user_addr_fault+0x60c/0x1370 arch/x86/mm/fault.c:1336 > > handle_page_fault arch/x86/mm/fault.c:1476 [inline] > > exc_page_fault+0x64/0xc0 arch/x86/mm/fault.c:1532 > > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618 > > ------------[ cut here ]------------ > > WARNING: ./include/linux/rmap.h:462 at __folio_rmap_sanity_checks include/linux/rmap.h:462 [inline], CPU#1: syz.0.18/6090 > > IIUC, that's the > > if (folio_test_anon(folio) && !folio_test_ksm(folio)) { > ... > VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0, folio); > } > > Seems to indicate that the anon_vma is no longer alive :/ > > Fortunately we have a reproducer. > > CCing Jann who addded that check "recently". > That check looks incorrect given the atomic_inc_not_zero in folio_get_anon_vma(). ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2025-12-23 5:23 [syzbot] [mm?] WARNING in folio_remove_rmap_ptes syzbot 2025-12-23 8:24 ` David Hildenbrand (Red Hat) @ 2025-12-24 5:35 ` Harry Yoo 2025-12-30 22:02 ` David Hildenbrand (Red Hat) 1 sibling, 1 reply; 13+ messages in thread From: Harry Yoo @ 2025-12-24 5:35 UTC (permalink / raw) To: syzbot Cc: Liam.Howlett, akpm, david, jannh, linux-kernel, linux-mm, lorenzo.stoakes, riel, syzkaller-bugs, vbabka On Mon, Dec 22, 2025 at 09:23:17PM -0800, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 9094662f6707 Merge tag 'ata-6.19-rc2' of git://git.kernel... > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=1411f77c580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=a11e0f726bfb6765 > dashboard link: https://syzkaller.appspot.com/bug?extid=b165fc2e11771c66d8ba > compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11998b1a580000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=128cdb1a580000 > > Downloadable assets: > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-9094662f.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/5bec9d32a91c/vmlinux-9094662f.xz > kernel image: https://storage.googleapis.com/syzbot-assets/3df82e1a3cec/bzImage-9094662f.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+b165fc2e11771c66d8ba@syzkaller.appspotmail.com > > handle_mm_fault+0x3fe/0xad0 mm/memory.c:6580 > do_user_addr_fault+0x60c/0x1370 arch/x86/mm/fault.c:1336 > handle_page_fault arch/x86/mm/fault.c:1476 [inline] > exc_page_fault+0x64/0xc0 arch/x86/mm/fault.c:1532 > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618 > ------------[ cut here ]------------ > WARNING: ./include/linux/rmap.h:462 at __folio_rmap_sanity_checks include/linux/rmap.h:462 [inline], CPU#1: syz.0.18/6090 > WARNING: ./include/linux/rmap.h:462 at __folio_remove_rmap mm/rmap.c:1663 [inline], CPU#1: syz.0.18/6090 > WARNING: ./include/linux/rmap.h:462 at folio_remove_rmap_ptes+0xc27/0xfb0 mm/rmap.c:1779, CPU#1: syz.0.18/6090 > Modules linked in: > CPU: 1 UID: 0 PID: 6090 Comm: syz.0.18 Not tainted syzkaller #0 PREEMPT(full) > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > RIP: 0010:__folio_rmap_sanity_checks include/linux/rmap.h:462 [inline] > RIP: 0010:__folio_remove_rmap mm/rmap.c:1663 [inline] > RIP: 0010:folio_remove_rmap_ptes+0xc27/0xfb0 mm/rmap.c:1779 > Code: 00 e9 49 f4 ff ff e8 a8 35 aa ff e8 c3 55 17 ff e9 98 fc ff ff e8 99 35 aa ff 48 c7 c6 80 b7 9c 8b 4c 89 e7 e8 8a 12 f5 ff 90 <0f> 0b 90 e9 5a f6 ff ff e8 7c 35 aa ff 48 8b 54 24 10 48 b8 00 00 > RSP: 0018:ffffc90003f5f260 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffffea0001417f80 RCX: ffffc90003f5f144 > RDX: ffff88803368c980 RSI: ffffffff8214b106 RDI: ffff88803368ce04 > RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 > R10: 0000000000000001 R11: ffff88803368d4b0 R12: ffffea0001417f80 > R13: ffff888030c90500 R14: 0000000000000000 R15: ffff888012660660 > FS: 00007f98fd3fe6c0(0000) GS:ffff8880d69f5000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f98fd3ddd58 CR3: 000000003661c000 CR4: 0000000000352ef0 > Call Trace: > <TASK> > zap_present_folio_ptes mm/memory.c:1650 [inline] > zap_present_ptes mm/memory.c:1708 [inline] > do_zap_pte_range mm/memory.c:1810 [inline] > zap_pte_range mm/memory.c:1854 [inline] > zap_pmd_range mm/memory.c:1946 [inline] > zap_pud_range mm/memory.c:1975 [inline] > zap_p4d_range mm/memory.c:1996 [inline] > unmap_page_range+0x1b7d/0x43c0 mm/memory.c:2017 > unmap_single_vma+0x153/0x240 mm/memory.c:2059 > unmap_vmas+0x218/0x470 mm/memory.c:2101 So this is unmapping VMAs, and it observed an anon_vma with refcount == 0. anon_vma's refcount isn't supposed to be zero as long as there's any anonymous memory mapped to a VMA (that's associated with the anon_vma). From the page dump below, we know that it's been allocated to a file VMA that has anon_vma (due to CoW, I think). > [ 64.399049][ T6090] page: refcount:2 mapcount:1 mapping:0000000000000000 index:0x0 pfn:0x505fe > [ 64.402037][ T6090] memcg:ffff888100078d40 > [ 64.403522][ T6090] anon flags: 0xfff0800002090c(referenced|uptodate|active|owner_2|swapbacked|node=0|zone=1|lastcpupid=0x7ff) > [ 64.407140][ T6090] raw: 00fff0800002090c 0000000000000000 dead000000000122 ffff888012660661 > [ 64.409851][ T6090] raw: 0000000000000000 0000000000000000 0000000200000000 ffff888100078d40 > [ 64.412578][ T6090] page dumped because: VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0) > [ 64.415320][ T6090] page_owner tracks the page as allocated > [ 64.417353][ T6090] page last allocated via order 0, migratetype Movable, gfp_mask 0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), pid 6091, tgid 6089 (syz.0.18), ts 64395709171, free_ts 64007663612 > [ 64.422891][ T6090] post_alloc_hook+0x1af/0x220 > [ 64.424399][ T6090] get_page_from_freelist+0xd0b/0x31a0 > [ 64.426135][ T6090] __alloc_frozen_pages_noprof+0x25f/0x2430 > [ 64.427958][ T6090] alloc_pages_mpol+0x1fb/0x550 > [ 64.429506][ T6090] folio_alloc_mpol_noprof+0x36/0x2f0 > [ 64.431157][ T6090] vma_alloc_folio_noprof+0xed/0x1e0 > [ 64.433173][ T6090] do_fault+0x219/0x1ad0 > [ 64.434586][ T6090] __handle_mm_fault+0x1919/0x2bb0 > [ 64.436396][ T6090] handle_mm_fault+0x3fe/0xad0 > [ 64.437985][ T6090] __get_user_pages+0x54e/0x3590 > [ 64.439679][ T6090] get_user_pages_remote+0x243/0xab0 woohoo, this is faulted via GUP from another process... > [ 64.441359][ T6090] uprobe_write+0x22b/0x24f0 > [ 64.442887][ T6090] uprobe_write_opcode+0x99/0x1a0 > [ 64.444496][ T6090] set_swbp+0x112/0x200 > [ 64.445793][ T6090] install_breakpoint+0x14b/0xa20 > [ 64.447382][ T6090] uprobe_mmap+0x512/0x10e0 > [ 64.448874][ T6090] page last free pid 6082 tgid 6082 stack trace: > [ 64.450887][ T6090] free_unref_folios+0xa22/0x1610 > [ 64.452536][ T6090] folios_put_refs+0x4be/0x750 > [ 64.454064][ T6090] folio_batch_move_lru+0x278/0x3a0 > [ 64.455714][ T6090] __folio_batch_add_and_move+0x318/0xc30 > [ 64.457810][ T6090] folio_add_lru_vma+0xb0/0x100 > [ 64.459416][ T6090] do_anonymous_page+0x12cf/0x2190 > [ 64.461066][ T6090] __handle_mm_fault+0x1ecf/0x2bb0 > [ 64.462706][ T6090] handle_mm_fault+0x3fe/0xad0 > [ 64.464562][ T6090] do_user_addr_fault+0x60c/0x1370 > [ 64.466676][ T6090] exc_page_fault+0x64/0xc0 > [ 64.468067][ T6090] asm_exc_page_fault+0x26/0x30 > [ 64.469661][ T6090] ------------[ cut here ]------------ BUT unfortunately the report doesn't have any information regarding _when_ the refcount has been dropped to zero. Perhaps we want yet another DEBUG_VM feature to record when it's been dropped to zero and report it in the sanity check, or... imagine harder how a file VMA that has anon_vma involving CoW / GUP / migration / reclamation could somehow drop the refcount to zero? Sounds fun ;) -- Cheers, Harry / Hyeonggon > vms_clear_ptes+0x419/0x790 mm/vma.c:1231 > vms_complete_munmap_vmas+0x1ca/0x970 mm/vma.c:1280 > do_vmi_align_munmap+0x446/0x7e0 mm/vma.c:1539 > do_vmi_munmap+0x204/0x3e0 mm/vma.c:1587 > do_munmap+0xb6/0xf0 mm/mmap.c:1065 > mremap_to+0x236/0x450 mm/mremap.c:1378 > remap_move mm/mremap.c:1890 [inline] > do_mremap+0x13a8/0x2020 mm/mremap.c:1933 > __do_sys_mremap+0x119/0x170 mm/mremap.c:1997 > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] > do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > RIP: 0033:0x7f98fdd8f7c9 > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007f98fd3fe038 EFLAGS: 00000246 ORIG_RAX: 0000000000000019 > RAX: ffffffffffffffda RBX: 00007f98fdfe5fa0 RCX: 00007f98fdd8f7c9 > RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000200000ffc000 > RBP: 00007f98fde13f91 R08: 0000200000002000 R09: 0000000000000000 > R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007f98fdfe6038 R14: 00007f98fdfe5fa0 R15: 00007ffd69c60518 > </TASK> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2025-12-24 5:35 ` Harry Yoo @ 2025-12-30 22:02 ` David Hildenbrand (Red Hat) 2025-12-31 6:59 ` Harry Yoo 0 siblings, 1 reply; 13+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-12-30 22:02 UTC (permalink / raw) To: Harry Yoo, syzbot Cc: Liam.Howlett, akpm, jannh, linux-kernel, linux-mm, lorenzo.stoakes, riel, syzkaller-bugs, vbabka On 12/24/25 06:35, Harry Yoo wrote: > On Mon, Dec 22, 2025 at 09:23:17PM -0800, syzbot wrote: >> Hello, >> >> syzbot found the following issue on: >> >> HEAD commit: 9094662f6707 Merge tag 'ata-6.19-rc2' of git://git.kernel... >> git tree: upstream >> console output: https://syzkaller.appspot.com/x/log.txt?x=1411f77c580000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=a11e0f726bfb6765 >> dashboard link: https://syzkaller.appspot.com/bug?extid=b165fc2e11771c66d8ba >> compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11998b1a580000 >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=128cdb1a580000 >> >> Downloadable assets: >> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-9094662f.raw.xz >> vmlinux: https://storage.googleapis.com/syzbot-assets/5bec9d32a91c/vmlinux-9094662f.xz >> kernel image: https://storage.googleapis.com/syzbot-assets/3df82e1a3cec/bzImage-9094662f.xz >> >> IMPORTANT: if you fix the issue, please add the following tag to the commit: >> Reported-by: syzbot+b165fc2e11771c66d8ba@syzkaller.appspotmail.com >> >> handle_mm_fault+0x3fe/0xad0 mm/memory.c:6580 >> do_user_addr_fault+0x60c/0x1370 arch/x86/mm/fault.c:1336 >> handle_page_fault arch/x86/mm/fault.c:1476 [inline] >> exc_page_fault+0x64/0xc0 arch/x86/mm/fault.c:1532 >> asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618 >> ------------[ cut here ]------------ >> WARNING: ./include/linux/rmap.h:462 at __folio_rmap_sanity_checks include/linux/rmap.h:462 [inline], CPU#1: syz.0.18/6090 >> WARNING: ./include/linux/rmap.h:462 at __folio_remove_rmap mm/rmap.c:1663 [inline], CPU#1: syz.0.18/6090 >> WARNING: ./include/linux/rmap.h:462 at folio_remove_rmap_ptes+0xc27/0xfb0 mm/rmap.c:1779, CPU#1: syz.0.18/6090 >> Modules linked in: >> CPU: 1 UID: 0 PID: 6090 Comm: syz.0.18 Not tainted syzkaller #0 PREEMPT(full) >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 >> RIP: 0010:__folio_rmap_sanity_checks include/linux/rmap.h:462 [inline] >> RIP: 0010:__folio_remove_rmap mm/rmap.c:1663 [inline] >> RIP: 0010:folio_remove_rmap_ptes+0xc27/0xfb0 mm/rmap.c:1779 >> Code: 00 e9 49 f4 ff ff e8 a8 35 aa ff e8 c3 55 17 ff e9 98 fc ff ff e8 99 35 aa ff 48 c7 c6 80 b7 9c 8b 4c 89 e7 e8 8a 12 f5 ff 90 <0f> 0b 90 e9 5a f6 ff ff e8 7c 35 aa ff 48 8b 54 24 10 48 b8 00 00 >> RSP: 0018:ffffc90003f5f260 EFLAGS: 00010293 >> RAX: 0000000000000000 RBX: ffffea0001417f80 RCX: ffffc90003f5f144 >> RDX: ffff88803368c980 RSI: ffffffff8214b106 RDI: ffff88803368ce04 >> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 >> R10: 0000000000000001 R11: ffff88803368d4b0 R12: ffffea0001417f80 >> R13: ffff888030c90500 R14: 0000000000000000 R15: ffff888012660660 >> FS: 00007f98fd3fe6c0(0000) GS:ffff8880d69f5000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007f98fd3ddd58 CR3: 000000003661c000 CR4: 0000000000352ef0 >> Call Trace: >> <TASK> >> zap_present_folio_ptes mm/memory.c:1650 [inline] >> zap_present_ptes mm/memory.c:1708 [inline] >> do_zap_pte_range mm/memory.c:1810 [inline] >> zap_pte_range mm/memory.c:1854 [inline] >> zap_pmd_range mm/memory.c:1946 [inline] >> zap_pud_range mm/memory.c:1975 [inline] >> zap_p4d_range mm/memory.c:1996 [inline] >> unmap_page_range+0x1b7d/0x43c0 mm/memory.c:2017 >> unmap_single_vma+0x153/0x240 mm/memory.c:2059 >> unmap_vmas+0x218/0x470 mm/memory.c:2101 > > So this is unmapping VMAs, and it observed an anon_vma with refcount == 0. > anon_vma's refcount isn't supposed to be zero as long as there's > any anonymous memory mapped to a VMA (that's associated with the anon_vma). > > From the page dump below, we know that it's been allocated to a file VMA > that has anon_vma (due to CoW, I think). > >> [ 64.399049][ T6090] page: refcount:2 mapcount:1 mapping:0000000000000000 index:0x0 pfn:0x505fe >> [ 64.402037][ T6090] memcg:ffff888100078d40 >> [ 64.403522][ T6090] anon flags: 0xfff0800002090c(referenced|uptodate|active|owner_2|swapbacked|node=0|zone=1|lastcpupid=0x7ff) >> [ 64.407140][ T6090] raw: 00fff0800002090c 0000000000000000 dead000000000122 ffff888012660661 >> [ 64.409851][ T6090] raw: 0000000000000000 0000000000000000 0000000200000000 ffff888100078d40 >> [ 64.412578][ T6090] page dumped because: VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0) >> [ 64.415320][ T6090] page_owner tracks the page as allocated >> [ 64.417353][ T6090] page last allocated via order 0, migratetype Movable, gfp_mask 0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), pid 6091, tgid 6089 (syz.0.18), ts 64395709171, free_ts 64007663612 >> [ 64.422891][ T6090] post_alloc_hook+0x1af/0x220 >> [ 64.424399][ T6090] get_page_from_freelist+0xd0b/0x31a0 >> [ 64.426135][ T6090] __alloc_frozen_pages_noprof+0x25f/0x2430 >> [ 64.427958][ T6090] alloc_pages_mpol+0x1fb/0x550 >> [ 64.429506][ T6090] folio_alloc_mpol_noprof+0x36/0x2f0 >> [ 64.431157][ T6090] vma_alloc_folio_noprof+0xed/0x1e0 >> [ 64.433173][ T6090] do_fault+0x219/0x1ad0 >> [ 64.434586][ T6090] __handle_mm_fault+0x1919/0x2bb0 >> [ 64.436396][ T6090] handle_mm_fault+0x3fe/0xad0 >> [ 64.437985][ T6090] __get_user_pages+0x54e/0x3590 >> [ 64.439679][ T6090] get_user_pages_remote+0x243/0xab0 > > woohoo, this is faulted via GUP from another process... > >> [ 64.441359][ T6090] uprobe_write+0x22b/0x24f0 >> [ 64.442887][ T6090] uprobe_write_opcode+0x99/0x1a0 >> [ 64.444496][ T6090] set_swbp+0x112/0x200 >> [ 64.445793][ T6090] install_breakpoint+0x14b/0xa20 >> [ 64.447382][ T6090] uprobe_mmap+0x512/0x10e0 >> [ 64.448874][ T6090] page last free pid 6082 tgid 6082 stack trace: >> [ 64.450887][ T6090] free_unref_folios+0xa22/0x1610 >> [ 64.452536][ T6090] folios_put_refs+0x4be/0x750 >> [ 64.454064][ T6090] folio_batch_move_lru+0x278/0x3a0 >> [ 64.455714][ T6090] __folio_batch_add_and_move+0x318/0xc30 >> [ 64.457810][ T6090] folio_add_lru_vma+0xb0/0x100 >> [ 64.459416][ T6090] do_anonymous_page+0x12cf/0x2190 >> [ 64.461066][ T6090] __handle_mm_fault+0x1ecf/0x2bb0 >> [ 64.462706][ T6090] handle_mm_fault+0x3fe/0xad0 >> [ 64.464562][ T6090] do_user_addr_fault+0x60c/0x1370 >> [ 64.466676][ T6090] exc_page_fault+0x64/0xc0 >> [ 64.468067][ T6090] asm_exc_page_fault+0x26/0x30 >> [ 64.469661][ T6090] ------------[ cut here ]------------ > > BUT unfortunately the report doesn't have any information regarding > _when_ the refcount has been dropped to zero. > > Perhaps we want yet another DEBUG_VM feature to record when it's been > dropped to zero and report it in the sanity check, or... imagine harder > how a file VMA that has anon_vma involving CoW / GUP / migration / > reclamation could somehow drop the refcount to zero? > > Sounds fun ;) > Can we bisect the issue given that we have a reproducer? This only popped up just now, so I would assume it's actually something that went into this release that makes it trigger. -- Cheers David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2025-12-30 22:02 ` David Hildenbrand (Red Hat) @ 2025-12-31 6:59 ` Harry Yoo 2026-01-01 13:09 ` Jeongjun Park 0 siblings, 1 reply; 13+ messages in thread From: Harry Yoo @ 2025-12-31 6:59 UTC (permalink / raw) To: David Hildenbrand (Red Hat) Cc: syzbot, Liam.Howlett, akpm, jannh, linux-kernel, linux-mm, lorenzo.stoakes, riel, syzkaller-bugs, vbabka On Tue, Dec 30, 2025 at 11:02:18PM +0100, David Hildenbrand (Red Hat) wrote: > On 12/24/25 06:35, Harry Yoo wrote: > > On Mon, Dec 22, 2025 at 09:23:17PM -0800, syzbot wrote: > > Perhaps we want yet another DEBUG_VM feature to record when it's been > > dropped to zero and report it in the sanity check, or... imagine harder > > how a file VMA that has anon_vma involving CoW / GUP / migration / > > reclamation could somehow drop the refcount to zero? > > > > Sounds fun ;) > > > > Can we bisect the issue given that we have a reproducer? Unfortunately I could not reproduce the issue with the C reproducer, even with the provided kernel config. Maybe it's a race condition and I didn't wait long enough... > This only popped up just now, so I would assume it's actually something that > went into this release that makes it trigger. I was assuming the bug has been there even before the addition of VM_WARN_ON_ONCE(), as the commit a222439e1e27 ("mm/rmap: add anon_vma lifetime debug check") says: > There have been syzkaller reports a few months ago[1][2] of UAF in rmap > walks that seems to indicate that there can be pages with elevated > mapcount whose anon_vma has already been freed, but I think we never > figured out what the cause is; and syzkaller only hit these UAFs when > memory pressure randomly caused reclaim to rmap-walk the affected pages, > so it of course didn't manage to create a reproducer. > > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios > to hopefully catch such issues more reliably. -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2025-12-31 6:59 ` Harry Yoo @ 2026-01-01 13:09 ` Jeongjun Park 2026-01-01 13:45 ` Harry Yoo 2026-01-01 16:54 ` Lorenzo Stoakes 0 siblings, 2 replies; 13+ messages in thread From: Jeongjun Park @ 2026-01-01 13:09 UTC (permalink / raw) To: harry.yoo Cc: Liam.Howlett, akpm, david, jannh, linux-kernel, linux-mm, lorenzo.stoakes, riel, syzbot+b165fc2e11771c66d8ba, syzkaller-bugs, vbabka Harry Yoo wrote: > On Tue, Dec 30, 2025 at 11:02:18PM +0100, David Hildenbrand (Red Hat) wrote: > > On 12/24/25 06:35, Harry Yoo wrote: > > > On Mon, Dec 22, 2025 at 09:23:17PM -0800, syzbot wrote: > > > Perhaps we want yet another DEBUG_VM feature to record when it's been > > > dropped to zero and report it in the sanity check, or... imagine harder > > > how a file VMA that has anon_vma involving CoW / GUP / migration / > > > reclamation could somehow drop the refcount to zero? > > > > > > Sounds fun ;) > > > > > > > Can we bisect the issue given that we have a reproducer? > > Unfortunately I could not reproduce the issue with the C reproducer, > even with the provided kernel config. Maybe it's a race condition and > I didn't wait long enough... > > > This only popped up just now, so I would assume it's actually something that > > went into this release that makes it trigger. > > I was assuming the bug has been there even before the addition of > VM_WARN_ON_ONCE(), as the commit a222439e1e27 ("mm/rmap: add anon_vma > lifetime debug check") says: > > There have been syzkaller reports a few months ago[1][2] of UAF in rmap > > walks that seems to indicate that there can be pages with elevated > > mapcount whose anon_vma has already been freed, but I think we never > > figured out what the cause is; and syzkaller only hit these UAFs when > > memory pressure randomly caused reclaim to rmap-walk the affected pages, > > so it of course didn't manage to create a reproducer. > > > > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios > > to hopefully catch such issues more reliably. > I tested this myself and found that the bug is caused by commit d23cb648e365 ("mm/mremap: permit mremap() move of multiple VMAs"). This commit doesn't mention anything about MREMAP_DONTUNMAP. Is it really acceptable for MREMAP_DONTUNMAP, which maintains old_address and aliases new_address, to use move-only fastpath? If MREMAP_DONTUNMAP can also use fastpath, I think a sophisticated refactoring of remap_move is needed to manage anon_vma/rmap lifetimes. Otherwise, adding simple flag check logic to vrm_move_only() is likely necessary. What are your thoughts? > -- > Cheers, > Harry / Hyeonggon Regards, Jeongjun Park ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2026-01-01 13:09 ` Jeongjun Park @ 2026-01-01 13:45 ` Harry Yoo 2026-01-01 14:30 ` Jeongjun Park 2026-01-01 16:54 ` Lorenzo Stoakes 1 sibling, 1 reply; 13+ messages in thread From: Harry Yoo @ 2026-01-01 13:45 UTC (permalink / raw) To: Jeongjun Park Cc: Liam.Howlett, akpm, david, jannh, linux-kernel, linux-mm, lorenzo.stoakes, riel, syzbot+b165fc2e11771c66d8ba, syzkaller-bugs, vbabka On Thu, Jan 01, 2026 at 10:09:06PM +0900, Jeongjun Park wrote: > Harry Yoo wrote: > > On Tue, Dec 30, 2025 at 11:02:18PM +0100, David Hildenbrand (Red Hat) wrote: > > > On 12/24/25 06:35, Harry Yoo wrote: > > > > On Mon, Dec 22, 2025 at 09:23:17PM -0800, syzbot wrote: > > > > Perhaps we want yet another DEBUG_VM feature to record when it's been > > > > dropped to zero and report it in the sanity check, or... imagine harder > > > > how a file VMA that has anon_vma involving CoW / GUP / migration / > > > > reclamation could somehow drop the refcount to zero? > > > > > > > > Sounds fun ;) > > > > > > > > > > Can we bisect the issue given that we have a reproducer? > > > > Unfortunately I could not reproduce the issue with the C reproducer, > > even with the provided kernel config. Maybe it's a race condition and > > I didn't wait long enough... > > > > > This only popped up just now, so I would assume it's actually something that > > > went into this release that makes it trigger. > > > > I was assuming the bug has been there even before the addition of > > VM_WARN_ON_ONCE(), as the commit a222439e1e27 ("mm/rmap: add anon_vma > > lifetime debug check") says: > > > There have been syzkaller reports a few months ago[1][2] of UAF in rmap > > > walks that seems to indicate that there can be pages with elevated > > > mapcount whose anon_vma has already been freed, but I think we never > > > figured out what the cause is; and syzkaller only hit these UAFs when > > > memory pressure randomly caused reclaim to rmap-walk the affected pages, > > > so it of course didn't manage to create a reproducer. > > > > > > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios > > > to hopefully catch such issues more reliably. > > Hi Jeongjun, > I tested this myself and found that the bug is caused by commit > d23cb648e365 ("mm/mremap: permit mremap() move of multiple VMAs"). Oh, great. Thanks! Could you please elaborate how you confirmed the bad commit? - Did you perform git bisection on it? - How did you reproduce the bug and how long did it take to reproduce? > This commit doesn't mention anything about MREMAP_DONTUNMAP. Is it really > acceptable for MREMAP_DONTUNMAP, which maintains old_address and aliases > new_address, to use move-only fastpath? > > If MREMAP_DONTUNMAP can also use fastpath, I think a sophisticated > refactoring of remap_move is needed to manage anon_vma/rmap lifetimes. > Otherwise, adding simple flag check logic to vrm_move_only() is likely > necessary. > > What are your thoughts? It's late at night, so... let me look at at this tomorrow with a clearer mind :) Happy new year, by the way! -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2026-01-01 13:45 ` Harry Yoo @ 2026-01-01 14:30 ` Jeongjun Park 2026-01-01 16:32 ` Lorenzo Stoakes 0 siblings, 1 reply; 13+ messages in thread From: Jeongjun Park @ 2026-01-01 14:30 UTC (permalink / raw) To: Harry Yoo Cc: Liam.Howlett, akpm, david, jannh, linux-kernel, linux-mm, lorenzo.stoakes, riel, syzbot+b165fc2e11771c66d8ba, syzkaller-bugs, vbabka Hi Harry, Harry Yoo <harry.yoo@oracle.com> wrote: > > On Thu, Jan 01, 2026 at 10:09:06PM +0900, Jeongjun Park wrote: > > Harry Yoo wrote: > > > On Tue, Dec 30, 2025 at 11:02:18PM +0100, David Hildenbrand (Red Hat) wrote: > > > > On 12/24/25 06:35, Harry Yoo wrote: > > > > > On Mon, Dec 22, 2025 at 09:23:17PM -0800, syzbot wrote: > > > > > Perhaps we want yet another DEBUG_VM feature to record when it's been > > > > > dropped to zero and report it in the sanity check, or... imagine harder > > > > > how a file VMA that has anon_vma involving CoW / GUP / migration / > > > > > reclamation could somehow drop the refcount to zero? > > > > > > > > > > Sounds fun ;) > > > > > > > > > > > > > Can we bisect the issue given that we have a reproducer? > > > > > > Unfortunately I could not reproduce the issue with the C reproducer, > > > even with the provided kernel config. Maybe it's a race condition and > > > I didn't wait long enough... > > > > > > > This only popped up just now, so I would assume it's actually something that > > > > went into this release that makes it trigger. > > > > > > I was assuming the bug has been there even before the addition of > > > VM_WARN_ON_ONCE(), as the commit a222439e1e27 ("mm/rmap: add anon_vma > > > lifetime debug check") says: > > > > There have been syzkaller reports a few months ago[1][2] of UAF in rmap > > > > walks that seems to indicate that there can be pages with elevated > > > > mapcount whose anon_vma has already been freed, but I think we never > > > > figured out what the cause is; and syzkaller only hit these UAFs when > > > > memory pressure randomly caused reclaim to rmap-walk the affected pages, > > > > so it of course didn't manage to create a reproducer. > > > > > > > > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios > > > > to hopefully catch such issues more reliably. > > > > > Hi Jeongjun, > > > I tested this myself and found that the bug is caused by commit > > d23cb648e365 ("mm/mremap: permit mremap() move of multiple VMAs"). > > Oh, great. Thanks! > > Could you please elaborate how you confirmed the bad commit? > > - Did you perform git bisection on it? > - How did you reproduce the bug and how long did it take to reproduce? > I tested the mremap-related commits in my local environment, building them one by one and using syzbot repro. [1] : https://syzkaller.appspot.com/text?tag=ReproC&x=128cdb1a580000 And for debugging purposes, I added the code from commit a222439e1e27 ("mm/rmap: add anon_vma lifetime debug check") and ran the test. Based on my testing, I found that the WARNING starts from commit d23cb648e365 ("mm/mremap: permit mremap() move of multiple VMAs"), which is right after commit 2cf442d74216 ("mm/mremap: clean up mlock populate behavior") in Lorenzo's mremap-related patch series. ``` [ 105.610134][ T9699] page: refcount:2 mapcount:1 mapping:0000000000000000 index:0x0 pfn:0x5abd6 [ 105.611881][ T9699] memcg:ffff888051abc100 [ 105.612642][ T9699] anon flags: 0x4fff0800002090c(referenced|uptodate|active|owner_2|swapbacked|node=1|zone=1|lastcpupid=0x7ff) [ 105.614724][ T9699] raw: 04fff0800002090c 0000000000000000 dead000000000122 ffff888047525bb1 [ 105.616213][ T9699] raw: 0000000000000000 0000000000000000 0000000200000000 ffff888051abc100 [ 105.617791][ T9699] page dumped because: VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0) [ 105.619364][ T9699] page_owner tracks the page as allocated [ 105.620554][ T9699] page last allocated via order 0, migratetype Movable, gfp_mask 0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), pid 9700, tgid 9698 (test), ts 105608898986, free_ts 104692063083 [ 105.623518][ T9699] post_alloc_hook+0x1be/0x230 [ 105.624454][ T9699] get_page_from_freelist+0x10c0/0x2f80 [ 105.625446][ T9699] __alloc_frozen_pages_noprof+0x256/0x2130 [ 105.626504][ T9699] alloc_pages_mpol+0x1f1/0x550 [ 105.627383][ T9699] folio_alloc_mpol_noprof+0x38/0x2f0 [...] [ 105.651729][ T9699] ------------[ cut here ]------------ [ 105.652694][ T9699] WARNING: CPU: 0 PID: 9699 at ./include/linux/rmap.h:472 __folio_rmap_sanity_checks+0x6c3/0x770 [ 105.654551][ T9699] Modules linked in: [ 105.655268][ T9699] CPU: 0 UID: 0 PID: 9699 Comm: test Not tainted 6.16.0-rc5-00304-gd23cb648e365-dirty #37 PREEMPT(full) [ 105.657209][ T9699] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 105.658803][ T9699] RIP: 0010:__folio_rmap_sanity_checks+0x6c3/0x770 [ 105.659959][ T9699] Code: 9a 13 00 e9 9f f9 ff ff 4c 89 e7 e8 77 9a 13 00 e9 87 fc ff ff e8 3d d9 af ff 48 c7 c6 00 b1 3b 8b 48 89 ef e8 7e 78 f6 ff 90 <0f> 0b 90 e9 82 fc ff ff e8 80 9a 13 00 e9 32 fa ff ff e8 76 9a 13 [ 105.663311][ T9699] RSP: 0018:ffffc9000baf7268 EFLAGS: 00010293 [ 105.664412][ T9699] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9000baf714c [ 105.665796][ T9699] RDX: ffff888020668000 RSI: ffffffff82089412 RDI: ffff888020668444 [ 105.667181][ T9699] RBP: ffffea00016af580 R08: 0000000000000001 R09: ffffed1005704841 [ 105.668591][ T9699] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888047525c50 [ 105.669977][ T9699] R13: ffff888047525bb0 R14: 0000000000000000 R15: 0000000000000000 [ 105.671389][ T9699] FS: 00007f781689e700(0000) GS:ffff888098559000(0000) knlGS:0000000000000000 [ 105.672968][ T9699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 105.674147][ T9699] CR2: 00007f781687cfb8 CR3: 00000000467b8000 CR4: 0000000000752ef0 [ 105.675535][ T9699] PKRU: 55555554 [ 105.676175][ T9699] Call Trace: [ 105.676786][ T9699] <TASK> [ 105.677329][ T9699] folio_remove_rmap_ptes+0x31/0x980 [ 105.678287][ T9699] unmap_page_range+0x1b97/0x41a0 [ 105.679205][ T9699] ? __pfx_unmap_page_range+0x10/0x10 [ 105.680164][ T9699] ? uprobe_munmap+0x448/0x5d0 [ 105.681045][ T9699] ? uprobe_munmap+0x479/0x5d0 [ 105.681916][ T9699] unmap_single_vma.constprop.0+0x153/0x230 [ 105.682973][ T9699] unmap_vmas+0x1d6/0x430 [ 105.683757][ T9699] ? __pfx_unmap_vmas+0x10/0x10 [ 105.684681][ T9699] ? __sanitizer_cov_trace_switch+0x54/0x90 [ 105.685740][ T9699] ? mas_update_gap+0x30a/0x4f0 [ 105.686616][ T9699] vms_clear_ptes.part.0+0x368/0x690 [ 105.687573][ T9699] ? __pfx_vms_clear_ptes.part.0+0x10/0x10 [ 105.688641][ T9699] ? __pfx_mas_store_gfp+0x10/0x10 [ 105.689553][ T9699] ? unlink_anon_vmas+0x457/0x890 [ 105.690463][ T9699] vms_complete_munmap_vmas+0x6cf/0xa20 [ 105.691488][ T9699] do_vmi_align_munmap+0x426/0x800 [ 105.692429][ T9699] ? __pfx_do_vmi_align_munmap+0x10/0x10 [ 105.693456][ T9699] ? mas_walk+0x6b7/0x8c0 [ 105.694290][ T9699] do_vmi_munmap+0x1f0/0x3d0 [ 105.695128][ T9699] do_munmap+0xbd/0x100 [ 105.695883][ T9699] ? __pfx_do_munmap+0x10/0x10 [ 105.696749][ T9699] ? mas_walk+0x6b7/0x8c0 [ 105.697542][ T9699] mremap_to+0x242/0x450 [ 105.698317][ T9699] do_mremap+0xff4/0x1fe0 [ 105.699114][ T9699] ? __pfx_do_mremap+0x10/0x10 [ 105.699992][ T9699] __do_sys_mremap+0x119/0x170 [ 105.700868][ T9699] ? __pfx___do_sys_mremap+0x10/0x10 [ 105.701821][ T9699] ? __x64_sys_futex+0x1c5/0x4c0 [ 105.702712][ T9699] ? __x64_sys_futex+0x1ce/0x4c0 [ 105.703629][ T9699] do_syscall_64+0xcb/0xfa0 [ 105.704463][ T9699] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 105.705510][ T9699] RIP: 0033:0x7f7816996fc9 [ 105.706311][ T9699] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 8e 0d 00 f7 d8 64 89 01 48 [ 105.709665][ T9699] RSP: 002b:00007f781689de98 EFLAGS: 00000297 ORIG_RAX: 0000000000000019 [ 105.711123][ T9699] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7816996fc9 [ 105.712507][ T9699] RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000200000ffc000 [ 105.713894][ T9699] RBP: 00007f781689dec0 R08: 0000200000002000 R09: 0000000000000000 [ 105.715319][ T9699] R10: 0000000000000007 R11: 0000000000000297 R12: 00007ffddaf7c6fe [ 105.716718][ T9699] R13: 00007ffddaf7c6ff R14: 00007f781689dfc0 R15: 0000000000022000 [ 105.718113][ T9699] </TASK> [ 105.718674][ T9699] Kernel panic - not syncing: kernel: panic_on_warn set ... [ 105.719943][ T9699] CPU: 0 UID: 0 PID: 9699 Comm: test Not tainted 6.16.0-rc5-00304-gd23cb648e365-dirty #37 PREEMPT(full) [ 105.721866][ T9699] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 105.723469][ T9699] Call Trace: [ 105.724047][ T9699] <TASK> [ 105.724592][ T9699] dump_stack_lvl+0x3d/0x1b0 [ 105.725432][ T9699] panic+0x6fc/0x7b0 [ 105.726145][ T9699] ? __pfx_panic+0x10/0x10 [ 105.726955][ T9699] ? show_trace_log_lvl+0x278/0x380 [ 105.727897][ T9699] ? check_panic_on_warn+0x1f/0xc0 [ 105.728819][ T9699] ? __folio_rmap_sanity_checks+0x6c3/0x770 [ 105.729867][ T9699] check_panic_on_warn+0xb1/0xc0 [ 105.730759][ T9699] __warn+0xf6/0x3d0 [ 105.731473][ T9699] ? __folio_rmap_sanity_checks+0x6c3/0x770 [ 105.732522][ T9699] report_bug+0x2e1/0x500 [ 105.733305][ T9699] ? __folio_rmap_sanity_checks+0x6c3/0x770 [ 105.734354][ T9699] handle_bug+0x2dd/0x410 [ 105.735132][ T9699] exc_invalid_op+0x35/0x80 [ 105.735947][ T9699] asm_exc_invalid_op+0x1a/0x20 [ 105.736819][ T9699] RIP: 0010:__folio_rmap_sanity_checks+0x6c3/0x770 [ 105.737962][ T9699] Code: 9a 13 00 e9 9f f9 ff ff 4c 89 e7 e8 77 9a 13 00 e9 87 fc ff ff e8 3d d9 af ff 48 c7 c6 00 b1 3b 8b 48 89 ef e8 7e 78 f6 ff 90 <0f> 0b 90 e9 82 fc ff ff e8 80 9a 13 00 e9 32 fa ff ff e8 76 9a 13 [ 105.741281][ T9699] RSP: 0018:ffffc9000baf7268 EFLAGS: 00010293 [ 105.742352][ T9699] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9000baf714c [ 105.743729][ T9699] RDX: ffff888020668000 RSI: ffffffff82089412 RDI: ffff888020668444 [...] [ 105.790634][ T9699] R13: 00007ffddaf7c6ff R14: 00007f781689dfc0 R15: 0000000000022000 [ 105.792031][ T9699] </TASK> ``` And while I haven't been able to reproduce it again, I did have one instance where a KASAN UAF was detected quite by accident during testing. So, I suspect UAF might be a low probability occurrence under certain race conditions. ``` [ 142.257627][ T9758] ================================================================== [ 142.259362][ T9758] BUG: KASAN: slab-use-after-free in folio_remove_rmap_ptes+0x260/0xfc0 [ 142.261082][ T9758] Read of size 4 at addr ffff88802856d920 by task test/9758 [ 142.262570][ T9758] [ 142.263096][ T9758] CPU: 1 UID: 0 PID: 9758 Comm: test Not tainted 6.19.0-rc2-00098-gc53f467229a7 #20 PREEMPT(full) [ 142.263119][ T9758] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 142.263134][ T9758] Call Trace: [ 142.263141][ T9758] <TASK> [ 142.263148][ T9758] dump_stack_lvl+0x116/0x1b0 [ 142.263187][ T9758] print_report+0xca/0x5f0 [ 142.263219][ T9758] ? __phys_addr+0xeb/0x180 [ 142.263239][ T9758] ? folio_remove_rmap_ptes+0x260/0xfc0 [ 142.263257][ T9758] ? folio_remove_rmap_ptes+0x260/0xfc0 [ 142.263275][ T9758] kasan_report+0xca/0x100 [ 142.263301][ T9758] ? folio_remove_rmap_ptes+0x260/0xfc0 [ 142.263322][ T9758] kasan_check_range+0x39/0x1c0 [ 142.263340][ T9758] folio_remove_rmap_ptes+0x260/0xfc0 [ 142.263360][ T9758] unmap_page_range+0x1c70/0x4300 [ 142.263403][ T9758] ? __pfx_unmap_page_range+0x10/0x10 [ 142.263428][ T9758] ? uprobe_munmap+0x440/0x600 [ 142.263452][ T9758] ? uprobe_munmap+0x470/0x600 [ 142.263472][ T9758] unmap_single_vma+0x153/0x230 [ 142.263499][ T9758] unmap_vmas+0x1d6/0x430 [ 142.263525][ T9758] ? __pfx_unmap_vmas+0x10/0x10 [ 142.263551][ T9758] ? __sanitizer_cov_trace_switch+0x54/0x90 [ 142.263580][ T9758] ? mas_update_gap+0x30a/0x4f0 [ 142.263620][ T9758] vms_clear_ptes.part.0+0x362/0x6b0 [ 142.263642][ T9758] ? __pfx_vms_clear_ptes.part.0+0x10/0x10 [ 142.263666][ T9758] ? __pfx_mas_store_gfp+0x10/0x10 [ 142.263684][ T9758] ? unlink_anon_vmas+0x457/0x890 [ 142.263705][ T9758] vms_complete_munmap_vmas+0x6cf/0xa20 [ 142.263728][ T9758] do_vmi_align_munmap+0x430/0x800 [ 142.263750][ T9758] ? __pfx_do_vmi_align_munmap+0x10/0x10 [ 142.263783][ T9758] ? mas_walk+0x6b7/0x8c0 [ 142.263812][ T9758] do_vmi_munmap+0x1f0/0x3d0 [ 142.263833][ T9758] do_munmap+0xb6/0xf0 [ 142.263860][ T9758] ? __pfx_do_munmap+0x10/0x10 [ 142.263889][ T9758] ? mas_walk+0x6b7/0x8c0 [ 142.263916][ T9758] mremap_to+0x242/0x450 [ 142.263936][ T9758] do_mremap+0x12b3/0x2090 [ 142.263961][ T9758] ? __pfx_do_mremap+0x10/0x10 [ 142.263987][ T9758] __do_sys_mremap+0x119/0x170 [ 142.264007][ T9758] ? __pfx___do_sys_mremap+0x10/0x10 [ 142.264030][ T9758] ? __x64_sys_futex+0x1c5/0x4d0 [ 142.264060][ T9758] ? __x64_sys_futex+0x1ce/0x4d0 [ 142.264095][ T9758] do_syscall_64+0xcb/0xf80 [ 142.264125][ T9758] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 142.264145][ T9758] RIP: 0033:0x7f5736fa5fc9 [ 142.264162][ T9758] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 8e 0d 00 f7 d8 64 89 01 48 [ 142.264180][ T9758] RSP: 002b:00007f5736eace98 EFLAGS: 00000297 ORIG_RAX: 0000000000000019 [ 142.264201][ T9758] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5736fa5fc9 [ 142.264213][ T9758] RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000200000ffc000 [ 142.264236][ T9758] RBP: 00007f5736eacec0 R08: 0000200000002000 R09: 0000000000000000 [ 142.264247][ T9758] R10: 0000000000000007 R11: 0000000000000297 R12: 00007fff0d19497e [ 142.264258][ T9758] R13: 00007fff0d19497f R14: 00007f5736eacfc0 R15: 0000000000022000 [ 142.264277][ T9758] </TASK> [ 142.264282][ T9758] [ 142.319909][ T9758] Allocated by task 9759: [ 142.320665][ T9758] kasan_save_stack+0x24/0x50 [ 142.321497][ T9758] kasan_save_track+0x14/0x30 [ 142.322331][ T9758] __kasan_slab_alloc+0x87/0x90 [ 142.323193][ T9758] kmem_cache_alloc_noprof+0x267/0x790 [ 142.324151][ T9758] __anon_vma_prepare+0x34b/0x610 [ 142.325035][ T9758] __vmf_anon_prepare+0x11f/0x250 [ 142.325929][ T9758] do_fault+0x190/0x1940 [ 142.326688][ T9758] __handle_mm_fault+0x1901/0x2ac0 [ 142.327581][ T9758] handle_mm_fault+0x3f9/0xac0 [ 142.328424][ T9758] __get_user_pages+0x5ac/0x3960 [ 142.329301][ T9758] get_user_pages_remote+0x28a/0xb20 [ 142.330236][ T9758] uprobe_write+0x201/0x21f0 [ 142.331052][ T9758] uprobe_write_opcode+0x99/0x1a0 [ 142.331936][ T9758] set_swbp+0x109/0x210 [ 142.332677][ T9758] install_breakpoint+0x158/0x9c0 [ 142.333558][ T9758] uprobe_mmap+0x5ab/0x1070 [ 142.334359][ T9758] vma_complete+0xa00/0xe70 [ 142.335157][ T9758] __split_vma+0xbbb/0x10f0 [ 142.335956][ T9758] vms_gather_munmap_vmas+0x1c5/0x12e0 [ 142.336911][ T9758] __mmap_region+0x475/0x2a70 [ 142.337740][ T9758] mmap_region+0x1b2/0x3e0 [ 142.338525][ T9758] do_mmap+0xa42/0x11e0 [ 142.339270][ T9758] vm_mmap_pgoff+0x280/0x460 [ 142.340090][ T9758] ksys_mmap_pgoff+0x330/0x5d0 [ 142.340938][ T9758] __x64_sys_mmap+0x127/0x190 [ 142.341771][ T9758] do_syscall_64+0xcb/0xf80 [ 142.342578][ T9758] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 142.343615][ T9758] [ 142.344035][ T9758] Freed by task 23: [ 142.344708][ T9758] kasan_save_stack+0x24/0x50 [ 142.345537][ T9758] kasan_save_track+0x14/0x30 [ 142.346372][ T9758] kasan_save_free_info+0x3b/0x60 [ 142.347273][ T9758] __kasan_slab_free+0x61/0x80 [ 142.348121][ T9758] slab_free_after_rcu_debug+0x109/0x300 [ 142.349105][ T9758] rcu_core+0x7a1/0x1600 [ 142.349853][ T9758] handle_softirqs+0x1d4/0x8e0 [ 142.350710][ T9758] run_ksoftirqd+0x3a/0x60 [ 142.351503][ T9758] smpboot_thread_fn+0x3d4/0xaa0 [ 142.352377][ T9758] kthread+0x3d0/0x780 [ 142.353103][ T9758] ret_from_fork+0x966/0xaf0 [ 142.353921][ T9758] ret_from_fork_asm+0x1a/0x30 [ 142.354775][ T9758] [ 142.355195][ T9758] Last potentially related work creation: [ 142.356179][ T9758] kasan_save_stack+0x24/0x50 [ 142.357013][ T9758] kasan_record_aux_stack+0xa7/0xc0 [ 142.357924][ T9758] kmem_cache_free+0x44f/0x760 [ 142.358768][ T9758] __put_anon_vma+0x114/0x390 [ 142.359596][ T9758] unlink_anon_vmas+0x57f/0x890 [ 142.360449][ T9758] move_vma+0x15e1/0x1970 [ 142.361214][ T9758] mremap_to+0x1c3/0x450 [ 142.361966][ T9758] do_mremap+0x12b3/0x2090 [ 142.362753][ T9758] __do_sys_mremap+0x119/0x170 [ 142.363596][ T9758] do_syscall_64+0xcb/0xf80 [ 142.364403][ T9758] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 142.365435][ T9758] [ 142.365858][ T9758] The buggy address belongs to the object at ffff88802856d880 [ 142.365858][ T9758] which belongs to the cache anon_vma of size 208 [ 142.368200][ T9758] The buggy address is located 160 bytes inside of [ 142.368200][ T9758] freed 208-byte region [ffff88802856d880, ffff88802856d950) [ 142.370541][ T9758] [ 142.370967][ T9758] The buggy address belongs to the physical page: [ 142.372076][ T9758] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2856d [ 142.373580][ T9758] memcg:ffff888000180f01 [ 142.374324][ T9758] ksm flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff) [ 142.375621][ T9758] page_type: f5(slab) [ 142.376325][ T9758] raw: 00fff00000000000 ffff888040416140 ffffea000082c080 dead000000000003 [ 142.377805][ T9758] raw: 0000000000000000 00000000800f000f 00000000f5000000 ffff888000180f01 [ 142.379284][ T9758] page dumped because: kasan: bad access detected [ 142.380392][ T9758] page_owner tracks the page as allocated [ 142.381378][ T9758] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 7254, tgid 7254 (systemd-udevd), ts 49831929003, free_ts 49824984874 [ 142.384666][ T9758] post_alloc_hook+0x1ca/0x240 [ 142.385505][ T9758] get_page_from_freelist+0xdb3/0x2a70 [ 142.386464][ T9758] __alloc_frozen_pages_noprof+0x256/0x20f0 [ 142.387499][ T9758] alloc_pages_mpol+0x1f1/0x550 [ 142.388365][ T9758] new_slab+0x2d0/0x440 [ 142.389100][ T9758] ___slab_alloc+0xdd8/0x1bc0 [ 142.389927][ T9758] __slab_alloc.constprop.0+0x66/0x110 [ 142.390882][ T9758] kmem_cache_alloc_noprof+0x4ba/0x790 [ 142.391837][ T9758] anon_vma_fork+0xe6/0x630 [ 142.392638][ T9758] dup_mmap+0x1285/0x2010 [ 142.393408][ T9758] copy_process+0x3747/0x7450 [ 142.394236][ T9758] kernel_clone+0xea/0x880 [ 142.395023][ T9758] __do_sys_clone+0xce/0x120 [ 142.395836][ T9758] do_syscall_64+0xcb/0xf80 [ 142.396646][ T9758] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 142.397680][ T9758] page last free pid 7224 tgid 7224 stack trace: [ 142.398937][ T9758] __free_frozen_pages+0x83e/0x1130 [ 142.399864][ T9758] inode_doinit_with_dentry+0xb0d/0x11f0 [ 142.400856][ T9758] selinux_d_instantiate+0x27/0x30 [ 142.401759][ T9758] security_d_instantiate+0x142/0x1a0 [ 142.402709][ T9758] d_splice_alias_ops+0x94/0x830 [ 142.403588][ T9758] kernfs_iop_lookup+0x23d/0x2d0 [ 142.404463][ T9758] __lookup_slow+0x251/0x480 [ 142.405280][ T9758] lookup_slow+0x51/0x80 [ 142.406032][ T9758] path_lookupat+0x5fe/0xb80 [ 142.406851][ T9758] filename_lookup+0x213/0x5e0 [ 142.407701][ T9758] vfs_statx+0xf2/0x3d0 [ 142.408433][ T9758] __do_sys_newstat+0x96/0x120 [ 142.409273][ T9758] do_syscall_64+0xcb/0xf80 [ 142.410083][ T9758] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 142.411114][ T9758] [ 142.411534][ T9758] Memory state around the buggy address: [ 142.412508][ T9758] ffff88802856d800: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 142.413894][ T9758] ffff88802856d880: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 142.415277][ T9758] >ffff88802856d900: fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc [ 142.416662][ T9758] ^ [ 142.417549][ T9758] ffff88802856d980: fc fc fa fb fb fb fb fb fb fb fb fb fb fb fb fb [ 142.419228][ T9758] ffff88802856da00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 142.420929][ T9758] ================================================================== [ 142.422724][ T9758] Kernel panic - not syncing: KASAN: panic_on_warn set ... [ 142.424255][ T9758] CPU: 1 UID: 0 PID: 9758 Comm: test Not tainted 6.19.0-rc2-00098-gc53f467229a7 #20 PREEMPT(full) [ 142.426503][ T9758] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 142.428429][ T9758] Call Trace: [ 142.429138][ T9758] <TASK> [ 142.429774][ T9758] dump_stack_lvl+0x3d/0x1b0 [ 142.430774][ T9758] vpanic+0x679/0x710 [ 142.431639][ T9758] panic+0xc2/0xd0 [ 142.432427][ T9758] ? __pfx_panic+0x10/0x10 [ 142.433345][ T9758] ? folio_remove_rmap_ptes+0x260/0xfc0 [ 142.434491][ T9758] ? check_panic_on_warn+0x1f/0xc0 [ 142.435548][ T9758] ? folio_remove_rmap_ptes+0x260/0xfc0 [ 142.436738][ T9758] check_panic_on_warn+0xb1/0xc0 [ 142.437805][ T9758] ? folio_remove_rmap_ptes+0x260/0xfc0 [ 142.438986][ T9758] end_report+0x107/0x160 [ 142.439925][ T9758] kasan_report+0xd8/0x100 [ 142.440902][ T9758] ? folio_remove_rmap_ptes+0x260/0xfc0 [ 142.442082][ T9758] kasan_check_range+0x39/0x1c0 [ 142.443111][ T9758] folio_remove_rmap_ptes+0x260/0xfc0 [ 142.444269][ T9758] unmap_page_range+0x1c70/0x4300 [ 142.445370][ T9758] ? __pfx_unmap_page_range+0x10/0x10 [ 142.446520][ T9758] ? uprobe_munmap+0x440/0x600 [ 142.447558][ T9758] ? uprobe_munmap+0x470/0x600 [ 142.448596][ T9758] unmap_single_vma+0x153/0x230 [ 142.449650][ T9758] unmap_vmas+0x1d6/0x430 [ 142.450594][ T9758] ? __pfx_unmap_vmas+0x10/0x10 [ 142.451647][ T9758] ? __sanitizer_cov_trace_switch+0x54/0x90 [ 142.452911][ T9758] ? mas_update_gap+0x30a/0x4f0 [ 142.453966][ T9758] vms_clear_ptes.part.0+0x362/0x6b0 [ 142.455107][ T9758] ? __pfx_vms_clear_ptes.part.0+0x10/0x10 [ 142.456351][ T9758] ? __pfx_mas_store_gfp+0x10/0x10 [ 142.457450][ T9758] ? unlink_anon_vmas+0x457/0x890 [ 142.458523][ T9758] vms_complete_munmap_vmas+0x6cf/0xa20 [ 142.459715][ T9758] do_vmi_align_munmap+0x430/0x800 [ 142.460817][ T9758] ? __pfx_do_vmi_align_munmap+0x10/0x10 [ 142.462034][ T9758] ? mas_walk+0x6b7/0x8c0 [ 142.462971][ T9758] do_vmi_munmap+0x1f0/0x3d0 [ 142.463973][ T9758] do_munmap+0xb6/0xf0 [ 142.464861][ T9758] ? __pfx_do_munmap+0x10/0x10 [ 142.465903][ T9758] ? mas_walk+0x6b7/0x8c0 [ 142.466847][ T9758] mremap_to+0x242/0x450 [ 142.467765][ T9758] do_mremap+0x12b3/0x2090 [ 142.468727][ T9758] ? __pfx_do_mremap+0x10/0x10 [ 142.469763][ T9758] __do_sys_mremap+0x119/0x170 [ 142.470789][ T9758] ? __pfx___do_sys_mremap+0x10/0x10 [ 142.471926][ T9758] ? __x64_sys_futex+0x1c5/0x4d0 [ 142.472986][ T9758] ? __x64_sys_futex+0x1ce/0x4d0 [ 142.474063][ T9758] do_syscall_64+0xcb/0xf80 [ 142.475050][ T9758] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 142.476312][ T9758] RIP: 0033:0x7f5736fa5fc9 [ 142.477261][ T9758] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 8e 0d 00 f7 d8 64 89 01 48 [ 142.481362][ T9758] RSP: 002b:00007f5736eace98 EFLAGS: 00000297 ORIG_RAX: 0000000000000019 [ 142.483134][ T9758] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5736fa5fc9 [ 142.484836][ T9758] RDX: 0000000000004000 RSI: 0000000000004000 RDI: 0000200000ffc000 [ 142.486536][ T9758] RBP: 00007f5736eacec0 R08: 0000200000002000 R09: 0000000000000000 [ 142.488223][ T9758] R10: 0000000000000007 R11: 0000000000000297 R12: 00007fff0d19497e [ 142.489913][ T9758] R13: 00007fff0d19497f R14: 00007f5736eacfc0 R15: 0000000000022000 [ 142.491609][ T9758] </TASK> ``` Since there are no commits in between these two commits, I am certain that the bug is introduced by this commit. > > This commit doesn't mention anything about MREMAP_DONTUNMAP. Is it really > > acceptable for MREMAP_DONTUNMAP, which maintains old_address and aliases > > new_address, to use move-only fastpath? > > > > If MREMAP_DONTUNMAP can also use fastpath, I think a sophisticated > > refactoring of remap_move is needed to manage anon_vma/rmap lifetimes. > > Otherwise, adding simple flag check logic to vrm_move_only() is likely > > necessary. > > > > What are your thoughts? > > It's late at night, so... > let me look at at this tomorrow with a clearer mind :) > > Happy new year, by the way! Happy new year to you too! :) > > -- > Cheers, > Harry / Hyeonggon Regards, Jeongjun Park ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2026-01-01 14:30 ` Jeongjun Park @ 2026-01-01 16:32 ` Lorenzo Stoakes 2026-01-01 17:06 ` David Hildenbrand (Red Hat) 0 siblings, 1 reply; 13+ messages in thread From: Lorenzo Stoakes @ 2026-01-01 16:32 UTC (permalink / raw) To: Jeongjun Park Cc: Harry Yoo, Liam.Howlett, akpm, david, jannh, linux-kernel, linux-mm, riel, syzbot+b165fc2e11771c66d8ba, syzkaller-bugs, vbabka On Thu, Jan 01, 2026 at 11:30:52PM +0900, Jeongjun Park wrote: > > Based on my testing, I found that the WARNING starts from commit > d23cb648e365 ("mm/mremap: permit mremap() move of multiple VMAs"), > which is right after commit 2cf442d74216 ("mm/mremap: clean up mlock > populate behavior") in Lorenzo's mremap-related patch series. OK let me take a look. Thanks, Lorenzo ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2026-01-01 16:32 ` Lorenzo Stoakes @ 2026-01-01 17:06 ` David Hildenbrand (Red Hat) 2026-01-01 21:28 ` Lorenzo Stoakes 0 siblings, 1 reply; 13+ messages in thread From: David Hildenbrand (Red Hat) @ 2026-01-01 17:06 UTC (permalink / raw) To: Lorenzo Stoakes, Jeongjun Park Cc: Harry Yoo, Liam.Howlett, akpm, jannh, linux-kernel, linux-mm, riel, syzbot+b165fc2e11771c66d8ba, syzkaller-bugs, vbabka On 1/1/26 17:32, Lorenzo Stoakes wrote: > On Thu, Jan 01, 2026 at 11:30:52PM +0900, Jeongjun Park wrote: >> >> Based on my testing, I found that the WARNING starts from commit >> d23cb648e365 ("mm/mremap: permit mremap() move of multiple VMAs"), >> which is right after commit 2cf442d74216 ("mm/mremap: clean up mlock >> populate behavior") in Lorenzo's mremap-related patch series. > > OK let me take a look. Trying to make sense of the reproducer and how bpf comes into play ... I assume BPF is only used to install a uprobe. We seem to create a file0 and register a uprobe on it. We then mmap() that file with PROT_NONE. We should end up in uprobe_mmap() and trigger a COW fault -> allocate an anon_vma. So likely the bpf magic is only there to allocate an anon_vma for a PROT_NONE region. But it's all a bit confusing ... :) -- Cheers David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2026-01-01 17:06 ` David Hildenbrand (Red Hat) @ 2026-01-01 21:28 ` Lorenzo Stoakes 0 siblings, 0 replies; 13+ messages in thread From: Lorenzo Stoakes @ 2026-01-01 21:28 UTC (permalink / raw) To: David Hildenbrand (Red Hat) Cc: Jeongjun Park, Harry Yoo, Liam.Howlett, akpm, jannh, linux-kernel, linux-mm, riel, syzbot+b165fc2e11771c66d8ba, syzkaller-bugs, vbabka On Thu, Jan 01, 2026 at 06:06:23PM +0100, David Hildenbrand (Red Hat) wrote: > On 1/1/26 17:32, Lorenzo Stoakes wrote: > > On Thu, Jan 01, 2026 at 11:30:52PM +0900, Jeongjun Park wrote: > > > > > > Based on my testing, I found that the WARNING starts from commit > > > d23cb648e365 ("mm/mremap: permit mremap() move of multiple VMAs"), > > > which is right after commit 2cf442d74216 ("mm/mremap: clean up mlock > > > populate behavior") in Lorenzo's mremap-related patch series. > > > > OK let me take a look. > > Trying to make sense of the reproducer and how bpf comes into play ... I > assume BPF is only used to install a uprobe. > > We seem to create a file0 and register a uprobe on it. > > We then mmap() that file with PROT_NONE. We should end up in uprobe_mmap() > and trigger a COW fault -> allocate an anon_vma. > > So likely the bpf magic is only there to allocate an anon_vma for a > PROT_NONE region. > > But it's all a bit confusing ... :) > > -- > Cheers > > David OK I had a huge reply going through all of Jeongjun's stuff (thanks for reporting!) but then got stuck into theories and highways and byways... all the while I couldn't repro. Well now I can repro reliably, finally! So I will dig into this more tomorrow. Having a reliable repro makes this vastly easier. I have theories... almost tempting to carry on right now but I'll end up not sleeping :) Cheers, Lorenzo ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_remove_rmap_ptes 2026-01-01 13:09 ` Jeongjun Park 2026-01-01 13:45 ` Harry Yoo @ 2026-01-01 16:54 ` Lorenzo Stoakes 1 sibling, 0 replies; 13+ messages in thread From: Lorenzo Stoakes @ 2026-01-01 16:54 UTC (permalink / raw) To: Jeongjun Park Cc: harry.yoo, Liam.Howlett, akpm, david, jannh, linux-kernel, linux-mm, riel, syzbot+b165fc2e11771c66d8ba, syzkaller-bugs, vbabka On Thu, Jan 01, 2026 at 10:09:06PM +0900, Jeongjun Park wrote: > Harry Yoo wrote: > > On Tue, Dec 30, 2025 at 11:02:18PM +0100, David Hildenbrand (Red Hat) wrote: > > > On 12/24/25 06:35, Harry Yoo wrote: > > > > On Mon, Dec 22, 2025 at 09:23:17PM -0800, syzbot wrote: > > > > Perhaps we want yet another DEBUG_VM feature to record when it's been > > > > dropped to zero and report it in the sanity check, or... imagine harder > > > > how a file VMA that has anon_vma involving CoW / GUP / migration / > > > > reclamation could somehow drop the refcount to zero? > > > > > > > > Sounds fun ;) > > > > > > > > > > Can we bisect the issue given that we have a reproducer? > > > > Unfortunately I could not reproduce the issue with the C reproducer, > > even with the provided kernel config. Maybe it's a race condition and > > I didn't wait long enough... > > > > > This only popped up just now, so I would assume it's actually something that > > > went into this release that makes it trigger. > > > > I was assuming the bug has been there even before the addition of > > VM_WARN_ON_ONCE(), as the commit a222439e1e27 ("mm/rmap: add anon_vma > > lifetime debug check") says: > > > There have been syzkaller reports a few months ago[1][2] of UAF in rmap > > > walks that seems to indicate that there can be pages with elevated > > > mapcount whose anon_vma has already been freed, but I think we never > > > figured out what the cause is; and syzkaller only hit these UAFs when > > > memory pressure randomly caused reclaim to rmap-walk the affected pages, > > > so it of course didn't manage to create a reproducer. > > > > > > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios > > > to hopefully catch such issues more reliably. > > > > I tested this myself and found that the bug is caused by commit > d23cb648e365 ("mm/mremap: permit mremap() move of multiple VMAs"). > > This commit doesn't mention anything about MREMAP_DONTUNMAP. Is it really > acceptable for MREMAP_DONTUNMAP, which maintains old_address and aliases > new_address, to use move-only fastpath? It's not a fast path, it permits multiple VMAs to be moved at once for convenience (most importantly - to avoid users _having to know_ how the kernel is going to handle VMA merging esp. in the light of confusing rules around merging of VMAs that map anonymous memory). When MREMAP_DONTUMAP is used, it doesn't leave the mapping as-is, it moves all the page tables, it just leaves the existing VMA where it is. There should be no problem with doing this. Obviously the fact there's a bug suggests there _is_ a problem obviously. This should be no different from individually mremap()'ing each of the VMAs separately. > > If MREMAP_DONTUNMAP can also use fastpath, I think a sophisticated > refactoring of remap_move is needed to manage anon_vma/rmap lifetimes. Why exactly? In dontunmap_complete() we unlink all attached anon_vma's explicitly, assuming we haven't just merged with the VMA we just moved. We don't have to do so for file-backed VMAs nor should there be any lifetime issues because the VMA will fault in from the file on access. > Otherwise, adding simple flag check logic to vrm_move_only() is likely > necessary. I'd say let's figure out the bug and see if there's any necessity for this. So far I haven't been able to reproduce it locally... :) and it seems you could only reproduce it once so far? That makes this something of a pain, seems like a race, the fact the repro uses BPF is also... not great for nailing this down :) But I am looking into it. One possibility is it's relying on a just-so arrangement of VMA's that trigger some horrible merge corner case, this bit of code: /* * anon_vma links of the old vma is no longer needed after its page * table has been moved. */ if (new_vma != vrm->vma && start == old_start && end == old_end) unlink_anon_vmas(vrm->vma); Makes me wonder if a merge that happens to occur here triggers the !unlink_anon_vmas() case... but then this really shouldn't be any different from running mremap() repeatedly for each individual VMA. > > What are your thoughts? As Ash from Alien said - I am collating :) Happy new year to all... :) Am officially on holiday until Monday but will try to look into this at least for today/tomorrow. > > > -- > > Cheers, > > Harry / Hyeonggon > > Regards, > Jeongjun Park > Cheers, Lorenzo ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-01-01 21:29 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-12-23 5:23 [syzbot] [mm?] WARNING in folio_remove_rmap_ptes syzbot 2025-12-23 8:24 ` David Hildenbrand (Red Hat) 2025-12-24 2:48 ` Hillf Danton 2025-12-24 5:35 ` Harry Yoo 2025-12-30 22:02 ` David Hildenbrand (Red Hat) 2025-12-31 6:59 ` Harry Yoo 2026-01-01 13:09 ` Jeongjun Park 2026-01-01 13:45 ` Harry Yoo 2026-01-01 14:30 ` Jeongjun Park 2026-01-01 16:32 ` Lorenzo Stoakes 2026-01-01 17:06 ` David Hildenbrand (Red Hat) 2026-01-01 21:28 ` Lorenzo Stoakes 2026-01-01 16:54 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox