* [syzbot] [mm?] kernel BUG in hpage_collapse_scan_file (2) @ 2026-01-25 2:23 syzbot 2026-01-25 12:10 ` Lance Yang 0 siblings, 1 reply; 4+ messages in thread From: syzbot @ 2026-01-25 2:23 UTC (permalink / raw) To: Liam.Howlett, akpm, baohua, baolin.wang, david, dev.jain, lance.yang, linux-kernel, linux-mm, lorenzo.stoakes, npache, ryan.roberts, syzkaller-bugs, ziy Hello, syzbot found the following issue on: HEAD commit: ca3a02fda4da Add linux-next specific files for 20260123 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=10c42452580000 kernel config: https://syzkaller.appspot.com/x/.config?x=10f2b64f8f12b9a4 dashboard link: https://syzkaller.appspot.com/bug?extid=bf6e6a6ca143afea5ca2 compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17f7cbfa580000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=112d405a580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/291ebca63a31/disk-ca3a02fd.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/b2112a214b54/vmlinux-ca3a02fd.xz kernel image: https://storage.googleapis.com/syzbot-assets/77d1ae437e07/bzImage-ca3a02fd.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+bf6e6a6ca143afea5ca2@syzkaller.appspotmail.com node ffff888148816ec0 offset 0 parent ffff888148817700 shift 0 count 64 values 0 array ffff88807be6b0f0 list ffff888148816ed8 ffff888148816ed8 marks 0 0 0 ------------[ cut here ]------------ kernel BUG at ./include/linux/xarray.h:1441! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 6017 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026 RIP: 0010:XAS_INVALID include/linux/xarray.h:1441 [inline] RIP: 0010:collapse_file mm/khugepaged.c:2041 [inline] RIP: 0010:hpage_collapse_scan_file+0x4e0c/0x50e0 mm/khugepaged.c:2387 Code: ff 48 89 df 48 c7 c6 c0 8c bc 8b e8 ee 6c f6 fe 90 0f 0b 48 85 db 0f 84 29 01 00 00 e8 bd 34 91 ff 48 89 df e8 f5 c4 4b 09 90 <0f> 0b e8 ad 34 91 ff 48 89 df 48 c7 c6 c0 8c bc 8b e8 be 6c f6 fe RSP: 0018:ffffc9000422f120 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888148816ec0 RCX: 6c90c8cc739bf400 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffffc9000422f428 R08: ffffc9000422eea7 R09: 1ffff92000845dd4 R10: dffffc0000000000 R11: fffff52000845dd5 R12: 00000003fffffffc R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000555592982500(0000) GS:ffff8881256ef000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b30363fff CR3: 0000000031994000 CR4: 00000000003526f0 Call Trace: <TASK> madvise_collapse+0x42f/0xb30 mm/khugepaged.c:2817 madvise_vma_behavior+0x10ad/0x43f0 mm/madvise.c:1372 madvise_walk_vmas+0x57a/0xaf0 mm/madvise.c:1721 madvise_do_behavior+0x386/0x540 mm/madvise.c:1937 do_madvise+0x1fa/0x2e0 mm/madvise.c:2030 __do_sys_madvise mm/madvise.c:2039 [inline] __se_sys_madvise mm/madvise.c:2037 [inline] __x64_sys_madvise+0xa6/0xc0 mm/madvise.c:2037 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f948ad9acb9 Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd1ef477c8 EFLAGS: 00000246 ORIG_RAX: 000000000000001c RAX: ffffffffffffffda RBX: 00007f948b015fa0 RCX: 00007f948ad9acb9 RDX: 0000000000000019 RSI: 0000000000600003 RDI: 0000200000000000 RBP: 00007f948ae08bf7 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f948b015fac R14: 00007f948b015fa0 R15: 00007f948b015fa0 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:XAS_INVALID include/linux/xarray.h:1441 [inline] RIP: 0010:collapse_file mm/khugepaged.c:2041 [inline] RIP: 0010:hpage_collapse_scan_file+0x4e0c/0x50e0 mm/khugepaged.c:2387 Code: ff 48 89 df 48 c7 c6 c0 8c bc 8b e8 ee 6c f6 fe 90 0f 0b 48 85 db 0f 84 29 01 00 00 e8 bd 34 91 ff 48 89 df e8 f5 c4 4b 09 90 <0f> 0b e8 ad 34 91 ff 48 89 df 48 c7 c6 c0 8c bc 8b e8 be 6c f6 fe RSP: 0018:ffffc9000422f120 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888148816ec0 RCX: 6c90c8cc739bf400 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffffc9000422f428 R08: ffffc9000422eea7 R09: 1ffff92000845dd4 R10: dffffc0000000000 R11: fffff52000845dd5 R12: 00000003fffffffc R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000555592982500(0000) GS:ffff8881256ef000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c0001002a0 CR3: 0000000031994000 CR4: 00000000003526f0 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [syzbot] [mm?] kernel BUG in hpage_collapse_scan_file (2) 2026-01-25 2:23 [syzbot] [mm?] kernel BUG in hpage_collapse_scan_file (2) syzbot @ 2026-01-25 12:10 ` Lance Yang 2026-01-25 18:13 ` David Hildenbrand (Red Hat) 0 siblings, 1 reply; 4+ messages in thread From: Lance Yang @ 2026-01-25 12:10 UTC (permalink / raw) To: willy Cc: syzbot+bf6e6a6ca143afea5ca2, Liam.Howlett, akpm, baohua, baolin.wang, david, dev.jain, lance.yang, linux-kernel, linux-mm, lorenzo.stoakes, npache, ryan.roberts, syzkaller-bugs, ziy Ccing Willy. On Sat, 24 Jan 2026 18:23:28 -0800, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: ca3a02fda4da Add linux-next specific files for 20260123 > git tree: linux-next > console output: https://syzkaller.appspot.com/x/log.txt?x=10c42452580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=10f2b64f8f12b9a4 > dashboard link: https://syzkaller.appspot.com/bug?extid=bf6e6a6ca143afea5ca2 > compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17f7cbfa580000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=112d405a580000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/291ebca63a31/disk-ca3a02fd.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/b2112a214b54/vmlinux-ca3a02fd.xz > kernel image: https://storage.googleapis.com/syzbot-assets/77d1ae437e07/bzImage-ca3a02fd.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+bf6e6a6ca143afea5ca2@syzkaller.appspotmail.com > > node ffff888148816ec0 offset 0 parent ffff888148817700 shift 0 count 64 values 0 array ffff88807be6b0f0 list ffff888148816ed8 ffff888148816ed8 marks 0 0 0 > ------------[ cut here ]------------ > kernel BUG at ./include/linux/xarray.h:1441! > Oops: invalid opcode: 0000 [#1] SMP KASAN PTI > CPU: 0 UID: 0 PID: 6017 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026 > RIP: 0010:XAS_INVALID include/linux/xarray.h:1441 [inline] Seems like that is: ``` static inline struct xa_state *XAS_INVALID(struct xa_state *xas) { XA_NODE_BUG_ON(xas->xa_node, xas_valid(xas)); return xas; } ``` Which was added by commit 43b00759f21b (not land upstream yet): ``` commit 43b00759f21b10142094d1ae5ff65cbb368953a3 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Sun Dec 14 10:53:31 2025 -0500 XArray: Add extra debugging check to xas_lock and friends While tracking down a recent bug, we discovered somewhere that had forgotten to call xas_reset() before calling xas_lock(). Add a debug check to be sure that doesn't happen in future and fix all the places in the test suite which were carelessly doing just this. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> ``` which catches places that forget to reset xas before locking. > RIP: 0010:collapse_file mm/khugepaged.c:2041 [inline] Yeah, maybe it caught a bug in collapse_file() ... When we lock again with xas_lock_irq(), xas->xa_node is still pointing at a node from the earlier xas_load(), so the BUG_ON fires, IIUC. Fix it by calling xas_set() before xas_lock_irq() to reset the state. And one spot in rollback doesn't actually need xas at all, just changed it to xa_lock_irq() directly. ---8<--- commit 2003255c52846ab10cad6c2e57cda4d17dddadbe Author: Lance Yang <lance.yang@linux.dev> Date: Sun Jan 25 19:37:56 2026 +0800 HACK diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fba6aea5bea6..3656ae491385 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2038,6 +2038,7 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, try_to_unmap(folio, TTU_IGNORE_MLOCK | TTU_BATCH_FLUSH); + xas_set(&xas, index); xas_lock_irq(&xas); VM_BUG_ON_FOLIO(folio != xa_load(xas.xa, index), folio); @@ -2140,9 +2141,8 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, int nr_none_check = 0; i_mmap_lock_read(mapping); - xas_lock_irq(&xas); - xas_set(&xas, start); + xas_lock_irq(&xas); for (index = start; index < end; index++) { if (!xas_next(&xas)) { xas_store(&xas, XA_RETRY_ENTRY); @@ -2192,6 +2192,7 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, goto rollback; } } else { + xas_set(&xas, start); xas_lock_irq(&xas); } @@ -2250,9 +2251,9 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, rollback: /* Something went wrong: roll back page cache changes */ if (nr_none) { - xas_lock_irq(&xas); + xa_lock_irq(&mapping->i_pages); mapping->nrpages -= nr_none; - xas_unlock_irq(&xas); + xa_unlock_irq(&mapping->i_pages); shmem_uncharge(mapping->host, nr_none); } --- Tested with the syzbot reproducer[1], no more crashes :) [1] https://syzkaller.appspot.com/x/repro.c?x=112d405a580000 Cheers, Lance [...] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [syzbot] [mm?] kernel BUG in hpage_collapse_scan_file (2) 2026-01-25 12:10 ` Lance Yang @ 2026-01-25 18:13 ` David Hildenbrand (Red Hat) 2026-01-26 1:54 ` Lance Yang 0 siblings, 1 reply; 4+ messages in thread From: David Hildenbrand (Red Hat) @ 2026-01-25 18:13 UTC (permalink / raw) To: Lance Yang, willy Cc: syzbot+bf6e6a6ca143afea5ca2, Liam.Howlett, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, lorenzo.stoakes, npache, ryan.roberts, syzkaller-bugs, ziy On 1/25/26 13:10, Lance Yang wrote: > Ccing Willy. > > On Sat, 24 Jan 2026 18:23:28 -0800, syzbot wrote: >> Hello, >> >> syzbot found the following issue on: >> >> HEAD commit: ca3a02fda4da Add linux-next specific files for 20260123 >> git tree: linux-next >> console output: https://syzkaller.appspot.com/x/log.txt?x=10c42452580000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=10f2b64f8f12b9a4 >> dashboard link: https://syzkaller.appspot.com/bug?extid=bf6e6a6ca143afea5ca2 >> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8 >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17f7cbfa580000 >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=112d405a580000 >> >> Downloadable assets: >> disk image: https://storage.googleapis.com/syzbot-assets/291ebca63a31/disk-ca3a02fd.raw.xz >> vmlinux: https://storage.googleapis.com/syzbot-assets/b2112a214b54/vmlinux-ca3a02fd.xz >> kernel image: https://storage.googleapis.com/syzbot-assets/77d1ae437e07/bzImage-ca3a02fd.xz >> >> IMPORTANT: if you fix the issue, please add the following tag to the commit: >> Reported-by: syzbot+bf6e6a6ca143afea5ca2@syzkaller.appspotmail.com >> >> node ffff888148816ec0 offset 0 parent ffff888148817700 shift 0 count 64 values 0 array ffff88807be6b0f0 list ffff888148816ed8 ffff888148816ed8 marks 0 0 0 >> ------------[ cut here ]------------ >> kernel BUG at ./include/linux/xarray.h:1441! >> Oops: invalid opcode: 0000 [#1] SMP KASAN PTI >> CPU: 0 UID: 0 PID: 6017 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/13/2026 >> RIP: 0010:XAS_INVALID include/linux/xarray.h:1441 [inline] > > Seems like that is: > > ``` > static inline struct xa_state *XAS_INVALID(struct xa_state *xas) > { > XA_NODE_BUG_ON(xas->xa_node, xas_valid(xas)); > return xas; > } > ``` I think there was recently already a discussion about this. See https://lore.kernel.org/linux-mm/aVvz3tYdu49TGkjI@mozart.vkv.me/ And where Willy said that likely it needs more thought: https://lore.kernel.org/linux-mm/aVwm3MQ_ZDa_kU8c@casper.infradead.org/ -- Cheers David ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [syzbot] [mm?] kernel BUG in hpage_collapse_scan_file (2) 2026-01-25 18:13 ` David Hildenbrand (Red Hat) @ 2026-01-26 1:54 ` Lance Yang 0 siblings, 0 replies; 4+ messages in thread From: Lance Yang @ 2026-01-26 1:54 UTC (permalink / raw) To: David Hildenbrand (Red Hat), willy Cc: syzbot+bf6e6a6ca143afea5ca2, Liam.Howlett, akpm, baohua, baolin.wang, dev.jain, linux-kernel, linux-mm, lorenzo.stoakes, npache, ryan.roberts, syzkaller-bugs, ziy On 2026/1/26 02:13, David Hildenbrand (Red Hat) wrote: > On 1/25/26 13:10, Lance Yang wrote: >> Ccing Willy. >> >> On Sat, 24 Jan 2026 18:23:28 -0800, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit: ca3a02fda4da Add linux-next specific files for 20260123 >>> git tree: linux-next >>> console output: https://syzkaller.appspot.com/x/log.txt?x=10c42452580000 >>> kernel config: https://syzkaller.appspot.com/x/.config? >>> x=10f2b64f8f12b9a4 >>> dashboard link: https://syzkaller.appspot.com/bug? >>> extid=bf6e6a6ca143afea5ca2 >>> compiler: Debian clang version 21.1.8 (+ >>> +20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD >>> 21.1.8 >>> syz repro: https://syzkaller.appspot.com/x/repro.syz? >>> x=17f7cbfa580000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=112d405a580000 >>> >>> Downloadable assets: >>> disk image: https://storage.googleapis.com/syzbot- >>> assets/291ebca63a31/disk-ca3a02fd.raw.xz >>> vmlinux: https://storage.googleapis.com/syzbot-assets/b2112a214b54/ >>> vmlinux-ca3a02fd.xz >>> kernel image: https://storage.googleapis.com/syzbot- >>> assets/77d1ae437e07/bzImage-ca3a02fd.xz >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the >>> commit: >>> Reported-by: syzbot+bf6e6a6ca143afea5ca2@syzkaller.appspotmail.com >>> >>> node ffff888148816ec0 offset 0 parent ffff888148817700 shift 0 count >>> 64 values 0 array ffff88807be6b0f0 list ffff888148816ed8 >>> ffff888148816ed8 marks 0 0 0 >>> ------------[ cut here ]------------ >>> kernel BUG at ./include/linux/xarray.h:1441! >>> Oops: invalid opcode: 0000 [#1] SMP KASAN PTI >>> CPU: 0 UID: 0 PID: 6017 Comm: syz.0.17 Not tainted syzkaller #0 >>> PREEMPT(full) >>> Hardware name: Google Google Compute Engine/Google Compute Engine, >>> BIOS Google 01/13/2026 >>> RIP: 0010:XAS_INVALID include/linux/xarray.h:1441 [inline] >> >> Seems like that is: >> >> ``` >> static inline struct xa_state *XAS_INVALID(struct xa_state *xas) >> { >> XA_NODE_BUG_ON(xas->xa_node, xas_valid(xas)); >> return xas; >> } >> ``` > > I think there was recently already a discussion about this. > > See > > https://lore.kernel.org/linux-mm/aVvz3tYdu49TGkjI@mozart.vkv.me/ > > > And where Willy said that likely it needs more thought: > > https://lore.kernel.org/linux-mm/aVwm3MQ_ZDa_kU8c@casper.infradead.org/ Ah, I see. Thanks for the pointer! ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-01-26 1:54 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2026-01-25 2:23 [syzbot] [mm?] kernel BUG in hpage_collapse_scan_file (2) syzbot 2026-01-25 12:10 ` Lance Yang 2026-01-25 18:13 ` David Hildenbrand (Red Hat) 2026-01-26 1:54 ` Lance Yang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox