* [syzbot] [mm?] WARNING in __page_table_check_ptes_set @ 2024-04-21 20:16 syzbot 2024-04-22 10:07 ` David Hildenbrand 0 siblings, 1 reply; 6+ messages in thread From: syzbot @ 2024-04-21 20:16 UTC (permalink / raw) To: akpm, linux-kernel, linux-mm, pasha.tatashin, syzkaller-bugs Hello, syzbot found the following issue on: HEAD commit: 4eab35893071 Add linux-next specific files for 20240417 git tree: linux-next console+strace: https://syzkaller.appspot.com/x/log.txt?x=1727a61b180000 kernel config: https://syzkaller.appspot.com/x/.config?x=27920e47287645ff dashboard link: https://syzkaller.appspot.com/bug?extid=d8426b591c36b21c750e compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=156da22d180000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=163dfec7180000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/9f7d6c097fb4/disk-4eab3589.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/287b16352982/vmlinux-4eab3589.xz kernel image: https://storage.googleapis.com/syzbot-assets/23839c65c573/bzImage-4eab3589.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+d8426b591c36b21c750e@syzkaller.appspotmail.com ------------[ cut here ]------------ WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_pte mm/page_table_check.c:199 [inline] WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_ptes_set+0x1db/0x420 mm/page_table_check.c:213 Modules linked in: CPU: 0 PID: 5084 Comm: syz-executor382 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:__page_table_check_pte mm/page_table_check.c:199 [inline] RIP: 0010:__page_table_check_ptes_set+0x1db/0x420 mm/page_table_check.c:213 Code: 48 8b 7c 24 40 48 c7 c6 80 19 46 8e e8 ee df 8e ff 41 83 fc 1d 74 18 41 83 fc 1a 75 1d e8 5d da 8e ff eb 10 e8 56 da 8e ff 90 <0f> 0b 90 eb 10 e8 4b da 8e ff 90 0f 0b 90 eb 05 e8 40 da 8e ff 48 RSP: 0018:ffffc9000366f740 EFLAGS: 00010293 RAX: ffffffff8207833a RBX: ffffc9000366f7c0 RCX: ffff888022af3c00 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000 RBP: ffffc9000366f830 R08: ffffffff820782af R09: 1ffffd40000a6a10 R10: dffffc0000000000 R11: fffff940000a6a11 R12: 0000000000000000 R13: 0000000014d42c67 R14: 0000000000000001 R15: 0000000000000000 FS: 0000555567f79380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000066c7e0 CR3: 0000000078cb0000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> page_table_check_ptes_set include/linux/page_table_check.h:74 [inline] set_ptes include/linux/pgtable.h:267 [inline] __ptep_modify_prot_commit include/linux/pgtable.h:1269 [inline] ptep_modify_prot_commit include/linux/pgtable.h:1302 [inline] change_pte_range mm/mprotect.c:194 [inline] change_pmd_range mm/mprotect.c:424 [inline] change_pud_range mm/mprotect.c:457 [inline] change_p4d_range mm/mprotect.c:480 [inline] change_protection_range mm/mprotect.c:508 [inline] change_protection+0x2770/0x3cc0 mm/mprotect.c:542 mprotect_fixup+0x740/0xa90 mm/mprotect.c:655 do_mprotect_pkey+0x90d/0xe00 mm/mprotect.c:820 __do_sys_mprotect mm/mprotect.c:841 [inline] __se_sys_mprotect mm/mprotect.c:838 [inline] __x64_sys_mprotect+0x80/0x90 mm/mprotect.c:838 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f45514bf429 Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffe52191598 EFLAGS: 00000246 ORIG_RAX: 000000000000000a RAX: ffffffffffffffda RBX: 00007ffe52191768 RCX: 00007f45514bf429 RDX: 000000000000000f RSI: 0000000000004000 RDI: 0000000020ffc000 RBP: 00007f4551532610 R08: 00007ffe52191768 R09: 00007ffe52191768 R10: 00007ffe52191768 R11: 0000000000000246 R12: 0000000000000001 R13: 00007ffe52191758 R14: 0000000000000001 R15: 0000000000000001 </TASK> --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [syzbot] [mm?] WARNING in __page_table_check_ptes_set 2024-04-21 20:16 [syzbot] [mm?] WARNING in __page_table_check_ptes_set syzbot @ 2024-04-22 10:07 ` David Hildenbrand 2024-04-22 10:38 ` David Hildenbrand 0 siblings, 1 reply; 6+ messages in thread From: David Hildenbrand @ 2024-04-22 10:07 UTC (permalink / raw) To: syzbot, akpm, linux-kernel, linux-mm, pasha.tatashin, syzkaller-bugs On 21.04.24 22:16, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 4eab35893071 Add linux-next specific files for 20240417 > git tree: linux-next > console+strace: https://syzkaller.appspot.com/x/log.txt?x=1727a61b180000 > kernel config: https://syzkaller.appspot.com/x/.config?x=27920e47287645ff > dashboard link: https://syzkaller.appspot.com/bug?extid=d8426b591c36b21c750e > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=156da22d180000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=163dfec7180000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/9f7d6c097fb4/disk-4eab3589.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/287b16352982/vmlinux-4eab3589.xz > kernel image: https://storage.googleapis.com/syzbot-assets/23839c65c573/bzImage-4eab3589.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+d8426b591c36b21c750e@syzkaller.appspotmail.com > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_pte mm/page_table_check.c:199 [inline] > WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_ptes_set+0x1db/0x420 I think this is if (pte_present(pte) && pte_uffd_wp(pte)) WARN_ON_ONCE(pte_write(pte)); mm/page_table_check.c:213 > Modules linked in: > CPU: 0 PID: 5084 Comm: syz-executor382 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 > RIP: 0010:__page_table_check_pte mm/page_table_check.c:199 [inline] > RIP: 0010:__page_table_check_ptes_set+0x1db/0x420 mm/page_table_check.c:213 > Code: 48 8b 7c 24 40 48 c7 c6 80 19 46 8e e8 ee df 8e ff 41 83 fc 1d 74 18 41 83 fc 1a 75 1d e8 5d da 8e ff eb 10 e8 56 da 8e ff 90 <0f> 0b 90 eb 10 e8 4b da 8e ff 90 0f 0b 90 eb 05 e8 40 da 8e ff 48 > RSP: 0018:ffffc9000366f740 EFLAGS: 00010293 > RAX: ffffffff8207833a RBX: ffffc9000366f7c0 RCX: ffff888022af3c00 > RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000 > RBP: ffffc9000366f830 R08: ffffffff820782af R09: 1ffffd40000a6a10 > R10: dffffc0000000000 R11: fffff940000a6a11 R12: 0000000000000000 > R13: 0000000014d42c67 R14: 0000000000000001 R15: 0000000000000000 > FS: 0000555567f79380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000000000066c7e0 CR3: 0000000078cb0000 CR4: 00000000003506f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > page_table_check_ptes_set include/linux/page_table_check.h:74 [inline] > set_ptes include/linux/pgtable.h:267 [inline] > __ptep_modify_prot_commit include/linux/pgtable.h:1269 [inline] > ptep_modify_prot_commit include/linux/pgtable.h:1302 [inline] > change_pte_range mm/mprotect.c:194 [inline] > change_pmd_range mm/mprotect.c:424 [inline] > change_pud_range mm/mprotect.c:457 [inline] > change_p4d_range mm/mprotect.c:480 [inline] > change_protection_range mm/mprotect.c:508 [inline] > change_protection+0x2770/0x3cc0 mm/mprotect.c:542 > mprotect_fixup+0x740/0xa90 mm/mprotect.c:655 > do_mprotect_pkey+0x90d/0xe00 mm/mprotect.c:820 > __do_sys_mprotect mm/mprotect.c:841 [inline] > __se_sys_mprotect mm/mprotect.c:838 [inline] > __x64_sys_mprotect+0x80/0x90 mm/mprotect.c:838 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > RIP: 0033:0x7f45514bf429 > Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007ffe52191598 EFLAGS: 00000246 ORIG_RAX: 000000000000000a > RAX: ffffffffffffffda RBX: 00007ffe52191768 RCX: 00007f45514bf429 > RDX: 000000000000000f RSI: 0000000000004000 RDI: 0000000020ffc000 > RBP: 00007f4551532610 R08: 00007ffe52191768 R09: 00007ffe52191768 > R10: 00007ffe52191768 R11: 0000000000000246 R12: 0000000000000001 > R13: 00007ffe52191758 R14: 0000000000000001 R15: 0000000000000001 > </TASK> Did we find a real issue that involves mprotect()? At least can_change_pte_writable() should always return "false" for userfaultfd_pte_wp(). Do we maybe have a uffd-wp PTE outside of a UFFD_WP VMA? Or was the PTE already writable and we only detect it now as we call mprotect()? (missed to detect it earlier?) > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > If the report is already addressed, let syzbot know by replying with: > #syz fix: exact-commit-title > > If you want syzbot to run the reproducer, reply with: > #syz test: git://repo/address.git branch-or-commit-hash > If you attach or paste a git patch, syzbot will apply it before testing. > > If you want to overwrite report's subsystems, reply with: > #syz set subsystems: new-subsystem > (See the list of subsystem names on the web dashboard) > > If the report is a duplicate of another one, reply with: > #syz dup: exact-subject-of-another-report > > If you want to undo deduplication, reply with: > #syz undup > -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [syzbot] [mm?] WARNING in __page_table_check_ptes_set 2024-04-22 10:07 ` David Hildenbrand @ 2024-04-22 10:38 ` David Hildenbrand 2024-04-22 11:46 ` David Hildenbrand 0 siblings, 1 reply; 6+ messages in thread From: David Hildenbrand @ 2024-04-22 10:38 UTC (permalink / raw) To: syzbot, akpm, linux-kernel, linux-mm, pasha.tatashin, syzkaller-bugs On 22.04.24 12:07, David Hildenbrand wrote: > On 21.04.24 22:16, syzbot wrote: >> Hello, >> >> syzbot found the following issue on: >> >> HEAD commit: 4eab35893071 Add linux-next specific files for 20240417 >> git tree: linux-next >> console+strace: https://syzkaller.appspot.com/x/log.txt?x=1727a61b180000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=27920e47287645ff >> dashboard link: https://syzkaller.appspot.com/bug?extid=d8426b591c36b21c750e >> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=156da22d180000 >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=163dfec7180000 >> >> Downloadable assets: >> disk image: https://storage.googleapis.com/syzbot-assets/9f7d6c097fb4/disk-4eab3589.raw.xz >> vmlinux: https://storage.googleapis.com/syzbot-assets/287b16352982/vmlinux-4eab3589.xz >> kernel image: https://storage.googleapis.com/syzbot-assets/23839c65c573/bzImage-4eab3589.xz >> >> IMPORTANT: if you fix the issue, please add the following tag to the commit: >> Reported-by: syzbot+d8426b591c36b21c750e@syzkaller.appspotmail.com >> >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_pte mm/page_table_check.c:199 [inline] >> WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_ptes_set+0x1db/0x420 > > I think this is > > if (pte_present(pte) && pte_uffd_wp(pte)) > WARN_ON_ONCE(pte_write(pte)); > > mm/page_table_check.c:213 >> Modules linked in: >> CPU: 0 PID: 5084 Comm: syz-executor382 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 >> RIP: 0010:__page_table_check_pte mm/page_table_check.c:199 [inline] >> RIP: 0010:__page_table_check_ptes_set+0x1db/0x420 mm/page_table_check.c:213 >> Code: 48 8b 7c 24 40 48 c7 c6 80 19 46 8e e8 ee df 8e ff 41 83 fc 1d 74 18 41 83 fc 1a 75 1d e8 5d da 8e ff eb 10 e8 56 da 8e ff 90 <0f> 0b 90 eb 10 e8 4b da 8e ff 90 0f 0b 90 eb 05 e8 40 da 8e ff 48 >> RSP: 0018:ffffc9000366f740 EFLAGS: 00010293 >> RAX: ffffffff8207833a RBX: ffffc9000366f7c0 RCX: ffff888022af3c00 >> RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000 >> RBP: ffffc9000366f830 R08: ffffffff820782af R09: 1ffffd40000a6a10 >> R10: dffffc0000000000 R11: fffff940000a6a11 R12: 0000000000000000 >> R13: 0000000014d42c67 R14: 0000000000000001 R15: 0000000000000000 >> FS: 0000555567f79380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 000000000066c7e0 CR3: 0000000078cb0000 CR4: 00000000003506f0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> <TASK> >> page_table_check_ptes_set include/linux/page_table_check.h:74 [inline] >> set_ptes include/linux/pgtable.h:267 [inline] >> __ptep_modify_prot_commit include/linux/pgtable.h:1269 [inline] >> ptep_modify_prot_commit include/linux/pgtable.h:1302 [inline] >> change_pte_range mm/mprotect.c:194 [inline] >> change_pmd_range mm/mprotect.c:424 [inline] >> change_pud_range mm/mprotect.c:457 [inline] >> change_p4d_range mm/mprotect.c:480 [inline] >> change_protection_range mm/mprotect.c:508 [inline] >> change_protection+0x2770/0x3cc0 mm/mprotect.c:542 >> mprotect_fixup+0x740/0xa90 mm/mprotect.c:655 >> do_mprotect_pkey+0x90d/0xe00 mm/mprotect.c:820 >> __do_sys_mprotect mm/mprotect.c:841 [inline] >> __se_sys_mprotect mm/mprotect.c:838 [inline] >> __x64_sys_mprotect+0x80/0x90 mm/mprotect.c:838 >> do_syscall_x64 arch/x86/entry/common.c:52 [inline] >> do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 >> entry_SYSCALL_64_after_hwframe+0x77/0x7f >> RIP: 0033:0x7f45514bf429 >> Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 >> RSP: 002b:00007ffe52191598 EFLAGS: 00000246 ORIG_RAX: 000000000000000a >> RAX: ffffffffffffffda RBX: 00007ffe52191768 RCX: 00007f45514bf429 >> RDX: 000000000000000f RSI: 0000000000004000 RDI: 0000000020ffc000 >> RBP: 00007f4551532610 R08: 00007ffe52191768 R09: 00007ffe52191768 >> R10: 00007ffe52191768 R11: 0000000000000246 R12: 0000000000000001 >> R13: 00007ffe52191758 R14: 0000000000000001 R15: 0000000000000001 >> </TASK> > > Did we find a real issue that involves mprotect()? > > At least can_change_pte_writable() should always return "false" for > userfaultfd_pte_wp(). > > Do we maybe have a uffd-wp PTE outside of a UFFD_WP VMA? > > Or was the PTE already writable and we only detect it now as we call > mprotect()? (missed to detect it earlier?) Staring at the reproducer, we do syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, /*offset=*/0ul); syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/ 7ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, /*offset=*/0ul); -> Writable anonymous memmory syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, /*offset=*/0ul); intptr_t res = 0; res = syscall(__NR_userfaultfd, /*flags=UFFD_USER_MODE_ONLY|O_NONBLOCK*/ 0x801ul); if (res != -1) r[0] = res; *(uint64_t*)0x200004c0 = 0xaa; *(uint64_t*)0x200004c8 = 0; *(uint64_t*)0x200004d0 = 0; syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa3f, /*arg=*/0x200004c0ul); -> _UFFDIO_API handshake? syscall(__NR_mprotect, /*addr=*/0x20ffc000ul, /*len=*/0x3000ul, /*prot=PROT_SEM|PROT_EXEC*/ 0xcul); -> Protect target range R/O. I assume: no page populated yet? -> 3 pages starting at 0x20ffc000ul; *(uint64_t*)0x20000180 = 0x20ffc000; *(uint64_t*)0x20000188 = 0x3000; *(uint64_t*)0x20000190 = 3; *(uint64_t*)0x20000198 = 0; syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc020aa00, /*arg=*/0x20000180ul); -> _UFFDIO_REGISTER (aa00) -> _range = 3 pages starting at 0x20ffc000ul -> _mode = UFFDIO_REGISTER_MODE_WP | UFFDIO_REGISTER_MODE_MINOR *(uint64_t*)0x20000000 = 0x20ffd000; *(uint64_t*)0x20000008 = 0x20ffb000; *(uint64_t*)0x20000010 = 0x1000; *(uint64_t*)0x20000018 = 3; *(uint64_t*)0x20000020 = 0; syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc028aa03, /*arg=*/0x20000000ul); -> _UFFDIO_COPY (aa03) -> dst = 0x20ffd000 -> src = 0x20ffb000 -> len = 0x1000 (single page) -> mode = UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP -> We are copying into the R/O range. src should be R/W and trigger a page fault on access where we get a fresh page. *(uint16_t*)0x200000c0 = 1; *(uint64_t*)0x200000c8 = 0x20000040; *(uint16_t*)0x20000040 = 6; *(uint8_t*)0x20000042 = 0; *(uint8_t*)0x20000043 = 0; *(uint32_t*)0x20000044 = 0x7fffffff; res = syscall(__NR_seccomp, /*op=*/1ul, /*flags=*/0ul, /*arg=*/0x200000c0ul); if (res != -1) r[1] = res; syscall(__NR_open_tree, /*dfd=*/-1, /*filename=*/0ul, /*flags=*/0ul); -> No idea what happens here and if it is relevant. If __NR_seccomp failed, we would no set r[1]. syscall(__NR_close_range, /*fd=*/r[1], /*max_fd=*/-1, /*flags=*/0ul); -> Is that closing uffd as well, especially if __NR_seccomp failed? syscall(__NR_mprotect, /*addr=*/0x20ffc000ul, /*len=*/0x4000ul, /*prot=PROT_SEM|PROT_WRITE|PROT_READ|PROT_EXEC*/ 0xful); -> Restore write permissions. This seems to fire the uffd-wp page table check I assume. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [syzbot] [mm?] WARNING in __page_table_check_ptes_set 2024-04-22 10:38 ` David Hildenbrand @ 2024-04-22 11:46 ` David Hildenbrand 2024-04-22 13:28 ` Peter Xu 0 siblings, 1 reply; 6+ messages in thread From: David Hildenbrand @ 2024-04-22 11:46 UTC (permalink / raw) To: syzbot, akpm, linux-kernel, linux-mm, pasha.tatashin, syzkaller-bugs On 22.04.24 12:38, David Hildenbrand wrote: > On 22.04.24 12:07, David Hildenbrand wrote: >> On 21.04.24 22:16, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit: 4eab35893071 Add linux-next specific files for 20240417 >>> git tree: linux-next >>> console+strace: https://syzkaller.appspot.com/x/log.txt?x=1727a61b180000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=27920e47287645ff >>> dashboard link: https://syzkaller.appspot.com/bug?extid=d8426b591c36b21c750e >>> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=156da22d180000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=163dfec7180000 >>> >>> Downloadable assets: >>> disk image: https://storage.googleapis.com/syzbot-assets/9f7d6c097fb4/disk-4eab3589.raw.xz >>> vmlinux: https://storage.googleapis.com/syzbot-assets/287b16352982/vmlinux-4eab3589.xz >>> kernel image: https://storage.googleapis.com/syzbot-assets/23839c65c573/bzImage-4eab3589.xz >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>> Reported-by: syzbot+d8426b591c36b21c750e@syzkaller.appspotmail.com >>> >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_pte mm/page_table_check.c:199 [inline] >>> WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_ptes_set+0x1db/0x420 >> >> I think this is >> >> if (pte_present(pte) && pte_uffd_wp(pte)) >> WARN_ON_ONCE(pte_write(pte)); >> >> mm/page_table_check.c:213 >>> Modules linked in: >>> CPU: 0 PID: 5084 Comm: syz-executor382 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 >>> RIP: 0010:__page_table_check_pte mm/page_table_check.c:199 [inline] >>> RIP: 0010:__page_table_check_ptes_set+0x1db/0x420 mm/page_table_check.c:213 >>> Code: 48 8b 7c 24 40 48 c7 c6 80 19 46 8e e8 ee df 8e ff 41 83 fc 1d 74 18 41 83 fc 1a 75 1d e8 5d da 8e ff eb 10 e8 56 da 8e ff 90 <0f> 0b 90 eb 10 e8 4b da 8e ff 90 0f 0b 90 eb 05 e8 40 da 8e ff 48 >>> RSP: 0018:ffffc9000366f740 EFLAGS: 00010293 >>> RAX: ffffffff8207833a RBX: ffffc9000366f7c0 RCX: ffff888022af3c00 >>> RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000 >>> RBP: ffffc9000366f830 R08: ffffffff820782af R09: 1ffffd40000a6a10 >>> R10: dffffc0000000000 R11: fffff940000a6a11 R12: 0000000000000000 >>> R13: 0000000014d42c67 R14: 0000000000000001 R15: 0000000000000000 >>> FS: 0000555567f79380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 000000000066c7e0 CR3: 0000000078cb0000 CR4: 00000000003506f0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> Call Trace: >>> <TASK> >>> page_table_check_ptes_set include/linux/page_table_check.h:74 [inline] >>> set_ptes include/linux/pgtable.h:267 [inline] >>> __ptep_modify_prot_commit include/linux/pgtable.h:1269 [inline] >>> ptep_modify_prot_commit include/linux/pgtable.h:1302 [inline] >>> change_pte_range mm/mprotect.c:194 [inline] >>> change_pmd_range mm/mprotect.c:424 [inline] >>> change_pud_range mm/mprotect.c:457 [inline] >>> change_p4d_range mm/mprotect.c:480 [inline] >>> change_protection_range mm/mprotect.c:508 [inline] >>> change_protection+0x2770/0x3cc0 mm/mprotect.c:542 >>> mprotect_fixup+0x740/0xa90 mm/mprotect.c:655 >>> do_mprotect_pkey+0x90d/0xe00 mm/mprotect.c:820 >>> __do_sys_mprotect mm/mprotect.c:841 [inline] >>> __se_sys_mprotect mm/mprotect.c:838 [inline] >>> __x64_sys_mprotect+0x80/0x90 mm/mprotect.c:838 >>> do_syscall_x64 arch/x86/entry/common.c:52 [inline] >>> do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 >>> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>> RIP: 0033:0x7f45514bf429 >>> Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 >>> RSP: 002b:00007ffe52191598 EFLAGS: 00000246 ORIG_RAX: 000000000000000a >>> RAX: ffffffffffffffda RBX: 00007ffe52191768 RCX: 00007f45514bf429 >>> RDX: 000000000000000f RSI: 0000000000004000 RDI: 0000000020ffc000 >>> RBP: 00007f4551532610 R08: 00007ffe52191768 R09: 00007ffe52191768 >>> R10: 00007ffe52191768 R11: 0000000000000246 R12: 0000000000000001 >>> R13: 00007ffe52191758 R14: 0000000000000001 R15: 0000000000000001 >>> </TASK> >> >> Did we find a real issue that involves mprotect()? >> >> At least can_change_pte_writable() should always return "false" for >> userfaultfd_pte_wp(). >> >> Do we maybe have a uffd-wp PTE outside of a UFFD_WP VMA? >> >> Or was the PTE already writable and we only detect it now as we call >> mprotect()? (missed to detect it earlier?) > > Staring at the reproducer, we do > > > syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, > /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, > /*offset=*/0ul); > syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, > /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/ 7ul, > /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, > /*offset=*/0ul); > > -> Writable anonymous memmory > > syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul, > /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, > /*offset=*/0ul); > intptr_t res = 0; > res = syscall(__NR_userfaultfd, > /*flags=UFFD_USER_MODE_ONLY|O_NONBLOCK*/ 0x801ul); > if (res != -1) > r[0] = res; > *(uint64_t*)0x200004c0 = 0xaa; > *(uint64_t*)0x200004c8 = 0; > *(uint64_t*)0x200004d0 = 0; > syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa3f, /*arg=*/0x200004c0ul); > > -> _UFFDIO_API handshake? > > syscall(__NR_mprotect, /*addr=*/0x20ffc000ul, /*len=*/0x3000ul, > /*prot=PROT_SEM|PROT_EXEC*/ 0xcul); > > -> Protect target range R/O. I assume: no page populated yet? > -> 3 pages starting at 0x20ffc000ul; > > *(uint64_t*)0x20000180 = 0x20ffc000; > *(uint64_t*)0x20000188 = 0x3000; > *(uint64_t*)0x20000190 = 3; > *(uint64_t*)0x20000198 = 0; > syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc020aa00, /*arg=*/0x20000180ul); > > -> _UFFDIO_REGISTER (aa00) > -> _range = 3 pages starting at 0x20ffc000ul > -> _mode = UFFDIO_REGISTER_MODE_WP | UFFDIO_REGISTER_MODE_MINOR > > *(uint64_t*)0x20000000 = 0x20ffd000; > *(uint64_t*)0x20000008 = 0x20ffb000; > *(uint64_t*)0x20000010 = 0x1000; > *(uint64_t*)0x20000018 = 3; > *(uint64_t*)0x20000020 = 0; > syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc028aa03, /*arg=*/0x20000000ul); > > -> _UFFDIO_COPY (aa03) > -> dst = 0x20ffd000 > -> src = 0x20ffb000 > -> len = 0x1000 (single page) > -> mode = UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP > > -> We are copying into the R/O range. src should be R/W and trigger a page fault > on access where we get a fresh page. > > *(uint16_t*)0x200000c0 = 1; > *(uint64_t*)0x200000c8 = 0x20000040; > *(uint16_t*)0x20000040 = 6; > *(uint8_t*)0x20000042 = 0; > *(uint8_t*)0x20000043 = 0; > *(uint32_t*)0x20000044 = 0x7fffffff; > res = syscall(__NR_seccomp, /*op=*/1ul, /*flags=*/0ul, /*arg=*/0x200000c0ul); > if (res != -1) > r[1] = res; > syscall(__NR_open_tree, /*dfd=*/-1, /*filename=*/0ul, /*flags=*/0ul); > > -> No idea what happens here and if it is relevant. If __NR_seccomp failed, we would > no set r[1]. > > syscall(__NR_close_range, /*fd=*/r[1], /*max_fd=*/-1, /*flags=*/0ul); > > -> Is that closing uffd as well, especially if __NR_seccomp failed? > > syscall(__NR_mprotect, /*addr=*/0x20ffc000ul, /*len=*/0x4000ul, > /*prot=PROT_SEM|PROT_WRITE|PROT_READ|PROT_EXEC*/ 0xful); > > -> Restore write permissions. This seems to fire the uffd-wp page table check I assume. I think the issue is that userfaultfd_release() will clear the VMA UFFD_WP flag, but it will not clear PTE uffd-wp bits. So we have leftover PTE uffd-wp bits at the time we wr-unprotect. I thought we removed that lazy handling, but looks like we didn't consider the "close uffd" case in: commit f369b07c861435bd812a9d14493f71b34132ed6f Author: Peter Xu <peterx@redhat.com> Date: Thu Aug 11 16:13:40 2022 -0400 mm/uffd: reset write protection when unregister with wp-mode close should behave just like unregister. Simplified+readable reproducer: #define _GNU_SOURCE #include <stdint.h> #include <fcntl.h> #include <sys/syscall.h> #include <sys/mman.h> #include <sys/types.h> #include <sys/ioctl.h> #include <linux/userfaultfd.h> #include <unistd.h> int main(void) { void *src = mmap(0, 4096, PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); void *dst = mmap(0, 4096, PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); struct uffdio_register uffdio_register = {}; struct uffdio_copy uffdio_copy = {}; struct uffdio_api uffdio_api = {}; int uffd; uffd = syscall(SYS_userfaultfd, O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY); uffdio_api.api = UFFD_API; ioctl(uffd, UFFDIO_API, &uffdio_api); uffdio_register.range.start = (uintptr_t)dst; uffdio_register.range.len = 4096; uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; ioctl(uffd, UFFDIO_REGISTER, &uffdio_register); uffdio_copy.dst = (uintptr_t)dst; uffdio_copy.src = (uintptr_t)src; uffdio_copy.len = 4096; uffdio_copy.mode = UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP; ioctl(uffd, UFFDIO_COPY, &uffdio_copy); close(uffd); mprotect(dst, 4096, PROT_READ|PROT_WRITE); return 0; } -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [syzbot] [mm?] WARNING in __page_table_check_ptes_set 2024-04-22 11:46 ` David Hildenbrand @ 2024-04-22 13:28 ` Peter Xu 2024-04-22 15:10 ` David Hildenbrand 0 siblings, 1 reply; 6+ messages in thread From: Peter Xu @ 2024-04-22 13:28 UTC (permalink / raw) To: David Hildenbrand Cc: syzbot, akpm, linux-kernel, linux-mm, pasha.tatashin, syzkaller-bugs On Mon, Apr 22, 2024 at 01:46:20PM +0200, David Hildenbrand wrote: > On 22.04.24 12:38, David Hildenbrand wrote: > > On 22.04.24 12:07, David Hildenbrand wrote: > > > On 21.04.24 22:16, syzbot wrote: > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: 4eab35893071 Add linux-next specific files for 20240417 > > > > git tree: linux-next > > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=1727a61b180000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=27920e47287645ff > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=d8426b591c36b21c750e > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=156da22d180000 > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=163dfec7180000 > > > > > > > > Downloadable assets: > > > > disk image: https://storage.googleapis.com/syzbot-assets/9f7d6c097fb4/disk-4eab3589.raw.xz > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/287b16352982/vmlinux-4eab3589.xz > > > > kernel image: https://storage.googleapis.com/syzbot-assets/23839c65c573/bzImage-4eab3589.xz > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+d8426b591c36b21c750e@syzkaller.appspotmail.com > > > > > > > > ------------[ cut here ]------------ > > > > WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_pte mm/page_table_check.c:199 [inline] > > > > WARNING: CPU: 0 PID: 5084 at mm/page_table_check.c:199 __page_table_check_ptes_set+0x1db/0x420 > > > > > > I think this is > > > > > > if (pte_present(pte) && pte_uffd_wp(pte)) > > > WARN_ON_ONCE(pte_write(pte)); > > > > > > mm/page_table_check.c:213 > > > > Modules linked in: > > > > CPU: 0 PID: 5084 Comm: syz-executor382 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0 > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 > > > > RIP: 0010:__page_table_check_pte mm/page_table_check.c:199 [inline] > > > > RIP: 0010:__page_table_check_ptes_set+0x1db/0x420 mm/page_table_check.c:213 > > > > Code: 48 8b 7c 24 40 48 c7 c6 80 19 46 8e e8 ee df 8e ff 41 83 fc 1d 74 18 41 83 fc 1a 75 1d e8 5d da 8e ff eb 10 e8 56 da 8e ff 90 <0f> 0b 90 eb 10 e8 4b da 8e ff 90 0f 0b 90 eb 05 e8 40 da 8e ff 48 > > > > RSP: 0018:ffffc9000366f740 EFLAGS: 00010293 > > > > RAX: ffffffff8207833a RBX: ffffc9000366f7c0 RCX: ffff888022af3c00 > > > > RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000 > > > > RBP: ffffc9000366f830 R08: ffffffff820782af R09: 1ffffd40000a6a10 > > > > R10: dffffc0000000000 R11: fffff940000a6a11 R12: 0000000000000000 > > > > R13: 0000000014d42c67 R14: 0000000000000001 R15: 0000000000000000 > > > > FS: 0000555567f79380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: 000000000066c7e0 CR3: 0000000078cb0000 CR4: 00000000003506f0 > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > Call Trace: > > > > <TASK> > > > > page_table_check_ptes_set include/linux/page_table_check.h:74 [inline] > > > > set_ptes include/linux/pgtable.h:267 [inline] > > > > __ptep_modify_prot_commit include/linux/pgtable.h:1269 [inline] > > > > ptep_modify_prot_commit include/linux/pgtable.h:1302 [inline] > > > > change_pte_range mm/mprotect.c:194 [inline] > > > > change_pmd_range mm/mprotect.c:424 [inline] > > > > change_pud_range mm/mprotect.c:457 [inline] > > > > change_p4d_range mm/mprotect.c:480 [inline] > > > > change_protection_range mm/mprotect.c:508 [inline] > > > > change_protection+0x2770/0x3cc0 mm/mprotect.c:542 > > > > mprotect_fixup+0x740/0xa90 mm/mprotect.c:655 > > > > do_mprotect_pkey+0x90d/0xe00 mm/mprotect.c:820 > > > > __do_sys_mprotect mm/mprotect.c:841 [inline] > > > > __se_sys_mprotect mm/mprotect.c:838 [inline] > > > > __x64_sys_mprotect+0x80/0x90 mm/mprotect.c:838 > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 > > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > RIP: 0033:0x7f45514bf429 > > > > Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > > > > RSP: 002b:00007ffe52191598 EFLAGS: 00000246 ORIG_RAX: 000000000000000a > > > > RAX: ffffffffffffffda RBX: 00007ffe52191768 RCX: 00007f45514bf429 > > > > RDX: 000000000000000f RSI: 0000000000004000 RDI: 0000000020ffc000 > > > > RBP: 00007f4551532610 R08: 00007ffe52191768 R09: 00007ffe52191768 > > > > R10: 00007ffe52191768 R11: 0000000000000246 R12: 0000000000000001 > > > > R13: 00007ffe52191758 R14: 0000000000000001 R15: 0000000000000001 > > > > </TASK> > > > > > > Did we find a real issue that involves mprotect()? > > > > > > At least can_change_pte_writable() should always return "false" for > > > userfaultfd_pte_wp(). > > > > > > Do we maybe have a uffd-wp PTE outside of a UFFD_WP VMA? > > > > > > Or was the PTE already writable and we only detect it now as we call > > > mprotect()? (missed to detect it earlier?) > > > > Staring at the reproducer, we do > > > > > > syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, > > /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, > > /*offset=*/0ul); > > syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, > > /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/ 7ul, > > /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, > > /*offset=*/0ul); > > > > -> Writable anonymous memmory > > > > syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul, > > /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1, > > /*offset=*/0ul); > > intptr_t res = 0; > > res = syscall(__NR_userfaultfd, > > /*flags=UFFD_USER_MODE_ONLY|O_NONBLOCK*/ 0x801ul); > > if (res != -1) > > r[0] = res; > > *(uint64_t*)0x200004c0 = 0xaa; > > *(uint64_t*)0x200004c8 = 0; > > *(uint64_t*)0x200004d0 = 0; > > syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa3f, /*arg=*/0x200004c0ul); > > > > -> _UFFDIO_API handshake? > > > > syscall(__NR_mprotect, /*addr=*/0x20ffc000ul, /*len=*/0x3000ul, > > /*prot=PROT_SEM|PROT_EXEC*/ 0xcul); > > > > -> Protect target range R/O. I assume: no page populated yet? > > -> 3 pages starting at 0x20ffc000ul; > > > > *(uint64_t*)0x20000180 = 0x20ffc000; > > *(uint64_t*)0x20000188 = 0x3000; > > *(uint64_t*)0x20000190 = 3; > > *(uint64_t*)0x20000198 = 0; > > syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc020aa00, /*arg=*/0x20000180ul); > > > > -> _UFFDIO_REGISTER (aa00) > > -> _range = 3 pages starting at 0x20ffc000ul > > -> _mode = UFFDIO_REGISTER_MODE_WP | UFFDIO_REGISTER_MODE_MINOR > > > > *(uint64_t*)0x20000000 = 0x20ffd000; > > *(uint64_t*)0x20000008 = 0x20ffb000; > > *(uint64_t*)0x20000010 = 0x1000; > > *(uint64_t*)0x20000018 = 3; > > *(uint64_t*)0x20000020 = 0; > > syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc028aa03, /*arg=*/0x20000000ul); > > > > -> _UFFDIO_COPY (aa03) > > -> dst = 0x20ffd000 > > -> src = 0x20ffb000 > > -> len = 0x1000 (single page) > > -> mode = UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP > > > > -> We are copying into the R/O range. src should be R/W and trigger a page fault > > on access where we get a fresh page. > > > > *(uint16_t*)0x200000c0 = 1; > > *(uint64_t*)0x200000c8 = 0x20000040; > > *(uint16_t*)0x20000040 = 6; > > *(uint8_t*)0x20000042 = 0; > > *(uint8_t*)0x20000043 = 0; > > *(uint32_t*)0x20000044 = 0x7fffffff; > > res = syscall(__NR_seccomp, /*op=*/1ul, /*flags=*/0ul, /*arg=*/0x200000c0ul); > > if (res != -1) > > r[1] = res; > > syscall(__NR_open_tree, /*dfd=*/-1, /*filename=*/0ul, /*flags=*/0ul); > > > > -> No idea what happens here and if it is relevant. If __NR_seccomp failed, we would > > no set r[1]. > > > > syscall(__NR_close_range, /*fd=*/r[1], /*max_fd=*/-1, /*flags=*/0ul); > > > > -> Is that closing uffd as well, especially if __NR_seccomp failed? > > > > syscall(__NR_mprotect, /*addr=*/0x20ffc000ul, /*len=*/0x4000ul, > > /*prot=PROT_SEM|PROT_WRITE|PROT_READ|PROT_EXEC*/ 0xful); > > > > -> Restore write permissions. This seems to fire the uffd-wp page table check I assume. > > I think the issue is that userfaultfd_release() will clear the VMA UFFD_WP flag, > but it will not clear PTE uffd-wp bits. So we have leftover PTE uffd-wp bits at > the time we wr-unprotect. > > I thought we removed that lazy handling, but looks like we didn't consider the > "close uffd" case in: > > commit f369b07c861435bd812a9d14493f71b34132ed6f > Author: Peter Xu <peterx@redhat.com> > Date: Thu Aug 11 16:13:40 2022 -0400 > > mm/uffd: reset write protection when unregister with wp-mode > > > close should behave just like unregister. > > > Simplified+readable reproducer: > > #define _GNU_SOURCE > > #include <stdint.h> > #include <fcntl.h> > #include <sys/syscall.h> > #include <sys/mman.h> > #include <sys/types.h> > #include <sys/ioctl.h> > #include <linux/userfaultfd.h> > #include <unistd.h> > > int main(void) > { > void *src = mmap(0, 4096, PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); > void *dst = mmap(0, 4096, PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); > struct uffdio_register uffdio_register = {}; > struct uffdio_copy uffdio_copy = {}; > struct uffdio_api uffdio_api = {}; > int uffd; > > uffd = syscall(SYS_userfaultfd, O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY); > uffdio_api.api = UFFD_API; > ioctl(uffd, UFFDIO_API, &uffdio_api); > > uffdio_register.range.start = (uintptr_t)dst; > uffdio_register.range.len = 4096; > uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; > ioctl(uffd, UFFDIO_REGISTER, &uffdio_register); > > uffdio_copy.dst = (uintptr_t)dst; > uffdio_copy.src = (uintptr_t)src; > uffdio_copy.len = 4096; > uffdio_copy.mode = UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP; > ioctl(uffd, UFFDIO_COPY, &uffdio_copy); > > close(uffd); > > mprotect(dst, 4096, PROT_READ|PROT_WRITE); > return 0; > } Thanks, I'll post a patch. PS: next time feel free to try "strace ./reproducer", it'll do the translations and I found it handy to work with syzbot. -- Peter Xu ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [syzbot] [mm?] WARNING in __page_table_check_ptes_set 2024-04-22 13:28 ` Peter Xu @ 2024-04-22 15:10 ` David Hildenbrand 0 siblings, 0 replies; 6+ messages in thread From: David Hildenbrand @ 2024-04-22 15:10 UTC (permalink / raw) To: Peter Xu Cc: syzbot, akpm, linux-kernel, linux-mm, pasha.tatashin, syzkaller-bugs >> commit f369b07c861435bd812a9d14493f71b34132ed6f >> Author: Peter Xu <peterx@redhat.com> >> Date: Thu Aug 11 16:13:40 2022 -0400 >> >> mm/uffd: reset write protection when unregister with wp-mode >> >> >> close should behave just like unregister. >> >> >> Simplified+readable reproducer: >> >> #define _GNU_SOURCE >> >> #include <stdint.h> >> #include <fcntl.h> >> #include <sys/syscall.h> >> #include <sys/mman.h> >> #include <sys/types.h> >> #include <sys/ioctl.h> >> #include <linux/userfaultfd.h> >> #include <unistd.h> >> >> int main(void) >> { >> void *src = mmap(0, 4096, PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); >> void *dst = mmap(0, 4096, PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); >> struct uffdio_register uffdio_register = {}; >> struct uffdio_copy uffdio_copy = {}; >> struct uffdio_api uffdio_api = {}; >> int uffd; >> >> uffd = syscall(SYS_userfaultfd, O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY); >> uffdio_api.api = UFFD_API; >> ioctl(uffd, UFFDIO_API, &uffdio_api); >> >> uffdio_register.range.start = (uintptr_t)dst; >> uffdio_register.range.len = 4096; >> uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; >> ioctl(uffd, UFFDIO_REGISTER, &uffdio_register); >> >> uffdio_copy.dst = (uintptr_t)dst; >> uffdio_copy.src = (uintptr_t)src; >> uffdio_copy.len = 4096; >> uffdio_copy.mode = UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP; >> ioctl(uffd, UFFDIO_COPY, &uffdio_copy); >> >> close(uffd); >> >> mprotect(dst, 4096, PROT_READ|PROT_WRITE); >> return 0; >> } > > Thanks, I'll post a patch. > > PS: next time feel free to try "strace ./reproducer", it'll do the > translations and I found it handy to work with syzbot. Cool, was not aware that it would do that amount of translation! -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-04-22 15:10 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-04-21 20:16 [syzbot] [mm?] WARNING in __page_table_check_ptes_set syzbot 2024-04-22 10:07 ` David Hildenbrand 2024-04-22 10:38 ` David Hildenbrand 2024-04-22 11:46 ` David Hildenbrand 2024-04-22 13:28 ` Peter Xu 2024-04-22 15:10 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox