* WARNING in try_grab_page @ 2023-08-03 8:56 Yikebaer Aizezi 2023-08-03 12:50 ` Matthew Wilcox 2023-08-03 13:19 ` Matthew Wilcox 0 siblings, 2 replies; 8+ messages in thread From: Yikebaer Aizezi @ 2023-08-03 8:56 UTC (permalink / raw) To: akpm, linux-mm; +Cc: linux-kernel Hello, When using Healer to fuzz the Linux kernel, the following crash was triggered on: HEAD commit: fdf0eaf11452d72945af31804e2a1048ee1b574c (tag: v6.5-rc2) git tree: upstream I tried to reproduce this bug on v6.5-rc3(HEAD commit: 6eaae198076080886b9e7d57f4ae06fa782f90ef), it still exists. console output: https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link kernel config: https://drive.google.com/file/d/1dApy7OR4KDYdhF96ZUowZQ1r-uLYTd-0/view?usp=drive_link C reproducer: https://drive.google.com/file/d/1Dkj31wwYP7p-AEJeemD3yrIUr7-VdBqF/view?usp=drive_link Syzlang reproducer: https://drive.google.com/file/d/1ib6zTs4srKI1RnUcHG3mSZ5HeW7rZhnd/view?usp=drive_link If you fix this issue, please add the following tag to the commit: Reported-by: Yikebaer Aizezi <yikebaer61@gmail.com> WARNING: CPU: 0 PID: 10232 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 mm/gup.c:229 Modules linked in: CPU: 0 PID: 10232 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:try_grab_page+0x2dd/0x3a0 mm/gup.c:229 Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d RSP: 0018:ffffc9000e99f268 EFLAGS: 00010202 RAX: 0000000000002f80 RBX: ffffea00003ae340 RCX: ffffc90002d79000 RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374 RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e R10: ffffea00003ae377 R11: 0000000000000000 R12: ffffea00003ae374 R13: 0000000000210052 R14: ffffea00003ae340 R15: 000000000eb8d225 FS: 00007fc6339a4640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000002c3ea000 CR4: 0000000000752ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> follow_page_pte+0x18c/0x1610 mm/gup.c:651 follow_pmd_mask mm/gup.c:727 [inline] follow_pud_mask mm/gup.c:765 [inline] follow_p4d_mask mm/gup.c:782 [inline] follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 __get_user_pages_locked mm/gup.c:1487 [inline] get_user_pages_unlocked+0x183/0x580 mm/gup.c:2387 hva_to_pfn_slow arch/x86/kvm/../../../virt/kvm/kvm_main.c:2536 [inline] hva_to_pfn+0x198/0xbc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2674 __gfn_to_pfn_memslot+0x202/0x3e0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2736 __kvm_faultin_pfn arch/x86/kvm/mmu/mmu.c:4329 [inline] kvm_faultin_pfn+0x21b/0x12d0 arch/x86/kvm/mmu/mmu.c:4365 kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4503 [inline] kvm_tdp_page_fault+0x167/0x4d0 arch/x86/kvm/mmu/mmu.c:4549 kvm_mmu_do_page_fault arch/x86/kvm/mmu/mmu_internal.h:320 [inline] kvm_mmu_page_fault+0x2f4/0x1a40 arch/x86/kvm/mmu/mmu.c:5756 handle_ept_violation+0x20a/0x620 arch/x86/kvm/vmx/vmx.c:5760 __vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6539 [inline] vmx_handle_exit+0x4a1/0x18d0 arch/x86/kvm/vmx/vmx.c:6556 vcpu_enter_guest arch/x86/kvm/x86.c:10848 [inline] vcpu_run+0x24b6/0x44b0 arch/x86/kvm/x86.c:10951 kvm_arch_vcpu_ioctl_run+0x416/0x1830 arch/x86/kvm/x86.c:11172 kvm_vcpu_ioctl+0x4de/0xcc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x173/0x1e0 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x47959d Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fc6339a4068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000006 RBP: 000000000059c0a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac R13: 000000000000000b R14: 0000000000437250 R15: 00007fc633984000 </TASK> Modules linked in: CPU: 0 PID: 10232 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:try_grab_page+0x2dd/0x3a0 mm/gup.c:229 Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d RSP: 0018:ffffc9000e99f268 EFLAGS: 00010202 RAX: 0000000000002f80 RBX: ffffea00003ae340 RCX: ffffc90002d79000 RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374 RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e R10: ffffea00003ae377 R11: 0000000000000000 R12: ffffea00003ae374 R13: 0000000000210052 R14: ffffea00003ae340 R15: 000000000eb8d225 FS: 00007fc6339a4640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000002c3ea000 CR4: 0000000000752ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> follow_page_pte+0x18c/0x1610 mm/gup.c:651 follow_pmd_mask mm/gup.c:727 [inline] follow_pud_mask mm/gup.c:765 [inline] follow_p4d_mask mm/gup.c:782 [inline] follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 __get_user_pages_locked mm/gup.c:1487 [inline] get_user_pages_unlocked+0x183/0x580 mm/gup.c:2387 hva_to_pfn_slow arch/x86/kvm/../../../virt/kvm/kvm_main.c:2536 [inline] hva_to_pfn+0x198/0xbc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2674 __gfn_to_pfn_memslot+0x202/0x3e0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2736 __kvm_faultin_pfn arch/x86/kvm/mmu/mmu.c:4329 [inline] kvm_faultin_pfn+0x21b/0x12d0 arch/x86/kvm/mmu/mmu.c:4365 kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4503 [inline] kvm_tdp_page_fault+0x167/0x4d0 arch/x86/kvm/mmu/mmu.c:4549 kvm_mmu_do_page_fault arch/x86/kvm/mmu/mmu_internal.h:320 [inline] kvm_mmu_page_fault+0x2f4/0x1a40 arch/x86/kvm/mmu/mmu.c:5756 handle_ept_violation+0x20a/0x620 arch/x86/kvm/vmx/vmx.c:5760 __vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6539 [inline] vmx_handle_exit+0x4a1/0x18d0 arch/x86/kvm/vmx/vmx.c:6556 vcpu_enter_guest arch/x86/kvm/x86.c:10848 [inline] vcpu_run+0x24b6/0x44b0 arch/x86/kvm/x86.c:10951 kvm_arch_vcpu_ioctl_run+0x416/0x1830 arch/x86/kvm/x86.c:11172 kvm_vcpu_ioctl+0x4de/0xcc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x173/0x1e0 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x47959d Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fc6339a4068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000006 RBP: 000000000059c0a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac R13: 000000000000000b R14: 0000000000437250 R15: 00007fc633984000 </TASK> Kernel panic - not syncing: kernel: panic_on_warn set ... CPU: 0 PID: 10232 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x92/0xf0 lib/dump_stack.c:106 panic+0x570/0x620 kernel/panic.c:340 check_panic_on_warn+0x8e/0x90 kernel/panic.c:236 __warn+0xee/0x340 kernel/panic.c:673 __report_bug lib/bug.c:199 [inline] report_bug+0x25d/0x460 lib/bug.c:219 handle_bug+0x3c/0x70 arch/x86/kernel/traps.c:324 exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:345 asm_exc_invalid_op+0x16/0x20 arch/x86/include/asm/idtentry.h:568 RIP: 0010:try_grab_page+0x2dd/0x3a0 mm/gup.c:229 Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d RSP: 0018:ffffc9000e99f268 EFLAGS: 00010202 RAX: 0000000000002f80 RBX: ffffea00003ae340 RCX: ffffc90002d79000 RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374 RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e R10: ffffea00003ae377 R11: 0000000000000000 R12: ffffea00003ae374 R13: 0000000000210052 R14: ffffea00003ae340 R15: 000000000eb8d225 follow_page_pte+0x18c/0x1610 mm/gup.c:651 follow_pmd_mask mm/gup.c:727 [inline] follow_pud_mask mm/gup.c:765 [inline] follow_p4d_mask mm/gup.c:782 [inline] follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 __get_user_pages_locked mm/gup.c:1487 [inline] get_user_pages_unlocked+0x183/0x580 mm/gup.c:2387 hva_to_pfn_slow arch/x86/kvm/../../../virt/kvm/kvm_main.c:2536 [inline] hva_to_pfn+0x198/0xbc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2674 __gfn_to_pfn_memslot+0x202/0x3e0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2736 __kvm_faultin_pfn arch/x86/kvm/mmu/mmu.c:4329 [inline] kvm_faultin_pfn+0x21b/0x12d0 arch/x86/kvm/mmu/mmu.c:4365 kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4503 [inline] kvm_tdp_page_fault+0x167/0x4d0 arch/x86/kvm/mmu/mmu.c:4549 kvm_mmu_do_page_fault arch/x86/kvm/mmu/mmu_internal.h:320 [inline] kvm_mmu_page_fault+0x2f4/0x1a40 arch/x86/kvm/mmu/mmu.c:5756 handle_ept_violation+0x20a/0x620 arch/x86/kvm/vmx/vmx.c:5760 __vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6539 [inline] vmx_handle_exit+0x4a1/0x18d0 arch/x86/kvm/vmx/vmx.c:6556 vcpu_enter_guest arch/x86/kvm/x86.c:10848 [inline] vcpu_run+0x24b6/0x44b0 arch/x86/kvm/x86.c:10951 kvm_arch_vcpu_ioctl_run+0x416/0x1830 arch/x86/kvm/x86.c:11172 kvm_vcpu_ioctl+0x4de/0xcc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x173/0x1e0 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x47959d Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fc6339a4068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000006 RBP: 000000000059c0a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac R13: 000000000000000b R14: 0000000000437250 R15: 00007fc633984000 </TASK> Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 1 seconds.. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page 2023-08-03 8:56 WARNING in try_grab_page Yikebaer Aizezi @ 2023-08-03 12:50 ` Matthew Wilcox 2023-08-03 13:19 ` Matthew Wilcox 1 sibling, 0 replies; 8+ messages in thread From: Matthew Wilcox @ 2023-08-03 12:50 UTC (permalink / raw) To: Yikebaer Aizezi; +Cc: akpm, linux-mm, linux-kernel On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote: > console output: > https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link > kernel config: https://drive.google.com/file/d/1dApy7OR4KDYdhF96ZUowZQ1r-uLYTd-0/view?usp=drive_link > C reproducer: https://drive.google.com/file/d/1Dkj31wwYP7p-AEJeemD3yrIUr7-VdBqF/view?usp=drive_link Are you sure this is right? The below stack trace shows something coming in through the ioctl() path, but nothing in this reproducer calls ioctl(). It's just socket(), bind(), accept4() and sendmsg(). I don't see a way to come up with this stack backtrace from that program. > Call Trace: > <TASK> > follow_page_pte+0x18c/0x1610 mm/gup.c:651 > follow_pmd_mask mm/gup.c:727 [inline] > follow_pud_mask mm/gup.c:765 [inline] > follow_p4d_mask mm/gup.c:782 [inline] > follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 > __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 > __get_user_pages_locked mm/gup.c:1487 [inline] > get_user_pages_unlocked+0x183/0x580 mm/gup.c:2387 > hva_to_pfn_slow arch/x86/kvm/../../../virt/kvm/kvm_main.c:2536 [inline] > hva_to_pfn+0x198/0xbc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2674 > __gfn_to_pfn_memslot+0x202/0x3e0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2736 > __kvm_faultin_pfn arch/x86/kvm/mmu/mmu.c:4329 [inline] > kvm_faultin_pfn+0x21b/0x12d0 arch/x86/kvm/mmu/mmu.c:4365 > kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4503 [inline] > kvm_tdp_page_fault+0x167/0x4d0 arch/x86/kvm/mmu/mmu.c:4549 > kvm_mmu_do_page_fault arch/x86/kvm/mmu/mmu_internal.h:320 [inline] > kvm_mmu_page_fault+0x2f4/0x1a40 arch/x86/kvm/mmu/mmu.c:5756 > handle_ept_violation+0x20a/0x620 arch/x86/kvm/vmx/vmx.c:5760 > __vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6539 [inline] > vmx_handle_exit+0x4a1/0x18d0 arch/x86/kvm/vmx/vmx.c:6556 > vcpu_enter_guest arch/x86/kvm/x86.c:10848 [inline] > vcpu_run+0x24b6/0x44b0 arch/x86/kvm/x86.c:10951 > kvm_arch_vcpu_ioctl_run+0x416/0x1830 arch/x86/kvm/x86.c:11172 > kvm_vcpu_ioctl+0x4de/0xcc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112 > vfs_ioctl fs/ioctl.c:51 [inline] > __do_sys_ioctl fs/ioctl.c:870 [inline] > __se_sys_ioctl fs/ioctl.c:856 [inline] > __x64_sys_ioctl+0x173/0x1e0 fs/ioctl.c:856 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x47959d > Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007fc6339a4068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000006 > RBP: 000000000059c0a0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac > R13: 000000000000000b R14: 0000000000437250 R15: 00007fc633984000 > </TASK> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page 2023-08-03 8:56 WARNING in try_grab_page Yikebaer Aizezi 2023-08-03 12:50 ` Matthew Wilcox @ 2023-08-03 13:19 ` Matthew Wilcox 2023-08-04 3:14 ` Yikebaer Aizezi 1 sibling, 1 reply; 8+ messages in thread From: Matthew Wilcox @ 2023-08-03 13:19 UTC (permalink / raw) To: Yikebaer Aizezi; +Cc: akpm, linux-mm On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote: > console output: > https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link I dug through this, and what I found troubles me. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 Modules linked in: CPU: 0 PID: 13067 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:try_grab_page+0x2dd/0x3a0 Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d RSP: 0018:ffffc9000c2777e0 EFLAGS: 00010212 RAX: 0000000000000247 RBX: ffffea00003ae340 RCX: ffffc90002bb1000 RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374 RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e R10: ffffea00003ae377 R11: 0000000000084001 R12: ffffea00003ae374 R13: 0000000000210002 R14: ffffea00003ae340 R15: 000000000eb8d225 FS: 00007f5841a13640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000500310 CR3: 0000000018d0c000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn+0xe2/0x340 ? try_grab_page+0x2dd/0x3a0 ? report_bug+0x25d/0x460 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x14/0x40 ? asm_exc_invalid_op+0x16/0x20 ? try_grab_page+0x2dd/0x3a0 ? try_grab_page+0x2dd/0x3a0 follow_page_pte+0x18c/0x1610 ? try_grab_page+0x3a0/0x3a0 ? rcu_is_watching+0xe/0xb0 follow_page_mask+0x2e4/0xbd0 __get_user_pages+0x3fa/0xcf0 ? follow_page_mask+0xbd0/0xbd0 ? down_read_killable+0x146/0x4f0 ? down_read_interruptible+0x4f0/0x4f0 ? rcu_is_watching+0xe/0xb0 __gup_longterm_locked+0x5fa/0x1ec0 ? io_schedule_timeout+0x150/0x150 ? rcu_is_watching+0xe/0xb0 ? get_user_pages_unlocked+0x580/0x580 ? lock_release+0x4f7/0x670 ? internal_get_user_pages_fast+0xe27/0x2690 ? lock_downgrade+0x690/0x690 ? preempt_schedule_common+0x45/0xb0 ? pud_huge+0x9c/0xe0 ? pmd_huge+0xe0/0xe0 internal_get_user_pages_fast+0x119b/0x2690 ? mtree_load+0x1df/0x980 ? __gup_device_huge+0x530/0x530 ? rcu_is_watching+0xe/0xb0 ? lock_release+0x4f7/0x670 get_user_pages_fast+0x95/0xe0 ? get_user_pages_fast_only+0xe0/0xe0 do_get_mempolicy+0x50c/0xd20 ? sp_delete+0xf0/0xf0 ? seccomp_notify_ioctl+0xd80/0xd80 __x64_sys_get_mempolicy+0x187/0x2a0 ? __ia32_sys_migrate_pages+0xf0/0xf0 ? __secure_computing+0x1ff/0x360 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x47959d Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f5841a13068 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 000000000059c0a0 R08: 0000000000000003 R09: 0000000000000000 R10: 0000000020ff9000 R11: 0000000000000246 R12: 000000000059c0ac R13: 000000000000000b R14: 0000000000437250 R15: 00007f58419f3000 </TASK> Kernel panic - not syncing: kernel: panic_on_warn set ... > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 That's this line: if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) Called from: follow_page_pte+0x18c/0x1610 That did: ptep = pte_offset_map_lock(mm, pmd, address, &ptl); pte = ptep_get(ptep); page = vm_normal_page(vma, address, pte); ret = try_grab_page(page, flags); So we grabbed the PTE lock, looked up the PTE, translated that into a page ... and found a page with a zero (or negative) refcount. That's Really Bad. I think it was a zero refcount because r08 is 0 and I don't see any other registers which have a plausible negative 32-bit number in them. Yikebaer, could I trouble you to add this: +++ b/mm/gup.c @@ -226,7 +226,7 @@ int __must_check try_grab_page(struct page *page, unsigned int flags) { struct folio *folio = page_folio(page); - if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) + if (VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0, folio)) return -ENOMEM; if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) and rerun the syzkaller? That'll give us some more information about what has happened, although it won't tell us why it happened. We might need to catch someone decrementing the refcount to lower than the mapcount to catch this ... which will be tricky, given the other things we reuse the mapcount for. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page 2023-08-03 13:19 ` Matthew Wilcox @ 2023-08-04 3:14 ` Yikebaer Aizezi 2023-08-04 3:42 ` Matthew Wilcox 2023-08-04 13:35 ` David Howells 0 siblings, 2 replies; 8+ messages in thread From: Yikebaer Aizezi @ 2023-08-04 3:14 UTC (permalink / raw) To: Matthew Wilcox; +Cc: akpm, linux-mm Just patched it, then I rerun the reproduce program, and I got this output from console: BUG: Bad page state in process POC pfn:0eb8d page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xeb8d flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) page_type: 0xffffffff() raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set page_owner info is not present (never set?) Modules linked in: CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106 bad_page+0x71/0x1a0 mm/page_alloc.c:533 free_page_is_bad_report mm/page_alloc.c:974 [inline] free_page_is_bad mm/page_alloc.c:984 [inline] free_pages_prepare mm/page_alloc.c:1153 [inline] free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348 free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443 __folio_put_small mm/swap.c:106 [inline] __folio_put+0xa2/0x110 mm/swap.c:129 folio_put include/linux/mm.h:1423 [inline] put_page include/linux/mm.h:1492 [inline] extract_user_to_sg lib/scatterlist.c:1151 [inline] extract_iter_to_sg lib/scatterlist.c:1349 [inline] extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339 hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg+0xcf/0x170 net/socket.c:748 ____sys_sendmsg+0x676/0x860 net/socket.c:2494 ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fbd79539f29 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xeb8d flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) page_type: 0xffffffff() raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0) page_owner info is not present (never set?) ------------[ cut here ]------------ WARNING: CPU: 0 PID: 7962 at mm/gup.c:229 try_grab_page+0x307/0x3c0 mm/gup.c:229 Modules linked in: CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 RSP: 0018:ffffc90002927178 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> follow_page_pte+0x18c/0x1610 mm/gup.c:651 follow_pmd_mask mm/gup.c:727 [inline] follow_pud_mask mm/gup.c:765 [inline] follow_p4d_mask mm/gup.c:782 [inline] follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 __get_user_pages_locked mm/gup.c:1487 [inline] __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 extract_user_to_sg lib/scatterlist.c:1123 [inline] extract_iter_to_sg lib/scatterlist.c:1349 [inline] extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg+0xcf/0x170 net/socket.c:748 ____sys_sendmsg+0x676/0x860 net/socket.c:2494 ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fbd79539f29 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Modules linked in: CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 RSP: 0018:ffffc90002927178 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> follow_page_pte+0x18c/0x1610 mm/gup.c:651 follow_pmd_mask mm/gup.c:727 [inline] follow_pud_mask mm/gup.c:765 [inline] follow_p4d_mask mm/gup.c:782 [inline] follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 __get_user_pages_locked mm/gup.c:1487 [inline] __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 extract_user_to_sg lib/scatterlist.c:1123 [inline] extract_iter_to_sg lib/scatterlist.c:1349 [inline] extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg+0xcf/0x170 net/socket.c:748 ____sys_sendmsg+0x676/0x860 net/socket.c:2494 ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fbd79539f29 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Kernel panic - not syncing: kernel: panic_on_warn set ... CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x92/0xf0 lib/dump_stack.c:106 panic+0x570/0x620 kernel/panic.c:340 check_panic_on_warn+0x8e/0x90 kernel/panic.c:236 __warn+0xee/0x340 kernel/panic.c:673 __report_bug lib/bug.c:199 [inline] report_bug+0x25d/0x460 lib/bug.c:219 handle_bug+0x3c/0x70 arch/x86/kernel/traps.c:324 exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:345 asm_exc_invalid_op+0x16/0x20 arch/x86/include/asm/idtentry.h:568 RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 RSP: 0018:ffffc90002927178 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 follow_page_pte+0x18c/0x1610 mm/gup.c:651 follow_pmd_mask mm/gup.c:727 [inline] follow_pud_mask mm/gup.c:765 [inline] follow_p4d_mask mm/gup.c:782 [inline] follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 __get_user_pages_locked mm/gup.c:1487 [inline] __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 extract_user_to_sg lib/scatterlist.c:1123 [inline] extract_iter_to_sg lib/scatterlist.c:1349 [inline] extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg+0xcf/0x170 net/socket.c:748 ____sys_sendmsg+0x676/0x860 net/socket.c:2494 ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fbd79539f29 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 1 seconds.. --------------------------------------------------------------------------------------------- I think the previous question you mentioned about ioctl() is triggered because of another crash WARNING in kvm_arch_vcpu_ioctl_run, I think somehow these two crashes triggered at one time. But I cannot figure out why it happened. after I tried to fixed that problem, and rerun C reproducer on this issue, I got different output from console as above. Matthew Wilcox <willy@infradead.org> 于2023年8月3日周四 21:19写道: > > On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote: > > console output: > > https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link > > I dug through this, and what I found troubles me. > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 > Modules linked in: > CPU: 0 PID: 13067 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > RIP: 0010:try_grab_page+0x2dd/0x3a0 > Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d > RSP: 0018:ffffc9000c2777e0 EFLAGS: 00010212 > RAX: 0000000000000247 RBX: ffffea00003ae340 RCX: ffffc90002bb1000 > RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374 > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e > R10: ffffea00003ae377 R11: 0000000000084001 R12: ffffea00003ae374 > R13: 0000000000210002 R14: ffffea00003ae340 R15: 000000000eb8d225 > FS: 00007f5841a13640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000500310 CR3: 0000000018d0c000 CR4: 0000000000750ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > ? __warn+0xe2/0x340 > ? try_grab_page+0x2dd/0x3a0 > ? report_bug+0x25d/0x460 > ? handle_bug+0x3c/0x70 > ? exc_invalid_op+0x14/0x40 > ? asm_exc_invalid_op+0x16/0x20 > ? try_grab_page+0x2dd/0x3a0 > ? try_grab_page+0x2dd/0x3a0 > follow_page_pte+0x18c/0x1610 > ? try_grab_page+0x3a0/0x3a0 > ? rcu_is_watching+0xe/0xb0 > follow_page_mask+0x2e4/0xbd0 > __get_user_pages+0x3fa/0xcf0 > ? follow_page_mask+0xbd0/0xbd0 > ? down_read_killable+0x146/0x4f0 > ? down_read_interruptible+0x4f0/0x4f0 > ? rcu_is_watching+0xe/0xb0 > __gup_longterm_locked+0x5fa/0x1ec0 > ? io_schedule_timeout+0x150/0x150 > ? rcu_is_watching+0xe/0xb0 > ? get_user_pages_unlocked+0x580/0x580 > ? lock_release+0x4f7/0x670 > ? internal_get_user_pages_fast+0xe27/0x2690 > ? lock_downgrade+0x690/0x690 > ? preempt_schedule_common+0x45/0xb0 > ? pud_huge+0x9c/0xe0 > ? pmd_huge+0xe0/0xe0 > internal_get_user_pages_fast+0x119b/0x2690 > ? mtree_load+0x1df/0x980 > ? __gup_device_huge+0x530/0x530 > ? rcu_is_watching+0xe/0xb0 > ? lock_release+0x4f7/0x670 > get_user_pages_fast+0x95/0xe0 > ? get_user_pages_fast_only+0xe0/0xe0 > do_get_mempolicy+0x50c/0xd20 > ? sp_delete+0xf0/0xf0 > ? seccomp_notify_ioctl+0xd80/0xd80 > __x64_sys_get_mempolicy+0x187/0x2a0 > ? __ia32_sys_migrate_pages+0xf0/0xf0 > ? __secure_computing+0x1ff/0x360 > do_syscall_64+0x35/0xb0 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x47959d > Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007f5841a13068 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef > RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: 000000000059c0a0 R08: 0000000000000003 R09: 0000000000000000 > R10: 0000000020ff9000 R11: 0000000000000246 R12: 000000000059c0ac > R13: 000000000000000b R14: 0000000000437250 R15: 00007f58419f3000 > </TASK> > Kernel panic - not syncing: kernel: panic_on_warn set ... > > > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 > > That's this line: > if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) > Called from: > follow_page_pte+0x18c/0x1610 > > That did: > ptep = pte_offset_map_lock(mm, pmd, address, &ptl); > pte = ptep_get(ptep); > page = vm_normal_page(vma, address, pte); > ret = try_grab_page(page, flags); > > So we grabbed the PTE lock, looked up the PTE, translated that into > a page ... and found a page with a zero (or negative) refcount. > That's Really Bad. I think it was a zero refcount because r08 is 0 > and I don't see any other registers which have a plausible negative > 32-bit number in them. > > Yikebaer, could I trouble you to add this: > > +++ b/mm/gup.c > @@ -226,7 +226,7 @@ int __must_check try_grab_page(struct page *page, unsigned int flags) > { > struct folio *folio = page_folio(page); > > - if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) > + if (VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0, folio)) > return -ENOMEM; > > if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) > > and rerun the syzkaller? That'll give us some more information about > what has happened, although it won't tell us why it happened. > > We might need to catch someone decrementing the refcount to lower than > the mapcount to catch this ... which will be tricky, given the other > things we reuse the mapcount for. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page 2023-08-04 3:14 ` Yikebaer Aizezi @ 2023-08-04 3:42 ` Matthew Wilcox 2023-08-04 13:32 ` Matthew Wilcox 2023-08-04 13:35 ` David Howells 1 sibling, 1 reply; 8+ messages in thread From: Matthew Wilcox @ 2023-08-04 3:42 UTC (permalink / raw) To: Yikebaer Aizezi; +Cc: akpm, linux-mm, David Howells On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote: > Just patched it, then I rerun the reproduce program, and I got this > output from console: > > BUG: Bad page state in process POC pfn:0eb8d > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 > index:0x0 pfn:0xeb8d > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) > page_type: 0xffffffff() > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > page_owner info is not present (never set?) > Modules linked in: > CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106 > bad_page+0x71/0x1a0 mm/page_alloc.c:533 > free_page_is_bad_report mm/page_alloc.c:974 [inline] > free_page_is_bad mm/page_alloc.c:984 [inline] > free_pages_prepare mm/page_alloc.c:1153 [inline] > free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348 > free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443 > __folio_put_small mm/swap.c:106 [inline] > __folio_put+0xa2/0x110 mm/swap.c:129 > folio_put include/linux/mm.h:1423 [inline] > put_page include/linux/mm.h:1492 [inline] > extract_user_to_sg lib/scatterlist.c:1151 [inline] Ohh. I think this is something Dave Howells has a patch for. > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339 > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > sock_sendmsg_nosec net/socket.c:725 [inline] > sock_sendmsg+0xcf/0x170 net/socket.c:748 > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fbd79539f29 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 > RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 > R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 > index:0x0 pfn:0xeb8d > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) > page_type: 0xffffffff() > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > page dumped because: VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0) > page_owner info is not present (never set?) > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 7962 at mm/gup.c:229 try_grab_page+0x307/0x3c0 mm/gup.c:229 > Modules linked in: > CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 > Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 > c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b > eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 > RSP: 0018:ffffc90002927178 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 > RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a > R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 > R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 > FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > follow_page_pte+0x18c/0x1610 mm/gup.c:651 > follow_pmd_mask mm/gup.c:727 [inline] > follow_pud_mask mm/gup.c:765 [inline] > follow_p4d_mask mm/gup.c:782 [inline] > follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 > __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 > __get_user_pages_locked mm/gup.c:1487 [inline] > __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 > internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 > pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 > iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] > iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 > extract_user_to_sg lib/scatterlist.c:1123 [inline] > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > sock_sendmsg_nosec net/socket.c:725 [inline] > sock_sendmsg+0xcf/0x170 net/socket.c:748 > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fbd79539f29 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 > RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 > R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > > Modules linked in: > CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 > Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 > c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b > eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 > RSP: 0018:ffffc90002927178 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 > RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a > R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 > R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 > FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > follow_page_pte+0x18c/0x1610 mm/gup.c:651 > follow_pmd_mask mm/gup.c:727 [inline] > follow_pud_mask mm/gup.c:765 [inline] > follow_p4d_mask mm/gup.c:782 [inline] > follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 > __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 > __get_user_pages_locked mm/gup.c:1487 [inline] > __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 > internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 > pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 > iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] > iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 > extract_user_to_sg lib/scatterlist.c:1123 [inline] > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > sock_sendmsg_nosec net/socket.c:725 [inline] > sock_sendmsg+0xcf/0x170 net/socket.c:748 > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fbd79539f29 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 > RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 > R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > Kernel panic - not syncing: kernel: panic_on_warn set ... > CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0x92/0xf0 lib/dump_stack.c:106 > panic+0x570/0x620 kernel/panic.c:340 > check_panic_on_warn+0x8e/0x90 kernel/panic.c:236 > __warn+0xee/0x340 kernel/panic.c:673 > __report_bug lib/bug.c:199 [inline] > report_bug+0x25d/0x460 lib/bug.c:219 > handle_bug+0x3c/0x70 arch/x86/kernel/traps.c:324 > exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:345 > asm_exc_invalid_op+0x16/0x20 arch/x86/include/asm/idtentry.h:568 > RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 > Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 > c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b > eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 > RSP: 0018:ffffc90002927178 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 > RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a > R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 > R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 > follow_page_pte+0x18c/0x1610 mm/gup.c:651 > follow_pmd_mask mm/gup.c:727 [inline] > follow_pud_mask mm/gup.c:765 [inline] > follow_p4d_mask mm/gup.c:782 [inline] > follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 > __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 > __get_user_pages_locked mm/gup.c:1487 [inline] > __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 > internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 > pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 > iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] > iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 > extract_user_to_sg lib/scatterlist.c:1123 [inline] > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > sock_sendmsg_nosec net/socket.c:725 [inline] > sock_sendmsg+0xcf/0x170 net/socket.c:748 > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fbd79539f29 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 > RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 > R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > Dumping ftrace buffer: > (ftrace buffer empty) > Kernel Offset: disabled > Rebooting in 1 seconds.. > > --------------------------------------------------------------------------------------------- > > I think the previous question you mentioned about ioctl() is triggered > because of > another crash WARNING in kvm_arch_vcpu_ioctl_run, I think somehow these > two crashes triggered at one time. But I cannot figure out why it happened. > > after I tried to fixed that problem, and rerun C reproducer on this > issue, I got > different output from console as above. > > > Matthew Wilcox <willy@infradead.org> 于2023年8月3日周四 21:19写道: > > > > > > On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote: > > > console output: > > > https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link > > > > I dug through this, and what I found troubles me. > > > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 > > Modules linked in: > > CPU: 0 PID: 13067 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > RIP: 0010:try_grab_page+0x2dd/0x3a0 > > Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d > > RSP: 0018:ffffc9000c2777e0 EFLAGS: 00010212 > > RAX: 0000000000000247 RBX: ffffea00003ae340 RCX: ffffc90002bb1000 > > RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374 > > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e > > R10: ffffea00003ae377 R11: 0000000000084001 R12: ffffea00003ae374 > > R13: 0000000000210002 R14: ffffea00003ae340 R15: 000000000eb8d225 > > FS: 00007f5841a13640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000500310 CR3: 0000000018d0c000 CR4: 0000000000750ef0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <TASK> > > ? __warn+0xe2/0x340 > > ? try_grab_page+0x2dd/0x3a0 > > ? report_bug+0x25d/0x460 > > ? handle_bug+0x3c/0x70 > > ? exc_invalid_op+0x14/0x40 > > ? asm_exc_invalid_op+0x16/0x20 > > ? try_grab_page+0x2dd/0x3a0 > > ? try_grab_page+0x2dd/0x3a0 > > follow_page_pte+0x18c/0x1610 > > ? try_grab_page+0x3a0/0x3a0 > > ? rcu_is_watching+0xe/0xb0 > > follow_page_mask+0x2e4/0xbd0 > > __get_user_pages+0x3fa/0xcf0 > > ? follow_page_mask+0xbd0/0xbd0 > > ? down_read_killable+0x146/0x4f0 > > ? down_read_interruptible+0x4f0/0x4f0 > > ? rcu_is_watching+0xe/0xb0 > > __gup_longterm_locked+0x5fa/0x1ec0 > > ? io_schedule_timeout+0x150/0x150 > > ? rcu_is_watching+0xe/0xb0 > > ? get_user_pages_unlocked+0x580/0x580 > > ? lock_release+0x4f7/0x670 > > ? internal_get_user_pages_fast+0xe27/0x2690 > > ? lock_downgrade+0x690/0x690 > > ? preempt_schedule_common+0x45/0xb0 > > ? pud_huge+0x9c/0xe0 > > ? pmd_huge+0xe0/0xe0 > > internal_get_user_pages_fast+0x119b/0x2690 > > ? mtree_load+0x1df/0x980 > > ? __gup_device_huge+0x530/0x530 > > ? rcu_is_watching+0xe/0xb0 > > ? lock_release+0x4f7/0x670 > > get_user_pages_fast+0x95/0xe0 > > ? get_user_pages_fast_only+0xe0/0xe0 > > do_get_mempolicy+0x50c/0xd20 > > ? sp_delete+0xf0/0xf0 > > ? seccomp_notify_ioctl+0xd80/0xd80 > > __x64_sys_get_mempolicy+0x187/0x2a0 > > ? __ia32_sys_migrate_pages+0xf0/0xf0 > > ? __secure_computing+0x1ff/0x360 > > do_syscall_64+0x35/0xb0 > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > RIP: 0033:0x47959d > > Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48 > > RSP: 002b:00007f5841a13068 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef > > RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > RBP: 000000000059c0a0 R08: 0000000000000003 R09: 0000000000000000 > > R10: 0000000020ff9000 R11: 0000000000000246 R12: 000000000059c0ac > > R13: 000000000000000b R14: 0000000000437250 R15: 00007f58419f3000 > > </TASK> > > Kernel panic - not syncing: kernel: panic_on_warn set ... > > > > > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 > > > > That's this line: > > if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) > > Called from: > > follow_page_pte+0x18c/0x1610 > > > > That did: > > ptep = pte_offset_map_lock(mm, pmd, address, &ptl); > > pte = ptep_get(ptep); > > page = vm_normal_page(vma, address, pte); > > ret = try_grab_page(page, flags); > > > > So we grabbed the PTE lock, looked up the PTE, translated that into > > a page ... and found a page with a zero (or negative) refcount. > > That's Really Bad. I think it was a zero refcount because r08 is 0 > > and I don't see any other registers which have a plausible negative > > 32-bit number in them. > > > > Yikebaer, could I trouble you to add this: > > > > +++ b/mm/gup.c > > @@ -226,7 +226,7 @@ int __must_check try_grab_page(struct page *page, unsigned int flags) > > { > > struct folio *folio = page_folio(page); > > > > - if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) > > + if (VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0, folio)) > > return -ENOMEM; > > > > if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) > > > > and rerun the syzkaller? That'll give us some more information about > > what has happened, although it won't tell us why it happened. > > > > We might need to catch someone decrementing the refcount to lower than > > the mapcount to catch this ... which will be tricky, given the other > > things we reuse the mapcount for. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page 2023-08-04 3:42 ` Matthew Wilcox @ 2023-08-04 13:32 ` Matthew Wilcox 0 siblings, 0 replies; 8+ messages in thread From: Matthew Wilcox @ 2023-08-04 13:32 UTC (permalink / raw) To: Yikebaer Aizezi; +Cc: akpm, linux-mm, David Howells On Fri, Aug 04, 2023 at 04:42:27AM +0100, Matthew Wilcox wrote: > On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote: > > Just patched it, then I rerun the reproduce program, and I got this > > output from console: > > > > BUG: Bad page state in process POC pfn:0eb8d > > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 > > index:0x0 pfn:0xeb8d > > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) > > page_type: 0xffffffff() > > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 > > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > > page_owner info is not present (never set?) > > Modules linked in: > > CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > Call Trace: > > <TASK> > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106 > > bad_page+0x71/0x1a0 mm/page_alloc.c:533 > > free_page_is_bad_report mm/page_alloc.c:974 [inline] > > free_page_is_bad mm/page_alloc.c:984 [inline] > > free_pages_prepare mm/page_alloc.c:1153 [inline] > > free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348 > > free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443 > > __folio_put_small mm/swap.c:106 [inline] > > __folio_put+0xa2/0x110 mm/swap.c:129 > > folio_put include/linux/mm.h:1423 [inline] > > put_page include/linux/mm.h:1492 [inline] > > extract_user_to_sg lib/scatterlist.c:1151 [inline] > > Ohh. I think this is something Dave Howells has a patch for. Can you try https://lore.kernel.org/mm-commits/20230726204730.B89D8C433C7@smtp.kernel.org/ ? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page 2023-08-04 3:14 ` Yikebaer Aizezi 2023-08-04 3:42 ` Matthew Wilcox @ 2023-08-04 13:35 ` David Howells 2023-08-06 7:51 ` Yikebaer Aizezi 1 sibling, 1 reply; 8+ messages in thread From: David Howells @ 2023-08-04 13:35 UTC (permalink / raw) To: Matthew Wilcox; +Cc: dhowells, Yikebaer Aizezi, akpm, linux-mm, Herbert Xu Matthew Wilcox <willy@infradead.org> wrote: > On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote: > > Just patched it, then I rerun the reproduce program, and I got this > > output from console: > > > > BUG: Bad page state in process POC pfn:0eb8d > > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 > > index:0x0 pfn:0xeb8d > > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) > > page_type: 0xffffffff() > > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 > > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > > page_owner info is not present (never set?) > > Modules linked in: > > CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > Call Trace: > > <TASK> > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106 > > bad_page+0x71/0x1a0 mm/page_alloc.c:533 > > free_page_is_bad_report mm/page_alloc.c:974 [inline] > > free_page_is_bad mm/page_alloc.c:984 [inline] > > free_pages_prepare mm/page_alloc.c:1153 [inline] > > free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348 > > free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443 > > __folio_put_small mm/swap.c:106 [inline] > > __folio_put+0xa2/0x110 mm/swap.c:129 > > folio_put include/linux/mm.h:1423 [inline] > > put_page include/linux/mm.h:1492 [inline] > > extract_user_to_sg lib/scatterlist.c:1151 [inline] > > Ohh. I think this is something Dave Howells has a patch for. > > > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > > extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339 > > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > > sock_sendmsg_nosec net/socket.c:725 [inline] > > sock_sendmsg+0xcf/0x170 net/socket.c:748 > > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x63/0xcd This might be the fix you're looking for. https://lore.kernel.org/linux-crypto/20571.1690369076@warthog.procyon.org.uk/ Andrew has it in mm-hotfixes-unstable. David ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page 2023-08-04 13:35 ` David Howells @ 2023-08-06 7:51 ` Yikebaer Aizezi 0 siblings, 0 replies; 8+ messages in thread From: Yikebaer Aizezi @ 2023-08-06 7:51 UTC (permalink / raw) To: David Howells, linux-mm, Matthew Wilcox, akpm; +Cc: Herbert Xu I just tried this patch, it worked and the bug was not triggered. David Howells <dhowells@redhat.com> 于2023年8月4日周五 21:35写道: > > Matthew Wilcox <willy@infradead.org> wrote: > > > On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote: > > > Just patched it, then I rerun the reproduce program, and I got this > > > output from console: > > > > > > BUG: Bad page state in process POC pfn:0eb8d > > > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 > > > index:0x0 pfn:0xeb8d > > > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) > > > page_type: 0xffffffff() > > > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 > > > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > > > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > > > page_owner info is not present (never set?) > > > Modules linked in: > > > CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > > Call Trace: > > > <TASK> > > > __dump_stack lib/dump_stack.c:88 [inline] > > > dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106 > > > bad_page+0x71/0x1a0 mm/page_alloc.c:533 > > > free_page_is_bad_report mm/page_alloc.c:974 [inline] > > > free_page_is_bad mm/page_alloc.c:984 [inline] > > > free_pages_prepare mm/page_alloc.c:1153 [inline] > > > free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348 > > > free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443 > > > __folio_put_small mm/swap.c:106 [inline] > > > __folio_put+0xa2/0x110 mm/swap.c:129 > > > folio_put include/linux/mm.h:1423 [inline] > > > put_page include/linux/mm.h:1492 [inline] > > > extract_user_to_sg lib/scatterlist.c:1151 [inline] > > > > Ohh. I think this is something Dave Howells has a patch for. > > > > > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > > > extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339 > > > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > > > sock_sendmsg_nosec net/socket.c:725 [inline] > > > sock_sendmsg+0xcf/0x170 net/socket.c:748 > > > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > > > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > > > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > This might be the fix you're looking for. > > https://lore.kernel.org/linux-crypto/20571.1690369076@warthog.procyon.org.uk/ > > Andrew has it in mm-hotfixes-unstable. > > David > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-08-06 7:51 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-08-03 8:56 WARNING in try_grab_page Yikebaer Aizezi 2023-08-03 12:50 ` Matthew Wilcox 2023-08-03 13:19 ` Matthew Wilcox 2023-08-04 3:14 ` Yikebaer Aizezi 2023-08-04 3:42 ` Matthew Wilcox 2023-08-04 13:32 ` Matthew Wilcox 2023-08-04 13:35 ` David Howells 2023-08-06 7:51 ` Yikebaer Aizezi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox