* WARNING in try_grab_page
@ 2023-08-03 8:56 Yikebaer Aizezi
2023-08-03 12:50 ` Matthew Wilcox
2023-08-03 13:19 ` Matthew Wilcox
0 siblings, 2 replies; 8+ messages in thread
From: Yikebaer Aizezi @ 2023-08-03 8:56 UTC (permalink / raw)
To: akpm, linux-mm; +Cc: linux-kernel
Hello,
When using Healer to fuzz the Linux kernel, the following crash
was triggered on:
HEAD commit: fdf0eaf11452d72945af31804e2a1048ee1b574c (tag: v6.5-rc2)
git tree: upstream
I tried to reproduce this bug on v6.5-rc3(HEAD commit:
6eaae198076080886b9e7d57f4ae06fa782f90ef), it still exists.
console output:
https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link
kernel config: https://drive.google.com/file/d/1dApy7OR4KDYdhF96ZUowZQ1r-uLYTd-0/view?usp=drive_link
C reproducer: https://drive.google.com/file/d/1Dkj31wwYP7p-AEJeemD3yrIUr7-VdBqF/view?usp=drive_link
Syzlang reproducer:
https://drive.google.com/file/d/1ib6zTs4srKI1RnUcHG3mSZ5HeW7rZhnd/view?usp=drive_link
If you fix this issue, please add the following tag to the commit:
Reported-by: Yikebaer Aizezi <yikebaer61@gmail.com>
WARNING: CPU: 0 PID: 10232 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0
mm/gup.c:229
Modules linked in:
CPU: 0 PID: 10232 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:try_grab_page+0x2dd/0x3a0 mm/gup.c:229
Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65
96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b
e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d
RSP: 0018:ffffc9000e99f268 EFLAGS: 00010202
RAX: 0000000000002f80 RBX: ffffea00003ae340 RCX: ffffc90002d79000
RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374
RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e
R10: ffffea00003ae377 R11: 0000000000000000 R12: ffffea00003ae374
R13: 0000000000210052 R14: ffffea00003ae340 R15: 000000000eb8d225
FS: 00007fc6339a4640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000002c3ea000 CR4: 0000000000752ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
follow_page_pte+0x18c/0x1610 mm/gup.c:651
follow_pmd_mask mm/gup.c:727 [inline]
follow_pud_mask mm/gup.c:765 [inline]
follow_p4d_mask mm/gup.c:782 [inline]
follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
__get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
__get_user_pages_locked mm/gup.c:1487 [inline]
get_user_pages_unlocked+0x183/0x580 mm/gup.c:2387
hva_to_pfn_slow arch/x86/kvm/../../../virt/kvm/kvm_main.c:2536 [inline]
hva_to_pfn+0x198/0xbc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2674
__gfn_to_pfn_memslot+0x202/0x3e0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2736
__kvm_faultin_pfn arch/x86/kvm/mmu/mmu.c:4329 [inline]
kvm_faultin_pfn+0x21b/0x12d0 arch/x86/kvm/mmu/mmu.c:4365
kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4503 [inline]
kvm_tdp_page_fault+0x167/0x4d0 arch/x86/kvm/mmu/mmu.c:4549
kvm_mmu_do_page_fault arch/x86/kvm/mmu/mmu_internal.h:320 [inline]
kvm_mmu_page_fault+0x2f4/0x1a40 arch/x86/kvm/mmu/mmu.c:5756
handle_ept_violation+0x20a/0x620 arch/x86/kvm/vmx/vmx.c:5760
__vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6539 [inline]
vmx_handle_exit+0x4a1/0x18d0 arch/x86/kvm/vmx/vmx.c:6556
vcpu_enter_guest arch/x86/kvm/x86.c:10848 [inline]
vcpu_run+0x24b6/0x44b0 arch/x86/kvm/x86.c:10951
kvm_arch_vcpu_ioctl_run+0x416/0x1830 arch/x86/kvm/x86.c:11172
kvm_vcpu_ioctl+0x4de/0xcc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x173/0x1e0 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x47959d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc6339a4068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000006
RBP: 000000000059c0a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac
R13: 000000000000000b R14: 0000000000437250 R15: 00007fc633984000
</TASK>
Modules linked in:
CPU: 0 PID: 10232 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:try_grab_page+0x2dd/0x3a0 mm/gup.c:229
Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65
96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b
e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d
RSP: 0018:ffffc9000e99f268 EFLAGS: 00010202
RAX: 0000000000002f80 RBX: ffffea00003ae340 RCX: ffffc90002d79000
RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374
RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e
R10: ffffea00003ae377 R11: 0000000000000000 R12: ffffea00003ae374
R13: 0000000000210052 R14: ffffea00003ae340 R15: 000000000eb8d225
FS: 00007fc6339a4640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000002c3ea000 CR4: 0000000000752ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
follow_page_pte+0x18c/0x1610 mm/gup.c:651
follow_pmd_mask mm/gup.c:727 [inline]
follow_pud_mask mm/gup.c:765 [inline]
follow_p4d_mask mm/gup.c:782 [inline]
follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
__get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
__get_user_pages_locked mm/gup.c:1487 [inline]
get_user_pages_unlocked+0x183/0x580 mm/gup.c:2387
hva_to_pfn_slow arch/x86/kvm/../../../virt/kvm/kvm_main.c:2536 [inline]
hva_to_pfn+0x198/0xbc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2674
__gfn_to_pfn_memslot+0x202/0x3e0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2736
__kvm_faultin_pfn arch/x86/kvm/mmu/mmu.c:4329 [inline]
kvm_faultin_pfn+0x21b/0x12d0 arch/x86/kvm/mmu/mmu.c:4365
kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4503 [inline]
kvm_tdp_page_fault+0x167/0x4d0 arch/x86/kvm/mmu/mmu.c:4549
kvm_mmu_do_page_fault arch/x86/kvm/mmu/mmu_internal.h:320 [inline]
kvm_mmu_page_fault+0x2f4/0x1a40 arch/x86/kvm/mmu/mmu.c:5756
handle_ept_violation+0x20a/0x620 arch/x86/kvm/vmx/vmx.c:5760
__vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6539 [inline]
vmx_handle_exit+0x4a1/0x18d0 arch/x86/kvm/vmx/vmx.c:6556
vcpu_enter_guest arch/x86/kvm/x86.c:10848 [inline]
vcpu_run+0x24b6/0x44b0 arch/x86/kvm/x86.c:10951
kvm_arch_vcpu_ioctl_run+0x416/0x1830 arch/x86/kvm/x86.c:11172
kvm_vcpu_ioctl+0x4de/0xcc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x173/0x1e0 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x47959d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc6339a4068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000006
RBP: 000000000059c0a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac
R13: 000000000000000b R14: 0000000000437250 R15: 00007fc633984000
</TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 0 PID: 10232 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x92/0xf0 lib/dump_stack.c:106
panic+0x570/0x620 kernel/panic.c:340
check_panic_on_warn+0x8e/0x90 kernel/panic.c:236
__warn+0xee/0x340 kernel/panic.c:673
__report_bug lib/bug.c:199 [inline]
report_bug+0x25d/0x460 lib/bug.c:219
handle_bug+0x3c/0x70 arch/x86/kernel/traps.c:324
exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:345
asm_exc_invalid_op+0x16/0x20 arch/x86/include/asm/idtentry.h:568
RIP: 0010:try_grab_page+0x2dd/0x3a0 mm/gup.c:229
Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65
96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b
e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d
RSP: 0018:ffffc9000e99f268 EFLAGS: 00010202
RAX: 0000000000002f80 RBX: ffffea00003ae340 RCX: ffffc90002d79000
RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374
RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e
R10: ffffea00003ae377 R11: 0000000000000000 R12: ffffea00003ae374
R13: 0000000000210052 R14: ffffea00003ae340 R15: 000000000eb8d225
follow_page_pte+0x18c/0x1610 mm/gup.c:651
follow_pmd_mask mm/gup.c:727 [inline]
follow_pud_mask mm/gup.c:765 [inline]
follow_p4d_mask mm/gup.c:782 [inline]
follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
__get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
__get_user_pages_locked mm/gup.c:1487 [inline]
get_user_pages_unlocked+0x183/0x580 mm/gup.c:2387
hva_to_pfn_slow arch/x86/kvm/../../../virt/kvm/kvm_main.c:2536 [inline]
hva_to_pfn+0x198/0xbc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2674
__gfn_to_pfn_memslot+0x202/0x3e0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2736
__kvm_faultin_pfn arch/x86/kvm/mmu/mmu.c:4329 [inline]
kvm_faultin_pfn+0x21b/0x12d0 arch/x86/kvm/mmu/mmu.c:4365
kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4503 [inline]
kvm_tdp_page_fault+0x167/0x4d0 arch/x86/kvm/mmu/mmu.c:4549
kvm_mmu_do_page_fault arch/x86/kvm/mmu/mmu_internal.h:320 [inline]
kvm_mmu_page_fault+0x2f4/0x1a40 arch/x86/kvm/mmu/mmu.c:5756
handle_ept_violation+0x20a/0x620 arch/x86/kvm/vmx/vmx.c:5760
__vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6539 [inline]
vmx_handle_exit+0x4a1/0x18d0 arch/x86/kvm/vmx/vmx.c:6556
vcpu_enter_guest arch/x86/kvm/x86.c:10848 [inline]
vcpu_run+0x24b6/0x44b0 arch/x86/kvm/x86.c:10951
kvm_arch_vcpu_ioctl_run+0x416/0x1830 arch/x86/kvm/x86.c:11172
kvm_vcpu_ioctl+0x4de/0xcc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x173/0x1e0 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x47959d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc6339a4068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000006
RBP: 000000000059c0a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac
R13: 000000000000000b R14: 0000000000437250 R15: 00007fc633984000
</TASK>
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 1 seconds..
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page
2023-08-03 8:56 WARNING in try_grab_page Yikebaer Aizezi
@ 2023-08-03 12:50 ` Matthew Wilcox
2023-08-03 13:19 ` Matthew Wilcox
1 sibling, 0 replies; 8+ messages in thread
From: Matthew Wilcox @ 2023-08-03 12:50 UTC (permalink / raw)
To: Yikebaer Aizezi; +Cc: akpm, linux-mm, linux-kernel
On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote:
> console output:
> https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link
> kernel config: https://drive.google.com/file/d/1dApy7OR4KDYdhF96ZUowZQ1r-uLYTd-0/view?usp=drive_link
> C reproducer: https://drive.google.com/file/d/1Dkj31wwYP7p-AEJeemD3yrIUr7-VdBqF/view?usp=drive_link
Are you sure this is right? The below stack trace shows something
coming in through the ioctl() path, but nothing in this reproducer
calls ioctl(). It's just socket(), bind(), accept4() and sendmsg().
I don't see a way to come up with this stack backtrace from that
program.
> Call Trace:
> <TASK>
> follow_page_pte+0x18c/0x1610 mm/gup.c:651
> follow_pmd_mask mm/gup.c:727 [inline]
> follow_pud_mask mm/gup.c:765 [inline]
> follow_p4d_mask mm/gup.c:782 [inline]
> follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
> __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
> __get_user_pages_locked mm/gup.c:1487 [inline]
> get_user_pages_unlocked+0x183/0x580 mm/gup.c:2387
> hva_to_pfn_slow arch/x86/kvm/../../../virt/kvm/kvm_main.c:2536 [inline]
> hva_to_pfn+0x198/0xbc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2674
> __gfn_to_pfn_memslot+0x202/0x3e0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2736
> __kvm_faultin_pfn arch/x86/kvm/mmu/mmu.c:4329 [inline]
> kvm_faultin_pfn+0x21b/0x12d0 arch/x86/kvm/mmu/mmu.c:4365
> kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4503 [inline]
> kvm_tdp_page_fault+0x167/0x4d0 arch/x86/kvm/mmu/mmu.c:4549
> kvm_mmu_do_page_fault arch/x86/kvm/mmu/mmu_internal.h:320 [inline]
> kvm_mmu_page_fault+0x2f4/0x1a40 arch/x86/kvm/mmu/mmu.c:5756
> handle_ept_violation+0x20a/0x620 arch/x86/kvm/vmx/vmx.c:5760
> __vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6539 [inline]
> vmx_handle_exit+0x4a1/0x18d0 arch/x86/kvm/vmx/vmx.c:6556
> vcpu_enter_guest arch/x86/kvm/x86.c:10848 [inline]
> vcpu_run+0x24b6/0x44b0 arch/x86/kvm/x86.c:10951
> kvm_arch_vcpu_ioctl_run+0x416/0x1830 arch/x86/kvm/x86.c:11172
> kvm_vcpu_ioctl+0x4de/0xcc0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112
> vfs_ioctl fs/ioctl.c:51 [inline]
> __do_sys_ioctl fs/ioctl.c:870 [inline]
> __se_sys_ioctl fs/ioctl.c:856 [inline]
> __x64_sys_ioctl+0x173/0x1e0 fs/ioctl.c:856
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x47959d
> Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fc6339a4068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d
> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000006
> RBP: 000000000059c0a0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000059c0ac
> R13: 000000000000000b R14: 0000000000437250 R15: 00007fc633984000
> </TASK>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page
2023-08-03 8:56 WARNING in try_grab_page Yikebaer Aizezi
2023-08-03 12:50 ` Matthew Wilcox
@ 2023-08-03 13:19 ` Matthew Wilcox
2023-08-04 3:14 ` Yikebaer Aizezi
1 sibling, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2023-08-03 13:19 UTC (permalink / raw)
To: Yikebaer Aizezi; +Cc: akpm, linux-mm
On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote:
> console output:
> https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link
I dug through this, and what I found troubles me.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0
Modules linked in:
CPU: 0 PID: 13067 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:try_grab_page+0x2dd/0x3a0
Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d
RSP: 0018:ffffc9000c2777e0 EFLAGS: 00010212
RAX: 0000000000000247 RBX: ffffea00003ae340 RCX: ffffc90002bb1000
RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374
RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e
R10: ffffea00003ae377 R11: 0000000000084001 R12: ffffea00003ae374
R13: 0000000000210002 R14: ffffea00003ae340 R15: 000000000eb8d225
FS: 00007f5841a13640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000500310 CR3: 0000000018d0c000 CR4: 0000000000750ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
? __warn+0xe2/0x340
? try_grab_page+0x2dd/0x3a0
? report_bug+0x25d/0x460
? handle_bug+0x3c/0x70
? exc_invalid_op+0x14/0x40
? asm_exc_invalid_op+0x16/0x20
? try_grab_page+0x2dd/0x3a0
? try_grab_page+0x2dd/0x3a0
follow_page_pte+0x18c/0x1610
? try_grab_page+0x3a0/0x3a0
? rcu_is_watching+0xe/0xb0
follow_page_mask+0x2e4/0xbd0
__get_user_pages+0x3fa/0xcf0
? follow_page_mask+0xbd0/0xbd0
? down_read_killable+0x146/0x4f0
? down_read_interruptible+0x4f0/0x4f0
? rcu_is_watching+0xe/0xb0
__gup_longterm_locked+0x5fa/0x1ec0
? io_schedule_timeout+0x150/0x150
? rcu_is_watching+0xe/0xb0
? get_user_pages_unlocked+0x580/0x580
? lock_release+0x4f7/0x670
? internal_get_user_pages_fast+0xe27/0x2690
? lock_downgrade+0x690/0x690
? preempt_schedule_common+0x45/0xb0
? pud_huge+0x9c/0xe0
? pmd_huge+0xe0/0xe0
internal_get_user_pages_fast+0x119b/0x2690
? mtree_load+0x1df/0x980
? __gup_device_huge+0x530/0x530
? rcu_is_watching+0xe/0xb0
? lock_release+0x4f7/0x670
get_user_pages_fast+0x95/0xe0
? get_user_pages_fast_only+0xe0/0xe0
do_get_mempolicy+0x50c/0xd20
? sp_delete+0xf0/0xf0
? seccomp_notify_ioctl+0xd80/0xd80
__x64_sys_get_mempolicy+0x187/0x2a0
? __ia32_sys_migrate_pages+0xf0/0xf0
? __secure_computing+0x1ff/0x360
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x47959d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f5841a13068 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 000000000059c0a0 R08: 0000000000000003 R09: 0000000000000000
R10: 0000000020ff9000 R11: 0000000000000246 R12: 000000000059c0ac
R13: 000000000000000b R14: 0000000000437250 R15: 00007f58419f3000
</TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
> WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0
That's this line:
if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
Called from:
follow_page_pte+0x18c/0x1610
That did:
ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
pte = ptep_get(ptep);
page = vm_normal_page(vma, address, pte);
ret = try_grab_page(page, flags);
So we grabbed the PTE lock, looked up the PTE, translated that into
a page ... and found a page with a zero (or negative) refcount.
That's Really Bad. I think it was a zero refcount because r08 is 0
and I don't see any other registers which have a plausible negative
32-bit number in them.
Yikebaer, could I trouble you to add this:
+++ b/mm/gup.c
@@ -226,7 +226,7 @@ int __must_check try_grab_page(struct page *page, unsigned int flags)
{
struct folio *folio = page_folio(page);
- if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
+ if (VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0, folio))
return -ENOMEM;
if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page)))
and rerun the syzkaller? That'll give us some more information about
what has happened, although it won't tell us why it happened.
We might need to catch someone decrementing the refcount to lower than
the mapcount to catch this ... which will be tricky, given the other
things we reuse the mapcount for.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page
2023-08-03 13:19 ` Matthew Wilcox
@ 2023-08-04 3:14 ` Yikebaer Aizezi
2023-08-04 3:42 ` Matthew Wilcox
2023-08-04 13:35 ` David Howells
0 siblings, 2 replies; 8+ messages in thread
From: Yikebaer Aizezi @ 2023-08-04 3:14 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: akpm, linux-mm
Just patched it, then I rerun the reproduce program, and I got this
output from console:
BUG: Bad page state in process POC pfn:0eb8d
page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0xeb8d
flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner info is not present (never set?)
Modules linked in:
CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106
bad_page+0x71/0x1a0 mm/page_alloc.c:533
free_page_is_bad_report mm/page_alloc.c:974 [inline]
free_page_is_bad mm/page_alloc.c:984 [inline]
free_pages_prepare mm/page_alloc.c:1153 [inline]
free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348
free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443
__folio_put_small mm/swap.c:106 [inline]
__folio_put+0xa2/0x110 mm/swap.c:129
folio_put include/linux/mm.h:1423 [inline]
put_page include/linux/mm.h:1492 [inline]
extract_user_to_sg lib/scatterlist.c:1151 [inline]
extract_iter_to_sg lib/scatterlist.c:1349 [inline]
extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339
hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
sock_sendmsg_nosec net/socket.c:725 [inline]
sock_sendmsg+0xcf/0x170 net/socket.c:748
____sys_sendmsg+0x676/0x860 net/socket.c:2494
___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
__sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fbd79539f29
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004
RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0
R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0xeb8d
flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0)
page_owner info is not present (never set?)
------------[ cut here ]------------
WARNING: CPU: 0 PID: 7962 at mm/gup.c:229 try_grab_page+0x307/0x3c0 mm/gup.c:229
Modules linked in:
CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229
Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48
c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b
eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00
RSP: 0018:ffffc90002927178 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000
RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00
RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a
R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4
R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340
FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
follow_page_pte+0x18c/0x1610 mm/gup.c:651
follow_pmd_mask mm/gup.c:727 [inline]
follow_pud_mask mm/gup.c:765 [inline]
follow_p4d_mask mm/gup.c:782 [inline]
follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
__get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
__get_user_pages_locked mm/gup.c:1487 [inline]
__gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181
internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179
pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285
iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline]
iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831
extract_user_to_sg lib/scatterlist.c:1123 [inline]
extract_iter_to_sg lib/scatterlist.c:1349 [inline]
extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339
hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
sock_sendmsg_nosec net/socket.c:725 [inline]
sock_sendmsg+0xcf/0x170 net/socket.c:748
____sys_sendmsg+0x676/0x860 net/socket.c:2494
___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
__sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fbd79539f29
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004
RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0
R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Modules linked in:
CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229
Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48
c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b
eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00
RSP: 0018:ffffc90002927178 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000
RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00
RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a
R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4
R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340
FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
follow_page_pte+0x18c/0x1610 mm/gup.c:651
follow_pmd_mask mm/gup.c:727 [inline]
follow_pud_mask mm/gup.c:765 [inline]
follow_p4d_mask mm/gup.c:782 [inline]
follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
__get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
__get_user_pages_locked mm/gup.c:1487 [inline]
__gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181
internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179
pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285
iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline]
iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831
extract_user_to_sg lib/scatterlist.c:1123 [inline]
extract_iter_to_sg lib/scatterlist.c:1349 [inline]
extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339
hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
sock_sendmsg_nosec net/socket.c:725 [inline]
sock_sendmsg+0xcf/0x170 net/socket.c:748
____sys_sendmsg+0x676/0x860 net/socket.c:2494
___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
__sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fbd79539f29
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004
RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0
R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x92/0xf0 lib/dump_stack.c:106
panic+0x570/0x620 kernel/panic.c:340
check_panic_on_warn+0x8e/0x90 kernel/panic.c:236
__warn+0xee/0x340 kernel/panic.c:673
__report_bug lib/bug.c:199 [inline]
report_bug+0x25d/0x460 lib/bug.c:219
handle_bug+0x3c/0x70 arch/x86/kernel/traps.c:324
exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:345
asm_exc_invalid_op+0x16/0x20 arch/x86/include/asm/idtentry.h:568
RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229
Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48
c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b
eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00
RSP: 0018:ffffc90002927178 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000
RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00
RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a
R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4
R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340
follow_page_pte+0x18c/0x1610 mm/gup.c:651
follow_pmd_mask mm/gup.c:727 [inline]
follow_pud_mask mm/gup.c:765 [inline]
follow_p4d_mask mm/gup.c:782 [inline]
follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
__get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
__get_user_pages_locked mm/gup.c:1487 [inline]
__gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181
internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179
pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285
iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline]
iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831
extract_user_to_sg lib/scatterlist.c:1123 [inline]
extract_iter_to_sg lib/scatterlist.c:1349 [inline]
extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339
hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
sock_sendmsg_nosec net/socket.c:725 [inline]
sock_sendmsg+0xcf/0x170 net/socket.c:748
____sys_sendmsg+0x676/0x860 net/socket.c:2494
___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
__sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fbd79539f29
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004
RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0
R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 1 seconds..
---------------------------------------------------------------------------------------------
I think the previous question you mentioned about ioctl() is triggered
because of
another crash WARNING in kvm_arch_vcpu_ioctl_run, I think somehow these
two crashes triggered at one time. But I cannot figure out why it happened.
after I tried to fixed that problem, and rerun C reproducer on this
issue, I got
different output from console as above.
Matthew Wilcox <willy@infradead.org> 于2023年8月3日周四 21:19写道:
>
> On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote:
> > console output:
> > https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link
>
> I dug through this, and what I found troubles me.
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0
> Modules linked in:
> CPU: 0 PID: 13067 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> RIP: 0010:try_grab_page+0x2dd/0x3a0
> Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d
> RSP: 0018:ffffc9000c2777e0 EFLAGS: 00010212
> RAX: 0000000000000247 RBX: ffffea00003ae340 RCX: ffffc90002bb1000
> RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374
> RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e
> R10: ffffea00003ae377 R11: 0000000000084001 R12: ffffea00003ae374
> R13: 0000000000210002 R14: ffffea00003ae340 R15: 000000000eb8d225
> FS: 00007f5841a13640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000500310 CR3: 0000000018d0c000 CR4: 0000000000750ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <TASK>
> ? __warn+0xe2/0x340
> ? try_grab_page+0x2dd/0x3a0
> ? report_bug+0x25d/0x460
> ? handle_bug+0x3c/0x70
> ? exc_invalid_op+0x14/0x40
> ? asm_exc_invalid_op+0x16/0x20
> ? try_grab_page+0x2dd/0x3a0
> ? try_grab_page+0x2dd/0x3a0
> follow_page_pte+0x18c/0x1610
> ? try_grab_page+0x3a0/0x3a0
> ? rcu_is_watching+0xe/0xb0
> follow_page_mask+0x2e4/0xbd0
> __get_user_pages+0x3fa/0xcf0
> ? follow_page_mask+0xbd0/0xbd0
> ? down_read_killable+0x146/0x4f0
> ? down_read_interruptible+0x4f0/0x4f0
> ? rcu_is_watching+0xe/0xb0
> __gup_longterm_locked+0x5fa/0x1ec0
> ? io_schedule_timeout+0x150/0x150
> ? rcu_is_watching+0xe/0xb0
> ? get_user_pages_unlocked+0x580/0x580
> ? lock_release+0x4f7/0x670
> ? internal_get_user_pages_fast+0xe27/0x2690
> ? lock_downgrade+0x690/0x690
> ? preempt_schedule_common+0x45/0xb0
> ? pud_huge+0x9c/0xe0
> ? pmd_huge+0xe0/0xe0
> internal_get_user_pages_fast+0x119b/0x2690
> ? mtree_load+0x1df/0x980
> ? __gup_device_huge+0x530/0x530
> ? rcu_is_watching+0xe/0xb0
> ? lock_release+0x4f7/0x670
> get_user_pages_fast+0x95/0xe0
> ? get_user_pages_fast_only+0xe0/0xe0
> do_get_mempolicy+0x50c/0xd20
> ? sp_delete+0xf0/0xf0
> ? seccomp_notify_ioctl+0xd80/0xd80
> __x64_sys_get_mempolicy+0x187/0x2a0
> ? __ia32_sys_migrate_pages+0xf0/0xf0
> ? __secure_computing+0x1ff/0x360
> do_syscall_64+0x35/0xb0
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x47959d
> Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f5841a13068 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
> RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 000000000059c0a0 R08: 0000000000000003 R09: 0000000000000000
> R10: 0000000020ff9000 R11: 0000000000000246 R12: 000000000059c0ac
> R13: 000000000000000b R14: 0000000000437250 R15: 00007f58419f3000
> </TASK>
> Kernel panic - not syncing: kernel: panic_on_warn set ...
>
> > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0
>
> That's this line:
> if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
> Called from:
> follow_page_pte+0x18c/0x1610
>
> That did:
> ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
> pte = ptep_get(ptep);
> page = vm_normal_page(vma, address, pte);
> ret = try_grab_page(page, flags);
>
> So we grabbed the PTE lock, looked up the PTE, translated that into
> a page ... and found a page with a zero (or negative) refcount.
> That's Really Bad. I think it was a zero refcount because r08 is 0
> and I don't see any other registers which have a plausible negative
> 32-bit number in them.
>
> Yikebaer, could I trouble you to add this:
>
> +++ b/mm/gup.c
> @@ -226,7 +226,7 @@ int __must_check try_grab_page(struct page *page, unsigned int flags)
> {
> struct folio *folio = page_folio(page);
>
> - if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
> + if (VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0, folio))
> return -ENOMEM;
>
> if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page)))
>
> and rerun the syzkaller? That'll give us some more information about
> what has happened, although it won't tell us why it happened.
>
> We might need to catch someone decrementing the refcount to lower than
> the mapcount to catch this ... which will be tricky, given the other
> things we reuse the mapcount for.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page
2023-08-04 3:14 ` Yikebaer Aizezi
@ 2023-08-04 3:42 ` Matthew Wilcox
2023-08-04 13:32 ` Matthew Wilcox
2023-08-04 13:35 ` David Howells
1 sibling, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2023-08-04 3:42 UTC (permalink / raw)
To: Yikebaer Aizezi; +Cc: akpm, linux-mm, David Howells
On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote:
> Just patched it, then I rerun the reproduce program, and I got this
> output from console:
>
> BUG: Bad page state in process POC pfn:0eb8d
> page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000
> index:0x0 pfn:0xeb8d
> flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff)
> page_type: 0xffffffff()
> raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000
> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> page_owner info is not present (never set?)
> Modules linked in:
> CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106
> bad_page+0x71/0x1a0 mm/page_alloc.c:533
> free_page_is_bad_report mm/page_alloc.c:974 [inline]
> free_page_is_bad mm/page_alloc.c:984 [inline]
> free_pages_prepare mm/page_alloc.c:1153 [inline]
> free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348
> free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443
> __folio_put_small mm/swap.c:106 [inline]
> __folio_put+0xa2/0x110 mm/swap.c:129
> folio_put include/linux/mm.h:1423 [inline]
> put_page include/linux/mm.h:1492 [inline]
> extract_user_to_sg lib/scatterlist.c:1151 [inline]
Ohh. I think this is something Dave Howells has a patch for.
> extract_iter_to_sg lib/scatterlist.c:1349 [inline]
> extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339
> hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
> sock_sendmsg_nosec net/socket.c:725 [inline]
> sock_sendmsg+0xcf/0x170 net/socket.c:748
> ____sys_sendmsg+0x676/0x860 net/socket.c:2494
> ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
> __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fbd79539f29
> Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29
> RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004
> RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0
> R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> </TASK>
>
> page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000
> index:0x0 pfn:0xeb8d
> flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff)
> page_type: 0xffffffff()
> raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000
> raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0)
> page_owner info is not present (never set?)
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 7962 at mm/gup.c:229 try_grab_page+0x307/0x3c0 mm/gup.c:229
> Modules linked in:
> CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229
> Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48
> c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b
> eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00
> RSP: 0018:ffffc90002927178 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000
> RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00
> RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a
> R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4
> R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340
> FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <TASK>
> follow_page_pte+0x18c/0x1610 mm/gup.c:651
> follow_pmd_mask mm/gup.c:727 [inline]
> follow_pud_mask mm/gup.c:765 [inline]
> follow_p4d_mask mm/gup.c:782 [inline]
> follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
> __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
> __get_user_pages_locked mm/gup.c:1487 [inline]
> __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181
> internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179
> pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285
> iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline]
> iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831
> extract_user_to_sg lib/scatterlist.c:1123 [inline]
> extract_iter_to_sg lib/scatterlist.c:1349 [inline]
> extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339
> hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
> sock_sendmsg_nosec net/socket.c:725 [inline]
> sock_sendmsg+0xcf/0x170 net/socket.c:748
> ____sys_sendmsg+0x676/0x860 net/socket.c:2494
> ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
> __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fbd79539f29
> Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29
> RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004
> RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0
> R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> </TASK>
>
> Modules linked in:
> CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229
> Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48
> c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b
> eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00
> RSP: 0018:ffffc90002927178 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000
> RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00
> RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a
> R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4
> R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340
> FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <TASK>
> follow_page_pte+0x18c/0x1610 mm/gup.c:651
> follow_pmd_mask mm/gup.c:727 [inline]
> follow_pud_mask mm/gup.c:765 [inline]
> follow_p4d_mask mm/gup.c:782 [inline]
> follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
> __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
> __get_user_pages_locked mm/gup.c:1487 [inline]
> __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181
> internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179
> pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285
> iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline]
> iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831
> extract_user_to_sg lib/scatterlist.c:1123 [inline]
> extract_iter_to_sg lib/scatterlist.c:1349 [inline]
> extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339
> hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
> sock_sendmsg_nosec net/socket.c:725 [inline]
> sock_sendmsg+0xcf/0x170 net/socket.c:748
> ____sys_sendmsg+0x676/0x860 net/socket.c:2494
> ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
> __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fbd79539f29
> Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29
> RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004
> RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0
> R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> </TASK>
> Kernel panic - not syncing: kernel: panic_on_warn set ...
> CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x92/0xf0 lib/dump_stack.c:106
> panic+0x570/0x620 kernel/panic.c:340
> check_panic_on_warn+0x8e/0x90 kernel/panic.c:236
> __warn+0xee/0x340 kernel/panic.c:673
> __report_bug lib/bug.c:199 [inline]
> report_bug+0x25d/0x460 lib/bug.c:219
> handle_bug+0x3c/0x70 arch/x86/kernel/traps.c:324
> exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:345
> asm_exc_invalid_op+0x16/0x20 arch/x86/include/asm/idtentry.h:568
> RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229
> Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48
> c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b
> eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00
> RSP: 0018:ffffc90002927178 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000
> RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00
> RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a
> R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4
> R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340
> follow_page_pte+0x18c/0x1610 mm/gup.c:651
> follow_pmd_mask mm/gup.c:727 [inline]
> follow_pud_mask mm/gup.c:765 [inline]
> follow_p4d_mask mm/gup.c:782 [inline]
> follow_page_mask+0x2e4/0xbd0 mm/gup.c:839
> __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256
> __get_user_pages_locked mm/gup.c:1487 [inline]
> __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181
> internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179
> pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285
> iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline]
> iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831
> extract_user_to_sg lib/scatterlist.c:1123 [inline]
> extract_iter_to_sg lib/scatterlist.c:1349 [inline]
> extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339
> hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
> sock_sendmsg_nosec net/socket.c:725 [inline]
> sock_sendmsg+0xcf/0x170 net/socket.c:748
> ____sys_sendmsg+0x676/0x860 net/socket.c:2494
> ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
> __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fbd79539f29
> Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29
> RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004
> RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0
> R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> </TASK>
> Dumping ftrace buffer:
> (ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 1 seconds..
>
> ---------------------------------------------------------------------------------------------
>
> I think the previous question you mentioned about ioctl() is triggered
> because of
> another crash WARNING in kvm_arch_vcpu_ioctl_run, I think somehow these
> two crashes triggered at one time. But I cannot figure out why it happened.
>
> after I tried to fixed that problem, and rerun C reproducer on this
> issue, I got
> different output from console as above.
>
>
> Matthew Wilcox <willy@infradead.org> 于2023年8月3日周四 21:19写道:
>
>
> >
> > On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote:
> > > console output:
> > > https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link
> >
> > I dug through this, and what I found troubles me.
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0
> > Modules linked in:
> > CPU: 0 PID: 13067 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> > RIP: 0010:try_grab_page+0x2dd/0x3a0
> > Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d
> > RSP: 0018:ffffc9000c2777e0 EFLAGS: 00010212
> > RAX: 0000000000000247 RBX: ffffea00003ae340 RCX: ffffc90002bb1000
> > RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374
> > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e
> > R10: ffffea00003ae377 R11: 0000000000084001 R12: ffffea00003ae374
> > R13: 0000000000210002 R14: ffffea00003ae340 R15: 000000000eb8d225
> > FS: 00007f5841a13640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000500310 CR3: 0000000018d0c000 CR4: 0000000000750ef0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > PKRU: 55555554
> > Call Trace:
> > <TASK>
> > ? __warn+0xe2/0x340
> > ? try_grab_page+0x2dd/0x3a0
> > ? report_bug+0x25d/0x460
> > ? handle_bug+0x3c/0x70
> > ? exc_invalid_op+0x14/0x40
> > ? asm_exc_invalid_op+0x16/0x20
> > ? try_grab_page+0x2dd/0x3a0
> > ? try_grab_page+0x2dd/0x3a0
> > follow_page_pte+0x18c/0x1610
> > ? try_grab_page+0x3a0/0x3a0
> > ? rcu_is_watching+0xe/0xb0
> > follow_page_mask+0x2e4/0xbd0
> > __get_user_pages+0x3fa/0xcf0
> > ? follow_page_mask+0xbd0/0xbd0
> > ? down_read_killable+0x146/0x4f0
> > ? down_read_interruptible+0x4f0/0x4f0
> > ? rcu_is_watching+0xe/0xb0
> > __gup_longterm_locked+0x5fa/0x1ec0
> > ? io_schedule_timeout+0x150/0x150
> > ? rcu_is_watching+0xe/0xb0
> > ? get_user_pages_unlocked+0x580/0x580
> > ? lock_release+0x4f7/0x670
> > ? internal_get_user_pages_fast+0xe27/0x2690
> > ? lock_downgrade+0x690/0x690
> > ? preempt_schedule_common+0x45/0xb0
> > ? pud_huge+0x9c/0xe0
> > ? pmd_huge+0xe0/0xe0
> > internal_get_user_pages_fast+0x119b/0x2690
> > ? mtree_load+0x1df/0x980
> > ? __gup_device_huge+0x530/0x530
> > ? rcu_is_watching+0xe/0xb0
> > ? lock_release+0x4f7/0x670
> > get_user_pages_fast+0x95/0xe0
> > ? get_user_pages_fast_only+0xe0/0xe0
> > do_get_mempolicy+0x50c/0xd20
> > ? sp_delete+0xf0/0xf0
> > ? seccomp_notify_ioctl+0xd80/0xd80
> > __x64_sys_get_mempolicy+0x187/0x2a0
> > ? __ia32_sys_migrate_pages+0xf0/0xf0
> > ? __secure_computing+0x1ff/0x360
> > do_syscall_64+0x35/0xb0
> > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > RIP: 0033:0x47959d
> > Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007f5841a13068 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef
> > RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d
> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> > RBP: 000000000059c0a0 R08: 0000000000000003 R09: 0000000000000000
> > R10: 0000000020ff9000 R11: 0000000000000246 R12: 000000000059c0ac
> > R13: 000000000000000b R14: 0000000000437250 R15: 00007f58419f3000
> > </TASK>
> > Kernel panic - not syncing: kernel: panic_on_warn set ...
> >
> > > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0
> >
> > That's this line:
> > if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
> > Called from:
> > follow_page_pte+0x18c/0x1610
> >
> > That did:
> > ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
> > pte = ptep_get(ptep);
> > page = vm_normal_page(vma, address, pte);
> > ret = try_grab_page(page, flags);
> >
> > So we grabbed the PTE lock, looked up the PTE, translated that into
> > a page ... and found a page with a zero (or negative) refcount.
> > That's Really Bad. I think it was a zero refcount because r08 is 0
> > and I don't see any other registers which have a plausible negative
> > 32-bit number in them.
> >
> > Yikebaer, could I trouble you to add this:
> >
> > +++ b/mm/gup.c
> > @@ -226,7 +226,7 @@ int __must_check try_grab_page(struct page *page, unsigned int flags)
> > {
> > struct folio *folio = page_folio(page);
> >
> > - if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
> > + if (VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0, folio))
> > return -ENOMEM;
> >
> > if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page)))
> >
> > and rerun the syzkaller? That'll give us some more information about
> > what has happened, although it won't tell us why it happened.
> >
> > We might need to catch someone decrementing the refcount to lower than
> > the mapcount to catch this ... which will be tricky, given the other
> > things we reuse the mapcount for.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page
2023-08-04 3:42 ` Matthew Wilcox
@ 2023-08-04 13:32 ` Matthew Wilcox
0 siblings, 0 replies; 8+ messages in thread
From: Matthew Wilcox @ 2023-08-04 13:32 UTC (permalink / raw)
To: Yikebaer Aizezi; +Cc: akpm, linux-mm, David Howells
On Fri, Aug 04, 2023 at 04:42:27AM +0100, Matthew Wilcox wrote:
> On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote:
> > Just patched it, then I rerun the reproduce program, and I got this
> > output from console:
> >
> > BUG: Bad page state in process POC pfn:0eb8d
> > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000
> > index:0x0 pfn:0xeb8d
> > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff)
> > page_type: 0xffffffff()
> > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000
> > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> > page_owner info is not present (never set?)
> > Modules linked in:
> > CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> > Call Trace:
> > <TASK>
> > __dump_stack lib/dump_stack.c:88 [inline]
> > dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106
> > bad_page+0x71/0x1a0 mm/page_alloc.c:533
> > free_page_is_bad_report mm/page_alloc.c:974 [inline]
> > free_page_is_bad mm/page_alloc.c:984 [inline]
> > free_pages_prepare mm/page_alloc.c:1153 [inline]
> > free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348
> > free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443
> > __folio_put_small mm/swap.c:106 [inline]
> > __folio_put+0xa2/0x110 mm/swap.c:129
> > folio_put include/linux/mm.h:1423 [inline]
> > put_page include/linux/mm.h:1492 [inline]
> > extract_user_to_sg lib/scatterlist.c:1151 [inline]
>
> Ohh. I think this is something Dave Howells has a patch for.
Can you try
https://lore.kernel.org/mm-commits/20230726204730.B89D8C433C7@smtp.kernel.org/
?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page
2023-08-04 3:14 ` Yikebaer Aizezi
2023-08-04 3:42 ` Matthew Wilcox
@ 2023-08-04 13:35 ` David Howells
2023-08-06 7:51 ` Yikebaer Aizezi
1 sibling, 1 reply; 8+ messages in thread
From: David Howells @ 2023-08-04 13:35 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: dhowells, Yikebaer Aizezi, akpm, linux-mm, Herbert Xu
Matthew Wilcox <willy@infradead.org> wrote:
> On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote:
> > Just patched it, then I rerun the reproduce program, and I got this
> > output from console:
> >
> > BUG: Bad page state in process POC pfn:0eb8d
> > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000
> > index:0x0 pfn:0xeb8d
> > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff)
> > page_type: 0xffffffff()
> > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000
> > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> > page_owner info is not present (never set?)
> > Modules linked in:
> > CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> > Call Trace:
> > <TASK>
> > __dump_stack lib/dump_stack.c:88 [inline]
> > dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106
> > bad_page+0x71/0x1a0 mm/page_alloc.c:533
> > free_page_is_bad_report mm/page_alloc.c:974 [inline]
> > free_page_is_bad mm/page_alloc.c:984 [inline]
> > free_pages_prepare mm/page_alloc.c:1153 [inline]
> > free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348
> > free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443
> > __folio_put_small mm/swap.c:106 [inline]
> > __folio_put+0xa2/0x110 mm/swap.c:129
> > folio_put include/linux/mm.h:1423 [inline]
> > put_page include/linux/mm.h:1492 [inline]
> > extract_user_to_sg lib/scatterlist.c:1151 [inline]
>
> Ohh. I think this is something Dave Howells has a patch for.
>
> > extract_iter_to_sg lib/scatterlist.c:1349 [inline]
> > extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339
> > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
> > sock_sendmsg_nosec net/socket.c:725 [inline]
> > sock_sendmsg+0xcf/0x170 net/socket.c:748
> > ____sys_sendmsg+0x676/0x860 net/socket.c:2494
> > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
> > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
> > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> > entry_SYSCALL_64_after_hwframe+0x63/0xcd
This might be the fix you're looking for.
https://lore.kernel.org/linux-crypto/20571.1690369076@warthog.procyon.org.uk/
Andrew has it in mm-hotfixes-unstable.
David
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: WARNING in try_grab_page
2023-08-04 13:35 ` David Howells
@ 2023-08-06 7:51 ` Yikebaer Aizezi
0 siblings, 0 replies; 8+ messages in thread
From: Yikebaer Aizezi @ 2023-08-06 7:51 UTC (permalink / raw)
To: David Howells, linux-mm, Matthew Wilcox, akpm; +Cc: Herbert Xu
I just tried this patch, it worked and the bug was not triggered.
David Howells <dhowells@redhat.com> 于2023年8月4日周五 21:35写道:
>
> Matthew Wilcox <willy@infradead.org> wrote:
>
> > On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote:
> > > Just patched it, then I rerun the reproduce program, and I got this
> > > output from console:
> > >
> > > BUG: Bad page state in process POC pfn:0eb8d
> > > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000
> > > index:0x0 pfn:0xeb8d
> > > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff)
> > > page_type: 0xffffffff()
> > > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000
> > > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> > > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> > > page_owner info is not present (never set?)
> > > Modules linked in:
> > > CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> > > Call Trace:
> > > <TASK>
> > > __dump_stack lib/dump_stack.c:88 [inline]
> > > dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106
> > > bad_page+0x71/0x1a0 mm/page_alloc.c:533
> > > free_page_is_bad_report mm/page_alloc.c:974 [inline]
> > > free_page_is_bad mm/page_alloc.c:984 [inline]
> > > free_pages_prepare mm/page_alloc.c:1153 [inline]
> > > free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348
> > > free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443
> > > __folio_put_small mm/swap.c:106 [inline]
> > > __folio_put+0xa2/0x110 mm/swap.c:129
> > > folio_put include/linux/mm.h:1423 [inline]
> > > put_page include/linux/mm.h:1492 [inline]
> > > extract_user_to_sg lib/scatterlist.c:1151 [inline]
> >
> > Ohh. I think this is something Dave Howells has a patch for.
> >
> > > extract_iter_to_sg lib/scatterlist.c:1349 [inline]
> > > extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339
> > > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119
> > > sock_sendmsg_nosec net/socket.c:725 [inline]
> > > sock_sendmsg+0xcf/0x170 net/socket.c:748
> > > ____sys_sendmsg+0x676/0x860 net/socket.c:2494
> > > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548
> > > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577
> > > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> This might be the fix you're looking for.
>
> https://lore.kernel.org/linux-crypto/20571.1690369076@warthog.procyon.org.uk/
>
> Andrew has it in mm-hotfixes-unstable.
>
> David
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-08-06 7:51 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-03 8:56 WARNING in try_grab_page Yikebaer Aizezi
2023-08-03 12:50 ` Matthew Wilcox
2023-08-03 13:19 ` Matthew Wilcox
2023-08-04 3:14 ` Yikebaer Aizezi
2023-08-04 3:42 ` Matthew Wilcox
2023-08-04 13:32 ` Matthew Wilcox
2023-08-04 13:35 ` David Howells
2023-08-06 7:51 ` Yikebaer Aizezi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox