* [syzbot] [mm?] KCSAN: data-race in mas_wr_store_entry / mtree_range_walk @ 2024-09-23 9:04 syzbot 2024-09-23 9:44 ` Lorenzo Stoakes 0 siblings, 1 reply; 5+ messages in thread From: syzbot @ 2024-09-23 9:04 UTC (permalink / raw) To: Liam.Howlett, akpm, linux-kernel, linux-mm, lorenzo.stoakes, syzkaller-bugs, vbabka Hello, syzbot found the following issue on: HEAD commit: 88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1237ec27980000 kernel config: https://syzkaller.appspot.com/x/.config?x=e6702f5f2b8ed242 dashboard link: https://syzkaller.appspot.com/bug?extid=e01fa33e67abb0b3b3bb compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 Unfortunately, I don't have any reproducer for this issue yet. Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/95bba355b2ed/disk-88264981.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/75966f4e5286/vmlinux-88264981.xz kernel image: https://storage.googleapis.com/syzbot-assets/7f1578876250/bzImage-88264981.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+e01fa33e67abb0b3b3bb@syzkaller.appspotmail.com ================================================================== BUG: KCSAN: data-race in mas_wr_store_entry / mtree_range_walk write to 0xffff888114555710 of 8 bytes by task 9573 on cpu 1: mas_wr_slot_store lib/maple_tree.c:3889 [inline] mas_wr_store_entry+0x146b/0x2d00 lib/maple_tree.c:4075 mas_store_prealloc+0x6bf/0x960 lib/maple_tree.c:5520 vma_iter_store mm/vma.h:470 [inline] commit_merge+0x441/0x740 mm/vma.c:609 vma_expand+0x211/0x360 mm/vma.c:1024 vma_merge_new_range+0x2cf/0x3e0 mm/vma.c:963 mmap_region+0x887/0x16e0 mm/mmap.c:1416 do_mmap+0x718/0xb60 mm/mmap.c:496 vm_mmap_pgoff+0x133/0x290 mm/util.c:588 ksys_mmap_pgoff+0xd0/0x330 mm/mmap.c:542 x64_sys_call+0x1884/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:10 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f read to 0xffff888114555710 of 8 bytes by task 9574 on cpu 0: mtree_range_walk+0x1b4/0x460 lib/maple_tree.c:2779 mas_state_walk lib/maple_tree.c:3601 [inline] mas_walk+0x16e/0x320 lib/maple_tree.c:4948 lock_vma_under_rcu+0x95/0x260 mm/memory.c:6224 do_user_addr_fault arch/x86/mm/fault.c:1329 [inline] handle_page_fault arch/x86/mm/fault.c:1481 [inline] exc_page_fault+0x150/0x650 arch/x86/mm/fault.c:1539 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 value changed: 0x00007f8311576fff -> 0xffffffff8529a680 Reported by Kernel Concurrency Sanitizer on: CPU: 0 UID: 0 PID: 9574 Comm: syz.0.2084 Tainted: G W 6.11.0-syzkaller-08481-g88264981f208 #0 Tainted: [W]=WARN Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 ================================================================== --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] KCSAN: data-race in mas_wr_store_entry / mtree_range_walk 2024-09-23 9:04 [syzbot] [mm?] KCSAN: data-race in mas_wr_store_entry / mtree_range_walk syzbot @ 2024-09-23 9:44 ` Lorenzo Stoakes 2024-09-23 11:57 ` Marco Elver 2024-09-23 13:50 ` Lorenzo Stoakes 0 siblings, 2 replies; 5+ messages in thread From: Lorenzo Stoakes @ 2024-09-23 9:44 UTC (permalink / raw) To: syzbot; +Cc: Liam.Howlett, akpm, linux-kernel, linux-mm, syzkaller-bugs, vbabka On Mon, Sep 23, 2024 at 02:04:23AM GMT, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=1237ec27980000 > kernel config: https://syzkaller.appspot.com/x/.config?x=e6702f5f2b8ed242 > dashboard link: https://syzkaller.appspot.com/bug?extid=e01fa33e67abb0b3b3bb > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > Thanks for the report, investigating. > Unfortunately, I don't have any reproducer for this issue yet. I suspect given this is so timing-specific, a reproducer might be difficult. > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/95bba355b2ed/disk-88264981.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/75966f4e5286/vmlinux-88264981.xz > kernel image: https://storage.googleapis.com/syzbot-assets/7f1578876250/bzImage-88264981.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+e01fa33e67abb0b3b3bb@syzkaller.appspotmail.com > > ================================================================== > BUG: KCSAN: data-race in mas_wr_store_entry / mtree_range_walk > > write to 0xffff888114555710 of 8 bytes by task 9573 on cpu 1: > mas_wr_slot_store lib/maple_tree.c:3889 [inline] > mas_wr_store_entry+0x146b/0x2d00 lib/maple_tree.c:4075 > mas_store_prealloc+0x6bf/0x960 lib/maple_tree.c:5520 > vma_iter_store mm/vma.h:470 [inline] > commit_merge+0x441/0x740 mm/vma.c:609 > vma_expand+0x211/0x360 mm/vma.c:1024 > vma_merge_new_range+0x2cf/0x3e0 mm/vma.c:963 > mmap_region+0x887/0x16e0 mm/mmap.c:1416 > do_mmap+0x718/0xb60 mm/mmap.c:496 > vm_mmap_pgoff+0x133/0x290 mm/util.c:588 > ksys_mmap_pgoff+0xd0/0x330 mm/mmap.c:542 > x64_sys_call+0x1884/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:10 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > read to 0xffff888114555710 of 8 bytes by task 9574 on cpu 0: > mtree_range_walk+0x1b4/0x460 lib/maple_tree.c:2779 > mas_state_walk lib/maple_tree.c:3601 [inline] > mas_walk+0x16e/0x320 lib/maple_tree.c:4948 > lock_vma_under_rcu+0x95/0x260 mm/memory.c:6224 > do_user_addr_fault arch/x86/mm/fault.c:1329 [inline] > handle_page_fault arch/x86/mm/fault.c:1481 [inline] > exc_page_fault+0x150/0x650 arch/x86/mm/fault.c:1539 > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 > > value changed: 0x00007f8311576fff -> 0xffffffff8529a680 This suggests we are failing to acquire an RCU lock on mmap() (though we have the write mmap lock). Maybe we missed an RCU lock at some point, but I'm a little baffled as to what could have changed in recent series to adjust this. I will dig into this and see what's going on. > > Reported by Kernel Concurrency Sanitizer on: > CPU: 0 UID: 0 PID: 9574 Comm: syz.0.2084 Tainted: G W 6.11.0-syzkaller-08481-g88264981f208 #0 > Tainted: [W]=WARN > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 > ================================================================== > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > If the report is already addressed, let syzbot know by replying with: > #syz fix: exact-commit-title > > If you want to overwrite report's subsystems, reply with: > #syz set subsystems: new-subsystem > (See the list of subsystem names on the web dashboard) > > If the report is a duplicate of another one, reply with: > #syz dup: exact-subject-of-another-report > > If you want to undo deduplication, reply with: > #syz undup ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] KCSAN: data-race in mas_wr_store_entry / mtree_range_walk 2024-09-23 9:44 ` Lorenzo Stoakes @ 2024-09-23 11:57 ` Marco Elver 2024-09-23 13:59 ` Lorenzo Stoakes 2024-09-23 13:50 ` Lorenzo Stoakes 1 sibling, 1 reply; 5+ messages in thread From: Marco Elver @ 2024-09-23 11:57 UTC (permalink / raw) To: Lorenzo Stoakes Cc: syzbot, Liam.Howlett, akpm, linux-kernel, linux-mm, syzkaller-bugs, vbabka On Mon, 23 Sept 2024 at 11:44, 'Lorenzo Stoakes' via syzkaller-bugs <syzkaller-bugs@googlegroups.com> wrote: > > On Mon, Sep 23, 2024 at 02:04:23AM GMT, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1237ec27980000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=e6702f5f2b8ed242 > > dashboard link: https://syzkaller.appspot.com/bug?extid=e01fa33e67abb0b3b3bb > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > Thanks for the report, investigating. > > > Unfortunately, I don't have any reproducer for this issue yet. > > I suspect given this is so timing-specific, a reproducer might be difficult. FWIW, syzbot currently does not try to find reproducers for KCSAN, because it's too non-deterministic. The best strategy would be for a developer who has an intuition for what might be going on to create a stress-test to reproduce. > This suggests we are failing to acquire an RCU lock on mmap() (though we > have the write mmap lock). > > Maybe we missed an RCU lock at some point, but I'm a little baffled as to > what could have changed in recent series to adjust this. > > I will dig into this and see what's going on. Thanks! ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] KCSAN: data-race in mas_wr_store_entry / mtree_range_walk 2024-09-23 11:57 ` Marco Elver @ 2024-09-23 13:59 ` Lorenzo Stoakes 0 siblings, 0 replies; 5+ messages in thread From: Lorenzo Stoakes @ 2024-09-23 13:59 UTC (permalink / raw) To: Marco Elver Cc: syzbot, Liam.Howlett, akpm, linux-kernel, linux-mm, syzkaller-bugs, vbabka On Mon, Sep 23, 2024 at 01:57:55PM GMT, Marco Elver wrote: > On Mon, 23 Sept 2024 at 11:44, 'Lorenzo Stoakes' via syzkaller-bugs > <syzkaller-bugs@googlegroups.com> wrote: > > > > On Mon, Sep 23, 2024 at 02:04:23AM GMT, syzbot wrote: > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: 88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k.. > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1237ec27980000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=e6702f5f2b8ed242 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=e01fa33e67abb0b3b3bb > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > > > > Thanks for the report, investigating. > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > I suspect given this is so timing-specific, a reproducer might be difficult. > > FWIW, syzbot currently does not try to find reproducers for KCSAN, > because it's too non-deterministic. Thanks, yes makes sense. > > The best strategy would be for a developer who has an intuition for > what might be going on to create a stress-test to reproduce. As per analysis in thread, I don't think this is really an issue, as one of the racing threads is code explicitly designed to accept and deal with races. We do obviously probably need an annotation, but unless it becomes urgent, will defer this until Liam's return to check and assess. Obviously if we see more reports I can dive back in! > > > This suggests we are failing to acquire an RCU lock on mmap() (though we > > have the write mmap lock). > > > > Maybe we missed an RCU lock at some point, but I'm a little baffled as to > > what could have changed in recent series to adjust this. > > > > I will dig into this and see what's going on. > > Thanks! ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] KCSAN: data-race in mas_wr_store_entry / mtree_range_walk 2024-09-23 9:44 ` Lorenzo Stoakes 2024-09-23 11:57 ` Marco Elver @ 2024-09-23 13:50 ` Lorenzo Stoakes 1 sibling, 0 replies; 5+ messages in thread From: Lorenzo Stoakes @ 2024-09-23 13:50 UTC (permalink / raw) To: syzbot Cc: Liam.Howlett, akpm, linux-kernel, linux-mm, syzkaller-bugs, vbabka, Suren Baghdasaryan + Suren for VMA lock aspects. On Mon, Sep 23, 2024 at 10:44:34AM GMT, Lorenzo Stoakes wrote: > On Mon, Sep 23, 2024 at 02:04:23AM GMT, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1237ec27980000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=e6702f5f2b8ed242 > > dashboard link: https://syzkaller.appspot.com/bug?extid=e01fa33e67abb0b3b3bb > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > Thanks for the report, investigating. TL;DR: I think there's a race, but it's in code that detects and handles races, we probably need to add a data_race() or similar annotation. I'll leave it to Liam (who is currently on leave) to address annotating this if appropriate on his return, unless this becomes more urgent in the meantime. > > > Unfortunately, I don't have any reproducer for this issue yet. > > I suspect given this is so timing-specific, a reproducer might be difficult. > > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/95bba355b2ed/disk-88264981.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/75966f4e5286/vmlinux-88264981.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/7f1578876250/bzImage-88264981.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+e01fa33e67abb0b3b3bb@syzkaller.appspotmail.com > > > > ================================================================== > > BUG: KCSAN: data-race in mas_wr_store_entry / mtree_range_walk > > > > write to 0xffff888114555710 of 8 bytes by task 9573 on cpu 1: This is: wr_mas->pivots[offset] = mas->index - 1; > > mas_wr_slot_store lib/maple_tree.c:3889 [inline] > > mas_wr_store_entry+0x146b/0x2d00 lib/maple_tree.c:4075 > > mas_store_prealloc+0x6bf/0x960 lib/maple_tree.c:5520 > > vma_iter_store mm/vma.h:470 [inline] > > commit_merge+0x441/0x740 mm/vma.c:609 > > vma_expand+0x211/0x360 mm/vma.c:1024 > > vma_merge_new_range+0x2cf/0x3e0 mm/vma.c:963 > > mmap_region+0x887/0x16e0 mm/mmap.c:1416 > > do_mmap+0x718/0xb60 mm/mmap.c:496 > > vm_mmap_pgoff+0x133/0x290 mm/util.c:588 > > ksys_mmap_pgoff+0xd0/0x330 mm/mmap.c:542 > > x64_sys_call+0x1884/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:10 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > read to 0xffff888114555710 of 8 bytes by task 9574 on cpu 0: > > mtree_range_walk+0x1b4/0x460 lib/maple_tree.c:2779 This is: if (pivots[offset] >= mas->index) { So we appear to be racing on this. But... I don't think there's an actual race issue here, just maybe a need for a data_race() wrapper or similar. The lock_vma_under_rcu() call is an optimistic attempt to obtain a VMA, and will try again if it is unable to get a stable reference to a VMA. On vma_start_write() which the above stack writing to the maple tree invokes prior to writing the maple tree, we lock the VMA for writing, and set the vma->vm_lock_seq value equal to vma->vm_mm->mm_lock_seq, before unlocking. So we know that vma->vm_lock_seq == vma->vm_mm->mm_lock_seq right up until vma_end_write_all() is called after the write is done (via mmap_write_unlock()). Then, in lock_vma_under_rcu(), vma_start_read() is called which bails out if vma->vm_lock_seq == vma->vm_mm->mm_lock_seq, which it likely would be. If you were _very_ unlucky with timing you might still get the lock on the modified VMA, however if it was detached by then, it would have been marked detached and lock_vma_under_rcu() will bail out, finally if we got the VMA and it changed (e.g. expand above), we explicitly check that the address is still in range - if it is, we're perfectly good to use it. If lock_vma_under_rcu() bails it resorts to the slow path, no harm no foul. I don't know why this was triggered with recent changes, maybe a timing thing, but I don't _think_ we have an issue here. No part of this code path really should be different here. If we have further reports in this area or any indication of a deeper issue I'll dive back in. > > mas_state_walk lib/maple_tree.c:3601 [inline] > > mas_walk+0x16e/0x320 lib/maple_tree.c:4948 > > lock_vma_under_rcu+0x95/0x260 mm/memory.c:6224 > > do_user_addr_fault arch/x86/mm/fault.c:1329 [inline] > > handle_page_fault arch/x86/mm/fault.c:1481 [inline] > > exc_page_fault+0x150/0x650 arch/x86/mm/fault.c:1539 > > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 > > > > value changed: 0x00007f8311576fff -> 0xffffffff8529a680 > > This suggests we are failing to acquire an RCU lock on mmap() (though we > have the write mmap lock). > > Maybe we missed an RCU lock at some point, but I'm a little baffled as to > what could have changed in recent series to adjust this. > > I will dig into this and see what's going on. > > > > > Reported by Kernel Concurrency Sanitizer on: > > CPU: 0 UID: 0 PID: 9574 Comm: syz.0.2084 Tainted: G W 6.11.0-syzkaller-08481-g88264981f208 #0 > > Tainted: [W]=WARN > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 > > ================================================================== > > > > > > --- > > This report is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkaller@googlegroups.com. > > > > syzbot will keep track of this issue. See: > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > > If the report is already addressed, let syzbot know by replying with: > > #syz fix: exact-commit-title > > > > If you want to overwrite report's subsystems, reply with: > > #syz set subsystems: new-subsystem > > (See the list of subsystem names on the web dashboard) > > > > If the report is a duplicate of another one, reply with: > > #syz dup: exact-subject-of-another-report > > > > If you want to undo deduplication, reply with: > > #syz undup ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-09-23 13:59 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-09-23 9:04 [syzbot] [mm?] KCSAN: data-race in mas_wr_store_entry / mtree_range_walk syzbot 2024-09-23 9:44 ` Lorenzo Stoakes 2024-09-23 11:57 ` Marco Elver 2024-09-23 13:59 ` Lorenzo Stoakes 2024-09-23 13:50 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox