Re: [syzbot] [mm?] KCSAN: data-race in __anon_vma_prepare / __vmf_anon_prepare

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Dmitry Vyukov <dvyukov@google.com>
To: syzbot <syzbot+f5d897f5194d92aa1769@syzkaller.appspotmail.com>
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
	david@kernel.org,  harry.yoo@oracle.com, jannh@google.com,
	linux-kernel@vger.kernel.org,  linux-mm@kvack.org,
	lorenzo.stoakes@oracle.com, riel@surriel.com,
	 syzkaller-bugs@googlegroups.com, vbabka@suse.cz
Subject: Re: [syzbot] [mm?] KCSAN: data-race in __anon_vma_prepare / __vmf_anon_prepare
Date: Wed, 14 Jan 2026 17:42:47 +0100	[thread overview]
Message-ID: <CACT4Y+aFaijS_CvzTnHB+ecg5nzYW1-MWPSp-Ad_0ax85=DvCQ@mail.gmail.com> (raw)
In-Reply-To: <6967c517.050a0220.150504.0007.GAE@google.com>

On Wed, 14 Jan 2026 at 17:32, syzbot
<syzbot+f5d897f5194d92aa1769@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    cfd4039213e7 Merge tag 'io_uring-6.19-20251208' of git://g..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1554d992580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c3201432211be40f
> dashboard link: https://syzkaller.appspot.com/bug?extid=f5d897f5194d92aa1769
> compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/9f556ae6e3c4/disk-cfd40392.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/efcf53c1d459/vmlinux-cfd40392.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/858f42961336/bzImage-cfd40392.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+f5d897f5194d92aa1769@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KCSAN: data-race in __anon_vma_prepare / __vmf_anon_prepare
>
> write to 0xffff88811c751e80 of 8 bytes by task 13471 on cpu 1:
>  __anon_vma_prepare+0x172/0x2f0 mm/rmap.c:212
>  __vmf_anon_prepare+0x91/0x100 mm/memory.c:3673
>  hugetlb_no_page+0x1c4/0x10d0 mm/hugetlb.c:5782
>  hugetlb_fault+0x4cf/0xce0 mm/hugetlb.c:-1
>  handle_mm_fault+0x1894/0x2c60 mm/memory.c:6578
>  do_user_addr_fault+0x3fe/0x1080 arch/x86/mm/fault.c:1387
>  handle_page_fault arch/x86/mm/fault.c:1476 [inline]
>  exc_page_fault+0x62/0xa0 arch/x86/mm/fault.c:1532
>  asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618
>  fault_in_readable+0xad/0x170 mm/gup.c:-1
>  fault_in_iov_iter_readable+0x129/0x210 lib/iov_iter.c:106
>  generic_perform_write+0x3cf/0x490 mm/filemap.c:4363
>  shmem_file_write_iter+0xc5/0xf0 mm/shmem.c:3490
>  new_sync_write fs/read_write.c:593 [inline]
>  vfs_write+0x52a/0x960 fs/read_write.c:686
>  ksys_pwrite64 fs/read_write.c:793 [inline]
>  __do_sys_pwrite64 fs/read_write.c:801 [inline]
>  __se_sys_pwrite64 fs/read_write.c:798 [inline]
>  __x64_sys_pwrite64+0xfd/0x150 fs/read_write.c:798
>  x64_sys_call+0x9f7/0x3000 arch/x86/include/generated/asm/syscalls_64.h:19
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xd8/0x2a0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> read to 0xffff88811c751e80 of 8 bytes by task 13473 on cpu 0:
>  __vmf_anon_prepare+0x26/0x100 mm/memory.c:3667
>  hugetlb_no_page+0x1c4/0x10d0 mm/hugetlb.c:5782
>  hugetlb_fault+0x4cf/0xce0 mm/hugetlb.c:-1
>  handle_mm_fault+0x1894/0x2c60 mm/memory.c:6578
>  faultin_page mm/gup.c:1126 [inline]
>  __get_user_pages+0x1024/0x1ed0 mm/gup.c:1428
>  populate_vma_page_range mm/gup.c:1860 [inline]
>  __mm_populate+0x243/0x3a0 mm/gup.c:1963
>  mm_populate include/linux/mm.h:3701 [inline]
>  vm_mmap_pgoff+0x232/0x2e0 mm/util.c:586
>  ksys_mmap_pgoff+0x268/0x310 mm/mmap.c:604
>  x64_sys_call+0x16bb/0x3000 arch/x86/include/generated/asm/syscalls_64.h:10
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xd8/0x2a0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> value changed: 0x0000000000000000 -> 0xffff888104ecca28
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 UID: 0 PID: 13473 Comm: syz.2.3219 Tainted: G        W           syzkaller #0 PREEMPT(voluntary)
> Tainted: [W]=WARN
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
> ==================================================================

Hi Harry,

I see you've been debugging:
KASAN: slab-use-after-free Read in folio_remove_rmap_ptes
https://lore.kernel.org/all/694e3dc6.050a0220.35954c.0066.GAE@google.com/T/

Can that bug be caused by this data race?
Below is an explanation by Gemini LLM as to why this race is harmful.
Obviously take it with a grain of salt, but with my limited mm
knowledge it does not look immediately wrong (re rmap invariant).

However, now digging into details I see that this Lorenzo's patch
also marked as fixing "KASAN: slab-use-after-free Read in
folio_remove_rmap_ptes":

mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
https://lore.kernel.org/all/b7930ad2b1503a657e29fe928eb33061d7eadf5b.1767638272.git.lorenzo.stoakes@oracle.com/T/

So perhaps the race is still benign (or points to another issue?)

Here is what LLM said about the race:
-----

The bug report is actionable and points to a harmful data race in the Linux
kernel's memory management subsystem, specifically in the handling of
anonymous `hugetlb` mappings.

**Analysis:**

1.  **Race Location:** The data race occurs on the `vma->anon_vma` field
       of a `struct vm_area_struct`.
    *   **Writer:** Task 13471 executes `__anon_vma_prepare` in `mm/rmap.c`.
        This function initializes the `anon_vma` for a VMA. It holds
        `mm->page_table_lock` and writes to `vma->anon_vma` (line 211 in the
        viewed source, corresponding to the report's `mm/rmap.c:212` area).
    *   **Reader:** Task 13473 executes `__vmf_anon_prepare` in `mm/memory.c`.
        This function is an optimization wrapper that checks if
        `vma->anon_vma` is already set (line 3666/3667) to avoid the overhead
        of `__anon_vma_prepare`. This check is performed **without** holding
        `mm->page_table_lock`.

2.  **Consistency:** The report is consistent. Both tasks are handling
`hugetlb` page faults (`hugetlb_no_page` -> `__vmf_anon_prepare`).
The writer stack shows it proceeded into `__anon_vma_prepare` (implying
`vma->anon_vma` was NULL initially), while the reader stack shows it
reading `vma->anon_vma`. The value change `0x0000000000000000 ->
0xffff888104ecca28` confirms initialization from NULL to a pointer.

3.  **Harmfulness (Why it is not benign):**
    *   In `__anon_vma_prepare`, the code currently initializes
    `vma->anon_vma` **before** linking the VMA to the `anon_vma`
    structure via `anon_vma_chain_link`.
    *   ```c
        vma->anon_vma = anon_vma;
        anon_vma_chain_link(vma, avc, anon_vma);
        ```
    *   Because the reader (`__vmf_anon_prepare`) checks `vma->anon_vma`
    locklessly, it can see the non-NULL value before `anon_vma_chain_link`
    has completed (due to compiler/CPU reordering or simple preemption
    between the two statements).
    *   If the reader proceeds, it assumes the `anon_vma` is fully ready.
    It then maps a page and sets `folio->mapping = anon_vma`.
    *   However, if `anon_vma_chain_link` hasn't finished, the `anon_vma`
    (specifically its interval tree) does not yet contain the entry for
    this `vma`.
    *   This breaks the reverse mapping (rmap) invariant. If the kernel
    subsequently tries to unmap or migrate this page (finding it via
    `folio->mapping`), `rmap_walk` will fail to find the VMA in the
    `anon_vma`'s interval tree. This can lead to pages being effectively
    pinned, migration failures, or in worst-case scenarios (like memory
    corruption handling or specific reclaim paths), logical errors where
    a page is assumed unmapped when it is not.

4.  **Fix:** The fix requires enforcing ordering. `vma->anon_vma` should
be set **after** `anon_vma_chain_link` is complete, and `smp_store_release`
/ `smp_load_acquire` (or equivalent barriers) should be used to ensure the
reader observes the fully initialized state.

next prev parent reply	other threads:[~2026-01-14 16:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-14 16:32 syzbot
2026-01-14 16:42 ` Dmitry Vyukov [this message]
2026-01-14 16:59   ` Jann Horn
2026-01-14 17:05     ` Dmitry Vyukov
2026-01-14 17:29       ` Jann Horn
2026-01-14 17:48         ` Jann Horn
2026-01-14 18:02           ` Lorenzo Stoakes
2026-01-14 18:23             ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACT4Y+aFaijS_CvzTnHB+ecg5nzYW1-MWPSp-Ad_0ax85=DvCQ@mail.gmail.com' \
    --to=dvyukov@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=riel@surriel.com \
    --cc=syzbot+f5d897f5194d92aa1769@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox