linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Jann Horn <jannh@google.com>
Cc: syzkaller <syzkaller@googlegroups.com>,
	 syzbot <syzbot+189d4742d07e937d68ea@syzkaller.appspotmail.com>,
	 akpm@linux-foundation.org, baolin.wang@linux.alibaba.com,
	hughd@google.com,  linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,  syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [mm?] KCSAN: data-race in copy_page_from_iter_atomic / pagecache_isize_extended
Date: Mon, 12 May 2025 22:51:45 +0200	[thread overview]
Message-ID: <CANpmjNP7Ktjq3gUGKqQKgn1tbNZLj2ALRNvF-ZcERZ42KRD6Aw@mail.gmail.com> (raw)
In-Reply-To: <CAG48ez1BGFn7jw+FYZJxRyyjnR+rrqx1AtNQoR_Jup3tZ-dADQ@mail.gmail.com>

On Mon, 12 May 2025 at 20:33, 'Jann Horn' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
>
> On Mon, May 12, 2025 at 7:44 PM Jann Horn <jannh@google.com> wrote:
> > On Tue, May 6, 2025 at 9:52 AM syzbot
> > <syzbot+189d4742d07e937d68ea@syzkaller.appspotmail.com> wrote:
> > > HEAD commit:    01f95500a162 Merge tag 'uml-for-linux-6.15-rc6' of git://g..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=17abbb68580000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=6154604431d9aaf9
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=189d4742d07e937d68ea
> > > compiler:       Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
> > [...]
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+189d4742d07e937d68ea@syzkaller.appspotmail.com
> > >
> > > ==================================================================
> > > BUG: KCSAN: data-race in copy_page_from_iter_atomic / pagecache_isize_extended
> >
> > I think this is a problem with the KCSAN implementation.
> >
> > This is a race between writing to a userspace-owned page and reading
> > from a userspace-owned page.
> >
> > This kind of pattern should be fairly trivial to trigger: If userspace
> > tells the kernel to read from a GUP'd page or pagecache on one thread,
> > and simultaneously tells the kernel to write to the same page on
> > another thread, we'll get a data race. This is not really a kernel
> > data race; it is more like a userspace race whose memory accesses
> > happen to go through the kernel.
> >
> > So I think the fix would be for KCSAN to ignore anything in such
> > pages. The hard part is, I'm not sure how to tell what kind of page
> > we're dealing with from the kernel, some MM people might know...
>
> Or alternatively, if we really do want data_race() operations around
> any memset() or memcpy() on userspace-controlled pages, I guess we'd
> have to pepper a lot of those around the kernel.
>
> Also, I didn't really think about some of what I wrote here - we
> certainly wouldn't want to ignore unannotated accesses to some struct
> located in pagecache that userspace can concurrently write to.
>
> Maybe it would actually make sense to do the opposite of what I said
> to some extent, special-case userspace-mapped pages such that KCSAN
> _always_ alerts on plain access to them...
>
> > distinguishing normal pagecache/anon pages from other pages might be
> > doable, but I guess it probably gets hard when thinking about
> > driver-allocated pages that were mapped into userspace vs
> > driver-allocated pages that are used internally in the driver...

There have been cases where user space was doing something unsafe, and
KCSAN caught it. While technically it's user space's bug to keep,
KCSAN is still telling us something's wrong here.

In the past we'd just ignore these bugs (never release them from
syzbot), but I think we recently changed the rules for some of these
to be sent to the mailing list. They can safely be ignored if deemed
"user space is doing something stupid".

I do think we want to surface such issues in one-off testing
scenarios. However, in the fuzzing/CI context it's not so helpful, so
we might need a way to suppress them. If there's a way to tell by
looking at the stacktrace, we could teach syzbot to ignore such data
races entirely.


  reply	other threads:[~2025-05-12 20:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-06  7:52 syzbot
2025-05-12 17:44 ` Jann Horn
2025-05-12 18:32   ` Jann Horn
2025-05-12 20:51     ` Marco Elver [this message]
2025-05-13 16:42       ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANpmjNP7Ktjq3gUGKqQKgn1tbNZLj2ALRNvF-ZcERZ42KRD6Aw@mail.gmail.com \
    --to=elver@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=syzbot+189d4742d07e937d68ea@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=syzkaller@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox