linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Hillf Danton <hdanton@sina.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Al Viro <viro@zeniv.linux.org.uk>,
	 syzbot <syzbot+919c5a9be8433b8bf201@syzkaller.appspotmail.com>,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	 syzkaller-bugs@googlegroups.com,
	Aleksandr Nogikh <nogikh@google.com>
Subject: Re: [syzbot] WARNING in do_mkdirat
Date: Mon, 12 Dec 2022 20:29:10 +0100	[thread overview]
Message-ID: <CANpmjNNCQEXpJt1PQptyr8mrBbhWpToCRfvUT+RXmw5EA5EwVw@mail.gmail.com> (raw)
In-Reply-To: <Y5d565XVsinbNNL2@mit.edu>

On Mon, 12 Dec 2022 at 19:58, Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Mon, Dec 12, 2022 at 11:29:11AM +0800, Hillf Danton wrote:
> > > You've completely misunderstood Al's point.  He's not whining about
> > > being cc'd, he's pointing at this is ONLY USEFUL IF THE NTFS3
> > > MAINTAINERS ARE CC'd.  And they're not.  So this is just noise.
> > > And enough noise means that signal is lost.
> >
> > Call Trace:
> >  <TASK>
> >  inode_unlock include/linux/fs.h:761 [inline]
> >  done_path_create fs/namei.c:3857 [inline]
> >  do_mkdirat+0x2de/0x550 fs/namei.c:4064
> >  __do_sys_mkdirat fs/namei.c:4076 [inline]
> >  __se_sys_mkdirat fs/namei.c:4074 [inline]
> >  __x64_sys_mkdirat+0x85/0x90 fs/namei.c:4074
> >  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >  do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
> >  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >
> > Given the call trace above, how do you know the ntfs3 guys should be also
> > Cced in addition to AV? What if it would take more than three months for
> > syzbot to learn the skills in your mind? What is preventing you routing
> > the report to ntfs3?
>
> If it takes 3 months for syzbot to take a look at the source code in
> their own #!@?! reproducer, or just to take a look at the strace link
> in the dashboard:
>
> [pid  3639] mount("/dev/loop0", "./file2", "ntfs3", MS_NOSUID|MS_NOEXEC|MS_DIRSYNC|MS_I_VERSION, "") = 0
>
> There's something really wrong.  The point Al has been making (and
> I've been making for multiple years) is that Syzbot has the
> information, but unfortunately, at the moment, it is only analyzing
> the the stack trace, and it is not doing things that really could be
> done automatically --- and cloud VM time is cheap, and upstream
> maintainer time is expensive.  So by not improving syzbot in a way
> that really shouldn't be all that difficult, the syzbot maintainers is
> disrespectiving the time of the upstream maintainers.
>
> So sure, we could ask Linus to triage all syzbot reports --- or we
> could ask Al to triage all syzbot file system reports --- but that is
> not a good use of upstream resources.
>
> And "we didn't know this is super annoying" isn't an excuse, because
> I've been asking for things like this *before* the COVID pandemic.  So
> if the Syzbot team won't listen to observations by a random Google
> engineer who happens to be an ext4 maintainer (or rather, I'm sure
> they were listening, but they didn't consider it important enough to
> staff and put on the roadmap), maybe something a bit
> more.... assertive by Al is something that will inspire them to
> prioritize this feature request "above the fold".  :-)
>
> And Al does have a point --- if a lot of upstream maintainers consider
> Syzbot reports to be less than useful, they will either auto-file
> reports to a junk folder, or just ignore the Syzbot reports because
> they are busy and the Probability(Usefulness) is close to zero, then
> recovering from that black eye to Syzbot's reputation is going to be a
> lot more difficult than if Syzbot was made more respectful of upstream
> maintainer time much earlier.
>
> Now, to be fair to the Syzbot team, the Syzbot console has gotten much
> better.  You can now download the syzbot trace, and download the
> mounted file system, when before, you had to do a lot more work to
> extract the file system (which is stored in separate constant C
> array's as compressed data) from the C reproducer.  So have things
> have gotten better.
>
> But at the same time, characterizing a syzbot report is something to
> be done by every file system maintainer who looks as a syzbot report,
> because there is no way to add a tag to the syzbot report that this
> particular syzbot report *really* is an ntfs3 issue.  So any
> information that a single developer figures out when triaging a bug
> (is this potentially an ext4 bug, nope, it's an ntfs3 bug) has to be
> replicated by every single kernel developer looking at the Syzbot
> dashboard.  Which again, is not respectful of upstream maintainers'
> time.

This is being worked on:
https://github.com/google/syzkaller/issues/3393#issuecomment-1330305227

Teaching a bot the pattern matching skills of a human is non-trivial.
The current design will likely do the simplest thing: regex match
reproducers and map a match to some kernel source dir, for which the
maintainers are Cc'd. If you have better suggestions on how to
mechanize subsystem selection based on a reproducer, please shout.


  reply	other threads:[~2022-12-12 19:29 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20221211002908.2210-1-hdanton@sina.com>
     [not found] ` <00000000000025ff8d05ef842be6@google.com>
     [not found]   ` <Y5VGCefLZmrOyd0Z@ZenIV>
2022-12-11  7:56     ` Hillf Danton
2022-12-11  8:39       ` Al Viro
2022-12-11 10:22         ` Hillf Danton
2022-12-11 15:46           ` Matthew Wilcox
2022-12-11 20:54             ` Al Viro
2022-12-12  3:29             ` Hillf Danton
2022-12-12 18:58               ` Theodore Ts'o
2022-12-12 19:29                 ` Marco Elver [this message]
2022-12-13  1:44                   ` Al Viro
2022-12-13  2:25                     ` Hillf Danton
2022-12-16 15:48                     ` Aleksandr Nogikh
2022-12-29 21:17                       ` Eric Biggers
2022-12-31 16:57                         ` Theodore Ts'o
2022-12-31 17:03                           ` Randy Dunlap
2023-01-03 13:36                           ` Aleksandr Nogikh
2022-12-13  1:47                 ` Hillf Danton
2022-12-13  3:36                   ` Al Viro
2022-12-13  4:12                     ` Hillf Danton
2022-12-13 11:05                       ` Alexander Potapenko
     [not found] <00000000000064d06705eeed9b4e@google.com>
2022-12-04  1:04 ` Hillf Danton
2022-12-09 19:50 ` syzbot
2022-12-09 19:57   ` Matthew Wilcox
2022-12-10 18:06 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANpmjNNCQEXpJt1PQptyr8mrBbhWpToCRfvUT+RXmw5EA5EwVw@mail.gmail.com \
    --to=elver@google.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nogikh@google.com \
    --cc=syzbot+919c5a9be8433b8bf201@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox