From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29F16C4332F for ; Mon, 12 Dec 2022 19:29:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7BA0D8E0003; Mon, 12 Dec 2022 14:29:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7444F8E0002; Mon, 12 Dec 2022 14:29:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5BCBC8E0003; Mon, 12 Dec 2022 14:29:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 48C678E0002 for ; Mon, 12 Dec 2022 14:29:49 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E4527120A5D for ; Mon, 12 Dec 2022 19:29:48 +0000 (UTC) X-FDA: 80234644056.05.35F6525 Received: from mail-yb1-f175.google.com (mail-yb1-f175.google.com [209.85.219.175]) by imf03.hostedemail.com (Postfix) with ESMTP id 6167720012 for ; Mon, 12 Dec 2022 19:29:47 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Gl0PBS6+; spf=pass (imf03.hostedemail.com: domain of elver@google.com designates 209.85.219.175 as permitted sender) smtp.mailfrom=elver@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670873387; a=rsa-sha256; cv=none; b=MTs5+OlE8aSad1bSJY2wWhQUUBCYDyJKVlAcq1f6cN8LPzfJAM+nl3IuO15hlNoNa8PFtW dO0xlCJQWAtYMqn9ZW3yVHQC6rJlC1KieHH91JKqBH7qAgFTHnknBIOLmgBWAxNimzVPyU EBLJjfT9zKztox86ItQut8PMoRbu9/E= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Gl0PBS6+; spf=pass (imf03.hostedemail.com: domain of elver@google.com designates 209.85.219.175 as permitted sender) smtp.mailfrom=elver@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670873387; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LiVT748OxRXpzCifqoj8IWto4LyoPoQy563IYc6Z3N8=; b=TwkMkoWPI0nALlpx2hK8yteNMdjlYqykWA2fa7ZnrGBO5pk1eNBj+cvJ4S0iw5xvHIpXkv i1jH5PypBUiXYd5EpKMPijC3pBn5OzYhzSC2bcKxzoo5XdZ/CRKNCtfE5VbjzqO2rOupks QemoMD3aZ6jmo+bgH+tCp6ie7HLTLwE= Received: by mail-yb1-f175.google.com with SMTP id 2so6537577ybl.13 for ; Mon, 12 Dec 2022 11:29:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=LiVT748OxRXpzCifqoj8IWto4LyoPoQy563IYc6Z3N8=; b=Gl0PBS6+NcmiNtDFS7m2kIDHC+fMYTm2E1ue9F5esGqa09T6K6dVt8KUnbzRAQYatu TdTNx9kuwFTXf72vocw0TK7HjdmRWxe5EnyMsQKozBNqpWCDWX9Sg3KFAceFlFOlTeVf KAf4ZwV3jPwF4qAKTL5jto9PUaNBSMoIRrUnBXfIgFV1TukTJkc2GNyYs0jAl4R82nhF QfpTO0olr8vh/95ZB16xYRPDWAqB+CNS4F5Ro412byIyea7ROJZfIDG1E7QjdhLlRpXb QNQ4m4v2f49NL77CsDpglXI0aJBKLP8nbnNyGEHYE0fbBDzeJeHhnEkqf1uu3eP4XRgq S/7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=LiVT748OxRXpzCifqoj8IWto4LyoPoQy563IYc6Z3N8=; b=M5sAXzuVzlifmuJrSZ7tJOty04PBjjOEvI8bL2vIzfO2xjQFmw3TfYxq9Z9OrcwgXd wuVMhNocts8ylygI6X0TKiJmBiwesj9Cvfdi/1ir+SSdkA83jr8N6u4hDtFjM5+3uXAD dElu+xIhWfF0/evS690SBqAQqnPdO4IhZCmT2x7xvwOTJdmZYT93nV8FoHBmVdkh4p+v DyD005TkOga9myyp+W5Shs6HF9Z5s/m3aKbi9BQ4P9cEwNqtxrkziMcbrB9RfKgE9Gqs q7jAJK217F+uIhK381NgraiUZBoyS1Nqv7tHqxEfPk8QXvCE/ToYHBmapYfyuCIzFaEo 4SJg== X-Gm-Message-State: ANoB5pkpgnsQ8AirOERZ9R/arzcZEMDWhVCFoSNbtuVjrCZDn2KgKRtB 4RxMcrGoR7jRL8II7E+VyqGzN/ucR++A0ozE9A9rHw== X-Google-Smtp-Source: AA0mqf6lprZSmewGrvzVAa0XSN3J7a1NlrVvMn5WyGmyExooaEz9MlPhHaGfB/x38ob74t7Sb98djD3uxaoGvd7bSrY= X-Received: by 2002:a25:1e89:0:b0:6f6:b3d1:edcc with SMTP id e131-20020a251e89000000b006f6b3d1edccmr48267868ybe.125.1670873386230; Mon, 12 Dec 2022 11:29:46 -0800 (PST) MIME-Version: 1.0 References: <20221211002908.2210-1-hdanton@sina.com> <00000000000025ff8d05ef842be6@google.com> <20221211075612.2486-1-hdanton@sina.com> <20221211102208.2600-1-hdanton@sina.com> <20221212032911.2965-1-hdanton@sina.com> In-Reply-To: From: Marco Elver Date: Mon, 12 Dec 2022 20:29:10 +0100 Message-ID: Subject: Re: [syzbot] WARNING in do_mkdirat To: "Theodore Ts'o" Cc: Hillf Danton , Matthew Wilcox , Al Viro , syzbot , linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com, Aleksandr Nogikh Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: 6167720012 X-Rspamd-Server: rspam01 X-Stat-Signature: 5i4u7aqdxow3rjcso3mgi6mefeibi9r4 X-HE-Tag: 1670873387-386722 X-HE-Meta: U2FsdGVkX1/dEHBcYjaY1A313J0cfJt3IsUAStRl+gxUJLyaWHNfO0NfTfw+rzBMN6AR2UPxX5bOPGf5iCPFnxjg1rDj8YTn43MWEzeOVvDM40duhKywx+kWOxrP48y6XTKOh+jenL0nkZp20MBw+6t5aKl2r/g1hbkOWIU0x8Nyd0bDIhHoquFcmlEl7IJPObW/JBpWNKQrEtI2h8WJzeUZFnTGiRx4A6BuR6u4olsIYWTRsz73YjYOeEVrUGYqgzfoEdAvyzTTYgpaHMnxl9WStUHDmZNGyWg1RUDfEOSnCwnEspsEWwlCHt7N0EJow53Sx/C0QtqjzX5R1xGUo7K4FSPVQNQFCDrtPIyplDF82l77AObIImNuylEJZk4amLEzw2DUR8OoK6pstcjhreb3NEBigjrI/8Y+Gr0BybNLF5hazVAwByoJnjJvnKujW3pqqGjywvLYGI2cVtIgnmUakJLK5UatORmB/N+yjgxCqzHMy5wbziBPP2szUYoyIyYcUXIN0W8oGaiSFMn3zhWFjkzNCZnFKEVlIIFOMi9+rkA4/R+pUSZqHQl7O/VVVxt9SkgNVmHZdIfZgr+MfyNHd3QopwWJW9brJ3zEOxJiribk+sYS8Q0VsQ+kQ6zL8qbZXUQpQY3Vademr4PsUAVbegAxq6LPAgYnfPGYz+jh3vx8J8Z+VU7EbyJu6RsmGV7aGXPTP4TSdfAxbDmvKqvQLUkk0NQVYNhMJ5qbwjNksuczt3k1MXdjh5eESBUjCofKrxxoO+c622yrC42bUl9vUkDphNFSzLlhQFGiQJs+DbMfCVqTv6HrU0beEDhdnfa4JlVOATP3blm0SOn9GHmUiRvHFh8JqHzv3hkhOvFI1Vf0b7VlNqmvknX7Fn/ReGGN7LZNILnSjW2znfernoJzQSovBHXLXQeoAaNvKHb6j4HmQ3qMFGJO46ExwV+ilskq06w6NJi0ZYwqSPX QUYxgT4N dmqANFLR/dKm1EWZFN3aCxTmeQprKiJAsTP1HZ+99jepoJh8nU6Q/9h5lHZhugXcPXJgR60DK1HlIWiYGoUgdnGkxo/i1DOOOowBHCGv4XNQop/fSG5KncVDEf+Al1YFYpqB3ZCtXsPdszI6KvIJahKVry7iyWNRUVRvk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 12 Dec 2022 at 19:58, Theodore Ts'o wrote: > > On Mon, Dec 12, 2022 at 11:29:11AM +0800, Hillf Danton wrote: > > > You've completely misunderstood Al's point. He's not whining about > > > being cc'd, he's pointing at this is ONLY USEFUL IF THE NTFS3 > > > MAINTAINERS ARE CC'd. And they're not. So this is just noise. > > > And enough noise means that signal is lost. > > > > Call Trace: > > > > inode_unlock include/linux/fs.h:761 [inline] > > done_path_create fs/namei.c:3857 [inline] > > do_mkdirat+0x2de/0x550 fs/namei.c:4064 > > __do_sys_mkdirat fs/namei.c:4076 [inline] > > __se_sys_mkdirat fs/namei.c:4074 [inline] > > __x64_sys_mkdirat+0x85/0x90 fs/namei.c:4074 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > > > Given the call trace above, how do you know the ntfs3 guys should be also > > Cced in addition to AV? What if it would take more than three months for > > syzbot to learn the skills in your mind? What is preventing you routing > > the report to ntfs3? > > If it takes 3 months for syzbot to take a look at the source code in > their own #!@?! reproducer, or just to take a look at the strace link > in the dashboard: > > [pid 3639] mount("/dev/loop0", "./file2", "ntfs3", MS_NOSUID|MS_NOEXEC|MS_DIRSYNC|MS_I_VERSION, "") = 0 > > There's something really wrong. The point Al has been making (and > I've been making for multiple years) is that Syzbot has the > information, but unfortunately, at the moment, it is only analyzing > the the stack trace, and it is not doing things that really could be > done automatically --- and cloud VM time is cheap, and upstream > maintainer time is expensive. So by not improving syzbot in a way > that really shouldn't be all that difficult, the syzbot maintainers is > disrespectiving the time of the upstream maintainers. > > So sure, we could ask Linus to triage all syzbot reports --- or we > could ask Al to triage all syzbot file system reports --- but that is > not a good use of upstream resources. > > And "we didn't know this is super annoying" isn't an excuse, because > I've been asking for things like this *before* the COVID pandemic. So > if the Syzbot team won't listen to observations by a random Google > engineer who happens to be an ext4 maintainer (or rather, I'm sure > they were listening, but they didn't consider it important enough to > staff and put on the roadmap), maybe something a bit > more.... assertive by Al is something that will inspire them to > prioritize this feature request "above the fold". :-) > > And Al does have a point --- if a lot of upstream maintainers consider > Syzbot reports to be less than useful, they will either auto-file > reports to a junk folder, or just ignore the Syzbot reports because > they are busy and the Probability(Usefulness) is close to zero, then > recovering from that black eye to Syzbot's reputation is going to be a > lot more difficult than if Syzbot was made more respectful of upstream > maintainer time much earlier. > > Now, to be fair to the Syzbot team, the Syzbot console has gotten much > better. You can now download the syzbot trace, and download the > mounted file system, when before, you had to do a lot more work to > extract the file system (which is stored in separate constant C > array's as compressed data) from the C reproducer. So have things > have gotten better. > > But at the same time, characterizing a syzbot report is something to > be done by every file system maintainer who looks as a syzbot report, > because there is no way to add a tag to the syzbot report that this > particular syzbot report *really* is an ntfs3 issue. So any > information that a single developer figures out when triaging a bug > (is this potentially an ext4 bug, nope, it's an ntfs3 bug) has to be > replicated by every single kernel developer looking at the Syzbot > dashboard. Which again, is not respectful of upstream maintainers' > time. This is being worked on: https://github.com/google/syzkaller/issues/3393#issuecomment-1330305227 Teaching a bot the pattern matching skills of a human is non-trivial. The current design will likely do the simplest thing: regex match reproducers and map a match to some kernel source dir, for which the maintainers are Cc'd. If you have better suggestions on how to mechanize subsystem selection based on a reproducer, please shout.