From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9448C001B2 for ; Fri, 16 Dec 2022 15:48:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 146C08E0008; Fri, 16 Dec 2022 10:48:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F65E8E0002; Fri, 16 Dec 2022 10:48:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F28B28E0008; Fri, 16 Dec 2022 10:48:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E18D88E0002 for ; Fri, 16 Dec 2022 10:48:49 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B3C631411E9 for ; Fri, 16 Dec 2022 15:48:49 +0000 (UTC) X-FDA: 80248602378.24.357549A Received: from mail-ua1-f52.google.com (mail-ua1-f52.google.com [209.85.222.52]) by imf03.hostedemail.com (Postfix) with ESMTP id 25BB320002 for ; Fri, 16 Dec 2022 15:48:46 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rPYrTc6D; spf=pass (imf03.hostedemail.com: domain of nogikh@google.com designates 209.85.222.52 as permitted sender) smtp.mailfrom=nogikh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671205727; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sYyw641UQlhdY1HWQGXj9TtCUKC82UQ9QI48yzSo1is=; b=fL7RIW/WpRetlmlnQBRmO491c3DuRD38OId/+LKac2fO090gdqQmd7azc/pVsRZe6+Fljx ORR2tddUrpk9LNZfT284BxXw/TI+BxVKi+a1N+WfzTUI2t/ftpJ/EauM2R8WH7NllgiStH t2wevMKbrKSjw4IO9+kWtdkSsDLmD2A= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rPYrTc6D; spf=pass (imf03.hostedemail.com: domain of nogikh@google.com designates 209.85.222.52 as permitted sender) smtp.mailfrom=nogikh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671205727; a=rsa-sha256; cv=none; b=xcdmgPDu1QUFFxHQMI2QbSarJAR12HYV2Og/6w6rtTbNjkMlSZSKNTntQ+vWKfV68SkDHl urewdI3Xcf7MhICcLVHrodCxQxmfDN3OQH//FcbkVvcYTaitMXI31FJaManenM3ni8Hp+l x/ZPW9323S4D6L3TyWeQDN1E40JRVog= Received: by mail-ua1-f52.google.com with SMTP id s25so624130uac.2 for ; Fri, 16 Dec 2022 07:48:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=sYyw641UQlhdY1HWQGXj9TtCUKC82UQ9QI48yzSo1is=; b=rPYrTc6DvNr/X7Oi9wM8LYTYjVShddq9aQ+JRdNQYkWUujHxfHoC+0KZtpdCivvRj/ jDVkRekXW7rVRelvVCxP3Jwnfqf3HFn4xkNE/s/JGe2vkeXDmNH2kBX7aIdYoOViGeFM gpGVlH8DAzI1beEs8QtvDg5VyaWf3fgO+InZ2d26nfkoOevGYsTK0CQDYpHFvt/yUHCo XQ20h7TkP7tnyoALGGNCNIljth4pLdw2f+nLGjTRatdU/3dAq2006hgcltktomK3YYs8 IUt3WneZAz7gURpcdiVgTE18odWgAsis7p1WS5QhLf4YW6KsfRUgYZzb6CGEG5CF1pvs Y5nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=sYyw641UQlhdY1HWQGXj9TtCUKC82UQ9QI48yzSo1is=; b=RsP3X9vgZ3LLaMJKtor6Pp19jGICokTDqUPqpStcTxzF5IZNah4LRJfQoKB2uPFqRN TytLhP7Hlrwy3XlaLWTcQF+ZpHjLhGNCHYDT8xd1I1aGp3+3tXJ3Xd2z6HZrE3FHxcDD 9Zz2C954X7VYub40GW1UI4DfRny+HXOOSGJBHDLMoeDNafVXAq3pVeFqqFk7x+PtpSVg 5wTaOmoNCpQDMTZfVWXvpakc098MPPQw9ExQs1YSHmujw7GsowORLG2bKynTN2YDsIL1 tkJqnV4rfPO7XNPFYl/iV0SJ16SgUfV82J1eAlC7a1GyfnDqWrT77NmwQsnbQmu3rJ62 rq1w== X-Gm-Message-State: ANoB5pksodbTXRPLTAoSlA1jK2DLHui4nVpUU4+TyAFRzFB9VQRzRxbJ BOQwfLiV3WNzVN3oHm7vvNir//dzRmqBGACZMo9koA== X-Google-Smtp-Source: AA0mqf4T5nbzgu+TCh9N6EOi6miuOG5tQBOYekxMbZim3irFSGgi+QEZadNXAL8LsMVD4hMNKHK4R//ELrG7VbFYIoQ= X-Received: by 2002:ab0:700c:0:b0:42f:70c2:593b with SMTP id k12-20020ab0700c000000b0042f70c2593bmr1966770ual.50.1671205726107; Fri, 16 Dec 2022 07:48:46 -0800 (PST) MIME-Version: 1.0 References: <20221211002908.2210-1-hdanton@sina.com> <00000000000025ff8d05ef842be6@google.com> <20221211075612.2486-1-hdanton@sina.com> <20221211102208.2600-1-hdanton@sina.com> <20221212032911.2965-1-hdanton@sina.com> In-Reply-To: From: Aleksandr Nogikh Date: Fri, 16 Dec 2022 16:48:34 +0100 Message-ID: Subject: Re: [syzbot] WARNING in do_mkdirat To: Al Viro Cc: Marco Elver , "Theodore Ts'o" , Hillf Danton , Matthew Wilcox , syzbot , linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 25BB320002 X-Stat-Signature: 6f8qfxh7nwy91or4jht7ntq7ramcrii9 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1671205726-286524 X-HE-Meta: U2FsdGVkX1+Tkj+BQYMw6Aeeu9q7iyo8zE1fkIW6DnNkGGRjveRz/2wUbSbLuYPlw7FBS76LEerIPLxrcNIY4XUNiqwb/VYoj0Icf4ETDJODUI2AEmgKtnNrVgIpKA+ejrFUxzp9/Y5wM/Vpr+y0W8HXZZyfOr0QV56erBibJScojrfRxxxHpBQGSteLZI/dBNruo0r4u9goCe8Tr8KhfLcR2zHewICpGehg1tE9Koz8Xfzjncr/CLqaWzq3coUFbv9D1xMxXvGMTl8SzlcjFguHqbtcU08gJ0/i7DLOTtq+Vaa0+p/seQDi7r+bt5frlpgNU+cDt+K86EAAkAoaBJZaplhR67MgjgfGWu2HG9K6ip4rL5IcA4h2m5G1hl3ETbZFRd8dzd+JJFXymBDuqkKkPQ3Z9qdktbUCfJoIpKulUHsV0mi6gsOACxSAfy/x7bi4o2GvOBvQgG1dLT+qBrxxN9Fu4wmWVYRHefgcRpObrHBs9KKFLzCp3ucNgLT+IlE0/1f5WeirJnaYFTU2D3+TX7uubP3cOSEIZyW5S+3TKLwko0j/VVO0zgKNHD9Gl94f+00Rves8mc+E74zGCwErtWiraOmitqVPPpmvTZnPrTYIeXJgZoi98Y9fBqmhMWeJWghV8QmxFO/7rAUYSxmxSSL3OEp2bPHv7VIS1GDygLreVOM/ZPLSMFR/z1si2YTvk2jR54hJc3d8ElW6iBli0AZUFqwiLmzkh3sNIaJbwdD4CaMBM1nO7ESQ5M1Ylr/TbdGMbcFCEl/P+DOX9BEBD5qf+CbwpYbdUTgpkvYWlAl3VTMnB31GeQXLEqOVisjSKGXGGIluKK6NM39qVJEhXPjHWrFSR3D/l00DNylK296u/1LGEvhvnkxLbswKLC424x15I0MmfztpKaaJaxplbQTSg1MD0VzntGFxQhHvz4lr+RjOlyMowSKjNTiUBYg6OCTzLlKiLKtmfLz iBvcDdo2 JR6Y8vQ3aG5kzidETucqWH+LjZE9rVvyG5z/Tvfo3CWmcSZ/JTGBV09vfVS3lpgCa3jBzMoyEgyzbkZa4S7FJwaoH/NLYKt8FkJ2KoMH+5cwmeNuwZiIp8PqTnuQAf8ZFdXNfGB/LscbBSnn2RW2GfcAmXQnMdsEC+esm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Dec 13, 2022 at 2:44 AM Al Viro wrote: > > On Mon, Dec 12, 2022 at 08:29:10PM +0100, Marco Elver wrote: > > > > > Given the call trace above, how do you know the ntfs3 guys should be also > > > > Cced in addition to AV? What if it would take more than three months for > > > > syzbot to learn the skills in your mind? > > Depends. If you really are talking about the *BOT* learning to do > that on its own, it certainly would take more than 3 months; strong AI > is hard. If, OTOH, it is not an AI research project and intervention of > somebody capable of passing the Turing test does not violate the purity > of experiment... Surely converting "if it mounts an image as filesystem > of type $T, grep the tree for "MODULE_ALIAS_FS($T)" and treat that > as if a function from the resulting file had been found in stack trace" > into something usable for the bot should not take more than 3 months, > should it? > > If expressing that rule really takes "more than three months", I would > suggest that something is very wrong with the bot architecture... > > > Teaching a bot the pattern matching skills of a human is non-trivial. > > The current design will likely do the simplest thing: regex match > > reproducers and map a match to some kernel source dir, for which the > > maintainers are Cc'd. If you have better suggestions on how to > > mechanize subsystem selection based on a reproducer, please shout. > > Er... Yes? Look, it's really that simple - > for i in `sed -ne 's/.*syz_mount_image$\([_[:alnum:]]*\).*/\1/p' <$REPRO`; do > git grep -l "MODULE_ALIAS_FS(\"$i\")" > done | sort | uniq > gets you the list of files. No, I'm not suggesting to go for that kind > of shell use, but it's clearly doable with regex and search over the source > for fixed strings. Unless something's drastically wrong with the way the > bot is written, it should be capable of something as basic as that... > > If it can't do that kind of mapping, precalculating it for given tree is > also not hard: > git grep 'MODULE_ALIAS_FS("'|sed -ne 's/\(.*\):.*MODULE_ALIAS_FS("\([_[:alnum:]]*\)".*/syz_mount_image$\2:\1/p' > will yield lines like > syz_mount_image$ext2:fs/ext2/super.c > syz_mount_image$ext2:fs/ext4/super.c > syz_mount_image$ext3:fs/ext4/super.c > syz_mount_image$ext4:fs/ext4/super.c > etc. Surely turning *that* into whatever form the bot wants can't > be terribly hard? [*] > > All of that assumes that pattern-matching in syzkaller reproducer is > expressible; if "we must do everything by call trace alone" is > a real limitation, we are SOL; stack trace simply doesn't have > that information. Is there such an architectural limitation? Thanks for the feedback, and we regret the inconvenience this may have caused. We've deployed a simple short term solution to the immediate issue: syzbot will extract the involved filesystems from reproducers and use this information to construct the email subject line and Cc the related people/mailing lists. This should take effect starting next week. That being said, in response to the original feedback we have already been planning comprehensive improvements to the subsystem selection process that will support more than just filesystems. But unfortunately, this is going to take longer to become available. -- Aleksandr > > [*] depending upon config, ext2 could be mounted by ext2.ko and ext4.ko; > both have the same maillist for bug reports, so this ambiguity doesn't > matter - either match would do.