From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9996DC433E2 for ; Mon, 31 Aug 2020 18:22:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5638720678 for ; Mon, 31 Aug 2020 18:22:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="N2g1q/Jl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5638720678 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D3AF26B0003; Mon, 31 Aug 2020 14:22:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF2236B0037; Mon, 31 Aug 2020 14:22:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD88C6B0055; Mon, 31 Aug 2020 14:22:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id 9D44A6B0003 for ; Mon, 31 Aug 2020 14:22:11 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1E96B180AD806 for ; Mon, 31 Aug 2020 18:22:11 +0000 (UTC) X-FDA: 77211683262.30.girls69_37177c227091 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id D04B2180B3C85 for ; Mon, 31 Aug 2020 18:22:10 +0000 (UTC) X-HE-Tag: girls69_37177c227091 X-Filterd-Recvd-Size: 6139 Received: from mail-lf1-f65.google.com (mail-lf1-f65.google.com [209.85.167.65]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Mon, 31 Aug 2020 18:22:10 +0000 (UTC) Received: by mail-lf1-f65.google.com with SMTP id 12so4042679lfb.11 for ; Mon, 31 Aug 2020 11:22:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jTgzspytD7DuzYJj7jRzial3MrRwpGnfxvjiXd0RJrg=; b=N2g1q/JlCxdx+13x2JZxaLmluk+WXW7bphbyfghBpkD0grHOsaVxOS8iWElJyflhut uEXYE1QCTv5zYzdWdWQOCnGSPE6RMGzxGSXeCBlSKmDD75jiddhL9VLF8BzVEifBSp3x TiMDN4tgP6a/sYMe98YwS0rxdCl8ccDfbJ8cU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jTgzspytD7DuzYJj7jRzial3MrRwpGnfxvjiXd0RJrg=; b=M5Rytoqc5CHbMALRvgbXDxa3Jh9aISD+jhTlI9LTb+Gj9j46Fo6YwNg+AjiipRo8y2 ArtZ5xY0BBTf5oZJWbfZKHuOEmTjRTbhMjzDlz7DrVD7eb55bXx4fDbRVyr4x7/AS781 ATosDlI0P4to0XRZBIpVnCNXxDjK2zi5hot73ryCSKnWhj10KRcS5hKWpKR731H+K8Ny xeYdSnmvWS+fm2dyGepjpLxA+lugPSMGqONCfbDkNkEMW9F8YQrSJAwpBmwQj2xpTVHz ZeDDL+5A8DkAG7nHessIeT0jEjUOWbrEVw/kSSOYDlVrdkcor64hEUPX2sYntFR28fm1 gb9g== X-Gm-Message-State: AOAM530L9Rox6XiGcE4P9cKVgpBxeOXiZqvspfuWSrVOLn0h/zkPlbmL 7Gid4CLhbFanpR1hSGwbYZGz6Bmnl2kGNw== X-Google-Smtp-Source: ABdhPJxCyt20KZEDCvEvkOmq2MgW1HAuHwhX0JHkpXjFKu8RafXkV6ITR+/8PrSOVTZeYc+D1qxzxw== X-Received: by 2002:ac2:4253:: with SMTP id m19mr1279120lfl.81.1598898128393; Mon, 31 Aug 2020 11:22:08 -0700 (PDT) Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com. [209.85.208.179]) by smtp.gmail.com with ESMTPSA id g6sm1533234lfh.18.2020.08.31.11.22.07 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 31 Aug 2020 11:22:07 -0700 (PDT) Received: by mail-lj1-f179.google.com with SMTP id r13so7871035ljm.0 for ; Mon, 31 Aug 2020 11:22:07 -0700 (PDT) X-Received: by 2002:a05:651c:219:: with SMTP id y25mr1144547ljn.314.1598898126633; Mon, 31 Aug 2020 11:22:06 -0700 (PDT) MIME-Version: 1.0 References: <000000000000d3a33205add2f7b2@google.com> <20200828100755.GG7072@quack2.suse.cz> <20200831100340.GA26519@quack2.suse.cz> In-Reply-To: <20200831100340.GA26519@quack2.suse.cz> From: Linus Torvalds Date: Mon, 31 Aug 2020 11:21:50 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: kernel BUG at fs/ext4/inode.c:LINE! To: Jan Kara Cc: syzbot , Andreas Dilger , Ext4 Developers List , Linux Kernel Mailing List , syzkaller-bugs , "Theodore Ts'o" , Linux-MM , Oleg Nesterov Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: D04B2180B3C85 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Aug 31, 2020 at 3:03 AM Jan Kara wrote: > > On Fri 28-08-20 12:07:55, Jan Kara wrote: > > > > Doh, so this is: > > > > wait_on_page_writeback(page); > > >>> BUG_ON(PageWriteback(page)); > > > > in mpage_prepare_extent_to_map(). So we have PageWriteback() page after we > > have called wait_on_page_writeback() on a locked page. Not sure how this > > could ever happen even less how ext4 could cause this... > > I was poking a bit into this and there were actually recent changes into > page bit waiting logic by Linus. Linus, any idea? So the main change is that now if somebody does a wake_up_page(), the page waiter will be released - even if somebody else then set the bit again (or possible if the waker never cleared it!). It used to be that the waiter went back to sleep. Which really shouldn't matter, but if we had any code that did something like end_page_writeback(); .. something does set_page_writeback() on the page again .. then the old BUG_ON() would likely never have triggered (because the waiter would have seen the writeback bit being set again and gone back to sleep), but now it will. So I would suspect a pre-existing issue that was just hidden by the old behavior and was basically impossible to trigger unless you hit *just* the right timing. And now it's easy to trigger, because the first time somebody clears PG_writeback, the wait_on_page_writeback() will just return *without* re-testing and *without* going back to sleep. Could there be somebody who does set_page_writeback() without holding the page lock? Maybe adding a WARN_ON_ONCE(!PageLocked(page)); at the top of __test_set_page_writeback() might find something? Note that it looks like this problem has been reported on Android before according to that syzbot thing. Ie, this thing: https://groups.google.com/g/syzkaller-android-bugs/c/2CfEdQd4EE0/m/xk_GRJEHBQAJ looks very similar, and predates the wake_up_page() changes. So it was probably just much _harder_ to hit before, and got easier to hit. Hmm. In fact, googling for mpage_prepare_extent_to_map "kernel BUG" seems to find stuff going back years. Here's a patchwork discussion where you had a debug patch to try to figure it out back in 2016: https://patchwork.ozlabs.org/project/linux-ext4/patch/20161122133452.GF3973@quack2.suse.cz/ although that one seems to be a different BUG_ON() in the same area. Maybe entirely unrelated, but the fact that this function shows up a fair amount is perhaps a sign of some long-running issue.. Linus