From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49F8EC433FE for ; Sun, 23 Oct 2022 22:38:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9FD66940008; Sun, 23 Oct 2022 18:38:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9AD50940007; Sun, 23 Oct 2022 18:38:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8752E940008; Sun, 23 Oct 2022 18:38:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 78028940007 for ; Sun, 23 Oct 2022 18:38:33 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 495D01A0215 for ; Sun, 23 Oct 2022 22:38:33 +0000 (UTC) X-FDA: 80053679706.28.41CC383 Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) by imf30.hostedemail.com (Postfix) with ESMTP id E141E8000D for ; Sun, 23 Oct 2022 22:38:32 +0000 (UTC) Received: by mail-qk1-f181.google.com with SMTP id a5so5225309qkl.6 for ; Sun, 23 Oct 2022 15:38:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=/UZTxhTvsbmMSOgEXTH+zor/VLnuGIVlDQauO0NokYw=; b=LaiaDOv7LosGQH8OAA/CL1Kep0uh24JOI4avmJJz/WFxiE13j8DCwkLK2zTx8EHmUd 4aAjNWmftEw55A1hvqz6OedrOo3y8KfnuZS7DdJ/GJ4GlPYe5vpei5/jEcSLRjIXtfe8 dZvMtCbGCS9QFrIeS/y2poVrLgiIgghmqw+OU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/UZTxhTvsbmMSOgEXTH+zor/VLnuGIVlDQauO0NokYw=; b=HSEVEbanVqWZ6sSYv+pXcEkNl0YDvnbZHu+/cykvIoIEOH7XuD/Kf3MoLIW2B4msgS cp2/FDuQP04tw9wNe3+R3zt7BNl6iBHpoUFbzLfgc9qgGoJpueIkP9Z+MepskAYsvC1U mhXfRUUElI959HV9oBeOIwdF5Lpir6aLIB1z+Sle5Sh1J5ZVA/1ZhU0B+pKWgkcfDjla D5uG7gtgzfU/5kE4o9KSNnZppw/uIMOjV9VXkDgBpKIyuxHoPUUd18oOo0gpLSZc7lcZ v6SXHtZLFiIyyEU09lyUcUVVvDtme+EUBXpyuIwR9s/y5t27sxXVDyOsCMGkYRrwQxkv TcgQ== X-Gm-Message-State: ACrzQf3O8Mrqbhsorh71+b7TZ495Ei9BaCuXGM0664/Q/d50DuMsKNcF +C+1S+7kUDdWek7cKKgXikniz1F1h7K4Gw== X-Google-Smtp-Source: AMsMyM7ed1/D89396Q/Rc2sMG9VR4RaOwlfbPl62/NWQV4/xCqQzmhgRXVDfgUxUwV6NaFkqFuhYbw== X-Received: by 2002:a37:5a42:0:b0:6ee:8f0a:9319 with SMTP id o63-20020a375a42000000b006ee8f0a9319mr20757566qkb.315.1666564711649; Sun, 23 Oct 2022 15:38:31 -0700 (PDT) Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com. [209.85.128.181]) by smtp.gmail.com with ESMTPSA id s7-20020a05620a254700b006af0ce13499sm13937977qko.115.2022.10.23.15.38.30 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 23 Oct 2022 15:38:30 -0700 (PDT) Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-3691e040abaso71598987b3.9 for ; Sun, 23 Oct 2022 15:38:30 -0700 (PDT) X-Received: by 2002:a81:d34c:0:b0:349:1e37:ce4e with SMTP id d12-20020a81d34c000000b003491e37ce4emr26546851ywl.112.1666564710035; Sun, 23 Oct 2022 15:38:30 -0700 (PDT) MIME-Version: 1.0 References: <6350a5f07bae2_6be12944c@dwillia2-xfh.jf.intel.com.notmuch> In-Reply-To: <6350a5f07bae2_6be12944c@dwillia2-xfh.jf.intel.com.notmuch> From: Linus Torvalds Date: Sun, 23 Oct 2022 15:38:13 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: writeback completion soft lockup BUG in folio_wake_bit() To: Dan Williams Cc: Matthew Wilcox , Brian Foster , Linux-MM , linux-fsdevel , linux-xfs , Hugh Dickins Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666564713; a=rsa-sha256; cv=none; b=RUhoGdQYxsAlL1OMKPDglTnW8St44YGKCt9DShqF8mIAj5HQsar97pV+moA7R/kqPSj3vn BAx2X7+wgGr9Ljp2aoT8Sd2pkag/TCNls1icLVFFmmkpDU7dIp/DTKAcjO7j1f2AgmjDhb hZqgoMQo2eiA3+6VHkrFg/TXlsaBnTA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=LaiaDOv7; spf=pass (imf30.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.222.181 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1666564713; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/UZTxhTvsbmMSOgEXTH+zor/VLnuGIVlDQauO0NokYw=; b=uOdR92T/TzAXXQSWcOFb/kKIVsxyi0N+w6HU4Z8Vupg0RdaezsfCbD3XW2YIJNSsmPLZ4c ksa8Gh/6NPC8hXXZ2Tz8VItrwLWGEOkFIVNcEjHqU+HEYRzZ14DBuL3YAAg3Pw2OvJK9M6 sqNRkVx7hY5F5x/edZT8PZVXNM14Z+g= Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=LaiaDOv7; spf=pass (imf30.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.222.181 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none X-Stat-Signature: fku9b41c3htex14aa3cgh7n6e1cxkyf5 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: E141E8000D X-Rspam-User: X-HE-Tag: 1666564712-678270 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 19, 2022 at 6:35 PM Dan Williams wrote: > > A report from a tester with this call trace: > > watchdog: BUG: soft lockup - CPU#127 stuck for 134s! [ksoftirqd/127:782] > RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [..] Whee. > ...lead me to this thread. This was after I had them force all softirqs > to run in ksoftirqd context, and run with rq_affinity == 2 to force > I/O completion work to throttle new submissions. > > Willy, are these headed upstream: > > https://lore.kernel.org/all/YjSbHp6B9a1G3tuQ@casper.infradead.org > > ...or I am missing an alternate solution posted elsewhere? Can your reporter test that patch? I think it should still apply pretty much as-is.. And if we actually had somebody who had a test-case that was literally fixed by getting rid of the old bookmark code, that would make applying that patch a no-brainer. The problem is that the original load that caused us to do that thing in the first place isn't repeatable because it was special production code - so removing that bookmark code because we _think_ it now hurts more than it helps is kind of a big hurdle. But if we had some hard confirmation from somebody that "yes, the bookmark code is now hurting", that would make it a lot more palatable to just remove the code that we just _think_ that probably isn't needed any more.. Linus