From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAB1AC433F5 for ; Wed, 16 Mar 2022 23:35:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4DAD16B0071; Wed, 16 Mar 2022 19:35:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4645F8D0002; Wed, 16 Mar 2022 19:35:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B6418D0001; Wed, 16 Mar 2022 19:35:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id 165266B0071 for ; Wed, 16 Mar 2022 19:35:32 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id DC1F061A2D for ; Wed, 16 Mar 2022 23:35:31 +0000 (UTC) X-FDA: 79251858462.09.E8CF481 Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by imf19.hostedemail.com (Postfix) with ESMTP id 252E41A000D for ; Wed, 16 Mar 2022 23:35:30 +0000 (UTC) Received: by mail-lf1-f43.google.com with SMTP id n19so6275226lfh.8 for ; Wed, 16 Mar 2022 16:35:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PUaRSZQS7aa+6y7Z0BXGANEI06keATxi49n2iARU7lI=; b=EKlOY+SlMv2j0eOUgqh3ARh4ND5wmiDDKtt3IIZB2Urcq6zLbFbsExxmvs2pRJIagQ T5/iBkJNAo7L4u6lyqVYxRKKUd2SrngFuYxld5zAO6g+XJE/gHRb031PUrlSDxugt6g6 sXnTQP1qpiqJLmqZ1f6IVSLVb2p4BjMoqTklk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PUaRSZQS7aa+6y7Z0BXGANEI06keATxi49n2iARU7lI=; b=SCyK4uF/88rnrDZn6dfc1QaXCKRZVbA3y5+06qjUjvrZKnfQhOd1yZ/78F6OA4ZcRe 2O1UkErAVmBdWUYCng0iRu4SMszkmYX3mxwDC+4JEmEig1NkqknPCE9z5cQXYiOQwVRu N2sksgNERpAKgFMagXHoIyGpwkgnCXD59DK5zT/OfcCA5LUnMrIHFHblDxgeTf9VwTM6 3ghND5W3hVFN+KjNSgfaonkGWGnL5bj6PovoRzoN/+AO/RanNu8VDI5YjWcjH4S6q3ZT yXZHcs7gCsDTmH3PdacAK6GPTbpGt6VkUhNgFgdalhm4G5J9EvB8vEKdD5Sbq8lI1KCd yQOg== X-Gm-Message-State: AOAM533p3M8xLhYU/s9+hHmm+2G9Iu3zvH9RLgbop1KsytxKtw7k8Uc8 FSYaEJPKzCn5fxn9fdqx3S4coRiTZ5BrOJl0fQ8= X-Google-Smtp-Source: ABdhPJxLCyWEc1OGYNcerJECiQqYEC505FOr+KsM6XgN5eASm9Zqvejh/gne7ghohXQ7zFGLUiDx9A== X-Received: by 2002:a05:6512:5c5:b0:449:f46c:c926 with SMTP id o5-20020a05651205c500b00449f46cc926mr676132lfo.631.1647473729215; Wed, 16 Mar 2022 16:35:29 -0700 (PDT) Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com. [209.85.167.51]) by smtp.gmail.com with ESMTPSA id t9-20020a2e5349000000b00247e931bd67sm285086ljd.9.2022.03.16.16.35.26 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 16 Mar 2022 16:35:26 -0700 (PDT) Received: by mail-lf1-f51.google.com with SMTP id b28so6307474lfc.4 for ; Wed, 16 Mar 2022 16:35:26 -0700 (PDT) X-Received: by 2002:ac2:4203:0:b0:448:8053:d402 with SMTP id y3-20020ac24203000000b004488053d402mr1106390lfh.687.1647473726276; Wed, 16 Mar 2022 16:35:26 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Wed, 16 Mar 2022 16:35:10 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: writeback completion soft lockup BUG in folio_wake_bit() To: Matthew Wilcox Cc: Brian Foster , Linux-MM , linux-fsdevel , linux-xfs , Hugh Dickins Content-Type: text/plain; charset="UTF-8" X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=EKlOY+Sl; dmarc=none; spf=pass (imf19.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.43 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 252E41A000D X-Stat-Signature: gpt164bes5szpzfbp66cn6xyikhdgnuq X-HE-Tag: 1647473730-911053 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 16, 2022 at 1:59 PM Matthew Wilcox wrote: > > As I recall, the bookmark hack was introduced in order to handle > lock_page() problems. It wasn't really supposed to handle writeback, > but nobody thought it would cause any harm (and indeed, it didn't at the > time). So how about we only use bookmarks for lock_page(), since > lock_page() usually doesn't have the multiple-waker semantics that > writeback has? I was hoping that some of the page lock problems are gone and we could maybe try to get rid of the bookmarks entirely. But the page lock issues only ever showed up on some private proprietary load and machine, so we never really got confirmation that they are fixed. There were lots of strong signs to them being related to the migration page locking, and it may be that the bookmark code is only hurting these days. See for example commit 9a1ea439b16b ("mm: put_and_wait_on_page_locked() while page is migrated") which doesn't actually change the *locking* side, but drops the page reference when waiting for the locked page to be unlocked, which in turn removes a "loop and try again when migration". And that may have been the real _fix_ for the problem. Because while the bookmark thing avoids the NMI lockup detector firing due to excessive hold times, the bookmarking also _causes_ that "we now will see the same page multiple times because we dropped the lock and somebody re-added it at the end of the queue" issue. Which seems to be the problem here. Ugh. I wish we had some way to test "could we just remove the bookmark code entirely again". Of course, the PG_lock case also works fairly hard to not actually remove and re-add the lock waiter to the queue, but having an actual "wait for and get the lock" operation. The writeback bit isn't done that way. I do hate how we had to make folio_wait_writeback{_killable}() use "while" rather than an "if". It *almost* works with just a "wait for current writeback", but not quite. See commit c2407cf7d22d ("mm: make wait_on_page_writeback() wait for multiple pending writebacks") for why we have to loop. Ugly, ugly. Because I do think that "while" in the writeback waiting is a problem. Maybe _the_ problem. Linus