From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C32DC28B20 for ; Sat, 29 Mar 2025 01:06:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8AEA6280173; Fri, 28 Mar 2025 21:06:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 85D9F280165; Fri, 28 Mar 2025 21:06:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FEAC280173; Fri, 28 Mar 2025 21:06:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4DFFE280165 for ; Fri, 28 Mar 2025 21:06:52 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 448B2160A63 for ; Sat, 29 Mar 2025 01:06:53 +0000 (UTC) X-FDA: 83272799106.14.929BFAF Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf26.hostedemail.com (Postfix) with ESMTP id AB1DA140007 for ; Sat, 29 Mar 2025 01:06:51 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=OXNfcbio; spf=pass (imf26.hostedemail.com: domain of mcgrof@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=mcgrof@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743210411; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IyCtUlsrdhr9hcJnkk5pM5G36vSBcvrhK2M5YecavWs=; b=MI3KyjOZtK7mCZ2Pt8YhTWa2vjPforQvgIgBerllMbnzCiTxTWWcU/n+buVPa4VbIOUj0F Yycd29HId1hVFI6lLzFLaTIdyhY6waGTaMSV2hb6JDM7arX4pXCwep8HMsrhCeXPhFHaHx lzXh4ALSLP7PkZ1cBAS4J3nSIRWcods= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743210411; a=rsa-sha256; cv=none; b=2gDZK3zfwIy7KsF5g8oalfnFtsHeSkLSf2cScD5/KhYld/Qh0rumyb/9Xpp6L75jI3zs/v loBTn6yQGTSziJYR/JCdsNZNPl2Z2NYycOieqd+W7wbhcmPLZfaOaKOjXXeDMZHul/Te+s b3FLCDoZgWAtBJ7zK4xWDbenc1k6tMw= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=OXNfcbio; spf=pass (imf26.hostedemail.com: domain of mcgrof@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=mcgrof@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 39A2A61135; Sat, 29 Mar 2025 01:06:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5031C4CEEA; Sat, 29 Mar 2025 01:06:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743210410; bh=gtnCohqVELmcawsV6NjbhkkPqDLxo8ZP2RZwSaOktPQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=OXNfcbioMcZNFe70LjaiqF6/770DiVEXn72cLVuxo/7X/xKxSdDSzLaZtJPQdtqXF DNSNJA5g4lU9jLz594FtEg3eixSeeT7jmf15/xLkHOHIDSmtymy48tlgynxRsUaif/ ZGWwqQjZcSPEhlhgW/Ct0zOSgF9fMFa6Zcqwv13ZYns5Ykep0K81xVFe5BG/7oF5yV gLLfuGbRoO1zn2E31BH/bqRNuxSOcRM4RD6+8/9RrEPx/GZIPMOj2ZXDU/gOkPJtCd 7jn93izUGhlfTyV3xh1GT7WCfYOJ+gJHkmSN2dpbEvkX3jknvcY1lKcOTGck0lwLwf Q/oNKv+SKyihw== Date: Fri, 28 Mar 2025 18:06:48 -0700 From: Luis Chamberlain To: Jan Kara , Kefeng Wang , Sebastian Andrzej Siewior , David Bueso , Tso Ted , Ritesh Harjani Cc: Johannes Weiner , Oliver Sang , Matthew Wilcox , David Hildenbrand , Alistair Popple , linux-mm@kvack.org, Christian Brauner , Hannes Reinecke , oe-lkp@lists.linux.dev, lkp@intel.com, John Garry , linux-block@vger.kernel.org, ltp@lists.linux.it, Pankaj Raghav , Daniel Gomez , Dave Chinner , gost.dev@samsung.com Subject: Re: [linux-next:master] [block/bdev] 3c20917120: BUG:sleeping_function_called_from_invalid_context_at_mm/util.c Message-ID: References: <20250322231440.GA1894930@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: uya3nyqsizkib1wmjq5918ouep68bso8 X-Rspam-User: X-Rspamd-Queue-Id: AB1DA140007 X-Rspamd-Server: rspam08 X-HE-Tag: 1743210411-685015 X-HE-Meta: U2FsdGVkX1/XrRg4I+lJ+U8vLuw09NPgWU3HDaamu5DFNhzUsLkASHjK8B6HonytDZz0efqRNex9sEZ6s3mCnxMkO4EGT5l4Zg3GRQ54yUoVyEJoP/2v482JbX/ta70XqZekl+RGgb3Di7YPFX8AoWesdWK6SPi5cMS2bTTcJwmLmLq5NVJd1jXgwGOk/eBkv9pYqACquTfN90zzB1lIWoNCiNelbDJ4eJvCgf1iZvq2bUHw1ZECVpddH3oD52OUYcdoXND9FnOhAXeBqFWpZuQp8aF0/W5v6q9jZMb2Go+zovzm9inbbqK+OW1CixfqNp4pPWNA3i5RNqmVt1QVn7glOgtOo/6pxf5pzPhcdwNVggiJn833xtWhPsGPc5rU01BFdI1wxNGAZmJRSEEkuz3OdxzYNENmk0pYjfJLmzvOZ9egwFyMMkPwgIQK4hibREZf/+QcM9W9WQPMpUIQdjxGBrr39BAe8aDuKbrlDGh6Or5/dwA+Ln2AqFNJLASmijHnWngbFO88i7J6FqBdfXhGdYtfyYcVis6O7y/yjSNOsunhOJ9BhWEl5P88kOcnGmgnvZk7w+cIGYCCb2/+Ht+9ko0UPzNlhdPfPmnVCjhUwVf14WS519kFyHj+vvKy5qDUoBdPktwhZvoUrB4FLPKgj1ePIHr7NxsO0f9iN5T0I/0lLTHE4OWyxmQbPqJbJ0KPkn6u3XE2bUg1PGhhDOS83ZHAssEcNQBP/e2ff0INcKRln8Q4N9w5FSw5s5/jzgc6CFP+Gwh+4PgAchMG+ExSr1AiwCRcobKKekzU5kn5QJAlBgZaTuDIB6+JAP3IZWDEnOT6cQRP3UHR6l2vrDcFv07q+k+jw4/O/toxbhCV0aAIg14E21IbN0KCSXSi27+4pfmXGtsonjbZu5HO2JWYcvwz5CKsX5Rqc0v5W9VXKZOR2WRO6U/j+clYNGJVroBzmHRotMB844wPegE BByADjU/ U0m/HQ7TJscdUlmr+5n3F8GJCK6EPd7YFINwK4VBYxyUvVfIqCBEULThqviDHvY2Q8zgt+Nra9j13acY/qDZ77UQnqx0x1ejrIU+dlS5d6qMMOXrrEVBXCcviRs7GGka8zK2YvTMlgpl5KdHN5UjhbdFabxVbHHnhBHaMOX49qnqQ5UI6YAGUzU0kSiqeTB//u5BLlI1tLp/LtxRcx7MJAfnHSQax57A8odAjuqc5AG4MjI0836/td66y9ZUvgBc5/DifqS73cFWA7DpsZLB8iwbgqG3WJzxSyfysJf+tDS7NZOHj9e2+8QZDBfxkm9kH0v/PjLpxKJ0lnRVehv3F7v+SD+iiZByhY9X1+thRR7wwQBh7xStRjEOVNsKp2gGqScU3KHqZC5flLLj7QsxUncj/5Y/cwbclkPS7AOpgN4AFlm9UVntg9ks+EJnYnB/X6oEuvDr4921s7Wg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Mar 28, 2025 at 05:08:40PM -0700, Luis Chamberlain wrote: > So, moving on, I think what's best is to see how we can get __find_get_block() > to not chug on during page migration. Something like this maybe? Passes initial 10 minutes of generic/750 on ext4 while also blasting an LBS device with dd. I'll let it soak. The second patch is what requieres more eyeballs / suggestions / ideas. >From 86b2315f3c80dd4562a1a0fa0734921d3e92398f Mon Sep 17 00:00:00 2001 From: Luis Chamberlain Date: Fri, 28 Mar 2025 17:12:48 -0700 Subject: [PATCH 1/3] mm/migrate: add might_sleep() on __migrate_folio() When we do page migration of large folios folio_mc_copy() can cond_resched() *iff* we are on a large folio. There's a hairy bug reported by both 0-day [0] and syzbot [1] where it has been detected we can call folio_mc_copy() in atomic context. While, technically speaking that should in theory be only possible today from buffer-head filesystems using buffer_migrate_folio_norefs() on page migration the only buffer-head large folio filesystem -- the block device cache, and so with block devices with large block sizes. However tracing shows that folio_mc_copy() *isn't* being called as often as we'd expect from buffer_migrate_folio_norefs() path as we're likely bailing early now thanks to the check added by commit 060913999d7a ("mm: migrate: support poisoned recover from migrate folio"). *Most* folio_mc_copy() calls in turn end up *not* being in atomic context, and so we won't hit a splat when using: CONFIG_PROVE_LOCKING=y CONFIG_DEBUG_ATOMIC_SLEEP=y But we *want* to help proactively find callers of __migrate_folio() in atomic context, so make might_sleep() explicit to help us root out large folio atomic callers of migrate_folio(). Link: https://lkml.kernel.org/r/202503101536.27099c77-lkp@intel.com # [0] Link: https://lkml.kernel.org/r/67e57c41.050a0220.2f068f.0033.GAE@google.com # [1] Link: https://lkml.kernel.org/r/Z-c6BqCSmAnNxb57@bombadil.infradead.org # [2] Signed-off-by: Luis Chamberlain --- mm/migrate.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/migrate.c b/mm/migrate.c index f3ee6d8d5e2e..712ddd11f3f0 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -751,6 +751,8 @@ static int __migrate_folio(struct address_space *mapping, struct folio *dst, { int rc, expected_count = folio_expected_refs(mapping, src); + might_sleep(); + /* Check whether src does not have extra refs before we do more work */ if (folio_ref_count(src) != expected_count) return -EAGAIN; -- 2.47.2 >From 561e94951fce481bb2e5917230bec7008c131d9a Mon Sep 17 00:00:00 2001 From: Luis Chamberlain Date: Fri, 28 Mar 2025 17:44:10 -0700 Subject: [PATCH 2/3] fs/buffer: avoid getting buffer if it is folio migration candidate Avoid giving a way a buffer with __find_get_block_slow() if the folio may be a folio migration candidate. We do this as an alternative to the issue fixed by commit ebdf4de5642fb6 ("mm: migrate: fix reference check race between __find_get_block() and migration"), given we've determined that we should avoid requiring folio migration callers from holding a spin lock while calling __migrate_folio(). This alternative simply avoids completing __find_get_block_slow() on folio migration candidates to let us later rip out the spin_lock() held on the buffer_migrate_folio_norefs() path. Signed-off-by: Luis Chamberlain --- fs/buffer.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/buffer.c b/fs/buffer.c index c7abb4a029dc..6e2c3837a202 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -208,6 +208,12 @@ __find_get_block_slow(struct block_device *bdev, sector_t block) head = folio_buffers(folio); if (!head) goto out_unlock; + + if (folio_test_lru(folio) && + folio_test_locked(folio) && + !folio_test_writeback(folio)) + goto out_unlock; + bh = head; do { if (!buffer_mapped(bh)) -- 2.47.2 >From af6963b73a8406162e6c2223fae600a799402e2b Mon Sep 17 00:00:00 2001 From: Luis Chamberlain Date: Fri, 28 Mar 2025 17:51:39 -0700 Subject: [PATCH 3/3] mm/migrate: avoid atomic context on buffer_migrate_folio_norefs() migration The buffer_migrate_folio_norefs() should avoid holding the spin lock held in order to ensure we can support large folios. The prior commit "fs/buffer: avoid getting buffer if it is folio migration candidate" ripped out the only rationale for having the atomic context, so we can remove the spin lock call now. Signed-off-by: Luis Chamberlain --- mm/migrate.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 712ddd11f3f0..f3047c685706 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -861,12 +861,12 @@ static int __buffer_migrate_folio(struct address_space *mapping, } bh = bh->b_this_page; } while (bh != head); + spin_unlock(&mapping->i_private_lock); if (busy) { if (invalidated) { rc = -EAGAIN; goto unlock_buffers; } - spin_unlock(&mapping->i_private_lock); invalidate_bh_lrus(); invalidated = true; goto recheck_buffers; @@ -884,8 +884,6 @@ static int __buffer_migrate_folio(struct address_space *mapping, } while (bh != head); unlock_buffers: - if (check_refs) - spin_unlock(&mapping->i_private_lock); bh = head; do { unlock_buffer(bh); -- 2.47.2