From: Luis Chamberlain <mcgrof@kernel.org>
To: Jan Kara <jack@suse.cz>, Kefeng Wang <wangkefeng.wang@huawei.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
David Bueso <dave@stgolabs.net>, Tso Ted <tytso@mit.edu>,
Ritesh Harjani <ritesh.list@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Oliver Sang <oliver.sang@intel.com>,
Matthew Wilcox <willy@infradead.org>,
David Hildenbrand <david@redhat.com>,
Alistair Popple <apopple@nvidia.com>,
linux-mm@kvack.org, Christian Brauner <brauner@kernel.org>,
Hannes Reinecke <hare@suse.de>,
oe-lkp@lists.linux.dev, lkp@intel.com,
John Garry <john.g.garry@oracle.com>,
linux-block@vger.kernel.org, ltp@lists.linux.it,
Pankaj Raghav <p.raghav@samsung.com>,
Daniel Gomez <da.gomez@samsung.com>,
Dave Chinner <david@fromorbit.com>,
gost.dev@samsung.com
Subject: Re: [linux-next:master] [block/bdev] 3c20917120: BUG:sleeping_function_called_from_invalid_context_at_mm/util.c
Date: Fri, 28 Mar 2025 18:06:48 -0700 [thread overview]
Message-ID: <Z-dHqMtGneCVs3v5@bombadil.infradead.org> (raw)
In-Reply-To: <Z-c6BqCSmAnNxb57@bombadil.infradead.org>
On Fri, Mar 28, 2025 at 05:08:40PM -0700, Luis Chamberlain wrote:
> So, moving on, I think what's best is to see how we can get __find_get_block()
> to not chug on during page migration.
Something like this maybe? Passes initial 10 minutes of generic/750
on ext4 while also blasting an LBS device with dd. I'll let it soak.
The second patch is what requieres more eyeballs / suggestions / ideas.
From 86b2315f3c80dd4562a1a0fa0734921d3e92398f Mon Sep 17 00:00:00 2001
From: Luis Chamberlain <mcgrof@kernel.org>
Date: Fri, 28 Mar 2025 17:12:48 -0700
Subject: [PATCH 1/3] mm/migrate: add might_sleep() on __migrate_folio()
When we do page migration of large folios folio_mc_copy() can
cond_resched() *iff* we are on a large folio. There's a hairy
bug reported by both 0-day [0] and syzbot [1] where it has been
detected we can call folio_mc_copy() in atomic context. While,
technically speaking that should in theory be only possible today
from buffer-head filesystems using buffer_migrate_folio_norefs()
on page migration the only buffer-head large folio filesystem -- the
block device cache, and so with block devices with large block sizes.
However tracing shows that folio_mc_copy() *isn't* being called
as often as we'd expect from buffer_migrate_folio_norefs() path
as we're likely bailing early now thanks to the check added by commit
060913999d7a ("mm: migrate: support poisoned recover from migrate
folio").
*Most* folio_mc_copy() calls in turn end up *not* being in atomic
context, and so we won't hit a splat when using:
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
But we *want* to help proactively find callers of __migrate_folio() in
atomic context, so make might_sleep() explicit to help us root out
large folio atomic callers of migrate_folio().
Link: https://lkml.kernel.org/r/202503101536.27099c77-lkp@intel.com # [0]
Link: https://lkml.kernel.org/r/67e57c41.050a0220.2f068f.0033.GAE@google.com # [1]
Link: https://lkml.kernel.org/r/Z-c6BqCSmAnNxb57@bombadil.infradead.org # [2]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
mm/migrate.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/migrate.c b/mm/migrate.c
index f3ee6d8d5e2e..712ddd11f3f0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -751,6 +751,8 @@ static int __migrate_folio(struct address_space *mapping, struct folio *dst,
{
int rc, expected_count = folio_expected_refs(mapping, src);
+ might_sleep();
+
/* Check whether src does not have extra refs before we do more work */
if (folio_ref_count(src) != expected_count)
return -EAGAIN;
--
2.47.2
From 561e94951fce481bb2e5917230bec7008c131d9a Mon Sep 17 00:00:00 2001
From: Luis Chamberlain <mcgrof@kernel.org>
Date: Fri, 28 Mar 2025 17:44:10 -0700
Subject: [PATCH 2/3] fs/buffer: avoid getting buffer if it is folio migration
candidate
Avoid giving a way a buffer with __find_get_block_slow() if the
folio may be a folio migration candidate. We do this as an alternative
to the issue fixed by commit ebdf4de5642fb6 ("mm: migrate: fix reference
check race between __find_get_block() and migration"), given we've
determined that we should avoid requiring folio migration callers
from holding a spin lock while calling __migrate_folio().
This alternative simply avoids completing __find_get_block_slow()
on folio migration candidates to let us later rip out the spin_lock()
held on the buffer_migrate_folio_norefs() path.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
fs/buffer.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/buffer.c b/fs/buffer.c
index c7abb4a029dc..6e2c3837a202 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -208,6 +208,12 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
head = folio_buffers(folio);
if (!head)
goto out_unlock;
+
+ if (folio_test_lru(folio) &&
+ folio_test_locked(folio) &&
+ !folio_test_writeback(folio))
+ goto out_unlock;
+
bh = head;
do {
if (!buffer_mapped(bh))
--
2.47.2
From af6963b73a8406162e6c2223fae600a799402e2b Mon Sep 17 00:00:00 2001
From: Luis Chamberlain <mcgrof@kernel.org>
Date: Fri, 28 Mar 2025 17:51:39 -0700
Subject: [PATCH 3/3] mm/migrate: avoid atomic context on
buffer_migrate_folio_norefs() migration
The buffer_migrate_folio_norefs() should avoid holding the spin lock
held in order to ensure we can support large folios. The prior commit
"fs/buffer: avoid getting buffer if it is folio migration candidate"
ripped out the only rationale for having the atomic context, so we can
remove the spin lock call now.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
mm/migrate.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index 712ddd11f3f0..f3047c685706 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -861,12 +861,12 @@ static int __buffer_migrate_folio(struct address_space *mapping,
}
bh = bh->b_this_page;
} while (bh != head);
+ spin_unlock(&mapping->i_private_lock);
if (busy) {
if (invalidated) {
rc = -EAGAIN;
goto unlock_buffers;
}
- spin_unlock(&mapping->i_private_lock);
invalidate_bh_lrus();
invalidated = true;
goto recheck_buffers;
@@ -884,8 +884,6 @@ static int __buffer_migrate_folio(struct address_space *mapping,
} while (bh != head);
unlock_buffers:
- if (check_refs)
- spin_unlock(&mapping->i_private_lock);
bh = head;
do {
unlock_buffer(bh);
--
2.47.2
next prev parent reply other threads:[~2025-03-29 1:06 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <202503101536.27099c77-lkp@intel.com>
[not found] ` <20250311-testphasen-behelfen-09b950bbecbf@brauner>
[not found] ` <Z9kEdPLNT8SOyOQT@xsang-OptiPlex-9020>
2025-03-18 8:15 ` Luis Chamberlain
2025-03-18 14:37 ` Matthew Wilcox
2025-03-18 23:17 ` Luis Chamberlain
2025-03-19 2:58 ` Matthew Wilcox
2025-03-19 16:55 ` Luis Chamberlain
2025-03-19 19:16 ` Luis Chamberlain
2025-03-19 19:24 ` Matthew Wilcox
2025-03-20 12:11 ` Luis Chamberlain
2025-03-20 12:18 ` Luis Chamberlain
2025-03-22 23:14 ` Johannes Weiner
2025-03-23 1:02 ` Luis Chamberlain
2025-03-23 7:07 ` Luis Chamberlain
2025-03-25 6:52 ` Oliver Sang
2025-03-28 1:44 ` Luis Chamberlain
2025-03-28 4:21 ` Luis Chamberlain
2025-03-28 9:47 ` Luis Chamberlain
2025-03-28 19:09 ` Luis Chamberlain
2025-03-29 0:08 ` Luis Chamberlain
2025-03-29 1:06 ` Luis Chamberlain [this message]
2025-03-31 7:45 ` Sebastian Andrzej Siewior
2025-04-08 16:43 ` Darrick J. Wong
2025-04-08 17:06 ` Luis Chamberlain
2025-04-08 17:24 ` Luis Chamberlain
2025-04-08 17:48 ` Darrick J. Wong
2025-04-08 17:51 ` Matthew Wilcox
2025-04-08 18:02 ` Darrick J. Wong
2025-04-08 18:51 ` Matthew Wilcox
2025-04-08 19:13 ` Luis Chamberlain
2025-04-08 19:13 ` Luis Chamberlain
2025-04-08 18:06 ` Luis Chamberlain
2025-03-20 1:24 ` Lai, Yi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-dHqMtGneCVs3v5@bombadil.infradead.org \
--to=mcgrof@kernel.org \
--cc=apopple@nvidia.com \
--cc=bigeasy@linutronix.de \
--cc=brauner@kernel.org \
--cc=da.gomez@samsung.com \
--cc=dave@stgolabs.net \
--cc=david@fromorbit.com \
--cc=david@redhat.com \
--cc=gost.dev@samsung.com \
--cc=hannes@cmpxchg.org \
--cc=hare@suse.de \
--cc=jack@suse.cz \
--cc=john.g.garry@oracle.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=ltp@lists.linux.it \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=p.raghav@samsung.com \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox