linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/readahead: read min folio constraints under invalidate lock
@ 2025-12-15 14:19 Jinchao Wang
  2025-12-15 14:22 ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Jinchao Wang @ 2025-12-15 14:19 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle),
	Andrew Morton, Christian Brauner, Hannes Reinecke,
	Luis Chamberlain, linux-fsdevel, linux-mm, linux-kernel
  Cc: stable, Jinchao Wang, syzbot+4d3cc33ef7a77041efa6,
	syzbot+fdba5cca73fee92c69d6

page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio
constraints before taking the invalidate lock, allowing concurrent changes to
violate page cache invariants.

Move the lookups under filemap_invalidate_lock_shared() to ensure readahead
allocations respect the mapping constraints.

Fixes: 47dd67532303 ("block/bdev: lift block size restrictions to 64k")
Reported-by: syzbot+4d3cc33ef7a77041efa6@syzkaller.appspotmail.com
Reported-by: syzbot+fdba5cca73fee92c69d6@syzkaller.appspotmail.com
Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
---
 mm/readahead.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index b415c9969176..74acd6c4f87c 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -214,7 +214,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
 	unsigned long index = readahead_index(ractl);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
 	unsigned long mark = ULONG_MAX, i = 0;
-	unsigned int min_nrpages = mapping_min_folio_nrpages(mapping);
+	unsigned int min_nrpages;
 
 	/*
 	 * Partway through the readahead operation, we will have added
@@ -232,6 +232,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
 				      lookahead_size);
 	filemap_invalidate_lock_shared(mapping);
 	index = mapping_align_index(mapping, index);
+	min_nrpages = mapping_min_folio_nrpages(mapping);
 
 	/*
 	 * As iterator `i` is aligned to min_nrpages, round_up the
@@ -467,7 +468,7 @@ void page_cache_ra_order(struct readahead_control *ractl,
 	struct address_space *mapping = ractl->mapping;
 	pgoff_t start = readahead_index(ractl);
 	pgoff_t index = start;
-	unsigned int min_order = mapping_min_folio_order(mapping);
+	unsigned int min_order;
 	pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
 	pgoff_t mark = index + ra->size - ra->async_size;
 	unsigned int nofs;
@@ -485,13 +486,16 @@ void page_cache_ra_order(struct readahead_control *ractl,
 
 	new_order = min(mapping_max_folio_order(mapping), new_order);
 	new_order = min_t(unsigned int, new_order, ilog2(ra->size));
-	new_order = max(new_order, min_order);
 
 	ra->order = new_order;
 
 	/* See comment in page_cache_ra_unbounded() */
 	nofs = memalloc_nofs_save();
 	filemap_invalidate_lock_shared(mapping);
+
+	min_order = mapping_min_folio_order(mapping);
+	new_order = max(new_order, min_order);
+
 	/*
 	 * If the new_order is greater than min_order and index is
 	 * already aligned to new_order, then this will be noop as index
-- 
2.43.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/readahead: read min folio constraints under invalidate lock
  2025-12-15 14:19 [PATCH] mm/readahead: read min folio constraints under invalidate lock Jinchao Wang
@ 2025-12-15 14:22 ` Matthew Wilcox
  2025-12-16  1:37   ` Jinchao Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2025-12-15 14:22 UTC (permalink / raw)
  To: Jinchao Wang
  Cc: Andrew Morton, Christian Brauner, Hannes Reinecke,
	Luis Chamberlain, linux-fsdevel, linux-mm, linux-kernel, stable,
	syzbot+4d3cc33ef7a77041efa6, syzbot+fdba5cca73fee92c69d6

On Mon, Dec 15, 2025 at 10:19:00PM +0800, Jinchao Wang wrote:
> page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio
> constraints before taking the invalidate lock, allowing concurrent changes to
> violate page cache invariants.
> 
> Move the lookups under filemap_invalidate_lock_shared() to ensure readahead
> allocations respect the mapping constraints.

Why are the mapping folio size constraints being changed?  They're
supposed to be set at inode instantiation and then never changed.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/readahead: read min folio constraints under invalidate lock
  2025-12-15 14:22 ` Matthew Wilcox
@ 2025-12-16  1:37   ` Jinchao Wang
  2025-12-16  2:42     ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Jinchao Wang @ 2025-12-16  1:37 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Christian Brauner, Hannes Reinecke,
	Luis Chamberlain, linux-fsdevel, linux-mm, linux-kernel, stable,
	syzbot+4d3cc33ef7a77041efa6, syzbot+fdba5cca73fee92c69d6

On Mon, Dec 15, 2025 at 02:22:23PM +0000, Matthew Wilcox wrote:
> On Mon, Dec 15, 2025 at 10:19:00PM +0800, Jinchao Wang wrote:
> > page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio
> > constraints before taking the invalidate lock, allowing concurrent changes to
> > violate page cache invariants.
> > 
> > Move the lookups under filemap_invalidate_lock_shared() to ensure readahead
> > allocations respect the mapping constraints.
> 
> Why are the mapping folio size constraints being changed?  They're
> supposed to be set at inode instantiation and then never changed.

They can change after instantiation for block devices. In the syzbot repro:
  blkdev_ioctl() -> blkdev_bszset() -> set_blocksize() ->
  mapping_set_folio_min_order()


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/readahead: read min folio constraints under invalidate lock
  2025-12-16  1:37   ` Jinchao Wang
@ 2025-12-16  2:42     ` Matthew Wilcox
  2025-12-16  3:12       ` Jinchao Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2025-12-16  2:42 UTC (permalink / raw)
  To: Jinchao Wang
  Cc: Andrew Morton, Christian Brauner, Hannes Reinecke,
	Luis Chamberlain, linux-fsdevel, linux-mm, linux-kernel, stable,
	syzbot+4d3cc33ef7a77041efa6, syzbot+fdba5cca73fee92c69d6

On Tue, Dec 16, 2025 at 09:37:51AM +0800, Jinchao Wang wrote:
> On Mon, Dec 15, 2025 at 02:22:23PM +0000, Matthew Wilcox wrote:
> > On Mon, Dec 15, 2025 at 10:19:00PM +0800, Jinchao Wang wrote:
> > > page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio
> > > constraints before taking the invalidate lock, allowing concurrent changes to
> > > violate page cache invariants.
> > > 
> > > Move the lookups under filemap_invalidate_lock_shared() to ensure readahead
> > > allocations respect the mapping constraints.
> > 
> > Why are the mapping folio size constraints being changed?  They're
> > supposed to be set at inode instantiation and then never changed.
> 
> They can change after instantiation for block devices. In the syzbot repro:
>   blkdev_ioctl() -> blkdev_bszset() -> set_blocksize() ->
>   mapping_set_folio_min_order()

Oh, this is just syzbot doing stupid things.  We should probably make
blkdev_bszset() fail if somebody else has an fd open.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/readahead: read min folio constraints under invalidate lock
  2025-12-16  2:42     ` Matthew Wilcox
@ 2025-12-16  3:12       ` Jinchao Wang
  2025-12-16  3:53         ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Jinchao Wang @ 2025-12-16  3:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Christian Brauner, Hannes Reinecke,
	Luis Chamberlain, linux-fsdevel, linux-mm, linux-kernel, stable,
	syzbot+4d3cc33ef7a77041efa6, syzbot+fdba5cca73fee92c69d6

On Tue, Dec 16, 2025 at 02:42:06AM +0000, Matthew Wilcox wrote:
> On Tue, Dec 16, 2025 at 09:37:51AM +0800, Jinchao Wang wrote:
> > On Mon, Dec 15, 2025 at 02:22:23PM +0000, Matthew Wilcox wrote:
> > > On Mon, Dec 15, 2025 at 10:19:00PM +0800, Jinchao Wang wrote:
> > > > page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio
> > > > constraints before taking the invalidate lock, allowing concurrent changes to
> > > > violate page cache invariants.
> > > > 
> > > > Move the lookups under filemap_invalidate_lock_shared() to ensure readahead
> > > > allocations respect the mapping constraints.
> > > 
> > > Why are the mapping folio size constraints being changed?  They're
> > > supposed to be set at inode instantiation and then never changed.
> > 
> > They can change after instantiation for block devices. In the syzbot repro:
> >   blkdev_ioctl() -> blkdev_bszset() -> set_blocksize() ->
> >   mapping_set_folio_min_order()
> 
> Oh, this is just syzbot doing stupid things.  We should probably make
> blkdev_bszset() fail if somebody else has an fd open.

Thanks, that makes sense.
Tightening blkdev_bszset() would avoid the race entirely.
This change is meant as a defensive fix to prevent BUGs.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/readahead: read min folio constraints under invalidate lock
  2025-12-16  3:12       ` Jinchao Wang
@ 2025-12-16  3:53         ` Matthew Wilcox
  2025-12-18  4:03           ` Jinchao Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2025-12-16  3:53 UTC (permalink / raw)
  To: Jinchao Wang
  Cc: Andrew Morton, Christian Brauner, Hannes Reinecke,
	Luis Chamberlain, linux-fsdevel, linux-mm, linux-kernel, stable,
	syzbot+4d3cc33ef7a77041efa6, syzbot+fdba5cca73fee92c69d6

On Tue, Dec 16, 2025 at 11:12:21AM +0800, Jinchao Wang wrote:
> On Tue, Dec 16, 2025 at 02:42:06AM +0000, Matthew Wilcox wrote:
> > On Tue, Dec 16, 2025 at 09:37:51AM +0800, Jinchao Wang wrote:
> > > On Mon, Dec 15, 2025 at 02:22:23PM +0000, Matthew Wilcox wrote:
> > > > On Mon, Dec 15, 2025 at 10:19:00PM +0800, Jinchao Wang wrote:
> > > > > page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio
> > > > > constraints before taking the invalidate lock, allowing concurrent changes to
> > > > > violate page cache invariants.
> > > > > 
> > > > > Move the lookups under filemap_invalidate_lock_shared() to ensure readahead
> > > > > allocations respect the mapping constraints.
> > > > 
> > > > Why are the mapping folio size constraints being changed?  They're
> > > > supposed to be set at inode instantiation and then never changed.
> > > 
> > > They can change after instantiation for block devices. In the syzbot repro:
> > >   blkdev_ioctl() -> blkdev_bszset() -> set_blocksize() ->
> > >   mapping_set_folio_min_order()
> > 
> > Oh, this is just syzbot doing stupid things.  We should probably make
> > blkdev_bszset() fail if somebody else has an fd open.
> 
> Thanks, that makes sense.
> Tightening blkdev_bszset() would avoid the race entirely.
> This change is meant as a defensive fix to prevent BUGs.

Yes, but the point is that there's a lot of code which relies on
the AS_FOLIO bits not changing in the middle.  Syzbot found one of them,
but there are others.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/readahead: read min folio constraints under invalidate lock
  2025-12-16  3:53         ` Matthew Wilcox
@ 2025-12-18  4:03           ` Jinchao Wang
  0 siblings, 0 replies; 7+ messages in thread
From: Jinchao Wang @ 2025-12-18  4:03 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Christian Brauner, Hannes Reinecke,
	Luis Chamberlain, linux-fsdevel, linux-mm, linux-kernel, stable,
	syzbot+4d3cc33ef7a77041efa6, syzbot+fdba5cca73fee92c69d6

On Tue, Dec 16, 2025 at 03:53:17AM +0000, Matthew Wilcox wrote:
> On Tue, Dec 16, 2025 at 11:12:21AM +0800, Jinchao Wang wrote:
> > On Tue, Dec 16, 2025 at 02:42:06AM +0000, Matthew Wilcox wrote:
> > > On Tue, Dec 16, 2025 at 09:37:51AM +0800, Jinchao Wang wrote:
> > > > On Mon, Dec 15, 2025 at 02:22:23PM +0000, Matthew Wilcox wrote:
> > > > > On Mon, Dec 15, 2025 at 10:19:00PM +0800, Jinchao Wang wrote:
> > > > > > page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio
> > > > > > constraints before taking the invalidate lock, allowing concurrent changes to
> > > > > > violate page cache invariants.
> > > > > > 
> > > > > > Move the lookups under filemap_invalidate_lock_shared() to ensure readahead
> > > > > > allocations respect the mapping constraints.
> > > > > 
> > > > > Why are the mapping folio size constraints being changed?  They're
> > > > > supposed to be set at inode instantiation and then never changed.
> > > > 
> > > > They can change after instantiation for block devices. In the syzbot repro:
> > > >   blkdev_ioctl() -> blkdev_bszset() -> set_blocksize() ->
> > > >   mapping_set_folio_min_order()
> > > 
> > > Oh, this is just syzbot doing stupid things.  We should probably make
> > > blkdev_bszset() fail if somebody else has an fd open.
> > 
> > Thanks, that makes sense.
> > Tightening blkdev_bszset() would avoid the race entirely.
> > This change is meant as a defensive fix to prevent BUGs.
> 
> Yes, but the point is that there's a lot of code which relies on
> the AS_FOLIO bits not changing in the middle.  Syzbot found one of them,
> but there are others.

I've been thinking about this more, and I wanted to share another
perspective if that's okay.

Rather than tracking down every place that might change AS_FOLIO bits
(like blkdev_bszset() and potentially others), what if we make the
page cache layer itself robust against such changes?

The invalidate_lock was introduced for exactly this kind of protection
(commit 730633f0b7f9: "mm: Protect operations adding pages to page
cache with invalidate_lock"). This way, the page cache doesn't need
to rely on assumptions about what upper layers might do.

The readahead functions already hold filemap_invalidate_lock_shared(),
so moving the constraint reads under the lock adds no overhead. It
would protect against AS_FOLIO changes regardless of their source.

I think this separates concerns nicely: upper layers can change
constraints through the invalidate_lock protocol, and page cache
operations are automatically safe. But I'd really value your thoughts
on this approach - you have much more experience with these tradeoffs
than I do.

Thanks again for taking the time to discuss this.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-12-18  4:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-15 14:19 [PATCH] mm/readahead: read min folio constraints under invalidate lock Jinchao Wang
2025-12-15 14:22 ` Matthew Wilcox
2025-12-16  1:37   ` Jinchao Wang
2025-12-16  2:42     ` Matthew Wilcox
2025-12-16  3:12       ` Jinchao Wang
2025-12-16  3:53         ` Matthew Wilcox
2025-12-18  4:03           ` Jinchao Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox