linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback
@ 2026-01-10  3:56 Qu Wenruo
  2026-01-10  3:56 ` [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Qu Wenruo @ 2026-01-10  3:56 UTC (permalink / raw)
  To: linux-btrfs, linux-mm, linux-fsdevel, linux-kernel

[CHANGELOG]
v3:
- Rebased to the latest for-next
  There is minor conflicts against the recent fix on
  read_cache_page_gfp().

- Still use the folio locked flag to track writeback
  There is a patch that reduced the size of btrfs_device to exactly 512
  bytes, adding new wait and atomic is not that worthy anymore

- Delete read_cache_page_gfp() function completely
  Btrfs is the last user of that function.

v2:
- Still use page cache for super block writes
  This is to ensure the user space won't see any half-backed super block
  caused by the race between bio writes and buffered read on the bdev.

  This is exposed by generic/492 which user space command blkid may
  fail to see the updated superblock.

  This also brings a slight imbalance, that our super block read is
  always uncached, but the superblock write is always cached.

RFC->v1:
- Make sb_write_pointer() use bdev_rw_virt()
  That is the missing location that still uses bdev's page cache, thanks
  Johannes for exposing this one.

- Replace btrfs_release_disk_super() with kfree()
  There is no need to keep that helper, and such replace will help us
  exposing locations which are still using the old page cache, like the
  above case.

- Only scratch the magic number of a super block in
  btrfs_scratch_superblock()
  To keep the behavior the same.

- Use GFP_NOFS when allocating memory
  This is also to keep the old behavior.

  Although I'd say btrfs_read_disk_super() call sites are safe, as they
  are either scanning a device, or at mount time, thus out of the write
  path and should be safe.

  The sb_write_pointer() one still needs the old GFP_NOFS flag as they
  can be called when writing the super block.

Btrfs has a long history using bdev's page cache for super block IOs.
It looks even weird in the older days that we manually setting different
page flags without going through the regular dirty -> lock -> writeback
-> clear writeback sequence.

Thankfully we're moving away from unnecessary bdev's page flag
modification, starting with commit bc00965dbff7 ("btrfs: count super
block write errors in device instead of tracking folio error state"),
we no longer relies on page cache to detect super block IO errors.

But we're still using the bdev's page cache for:

- Reading super blocks
  Reading a whole folio just to grab a 4KiB super block can be
  overkilled.
  And this is the easiest one to kill, just kmalloc() and bdev_rw_virt() will
  handle it well.

- Scratching super blocks
  We can use bdev_rw_virt() to write a super block with its magic
  zeroed.

  However we also need to invalidate the cache to ensure the user space
  won't see the out-of-date cached super block.

- Writing super blocks
  We're using the page cache of bdev, for a different purpose.
  We want to ensure the user space scanning tools like blkid seeing a
  consistent content.

  If we just go the bdev_rw_virt() path, the user space read can race
  with our bio write, resulting inconsistent contents.

  So here we still need to utilize the page cache of bdev, but with
  comments explaining why we need to.

However this brings one small change:

- Device scan is no longer cached
  For mount time it's totally fine, but every time a btrfs device is
  touched, we will submit a 4K sync read from the disk.
  The cost may not be that huge though.

Qu Wenruo (3):
  btrfs: use bdev_rw_virt() to read and scratch the disk super block
  btrfs: minor improvement on super block writeback
  mm/filemap: remove read_cache_page_gfp()

 fs/btrfs/disk-io.c      | 45 +++++++++++++++----------
 fs/btrfs/super.c        |  4 +--
 fs/btrfs/volumes.c      | 74 ++++++++++++++++-------------------------
 fs/btrfs/volumes.h      |  4 +--
 fs/btrfs/zoned.c        | 26 +++++++++------
 include/linux/pagemap.h |  2 --
 mm/filemap.c            | 23 -------------
 7 files changed, 74 insertions(+), 104 deletions(-)

-- 
2.52.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-01-10  6:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-10  3:56 [PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback Qu Wenruo
2026-01-10  3:56 ` [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block Qu Wenruo
2026-01-10  5:56   ` Matthew Wilcox
2026-01-10  6:02     ` Qu Wenruo
2026-01-10  3:56 ` [PATCH v3 2/3] btrfs: minor improvement on super block writeback Qu Wenruo
2026-01-10  3:56 ` [PATCH v3 3/3] mm/filemap: remove read_cache_page_gfp() Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox