[PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback
Date: Sat, 10 Jan 2026 14:26:18 +1030	[thread overview]
Message-ID: <cover.1768017091.git.wqu@suse.com> (raw)

[CHANGELOG]
v3:
- Rebased to the latest for-next
  There is minor conflicts against the recent fix on
  read_cache_page_gfp().

- Still use the folio locked flag to track writeback
  There is a patch that reduced the size of btrfs_device to exactly 512
  bytes, adding new wait and atomic is not that worthy anymore

- Delete read_cache_page_gfp() function completely
  Btrfs is the last user of that function.

v2:
- Still use page cache for super block writes
  This is to ensure the user space won't see any half-backed super block
  caused by the race between bio writes and buffered read on the bdev.

  This is exposed by generic/492 which user space command blkid may
  fail to see the updated superblock.

  This also brings a slight imbalance, that our super block read is
  always uncached, but the superblock write is always cached.

RFC->v1:
- Make sb_write_pointer() use bdev_rw_virt()
  That is the missing location that still uses bdev's page cache, thanks
  Johannes for exposing this one.

- Replace btrfs_release_disk_super() with kfree()
  There is no need to keep that helper, and such replace will help us
  exposing locations which are still using the old page cache, like the
  above case.

- Only scratch the magic number of a super block in
  btrfs_scratch_superblock()
  To keep the behavior the same.

- Use GFP_NOFS when allocating memory
  This is also to keep the old behavior.

  Although I'd say btrfs_read_disk_super() call sites are safe, as they
  are either scanning a device, or at mount time, thus out of the write
  path and should be safe.

  The sb_write_pointer() one still needs the old GFP_NOFS flag as they
  can be called when writing the super block.

Btrfs has a long history using bdev's page cache for super block IOs.
It looks even weird in the older days that we manually setting different
page flags without going through the regular dirty -> lock -> writeback
-> clear writeback sequence.

Thankfully we're moving away from unnecessary bdev's page flag
modification, starting with commit bc00965dbff7 ("btrfs: count super
block write errors in device instead of tracking folio error state"),
we no longer relies on page cache to detect super block IO errors.

But we're still using the bdev's page cache for:

- Reading super blocks
  Reading a whole folio just to grab a 4KiB super block can be
  overkilled.
  And this is the easiest one to kill, just kmalloc() and bdev_rw_virt() will
  handle it well.

- Scratching super blocks
  We can use bdev_rw_virt() to write a super block with its magic
  zeroed.

  However we also need to invalidate the cache to ensure the user space
  won't see the out-of-date cached super block.

- Writing super blocks
  We're using the page cache of bdev, for a different purpose.
  We want to ensure the user space scanning tools like blkid seeing a
  consistent content.

  If we just go the bdev_rw_virt() path, the user space read can race
  with our bio write, resulting inconsistent contents.

  So here we still need to utilize the page cache of bdev, but with
  comments explaining why we need to.

However this brings one small change:

- Device scan is no longer cached
  For mount time it's totally fine, but every time a btrfs device is
  touched, we will submit a 4K sync read from the disk.
  The cost may not be that huge though.

Qu Wenruo (3):
  btrfs: use bdev_rw_virt() to read and scratch the disk super block
  btrfs: minor improvement on super block writeback
  mm/filemap: remove read_cache_page_gfp()

 fs/btrfs/disk-io.c      | 45 +++++++++++++++----------
 fs/btrfs/super.c        |  4 +--
 fs/btrfs/volumes.c      | 74 ++++++++++++++++-------------------------
 fs/btrfs/volumes.h      |  4 +--
 fs/btrfs/zoned.c        | 26 +++++++++------
 include/linux/pagemap.h |  2 --
 mm/filemap.c            | 23 -------------
 7 files changed, 74 insertions(+), 104 deletions(-)

-- 
2.52.0

next             reply	other threads:[~2026-01-10  3:56 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-10  3:56 Qu Wenruo [this message]
2026-01-10  3:56 ` [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block Qu Wenruo
2026-01-10  5:56   ` Matthew Wilcox
2026-01-10  6:02     ` Qu Wenruo
2026-01-10  3:56 ` [PATCH v3 2/3] btrfs: minor improvement on super block writeback Qu Wenruo
2026-01-10  3:56 ` [PATCH v3 3/3] mm/filemap: remove read_cache_page_gfp() Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1768017091.git.wqu@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox