* [PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback
@ 2026-01-10 3:56 Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block Qu Wenruo
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Qu Wenruo @ 2026-01-10 3:56 UTC (permalink / raw)
To: linux-btrfs, linux-mm, linux-fsdevel, linux-kernel
[CHANGELOG]
v3:
- Rebased to the latest for-next
There is minor conflicts against the recent fix on
read_cache_page_gfp().
- Still use the folio locked flag to track writeback
There is a patch that reduced the size of btrfs_device to exactly 512
bytes, adding new wait and atomic is not that worthy anymore
- Delete read_cache_page_gfp() function completely
Btrfs is the last user of that function.
v2:
- Still use page cache for super block writes
This is to ensure the user space won't see any half-backed super block
caused by the race between bio writes and buffered read on the bdev.
This is exposed by generic/492 which user space command blkid may
fail to see the updated superblock.
This also brings a slight imbalance, that our super block read is
always uncached, but the superblock write is always cached.
RFC->v1:
- Make sb_write_pointer() use bdev_rw_virt()
That is the missing location that still uses bdev's page cache, thanks
Johannes for exposing this one.
- Replace btrfs_release_disk_super() with kfree()
There is no need to keep that helper, and such replace will help us
exposing locations which are still using the old page cache, like the
above case.
- Only scratch the magic number of a super block in
btrfs_scratch_superblock()
To keep the behavior the same.
- Use GFP_NOFS when allocating memory
This is also to keep the old behavior.
Although I'd say btrfs_read_disk_super() call sites are safe, as they
are either scanning a device, or at mount time, thus out of the write
path and should be safe.
The sb_write_pointer() one still needs the old GFP_NOFS flag as they
can be called when writing the super block.
Btrfs has a long history using bdev's page cache for super block IOs.
It looks even weird in the older days that we manually setting different
page flags without going through the regular dirty -> lock -> writeback
-> clear writeback sequence.
Thankfully we're moving away from unnecessary bdev's page flag
modification, starting with commit bc00965dbff7 ("btrfs: count super
block write errors in device instead of tracking folio error state"),
we no longer relies on page cache to detect super block IO errors.
But we're still using the bdev's page cache for:
- Reading super blocks
Reading a whole folio just to grab a 4KiB super block can be
overkilled.
And this is the easiest one to kill, just kmalloc() and bdev_rw_virt() will
handle it well.
- Scratching super blocks
We can use bdev_rw_virt() to write a super block with its magic
zeroed.
However we also need to invalidate the cache to ensure the user space
won't see the out-of-date cached super block.
- Writing super blocks
We're using the page cache of bdev, for a different purpose.
We want to ensure the user space scanning tools like blkid seeing a
consistent content.
If we just go the bdev_rw_virt() path, the user space read can race
with our bio write, resulting inconsistent contents.
So here we still need to utilize the page cache of bdev, but with
comments explaining why we need to.
However this brings one small change:
- Device scan is no longer cached
For mount time it's totally fine, but every time a btrfs device is
touched, we will submit a 4K sync read from the disk.
The cost may not be that huge though.
Qu Wenruo (3):
btrfs: use bdev_rw_virt() to read and scratch the disk super block
btrfs: minor improvement on super block writeback
mm/filemap: remove read_cache_page_gfp()
fs/btrfs/disk-io.c | 45 +++++++++++++++----------
fs/btrfs/super.c | 4 +--
fs/btrfs/volumes.c | 74 ++++++++++++++++-------------------------
fs/btrfs/volumes.h | 4 +--
fs/btrfs/zoned.c | 26 +++++++++------
include/linux/pagemap.h | 2 --
mm/filemap.c | 23 -------------
7 files changed, 74 insertions(+), 104 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block
2026-01-10 3:56 [PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback Qu Wenruo
@ 2026-01-10 3:56 ` Qu Wenruo
2026-01-10 5:56 ` Matthew Wilcox
2026-01-10 3:56 ` [PATCH v3 2/3] btrfs: minor improvement on super block writeback Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 3/3] mm/filemap: remove read_cache_page_gfp() Qu Wenruo
2 siblings, 1 reply; 6+ messages in thread
From: Qu Wenruo @ 2026-01-10 3:56 UTC (permalink / raw)
To: linux-btrfs, linux-mm, linux-fsdevel, linux-kernel; +Cc: Johannes Thumshirn
Currently we're using the block device page cache to read and scratch
the super block.
But that means we're reading the whole folio to grab just the super
block, this can be unnecessary especially nowadays bdev's page cache
supports large folio, not to mention systems with page size larger than
4K.
Furthermore read_cache_page*() can race with device block size setting,
thus requires extra locking.
Modify the following routines by:
- Use kmalloc() + bdev_rw_virt() for btrfs_read_disk_super()
This means we can easily replace btrfs_release_disk_super() with a
simple kfree().
This also means there will no longer be any cached read for
btrfs_read_disk_super(), thus we can drop the @drop_cache parameter.
However this change brings a slightly behavior change for
btrfs_scan_one_device(), now every time the device is scanned, btrfs
will submit a read request, no more cached scan.
- Use bdev_rw_virt() for btrfs_scratch_superblock()
Just use the memory returned by btrfs_read_disk_super() and reset the
magic number.
Then use bdev_rw_virt() to do the write.
And since we're using bio to submit writes directly to the device, not
using page cache anymore, after scratching the super block we also
have to invalidate the cache to avoid user space seeing the
out-of-date cached super block.
- Use kmalloc() and bdev_rw_virt() for sb_writer_pointer()
In zoned mode we have a corner case that both super block zones are
full, and we need to determine which zone to reuse.
In that case we need to read the last super block of both zones and
compare their generations.
Here we just use regular kmalloc() + bdev_rw_virt() to do the read.
And since we're here, simplify the error handling path by always
calling kfree() on both super blocks.
Since both super block pointers are initialized to NULL, we're safe to
call kfree() on them.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/disk-io.c | 8 ++---
fs/btrfs/super.c | 4 +--
fs/btrfs/volumes.c | 74 ++++++++++++++++++----------------------------
fs/btrfs/volumes.h | 4 +--
fs/btrfs/zoned.c | 26 +++++++++-------
5 files changed, 51 insertions(+), 65 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7ce7afe2bdaf..0dd77b56dfdf 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3269,7 +3269,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
/*
* Read super block and check the signature bytes only
*/
- disk_super = btrfs_read_disk_super(fs_devices->latest_dev->bdev, 0, false);
+ disk_super = btrfs_read_disk_super(fs_devices->latest_dev->bdev, 0);
if (IS_ERR(disk_super)) {
ret = PTR_ERR(disk_super);
goto fail_alloc;
@@ -3285,7 +3285,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
btrfs_err(fs_info, "unsupported checksum algorithm: %u",
csum_type);
ret = -EINVAL;
- btrfs_release_disk_super(disk_super);
+ kfree(disk_super);
goto fail_alloc;
}
@@ -3301,7 +3301,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
if (btrfs_check_super_csum(fs_info, disk_super)) {
btrfs_err(fs_info, "superblock checksum mismatch");
ret = -EINVAL;
- btrfs_release_disk_super(disk_super);
+ kfree(disk_super);
goto fail_alloc;
}
@@ -3311,7 +3311,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
* the whole block of INFO_SIZE
*/
memcpy(fs_info->super_copy, disk_super, sizeof(*fs_info->super_copy));
- btrfs_release_disk_super(disk_super);
+ kfree(disk_super);
disk_super = fs_info->super_copy;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d64d303b6edc..f884260d7233 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2317,7 +2317,7 @@ static int check_dev_super(struct btrfs_device *dev)
return 0;
/* Only need to check the primary super block. */
- sb = btrfs_read_disk_super(dev->bdev, 0, true);
+ sb = btrfs_read_disk_super(dev->bdev, 0);
if (IS_ERR(sb))
return PTR_ERR(sb);
@@ -2349,7 +2349,7 @@ static int check_dev_super(struct btrfs_device *dev)
goto out;
}
out:
- btrfs_release_disk_super(sb);
+ kfree(sb);
return ret;
}
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 908a89eaeabf..2969e2b96538 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -495,7 +495,7 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mode_t flags, void *holder,
}
}
invalidate_bdev(bdev);
- *disk_super = btrfs_read_disk_super(bdev, 0, false);
+ *disk_super = btrfs_read_disk_super(bdev, 0);
if (IS_ERR(*disk_super)) {
ret = PTR_ERR(*disk_super);
bdev_fput(*bdev_file);
@@ -716,12 +716,12 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
fs_devices->rw_devices++;
list_add_tail(&device->dev_alloc_list, &fs_devices->alloc_list);
}
- btrfs_release_disk_super(disk_super);
+ kfree(disk_super);
return 0;
error_free_page:
- btrfs_release_disk_super(disk_super);
+ kfree(disk_super);
bdev_fput(bdev_file);
return -EINVAL;
@@ -1325,20 +1325,11 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
return ret;
}
-void btrfs_release_disk_super(struct btrfs_super_block *super)
-{
- struct page *page = virt_to_page(super);
-
- put_page(page);
-}
-
struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev,
- int copy_num, bool drop_cache)
+ int copy_num)
{
struct btrfs_super_block *super;
- struct page *page;
u64 bytenr, bytenr_orig;
- struct address_space *mapping = bdev->bd_mapping;
int ret;
bytenr_orig = btrfs_sb_offset(copy_num);
@@ -1352,28 +1343,19 @@ struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev,
if (bytenr + BTRFS_SUPER_INFO_SIZE >= bdev_nr_bytes(bdev))
return ERR_PTR(-EINVAL);
- if (drop_cache) {
- /* This should only be called with the primary sb. */
- ASSERT(copy_num == 0);
-
- /*
- * Drop the page of the primary superblock, so later read will
- * always read from the device.
- */
- invalidate_inode_pages2_range(mapping, bytenr >> PAGE_SHIFT,
- (bytenr + BTRFS_SUPER_INFO_SIZE) >> PAGE_SHIFT);
+ super = kmalloc(BTRFS_SUPER_INFO_SIZE, GFP_NOFS);
+ if (!super)
+ return ERR_PTR(-ENOMEM);
+ ret = bdev_rw_virt(bdev, bytenr >> SECTOR_SHIFT, super, BTRFS_SUPER_INFO_SIZE,
+ REQ_OP_READ);
+ if (ret < 0) {
+ kfree(super);
+ return ERR_PTR(ret);
}
- filemap_invalidate_lock(mapping);
- page = read_cache_page_gfp(mapping, bytenr >> PAGE_SHIFT, GFP_NOFS);
- filemap_invalidate_unlock(mapping);
- if (IS_ERR(page))
- return ERR_CAST(page);
-
- super = page_address(page);
if (btrfs_super_magic(super) != BTRFS_MAGIC ||
btrfs_super_bytenr(super) != bytenr_orig) {
- btrfs_release_disk_super(super);
+ kfree(super);
return ERR_PTR(-EINVAL);
}
@@ -1474,7 +1456,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path,
if (IS_ERR(bdev_file))
return ERR_CAST(bdev_file);
- disk_super = btrfs_read_disk_super(file_bdev(bdev_file), 0, false);
+ disk_super = btrfs_read_disk_super(file_bdev(bdev_file), 0);
if (IS_ERR(disk_super)) {
device = ERR_CAST(disk_super);
goto error_bdev_put;
@@ -1496,7 +1478,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path,
btrfs_free_stale_devices(device->devt, device);
free_disk_super:
- btrfs_release_disk_super(disk_super);
+ kfree(disk_super);
error_bdev_put:
bdev_fput(bdev_file);
@@ -2119,20 +2101,22 @@ static void btrfs_scratch_superblock(struct btrfs_fs_info *fs_info,
struct block_device *bdev, int copy_num)
{
struct btrfs_super_block *disk_super;
- const size_t len = sizeof(disk_super->magic);
const u64 bytenr = btrfs_sb_offset(copy_num);
int ret;
- disk_super = btrfs_read_disk_super(bdev, copy_num, false);
- if (IS_ERR(disk_super))
- return;
-
- memset(&disk_super->magic, 0, len);
- folio_mark_dirty(virt_to_folio(disk_super));
- btrfs_release_disk_super(disk_super);
-
- ret = sync_blockdev_range(bdev, bytenr, bytenr + len - 1);
- if (ret)
+ disk_super = btrfs_read_disk_super(bdev, copy_num);
+ if (IS_ERR(disk_super)) {
+ ret = PTR_ERR(disk_super);
+ goto out;
+ }
+ btrfs_set_super_magic(disk_super, 0);
+ ret = bdev_rw_virt(bdev, bytenr >> SECTOR_SHIFT, disk_super,
+ BTRFS_SUPER_INFO_SIZE, REQ_OP_WRITE);
+ kfree(disk_super);
+out:
+ /* Make sure userspace won't see some out-of-date cached super block. */
+ invalidate_bdev(bdev);
+ if (ret < 0)
btrfs_warn(fs_info, "error clearing superblock number %d (%d)",
copy_num, ret);
}
@@ -2462,7 +2446,7 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
memcpy(args->fsid, disk_super->metadata_uuid, BTRFS_FSID_SIZE);
else
memcpy(args->fsid, disk_super->fsid, BTRFS_FSID_SIZE);
- btrfs_release_disk_super(disk_super);
+ kfree(disk_super);
bdev_fput(bdev_file);
return 0;
}
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 93f45410931e..6381420800fb 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -780,9 +780,7 @@ struct btrfs_chunk_map *btrfs_get_chunk_map(struct btrfs_fs_info *fs_info,
u64 logical, u64 length);
void btrfs_remove_chunk_map(struct btrfs_fs_info *fs_info, struct btrfs_chunk_map *map);
struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev,
- int copy_num, bool drop_cache);
-void btrfs_release_disk_super(struct btrfs_super_block *super);
-
+ int copy_num);
static inline void btrfs_dev_stat_inc(struct btrfs_device *dev,
int index)
{
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 2e861eef5cd8..301e342776b2 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -122,23 +122,27 @@ static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
return -ENOENT;
} else if (full[0] && full[1]) {
/* Compare two super blocks */
- struct address_space *mapping = bdev->bd_mapping;
- struct page *page[BTRFS_NR_SB_LOG_ZONES];
- struct btrfs_super_block *super[BTRFS_NR_SB_LOG_ZONES];
+ struct btrfs_super_block *super[BTRFS_NR_SB_LOG_ZONES] = { 0 };
for (int i = 0; i < BTRFS_NR_SB_LOG_ZONES; i++) {
u64 zone_end = (zones[i].start + zones[i].capacity) << SECTOR_SHIFT;
u64 bytenr = ALIGN_DOWN(zone_end, BTRFS_SUPER_INFO_SIZE) -
BTRFS_SUPER_INFO_SIZE;
+ int ret;
- page[i] = read_cache_page_gfp(mapping,
- bytenr >> PAGE_SHIFT, GFP_NOFS);
- if (IS_ERR(page[i])) {
- if (i == 1)
- btrfs_release_disk_super(super[0]);
- return PTR_ERR(page[i]);
+ super[i] = kmalloc(BTRFS_SUPER_INFO_SIZE, GFP_NOFS);
+ if (!super[i]) {
+ kfree(super[0]);
+ kfree(super[1]);
+ return -ENOMEM;
+ }
+ ret = bdev_rw_virt(bdev, bytenr >> SECTOR_SHIFT, super[i],
+ BTRFS_SUPER_INFO_SIZE, REQ_OP_READ);
+ if (ret < 0) {
+ kfree(super[0]);
+ kfree(super[1]);
+ return ret;
}
- super[i] = page_address(page[i]);
}
if (btrfs_super_generation(super[0]) >
@@ -148,7 +152,7 @@ static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
sector = zones[0].start;
for (int i = 0; i < BTRFS_NR_SB_LOG_ZONES; i++)
- btrfs_release_disk_super(super[i]);
+ kfree(super[i]);
} else if (!full[0] && (empty[1] || full[1])) {
sector = zones[0].wp;
} else if (full[0]) {
--
2.52.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 2/3] btrfs: minor improvement on super block writeback
2026-01-10 3:56 [PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block Qu Wenruo
@ 2026-01-10 3:56 ` Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 3/3] mm/filemap: remove read_cache_page_gfp() Qu Wenruo
2 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2026-01-10 3:56 UTC (permalink / raw)
To: linux-btrfs, linux-mm, linux-fsdevel, linux-kernel
This includes:
- Move the write error handling out of the folio iteration
This is not a big deal, since our super block is never going to be
larger than a single page.
- Add a comment on why we want to lock the folio for writeback
And the fact that we use folio locked state to track if the write has
finished.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/disk-io.c | 37 +++++++++++++++++++++++--------------
1 file changed, 23 insertions(+), 14 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 0dd77b56dfdf..96b7b71e6911 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3650,24 +3650,24 @@ static void btrfs_end_super_write(struct bio *bio)
struct folio_iter fi;
bio_for_each_folio_all(fi, bio) {
- if (bio->bi_status) {
- btrfs_warn_rl(device->fs_info,
- "lost super block write due to IO error on %s (%d)",
- btrfs_dev_name(device),
- blk_status_to_errno(bio->bi_status));
- btrfs_dev_stat_inc_and_print(device,
- BTRFS_DEV_STAT_WRITE_ERRS);
- /* Ensure failure if the primary sb fails. */
- if (bio->bi_opf & REQ_FUA)
- atomic_add(BTRFS_SUPER_PRIMARY_WRITE_ERROR,
- &device->sb_write_errors);
- else
- atomic_inc(&device->sb_write_errors);
- }
folio_unlock(fi.folio);
folio_put(fi.folio);
}
+ if (bio->bi_status) {
+ btrfs_warn_rl(device->fs_info,
+ "lost super block write due to IO error on %s (%d)",
+ btrfs_dev_name(device),
+ blk_status_to_errno(bio->bi_status));
+ btrfs_dev_stat_inc_and_print(device,
+ BTRFS_DEV_STAT_WRITE_ERRS);
+ /* Ensure failure if the primary sb fails. */
+ if (bio->bi_opf & REQ_FUA)
+ atomic_add(BTRFS_SUPER_PRIMARY_WRITE_ERROR,
+ &device->sb_write_errors);
+ else
+ atomic_inc(&device->sb_write_errors);
+ }
bio_put(bio);
}
@@ -3721,6 +3721,15 @@ static int write_dev_supers(struct btrfs_device *device,
btrfs_csum(fs_info->csum_type, (const u8 *)sb + BTRFS_CSUM_SIZE,
BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE, sb->csum);
+ /*
+ * Lock the folio containing the super block.
+ *
+ * This will prevent user space dev scan from getting half-backed
+ * super block.
+ *
+ * Also keep the folio locked until write finished, as a way to
+ * track if the write has finished.
+ */
folio = __filemap_get_folio(mapping, bytenr >> PAGE_SHIFT,
FGP_LOCK | FGP_ACCESSED | FGP_CREAT,
GFP_NOFS);
--
2.52.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 3/3] mm/filemap: remove read_cache_page_gfp()
2026-01-10 3:56 [PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 2/3] btrfs: minor improvement on super block writeback Qu Wenruo
@ 2026-01-10 3:56 ` Qu Wenruo
2 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2026-01-10 3:56 UTC (permalink / raw)
To: linux-btrfs, linux-mm, linux-fsdevel, linux-kernel
The last user of this function is btrfs, which has migrated to use
bdev_rw_virt().
So there is no need to keep that function.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
include/linux/pagemap.h | 2 --
mm/filemap.c | 23 -----------------------
2 files changed, 25 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 31a848485ad9..2efbf6c55a96 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1002,8 +1002,6 @@ struct folio *mapping_read_folio_gfp(struct address_space *, pgoff_t index,
gfp_t flags);
struct page *read_cache_page(struct address_space *, pgoff_t index,
filler_t *filler, struct file *file);
-extern struct page * read_cache_page_gfp(struct address_space *mapping,
- pgoff_t index, gfp_t gfp_mask);
static inline struct page *read_mapping_page(struct address_space *mapping,
pgoff_t index, struct file *file)
diff --git a/mm/filemap.c b/mm/filemap.c
index ebd75684cb0a..cd167aa45934 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -4173,29 +4173,6 @@ struct page *read_cache_page(struct address_space *mapping,
}
EXPORT_SYMBOL(read_cache_page);
-/**
- * read_cache_page_gfp - read into page cache, using specified page allocation flags.
- * @mapping: the page's address_space
- * @index: the page index
- * @gfp: the page allocator flags to use if allocating
- *
- * This is the same as "read_mapping_page(mapping, index, NULL)", but with
- * any new page allocations done using the specified allocation flags.
- *
- * If the page does not get brought uptodate, return -EIO.
- *
- * The function expects mapping->invalidate_lock to be already held.
- *
- * Return: up to date page on success, ERR_PTR() on failure.
- */
-struct page *read_cache_page_gfp(struct address_space *mapping,
- pgoff_t index,
- gfp_t gfp)
-{
- return do_read_cache_page(mapping, index, NULL, NULL, gfp);
-}
-EXPORT_SYMBOL(read_cache_page_gfp);
-
/*
* Warn about a page cache invalidation failure during a direct I/O write.
*/
--
2.52.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block
2026-01-10 3:56 ` [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block Qu Wenruo
@ 2026-01-10 5:56 ` Matthew Wilcox
2026-01-10 6:02 ` Qu Wenruo
0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2026-01-10 5:56 UTC (permalink / raw)
To: Qu Wenruo
Cc: linux-btrfs, linux-mm, linux-fsdevel, linux-kernel, Johannes Thumshirn
On Sat, Jan 10, 2026 at 02:26:19PM +1030, Qu Wenruo wrote:
> Furthermore read_cache_page*() can race with device block size setting,
> thus requires extra locking.
What? There's supposed to be sufficient locking to prevent this.
Is there a bug report I can look at?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block
2026-01-10 5:56 ` Matthew Wilcox
@ 2026-01-10 6:02 ` Qu Wenruo
0 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2026-01-10 6:02 UTC (permalink / raw)
To: Matthew Wilcox
Cc: linux-btrfs, linux-mm, linux-fsdevel, linux-kernel, Johannes Thumshirn
在 2026/1/10 16:26, Matthew Wilcox 写道:
> On Sat, Jan 10, 2026 at 02:26:19PM +1030, Qu Wenruo wrote:
>> Furthermore read_cache_page*() can race with device block size setting,
>> thus requires extra locking.
>
> What? There's supposed to be sufficient locking to prevent this.
> Is there a bug report I can look at?
The comment of read_cache_page_gfp() already mentions that invalidate
lock is required, but we didn't hold inside btrfs, and it's already
fixed now, nothing to be worried from mm side:
https://lore.kernel.org/linux-btrfs/tencent_A63C4B6C74A576F566AA3C0B37CE96AC3609@qq.com/
This report and fix just reminds me to finally push the series to get
rid of read_cache_page_gfp() completely.
Thanks,
Qu
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-01-10 6:03 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-10 3:56 [PATCH v3 0/3] btrfs: only use bdev's page cache for super block writeback Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 1/3] btrfs: use bdev_rw_virt() to read and scratch the disk super block Qu Wenruo
2026-01-10 5:56 ` Matthew Wilcox
2026-01-10 6:02 ` Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 2/3] btrfs: minor improvement on super block writeback Qu Wenruo
2026-01-10 3:56 ` [PATCH v3 3/3] mm/filemap: remove read_cache_page_gfp() Qu Wenruo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox