From: Tal Zussman <tz2294@columbia.edu>
To: Jens Axboe <axboe@kernel.dk>,
"Tigran A. Aivazian" <aivazian.tigran@gmail.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Namjae Jeon <linkinjeon@kernel.org>,
Sungjong Seo <sj1557.seo@samsung.com>,
Yuezhang Mo <yuezhang.mo@sony.com>,
Dave Kleikamp <shaggy@kernel.org>,
Ryusuke Konishi <konishi.ryusuke@gmail.com>,
Viacheslav Dubeyko <slava@dubeyko.com>,
Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
Bob Copeland <me@bobcopeland.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
jfs-discussion@lists.sourceforge.net,
linux-nilfs@vger.kernel.org, ntfs3@lists.linux.dev,
linux-karma-devel@lists.sourceforge.net, linux-mm@kvack.org,
Tal Zussman <tz2294@columbia.edu>
Subject: [PATCH RFC v2 0/2] block: enable RWF_DONTCACHE for block devices
Date: Wed, 25 Feb 2026 17:40:55 -0500 [thread overview]
Message-ID: <20260225-blk-dontcache-v2-0-70e7ac4f7108@columbia.edu> (raw)
Add support for using RWF_DONTCACHE with block devices and other
buffer_head-based I/O.
Dropbehind pruning needs to be done in non-IRQ context, but block
devices complete writeback in IRQ context. To fix this, we first defer
dropbehind completion initiated from IRQ context by scheduling a work
item on the system workqueue to process a batch of folios.
Then, fix up the block_write_begin() interface to allow issuing
RWF_DONTCACHE I/Os.
This support is useful for databases that operate on raw block devices,
among other userspace applications.
I tested this (with CONFIG_BUFFER_HEAD=y) for reads and writes on a
single block device on a VM, so results may be noisy.
Reads were tested on the root partition with a 45GB range (~2x RAM).
Writes were tested on a disabled swap parition (~1GB) in a memcg of size
244MB to force reclaim pressure.
Results:
===== READS (/dev/nvme0n1p2) =====
sec normal MB/s dontcache MB/s
---- ------------ --------------
1 993.9 1799.6
2 992.8 1693.8
3 923.4 2565.9
4 1013.5 3917.3
5 1557.9 2438.2
6 2363.4 1844.3
7 1447.9 2048.6
8 899.4 1951.7
9 1246.8 1756.1
10 1139.0 1665.6
11 1089.7 1707.7
12 1270.4 1736.5
13 1244.0 1756.3
14 1389.7 1566.2
---- ------------ --------------
avg 1258.0 2005.4 (+59%)
==== WRITES (/dev/nvme0n1p3) =====
sec normal MB/s dontcache MB/s
---- ------------ --------------
1 2396.1 9670.6
2 8444.8 9391.5
3 770.8 9400.8
4 61.5 9565.9
5 7701.0 8832.6
6 8634.3 9912.9
7 469.2 9835.4
8 8588.5 9587.2
9 8602.2 9334.8
10 591.1 8678.8
11 8528.7 3847.0
---- ------------ --------------
avg 4981.7 8914.3 (+79%)
---
Changes in v2:
- Add R-b from Jan Kara for 2/2.
- Add patch to defer dropbehind completion from IRQ context via a work
item (1/2).
- Add initial performance numbers to cover letter.
- Link to v1: https://lore.kernel.org/r/20260218-blk-dontcache-v1-1-fad6675ef71f@columbia.edu
---
Tal Zussman (2):
filemap: defer dropbehind invalidation from IRQ context
block: enable RWF_DONTCACHE for block devices
block/fops.c | 4 +--
fs/bfs/file.c | 2 +-
fs/buffer.c | 12 ++++---
fs/exfat/inode.c | 2 +-
fs/ext2/inode.c | 2 +-
fs/jfs/inode.c | 2 +-
fs/minix/inode.c | 2 +-
fs/nilfs2/inode.c | 2 +-
fs/nilfs2/recovery.c | 2 +-
fs/ntfs3/inode.c | 2 +-
fs/omfs/file.c | 2 +-
fs/udf/inode.c | 2 +-
fs/ufs/inode.c | 2 +-
include/linux/buffer_head.h | 5 +--
mm/filemap.c | 84 ++++++++++++++++++++++++++++++++++++++++++---
15 files changed, 103 insertions(+), 24 deletions(-)
---
base-commit: 05f7e89ab9731565d8a62e3b5d1ec206485eeb0b
change-id: 20260218-blk-dontcache-338133dd045e
Best regards,
--
Tal Zussman <tz2294@columbia.edu>
next reply other threads:[~2026-02-25 22:41 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 22:40 Tal Zussman [this message]
2026-02-25 22:40 ` [PATCH RFC v2 1/2] filemap: defer dropbehind invalidation from IRQ context Tal Zussman
2026-02-25 22:52 ` Jens Axboe
2026-02-26 1:38 ` Tal Zussman
2026-02-25 22:40 ` [PATCH RFC v2 2/2] block: enable RWF_DONTCACHE for block devices Tal Zussman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260225-blk-dontcache-v2-0-70e7ac4f7108@columbia.edu \
--to=tz2294@columbia.edu \
--cc=aivazian.tigran@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=almaz.alexandrovich@paragon-software.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=jfs-discussion@lists.sourceforge.net \
--cc=konishi.ryusuke@gmail.com \
--cc=linkinjeon@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-karma-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nilfs@vger.kernel.org \
--cc=me@bobcopeland.com \
--cc=ntfs3@lists.linux.dev \
--cc=shaggy@kernel.org \
--cc=sj1557.seo@samsung.com \
--cc=slava@dubeyko.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=yuezhang.mo@sony.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox