From: Jens Axboe <axboe@kernel.dk>
To: Matthew Wilcox <willy@infradead.org>
Cc: Tal Zussman <tz2294@columbia.edu>,
"Tigran A. Aivazian" <aivazian.tigran@gmail.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Namjae Jeon <linkinjeon@kernel.org>,
Sungjong Seo <sj1557.seo@samsung.com>,
Yuezhang Mo <yuezhang.mo@sony.com>,
Dave Kleikamp <shaggy@kernel.org>,
Ryusuke Konishi <konishi.ryusuke@gmail.com>,
Viacheslav Dubeyko <slava@dubeyko.com>,
Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
Bob Copeland <me@bobcopeland.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
jfs-discussion@lists.sourceforge.net,
linux-nilfs@vger.kernel.org, ntfs3@lists.linux.dev,
linux-karma-devel@lists.sourceforge.net, linux-mm@kvack.org,
"Vishal Moola (Oracle)" <vishal.moola@gmail.com>
Subject: Re: [PATCH RFC v2 1/2] filemap: defer dropbehind invalidation from IRQ context
Date: Wed, 25 Feb 2026 20:15:28 -0700 [thread overview]
Message-ID: <44e3e9ea-350b-4357-ba50-726e506feab5@kernel.dk> (raw)
In-Reply-To: <aZ-2G_6lDZePLSyx@casper.infradead.org>
On 2/25/26 7:55 PM, Matthew Wilcox wrote:
> On Wed, Feb 25, 2026 at 03:52:41PM -0700, Jens Axboe wrote:
>> How well does this scale? I did a patch basically the same as this, but
>> not using a folio batch though. But the main sticking point was
>> dropbehind_lock contention, to the point where I left it alone and
>> thought "ok maybe we just do this when we're done with the awful
>> buffer_head stuff". What happens if you have N threads doing IO at the
>> same time to N block devices? I suspect it'll look absolutely terrible,
>> as each thread will be banging on that dropbehind_lock.
>>
>> One solution could potentially be to use per-cpu lists for this. If you
>> have N threads working on separate block devices, they will tend to be
>> sticky to their CPU anyway.
>
> Back in 2021, I had Vishal look at switching the page cache from using
> hardirq-disabling locks to softirq-disabling locks [1]. Some of the
> feedback (which doesn't seem to be entirely findable on the lists ...)
> was that we'd be better off punting writeback completion from interrupt
> context to task context and going from spin_lock_irq() to spin_lock()
> rather than going to spin_lock_bh().
>
> I recently saw something (possibly XFS?) promoting this idea again.
> And now there's this. Perhaps the time has come to process all
> write-completions in task context, rather than everyone coming up with
> their own workqueues to solve their little piece of the problem?
Perhaps, even though the punting tends to suck... One idea I toyed with
but had to abandon due to fs freezeing was letting callers that process
completions in task context anyway just do the necessary work at that
time. There's literally nothing worse than having part of a completion
happen in IRQ, then punt parts of that to a worker, and need to wait for
the worker to finish whatever it needs to do - only to then wake the
target task. We can trivially do this in io_uring, as the actual
completion is posted from the task itself anyway. We just need to have
the task do the bottom half of the completion as well, rather than some
unrelated kthread worker.
I'd be worried a generic solution would be the worst of all worlds, as
it prevents optimizations that happen in eg iomap and other spots, where
only completions that absolutely need to happen in task context get
punted. There's a big difference between handling a completion inline vs
needing a round-trip to some worker to do it.
--
Jens Axboe
next prev parent reply other threads:[~2026-02-26 3:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 22:40 [PATCH RFC v2 0/2] block: enable RWF_DONTCACHE for block devices Tal Zussman
2026-02-25 22:40 ` [PATCH RFC v2 1/2] filemap: defer dropbehind invalidation from IRQ context Tal Zussman
2026-02-25 22:52 ` Jens Axboe
2026-02-26 1:38 ` Tal Zussman
2026-02-26 3:11 ` Jens Axboe
2026-02-26 2:55 ` Matthew Wilcox
2026-02-26 3:15 ` Jens Axboe [this message]
2026-02-25 22:40 ` [PATCH RFC v2 2/2] block: enable RWF_DONTCACHE for block devices Tal Zussman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44e3e9ea-350b-4357-ba50-726e506feab5@kernel.dk \
--to=axboe@kernel.dk \
--cc=aivazian.tigran@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=almaz.alexandrovich@paragon-software.com \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=jfs-discussion@lists.sourceforge.net \
--cc=konishi.ryusuke@gmail.com \
--cc=linkinjeon@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-karma-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nilfs@vger.kernel.org \
--cc=me@bobcopeland.com \
--cc=ntfs3@lists.linux.dev \
--cc=shaggy@kernel.org \
--cc=sj1557.seo@samsung.com \
--cc=slava@dubeyko.com \
--cc=tz2294@columbia.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=vishal.moola@gmail.com \
--cc=willy@infradead.org \
--cc=yuezhang.mo@sony.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox