From: Dave Chinner <dgc@kernel.org>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Christoph Hellwig <hch@lst.de>, Tal Zussman <tz2294@columbia.edu>,
Jens Axboe <axboe@kernel.dk>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Christian Brauner <brauner@kernel.org>,
"Darrick J. Wong" <djwong@kernel.org>,
Carlos Maiolino <cem@kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
Bart Van Assche <bvanassche@acm.org>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, Sandeep Dhavale <dhavale@google.com>
Subject: Re: [PATCH 8/8] RFC: use a TASK_FIFO kthread for read completion support
Date: Tue, 14 Apr 2026 10:58:16 +1000 [thread overview]
Message-ID: <ad2RKNo2FGhpzJQp@dread> (raw)
In-Reply-To: <7f0d072b-97a7-405f-bff5-d3819de2e3dd@linux.alibaba.com>
On Sat, Apr 11, 2026 at 07:44:43AM +0800, Gao Xiang wrote:
>
>
> On 2026/4/11 06:11, Dave Chinner wrote:
> > On Thu, Apr 09, 2026 at 06:02:21PM +0200, Christoph Hellwig wrote:
> > > Commit 3fffb589b9a6 ("erofs: add per-cpu threads for decompression as an
> > > option") explains why workqueue aren't great for low-latency completion
> > > handling. Switch to a per-cpu kthread to handle it instead. This code
> > > is based on the erofs code in the above commit, but further simplified
> > > by directly using a kthread instead of a kthread_work.
> > >
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> >
> > Can we please not go back to the (bad) old days of individual
> > subsystems needing their own set of per-cpu kernel tasks just
> > sitting around idle most of of the time? The whole point of the
> > workqueue infrastructure was to get rid of this widely repeated
> > anti-pattern.
> >
> > If there's a latency problem with workqueue scheduling, then we
> > should be fixing that problem rather than working around it in every
> > subsystem that thinkgs it has a workqueue scheduling latency
> > issue...
>
> It has been "fixed" but never actually get fixed:
> https://lore.kernel.org/r/CAB=BE-QaNBn1cVK6c7LM2cLpH_Ck_9SYw-YDYEnNrtwfoyu81Q@mail.gmail.com
>
> and workqueues don't have any plan to introduce RT threads;
They don't need to (or should) introduce RT threads. Per-cpu kernel
threads already get priority over normal user tasks on scheduling
decisions. However, they do not pre-empt running kernel tasks of
the same priority.
In general, kernel threads should not use RT scheduling at all - if
the kernel uses RT prioprity tasks then that can interfere with user
scheduled RT tasks. This is especially true in this case where a
non-RT tasks issue the IO, and the IO completion is then scheduled
with RT priority. IOWs, any unprivileged user can now impact the
processing time available to, and the response latency of, other
RT scheduled tasks the system is running.
Tejun asked Sandeep if setting the workqueue thread priority to
-19 through sysfs (i.e. making them higher priority than normal
kernel threads) had the same effect on latency as using a dedicated
per-cpu RT task thread. THere was no followup.
In theory, this should provide the same benefit, because what RT
scheduling is doing is pre-empting any user and kernel task that was
running when the interrupt was delivered to execute the completion
task immediately.
Setting the workqueue to use kernel threads of a higher scheduler
prioirty should do the same thing, without the need to use dedicated
per-cpu RT threads.
> If Sandeep has more time, I hope he could have more time to
> test since I don't work on Android anymore: In principle,
> I still think RT thread is needed somewhere for such usage
> since lowest latencies is needed.
All that is needed is for the kworker thread to be scheduled to run
immeidately after the interrupt that scheduled the work exits. This
does not require dedicated per-cpu kernel tasks or RT scheduling....
> Compared to the scheduling latency issues, interested users
> don't care "individual subsystems needing their own set of
> per-cpu kernel tasks just sitting around idle most of of
> the time". If end users care it more, they can just turn
> it off by Kconfig.
Distros enable all these subsystems all the time, so saying
"turn it off via kconfig" is not a viable mitigation
strategy. Proliferation of dedicated per-CPU worker task pools is a
known problem, and we really don't want to regress back to those
days when a typical system had thousands of dedicated per-cpu work
queues that largely did nothing most of the time.
-Dave.
--
Dave Chinner
dgc@kernel.org
next prev parent reply other threads:[~2026-04-14 0:58 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 16:02 bio completion in task enhancements / experiments Christoph Hellwig
2026-04-09 16:02 ` [PATCH 1/8] block: add BIO_COMPLETE_IN_TASK for task-context completion Christoph Hellwig
2026-04-09 16:02 ` [PATCH 2/8] iomap: use BIO_COMPLETE_IN_TASK for dropbehind writeback Christoph Hellwig
2026-04-09 16:02 ` [PATCH 3/8] block: enable RWF_DONTCACHE for block devices Christoph Hellwig
2026-04-09 16:02 ` [PATCH 4/8] FOLD: block: change the defer in task context interface to be procedural Christoph Hellwig
2026-04-09 20:18 ` Matthew Wilcox
2026-04-10 6:17 ` Christoph Hellwig
2026-04-10 13:26 ` Matthew Wilcox
2026-04-09 16:02 ` [PATCH 5/8] FOLD: don't use in_task() to decide for offloading Christoph Hellwig
2026-04-09 16:02 ` [PATCH 6/8] iomap: use bio_complete_in_task for buffered read errors Christoph Hellwig
2026-04-09 16:02 ` [PATCH 7/8] iomap: use bio_complete_in_task for buffered write completions Christoph Hellwig
2026-04-09 16:02 ` [PATCH 8/8] RFC: use a TASK_FIFO kthread for read completion support Christoph Hellwig
2026-04-09 19:06 ` Tal Zussman
2026-04-10 6:19 ` Christoph Hellwig
2026-04-10 22:11 ` Dave Chinner
2026-04-10 23:44 ` Gao Xiang
2026-04-10 23:53 ` Gao Xiang
2026-04-14 0:58 ` Dave Chinner [this message]
2026-04-14 2:23 ` Gao Xiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ad2RKNo2FGhpzJQp@dread \
--to=dgc@kernel.org \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=bvanassche@acm.org \
--cc=cem@kernel.org \
--cc=dhavale@google.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=hsiangkao@linux.alibaba.com \
--cc=jack@suse.cz \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tz2294@columbia.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox