From: Jeff Layton <jlayton@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Mike Snitzer <snitzer@kernel.org>,
Chuck Lever <chuck.lever@oracle.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 1/4] mm: fix IOCB_DONTCACHE write performance with rate-limited writeback
Date: Thu, 02 Apr 2026 08:28:42 -0400 [thread overview]
Message-ID: <01dd135adf38e35492d957a35e22c4ba5c2283d1.camel@kernel.org> (raw)
In-Reply-To: <ac385Il8l-krKEOQ@infradead.org>
On Wed, 2026-04-01 at 22:21 -0700, Christoph Hellwig wrote:
> On Wed, Apr 01, 2026 at 03:10:58PM -0400, Jeff Layton wrote:
> > IOCB_DONTCACHE calls filemap_flush_range() with nr_to_write=LONG_MAX
> > on every write, which flushes all dirty pages in the written range.
> >
> > Under concurrent writers this creates severe serialization on the
> > writeback submission path, causing throughput to collapse to ~47% of
> > buffered I/O with multi-second tail latency. Even single-client
> > sequential writes suffer: on a 512GB file with 256GB RAM, the
> > aggressive flushing triggers dirty throttling that limits throughput
> > to 575 MB/s vs 1442 MB/s with rate-limited writeback.
>
> I'm not sure the first how you think the first paragraph relate to
> the second.
>
The belief is that under heavy parallel write workload on the same
inode, the writers all end up stacking up on the mapping's xa_lock.
However as Ritesh points out, I should probably confirm that with perf.
> > Replace the filemap_flush_range() call in generic_write_sync() with a
> > new filemap_dontcache_writeback_range() that uses two rate-limiting
> > mechanisms:
> >
> > 1. Skip-if-busy: check mapping_tagged(PAGECACHE_TAG_WRITEBACK)
> > before flushing. If writeback is already in progress on the
> > mapping, skip the flush entirely. This eliminates writeback
> > submission contention between concurrent writers.
>
> Makes sense.
>
> > 2. Proportional cap: when flushing does occur, cap nr_to_write to
> > the number of pages just written. This prevents any single
> > write from triggering a large flush that would starve concurrent
> > readers.
>
> This doesn't make any sense at all.
> filemap_flush_range/filemap_writeback always caps the number of written
> pages to the range passed in. What do you think is the change here?
>
I had some earlier results that indicated that this did help. It's
possible they were bogus though. I'll recheck that and get back to you.
> > + return filemap_writeback(mapping, start, end, WB_SYNC_NONE, &nr,
> > + WB_REASON_BACKGROUND);
>
> filemap_writeback only has 5 arguments in any tree I've looked at
> including linux-next.
>
I think this was a bad merge on my part. Mea culpa. The version in the
"dontcache" branch of my tree should be correct.
Thanks for the review!
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2026-04-02 12:28 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 19:10 [PATCH 0/4] mm: improve write performance with RWF_DONTCACHE Jeff Layton
2026-04-01 19:10 ` [PATCH 1/4] mm: fix IOCB_DONTCACHE write performance with rate-limited writeback Jeff Layton
2026-04-02 4:43 ` Ritesh Harjani
2026-04-02 11:59 ` Jeff Layton
2026-04-02 12:40 ` Ritesh Harjani
2026-04-02 5:21 ` Christoph Hellwig
2026-04-02 12:28 ` Jeff Layton [this message]
2026-04-06 5:44 ` Christoph Hellwig
2026-04-01 19:10 ` [PATCH 2/4] mm: add atomic flush guard for IOCB_DONTCACHE writeback Jeff Layton
2026-04-02 5:27 ` Christoph Hellwig
2026-04-02 12:49 ` Jeff Layton
2026-04-06 5:49 ` Christoph Hellwig
2026-04-06 13:32 ` Jeff Layton
2026-04-07 5:19 ` Christoph Hellwig
2026-04-01 19:11 ` [PATCH 3/4] testing: add nfsd-io-bench NFS server benchmark suite Jeff Layton
2026-04-01 19:11 ` [PATCH 4/4] testing: add dontcache-bench local filesystem " Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=01dd135adf38e35492d957a35e22c4ba5c2283d1.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=david@kernel.org \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=snitzer@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox