linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner	 <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	"Matthew Wilcox (Oracle)"	 <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R. Howlett"	 <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport	 <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko	 <mhocko@suse.com>,
	Mike Snitzer <snitzer@kernel.org>,
	Chuck Lever	 <chuck.lever@oracle.com>,
	linux-fsdevel@vger.kernel.org,  linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 1/4] mm: fix IOCB_DONTCACHE write performance with rate-limited writeback
Date: Thu, 02 Apr 2026 08:28:42 -0400	[thread overview]
Message-ID: <01dd135adf38e35492d957a35e22c4ba5c2283d1.camel@kernel.org> (raw)
In-Reply-To: <ac385Il8l-krKEOQ@infradead.org>

On Wed, 2026-04-01 at 22:21 -0700, Christoph Hellwig wrote:
> On Wed, Apr 01, 2026 at 03:10:58PM -0400, Jeff Layton wrote:
> > IOCB_DONTCACHE calls filemap_flush_range() with nr_to_write=LONG_MAX
> > on every write, which flushes all dirty pages in the written range.
> > 
> > Under concurrent writers this creates severe serialization on the
> > writeback submission path, causing throughput to collapse to ~47% of
> > buffered I/O with multi-second tail latency.  Even single-client
> > sequential writes suffer: on a 512GB file with 256GB RAM, the
> > aggressive flushing triggers dirty throttling that limits throughput
> > to 575 MB/s vs 1442 MB/s with rate-limited writeback.
> 
> I'm not sure the first how you think the first paragraph relate to
> the second.
> 

The belief is that under heavy parallel write workload on the same
inode, the writers all end up stacking up on the mapping's xa_lock.
However as Ritesh points out, I should probably confirm that with perf.
 
> > Replace the filemap_flush_range() call in generic_write_sync() with a
> > new filemap_dontcache_writeback_range() that uses two rate-limiting
> > mechanisms:
> > 
> >   1. Skip-if-busy: check mapping_tagged(PAGECACHE_TAG_WRITEBACK)
> >      before flushing.  If writeback is already in progress on the
> >      mapping, skip the flush entirely.  This eliminates writeback
> >      submission contention between concurrent writers.
> 
> Makes sense.
> 
> >   2. Proportional cap: when flushing does occur, cap nr_to_write to
> >      the number of pages just written.  This prevents any single
> >      write from triggering a large flush that would starve concurrent
> >      readers.
> 
> This doesn't make any sense at all.
> filemap_flush_range/filemap_writeback always caps the number of written
> pages to the range passed in.  What do you think is the change here?
> 

I had some earlier results that indicated that this did help. It's
possible they were bogus though. I'll recheck that and get back to you.

> > +	return filemap_writeback(mapping, start, end, WB_SYNC_NONE, &nr,
> > +			WB_REASON_BACKGROUND);
> 
> filemap_writeback only has 5 arguments in any tree I've looked at
> including linux-next.
> 

I think this was a bad merge on my part. Mea culpa. The version in the
"dontcache" branch of my tree should be correct.

Thanks for the review!
-- 
Jeff Layton <jlayton@kernel.org>


  reply	other threads:[~2026-04-02 12:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 19:10 [PATCH 0/4] mm: improve write performance with RWF_DONTCACHE Jeff Layton
2026-04-01 19:10 ` [PATCH 1/4] mm: fix IOCB_DONTCACHE write performance with rate-limited writeback Jeff Layton
2026-04-02  4:43   ` Ritesh Harjani
2026-04-02 11:59     ` Jeff Layton
2026-04-02 12:40       ` Ritesh Harjani
2026-04-02  5:21   ` Christoph Hellwig
2026-04-02 12:28     ` Jeff Layton [this message]
2026-04-06  5:44       ` Christoph Hellwig
2026-04-01 19:10 ` [PATCH 2/4] mm: add atomic flush guard for IOCB_DONTCACHE writeback Jeff Layton
2026-04-02  5:27   ` Christoph Hellwig
2026-04-02 12:49     ` Jeff Layton
2026-04-06  5:49       ` Christoph Hellwig
2026-04-06 13:32         ` Jeff Layton
2026-04-07  5:19           ` Christoph Hellwig
2026-04-01 19:11 ` [PATCH 3/4] testing: add nfsd-io-bench NFS server benchmark suite Jeff Layton
2026-04-01 19:11 ` [PATCH 4/4] testing: add dontcache-bench local filesystem " Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01dd135adf38e35492d957a35e22c4ba5c2283d1.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=david@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=snitzer@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox