linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannelkoong@gmail.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-mm@kvack.org, brauner@kernel.org, willy@infradead.org,
	jack@suse.cz,  hch@infradead.org, linux-fsdevel@vger.kernel.org,
	kernel-team@meta.com
Subject: Re: [RFC PATCH v1 10/10] iomap: add granular dirty and writeback accounting
Date: Wed, 27 Aug 2025 17:08:00 -0700	[thread overview]
Message-ID: <CAJnrk1bqZzKvxxCrok2zqWoy8tXtY+hZF9kXa8PTWmZnihOMTw@mail.gmail.com> (raw)
In-Reply-To: <CAJnrk1a0vBqcbwDGnhr2A-H26Jr=0WauX7A2VLU9wvtV3UtpDQ@mail.gmail.com>

On Fri, Aug 15, 2025 at 11:38 AM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Thu, Aug 14, 2025 at 9:38 AM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Thu, Jul 31, 2025 at 05:21:31PM -0700, Joanne Koong wrote:
> > > Add granular dirty and writeback accounting for large folios. These
> > > stats are used by the mm layer for dirty balancing and throttling.
> > > Having granular dirty and writeback accounting helps prevent
> > > over-aggressive balancing and throttling.
> > >
> > > There are 4 places in iomap this commit affects:
> > > a) filemap dirtying, which now calls filemap_dirty_folio_pages()
> > > b) writeback_iter with setting the wbc->no_stats_accounting bit and
> > > calling clear_dirty_for_io_stats()
> > > c) starting writeback, which now calls __folio_start_writeback()
> > > d) ending writeback, which now calls folio_end_writeback_pages()
> > >
> > > This relies on using the ifs->state dirty bitmap to track dirty pages in
> > > the folio. As such, this can only be utilized on filesystems where the
> > > block size >= PAGE_SIZE.
> >
> > Apologies for my slow responses this month. :)
>
> No worries at all, thanks for looking at this.
> >
> > I wonder, does this cause an observable change in the writeback
> > accounting and throttling behavior for non-fuse filesystems like XFS
> > that use large folios?  I *think* this does actually reduce throttling
> > for XFS, but it might not be so noticeable because the limits are much
> > more generous outside of fuse?
>
> I haven't run any benchmarks on non-fuse filesystems yet but that's
> what I would expect too. Will run some benchmarks to see!

I ran some benchmarks on xfs for the contrived test case I used for
fuse (eg writing 2 GB in 128 MB chunks and then doing 50k 50-byte
random writes) and I don't see any noticeable performance difference.

I re-tested it on fuse but this time with strictlimiting disabled and
didn't notice any difference on that either, probably because with
strictlimiting off we don't run into the upper limit in that test so
there's no extra throttling that needs to be mitigated.

It's unclear to me how often (if at all?) real workloads run up
against their dirty/writeback limits.

>
> >
> > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > > ---
> > >  fs/iomap/buffered-io.c | 136 ++++++++++++++++++++++++++++++++++++++---
> > >  1 file changed, 128 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > > index bcc6e0e5334e..626c3c8399cc 100644
> > > --- a/fs/iomap/buffered-io.c
> > > +++ b/fs/iomap/buffered-io.c
> > > @@ -20,6 +20,8 @@ struct iomap_folio_state {
> > >       spinlock_t              state_lock;
> > >       unsigned int            read_bytes_pending;
> > >       atomic_t                write_bytes_pending;
> > > +     /* number of pages being currently written back */
> > > +     unsigned                nr_pages_writeback;
> > >
> > >       /*
> > >        * Each block has two bits in this bitmap:
> > > @@ -81,6 +83,25 @@ static inline bool ifs_block_is_dirty(struct folio *folio,
> > >       return test_bit(block + blks_per_folio, ifs->state);
> > >  }
> > >
> > > +static unsigned ifs_count_dirty_pages(struct folio *folio)
> > > +{
> > > +     struct iomap_folio_state *ifs = folio->private;
> > > +     struct inode *inode = folio->mapping->host;
> > > +     unsigned block_size = 1 << inode->i_blkbits;
> > > +     unsigned start_blk = 0;
> > > +     unsigned end_blk = min((unsigned)(i_size_read(inode) >> inode->i_blkbits),
> > > +                             i_blocks_per_folio(inode, folio));
> > > +     unsigned nblks = 0;
> > > +
> > > +     while (start_blk < end_blk) {
> > > +             if (ifs_block_is_dirty(folio, ifs, start_blk))
> > > +                     nblks++;
> > > +             start_blk++;
> > > +     }
> >
> > Hmm, isn't this bitmap_weight(ifs->state, blks_per_folio) ?
> >
> > Ohh wait no, the dirty bitmap doesn't start on a byte boundary because
> > the format of the bitmap is [uptodate bits][dirty bits].
> >
> > Maybe those two should be reversed, because I bet the dirty state gets
> > changed a lot more over the lifetime of a folio than the uptodate bits.
>
> I think there's the find_next_bit() helper (which Christoph also
> pointed out) that could probably be used here instead. Or at least
> that's how I see a lot of the driver code doing it for their bitmaps.


  reply	other threads:[~2025-08-28  0:08 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-01  0:21 [RFC PATCH v1 00/10] mm/iomap: " Joanne Koong
2025-08-01  0:21 ` [RFC PATCH v1 01/10] mm: pass number of pages to __folio_start_writeback() Joanne Koong
2025-08-01  0:21 ` [RFC PATCH v1 02/10] mm: pass number of pages to __folio_end_writeback() Joanne Koong
2025-08-01  0:21 ` [RFC PATCH v1 03/10] mm: add folio_end_writeback_pages() helper Joanne Koong
2025-08-12  8:03   ` Christoph Hellwig
2025-08-01  0:21 ` [RFC PATCH v1 04/10] mm: pass number of pages dirtied to __folio_mark_dirty() Joanne Koong
2025-08-01  0:21 ` [RFC PATCH v1 05/10] mm: add filemap_dirty_folio_pages() helper Joanne Koong
2025-08-01 17:07   ` Jan Kara
2025-08-01 21:47     ` Joanne Koong
2025-08-12  8:05   ` Christoph Hellwig
2025-08-01  0:21 ` [RFC PATCH v1 06/10] mm: add __folio_clear_dirty_for_io() helper Joanne Koong
2025-08-01  0:21 ` [RFC PATCH v1 07/10] mm: add no_stats_accounting bitfield to wbc Joanne Koong
2025-08-12  8:06   ` Christoph Hellwig
2025-08-01  0:21 ` [RFC PATCH v1 08/10] mm: refactor clearing dirty stats into helper function Joanne Koong
2025-08-04 16:26   ` Jeff Layton
2025-08-01  0:21 ` [RFC PATCH v1 09/10] mm: add clear_dirty_for_io_stats() helper Joanne Koong
2025-08-01  0:21 ` [RFC PATCH v1 10/10] iomap: add granular dirty and writeback accounting Joanne Koong
2025-08-12  8:15   ` Christoph Hellwig
2025-08-13  1:10     ` Joanne Koong
2025-08-13 22:03       ` Joanne Koong
2025-08-14 16:37   ` Darrick J. Wong
2025-08-15 18:38     ` Joanne Koong
2025-08-28  0:08       ` Joanne Koong [this message]
2025-08-29 23:02         ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJnrk1bqZzKvxxCrok2zqWoy8tXtY+hZF9kXa8PTWmZnihOMTw@mail.gmail.com \
    --to=joannelkoong@gmail.com \
    --cc=brauner@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=kernel-team@meta.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox