From: Kundan Kumar <kundan.kumar@samsung.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz,
willy@infradead.org, mcgrof@kernel.org, clm@meta.com,
david@fromorbit.com, amir73il@gmail.com, axboe@kernel.dk,
hch@lst.de, ritesh.list@gmail.com, dave@stgolabs.net,
cem@kernel.org, wangyufei@vivo.com,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-xfs@vger.kernel.org, gost.dev@samsung.com,
anuj20.g@samsung.com, vishak.g@samsung.com, joshi.k@samsung.com
Subject: Re: [PATCH v3 3/6] xfs: add per-inode AG prediction map and dirty-AG bitmap
Date: Tue, 3 Feb 2026 12:50:53 +0530 [thread overview]
Message-ID: <2c485586-83c9-4697-91fc-7b0cee697704@samsung.com> (raw)
In-Reply-To: <20260129004404.GA7712@frogsfrogsfrogs>
On 1/29/2026 6:14 AM, Darrick J. Wong wrote:
> On Fri, Jan 16, 2026 at 03:38:15PM +0530, Kundan Kumar wrote:
>> Add per-inode structures to track predicted AGs of dirty folios using
>> an xarray and bitmap. This enables efficient identification of AGs
>> involved in writeback.
>>
>> Signed-off-by: Kundan Kumar <kundan.kumar@samsung.com>
>> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
>> ---
>> fs/xfs/xfs_icache.c | 27 +++++++++++++++++++++++++++
>> fs/xfs/xfs_inode.h | 5 +++++
>> 2 files changed, 32 insertions(+)
>>
>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>> index e44040206851..f97aa6d66271 100644
>> --- a/fs/xfs/xfs_icache.c
>> +++ b/fs/xfs/xfs_icache.c
>> @@ -80,6 +80,25 @@ static inline xa_mark_t ici_tag_to_mark(unsigned int tag)
>> return XFS_PERAG_BLOCKGC_MARK;
>> }
>>
>> +static int xfs_inode_init_ag_bitmap(struct xfs_inode *ip)
>> +{
>> + unsigned int bits = ip->i_mount->m_sb.sb_agcount;
>> + unsigned int nlongs;
>> +
>> + xa_init_flags(&ip->i_ag_pmap, XA_FLAGS_LOCK_IRQ);
>
> This increases the size of struct xfs_inode by 40 bytes...
>
I’ll make this lazy and sparse: move AG writeback state behind a pointer
allocated on first use, and replace the bitmap with a sparse dirty-AG
set(xarray keyed by agno) so memory scales with AGs actually touched by
the inode.
>> + ip->i_ag_dirty_bitmap = NULL;
>> + ip->i_ag_dirty_bits = bits;
>> +
>> + if (!bits)
>> + return 0;
>> +
>> + nlongs = BITS_TO_LONGS(bits);
>> + ip->i_ag_dirty_bitmap = kcalloc(nlongs, sizeof(unsigned long),
>> + GFP_NOFS);
>
> ...and there could be hundreds or thousands of AGs for each filesystem.
> That's a lot of kernel memory to handle this prediction stuff, and I"m
> not even sure what ag_dirty_bitmap does yet.
>
The bit for an AG is set in ag_dirty_bitmap at write time. During
writeback, we check which AG bits are set, wake only those AG-specific
workers, and each worker scans the page cache, filters folios tagged for
its AG, and submits the I/O.
>> +
>> + return ip->i_ag_dirty_bitmap ? 0 : -ENOMEM;
>> +}
>> +
>> /*
>> * Allocate and initialise an xfs_inode.
>> */
>> @@ -131,6 +150,8 @@ xfs_inode_alloc(
>> ip->i_next_unlinked = NULLAGINO;
>> ip->i_prev_unlinked = 0;
>>
>> + xfs_inode_init_ag_bitmap(ip);
>
> Unchecked return value???
Will correct in next version
>
>> +
>> return ip;
>> }
>>
>> @@ -194,6 +215,12 @@ xfs_inode_free(
>> ip->i_ino = 0;
>> spin_unlock(&ip->i_flags_lock);
>>
>> + /* free xarray contents (values are immediate packed ints) */
>> + xa_destroy(&ip->i_ag_pmap);
>> + kfree(ip->i_ag_dirty_bitmap);
>> + ip->i_ag_dirty_bitmap = NULL;
>> + ip->i_ag_dirty_bits = 0;
>> +
>> __xfs_inode_free(ip);
>> }
>>
>> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
>> index bd6d33557194..dee449168605 100644
>> --- a/fs/xfs/xfs_inode.h
>> +++ b/fs/xfs/xfs_inode.h
>> @@ -99,6 +99,11 @@ typedef struct xfs_inode {
>> spinlock_t i_ioend_lock;
>> struct work_struct i_ioend_work;
>> struct list_head i_ioend_list;
>> +
>> + /* AG prediction map: pgoff_t -> packed u32 */
>
> What about blocksize < pagesize filesystems? Which packed agno do you
> associate with the pgoff_t?
>
> Also, do you have an xarray entry for each pgoff_t in a large folio?
>
> --D
>
pgoff_t here is the pagecache index (folio->index), i.e. file offset in
PAGE_SIZE units, not a filesystem block index. So blocksize < PAGE_SIZE
doesn’t change the association, the packed agno is attached to the folio
at that pagecache index.
We store one xarray entry per folio index (the start of the folio). We
do not create entries for each base-page inside a large folio. If a
large folio could span multiple extents/AGs, we’ll treat the hint as
advisory and tag it invalid (fallback to normal writeback routing)
rather than trying to encode per-subpage AGs.
>> + struct xarray i_ag_pmap;
>> + unsigned long *i_ag_dirty_bitmap;
>> + unsigned int i_ag_dirty_bits;
>> } xfs_inode_t;
>>
>> static inline bool xfs_inode_on_unlinked_list(const struct xfs_inode *ip)
>> --
>> 2.25.1
>>
>>
>
next prev parent reply other threads:[~2026-02-03 7:21 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20260116101236epcas5p12ba3de776976f4ea6666e16a33ab6ec4@epcas5p1.samsung.com>
2026-01-16 10:08 ` [PATCH v3 0/6] AG aware parallel writeback for XFS Kundan Kumar
[not found] ` <CGME20260116101241epcas5p330f9c335a096aaaefda4b7d3c38d6038@epcas5p3.samsung.com>
2026-01-16 10:08 ` [PATCH v3 1/6] iomap: add write ops hook to attach metadata to folios Kundan Kumar
[not found] ` <CGME20260116101245epcas5p30269c6aa35784db67e6d6ca800a683a7@epcas5p3.samsung.com>
2026-01-16 10:08 ` [PATCH v3 2/6] xfs: add helpers to pack AG prediction info for per-folio tracking Kundan Kumar
2026-01-29 0:45 ` Darrick J. Wong
2026-02-03 7:15 ` Kundan Kumar
2026-02-05 16:39 ` Darrick J. Wong
2026-02-04 7:37 ` Nirjhar Roy (IBM)
[not found] ` <CGME20260116101251epcas5p1cf5b48f2efb14fe4387be3053b3c3ebc@epcas5p1.samsung.com>
2026-01-16 10:08 ` [PATCH v3 3/6] xfs: add per-inode AG prediction map and dirty-AG bitmap Kundan Kumar
2026-01-29 0:44 ` Darrick J. Wong
2026-02-03 7:20 ` Kundan Kumar [this message]
2026-02-05 16:42 ` Darrick J. Wong
2026-02-05 6:44 ` Nirjhar Roy (IBM)
2026-02-05 16:32 ` Darrick J. Wong
2026-02-06 5:41 ` Nirjhar Roy (IBM)
2026-02-05 6:36 ` Nirjhar Roy (IBM)
2026-02-05 16:36 ` Darrick J. Wong
2026-02-06 5:36 ` Nirjhar Roy (IBM)
2026-02-06 5:57 ` Darrick J. Wong
2026-02-06 6:03 ` Nirjhar Roy (IBM)
2026-02-06 7:00 ` Christoph Hellwig
[not found] ` <CGME20260116101256epcas5p2d6125a6bcad78c33f737fdc3484aca79@epcas5p2.samsung.com>
2026-01-16 10:08 ` [PATCH v3 4/6] xfs: tag folios with AG number during buffered write via iomap attach hook Kundan Kumar
2026-01-29 0:47 ` Darrick J. Wong
2026-01-29 22:40 ` Darrick J. Wong
2026-02-03 7:32 ` Kundan Kumar
2026-02-03 7:28 ` Kundan Kumar
2026-02-05 15:56 ` Brian Foster
2026-02-06 6:44 ` Nirjhar Roy (IBM)
[not found] ` <CGME20260116101259epcas5p1cfa6ab02e5a01f7c46cc78df95c57ce0@epcas5p1.samsung.com>
2026-01-16 10:08 ` [PATCH v3 5/6] xfs: add per-AG writeback workqueue infrastructure Kundan Kumar
2026-01-29 22:21 ` Darrick J. Wong
2026-02-03 7:35 ` Kundan Kumar
2026-02-06 6:46 ` Christoph Hellwig
2026-02-10 11:56 ` Nirjhar Roy (IBM)
[not found] ` <CGME20260116101305epcas5p497cd6d9027301853669f1c1aaffbf128@epcas5p4.samsung.com>
2026-01-16 10:08 ` [PATCH v3 6/6] xfs: offload writeback by AG using per-inode dirty bitmap and per-AG workers Kundan Kumar
2026-01-29 22:34 ` Darrick J. Wong
2026-02-03 7:40 ` Kundan Kumar
2026-02-11 9:39 ` Nirjhar Roy (IBM)
2026-01-16 16:13 ` [syzbot ci] Re: AG aware parallel writeback for XFS syzbot ci
2026-01-21 19:54 ` [PATCH v3 0/6] " Brian Foster
2026-01-22 16:15 ` Kundan Kumar
2026-01-23 9:36 ` Pankaj Raghav (Samsung)
2026-01-23 13:26 ` Brian Foster
2026-01-28 18:28 ` Kundan Kumar
2026-02-06 6:25 ` Christoph Hellwig
2026-02-06 10:07 ` Kundan Kumar
2026-02-06 17:42 ` Darrick J. Wong
2026-02-09 6:30 ` Christoph Hellwig
2026-02-09 15:54 ` Kundan Kumar
2026-02-10 15:38 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2c485586-83c9-4697-91fc-7b0cee697704@samsung.com \
--to=kundan.kumar@samsung.com \
--cc=amir73il@gmail.com \
--cc=anuj20.g@samsung.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=cem@kernel.org \
--cc=clm@meta.com \
--cc=dave@stgolabs.net \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=gost.dev@samsung.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=joshi.k@samsung.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=ritesh.list@gmail.com \
--cc=viro@zeniv.linux.org.uk \
--cc=vishak.g@samsung.com \
--cc=wangyufei@vivo.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox