linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Nirjhar Roy (IBM)" <nirjhar.roy.lists@gmail.com>
To: Kundan Kumar <kundan.kumar@samsung.com>,
	viro@zeniv.linux.org.uk,  brauner@kernel.org, jack@suse.cz,
	willy@infradead.org, mcgrof@kernel.org,  clm@meta.com,
	david@fromorbit.com, amir73il@gmail.com, axboe@kernel.dk,
	hch@lst.de,  ritesh.list@gmail.com, djwong@kernel.org,
	dave@stgolabs.net, cem@kernel.org,  wangyufei@vivo.com
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	 linux-xfs@vger.kernel.org, gost.dev@samsung.com,
	anuj20.g@samsung.com,  vishak.g@samsung.com, joshi.k@samsung.com
Subject: Re: [PATCH v3 4/6] xfs: tag folios with AG number during buffered write via iomap attach hook
Date: Fri, 06 Feb 2026 12:14:24 +0530	[thread overview]
Message-ID: <c400d2a68d87a29e17a08ed03489ca33ea46a8f4.camel@gmail.com> (raw)
In-Reply-To: <20260116100818.7576-5-kundan.kumar@samsung.com>

On Fri, 2026-01-16 at 15:38 +0530, Kundan Kumar wrote:
> Use the iomap attach hook to tag folios with their predicted
> allocation group at write time. Mapped extents derive AG directly;
> delalloc and hole cases use a lightweight predictor.
> 
> Signed-off-by: Kundan Kumar <kundan.kumar@samsung.com>
> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
> ---
>  fs/xfs/xfs_iomap.c | 114 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 114 insertions(+)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 490e12cb99be..3c927ce118fe 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -12,6 +12,9 @@
>  #include "xfs_trans_resv.h"
>  #include "xfs_mount.h"
>  #include "xfs_inode.h"
> +#include "xfs_alloc.h"
> +#include "xfs_ag.h"
> +#include "xfs_ag_resv.h"
>  #include "xfs_btree.h"
>  #include "xfs_bmap_btree.h"
>  #include "xfs_bmap.h"
> @@ -92,8 +95,119 @@ xfs_iomap_valid(
>  	return true;
>  }
>  
> +static xfs_agnumber_t
> +xfs_predict_delalloc_agno(const struct xfs_inode *ip, loff_t pos, loff_t len)
> +{
> +	struct xfs_mount *mp = ip->i_mount;
> +	xfs_agnumber_t start_agno, agno, best_agno;
> +	struct xfs_perag *pag;
Nit: Coding style - I think XFS uses a tab between the data type and the identifier and makes all
the identifier align. Something like
int			a;
struct xfs_mount	*mp;
> +
Nit: Extra blank line?
> +	xfs_extlen_t free, resv, avail;
> +	xfs_extlen_t need_fsbs, min_free_fsbs;
> +	xfs_extlen_t best_free = 0;
> +	xfs_agnumber_t agcount = mp->m_sb.sb_agcount;
Similar comment as above
> +
> +	/* RT inodes allocate from the realtime volume */
> +	if (XFS_IS_REALTIME_INODE(ip))
> +		return XFS_INO_TO_AGNO(mp, ip->i_ino);
What are we returning for realtime volume? AG sizes and rtgroup sizes may be different, isn't it?
Can we use the above macro for both? Also, rt volume work with extents (which can be more than the
fsblock size) - so will the above work? 
> +
> +	start_agno =  XFS_INO_TO_AGNO(mp, ip->i_ino);
> +
> +	/*
> +	 * size-based minimum free requirement.
> +	 * Convert bytes to fsbs and require some slack.
> +	 */
> +	need_fsbs = XFS_B_TO_FSB(mp, (xfs_fsize_t)len);
> +	min_free_fsbs = need_fsbs + max_t(xfs_extlen_t, need_fsbs >> 2, 128);
A short comment explaining the above?
> +
> +	/*
> +	 * scan AGs starting at start_agno and wrapping.
> +	 * Pick the first AG that meets min_free_fsbs after reservations.
> +	 * Keep a "best" fallback = maximum (free - resv).
> +	 */
> +	best_agno = start_agno;
> +
> +	for (xfs_agnumber_t i = 0; i < agcount; i++) {
Maybe use for_each_perag_wrap_range or for_each_perag_wrap (defined in fs/xfs/libxfs/xfs_ag.h)?
> +		agno = (start_agno + i) % agcount;
> +		pag = xfs_perag_get(mp, agno);
> +
> +		if (!xfs_perag_initialised_agf(pag))
> +			goto next;
> +
> +		free = READ_ONCE(pag->pagf_freeblks);
> +		resv = xfs_ag_resv_needed(pag, XFS_AG_RESV_NONE);
> +
> +		if (free <= resv)
> +			goto next;
> +
> +		avail = free - resv;
> +
> +		if (avail >= min_free_fsbs) {
> +			xfs_perag_put(pag);
> +			return agno;
> +		}
> +
> +		if (avail > best_free) {
Just for my understanding - we are doing a largest fit selection, aren't we?
> +			best_free = avail;
> +			best_agno = agno;
> +		}
> +next:
> +		xfs_perag_put(pag);
> +	}
> +
> +	return best_agno;
> +}
> +
> +static inline xfs_agnumber_t xfs_ag_from_iomap(const struct xfs_mount *mp,
> +		const struct iomap *iomap,
> +		const struct xfs_inode *ip, loff_t pos, size_t len)
> +{
> +	if (iomap->type == IOMAP_MAPPED || iomap->type == IOMAP_UNWRITTEN) {
> +		/* iomap->addr is byte address on device for buffered I/O */
> +		xfs_fsblock_t fsb = XFS_BB_TO_FSBT(mp, BTOBB(iomap->addr));
> +
> +		return XFS_FSB_TO_AGNO(mp, fsb);
> +	} else if (iomap->type == IOMAP_HOLE || iomap->type == IOMAP_DELALLOC) {
> +		return xfs_predict_delalloc_agno(ip, pos, len);
> +	}
> +
> +	return XFS_INO_TO_AGNO(mp, ip->i_ino);
> +}
> +
> +static void xfs_agp_set(struct xfs_inode *ip, pgoff_t index,
> +			xfs_agnumber_t agno, u8 type)
> +{
> +	u32 packed = xfs_agp_pack((u32)agno, type, true);
> +
> +	/* store as immediate value */
> +	xa_store(&ip->i_ag_pmap, index, xa_mk_value(packed), GFP_NOFS);
Maybe nit: Unhandled return from xa_store() - It returns void *
> +
> +	/* Mark this AG as having potential dirty work */
> +	if (ip->i_ag_dirty_bitmap && (u32)agno < ip->i_ag_dirty_bits)
if agno >= i_ag_dirty_bits, then shouldn't we throw a warn or an ASSERT() or XFS_IS_CORRUPT() -
Darrick, any thoughts?
> +		set_bit((u32)agno, ip->i_ag_dirty_bitmap);
> +}
> +
> +static void
> +xfs_iomap_tag_folio(const struct iomap *iomap, struct folio *folio,
> +		loff_t pos, size_t len)
Nit: Coding style:
I think xfs functions declares 1 parameter in each line with the ")" just after the last parameter
on the same line. Look at something like xfs_swap_extent_rmap() defined in fs/xfs/xfs_bmap_util.c.
Not sure if the maintainers have hard preferences with this - so better to cross check.
> +{
> +	struct inode *inode;
> +	struct xfs_inode *ip;
> +	struct xfs_mount *mp;
> +	xfs_agnumber_t agno;
Coding style as mentioned before:
struct inode		*inode;
struct xfs_inod		*ip;

--NR 
> +
> +	inode = folio_mapping(folio)->host;
> +	ip = XFS_I(inode);
> +	mp = ip->i_mount;
> +
> +	agno = xfs_ag_from_iomap(mp, iomap, ip, pos, len);
> +
> +	xfs_agp_set(ip, folio->index, agno, (u8)iomap->type);
> +}
> +
>  const struct iomap_write_ops xfs_iomap_write_ops = {
>  	.iomap_valid		= xfs_iomap_valid,
> +	.tag_folio		= xfs_iomap_tag_folio,
>  };
>  
>  int



  parent reply	other threads:[~2026-02-06  6:44 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20260116101236epcas5p12ba3de776976f4ea6666e16a33ab6ec4@epcas5p1.samsung.com>
2026-01-16 10:08 ` [PATCH v3 0/6] AG aware parallel writeback for XFS Kundan Kumar
     [not found]   ` <CGME20260116101241epcas5p330f9c335a096aaaefda4b7d3c38d6038@epcas5p3.samsung.com>
2026-01-16 10:08     ` [PATCH v3 1/6] iomap: add write ops hook to attach metadata to folios Kundan Kumar
     [not found]   ` <CGME20260116101245epcas5p30269c6aa35784db67e6d6ca800a683a7@epcas5p3.samsung.com>
2026-01-16 10:08     ` [PATCH v3 2/6] xfs: add helpers to pack AG prediction info for per-folio tracking Kundan Kumar
2026-01-29  0:45       ` Darrick J. Wong
2026-02-03  7:15         ` Kundan Kumar
2026-02-05 16:39           ` Darrick J. Wong
2026-02-04  7:37       ` Nirjhar Roy (IBM)
     [not found]   ` <CGME20260116101251epcas5p1cf5b48f2efb14fe4387be3053b3c3ebc@epcas5p1.samsung.com>
2026-01-16 10:08     ` [PATCH v3 3/6] xfs: add per-inode AG prediction map and dirty-AG bitmap Kundan Kumar
2026-01-29  0:44       ` Darrick J. Wong
2026-02-03  7:20         ` Kundan Kumar
2026-02-05 16:42           ` Darrick J. Wong
2026-02-05  6:44         ` Nirjhar Roy (IBM)
2026-02-05 16:32           ` Darrick J. Wong
2026-02-06  5:41             ` Nirjhar Roy (IBM)
2026-02-05  6:36       ` Nirjhar Roy (IBM)
2026-02-05 16:36         ` Darrick J. Wong
2026-02-06  5:36           ` Nirjhar Roy (IBM)
2026-02-06  5:57             ` Darrick J. Wong
2026-02-06  6:03               ` Nirjhar Roy (IBM)
2026-02-06  7:00       ` Christoph Hellwig
     [not found]   ` <CGME20260116101256epcas5p2d6125a6bcad78c33f737fdc3484aca79@epcas5p2.samsung.com>
2026-01-16 10:08     ` [PATCH v3 4/6] xfs: tag folios with AG number during buffered write via iomap attach hook Kundan Kumar
2026-01-29  0:47       ` Darrick J. Wong
2026-01-29 22:40         ` Darrick J. Wong
2026-02-03  7:32           ` Kundan Kumar
2026-02-03  7:28         ` Kundan Kumar
2026-02-05 15:56           ` Brian Foster
2026-02-06  6:44       ` Nirjhar Roy (IBM) [this message]
     [not found]   ` <CGME20260116101259epcas5p1cfa6ab02e5a01f7c46cc78df95c57ce0@epcas5p1.samsung.com>
2026-01-16 10:08     ` [PATCH v3 5/6] xfs: add per-AG writeback workqueue infrastructure Kundan Kumar
2026-01-29 22:21       ` Darrick J. Wong
2026-02-03  7:35         ` Kundan Kumar
2026-02-06  6:46       ` Christoph Hellwig
2026-02-10 11:56       ` Nirjhar Roy (IBM)
     [not found]   ` <CGME20260116101305epcas5p497cd6d9027301853669f1c1aaffbf128@epcas5p4.samsung.com>
2026-01-16 10:08     ` [PATCH v3 6/6] xfs: offload writeback by AG using per-inode dirty bitmap and per-AG workers Kundan Kumar
2026-01-29 22:34       ` Darrick J. Wong
2026-02-03  7:40         ` Kundan Kumar
2026-02-11  9:39       ` Nirjhar Roy (IBM)
2026-01-16 16:13   ` [syzbot ci] Re: AG aware parallel writeback for XFS syzbot ci
2026-01-21 19:54   ` [PATCH v3 0/6] " Brian Foster
2026-01-22 16:15     ` Kundan Kumar
2026-01-23  9:36       ` Pankaj Raghav (Samsung)
2026-01-23 13:26       ` Brian Foster
2026-01-28 18:28         ` Kundan Kumar
2026-02-06  6:25           ` Christoph Hellwig
2026-02-06 10:07             ` Kundan Kumar
2026-02-06 17:42               ` Darrick J. Wong
2026-02-09  6:30               ` Christoph Hellwig
2026-02-09 15:54             ` Kundan Kumar
2026-02-10 15:38               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c400d2a68d87a29e17a08ed03489ca33ea46a8f4.camel@gmail.com \
    --to=nirjhar.roy.lists@gmail.com \
    --cc=amir73il@gmail.com \
    --cc=anuj20.g@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=clm@meta.com \
    --cc=dave@stgolabs.net \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=joshi.k@samsung.com \
    --cc=kundan.kumar@samsung.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishak.g@samsung.com \
    --cc=wangyufei@vivo.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox