linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	mcgrof@kernel.org, gost.dev@samsung.com,
	akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org,
	chandan.babu@oracle.com, p.raghav@samsung.com,
	linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org,
	linux-mm@kvack.org
Subject: Re: [RFC v2 14/14] xfs: enable block size larger than page size support
Date: Wed, 14 Feb 2024 08:34:01 +1100	[thread overview]
Message-ID: <ZcvgSSbIqm4N6TVJ@dread.disaster.area> (raw)
In-Reply-To: <20240213093713.1753368-15-kernel@pankajraghav.com>

On Tue, Feb 13, 2024 at 10:37:13AM +0100, Pankaj Raghav (Samsung) wrote:
> From: Pankaj Raghav <p.raghav@samsung.com>
> 
> Page cache now has the ability to have a minimum order when allocating
> a folio which is a prerequisite to add support for block size > page
> size. Enable it in XFS under CONFIG_XFS_LBS.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
> ---
>  fs/xfs/xfs_icache.c | 8 ++++++--
>  fs/xfs/xfs_super.c  | 8 +++-----
>  2 files changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index dba514a2c84d..9de81caf7ad4 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -73,6 +73,7 @@ xfs_inode_alloc(
>  	xfs_ino_t		ino)
>  {
>  	struct xfs_inode	*ip;
> +	int			min_order = 0;
>  
>  	/*
>  	 * XXX: If this didn't occur in transactions, we could drop GFP_NOFAIL
> @@ -88,7 +89,8 @@ xfs_inode_alloc(
>  	/* VFS doesn't initialise i_mode or i_state! */
>  	VFS_I(ip)->i_mode = 0;
>  	VFS_I(ip)->i_state = 0;
> -	mapping_set_large_folios(VFS_I(ip)->i_mapping);
> +	min_order = max(min_order, ilog2(mp->m_sb.sb_blocksize) - PAGE_SHIFT);
> +	mapping_set_folio_orders(VFS_I(ip)->i_mapping, min_order, MAX_PAGECACHE_ORDER);

That's pretty nasty. You're using max() to hide underflow in the
subtraction to clamp the value to zero. And you don't need ilog2()
because we have the log of the block size in the superblock already.

	int			min_order = 0;
	.....
	if (mp->m_sb.sb_blocksize > PAGE_SIZE)
		min_order = mp->m_sb.sb_blocklog - PAGE_SHIFT;

But, really why recalculate this -constant- on every inode
allocation?  That's a very hot path, so this should be set in the
M_IGEO(mp) structure (mp->m_ino_geo) at mount time and then the code
is simply:

	mapping_set_folio_orders(VFS_I(ip)->i_mapping,
			M_IGEO(mp)->min_folio_order, MAX_PAGECACHE_ORDER);

We already access the M_IGEO(mp) structure every inode allocation,
so there's little in way of additional cost here....

> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 5a2512d20bd0..6a3f0f6727eb 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1625,13 +1625,11 @@ xfs_fs_fill_super(
>  		goto out_free_sb;
>  	}
>  
> -	/*
> -	 * Until this is fixed only page-sized or smaller data blocks work.
> -	 */
> -	if (mp->m_sb.sb_blocksize > PAGE_SIZE) {
> +	if (!IS_ENABLED(CONFIG_XFS_LBS) && mp->m_sb.sb_blocksize > PAGE_SIZE) {
>  		xfs_warn(mp,
>  		"File system with blocksize %d bytes. "
> -		"Only pagesize (%ld) or less will currently work.",
> +		"Only pagesize (%ld) or less will currently work. "
> +		"Enable Experimental CONFIG_XFS_LBS for this support",
>  				mp->m_sb.sb_blocksize, PAGE_SIZE);
>  		error = -ENOSYS;
>  		goto out_free_sb;

This should just issue a warning if bs > ps.

	if (mp->m_sb.sb_blocksize > PAGE_SIZE) {
  		xfs_warn(mp,
"EXPERIMENTAL: Filesystem with Large Block Size (%d bytes) enabled.",
			mp->m_sb.sb_blocksize);
	}

-Dave.
-- 
Dave Chinner
david@fromorbit.com


  parent reply	other threads:[~2024-02-13 21:34 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13  9:36 [RFC v2 00/14] enable bs > ps in XFS Pankaj Raghav (Samsung)
2024-02-13  9:37 ` [RFC v2 01/14] fs: Allow fine-grained control of folio sizes Pankaj Raghav (Samsung)
2024-02-13 12:03   ` Hannes Reinecke
2024-02-13 16:34   ` Darrick J. Wong
2024-02-13 21:05     ` Pankaj Raghav (Samsung)
2024-02-13 21:29       ` Darrick J. Wong
2024-02-14 19:00         ` Matthew Wilcox
2024-02-15 10:34           ` Pankaj Raghav (Samsung)
2024-02-14 18:49   ` Matthew Wilcox
2024-02-15 10:21     ` Pankaj Raghav (Samsung)
2024-02-13  9:37 ` [RFC v2 02/14] filemap: align the index to mapping_min_order in the page cache Pankaj Raghav (Samsung)
2024-02-13 12:20   ` Hannes Reinecke
2024-02-13 21:13     ` Pankaj Raghav (Samsung)
2024-02-13 22:00   ` Dave Chinner
2024-02-13  9:37 ` [RFC v2 03/14] filemap: use mapping_min_order while allocating folios Pankaj Raghav (Samsung)
2024-02-13 14:58   ` Hannes Reinecke
2024-02-13 16:38   ` Darrick J. Wong
2024-02-13 22:05   ` Dave Chinner
2024-02-14 10:13     ` Pankaj Raghav (Samsung)
2024-02-13  9:37 ` [RFC v2 04/14] readahead: set file_ra_state->ra_pages to be at least mapping_min_order Pankaj Raghav (Samsung)
2024-02-13 14:59   ` Hannes Reinecke
2024-02-13 16:46   ` Darrick J. Wong
2024-02-13 22:09   ` Dave Chinner
2024-02-14 13:32     ` Pankaj Raghav (Samsung)
2024-02-14 13:53       ` Pankaj Raghav (Samsung)
2024-02-13  9:37 ` [RFC v2 05/14] readahead: align index to mapping_min_order in ondemand_ra and force_ra Pankaj Raghav (Samsung)
2024-02-13 15:00   ` Hannes Reinecke
2024-02-13 16:46   ` Darrick J. Wong
2024-02-13 22:29   ` Dave Chinner
2024-02-14 15:10     ` Pankaj Raghav (Samsung)
2024-02-13  9:37 ` [RFC v2 06/14] readahead: rework loop in page_cache_ra_unbounded() Pankaj Raghav (Samsung)
2024-02-13 16:47   ` Darrick J. Wong
2024-02-13  9:37 ` [RFC v2 07/14] readahead: allocate folios with mapping_min_order in ra_(unbounded|order) Pankaj Raghav (Samsung)
2024-02-13 15:01   ` Hannes Reinecke
2024-02-13 16:47   ` Darrick J. Wong
2024-02-13  9:37 ` [RFC v2 08/14] mm: do not split a folio if it has minimum folio order requirement Pankaj Raghav (Samsung)
2024-02-13 15:02   ` Hannes Reinecke
2024-02-13  9:37 ` [RFC v2 09/14] mm: Support order-1 folios in the page cache Pankaj Raghav (Samsung)
2024-02-13 15:03   ` Hannes Reinecke
2024-02-13  9:37 ` [RFC v2 10/14] iomap: fix iomap_dio_zero() for fs bs > system page size Pankaj Raghav (Samsung)
2024-02-13 15:06   ` Hannes Reinecke
2024-02-13 16:30   ` Darrick J. Wong
2024-02-13 21:27     ` Pankaj Raghav (Samsung)
2024-02-13 21:30       ` Darrick J. Wong
2024-02-14 15:13         ` Pankaj Raghav (Samsung)
2024-02-13  9:37 ` [RFC v2 11/14] xfs: expose block size in stat Pankaj Raghav (Samsung)
2024-02-13 16:27   ` Darrick J. Wong
2024-02-13 21:32     ` Pankaj Raghav (Samsung)
2024-02-13  9:37 ` [RFC v2 12/14] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Pankaj Raghav (Samsung)
2024-02-13 16:26   ` Darrick J. Wong
2024-02-13 21:48     ` Pankaj Raghav (Samsung)
2024-02-13 22:44       ` Dave Chinner
2024-02-14 15:51         ` Pankaj Raghav (Samsung)
2024-02-13  9:37 ` [RFC v2 13/14] xfs: add an experimental CONFIG_XFS_LBS option Pankaj Raghav (Samsung)
2024-02-13 16:39   ` Darrick J. Wong
2024-02-13 21:19   ` Dave Chinner
2024-02-13 21:54     ` Pankaj Raghav (Samsung)
2024-02-13 22:45       ` Dave Chinner
2024-02-13  9:37 ` [RFC v2 14/14] xfs: enable block size larger than page size support Pankaj Raghav (Samsung)
2024-02-13 16:20   ` Darrick J. Wong
2024-02-14 16:40     ` Pankaj Raghav (Samsung)
2024-02-13 21:34   ` Dave Chinner [this message]
2024-02-14 16:35     ` Pankaj Raghav (Samsung)
2024-02-15 22:17       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZcvgSSbIqm4N6TVJ@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=chandan.babu@oracle.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=kbusch@kernel.org \
    --cc=kernel@pankajraghav.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox