Re: [PATCH v10 01/10] fs: Allow fine-grained control of folio sizes

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Ryan Roberts <ryan.roberts@arm.com>
To: "Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>,
	Matthew Wilcox <willy@infradead.org>
Cc: david@fromorbit.com, chandan.babu@oracle.com, djwong@kernel.org,
	brauner@kernel.org, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org, yang@os.amperecomputing.com,
	linux-mm@kvack.org, john.g.garry@oracle.com,
	linux-fsdevel@vger.kernel.org, hare@suse.de,
	p.raghav@samsung.com, mcgrof@kernel.org, gost.dev@samsung.com,
	cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de,
	Zi Yan <ziy@nvidia.com>
Subject: Re: [PATCH v10 01/10] fs: Allow fine-grained control of folio sizes
Date: Wed, 17 Jul 2024 10:59:27 +0100	[thread overview]
Message-ID: <61806152-3450-4a4f-b81f-acc6c6aeed29@arm.com> (raw)
In-Reply-To: <20240717094621.fdobfk7coyirg5e5@quentin>

On 17/07/2024 10:46, Pankaj Raghav (Samsung) wrote:
> On Tue, Jul 16, 2024 at 04:26:10PM +0100, Matthew Wilcox wrote:
>> On Mon, Jul 15, 2024 at 11:44:48AM +0200, Pankaj Raghav (Samsung) wrote:
>>> +/*
>>> + * mapping_max_folio_size_supported() - Check the max folio size supported
>>> + *
>>> + * The filesystem should call this function at mount time if there is a
>>> + * requirement on the folio mapping size in the page cache.
>>> + */
>>> +static inline size_t mapping_max_folio_size_supported(void)
>>> +{
>>> +	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
>>> +		return 1U << (PAGE_SHIFT + MAX_PAGECACHE_ORDER);
>>> +	return PAGE_SIZE;
>>> +}
>>
>> There's no need for this to be part of this patch.  I've removed stuff
>> from this patch before that's not needed, please stop adding unnecessary
>> functions.  This would logically be part of patch 10.
> 
> That makes sense. I will move it to the last patch.
> 
>>
>>> +static inline void mapping_set_folio_order_range(struct address_space *mapping,
>>> +						 unsigned int min,
>>> +						 unsigned int max)
>>> +{
>>> +	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
>>> +		return;
>>> +
>>> +	if (min > MAX_PAGECACHE_ORDER) {
>>> +		VM_WARN_ONCE(1,
>>> +	"min order > MAX_PAGECACHE_ORDER. Setting min_order to MAX_PAGECACHE_ORDER");
>>> +		min = MAX_PAGECACHE_ORDER;
>>> +	}
>>
>> This is really too much.  It's something that will never happen.  Just
>> delete the message.
>>
>>> +	if (max > MAX_PAGECACHE_ORDER) {
>>> +		VM_WARN_ONCE(1,
>>> +	"max order > MAX_PAGECACHE_ORDER. Setting max_order to MAX_PAGECACHE_ORDER");
>>> +		max = MAX_PAGECACHE_ORDER;
>>
>> Absolutely not.  If the filesystem declares it can support a block size
>> of 4TB, then good for it.  We just silently clamp it.
> 
> Hmm, but you raised the point about clamping in the previous patches[1]
> after Ryan pointed out that we should not silently clamp the order.
> 
> ```
>> It seems strange to silently clamp these? Presumably for the bs>ps usecase,
>> whatever values are passed in are a hard requirement? So wouldn't want them to
>> be silently reduced. (Especially given the recent change to reduce the size of
>> MAX_PAGECACHE_ORDER to less then PMD size in some cases).
> 
> Hm, yes.  We should probably make this return an errno.  Including
> returning an errno for !IS_ENABLED() and min > 0.
> ```
> 
> It was not clear from the conversation in the previous patches that we
> decided to just clamp the order (like it was done before).
> 
> So let's just stick with how it was done before where we clamp the
> values if min and max > MAX_PAGECACHE_ORDER?
> 
> [1] https://lore.kernel.org/linux-fsdevel/Zoa9rQbEUam467-q@casper.infradead.org/

The way I see it, there are 2 approaches we could take:

1. Implement mapping_max_folio_size_supported(), write a headerdoc for
mapping_set_folio_order_range() that says min must be lte max, max must be lte
mapping_max_folio_size_supported(). Then emit VM_WARN() in
mapping_set_folio_order_range() if the constraints are violated, and clamp to
make it safe (from page cache's perspective). The VM_WARN()s can just be inline
in the if statements to keep them clean. The FS is responsible for checking
mapping_max_folio_size_supported() and ensuring min and max meet requirements.

2. Return an error from mapping_set_folio_order_range() (and the other functions
that set min/max). No need for warning. No state changed if error is returned.
FS can emit warning on error if it wants.

Personally I prefer option 2, but 1 is definitely less churn.

Thanks,
Ryan

next prev parent reply	other threads:[~2024-07-17  9:59 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-15  9:44 [PATCH v10 00/10] enable bs > ps in XFS Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 01/10] fs: Allow fine-grained control of folio sizes Pankaj Raghav (Samsung)
2024-07-16 15:26   ` Matthew Wilcox
2024-07-17  9:46     ` Pankaj Raghav (Samsung)
2024-07-17  9:59       ` Ryan Roberts [this message]
2024-07-17 15:12         ` Pankaj Raghav (Samsung)
2024-07-17 15:25           ` Darrick J. Wong
2024-07-17 15:26           ` Ryan Roberts
2024-07-22 14:19     ` Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 02/10] filemap: allocate mapping_min_order folios in the page cache Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 03/10] readahead: allocate folios with mapping_min_order in readahead Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 04/10] mm: split a folio in minimum folio order chunks Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 05/10] filemap: cap PTE range to be created to allowed zero fill in folio_map_range() Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 06/10] iomap: fix iomap_dio_zero() for fs bs > system page size Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 07/10] xfs: use kvmalloc for xattr buffers Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 08/10] xfs: expose block size in stat Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 09/10] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Pankaj Raghav (Samsung)
2024-07-15  9:44 ` [PATCH v10 10/10] xfs: enable block size larger than page size support Pankaj Raghav (Samsung)
2024-07-15 16:46   ` Darrick J. Wong
2024-07-22 14:12     ` Pankaj Raghav (Samsung)
2024-07-22 18:49       ` Darrick J. Wong
2024-07-16 15:29   ` Matthew Wilcox
2024-07-16 17:40     ` Darrick J. Wong
2024-07-16 17:46       ` Matthew Wilcox
2024-07-16 22:37         ` Darrick J. Wong
2024-07-17 10:02         ` Pankaj Raghav (Samsung)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61806152-3450-4a4f-b81f-acc6c6aeed29@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=cl@os.amperecomputing.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=john.g.garry@oracle.com \
    --cc=kernel@pankajraghav.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox