Re: [PATCH] mm/huge_memory: consolidate order-related checks into folio_split_supported()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Zi Yan <ziy@nvidia.com>
To: "David Hildenbrand (Red Hat)" <david@kernel.org>,
	Wei Yang <richard.weiyang@gmail.com>
Cc: willy@infradead.org, akpm@linux-foundation.org,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, baolin.wang@linux.alibaba.com,
	npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
	baohua@kernel.org, lance.yang@linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm/huge_memory: consolidate order-related checks into folio_split_supported()
Date: Fri, 14 Nov 2025 07:43:38 -0500	[thread overview]
Message-ID: <01FABE3A-AD4E-4A09-B971-C89503A848DF@nvidia.com> (raw)
In-Reply-To: <827fd8d8-c327-4867-9693-ec06cded55a9@kernel.org>

On 14 Nov 2025, at 3:49, David Hildenbrand (Red Hat) wrote:

> On 14.11.25 08:57, Wei Yang wrote:
>> The primary goal of the folio_split_supported() function is to validate
>> whether a folio is suitable for splitting and to bail out early if it is
>> not.
>>
>> Currently, some order-related checks are scattered throughout the
>> calling code rather than being centralized in folio_split_supported().
>>
>> This commit moves all remaining order-related validation logic into
>> folio_split_supported(). This consolidation ensures that the function
>> serves its intended purpose as a single point of failure and improves
>> the clarity and maintainability of the surrounding code.
>
> Combining the EINVAL handling sounds reasonable.
>
>>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>>   include/linux/pagemap.h |  6 +++
>>   mm/huge_memory.c        | 88 +++++++++++++++++++++--------------------
>>   2 files changed, 51 insertions(+), 43 deletions(-)
>>
>> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
>> index 09b581c1d878..d8c8df629b90 100644
>> --- a/include/linux/pagemap.h
>> +++ b/include/linux/pagemap.h
>> @@ -516,6 +516,12 @@ static inline bool mapping_large_folio_support(const struct address_space *mappi
>>   	return mapping_max_folio_order(mapping) > 0;
>>   }
>>  +static inline bool
>> +mapping_folio_order_supported(const struct address_space *mapping, unsigned int order)
>> +{
>> +	return (order >= mapping_min_folio_order(mapping) && order <= mapping_max_folio_order(mapping));
>> +}
>
> (unnecessary () and unnecessary long line)
>
> Style in the file seems to want:
>
> static inline bool mapping_folio_order_supported(const struct address_space *mapping,
> 						 unsigned int order)
> {
> 	return order >= mapping_min_folio_order(mapping) &&
> 	       order <= mapping_max_folio_order(mapping);
> }
>
>
> The mapping_max_folio_order() check is new now. What is the default value of that? Is it always initialized properly?
>
>> +
>>   /* Return the maximum folio size for this pagecache mapping, in bytes. */
>>   static inline size_t mapping_max_folio_size(const struct address_space *mapping)
>>   {
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 0184cd915f44..68faac843527 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -3690,34 +3690,58 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>>   bool folio_split_supported(struct folio *folio, unsigned int new_order,
>>   		enum split_type split_type, bool warns)
>>   {
>> +	const int old_order = folio_order(folio);
>
> While at it, make it "unsigned int" like new_order.
>
>> +
>> +	if (new_order >= old_order)
>> +		return -EINVAL;
>> +
>>   	if (folio_test_anon(folio)) {
>>   		/* order-1 is not supported for anonymous THP. */
>>   		VM_WARN_ONCE(warns && new_order == 1,
>>   				"Cannot split to order-1 folio");
>>   		if (new_order == 1)
>>   			return false;
>> -	} else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
>> -		if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>> -		    !mapping_large_folio_support(folio->mapping)) {
>> -			/*
>> -			 * We can always split a folio down to a single page
>> -			 * (new_order == 0) uniformly.
>> -			 *
>> -			 * For any other scenario
>> -			 *   a) uniform split targeting a large folio
>> -			 *      (new_order > 0)
>> -			 *   b) any non-uniform split
>> -			 * we must confirm that the file system supports large
>> -			 * folios.
>> -			 *
>> -			 * Note that we might still have THPs in such
>> -			 * mappings, which is created from khugepaged when
>> -			 * CONFIG_READ_ONLY_THP_FOR_FS is enabled. But in that
>> -			 * case, the mapping does not actually support large
>> -			 * folios properly.
>> -			 */
>> +	} else {
>> +		const struct address_space *mapping = NULL;
>> +
>> +		mapping = folio->mapping;
>
> const struct address_space *mapping = folio->mapping;
>
>> +
>> +		/* Truncated ? */
>> +		/*
>> +		 * TODO: add support for large shmem folio in swap cache.
>> +		 * When shmem is in swap cache, mapping is NULL and
>> +		 * folio_test_swapcache() is true.
>> +		 */
>> +		if (!mapping)
>> +			return false;
>> +
>> +		/*
>> +		 * We have two types of split:
>> +		 *
>> +		 *   a) uniform split: split folio directly to new_order.
>> +		 *   b) non-uniform split: create after-split folios with
>> +		 *      orders from (old_order - 1) to new_order.
>> +		 *
>> +		 * For file system, we encodes it supported folio order in
>> +		 * mapping->flags, which could be checked by
>> +		 * mapping_folio_order_supported().
>> +		 *
>> +		 * With these knowledge, we can know whether folio support
>> +		 * split to new_order by:
>> +		 *
>> +		 *   1. check new_order is supported first
>> +		 *   2. check (old_order - 1) is supported if
>> +		 *      SPLIT_TYPE_NON_UNIFORM
>> +		 */
>> +		if (!mapping_folio_order_supported(mapping, new_order)) {
>> +			VM_WARN_ONCE(warns,
>> +				"Cannot split file folio to unsupported order: %d", new_order);
>
> Is that really worth a VM_WARN_ONCE? We didn't have that previously IIUC, we would only return
> -EINVAL.

No, and it causes undesired warning when LBS folio is enabled. I explicitly
removed this warning one month ago in the LBS related patch[1].

It is so frustrating to see this part of patch. Wei has RB in the aforementioned
patch and still add this warning blindly. I am not sure if Wei understands
what he is doing, since he threw the idea to me and I told him to just
move the code without changing the logic, but he insisted doing it in his
own way and failed[2]. This retry is still wrong.

Wei, please make sure you understand the code before sending any patch.

[1] https://lore.kernel.org/linux-mm/20251017013630.139907-1-ziy@nvidia.com/
[2] https://lore.kernel.org/linux-mm/20251114030301.hkestzrk534ik7q4@master/

Best Regards,
Yan, Zi

next prev parent reply	other threads:[~2025-11-14 12:43 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-14  7:57 Wei Yang
2025-11-14  8:49 ` David Hildenbrand (Red Hat)
2025-11-14 12:43   ` Zi Yan [this message]
2025-11-14 14:30     ` Wei Yang
2025-11-14 20:53       ` Zi Yan
2025-11-15  2:42         ` Wei Yang
2025-11-14 15:03   ` Wei Yang
2025-11-14 19:36     ` David Hildenbrand (Red Hat)
2025-11-15  2:51       ` Wei Yang
2025-11-15  5:07         ` Matthew Wilcox
2025-11-15  9:43           ` Wei Yang
2025-12-04 15:13       ` Wei Yang
2025-11-19 12:37 ` Dan Carpenter
2025-11-19 12:39   ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01FABE3A-AD4E-4A09-B971-C89503A848DF@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox