linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, ziy@nvidia.com,
	baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com,
	npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
	baohua@kernel.org, lance.yang@linux.dev, linux-mm@kvack.org
Subject: Re: [Patch v2] mm/huge_memory: consolidate order-related checks into folio_check_splittable()
Date: Sun, 4 Jan 2026 02:37:56 +0000	[thread overview]
Message-ID: <20260104023756.jufklyl3bl64fnck@master> (raw)
In-Reply-To: <20251223122539.10726-1-richard.weiyang@gmail.com>

On Tue, Dec 23, 2025 at 12:25:39PM +0000, Wei Yang wrote:
>The primary goal of the folio_check_splittable() function is to validate
>whether a folio is suitable for splitting and to bail out early if it is
>not.
>
>Currently, some order-related checks are scattered throughout the
>calling code rather than being centralized in folio_check_splittable().
>
>This commit moves all remaining order-related validation logic into
>folio_check_splittable(). This consolidation ensures that the function
>serves its intended purpose as a single point of failure and improves
>the clarity and maintainability of the surrounding code.
>
>Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>Cc: Zi Yan <ziy@nvidia.com>
>
>---
[...]
>@@ -3719,28 +3723,33 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order,
> 		/* order-1 is not supported for anonymous THP. */
> 		if (new_order == 1)
> 			return -EINVAL;
>-	} else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
>-		if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>-		    !mapping_large_folio_support(folio->mapping)) {
>-			/*
>-			 * We can always split a folio down to a single page
>-			 * (new_order == 0) uniformly.
>-			 *
>-			 * For any other scenario
>-			 *   a) uniform split targeting a large folio
>-			 *      (new_order > 0)
>-			 *   b) any non-uniform split
>-			 * we must confirm that the file system supports large
>-			 * folios.
>-			 *
>-			 * Note that we might still have THPs in such
>-			 * mappings, which is created from khugepaged when
>-			 * CONFIG_READ_ONLY_THP_FOR_FS is enabled. But in that
>-			 * case, the mapping does not actually support large
>-			 * folios properly.
>-			 */
>-			return -EINVAL;
>+	} else {
>+		if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
>+			if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>+			    !mapping_large_folio_support(folio->mapping)) {
>+				/*
>+				 * We can always split a folio down to a
>+				 * single page (new_order == 0) uniformly.
>+				 *
>+				 * For any other scenario
>+				 *   a) uniform split targeting a large folio
>+				 *      (new_order > 0)
>+				 *   b) any non-uniform split
>+				 * we must confirm that the file system
>+				 * supports large folios.
>+				 *
>+				 * Note that we might still have THPs in such
>+				 * mappings, which is created from khugepaged
>+				 * when CONFIG_READ_ONLY_THP_FOR_FS is
>+				 * enabled. But in that case, the mapping does
>+				 * not actually support large folios properly.
>+				 */
>+				return -EINVAL;
>+			}
> 		}

Hi, Happy New Year to all :-)


Following the application of this patch, a warning was reported [5]. The root
cause is an attempt to uniformly split a page cache folio down to order-0 when
the mapping has a mapping_min_folio_order() > 0.

It is worth noting that the current upstream code also denies this split, but
it does so silently. This patch simply makes the violation visible.

Upon reviewing the code history, I believe the logic introduced here is
correct. The existing comment--"We can always split a folio down to a single
page"--appears to be misleading, as it does not account for modern
constraints where a minimum folio order is required by the mapping.

Below is my analysis and suggestion:
----


All the story came from [1] which introduced folio split to any lower order.
The check on order is fixed by [2], which is the base of current form.

When we split a large pagecache folio, it has two possibilities:

    1) khugepaged collapsed folio
    2) filesystem supported large folio 

For case 1), the folio could only be splitted to order-0, so the check is
added, (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !mapping_large_folio_support(folio->mapping))

For case 2), it looks implies mapping_large_folio_support() is true. And it
looks we assume it could be splitted to any lower order. Because the
mapping_min_folio_order() is introduced by [3] and the min_order check is
introduced in [4], both are later than [1] and [2]. So when [2] applied, we
don't have the knowledge of mapping_min_folio_order(). (I may lose some
background here, if not correct, please correct me.)

The introduction of [3] and [4] changes the assumption at [2], besides
checking mapping_large_folio_support(), the mapping_min_folio_order() should
be checked. So the comment in current code is misleading:

	/* 
	 * We can always split a folio down to a
	 * single page (new_order == 0) uniformly.
	 */

For case 1), it still stands, but for case 2) we should also check min_order
with mapping_min_folio_order(), which [4] does.

If my understanding is correct, below is my suggestion for the comment change.
Feel free to correct me, if I miss something.

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 40cf59301c21..b0ba27b0f763 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3714,28 +3718,28 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order,
                /* order-1 is not supported for anonymous THP. */
                if (new_order == 1)
                        return -EINVAL;
-       } else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
-               if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
-                   !mapping_large_folio_support(folio->mapping)) {
-                       /*
-                        * We can always split a folio down to a single page
-                        * (new_order == 0) uniformly.
-                        *
-                        * For any other scenario
-                        *   a) uniform split targeting a large folio
-                        *      (new_order > 0)
-                        *   b) any non-uniform split
-                        * we must confirm that the file system supports large
-                        * folios.
-                        *
-                        * Note that we might still have THPs in such
-                        * mappings, which is created from khugepaged when
-                        * CONFIG_READ_ONLY_THP_FOR_FS is enabled. But in that
-                        * case, the mapping does not actually support large
-                        * folios properly.
-                        */
-                       return -EINVAL;
+       } else {
+               /*
+                * When splitting a large pagecache folio, it has two
+                * possibilities:
+                *
+                *   1) khugepaged collapsed folio when
+                *      CONFIG_READ_ONLY_THP_FOR_FS is enabled
+                *   2) filesystem supported folio
+                *
+                * For case 1), we only support uniform split to order-0.
+                * For case 2), we need to make sure new_order is not less
+                * than mapping_min_folio_order().
+                */
+               if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
+                       if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
+                           !mapping_large_folio_support(folio->mapping)) {
+                               return -EINVAL;
+                       }
                }
+
+               if (new_order < mapping_min_folio_order(folio->mapping))
+                       return -EINVAL;
        }


The warning reported is in madvise_cold_or_pageout_pte_range(). I don't find a
way to eliminate it, since user may specify any range to madvise(). Uniformly
splitting to order-0 is a general way.

[1]: 2024-02-26 c010d47f107f mm: thp: split huge page to any lower order pages
[2]: 2024-06-07 6a50c9b512f7 mm: huge_memory: fix misused mapping_large_folio_support() for anon folios
[3]: 2024-08-22 84429b675bcf fs: Allow fine-grained control of folio sizes
[4]: 2024-08-22 e220917fa507 mm: split a folio in minimum folio order chunks
[5]: https://lore.kernel.org/all/694ac438.050a0220.35954c.0000.GAE@google.com/

-- 
Wei Yang
Help you, Help me


  parent reply	other threads:[~2026-01-04  2:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-23 12:25 Wei Yang
2025-12-23 17:50 ` [syzbot ci] " syzbot ci
2026-01-04  2:37 ` Wei Yang [this message]
2026-01-05 16:16   ` [Patch v2] " David Hildenbrand (Red Hat)
2026-01-05 16:29     ` Lorenzo Stoakes
2026-01-05 16:52       ` Matthew Wilcox
2026-01-06  9:54     ` Wei Yang
2026-01-06 12:28       ` Zi Yan
2026-01-06 12:51         ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260104023756.jufklyl3bl64fnck@master \
    --to=richard.weiyang@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox