[PATCH v3 0/4] Improve folio split related functions

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/4] Improve folio split related functions
@ 2025-11-26  3:50 Zi Yan
  2025-11-26  3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26  3:50 UTC (permalink / raw)
  To: David Hildenbrand, Lorenzo Stoakes
  Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
	Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel

Hi all,

This patchset improves several folio split related functions to avoid
future misuse. The changes are:

1. Consolidated folio splittable checks by moving truncated folio check,
   huge zero folio check, and writeback folio check into
   folio_split_supported(). Changed the function return type. Renamed it
   to folio_check_splittable() for clarification.

2. Replaced can_split_folio() with open coded folio_expected_ref_count()
   and folio_ref_count() and introduced folio_cache_ref_count().

3. Changed min_order_for_split() to always return an order.

4. Fixed folio split stats counting.

Motivation
===
This is based on Wei's observation[1] and solves several potential
issues:
1. Dereferencing NULL folio->mapping in try_folio_split_to_order() if it
   is called on truncated folios.
2. Not handling of negative return value of min_order_for_split() in
   mm/memory-failure.c

There is no bug in the current code.

The code is based on mm-new with V2 reverted and can replace V2 cleanly
on mm-new branch.

Changelog
===
From V2[3]:
1. Removed "bool warns" parameter from folio_check_splittable().

2. Removed all warnings in folio_check_splittable() and added a single
   warning in its caller, __folio_split() instead.

3. Spelled out in the comment in folio_check_splittable() that folios
   without a mapping in the swapcache can be shmem or to-be-anon folios.

4. Renamed folio_cache_references to folio_cache_ref_count.

5. Removed extra_pins variable.

6. Replaced folio_expected_ref_count() with folio_cache_ref_count() for
   folio_ref_unfreeze() uses in __folio_freeze_and_split_unmapped(),
   since they are equivalent at those call sites.

From RFC[2]:
1. Renamed folio_split_supported() to folio_check_splittable(), changed
   its return type from bool to int to return error code directly, and
   added kernel-doc.

2. Moved truncated folio check, zero huge folio check, and writeback
   check in folio_check_splittable().

3. Changed zero huge folio check's error number from -EBUSY to -EINVAL.

4. Replaced can_split_folio() with open code.

5. Changed min_order_for_split() to return 0 for truncated folio instead
   of -EBUSY and added kernel-doc.

6. Fixed folio split stats counting.

Comments and feedbacks are welcome.

Link: https://lore.kernel.org/all/20251120004735.52z7r4xmogw7mbsj@master/ [1]
Link: https://lore.kernel.org/all/20251120035953.1115736-1-ziy@nvidia.com/ [2]
Link: https://lore.kernel.org/all/20251122025529.1562592-1-ziy@nvidia.com/ [3]

Zi Yan (4):
  mm/huge_memory: change folio_split_supported() to
    folio_check_splittable()
  mm/huge_memory: replace can_split_folio() with direct refcount
    calculation
  mm/huge_memory: make min_order_for_split() always return an order
  mm/huge_memory: fix folio split stats counting

 include/linux/huge_mm.h |  13 ++--
 mm/huge_memory.c        | 161 ++++++++++++++++++++++------------------
 mm/vmscan.c             |   3 +-
 3 files changed, 97 insertions(+), 80 deletions(-)

-- 
2.51.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
  2025-11-26  3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
@ 2025-11-26  3:50 ` Zi Yan
  2025-11-26  4:14   ` Balbir Singh
                     ` (2 more replies)
  2025-11-26  3:50 ` [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation Zi Yan
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26  3:50 UTC (permalink / raw)
  To: David Hildenbrand, Lorenzo Stoakes
  Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
	Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel

folio_split_supported() used in try_folio_split_to_order() requires
folio->mapping to be non NULL, but current try_folio_split_to_order() does
not check it. There is no issue in the current code, since
try_folio_split_to_order() is only used in truncate_inode_partial_folio(),
where folio->mapping is not NULL.

To prevent future misuse, move folio->mapping NULL check (i.e., folio is
truncated) into folio_split_supported(). Since folio->mapping NULL check
returns -EBUSY and folio_split_supported() == false means -EINVAL, change
folio_split_supported() return type from bool to int and return error
numbers accordingly. Rename folio_split_supported() to
folio_check_splittable() to match the return type change.

While at it, move is_huge_zero_folio() check and folio_test_writeback()
check into folio_check_splittable() and add kernel-doc.

Remove all warnings inside folio_check_splittable() and give warnings
in __folio_split() instead, so that bool warns parameter can be removed.

Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
---
 include/linux/huge_mm.h |  6 ++--
 mm/huge_memory.c        | 76 +++++++++++++++++++++++------------------
 2 files changed, 46 insertions(+), 36 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 1d439de1ca2c..66105a90b4c3 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -375,8 +375,8 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
 int folio_split_unmapped(struct folio *folio, unsigned int new_order);
 int min_order_for_split(struct folio *folio);
 int split_folio_to_list(struct folio *folio, struct list_head *list);
-bool folio_split_supported(struct folio *folio, unsigned int new_order,
-		enum split_type split_type, bool warns);
+int folio_check_splittable(struct folio *folio, unsigned int new_order,
+			   enum split_type split_type);
 int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
 		struct list_head *list);
 
@@ -407,7 +407,7 @@ static inline int split_huge_page_to_order(struct page *page, unsigned int new_o
 static inline int try_folio_split_to_order(struct folio *folio,
 		struct page *page, unsigned int new_order)
 {
-	if (!folio_split_supported(folio, new_order, SPLIT_TYPE_NON_UNIFORM, /* warns= */ false))
+	if (folio_check_splittable(folio, new_order, SPLIT_TYPE_NON_UNIFORM))
 		return split_huge_page_to_order(&folio->page, new_order);
 	return folio_split(folio, new_order, page, NULL);
 }
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 041b554c7115..771df0c02a4a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3688,15 +3688,40 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
 	return 0;
 }
 
-bool folio_split_supported(struct folio *folio, unsigned int new_order,
-		enum split_type split_type, bool warns)
+/**
+ * folio_check_splittable() - check if a folio can be split to a given order
+ * @folio: folio to be split
+ * @new_order: the smallest order of the after split folios (since buddy
+ *             allocator like split generates folios with orders from @folio's
+ *             order - 1 to new_order).
+ * @split_type: uniform or non-uniform split
+ *
+ * folio_check_splittable() checks if @folio can be split to @new_order using
+ * @split_type method. The truncated folio check must come first.
+ *
+ * Context: folio must be locked.
+ *
+ * Return: 0 - @folio can be split to @new_order, otherwise an error number is
+ * returned.
+ */
+int folio_check_splittable(struct folio *folio, unsigned int new_order,
+			   enum split_type split_type)
 {
+	VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio);
+	/*
+	 * Folios that just got truncated cannot get split. Signal to the
+	 * caller that there was a race.
+	 *
+	 * TODO: this will also currently refuse folios without a mapping in the
+	 * swapcache (shmem or to-be-anon folios).
+	 */
+	if (!folio_test_anon(folio) && !folio->mapping)
+		return -EBUSY;
+
 	if (folio_test_anon(folio)) {
 		/* order-1 is not supported for anonymous THP. */
-		VM_WARN_ONCE(warns && new_order == 1,
-				"Cannot split to order-1 folio");
 		if (new_order == 1)
-			return false;
+			return -EINVAL;
 	} else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
 		if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
 		    !mapping_large_folio_support(folio->mapping)) {
@@ -3717,9 +3742,7 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
 			 * case, the mapping does not actually support large
 			 * folios properly.
 			 */
-			VM_WARN_ONCE(warns,
-				"Cannot split file folio to non-0 order");
-			return false;
+			return -EINVAL;
 		}
 	}
 
@@ -3732,12 +3755,16 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
 	 * here.
 	 */
 	if ((split_type == SPLIT_TYPE_NON_UNIFORM || new_order) && folio_test_swapcache(folio)) {
-		VM_WARN_ONCE(warns,
-			"Cannot split swapcache folio to non-0 order");
-		return false;
+		return -EINVAL;
 	}
 
-	return true;
+	if (is_huge_zero_folio(folio))
+		return -EINVAL;
+
+	if (folio_test_writeback(folio))
+		return -EBUSY;
+
+	return 0;
 }
 
 static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
@@ -3922,7 +3949,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 	int remap_flags = 0;
 	int extra_pins, ret;
 	pgoff_t end = 0;
-	bool is_hzp;
 
 	VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
 	VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
@@ -3930,31 +3956,15 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 	if (folio != page_folio(split_at) || folio != page_folio(lock_at))
 		return -EINVAL;
 
-	/*
-	 * Folios that just got truncated cannot get split. Signal to the
-	 * caller that there was a race.
-	 *
-	 * TODO: this will also currently refuse shmem folios that are in the
-	 * swapcache.
-	 */
-	if (!is_anon && !folio->mapping)
-		return -EBUSY;
-
 	if (new_order >= old_order)
 		return -EINVAL;
 
-	if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
-		return -EINVAL;
-
-	is_hzp = is_huge_zero_folio(folio);
-	if (is_hzp) {
-		pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
-		return -EBUSY;
+	ret = folio_check_splittable(folio, new_order, split_type);
+	if (ret) {
+		VM_WARN_ONCE(ret == -EINVAL, "Tried to split an unsplittable folio");
+		return ret;
 	}
 
-	if (folio_test_writeback(folio))
-		return -EBUSY;
-
 	if (is_anon) {
 		/*
 		 * The caller does not necessarily hold an mmap_lock that would
-- 
2.51.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation
  2025-11-26  3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
  2025-11-26  3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
@ 2025-11-26  3:50 ` Zi Yan
  2025-11-26  9:56   ` David Hildenbrand (Red Hat)
  2025-11-26  3:50 ` [PATCH v3 3/4] mm/huge_memory: make min_order_for_split() always return an order Zi Yan
  2025-11-26  3:50 ` [PATCH v3 4/4] mm/huge_memory: fix folio split stats counting Zi Yan
  3 siblings, 1 reply; 12+ messages in thread
From: Zi Yan @ 2025-11-26  3:50 UTC (permalink / raw)
  To: David Hildenbrand, Lorenzo Stoakes
  Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
	Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel

can_split_folio() is just a refcount comparison, making sure only the
split caller holds an extra pin. Open code it with
folio_expected_ref_count() != folio_ref_count() - 1. For the extra_pins
used by folio_ref_freeze(), add folio_cache_ref_count() to calculate it.
Also replace folio_expected_ref_count() with folio_cache_ref_count() used
by folio_ref_unfreeze(), since they are returning the same values when
a folio is frozen and folio_cache_ref_count() does not have unnecessary
folio_mapcount() in its implementation.

Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Balbir Singh <balbirs@nvidia.com>
---
 include/linux/huge_mm.h |  1 -
 mm/huge_memory.c        | 48 ++++++++++++++++-------------------------
 mm/vmscan.c             |  3 ++-
 3 files changed, 21 insertions(+), 31 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 66105a90b4c3..8a52e20387b0 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -369,7 +369,6 @@ enum split_type {
 	SPLIT_TYPE_NON_UNIFORM,
 };
 
-bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins);
 int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
 		unsigned int new_order);
 int folio_split_unmapped(struct folio *folio, unsigned int new_order);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 771df0c02a4a..cab429d8fe83 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3455,23 +3455,6 @@ static void lru_add_split_folio(struct folio *folio, struct folio *new_folio,
 	}
 }
 
-/* Racy check whether the huge page can be split */
-bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins)
-{
-	int extra_pins;
-
-	/* Additional pins from page cache */
-	if (folio_test_anon(folio))
-		extra_pins = folio_test_swapcache(folio) ?
-				folio_nr_pages(folio) : 0;
-	else
-		extra_pins = folio_nr_pages(folio);
-	if (pextra_pins)
-		*pextra_pins = extra_pins;
-	return folio_mapcount(folio) == folio_ref_count(folio) - extra_pins -
-					caller_pins;
-}
-
 static bool page_range_has_hwpoisoned(struct page *page, long nr_pages)
 {
 	for (; nr_pages; page++, nr_pages--)
@@ -3767,11 +3750,19 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order,
 	return 0;
 }
 
+/* Number of folio references from the pagecache or the swapcache. */
+static unsigned int folio_cache_ref_count(const struct folio *folio)
+{
+	if (folio_test_anon(folio) && !folio_test_swapcache(folio))
+		return 0;
+	return folio_nr_pages(folio);
+}
+
 static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
 					     struct page *split_at, struct xa_state *xas,
 					     struct address_space *mapping, bool do_lru,
 					     struct list_head *list, enum split_type split_type,
-					     pgoff_t end, int *nr_shmem_dropped, int extra_pins)
+					     pgoff_t end, int *nr_shmem_dropped)
 {
 	struct folio *end_folio = folio_next(folio);
 	struct folio *new_folio, *next;
@@ -3782,7 +3773,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
 	VM_WARN_ON_ONCE(!mapping && end);
 	/* Prevent deferred_split_scan() touching ->_refcount */
 	ds_queue = folio_split_queue_lock(folio);
-	if (folio_ref_freeze(folio, 1 + extra_pins)) {
+	if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) {
 		struct swap_cluster_info *ci = NULL;
 		struct lruvec *lruvec;
 		int expected_refs;
@@ -3853,7 +3844,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
 
 			zone_device_private_split_cb(folio, new_folio);
 
-			expected_refs = folio_expected_ref_count(new_folio) + 1;
+			expected_refs = folio_cache_ref_count(new_folio) + 1;
 			folio_ref_unfreeze(new_folio, expected_refs);
 
 			if (do_lru)
@@ -3897,7 +3888,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
 		 * Otherwise, a parallel folio_try_get() can grab @folio
 		 * and its caller can see stale page cache entries.
 		 */
-		expected_refs = folio_expected_ref_count(folio) + 1;
+		expected_refs = folio_cache_ref_count(folio) + 1;
 		folio_ref_unfreeze(folio, expected_refs);
 
 		if (do_lru)
@@ -3947,7 +3938,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 	struct folio *new_folio, *next;
 	int nr_shmem_dropped = 0;
 	int remap_flags = 0;
-	int extra_pins, ret;
+	int ret;
 	pgoff_t end = 0;
 
 	VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
@@ -4028,7 +4019,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 	 * Racy check if we can split the page, before unmap_folio() will
 	 * split PMDs
 	 */
-	if (!can_split_folio(folio, 1, &extra_pins)) {
+	if (folio_expected_ref_count(folio) != folio_ref_count(folio) - 1) {
 		ret = -EAGAIN;
 		goto out_unlock;
 	}
@@ -4051,8 +4042,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 	}
 
 	ret = __folio_freeze_and_split_unmapped(folio, new_order, split_at, &xas, mapping,
-						true, list, split_type, end, &nr_shmem_dropped,
-						extra_pins);
+						true, list, split_type, end, &nr_shmem_dropped);
 fail:
 	if (mapping)
 		xas_unlock(&xas);
@@ -4126,20 +4116,20 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
  */
 int folio_split_unmapped(struct folio *folio, unsigned int new_order)
 {
-	int extra_pins, ret = 0;
+	int ret = 0;
 
 	VM_WARN_ON_ONCE_FOLIO(folio_mapped(folio), folio);
 	VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
 	VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
 	VM_WARN_ON_ONCE_FOLIO(!folio_test_anon(folio), folio);
 
-	if (!can_split_folio(folio, 1, &extra_pins))
+	if (folio_expected_ref_count(folio) != folio_ref_count(folio) - 1)
 		return -EAGAIN;
 
 	local_irq_disable();
 	ret = __folio_freeze_and_split_unmapped(folio, new_order, &folio->page, NULL,
 						NULL, false, NULL, SPLIT_TYPE_UNIFORM,
-						0, NULL, extra_pins);
+						0, NULL);
 	local_irq_enable();
 	return ret;
 }
@@ -4632,7 +4622,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
 		 * can be split or not. So skip the check here.
 		 */
 		if (!folio_test_private(folio) &&
-		    !can_split_folio(folio, 0, NULL))
+		    folio_expected_ref_count(folio) != folio_ref_count(folio))
 			goto next;
 
 		if (!folio_trylock(folio))
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 92980b072121..3b85652a42b9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1284,7 +1284,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 					goto keep_locked;
 				if (folio_test_large(folio)) {
 					/* cannot split folio, skip it */
-					if (!can_split_folio(folio, 1, NULL))
+					if (folio_expected_ref_count(folio) !=
+					    folio_ref_count(folio) - 1)
 						goto activate_locked;
 					/*
 					 * Split partially mapped folios right away.
-- 
2.51.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 3/4] mm/huge_memory: make min_order_for_split() always return an order
  2025-11-26  3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
  2025-11-26  3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
  2025-11-26  3:50 ` [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation Zi Yan
@ 2025-11-26  3:50 ` Zi Yan
  2025-11-26  3:50 ` [PATCH v3 4/4] mm/huge_memory: fix folio split stats counting Zi Yan
  3 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26  3:50 UTC (permalink / raw)
  To: David Hildenbrand, Lorenzo Stoakes
  Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
	Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel

min_order_for_split() returns -EBUSY when the folio is truncated and cannot
be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
split_huge_page*() target order silently"), memory_failure() does not
handle it and pass -EBUSY to try_to_split_thp_page() directly.
try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
new_order is unsigned int in __folio_split() and this large new_order is
rejected as an invalid input. The code does not cause a bug.
soft_offline_in_use_page() also uses min_order_for_split() but it always
passes 0 as new_order for split.

Fix it by making min_order_for_split() always return an order. When the
given folio is truncated, namely folio->mapping == NULL, return 0 and let
a subsequent split function handle the situation and return -EBUSY.

Add kernel-doc to min_order_for_split() to clarify its use.

Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 include/linux/huge_mm.h |  6 +++---
 mm/huge_memory.c        | 25 +++++++++++++++++++------
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 8a52e20387b0..21162493a0a0 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -372,7 +372,7 @@ enum split_type {
 int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
 		unsigned int new_order);
 int folio_split_unmapped(struct folio *folio, unsigned int new_order);
-int min_order_for_split(struct folio *folio);
+unsigned int min_order_for_split(struct folio *folio);
 int split_folio_to_list(struct folio *folio, struct list_head *list);
 int folio_check_splittable(struct folio *folio, unsigned int new_order,
 			   enum split_type split_type);
@@ -630,10 +630,10 @@ static inline int split_huge_page(struct page *page)
 	return -EINVAL;
 }
 
-static inline int min_order_for_split(struct folio *folio)
+static inline unsigned int min_order_for_split(struct folio *folio)
 {
 	VM_WARN_ON_ONCE_FOLIO(1, folio);
-	return -EINVAL;
+	return 0;
 }
 
 static inline int split_folio_to_list(struct folio *folio, struct list_head *list)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index cab429d8fe83..3d2396bf5763 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4221,16 +4221,29 @@ int folio_split(struct folio *folio, unsigned int new_order,
 			     SPLIT_TYPE_NON_UNIFORM);
 }
 
-int min_order_for_split(struct folio *folio)
+/**
+ * min_order_for_split() - get the minimum order @folio can be split to
+ * @folio: folio to split
+ *
+ * min_order_for_split() tells the minimum order @folio can be split to.
+ * If a file-backed folio is truncated, 0 will be returned. Any subsequent
+ * split attempt should get -EBUSY from split checking code.
+ *
+ * Return: @folio's minimum order for split
+ */
+unsigned int min_order_for_split(struct folio *folio)
 {
 	if (folio_test_anon(folio))
 		return 0;
 
-	if (!folio->mapping) {
-		if (folio_test_pmd_mappable(folio))
-			count_vm_event(THP_SPLIT_PAGE_FAILED);
-		return -EBUSY;
-	}
+	/*
+	 * If the folio got truncated, we don't know the previous mapping and
+	 * consequently the old min order. But it doesn't matter, as any split
+	 * attempt will immediately fail with -EBUSY as the folio cannot get
+	 * split until freed.
+	 */
+	if (!folio->mapping)
+		return 0;
 
 	return mapping_min_folio_order(folio->mapping);
 }
-- 
2.51.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 4/4] mm/huge_memory: fix folio split stats counting
  2025-11-26  3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
                   ` (2 preceding siblings ...)
  2025-11-26  3:50 ` [PATCH v3 3/4] mm/huge_memory: make min_order_for_split() always return an order Zi Yan
@ 2025-11-26  3:50 ` Zi Yan
  3 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26  3:50 UTC (permalink / raw)
  To: David Hildenbrand, Lorenzo Stoakes
  Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
	Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel

The "return <error code>" statements for error checks at the beginning of
__folio_split() skip necessary count_vm_event() and count_mthp_stat() at
the end of the function. Fix these by replacing them with
"ret = <error code>; goto out;".

Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
---
 mm/huge_memory.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3d2396bf5763..9e984608da81 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3944,16 +3944,20 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 	VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
 	VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
 
-	if (folio != page_folio(split_at) || folio != page_folio(lock_at))
-		return -EINVAL;
+	if (folio != page_folio(split_at) || folio != page_folio(lock_at)) {
+		ret = -EINVAL;
+		goto out;
+	}
 
-	if (new_order >= old_order)
-		return -EINVAL;
+	if (new_order >= old_order) {
+		ret = -EINVAL;
+		goto out;
+	}
 
 	ret = folio_check_splittable(folio, new_order, split_type);
 	if (ret) {
 		VM_WARN_ONCE(ret == -EINVAL, "Tried to split an unsplittable folio");
-		return ret;
+		goto out;
 	}
 
 	if (is_anon) {
-- 
2.51.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
  2025-11-26  3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
@ 2025-11-26  4:14   ` Balbir Singh
  2025-11-26 16:55     ` Zi Yan
  2025-11-26  9:54   ` David Hildenbrand (Red Hat)
  2025-11-27  5:23   ` Barry Song
  2 siblings, 1 reply; 12+ messages in thread
From: Balbir Singh @ 2025-11-26  4:14 UTC (permalink / raw)
  To: Zi Yan, David Hildenbrand, Lorenzo Stoakes
  Cc: Andrew Morton, Baolin Wang, Liam R. Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
	Naoya Horiguchi, Wei Yang, linux-mm, linux-kernel

On 11/26/25 14:50, Zi Yan wrote:
> folio_split_supported() used in try_folio_split_to_order() requires
> folio->mapping to be non NULL, but current try_folio_split_to_order() does
> not check it. There is no issue in the current code, since
> try_folio_split_to_order() is only used in truncate_inode_partial_folio(),
> where folio->mapping is not NULL.
> 
> To prevent future misuse, move folio->mapping NULL check (i.e., folio is
> truncated) into folio_split_supported(). Since folio->mapping NULL check
> returns -EBUSY and folio_split_supported() == false means -EINVAL, change
> folio_split_supported() return type from bool to int and return error
> numbers accordingly. Rename folio_split_supported() to
> folio_check_splittable() to match the return type change.
> 
> While at it, move is_huge_zero_folio() check and folio_test_writeback()
> check into folio_check_splittable() and add kernel-doc.
> 
> Remove all warnings inside folio_check_splittable() and give warnings
> in __folio_split() instead, so that bool warns parameter can be removed.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>  include/linux/huge_mm.h |  6 ++--
>  mm/huge_memory.c        | 76 +++++++++++++++++++++++------------------
>  2 files changed, 46 insertions(+), 36 deletions(-)
> 
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 1d439de1ca2c..66105a90b4c3 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -375,8 +375,8 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
>  int folio_split_unmapped(struct folio *folio, unsigned int new_order);
>  int min_order_for_split(struct folio *folio);
>  int split_folio_to_list(struct folio *folio, struct list_head *list);
> -bool folio_split_supported(struct folio *folio, unsigned int new_order,
> -		enum split_type split_type, bool warns);
> +int folio_check_splittable(struct folio *folio, unsigned int new_order,
> +			   enum split_type split_type);
>  int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
>  		struct list_head *list);
>  
> @@ -407,7 +407,7 @@ static inline int split_huge_page_to_order(struct page *page, unsigned int new_o
>  static inline int try_folio_split_to_order(struct folio *folio,
>  		struct page *page, unsigned int new_order)
>  {
> -	if (!folio_split_supported(folio, new_order, SPLIT_TYPE_NON_UNIFORM, /* warns= */ false))
> +	if (folio_check_splittable(folio, new_order, SPLIT_TYPE_NON_UNIFORM))
>  		return split_huge_page_to_order(&folio->page, new_order);
>  	return folio_split(folio, new_order, page, NULL);
>  }
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 041b554c7115..771df0c02a4a 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3688,15 +3688,40 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>  	return 0;
>  }
>  
> -bool folio_split_supported(struct folio *folio, unsigned int new_order,
> -		enum split_type split_type, bool warns)
> +/**
> + * folio_check_splittable() - check if a folio can be split to a given order
> + * @folio: folio to be split
> + * @new_order: the smallest order of the after split folios (since buddy
> + *             allocator like split generates folios with orders from @folio's
> + *             order - 1 to new_order).
> + * @split_type: uniform or non-uniform split
> + *
> + * folio_check_splittable() checks if @folio can be split to @new_order using
> + * @split_type method. The truncated folio check must come first.
> + *
> + * Context: folio must be locked.
> + *
> + * Return: 0 - @folio can be split to @new_order, otherwise an error number is
> + * returned.
> + */
> +int folio_check_splittable(struct folio *folio, unsigned int new_order,
> +			   enum split_type split_type)
>  {
> +	VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio);
> +	/*
> +	 * Folios that just got truncated cannot get split. Signal to the
> +	 * caller that there was a race.
> +	 *
> +	 * TODO: this will also currently refuse folios without a mapping in the
> +	 * swapcache (shmem or to-be-anon folios).
> +	 */
> +	if (!folio_test_anon(folio) && !folio->mapping)
> +		return -EBUSY;
> +

Nit: Shouldn't the order of check be 

if (!folio->mapping && !folio_test_anon(folio))

works better if folio->mapping is NULL


>  	if (folio_test_anon(folio)) {
>  		/* order-1 is not supported for anonymous THP. */
> -		VM_WARN_ONCE(warns && new_order == 1,
> -				"Cannot split to order-1 folio");
>  		if (new_order == 1)
> -			return false;
> +			return -EINVAL;
>  	} else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
>  		if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>  		    !mapping_large_folio_support(folio->mapping)) {
> @@ -3717,9 +3742,7 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
>  			 * case, the mapping does not actually support large
>  			 * folios properly.
>  			 */
> -			VM_WARN_ONCE(warns,
> -				"Cannot split file folio to non-0 order");
> -			return false;
> +			return -EINVAL;
>  		}
>  	}
>  
> @@ -3732,12 +3755,16 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
>  	 * here.
>  	 */
>  	if ((split_type == SPLIT_TYPE_NON_UNIFORM || new_order) && folio_test_swapcache(folio)) {
> -		VM_WARN_ONCE(warns,
> -			"Cannot split swapcache folio to non-0 order");
> -		return false;
> +		return -EINVAL;
>  	}
>  
> -	return true;
> +	if (is_huge_zero_folio(folio))
> +		return -EINVAL;
> +
> +	if (folio_test_writeback(folio))
> +		return -EBUSY;
> +
> +	return 0;
>  }
>  
>  static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
> @@ -3922,7 +3949,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  	int remap_flags = 0;
>  	int extra_pins, ret;
>  	pgoff_t end = 0;
> -	bool is_hzp;
>  
>  	VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
>  	VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
> @@ -3930,31 +3956,15 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  	if (folio != page_folio(split_at) || folio != page_folio(lock_at))
>  		return -EINVAL;
>  
> -	/*
> -	 * Folios that just got truncated cannot get split. Signal to the
> -	 * caller that there was a race.
> -	 *
> -	 * TODO: this will also currently refuse shmem folios that are in the
> -	 * swapcache.
> -	 */
> -	if (!is_anon && !folio->mapping)
> -		return -EBUSY;
> -
>  	if (new_order >= old_order)
>  		return -EINVAL;
>  
> -	if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
> -		return -EINVAL;
> -
> -	is_hzp = is_huge_zero_folio(folio);
> -	if (is_hzp) {
> -		pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
> -		return -EBUSY;
> +	ret = folio_check_splittable(folio, new_order, split_type);
> +	if (ret) {
> +		VM_WARN_ONCE(ret == -EINVAL, "Tried to split an unsplittable folio");
> +		return ret;
>  	}
>  
> -	if (folio_test_writeback(folio))
> -		return -EBUSY;
> -
>  	if (is_anon) {
>  		/*
>  		 * The caller does not necessarily hold an mmap_lock that would

Otherwise,looks good!

Acked-by: Balbir Singh <balbirs@nvidia.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
  2025-11-26  3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
  2025-11-26  4:14   ` Balbir Singh
@ 2025-11-26  9:54   ` David Hildenbrand (Red Hat)
  2025-11-26 16:59     ` Zi Yan
  2025-11-27  5:23   ` Barry Song
  2 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26  9:54 UTC (permalink / raw)
  To: Zi Yan, Lorenzo Stoakes
  Cc: Andrew Morton, Baolin Wang, Liam R. Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
	Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel

> -	/*
> -	 * Folios that just got truncated cannot get split. Signal to the
> -	 * caller that there was a race.
> -	 *
> -	 * TODO: this will also currently refuse shmem folios that are in the
> -	 * swapcache.
> -	 */
> -	if (!is_anon && !folio->mapping)
> -		return -EBUSY;
> -
>   	if (new_order >= old_order)
>   		return -EINVAL;
>   
> -	if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
> -		return -EINVAL;
> -
> -	is_hzp = is_huge_zero_folio(folio);
> -	if (is_hzp) {
> -		pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
> -		return -EBUSY;

As we are changing that case to a VM_WARN_ONCE(), is there some path 
where we might trigger that?

I'm wondering about the split_huge_pages_all() function in particular. I 
guess the "!folio_test_lru(folio)" would protect us?

Apart from that LGTM

Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation
  2025-11-26  3:50 ` [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation Zi Yan
@ 2025-11-26  9:56   ` David Hildenbrand (Red Hat)
  2025-11-26 16:59     ` Zi Yan
  0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26  9:56 UTC (permalink / raw)
  To: Zi Yan, Lorenzo Stoakes
  Cc: Andrew Morton, Baolin Wang, Liam R. Howlett, Nico Pache,
	Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
	Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel


>   static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
>   					     struct page *split_at, struct xa_state *xas,
>   					     struct address_space *mapping, bool do_lru,
>   					     struct list_head *list, enum split_type split_type,
> -					     pgoff_t end, int *nr_shmem_dropped, int extra_pins)
> +					     pgoff_t end, int *nr_shmem_dropped)
>   {
>   	struct folio *end_folio = folio_next(folio);
>   	struct folio *new_folio, *next;
> @@ -3782,7 +3773,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>   	VM_WARN_ON_ONCE(!mapping && end);
>   	/* Prevent deferred_split_scan() touching ->_refcount */
>   	ds_queue = folio_split_queue_lock(folio);
> -	if (folio_ref_freeze(folio, 1 + extra_pins)) {
> +	if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) {
>   		struct swap_cluster_info *ci = NULL;
>   		struct lruvec *lruvec;
>   		int expected_refs;
> @@ -3853,7 +3844,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>   
>   			zone_device_private_split_cb(folio, new_folio);
>   
> -			expected_refs = folio_expected_ref_count(new_folio) + 1;
> +			expected_refs = folio_cache_ref_count(new_folio) + 1;
>   			folio_ref_unfreeze(new_folio, expected_refs);
>   
>   			if (do_lru)
> @@ -3897,7 +3888,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>   		 * Otherwise, a parallel folio_try_get() can grab @folio
>   		 * and its caller can see stale page cache entries.
>   		 */
> -		expected_refs = folio_expected_ref_count(folio) + 1;
> +		expected_refs = folio_cache_ref_count(folio) + 1;
>   		folio_ref_unfreeze(folio, expected_refs);

Can we just get rid of the expected_refs variable as well?

Apart from that LGTM, thanks!

Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
  2025-11-26  4:14   ` Balbir Singh
@ 2025-11-26 16:55     ` Zi Yan
  0 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 16:55 UTC (permalink / raw)
  To: Balbir Singh
  Cc: David Hildenbrand, Lorenzo Stoakes, Andrew Morton, Baolin Wang,
	Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Miaohe Lin, Naoya Horiguchi, Wei Yang, linux-mm,
	linux-kernel

On 25 Nov 2025, at 23:14, Balbir Singh wrote:

> On 11/26/25 14:50, Zi Yan wrote:
>> folio_split_supported() used in try_folio_split_to_order() requires
>> folio->mapping to be non NULL, but current try_folio_split_to_order() does
>> not check it. There is no issue in the current code, since
>> try_folio_split_to_order() is only used in truncate_inode_partial_folio(),
>> where folio->mapping is not NULL.
>>
>> To prevent future misuse, move folio->mapping NULL check (i.e., folio is
>> truncated) into folio_split_supported(). Since folio->mapping NULL check
>> returns -EBUSY and folio_split_supported() == false means -EINVAL, change
>> folio_split_supported() return type from bool to int and return error
>> numbers accordingly. Rename folio_split_supported() to
>> folio_check_splittable() to match the return type change.
>>
>> While at it, move is_huge_zero_folio() check and folio_test_writeback()
>> check into folio_check_splittable() and add kernel-doc.
>>
>> Remove all warnings inside folio_check_splittable() and give warnings
>> in __folio_split() instead, so that bool warns parameter can be removed.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>>  include/linux/huge_mm.h |  6 ++--
>>  mm/huge_memory.c        | 76 +++++++++++++++++++++++------------------
>>  2 files changed, 46 insertions(+), 36 deletions(-)
>>
>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>> index 1d439de1ca2c..66105a90b4c3 100644
>> --- a/include/linux/huge_mm.h
>> +++ b/include/linux/huge_mm.h
>> @@ -375,8 +375,8 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
>>  int folio_split_unmapped(struct folio *folio, unsigned int new_order);
>>  int min_order_for_split(struct folio *folio);
>>  int split_folio_to_list(struct folio *folio, struct list_head *list);
>> -bool folio_split_supported(struct folio *folio, unsigned int new_order,
>> -		enum split_type split_type, bool warns);
>> +int folio_check_splittable(struct folio *folio, unsigned int new_order,
>> +			   enum split_type split_type);
>>  int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
>>  		struct list_head *list);
>>
>> @@ -407,7 +407,7 @@ static inline int split_huge_page_to_order(struct page *page, unsigned int new_o
>>  static inline int try_folio_split_to_order(struct folio *folio,
>>  		struct page *page, unsigned int new_order)
>>  {
>> -	if (!folio_split_supported(folio, new_order, SPLIT_TYPE_NON_UNIFORM, /* warns= */ false))
>> +	if (folio_check_splittable(folio, new_order, SPLIT_TYPE_NON_UNIFORM))
>>  		return split_huge_page_to_order(&folio->page, new_order);
>>  	return folio_split(folio, new_order, page, NULL);
>>  }
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 041b554c7115..771df0c02a4a 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -3688,15 +3688,40 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>>  	return 0;
>>  }
>>
>> -bool folio_split_supported(struct folio *folio, unsigned int new_order,
>> -		enum split_type split_type, bool warns)
>> +/**
>> + * folio_check_splittable() - check if a folio can be split to a given order
>> + * @folio: folio to be split
>> + * @new_order: the smallest order of the after split folios (since buddy
>> + *             allocator like split generates folios with orders from @folio's
>> + *             order - 1 to new_order).
>> + * @split_type: uniform or non-uniform split
>> + *
>> + * folio_check_splittable() checks if @folio can be split to @new_order using
>> + * @split_type method. The truncated folio check must come first.
>> + *
>> + * Context: folio must be locked.
>> + *
>> + * Return: 0 - @folio can be split to @new_order, otherwise an error number is
>> + * returned.
>> + */
>> +int folio_check_splittable(struct folio *folio, unsigned int new_order,
>> +			   enum split_type split_type)
>>  {
>> +	VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio);
>> +	/*
>> +	 * Folios that just got truncated cannot get split. Signal to the
>> +	 * caller that there was a race.
>> +	 *
>> +	 * TODO: this will also currently refuse folios without a mapping in the
>> +	 * swapcache (shmem or to-be-anon folios).
>> +	 */
>> +	if (!folio_test_anon(folio) && !folio->mapping)
>> +		return -EBUSY;
>> +
>
> Nit: Shouldn't the order of check be
>
> if (!folio->mapping && !folio_test_anon(folio))
>
> works better if folio->mapping is NULL

It does not matter, since folio_test_anon() checks folio->mapping too.
I can revert the order in the next version.

>
>
>>  	if (folio_test_anon(folio)) {
>>  		/* order-1 is not supported for anonymous THP. */
>> -		VM_WARN_ONCE(warns && new_order == 1,
>> -				"Cannot split to order-1 folio");
>>  		if (new_order == 1)
>> -			return false;
>> +			return -EINVAL;
>>  	} else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
>>  		if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>>  		    !mapping_large_folio_support(folio->mapping)) {
>> @@ -3717,9 +3742,7 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
>>  			 * case, the mapping does not actually support large
>>  			 * folios properly.
>>  			 */
>> -			VM_WARN_ONCE(warns,
>> -				"Cannot split file folio to non-0 order");
>> -			return false;
>> +			return -EINVAL;
>>  		}
>>  	}
>>
>> @@ -3732,12 +3755,16 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
>>  	 * here.
>>  	 */
>>  	if ((split_type == SPLIT_TYPE_NON_UNIFORM || new_order) && folio_test_swapcache(folio)) {
>> -		VM_WARN_ONCE(warns,
>> -			"Cannot split swapcache folio to non-0 order");
>> -		return false;
>> +		return -EINVAL;
>>  	}
>>
>> -	return true;
>> +	if (is_huge_zero_folio(folio))
>> +		return -EINVAL;
>> +
>> +	if (folio_test_writeback(folio))
>> +		return -EBUSY;
>> +
>> +	return 0;
>>  }
>>
>>  static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
>> @@ -3922,7 +3949,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>>  	int remap_flags = 0;
>>  	int extra_pins, ret;
>>  	pgoff_t end = 0;
>> -	bool is_hzp;
>>
>>  	VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
>>  	VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
>> @@ -3930,31 +3956,15 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>>  	if (folio != page_folio(split_at) || folio != page_folio(lock_at))
>>  		return -EINVAL;
>>
>> -	/*
>> -	 * Folios that just got truncated cannot get split. Signal to the
>> -	 * caller that there was a race.
>> -	 *
>> -	 * TODO: this will also currently refuse shmem folios that are in the
>> -	 * swapcache.
>> -	 */
>> -	if (!is_anon && !folio->mapping)
>> -		return -EBUSY;
>> -
>>  	if (new_order >= old_order)
>>  		return -EINVAL;
>>
>> -	if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
>> -		return -EINVAL;
>> -
>> -	is_hzp = is_huge_zero_folio(folio);
>> -	if (is_hzp) {
>> -		pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
>> -		return -EBUSY;
>> +	ret = folio_check_splittable(folio, new_order, split_type);
>> +	if (ret) {
>> +		VM_WARN_ONCE(ret == -EINVAL, "Tried to split an unsplittable folio");
>> +		return ret;
>>  	}
>>
>> -	if (folio_test_writeback(folio))
>> -		return -EBUSY;
>> -
>>  	if (is_anon) {
>>  		/*
>>  		 * The caller does not necessarily hold an mmap_lock that would
>
> Otherwise,looks good!
>
> Acked-by: Balbir Singh <balbirs@nvidia.com>

Thanks.

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
  2025-11-26  9:54   ` David Hildenbrand (Red Hat)
@ 2025-11-26 16:59     ` Zi Yan
  0 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 16:59 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: Lorenzo Stoakes, Andrew Morton, Baolin Wang, Liam R. Howlett,
	Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang,
	Miaohe Lin, Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm,
	linux-kernel

On 26 Nov 2025, at 4:54, David Hildenbrand (Red Hat) wrote:

>> -	/*
>> -	 * Folios that just got truncated cannot get split. Signal to the
>> -	 * caller that there was a race.
>> -	 *
>> -	 * TODO: this will also currently refuse shmem folios that are in the
>> -	 * swapcache.
>> -	 */
>> -	if (!is_anon && !folio->mapping)
>> -		return -EBUSY;
>> -
>>   	if (new_order >= old_order)
>>   		return -EINVAL;
>>  -	if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
>> -		return -EINVAL;
>> -
>> -	is_hzp = is_huge_zero_folio(folio);
>> -	if (is_hzp) {
>> -		pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
>> -		return -EBUSY;
>
> As we are changing that case to a VM_WARN_ONCE(), is there some path where we might trigger that?

Based on the git history, this check is added for injecting errors
to huge zero folio and triggering memory failure handling.

>
> I'm wondering about the split_huge_pages_all() function in particular. I guess the "!folio_test_lru(folio)" would protect us?

I think so.
>
> Apart from that LGTM
>
> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
>

Thanks.


Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation
  2025-11-26  9:56   ` David Hildenbrand (Red Hat)
@ 2025-11-26 16:59     ` Zi Yan
  0 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 16:59 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: Lorenzo Stoakes, Andrew Morton, Baolin Wang, Liam R. Howlett,
	Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang,
	Miaohe Lin, Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm,
	linux-kernel

On 26 Nov 2025, at 4:56, David Hildenbrand (Red Hat) wrote:

>>   static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
>>   					     struct page *split_at, struct xa_state *xas,
>>   					     struct address_space *mapping, bool do_lru,
>>   					     struct list_head *list, enum split_type split_type,
>> -					     pgoff_t end, int *nr_shmem_dropped, int extra_pins)
>> +					     pgoff_t end, int *nr_shmem_dropped)
>>   {
>>   	struct folio *end_folio = folio_next(folio);
>>   	struct folio *new_folio, *next;
>> @@ -3782,7 +3773,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>>   	VM_WARN_ON_ONCE(!mapping && end);
>>   	/* Prevent deferred_split_scan() touching ->_refcount */
>>   	ds_queue = folio_split_queue_lock(folio);
>> -	if (folio_ref_freeze(folio, 1 + extra_pins)) {
>> +	if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) {
>>   		struct swap_cluster_info *ci = NULL;
>>   		struct lruvec *lruvec;
>>   		int expected_refs;
>> @@ -3853,7 +3844,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>>    			zone_device_private_split_cb(folio, new_folio);
>>  -			expected_refs = folio_expected_ref_count(new_folio) + 1;
>> +			expected_refs = folio_cache_ref_count(new_folio) + 1;
>>   			folio_ref_unfreeze(new_folio, expected_refs);
>>    			if (do_lru)
>> @@ -3897,7 +3888,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>>   		 * Otherwise, a parallel folio_try_get() can grab @folio
>>   		 * and its caller can see stale page cache entries.
>>   		 */
>> -		expected_refs = folio_expected_ref_count(folio) + 1;
>> +		expected_refs = folio_cache_ref_count(folio) + 1;
>>   		folio_ref_unfreeze(folio, expected_refs);
>
> Can we just get rid of the expected_refs variable as well?

OK. Will update it.

>
> Apart from that LGTM, thanks!
>
> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>

Thanks.


Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
  2025-11-26  3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
  2025-11-26  4:14   ` Balbir Singh
  2025-11-26  9:54   ` David Hildenbrand (Red Hat)
@ 2025-11-27  5:23   ` Barry Song
  2 siblings, 0 replies; 12+ messages in thread
From: Barry Song @ 2025-11-27  5:23 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand, Lorenzo Stoakes, Andrew Morton, Baolin Wang,
	Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Lance Yang,
	Miaohe Lin, Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm,
	linux-kernel

On Wed, Nov 26, 2025 at 11:50 AM Zi Yan <ziy@nvidia.com> wrote:
>
> folio_split_supported() used in try_folio_split_to_order() requires
> folio->mapping to be non NULL, but current try_folio_split_to_order() does
> not check it. There is no issue in the current code, since
> try_folio_split_to_order() is only used in truncate_inode_partial_folio(),
> where folio->mapping is not NULL.
>
> To prevent future misuse, move folio->mapping NULL check (i.e., folio is
> truncated) into folio_split_supported(). Since folio->mapping NULL check
> returns -EBUSY and folio_split_supported() == false means -EINVAL, change
> folio_split_supported() return type from bool to int and return error
> numbers accordingly. Rename folio_split_supported() to
> folio_check_splittable() to match the return type change.
>
> While at it, move is_huge_zero_folio() check and folio_test_writeback()
> check into folio_check_splittable() and add kernel-doc.
>
> Remove all warnings inside folio_check_splittable() and give warnings
> in __folio_split() instead, so that bool warns parameter can be removed.
>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

Much cleaner than having a "warns" argument before.

Reviewed-by: Barry Song <baohua@kernel.org>

Thanks
Barry


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-11-27  5:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-26  3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
2025-11-26  3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
2025-11-26  4:14   ` Balbir Singh
2025-11-26 16:55     ` Zi Yan
2025-11-26  9:54   ` David Hildenbrand (Red Hat)
2025-11-26 16:59     ` Zi Yan
2025-11-27  5:23   ` Barry Song
2025-11-26  3:50 ` [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation Zi Yan
2025-11-26  9:56   ` David Hildenbrand (Red Hat)
2025-11-26 16:59     ` Zi Yan
2025-11-26  3:50 ` [PATCH v3 3/4] mm/huge_memory: make min_order_for_split() always return an order Zi Yan
2025-11-26  3:50 ` [PATCH v3 4/4] mm/huge_memory: fix folio split stats counting Zi Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox