* [PATCH v3 0/4] Improve folio split related functions
@ 2025-11-26 3:50 Zi Yan
2025-11-26 3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 3:50 UTC (permalink / raw)
To: David Hildenbrand, Lorenzo Stoakes
Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel
Hi all,
This patchset improves several folio split related functions to avoid
future misuse. The changes are:
1. Consolidated folio splittable checks by moving truncated folio check,
huge zero folio check, and writeback folio check into
folio_split_supported(). Changed the function return type. Renamed it
to folio_check_splittable() for clarification.
2. Replaced can_split_folio() with open coded folio_expected_ref_count()
and folio_ref_count() and introduced folio_cache_ref_count().
3. Changed min_order_for_split() to always return an order.
4. Fixed folio split stats counting.
Motivation
===
This is based on Wei's observation[1] and solves several potential
issues:
1. Dereferencing NULL folio->mapping in try_folio_split_to_order() if it
is called on truncated folios.
2. Not handling of negative return value of min_order_for_split() in
mm/memory-failure.c
There is no bug in the current code.
The code is based on mm-new with V2 reverted and can replace V2 cleanly
on mm-new branch.
Changelog
===
From V2[3]:
1. Removed "bool warns" parameter from folio_check_splittable().
2. Removed all warnings in folio_check_splittable() and added a single
warning in its caller, __folio_split() instead.
3. Spelled out in the comment in folio_check_splittable() that folios
without a mapping in the swapcache can be shmem or to-be-anon folios.
4. Renamed folio_cache_references to folio_cache_ref_count.
5. Removed extra_pins variable.
6. Replaced folio_expected_ref_count() with folio_cache_ref_count() for
folio_ref_unfreeze() uses in __folio_freeze_and_split_unmapped(),
since they are equivalent at those call sites.
From RFC[2]:
1. Renamed folio_split_supported() to folio_check_splittable(), changed
its return type from bool to int to return error code directly, and
added kernel-doc.
2. Moved truncated folio check, zero huge folio check, and writeback
check in folio_check_splittable().
3. Changed zero huge folio check's error number from -EBUSY to -EINVAL.
4. Replaced can_split_folio() with open code.
5. Changed min_order_for_split() to return 0 for truncated folio instead
of -EBUSY and added kernel-doc.
6. Fixed folio split stats counting.
Comments and feedbacks are welcome.
Link: https://lore.kernel.org/all/20251120004735.52z7r4xmogw7mbsj@master/ [1]
Link: https://lore.kernel.org/all/20251120035953.1115736-1-ziy@nvidia.com/ [2]
Link: https://lore.kernel.org/all/20251122025529.1562592-1-ziy@nvidia.com/ [3]
Zi Yan (4):
mm/huge_memory: change folio_split_supported() to
folio_check_splittable()
mm/huge_memory: replace can_split_folio() with direct refcount
calculation
mm/huge_memory: make min_order_for_split() always return an order
mm/huge_memory: fix folio split stats counting
include/linux/huge_mm.h | 13 ++--
mm/huge_memory.c | 161 ++++++++++++++++++++++------------------
mm/vmscan.c | 3 +-
3 files changed, 97 insertions(+), 80 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
2025-11-26 3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
@ 2025-11-26 3:50 ` Zi Yan
2025-11-26 4:14 ` Balbir Singh
` (2 more replies)
2025-11-26 3:50 ` [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation Zi Yan
` (2 subsequent siblings)
3 siblings, 3 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 3:50 UTC (permalink / raw)
To: David Hildenbrand, Lorenzo Stoakes
Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel
folio_split_supported() used in try_folio_split_to_order() requires
folio->mapping to be non NULL, but current try_folio_split_to_order() does
not check it. There is no issue in the current code, since
try_folio_split_to_order() is only used in truncate_inode_partial_folio(),
where folio->mapping is not NULL.
To prevent future misuse, move folio->mapping NULL check (i.e., folio is
truncated) into folio_split_supported(). Since folio->mapping NULL check
returns -EBUSY and folio_split_supported() == false means -EINVAL, change
folio_split_supported() return type from bool to int and return error
numbers accordingly. Rename folio_split_supported() to
folio_check_splittable() to match the return type change.
While at it, move is_huge_zero_folio() check and folio_test_writeback()
check into folio_check_splittable() and add kernel-doc.
Remove all warnings inside folio_check_splittable() and give warnings
in __folio_split() instead, so that bool warns parameter can be removed.
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
---
include/linux/huge_mm.h | 6 ++--
mm/huge_memory.c | 76 +++++++++++++++++++++++------------------
2 files changed, 46 insertions(+), 36 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 1d439de1ca2c..66105a90b4c3 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -375,8 +375,8 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
int folio_split_unmapped(struct folio *folio, unsigned int new_order);
int min_order_for_split(struct folio *folio);
int split_folio_to_list(struct folio *folio, struct list_head *list);
-bool folio_split_supported(struct folio *folio, unsigned int new_order,
- enum split_type split_type, bool warns);
+int folio_check_splittable(struct folio *folio, unsigned int new_order,
+ enum split_type split_type);
int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
struct list_head *list);
@@ -407,7 +407,7 @@ static inline int split_huge_page_to_order(struct page *page, unsigned int new_o
static inline int try_folio_split_to_order(struct folio *folio,
struct page *page, unsigned int new_order)
{
- if (!folio_split_supported(folio, new_order, SPLIT_TYPE_NON_UNIFORM, /* warns= */ false))
+ if (folio_check_splittable(folio, new_order, SPLIT_TYPE_NON_UNIFORM))
return split_huge_page_to_order(&folio->page, new_order);
return folio_split(folio, new_order, page, NULL);
}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 041b554c7115..771df0c02a4a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3688,15 +3688,40 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
return 0;
}
-bool folio_split_supported(struct folio *folio, unsigned int new_order,
- enum split_type split_type, bool warns)
+/**
+ * folio_check_splittable() - check if a folio can be split to a given order
+ * @folio: folio to be split
+ * @new_order: the smallest order of the after split folios (since buddy
+ * allocator like split generates folios with orders from @folio's
+ * order - 1 to new_order).
+ * @split_type: uniform or non-uniform split
+ *
+ * folio_check_splittable() checks if @folio can be split to @new_order using
+ * @split_type method. The truncated folio check must come first.
+ *
+ * Context: folio must be locked.
+ *
+ * Return: 0 - @folio can be split to @new_order, otherwise an error number is
+ * returned.
+ */
+int folio_check_splittable(struct folio *folio, unsigned int new_order,
+ enum split_type split_type)
{
+ VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio);
+ /*
+ * Folios that just got truncated cannot get split. Signal to the
+ * caller that there was a race.
+ *
+ * TODO: this will also currently refuse folios without a mapping in the
+ * swapcache (shmem or to-be-anon folios).
+ */
+ if (!folio_test_anon(folio) && !folio->mapping)
+ return -EBUSY;
+
if (folio_test_anon(folio)) {
/* order-1 is not supported for anonymous THP. */
- VM_WARN_ONCE(warns && new_order == 1,
- "Cannot split to order-1 folio");
if (new_order == 1)
- return false;
+ return -EINVAL;
} else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
!mapping_large_folio_support(folio->mapping)) {
@@ -3717,9 +3742,7 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
* case, the mapping does not actually support large
* folios properly.
*/
- VM_WARN_ONCE(warns,
- "Cannot split file folio to non-0 order");
- return false;
+ return -EINVAL;
}
}
@@ -3732,12 +3755,16 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
* here.
*/
if ((split_type == SPLIT_TYPE_NON_UNIFORM || new_order) && folio_test_swapcache(folio)) {
- VM_WARN_ONCE(warns,
- "Cannot split swapcache folio to non-0 order");
- return false;
+ return -EINVAL;
}
- return true;
+ if (is_huge_zero_folio(folio))
+ return -EINVAL;
+
+ if (folio_test_writeback(folio))
+ return -EBUSY;
+
+ return 0;
}
static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
@@ -3922,7 +3949,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
int remap_flags = 0;
int extra_pins, ret;
pgoff_t end = 0;
- bool is_hzp;
VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
@@ -3930,31 +3956,15 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
if (folio != page_folio(split_at) || folio != page_folio(lock_at))
return -EINVAL;
- /*
- * Folios that just got truncated cannot get split. Signal to the
- * caller that there was a race.
- *
- * TODO: this will also currently refuse shmem folios that are in the
- * swapcache.
- */
- if (!is_anon && !folio->mapping)
- return -EBUSY;
-
if (new_order >= old_order)
return -EINVAL;
- if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
- return -EINVAL;
-
- is_hzp = is_huge_zero_folio(folio);
- if (is_hzp) {
- pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
- return -EBUSY;
+ ret = folio_check_splittable(folio, new_order, split_type);
+ if (ret) {
+ VM_WARN_ONCE(ret == -EINVAL, "Tried to split an unsplittable folio");
+ return ret;
}
- if (folio_test_writeback(folio))
- return -EBUSY;
-
if (is_anon) {
/*
* The caller does not necessarily hold an mmap_lock that would
--
2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation
2025-11-26 3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
2025-11-26 3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
@ 2025-11-26 3:50 ` Zi Yan
2025-11-26 9:56 ` David Hildenbrand (Red Hat)
2025-11-26 3:50 ` [PATCH v3 3/4] mm/huge_memory: make min_order_for_split() always return an order Zi Yan
2025-11-26 3:50 ` [PATCH v3 4/4] mm/huge_memory: fix folio split stats counting Zi Yan
3 siblings, 1 reply; 12+ messages in thread
From: Zi Yan @ 2025-11-26 3:50 UTC (permalink / raw)
To: David Hildenbrand, Lorenzo Stoakes
Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel
can_split_folio() is just a refcount comparison, making sure only the
split caller holds an extra pin. Open code it with
folio_expected_ref_count() != folio_ref_count() - 1. For the extra_pins
used by folio_ref_freeze(), add folio_cache_ref_count() to calculate it.
Also replace folio_expected_ref_count() with folio_cache_ref_count() used
by folio_ref_unfreeze(), since they are returning the same values when
a folio is frozen and folio_cache_ref_count() does not have unnecessary
folio_mapcount() in its implementation.
Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Balbir Singh <balbirs@nvidia.com>
---
include/linux/huge_mm.h | 1 -
mm/huge_memory.c | 48 ++++++++++++++++-------------------------
mm/vmscan.c | 3 ++-
3 files changed, 21 insertions(+), 31 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 66105a90b4c3..8a52e20387b0 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -369,7 +369,6 @@ enum split_type {
SPLIT_TYPE_NON_UNIFORM,
};
-bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins);
int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
unsigned int new_order);
int folio_split_unmapped(struct folio *folio, unsigned int new_order);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 771df0c02a4a..cab429d8fe83 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3455,23 +3455,6 @@ static void lru_add_split_folio(struct folio *folio, struct folio *new_folio,
}
}
-/* Racy check whether the huge page can be split */
-bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins)
-{
- int extra_pins;
-
- /* Additional pins from page cache */
- if (folio_test_anon(folio))
- extra_pins = folio_test_swapcache(folio) ?
- folio_nr_pages(folio) : 0;
- else
- extra_pins = folio_nr_pages(folio);
- if (pextra_pins)
- *pextra_pins = extra_pins;
- return folio_mapcount(folio) == folio_ref_count(folio) - extra_pins -
- caller_pins;
-}
-
static bool page_range_has_hwpoisoned(struct page *page, long nr_pages)
{
for (; nr_pages; page++, nr_pages--)
@@ -3767,11 +3750,19 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order,
return 0;
}
+/* Number of folio references from the pagecache or the swapcache. */
+static unsigned int folio_cache_ref_count(const struct folio *folio)
+{
+ if (folio_test_anon(folio) && !folio_test_swapcache(folio))
+ return 0;
+ return folio_nr_pages(folio);
+}
+
static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
struct page *split_at, struct xa_state *xas,
struct address_space *mapping, bool do_lru,
struct list_head *list, enum split_type split_type,
- pgoff_t end, int *nr_shmem_dropped, int extra_pins)
+ pgoff_t end, int *nr_shmem_dropped)
{
struct folio *end_folio = folio_next(folio);
struct folio *new_folio, *next;
@@ -3782,7 +3773,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
VM_WARN_ON_ONCE(!mapping && end);
/* Prevent deferred_split_scan() touching ->_refcount */
ds_queue = folio_split_queue_lock(folio);
- if (folio_ref_freeze(folio, 1 + extra_pins)) {
+ if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) {
struct swap_cluster_info *ci = NULL;
struct lruvec *lruvec;
int expected_refs;
@@ -3853,7 +3844,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
zone_device_private_split_cb(folio, new_folio);
- expected_refs = folio_expected_ref_count(new_folio) + 1;
+ expected_refs = folio_cache_ref_count(new_folio) + 1;
folio_ref_unfreeze(new_folio, expected_refs);
if (do_lru)
@@ -3897,7 +3888,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
* Otherwise, a parallel folio_try_get() can grab @folio
* and its caller can see stale page cache entries.
*/
- expected_refs = folio_expected_ref_count(folio) + 1;
+ expected_refs = folio_cache_ref_count(folio) + 1;
folio_ref_unfreeze(folio, expected_refs);
if (do_lru)
@@ -3947,7 +3938,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
struct folio *new_folio, *next;
int nr_shmem_dropped = 0;
int remap_flags = 0;
- int extra_pins, ret;
+ int ret;
pgoff_t end = 0;
VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
@@ -4028,7 +4019,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
* Racy check if we can split the page, before unmap_folio() will
* split PMDs
*/
- if (!can_split_folio(folio, 1, &extra_pins)) {
+ if (folio_expected_ref_count(folio) != folio_ref_count(folio) - 1) {
ret = -EAGAIN;
goto out_unlock;
}
@@ -4051,8 +4042,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
}
ret = __folio_freeze_and_split_unmapped(folio, new_order, split_at, &xas, mapping,
- true, list, split_type, end, &nr_shmem_dropped,
- extra_pins);
+ true, list, split_type, end, &nr_shmem_dropped);
fail:
if (mapping)
xas_unlock(&xas);
@@ -4126,20 +4116,20 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
*/
int folio_split_unmapped(struct folio *folio, unsigned int new_order)
{
- int extra_pins, ret = 0;
+ int ret = 0;
VM_WARN_ON_ONCE_FOLIO(folio_mapped(folio), folio);
VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
VM_WARN_ON_ONCE_FOLIO(!folio_test_anon(folio), folio);
- if (!can_split_folio(folio, 1, &extra_pins))
+ if (folio_expected_ref_count(folio) != folio_ref_count(folio) - 1)
return -EAGAIN;
local_irq_disable();
ret = __folio_freeze_and_split_unmapped(folio, new_order, &folio->page, NULL,
NULL, false, NULL, SPLIT_TYPE_UNIFORM,
- 0, NULL, extra_pins);
+ 0, NULL);
local_irq_enable();
return ret;
}
@@ -4632,7 +4622,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start,
* can be split or not. So skip the check here.
*/
if (!folio_test_private(folio) &&
- !can_split_folio(folio, 0, NULL))
+ folio_expected_ref_count(folio) != folio_ref_count(folio))
goto next;
if (!folio_trylock(folio))
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 92980b072121..3b85652a42b9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1284,7 +1284,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
goto keep_locked;
if (folio_test_large(folio)) {
/* cannot split folio, skip it */
- if (!can_split_folio(folio, 1, NULL))
+ if (folio_expected_ref_count(folio) !=
+ folio_ref_count(folio) - 1)
goto activate_locked;
/*
* Split partially mapped folios right away.
--
2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 3/4] mm/huge_memory: make min_order_for_split() always return an order
2025-11-26 3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
2025-11-26 3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
2025-11-26 3:50 ` [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation Zi Yan
@ 2025-11-26 3:50 ` Zi Yan
2025-11-26 3:50 ` [PATCH v3 4/4] mm/huge_memory: fix folio split stats counting Zi Yan
3 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 3:50 UTC (permalink / raw)
To: David Hildenbrand, Lorenzo Stoakes
Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel
min_order_for_split() returns -EBUSY when the folio is truncated and cannot
be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
split_huge_page*() target order silently"), memory_failure() does not
handle it and pass -EBUSY to try_to_split_thp_page() directly.
try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
new_order is unsigned int in __folio_split() and this large new_order is
rejected as an invalid input. The code does not cause a bug.
soft_offline_in_use_page() also uses min_order_for_split() but it always
passes 0 as new_order for split.
Fix it by making min_order_for_split() always return an order. When the
given folio is truncated, namely folio->mapping == NULL, return 0 and let
a subsequent split function handle the situation and return -EBUSY.
Add kernel-doc to min_order_for_split() to clarify its use.
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
include/linux/huge_mm.h | 6 +++---
mm/huge_memory.c | 25 +++++++++++++++++++------
2 files changed, 22 insertions(+), 9 deletions(-)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 8a52e20387b0..21162493a0a0 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -372,7 +372,7 @@ enum split_type {
int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
unsigned int new_order);
int folio_split_unmapped(struct folio *folio, unsigned int new_order);
-int min_order_for_split(struct folio *folio);
+unsigned int min_order_for_split(struct folio *folio);
int split_folio_to_list(struct folio *folio, struct list_head *list);
int folio_check_splittable(struct folio *folio, unsigned int new_order,
enum split_type split_type);
@@ -630,10 +630,10 @@ static inline int split_huge_page(struct page *page)
return -EINVAL;
}
-static inline int min_order_for_split(struct folio *folio)
+static inline unsigned int min_order_for_split(struct folio *folio)
{
VM_WARN_ON_ONCE_FOLIO(1, folio);
- return -EINVAL;
+ return 0;
}
static inline int split_folio_to_list(struct folio *folio, struct list_head *list)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index cab429d8fe83..3d2396bf5763 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4221,16 +4221,29 @@ int folio_split(struct folio *folio, unsigned int new_order,
SPLIT_TYPE_NON_UNIFORM);
}
-int min_order_for_split(struct folio *folio)
+/**
+ * min_order_for_split() - get the minimum order @folio can be split to
+ * @folio: folio to split
+ *
+ * min_order_for_split() tells the minimum order @folio can be split to.
+ * If a file-backed folio is truncated, 0 will be returned. Any subsequent
+ * split attempt should get -EBUSY from split checking code.
+ *
+ * Return: @folio's minimum order for split
+ */
+unsigned int min_order_for_split(struct folio *folio)
{
if (folio_test_anon(folio))
return 0;
- if (!folio->mapping) {
- if (folio_test_pmd_mappable(folio))
- count_vm_event(THP_SPLIT_PAGE_FAILED);
- return -EBUSY;
- }
+ /*
+ * If the folio got truncated, we don't know the previous mapping and
+ * consequently the old min order. But it doesn't matter, as any split
+ * attempt will immediately fail with -EBUSY as the folio cannot get
+ * split until freed.
+ */
+ if (!folio->mapping)
+ return 0;
return mapping_min_folio_order(folio->mapping);
}
--
2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 4/4] mm/huge_memory: fix folio split stats counting
2025-11-26 3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
` (2 preceding siblings ...)
2025-11-26 3:50 ` [PATCH v3 3/4] mm/huge_memory: make min_order_for_split() always return an order Zi Yan
@ 2025-11-26 3:50 ` Zi Yan
3 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 3:50 UTC (permalink / raw)
To: David Hildenbrand, Lorenzo Stoakes
Cc: Andrew Morton, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel
The "return <error code>" statements for error checks at the beginning of
__folio_split() skip necessary count_vm_event() and count_mthp_stat() at
the end of the function. Fix these by replacing them with
"ret = <error code>; goto out;".
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
---
mm/huge_memory.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3d2396bf5763..9e984608da81 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3944,16 +3944,20 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
- if (folio != page_folio(split_at) || folio != page_folio(lock_at))
- return -EINVAL;
+ if (folio != page_folio(split_at) || folio != page_folio(lock_at)) {
+ ret = -EINVAL;
+ goto out;
+ }
- if (new_order >= old_order)
- return -EINVAL;
+ if (new_order >= old_order) {
+ ret = -EINVAL;
+ goto out;
+ }
ret = folio_check_splittable(folio, new_order, split_type);
if (ret) {
VM_WARN_ONCE(ret == -EINVAL, "Tried to split an unsplittable folio");
- return ret;
+ goto out;
}
if (is_anon) {
--
2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
2025-11-26 3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
@ 2025-11-26 4:14 ` Balbir Singh
2025-11-26 16:55 ` Zi Yan
2025-11-26 9:54 ` David Hildenbrand (Red Hat)
2025-11-27 5:23 ` Barry Song
2 siblings, 1 reply; 12+ messages in thread
From: Balbir Singh @ 2025-11-26 4:14 UTC (permalink / raw)
To: Zi Yan, David Hildenbrand, Lorenzo Stoakes
Cc: Andrew Morton, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
Naoya Horiguchi, Wei Yang, linux-mm, linux-kernel
On 11/26/25 14:50, Zi Yan wrote:
> folio_split_supported() used in try_folio_split_to_order() requires
> folio->mapping to be non NULL, but current try_folio_split_to_order() does
> not check it. There is no issue in the current code, since
> try_folio_split_to_order() is only used in truncate_inode_partial_folio(),
> where folio->mapping is not NULL.
>
> To prevent future misuse, move folio->mapping NULL check (i.e., folio is
> truncated) into folio_split_supported(). Since folio->mapping NULL check
> returns -EBUSY and folio_split_supported() == false means -EINVAL, change
> folio_split_supported() return type from bool to int and return error
> numbers accordingly. Rename folio_split_supported() to
> folio_check_splittable() to match the return type change.
>
> While at it, move is_huge_zero_folio() check and folio_test_writeback()
> check into folio_check_splittable() and add kernel-doc.
>
> Remove all warnings inside folio_check_splittable() and give warnings
> in __folio_split() instead, so that bool warns parameter can be removed.
>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> ---
> include/linux/huge_mm.h | 6 ++--
> mm/huge_memory.c | 76 +++++++++++++++++++++++------------------
> 2 files changed, 46 insertions(+), 36 deletions(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 1d439de1ca2c..66105a90b4c3 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -375,8 +375,8 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
> int folio_split_unmapped(struct folio *folio, unsigned int new_order);
> int min_order_for_split(struct folio *folio);
> int split_folio_to_list(struct folio *folio, struct list_head *list);
> -bool folio_split_supported(struct folio *folio, unsigned int new_order,
> - enum split_type split_type, bool warns);
> +int folio_check_splittable(struct folio *folio, unsigned int new_order,
> + enum split_type split_type);
> int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
> struct list_head *list);
>
> @@ -407,7 +407,7 @@ static inline int split_huge_page_to_order(struct page *page, unsigned int new_o
> static inline int try_folio_split_to_order(struct folio *folio,
> struct page *page, unsigned int new_order)
> {
> - if (!folio_split_supported(folio, new_order, SPLIT_TYPE_NON_UNIFORM, /* warns= */ false))
> + if (folio_check_splittable(folio, new_order, SPLIT_TYPE_NON_UNIFORM))
> return split_huge_page_to_order(&folio->page, new_order);
> return folio_split(folio, new_order, page, NULL);
> }
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 041b554c7115..771df0c02a4a 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3688,15 +3688,40 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
> return 0;
> }
>
> -bool folio_split_supported(struct folio *folio, unsigned int new_order,
> - enum split_type split_type, bool warns)
> +/**
> + * folio_check_splittable() - check if a folio can be split to a given order
> + * @folio: folio to be split
> + * @new_order: the smallest order of the after split folios (since buddy
> + * allocator like split generates folios with orders from @folio's
> + * order - 1 to new_order).
> + * @split_type: uniform or non-uniform split
> + *
> + * folio_check_splittable() checks if @folio can be split to @new_order using
> + * @split_type method. The truncated folio check must come first.
> + *
> + * Context: folio must be locked.
> + *
> + * Return: 0 - @folio can be split to @new_order, otherwise an error number is
> + * returned.
> + */
> +int folio_check_splittable(struct folio *folio, unsigned int new_order,
> + enum split_type split_type)
> {
> + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio);
> + /*
> + * Folios that just got truncated cannot get split. Signal to the
> + * caller that there was a race.
> + *
> + * TODO: this will also currently refuse folios without a mapping in the
> + * swapcache (shmem or to-be-anon folios).
> + */
> + if (!folio_test_anon(folio) && !folio->mapping)
> + return -EBUSY;
> +
Nit: Shouldn't the order of check be
if (!folio->mapping && !folio_test_anon(folio))
works better if folio->mapping is NULL
> if (folio_test_anon(folio)) {
> /* order-1 is not supported for anonymous THP. */
> - VM_WARN_ONCE(warns && new_order == 1,
> - "Cannot split to order-1 folio");
> if (new_order == 1)
> - return false;
> + return -EINVAL;
> } else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
> if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
> !mapping_large_folio_support(folio->mapping)) {
> @@ -3717,9 +3742,7 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
> * case, the mapping does not actually support large
> * folios properly.
> */
> - VM_WARN_ONCE(warns,
> - "Cannot split file folio to non-0 order");
> - return false;
> + return -EINVAL;
> }
> }
>
> @@ -3732,12 +3755,16 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
> * here.
> */
> if ((split_type == SPLIT_TYPE_NON_UNIFORM || new_order) && folio_test_swapcache(folio)) {
> - VM_WARN_ONCE(warns,
> - "Cannot split swapcache folio to non-0 order");
> - return false;
> + return -EINVAL;
> }
>
> - return true;
> + if (is_huge_zero_folio(folio))
> + return -EINVAL;
> +
> + if (folio_test_writeback(folio))
> + return -EBUSY;
> +
> + return 0;
> }
>
> static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
> @@ -3922,7 +3949,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> int remap_flags = 0;
> int extra_pins, ret;
> pgoff_t end = 0;
> - bool is_hzp;
>
> VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
> VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
> @@ -3930,31 +3956,15 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> if (folio != page_folio(split_at) || folio != page_folio(lock_at))
> return -EINVAL;
>
> - /*
> - * Folios that just got truncated cannot get split. Signal to the
> - * caller that there was a race.
> - *
> - * TODO: this will also currently refuse shmem folios that are in the
> - * swapcache.
> - */
> - if (!is_anon && !folio->mapping)
> - return -EBUSY;
> -
> if (new_order >= old_order)
> return -EINVAL;
>
> - if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
> - return -EINVAL;
> -
> - is_hzp = is_huge_zero_folio(folio);
> - if (is_hzp) {
> - pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
> - return -EBUSY;
> + ret = folio_check_splittable(folio, new_order, split_type);
> + if (ret) {
> + VM_WARN_ONCE(ret == -EINVAL, "Tried to split an unsplittable folio");
> + return ret;
> }
>
> - if (folio_test_writeback(folio))
> - return -EBUSY;
> -
> if (is_anon) {
> /*
> * The caller does not necessarily hold an mmap_lock that would
Otherwise,looks good!
Acked-by: Balbir Singh <balbirs@nvidia.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
2025-11-26 3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
2025-11-26 4:14 ` Balbir Singh
@ 2025-11-26 9:54 ` David Hildenbrand (Red Hat)
2025-11-26 16:59 ` Zi Yan
2025-11-27 5:23 ` Barry Song
2 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26 9:54 UTC (permalink / raw)
To: Zi Yan, Lorenzo Stoakes
Cc: Andrew Morton, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel
> - /*
> - * Folios that just got truncated cannot get split. Signal to the
> - * caller that there was a race.
> - *
> - * TODO: this will also currently refuse shmem folios that are in the
> - * swapcache.
> - */
> - if (!is_anon && !folio->mapping)
> - return -EBUSY;
> -
> if (new_order >= old_order)
> return -EINVAL;
>
> - if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
> - return -EINVAL;
> -
> - is_hzp = is_huge_zero_folio(folio);
> - if (is_hzp) {
> - pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
> - return -EBUSY;
As we are changing that case to a VM_WARN_ONCE(), is there some path
where we might trigger that?
I'm wondering about the split_huge_pages_all() function in particular. I
guess the "!folio_test_lru(folio)" would protect us?
Apart from that LGTM
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
--
Cheers
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation
2025-11-26 3:50 ` [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation Zi Yan
@ 2025-11-26 9:56 ` David Hildenbrand (Red Hat)
2025-11-26 16:59 ` Zi Yan
0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26 9:56 UTC (permalink / raw)
To: Zi Yan, Lorenzo Stoakes
Cc: Andrew Morton, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Miaohe Lin,
Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm, linux-kernel
> static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
> struct page *split_at, struct xa_state *xas,
> struct address_space *mapping, bool do_lru,
> struct list_head *list, enum split_type split_type,
> - pgoff_t end, int *nr_shmem_dropped, int extra_pins)
> + pgoff_t end, int *nr_shmem_dropped)
> {
> struct folio *end_folio = folio_next(folio);
> struct folio *new_folio, *next;
> @@ -3782,7 +3773,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
> VM_WARN_ON_ONCE(!mapping && end);
> /* Prevent deferred_split_scan() touching ->_refcount */
> ds_queue = folio_split_queue_lock(folio);
> - if (folio_ref_freeze(folio, 1 + extra_pins)) {
> + if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) {
> struct swap_cluster_info *ci = NULL;
> struct lruvec *lruvec;
> int expected_refs;
> @@ -3853,7 +3844,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>
> zone_device_private_split_cb(folio, new_folio);
>
> - expected_refs = folio_expected_ref_count(new_folio) + 1;
> + expected_refs = folio_cache_ref_count(new_folio) + 1;
> folio_ref_unfreeze(new_folio, expected_refs);
>
> if (do_lru)
> @@ -3897,7 +3888,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
> * Otherwise, a parallel folio_try_get() can grab @folio
> * and its caller can see stale page cache entries.
> */
> - expected_refs = folio_expected_ref_count(folio) + 1;
> + expected_refs = folio_cache_ref_count(folio) + 1;
> folio_ref_unfreeze(folio, expected_refs);
Can we just get rid of the expected_refs variable as well?
Apart from that LGTM, thanks!
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
--
Cheers
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
2025-11-26 4:14 ` Balbir Singh
@ 2025-11-26 16:55 ` Zi Yan
0 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 16:55 UTC (permalink / raw)
To: Balbir Singh
Cc: David Hildenbrand, Lorenzo Stoakes, Andrew Morton, Baolin Wang,
Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
Lance Yang, Miaohe Lin, Naoya Horiguchi, Wei Yang, linux-mm,
linux-kernel
On 25 Nov 2025, at 23:14, Balbir Singh wrote:
> On 11/26/25 14:50, Zi Yan wrote:
>> folio_split_supported() used in try_folio_split_to_order() requires
>> folio->mapping to be non NULL, but current try_folio_split_to_order() does
>> not check it. There is no issue in the current code, since
>> try_folio_split_to_order() is only used in truncate_inode_partial_folio(),
>> where folio->mapping is not NULL.
>>
>> To prevent future misuse, move folio->mapping NULL check (i.e., folio is
>> truncated) into folio_split_supported(). Since folio->mapping NULL check
>> returns -EBUSY and folio_split_supported() == false means -EINVAL, change
>> folio_split_supported() return type from bool to int and return error
>> numbers accordingly. Rename folio_split_supported() to
>> folio_check_splittable() to match the return type change.
>>
>> While at it, move is_huge_zero_folio() check and folio_test_writeback()
>> check into folio_check_splittable() and add kernel-doc.
>>
>> Remove all warnings inside folio_check_splittable() and give warnings
>> in __folio_split() instead, so that bool warns parameter can be removed.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>> include/linux/huge_mm.h | 6 ++--
>> mm/huge_memory.c | 76 +++++++++++++++++++++++------------------
>> 2 files changed, 46 insertions(+), 36 deletions(-)
>>
>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>> index 1d439de1ca2c..66105a90b4c3 100644
>> --- a/include/linux/huge_mm.h
>> +++ b/include/linux/huge_mm.h
>> @@ -375,8 +375,8 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
>> int folio_split_unmapped(struct folio *folio, unsigned int new_order);
>> int min_order_for_split(struct folio *folio);
>> int split_folio_to_list(struct folio *folio, struct list_head *list);
>> -bool folio_split_supported(struct folio *folio, unsigned int new_order,
>> - enum split_type split_type, bool warns);
>> +int folio_check_splittable(struct folio *folio, unsigned int new_order,
>> + enum split_type split_type);
>> int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
>> struct list_head *list);
>>
>> @@ -407,7 +407,7 @@ static inline int split_huge_page_to_order(struct page *page, unsigned int new_o
>> static inline int try_folio_split_to_order(struct folio *folio,
>> struct page *page, unsigned int new_order)
>> {
>> - if (!folio_split_supported(folio, new_order, SPLIT_TYPE_NON_UNIFORM, /* warns= */ false))
>> + if (folio_check_splittable(folio, new_order, SPLIT_TYPE_NON_UNIFORM))
>> return split_huge_page_to_order(&folio->page, new_order);
>> return folio_split(folio, new_order, page, NULL);
>> }
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 041b554c7115..771df0c02a4a 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -3688,15 +3688,40 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>> return 0;
>> }
>>
>> -bool folio_split_supported(struct folio *folio, unsigned int new_order,
>> - enum split_type split_type, bool warns)
>> +/**
>> + * folio_check_splittable() - check if a folio can be split to a given order
>> + * @folio: folio to be split
>> + * @new_order: the smallest order of the after split folios (since buddy
>> + * allocator like split generates folios with orders from @folio's
>> + * order - 1 to new_order).
>> + * @split_type: uniform or non-uniform split
>> + *
>> + * folio_check_splittable() checks if @folio can be split to @new_order using
>> + * @split_type method. The truncated folio check must come first.
>> + *
>> + * Context: folio must be locked.
>> + *
>> + * Return: 0 - @folio can be split to @new_order, otherwise an error number is
>> + * returned.
>> + */
>> +int folio_check_splittable(struct folio *folio, unsigned int new_order,
>> + enum split_type split_type)
>> {
>> + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio);
>> + /*
>> + * Folios that just got truncated cannot get split. Signal to the
>> + * caller that there was a race.
>> + *
>> + * TODO: this will also currently refuse folios without a mapping in the
>> + * swapcache (shmem or to-be-anon folios).
>> + */
>> + if (!folio_test_anon(folio) && !folio->mapping)
>> + return -EBUSY;
>> +
>
> Nit: Shouldn't the order of check be
>
> if (!folio->mapping && !folio_test_anon(folio))
>
> works better if folio->mapping is NULL
It does not matter, since folio_test_anon() checks folio->mapping too.
I can revert the order in the next version.
>
>
>> if (folio_test_anon(folio)) {
>> /* order-1 is not supported for anonymous THP. */
>> - VM_WARN_ONCE(warns && new_order == 1,
>> - "Cannot split to order-1 folio");
>> if (new_order == 1)
>> - return false;
>> + return -EINVAL;
>> } else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
>> if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>> !mapping_large_folio_support(folio->mapping)) {
>> @@ -3717,9 +3742,7 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
>> * case, the mapping does not actually support large
>> * folios properly.
>> */
>> - VM_WARN_ONCE(warns,
>> - "Cannot split file folio to non-0 order");
>> - return false;
>> + return -EINVAL;
>> }
>> }
>>
>> @@ -3732,12 +3755,16 @@ bool folio_split_supported(struct folio *folio, unsigned int new_order,
>> * here.
>> */
>> if ((split_type == SPLIT_TYPE_NON_UNIFORM || new_order) && folio_test_swapcache(folio)) {
>> - VM_WARN_ONCE(warns,
>> - "Cannot split swapcache folio to non-0 order");
>> - return false;
>> + return -EINVAL;
>> }
>>
>> - return true;
>> + if (is_huge_zero_folio(folio))
>> + return -EINVAL;
>> +
>> + if (folio_test_writeback(folio))
>> + return -EBUSY;
>> +
>> + return 0;
>> }
>>
>> static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
>> @@ -3922,7 +3949,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>> int remap_flags = 0;
>> int extra_pins, ret;
>> pgoff_t end = 0;
>> - bool is_hzp;
>>
>> VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
>> VM_WARN_ON_ONCE_FOLIO(!folio_test_large(folio), folio);
>> @@ -3930,31 +3956,15 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>> if (folio != page_folio(split_at) || folio != page_folio(lock_at))
>> return -EINVAL;
>>
>> - /*
>> - * Folios that just got truncated cannot get split. Signal to the
>> - * caller that there was a race.
>> - *
>> - * TODO: this will also currently refuse shmem folios that are in the
>> - * swapcache.
>> - */
>> - if (!is_anon && !folio->mapping)
>> - return -EBUSY;
>> -
>> if (new_order >= old_order)
>> return -EINVAL;
>>
>> - if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
>> - return -EINVAL;
>> -
>> - is_hzp = is_huge_zero_folio(folio);
>> - if (is_hzp) {
>> - pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
>> - return -EBUSY;
>> + ret = folio_check_splittable(folio, new_order, split_type);
>> + if (ret) {
>> + VM_WARN_ONCE(ret == -EINVAL, "Tried to split an unsplittable folio");
>> + return ret;
>> }
>>
>> - if (folio_test_writeback(folio))
>> - return -EBUSY;
>> -
>> if (is_anon) {
>> /*
>> * The caller does not necessarily hold an mmap_lock that would
>
> Otherwise,looks good!
>
> Acked-by: Balbir Singh <balbirs@nvidia.com>
Thanks.
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
2025-11-26 9:54 ` David Hildenbrand (Red Hat)
@ 2025-11-26 16:59 ` Zi Yan
0 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 16:59 UTC (permalink / raw)
To: David Hildenbrand (Red Hat)
Cc: Lorenzo Stoakes, Andrew Morton, Baolin Wang, Liam R. Howlett,
Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang,
Miaohe Lin, Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm,
linux-kernel
On 26 Nov 2025, at 4:54, David Hildenbrand (Red Hat) wrote:
>> - /*
>> - * Folios that just got truncated cannot get split. Signal to the
>> - * caller that there was a race.
>> - *
>> - * TODO: this will also currently refuse shmem folios that are in the
>> - * swapcache.
>> - */
>> - if (!is_anon && !folio->mapping)
>> - return -EBUSY;
>> -
>> if (new_order >= old_order)
>> return -EINVAL;
>> - if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
>> - return -EINVAL;
>> -
>> - is_hzp = is_huge_zero_folio(folio);
>> - if (is_hzp) {
>> - pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
>> - return -EBUSY;
>
> As we are changing that case to a VM_WARN_ONCE(), is there some path where we might trigger that?
Based on the git history, this check is added for injecting errors
to huge zero folio and triggering memory failure handling.
>
> I'm wondering about the split_huge_pages_all() function in particular. I guess the "!folio_test_lru(folio)" would protect us?
I think so.
>
> Apart from that LGTM
>
> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
>
Thanks.
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation
2025-11-26 9:56 ` David Hildenbrand (Red Hat)
@ 2025-11-26 16:59 ` Zi Yan
0 siblings, 0 replies; 12+ messages in thread
From: Zi Yan @ 2025-11-26 16:59 UTC (permalink / raw)
To: David Hildenbrand (Red Hat)
Cc: Lorenzo Stoakes, Andrew Morton, Baolin Wang, Liam R. Howlett,
Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang,
Miaohe Lin, Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm,
linux-kernel
On 26 Nov 2025, at 4:56, David Hildenbrand (Red Hat) wrote:
>> static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int new_order,
>> struct page *split_at, struct xa_state *xas,
>> struct address_space *mapping, bool do_lru,
>> struct list_head *list, enum split_type split_type,
>> - pgoff_t end, int *nr_shmem_dropped, int extra_pins)
>> + pgoff_t end, int *nr_shmem_dropped)
>> {
>> struct folio *end_folio = folio_next(folio);
>> struct folio *new_folio, *next;
>> @@ -3782,7 +3773,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>> VM_WARN_ON_ONCE(!mapping && end);
>> /* Prevent deferred_split_scan() touching ->_refcount */
>> ds_queue = folio_split_queue_lock(folio);
>> - if (folio_ref_freeze(folio, 1 + extra_pins)) {
>> + if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) {
>> struct swap_cluster_info *ci = NULL;
>> struct lruvec *lruvec;
>> int expected_refs;
>> @@ -3853,7 +3844,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>> zone_device_private_split_cb(folio, new_folio);
>> - expected_refs = folio_expected_ref_count(new_folio) + 1;
>> + expected_refs = folio_cache_ref_count(new_folio) + 1;
>> folio_ref_unfreeze(new_folio, expected_refs);
>> if (do_lru)
>> @@ -3897,7 +3888,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>> * Otherwise, a parallel folio_try_get() can grab @folio
>> * and its caller can see stale page cache entries.
>> */
>> - expected_refs = folio_expected_ref_count(folio) + 1;
>> + expected_refs = folio_cache_ref_count(folio) + 1;
>> folio_ref_unfreeze(folio, expected_refs);
>
> Can we just get rid of the expected_refs variable as well?
OK. Will update it.
>
> Apart from that LGTM, thanks!
>
> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Thanks.
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
2025-11-26 3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
2025-11-26 4:14 ` Balbir Singh
2025-11-26 9:54 ` David Hildenbrand (Red Hat)
@ 2025-11-27 5:23 ` Barry Song
2 siblings, 0 replies; 12+ messages in thread
From: Barry Song @ 2025-11-27 5:23 UTC (permalink / raw)
To: Zi Yan
Cc: David Hildenbrand, Lorenzo Stoakes, Andrew Morton, Baolin Wang,
Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Lance Yang,
Miaohe Lin, Naoya Horiguchi, Wei Yang, Balbir Singh, linux-mm,
linux-kernel
On Wed, Nov 26, 2025 at 11:50 AM Zi Yan <ziy@nvidia.com> wrote:
>
> folio_split_supported() used in try_folio_split_to_order() requires
> folio->mapping to be non NULL, but current try_folio_split_to_order() does
> not check it. There is no issue in the current code, since
> try_folio_split_to_order() is only used in truncate_inode_partial_folio(),
> where folio->mapping is not NULL.
>
> To prevent future misuse, move folio->mapping NULL check (i.e., folio is
> truncated) into folio_split_supported(). Since folio->mapping NULL check
> returns -EBUSY and folio_split_supported() == false means -EINVAL, change
> folio_split_supported() return type from bool to int and return error
> numbers accordingly. Rename folio_split_supported() to
> folio_check_splittable() to match the return type change.
>
> While at it, move is_huge_zero_folio() check and folio_test_writeback()
> check into folio_check_splittable() and add kernel-doc.
>
> Remove all warnings inside folio_check_splittable() and give warnings
> in __folio_split() instead, so that bool warns parameter can be removed.
>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Much cleaner than having a "warns" argument before.
Reviewed-by: Barry Song <baohua@kernel.org>
Thanks
Barry
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-11-27 5:23 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-26 3:50 [PATCH v3 0/4] Improve folio split related functions Zi Yan
2025-11-26 3:50 ` [PATCH v3 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable() Zi Yan
2025-11-26 4:14 ` Balbir Singh
2025-11-26 16:55 ` Zi Yan
2025-11-26 9:54 ` David Hildenbrand (Red Hat)
2025-11-26 16:59 ` Zi Yan
2025-11-27 5:23 ` Barry Song
2025-11-26 3:50 ` [PATCH v3 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation Zi Yan
2025-11-26 9:56 ` David Hildenbrand (Red Hat)
2025-11-26 16:59 ` Zi Yan
2025-11-26 3:50 ` [PATCH v3 3/4] mm/huge_memory: make min_order_for_split() always return an order Zi Yan
2025-11-26 3:50 ` [PATCH v3 4/4] mm/huge_memory: fix folio split stats counting Zi Yan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox