linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 0/3] Decoupling large folios dependency on THP
@ 2025-12-06  3:08 Pankaj Raghav
  2025-12-06  3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Pankaj Raghav @ 2025-12-06  3:08 UTC (permalink / raw)
  To: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
	Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
	Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
	Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
  Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
	gost.dev, kernel, tytso, Pankaj Raghav

File-backed Large folios were initially implemented with dependencies on Transparent
Huge Pages (THP) infrastructure. As large folio adoption expanded across
the kernel, CONFIG_TRANSPARENT_HUGEPAGE has become an overloaded
configuration option, sometimes used as a proxy for large folio support
[1][2][3].

This series is a part of the LPC talk[4], and I am sending the RFC
series to start the discussion.

There are multiple solutions to solve this problem and this is one of
them with minimal changes. I plan on discussing possible other solutions
at the talk.

Based on my investigation, the only feature large folios depend on is
the THP splitting infrastructure. Either during truncation or memory
pressure when the large folio has to be split, then THP's splitting
infrastructure is used to split them into min order folio chunks.

In this approach, we restrict the maximum order of the large folio to
minimum order to ensure we never use the splitting infrastructure when
THP is disabled.

I disabled THP, and ran xfstests on XFS with 16k, 32k and 64k blocksizes
and the changes seems to survive the test without any issues.

Looking forward to some productive discussion.

P.S: Thanks to Zi, David and willy for all the ideas they provided to
solve this problem.

[1] https://lore.kernel.org/linux-mm/731d8b44-1a45-40bc-a274-8f39a7ae0f7f@lucifer.local/
[2] https://lore.kernel.org/all/aGfNKGBz9lhuK1AF@casper.infradead.org/
[3] https://lore.kernel.org/linux-ext4/20251110043226.GD2988753@mit.edu/
[4] https://lpc.events/event/19/contributions/2139/

Pankaj Raghav (3):
  filemap: set max order to be min order if THP is disabled
  huge_memory: skip warning if min order and folio order are same in
    split
  blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices

 include/linux/blkdev.h  |  5 -----
 include/linux/huge_mm.h | 40 ++++++++--------------------------------
 include/linux/pagemap.h | 17 ++++++-----------
 mm/memory.c             | 41 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 55 insertions(+), 48 deletions(-)


base-commit: e4c4d9892021888be6d874ec1be307e80382f431
-- 
2.50.1



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RFC v2 1/3] filemap: set max order to be min order if THP is disabled
  2025-12-06  3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
@ 2025-12-06  3:08 ` Pankaj Raghav
  2025-12-06  3:08 ` [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split Pankaj Raghav
  2025-12-06  3:08 ` [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices Pankaj Raghav
  2 siblings, 0 replies; 4+ messages in thread
From: Pankaj Raghav @ 2025-12-06  3:08 UTC (permalink / raw)
  To: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
	Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
	Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
	Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
  Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
	gost.dev, kernel, tytso, Pankaj Raghav

Large folios in the page cache depend on the splitting infrastructure from
THP. To remove the dependency between large folios and
CONFIG_TRANSPARENT_HUGEPAGE, set the min order == max order if THP is
disabled. This will make sure the splitting code will not be required
when THP is disabled, therefore, removing the dependency between large
folios and THP.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
 include/linux/pagemap.h | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 09b581c1d878..1bb0d4432d4b 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -397,9 +397,7 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
  */
 static inline size_t mapping_max_folio_size_supported(void)
 {
-	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
-		return 1U << (PAGE_SHIFT + MAX_PAGECACHE_ORDER);
-	return PAGE_SIZE;
+	return 1U << (PAGE_SHIFT + MAX_PAGECACHE_ORDER);
 }
 
 /*
@@ -422,16 +420,17 @@ static inline void mapping_set_folio_order_range(struct address_space *mapping,
 						 unsigned int min,
 						 unsigned int max)
 {
-	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
-		return;
-
 	if (min > MAX_PAGECACHE_ORDER)
 		min = MAX_PAGECACHE_ORDER;
 
 	if (max > MAX_PAGECACHE_ORDER)
 		max = MAX_PAGECACHE_ORDER;
 
-	if (max < min)
+	/* Large folios depend on THP infrastructure for splitting.
+	 * If THP is disabled, we cap the max order to min order to avoid
+	 * splitting the folios.
+	 */
+	if ((max < min) || !IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
 		max = min;
 
 	mapping->flags = (mapping->flags & ~AS_FOLIO_ORDER_MASK) |
@@ -463,16 +462,12 @@ static inline void mapping_set_large_folios(struct address_space *mapping)
 static inline unsigned int
 mapping_max_folio_order(const struct address_space *mapping)
 {
-	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
-		return 0;
 	return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX;
 }
 
 static inline unsigned int
 mapping_min_folio_order(const struct address_space *mapping)
 {
-	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
-		return 0;
 	return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN;
 }
 
-- 
2.50.1



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split
  2025-12-06  3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
  2025-12-06  3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
@ 2025-12-06  3:08 ` Pankaj Raghav
  2025-12-06  3:08 ` [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices Pankaj Raghav
  2 siblings, 0 replies; 4+ messages in thread
From: Pankaj Raghav @ 2025-12-06  3:08 UTC (permalink / raw)
  To: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
	Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
	Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
	Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
  Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
	gost.dev, kernel, tytso, Pankaj Raghav

When THP is disabled, file-backed large folios max order is capped to the
min order to avoid using the splitting infrastructure.

Currently, splitting calls will create a warning when called with THP
disabled. But splitting call does not have to do anything when min order
is same as the folio order.

So skip the warning in folio split functions if the min order is same as
the folio order for file backed folios.

Due to issues with circular dependency, move the definition of split
function for !CONFIG_TRANSPARENT_HUGEPAGES to mm/memory.c

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
 include/linux/huge_mm.h | 40 ++++++++--------------------------------
 mm/memory.c             | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 32 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 21162493a0a0..71e309f2d26a 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -612,42 +612,18 @@ can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins)
 {
 	return false;
 }
-static inline int
-split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
-		unsigned int new_order)
-{
-	VM_WARN_ON_ONCE_PAGE(1, page);
-	return -EINVAL;
-}
-static inline int split_huge_page_to_order(struct page *page, unsigned int new_order)
-{
-	VM_WARN_ON_ONCE_PAGE(1, page);
-	return -EINVAL;
-}
+int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
+		unsigned int new_order);
+int split_huge_page_to_order(struct page *page, unsigned int new_order);
 static inline int split_huge_page(struct page *page)
 {
-	VM_WARN_ON_ONCE_PAGE(1, page);
-	return -EINVAL;
-}
-
-static inline unsigned int min_order_for_split(struct folio *folio)
-{
-	VM_WARN_ON_ONCE_FOLIO(1, folio);
-	return 0;
-}
-
-static inline int split_folio_to_list(struct folio *folio, struct list_head *list)
-{
-	VM_WARN_ON_ONCE_FOLIO(1, folio);
-	return -EINVAL;
+	return split_huge_page_to_list_to_order(page, NULL, 0);
 }
 
-static inline int try_folio_split_to_order(struct folio *folio,
-		struct page *page, unsigned int new_order)
-{
-	VM_WARN_ON_ONCE_FOLIO(1, folio);
-	return -EINVAL;
-}
+unsigned int min_order_for_split(struct folio *folio);
+int split_folio_to_list(struct folio *folio, struct list_head *list);
+int try_folio_split_to_order(struct folio *folio,
+		struct page *page, unsigned int new_order);
 
 static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {}
 static inline void reparent_deferred_split_queue(struct mem_cgroup *memcg) {}
diff --git a/mm/memory.c b/mm/memory.c
index 6675e87eb7dd..4eccdf72a46e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4020,6 +4020,47 @@ static bool __wp_can_reuse_large_anon_folio(struct folio *folio,
 {
 	BUILD_BUG();
 }
+
+int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
+				     unsigned int new_order)
+{
+	struct folio *folio = page_folio(page);
+	unsigned int order = mapping_min_folio_order(folio->mapping);
+
+	if (!folio_test_anon(folio) && order == folio_order(folio))
+		return -EINVAL;
+
+	VM_WARN_ON_ONCE_PAGE(1, page);
+	return -EINVAL;
+}
+
+int split_huge_page_to_order(struct page *page, unsigned int new_order)
+{
+	return split_huge_page_to_list_to_order(page, NULL, new_order);
+}
+
+int split_folio_to_list(struct folio *folio, struct list_head *list)
+{
+	unsigned int order = mapping_min_folio_order(folio->mapping);
+
+	if (!folio_test_anon(folio) && order == folio_order(folio))
+		return -EINVAL;
+
+	VM_WARN_ON_ONCE_FOLIO(1, folio);
+	return -EINVAL;
+}
+
+unsigned int min_order_for_split(struct folio *folio)
+{
+	return split_folio_to_list(folio, NULL);
+}
+
+
+int try_folio_split_to_order(struct folio *folio, struct page *page,
+			     unsigned int new_order)
+{
+	return split_folio_to_list(folio, NULL);
+}
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 static bool wp_can_reuse_anon_folio(struct folio *folio,
-- 
2.50.1



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices
  2025-12-06  3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
  2025-12-06  3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
  2025-12-06  3:08 ` [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split Pankaj Raghav
@ 2025-12-06  3:08 ` Pankaj Raghav
  2 siblings, 0 replies; 4+ messages in thread
From: Pankaj Raghav @ 2025-12-06  3:08 UTC (permalink / raw)
  To: Suren Baghdasaryan, Mike Rapoport, David Hildenbrand,
	Ryan Roberts, Michal Hocko, Lance Yang, Lorenzo Stoakes,
	Baolin Wang, Dev Jain, Barry Song, Andrew Morton, Nico Pache,
	Zi Yan, Vlastimil Babka, Liam R . Howlett, Jens Axboe
  Cc: linux-kernel, linux-mm, linux-block, linux-fsdevel, mcgrof,
	gost.dev, kernel, tytso, Pankaj Raghav

Now that dependency between CONFIG_TRANSPARENT_HUGEPAGES and large
folios are removed, enable LBS devices even when THP config is disabled.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
 include/linux/blkdev.h | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 70b671a9a7f7..b6379d73f546 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -270,16 +270,11 @@ static inline dev_t disk_devt(struct gendisk *disk)
 	return MKDEV(disk->major, disk->first_minor);
 }
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 /*
  * We should strive for 1 << (PAGE_SHIFT + MAX_PAGECACHE_ORDER)
  * however we constrain this to what we can validate and test.
  */
 #define BLK_MAX_BLOCK_SIZE      SZ_64K
-#else
-#define BLK_MAX_BLOCK_SIZE      PAGE_SIZE
-#endif
-
 
 /* blk_validate_limits() validates bsize, so drivers don't usually need to */
 static inline int blk_validate_block_size(unsigned long bsize)
-- 
2.50.1



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-12-06  3:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-06  3:08 [RFC v2 0/3] Decoupling large folios dependency on THP Pankaj Raghav
2025-12-06  3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
2025-12-06  3:08 ` [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split Pankaj Raghav
2025-12-06  3:08 ` [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices Pankaj Raghav

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox