* [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio
@ 2025-09-18 13:19 Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig() Kefeng Wang
` (7 more replies)
0 siblings, 8 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:19 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
Firstly, optimize pfn_range_valid_contig()/replace_free_hugepage_folios()
to accelerate the gigantic folio allocation, time reduced from 2.124s to
0.429s when allocate 200*1G.
Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
which avoid atomic operation about page refcount, and then convert to
allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
the alloc_gigantic_folio().
v2:
- Optimize gigantic folio allocation speed
- Using HPAGE_PUD_ORDER in debug_vm_pgtable
- Address some David's comments,
- kill folio_alloc_gigantic()
- add generic cma_alloc_frozen{_compound}() instead of
cma_{alloc,free}_folio
Kefeng Wang (8):
mm: page_alloc: optimize pfn_range_valid_contig()
mm: hugetlb: optimize replace_free_hugepage_folios()
mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page()
mm: page_alloc: add split_non_compound_page()
mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}()
mm: cma: add __cma_release()
mm: cma: add cma_alloc_frozen{_compound}()
mm: hugetlb: allocate frozen pages in alloc_gigantic_folio()
include/linux/cma.h | 26 +----
include/linux/gfp.h | 52 ++++------
mm/cma.c | 104 +++++++++-----------
mm/debug_vm_pgtable.c | 38 ++++---
mm/hugetlb.c | 103 +++++++++----------
mm/hugetlb_cma.c | 27 ++---
mm/hugetlb_cma.h | 10 +-
mm/internal.h | 6 ++
mm/page_alloc.c | 224 +++++++++++++++++++++++++++++-------------
9 files changed, 318 insertions(+), 272 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
@ 2025-09-18 13:19 ` Kefeng Wang
2025-09-18 15:49 ` Zi Yan
` (3 more replies)
2025-09-18 13:19 ` [PATCH v2 2/8] mm: hugetlb: optimize replace_free_hugepage_folios() Kefeng Wang
` (6 subsequent siblings)
7 siblings, 4 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:19 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
The alloc_contig_pages() spends a lot of time in pfn_range_valid_contig(),
we could check whether the page in this pfn range could be allocated
before alloc_contig_range(), if the page can't be migrated, no further
action is required, and also skip some unnecessary iterations for
compound pages such as THP and non-compound high order buddy, which
save times a lot too. The check is racy, but the only danger is skipping
too much.
A simple test on machine with 116G free memory, allocate 120 * 1G
HugeTLB folios(107 successfully returned),
time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
Before: 0m2.124s
After: 0m0.602s
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/page_alloc.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 478beaf95f84..5b7d705e9710 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7012,6 +7012,7 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
{
unsigned long i, end_pfn = start_pfn + nr_pages;
struct page *page;
+ struct folio *folio;
for (i = start_pfn; i < end_pfn; i++) {
page = pfn_to_online_page(i);
@@ -7021,11 +7022,26 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
if (page_zone(page) != z)
return false;
- if (PageReserved(page))
+ folio = page_folio(page);
+ if (folio_test_reserved(folio))
return false;
- if (PageHuge(page))
+ if (folio_test_hugetlb(folio))
return false;
+
+ /* The following type of folios aren't migrated */
+ if (folio_test_pgtable(folio) | folio_test_stack(folio))
+ return false;
+
+ /*
+ * For compound pages such as THP and non-compound high
+ * order buddy pages, save potentially a lot of iterations
+ * if we can skip them at once.
+ */
+ if (PageCompound(page))
+ i += (1UL << compound_order(page)) - 1;
+ else if (PageBuddy(page))
+ i += (1UL << buddy_order(page)) - 1;
}
return true;
}
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 2/8] mm: hugetlb: optimize replace_free_hugepage_folios()
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig() Kefeng Wang
@ 2025-09-18 13:19 ` Kefeng Wang
2025-09-30 9:57 ` David Hildenbrand
2025-09-18 13:19 ` [PATCH v2 3/8] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
` (5 subsequent siblings)
7 siblings, 1 reply; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:19 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
No need to replace free hugepage folios if no free hugetlb folios,
we don't replace gigantic folio, so use isolate_or_dissolve_huge_folio(),
also skip some pfn iterations for compound pages such as THP and
non-compound high order buddy to save time.
A simple test on machine with 116G free memory, allocate 120 * 1G
HugeTLB folios(107 successfully returned),
time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
Before: 0m0.602s
After: 0m0.429s
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/hugetlb.c | 49 +++++++++++++++++++++++++++++++++++++------------
1 file changed, 37 insertions(+), 12 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1806685ea326..bc88b659a88b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2890,26 +2890,51 @@ int isolate_or_dissolve_huge_folio(struct folio *folio, struct list_head *list)
*/
int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn)
{
- struct folio *folio;
- int ret = 0;
+ unsigned long nr = 0;
+ struct page *page;
+ struct hstate *h;
+ LIST_HEAD(list);
+
+ /* Avoid pfn iterations if no free non-gigantic huge pages */
+ for_each_hstate(h) {
+ if (!hstate_is_gigantic(h))
+ nr += h->free_huge_pages;
+ }
- LIST_HEAD(isolate_list);
+ if (!nr)
+ return 0;
while (start_pfn < end_pfn) {
- folio = pfn_folio(start_pfn);
+ page = pfn_to_page(start_pfn);
+ nr = 1;
- /* Not to disrupt normal path by vainly holding hugetlb_lock */
- if (folio_test_hugetlb(folio) && !folio_ref_count(folio)) {
- ret = alloc_and_dissolve_hugetlb_folio(folio, &isolate_list);
- if (ret)
- break;
+ if (PageHuge(page) || PageCompound(page)) {
+ struct folio *folio = page_folio(page);
+
+ nr = 1UL << compound_order(page);
- putback_movable_pages(&isolate_list);
+ if (folio_test_hugetlb(folio) && !folio_ref_count(folio)) {
+ if (isolate_or_dissolve_huge_folio(folio, &list))
+ return -ENOMEM;
+
+ putback_movable_pages(&list);
+ }
+ } else if (PageBuddy(page)) {
+ /*
+ * Buddy order check without zone lock is unsafe and
+ * the order is maybe invalid, but race should be
+ * small, and the worst thing is skipping free hugetlb.
+ */
+ const unsigned int order = buddy_order_unsafe(page);
+
+ if (order <= MAX_PAGE_ORDER)
+ nr = 1UL << order;
}
- start_pfn++;
+
+ start_pfn += nr;
}
- return ret;
+ return 0;
}
void wait_for_freed_hugetlb_folios(void)
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 3/8] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page()
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig() Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 2/8] mm: hugetlb: optimize replace_free_hugepage_folios() Kefeng Wang
@ 2025-09-18 13:19 ` Kefeng Wang
2025-09-30 10:01 ` David Hildenbrand
2025-09-18 13:19 ` [PATCH v2 4/8] mm: page_alloc: add split_non_compound_page() Kefeng Wang
` (4 subsequent siblings)
7 siblings, 1 reply; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:19 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
Add a new helper to free huge page to be consistency to
debug_vm_pgtable_alloc_huge_page(), and use HPAGE_PUD_ORDER
instead of open-code.
Also move the free_contig_range() under CONFIG_ALLOC_CONTIG
since all caller are built with CONFIG_ALLOC_CONTIG.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
include/linux/gfp.h | 2 +-
mm/debug_vm_pgtable.c | 38 +++++++++++++++++---------------------
mm/page_alloc.c | 2 +-
3 files changed, 19 insertions(+), 23 deletions(-)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 0ceb4e09306c..1fefb63e0480 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -437,8 +437,8 @@ extern struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_
int nid, nodemask_t *nodemask);
#define alloc_contig_pages(...) alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
-#endif
void free_contig_range(unsigned long pfn, unsigned long nr_pages);
+#endif
#ifdef CONFIG_CONTIG_ALLOC
static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 830107b6dd08..d7f82aa58711 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -946,22 +946,26 @@ static unsigned long __init get_random_vaddr(void)
return random_vaddr;
}
-static void __init destroy_args(struct pgtable_debug_args *args)
+static void __init
+debug_vm_pgtable_free_huge_page(struct pgtable_debug_args *args,
+ unsigned long pfn, int order)
{
- struct page *page = NULL;
+#ifdef CONFIG_CONTIG_ALLOC
+ if (args->is_contiguous_page) {
+ free_contig_range(pfn, 1 << order);
+ return;
+ }
+#endif
+ __free_pages(pfn_to_page(pfn), order);
+}
+static void __init destroy_args(struct pgtable_debug_args *args)
+{
/* Free (huge) page */
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
has_transparent_pud_hugepage() &&
args->pud_pfn != ULONG_MAX) {
- if (args->is_contiguous_page) {
- free_contig_range(args->pud_pfn,
- (1 << (HPAGE_PUD_SHIFT - PAGE_SHIFT)));
- } else {
- page = pfn_to_page(args->pud_pfn);
- __free_pages(page, HPAGE_PUD_SHIFT - PAGE_SHIFT);
- }
-
+ debug_vm_pgtable_free_huge_page(args, args->pud_pfn, HPAGE_PUD_ORDER);
args->pud_pfn = ULONG_MAX;
args->pmd_pfn = ULONG_MAX;
args->pte_pfn = ULONG_MAX;
@@ -970,20 +974,13 @@ static void __init destroy_args(struct pgtable_debug_args *args)
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
has_transparent_hugepage() &&
args->pmd_pfn != ULONG_MAX) {
- if (args->is_contiguous_page) {
- free_contig_range(args->pmd_pfn, (1 << HPAGE_PMD_ORDER));
- } else {
- page = pfn_to_page(args->pmd_pfn);
- __free_pages(page, HPAGE_PMD_ORDER);
- }
-
+ debug_vm_pgtable_free_huge_page(args, args->pmd_pfn, HPAGE_PMD_ORDER);
args->pmd_pfn = ULONG_MAX;
args->pte_pfn = ULONG_MAX;
}
if (args->pte_pfn != ULONG_MAX) {
- page = pfn_to_page(args->pte_pfn);
- __free_page(page);
+ __free_page(pfn_to_page(args->pte_pfn));
args->pte_pfn = ULONG_MAX;
}
@@ -1215,8 +1212,7 @@ static int __init init_args(struct pgtable_debug_args *args)
*/
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
has_transparent_pud_hugepage()) {
- page = debug_vm_pgtable_alloc_huge_page(args,
- HPAGE_PUD_SHIFT - PAGE_SHIFT);
+ page = debug_vm_pgtable_alloc_huge_page(args, HPAGE_PUD_ORDER);
if (page) {
args->pud_pfn = page_to_pfn(page);
args->pmd_pfn = args->pud_pfn;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5b7d705e9710..b6eeae39f4d0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7113,7 +7113,6 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
}
return NULL;
}
-#endif /* CONFIG_CONTIG_ALLOC */
void free_contig_range(unsigned long pfn, unsigned long nr_pages)
{
@@ -7140,6 +7139,7 @@ void free_contig_range(unsigned long pfn, unsigned long nr_pages)
WARN(count != 0, "%lu pages are still in use!\n", count);
}
EXPORT_SYMBOL(free_contig_range);
+#endif /* CONFIG_CONTIG_ALLOC */
/*
* Effectively disable pcplists for the zone by setting the high limit to 0
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 4/8] mm: page_alloc: add split_non_compound_page()
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
` (2 preceding siblings ...)
2025-09-18 13:19 ` [PATCH v2 3/8] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
@ 2025-09-18 13:19 ` Kefeng Wang
2025-09-30 10:06 ` David Hildenbrand
2025-09-18 13:19 ` [PATCH v2 5/8] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}() Kefeng Wang
` (3 subsequent siblings)
7 siblings, 1 reply; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:19 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
Add new split_non_compound_page() to simplify make_alloc_exact().
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/page_alloc.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b6eeae39f4d0..e1d229b75f27 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3042,6 +3042,15 @@ void free_unref_folios(struct folio_batch *folios)
folio_batch_reinit(folios);
}
+static void split_non_compound_page(struct page *page, unsigned int order)
+{
+ VM_BUG_ON_PAGE(PageCompound(page), page);
+
+ split_page_owner(page, order, 0);
+ pgalloc_tag_split(page_folio(page), order, 0);
+ split_page_memcg(page, order);
+}
+
/*
* split_page takes a non-compound higher-order page, and splits it into
* n (1<<order) sub-pages: page[0..n]
@@ -3054,14 +3063,12 @@ void split_page(struct page *page, unsigned int order)
{
int i;
- VM_BUG_ON_PAGE(PageCompound(page), page);
VM_BUG_ON_PAGE(!page_count(page), page);
for (i = 1; i < (1 << order); i++)
set_page_refcounted(page + i);
- split_page_owner(page, order, 0);
- pgalloc_tag_split(page_folio(page), order, 0);
- split_page_memcg(page, order);
+
+ split_non_compound_page(page, order);
}
EXPORT_SYMBOL_GPL(split_page);
@@ -5315,9 +5322,7 @@ static void *make_alloc_exact(unsigned long addr, unsigned int order,
struct page *page = virt_to_page((void *)addr);
struct page *last = page + nr;
- split_page_owner(page, order, 0);
- pgalloc_tag_split(page_folio(page), order, 0);
- split_page_memcg(page, order);
+ split_non_compound_page(page, order);
while (page < --last)
set_page_refcounted(last);
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 5/8] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}()
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
` (3 preceding siblings ...)
2025-09-18 13:19 ` [PATCH v2 4/8] mm: page_alloc: add split_non_compound_page() Kefeng Wang
@ 2025-09-18 13:19 ` Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 6/8] mm: cma: add __cma_release() Kefeng Wang
` (2 subsequent siblings)
7 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:19 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
In order to allocate given range of pages or allocate compound
pages without incrementing their refcount, adding two new helper
alloc_contig_{range_frozen,frozen_pages}() which may be beneficial
to some users (eg hugetlb), also free_contig_range_frozen() is
provided to match alloc_contig_range_frozen(), but it is better to
use free_frozen_pages() to free frozen compound pages.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
include/linux/gfp.h | 29 +++++--
mm/page_alloc.c | 183 +++++++++++++++++++++++++++++---------------
2 files changed, 143 insertions(+), 69 deletions(-)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 1fefb63e0480..fbbdd8c88483 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -429,14 +429,27 @@ typedef unsigned int __bitwise acr_flags_t;
#define ACR_FLAGS_CMA ((__force acr_flags_t)BIT(0)) // allocate for CMA
/* The below functions must be run on a range from a single zone. */
-extern int alloc_contig_range_noprof(unsigned long start, unsigned long end,
- acr_flags_t alloc_flags, gfp_t gfp_mask);
-#define alloc_contig_range(...) alloc_hooks(alloc_contig_range_noprof(__VA_ARGS__))
-
-extern struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
- int nid, nodemask_t *nodemask);
-#define alloc_contig_pages(...) alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
-
+int alloc_contig_range_frozen_noprof(unsigned long start, unsigned long end,
+ acr_flags_t alloc_flags, gfp_t gfp_mask);
+#define alloc_contig_range_frozen(...) \
+ alloc_hooks(alloc_contig_range_frozen_noprof(__VA_ARGS__))
+
+int alloc_contig_range_noprof(unsigned long start, unsigned long end,
+ acr_flags_t alloc_flags, gfp_t gfp_mask);
+#define alloc_contig_range(...) \
+ alloc_hooks(alloc_contig_range_noprof(__VA_ARGS__))
+
+struct page *alloc_contig_frozen_pages_noprof(unsigned long nr_pages,
+ gfp_t gfp_mask, int nid, nodemask_t *nodemask);
+#define alloc_contig_frozen_pages(...) \
+ alloc_hooks(alloc_contig_frozen_pages_noprof(__VA_ARGS__))
+
+struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
+ int nid, nodemask_t *nodemask);
+#define alloc_contig_pages(...) \
+ alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
+
+void free_contig_range_frozen(unsigned long pfn, unsigned long nr_pages);
void free_contig_range(unsigned long pfn, unsigned long nr_pages);
#endif
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e1d229b75f27..05db9b5d584f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6782,7 +6782,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
return (ret < 0) ? ret : 0;
}
-static void split_free_pages(struct list_head *list, gfp_t gfp_mask)
+static void split_free_frozen_pages(struct list_head *list, gfp_t gfp_mask)
{
int order;
@@ -6794,11 +6794,10 @@ static void split_free_pages(struct list_head *list, gfp_t gfp_mask)
int i;
post_alloc_hook(page, order, gfp_mask);
- set_page_refcounted(page);
if (!order)
continue;
- split_page(page, order);
+ split_non_compound_page(page, order);
/* Add all subpages to the order-0 head, in sequence. */
list_del(&page->lru);
@@ -6842,28 +6841,8 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
return 0;
}
-/**
- * alloc_contig_range() -- tries to allocate given range of pages
- * @start: start PFN to allocate
- * @end: one-past-the-last PFN to allocate
- * @alloc_flags: allocation information
- * @gfp_mask: GFP mask. Node/zone/placement hints are ignored; only some
- * action and reclaim modifiers are supported. Reclaim modifiers
- * control allocation behavior during compaction/migration/reclaim.
- *
- * The PFN range does not have to be pageblock aligned. The PFN range must
- * belong to a single zone.
- *
- * The first thing this routine does is attempt to MIGRATE_ISOLATE all
- * pageblocks in the range. Once isolated, the pageblocks should not
- * be modified by others.
- *
- * Return: zero on success or negative error code. On success all
- * pages which PFN is in [start, end) are allocated for the caller and
- * need to be freed with free_contig_range().
- */
-int alloc_contig_range_noprof(unsigned long start, unsigned long end,
- acr_flags_t alloc_flags, gfp_t gfp_mask)
+int alloc_contig_range_frozen_noprof(unsigned long start, unsigned long end,
+ acr_flags_t alloc_flags, gfp_t gfp_mask)
{
const unsigned int order = ilog2(end - start);
unsigned long outer_start, outer_end;
@@ -6979,19 +6958,18 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
}
if (!(gfp_mask & __GFP_COMP)) {
- split_free_pages(cc.freepages, gfp_mask);
+ split_free_frozen_pages(cc.freepages, gfp_mask);
/* Free head and tail (if any) */
if (start != outer_start)
- free_contig_range(outer_start, start - outer_start);
+ free_contig_range_frozen(outer_start, start - outer_start);
if (end != outer_end)
- free_contig_range(end, outer_end - end);
+ free_contig_range_frozen(end, outer_end - end);
} else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
struct page *head = pfn_to_page(start);
check_new_pages(head, order);
prep_new_page(head, order, gfp_mask, 0);
- set_page_refcounted(head);
} else {
ret = -EINVAL;
WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n",
@@ -7001,16 +6979,48 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
undo_isolate_page_range(start, end);
return ret;
}
-EXPORT_SYMBOL(alloc_contig_range_noprof);
-static int __alloc_contig_pages(unsigned long start_pfn,
- unsigned long nr_pages, gfp_t gfp_mask)
+/**
+ * alloc_contig_range() -- tries to allocate given range of pages
+ * @start: start PFN to allocate
+ * @end: one-past-the-last PFN to allocate
+ * @alloc_flags: allocation information
+ * @gfp_mask: GFP mask. Node/zone/placement hints are ignored; only some
+ * action and reclaim modifiers are supported. Reclaim modifiers
+ * control allocation behavior during compaction/migration/reclaim.
+ *
+ * The PFN range does not have to be pageblock aligned. The PFN range must
+ * belong to a single zone.
+ *
+ * The first thing this routine does is attempt to MIGRATE_ISOLATE all
+ * pageblocks in the range. Once isolated, the pageblocks should not
+ * be modified by others.
+ *
+ * Return: zero on success or negative error code. On success all
+ * pages which PFN is in [start, end) are allocated for the caller and
+ * need to be freed with free_contig_range().
+ */
+int alloc_contig_range_noprof(unsigned long start, unsigned long end,
+ acr_flags_t alloc_flags, gfp_t gfp_mask)
{
- unsigned long end_pfn = start_pfn + nr_pages;
+ int ret;
+
+ ret = alloc_contig_range_frozen_noprof(start, end, alloc_flags, gfp_mask);
+ if (ret)
+ return ret;
+
+ if (gfp_mask & __GFP_COMP) {
+ set_page_refcounted(pfn_to_page(start));
+ } else {
+ unsigned long pfn;
+
+ for (pfn = start; pfn < end; pfn++)
+ set_page_refcounted(pfn_to_page(pfn));
+ }
- return alloc_contig_range_noprof(start_pfn, end_pfn, ACR_FLAGS_NONE,
- gfp_mask);
+ return 0;
}
+EXPORT_SYMBOL(alloc_contig_range_noprof);
static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
unsigned long nr_pages)
@@ -7059,31 +7069,8 @@ static bool zone_spans_last_pfn(const struct zone *zone,
return zone_spans_pfn(zone, last_pfn);
}
-/**
- * alloc_contig_pages() -- tries to find and allocate contiguous range of pages
- * @nr_pages: Number of contiguous pages to allocate
- * @gfp_mask: GFP mask. Node/zone/placement hints limit the search; only some
- * action and reclaim modifiers are supported. Reclaim modifiers
- * control allocation behavior during compaction/migration/reclaim.
- * @nid: Target node
- * @nodemask: Mask for other possible nodes
- *
- * This routine is a wrapper around alloc_contig_range(). It scans over zones
- * on an applicable zonelist to find a contiguous pfn range which can then be
- * tried for allocation with alloc_contig_range(). This routine is intended
- * for allocation requests which can not be fulfilled with the buddy allocator.
- *
- * The allocated memory is always aligned to a page boundary. If nr_pages is a
- * power of two, then allocated range is also guaranteed to be aligned to same
- * nr_pages (e.g. 1GB request would be aligned to 1GB).
- *
- * Allocated pages can be freed with free_contig_range() or by manually calling
- * __free_page() on each allocated page.
- *
- * Return: pointer to contiguous pages on success, or NULL if not successful.
- */
-struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
- int nid, nodemask_t *nodemask)
+struct page *alloc_contig_frozen_pages_noprof(unsigned long nr_pages,
+ gfp_t gfp_mask, int nid, nodemask_t *nodemask)
{
unsigned long ret, pfn, flags;
struct zonelist *zonelist;
@@ -7106,7 +7093,9 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
* and cause alloc_contig_range() to fail...
*/
spin_unlock_irqrestore(&zone->lock, flags);
- ret = __alloc_contig_pages(pfn, nr_pages,
+ ret = alloc_contig_range_frozen_noprof(pfn,
+ pfn + nr_pages,
+ ACR_FLAGS_NONE,
gfp_mask);
if (!ret)
return pfn_to_page(pfn);
@@ -7118,6 +7107,78 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
}
return NULL;
}
+EXPORT_SYMBOL(alloc_contig_range_frozen_noprof);
+
+void free_contig_range_frozen(unsigned long pfn, unsigned long nr_pages)
+{
+ struct folio *folio = pfn_folio(pfn);
+
+ if (folio_test_large(folio)) {
+ int expected = folio_nr_pages(folio);
+
+ WARN_ON(folio_ref_count(folio));
+
+ if (nr_pages == expected)
+ free_frozen_pages(&folio->page, folio_order(folio));
+ else
+ WARN(true, "PFN %lu: nr_pages %lu != expected %d\n",
+ pfn, nr_pages, expected);
+ return;
+ }
+
+ for (; nr_pages--; pfn++) {
+ struct page *page = pfn_to_page(pfn);
+
+ WARN_ON(page_ref_count(page));
+ free_frozen_pages(page, 0);
+ }
+}
+EXPORT_SYMBOL(free_contig_range_frozen);
+
+/**
+ * alloc_contig_pages() -- tries to find and allocate contiguous range of pages
+ * @nr_pages: Number of contiguous pages to allocate
+ * @gfp_mask: GFP mask. Node/zone/placement hints limit the search; only some
+ * action and reclaim modifiers are supported. Reclaim modifiers
+ * control allocation behavior during compaction/migration/reclaim.
+ * @nid: Target node
+ * @nodemask: Mask for other possible nodes
+ *
+ * This routine is a wrapper around alloc_contig_range(). It scans over zones
+ * on an applicable zonelist to find a contiguous pfn range which can then be
+ * tried for allocation with alloc_contig_range(). This routine is intended
+ * for allocation requests which can not be fulfilled with the buddy allocator.
+ *
+ * The allocated memory is always aligned to a page boundary. If nr_pages is a
+ * power of two, then allocated range is also guaranteed to be aligned to same
+ * nr_pages (e.g. 1GB request would be aligned to 1GB).
+ *
+ * Allocated pages can be freed with free_contig_range() or by manually calling
+ * __free_page() on each allocated page.
+ *
+ * Return: pointer to contiguous pages on success, or NULL if not successful.
+ */
+struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
+ int nid, nodemask_t *nodemask)
+{
+ struct page *page;
+
+ page = alloc_contig_frozen_pages_noprof(nr_pages, gfp_mask, nid,
+ nodemask);
+ if (!page)
+ return NULL;
+
+ if (gfp_mask & __GFP_COMP) {
+ set_page_refcounted(page);
+ } else {
+ unsigned long pfn = page_to_pfn(page);
+
+ for (; nr_pages--; pfn++)
+ set_page_refcounted(pfn_to_page(pfn));
+ }
+
+ return page;
+}
void free_contig_range(unsigned long pfn, unsigned long nr_pages)
{
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 6/8] mm: cma: add __cma_release()
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
` (4 preceding siblings ...)
2025-09-18 13:19 ` [PATCH v2 5/8] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}() Kefeng Wang
@ 2025-09-18 13:19 ` Kefeng Wang
2025-09-30 10:15 ` David Hildenbrand
2025-09-18 13:19 ` [PATCH v2 7/8] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
2025-09-18 13:20 ` [PATCH v2 8/8] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio() Kefeng Wang
7 siblings, 1 reply; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:19 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
Kill cma_pages_valid() which only used in cma_release(), also
cleanup code duplication between cma pages valid checking and
cma memrange finding, add __cma_release() helper to prepare for
the upcoming frozen page release.
Reviewed-by: Jane Chu <jane.chu@oracle.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
include/linux/cma.h | 1 -
mm/cma.c | 57 ++++++++++++---------------------------------
2 files changed, 15 insertions(+), 43 deletions(-)
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 62d9c1cf6326..e5745d2aec55 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -49,7 +49,6 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
struct cma **res_cma);
extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
bool no_warn);
-extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count);
extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
diff --git a/mm/cma.c b/mm/cma.c
index 813e6dc7b095..2af8c5bc58dd 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -942,34 +942,36 @@ struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
return page ? page_folio(page) : NULL;
}
-bool cma_pages_valid(struct cma *cma, const struct page *pages,
- unsigned long count)
+static bool __cma_release(struct cma *cma, const struct page *pages,
+ unsigned long count)
{
unsigned long pfn, end;
int r;
struct cma_memrange *cmr;
- bool ret;
+
+ pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count);
if (!cma || !pages || count > cma->count)
return false;
pfn = page_to_pfn(pages);
- ret = false;
for (r = 0; r < cma->nranges; r++) {
cmr = &cma->ranges[r];
end = cmr->base_pfn + cmr->count;
- if (pfn >= cmr->base_pfn && pfn < end) {
- ret = pfn + count <= end;
+ if (pfn >= cmr->base_pfn && pfn < end && pfn + count <= end)
break;
- }
}
- if (!ret)
- pr_debug("%s(page %p, count %lu)\n",
- __func__, (void *)pages, count);
+ if (r == cma->nranges)
+ return false;
- return ret;
+ free_contig_range(pfn, count);
+ cma_clear_bitmap(cma, cmr, pfn, count);
+ cma_sysfs_account_release_pages(cma, count);
+ trace_cma_release(cma->name, pfn, pages, count);
+
+ return true;
}
/**
@@ -985,36 +987,7 @@ bool cma_pages_valid(struct cma *cma, const struct page *pages,
bool cma_release(struct cma *cma, const struct page *pages,
unsigned long count)
{
- struct cma_memrange *cmr;
- unsigned long pfn, end_pfn;
- int r;
-
- pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count);
-
- if (!cma_pages_valid(cma, pages, count))
- return false;
-
- pfn = page_to_pfn(pages);
- end_pfn = pfn + count;
-
- for (r = 0; r < cma->nranges; r++) {
- cmr = &cma->ranges[r];
- if (pfn >= cmr->base_pfn &&
- pfn < (cmr->base_pfn + cmr->count)) {
- VM_BUG_ON(end_pfn > cmr->base_pfn + cmr->count);
- break;
- }
- }
-
- if (r == cma->nranges)
- return false;
-
- free_contig_range(pfn, count);
- cma_clear_bitmap(cma, cmr, pfn, count);
- cma_sysfs_account_release_pages(cma, count);
- trace_cma_release(cma->name, pfn, pages, count);
-
- return true;
+ return __cma_release(cma, pages, count);
}
bool cma_free_folio(struct cma *cma, const struct folio *folio)
@@ -1022,7 +995,7 @@ bool cma_free_folio(struct cma *cma, const struct folio *folio)
if (WARN_ON(!folio_test_large(folio)))
return false;
- return cma_release(cma, &folio->page, folio_nr_pages(folio));
+ return __cma_release(cma, &folio->page, folio_nr_pages(folio));
}
int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 7/8] mm: cma: add cma_alloc_frozen{_compound}()
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
` (5 preceding siblings ...)
2025-09-18 13:19 ` [PATCH v2 6/8] mm: cma: add __cma_release() Kefeng Wang
@ 2025-09-18 13:19 ` Kefeng Wang
2025-09-18 13:20 ` [PATCH v2 8/8] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio() Kefeng Wang
7 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:19 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
Introduce cma_alloc_frozen{_compound}() helper to alloc pages
without incrementing their refcount, and convert hugetlb cma
to use cma_alloc_frozen_compound, also move cma_validate_zones()
into mm/internal.h since no outside user.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
include/linux/cma.h | 25 +++++----------------
mm/cma.c | 55 +++++++++++++++++++++++++++++----------------
mm/hugetlb_cma.c | 22 ++++++++++--------
mm/internal.h | 6 +++++
4 files changed, 60 insertions(+), 48 deletions(-)
diff --git a/include/linux/cma.h b/include/linux/cma.h
index e5745d2aec55..4981c151ef84 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -51,29 +51,14 @@ extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int
bool no_warn);
extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
+struct page *cma_alloc_frozen(struct cma *cma, unsigned long count,
+ unsigned int align, bool no_warn);
+bool cma_release_frozen(struct cma *cma, const struct page *pages,
+ unsigned long count);
+
extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
extern bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end);
extern void cma_reserve_pages_on_error(struct cma *cma);
-#ifdef CONFIG_CMA
-struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
-bool cma_free_folio(struct cma *cma, const struct folio *folio);
-bool cma_validate_zones(struct cma *cma);
-#else
-static inline struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
-{
- return NULL;
-}
-
-static inline bool cma_free_folio(struct cma *cma, const struct folio *folio)
-{
- return false;
-}
-static inline bool cma_validate_zones(struct cma *cma)
-{
- return false;
-}
-#endif
-
#endif
diff --git a/mm/cma.c b/mm/cma.c
index 2af8c5bc58dd..aa237eab49bf 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -836,7 +836,7 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
spin_unlock_irq(&cma->lock);
mutex_lock(&cma->alloc_mutex);
- ret = alloc_contig_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
+ ret = alloc_contig_range_frozen(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
mutex_unlock(&cma->alloc_mutex);
if (!ret)
break;
@@ -856,8 +856,8 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
return ret;
}
-static struct page *__cma_alloc(struct cma *cma, unsigned long count,
- unsigned int align, gfp_t gfp)
+static struct page *__cma_alloc_frozen(struct cma *cma,
+ unsigned long count, unsigned int align, gfp_t gfp)
{
struct page *page = NULL;
int ret = -ENOMEM, r;
@@ -914,6 +914,21 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
return page;
}
+struct page *cma_alloc_frozen(struct cma *cma, unsigned long count,
+ unsigned int align, bool no_warn)
+{
+ gfp_t gfp = GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0);
+
+ return __cma_alloc_frozen(cma, count, align, gfp);
+}
+
+struct page *cma_alloc_frozen_compound(struct cma *cma, unsigned int order)
+{
+ gfp_t gfp = GFP_KERNEL | __GFP_COMP | __GFP_NOWARN;
+
+ return __cma_alloc_frozen(cma, 1 << order, order, gfp);
+}
+
/**
* cma_alloc() - allocate pages from contiguous area
* @cma: Contiguous memory region for which the allocation is performed.
@@ -927,23 +942,23 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
struct page *cma_alloc(struct cma *cma, unsigned long count,
unsigned int align, bool no_warn)
{
- return __cma_alloc(cma, count, align, GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
-}
-
-struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
-{
+ unsigned long pfn;
struct page *page;
- if (WARN_ON(!order || !(gfp & __GFP_COMP)))
+ page = cma_alloc_frozen(cma, count, align, no_warn);
+ if (!page)
return NULL;
- page = __cma_alloc(cma, 1 << order, order, gfp);
+ pfn = page_to_pfn(page);
- return page ? page_folio(page) : NULL;
+ for (; count--; pfn++)
+ set_page_refcounted(pfn_to_page(pfn));
+
+ return page;
}
static bool __cma_release(struct cma *cma, const struct page *pages,
- unsigned long count)
+ unsigned long count, bool frozen)
{
unsigned long pfn, end;
int r;
@@ -966,7 +981,11 @@ static bool __cma_release(struct cma *cma, const struct page *pages,
if (r == cma->nranges)
return false;
- free_contig_range(pfn, count);
+ if (frozen)
+ free_contig_range_frozen(pfn, count);
+ else
+ free_contig_range(pfn, count);
+
cma_clear_bitmap(cma, cmr, pfn, count);
cma_sysfs_account_release_pages(cma, count);
trace_cma_release(cma->name, pfn, pages, count);
@@ -987,15 +1006,13 @@ static bool __cma_release(struct cma *cma, const struct page *pages,
bool cma_release(struct cma *cma, const struct page *pages,
unsigned long count)
{
- return __cma_release(cma, pages, count);
+ return __cma_release(cma, pages, count, false);
}
-bool cma_free_folio(struct cma *cma, const struct folio *folio)
+bool cma_release_frozen(struct cma *cma, const struct page *pages,
+ unsigned long count)
{
- if (WARN_ON(!folio_test_large(folio)))
- return false;
-
- return __cma_release(cma, &folio->page, folio_nr_pages(folio));
+ return __cma_release(cma, pages, count, true);
}
int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c
index e8e4dc7182d5..fc41f3b949f8 100644
--- a/mm/hugetlb_cma.c
+++ b/mm/hugetlb_cma.c
@@ -22,33 +22,37 @@ void hugetlb_cma_free_folio(struct folio *folio)
{
int nid = folio_nid(folio);
- WARN_ON_ONCE(!cma_free_folio(hugetlb_cma[nid], folio));
+ WARN_ON_ONCE(!cma_release(hugetlb_cma[nid], &folio->page,
+ folio_nr_pages(folio)));
}
-
struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
int nid, nodemask_t *nodemask)
{
int node;
- struct folio *folio = NULL;
+ struct folio *folio;
+ struct page *page = NULL;
if (hugetlb_cma[nid])
- folio = cma_alloc_folio(hugetlb_cma[nid], order, gfp_mask);
+ page = cma_alloc_frozen_compound(hugetlb_cma[nid], order);
- if (!folio && !(gfp_mask & __GFP_THISNODE)) {
+ if (!page && !(gfp_mask & __GFP_THISNODE)) {
for_each_node_mask(node, *nodemask) {
if (node == nid || !hugetlb_cma[node])
continue;
- folio = cma_alloc_folio(hugetlb_cma[node], order, gfp_mask);
- if (folio)
+ page = cma_alloc_frozen_compound(hugetlb_cma[nid], order);
+ if (page)
break;
}
}
- if (folio)
- folio_set_hugetlb_cma(folio);
+ if (!page)
+ return NULL;
+ set_page_refcounted(page);
+ folio = page_folio(page);
+ folio_set_hugetlb_cma(folio);
return folio;
}
diff --git a/mm/internal.h b/mm/internal.h
index 1561fc2ff5b8..ffcfde60059e 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -936,9 +936,15 @@ void init_cma_reserved_pageblock(struct page *page);
struct cma;
#ifdef CONFIG_CMA
+struct page *cma_alloc_frozen_compound(struct cma *cma, unsigned int order);
+bool cma_validate_zones(struct cma *cma);
void *cma_reserve_early(struct cma *cma, unsigned long size);
void init_cma_pageblock(struct page *page);
#else
+static inline bool cma_validate_zones(struct cma *cma)
+{
+ return false;
+}
static inline void *cma_reserve_early(struct cma *cma, unsigned long size)
{
return NULL;
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 8/8] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio()
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
` (6 preceding siblings ...)
2025-09-18 13:19 ` [PATCH v2 7/8] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
@ 2025-09-18 13:20 ` Kefeng Wang
7 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-09-18 13:20 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm, Kefeng Wang
The alloc_gigantic_folio() allocates a folio by alloc_contig_range()
with refcount increated and then freeze it, convert to allocate a
frozen folio to remove the atomic operation about folio refcount,
and saving atomic operation during __update_and_free_hugetlb_folio
too. Also rename hugetlb_cma_{alloc,free}_folio() with frozen which
make them more self-explanatory.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
include/linux/gfp.h | 23 -------------------
mm/hugetlb.c | 54 ++++++++++-----------------------------------
mm/hugetlb_cma.c | 11 +++++----
mm/hugetlb_cma.h | 10 ++++-----
4 files changed, 22 insertions(+), 76 deletions(-)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index fbbdd8c88483..82aba162f352 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -453,27 +453,4 @@ void free_contig_range_frozen(unsigned long pfn, unsigned long nr_pages);
void free_contig_range(unsigned long pfn, unsigned long nr_pages);
#endif
-#ifdef CONFIG_CONTIG_ALLOC
-static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
- int nid, nodemask_t *node)
-{
- struct page *page;
-
- if (WARN_ON(!order || !(gfp & __GFP_COMP)))
- return NULL;
-
- page = alloc_contig_pages_noprof(1 << order, gfp, nid, node);
-
- return page ? page_folio(page) : NULL;
-}
-#else
-static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
- int nid, nodemask_t *node)
-{
- return NULL;
-}
-#endif
-/* This should be paired with folio_put() rather than free_contig_range(). */
-#define folio_alloc_gigantic(...) alloc_hooks(folio_alloc_gigantic_noprof(__VA_ARGS__))
-
#endif /* __LINUX_GFP_H */
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bc88b659a88b..ce5f94c15268 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -125,16 +125,6 @@ static void hugetlb_unshare_pmds(struct vm_area_struct *vma,
unsigned long start, unsigned long end, bool take_locks);
static struct resv_map *vma_resv_map(struct vm_area_struct *vma);
-static void hugetlb_free_folio(struct folio *folio)
-{
- if (folio_test_hugetlb_cma(folio)) {
- hugetlb_cma_free_folio(folio);
- return;
- }
-
- folio_put(folio);
-}
-
static inline bool subpool_is_free(struct hugepage_subpool *spool)
{
if (spool->count)
@@ -1472,44 +1462,22 @@ static int hstate_next_node_to_free(struct hstate *h, nodemask_t *nodes_allowed)
nr_nodes--)
#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-#ifdef CONFIG_CONTIG_ALLOC
static struct folio *alloc_gigantic_folio(int order, gfp_t gfp_mask,
int nid, nodemask_t *nodemask)
{
struct folio *folio;
- bool retried = false;
-retry:
- folio = hugetlb_cma_alloc_folio(order, gfp_mask, nid, nodemask);
- if (!folio) {
- if (hugetlb_cma_exclusive_alloc())
- return NULL;
-
- folio = folio_alloc_gigantic(order, gfp_mask, nid, nodemask);
- if (!folio)
- return NULL;
- }
-
- if (folio_ref_freeze(folio, 1))
+ folio = hugetlb_cma_alloc_frozen_folio(order, gfp_mask, nid, nodemask);
+ if (folio)
return folio;
- pr_warn("HugeTLB: unexpected refcount on PFN %lu\n", folio_pfn(folio));
- hugetlb_free_folio(folio);
- if (!retried) {
- retried = true;
- goto retry;
- }
- return NULL;
-}
+ if (hugetlb_cma_exclusive_alloc())
+ return NULL;
-#else /* !CONFIG_CONTIG_ALLOC */
-static struct folio *alloc_gigantic_folio(int order, gfp_t gfp_mask, int nid,
- nodemask_t *nodemask)
-{
- return NULL;
+ folio = (struct folio *)alloc_contig_frozen_pages(1 << order, gfp_mask,
+ nid, nodemask);
+ return folio;
}
-#endif /* CONFIG_CONTIG_ALLOC */
-
#else /* !CONFIG_ARCH_HAS_GIGANTIC_PAGE */
static struct folio *alloc_gigantic_folio(int order, gfp_t gfp_mask, int nid,
nodemask_t *nodemask)
@@ -1641,9 +1609,11 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
if (unlikely(folio_test_hwpoison(folio)))
folio_clear_hugetlb_hwpoison(folio);
- folio_ref_unfreeze(folio, 1);
-
- hugetlb_free_folio(folio);
+ VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
+ if (folio_test_hugetlb_cma(folio))
+ hugetlb_cma_free_frozen_folio(folio);
+ else
+ free_frozen_pages(&folio->page, folio_order(folio));
}
/*
diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c
index fc41f3b949f8..af9caaf007e4 100644
--- a/mm/hugetlb_cma.c
+++ b/mm/hugetlb_cma.c
@@ -18,16 +18,16 @@ static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata;
static bool hugetlb_cma_only;
static unsigned long hugetlb_cma_size __initdata;
-void hugetlb_cma_free_folio(struct folio *folio)
+void hugetlb_cma_free_frozen_folio(struct folio *folio)
{
int nid = folio_nid(folio);
- WARN_ON_ONCE(!cma_release(hugetlb_cma[nid], &folio->page,
- folio_nr_pages(folio)));
+ WARN_ON_ONCE(!cma_release_frozen(hugetlb_cma[nid], &folio->page,
+ folio_nr_pages(folio)));
}
-struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
- int nid, nodemask_t *nodemask)
+struct folio *hugetlb_cma_alloc_frozen_folio(int order, gfp_t gfp_mask,
+ int nid, nodemask_t *nodemask)
{
int node;
struct folio *folio;
@@ -50,7 +50,6 @@ struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
if (!page)
return NULL;
- set_page_refcounted(page);
folio = page_folio(page);
folio_set_hugetlb_cma(folio);
return folio;
diff --git a/mm/hugetlb_cma.h b/mm/hugetlb_cma.h
index 2c2ec8a7e134..3bc295c8c38e 100644
--- a/mm/hugetlb_cma.h
+++ b/mm/hugetlb_cma.h
@@ -3,8 +3,8 @@
#define _LINUX_HUGETLB_CMA_H
#ifdef CONFIG_CMA
-void hugetlb_cma_free_folio(struct folio *folio);
-struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
+void hugetlb_cma_free_frozen_folio(struct folio *folio);
+struct folio *hugetlb_cma_alloc_frozen_folio(int order, gfp_t gfp_mask,
int nid, nodemask_t *nodemask);
struct huge_bootmem_page *hugetlb_cma_alloc_bootmem(struct hstate *h, int *nid,
bool node_exact);
@@ -14,12 +14,12 @@ unsigned long hugetlb_cma_total_size(void);
void hugetlb_cma_validate_params(void);
bool hugetlb_early_cma(struct hstate *h);
#else
-static inline void hugetlb_cma_free_folio(struct folio *folio)
+static inline void hugetlb_cma_free_frozen_folio(struct folio *folio)
{
}
-static inline struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
- int nid, nodemask_t *nodemask)
+static inline struct folio *hugetlb_cma_alloc_frozen_folio(int order,
+ gfp_t gfp_mask, int nid, nodemask_t *nodemask)
{
return NULL;
}
--
2.27.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
2025-09-18 13:19 ` [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig() Kefeng Wang
@ 2025-09-18 15:49 ` Zi Yan
2025-09-19 2:03 ` Kefeng Wang
2025-09-19 1:40 ` kernel test robot
` (2 subsequent siblings)
3 siblings, 1 reply; 23+ messages in thread
From: Zi Yan @ 2025-09-18 15:49 UTC (permalink / raw)
To: Kefeng Wang
Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Matthew Wilcox, sidhartha.kumar, jane.chu, Vlastimil Babka,
Brendan Jackman, Johannes Weiner, linux-mm
On 18 Sep 2025, at 9:19, Kefeng Wang wrote:
> The alloc_contig_pages() spends a lot of time in pfn_range_valid_contig(),
> we could check whether the page in this pfn range could be allocated
> before alloc_contig_range(), if the page can't be migrated, no further
> action is required, and also skip some unnecessary iterations for
> compound pages such as THP and non-compound high order buddy, which
> save times a lot too. The check is racy, but the only danger is skipping
> too much.
>
> A simple test on machine with 116G free memory, allocate 120 * 1G
> HugeTLB folios(107 successfully returned),
>
> time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
>
> Before: 0m2.124s
> After: 0m0.602s
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/page_alloc.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 478beaf95f84..5b7d705e9710 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7012,6 +7012,7 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> {
> unsigned long i, end_pfn = start_pfn + nr_pages;
> struct page *page;
> + struct folio *folio;
>
> for (i = start_pfn; i < end_pfn; i++) {
> page = pfn_to_online_page(i);
> @@ -7021,11 +7022,26 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> if (page_zone(page) != z)
> return false;
>
> - if (PageReserved(page))
> + folio = page_folio(page);
> + if (folio_test_reserved(folio))
> return false;
>
> - if (PageHuge(page))
> + if (folio_test_hugetlb(folio))
> return false;
> +
> + /* The following type of folios aren't migrated */
s/aren’t/cannot be/
> + if (folio_test_pgtable(folio) | folio_test_stack(folio))
> + return false;
Maybe worth explicitly stating these two types of pages in the commit log.
> +
> + /*
> + * For compound pages such as THP and non-compound high
> + * order buddy pages, save potentially a lot of iterations
> + * if we can skip them at once.
> + */
> + if (PageCompound(page))
> + i += (1UL << compound_order(page)) - 1;
Just a note here, if page is tail, this just move i to the next page
instead of next folio.
> + else if (PageBuddy(page))
> + i += (1UL << buddy_order(page)) - 1;
> }
> return true;
> }
> --
> 2.27.0
Otherwise, LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
2025-09-18 13:19 ` [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig() Kefeng Wang
2025-09-18 15:49 ` Zi Yan
@ 2025-09-19 1:40 ` kernel test robot
2025-09-19 5:00 ` Dev Jain
2025-09-30 9:56 ` David Hildenbrand
3 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2025-09-19 1:40 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
Muchun Song, Zi Yan, Matthew Wilcox
Cc: llvm, oe-kbuild-all, Linux Memory Management List,
sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, Kefeng Wang
Hi Kefeng,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/Kefeng-Wang/mm-page_alloc-optimize-pfn_range_valid_contig/20250918-212431
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20250918132000.1951232-2-wangkefeng.wang%40huawei.com
patch subject: [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20250919/202509190917.wgDdVYHL-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250919/202509190917.wgDdVYHL-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202509190917.wgDdVYHL-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> mm/page_alloc.c:7033:7: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical]
7033 | if (folio_test_pgtable(folio) | folio_test_stack(folio))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| ||
mm/page_alloc.c:7033:7: note: cast one or both operands to int to silence this warning
1 warning generated.
vim +7033 mm/page_alloc.c
7009
7010 static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
7011 unsigned long nr_pages)
7012 {
7013 unsigned long i, end_pfn = start_pfn + nr_pages;
7014 struct page *page;
7015 struct folio *folio;
7016
7017 for (i = start_pfn; i < end_pfn; i++) {
7018 page = pfn_to_online_page(i);
7019 if (!page)
7020 return false;
7021
7022 if (page_zone(page) != z)
7023 return false;
7024
7025 folio = page_folio(page);
7026 if (folio_test_reserved(folio))
7027 return false;
7028
7029 if (folio_test_hugetlb(folio))
7030 return false;
7031
7032 /* The following type of folios aren't migrated */
> 7033 if (folio_test_pgtable(folio) | folio_test_stack(folio))
7034 return false;
7035
7036 /*
7037 * For compound pages such as THP and non-compound high
7038 * order buddy pages, save potentially a lot of iterations
7039 * if we can skip them at once.
7040 */
7041 if (PageCompound(page))
7042 i += (1UL << compound_order(page)) - 1;
7043 else if (PageBuddy(page))
7044 i += (1UL << buddy_order(page)) - 1;
7045 }
7046 return true;
7047 }
7048
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
2025-09-18 15:49 ` Zi Yan
@ 2025-09-19 2:03 ` Kefeng Wang
0 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-09-19 2:03 UTC (permalink / raw)
To: Zi Yan
Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
Matthew Wilcox, sidhartha.kumar, jane.chu, Vlastimil Babka,
Brendan Jackman, Johannes Weiner, linux-mm
On 2025/9/18 23:49, Zi Yan wrote:
> On 18 Sep 2025, at 9:19, Kefeng Wang wrote:
>
>> The alloc_contig_pages() spends a lot of time in pfn_range_valid_contig(),
>> we could check whether the page in this pfn range could be allocated
>> before alloc_contig_range(), if the page can't be migrated, no further
>> action is required, and also skip some unnecessary iterations for
>> compound pages such as THP and non-compound high order buddy, which
>> save times a lot too. The check is racy, but the only danger is skipping
>> too much.
>>
>> A simple test on machine with 116G free memory, allocate 120 * 1G
>> HugeTLB folios(107 successfully returned),
>>
>> time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
>>
>> Before: 0m2.124s
>> After: 0m0.602s
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> mm/page_alloc.c | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 478beaf95f84..5b7d705e9710 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -7012,6 +7012,7 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
>> {
>> unsigned long i, end_pfn = start_pfn + nr_pages;
>> struct page *page;
>> + struct folio *folio;
>>
>> for (i = start_pfn; i < end_pfn; i++) {
>> page = pfn_to_online_page(i);
>> @@ -7021,11 +7022,26 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
>> if (page_zone(page) != z)
>> return false;
>>
>> - if (PageReserved(page))
>> + folio = page_folio(page);
>> + if (folio_test_reserved(folio))
>> return false;
>>
>> - if (PageHuge(page))
>> + if (folio_test_hugetlb(folio))
>> return false;
>> +
>> + /* The following type of folios aren't migrated */
>
> s/aren’t/cannot be/
ACK.
>
>> + if (folio_test_pgtable(folio) | folio_test_stack(folio))
>> + return false;
should be "||", will fix.
>
> Maybe worth explicitly stating these two types of pages in the commit log.
>
OK, will update.
>> +
>> + /*
>> + * For compound pages such as THP and non-compound high
>> + * order buddy pages, save potentially a lot of iterations
>> + * if we can skip them at once.
>> + */
>> + if (PageCompound(page))
>> + i += (1UL << compound_order(page)) - 1;
>
> Just a note here, if page is tail, this just move i to the next page
> instead of next folio.
As no reference held,it is not too precise but we optimize for most
scenarios.
>
>> + else if (PageBuddy(page))
>> + i += (1UL << buddy_order(page)) - 1;
>> }
>> return true;
>> }
>> --
>> 2.27.0
>
> Otherwise, LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>
>
>
Thanks.
> Best Regards,
> Yan, Zi
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
2025-09-18 13:19 ` [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig() Kefeng Wang
2025-09-18 15:49 ` Zi Yan
2025-09-19 1:40 ` kernel test robot
@ 2025-09-19 5:00 ` Dev Jain
2025-09-20 8:19 ` Kefeng Wang
2025-09-30 9:56 ` David Hildenbrand
3 siblings, 1 reply; 23+ messages in thread
From: Dev Jain @ 2025-09-19 5:00 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
Muchun Song, Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 18/09/25 6:49 pm, Kefeng Wang wrote:
> The alloc_contig_pages() spends a lot of time in pfn_range_valid_contig(),
> we could check whether the page in this pfn range could be allocated
> before alloc_contig_range(), if the page can't be migrated, no further
> action is required, and also skip some unnecessary iterations for
> compound pages such as THP and non-compound high order buddy, which
> save times a lot too. The check is racy, but the only danger is skipping
> too much.
>
> A simple test on machine with 116G free memory, allocate 120 * 1G
> HugeTLB folios(107 successfully returned),
>
> time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
>
> Before: 0m2.124s
> After: 0m0.602s
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/page_alloc.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 478beaf95f84..5b7d705e9710 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7012,6 +7012,7 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> {
> unsigned long i, end_pfn = start_pfn + nr_pages;
> struct page *page;
> + struct folio *folio;
>
> for (i = start_pfn; i < end_pfn; i++) {
> page = pfn_to_online_page(i);
> @@ -7021,11 +7022,26 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> if (page_zone(page) != z)
> return false;
>
> - if (PageReserved(page))
> + folio = page_folio(page);
> + if (folio_test_reserved(folio))
> return false;
>
> - if (PageHuge(page))
> + if (folio_test_hugetlb(folio))
> return false;
> +
> + /* The following type of folios aren't migrated */
> + if (folio_test_pgtable(folio) | folio_test_stack(folio))
> + return false;
> +
> + /*
> + * For compound pages such as THP and non-compound high
> + * order buddy pages, save potentially a lot of iterations
> + * if we can skip them at once.
> + */
> + if (PageCompound(page))
> + i += (1UL << compound_order(page)) - 1;
Can we instead do
if (folio_test_large(folio))
i += folio_nr_pages(folio) - folio_page_idx(folio, page).
> + else if (PageBuddy(page))
> + i += (1UL << buddy_order(page)) - 1;
> }
> return true;
> }
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
2025-09-19 5:00 ` Dev Jain
@ 2025-09-20 8:19 ` Kefeng Wang
0 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-09-20 8:19 UTC (permalink / raw)
To: Dev Jain, Andrew Morton, David Hildenbrand, Oscar Salvador,
Muchun Song, Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 2025/9/19 13:00, Dev Jain wrote:
>
> On 18/09/25 6:49 pm, Kefeng Wang wrote:
>> The alloc_contig_pages() spends a lot of time in
>> pfn_range_valid_contig(),
>> we could check whether the page in this pfn range could be allocated
>> before alloc_contig_range(), if the page can't be migrated, no further
>> action is required, and also skip some unnecessary iterations for
>> compound pages such as THP and non-compound high order buddy, which
>> save times a lot too. The check is racy, but the only danger is skipping
>> too much.
>>
>> A simple test on machine with 116G free memory, allocate 120 * 1G
>> HugeTLB folios(107 successfully returned),
>>
>> time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/
>> nr_hugepages
>>
>> Before: 0m2.124s
>> After: 0m0.602s
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> mm/page_alloc.c | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 478beaf95f84..5b7d705e9710 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -7012,6 +7012,7 @@ static bool pfn_range_valid_contig(struct zone
>> *z, unsigned long start_pfn,
>> {
>> unsigned long i, end_pfn = start_pfn + nr_pages;
>> struct page *page;
>> + struct folio *folio;
>> for (i = start_pfn; i < end_pfn; i++) {
>> page = pfn_to_online_page(i);
>> @@ -7021,11 +7022,26 @@ static bool pfn_range_valid_contig(struct zone
>> *z, unsigned long start_pfn,
>> if (page_zone(page) != z)
>> return false;
>> - if (PageReserved(page))
>> + folio = page_folio(page);
>> + if (folio_test_reserved(folio))
>> return false;
>> - if (PageHuge(page))
>> + if (folio_test_hugetlb(folio))
>> return false;
>> +
>> + /* The following type of folios aren't migrated */
>> + if (folio_test_pgtable(folio) | folio_test_stack(folio))
>> + return false;
>> +
>> + /*
>> + * For compound pages such as THP and non-compound high
>> + * order buddy pages, save potentially a lot of iterations
>> + * if we can skip them at once.
>> + */
>> + if (PageCompound(page))
>> + i += (1UL << compound_order(page)) - 1;
>
> Can we instead do
> if (folio_test_large(folio))
> i += folio_nr_pages(folio) - folio_page_idx(folio, page).
I'm afraid not, see 9342bc134ae7 ("mm/memory_hotplug: fix call
folio_test_large with tail page in do_migrate_range").
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
2025-09-18 13:19 ` [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig() Kefeng Wang
` (2 preceding siblings ...)
2025-09-19 5:00 ` Dev Jain
@ 2025-09-30 9:56 ` David Hildenbrand
2025-10-09 12:40 ` Kefeng Wang
3 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-09-30 9:56 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton, Oscar Salvador, Muchun Song, Zi Yan,
Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 18.09.25 15:19, Kefeng Wang wrote:
> The alloc_contig_pages() spends a lot of time in pfn_range_valid_contig(),
> we could check whether the page in this pfn range could be allocated
> before alloc_contig_range(), if the page can't be migrated, no further
> action is required, and also skip some unnecessary iterations for
> compound pages such as THP and non-compound high order buddy, which
> save times a lot too. The check is racy, but the only danger is skipping
> too much.
>
> A simple test on machine with 116G free memory, allocate 120 * 1G
> HugeTLB folios(107 successfully returned),
>
> time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
>
> Before: 0m2.124s
> After: 0m0.602s
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/page_alloc.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 478beaf95f84..5b7d705e9710 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7012,6 +7012,7 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> {
> unsigned long i, end_pfn = start_pfn + nr_pages;
> struct page *page;
> + struct folio *folio;
>
> for (i = start_pfn; i < end_pfn; i++) {
> page = pfn_to_online_page(i);
> @@ -7021,11 +7022,26 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> if (page_zone(page) != z)
> return false;
>
> - if (PageReserved(page))
> + folio = page_folio(page);
> + if (folio_test_reserved(folio))
> return false;
>
> - if (PageHuge(page))
> + if (folio_test_hugetlb(folio))
> return false;
> +
> + /* The following type of folios aren't migrated */
> + if (folio_test_pgtable(folio) | folio_test_stack(folio))
> + return false;
> +
I don't enjoy us open coding this here. has_unmovable_pages() has a much
better heuristics.
I suggest you drop this patch for now from this series, as it seems to
be independent from the rest, and instead see if you could reuse some of
the has_unmovable_pages() logic instead.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 2/8] mm: hugetlb: optimize replace_free_hugepage_folios()
2025-09-18 13:19 ` [PATCH v2 2/8] mm: hugetlb: optimize replace_free_hugepage_folios() Kefeng Wang
@ 2025-09-30 9:57 ` David Hildenbrand
2025-10-09 12:40 ` Kefeng Wang
0 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-09-30 9:57 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton, Oscar Salvador, Muchun Song, Zi Yan,
Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 18.09.25 15:19, Kefeng Wang wrote:
> No need to replace free hugepage folios if no free hugetlb folios,
> we don't replace gigantic folio, so use isolate_or_dissolve_huge_folio(),
> also skip some pfn iterations for compound pages such as THP and
> non-compound high order buddy to save time.
>
> A simple test on machine with 116G free memory, allocate 120 * 1G
> HugeTLB folios(107 successfully returned),
>
> time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
>
> Before: 0m0.602s
> After: 0m0.429s
Also this patch feels misplaced in this series. I suggest you send that
out separately.
Or is there anything important that I am missing?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 3/8] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page()
2025-09-18 13:19 ` [PATCH v2 3/8] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
@ 2025-09-30 10:01 ` David Hildenbrand
0 siblings, 0 replies; 23+ messages in thread
From: David Hildenbrand @ 2025-09-30 10:01 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton, Oscar Salvador, Muchun Song, Zi Yan,
Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 18.09.25 15:19, Kefeng Wang wrote:
> Add a new helper to free huge page to be consistency to
> debug_vm_pgtable_alloc_huge_page(), and use HPAGE_PUD_ORDER
> instead of open-code.
>
> Also move the free_contig_range() under CONFIG_ALLOC_CONTIG
> since all caller are built with CONFIG_ALLOC_CONTIG.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 4/8] mm: page_alloc: add split_non_compound_page()
2025-09-18 13:19 ` [PATCH v2 4/8] mm: page_alloc: add split_non_compound_page() Kefeng Wang
@ 2025-09-30 10:06 ` David Hildenbrand
2025-10-09 12:40 ` Kefeng Wang
0 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-09-30 10:06 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton, Oscar Salvador, Muchun Song, Zi Yan,
Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 18.09.25 15:19, Kefeng Wang wrote:
> Add new split_non_compound_page() to simplify make_alloc_exact().
>
"Factor out the splitting of non-compound page from make_alloc_exact()
and split_page() into a new helper function split_non_compound_page()".
Not sure I enjoy the name "split_non_compound_page()", but it matches
the existing theme of split_page(): we're not really splitting any
pages, we're just adjusting tracking metadata for pages part of the
original-higher-order-page so it can be freed separately later.
But now I think of it, the terminology is bad if you look at the
description of split_page(): "split_page takes a non-compound
higher-order page, and splits it into n (1<<order) sub-pages:
page[0..n]". It's unclear how split_non_compound_page() would really differ.
I would suggest you call the new helper simply "__split_page" ?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 6/8] mm: cma: add __cma_release()
2025-09-18 13:19 ` [PATCH v2 6/8] mm: cma: add __cma_release() Kefeng Wang
@ 2025-09-30 10:15 ` David Hildenbrand
2025-10-09 12:40 ` Kefeng Wang
0 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-09-30 10:15 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton, Oscar Salvador, Muchun Song, Zi Yan,
Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 18.09.25 15:19, Kefeng Wang wrote:
> Kill cma_pages_valid() which only used in cma_release(), also
> cleanup code duplication between cma pages valid checking and
> cma memrange finding, add __cma_release() helper to prepare for
> the upcoming frozen page release.
>
> Reviewed-by: Jane Chu <jane.chu@oracle.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> include/linux/cma.h | 1 -
> mm/cma.c | 57 ++++++++++++---------------------------------
> 2 files changed, 15 insertions(+), 43 deletions(-)
>
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 62d9c1cf6326..e5745d2aec55 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -49,7 +49,6 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
> struct cma **res_cma);
> extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
> bool no_warn);
> -extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count);
> extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
>
> extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
> diff --git a/mm/cma.c b/mm/cma.c
> index 813e6dc7b095..2af8c5bc58dd 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -942,34 +942,36 @@ struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
> return page ? page_folio(page) : NULL;
> }
>
> -bool cma_pages_valid(struct cma *cma, const struct page *pages,
> - unsigned long count)
> +static bool __cma_release(struct cma *cma, const struct page *pages,
> + unsigned long count)
> {
> unsigned long pfn, end;
> int r;
> struct cma_memrange *cmr;
> - bool ret;
> +
> + pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count);
>
> if (!cma || !pages || count > cma->count)
> return false;
>
> pfn = page_to_pfn(pages);
> - ret = false;
>
> for (r = 0; r < cma->nranges; r++) {
> cmr = &cma->ranges[r];
> end = cmr->base_pfn + cmr->count;
> - if (pfn >= cmr->base_pfn && pfn < end) {
> - ret = pfn + count <= end;
> + if (pfn >= cmr->base_pfn && pfn < end && pfn + count <= end)
Are you afraid of overflows here, or why can't it simply be
if (pfn >= cmr->base_pfn && pfn + count <= end)
But I wonder if we want to keep here
if (pfn >= cmr->base_pfn && pfn < end)
And VM_WARN if the area does not completely fit into the range. See below.
> break;
> - }
> }
>
> - if (!ret)
> - pr_debug("%s(page %p, count %lu)\n",
> - __func__, (void *)pages, count);
> + if (r == cma->nranges)
> + return false;
Would we want to warn one way or the other in that case? Is it valid
that someone tries to free a wrong range?
Note that the original code had this pr_debug() in case no range for the
start pfn was found (IIUC, it's confusing) and this VM_BUG_ON(end_pfn >
cmr->base_pfn + cmr->count) in case a range was found but it would not
completely match the
You're not discussing that behavioral change in the changelog, and I
think we would want to keep some sanity checks, likely in a
VM_WARN_ON_ONCE() form.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 6/8] mm: cma: add __cma_release()
2025-09-30 10:15 ` David Hildenbrand
@ 2025-10-09 12:40 ` Kefeng Wang
0 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-10-09 12:40 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 2025/9/30 18:15, David Hildenbrand wrote:
> On 18.09.25 15:19, Kefeng Wang wrote:
>> Kill cma_pages_valid() which only used in cma_release(), also
>> cleanup code duplication between cma pages valid checking and
>> cma memrange finding, add __cma_release() helper to prepare for
>> the upcoming frozen page release.
>>
>> Reviewed-by: Jane Chu <jane.chu@oracle.com>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> include/linux/cma.h | 1 -
>> mm/cma.c | 57 ++++++++++++---------------------------------
>> 2 files changed, 15 insertions(+), 43 deletions(-)
>>
>> diff --git a/include/linux/cma.h b/include/linux/cma.h
>> index 62d9c1cf6326..e5745d2aec55 100644
>> --- a/include/linux/cma.h
>> +++ b/include/linux/cma.h
>> @@ -49,7 +49,6 @@ extern int cma_init_reserved_mem(phys_addr_t base,
>> phys_addr_t size,
>> struct cma **res_cma);
>> extern struct page *cma_alloc(struct cma *cma, unsigned long count,
>> unsigned int align,
>> bool no_warn);
>> -extern bool cma_pages_valid(struct cma *cma, const struct page
>> *pages, unsigned long count);
>> extern bool cma_release(struct cma *cma, const struct page *pages,
>> unsigned long count);
>> extern int cma_for_each_area(int (*it)(struct cma *cma, void *data),
>> void *data);
>> diff --git a/mm/cma.c b/mm/cma.c
>> index 813e6dc7b095..2af8c5bc58dd 100644
>> --- a/mm/cma.c
>> +++ b/mm/cma.c
>> @@ -942,34 +942,36 @@ struct folio *cma_alloc_folio(struct cma *cma,
>> int order, gfp_t gfp)
>> return page ? page_folio(page) : NULL;
>> }
>> -bool cma_pages_valid(struct cma *cma, const struct page *pages,
>> - unsigned long count)
>> +static bool __cma_release(struct cma *cma, const struct page *pages,
>> + unsigned long count)
>> {
>> unsigned long pfn, end;
>> int r;
>> struct cma_memrange *cmr;
>> - bool ret;
>> +
>> + pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages,
>> count);
>> if (!cma || !pages || count > cma->count)
>> return false;
>> pfn = page_to_pfn(pages);
>> - ret = false;
>> for (r = 0; r < cma->nranges; r++) {
>> cmr = &cma->ranges[r];
>> end = cmr->base_pfn + cmr->count;
>> - if (pfn >= cmr->base_pfn && pfn < end) {
>> - ret = pfn + count <= end;
>> + if (pfn >= cmr->base_pfn && pfn < end && pfn + count <= end)
>
> Are you afraid of overflows here, or why can't it simply be
>
> if (pfn >= cmr->base_pfn && pfn + count <= end)
>
> But I wonder if we want to keep here
>
> if (pfn >= cmr->base_pfn && pfn < end)
>
> And VM_WARN if the area does not completely fit into the range. See below.
>
>
>> break;
>> - }
>> }
>> - if (!ret)
>> - pr_debug("%s(page %p, count %lu)\n",
>> - __func__, (void *)pages, count);
>> + if (r == cma->nranges)
>> + return false;
>
> Would we want to warn one way or the other in that case? Is it valid
> that someone tries to free a wrong range?
The original cma_pages_valid() check start pfn whether it is in cma
range or not, and the range must be within the complete cma range.
The repeatedly check "VM_BUG_ON(pfn + count > end)" in cma_release()
is never performed since we return early after cma_pages_valid().
>
> Note that the original code had this pr_debug() in case no range for the
> start pfn was found (IIUC, it's confusing) and this VM_BUG_ON(end_pfn >
> cmr->base_pfn + cmr->count) in case a range was found but it would not
> completely match the
So the VM_BUG_ON is not useful,
>
> You're not discussing that behavioral change in the changelog, and I
> think we would want to keep some sanity checks, likely in a
> VM_WARN_ON_ONCE() form.
>
>
But for the error path, adding some debug info is better, a quick diff
based on this patch, what do you think?
diff --git a/mm/cma.c b/mm/cma.c
index 2af8c5bc58dd..88016f4aef7f 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -959,12 +959,19 @@ static bool __cma_release(struct cma *cma, const
struct page *pages,
for (r = 0; r < cma->nranges; r++) {
cmr = &cma->ranges[r];
end = cmr->base_pfn + cmr->count;
- if (pfn >= cmr->base_pfn && pfn < end && pfn + count <= end)
- break;
+ if (pfn >= cmr->base_pfn && pfn < end) {
+ if (pfn + count <= end)
+ break;
+
+ VM_WARN_ON_ONCE(1);
+ }
}
- if (r == cma->nranges)
+ if (r == cma->nranges) {
+ pr_debug("%s(no cma range match the page %p)\n",
+ __func__, (void *)pages);
return false;
+ }
free_contig_range(pfn, count);
cma_clear_bitmap(cma, cmr, pfn, count);
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig()
2025-09-30 9:56 ` David Hildenbrand
@ 2025-10-09 12:40 ` Kefeng Wang
0 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-10-09 12:40 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 2025/9/30 17:56, David Hildenbrand wrote:
> On 18.09.25 15:19, Kefeng Wang wrote:
>> The alloc_contig_pages() spends a lot of time in
>> pfn_range_valid_contig(),
>> we could check whether the page in this pfn range could be allocated
>> before alloc_contig_range(), if the page can't be migrated, no further
>> action is required, and also skip some unnecessary iterations for
>> compound pages such as THP and non-compound high order buddy, which
>> save times a lot too. The check is racy, but the only danger is skipping
>> too much.
>>
>> A simple test on machine with 116G free memory, allocate 120 * 1G
>> HugeTLB folios(107 successfully returned),
>>
>> time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/
>> nr_hugepages
>>
>> Before: 0m2.124s
>> After: 0m0.602s
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> mm/page_alloc.c | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 478beaf95f84..5b7d705e9710 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -7012,6 +7012,7 @@ static bool pfn_range_valid_contig(struct zone
>> *z, unsigned long start_pfn,
>> {
>> unsigned long i, end_pfn = start_pfn + nr_pages;
>> struct page *page;
>> + struct folio *folio;
>> for (i = start_pfn; i < end_pfn; i++) {
>> page = pfn_to_online_page(i);
>> @@ -7021,11 +7022,26 @@ static bool pfn_range_valid_contig(struct zone
>> *z, unsigned long start_pfn,
>> if (page_zone(page) != z)
>> return false;
>> - if (PageReserved(page))
>> + folio = page_folio(page);
>> + if (folio_test_reserved(folio))
>> return false;
>> - if (PageHuge(page))
>> + if (folio_test_hugetlb(folio))
>> return false;
>> +
>> + /* The following type of folios aren't migrated */
>> + if (folio_test_pgtable(folio) | folio_test_stack(folio))
>> + return false;
>> +
>
> I don't enjoy us open coding this here. has_unmovable_pages() has a much
> better heuristics.
>
> I suggest you drop this patch for now from this series, as it seems to
> be independent from the rest, and instead see if you could reuse some of
> the has_unmovable_pages() logic instead.
>
OK,I will try to check if has_unmovable_pages() could be used. The new
patches are added when I test alloc_contig_pages/
alloc_contig_frozen_pages with different GFP flags.
Let me remove them and resend them separately.
[1]
https://lore.kernel.org/linux-mm/39ea6d31-ec9c-4053-a875-8e86a8676a62@huawei.com/
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 2/8] mm: hugetlb: optimize replace_free_hugepage_folios()
2025-09-30 9:57 ` David Hildenbrand
@ 2025-10-09 12:40 ` Kefeng Wang
0 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-10-09 12:40 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 2025/9/30 17:57, David Hildenbrand wrote:
> On 18.09.25 15:19, Kefeng Wang wrote:
>> No need to replace free hugepage folios if no free hugetlb folios,
>> we don't replace gigantic folio, so use isolate_or_dissolve_huge_folio(),
>> also skip some pfn iterations for compound pages such as THP and
>> non-compound high order buddy to save time.
>>
>> A simple test on machine with 116G free memory, allocate 120 * 1G
>> HugeTLB folios(107 successfully returned),
>>
>> time echo 120 > /sys/kernel/mm/hugepages/hugepages-1048576kB/
>> nr_hugepages
>>
>> Before: 0m0.602s
>> After: 0m0.429s
>
> Also this patch feels misplaced in this series. I suggest you send that
> out separately.
>
> Or is there anything important that I am missing?
>
Sure, let me do it separately.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 4/8] mm: page_alloc: add split_non_compound_page()
2025-09-30 10:06 ` David Hildenbrand
@ 2025-10-09 12:40 ` Kefeng Wang
0 siblings, 0 replies; 23+ messages in thread
From: Kefeng Wang @ 2025-10-09 12:40 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton, Oscar Salvador, Muchun Song,
Zi Yan, Matthew Wilcox
Cc: sidhartha.kumar, jane.chu, Vlastimil Babka, Brendan Jackman,
Johannes Weiner, linux-mm
On 2025/9/30 18:06, David Hildenbrand wrote:
> On 18.09.25 15:19, Kefeng Wang wrote:
>> Add new split_non_compound_page() to simplify make_alloc_exact().
>>
>
> "Factor out the splitting of non-compound page from make_alloc_exact()
> and split_page() into a new helper function split_non_compound_page()".
>
Thanks, will update the changelog.
> Not sure I enjoy the name "split_non_compound_page()", but it matches
> the existing theme of split_page(): we're not really splitting any
> pages, we're just adjusting tracking metadata for pages part of the
> original-higher-order-page so it can be freed separately later.
>
> But now I think of it, the terminology is bad if you look at the
> description of split_page(): "split_page takes a non-compound higher-
> order page, and splits it into n (1<<order) sub-pages: page[0..n]". It's
> unclear how split_non_compound_page() would really differ.
>
> I would suggest you call the new helper simply "__split_page" ?
>
>
OK, the naming always beats me, will change it.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2025-10-09 12:59 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-18 13:19 [PATCH v2 0/8] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 1/8] mm: page_alloc: optimize pfn_range_valid_contig() Kefeng Wang
2025-09-18 15:49 ` Zi Yan
2025-09-19 2:03 ` Kefeng Wang
2025-09-19 1:40 ` kernel test robot
2025-09-19 5:00 ` Dev Jain
2025-09-20 8:19 ` Kefeng Wang
2025-09-30 9:56 ` David Hildenbrand
2025-10-09 12:40 ` Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 2/8] mm: hugetlb: optimize replace_free_hugepage_folios() Kefeng Wang
2025-09-30 9:57 ` David Hildenbrand
2025-10-09 12:40 ` Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 3/8] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
2025-09-30 10:01 ` David Hildenbrand
2025-09-18 13:19 ` [PATCH v2 4/8] mm: page_alloc: add split_non_compound_page() Kefeng Wang
2025-09-30 10:06 ` David Hildenbrand
2025-10-09 12:40 ` Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 5/8] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}() Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 6/8] mm: cma: add __cma_release() Kefeng Wang
2025-09-30 10:15 ` David Hildenbrand
2025-10-09 12:40 ` Kefeng Wang
2025-09-18 13:19 ` [PATCH v2 7/8] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
2025-09-18 13:20 ` [PATCH v2 8/8] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio() Kefeng Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox