linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
@ 2025-12-30  7:24 Kefeng Wang
  2025-12-30  7:24 ` [PATCH v5 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
                   ` (7 more replies)
  0 siblings, 8 replies; 30+ messages in thread
From: Kefeng Wang @ 2025-12-30  7:24 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song, linux-mm
  Cc: sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox, Kefeng Wang

Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
which avoid atomic operation about page refcount, and then convert to
allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
the alloc_gigantic_folio().

v5:
- address Zi Yan's comments 
  - convert one more VM_BUG_ON_PAGE to VM_WARN_ON_PAGE in split_page()
  - add a check to catch a tail page in free_contig_frozen_range()
  - remove __cma_release() and update patch3's changelog
  - put each page’s refcount in cma_release() and add find_cma_memrange()/
    __cma_release_frozen() to unify release path, which make the release
    pattern matches allocation patter
  - rename alloc_buddy_hugetlb_folio() and alloc_gigantic_folio() with frozen
- fix wrong node id passed to cma_alloc_frozen_compound() in
  hugetlb_cma_alloc_folio(), found by lkp
- collect ACK/RB from Zi Yan/Muchun
- rebased on mm-new

v4 RESEND:
- fix set_pages_refcounted() due to bad git-rebase
- rebased on next1215, also could be apply to mm-new.

v4:
- add VM_WARN_ON_PAGE and instead of VM_BUG_ON_PAGE in __split_page() 
- add back pr_debug part in __cma_release()
- rename alloc_contig_range_frozen to alloc_contig_frozen_range, and
  make alloc_contig_{range,pages} to only allocate non-compound pages,
  the free_contig_range() adjusted accordingly
- add set_pages_refcounted helper to reduce code duplication
- collect ACK

v3:
- Fix built warn/err, found by lkp test
- Address some David's comments,
  - Focus on frozen part and drop the optimization part
  - Rename split_non_compound_pages() to __split_pages()
  - Adding back debug print/WARN_ON if no cma range found or the
    pfn range of page is not full match the cma range.

v2:
- Optimize gigantic folio allocation speed
- Using HPAGE_PUD_ORDER in debug_vm_pgtable
- Address some David's comments,
  - kill folio_alloc_gigantic()
  - add generic cma_alloc_frozen{_compound}() instead of
    cma_{alloc,free}_folio


Kefeng Wang (6):
  mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page()
  mm: page_alloc: add __split_page()
  mm: cma: kill cma_pages_valid()
  mm: page_alloc: add alloc_contig_frozen_{range,pages}()
  mm: cma: add cma_alloc_frozen{_compound}()
  mm: hugetlb: allocate frozen pages for gigantic allocation

 include/linux/cma.h     |  27 ++----
 include/linux/gfp.h     |  52 ++++------
 include/linux/mmdebug.h |  10 ++
 mm/cma.c                | 122 +++++++++++++----------
 mm/debug_vm_pgtable.c   |  38 ++++----
 mm/hugetlb.c            |  70 ++++----------
 mm/hugetlb_cma.c        |  29 +++---
 mm/hugetlb_cma.h        |  10 +-
 mm/internal.h           |  13 +++
 mm/page_alloc.c         | 209 ++++++++++++++++++++++++++++------------
 10 files changed, 326 insertions(+), 254 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page()
  2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
@ 2025-12-30  7:24 ` Kefeng Wang
  2026-01-02 18:51   ` Sid Kumar
  2025-12-30  7:24 ` [PATCH v5 2/6] mm: page_alloc: add __split_page() Kefeng Wang
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Kefeng Wang @ 2025-12-30  7:24 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song, linux-mm
  Cc: sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox, Kefeng Wang

Add a new helper to free huge page to be consistency to
debug_vm_pgtable_alloc_huge_page(), and use HPAGE_PUD_ORDER
instead of open-code.

Also move the free_contig_range() under CONFIG_ALLOC_CONTIG
since all caller are built with CONFIG_ALLOC_CONTIG.

Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/gfp.h   |  2 +-
 mm/debug_vm_pgtable.c | 38 +++++++++++++++++---------------------
 mm/page_alloc.c       |  2 +-
 3 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index b155929af5b1..ea053f1cfa16 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -438,8 +438,8 @@ extern struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_
 					      int nid, nodemask_t *nodemask);
 #define alloc_contig_pages(...)			alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
 
-#endif
 void free_contig_range(unsigned long pfn, unsigned long nr_pages);
+#endif
 
 #ifdef CONFIG_CONTIG_ALLOC
 static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index ae9b9310d96f..83cf07269f13 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -971,22 +971,26 @@ static unsigned long __init get_random_vaddr(void)
 	return random_vaddr;
 }
 
-static void __init destroy_args(struct pgtable_debug_args *args)
+static void __init
+debug_vm_pgtable_free_huge_page(struct pgtable_debug_args *args,
+		unsigned long pfn, int order)
 {
-	struct page *page = NULL;
+#ifdef CONFIG_CONTIG_ALLOC
+	if (args->is_contiguous_page) {
+		free_contig_range(pfn, 1 << order);
+		return;
+	}
+#endif
+	__free_pages(pfn_to_page(pfn), order);
+}
 
+static void __init destroy_args(struct pgtable_debug_args *args)
+{
 	/* Free (huge) page */
 	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
 	    has_transparent_pud_hugepage() &&
 	    args->pud_pfn != ULONG_MAX) {
-		if (args->is_contiguous_page) {
-			free_contig_range(args->pud_pfn,
-					  (1 << (HPAGE_PUD_SHIFT - PAGE_SHIFT)));
-		} else {
-			page = pfn_to_page(args->pud_pfn);
-			__free_pages(page, HPAGE_PUD_SHIFT - PAGE_SHIFT);
-		}
-
+		debug_vm_pgtable_free_huge_page(args, args->pud_pfn, HPAGE_PUD_ORDER);
 		args->pud_pfn = ULONG_MAX;
 		args->pmd_pfn = ULONG_MAX;
 		args->pte_pfn = ULONG_MAX;
@@ -995,20 +999,13 @@ static void __init destroy_args(struct pgtable_debug_args *args)
 	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
 	    has_transparent_hugepage() &&
 	    args->pmd_pfn != ULONG_MAX) {
-		if (args->is_contiguous_page) {
-			free_contig_range(args->pmd_pfn, (1 << HPAGE_PMD_ORDER));
-		} else {
-			page = pfn_to_page(args->pmd_pfn);
-			__free_pages(page, HPAGE_PMD_ORDER);
-		}
-
+		debug_vm_pgtable_free_huge_page(args, args->pmd_pfn, HPAGE_PMD_ORDER);
 		args->pmd_pfn = ULONG_MAX;
 		args->pte_pfn = ULONG_MAX;
 	}
 
 	if (args->pte_pfn != ULONG_MAX) {
-		page = pfn_to_page(args->pte_pfn);
-		__free_page(page);
+		__free_page(pfn_to_page(args->pte_pfn));
 
 		args->pte_pfn = ULONG_MAX;
 	}
@@ -1242,8 +1239,7 @@ static int __init init_args(struct pgtable_debug_args *args)
 	 */
 	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
 	    has_transparent_pud_hugepage()) {
-		page = debug_vm_pgtable_alloc_huge_page(args,
-				HPAGE_PUD_SHIFT - PAGE_SHIFT);
+		page = debug_vm_pgtable_alloc_huge_page(args, HPAGE_PUD_ORDER);
 		if (page) {
 			args->pud_pfn = page_to_pfn(page);
 			args->pmd_pfn = args->pud_pfn;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a045d728ae0f..206397ed33a7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7248,7 +7248,6 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
 	}
 	return NULL;
 }
-#endif /* CONFIG_CONTIG_ALLOC */
 
 void free_contig_range(unsigned long pfn, unsigned long nr_pages)
 {
@@ -7275,6 +7274,7 @@ void free_contig_range(unsigned long pfn, unsigned long nr_pages)
 	WARN(count != 0, "%lu pages are still in use!\n", count);
 }
 EXPORT_SYMBOL(free_contig_range);
+#endif /* CONFIG_CONTIG_ALLOC */
 
 /*
  * Effectively disable pcplists for the zone by setting the high limit to 0
-- 
2.27.0



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 2/6] mm: page_alloc: add __split_page()
  2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
  2025-12-30  7:24 ` [PATCH v5 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
@ 2025-12-30  7:24 ` Kefeng Wang
  2026-01-02 18:55   ` Sid Kumar
  2025-12-30  7:24 ` [PATCH v5 3/6] mm: cma: kill cma_pages_valid() Kefeng Wang
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Kefeng Wang @ 2025-12-30  7:24 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song, linux-mm
  Cc: sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox, Kefeng Wang

Factor out the splitting of non-compound page from make_alloc_exact()
and split_page() into a new helper function __split_page().

While at it, convert the VM_BUG_ON_PAGE() into a VM_WARN_ON_PAGE().

Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/mmdebug.h | 10 ++++++++++
 mm/page_alloc.c         | 21 +++++++++++++--------
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
index 14a45979cccc..ab60ffba08f5 100644
--- a/include/linux/mmdebug.h
+++ b/include/linux/mmdebug.h
@@ -47,6 +47,15 @@ void vma_iter_dump_tree(const struct vma_iterator *vmi);
 			BUG();						\
 		}							\
 	} while (0)
+#define VM_WARN_ON_PAGE(cond, page)		({			\
+	int __ret_warn = !!(cond);					\
+									\
+	if (unlikely(__ret_warn)) {					\
+		dump_page(page, "VM_WARN_ON_PAGE(" __stringify(cond)")");\
+		WARN_ON(1);						\
+	}								\
+	unlikely(__ret_warn);						\
+})
 #define VM_WARN_ON_ONCE_PAGE(cond, page)	({			\
 	static bool __section(".data..once") __warned;			\
 	int __ret_warn_once = !!(cond);					\
@@ -122,6 +131,7 @@ void vma_iter_dump_tree(const struct vma_iterator *vmi);
 #define VM_BUG_ON_MM(cond, mm) VM_BUG_ON(cond)
 #define VM_WARN_ON(cond) BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
+#define VM_WARN_ON_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
 #define VM_WARN_ON_ONCE_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 206397ed33a7..b9bfbb69537e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3080,6 +3080,15 @@ void free_unref_folios(struct folio_batch *folios)
 	folio_batch_reinit(folios);
 }
 
+static void __split_page(struct page *page, unsigned int order)
+{
+	VM_WARN_ON_PAGE(PageCompound(page), page);
+
+	split_page_owner(page, order, 0);
+	pgalloc_tag_split(page_folio(page), order, 0);
+	split_page_memcg(page, order);
+}
+
 /*
  * split_page takes a non-compound higher-order page, and splits it into
  * n (1<<order) sub-pages: page[0..n]
@@ -3092,14 +3101,12 @@ void split_page(struct page *page, unsigned int order)
 {
 	int i;
 
-	VM_BUG_ON_PAGE(PageCompound(page), page);
-	VM_BUG_ON_PAGE(!page_count(page), page);
+	VM_WARN_ON_PAGE(!page_count(page), page);
 
 	for (i = 1; i < (1 << order); i++)
 		set_page_refcounted(page + i);
-	split_page_owner(page, order, 0);
-	pgalloc_tag_split(page_folio(page), order, 0);
-	split_page_memcg(page, order);
+
+	__split_page(page, order);
 }
 EXPORT_SYMBOL_GPL(split_page);
 
@@ -5383,9 +5390,7 @@ static void *make_alloc_exact(unsigned long addr, unsigned int order,
 		struct page *page = virt_to_page((void *)addr);
 		struct page *last = page + nr;
 
-		split_page_owner(page, order, 0);
-		pgalloc_tag_split(page_folio(page), order, 0);
-		split_page_memcg(page, order);
+		__split_page(page, order);
 		while (page < --last)
 			set_page_refcounted(last);
 
-- 
2.27.0



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 3/6] mm: cma: kill cma_pages_valid()
  2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
  2025-12-30  7:24 ` [PATCH v5 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
  2025-12-30  7:24 ` [PATCH v5 2/6] mm: page_alloc: add __split_page() Kefeng Wang
@ 2025-12-30  7:24 ` Kefeng Wang
  2025-12-30  7:24 ` [PATCH v5 4/6] mm: page_alloc: add alloc_contig_frozen_{range,pages}() Kefeng Wang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Kefeng Wang @ 2025-12-30  7:24 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song, linux-mm
  Cc: sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox, Kefeng Wang

Kill cma_pages_valid() which only used in cma_release(), also
cleanup code duplication between cma pages valid checking and
cma memrange finding.

Reviewed-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/cma.h |  1 -
 mm/cma.c            | 48 +++++++++++----------------------------------
 2 files changed, 11 insertions(+), 38 deletions(-)

diff --git a/include/linux/cma.h b/include/linux/cma.h
index 62d9c1cf6326..e5745d2aec55 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -49,7 +49,6 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 					struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
 			      bool no_warn);
-extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
 
 extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
diff --git a/mm/cma.c b/mm/cma.c
index 813e6dc7b095..fe3a9eaac4e5 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -942,36 +942,6 @@ struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
 	return page ? page_folio(page) : NULL;
 }
 
-bool cma_pages_valid(struct cma *cma, const struct page *pages,
-		     unsigned long count)
-{
-	unsigned long pfn, end;
-	int r;
-	struct cma_memrange *cmr;
-	bool ret;
-
-	if (!cma || !pages || count > cma->count)
-		return false;
-
-	pfn = page_to_pfn(pages);
-	ret = false;
-
-	for (r = 0; r < cma->nranges; r++) {
-		cmr = &cma->ranges[r];
-		end = cmr->base_pfn + cmr->count;
-		if (pfn >= cmr->base_pfn && pfn < end) {
-			ret = pfn + count <= end;
-			break;
-		}
-	}
-
-	if (!ret)
-		pr_debug("%s(page %p, count %lu)\n",
-				__func__, (void *)pages, count);
-
-	return ret;
-}
-
 /**
  * cma_release() - release allocated pages
  * @cma:   Contiguous memory region for which the allocation is performed.
@@ -991,23 +961,27 @@ bool cma_release(struct cma *cma, const struct page *pages,
 
 	pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count);
 
-	if (!cma_pages_valid(cma, pages, count))
+	if (!cma || !pages || count > cma->count)
 		return false;
 
 	pfn = page_to_pfn(pages);
-	end_pfn = pfn + count;
 
 	for (r = 0; r < cma->nranges; r++) {
 		cmr = &cma->ranges[r];
-		if (pfn >= cmr->base_pfn &&
-		    pfn < (cmr->base_pfn + cmr->count)) {
-			VM_BUG_ON(end_pfn > cmr->base_pfn + cmr->count);
-			break;
+		end_pfn = cmr->base_pfn + cmr->count;
+		if (pfn >= cmr->base_pfn && pfn < end_pfn) {
+			if (pfn + count <= end_pfn)
+				break;
+
+			VM_WARN_ON_ONCE(1);
 		}
 	}
 
-	if (r == cma->nranges)
+	if (r == cma->nranges) {
+		pr_debug("%s(page %p, count %lu, no cma range matches the page range)\n",
+			 __func__, (void *)pages, count);
 		return false;
+	}
 
 	free_contig_range(pfn, count);
 	cma_clear_bitmap(cma, cmr, pfn, count);
-- 
2.27.0



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 4/6] mm: page_alloc: add alloc_contig_frozen_{range,pages}()
  2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
                   ` (2 preceding siblings ...)
  2025-12-30  7:24 ` [PATCH v5 3/6] mm: cma: kill cma_pages_valid() Kefeng Wang
@ 2025-12-30  7:24 ` Kefeng Wang
  2025-12-31  2:57   ` Zi Yan
  2026-01-02 21:05   ` Sid Kumar
  2025-12-30  7:24 ` [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 30+ messages in thread
From: Kefeng Wang @ 2025-12-30  7:24 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song, linux-mm
  Cc: sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox, Kefeng Wang

In order to allocate given range of pages or allocate compound
pages without incrementing their refcount, adding two new helper
alloc_contig_frozen_{range,pages}() which may be beneficial
to some users (eg hugetlb).

The new alloc_contig_{range,pages} only take !__GFP_COMP gfp now,
and the free_contig_range() is refactored to only free non-compound
pages, the only caller to free compound pages in cma_free_folio() is
changed accordingly, and the free_contig_frozen_range() is provided
to match the alloc_contig_frozen_range(), which is used to free
frozen pages.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/gfp.h |  52 +++++--------
 mm/cma.c            |   9 ++-
 mm/hugetlb.c        |   9 ++-
 mm/internal.h       |  13 ++++
 mm/page_alloc.c     | 186 ++++++++++++++++++++++++++++++++------------
 5 files changed, 184 insertions(+), 85 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index ea053f1cfa16..aa45989f410d 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -430,40 +430,30 @@ typedef unsigned int __bitwise acr_flags_t;
 #define ACR_FLAGS_CMA ((__force acr_flags_t)BIT(0)) // allocate for CMA
 
 /* The below functions must be run on a range from a single zone. */
-extern int alloc_contig_range_noprof(unsigned long start, unsigned long end,
-				     acr_flags_t alloc_flags, gfp_t gfp_mask);
-#define alloc_contig_range(...)			alloc_hooks(alloc_contig_range_noprof(__VA_ARGS__))
-
-extern struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
-					      int nid, nodemask_t *nodemask);
-#define alloc_contig_pages(...)			alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
-
+int alloc_contig_frozen_range_noprof(unsigned long start, unsigned long end,
+		acr_flags_t alloc_flags, gfp_t gfp_mask);
+#define alloc_contig_frozen_range(...)	\
+	alloc_hooks(alloc_contig_frozen_range_noprof(__VA_ARGS__))
+
+int alloc_contig_range_noprof(unsigned long start, unsigned long end,
+		acr_flags_t alloc_flags, gfp_t gfp_mask);
+#define alloc_contig_range(...)	\
+	alloc_hooks(alloc_contig_range_noprof(__VA_ARGS__))
+
+struct page *alloc_contig_frozen_pages_noprof(unsigned long nr_pages,
+		gfp_t gfp_mask, int nid, nodemask_t *nodemask);
+#define alloc_contig_frozen_pages(...) \
+	alloc_hooks(alloc_contig_frozen_pages_noprof(__VA_ARGS__))
+
+struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
+		int nid, nodemask_t *nodemask);
+#define alloc_contig_pages(...)	\
+	alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
+
+void free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages);
 void free_contig_range(unsigned long pfn, unsigned long nr_pages);
 #endif
 
-#ifdef CONFIG_CONTIG_ALLOC
-static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
-							int nid, nodemask_t *node)
-{
-	struct page *page;
-
-	if (WARN_ON(!order || !(gfp & __GFP_COMP)))
-		return NULL;
-
-	page = alloc_contig_pages_noprof(1 << order, gfp, nid, node);
-
-	return page ? page_folio(page) : NULL;
-}
-#else
-static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
-							int nid, nodemask_t *node)
-{
-	return NULL;
-}
-#endif
-/* This should be paired with folio_put() rather than free_contig_range(). */
-#define folio_alloc_gigantic(...) alloc_hooks(folio_alloc_gigantic_noprof(__VA_ARGS__))
-
 DEFINE_FREE(free_page, void *, free_page((unsigned long)_T))
 
 #endif /* __LINUX_GFP_H */
diff --git a/mm/cma.c b/mm/cma.c
index fe3a9eaac4e5..0e8c146424fb 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -836,7 +836,7 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
 		spin_unlock_irq(&cma->lock);
 
 		mutex_lock(&cma->alloc_mutex);
-		ret = alloc_contig_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
+		ret = alloc_contig_frozen_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
 		mutex_unlock(&cma->alloc_mutex);
 		if (!ret)
 			break;
@@ -904,6 +904,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
 	trace_cma_alloc_finish(name, page ? page_to_pfn(page) : 0,
 			       page, count, align, ret);
 	if (page) {
+		set_pages_refcounted(page, count);
 		count_vm_event(CMA_ALLOC_SUCCESS);
 		cma_sysfs_account_success_pages(cma, count);
 	} else {
@@ -983,7 +984,11 @@ bool cma_release(struct cma *cma, const struct page *pages,
 		return false;
 	}
 
-	free_contig_range(pfn, count);
+	if (PageHead(pages))
+		__free_pages((struct page *)pages, compound_order(pages));
+	else
+		free_contig_range(pfn, count);
+
 	cma_clear_bitmap(cma, cmr, pfn, count);
 	cma_sysfs_account_release_pages(cma, count);
 	trace_cma_release(cma->name, pfn, pages, count);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a1832da0f623..c990e439c32e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1428,12 +1428,17 @@ static struct folio *alloc_gigantic_folio(int order, gfp_t gfp_mask,
 retry:
 	folio = hugetlb_cma_alloc_folio(order, gfp_mask, nid, nodemask);
 	if (!folio) {
+		struct page *page;
+
 		if (hugetlb_cma_exclusive_alloc())
 			return NULL;
 
-		folio = folio_alloc_gigantic(order, gfp_mask, nid, nodemask);
-		if (!folio)
+		page = alloc_contig_frozen_pages(1 << order, gfp_mask, nid, nodemask);
+		if (!page)
 			return NULL;
+
+		set_page_refcounted(page);
+		folio = page_folio(page);
 	}
 
 	if (folio_ref_freeze(folio, 1))
diff --git a/mm/internal.h b/mm/internal.h
index db4e97489f66..b8737c474412 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -513,6 +513,19 @@ static inline void set_page_refcounted(struct page *page)
 	set_page_count(page, 1);
 }
 
+static inline void set_pages_refcounted(struct page *page, unsigned long nr_pages)
+{
+	unsigned long pfn = page_to_pfn(page);
+
+	if (PageHead(page)) {
+		set_page_refcounted(page);
+		return;
+	}
+
+	for (; nr_pages--; pfn++)
+		set_page_refcounted(pfn_to_page(pfn));
+}
+
 /*
  * Return true if a folio needs ->release_folio() calling upon it.
  */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b9bfbb69537e..149f7b581b62 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6882,7 +6882,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 	return (ret < 0) ? ret : 0;
 }
 
-static void split_free_pages(struct list_head *list, gfp_t gfp_mask)
+static void split_free_frozen_pages(struct list_head *list, gfp_t gfp_mask)
 {
 	int order;
 
@@ -6894,11 +6894,10 @@ static void split_free_pages(struct list_head *list, gfp_t gfp_mask)
 			int i;
 
 			post_alloc_hook(page, order, gfp_mask);
-			set_page_refcounted(page);
 			if (!order)
 				continue;
 
-			split_page(page, order);
+			__split_page(page, order);
 
 			/* Add all subpages to the order-0 head, in sequence. */
 			list_del(&page->lru);
@@ -6942,8 +6941,14 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
 	return 0;
 }
 
+static void __free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages)
+{
+	for (; nr_pages--; pfn++)
+		free_frozen_pages(pfn_to_page(pfn), 0);
+}
+
 /**
- * alloc_contig_range() -- tries to allocate given range of pages
+ * alloc_contig_frozen_range() -- tries to allocate given range of frozen pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
  * @alloc_flags:	allocation information
@@ -6958,12 +6963,15 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
  * pageblocks in the range.  Once isolated, the pageblocks should not
  * be modified by others.
  *
- * Return: zero on success or negative error code.  On success all
- * pages which PFN is in [start, end) are allocated for the caller and
- * need to be freed with free_contig_range().
+ * All frozen pages which PFN is in [start, end) are allocated for the
+ * caller, and they could be freed with free_contig_frozen_range(),
+ * free_frozen_pages() also could be used to free compound frozen pages
+ * directly.
+ *
+ * Return: zero on success or negative error code.
  */
-int alloc_contig_range_noprof(unsigned long start, unsigned long end,
-			      acr_flags_t alloc_flags, gfp_t gfp_mask)
+int alloc_contig_frozen_range_noprof(unsigned long start, unsigned long end,
+		acr_flags_t alloc_flags, gfp_t gfp_mask)
 {
 	const unsigned int order = ilog2(end - start);
 	unsigned long outer_start, outer_end;
@@ -7079,19 +7087,18 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 	}
 
 	if (!(gfp_mask & __GFP_COMP)) {
-		split_free_pages(cc.freepages, gfp_mask);
+		split_free_frozen_pages(cc.freepages, gfp_mask);
 
 		/* Free head and tail (if any) */
 		if (start != outer_start)
-			free_contig_range(outer_start, start - outer_start);
+			__free_contig_frozen_range(outer_start, start - outer_start);
 		if (end != outer_end)
-			free_contig_range(end, outer_end - end);
+			__free_contig_frozen_range(end, outer_end - end);
 	} else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
 		struct page *head = pfn_to_page(start);
 
 		check_new_pages(head, order);
 		prep_new_page(head, order, gfp_mask, 0);
-		set_page_refcounted(head);
 	} else {
 		ret = -EINVAL;
 		WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n",
@@ -7101,16 +7108,40 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 	undo_isolate_page_range(start, end);
 	return ret;
 }
-EXPORT_SYMBOL(alloc_contig_range_noprof);
+EXPORT_SYMBOL(alloc_contig_frozen_range_noprof);
 
-static int __alloc_contig_pages(unsigned long start_pfn,
-				unsigned long nr_pages, gfp_t gfp_mask)
+/**
+ * alloc_contig_range() -- tries to allocate given range of pages
+ * @start:	start PFN to allocate
+ * @end:	one-past-the-last PFN to allocate
+ * @alloc_flags:	allocation information
+ * @gfp_mask:	GFP mask.
+ *
+ * This routine is a wrapper around alloc_contig_frozen_range(), it can't
+ * be used to allocate compound pages, the refcount of each allocated page
+ * will be set to one.
+ *
+ * All pages which PFN is in [start, end) are allocated for the caller,
+ * and should be freed with free_contig_range() or by manually calling
+ * __free_page() on each allocated page.
+ *
+ * Return: zero on success or negative error code.
+ */
+int alloc_contig_range_noprof(unsigned long start, unsigned long end,
+			      acr_flags_t alloc_flags, gfp_t gfp_mask)
 {
-	unsigned long end_pfn = start_pfn + nr_pages;
+	int ret;
 
-	return alloc_contig_range_noprof(start_pfn, end_pfn, ACR_FLAGS_NONE,
-					 gfp_mask);
+	if (WARN_ON(gfp_mask & __GFP_COMP))
+		return -EINVAL;
+
+	ret = alloc_contig_frozen_range_noprof(start, end, alloc_flags, gfp_mask);
+	if (!ret)
+		set_pages_refcounted(pfn_to_page(start), end - start);
+
+	return ret;
 }
+EXPORT_SYMBOL(alloc_contig_range_noprof);
 
 static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
 				   unsigned long nr_pages, bool skip_hugetlb,
@@ -7179,7 +7210,7 @@ static bool zone_spans_last_pfn(const struct zone *zone,
 }
 
 /**
- * alloc_contig_pages() -- tries to find and allocate contiguous range of pages
+ * alloc_contig_frozen_pages() -- tries to find and allocate contiguous range of frozen pages
  * @nr_pages:	Number of contiguous pages to allocate
  * @gfp_mask:	GFP mask. Node/zone/placement hints limit the search; only some
  *		action and reclaim modifiers are supported. Reclaim modifiers
@@ -7187,22 +7218,25 @@ static bool zone_spans_last_pfn(const struct zone *zone,
  * @nid:	Target node
  * @nodemask:	Mask for other possible nodes
  *
- * This routine is a wrapper around alloc_contig_range(). It scans over zones
- * on an applicable zonelist to find a contiguous pfn range which can then be
- * tried for allocation with alloc_contig_range(). This routine is intended
- * for allocation requests which can not be fulfilled with the buddy allocator.
+ * This routine is a wrapper around alloc_contig_frozen_range(). It scans over
+ * zones on an applicable zonelist to find a contiguous pfn range which can then
+ * be tried for allocation with alloc_contig_frozen_range(). This routine is
+ * intended for allocation requests which can not be fulfilled with the buddy
+ * allocator.
  *
  * The allocated memory is always aligned to a page boundary. If nr_pages is a
  * power of two, then allocated range is also guaranteed to be aligned to same
  * nr_pages (e.g. 1GB request would be aligned to 1GB).
  *
- * Allocated pages can be freed with free_contig_range() or by manually calling
- * __free_page() on each allocated page.
+ * Allocated frozen pages need be freed with free_contig_frozen_range(),
+ * or by manually calling free_frozen_pages() on each allocated frozen
+ * non-compound page, for compound frozen pages could be freed with
+ * free_frozen_pages() directly.
  *
- * Return: pointer to contiguous pages on success, or NULL if not successful.
+ * Return: pointer to contiguous frozen pages on success, or NULL if not successful.
  */
-struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
-				 int nid, nodemask_t *nodemask)
+struct page *alloc_contig_frozen_pages_noprof(unsigned long nr_pages,
+		gfp_t gfp_mask, int nid, nodemask_t *nodemask)
 {
 	unsigned long ret, pfn, flags;
 	struct zonelist *zonelist;
@@ -7224,13 +7258,15 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
 						   &skipped_hugetlb)) {
 				/*
 				 * We release the zone lock here because
-				 * alloc_contig_range() will also lock the zone
-				 * at some point. If there's an allocation
-				 * spinning on this lock, it may win the race
-				 * and cause alloc_contig_range() to fail...
+				 * alloc_contig_frozen_range() will also lock
+				 * the zone at some point. If there's an
+				 * allocation spinning on this lock, it may
+				 * win the race and cause allocation to fail.
 				 */
 				spin_unlock_irqrestore(&zone->lock, flags);
-				ret = __alloc_contig_pages(pfn, nr_pages,
+				ret = alloc_contig_frozen_range_noprof(pfn,
+							pfn + nr_pages,
+							ACR_FLAGS_NONE,
 							gfp_mask);
 				if (!ret)
 					return pfn_to_page(pfn);
@@ -7253,30 +7289,80 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
 	}
 	return NULL;
 }
+EXPORT_SYMBOL(alloc_contig_frozen_pages_noprof);
 
-void free_contig_range(unsigned long pfn, unsigned long nr_pages)
+/**
+ * alloc_contig_pages() -- tries to find and allocate contiguous range of pages
+ * @nr_pages:	Number of contiguous pages to allocate
+ * @gfp_mask:	GFP mask.
+ * @nid:	Target node
+ * @nodemask:	Mask for other possible nodes
+ *
+ * This routine is a wrapper around alloc_contig_frozen_pages(), it can't
+ * be used to allocate compound pages, the refcount of each allocated page
+ * will be set to one.
+ *
+ * Allocated pages can be freed with free_contig_range() or by manually
+ * calling __free_page() on each allocated page.
+ *
+ * Return: pointer to contiguous pages on success, or NULL if not successful.
+ */
+struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
+		int nid, nodemask_t *nodemask)
 {
-	unsigned long count = 0;
-	struct folio *folio = pfn_folio(pfn);
+	struct page *page;
 
-	if (folio_test_large(folio)) {
-		int expected = folio_nr_pages(folio);
+	if (WARN_ON(gfp_mask & __GFP_COMP))
+		return NULL;
 
-		if (nr_pages == expected)
-			folio_put(folio);
-		else
-			WARN(true, "PFN %lu: nr_pages %lu != expected %d\n",
-			     pfn, nr_pages, expected);
+	page = alloc_contig_frozen_pages_noprof(nr_pages, gfp_mask, nid,
+						nodemask);
+	if (page)
+		set_pages_refcounted(page, nr_pages);
+
+	return page;
+}
+EXPORT_SYMBOL(alloc_contig_pages_noprof);
+
+/**
+ * free_contig_frozen_range() -- free the contiguous range of frozen pages
+ * @pfn:	start PFN to free
+ * @nr_pages:	Number of contiguous frozen pages to free
+ *
+ * This can be used to free the allocated compound/non-compound frozen pages.
+ */
+void free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages)
+{
+	struct page *first_page = pfn_to_page(pfn);
+	const unsigned int order = ilog2(nr_pages);
+
+	if (WARN_ON_ONCE(first_page != compound_head(first_page)))
+		return;
+
+	if (PageHead(first_page)) {
+		WARN_ON_ONCE(order != compound_order(first_page));
+		free_frozen_pages(first_page, order);
 		return;
 	}
 
-	for (; nr_pages--; pfn++) {
-		struct page *page = pfn_to_page(pfn);
+	__free_contig_frozen_range(pfn, nr_pages);
+}
+EXPORT_SYMBOL(free_contig_frozen_range);
+
+/**
+ * free_contig_range() -- free the contiguous range of pages
+ * @pfn:	start PFN to free
+ * @nr_pages:	Number of contiguous pages to free
+ *
+ * This can be only used to free the allocated non-compound pages.
+ */
+void free_contig_range(unsigned long pfn, unsigned long nr_pages)
+{
+	if (WARN_ON_ONCE(PageHead(pfn_to_page(pfn))))
+		return;
 
-		count += page_count(page) != 1;
-		__free_page(page);
-	}
-	WARN(count != 0, "%lu pages are still in use!\n", count);
+	for (; nr_pages--; pfn++)
+		__free_page(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL(free_contig_range);
 #endif /* CONFIG_CONTIG_ALLOC */
-- 
2.27.0



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}()
  2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
                   ` (3 preceding siblings ...)
  2025-12-30  7:24 ` [PATCH v5 4/6] mm: page_alloc: add alloc_contig_frozen_{range,pages}() Kefeng Wang
@ 2025-12-30  7:24 ` Kefeng Wang
  2025-12-31  2:59   ` Zi Yan
  2026-01-08  4:19   ` Dmitry Baryshkov
  2025-12-30  7:24 ` [PATCH v5 6/6] mm: hugetlb: allocate frozen pages for gigantic allocation Kefeng Wang
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 30+ messages in thread
From: Kefeng Wang @ 2025-12-30  7:24 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song, linux-mm
  Cc: sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox, Kefeng Wang

Introduce cma_alloc_frozen{_compound}() helper to alloc pages without
incrementing their refcount, then convert hugetlb cma to use the
cma_alloc_frozen_compound() and cma_release_frozen() and remove the
unused cma_{alloc,free}_folio(), also move the cma_validate_zones()
into mm/internal.h since no outside user.

The set_pages_refcounted() is only called to set non-compound pages
after above changes, so remove the processing about PageHead.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/cma.h |  26 +++--------
 mm/cma.c            | 107 +++++++++++++++++++++++++++++---------------
 mm/hugetlb_cma.c    |  24 +++++-----
 mm/internal.h       |  10 ++---
 4 files changed, 97 insertions(+), 70 deletions(-)

diff --git a/include/linux/cma.h b/include/linux/cma.h
index e5745d2aec55..e2a690f7e77e 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -51,29 +51,15 @@ extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int
 			      bool no_warn);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
 
+struct page *cma_alloc_frozen(struct cma *cma, unsigned long count,
+		unsigned int align, bool no_warn);
+struct page *cma_alloc_frozen_compound(struct cma *cma, unsigned int order);
+bool cma_release_frozen(struct cma *cma, const struct page *pages,
+		unsigned long count);
+
 extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
 extern bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end);
 
 extern void cma_reserve_pages_on_error(struct cma *cma);
 
-#ifdef CONFIG_CMA
-struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
-bool cma_free_folio(struct cma *cma, const struct folio *folio);
-bool cma_validate_zones(struct cma *cma);
-#else
-static inline struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
-{
-	return NULL;
-}
-
-static inline bool cma_free_folio(struct cma *cma, const struct folio *folio)
-{
-	return false;
-}
-static inline bool cma_validate_zones(struct cma *cma)
-{
-	return false;
-}
-#endif
-
 #endif
diff --git a/mm/cma.c b/mm/cma.c
index 0e8c146424fb..5713becc602b 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -856,8 +856,8 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
 	return ret;
 }
 
-static struct page *__cma_alloc(struct cma *cma, unsigned long count,
-		       unsigned int align, gfp_t gfp)
+static struct page *__cma_alloc_frozen(struct cma *cma,
+		unsigned long count, unsigned int align, gfp_t gfp)
 {
 	struct page *page = NULL;
 	int ret = -ENOMEM, r;
@@ -904,7 +904,6 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
 	trace_cma_alloc_finish(name, page ? page_to_pfn(page) : 0,
 			       page, count, align, ret);
 	if (page) {
-		set_pages_refcounted(page, count);
 		count_vm_event(CMA_ALLOC_SUCCESS);
 		cma_sysfs_account_success_pages(cma, count);
 	} else {
@@ -915,6 +914,21 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
 	return page;
 }
 
+struct page *cma_alloc_frozen(struct cma *cma, unsigned long count,
+		unsigned int align, bool no_warn)
+{
+	gfp_t gfp = GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0);
+
+	return __cma_alloc_frozen(cma, count, align, gfp);
+}
+
+struct page *cma_alloc_frozen_compound(struct cma *cma, unsigned int order)
+{
+	gfp_t gfp = GFP_KERNEL | __GFP_COMP | __GFP_NOWARN;
+
+	return __cma_alloc_frozen(cma, 1 << order, order, gfp);
+}
+
 /**
  * cma_alloc() - allocate pages from contiguous area
  * @cma:   Contiguous memory region for which the allocation is performed.
@@ -927,43 +941,27 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
  */
 struct page *cma_alloc(struct cma *cma, unsigned long count,
 		       unsigned int align, bool no_warn)
-{
-	return __cma_alloc(cma, count, align, GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
-}
-
-struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
 {
 	struct page *page;
 
-	if (WARN_ON(!order || !(gfp & __GFP_COMP)))
-		return NULL;
-
-	page = __cma_alloc(cma, 1 << order, order, gfp);
+	page = cma_alloc_frozen(cma, count, align, no_warn);
+	if (page)
+		set_pages_refcounted(page, count);
 
-	return page ? page_folio(page) : NULL;
+	return page;
 }
 
-/**
- * cma_release() - release allocated pages
- * @cma:   Contiguous memory region for which the allocation is performed.
- * @pages: Allocated pages.
- * @count: Number of allocated pages.
- *
- * This function releases memory allocated by cma_alloc().
- * It returns false when provided pages do not belong to contiguous area and
- * true otherwise.
- */
-bool cma_release(struct cma *cma, const struct page *pages,
-		 unsigned long count)
+static struct cma_memrange *find_cma_memrange(struct cma *cma,
+		const struct page *pages, unsigned long count)
 {
-	struct cma_memrange *cmr;
+	struct cma_memrange *cmr = NULL;
 	unsigned long pfn, end_pfn;
 	int r;
 
 	pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count);
 
 	if (!cma || !pages || count > cma->count)
-		return false;
+		return NULL;
 
 	pfn = page_to_pfn(pages);
 
@@ -981,27 +979,66 @@ bool cma_release(struct cma *cma, const struct page *pages,
 	if (r == cma->nranges) {
 		pr_debug("%s(page %p, count %lu, no cma range matches the page range)\n",
 			 __func__, (void *)pages, count);
-		return false;
+		return NULL;
 	}
 
-	if (PageHead(pages))
-		__free_pages((struct page *)pages, compound_order(pages));
-	else
-		free_contig_range(pfn, count);
+	return cmr;
+}
+
+static void __cma_release_frozen(struct cma *cma, struct cma_memrange *cmr,
+		const struct page *pages, unsigned long count)
+{
+	unsigned long pfn = page_to_pfn(pages);
+
+	pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count);
 
+	free_contig_frozen_range(pfn, count);
 	cma_clear_bitmap(cma, cmr, pfn, count);
 	cma_sysfs_account_release_pages(cma, count);
 	trace_cma_release(cma->name, pfn, pages, count);
+}
+
+/**
+ * cma_release() - release allocated pages
+ * @cma:   Contiguous memory region for which the allocation is performed.
+ * @pages: Allocated pages.
+ * @count: Number of allocated pages.
+ *
+ * This function releases memory allocated by cma_alloc().
+ * It returns false when provided pages do not belong to contiguous area and
+ * true otherwise.
+ */
+bool cma_release(struct cma *cma, const struct page *pages,
+		 unsigned long count)
+{
+	struct cma_memrange *cmr;
+	unsigned long pfn;
+
+	cmr = find_cma_memrange(cma, pages, count);
+	if (!cmr)
+		return false;
+
+	pfn = page_to_pfn(pages);
+	for (; count--; pfn++)
+		VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn)));
+
+	__cma_release_frozen(cma, cmr, pages, count);
 
 	return true;
 }
 
-bool cma_free_folio(struct cma *cma, const struct folio *folio)
+bool cma_release_frozen(struct cma *cma, const struct page *pages,
+		unsigned long count)
 {
-	if (WARN_ON(!folio_test_large(folio)))
+	struct cma_memrange *cmr;
+
+	cmr = find_cma_memrange(cma, pages, count);
+	if (!cmr)
 		return false;
 
-	return cma_release(cma, &folio->page, folio_nr_pages(folio));
+	__cma_release_frozen(cma, cmr, pages, count);
+
+	return true;
 }
 
 int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data)
diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c
index e8e4dc7182d5..9469a7bd673f 100644
--- a/mm/hugetlb_cma.c
+++ b/mm/hugetlb_cma.c
@@ -20,35 +20,39 @@ static unsigned long hugetlb_cma_size __initdata;
 
 void hugetlb_cma_free_folio(struct folio *folio)
 {
-	int nid = folio_nid(folio);
+	folio_ref_dec(folio);
 
-	WARN_ON_ONCE(!cma_free_folio(hugetlb_cma[nid], folio));
+	WARN_ON_ONCE(!cma_release_frozen(hugetlb_cma[folio_nid(folio)],
+					 &folio->page, folio_nr_pages(folio)));
 }
 
-
 struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
 				      int nid, nodemask_t *nodemask)
 {
 	int node;
-	struct folio *folio = NULL;
+	struct folio *folio;
+	struct page *page = NULL;
 
 	if (hugetlb_cma[nid])
-		folio = cma_alloc_folio(hugetlb_cma[nid], order, gfp_mask);
+		page = cma_alloc_frozen_compound(hugetlb_cma[nid], order);
 
-	if (!folio && !(gfp_mask & __GFP_THISNODE)) {
+	if (!page && !(gfp_mask & __GFP_THISNODE)) {
 		for_each_node_mask(node, *nodemask) {
 			if (node == nid || !hugetlb_cma[node])
 				continue;
 
-			folio = cma_alloc_folio(hugetlb_cma[node], order, gfp_mask);
-			if (folio)
+			page = cma_alloc_frozen_compound(hugetlb_cma[node], order);
+			if (page)
 				break;
 		}
 	}
 
-	if (folio)
-		folio_set_hugetlb_cma(folio);
+	if (!page)
+		return NULL;
 
+	set_page_refcounted(page);
+	folio = page_folio(page);
+	folio_set_hugetlb_cma(folio);
 	return folio;
 }
 
diff --git a/mm/internal.h b/mm/internal.h
index b8737c474412..ae7bd87f97b1 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -517,11 +517,6 @@ static inline void set_pages_refcounted(struct page *page, unsigned long nr_page
 {
 	unsigned long pfn = page_to_pfn(page);
 
-	if (PageHead(page)) {
-		set_page_refcounted(page);
-		return;
-	}
-
 	for (; nr_pages--; pfn++)
 		set_page_refcounted(pfn_to_page(pfn));
 }
@@ -949,9 +944,14 @@ void init_cma_reserved_pageblock(struct page *page);
 struct cma;
 
 #ifdef CONFIG_CMA
+bool cma_validate_zones(struct cma *cma);
 void *cma_reserve_early(struct cma *cma, unsigned long size);
 void init_cma_pageblock(struct page *page);
 #else
+static inline bool cma_validate_zones(struct cma *cma)
+{
+	return false;
+}
 static inline void *cma_reserve_early(struct cma *cma, unsigned long size)
 {
 	return NULL;
-- 
2.27.0



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 6/6] mm: hugetlb: allocate frozen pages for gigantic allocation
  2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
                   ` (4 preceding siblings ...)
  2025-12-30  7:24 ` [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
@ 2025-12-30  7:24 ` Kefeng Wang
  2025-12-31  2:50   ` Muchun Song
  2025-12-31  3:00   ` Zi Yan
  2025-12-30 18:17 ` [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Andrew Morton
  2026-01-07 17:31 ` Claudiu Beznea
  7 siblings, 2 replies; 30+ messages in thread
From: Kefeng Wang @ 2025-12-30  7:24 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song, linux-mm
  Cc: sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox, Kefeng Wang

The alloc_gigantic_folio() allocates a folio with refcount increated
and then freeze it, convert to allocate a frozen folio to remove the
atomic operation about folio refcount, and saving atomic operation
during __update_and_free_hugetlb_folio() too.

Besides, rename hugetlb_cma_{alloc,free}_folio(), alloc_gigantic_folio()
and alloc_buddy_hugetlb_folio() with frozen which make them more
self-explanatory.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/hugetlb.c     | 75 +++++++++++++-----------------------------------
 mm/hugetlb_cma.c |  9 ++----
 mm/hugetlb_cma.h | 10 +++----
 3 files changed, 28 insertions(+), 66 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c990e439c32e..cb296c6912f6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -121,16 +121,6 @@ static void hugetlb_unshare_pmds(struct vm_area_struct *vma,
 		unsigned long start, unsigned long end, bool take_locks);
 static struct resv_map *vma_resv_map(struct vm_area_struct *vma);
 
-static void hugetlb_free_folio(struct folio *folio)
-{
-	if (folio_test_hugetlb_cma(folio)) {
-		hugetlb_cma_free_folio(folio);
-		return;
-	}
-
-	folio_put(folio);
-}
-
 static inline bool subpool_is_free(struct hugepage_subpool *spool)
 {
 	if (spool->count)
@@ -1417,52 +1407,25 @@ static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h,
 	return NULL;
 }
 
-#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-#ifdef CONFIG_CONTIG_ALLOC
-static struct folio *alloc_gigantic_folio(int order, gfp_t gfp_mask,
+#if defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE) && defined(CONFIG_CONTIG_ALLOC)
+static struct folio *alloc_gigantic_frozen_folio(int order, gfp_t gfp_mask,
 		int nid, nodemask_t *nodemask)
 {
 	struct folio *folio;
-	bool retried = false;
 
-retry:
-	folio = hugetlb_cma_alloc_folio(order, gfp_mask, nid, nodemask);
-	if (!folio) {
-		struct page *page;
-
-		if (hugetlb_cma_exclusive_alloc())
-			return NULL;
-
-		page = alloc_contig_frozen_pages(1 << order, gfp_mask, nid, nodemask);
-		if (!page)
-			return NULL;
-
-		set_page_refcounted(page);
-		folio = page_folio(page);
-	}
-
-	if (folio_ref_freeze(folio, 1))
+	folio = hugetlb_cma_alloc_frozen_folio(order, gfp_mask, nid, nodemask);
+	if (folio)
 		return folio;
 
-	pr_warn("HugeTLB: unexpected refcount on PFN %lu\n", folio_pfn(folio));
-	hugetlb_free_folio(folio);
-	if (!retried) {
-		retried = true;
-		goto retry;
-	}
-	return NULL;
-}
+	if (hugetlb_cma_exclusive_alloc())
+		return NULL;
 
-#else /* !CONFIG_CONTIG_ALLOC */
-static struct folio *alloc_gigantic_folio(int order, gfp_t gfp_mask, int nid,
-					  nodemask_t *nodemask)
-{
-	return NULL;
+	folio = (struct folio *)alloc_contig_frozen_pages(1 << order, gfp_mask,
+							  nid, nodemask);
+	return folio;
 }
-#endif /* CONFIG_CONTIG_ALLOC */
-
-#else /* !CONFIG_ARCH_HAS_GIGANTIC_PAGE */
-static struct folio *alloc_gigantic_folio(int order, gfp_t gfp_mask, int nid,
+#else /* !CONFIG_ARCH_HAS_GIGANTIC_PAGE || !CONFIG_CONTIG_ALLOC */
+static struct folio *alloc_gigantic_frozen_folio(int order, gfp_t gfp_mask, int nid,
 					  nodemask_t *nodemask)
 {
 	return NULL;
@@ -1592,9 +1555,11 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
 	if (unlikely(folio_test_hwpoison(folio)))
 		folio_clear_hugetlb_hwpoison(folio);
 
-	folio_ref_unfreeze(folio, 1);
-
-	hugetlb_free_folio(folio);
+	VM_BUG_ON_FOLIO(folio_ref_count(folio), folio);
+	if (folio_test_hugetlb_cma(folio))
+		hugetlb_cma_free_frozen_folio(folio);
+	else
+		free_frozen_pages(&folio->page, folio_order(folio));
 }
 
 /*
@@ -1874,7 +1839,7 @@ struct address_space *hugetlb_folio_mapping_lock_write(struct folio *folio)
 	return NULL;
 }
 
-static struct folio *alloc_buddy_hugetlb_folio(int order, gfp_t gfp_mask,
+static struct folio *alloc_buddy_frozen_folio(int order, gfp_t gfp_mask,
 		int nid, nodemask_t *nmask, nodemask_t *node_alloc_noretry)
 {
 	struct folio *folio;
@@ -1930,10 +1895,10 @@ static struct folio *only_alloc_fresh_hugetlb_folio(struct hstate *h,
 		nid = numa_mem_id();
 
 	if (order_is_gigantic(order))
-		folio = alloc_gigantic_folio(order, gfp_mask, nid, nmask);
+		folio = alloc_gigantic_frozen_folio(order, gfp_mask, nid, nmask);
 	else
-		folio = alloc_buddy_hugetlb_folio(order, gfp_mask, nid, nmask,
-						  node_alloc_noretry);
+		folio = alloc_buddy_frozen_folio(order, gfp_mask, nid, nmask,
+						 node_alloc_noretry);
 	if (folio)
 		init_new_hugetlb_folio(folio);
 	return folio;
diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c
index 9469a7bd673f..e9f63648df6d 100644
--- a/mm/hugetlb_cma.c
+++ b/mm/hugetlb_cma.c
@@ -18,16 +18,14 @@ static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata;
 static bool hugetlb_cma_only;
 static unsigned long hugetlb_cma_size __initdata;
 
-void hugetlb_cma_free_folio(struct folio *folio)
+void hugetlb_cma_free_frozen_folio(struct folio *folio)
 {
-	folio_ref_dec(folio);
-
 	WARN_ON_ONCE(!cma_release_frozen(hugetlb_cma[folio_nid(folio)],
 					 &folio->page, folio_nr_pages(folio)));
 }
 
-struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
-				      int nid, nodemask_t *nodemask)
+struct folio *hugetlb_cma_alloc_frozen_folio(int order, gfp_t gfp_mask,
+		int nid, nodemask_t *nodemask)
 {
 	int node;
 	struct folio *folio;
@@ -50,7 +48,6 @@ struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
 	if (!page)
 		return NULL;
 
-	set_page_refcounted(page);
 	folio = page_folio(page);
 	folio_set_hugetlb_cma(folio);
 	return folio;
diff --git a/mm/hugetlb_cma.h b/mm/hugetlb_cma.h
index 2c2ec8a7e134..3bc295c8c38e 100644
--- a/mm/hugetlb_cma.h
+++ b/mm/hugetlb_cma.h
@@ -3,8 +3,8 @@
 #define _LINUX_HUGETLB_CMA_H
 
 #ifdef CONFIG_CMA
-void hugetlb_cma_free_folio(struct folio *folio);
-struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
+void hugetlb_cma_free_frozen_folio(struct folio *folio);
+struct folio *hugetlb_cma_alloc_frozen_folio(int order, gfp_t gfp_mask,
 				      int nid, nodemask_t *nodemask);
 struct huge_bootmem_page *hugetlb_cma_alloc_bootmem(struct hstate *h, int *nid,
 						    bool node_exact);
@@ -14,12 +14,12 @@ unsigned long hugetlb_cma_total_size(void);
 void hugetlb_cma_validate_params(void);
 bool hugetlb_early_cma(struct hstate *h);
 #else
-static inline void hugetlb_cma_free_folio(struct folio *folio)
+static inline void hugetlb_cma_free_frozen_folio(struct folio *folio)
 {
 }
 
-static inline struct folio *hugetlb_cma_alloc_folio(int order, gfp_t gfp_mask,
-		int nid, nodemask_t *nodemask)
+static inline struct folio *hugetlb_cma_alloc_frozen_folio(int order,
+		gfp_t gfp_mask,	int nid, nodemask_t *nodemask)
 {
 	return NULL;
 }
-- 
2.27.0



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
                   ` (5 preceding siblings ...)
  2025-12-30  7:24 ` [PATCH v5 6/6] mm: hugetlb: allocate frozen pages for gigantic allocation Kefeng Wang
@ 2025-12-30 18:17 ` Andrew Morton
  2026-01-07 17:31 ` Claudiu Beznea
  7 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2025-12-30 18:17 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: David Hildenbrand, Oscar Salvador, Muchun Song, linux-mm,
	sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox

On Tue, 30 Dec 2025 15:24:16 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:

> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
> which avoid atomic operation about page refcount, and then convert to
> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
> the alloc_gigantic_folio().

Thanks, I queued this for testing along with a note that review input
on patches 4, 5 and 6 is desired.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 6/6] mm: hugetlb: allocate frozen pages for gigantic allocation
  2025-12-30  7:24 ` [PATCH v5 6/6] mm: hugetlb: allocate frozen pages for gigantic allocation Kefeng Wang
@ 2025-12-31  2:50   ` Muchun Song
  2025-12-31  3:00   ` Zi Yan
  1 sibling, 0 replies; 30+ messages in thread
From: Muchun Song @ 2025-12-31  2:50 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, linux-mm,
	sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox



> On Dec 30, 2025, at 15:24, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
> The alloc_gigantic_folio() allocates a folio with refcount increated
> and then freeze it, convert to allocate a frozen folio to remove the
> atomic operation about folio refcount, and saving atomic operation
> during __update_and_free_hugetlb_folio() too.
> 
> Besides, rename hugetlb_cma_{alloc,free}_folio(), alloc_gigantic_folio()
> and alloc_buddy_hugetlb_folio() with frozen which make them more
> self-explanatory.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Muchun Song <muchun.song@linux.dev>

Thanks.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 4/6] mm: page_alloc: add alloc_contig_frozen_{range,pages}()
  2025-12-30  7:24 ` [PATCH v5 4/6] mm: page_alloc: add alloc_contig_frozen_{range,pages}() Kefeng Wang
@ 2025-12-31  2:57   ` Zi Yan
  2026-01-02 21:05   ` Sid Kumar
  1 sibling, 0 replies; 30+ messages in thread
From: Zi Yan @ 2025-12-31  2:57 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
	linux-mm, sidhartha.kumar, jane.chu, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox

On 30 Dec 2025, at 2:24, Kefeng Wang wrote:

> In order to allocate given range of pages or allocate compound
> pages without incrementing their refcount, adding two new helper
> alloc_contig_frozen_{range,pages}() which may be beneficial
> to some users (eg hugetlb).
>
> The new alloc_contig_{range,pages} only take !__GFP_COMP gfp now,
> and the free_contig_range() is refactored to only free non-compound
> pages, the only caller to free compound pages in cma_free_folio() is
> changed accordingly, and the free_contig_frozen_range() is provided
> to match the alloc_contig_frozen_range(), which is used to free
> frozen pages.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  include/linux/gfp.h |  52 +++++--------
>  mm/cma.c            |   9 ++-
>  mm/hugetlb.c        |   9 ++-
>  mm/internal.h       |  13 ++++
>  mm/page_alloc.c     | 186 ++++++++++++++++++++++++++++++++------------
>  5 files changed, 184 insertions(+), 85 deletions(-)
>
LGTM. Thanks.

Reviewed-by: Zi Yan <ziy@nvidia.com>

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}()
  2025-12-30  7:24 ` [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
@ 2025-12-31  2:59   ` Zi Yan
  2026-01-08  4:19   ` Dmitry Baryshkov
  1 sibling, 0 replies; 30+ messages in thread
From: Zi Yan @ 2025-12-31  2:59 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
	linux-mm, sidhartha.kumar, jane.chu, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox

On 30 Dec 2025, at 2:24, Kefeng Wang wrote:

> Introduce cma_alloc_frozen{_compound}() helper to alloc pages without
> incrementing their refcount, then convert hugetlb cma to use the
> cma_alloc_frozen_compound() and cma_release_frozen() and remove the
> unused cma_{alloc,free}_folio(), also move the cma_validate_zones()
> into mm/internal.h since no outside user.
>
> The set_pages_refcounted() is only called to set non-compound pages
> after above changes, so remove the processing about PageHead.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  include/linux/cma.h |  26 +++--------
>  mm/cma.c            | 107 +++++++++++++++++++++++++++++---------------
>  mm/hugetlb_cma.c    |  24 +++++-----
>  mm/internal.h       |  10 ++---
>  4 files changed, 97 insertions(+), 70 deletions(-)
>
LGTM. Thanks.

Reviewed-by: Zi Yan <ziy@nvidia.com>

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 6/6] mm: hugetlb: allocate frozen pages for gigantic allocation
  2025-12-30  7:24 ` [PATCH v5 6/6] mm: hugetlb: allocate frozen pages for gigantic allocation Kefeng Wang
  2025-12-31  2:50   ` Muchun Song
@ 2025-12-31  3:00   ` Zi Yan
  1 sibling, 0 replies; 30+ messages in thread
From: Zi Yan @ 2025-12-31  3:00 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
	linux-mm, sidhartha.kumar, jane.chu, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox

On 30 Dec 2025, at 2:24, Kefeng Wang wrote:

> The alloc_gigantic_folio() allocates a folio with refcount increated
> and then freeze it, convert to allocate a frozen folio to remove the
> atomic operation about folio refcount, and saving atomic operation
> during __update_and_free_hugetlb_folio() too.
>
> Besides, rename hugetlb_cma_{alloc,free}_folio(), alloc_gigantic_folio()
> and alloc_buddy_hugetlb_folio() with frozen which make them more
> self-explanatory.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  mm/hugetlb.c     | 75 +++++++++++++-----------------------------------
>  mm/hugetlb_cma.c |  9 ++----
>  mm/hugetlb_cma.h | 10 +++----
>  3 files changed, 28 insertions(+), 66 deletions(-)
>
LGTM. Thanks.

Reviewed-by: Zi Yan <ziy@nvidia.com>

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page()
  2025-12-30  7:24 ` [PATCH v5 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
@ 2026-01-02 18:51   ` Sid Kumar
  0 siblings, 0 replies; 30+ messages in thread
From: Sid Kumar @ 2026-01-02 18:51 UTC (permalink / raw)
  To: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm
  Cc: jane.chu, Zi Yan, Vlastimil Babka, Brendan Jackman,
	Johannes Weiner, Matthew Wilcox


On 12/30/25 1:24 AM, Kefeng Wang wrote:
> Add a new helper to free huge page to be consistency to
> debug_vm_pgtable_alloc_huge_page(), and use HPAGE_PUD_ORDER
> instead of open-code.
>
> Also move the free_contig_range() under CONFIG_ALLOC_CONTIG
> since all caller are built with CONFIG_ALLOC_CONTIG.
>
> Acked-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Muchun Song <muchun.song@linux.dev>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>   include/linux/gfp.h   |  2 +-
>   mm/debug_vm_pgtable.c | 38 +++++++++++++++++---------------------
>   mm/page_alloc.c       |  2 +-
>   3 files changed, 19 insertions(+), 23 deletions(-)

Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>

> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index b155929af5b1..ea053f1cfa16 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -438,8 +438,8 @@ extern struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_
>   					      int nid, nodemask_t *nodemask);
>   #define alloc_contig_pages(...)			alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
>   
> -#endif
>   void free_contig_range(unsigned long pfn, unsigned long nr_pages);
> +#endif
>   
>   #ifdef CONFIG_CONTIG_ALLOC
>   static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index ae9b9310d96f..83cf07269f13 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -971,22 +971,26 @@ static unsigned long __init get_random_vaddr(void)
>   	return random_vaddr;
>   }
>   
> -static void __init destroy_args(struct pgtable_debug_args *args)
> +static void __init
> +debug_vm_pgtable_free_huge_page(struct pgtable_debug_args *args,
> +		unsigned long pfn, int order)
>   {
> -	struct page *page = NULL;
> +#ifdef CONFIG_CONTIG_ALLOC
> +	if (args->is_contiguous_page) {
> +		free_contig_range(pfn, 1 << order);
> +		return;
> +	}
> +#endif
> +	__free_pages(pfn_to_page(pfn), order);
> +}
>   
> +static void __init destroy_args(struct pgtable_debug_args *args)
> +{
>   	/* Free (huge) page */
>   	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
>   	    has_transparent_pud_hugepage() &&
>   	    args->pud_pfn != ULONG_MAX) {
> -		if (args->is_contiguous_page) {
> -			free_contig_range(args->pud_pfn,
> -					  (1 << (HPAGE_PUD_SHIFT - PAGE_SHIFT)));
> -		} else {
> -			page = pfn_to_page(args->pud_pfn);
> -			__free_pages(page, HPAGE_PUD_SHIFT - PAGE_SHIFT);
> -		}
> -
> +		debug_vm_pgtable_free_huge_page(args, args->pud_pfn, HPAGE_PUD_ORDER);
>   		args->pud_pfn = ULONG_MAX;
>   		args->pmd_pfn = ULONG_MAX;
>   		args->pte_pfn = ULONG_MAX;
> @@ -995,20 +999,13 @@ static void __init destroy_args(struct pgtable_debug_args *args)
>   	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
>   	    has_transparent_hugepage() &&
>   	    args->pmd_pfn != ULONG_MAX) {
> -		if (args->is_contiguous_page) {
> -			free_contig_range(args->pmd_pfn, (1 << HPAGE_PMD_ORDER));
> -		} else {
> -			page = pfn_to_page(args->pmd_pfn);
> -			__free_pages(page, HPAGE_PMD_ORDER);
> -		}
> -
> +		debug_vm_pgtable_free_huge_page(args, args->pmd_pfn, HPAGE_PMD_ORDER);
>   		args->pmd_pfn = ULONG_MAX;
>   		args->pte_pfn = ULONG_MAX;
>   	}
>   
>   	if (args->pte_pfn != ULONG_MAX) {
> -		page = pfn_to_page(args->pte_pfn);
> -		__free_page(page);
> +		__free_page(pfn_to_page(args->pte_pfn));
>   
>   		args->pte_pfn = ULONG_MAX;
>   	}
> @@ -1242,8 +1239,7 @@ static int __init init_args(struct pgtable_debug_args *args)
>   	 */
>   	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
>   	    has_transparent_pud_hugepage()) {
> -		page = debug_vm_pgtable_alloc_huge_page(args,
> -				HPAGE_PUD_SHIFT - PAGE_SHIFT);
> +		page = debug_vm_pgtable_alloc_huge_page(args, HPAGE_PUD_ORDER);
>   		if (page) {
>   			args->pud_pfn = page_to_pfn(page);
>   			args->pmd_pfn = args->pud_pfn;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a045d728ae0f..206397ed33a7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7248,7 +7248,6 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
>   	}
>   	return NULL;
>   }
> -#endif /* CONFIG_CONTIG_ALLOC */
>   
>   void free_contig_range(unsigned long pfn, unsigned long nr_pages)
>   {
> @@ -7275,6 +7274,7 @@ void free_contig_range(unsigned long pfn, unsigned long nr_pages)
>   	WARN(count != 0, "%lu pages are still in use!\n", count);
>   }
>   EXPORT_SYMBOL(free_contig_range);
> +#endif /* CONFIG_CONTIG_ALLOC */
>   
>   /*
>    * Effectively disable pcplists for the zone by setting the high limit to 0


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 2/6] mm: page_alloc: add __split_page()
  2025-12-30  7:24 ` [PATCH v5 2/6] mm: page_alloc: add __split_page() Kefeng Wang
@ 2026-01-02 18:55   ` Sid Kumar
  0 siblings, 0 replies; 30+ messages in thread
From: Sid Kumar @ 2026-01-02 18:55 UTC (permalink / raw)
  To: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm
  Cc: jane.chu, Zi Yan, Vlastimil Babka, Brendan Jackman,
	Johannes Weiner, Matthew Wilcox


On 12/30/25 1:24 AM, Kefeng Wang wrote:
> Factor out the splitting of non-compound page from make_alloc_exact()
> and split_page() into a new helper function __split_page().
>
> While at it, convert the VM_BUG_ON_PAGE() into a VM_WARN_ON_PAGE().
>
> Acked-by: David Hildenbrand <david@redhat.com>
> Acked-by: Muchun Song <muchun.song@linux.dev>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>

> ---
>   include/linux/mmdebug.h | 10 ++++++++++
>   mm/page_alloc.c         | 21 +++++++++++++--------
>   2 files changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/mmdebug.h b/include/linux/mmdebug.h
> index 14a45979cccc..ab60ffba08f5 100644
> --- a/include/linux/mmdebug.h
> +++ b/include/linux/mmdebug.h
> @@ -47,6 +47,15 @@ void vma_iter_dump_tree(const struct vma_iterator *vmi);
>   			BUG();						\
>   		}							\
>   	} while (0)
> +#define VM_WARN_ON_PAGE(cond, page)		({			\
> +	int __ret_warn = !!(cond);					\
> +									\
> +	if (unlikely(__ret_warn)) {					\
> +		dump_page(page, "VM_WARN_ON_PAGE(" __stringify(cond)")");\
> +		WARN_ON(1);						\
> +	}								\
> +	unlikely(__ret_warn);						\
> +})
>   #define VM_WARN_ON_ONCE_PAGE(cond, page)	({			\
>   	static bool __section(".data..once") __warned;			\
>   	int __ret_warn_once = !!(cond);					\
> @@ -122,6 +131,7 @@ void vma_iter_dump_tree(const struct vma_iterator *vmi);
>   #define VM_BUG_ON_MM(cond, mm) VM_BUG_ON(cond)
>   #define VM_WARN_ON(cond) BUILD_BUG_ON_INVALID(cond)
>   #define VM_WARN_ON_ONCE(cond) BUILD_BUG_ON_INVALID(cond)
> +#define VM_WARN_ON_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
>   #define VM_WARN_ON_ONCE_PAGE(cond, page)  BUILD_BUG_ON_INVALID(cond)
>   #define VM_WARN_ON_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
>   #define VM_WARN_ON_ONCE_FOLIO(cond, folio)  BUILD_BUG_ON_INVALID(cond)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 206397ed33a7..b9bfbb69537e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3080,6 +3080,15 @@ void free_unref_folios(struct folio_batch *folios)
>   	folio_batch_reinit(folios);
>   }
>   
> +static void __split_page(struct page *page, unsigned int order)
> +{
> +	VM_WARN_ON_PAGE(PageCompound(page), page);
> +
> +	split_page_owner(page, order, 0);
> +	pgalloc_tag_split(page_folio(page), order, 0);
> +	split_page_memcg(page, order);
> +}
> +
>   /*
>    * split_page takes a non-compound higher-order page, and splits it into
>    * n (1<<order) sub-pages: page[0..n]
> @@ -3092,14 +3101,12 @@ void split_page(struct page *page, unsigned int order)
>   {
>   	int i;
>   
> -	VM_BUG_ON_PAGE(PageCompound(page), page);
> -	VM_BUG_ON_PAGE(!page_count(page), page);
> +	VM_WARN_ON_PAGE(!page_count(page), page);
>   
>   	for (i = 1; i < (1 << order); i++)
>   		set_page_refcounted(page + i);
> -	split_page_owner(page, order, 0);
> -	pgalloc_tag_split(page_folio(page), order, 0);
> -	split_page_memcg(page, order);
> +
> +	__split_page(page, order);
>   }
>   EXPORT_SYMBOL_GPL(split_page);
>   
> @@ -5383,9 +5390,7 @@ static void *make_alloc_exact(unsigned long addr, unsigned int order,
>   		struct page *page = virt_to_page((void *)addr);
>   		struct page *last = page + nr;
>   
> -		split_page_owner(page, order, 0);
> -		pgalloc_tag_split(page_folio(page), order, 0);
> -		split_page_memcg(page, order);
> +		__split_page(page, order);
>   		while (page < --last)
>   			set_page_refcounted(last);
>   


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 4/6] mm: page_alloc: add alloc_contig_frozen_{range,pages}()
  2025-12-30  7:24 ` [PATCH v5 4/6] mm: page_alloc: add alloc_contig_frozen_{range,pages}() Kefeng Wang
  2025-12-31  2:57   ` Zi Yan
@ 2026-01-02 21:05   ` Sid Kumar
  1 sibling, 0 replies; 30+ messages in thread
From: Sid Kumar @ 2026-01-02 21:05 UTC (permalink / raw)
  To: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm
  Cc: jane.chu, Zi Yan, Vlastimil Babka, Brendan Jackman,
	Johannes Weiner, Matthew Wilcox


On 12/30/25 1:24 AM, Kefeng Wang wrote:
> In order to allocate given range of pages or allocate compound
> pages without incrementing their refcount, adding two new helper
> alloc_contig_frozen_{range,pages}() which may be beneficial
> to some users (eg hugetlb).
>
> The new alloc_contig_{range,pages} only take !__GFP_COMP gfp now,
> and the free_contig_range() is refactored to only free non-compound
> pages, the only caller to free compound pages in cma_free_folio() is
> changed accordingly, and the free_contig_frozen_range() is provided
> to match the alloc_contig_frozen_range(), which is used to free
> frozen pages.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>

> ---
>   include/linux/gfp.h |  52 +++++--------
>   mm/cma.c            |   9 ++-
>   mm/hugetlb.c        |   9 ++-
>   mm/internal.h       |  13 ++++
>   mm/page_alloc.c     | 186 ++++++++++++++++++++++++++++++++------------
>   5 files changed, 184 insertions(+), 85 deletions(-)
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index ea053f1cfa16..aa45989f410d 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -430,40 +430,30 @@ typedef unsigned int __bitwise acr_flags_t;
>   #define ACR_FLAGS_CMA ((__force acr_flags_t)BIT(0)) // allocate for CMA
>   
>   /* The below functions must be run on a range from a single zone. */
> -extern int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> -				     acr_flags_t alloc_flags, gfp_t gfp_mask);
> -#define alloc_contig_range(...)			alloc_hooks(alloc_contig_range_noprof(__VA_ARGS__))
> -
> -extern struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
> -					      int nid, nodemask_t *nodemask);
> -#define alloc_contig_pages(...)			alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
> -
> +int alloc_contig_frozen_range_noprof(unsigned long start, unsigned long end,
> +		acr_flags_t alloc_flags, gfp_t gfp_mask);
> +#define alloc_contig_frozen_range(...)	\
> +	alloc_hooks(alloc_contig_frozen_range_noprof(__VA_ARGS__))
> +
> +int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> +		acr_flags_t alloc_flags, gfp_t gfp_mask);
> +#define alloc_contig_range(...)	\
> +	alloc_hooks(alloc_contig_range_noprof(__VA_ARGS__))
> +
> +struct page *alloc_contig_frozen_pages_noprof(unsigned long nr_pages,
> +		gfp_t gfp_mask, int nid, nodemask_t *nodemask);
> +#define alloc_contig_frozen_pages(...) \
> +	alloc_hooks(alloc_contig_frozen_pages_noprof(__VA_ARGS__))
> +
> +struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
> +		int nid, nodemask_t *nodemask);
> +#define alloc_contig_pages(...)	\
> +	alloc_hooks(alloc_contig_pages_noprof(__VA_ARGS__))
> +
> +void free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages);
>   void free_contig_range(unsigned long pfn, unsigned long nr_pages);
>   #endif
>   
> -#ifdef CONFIG_CONTIG_ALLOC
> -static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
> -							int nid, nodemask_t *node)
> -{
> -	struct page *page;
> -
> -	if (WARN_ON(!order || !(gfp & __GFP_COMP)))
> -		return NULL;
> -
> -	page = alloc_contig_pages_noprof(1 << order, gfp, nid, node);
> -
> -	return page ? page_folio(page) : NULL;
> -}
> -#else
> -static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp,
> -							int nid, nodemask_t *node)
> -{
> -	return NULL;
> -}
> -#endif
> -/* This should be paired with folio_put() rather than free_contig_range(). */
> -#define folio_alloc_gigantic(...) alloc_hooks(folio_alloc_gigantic_noprof(__VA_ARGS__))
> -
>   DEFINE_FREE(free_page, void *, free_page((unsigned long)_T))
>   
>   #endif /* __LINUX_GFP_H */
> diff --git a/mm/cma.c b/mm/cma.c
> index fe3a9eaac4e5..0e8c146424fb 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -836,7 +836,7 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>   		spin_unlock_irq(&cma->lock);
>   
>   		mutex_lock(&cma->alloc_mutex);
> -		ret = alloc_contig_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
> +		ret = alloc_contig_frozen_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
>   		mutex_unlock(&cma->alloc_mutex);
>   		if (!ret)
>   			break;
> @@ -904,6 +904,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
>   	trace_cma_alloc_finish(name, page ? page_to_pfn(page) : 0,
>   			       page, count, align, ret);
>   	if (page) {
> +		set_pages_refcounted(page, count);
>   		count_vm_event(CMA_ALLOC_SUCCESS);
>   		cma_sysfs_account_success_pages(cma, count);
>   	} else {
> @@ -983,7 +984,11 @@ bool cma_release(struct cma *cma, const struct page *pages,
>   		return false;
>   	}
>   
> -	free_contig_range(pfn, count);
> +	if (PageHead(pages))
> +		__free_pages((struct page *)pages, compound_order(pages));
> +	else
> +		free_contig_range(pfn, count);
> +
>   	cma_clear_bitmap(cma, cmr, pfn, count);
>   	cma_sysfs_account_release_pages(cma, count);
>   	trace_cma_release(cma->name, pfn, pages, count);
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a1832da0f623..c990e439c32e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1428,12 +1428,17 @@ static struct folio *alloc_gigantic_folio(int order, gfp_t gfp_mask,
>   retry:
>   	folio = hugetlb_cma_alloc_folio(order, gfp_mask, nid, nodemask);
>   	if (!folio) {
> +		struct page *page;
> +
>   		if (hugetlb_cma_exclusive_alloc())
>   			return NULL;
>   
> -		folio = folio_alloc_gigantic(order, gfp_mask, nid, nodemask);
> -		if (!folio)
> +		page = alloc_contig_frozen_pages(1 << order, gfp_mask, nid, nodemask);
> +		if (!page)
>   			return NULL;
> +
> +		set_page_refcounted(page);
> +		folio = page_folio(page);
>   	}
>   
>   	if (folio_ref_freeze(folio, 1))
> diff --git a/mm/internal.h b/mm/internal.h
> index db4e97489f66..b8737c474412 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -513,6 +513,19 @@ static inline void set_page_refcounted(struct page *page)
>   	set_page_count(page, 1);
>   }
>   
> +static inline void set_pages_refcounted(struct page *page, unsigned long nr_pages)
> +{
> +	unsigned long pfn = page_to_pfn(page);
> +
> +	if (PageHead(page)) {
> +		set_page_refcounted(page);
> +		return;
> +	}
> +
> +	for (; nr_pages--; pfn++)
> +		set_page_refcounted(pfn_to_page(pfn));
> +}
> +
>   /*
>    * Return true if a folio needs ->release_folio() calling upon it.
>    */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b9bfbb69537e..149f7b581b62 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6882,7 +6882,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>   	return (ret < 0) ? ret : 0;
>   }
>   
> -static void split_free_pages(struct list_head *list, gfp_t gfp_mask)
> +static void split_free_frozen_pages(struct list_head *list, gfp_t gfp_mask)
>   {
>   	int order;
>   
> @@ -6894,11 +6894,10 @@ static void split_free_pages(struct list_head *list, gfp_t gfp_mask)
>   			int i;
>   
>   			post_alloc_hook(page, order, gfp_mask);
> -			set_page_refcounted(page);
>   			if (!order)
>   				continue;
>   
> -			split_page(page, order);
> +			__split_page(page, order);
>   
>   			/* Add all subpages to the order-0 head, in sequence. */
>   			list_del(&page->lru);
> @@ -6942,8 +6941,14 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
>   	return 0;
>   }
>   
> +static void __free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages)
> +{
> +	for (; nr_pages--; pfn++)
> +		free_frozen_pages(pfn_to_page(pfn), 0);
> +}
> +
>   /**
> - * alloc_contig_range() -- tries to allocate given range of pages
> + * alloc_contig_frozen_range() -- tries to allocate given range of frozen pages
>    * @start:	start PFN to allocate
>    * @end:	one-past-the-last PFN to allocate
>    * @alloc_flags:	allocation information
> @@ -6958,12 +6963,15 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
>    * pageblocks in the range.  Once isolated, the pageblocks should not
>    * be modified by others.
>    *
> - * Return: zero on success or negative error code.  On success all
> - * pages which PFN is in [start, end) are allocated for the caller and
> - * need to be freed with free_contig_range().
> + * All frozen pages which PFN is in [start, end) are allocated for the
> + * caller, and they could be freed with free_contig_frozen_range(),
> + * free_frozen_pages() also could be used to free compound frozen pages
> + * directly.
> + *
> + * Return: zero on success or negative error code.
>    */
> -int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> -			      acr_flags_t alloc_flags, gfp_t gfp_mask)
> +int alloc_contig_frozen_range_noprof(unsigned long start, unsigned long end,
> +		acr_flags_t alloc_flags, gfp_t gfp_mask)
>   {
>   	const unsigned int order = ilog2(end - start);
>   	unsigned long outer_start, outer_end;
> @@ -7079,19 +7087,18 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>   	}
>   
>   	if (!(gfp_mask & __GFP_COMP)) {
> -		split_free_pages(cc.freepages, gfp_mask);
> +		split_free_frozen_pages(cc.freepages, gfp_mask);
>   
>   		/* Free head and tail (if any) */
>   		if (start != outer_start)
> -			free_contig_range(outer_start, start - outer_start);
> +			__free_contig_frozen_range(outer_start, start - outer_start);
>   		if (end != outer_end)
> -			free_contig_range(end, outer_end - end);
> +			__free_contig_frozen_range(end, outer_end - end);
>   	} else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>   		struct page *head = pfn_to_page(start);
>   
>   		check_new_pages(head, order);
>   		prep_new_page(head, order, gfp_mask, 0);
> -		set_page_refcounted(head);
>   	} else {
>   		ret = -EINVAL;
>   		WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n",
> @@ -7101,16 +7108,40 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>   	undo_isolate_page_range(start, end);
>   	return ret;
>   }
> -EXPORT_SYMBOL(alloc_contig_range_noprof);
> +EXPORT_SYMBOL(alloc_contig_frozen_range_noprof);
>   
> -static int __alloc_contig_pages(unsigned long start_pfn,
> -				unsigned long nr_pages, gfp_t gfp_mask)
> +/**
> + * alloc_contig_range() -- tries to allocate given range of pages
> + * @start:	start PFN to allocate
> + * @end:	one-past-the-last PFN to allocate
> + * @alloc_flags:	allocation information
> + * @gfp_mask:	GFP mask.
> + *
> + * This routine is a wrapper around alloc_contig_frozen_range(), it can't
> + * be used to allocate compound pages, the refcount of each allocated page
> + * will be set to one.
> + *
> + * All pages which PFN is in [start, end) are allocated for the caller,
> + * and should be freed with free_contig_range() or by manually calling
> + * __free_page() on each allocated page.
> + *
> + * Return: zero on success or negative error code.
> + */
> +int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> +			      acr_flags_t alloc_flags, gfp_t gfp_mask)
>   {
> -	unsigned long end_pfn = start_pfn + nr_pages;
> +	int ret;
>   
> -	return alloc_contig_range_noprof(start_pfn, end_pfn, ACR_FLAGS_NONE,
> -					 gfp_mask);
> +	if (WARN_ON(gfp_mask & __GFP_COMP))
> +		return -EINVAL;
> +
> +	ret = alloc_contig_frozen_range_noprof(start, end, alloc_flags, gfp_mask);
> +	if (!ret)
> +		set_pages_refcounted(pfn_to_page(start), end - start);
> +
> +	return ret;
>   }
> +EXPORT_SYMBOL(alloc_contig_range_noprof);
>   
>   static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
>   				   unsigned long nr_pages, bool skip_hugetlb,
> @@ -7179,7 +7210,7 @@ static bool zone_spans_last_pfn(const struct zone *zone,
>   }
>   
>   /**
> - * alloc_contig_pages() -- tries to find and allocate contiguous range of pages
> + * alloc_contig_frozen_pages() -- tries to find and allocate contiguous range of frozen pages
>    * @nr_pages:	Number of contiguous pages to allocate
>    * @gfp_mask:	GFP mask. Node/zone/placement hints limit the search; only some
>    *		action and reclaim modifiers are supported. Reclaim modifiers
> @@ -7187,22 +7218,25 @@ static bool zone_spans_last_pfn(const struct zone *zone,
>    * @nid:	Target node
>    * @nodemask:	Mask for other possible nodes
>    *
> - * This routine is a wrapper around alloc_contig_range(). It scans over zones
> - * on an applicable zonelist to find a contiguous pfn range which can then be
> - * tried for allocation with alloc_contig_range(). This routine is intended
> - * for allocation requests which can not be fulfilled with the buddy allocator.
> + * This routine is a wrapper around alloc_contig_frozen_range(). It scans over
> + * zones on an applicable zonelist to find a contiguous pfn range which can then
> + * be tried for allocation with alloc_contig_frozen_range(). This routine is
> + * intended for allocation requests which can not be fulfilled with the buddy
> + * allocator.
>    *
>    * The allocated memory is always aligned to a page boundary. If nr_pages is a
>    * power of two, then allocated range is also guaranteed to be aligned to same
>    * nr_pages (e.g. 1GB request would be aligned to 1GB).
>    *
> - * Allocated pages can be freed with free_contig_range() or by manually calling
> - * __free_page() on each allocated page.
> + * Allocated frozen pages need be freed with free_contig_frozen_range(),
> + * or by manually calling free_frozen_pages() on each allocated frozen
> + * non-compound page, for compound frozen pages could be freed with
> + * free_frozen_pages() directly.
>    *
> - * Return: pointer to contiguous pages on success, or NULL if not successful.
> + * Return: pointer to contiguous frozen pages on success, or NULL if not successful.
>    */
> -struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
> -				 int nid, nodemask_t *nodemask)
> +struct page *alloc_contig_frozen_pages_noprof(unsigned long nr_pages,
> +		gfp_t gfp_mask, int nid, nodemask_t *nodemask)
>   {
>   	unsigned long ret, pfn, flags;
>   	struct zonelist *zonelist;
> @@ -7224,13 +7258,15 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
>   						   &skipped_hugetlb)) {
>   				/*
>   				 * We release the zone lock here because
> -				 * alloc_contig_range() will also lock the zone
> -				 * at some point. If there's an allocation
> -				 * spinning on this lock, it may win the race
> -				 * and cause alloc_contig_range() to fail...
> +				 * alloc_contig_frozen_range() will also lock
> +				 * the zone at some point. If there's an
> +				 * allocation spinning on this lock, it may
> +				 * win the race and cause allocation to fail.
>   				 */
>   				spin_unlock_irqrestore(&zone->lock, flags);
> -				ret = __alloc_contig_pages(pfn, nr_pages,
> +				ret = alloc_contig_frozen_range_noprof(pfn,
> +							pfn + nr_pages,
> +							ACR_FLAGS_NONE,
>   							gfp_mask);
>   				if (!ret)
>   					return pfn_to_page(pfn);
> @@ -7253,30 +7289,80 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
>   	}
>   	return NULL;
>   }
> +EXPORT_SYMBOL(alloc_contig_frozen_pages_noprof);
>   
> -void free_contig_range(unsigned long pfn, unsigned long nr_pages)
> +/**
> + * alloc_contig_pages() -- tries to find and allocate contiguous range of pages
> + * @nr_pages:	Number of contiguous pages to allocate
> + * @gfp_mask:	GFP mask.
> + * @nid:	Target node
> + * @nodemask:	Mask for other possible nodes
> + *
> + * This routine is a wrapper around alloc_contig_frozen_pages(), it can't
> + * be used to allocate compound pages, the refcount of each allocated page
> + * will be set to one.
> + *
> + * Allocated pages can be freed with free_contig_range() or by manually
> + * calling __free_page() on each allocated page.
> + *
> + * Return: pointer to contiguous pages on success, or NULL if not successful.
> + */
> +struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
> +		int nid, nodemask_t *nodemask)
>   {
> -	unsigned long count = 0;
> -	struct folio *folio = pfn_folio(pfn);
> +	struct page *page;
>   
> -	if (folio_test_large(folio)) {
> -		int expected = folio_nr_pages(folio);
> +	if (WARN_ON(gfp_mask & __GFP_COMP))
> +		return NULL;
>   
> -		if (nr_pages == expected)
> -			folio_put(folio);
> -		else
> -			WARN(true, "PFN %lu: nr_pages %lu != expected %d\n",
> -			     pfn, nr_pages, expected);
> +	page = alloc_contig_frozen_pages_noprof(nr_pages, gfp_mask, nid,
> +						nodemask);
> +	if (page)
> +		set_pages_refcounted(page, nr_pages);
> +
> +	return page;
> +}
> +EXPORT_SYMBOL(alloc_contig_pages_noprof);
> +
> +/**
> + * free_contig_frozen_range() -- free the contiguous range of frozen pages
> + * @pfn:	start PFN to free
> + * @nr_pages:	Number of contiguous frozen pages to free
> + *
> + * This can be used to free the allocated compound/non-compound frozen pages.
> + */
> +void free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages)
> +{
> +	struct page *first_page = pfn_to_page(pfn);
> +	const unsigned int order = ilog2(nr_pages);
> +
> +	if (WARN_ON_ONCE(first_page != compound_head(first_page)))
> +		return;
> +
> +	if (PageHead(first_page)) {
> +		WARN_ON_ONCE(order != compound_order(first_page));
> +		free_frozen_pages(first_page, order);
>   		return;
>   	}
>   
> -	for (; nr_pages--; pfn++) {
> -		struct page *page = pfn_to_page(pfn);
> +	__free_contig_frozen_range(pfn, nr_pages);
> +}
> +EXPORT_SYMBOL(free_contig_frozen_range);
> +
> +/**
> + * free_contig_range() -- free the contiguous range of pages
> + * @pfn:	start PFN to free
> + * @nr_pages:	Number of contiguous pages to free
> + *
> + * This can be only used to free the allocated non-compound pages.
> + */
> +void free_contig_range(unsigned long pfn, unsigned long nr_pages)
> +{
> +	if (WARN_ON_ONCE(PageHead(pfn_to_page(pfn))))
> +		return;
>   
> -		count += page_count(page) != 1;
> -		__free_page(page);
> -	}
> -	WARN(count != 0, "%lu pages are still in use!\n", count);
> +	for (; nr_pages--; pfn++)
> +		__free_page(pfn_to_page(pfn));
>   }
>   EXPORT_SYMBOL(free_contig_range);
>   #endif /* CONFIG_CONTIG_ALLOC */


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
                   ` (6 preceding siblings ...)
  2025-12-30 18:17 ` [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Andrew Morton
@ 2026-01-07 17:31 ` Claudiu Beznea
  2026-01-07 18:25   ` Andrew Morton
                     ` (2 more replies)
  7 siblings, 3 replies; 30+ messages in thread
From: Claudiu Beznea @ 2026-01-07 17:31 UTC (permalink / raw)
  To: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm
  Cc: sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox

Hi,

On 12/30/25 09:24, Kefeng Wang wrote:
> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
> which avoid atomic operation about page refcount, and then convert to
> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
> the alloc_gigantic_folio().

I'm seeing the following issues on the Renesas RZ/G3S SoC when doing 
suspend to idle:

[  129.539064] Freezing user space processes
[  129.545037] Freezing user space processes completed (elapsed 0.005 
seconds)
[  129.552078] OOM killer disabled.
[  129.555335] Freezing remaining freezable tasks
[  129.561405] Freezing remaining freezable tasks completed (elapsed 
0.001 seconds)
[  129.636729] Unable to handle kernel paging request at virtual address 
dead000000000108
[  129.644674] Mem abort info:
[  129.647456]   ESR = 0x0000000096000044
[  129.651190]   EC = 0x25: DABT (current EL), IL = 32 bits
[  129.656482]   SET = 0, FnV = 0
[  129.659523]   EA = 0, S1PTW = 0
[  129.662650]   FSC = 0x04: level 0 translation fault
[  129.667507] Data abort info:
[  129.670374]   ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[  129.675837]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[  129.680867]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  129.686158] [dead000000000108] address between user and kernel 
address ranges
[  129.693267] Internal error: Oops: 0000000096000044 [#1]  SMP
[  129.698905] Modules linked in: nvme nvme_core snd_soc_simple_card 
snd_soc_simple_card_utils snd_soc_rz_ssi snd_soc_da7213 renesas_usbhs 
snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd 
soundcore rzg3s_thermal clk_vbattb rzg2l_adc rtc_renesas_rtca3 
industrialio_adc sha256 cfg80211 bluetooth ecdh_generic ecc rfkill fuse 
drm backlight ipv6
[  129.730189] CPU: 0 UID: 0 PID: 282 Comm: python3 Not tainted 
6.19.0-rc4-next-20260107-00002-g608ca48d0994 #1 PREEMPT
[  129.740765] Hardware name: Renesas SMARC EVK version 2 based on 
r9a08g045s33 (DT)
[  129.748223] pstate: a04000c5 (NzCv daIF +PAN -UAO -TCO -DIT -SSBS 
BTYPE=--)
[  129.755160] pc : free_pcppages_bulk+0x12c/0x204
[  129.759701] lr : free_pcppages_bulk+0x168/0x204
[  129.764219] sp : ffff80008392b7e0
[  129.767520] x29: ffff80008392b7e0 x28: ffff00003cff96b0 x27: 
ffff00003fe25700
[  129.774638] x26: ffff800081e66bd8 x25: 0000000000000001 x24: 
0000000000000025
[  129.781755] x23: ffff00003cff9680 x22: ffff00003cff9690 x21: 
dead000000000100
[  129.788872] x20: 0000000000000001 x19: 0000000000000000 x18: 
0000000000000020
[  129.795989] x17: ffff00000f3d0a00 x16: 0000000000000006 x15: 
000000e7ebec93d2
[  129.803106] x14: 0000000000000005 x13: dead000000000100 x12: 
0000000000000038
[  129.810223] x11: 0000000000000000 x10: 0000000000000001 x9 : 
0000000000000000
[  129.817339] x8 : 0000000000000000 x7 : ffff00003fe258a8 x6 : 
dead000000000122
[  129.824456] x5 : dead000000000122 x4 : ffff00003fe14628 x3 : 
fffffdffc0f799c8
[  129.831573] x2 : 0401010101010101 x1 : 000000000007de67 x0 : 
fffffdffc0f799c0
[  129.838691] Call trace:
[  129.841129]  free_pcppages_bulk+0x12c/0x204 (P)
[  129.845653]  free_frozen_page_commit.constprop.0+0x27c/0x478
[  129.851300]  __free_frozen_pages+0x1a0/0x63c
[  129.855562]  free_contig_frozen_range+0xd0/0x118
[  129.860165]  cma_release+0x7c/0xd8
[  129.863568]  dma_free_contiguous+0x2c/0x74
[  129.867657]  dma_direct_free+0xd8/0x1b0
[  129.871482]  dma_free_attrs+0x84/0xf8
[  129.875140]  ravb_ring_free+0x5c/0x1b4
[  129.878888]  ravb_close+0x12c/0x1d4
[  129.882368]  ravb_suspend+0x60/0x16c
[  129.885935]  device_suspend+0x148/0x3f4
[  129.889766]  dpm_suspend+0x1b0/0x2ac
[  129.893332]  dpm_suspend_start+0x54/0x70
[  129.897245]  suspend_devices_and_enter+0x124/0x4b8
[  129.902026]  pm_suspend+0x1a4/0x1f0
[  129.905506]  state_store+0x8c/0x110
[  129.908985]  kobj_attr_store+0x18/0x2c
[  129.912727]  sysfs_kf_write+0x7c/0x94
[  129.916384]  kernfs_fop_write_iter+0x128/0x1b8
[  129.920815]  vfs_write+0x2ac/0x350
[  129.924210]  ksys_write+0x68/0xfc
[  129.927517]  __arm64_sys_write+0x1c/0x28
[  129.931431]  invoke_syscall+0x48/0x10c
[  129.935177]  el0_svc_common.constprop.0+0xc0/0xe0
[  129.939871]  do_el0_svc+0x1c/0x28
[  129.943180]  el0_svc+0x34/0x10c
[  129.946319]  el0t_64_sync_handler+0xa0/0xe4
[  129.950492]  el0t_64_sync+0x198/0x19c

Using ./scripts/decode_stacktrace.sh on this leads to the following output:

./scripts/decode_stacktrace.sh build-arm64/vmlinux < out
[  490.453272] Call trace:
[  490.455711]  free_pcppages_bulk (include/linux/list.h:203 
include/linux/list.h:226 include/linux/list.h:237 mm/page_alloc.c:1525) (P)
[  490.460234]  free_frozen_page_commit.constprop.0 
(include/linux/spinlock.h:392 mm/page_alloc.c:2919)
[  490.465881]  __free_frozen_pages (mm/page_alloc.c:3003)
[  490.470143]  free_contig_frozen_range (mm/page_alloc.c:6977 
mm/page_alloc.c:7379)
[  490.474747]  cma_release (mm/cma.c:996 mm/cma.c:1025)
[  490.478149]  dma_free_contiguous (kernel/dma/contiguous.c:430)
[  490.482240]  dma_direct_free (kernel/dma/direct.c:351)
[  490.486064]  dma_free_attrs (kernel/dma/mapping.c:688)
[  490.489723]  ravb_ring_free 
(drivers/net/ethernet/renesas/ravb_main.c:249 
drivers/net/ethernet/renesas/ravb_main.c:260)
[  490.493469]  ravb_close (drivers/net/ethernet/renesas/ravb_main.c:2406)
[  490.496950]  ravb_suspend (drivers/net/ethernet/renesas/ravb_main.c:3225)
[  490.500516]  device_suspend (drivers/base/power/main.c:504 
drivers/base/power/main.c:1965)
[  490.504347]  dpm_suspend (drivers/base/power/main.c:2049)
[  490.507916]  dpm_suspend_start (drivers/base/power/main.c:2282)
[  490.511829]  suspend_devices_and_enter (kernel/power/suspend.c:523)
[  490.516609]  pm_suspend (kernel/power/suspend.c:621 
kernel/power/suspend.c:644)
[  490.520088]  state_store (kernel/power/main.c:819)
[  490.523568]  kobj_attr_store (lib/kobject.c:842)
[  490.527310]  sysfs_kf_write (fs/sysfs/file.c:143)
[  490.530967]  kernfs_fop_write_iter (fs/kernfs/file.c:352)
[  490.535398]  vfs_write (fs/read_write.c:593 fs/read_write.c:686)
[  490.538793]  ksys_write (fs/read_write.c:738)
[  490.542101]  __arm64_sys_write (fs/read_write.c:746)
[  490.546014]  invoke_syscall (arch/arm64/include/asm/current.h:19 
arch/arm64/kernel/syscall.c:54)
[  490.549762]  el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:70)
[  490.554454]  do_el0_svc (arch/arm64/kernel/syscall.c:152)
[  490.557763]  el0_svc (arch/arm64/include/asm/irqflags.h:55 
arch/arm64/include/asm/irqflags.h:76 arch/arm64/kernel/entry-common.c:80 
arch/arm64/kernel/entry-common.c:725)
[  490.560901]  el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:744)
[  490.565074]  el0t_64_sync (arch/arm64/kernel/entry.S:596)

Reverting this series leads to no more failures. Should things be 
handled differently now in the drivers? Do you consider there is 
something buggy in the ravb driver?

Thank you,
Claudiu


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-07 17:31 ` Claudiu Beznea
@ 2026-01-07 18:25   ` Andrew Morton
  2026-01-07 18:26   ` Zi Yan
  2026-01-07 18:39   ` Mark Brown
  2 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2026-01-07 18:25 UTC (permalink / raw)
  To: Claudiu Beznea
  Cc: Kefeng Wang, David Hildenbrand, Oscar Salvador, Muchun Song,
	linux-mm, sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox

On Wed, 7 Jan 2026 19:31:30 +0200 Claudiu Beznea <claudiu.beznea@tuxon.dev> wrote:

> On 12/30/25 09:24, Kefeng Wang wrote:
> > Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
> > which avoid atomic operation about page refcount, and then convert to
> > allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
> > the alloc_gigantic_folio().
> 
> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing 
> suspend to idle:

Thanks.  For now I'll move this series back into mm.git's mm-new
branch, so it will no longer be published in linux-next.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-07 17:31 ` Claudiu Beznea
  2026-01-07 18:25   ` Andrew Morton
@ 2026-01-07 18:26   ` Zi Yan
  2026-01-07 18:39   ` Mark Brown
  2 siblings, 0 replies; 30+ messages in thread
From: Zi Yan @ 2026-01-07 18:26 UTC (permalink / raw)
  To: Claudiu Beznea
  Cc: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm, sidhartha.kumar, jane.chu,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox

On 7 Jan 2026, at 12:31, Claudiu Beznea wrote:

> Hi,
>
> On 12/30/25 09:24, Kefeng Wang wrote:
>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>> which avoid atomic operation about page refcount, and then convert to
>> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
>> the alloc_gigantic_folio().
>
> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend to idle:
>
> [  129.539064] Freezing user space processes
> [  129.545037] Freezing user space processes completed (elapsed 0.005 seconds)
> [  129.552078] OOM killer disabled.
> [  129.555335] Freezing remaining freezable tasks
> [  129.561405] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [  129.636729] Unable to handle kernel paging request at virtual address dead000000000108

This is LIST_POISON1 + 8, namely at __list_del(), assigning to next->prev
caused the issue. This means list_del() has been performed on page->pcplist,
since list_del() sets ->next to LIST_POISON1.

> [  129.644674] Mem abort info:
> [  129.647456]   ESR = 0x0000000096000044
> [  129.651190]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  129.656482]   SET = 0, FnV = 0
> [  129.659523]   EA = 0, S1PTW = 0
> [  129.662650]   FSC = 0x04: level 0 translation fault
> [  129.667507] Data abort info:
> [  129.670374]   ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
> [  129.675837]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> [  129.680867]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [  129.686158] [dead000000000108] address between user and kernel address ranges
> [  129.693267] Internal error: Oops: 0000000096000044 [#1]  SMP
> [  129.698905] Modules linked in: nvme nvme_core snd_soc_simple_card snd_soc_simple_card_utils snd_soc_rz_ssi snd_soc_da7213 renesas_usbhs snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd soundcore rzg3s_thermal clk_vbattb rzg2l_adc rtc_renesas_rtca3 industrialio_adc sha256 cfg80211 bluetooth ecdh_generic ecc rfkill fuse drm backlight ipv6
> [  129.730189] CPU: 0 UID: 0 PID: 282 Comm: python3 Not tainted 6.19.0-rc4-next-20260107-00002-g608ca48d0994 #1 PREEMPT
> [  129.740765] Hardware name: Renesas SMARC EVK version 2 based on r9a08g045s33 (DT)
> [  129.748223] pstate: a04000c5 (NzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  129.755160] pc : free_pcppages_bulk+0x12c/0x204
> [  129.759701] lr : free_pcppages_bulk+0x168/0x204
> [  129.764219] sp : ffff80008392b7e0
> [  129.767520] x29: ffff80008392b7e0 x28: ffff00003cff96b0 x27: ffff00003fe25700
> [  129.774638] x26: ffff800081e66bd8 x25: 0000000000000001 x24: 0000000000000025
> [  129.781755] x23: ffff00003cff9680 x22: ffff00003cff9690 x21: dead000000000100
> [  129.788872] x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000020
> [  129.795989] x17: ffff00000f3d0a00 x16: 0000000000000006 x15: 000000e7ebec93d2
> [  129.803106] x14: 0000000000000005 x13: dead000000000100 x12: 0000000000000038
> [  129.810223] x11: 0000000000000000 x10: 0000000000000001 x9 : 0000000000000000
> [  129.817339] x8 : 0000000000000000 x7 : ffff00003fe258a8 x6 : dead000000000122
> [  129.824456] x5 : dead000000000122 x4 : ffff00003fe14628 x3 : fffffdffc0f799c8
> [  129.831573] x2 : 0401010101010101 x1 : 000000000007de67 x0 : fffffdffc0f799c0
> [  129.838691] Call trace:
> [  129.841129]  free_pcppages_bulk+0x12c/0x204 (P)
> [  129.845653]  free_frozen_page_commit.constprop.0+0x27c/0x478
> [  129.851300]  __free_frozen_pages+0x1a0/0x63c
> [  129.855562]  free_contig_frozen_range+0xd0/0x118
> [  129.860165]  cma_release+0x7c/0xd8
> [  129.863568]  dma_free_contiguous+0x2c/0x74
> [  129.867657]  dma_direct_free+0xd8/0x1b0
> [  129.871482]  dma_free_attrs+0x84/0xf8
> [  129.875140]  ravb_ring_free+0x5c/0x1b4
> [  129.878888]  ravb_close+0x12c/0x1d4
> [  129.882368]  ravb_suspend+0x60/0x16c
> [  129.885935]  device_suspend+0x148/0x3f4
> [  129.889766]  dpm_suspend+0x1b0/0x2ac
> [  129.893332]  dpm_suspend_start+0x54/0x70
> [  129.897245]  suspend_devices_and_enter+0x124/0x4b8
> [  129.902026]  pm_suspend+0x1a4/0x1f0
> [  129.905506]  state_store+0x8c/0x110
> [  129.908985]  kobj_attr_store+0x18/0x2c
> [  129.912727]  sysfs_kf_write+0x7c/0x94
> [  129.916384]  kernfs_fop_write_iter+0x128/0x1b8
> [  129.920815]  vfs_write+0x2ac/0x350
> [  129.924210]  ksys_write+0x68/0xfc
> [  129.927517]  __arm64_sys_write+0x1c/0x28
> [  129.931431]  invoke_syscall+0x48/0x10c
> [  129.935177]  el0_svc_common.constprop.0+0xc0/0xe0
> [  129.939871]  do_el0_svc+0x1c/0x28
> [  129.943180]  el0_svc+0x34/0x10c
> [  129.946319]  el0t_64_sync_handler+0xa0/0xe4
> [  129.950492]  el0t_64_sync+0x198/0x19c
>
> Using ./scripts/decode_stacktrace.sh on this leads to the following output:
>
> ./scripts/decode_stacktrace.sh build-arm64/vmlinux < out
> [  490.453272] Call trace:
> [  490.455711]  free_pcppages_bulk (include/linux/list.h:203 include/linux/list.h:226 include/linux/list.h:237 mm/page_alloc.c:1525) (P)
> [  490.460234]  free_frozen_page_commit.constprop.0 (include/linux/spinlock.h:392 mm/page_alloc.c:2919)
> [  490.465881]  __free_frozen_pages (mm/page_alloc.c:3003)
> [  490.470143]  free_contig_frozen_range (mm/page_alloc.c:6977 mm/page_alloc.c:7379)
> [  490.474747]  cma_release (mm/cma.c:996 mm/cma.c:1025)
> [  490.478149]  dma_free_contiguous (kernel/dma/contiguous.c:430)
> [  490.482240]  dma_direct_free (kernel/dma/direct.c:351)
> [  490.486064]  dma_free_attrs (kernel/dma/mapping.c:688)
> [  490.489723]  ravb_ring_free (drivers/net/ethernet/renesas/ravb_main.c:249 drivers/net/ethernet/renesas/ravb_main.c:260)
> [  490.493469]  ravb_close (drivers/net/ethernet/renesas/ravb_main.c:2406)
> [  490.496950]  ravb_suspend (drivers/net/ethernet/renesas/ravb_main.c:3225)
> [  490.500516]  device_suspend (drivers/base/power/main.c:504 drivers/base/power/main.c:1965)
> [  490.504347]  dpm_suspend (drivers/base/power/main.c:2049)
> [  490.507916]  dpm_suspend_start (drivers/base/power/main.c:2282)
> [  490.511829]  suspend_devices_and_enter (kernel/power/suspend.c:523)
> [  490.516609]  pm_suspend (kernel/power/suspend.c:621 kernel/power/suspend.c:644)
> [  490.520088]  state_store (kernel/power/main.c:819)
> [  490.523568]  kobj_attr_store (lib/kobject.c:842)
> [  490.527310]  sysfs_kf_write (fs/sysfs/file.c:143)
> [  490.530967]  kernfs_fop_write_iter (fs/kernfs/file.c:352)
> [  490.535398]  vfs_write (fs/read_write.c:593 fs/read_write.c:686)
> [  490.538793]  ksys_write (fs/read_write.c:738)
> [  490.542101]  __arm64_sys_write (fs/read_write.c:746)
> [  490.546014]  invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54)
> [  490.549762]  el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:70)
> [  490.554454]  do_el0_svc (arch/arm64/kernel/syscall.c:152)
> [  490.557763]  el0_svc (arch/arm64/include/asm/irqflags.h:55 arch/arm64/include/asm/irqflags.h:76 arch/arm64/kernel/entry-common.c:80 arch/arm64/kernel/entry-common.c:725)
> [  490.560901]  el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:744)
> [  490.565074]  el0t_64_sync (arch/arm64/kernel/entry.S:596)
>
> Reverting this series leads to no more failures. Should things be handled differently now in the drivers? Do you consider there is something buggy in the ravb driver?

Is it possible to do a bisect? That would help find the issue. I think
the series does not intend to change how existing cma_* APIs behave.

Thanks.

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-07 17:31 ` Claudiu Beznea
  2026-01-07 18:25   ` Andrew Morton
  2026-01-07 18:26   ` Zi Yan
@ 2026-01-07 18:39   ` Mark Brown
  2026-01-07 18:50     ` Andrew Morton
  2026-01-07 19:38     ` Zi Yan
  2 siblings, 2 replies; 30+ messages in thread
From: Mark Brown @ 2026-01-07 18:39 UTC (permalink / raw)
  To: Claudiu Beznea
  Cc: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm, sidhartha.kumar, jane.chu, Zi Yan,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox

[-- Attachment #1: Type: text/plain, Size: 16677 bytes --]

On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
> On 12/30/25 09:24, Kefeng Wang wrote:

> > Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
> > which avoid atomic operation about page refcount, and then convert to
> > allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
> > the alloc_gigantic_folio().
> 
> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
> to idle:
> 
> [  129.636729] Unable to handle kernel paging request at virtual address
> dead000000000108
> [  129.644674] Mem abort info:
> [  129.647456]   ESR = 0x0000000096000044

This is also introducing OOMs when doing at least audio tests (I don't
think these are super relevant) on Raspberry Pi 3B+ running NFS root
(probably more relevant):

[   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000

...

[   64.087583] Call trace:
[   64.087586]  kmem_cache_free+0x88/0x434 (P)
[   64.087598]  skb_free_head+0x9c/0xb8
[   64.087608]  skb_release_data+0x120/0x174
[   64.087615]  __kfree_skb+0x2c/0x44
[   64.087622]  tcp_data_queue+0x948/0xe50

Full log:

   https://lava.sirena.org.uk/scheduler/job/2341856#L1721

Bisection identifies:

[fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()

as being the comit that introduces the issue.  Bisect log with links to
further test runs:

# bad: [f96074c6d01d8a5e9e2fccd0bba5f2ed654c1f2d] Add linux-next specific files for 20260107
# good: [20dcabcbe843d74f3f6d2c8b5a4bd14443997697] Merge branch 'for-linux-next-fixes' of https://gitlab.freedesktop.org/drm/misc/kernel.git
# good: [75d208bddcca55ec31481420fbb4d6c9703ba195] spi: stm32: avoid __maybe_unused and use pm_ptr
# good: [04b61513dfe40f80f0dcc795003637b510522b3c] ASoC: SDCA: Replace use of system_wq with system_dfl_wq
# good: [9bf0bd7bdea6c402007ffb784dd0c0f704aa2310] ASoC: nau8821: Sort #include directives
# good: [52ddc0106c77ff0eacf07b309833ae6e6a4e8587] ASoC: es8328: Remove duplicate DAPM routes
# good: [96d337436fe0921177a6090aeb5bb214753654fc] spi: dt-bindings: at91: add microchip,lan9691-spi
# good: [4c5e6d5b31bc623d89185d551681ab91cfd037c9] ASoC: codecs: ES8389: Update clock configuration
# good: [211243b69533e968cc6f0259fb80ffee02fbe0ca] firmware: cs_dsp: test_bin: Add tests for offsets > 0xffff
# good: [420739112e95c9bb286b4e87875706925970abd3] ASoC: rt5575: Add the codec driver for the ALC5575
# good: [25abdc151a448a17d500ea9468ce32582c479faa] ASoC: rt1320: fix the remainder calculation of r0 value
# good: [284853affe73fe1ca9786bd52b934eb9d420a942] ASoC: rt1320: fix size_t format string
# good: [45e9066f3a487e9e26b842644364d045af054775] ASoC: Intel: avs: replace strcmp with sysfs_streq
# good: [0f698d742f628d02ab2a222f8cf5f793443865d0] spi: bcm63xx-hsspi: add support for 1-2-2 read ops
# good: [8db50f0fa43efe8799fd40b872dcdd39a90d7549] ASoC: rt1320: fix the warning the string may be truncated
# good: [b0655377aa5a410df02d89170c20141a1a5bbc28] rust: regulator: replace `kernel::c_str!` with C-Strings
# good: [c6bca73d699cfe00d3419566fdb2a45e112f44b0] ASoC: rt1320: Fix retry checking in rt1320_rae_load()
# good: [4ab48cc63e15cb619d641d1edf9a15a0a98875b2] ASoC: qcom: audioreach: Constify function arguments
# good: [a2a631830deb382a3d27b6f52b2d654a3e6bb427] ASoC: qcom: Constify APR/GPR result structs
# good: [99a3ef1e81cd1775bc1f8cc2ad188b1fc755d5cd] ASoC: SDCA: Add ASoC jack hookup in class driver
# good: [32a708ba5db50cf928a1f1b2039ceef33de2c286] regulator: Add rt8092 support
# good: [7a8447fc71a09000cee5a2372b6efde45735d2c8] ASoC: codecs: wcd939x-sdw: use devres for regmap allocation
# good: [b39ef93a2e5b5f4289a3486d8a94a09a1e6a4c67] spi: stm32: perform small transfer in polling mode
# good: [3622dc47a4b13e0ec86358c7b54a0b33bfcaa03c] ASoC: codec: rt286: Use devm_request_threaded_irq to manage IRQ lifetime and fix smatch warning
# good: [2a28b5240f2b328495c6565d277f438dbc583d61] ASoC: SOF: ipc4-control: Add support for generic bytes control
# good: [7f7b350e4a65446f5d52ea8ae99e12eac8a972db] spi: stm32-qspi: Remove unneeded semicolon
# good: [f764645cb85a8b8f58067289cdfed28f6c1cdf49] ASoC: codecs: tas2780: tidyup format check in tas2780_set_fmt()
# good: [02e7af5b6423d2dbf82f852572f2fa8c00aafb19] ASoC: Intel: sof_rt5682: add tas2563 speaker amp support
# good: [9a6bc0a406608e2520f18d996483c9d2e4a9fb27] ASoC: codecs: ES8326: Add kcontrol for DRE
# good: [29c8c00d9f9db5fb659b6f05f9e8964afc13f3e2] spi: add driver for NXP XSPI controller
# good: [1303c2903889b01d581083ed92e439e7544dd3e5] MAINTAINERS: Add MAINTAINERS entry for the ATCSPI200 SPI controller driver
# good: [524ee559948d8d079b13466e70fa741f909699c0] ASoC: SOF: Intel: hda: Only check SSP MCLK mask in case of IPC3
# good: [f25c7d709b93602ee9a08eba522808a18e1f5d56] ASoC: SOF: Intel: pci-nvl: Set on_demand_dsp_boot for NVL-S
# good: [f4acea9eef704607d1a950909ce3a52a770d6be2] spi: dt-bindings: st,stm32-spi: add 'power-domains' property
# good: [fee876b2ec75dcc18fdea154eae1f5bf14d82659] spi: stm32-qspi: Simplify SMIE interrupt test
# good: [b884e34994ca41f7b7819f3c41b78ff494787b27] spi: spi-fsl-lpspi: convert min_t() to simple min()
# good: [0bb160c92ad400c692984763996b758458adea17] ASoC: qcom: Minor readability improve with new lines
# good: [81acbdc51bbbec822a1525481f2f70677c47aee0] ASoC: sdw-mockup: Drop dummy remove function
# good: [03d281f384768610bf90697bce9e35d3d596de77] rust: regulator: add __rust_helper to helpers
# good: [ba9b28652c75b07383e267328f1759195d5430f7] spi: imx: enable DMA mode for target operation
# good: [9e92c559d49d6fb903af17a31a469aac51b1766d] regulator: max77675: Add MAX77675 regulator driver
# good: [124f6155f3d97b0e33f178c10a5138a42c8fd207] ASoC: renesas: rz-ssi: Add support for 32 bits sample width
# good: [aa30193af8873b3ccfd70a4275336ab6cbd4e5e6] ASoC: Intel: catpt: Drop superfluous space in PCM code
# good: [e39011184f23de3d04ca8e80b4df76c9047b4026] ASoC: SDCA: functions: Fix confusing cleanup.h syntax
# good: [fa08b566860bca8ebf9300090b85174c34de7ca5] spi: rzv2h-rspi: add support for DMA mode
# good: [6c177775dcc5e70a64ddf4ee842c66af498f2c7c] Merge branch 'next/drivers' into for-next
git bisect start 'f96074c6d01d8a5e9e2fccd0bba5f2ed654c1f2d' '20dcabcbe843d74f3f6d2c8b5a4bd14443997697' '75d208bddcca55ec31481420fbb4d6c9703ba195' '04b61513dfe40f80f0dcc795003637b510522b3c' '9bf0bd7bdea6c402007ffb784dd0c0f704aa2310' '52ddc0106c77ff0eacf07b309833ae6e6a4e8587' '96d337436fe0921177a6090aeb5bb214753654fc' '4c5e6d5b31bc623d89185d551681ab91cfd037c9' '211243b69533e968cc6f0259fb80ffee02fbe0ca' '420739112e95c9bb286b4e87875706925970abd3' '25abdc151a448a17d500ea9468ce32582c479faa' '284853affe73fe1ca9786bd52b934eb9d420a942' '45e9066f3a487e9e26b842644364d045af054775' '0f698d742f628d02ab2a222f8cf5f793443865d0' '8db50f0fa43efe8799fd40b872dcdd39a90d7549' 'b0655377aa5a410df02d89170c20141a1a5bbc28' 'c6bca73d699cfe00d3419566fdb2a45e112f44b0' '4ab48cc63e15cb619d641d1edf9a15a0a98875b2' 'a2a631830deb382a3d27b6f52b2d654a3e6bb427' '99a3ef1e81cd1775bc1f8cc2ad188b1fc755d5cd' '32a708ba5db50cf928a1f1b2039ceef33de2c286' '7a8447fc71a09000cee5a2372b6efde45735d2c8' 'b39ef93a2e5b5f4289a3486d8a94a09a1e6a4c67' '3622dc47a4b13e0ec86358c7b54a0b33bfcaa03c' '2a28b5240f2b328495c6565d277f438dbc583d61' '7f7b350e4a65446f5d52ea8ae99e12eac8a972db' 'f764645cb85a8b8f58067289cdfed28f6c1cdf49' '02e7af5b6423d2dbf82f852572f2fa8c00aafb19' '9a6bc0a406608e2520f18d996483c9d2e4a9fb27' '29c8c00d9f9db5fb659b6f05f9e8964afc13f3e2' '1303c2903889b01d581083ed92e439e7544dd3e5' '524ee559948d8d079b13466e70fa741f909699c0' 'f25c7d709b93602ee9a08eba522808a18e1f5d56' 'f4acea9eef704607d1a950909ce3a52a770d6be2' 'fee876b2ec75dcc18fdea154eae1f5bf14d82659' 'b884e34994ca41f7b7819f3c41b78ff494787b27' '0bb160c92ad400c692984763996b758458adea17' '81acbdc51bbbec822a1525481f2f70677c47aee0' '03d281f384768610bf90697bce9e35d3d596de77' 'ba9b28652c75b07383e267328f1759195d5430f7' '9e92c559d49d6fb903af17a31a469aac51b1766d' '124f6155f3d97b0e33f178c10a5138a42c8fd207' 'aa30193af8873b3ccfd70a4275336ab6cbd4e5e6' 'e39011184f23de3d04ca8e80b4df76c9047b4026' 'fa08b566860bca8ebf9300090b85174c34de7ca5' '6c177775dcc5e70a64ddf4ee842c66af498f2c7c'
# test job: [75d208bddcca55ec31481420fbb4d6c9703ba195] https://lava.sirena.org.uk/scheduler/job/2337459
# test job: [04b61513dfe40f80f0dcc795003637b510522b3c] https://lava.sirena.org.uk/scheduler/job/2337698
# test job: [9bf0bd7bdea6c402007ffb784dd0c0f704aa2310] https://lava.sirena.org.uk/scheduler/job/2331109
# test job: [52ddc0106c77ff0eacf07b309833ae6e6a4e8587] https://lava.sirena.org.uk/scheduler/job/2331449
# test job: [96d337436fe0921177a6090aeb5bb214753654fc] https://lava.sirena.org.uk/scheduler/job/2330452
# test job: [4c5e6d5b31bc623d89185d551681ab91cfd037c9] https://lava.sirena.org.uk/scheduler/job/2331888
# test job: [211243b69533e968cc6f0259fb80ffee02fbe0ca] https://lava.sirena.org.uk/scheduler/job/2330732
# test job: [420739112e95c9bb286b4e87875706925970abd3] https://lava.sirena.org.uk/scheduler/job/2331724
# test job: [25abdc151a448a17d500ea9468ce32582c479faa] https://lava.sirena.org.uk/scheduler/job/2307411
# test job: [284853affe73fe1ca9786bd52b934eb9d420a942] https://lava.sirena.org.uk/scheduler/job/2298008
# test job: [45e9066f3a487e9e26b842644364d045af054775] https://lava.sirena.org.uk/scheduler/job/2295689
# test job: [0f698d742f628d02ab2a222f8cf5f793443865d0] https://lava.sirena.org.uk/scheduler/job/2295186
# test job: [8db50f0fa43efe8799fd40b872dcdd39a90d7549] https://lava.sirena.org.uk/scheduler/job/2292094
# test job: [b0655377aa5a410df02d89170c20141a1a5bbc28] https://lava.sirena.org.uk/scheduler/job/2291691
# test job: [c6bca73d699cfe00d3419566fdb2a45e112f44b0] https://lava.sirena.org.uk/scheduler/job/2290187
# test job: [4ab48cc63e15cb619d641d1edf9a15a0a98875b2] https://lava.sirena.org.uk/scheduler/job/2290923
# test job: [a2a631830deb382a3d27b6f52b2d654a3e6bb427] https://lava.sirena.org.uk/scheduler/job/2281809
# test job: [99a3ef1e81cd1775bc1f8cc2ad188b1fc755d5cd] https://lava.sirena.org.uk/scheduler/job/2290873
# test job: [32a708ba5db50cf928a1f1b2039ceef33de2c286] https://lava.sirena.org.uk/scheduler/job/2279418
# test job: [7a8447fc71a09000cee5a2372b6efde45735d2c8] https://lava.sirena.org.uk/scheduler/job/2271746
# test job: [b39ef93a2e5b5f4289a3486d8a94a09a1e6a4c67] https://lava.sirena.org.uk/scheduler/job/2269621
# test job: [3622dc47a4b13e0ec86358c7b54a0b33bfcaa03c] https://lava.sirena.org.uk/scheduler/job/2268671
# test job: [2a28b5240f2b328495c6565d277f438dbc583d61] https://lava.sirena.org.uk/scheduler/job/2266226
# test job: [7f7b350e4a65446f5d52ea8ae99e12eac8a972db] https://lava.sirena.org.uk/scheduler/job/2263933
# test job: [f764645cb85a8b8f58067289cdfed28f6c1cdf49] https://lava.sirena.org.uk/scheduler/job/2264479
# test job: [02e7af5b6423d2dbf82f852572f2fa8c00aafb19] https://lava.sirena.org.uk/scheduler/job/2262695
# test job: [9a6bc0a406608e2520f18d996483c9d2e4a9fb27] https://lava.sirena.org.uk/scheduler/job/2264395
# test job: [29c8c00d9f9db5fb659b6f05f9e8964afc13f3e2] https://lava.sirena.org.uk/scheduler/job/2264013
# test job: [1303c2903889b01d581083ed92e439e7544dd3e5] https://lava.sirena.org.uk/scheduler/job/2263454
# test job: [524ee559948d8d079b13466e70fa741f909699c0] https://lava.sirena.org.uk/scheduler/job/2243928
# test job: [f25c7d709b93602ee9a08eba522808a18e1f5d56] https://lava.sirena.org.uk/scheduler/job/2244095
# test job: [f4acea9eef704607d1a950909ce3a52a770d6be2] https://lava.sirena.org.uk/scheduler/job/2243878
# test job: [fee876b2ec75dcc18fdea154eae1f5bf14d82659] https://lava.sirena.org.uk/scheduler/job/2231258
# test job: [b884e34994ca41f7b7819f3c41b78ff494787b27] https://lava.sirena.org.uk/scheduler/job/2232599
# test job: [0bb160c92ad400c692984763996b758458adea17] https://lava.sirena.org.uk/scheduler/job/2233069
# test job: [81acbdc51bbbec822a1525481f2f70677c47aee0] https://lava.sirena.org.uk/scheduler/job/2232790
# test job: [03d281f384768610bf90697bce9e35d3d596de77] https://lava.sirena.org.uk/scheduler/job/2231108
# test job: [ba9b28652c75b07383e267328f1759195d5430f7] https://lava.sirena.org.uk/scheduler/job/2231415
# test job: [9e92c559d49d6fb903af17a31a469aac51b1766d] https://lava.sirena.org.uk/scheduler/job/2232475
# test job: [124f6155f3d97b0e33f178c10a5138a42c8fd207] https://lava.sirena.org.uk/scheduler/job/2232888
# test job: [aa30193af8873b3ccfd70a4275336ab6cbd4e5e6] https://lava.sirena.org.uk/scheduler/job/2232739
# test job: [e39011184f23de3d04ca8e80b4df76c9047b4026] https://lava.sirena.org.uk/scheduler/job/2232438
# test job: [fa08b566860bca8ebf9300090b85174c34de7ca5] https://lava.sirena.org.uk/scheduler/job/2232925
# test job: [6c177775dcc5e70a64ddf4ee842c66af498f2c7c] https://lava.sirena.org.uk/scheduler/job/2203261
# test job: [f96074c6d01d8a5e9e2fccd0bba5f2ed654c1f2d] https://lava.sirena.org.uk/scheduler/job/2341856
# bad: [f96074c6d01d8a5e9e2fccd0bba5f2ed654c1f2d] Add linux-next specific files for 20260107
git bisect bad f96074c6d01d8a5e9e2fccd0bba5f2ed654c1f2d
# test job: [2fdca3c405e768b297c7833abd381798ec67c12f] https://lava.sirena.org.uk/scheduler/job/2342141
# bad: [2fdca3c405e768b297c7833abd381798ec67c12f] Merge branch 'libcrypto-next' of https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git
git bisect bad 2fdca3c405e768b297c7833abd381798ec67c12f
# test job: [711e711d9a1413a312c4e529460dc6359d7a2e21] https://lava.sirena.org.uk/scheduler/job/2342498
# bad: [711e711d9a1413a312c4e529460dc6359d7a2e21] Merge branch 'next' of https://git.kernel.org/pub/scm/linux/kernel/git/jenswi/linux-tee.git
git bisect bad 711e711d9a1413a312c4e529460dc6359d7a2e21
# test job: [eeebc729d8edff51c7d53a8da80ab694fb00ec59] https://lava.sirena.org.uk/scheduler/job/2342751
# bad: [eeebc729d8edff51c7d53a8da80ab694fb00ec59] Merge branch 'soc_fsl' of https://git.kernel.org/pub/scm/linux/kernel/git/chleroy/linux.git
git bisect bad eeebc729d8edff51c7d53a8da80ab694fb00ec59
# test job: [99f5256ce0398314379c4d43f960bb05c6a68e86] https://lava.sirena.org.uk/scheduler/job/2343002
# bad: [99f5256ce0398314379c4d43f960bb05c6a68e86] Merge branch 'mm-unstable' of https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect bad 99f5256ce0398314379c4d43f960bb05c6a68e86
# test job: [0d560834ac1e15bda48c3df99397479d584ad49a] https://lava.sirena.org.uk/scheduler/job/2343255
# good: [0d560834ac1e15bda48c3df99397479d584ad49a] mm, swap: never bypass the swap cache even for SWP_SYNCHRONOUS_IO
git bisect good 0d560834ac1e15bda48c3df99397479d584ad49a
# test job: [f7722c416c0d4fc2183170daf0703cd7a18fb1e2] https://lava.sirena.org.uk/scheduler/job/2343380
# good: [f7722c416c0d4fc2183170daf0703cd7a18fb1e2] mm/damon/tests/core-kunit: verify the 'age' field in damon_test_split_at()
git bisect good f7722c416c0d4fc2183170daf0703cd7a18fb1e2
# test job: [af14267376cc7aea8183a2b01efe5458768652d5] https://lava.sirena.org.uk/scheduler/job/2343575
# bad: [af14267376cc7aea8183a2b01efe5458768652d5] mips: introduce arch_zone_limits_init()
git bisect bad af14267376cc7aea8183a2b01efe5458768652d5
# test job: [bb25558519874e266399381588490c1d1ffae133] https://lava.sirena.org.uk/scheduler/job/2343772
# bad: [bb25558519874e266399381588490c1d1ffae133] mm: hugetlb: allocate frozen pages for gigantic allocation
git bisect bad bb25558519874e266399381588490c1d1ffae133
# test job: [8c670157c80aa96182795e801ea542a20aff86f1] https://lava.sirena.org.uk/scheduler/job/2343923
# good: [8c670157c80aa96182795e801ea542a20aff86f1] mm/damon/tests/core-kunit: remove a redundant test case and add a new test case in damos_test_commit_quota_goal()
git bisect good 8c670157c80aa96182795e801ea542a20aff86f1
# test job: [d0a464d968a308259b9dc5af39c1059a83c0524d] https://lava.sirena.org.uk/scheduler/job/2343979
# good: [d0a464d968a308259b9dc5af39c1059a83c0524d] mm: cma: kill cma_pages_valid()
git bisect good d0a464d968a308259b9dc5af39c1059a83c0524d
# test job: [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] https://lava.sirena.org.uk/scheduler/job/2344132
# bad: [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()
git bisect bad fb9a328d30400dbc8b2ea5a57daeb28bedac398b
# test job: [c619019d04ad8de96657cd0c8f1251ac14944338] https://lava.sirena.org.uk/scheduler/job/2344251
# good: [c619019d04ad8de96657cd0c8f1251ac14944338] mm: page_alloc: add alloc_contig_frozen_{range,pages}()
git bisect good c619019d04ad8de96657cd0c8f1251ac14944338
# first bad commit: [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-07 18:39   ` Mark Brown
@ 2026-01-07 18:50     ` Andrew Morton
  2026-01-07 19:38     ` Zi Yan
  1 sibling, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2026-01-07 18:50 UTC (permalink / raw)
  To: Mark Brown
  Cc: Claudiu Beznea, Kefeng Wang, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm, sidhartha.kumar, jane.chu, Zi Yan,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox

On Wed, 7 Jan 2026 18:39:17 +0000 Mark Brown <broonie@kernel.org> wrote:

> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
> > On 12/30/25 09:24, Kefeng Wang wrote:
> 
> > > Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
> > > which avoid atomic operation about page refcount, and then convert to
> > > allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
> > > the alloc_gigantic_folio().
> > 
> > I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
> > to idle:
> > 
> > [  129.636729] Unable to handle kernel paging request at virtual address
> > dead000000000108
> > [  129.644674] Mem abort info:
> > [  129.647456]   ESR = 0x0000000096000044
> 
> This is also introducing OOMs when doing at least audio tests (I don't
> think these are super relevant) on Raspberry Pi 3B+ running NFS root
> (probably more relevant):
> 
> [   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000

Thanks.  I'll fully drop this series from mm.git.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-07 18:39   ` Mark Brown
  2026-01-07 18:50     ` Andrew Morton
@ 2026-01-07 19:38     ` Zi Yan
       [not found]       ` <CGME20260107225819eucas1p2de678d4e810fdbde87192b83033a814c@eucas1p2.samsung.com>
                         ` (3 more replies)
  1 sibling, 4 replies; 30+ messages in thread
From: Zi Yan @ 2026-01-07 19:38 UTC (permalink / raw)
  To: Mark Brown, Claudiu Beznea
  Cc: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm, sidhartha.kumar, jane.chu,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox

On 7 Jan 2026, at 13:39, Mark Brown wrote:

> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
>> On 12/30/25 09:24, Kefeng Wang wrote:
>
>>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>>> which avoid atomic operation about page refcount, and then convert to
>>> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
>>> the alloc_gigantic_folio().
>>
>> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
>> to idle:
>>
>> [  129.636729] Unable to handle kernel paging request at virtual address
>> dead000000000108
>> [  129.644674] Mem abort info:
>> [  129.647456]   ESR = 0x0000000096000044
>
> This is also introducing OOMs when doing at least audio tests (I don't
> think these are super relevant) on Raspberry Pi 3B+ running NFS root
> (probably more relevant):
>
> [   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000
>
> ...
>
> [   64.087583] Call trace:
> [   64.087586]  kmem_cache_free+0x88/0x434 (P)
> [   64.087598]  skb_free_head+0x9c/0xb8
> [   64.087608]  skb_release_data+0x120/0x174
> [   64.087615]  __kfree_skb+0x2c/0x44
> [   64.087622]  tcp_data_queue+0x948/0xe50
>
> Full log:
>
>   https://lava.sirena.org.uk/scheduler/job/2341856#L1721
>
> Bisection identifies:
>
> [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()
>
> as being the comit that introduces the issue.  Bisect log with links to
> further test runs:

Hi Mark and Claudiu,

Can you try the patch below to see if it fixes the issue? Basically,
in cma_release(), count was used to drop page ref and decreased to 0,
but after the loop, count becomes -1 and __cma_release_frozen()
is releasing unnecessary pages.

Thanks.


From ece23da65ea7210e1fcb51ee9c27aec19b84811c Mon Sep 17 00:00:00 2001
From: Zi Yan <ziy@nvidia.com>
Date: Wed, 7 Jan 2026 14:23:15 -0500
Subject: [PATCH] mm/cma: fix cma_release by not decreasing count to 0.
Content-Type: text/plain; charset="utf-8"

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/cma.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/cma.c b/mm/cma.c
index 5713becc602b..408b07f6fddd 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -1013,13 +1013,14 @@ bool cma_release(struct cma *cma, const struct page *pages,
 {
 	struct cma_memrange *cmr;
 	unsigned long pfn;
+	unsigned long i;

 	cmr = find_cma_memrange(cma, pages, count);
 	if (!cmr)
 		return false;

 	pfn = page_to_pfn(pages);
-	for (; count--; pfn++)
+	for (i = 0; i < count; i++, pfn++)
 		VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn)));

 	__cma_release_frozen(cma, cmr, pages, count);
-- 
2.51.0




Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
       [not found]       ` <CGME20260107225819eucas1p2de678d4e810fdbde87192b83033a814c@eucas1p2.samsung.com>
@ 2026-01-07 22:58         ` Marek Szyprowski
  0 siblings, 0 replies; 30+ messages in thread
From: Marek Szyprowski @ 2026-01-07 22:58 UTC (permalink / raw)
  To: Zi Yan, Mark Brown, Claudiu Beznea
  Cc: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm, sidhartha.kumar, jane.chu,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox

On 07.01.2026 20:38, Zi Yan wrote:
> On 7 Jan 2026, at 13:39, Mark Brown wrote:
>> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
>>> On 12/30/25 09:24, Kefeng Wang wrote:
>>>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>>>> which avoid atomic operation about page refcount, and then convert to
>>>> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
>>>> the alloc_gigantic_folio().
>>> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
>>> to idle:
>>>
>>> [  129.636729] Unable to handle kernel paging request at virtual address
>>> dead000000000108
>>> [  129.644674] Mem abort info:
>>> [  129.647456]   ESR = 0x0000000096000044
>> This is also introducing OOMs when doing at least audio tests (I don't
>> think these are super relevant) on Raspberry Pi 3B+ running NFS root
>> (probably more relevant):
>>
>> [   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000
>>
>> ...
>>
>> [   64.087583] Call trace:
>> [   64.087586]  kmem_cache_free+0x88/0x434 (P)
>> [   64.087598]  skb_free_head+0x9c/0xb8
>> [   64.087608]  skb_release_data+0x120/0x174
>> [   64.087615]  __kfree_skb+0x2c/0x44
>> [   64.087622]  tcp_data_queue+0x948/0xe50
>>
>> Full log:
>>
>>    https://lava.sirena.org.uk/scheduler/job/2341856#L1721
>>
>> Bisection identifies:
>>
>> [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()
>>
>> as being the comit that introduces the issue.  Bisect log with links to
>> further test runs:
> Hi Mark and Claudiu,
>
> Can you try the patch below to see if it fixes the issue? Basically,
> in cma_release(), count was used to drop page ref and decreased to 0,
> but after the loop, count becomes -1 and __cma_release_frozen()
> is releasing unnecessary pages.

I ran into the same issue on my test farm with next-20260107, then 
bisected to this patchset. I can confirm that the below patch fixes it. 
Feel free to add:

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

>  From ece23da65ea7210e1fcb51ee9c27aec19b84811c Mon Sep 17 00:00:00 2001
> From: Zi Yan <ziy@nvidia.com>
> Date: Wed, 7 Jan 2026 14:23:15 -0500
> Subject: [PATCH] mm/cma: fix cma_release by not decreasing count to 0.
> Content-Type: text/plain; charset="utf-8"
>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>   mm/cma.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/cma.c b/mm/cma.c
> index 5713becc602b..408b07f6fddd 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -1013,13 +1013,14 @@ bool cma_release(struct cma *cma, const struct page *pages,
>   {
>   	struct cma_memrange *cmr;
>   	unsigned long pfn;
> +	unsigned long i;
>
>   	cmr = find_cma_memrange(cma, pages, count);
>   	if (!cmr)
>   		return false;
>
>   	pfn = page_to_pfn(pages);
> -	for (; count--; pfn++)
> +	for (i = 0; i < count; i++, pfn++)
>   		VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn)));
>
>   	__cma_release_frozen(cma, cmr, pages, count);

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-07 19:38     ` Zi Yan
       [not found]       ` <CGME20260107225819eucas1p2de678d4e810fdbde87192b83033a814c@eucas1p2.samsung.com>
@ 2026-01-08  1:05       ` Kefeng Wang
  2026-01-08  1:53         ` Kefeng Wang
  2026-01-08  9:00       ` Claudiu Beznea
  2026-01-08  9:14       ` Konrad Dybcio
  3 siblings, 1 reply; 30+ messages in thread
From: Kefeng Wang @ 2026-01-08  1:05 UTC (permalink / raw)
  To: Zi Yan, Mark Brown, Claudiu Beznea
  Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
	linux-mm, sidhartha.kumar, jane.chu, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox



On 2026/1/8 3:38, Zi Yan wrote:
> On 7 Jan 2026, at 13:39, Mark Brown wrote:
> 
>> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
>>> On 12/30/25 09:24, Kefeng Wang wrote:
>>
>>>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>>>> which avoid atomic operation about page refcount, and then convert to
>>>> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
>>>> the alloc_gigantic_folio().
>>>
>>> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
>>> to idle:
>>>
>>> [  129.636729] Unable to handle kernel paging request at virtual address
>>> dead000000000108
>>> [  129.644674] Mem abort info:
>>> [  129.647456]   ESR = 0x0000000096000044
>>
>> This is also introducing OOMs when doing at least audio tests (I don't
>> think these are super relevant) on Raspberry Pi 3B+ running NFS root
>> (probably more relevant):
>>
>> [   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000
>>
>> ...
>>
>> [   64.087583] Call trace:
>> [   64.087586]  kmem_cache_free+0x88/0x434 (P)
>> [   64.087598]  skb_free_head+0x9c/0xb8
>> [   64.087608]  skb_release_data+0x120/0x174
>> [   64.087615]  __kfree_skb+0x2c/0x44
>> [   64.087622]  tcp_data_queue+0x948/0xe50
>>
>> Full log:
>>
>>    https://lava.sirena.org.uk/scheduler/job/2341856#L1721
>>
>> Bisection identifies:
>>
>> [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()
>>
>> as being the comit that introduces the issue.  Bisect log with links to
>> further test runs:
> 
> Hi Mark and Claudiu,

Thanks for the reports.

> 
> Can you try the patch below to see if it fixes the issue? Basically,
> in cma_release(), count was used to drop page ref and decreased to 0,
> but after the loop, count becomes -1 and __cma_release_frozen()
> is releasing unnecessary pages.

Oh,sorry for introducing the regression, my previous self-tests were
more focused on the hugetlb part, I should be more careful about this,
and thanks Zi Yan for the quick fix, I will do more check and test for
the non-hugetlb part.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-08  1:05       ` Kefeng Wang
@ 2026-01-08  1:53         ` Kefeng Wang
  2026-01-08  3:25           ` Zi Yan
  0 siblings, 1 reply; 30+ messages in thread
From: Kefeng Wang @ 2026-01-08  1:53 UTC (permalink / raw)
  To: Zi Yan, Mark Brown, Claudiu Beznea
  Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
	linux-mm, sidhartha.kumar, jane.chu, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox



On 2026/1/8 9:05, Kefeng Wang wrote:
> 
> 
> On 2026/1/8 3:38, Zi Yan wrote:
>> On 7 Jan 2026, at 13:39, Mark Brown wrote:
>>
>>> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
>>>> On 12/30/25 09:24, Kefeng Wang wrote:
>>>
>>>>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>>>>> which avoid atomic operation about page refcount, and then convert to
>>>>> allocate frozen gigantic folio by the new helpers in hugetlb to 
>>>>> cleanup
>>>>> the alloc_gigantic_folio().
>>>>
>>>> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing 
>>>> suspend
>>>> to idle:
>>>>
>>>> [  129.636729] Unable to handle kernel paging request at virtual 
>>>> address
>>>> dead000000000108
>>>> [  129.644674] Mem abort info:
>>>> [  129.647456]   ESR = 0x0000000096000044
>>>
>>> This is also introducing OOMs when doing at least audio tests (I don't
>>> think these are super relevant) on Raspberry Pi 3B+ running NFS root
>>> (probably more relevant):
>>>
>>> [   64.064256] Unable to handle kernel paging request at virtual 
>>> address fffffdffc1000000
>>>
>>> ...
>>>
>>> [   64.087583] Call trace:
>>> [   64.087586]  kmem_cache_free+0x88/0x434 (P)
>>> [   64.087598]  skb_free_head+0x9c/0xb8
>>> [   64.087608]  skb_release_data+0x120/0x174
>>> [   64.087615]  __kfree_skb+0x2c/0x44
>>> [   64.087622]  tcp_data_queue+0x948/0xe50
>>>
>>> Full log:
>>>
>>>    https://lava.sirena.org.uk/scheduler/job/2341856#L1721
>>>
>>> Bisection identifies:
>>>
>>> [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add 
>>> cma_alloc_frozen{_compound}()
>>>
>>> as being the comit that introduces the issue.  Bisect log with links to
>>> further test runs:
>>
>> Hi Mark and Claudiu,
> 
> Thanks for the reports.
> 
>>
>> Can you try the patch below to see if it fixes the issue? Basically,
>> in cma_release(), count was used to drop page ref and decreased to 0,
>> but after the loop, count becomes -1 and __cma_release_frozen()
>> is releasing unnecessary pages.
> 
> Oh,sorry for introducing the regression, my previous self-tests were
> more focused on the hugetlb part, I should be more careful about this,
> and thanks Zi Yan for the quick fix, I will do more check and test for
> the non-hugetlb part.
> 
> 

Based on the cma_debug interface, I performed quick tests to reproduce
the issue before Zi's fix. With the fix applied, the crash was resolved,
and no other issues were observed during testing.






^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-08  1:53         ` Kefeng Wang
@ 2026-01-08  3:25           ` Zi Yan
  2026-01-08  7:10             ` Kefeng Wang
  0 siblings, 1 reply; 30+ messages in thread
From: Zi Yan @ 2026-01-08  3:25 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Mark Brown, Claudiu Beznea, Andrew Morton, David Hildenbrand,
	Oscar Salvador, Muchun Song, linux-mm, sidhartha.kumar, jane.chu,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox

On 7 Jan 2026, at 20:53, Kefeng Wang wrote:

> On 2026/1/8 9:05, Kefeng Wang wrote:
>>
>>
>> On 2026/1/8 3:38, Zi Yan wrote:
>>> On 7 Jan 2026, at 13:39, Mark Brown wrote:
>>>
>>>> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
>>>>> On 12/30/25 09:24, Kefeng Wang wrote:
>>>>
>>>>>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>>>>>> which avoid atomic operation about page refcount, and then convert to
>>>>>> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
>>>>>> the alloc_gigantic_folio().
>>>>>
>>>>> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
>>>>> to idle:
>>>>>
>>>>> [  129.636729] Unable to handle kernel paging request at virtual address
>>>>> dead000000000108
>>>>> [  129.644674] Mem abort info:
>>>>> [  129.647456]   ESR = 0x0000000096000044
>>>>
>>>> This is also introducing OOMs when doing at least audio tests (I don't
>>>> think these are super relevant) on Raspberry Pi 3B+ running NFS root
>>>> (probably more relevant):
>>>>
>>>> [   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000
>>>>
>>>> ...
>>>>
>>>> [   64.087583] Call trace:
>>>> [   64.087586]  kmem_cache_free+0x88/0x434 (P)
>>>> [   64.087598]  skb_free_head+0x9c/0xb8
>>>> [   64.087608]  skb_release_data+0x120/0x174
>>>> [   64.087615]  __kfree_skb+0x2c/0x44
>>>> [   64.087622]  tcp_data_queue+0x948/0xe50
>>>>
>>>> Full log:
>>>>
>>>>    https://lava.sirena.org.uk/scheduler/job/2341856#L1721
>>>>
>>>> Bisection identifies:
>>>>
>>>> [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()
>>>>
>>>> as being the comit that introduces the issue.  Bisect log with links to
>>>> further test runs:
>>>
>>> Hi Mark and Claudiu,
>>
>> Thanks for the reports.
>>
>>>
>>> Can you try the patch below to see if it fixes the issue? Basically,
>>> in cma_release(), count was used to drop page ref and decreased to 0,
>>> but after the loop, count becomes -1 and __cma_release_frozen()
>>> is releasing unnecessary pages.
>>
>> Oh,sorry for introducing the regression, my previous self-tests were
>> more focused on the hugetlb part, I should be more careful about this,
>> and thanks Zi Yan for the quick fix, I will do more check and test for
>> the non-hugetlb part.
>>
>>
>
> Based on the cma_debug interface, I performed quick tests to reproduce
> the issue before Zi's fix. With the fix applied, the crash was resolved,
> and no other issues were observed during testing.

It seems that the series is dropped from Andrew’s tree. You probably want
to send a new version with the fix folded.

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}()
  2025-12-30  7:24 ` [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
  2025-12-31  2:59   ` Zi Yan
@ 2026-01-08  4:19   ` Dmitry Baryshkov
  2026-01-08  6:57     ` Kefeng Wang
  1 sibling, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2026-01-08  4:19 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
	linux-mm, sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox

On Tue, Dec 30, 2025 at 03:24:21PM +0800, Kefeng Wang wrote:
> Introduce cma_alloc_frozen{_compound}() helper to alloc pages without
> incrementing their refcount, then convert hugetlb cma to use the
> cma_alloc_frozen_compound() and cma_release_frozen() and remove the
> unused cma_{alloc,free}_folio(), also move the cma_validate_zones()
> into mm/internal.h since no outside user.
> 
> The set_pages_refcounted() is only called to set non-compound pages
> after above changes, so remove the processing about PageHead.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  include/linux/cma.h |  26 +++--------
>  mm/cma.c            | 107 +++++++++++++++++++++++++++++---------------
>  mm/hugetlb_cma.c    |  24 +++++-----
>  mm/internal.h       |  10 ++---
>  4 files changed, 97 insertions(+), 70 deletions(-)
> 

This breaks booting of Qualcomm RB3 Gen2:

[    9.500774] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[    9.509862] Mem abort info:
[    9.512745]   ESR = 0x0000000096000004
[    9.516597]   EC = 0x25: DABT (current EL), IL = 32 bits
[    9.522050]   SET = 0, FnV = 0
[    9.525194]   EA = 0, S1PTW = 0
[    9.528429]   FSC = 0x04: level 0 translation fault
[    9.533440] Data abort info:
[    9.536400]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[    9.542030]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    9.547224]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    9.552684] [0000000000000008] user address but active_mm is swapper
[    9.559237] Internal error: Oops: 0000000096000004 [#1]  SMP
[    9.565054] Modules linked in:
[    9.568202] CPU: 7 UID: 0 PID: 59 Comm: kworker/u32:1 Not tainted 6.19.0-rc3-00212-gfb9a328d3040 #4016 PREEMPT
[    9.578552] Hardware name: Qualcomm Technologies, Inc. Robotics RB3gen2 (DT)
[    9.585792] Workqueue: events_unbound deferred_probe_work_func
[    9.591791] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    9.598941] pc : __get_pfnblock_flags_mask.isra.0+0x30/0x60
[    9.604669] lr : free_pcppages_bulk+0x120/0x274
[    9.609327] sp : ffff8000808ab610
[    9.612741] x29: ffff8000808ab610 x28: 0000000000000001 x27: ffff0001797fa380
[    9.620071] x26: ffffa5ded75b3800 x25: 0000000000000004 x24: ffff0001797fa390
[    9.627399] x23: ffffa5ded7850ea0 x22: 0000000000000001 x21: 000000000000003a
[    9.634728] x20: ffff00017f41b020 x19: ffff0001797fa3f0 x18: ffff000080a2a9e8
[    9.642057] x17: ffff000080a2a9e8 x16: ffff0000803c1b50 x15: 0000000000000003
[    9.649385] x14: 0000000000000028 x13: 0000000000006aa1 x12: 0000000000000000
[    9.656707] x11: 0000000000000001 x10: 0000000000000000 x9 : fffffdffc1f8c000
[    9.664037] x8 : ffff00017f41b028 x7 : ffff00017f41ae00 x6 : fffffdffc1f8c008
[    9.671364] x5 : fffffc08070506c0 x4 : 000001fffff8100e x3 : 0001fffff8100e0a
[    9.678693] x2 : 0000000000000000 x1 : 0000000000000017 x0 : fffffc08070506c0
[    9.686021] Call trace:
[    9.688537]  __get_pfnblock_flags_mask.isra.0+0x30/0x60 (P)
[    9.694257]  free_frozen_page_commit.isra.0+0x1a8/0x478
[    9.699628]  __free_frozen_pages+0x240/0x5c0
[    9.704013]  free_contig_frozen_range+0xc8/0x110
[    9.708764]  __cma_release_frozen+0x54/0x188
[    9.713159]  cma_release+0x4c/0x78
[    9.716661]  dma_free_contiguous+0x2c/0x74
[    9.720870]  dma_direct_free+0xf4/0x188
[    9.724814]  dma_free_attrs+0xa8/0x1d8
[    9.728666]  qcom_scm_pas_init_image+0x178/0x18c
[    9.733414]  qcom_mdt_pas_init+0x130/0x23c
[    9.737623]  qcom_mdt_load+0x44/0xa0
[    9.741299]  venus_boot+0x14c/0x2e8
[    9.744891]  venus_probe+0x32c/0x5d8
[    9.748567]  platform_probe+0x5c/0xa4
[    9.752332]  really_probe+0xbc/0x2c0
[    9.756009]  __driver_probe_device+0x78/0x120
[    9.760483]  driver_probe_device+0x3c/0x160
[    9.764779]  __device_attach_driver+0xb8/0x140
[    9.769347]  bus_for_each_drv+0x88/0xe8
[    9.773290]  __device_attach+0xa0/0x198
[    9.777232]  device_initial_probe+0x50/0x54
[    9.781527]  bus_probe_device+0x38/0xac
[    9.785468]  deferred_probe_work_func+0x90/0xcc
[    9.790130]  process_one_work+0x214/0x64c
[    9.794251]  worker_thread+0x1bc/0x360
[    9.798103]  kthread+0x14c/0x220
[    9.801424]  ret_from_fork+0x10/0x20
[    9.805103] Code: f8647842 f100005f 8b231043 9a821062 (f9400442)
[    9.811358] ---[ end trace 0000000000000000 ]---

-- 
With best wishes
Dmitry


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}()
  2026-01-08  4:19   ` Dmitry Baryshkov
@ 2026-01-08  6:57     ` Kefeng Wang
  0 siblings, 0 replies; 30+ messages in thread
From: Kefeng Wang @ 2026-01-08  6:57 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andrew Morton, David Hildenbrand, Oscar Salvador, Muchun Song,
	linux-mm, sidhartha.kumar, jane.chu, Zi Yan, Vlastimil Babka,
	Brendan Jackman, Johannes Weiner, Matthew Wilcox



On 2026/1/8 12:19, Dmitry Baryshkov wrote:
> On Tue, Dec 30, 2025 at 03:24:21PM +0800, Kefeng Wang wrote:
>> Introduce cma_alloc_frozen{_compound}() helper to alloc pages without
>> incrementing their refcount, then convert hugetlb cma to use the
>> cma_alloc_frozen_compound() and cma_release_frozen() and remove the
>> unused cma_{alloc,free}_folio(), also move the cma_validate_zones()
>> into mm/internal.h since no outside user.
>>
>> The set_pages_refcounted() is only called to set non-compound pages
>> after above changes, so remove the processing about PageHead.
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   include/linux/cma.h |  26 +++--------
>>   mm/cma.c            | 107 +++++++++++++++++++++++++++++---------------
>>   mm/hugetlb_cma.c    |  24 +++++-----
>>   mm/internal.h       |  10 ++---
>>   4 files changed, 97 insertions(+), 70 deletions(-)
>>
> 
> This breaks booting of Qualcomm RB3 Gen2:

Thanks for your report, sorry for the regression, Zi has post a fix [1],
could you try it? I will do more test and resend a new version.

[1] 
https://lore.kernel.org/linux-mm/7253A444-97D1-4256-9AD9-BCFF66437510@nvidia.com/

> 
> [    9.500774] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> [    9.509862] Mem abort info:
> [    9.512745]   ESR = 0x0000000096000004
> [    9.516597]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    9.522050]   SET = 0, FnV = 0
> [    9.525194]   EA = 0, S1PTW = 0
> [    9.528429]   FSC = 0x04: level 0 translation fault
> [    9.533440] Data abort info:
> [    9.536400]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [    9.542030]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [    9.547224]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [    9.552684] [0000000000000008] user address but active_mm is swapper
> [    9.559237] Internal error: Oops: 0000000096000004 [#1]  SMP
> [    9.565054] Modules linked in:
> [    9.568202] CPU: 7 UID: 0 PID: 59 Comm: kworker/u32:1 Not tainted 6.19.0-rc3-00212-gfb9a328d3040 #4016 PREEMPT
> [    9.578552] Hardware name: Qualcomm Technologies, Inc. Robotics RB3gen2 (DT)
> [    9.585792] Workqueue: events_unbound deferred_probe_work_func
> [    9.591791] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    9.598941] pc : __get_pfnblock_flags_mask.isra.0+0x30/0x60
> [    9.604669] lr : free_pcppages_bulk+0x120/0x274
> [    9.609327] sp : ffff8000808ab610
> [    9.612741] x29: ffff8000808ab610 x28: 0000000000000001 x27: ffff0001797fa380
> [    9.620071] x26: ffffa5ded75b3800 x25: 0000000000000004 x24: ffff0001797fa390
> [    9.627399] x23: ffffa5ded7850ea0 x22: 0000000000000001 x21: 000000000000003a
> [    9.634728] x20: ffff00017f41b020 x19: ffff0001797fa3f0 x18: ffff000080a2a9e8
> [    9.642057] x17: ffff000080a2a9e8 x16: ffff0000803c1b50 x15: 0000000000000003
> [    9.649385] x14: 0000000000000028 x13: 0000000000006aa1 x12: 0000000000000000
> [    9.656707] x11: 0000000000000001 x10: 0000000000000000 x9 : fffffdffc1f8c000
> [    9.664037] x8 : ffff00017f41b028 x7 : ffff00017f41ae00 x6 : fffffdffc1f8c008
> [    9.671364] x5 : fffffc08070506c0 x4 : 000001fffff8100e x3 : 0001fffff8100e0a
> [    9.678693] x2 : 0000000000000000 x1 : 0000000000000017 x0 : fffffc08070506c0
> [    9.686021] Call trace:
> [    9.688537]  __get_pfnblock_flags_mask.isra.0+0x30/0x60 (P)
> [    9.694257]  free_frozen_page_commit.isra.0+0x1a8/0x478
> [    9.699628]  __free_frozen_pages+0x240/0x5c0
> [    9.704013]  free_contig_frozen_range+0xc8/0x110
> [    9.708764]  __cma_release_frozen+0x54/0x188
> [    9.713159]  cma_release+0x4c/0x78
> [    9.716661]  dma_free_contiguous+0x2c/0x74
> [    9.720870]  dma_direct_free+0xf4/0x188
> [    9.724814]  dma_free_attrs+0xa8/0x1d8
> [    9.728666]  qcom_scm_pas_init_image+0x178/0x18c
> [    9.733414]  qcom_mdt_pas_init+0x130/0x23c
> [    9.737623]  qcom_mdt_load+0x44/0xa0
> [    9.741299]  venus_boot+0x14c/0x2e8
> [    9.744891]  venus_probe+0x32c/0x5d8
> [    9.748567]  platform_probe+0x5c/0xa4
> [    9.752332]  really_probe+0xbc/0x2c0
> [    9.756009]  __driver_probe_device+0x78/0x120
> [    9.760483]  driver_probe_device+0x3c/0x160
> [    9.764779]  __device_attach_driver+0xb8/0x140
> [    9.769347]  bus_for_each_drv+0x88/0xe8
> [    9.773290]  __device_attach+0xa0/0x198
> [    9.777232]  device_initial_probe+0x50/0x54
> [    9.781527]  bus_probe_device+0x38/0xac
> [    9.785468]  deferred_probe_work_func+0x90/0xcc
> [    9.790130]  process_one_work+0x214/0x64c
> [    9.794251]  worker_thread+0x1bc/0x360
> [    9.798103]  kthread+0x14c/0x220
> [    9.801424]  ret_from_fork+0x10/0x20
> [    9.805103] Code: f8647842 f100005f 8b231043 9a821062 (f9400442)
> [    9.811358] ---[ end trace 0000000000000000 ]---
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-08  3:25           ` Zi Yan
@ 2026-01-08  7:10             ` Kefeng Wang
  0 siblings, 0 replies; 30+ messages in thread
From: Kefeng Wang @ 2026-01-08  7:10 UTC (permalink / raw)
  To: Zi Yan
  Cc: Mark Brown, Claudiu Beznea, Andrew Morton, David Hildenbrand,
	Oscar Salvador, Muchun Song, linux-mm, sidhartha.kumar, jane.chu,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox



On 2026/1/8 11:25, Zi Yan wrote:
> On 7 Jan 2026, at 20:53, Kefeng Wang wrote:
> 
>> On 2026/1/8 9:05, Kefeng Wang wrote:
>>>
>>>
>>> On 2026/1/8 3:38, Zi Yan wrote:
>>>> On 7 Jan 2026, at 13:39, Mark Brown wrote:
>>>>
>>>>> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
>>>>>> On 12/30/25 09:24, Kefeng Wang wrote:
>>>>>
>>>>>>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>>>>>>> which avoid atomic operation about page refcount, and then convert to
>>>>>>> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
>>>>>>> the alloc_gigantic_folio().
>>>>>>
>>>>>> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
>>>>>> to idle:
>>>>>>
>>>>>> [  129.636729] Unable to handle kernel paging request at virtual address
>>>>>> dead000000000108
>>>>>> [  129.644674] Mem abort info:
>>>>>> [  129.647456]   ESR = 0x0000000096000044
>>>>>
>>>>> This is also introducing OOMs when doing at least audio tests (I don't
>>>>> think these are super relevant) on Raspberry Pi 3B+ running NFS root
>>>>> (probably more relevant):
>>>>>
>>>>> [   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000
>>>>>
>>>>> ...
>>>>>
>>>>> [   64.087583] Call trace:
>>>>> [   64.087586]  kmem_cache_free+0x88/0x434 (P)
>>>>> [   64.087598]  skb_free_head+0x9c/0xb8
>>>>> [   64.087608]  skb_release_data+0x120/0x174
>>>>> [   64.087615]  __kfree_skb+0x2c/0x44
>>>>> [   64.087622]  tcp_data_queue+0x948/0xe50
>>>>>
>>>>> Full log:
>>>>>
>>>>>     https://lava.sirena.org.uk/scheduler/job/2341856#L1721
>>>>>
>>>>> Bisection identifies:
>>>>>
>>>>> [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()
>>>>>
>>>>> as being the comit that introduces the issue.  Bisect log with links to
>>>>> further test runs:
>>>>
>>>> Hi Mark and Claudiu,
>>>
>>> Thanks for the reports.
>>>
>>>>
>>>> Can you try the patch below to see if it fixes the issue? Basically,
>>>> in cma_release(), count was used to drop page ref and decreased to 0,
>>>> but after the loop, count becomes -1 and __cma_release_frozen()
>>>> is releasing unnecessary pages.
>>>
>>> Oh,sorry for introducing the regression, my previous self-tests were
>>> more focused on the hugetlb part, I should be more careful about this,
>>> and thanks Zi Yan for the quick fix, I will do more check and test for
>>> the non-hugetlb part.
>>>
>>>
>>
>> Based on the cma_debug interface, I performed quick tests to reproduce
>> the issue before Zi's fix. With the fix applied, the crash was resolved,
>> and no other issues were observed during testing.
> 
> It seems that the series is dropped from Andrew’s tree. You probably want
> to send a new version with the fix folded.
> 

OK, thanks for the remind, will do.

> Best Regards,
> Yan, Zi
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-07 19:38     ` Zi Yan
       [not found]       ` <CGME20260107225819eucas1p2de678d4e810fdbde87192b83033a814c@eucas1p2.samsung.com>
  2026-01-08  1:05       ` Kefeng Wang
@ 2026-01-08  9:00       ` Claudiu Beznea
  2026-01-08  9:14       ` Konrad Dybcio
  3 siblings, 0 replies; 30+ messages in thread
From: Claudiu Beznea @ 2026-01-08  9:00 UTC (permalink / raw)
  To: Zi Yan, Mark Brown
  Cc: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm, sidhartha.kumar, jane.chu,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox

Hi,

On 1/7/26 21:38, Zi Yan wrote:
> On 7 Jan 2026, at 13:39, Mark Brown wrote:
> 
>> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
>>> On 12/30/25 09:24, Kefeng Wang wrote:
>>
>>>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>>>> which avoid atomic operation about page refcount, and then convert to
>>>> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
>>>> the alloc_gigantic_folio().
>>>
>>> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
>>> to idle:
>>>
>>> [  129.636729] Unable to handle kernel paging request at virtual address
>>> dead000000000108
>>> [  129.644674] Mem abort info:
>>> [  129.647456]   ESR = 0x0000000096000044
>>
>> This is also introducing OOMs when doing at least audio tests (I don't
>> think these are super relevant) on Raspberry Pi 3B+ running NFS root
>> (probably more relevant):
>>
>> [   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000
>>
>> ...
>>
>> [   64.087583] Call trace:
>> [   64.087586]  kmem_cache_free+0x88/0x434 (P)
>> [   64.087598]  skb_free_head+0x9c/0xb8
>> [   64.087608]  skb_release_data+0x120/0x174
>> [   64.087615]  __kfree_skb+0x2c/0x44
>> [   64.087622]  tcp_data_queue+0x948/0xe50
>>
>> Full log:
>>
>>    https://lava.sirena.org.uk/scheduler/job/2341856#L1721
>>
>> Bisection identifies:
>>
>> [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()
>>
>> as being the comit that introduces the issue.  Bisect log with links to
>> further test runs:
> 
> Hi Mark and Claudiu,
> 
> Can you try the patch below to see if it fixes the issue? Basically,
> in cma_release(), count was used to drop page ref and decreased to 0,
> but after the loop, count becomes -1 and __cma_release_frozen()
> is releasing unnecessary pages.
> 
> Thanks.
> 
> 
>  From ece23da65ea7210e1fcb51ee9c27aec19b84811c Mon Sep 17 00:00:00 2001
> From: Zi Yan <ziy@nvidia.com>
> Date: Wed, 7 Jan 2026 14:23:15 -0500
> Subject: [PATCH] mm/cma: fix cma_release by not decreasing count to 0.
> Content-Type: text/plain; charset="utf-8"
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>   mm/cma.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/cma.c b/mm/cma.c
> index 5713becc602b..408b07f6fddd 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -1013,13 +1013,14 @@ bool cma_release(struct cma *cma, const struct page *pages,
>   {
>   	struct cma_memrange *cmr;
>   	unsigned long pfn;
> +	unsigned long i;
> 
>   	cmr = find_cma_memrange(cma, pages, count);
>   	if (!cmr)
>   		return false;
> 
>   	pfn = page_to_pfn(pages);
> -	for (; count--; pfn++)
> +	for (i = 0; i < count; i++, pfn++)
>   		VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn)));
> 
>   	__cma_release_frozen(cma, cmr, pages, count);

This solves my use case as well. If any, you can add:

Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>

Thank you,
Claudiu


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio
  2026-01-07 19:38     ` Zi Yan
                         ` (2 preceding siblings ...)
  2026-01-08  9:00       ` Claudiu Beznea
@ 2026-01-08  9:14       ` Konrad Dybcio
  3 siblings, 0 replies; 30+ messages in thread
From: Konrad Dybcio @ 2026-01-08  9:14 UTC (permalink / raw)
  To: Zi Yan, Mark Brown, Claudiu Beznea
  Cc: Kefeng Wang, Andrew Morton, David Hildenbrand, Oscar Salvador,
	Muchun Song, linux-mm, sidhartha.kumar, jane.chu,
	Vlastimil Babka, Brendan Jackman, Johannes Weiner,
	Matthew Wilcox

On 1/7/26 8:38 PM, Zi Yan wrote:
> On 7 Jan 2026, at 13:39, Mark Brown wrote:
> 
>> On Wed, Jan 07, 2026 at 07:31:30PM +0200, Claudiu Beznea wrote:
>>> On 12/30/25 09:24, Kefeng Wang wrote:
>>
>>>> Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound()
>>>> which avoid atomic operation about page refcount, and then convert to
>>>> allocate frozen gigantic folio by the new helpers in hugetlb to cleanup
>>>> the alloc_gigantic_folio().
>>>
>>> I'm seeing the following issues on the Renesas RZ/G3S SoC when doing suspend
>>> to idle:
>>>
>>> [  129.636729] Unable to handle kernel paging request at virtual address
>>> dead000000000108
>>> [  129.644674] Mem abort info:
>>> [  129.647456]   ESR = 0x0000000096000044
>>
>> This is also introducing OOMs when doing at least audio tests (I don't
>> think these are super relevant) on Raspberry Pi 3B+ running NFS root
>> (probably more relevant):
>>
>> [   64.064256] Unable to handle kernel paging request at virtual address fffffdffc1000000
>>
>> ...
>>
>> [   64.087583] Call trace:
>> [   64.087586]  kmem_cache_free+0x88/0x434 (P)
>> [   64.087598]  skb_free_head+0x9c/0xb8
>> [   64.087608]  skb_release_data+0x120/0x174
>> [   64.087615]  __kfree_skb+0x2c/0x44
>> [   64.087622]  tcp_data_queue+0x948/0xe50
>>
>> Full log:
>>
>>   https://lava.sirena.org.uk/scheduler/job/2341856#L1721
>>
>> Bisection identifies:
>>
>> [fb9a328d30400dbc8b2ea5a57daeb28bedac398b] mm: cma: add cma_alloc_frozen{_compound}()
>>
>> as being the comit that introduces the issue.  Bisect log with links to
>> further test runs:
> 
> Hi Mark and Claudiu,
> 
> Can you try the patch below to see if it fixes the issue? Basically,
> in cma_release(), count was used to drop page ref and decreased to 0,
> but after the loop, count becomes -1 and __cma_release_frozen()
> is releasing unnecessary pages.
> 
> Thanks.
> 
> 
> From ece23da65ea7210e1fcb51ee9c27aec19b84811c Mon Sep 17 00:00:00 2001
> From: Zi Yan <ziy@nvidia.com>
> Date: Wed, 7 Jan 2026 14:23:15 -0500
> Subject: [PATCH] mm/cma: fix cma_release by not decreasing count to 0.
> Content-Type: text/plain; charset="utf-8"
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---

This also fixes booting on a number of Qualcomm platforms

Tested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>

Konrad


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2026-01-08  9:14 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-30  7:24 [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
2025-12-30  7:24 ` [PATCH v5 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
2026-01-02 18:51   ` Sid Kumar
2025-12-30  7:24 ` [PATCH v5 2/6] mm: page_alloc: add __split_page() Kefeng Wang
2026-01-02 18:55   ` Sid Kumar
2025-12-30  7:24 ` [PATCH v5 3/6] mm: cma: kill cma_pages_valid() Kefeng Wang
2025-12-30  7:24 ` [PATCH v5 4/6] mm: page_alloc: add alloc_contig_frozen_{range,pages}() Kefeng Wang
2025-12-31  2:57   ` Zi Yan
2026-01-02 21:05   ` Sid Kumar
2025-12-30  7:24 ` [PATCH v5 5/6] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
2025-12-31  2:59   ` Zi Yan
2026-01-08  4:19   ` Dmitry Baryshkov
2026-01-08  6:57     ` Kefeng Wang
2025-12-30  7:24 ` [PATCH v5 6/6] mm: hugetlb: allocate frozen pages for gigantic allocation Kefeng Wang
2025-12-31  2:50   ` Muchun Song
2025-12-31  3:00   ` Zi Yan
2025-12-30 18:17 ` [PATCH v5 mm-new 0/6] mm: hugetlb: allocate frozen gigantic folio Andrew Morton
2026-01-07 17:31 ` Claudiu Beznea
2026-01-07 18:25   ` Andrew Morton
2026-01-07 18:26   ` Zi Yan
2026-01-07 18:39   ` Mark Brown
2026-01-07 18:50     ` Andrew Morton
2026-01-07 19:38     ` Zi Yan
     [not found]       ` <CGME20260107225819eucas1p2de678d4e810fdbde87192b83033a814c@eucas1p2.samsung.com>
2026-01-07 22:58         ` Marek Szyprowski
2026-01-08  1:05       ` Kefeng Wang
2026-01-08  1:53         ` Kefeng Wang
2026-01-08  3:25           ` Zi Yan
2026-01-08  7:10             ` Kefeng Wang
2026-01-08  9:00       ` Claudiu Beznea
2026-01-08  9:14       ` Konrad Dybcio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox