> On Feb 15, 2025, at 15:20, yangge1116@126.com wrote: > > From: Ge Yang > > Since the introduction of commit b65d4adbc0f0 ("mm: hugetlb: defer freeing > of HugeTLB pages"), which supports deferring the freeing of HugeTLB pages, > the allocation of contiguous memory through cma_alloc() may fail > probabilistically. > > In the CMA allocation process, if it is found that the CMA area is occupied > by in-use hugepage folios, these in-use hugepage folios need to be migrated > to another location. When there are no available hugepage folios in the > free HugeTLB pool during the migration of in-use HugeTLB pages, new folios > are allocated from the buddy system. A temporary state is set on the newly > allocated folio. Upon completion of the hugepage folio migration, the > temporary state is transferred from the new folios to the old folios. > Normally, when the old folios with the temporary state are freed, it is > directly released back to the buddy system. However, due to the deferred > freeing of HugeTLB pages, the PageBuddy() check fails, ultimately leading > to the failure of cma_alloc(). > > Here is a simplified call trace illustrating the process: > cma_alloc() > ->__alloc_contig_migrate_range() // Migrate in-use hugepage > ->unmap_and_move_huge_page() > ->folio_putback_hugetlb() // Free old folios > ->test_pages_isolated() > ->__test_page_isolated_in_pageblock() > ->PageBuddy(page) // Check if the page is in buddy > > To resolve this issue, we have implemented a function named > wait_for_hugepage_folios_freed(). This function ensures that the hugepage > folios are properly released back to the buddy system after their migration > is completed. By invoking wait_for_hugepage_folios_freed() before calling > PageBuddy(), we ensure that PageBuddy() will succeed. > > Fixes: b65d4adbc0f0 ("mm: hugetlb: defer freeing of HugeTLB pages") > Signed-off-by: Ge Yang > --- > > V2: > - flush all folios at once suggested by David > > include/linux/hugetlb.h | 5 +++++ > mm/hugetlb.c | 8 ++++++++ > mm/page_isolation.c | 10 ++++++++++ > 3 files changed, 23 insertions(+) > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 6c6546b..04708b0 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -697,6 +697,7 @@ bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); > > int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); > int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); > +void wait_for_hugepage_folios_freed(void); > struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, > unsigned long addr, bool cow_from_owner); > struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, > @@ -1092,6 +1093,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, > return 0; > } > > +static inline void wait_for_hugepage_folios_freed(void) > +{ > +} > + > static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, > unsigned long addr, > bool cow_from_owner) > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 30bc34d..36dd3e4 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2955,6 +2955,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) > return ret; > } > > +void wait_for_hugepage_folios_freed(void) We usually use the "hugetlb" term now instead of "huge_page" to differentiate with THP. So I suggest naming it as wait_for_hugetlb_folios_freed(). > +{ > + struct hstate *h; > + > + for_each_hstate(h) > + flush_free_hpage_work(h); Because all hstate use the shared work to defer the freeing of hugetlb pages, we only need to flush once. Directly useing flush_work(&free_hpage_work) is enough. > +} > + > typedef enum { > /* > * For either 0/1: we checked the per-vma resv map, and one resv > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > index 8ed53ee0..f56cf02 100644 > --- a/mm/page_isolation.c > +++ b/mm/page_isolation.c > @@ -615,6 +615,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, > int ret; > > /* > + * Due to the deferred freeing of HugeTLB folios, the hugepage folios may > + * not immediately release to the buddy system. This can cause PageBuddy() > + * to fail in __test_page_isolated_in_pageblock(). To ensure that the > + * hugepage folios are properly released back to the buddy system, we hugetlb folios, pls. Thanks, Muchun > + * invoke the wait_for_hugepage_folios_freed() function to wait for the > + * release to complete. > + */ > + wait_for_hugepage_folios_freed(); > + > + /* > * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free > * pages are not aligned to pageblock_nr_pages. > * Then we just check migratetype first. > -- > 2.7.4 >