From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Oscar Salvador <osalvador@suse.de>,
Muchun Song <muchun.song@linux.dev>, Zi Yan <ziy@nvidia.com>,
Matthew Wilcox <willy@infradead.org>
Cc: <sidhartha.kumar@oracle.com>, <jane.chu@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, <linux-mm@kvack.org>
Subject: Re: [PATCH 4/4] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio()
Date: Sat, 13 Sep 2025 12:13:36 +0800 [thread overview]
Message-ID: <39ea6d31-ec9c-4053-a875-8e86a8676a62@huawei.com> (raw)
In-Reply-To: <5ede233c-c8c6-4067-afb3-df94b4222cda@redhat.com>
On 2025/9/13 2:07, David Hildenbrand wrote:
>
>>>>
>>>> Would be interesting trying to see how much overhead would remain when
>>>> just dealing
>>
>> Patch2 skip atomic update in split_page() with
>> alloc_contig_frozen_pages(),
>> I could test performance about old alloc_contig_pages() with/without
>> GFP_COMP and new alloc_contig_frozen_pages() when allocate same size.
>
> Yes, that would be very interesting.
TEST1: alloc_contig_frozen_pages(split without atomic) + GFP_KERNEL
TEST2: alloc_contig_pages(split with atomic) + GFP_KERNEL
TEST3: alloc_contig_frozen_pages(split without atomic) +
GFP_KERNEL|__GFP_COMP
TEST4: alloc_contig_pages(split with atomic) + GFP_KERNEL|__GFP_COMP
Alloc/free 10 * 1G, the result shows as usec
TEST1 TEST2 TEST3 TEST4
alloc
301024 296947 286681 288125
297099 297194 286810 289906
297134 300445 289950 288393
free
215014 215332 16730 16613
214730 214948 16516 16588
215727 214945 16589 16507
The perf profile show below,
split_free_frozen_pages()'s proportion is lower than split_free_pages,
but the total time is not reduced.
TEST1 alloc:
- sysctl_test_sysctl_handler.part.0
- 96.92% alloc_contig_frozen_pages_noprof
- 82.69% pfn_range_valid_contig
pfn_to_online_page
- 13.97% alloc_contig_range_frozen_noprof
4.39% replace_free_hugepage_folios
- 4.34% split_free_frozen_pages
__list_add_valid_or_report
- 3.22% undo_isolate_page_range
- 3.04% unset_migratetype_isolate
- __move_freepages_block_isolate
2.96% __move_freepages_block
- 0.86% __drain_all_pages
drain_pages_zone
free_pcppages_bulk
- 0.59% start_isolate_page_range
- 0.51% set_migratetype_isolate
__move_freepages_block_isolate
TEST2 alloc:
- sysctl_test_sysctl_handler.part.0
- 99.82% alloc_contig_pages_old_noprof
- 81.59% pfn_range_valid_contig
pfn_to_online_page
- 18.16% alloc_contig_range_old_noprof.constprop.0
- 8.10% split_free_pages
4.72% __split_page
1.38% __list_add_valid_or_report
5.83% replace_free_hugepage_folios
- 3.07% undo_isolate_page_range
- 3.01% unset_migratetype_isolate
- __move_freepages_block_isolate
2.88% __move_freepages_block
TEST3 alloc:
- sysctl_test_sysctl_handler.part.0
- 99.85% alloc_contig_frozen_pages_noprof
- 86.04% pfn_range_valid_contig
pfn_to_online_page
- 13.52% alloc_contig_range_frozen_noprof
4.23% replace_free_hugepage_folios
- 3.99% prep_new_page
prep_compound_page
- 3.86% undo_isolate_page_range
- 3.43% unset_migratetype_isolate
- __move_freepages_block_isolate
3.35% __move_freepages_block
0.54% __alloc_contig_migrate_range
>
>>
>>>>
>>>> OTOH, maybe we can leave GFP_COMPOUND support in but make the function
>>>> more generic, not limited to folios (I suspect many users will not want
>>>> folios, except hugetlb).
>>>>
>>
>> Maybe just only add cma_alloc_frozen(), and
>> cma_alloc()/hugetlb_cma_alloc_folio()
>> is the wrapper that calls it and set page refcount, like what we did in
>> other frozen allocation.
>>
>> struct page *cma_alloc_frozen(struct cma *cma, unsigned long count,
>> unsigned int align, gfp_t gfp);
>
> I think it's weird that cma_alloc_frozen() consumes gfp_t when
> cma_alloc() doesn't. So I would wish that we can clean that up (->
> remove gfp if possible).
>
> So maybe we really want a
>
> cma_alloc_frozen
>
> and
>
> cma_alloc_frozen_compound
>
> whereby the latter consumes an "order" instead of "count + align".
>
>>
>>>> Maybe just a
>>>>
>>>> struct page * cma_alloc_compound(struct cma *cma, unsigned int order,
>>>> unsigned int align, bool no_warn);
>>>
>>> ^ no need for the align as I realized, just like
>>> cma_alloc_folio() doesn't have.
>>
>> since cma_alloc_frozen is more generic, we need keep align,
>>
>>>
>>> I do wonder why we decided to allow cma_alloc_folio() to consume gfp_t
>>> flags when we don't do the same for cma_alloc().
>>>
>>
>> cma_alloc_folio() now is called by hugetlb allocation, the gfp could be
>> htlb_alloc_mask(), with/without __GFP_THISNODE/__GFP_RETRY_MAYFAIL...
>> and this could be used in alloc_contig_frozen_pages(),(eg,
>> gfp_zone).
>
> Take a look at alloc_contig_range_noprof() where I added
>
> __alloc_contig_verify_gfp_mask()
>
> Essentially, we ignore
>
> GFP_ZONEMASK | __GFP_RECLAIMABLE | __GFP_WRITE | __GFP_HARDWALL
> | __GFP_THISNODE | __GFP_MOVABLE
>
> And we *always* set __GFP_RETRY_MAYFAIL
>
> I'd argue that hugetlb passing in random parameters that don't really
> make sense for contig allcoations is rather an issue and makes you
> believe that they would have any effect.
>
> If cma_alloc doesn't provide them a cma_alloc_frozen or cma_alloc_folio
> shouldn't provide them.
Oh, I make a mistake, the cma_alloc_folio calls alloc_contig_range
(not alloc_contig_pages) to alloc page from special cma ranges, and
__GFP_THISNODE only needed by hugetlb_cma_alloc_folio(), so the gfp
flags is not useful for cma_alloc_frozen or cma_alloc_folio, let's
remove it.
>
>>
>> For cma_alloc() unconditional use GFP_KERNEL from commit
>> 6518202970c1 "mm/cma: remove unsupported gfp_mask parameter from
>> cma_alloc()", but this is another story.
>>
>> Back to this patchset, just add a new cma_alloc_frozen() shown
>> above and directly call it in cma_alloc()/hugetlb_cma_alloc_folio(),
>> get rid of folio_alloc_frozen_gigantic() and cma_alloc_folio(), and
>> we could do more optimization in the next step.
>
> Yeah, that's probably a good first step, but while at it I would hope we
> could just cleanup the gfp stuff and provide a more consistent interface.
>
So
struct page *cma_alloc_frozen(struct cma *cma, unsigned long count,
unsigned int align, bool no_warn) in include/linux/cma.h
and
struct page *cma_alloc_frozen_compound(struct cma *cma, unsigned int
order) in mm/internal.h since maybe only hugetlb use it.
or keep cma_alloc_folio() but remove gfp,
struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
Any comments?
Thanks.
prev parent reply other threads:[~2025-09-13 4:13 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 6:56 [PATCH 0/4] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
2025-09-11 6:56 ` [PATCH 1/4] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
2025-09-12 6:58 ` Kefeng Wang
2025-09-11 6:56 ` [PATCH 2/4] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}() Kefeng Wang
2025-09-11 6:56 ` [PATCH 3/4] mm: cma: add __cma_release() Kefeng Wang
2025-09-11 6:56 ` [PATCH 4/4] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio() Kefeng Wang
2025-09-11 8:25 ` David Hildenbrand
2025-09-11 9:11 ` Kefeng Wang
2025-09-11 18:56 ` David Hildenbrand
2025-09-12 6:57 ` Kefeng Wang
2025-09-12 7:18 ` David Hildenbrand
2025-09-12 7:23 ` David Hildenbrand
2025-09-12 9:12 ` Kefeng Wang
2025-09-12 18:07 ` David Hildenbrand
2025-09-13 4:13 ` Kefeng Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=39ea6d31-ec9c-4053-a875-8e86a8676a62@huawei.com \
--to=wangkefeng.wang@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=jane.chu@oracle.com \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=sidhartha.kumar@oracle.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox