From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Oscar Salvador <osalvador@suse.de>,
Muchun Song <muchun.song@linux.dev>
Cc: <sidhartha.kumar@oracle.com>, <jane.chu@oracle.com>,
Zi Yan <ziy@nvidia.com>, Vlastimil Babka <vbabka@suse.cz>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, <linux-mm@kvack.org>,
Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v3 3/6] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}()
Date: Thu, 23 Oct 2025 20:06:23 +0800 [thread overview]
Message-ID: <35ec1c59-d6b2-4840-aea0-2e5c219d9d99@huawei.com> (raw)
In-Reply-To: <a10446f4-d069-45dc-94c8-7c22fcf7f30f@huawei.com>
On 2025/10/20 23:21, Kefeng Wang wrote:
> + Matthew
>
> On 2025/10/20 21:07, David Hildenbrand wrote:
>>
>>>>
>>>>> +void free_contig_range_frozen(unsigned long pfn, unsigned long
>>>>> nr_pages)
>>>>> +{
>>>>> + struct folio *folio = pfn_folio(pfn);
>>>>> +
>>>>> + if (folio_test_large(folio)) {
>>>>> + int expected = folio_nr_pages(folio);
>>>>> +
>>>>> + WARN_ON(folio_ref_count(folio));
>>>>> +
>>>>> + if (nr_pages == expected)
>>>>> + free_frozen_pages(&folio->page, folio_order(folio));
>>>>> + else
>>>>> + WARN(true, "PFN %lu: nr_pages %lu != expected %d\n",
>>>>> + pfn, nr_pages, expected);
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> + for (; nr_pages--; pfn++) {
>>>>> + struct page *page = pfn_to_page(pfn);
>>>>> +
>>>>> + WARN_ON(page_ref_count(page));
>>>>> + free_frozen_pages(page, 0);
>>>>> + }
>>>>
>>>> That's mostly a copy-and-paste of free_contig_range().
>>>>
>>>> I wonder if there is some way to avoid duplicating a lot of
>>>> free_contig_range() here. Hmmm.
>>>>
>>>> Also, the folio stuff in there looks a bit weird I'm afraid.
>>>>
>>>> Can't we just refuse to free compound pages throught this interface and
>>>> free_contig_range() ? IIRC only hugetlb uses it and uses folio_put()
>>>> either way?
>>>>
>>>> Then we can just document that compound allocations are to be freed
>>>> differently.
>>>
>>>
>>> There is a case for cma_free_folio, which calls free_contig_range for
>>> both in cma_release(), but I will try to check whether we could avoid
>>> the folio stuff in free_contig_range().
>>
>>
>> Ah, right, there is hugetlb_cma_free_folio()->cma_free_folio().
>>
>> And we need that, because we have to make sure that CMA stats are
>> updated properly.
>>
>> All compound page handling in the freeing path is just nasty and not
>> particularly future-proof regarding memdescs.
>>
>> I wonder if we could just teach alloc_contig to never hand out
>> compound pages and then let the freeing path similarly assert that
>> there are no compound pages.
>>
>> Whoever wants a compound page (currently only hugetlb?) can create
>> that from a frozen range. Before returning the frozen range the
>> compound page can be dissolved. That way also any memdesc can be
>> allocated/freed by the caller later.
>>
>> The only nasty thing is the handing of splitting/merging of
>> set_page_owner/page_table_check_alloc etc. :(
>>
>>
>>
>> As an alternative, we could only allow compound pages for frozen
>> pages. This way, we'd force any caller to handle the allocation/
>> freeing of the memdesc in the future manually.
>>
>> Essentially, only allow GFP_COMPOUND on the frozen interface, which we
>> would convert hugetlb to.
>>
>> That means that we can simplify free_contig_range() [no need to handle
>> compound pages]. For free_contig_frozen_range() we would skip refcount
>> checks on that level and do something like:
I tried to only allocate/free non-compound pages in
alloc_contig_{range,pages}() and free_contig_range(),
The new added alloc_contig_frozen_{range,pages} can allocate compound/
non-compound frozen pages, let's discuss in the new version[1].
[1]
https://lore.kernel.org/linux-mm/20251023115940.3573158-1-wangkefeng.wang@huawei.com/
>>
>> void free_contig_frozen_range(unsigned long pfn, unsigned long nr_pages)
>> {
>> struct page *first_page = pfn_to_page(pfn)
>> const unsigned int order = ilog2(nr_pages);
>>
>> if (PageHead(first_page)) {
>> WARN_ON_ONCE(order != compound_order(first_page));
>> free_frozen_pages(first_page, order);
>> return;
>> }
>>
>> for (; nr_pages--; pfn++)
>> free_frozen_pages(pfn_to_page(pfn), 0);
>> }
>>
>> CCing Willy, I don't know yet what will be better in the future. But
>> the folio stuff in there screams for problems.
>>
>
> Sorry forget to add cc in v3, the full link[1],
>
> [1] https://lore.kernel.org/linux-mm/20251013133854.2466530-1-
> wangkefeng.wang@huawei.com/
>
>
>
next prev parent reply other threads:[~2025-10-23 12:06 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-13 13:38 [PATCH v3 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 1/6] mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page() Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 2/6] mm: page_alloc: add __split_page() Kefeng Wang
2025-10-13 19:44 ` David Hildenbrand
2025-10-14 3:45 ` Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 3/6] mm: page_alloc: add alloc_contig_{range_frozen,frozen_pages}() Kefeng Wang
2025-10-16 20:53 ` David Hildenbrand
2025-10-17 7:19 ` Kefeng Wang
2025-10-20 13:07 ` David Hildenbrand
2025-10-20 15:21 ` Kefeng Wang
2025-10-23 12:06 ` Kefeng Wang [this message]
2025-10-13 13:38 ` [PATCH v3 4/6] mm: cma: add __cma_release() Kefeng Wang
2025-10-13 19:48 ` David Hildenbrand
2025-10-14 3:45 ` Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 5/6] mm: cma: add cma_alloc_frozen{_compound}() Kefeng Wang
2025-10-13 13:38 ` [PATCH v3 6/6] mm: hugetlb: allocate frozen pages in alloc_gigantic_folio() Kefeng Wang
2025-10-16 1:20 ` [PATCH v3 0/6] mm: hugetlb: allocate frozen gigantic folio Kefeng Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=35ec1c59-d6b2-4840-aea0-2e5c219d9d99@huawei.com \
--to=wangkefeng.wang@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=jane.chu@oracle.com \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=sidhartha.kumar@oracle.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox