From: Jinjiang Tu <tujinjiang@huawei.com>
To: Zi Yan <ziy@nvidia.com>, Andrew Morton <akpm@linux-foundation.org>
Cc: <yuzhao@google.com>, <david@redhat.com>, <linux-mm@kvack.org>,
<wangkefeng.wang@huawei.com>, <sunnanyong@huawei.com>
Subject: Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
Date: Sat, 19 Apr 2025 08:54:38 +0800 [thread overview]
Message-ID: <a443a135-fedb-2a99-22ef-b4c3d9610542@huawei.com> (raw)
In-Reply-To: <6E553AA1-5B53-4E52-9940-3B8E0DE36FC1@nvidia.com>
在 2025/4/19 5:32, Zi Yan 写道:
> Hi Jinjiang,
>
> On 17 Apr 2025, at 22:59, Andrew Morton wrote:
>
>> On Wed, 12 Mar 2025 16:47:05 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:
>>
>>> When calling alloc_contig_range() with __GFP_COMP and the order of
>>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>>> WARNING as follows:
>>>
>>> PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>>> WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
> Basically, you are using alloc_contig_range() to allocate a compound page
> that can be allocated from buddy allocator, since order is < MAX_ORDER.
> What is the use case? Why is alloc_contig_range() used?
In CMA case, alloc_contig_range() is used to allocate from requested pfn range, and the order may
be < MAX_ORDER.
>
>>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>>> isolated, migrate these pages if they are in use and will be freed to
>>> MIGRATE_ISOLATED freelist.
>>>
>>> Suppose two alloc_contig_range() calls at the same time and the requested
>>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>>> respectively. Suppose the two memory range are in use, then
>>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>>
>>> To fix it, call free_contig_range() to free the excess pfn range.
>> This has been in mm-hotfixes for a month without issue. Is there any
>> reviewer interest?
>>
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>> goto done;
>>> }
>>>
>>> - if (!(gfp_mask & __GFP_COMP)) {
>>> + if (!(gfp_mask & __GFP_COMP) ||
>>> + (is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
>>> split_free_pages(cc.freepages, gfp_mask);
> This does not look right to me. When a compound page is requested,
> alloc_contig_range() should give a compound page, but split_free_pages()
> will make the free page as a list of contiguous order-0 pages.
>
> I do not think we should keep this patch.
>
> Jinjiang, let me know if I miss anything.
After split_free_pages(), below code is execucted to collapse the contiguous order-0 pages
to a compound page.
if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
struct page *head = pfn_to_page(start);
int order = ilog2(end - start);
check_new_pages(head, order);
prep_new_page(head, order, gfp_mask, 0);
set_page_refcounted(head);
}
Thanks.
>
>>> /* Free head and tail (if any) */
>>> @@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>> free_contig_range(outer_start, start - outer_start);
>>> if (end != outer_end)
>>> free_contig_range(end, outer_end - end);
>>> - } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>> +
>>> + outer_start = start;
>>> + outer_end = end;
>>> +
>>> + if (!(gfp_mask & __GFP_COMP))
>>> + goto done;
>>> + }
>>> +
>>> + if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>> struct page *head = pfn_to_page(start);
>>> int order = ilog2(end - start);
>>>
>
> Best Regards,
> Yan, Zi
next prev parent reply other threads:[~2025-04-19 0:55 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-12 8:47 Jinjiang Tu
2025-04-18 2:59 ` Andrew Morton
2025-04-18 21:32 ` Zi Yan
2025-04-19 0:54 ` Jinjiang Tu [this message]
2025-04-19 1:32 ` Zi Yan
2025-04-19 1:50 ` Jinjiang Tu
2025-05-12 6:38 ` Jinjiang Tu
2025-05-12 7:58 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a443a135-fedb-2a99-22ef-b4c3d9610542@huawei.com \
--to=tujinjiang@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=sunnanyong@huawei.com \
--cc=wangkefeng.wang@huawei.com \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox