linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jinjiang Tu <tujinjiang@huawei.com>
To: Zi Yan <ziy@nvidia.com>, Andrew Morton <akpm@linux-foundation.org>
Cc: <yuzhao@google.com>, <david@redhat.com>, <linux-mm@kvack.org>,
	<wangkefeng.wang@huawei.com>, <sunnanyong@huawei.com>
Subject: Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
Date: Sat, 19 Apr 2025 08:54:38 +0800	[thread overview]
Message-ID: <a443a135-fedb-2a99-22ef-b4c3d9610542@huawei.com> (raw)
In-Reply-To: <6E553AA1-5B53-4E52-9940-3B8E0DE36FC1@nvidia.com>


在 2025/4/19 5:32, Zi Yan 写道:
> Hi Jinjiang,
>
> On 17 Apr 2025, at 22:59, Andrew Morton wrote:
>
>> On Wed, 12 Mar 2025 16:47:05 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:
>>
>>> When calling alloc_contig_range() with __GFP_COMP and the order of
>>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>>> WARNING as follows:
>>>
>>>   PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>>>   WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
> Basically, you are using alloc_contig_range() to allocate a compound page
> that can be allocated from buddy allocator, since order is < MAX_ORDER.
> What is the use case? Why is alloc_contig_range() used?

In CMA case, alloc_contig_range() is used to allocate from requested pfn range, and the order may
be < MAX_ORDER.

>
>>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>>> isolated, migrate these pages if they are in use and will be freed to
>>> MIGRATE_ISOLATED freelist.
>>>
>>> Suppose two alloc_contig_range() calls at the same time and the requested
>>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>>> respectively. Suppose the two memory range are in use, then
>>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>>
>>> To fix it, call free_contig_range() to free the excess pfn range.
>> This has been in mm-hotfixes for a month without issue.  Is there any
>> reviewer interest?
>>
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>   		goto done;
>>>   	}
>>>
>>> -	if (!(gfp_mask & __GFP_COMP)) {
>>> +	if (!(gfp_mask & __GFP_COMP) ||
>>> +		(is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
>>>   		split_free_pages(cc.freepages, gfp_mask);
> This does not look right to me. When a compound page is requested,
> alloc_contig_range() should give a compound page, but split_free_pages()
> will make the free page as a list of contiguous order-0 pages.
>
> I do not think we should keep this patch.
>
> Jinjiang, let me know if I miss anything.

After split_free_pages(), below code is execucted to collapse the contiguous order-0 pages
to a compound page.

   if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
         struct page *head = pfn_to_page(start);
         int order = ilog2(end - start);

         check_new_pages(head, order);
         prep_new_page(head, order, gfp_mask, 0);
         set_page_refcounted(head);

   }

Thanks.

>
>>>   		/* Free head and tail (if any) */
>>> @@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>   			free_contig_range(outer_start, start - outer_start);
>>>   		if (end != outer_end)
>>>   			free_contig_range(end, outer_end - end);
>>> -	} else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>> +
>>> +		outer_start = start;
>>> +		outer_end = end;
>>> +
>>> +		if (!(gfp_mask & __GFP_COMP))
>>> +			goto done;
>>> +	}
>>> +
>>> +	if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>>   		struct page *head = pfn_to_page(start);
>>>   		int order = ilog2(end - start);
>>>
>
> Best Regards,
> Yan, Zi


  reply	other threads:[~2025-04-19  0:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-12  8:47 Jinjiang Tu
2025-04-18  2:59 ` Andrew Morton
2025-04-18 21:32   ` Zi Yan
2025-04-19  0:54     ` Jinjiang Tu [this message]
2025-04-19  1:32       ` Zi Yan
2025-04-19  1:50         ` Jinjiang Tu
2025-05-12  6:38         ` Jinjiang Tu
2025-05-12  7:58           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a443a135-fedb-2a99-22ef-b4c3d9610542@huawei.com \
    --to=tujinjiang@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=sunnanyong@huawei.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox