From: Zi Yan <ziy@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: Jinjiang Tu <tujinjiang@huawei.com>,
akpm@linux-foundation.org, yuzhao@google.com, linux-mm@kvack.org,
wangkefeng.wang@huawei.com
Subject: Re: [PATCH v2] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
Date: Fri, 25 Apr 2025 07:04:55 -0400 [thread overview]
Message-ID: <5CD028AA-64B7-4A93-8679-AAC5869B8C15@nvidia.com> (raw)
In-Reply-To: <0a6fa00f-0e48-4101-b2ad-23c9a964b740@redhat.com>
On 25 Apr 2025, at 6:33, David Hildenbrand wrote:
> On 21.04.25 03:36, Jinjiang Tu wrote:
>> When calling alloc_contig_range() with __GFP_COMP and the order of
>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>> WARNING as follows:
>>
>> PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>> WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
>>
>
> Just to verify: there is no such in-tree user, right?
>
>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>> isolated, migrate these pages if they are in use and will be freed to
>> MIGRATE_ISOLATED freelist.
>>
>> Suppose two alloc_contig_range() calls at the same time and the requested
>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>> respectively. Suppose the two memory range are in use, then
>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>
>> To fix it, call free_contig_range() to free the excess pfn range.
>>
>> Fixes: e98337d11bbd ("mm/contig_alloc: support __GFP_COMP")
>> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
>> ---
>> Changelog since v1:
>> * Add comment and remove redundant code, suggested by Zi Yan
>>
>> mm/page_alloc.c | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 579789600a3c..f0162ab991ad 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6440,6 +6440,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>> .alloc_contig = true,
>> };
>> INIT_LIST_HEAD(&cc.migratepages);
>> + bool is_range_aligned;
>
> is "aligned" the right word? Aligned to what?
>
> I do wonder if we could do the following on top, checking that the range is suitable for __GFP_COMP earlier.
>
The change below makes the code cleaner. Acked-by: Zi Yan <ziy@nvidia.com>
>
> From 6c414d786db74b1494f7cf66ebf911c01995d20a Mon Sep 17 00:00:00 2001
> From: David Hildenbrand <david@redhat.com>
> Date: Fri, 25 Apr 2025 12:32:15 +0200
> Subject: [PATCH] tmp
>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> mm/page_alloc.c | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 57aa64dc74a05..85312903dcd8c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6682,6 +6682,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
> int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> unsigned migratetype, gfp_t gfp_mask)
> {
> + const int range_order = ilog2(end - start);
> unsigned long outer_start, outer_end;
> int ret = 0;
> @@ -6695,12 +6696,19 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> .alloc_contig = true,
> };
> INIT_LIST_HEAD(&cc.migratepages);
> - bool is_range_aligned;
> gfp_mask = current_gfp_context(gfp_mask);
> if (__alloc_contig_verify_gfp_mask(gfp_mask, (gfp_t *)&cc.gfp_mask))
> return -EINVAL;
> + /* __GFP_COMP may only be used for certain aligned+sized ranges. */
> + if ((gfp_mask & __GFP_COMP) &&
> + (!is_power_of_2(end - start) || !IS_ALIGNED(start, 1 << range_order))) {
> + WARN_ONCE(true, "PFN range: requested [%lu, %lu) is not suitable for __GFP_COMP\n",
> + start, end);
> + return -EINVAL;
> + }
> +
> /*
> * What we do here is we mark all pageblocks in range as
> * MIGRATE_ISOLATE. Because pageblock and max order pages may
> @@ -6789,9 +6797,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> * isolated free pages can have higher order than the requested
> * one. Use split_free_pages() to free out of range pages.
> */
> - is_range_aligned = is_power_of_2(end - start);
> - if (!(gfp_mask & __GFP_COMP) ||
> - (is_range_aligned && ilog2(end - start) < MAX_PAGE_ORDER)) {
> + if (!(gfp_mask & __GFP_COMP) || range_order < MAX_PAGE_ORDER) {
> split_free_pages(cc.freepages, gfp_mask);
> /* Free head and tail (if any) */
> @@ -6802,22 +6808,16 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> outer_start = start;
> outer_end = end;
> -
> - if (!(gfp_mask & __GFP_COMP))
> - goto done;
> }
> - if (start == outer_start && end == outer_end && is_range_aligned) {
> + if (gfp_mask & __GFP_COMP) {
> struct page *head = pfn_to_page(start);
> int order = ilog2(end - start);
> + VM_WARN_ON_ONCE(outer_start != start || outer_end != end);
> check_new_pages(head, order);
> prep_new_page(head, order, gfp_mask, 0);
> set_page_refcounted(head);
> - } else {
> - ret = -EINVAL;
> - WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n",
> - start, end, outer_start, outer_end);
> }
> done:
> undo_isolate_page_range(start, end, migratetype);
> --
> 2.49.0
>
>
> --
> Cheers,
>
> David / dhildenb
--
Best Regards,
Yan, Zi
next prev parent reply other threads:[~2025-04-25 11:05 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-21 1:36 Jinjiang Tu
2025-04-21 1:52 ` Zi Yan
2025-04-25 10:33 ` David Hildenbrand
2025-04-25 11:04 ` Zi Yan [this message]
2025-05-11 8:04 ` David Hildenbrand
2025-05-12 1:13 ` Jinjiang Tu
2025-05-28 2:19 ` Andrew Morton
2025-05-28 2:25 ` Zi Yan
2025-05-28 2:58 ` Jinjiang Tu
2025-05-28 8:43 ` David Hildenbrand
2025-05-28 12:14 ` Jinjiang Tu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5CD028AA-64B7-4A93-8679-AAC5869B8C15@nvidia.com \
--to=ziy@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=tujinjiang@huawei.com \
--cc=wangkefeng.wang@huawei.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox