linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: Jinjiang Tu <tujinjiang@huawei.com>,
	akpm@linux-foundation.org, yuzhao@google.com, linux-mm@kvack.org,
	wangkefeng.wang@huawei.com
Subject: Re: [PATCH v2] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
Date: Fri, 25 Apr 2025 07:04:55 -0400	[thread overview]
Message-ID: <5CD028AA-64B7-4A93-8679-AAC5869B8C15@nvidia.com> (raw)
In-Reply-To: <0a6fa00f-0e48-4101-b2ad-23c9a964b740@redhat.com>

On 25 Apr 2025, at 6:33, David Hildenbrand wrote:

> On 21.04.25 03:36, Jinjiang Tu wrote:
>> When calling alloc_contig_range() with __GFP_COMP and the order of
>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>> WARNING as follows:
>>
>>   PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>>   WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
>>
>
> Just to verify: there is no such in-tree user, right?
>
>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>> isolated, migrate these pages if they are in use and will be freed to
>> MIGRATE_ISOLATED freelist.
>>
>> Suppose two alloc_contig_range() calls at the same time and the requested
>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>> respectively. Suppose the two memory range are in use, then
>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>
>> To fix it, call free_contig_range() to free the excess pfn range.
>>
>> Fixes: e98337d11bbd ("mm/contig_alloc: support __GFP_COMP")
>> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
>> ---
>> Changelog since v1:
>>   * Add comment and remove redundant code, suggested by Zi Yan
>>
>>   mm/page_alloc.c | 20 ++++++++++++++++++--
>>   1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 579789600a3c..f0162ab991ad 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6440,6 +6440,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>   		.alloc_contig = true,
>>   	};
>>   	INIT_LIST_HEAD(&cc.migratepages);
>> +	bool is_range_aligned;
>
> is "aligned" the right word? Aligned to what?
>
> I do wonder if we could do the following on top, checking that the range is suitable for __GFP_COMP earlier.
>

The change below makes the code cleaner. Acked-by: Zi Yan <ziy@nvidia.com>

>
> From 6c414d786db74b1494f7cf66ebf911c01995d20a Mon Sep 17 00:00:00 2001
> From: David Hildenbrand <david@redhat.com>
> Date: Fri, 25 Apr 2025 12:32:15 +0200
> Subject: [PATCH] tmp
>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/page_alloc.c | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 57aa64dc74a05..85312903dcd8c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6682,6 +6682,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
>  int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>  		       unsigned migratetype, gfp_t gfp_mask)
>  {
> +	const int range_order = ilog2(end - start);
>  	unsigned long outer_start, outer_end;
>  	int ret = 0;
>  @@ -6695,12 +6696,19 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>  		.alloc_contig = true,
>  	};
>  	INIT_LIST_HEAD(&cc.migratepages);
> -	bool is_range_aligned;
>   	gfp_mask = current_gfp_context(gfp_mask);
>  	if (__alloc_contig_verify_gfp_mask(gfp_mask, (gfp_t *)&cc.gfp_mask))
>  		return -EINVAL;
>  +	/* __GFP_COMP may only be used for certain aligned+sized ranges. */
> +	if ((gfp_mask & __GFP_COMP) &&
> +	    (!is_power_of_2(end - start) || !IS_ALIGNED(start, 1 << range_order))) {
> +		WARN_ONCE(true, "PFN range: requested [%lu, %lu) is not suitable for __GFP_COMP\n",
> +			  start, end);
> +		return -EINVAL;
> +	}
> +
>  	/*
>  	 * What we do here is we mark all pageblocks in range as
>  	 * MIGRATE_ISOLATE.  Because pageblock and max order pages may
> @@ -6789,9 +6797,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>  	 * isolated free pages can have higher order than the requested
>  	 * one. Use split_free_pages() to free out of range pages.
>  	 */
> -	is_range_aligned = is_power_of_2(end - start);
> -	if (!(gfp_mask & __GFP_COMP) ||
> -		(is_range_aligned && ilog2(end - start) < MAX_PAGE_ORDER)) {
> +	if (!(gfp_mask & __GFP_COMP) || range_order < MAX_PAGE_ORDER) {
>  		split_free_pages(cc.freepages, gfp_mask);
>   		/* Free head and tail (if any) */
> @@ -6802,22 +6808,16 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>   		outer_start = start;
>  		outer_end = end;
> -
> -		if (!(gfp_mask & __GFP_COMP))
> -			goto done;
>  	}
>  -	if (start == outer_start && end == outer_end && is_range_aligned) {
> +	if (gfp_mask & __GFP_COMP) {
>  		struct page *head = pfn_to_page(start);
>  		int order = ilog2(end - start);
>  +		VM_WARN_ON_ONCE(outer_start != start || outer_end != end);
>  		check_new_pages(head, order);
>  		prep_new_page(head, order, gfp_mask, 0);
>  		set_page_refcounted(head);
> -	} else {
> -		ret = -EINVAL;
> -		WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n",
> -		     start, end, outer_start, outer_end);
>  	}
>  done:
>  	undo_isolate_page_range(start, end, migratetype);
> -- 
> 2.49.0
>
>
> -- 
> Cheers,
>
> David / dhildenb


--
Best Regards,
Yan, Zi


  reply	other threads:[~2025-04-25 11:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-21  1:36 Jinjiang Tu
2025-04-21  1:52 ` Zi Yan
2025-04-25 10:33 ` David Hildenbrand
2025-04-25 11:04   ` Zi Yan [this message]
2025-05-11  8:04     ` David Hildenbrand
2025-05-12  1:13       ` Jinjiang Tu
2025-05-28  2:19         ` Andrew Morton
2025-05-28  2:25           ` Zi Yan
2025-05-28  2:58             ` Jinjiang Tu
2025-05-28  8:43           ` David Hildenbrand
2025-05-28 12:14             ` Jinjiang Tu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5CD028AA-64B7-4A93-8679-AAC5869B8C15@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=tujinjiang@huawei.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox