* [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
@ 2025-03-12 8:47 Jinjiang Tu
2025-04-18 2:59 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Jinjiang Tu @ 2025-03-12 8:47 UTC (permalink / raw)
To: akpm, yuzhao, ziy, david; +Cc: linux-mm, wangkefeng.wang, sunnanyong
When calling alloc_contig_range() with __GFP_COMP and the order of
requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
WARNING as follows:
PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
alloc_contig_range() marks pageblocks of the requested pfn range to be
isolated, migrate these pages if they are in use and will be freed to
MIGRATE_ISOLATED freelist.
Suppose two alloc_contig_range() calls at the same time and the requested
pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
respectively. Suppose the two memory range are in use, then
alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
To fix it, call free_contig_range() to free the excess pfn range.
Fixes: e98337d11bbd ("mm/contig_alloc: support __GFP_COMP")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
---
mm/page_alloc.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 579789600a3c..c1260968e89e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
goto done;
}
- if (!(gfp_mask & __GFP_COMP)) {
+ if (!(gfp_mask & __GFP_COMP) ||
+ (is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
split_free_pages(cc.freepages, gfp_mask);
/* Free head and tail (if any) */
@@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
free_contig_range(outer_start, start - outer_start);
if (end != outer_end)
free_contig_range(end, outer_end - end);
- } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
+
+ outer_start = start;
+ outer_end = end;
+
+ if (!(gfp_mask & __GFP_COMP))
+ goto done;
+ }
+
+ if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
struct page *head = pfn_to_page(start);
int order = ilog2(end - start);
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
2025-03-12 8:47 [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER Jinjiang Tu
@ 2025-04-18 2:59 ` Andrew Morton
2025-04-18 21:32 ` Zi Yan
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2025-04-18 2:59 UTC (permalink / raw)
To: Jinjiang Tu; +Cc: yuzhao, ziy, david, linux-mm, wangkefeng.wang, sunnanyong
On Wed, 12 Mar 2025 16:47:05 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:
> When calling alloc_contig_range() with __GFP_COMP and the order of
> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
> WARNING as follows:
>
> PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
> WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
>
> alloc_contig_range() marks pageblocks of the requested pfn range to be
> isolated, migrate these pages if they are in use and will be freed to
> MIGRATE_ISOLATED freelist.
>
> Suppose two alloc_contig_range() calls at the same time and the requested
> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
> respectively. Suppose the two memory range are in use, then
> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>
> To fix it, call free_contig_range() to free the excess pfn range.
This has been in mm-hotfixes for a month without issue. Is there any
reviewer interest?
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> goto done;
> }
>
> - if (!(gfp_mask & __GFP_COMP)) {
> + if (!(gfp_mask & __GFP_COMP) ||
> + (is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
> split_free_pages(cc.freepages, gfp_mask);
>
> /* Free head and tail (if any) */
> @@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
> free_contig_range(outer_start, start - outer_start);
> if (end != outer_end)
> free_contig_range(end, outer_end - end);
> - } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
> +
> + outer_start = start;
> + outer_end = end;
> +
> + if (!(gfp_mask & __GFP_COMP))
> + goto done;
> + }
> +
> + if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
> struct page *head = pfn_to_page(start);
> int order = ilog2(end - start);
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
2025-04-18 2:59 ` Andrew Morton
@ 2025-04-18 21:32 ` Zi Yan
2025-04-19 0:54 ` Jinjiang Tu
0 siblings, 1 reply; 8+ messages in thread
From: Zi Yan @ 2025-04-18 21:32 UTC (permalink / raw)
To: Andrew Morton, Jinjiang Tu
Cc: yuzhao, david, linux-mm, wangkefeng.wang, sunnanyong
Hi Jinjiang,
On 17 Apr 2025, at 22:59, Andrew Morton wrote:
> On Wed, 12 Mar 2025 16:47:05 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:
>
>> When calling alloc_contig_range() with __GFP_COMP and the order of
>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>> WARNING as follows:
>>
>> PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>> WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
Basically, you are using alloc_contig_range() to allocate a compound page
that can be allocated from buddy allocator, since order is < MAX_ORDER.
What is the use case? Why is alloc_contig_range() used?
>>
>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>> isolated, migrate these pages if they are in use and will be freed to
>> MIGRATE_ISOLATED freelist.
>>
>> Suppose two alloc_contig_range() calls at the same time and the requested
>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>> respectively. Suppose the two memory range are in use, then
>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>
>> To fix it, call free_contig_range() to free the excess pfn range.
>
> This has been in mm-hotfixes for a month without issue. Is there any
> reviewer interest?
>
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>> goto done;
>> }
>>
>> - if (!(gfp_mask & __GFP_COMP)) {
>> + if (!(gfp_mask & __GFP_COMP) ||
>> + (is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
>> split_free_pages(cc.freepages, gfp_mask);
This does not look right to me. When a compound page is requested,
alloc_contig_range() should give a compound page, but split_free_pages()
will make the free page as a list of contiguous order-0 pages.
I do not think we should keep this patch.
Jinjiang, let me know if I miss anything.
>>
>> /* Free head and tail (if any) */
>> @@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>> free_contig_range(outer_start, start - outer_start);
>> if (end != outer_end)
>> free_contig_range(end, outer_end - end);
>> - } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>> +
>> + outer_start = start;
>> + outer_end = end;
>> +
>> + if (!(gfp_mask & __GFP_COMP))
>> + goto done;
>> + }
>> +
>> + if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>> struct page *head = pfn_to_page(start);
>> int order = ilog2(end - start);
>>
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
2025-04-18 21:32 ` Zi Yan
@ 2025-04-19 0:54 ` Jinjiang Tu
2025-04-19 1:32 ` Zi Yan
0 siblings, 1 reply; 8+ messages in thread
From: Jinjiang Tu @ 2025-04-19 0:54 UTC (permalink / raw)
To: Zi Yan, Andrew Morton
Cc: yuzhao, david, linux-mm, wangkefeng.wang, sunnanyong
在 2025/4/19 5:32, Zi Yan 写道:
> Hi Jinjiang,
>
> On 17 Apr 2025, at 22:59, Andrew Morton wrote:
>
>> On Wed, 12 Mar 2025 16:47:05 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:
>>
>>> When calling alloc_contig_range() with __GFP_COMP and the order of
>>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>>> WARNING as follows:
>>>
>>> PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>>> WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
> Basically, you are using alloc_contig_range() to allocate a compound page
> that can be allocated from buddy allocator, since order is < MAX_ORDER.
> What is the use case? Why is alloc_contig_range() used?
In CMA case, alloc_contig_range() is used to allocate from requested pfn range, and the order may
be < MAX_ORDER.
>
>>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>>> isolated, migrate these pages if they are in use and will be freed to
>>> MIGRATE_ISOLATED freelist.
>>>
>>> Suppose two alloc_contig_range() calls at the same time and the requested
>>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>>> respectively. Suppose the two memory range are in use, then
>>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>>
>>> To fix it, call free_contig_range() to free the excess pfn range.
>> This has been in mm-hotfixes for a month without issue. Is there any
>> reviewer interest?
>>
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>> goto done;
>>> }
>>>
>>> - if (!(gfp_mask & __GFP_COMP)) {
>>> + if (!(gfp_mask & __GFP_COMP) ||
>>> + (is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
>>> split_free_pages(cc.freepages, gfp_mask);
> This does not look right to me. When a compound page is requested,
> alloc_contig_range() should give a compound page, but split_free_pages()
> will make the free page as a list of contiguous order-0 pages.
>
> I do not think we should keep this patch.
>
> Jinjiang, let me know if I miss anything.
After split_free_pages(), below code is execucted to collapse the contiguous order-0 pages
to a compound page.
if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
struct page *head = pfn_to_page(start);
int order = ilog2(end - start);
check_new_pages(head, order);
prep_new_page(head, order, gfp_mask, 0);
set_page_refcounted(head);
}
Thanks.
>
>>> /* Free head and tail (if any) */
>>> @@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>> free_contig_range(outer_start, start - outer_start);
>>> if (end != outer_end)
>>> free_contig_range(end, outer_end - end);
>>> - } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>> +
>>> + outer_start = start;
>>> + outer_end = end;
>>> +
>>> + if (!(gfp_mask & __GFP_COMP))
>>> + goto done;
>>> + }
>>> +
>>> + if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>> struct page *head = pfn_to_page(start);
>>> int order = ilog2(end - start);
>>>
>
> Best Regards,
> Yan, Zi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
2025-04-19 0:54 ` Jinjiang Tu
@ 2025-04-19 1:32 ` Zi Yan
2025-04-19 1:50 ` Jinjiang Tu
2025-05-12 6:38 ` Jinjiang Tu
0 siblings, 2 replies; 8+ messages in thread
From: Zi Yan @ 2025-04-19 1:32 UTC (permalink / raw)
To: Jinjiang Tu
Cc: Andrew Morton, yuzhao, david, linux-mm, wangkefeng.wang, sunnanyong
On 18 Apr 2025, at 20:54, Jinjiang Tu wrote:
> 在 2025/4/19 5:32, Zi Yan 写道:
>> Hi Jinjiang,
>>
>> On 17 Apr 2025, at 22:59, Andrew Morton wrote:
>>
>>> On Wed, 12 Mar 2025 16:47:05 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:
>>>
>>>> When calling alloc_contig_range() with __GFP_COMP and the order of
>>>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>>>> WARNING as follows:
>>>>
>>>> PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>>>> WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
>> Basically, you are using alloc_contig_range() to allocate a compound page
>> that can be allocated from buddy allocator, since order is < MAX_ORDER.
>> What is the use case? Why is alloc_contig_range() used?
>
> In CMA case, alloc_contig_range() is used to allocate from requested pfn range, and the order may
> be < MAX_ORDER.
But why do you need __GFP_COMP? I thought __GFP_COMP was only used for
1GB hugetlb.
>
>>
>>>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>>>> isolated, migrate these pages if they are in use and will be freed to
>>>> MIGRATE_ISOLATED freelist.
>>>>
>>>> Suppose two alloc_contig_range() calls at the same time and the requested
>>>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>>>> respectively. Suppose the two memory range are in use, then
>>>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>>>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>>>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>>>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>>>
>>>> To fix it, call free_contig_range() to free the excess pfn range.
>>> This has been in mm-hotfixes for a month without issue. Is there any
>>> reviewer interest?
>>>
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>> goto done;
>>>> }
>>>>
>>>> - if (!(gfp_mask & __GFP_COMP)) {
>>>> + if (!(gfp_mask & __GFP_COMP) ||
>>>> + (is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
>>>> split_free_pages(cc.freepages, gfp_mask);
>> This does not look right to me. When a compound page is requested,
>> alloc_contig_range() should give a compound page, but split_free_pages()
>> will make the free page as a list of contiguous order-0 pages.
>>
>> I do not think we should keep this patch.
>>
>> Jinjiang, let me know if I miss anything.
>
> After split_free_pages(), below code is execucted to collapse the contiguous order-0 pages
> to a compound page.
OK, got it. Since cc.freepages can be MAX_PAGE_ORDER and the requested
PFN range is smaller than MAX_PAGE_ORDER. A more “right” way of handling
this would be in the second if (the one shown below), you check order
against the cc.freepages order and free out of requested range pages.
But that might be too complicated. Your approach is simpler.
Can you add a comment above
“(is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)” to
explain why __GFP_COMP needs split_free_pages()? Something like:
With __GFP_COMP and the requested order < MAX_PAGE_ORDER, isolated free
pages can have higher order than the requested one. Use split_free_pages()
to free out of range pages.
>
> if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
is_power_of_2(end - start) is used twice. Can you add a variable for it
if that sounds good to you?
Thanks.
> struct page *head = pfn_to_page(start);
> int order = ilog2(end - start);
>
> check_new_pages(head, order);
> prep_new_page(head, order, gfp_mask, 0);
> set_page_refcounted(head);
>
> }
>
> Thanks.
>
>>
>>>> /* Free head and tail (if any) */
>>>> @@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>> free_contig_range(outer_start, start - outer_start);
>>>> if (end != outer_end)
>>>> free_contig_range(end, outer_end - end);
>>>> - } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>>> +
>>>> + outer_start = start;
>>>> + outer_end = end;
>>>> +
>>>> + if (!(gfp_mask & __GFP_COMP))
>>>> + goto done;
>>>> + }
>>>> +
>>>> + if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>>> struct page *head = pfn_to_page(start);
>>>> int order = ilog2(end - start);
>>>>
>>
>> Best Regards,
>> Yan, Zi
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
2025-04-19 1:32 ` Zi Yan
@ 2025-04-19 1:50 ` Jinjiang Tu
2025-05-12 6:38 ` Jinjiang Tu
1 sibling, 0 replies; 8+ messages in thread
From: Jinjiang Tu @ 2025-04-19 1:50 UTC (permalink / raw)
To: Zi Yan
Cc: Andrew Morton, yuzhao, david, linux-mm, wangkefeng.wang, sunnanyong
在 2025/4/19 9:32, Zi Yan 写道:
> On 18 Apr 2025, at 20:54, Jinjiang Tu wrote:
>
>> 在 2025/4/19 5:32, Zi Yan 写道:
>>> Hi Jinjiang,
>>>
>>> On 17 Apr 2025, at 22:59, Andrew Morton wrote:
>>>
>>>> On Wed, 12 Mar 2025 16:47:05 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:
>>>>
>>>>> When calling alloc_contig_range() with __GFP_COMP and the order of
>>>>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>>>>> WARNING as follows:
>>>>>
>>>>> PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>>>>> WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
>>> Basically, you are using alloc_contig_range() to allocate a compound page
>>> that can be allocated from buddy allocator, since order is < MAX_ORDER.
>>> What is the use case? Why is alloc_contig_range() used?
>> In CMA case, alloc_contig_range() is used to allocate from requested pfn range, and the order may
>> be < MAX_ORDER.
> But why do you need __GFP_COMP? I thought __GFP_COMP was only used for
> 1GB hugetlb.
In my use case, I use __GFP_COMP to make 2M allocation and free from CMA faster,
which may occur frequently.
Since alloc_contig_range_noprof() is exported to drivers, it's necessary to handle
order < MAX_ORDER case when __GFP_COMP is set.
>>>>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>>>>> isolated, migrate these pages if they are in use and will be freed to
>>>>> MIGRATE_ISOLATED freelist.
>>>>>
>>>>> Suppose two alloc_contig_range() calls at the same time and the requested
>>>>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>>>>> respectively. Suppose the two memory range are in use, then
>>>>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>>>>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>>>>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>>>>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>>>>
>>>>> To fix it, call free_contig_range() to free the excess pfn range.
>>>> This has been in mm-hotfixes for a month without issue. Is there any
>>>> reviewer interest?
>>>>
>>>>> --- a/mm/page_alloc.c
>>>>> +++ b/mm/page_alloc.c
>>>>> @@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>>> goto done;
>>>>> }
>>>>>
>>>>> - if (!(gfp_mask & __GFP_COMP)) {
>>>>> + if (!(gfp_mask & __GFP_COMP) ||
>>>>> + (is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
>>>>> split_free_pages(cc.freepages, gfp_mask);
>>> This does not look right to me. When a compound page is requested,
>>> alloc_contig_range() should give a compound page, but split_free_pages()
>>> will make the free page as a list of contiguous order-0 pages.
>>>
>>> I do not think we should keep this patch.
>>>
>>> Jinjiang, let me know if I miss anything.
>> After split_free_pages(), below code is execucted to collapse the contiguous order-0 pages
>> to a compound page.
> OK, got it. Since cc.freepages can be MAX_PAGE_ORDER and the requested
> PFN range is smaller than MAX_PAGE_ORDER. A more “right” way of handling
> this would be in the second if (the one shown below), you check order
> against the cc.freepages order and free out of requested range pages.
> But that might be too complicated. Your approach is simpler.
>
> Can you add a comment above
> “(is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)” to
> explain why __GFP_COMP needs split_free_pages()? Something like:
>
> With __GFP_COMP and the requested order < MAX_PAGE_ORDER, isolated free
> pages can have higher order than the requested one. Use split_free_pages()
> to free out of range pages.
>
>
>> if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
> is_power_of_2(end - start) is used twice. Can you add a variable for it
> if that sounds good to you?
>
> Thanks.
Thanks, I will update this patch ASAP.
>> struct page *head = pfn_to_page(start);
>> int order = ilog2(end - start);
>>
>> check_new_pages(head, order);
>> prep_new_page(head, order, gfp_mask, 0);
>> set_page_refcounted(head);
>>
>> }
>>
>> Thanks.
>>
>>>>> /* Free head and tail (if any) */
>>>>> @@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>>> free_contig_range(outer_start, start - outer_start);
>>>>> if (end != outer_end)
>>>>> free_contig_range(end, outer_end - end);
>>>>> - } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>>>> +
>>>>> + outer_start = start;
>>>>> + outer_end = end;
>>>>> +
>>>>> + if (!(gfp_mask & __GFP_COMP))
>>>>> + goto done;
>>>>> + }
>>>>> +
>>>>> + if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>>>> struct page *head = pfn_to_page(start);
>>>>> int order = ilog2(end - start);
>>>>>
>>> Best Regards,
>>> Yan, Zi
>
> Best Regards,
> Yan, Zi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
2025-04-19 1:32 ` Zi Yan
2025-04-19 1:50 ` Jinjiang Tu
@ 2025-05-12 6:38 ` Jinjiang Tu
2025-05-12 7:58 ` David Hildenbrand
1 sibling, 1 reply; 8+ messages in thread
From: Jinjiang Tu @ 2025-05-12 6:38 UTC (permalink / raw)
To: Zi Yan, david
Cc: Andrew Morton, yuzhao, linux-mm, wangkefeng.wang, sunnanyong
在 2025/4/19 9:32, Zi Yan 写道:
> On 18 Apr 2025, at 20:54, Jinjiang Tu wrote:
>
>> 在 2025/4/19 5:32, Zi Yan 写道:
>>> Hi Jinjiang,
>>>
>>> On 17 Apr 2025, at 22:59, Andrew Morton wrote:
>>>
>>>> On Wed, 12 Mar 2025 16:47:05 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:
>>>>
>>>>> When calling alloc_contig_range() with __GFP_COMP and the order of
>>>>> requested pfn range is pageblock_order, less than MAX_ORDER, I triggered
>>>>> WARNING as follows:
>>>>>
>>>>> PFN range: requested [2150105088, 2150105600), allocated [2150105088, 2150106112)
>>>>> WARNING: CPU: 3 PID: 580 at mm/page_alloc.c:6877 alloc_contig_range+0x280/0x340
>>> Basically, you are using alloc_contig_range() to allocate a compound page
>>> that can be allocated from buddy allocator, since order is < MAX_ORDER.
>>> What is the use case? Why is alloc_contig_range() used?
>> In CMA case, alloc_contig_range() is used to allocate from requested pfn range, and the order may
>> be < MAX_ORDER.
> But why do you need __GFP_COMP? I thought __GFP_COMP was only used for
> 1GB hugetlb.
>
>>>>> alloc_contig_range() marks pageblocks of the requested pfn range to be
>>>>> isolated, migrate these pages if they are in use and will be freed to
>>>>> MIGRATE_ISOLATED freelist.
>>>>>
>>>>> Suppose two alloc_contig_range() calls at the same time and the requested
>>>>> pfn range are [0x80280000, 0x80280200) and [0x80280200, 0x80280400)
>>>>> respectively. Suppose the two memory range are in use, then
>>>>> alloc_contig_range() will migrate and free these pages to MIGRATE_ISOLATED
>>>>> freelist. __free_one_page() will merge MIGRATE_ISOLATE buddy to larger
>>>>> buddy, resulting in a MAX_ORDER buddy. Finally, find_large_buddy() in
>>>>> alloc_contig_range() returns a MAX_ORDER buddy and results in WARNING.
>>>>>
>>>>> To fix it, call free_contig_range() to free the excess pfn range.
>>>> This has been in mm-hotfixes for a month without issue. Is there any
>>>> reviewer interest?
>>>>
>>>>> --- a/mm/page_alloc.c
>>>>> +++ b/mm/page_alloc.c
>>>>> @@ -6528,7 +6528,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>>> goto done;
>>>>> }
>>>>>
>>>>> - if (!(gfp_mask & __GFP_COMP)) {
>>>>> + if (!(gfp_mask & __GFP_COMP) ||
>>>>> + (is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)) {
>>>>> split_free_pages(cc.freepages, gfp_mask);
>>> This does not look right to me. When a compound page is requested,
>>> alloc_contig_range() should give a compound page, but split_free_pages()
>>> will make the free page as a list of contiguous order-0 pages.
>>>
>>> I do not think we should keep this patch.
>>>
>>> Jinjiang, let me know if I miss anything.
>> After split_free_pages(), below code is execucted to collapse the contiguous order-0 pages
>> to a compound page.
This is wrong. After split_free_pages(), these pages are order-0 pages with refcount 1. Call
prep_new_page(head) and set_page_refcounted(head) lead to VM_BUG_ON_PAGE(page_ref_count(page), page).
> OK, got it. Since cc.freepages can be MAX_PAGE_ORDER and the requested
> PFN range is smaller than MAX_PAGE_ORDER. A more “right” way of handling
> this would be in the second if (the one shown below), you check order
> against the cc.freepages order and free out of requested range pages.
> But that might be too complicated. Your approach is simpler.
For (outer_start != start) || (outer_end != end) case, We could split the pages to order-0 list,
and only call post_alloc_hook/set_page_refcounted for pages in [outer_start, start), [end, outer_end) range,
and call free_contig_range() for them. How about this approach?
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2ff19413e876..a80767621c52 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6625,6 +6625,37 @@ static void split_free_pages(struct list_head *list, gfp_t gfp_mask)
}
}
+static void split_pages_to_order0(struct list_head *list)
+{
+ int order;
+
+ for (order = 0; order < NR_PAGE_ORDERS; order++) {
+ struct page *page, *next;
+ int nr_pages = 1 << order;
+
+ list_for_each_entry_safe(page, next, &list[order], lru) {
+ int i;
+
+ list_del(&page->lru);
+ for (i = 0; i < nr_pages; i++)
+ list_add_tail(&page[i].lru, &list[0]);
+ }
+ }
+}
+
+static void free_pfn_range(unsigned long start, unsigned end, gfp_t gfp_mask)
+{
+ struct page *page;
+ unsigned long i;
+
+ page = pfn_to_page(start);
+ for (i = 0; i < end - start; ++i, ++page) {
+ post_alloc_hook(page, 0, gfp_mask);
+ set_page_refcounted(page);
+ }
+ free_contig_range(start, end - start);
+}
+
static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
{
const gfp_t reclaim_mask = __GFP_IO | __GFP_FS | __GFP_RECLAIM;
@@ -6800,7 +6831,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
* isolated free pages can have higher order than the requested
* one. Use split_free_pages() to free out of range pages.
*/
- if (!(gfp_mask & __GFP_COMP) || range_order < MAX_PAGE_ORDER) {
+ if (!(gfp_mask & __GFP_COMP)) {
split_free_pages(cc.freepages, gfp_mask);
/* Free head and tail (if any) */
@@ -6809,8 +6840,17 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
if (end != outer_end)
free_contig_range(end, outer_end - end);
- outer_start = start;
- outer_end = end;
+ } else if ((outer_start != start) || (end != outer_end)) {
+ struct page *page;
+ int i;
+
+ split_pages_to_order0(cc.freepages);
+
+ if (start != outer_start)
+ free_pfn_range(outer_start, start, gfp_mask);
+
+ if (end != outer_end)
+ free_pfn_range(end, outer_end, gfp_mask);
}
>
> Can you add a comment above
> “(is_power_of_2(end - start) && ilog2(end - start) < MAX_PAGE_ORDER)” to
> explain why __GFP_COMP needs split_free_pages()? Something like:
>
> With __GFP_COMP and the requested order < MAX_PAGE_ORDER, isolated free
> pages can have higher order than the requested one. Use split_free_pages()
> to free out of range pages.
>
>
>> if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
> is_power_of_2(end - start) is used twice. Can you add a variable for it
> if that sounds good to you?
>
> Thanks.
>
>> struct page *head = pfn_to_page(start);
>> int order = ilog2(end - start);
>>
>> check_new_pages(head, order);
>> prep_new_page(head, order, gfp_mask, 0);
>> set_page_refcounted(head);
>>
>> }
>>
>> Thanks.
>>
>>>>> /* Free head and tail (if any) */
>>>>> @@ -6536,7 +6537,15 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>>> free_contig_range(outer_start, start - outer_start);
>>>>> if (end != outer_end)
>>>>> free_contig_range(end, outer_end - end);
>>>>> - } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>>>> +
>>>>> + outer_start = start;
>>>>> + outer_end = end;
>>>>> +
>>>>> + if (!(gfp_mask & __GFP_COMP))
>>>>> + goto done;
>>>>> + }
>>>>> +
>>>>> + if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>>>> struct page *head = pfn_to_page(start);
>>>>> int order = ilog2(end - start);
>>>>>
>>> Best Regards,
>>> Yan, Zi
>
> Best Regards,
> Yan, Zi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER
2025-05-12 6:38 ` Jinjiang Tu
@ 2025-05-12 7:58 ` David Hildenbrand
0 siblings, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2025-05-12 7:58 UTC (permalink / raw)
To: Jinjiang Tu, Zi Yan
Cc: Andrew Morton, yuzhao, linux-mm, wangkefeng.wang, sunnanyong
>>> to a compound page.
>
> This is wrong. After split_free_pages(), these pages are order-0 pages with refcount 1. Call
> prep_new_page(head) and set_page_refcounted(head) lead to VM_BUG_ON_PAGE(page_ref_count(page), page).
>
>> OK, got it. Since cc.freepages can be MAX_PAGE_ORDER and the requested
>> PFN range is smaller than MAX_PAGE_ORDER. A more “right” way of handling
>> this would be in the second if (the one shown below), you check order
>> against the cc.freepages order and free out of requested range pages.
>> But that might be too complicated. Your approach is simpler.
>
> For (outer_start != start) || (outer_end != end) case, We could split the pages to order-0 list,
> and only call post_alloc_hook/set_page_refcounted for pages in [outer_start, start), [end, outer_end) range,
> and call free_contig_range() for them. How about this approach?
>
As there is no in-tree user of this functionality, I'm wondering if we
should simply fail the allocation if GFP_COMP is specified with a
non-suitable range?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-05-12 7:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-12 8:47 [PATCH] mm/contig_alloc: fix alloc_contig_range when __GFP_COMP and order < MAX_ORDER Jinjiang Tu
2025-04-18 2:59 ` Andrew Morton
2025-04-18 21:32 ` Zi Yan
2025-04-19 0:54 ` Jinjiang Tu
2025-04-19 1:32 ` Zi Yan
2025-04-19 1:50 ` Jinjiang Tu
2025-05-12 6:38 ` Jinjiang Tu
2025-05-12 7:58 ` David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox