linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/page_alloc: check the correct buddy if it is a starting block
@ 2025-09-04  2:06 Wei Yang
  2025-09-05  2:01 ` Zi Yan
  2025-09-05  9:40 ` David Hildenbrand
  0 siblings, 2 replies; 6+ messages in thread
From: Wei Yang @ 2025-09-04  2:06 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, Wei Yang, Johannes Weiner, Zi Yan, David Hildenbrand,
	Baolin Wang, Vlastimil Babka

find_large_buddy() search buddy based on start_pfn, which maybe
different from page's pfn, e.g. when page is not pageblock aligned,
because prep_move_freepages_block() always align start_pfn to pageblock.

This means when we found a starting block at start_pfn, it may check
on the wrong page theoretically.

The good news is the page passed to __move_freepages_block_isolate() has
only two possible cases:

  * page is pageblock aligned
  * page is __first_valid_page() of this block

So it is safe for the first case, and it won't get a buddy larger than
pageblock for the second case.

To eliminate the ambiguity, unify the handling for starting/tail block.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>

---
Tried memory online/offine, which looks good.
---
 mm/page_alloc.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5183a646fa09..453cd361386c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2093,6 +2093,7 @@ static bool __move_freepages_block_isolate(struct zone *zone,
 	unsigned long start_pfn, pfn;
 	int from_mt;
 	int to_mt;
+	struct page *buddy;
 
 	if (isolate == get_pageblock_isolate(page)) {
 		VM_WARN_ONCE(1, "%s a pageblock that is already in that state",
@@ -2107,10 +2108,10 @@ static bool __move_freepages_block_isolate(struct zone *zone,
 	if (pageblock_order == MAX_PAGE_ORDER)
 		goto move;
 
-	/* We're a tail block in a larger buddy */
 	pfn = find_large_buddy(start_pfn);
-	if (pfn != start_pfn) {
-		struct page *buddy = pfn_to_page(pfn);
+	buddy = pfn_to_page(pfn);
+	/* We're a part of a larger buddy */
+	if (PageBuddy(buddy) && buddy_order(buddy) > pageblock_order) {
 		int order = buddy_order(buddy);
 
 		del_page_from_free_list(buddy, zone, order,
@@ -2120,16 +2121,6 @@ static bool __move_freepages_block_isolate(struct zone *zone,
 		return true;
 	}
 
-	/* We're the starting block of a larger buddy */
-	if (PageBuddy(page) && buddy_order(page) > pageblock_order) {
-		int order = buddy_order(page);
-
-		del_page_from_free_list(page, zone, order,
-					get_pfnblock_migratetype(page, pfn));
-		toggle_pageblock_isolate(page, isolate);
-		split_large_buddy(zone, page, pfn, order, FPI_NONE);
-		return true;
-	}
 move:
 	/* Use MIGRATETYPE_MASK to get non-isolate migratetype */
 	if (isolate) {
-- 
2.34.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: check the correct buddy if it is a starting block
  2025-09-04  2:06 [PATCH] mm/page_alloc: check the correct buddy if it is a starting block Wei Yang
@ 2025-09-05  2:01 ` Zi Yan
  2025-09-05  3:11   ` Wei Yang
  2025-09-05  9:40 ` David Hildenbrand
  1 sibling, 1 reply; 6+ messages in thread
From: Zi Yan @ 2025-09-05  2:01 UTC (permalink / raw)
  To: Wei Yang
  Cc: akpm, linux-mm, Johannes Weiner, David Hildenbrand, Baolin Wang,
	Vlastimil Babka

On 3 Sep 2025, at 22:06, Wei Yang wrote:

> find_large_buddy() search buddy based on start_pfn, which maybe
> different from page's pfn, e.g. when page is not pageblock aligned,
> because prep_move_freepages_block() always align start_pfn to pageblock.
>
> This means when we found a starting block at start_pfn, it may check
> on the wrong page theoretically.

and not split the free page as it is supposed to, causing a freelist migratetype
mismatch.

>
> The good news is the page passed to __move_freepages_block_isolate() has
> only two possible cases:
>
>   * page is pageblock aligned
>   * page is __first_valid_page() of this block
>
> So it is safe for the first case, and it won't get a buddy larger than
> pageblock for the second case.
>
> To eliminate the ambiguity, unify the handling for starting/tail block.

To fix the issue, check the returned pfn of find_large_buddy() to decide
whether to split the free page:
1. if it is not a PageBuddy pfn, no split;
2. if it is a PageBuddy pfn but order <= pageblock_order, no split;
3. if it is a PageBuddy pfn with order > pageblock_order,
   start_pfn is either in the starting block or tail block, split the PageBuddy
   at pageblock_order level.


Otherwise, LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>
>
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
>
> ---
> Tried memory online/offine, which looks good.
> ---
>  mm/page_alloc.c | 17 ++++-------------
>  1 file changed, 4 insertions(+), 13 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5183a646fa09..453cd361386c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2093,6 +2093,7 @@ static bool __move_freepages_block_isolate(struct zone *zone,
>  	unsigned long start_pfn, pfn;
>  	int from_mt;
>  	int to_mt;
> +	struct page *buddy;
>
>  	if (isolate == get_pageblock_isolate(page)) {
>  		VM_WARN_ONCE(1, "%s a pageblock that is already in that state",
> @@ -2107,10 +2108,10 @@ static bool __move_freepages_block_isolate(struct zone *zone,
>  	if (pageblock_order == MAX_PAGE_ORDER)
>  		goto move;
>
> -	/* We're a tail block in a larger buddy */
>  	pfn = find_large_buddy(start_pfn);
> -	if (pfn != start_pfn) {
> -		struct page *buddy = pfn_to_page(pfn);
> +	buddy = pfn_to_page(pfn);
> +	/* We're a part of a larger buddy */
> +	if (PageBuddy(buddy) && buddy_order(buddy) > pageblock_order) {
>  		int order = buddy_order(buddy);
>
>  		del_page_from_free_list(buddy, zone, order,
> @@ -2120,16 +2121,6 @@ static bool __move_freepages_block_isolate(struct zone *zone,
>  		return true;
>  	}
>
> -	/* We're the starting block of a larger buddy */
> -	if (PageBuddy(page) && buddy_order(page) > pageblock_order) {
> -		int order = buddy_order(page);
> -
> -		del_page_from_free_list(page, zone, order,
> -					get_pfnblock_migratetype(page, pfn));
> -		toggle_pageblock_isolate(page, isolate);
> -		split_large_buddy(zone, page, pfn, order, FPI_NONE);
> -		return true;
> -	}
>  move:
>  	/* Use MIGRATETYPE_MASK to get non-isolate migratetype */
>  	if (isolate) {
> -- 
> 2.34.1


--
Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: check the correct buddy if it is a starting block
  2025-09-05  2:01 ` Zi Yan
@ 2025-09-05  3:11   ` Wei Yang
  2025-09-05  3:25     ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Wei Yang @ 2025-09-05  3:11 UTC (permalink / raw)
  To: Zi Yan
  Cc: Wei Yang, akpm, linux-mm, Johannes Weiner, David Hildenbrand,
	Baolin Wang, Vlastimil Babka

On Thu, Sep 04, 2025 at 10:01:01PM -0400, Zi Yan wrote:
>On 3 Sep 2025, at 22:06, Wei Yang wrote:
>
>> find_large_buddy() search buddy based on start_pfn, which maybe
>> different from page's pfn, e.g. when page is not pageblock aligned,
>> because prep_move_freepages_block() always align start_pfn to pageblock.
>>
>> This means when we found a starting block at start_pfn, it may check
>> on the wrong page theoretically.
>
>and not split the free page as it is supposed to, causing a freelist migratetype
>mismatch.
>

Thanks, this is important.

>>
>> The good news is the page passed to __move_freepages_block_isolate() has
>> only two possible cases:
>>
>>   * page is pageblock aligned
>>   * page is __first_valid_page() of this block
>>
>> So it is safe for the first case, and it won't get a buddy larger than
>> pageblock for the second case.
>>
>> To eliminate the ambiguity, unify the handling for starting/tail block.
>
>To fix the issue, check the returned pfn of find_large_buddy() to decide
>whether to split the free page:
>1. if it is not a PageBuddy pfn, no split;
>2. if it is a PageBuddy pfn but order <= pageblock_order, no split;
>3. if it is a PageBuddy pfn with order > pageblock_order,
>   start_pfn is either in the starting block or tail block, split the PageBuddy
>   at pageblock_order level.
>
>
>Otherwise, LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>

Thanks.

@Andrew, if not bother, would you mind adjust the changelog?

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: check the correct buddy if it is a starting block
  2025-09-05  3:11   ` Wei Yang
@ 2025-09-05  3:25     ` Andrew Morton
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2025-09-05  3:25 UTC (permalink / raw)
  To: Wei Yang
  Cc: Zi Yan, linux-mm, Johannes Weiner, David Hildenbrand,
	Baolin Wang, Vlastimil Babka

On Fri, 5 Sep 2025 03:11:30 +0000 Wei Yang <richard.weiyang@gmail.com> wrote:

> >>
> >> The good news is the page passed to __move_freepages_block_isolate() has
> >> only two possible cases:
> >>
> >>   * page is pageblock aligned
> >>   * page is __first_valid_page() of this block
> >>
> >> So it is safe for the first case, and it won't get a buddy larger than
> >> pageblock for the second case.
> >>
> >> To eliminate the ambiguity, unify the handling for starting/tail block.
> >
> >To fix the issue, check the returned pfn of find_large_buddy() to decide
> >whether to split the free page:
> >1. if it is not a PageBuddy pfn, no split;
> >2. if it is a PageBuddy pfn but order <= pageblock_order, no split;
> >3. if it is a PageBuddy pfn with order > pageblock_order,
> >   start_pfn is either in the starting block or tail block, split the PageBuddy
> >   at pageblock_order level.
> >
> >
> >Otherwise, LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>
> 
> Thanks.
> 
> @Andrew, if not bother, would you mind adjust the changelog?

This?

From: Wei Yang <richard.weiyang@gmail.com>
Subject: mm/page_alloc: check the correct buddy if it is a starting block
Date: Thu, 4 Sep 2025 02:06:54 +0000

find_large_buddy() search buddy based on start_pfn, which may be different
from page's pfn, e.g.  when page is not pageblock aligned, because
prep_move_freepages_block() always aligns start_pfn to pageblock.

This means when we found a starting block at start_pfn, it may check on
the wrong page theoretically.

The good news is the page passed to __move_freepages_block_isolate() has
only two possible cases:

  * page is pageblock aligned
  * page is __first_valid_page() of this block

So it is safe for the first case, and it won't get a buddy larger than
pageblock for the second case.

To fix the issue, check the returned pfn of find_large_buddy() to
decide whether to split the free page:

1. if it is not a PageBuddy pfn, no split;
2. if it is a PageBuddy pfn but order <= pageblock_order, no split;
3. if it is a PageBuddy pfn with order > pageblock_order,
   start_pfn is either in the starting block or tail block, split the PageBuddy
   at pageblock_order level.

Link: https://lkml.kernel.org/r/20250904020654.28689-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: check the correct buddy if it is a starting block
  2025-09-04  2:06 [PATCH] mm/page_alloc: check the correct buddy if it is a starting block Wei Yang
  2025-09-05  2:01 ` Zi Yan
@ 2025-09-05  9:40 ` David Hildenbrand
  2025-09-05 13:35   ` Wei Yang
  1 sibling, 1 reply; 6+ messages in thread
From: David Hildenbrand @ 2025-09-05  9:40 UTC (permalink / raw)
  To: Wei Yang, akpm
  Cc: linux-mm, Johannes Weiner, Zi Yan, Baolin Wang, Vlastimil Babka

On 04.09.25 04:06, Wei Yang wrote:
> find_large_buddy() search buddy based on start_pfn, which maybe
> different from page's pfn, e.g. when page is not pageblock aligned,
> because prep_move_freepages_block() always align start_pfn to pageblock.
> 
> This means when we found a starting block at start_pfn, it may check
> on the wrong page theoretically.
> 
> The good news is the page passed to __move_freepages_block_isolate() has
> only two possible cases:
> 
>    * page is pageblock aligned
>    * page is __first_valid_page() of this block
> 
> So it is safe for the first case, and it won't get a buddy larger than
> pageblock for the second case.
> 
> To eliminate the ambiguity, unify the handling for starting/tail block.
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> 
> ---
> Tried memory online/offine, which looks good.
> ---
>   mm/page_alloc.c | 17 ++++-------------
>   1 file changed, 4 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5183a646fa09..453cd361386c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2093,6 +2093,7 @@ static bool __move_freepages_block_isolate(struct zone *zone,
>   	unsigned long start_pfn, pfn;
>   	int from_mt;
>   	int to_mt;
> +	struct page *buddy;
>   
>   	if (isolate == get_pageblock_isolate(page)) {
>   		VM_WARN_ONCE(1, "%s a pageblock that is already in that state",
> @@ -2107,10 +2108,10 @@ static bool __move_freepages_block_isolate(struct zone *zone,
>   	if (pageblock_order == MAX_PAGE_ORDER)
>   		goto move;
>   
> -	/* We're a tail block in a larger buddy */
>   	pfn = find_large_buddy(start_pfn);
> -	if (pfn != start_pfn) {
> -		struct page *buddy = pfn_to_page(pfn);
> +	buddy = pfn_to_page(pfn);

While at it, can you please rename "pfn" to "buddy_pfn" ?

> +	/* We're a part of a larger buddy */
> +	if (PageBuddy(buddy) && buddy_order(buddy) > pageblock_order) {
>   		int order = buddy_order(buddy);

Yes, I think this should work.

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: check the correct buddy if it is a starting block
  2025-09-05  9:40 ` David Hildenbrand
@ 2025-09-05 13:35   ` Wei Yang
  0 siblings, 0 replies; 6+ messages in thread
From: Wei Yang @ 2025-09-05 13:35 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Wei Yang, akpm, linux-mm, Johannes Weiner, Zi Yan, Baolin Wang,
	Vlastimil Babka

On Fri, Sep 05, 2025 at 11:40:42AM +0200, David Hildenbrand wrote:
>On 04.09.25 04:06, Wei Yang wrote:
>> find_large_buddy() search buddy based on start_pfn, which maybe
>> different from page's pfn, e.g. when page is not pageblock aligned,
>> because prep_move_freepages_block() always align start_pfn to pageblock.
>> 
>> This means when we found a starting block at start_pfn, it may check
>> on the wrong page theoretically.
>> 
>> The good news is the page passed to __move_freepages_block_isolate() has
>> only two possible cases:
>> 
>>    * page is pageblock aligned
>>    * page is __first_valid_page() of this block
>> 
>> So it is safe for the first case, and it won't get a buddy larger than
>> pageblock for the second case.
>> 
>> To eliminate the ambiguity, unify the handling for starting/tail block.
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> 
>> ---
>> Tried memory online/offine, which looks good.
>> ---
>>   mm/page_alloc.c | 17 ++++-------------
>>   1 file changed, 4 insertions(+), 13 deletions(-)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 5183a646fa09..453cd361386c 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2093,6 +2093,7 @@ static bool __move_freepages_block_isolate(struct zone *zone,
>>   	unsigned long start_pfn, pfn;
>>   	int from_mt;
>>   	int to_mt;
>> +	struct page *buddy;
>>   	if (isolate == get_pageblock_isolate(page)) {
>>   		VM_WARN_ONCE(1, "%s a pageblock that is already in that state",
>> @@ -2107,10 +2108,10 @@ static bool __move_freepages_block_isolate(struct zone *zone,
>>   	if (pageblock_order == MAX_PAGE_ORDER)
>>   		goto move;
>> -	/* We're a tail block in a larger buddy */
>>   	pfn = find_large_buddy(start_pfn);
>> -	if (pfn != start_pfn) {
>> -		struct page *buddy = pfn_to_page(pfn);
>> +	buddy = pfn_to_page(pfn);
>
>While at it, can you please rename "pfn" to "buddy_pfn" ?
>

Ok. 

>> +	/* We're a part of a larger buddy */
>> +	if (PageBuddy(buddy) && buddy_order(buddy) > pageblock_order) {
>>   		int order = buddy_order(buddy);
>
>Yes, I think this should work.
>

@Andrew, looks I will send a v2 with updated changelog. Thanks

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-09-05 13:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-04  2:06 [PATCH] mm/page_alloc: check the correct buddy if it is a starting block Wei Yang
2025-09-05  2:01 ` Zi Yan
2025-09-05  3:11   ` Wei Yang
2025-09-05  3:25     ` Andrew Morton
2025-09-05  9:40 ` David Hildenbrand
2025-09-05 13:35   ` Wei Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox