* [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order @ 2025-08-28 9:16 Wei Yang 2025-08-29 3:02 ` Zi Yan 0 siblings, 1 reply; 8+ messages in thread From: Wei Yang @ 2025-08-28 9:16 UTC (permalink / raw) To: akpm Cc: linux-mm, Wei Yang, Johannes Weiner, Zi Yan, Vlastimil Babka, David Hildenbrand We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large buddy. While if the order is less than start_pfn aligned order, we would get the same pfn and do the same check again. Iterate from start_pfn aligned order to reduce duplicated work. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Zi Yan <ziy@nvidia.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> --- I build this and run, but not sure how fully test this. --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 27ea4c7acd15..7f2dfd30106f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page, /* Look for a buddy that straddles start_pfn */ static unsigned long find_large_buddy(unsigned long start_pfn) { - int order = 0; + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER; struct page *page; unsigned long pfn = start_pfn; -- 2.34.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order 2025-08-28 9:16 [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order Wei Yang @ 2025-08-29 3:02 ` Zi Yan 2025-08-30 1:25 ` Wei Yang 2025-08-30 2:15 ` Wei Yang 0 siblings, 2 replies; 8+ messages in thread From: Zi Yan @ 2025-08-29 3:02 UTC (permalink / raw) To: Wei Yang Cc: akpm, linux-mm, Johannes Weiner, Vlastimil Babka, David Hildenbrand On 28 Aug 2025, at 5:16, Wei Yang wrote: > We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large > buddy. While if the order is less than start_pfn aligned order, we would > get the same pfn and do the same check again. > > Iterate from start_pfn aligned order to reduce duplicated work. > > Signed-off-by: Wei Yang <richard.weiyang@gmail.com> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Zi Yan <ziy@nvidia.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: David Hildenbrand <david@redhat.com> > > --- > I build this and run, but not sure how fully test this. > --- > mm/page_alloc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 27ea4c7acd15..7f2dfd30106f 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page, > /* Look for a buddy that straddles start_pfn */ > static unsigned long find_large_buddy(unsigned long start_pfn) > { > - int order = 0; > + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER; > struct page *page; > unsigned long pfn = start_pfn; I think it is right, but the code is very subtle and hard to understand after the change. It is better to add comment to explain it. Paste the code below for more context: while (!PageBuddy(page = pfn_to_page(pfn))) { /* Nothing found */ if (++order > MAX_PAGE_ORDER) return start_pfn; pfn &= ~0UL << order; } The code tries to find a PageBuddy starting from start_pfn starting from order=0. When entering the while loop, it means PageBuddy cannot be order-0 and ++order increases the order by 1. Your change fast forwards the process based on start_pfn. If start_pfn is not an order-0 page, based on first set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn) and it works because "if (++order > MAX_PAGE_ORDER)" increases order to __ffs(start_pfn) + 1. Can you add a comment on your "int order = ..."? Something like: If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn) to remove unnecessary work in the while below. Feel free to reword the above. With the added comment, feel free to add Reviewed-by: Zi Yan <ziy@nvidia.com> BTW, I also notice that when start_pfn is an order-0 PageBuddy, the "if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true even if there is no buddy straddles start_pfn, although "return pfn" gives the same results as "return start_pfn" (no straddle). The original code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm: page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn before the straddle check, so the correct code should check start_pfn == pfn and return early. But since current code is functionally equivalent. Maybe adding a comment about it would be sufficient. Something like: When the found buddy order is 0, the check would give false positive, but the returned result is still correct, since pfn is the same as start_pfn. -- Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order 2025-08-29 3:02 ` Zi Yan @ 2025-08-30 1:25 ` Wei Yang 2025-08-30 3:20 ` Vishal Moola (Oracle) 2025-08-31 1:28 ` Zi Yan 2025-08-30 2:15 ` Wei Yang 1 sibling, 2 replies; 8+ messages in thread From: Wei Yang @ 2025-08-30 1:25 UTC (permalink / raw) To: Zi Yan Cc: Wei Yang, akpm, linux-mm, Johannes Weiner, Vlastimil Babka, David Hildenbrand On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote: >On 28 Aug 2025, at 5:16, Wei Yang wrote: > >> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large >> buddy. While if the order is less than start_pfn aligned order, we would >> get the same pfn and do the same check again. >> >> Iterate from start_pfn aligned order to reduce duplicated work. >> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> >> Cc: Johannes Weiner <hannes@cmpxchg.org> >> Cc: Zi Yan <ziy@nvidia.com> >> Cc: Vlastimil Babka <vbabka@suse.cz> >> Cc: David Hildenbrand <david@redhat.com> >> >> --- >> I build this and run, but not sure how fully test this. >> --- >> mm/page_alloc.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 27ea4c7acd15..7f2dfd30106f 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page, >> /* Look for a buddy that straddles start_pfn */ >> static unsigned long find_large_buddy(unsigned long start_pfn) >> { >> - int order = 0; >> + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER; >> struct page *page; >> unsigned long pfn = start_pfn; > Hi, Zi Yan Thanks for the review. >I think it is right, but the code is very subtle and hard to understand >after the change. It is better to add comment to explain it. One thing I want to point out is in __move_freepages_block_isolate(), find_large_buddy() is always given a pageblock aligned start_pfn. This means if start_pfn is not a free page, it would always try 10 times until give up. > >Paste the code below for more context: > > while (!PageBuddy(page = pfn_to_page(pfn))) { > /* Nothing found */ > if (++order > MAX_PAGE_ORDER) > return start_pfn; > pfn &= ~0UL << order; > } > > > >The code tries to find a PageBuddy starting from start_pfn starting from >order=0. When entering the while loop, it means PageBuddy cannot be order-0 >and ++order increases the order by 1. Your change fast forwards the process >based on start_pfn. If start_pfn is not an order-0 page, based on first >set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy >order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn) >and it works because "if (++order > MAX_PAGE_ORDER)" increases order >to __ffs(start_pfn) + 1. > >Can you add a comment on your "int order = ..."? Something like: > >If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn >has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn) >to remove unnecessary work in the while below. Sure, I would add a comment above the assignment. But in my mind, we are not farst forward order, but start check from order of __ffs(start_pfn). How about: (not good at commento) /* * We start find large buddy from start_pfn order, since a * !PageBuddy() means all lower order page is !PageBuddy(). */ > >Feel free to reword the above. > >With the added comment, feel free to add Reviewed-by: Zi Yan <ziy@nvidia.com> > > >BTW, I also notice that when start_pfn is an order-0 PageBuddy, the >"if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true >even if there is no buddy straddles start_pfn, although "return pfn" >gives the same results as "return start_pfn" (no straddle). The original >code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm: >page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn >before the straddle check, so the correct code should check start_pfn == pfn >and return early. But since current code is functionally equivalent. >Maybe adding a comment about it would be sufficient. Something like: The comment above this function says "a buddy that straddles start_pfn", this looks good to me. An order-0 page that start from start_pfn also means straddle start_pfn. This may differ from original logic, but seems not wrong. > >When the found buddy order is 0, the check would give false positive, >but the returned result is still correct, since pfn is the same as start_pfn. > I prefer not adding this. > > >-- >Best Regards, >Yan, Zi -- Wei Yang Help you, Help me ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order 2025-08-30 1:25 ` Wei Yang @ 2025-08-30 3:20 ` Vishal Moola (Oracle) 2025-08-30 7:48 ` Wei Yang 2025-08-31 1:28 ` Zi Yan 1 sibling, 1 reply; 8+ messages in thread From: Vishal Moola (Oracle) @ 2025-08-30 3:20 UTC (permalink / raw) To: Wei Yang Cc: Zi Yan, akpm, linux-mm, Johannes Weiner, Vlastimil Babka, David Hildenbrand On Sat, Aug 30, 2025 at 01:25:05AM +0000, Wei Yang wrote: > On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote: > >On 28 Aug 2025, at 5:16, Wei Yang wrote: > > > >> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large > >> buddy. While if the order is less than start_pfn aligned order, we would > >> get the same pfn and do the same check again. > >> > >> Iterate from start_pfn aligned order to reduce duplicated work. > >> > >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> > >> Cc: Johannes Weiner <hannes@cmpxchg.org> > >> Cc: Zi Yan <ziy@nvidia.com> > >> Cc: Vlastimil Babka <vbabka@suse.cz> > >> Cc: David Hildenbrand <david@redhat.com> > >> > >> --- > >I think it is right, but the code is very subtle and hard to understand > >after the change. It is better to add comment to explain it. > > One thing I want to point out is in __move_freepages_block_isolate(), > find_large_buddy() is always given a pageblock aligned start_pfn. This means > if start_pfn is not a free page, it would always try 10 times until give up. > > > > >Paste the code below for more context: > > > > while (!PageBuddy(page = pfn_to_page(pfn))) { > > /* Nothing found */ > > if (++order > MAX_PAGE_ORDER) > > return start_pfn; > > pfn &= ~0UL << order; > > } > > > > > > > >The code tries to find a PageBuddy starting from start_pfn starting from > >order=0. When entering the while loop, it means PageBuddy cannot be order-0 > >and ++order increases the order by 1. Your change fast forwards the process > >based on start_pfn. If start_pfn is not an order-0 page, based on first > >set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy > >order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn) > >and it works because "if (++order > MAX_PAGE_ORDER)" increases order > >to __ffs(start_pfn) + 1. > > > >Can you add a comment on your "int order = ..."? Something like: > > > >If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn > >has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn) > >to remove unnecessary work in the while below. > > Sure, I would add a comment above the assignment. > > But in my mind, we are not farst forward order, but start check from order of > __ffs(start_pfn). > > > How about: (not good at commento) > > /* > * We start find large buddy from start_pfn order, since a > * !PageBuddy() means all lower order page is !PageBuddy(). > */ Personally, the way Zi worded it makes more sense to me. Maybe replacing the second sentence with your comment is fine, but I think Zi's first sentence provides important context in a clear way. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order 2025-08-30 3:20 ` Vishal Moola (Oracle) @ 2025-08-30 7:48 ` Wei Yang 0 siblings, 0 replies; 8+ messages in thread From: Wei Yang @ 2025-08-30 7:48 UTC (permalink / raw) To: Vishal Moola (Oracle) Cc: Wei Yang, Zi Yan, akpm, linux-mm, Johannes Weiner, Vlastimil Babka, David Hildenbrand On Fri, Aug 29, 2025 at 08:20:04PM -0700, Vishal Moola (Oracle) wrote: >On Sat, Aug 30, 2025 at 01:25:05AM +0000, Wei Yang wrote: >> On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote: >> >On 28 Aug 2025, at 5:16, Wei Yang wrote: >> > >> >> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large >> >> buddy. While if the order is less than start_pfn aligned order, we would >> >> get the same pfn and do the same check again. >> >> >> >> Iterate from start_pfn aligned order to reduce duplicated work. >> >> >> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> >> >> Cc: Johannes Weiner <hannes@cmpxchg.org> >> >> Cc: Zi Yan <ziy@nvidia.com> >> >> Cc: Vlastimil Babka <vbabka@suse.cz> >> >> Cc: David Hildenbrand <david@redhat.com> >> >> >> >> --- >> >I think it is right, but the code is very subtle and hard to understand >> >after the change. It is better to add comment to explain it. >> >> One thing I want to point out is in __move_freepages_block_isolate(), >> find_large_buddy() is always given a pageblock aligned start_pfn. This means >> if start_pfn is not a free page, it would always try 10 times until give up. >> >> > >> >Paste the code below for more context: >> > >> > while (!PageBuddy(page = pfn_to_page(pfn))) { >> > /* Nothing found */ >> > if (++order > MAX_PAGE_ORDER) >> > return start_pfn; >> > pfn &= ~0UL << order; >> > } >> > >> > >> > >> >The code tries to find a PageBuddy starting from start_pfn starting from >> >order=0. When entering the while loop, it means PageBuddy cannot be order-0 >> >and ++order increases the order by 1. Your change fast forwards the process >> >based on start_pfn. If start_pfn is not an order-0 page, based on first >> >set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy >> >order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn) >> >and it works because "if (++order > MAX_PAGE_ORDER)" increases order >> >to __ffs(start_pfn) + 1. >> > >> >Can you add a comment on your "int order = ..."? Something like: >> > >> >If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn >> >has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn) >> >to remove unnecessary work in the while below. >> >> Sure, I would add a comment above the assignment. >> >> But in my mind, we are not farst forward order, but start check from order of >> __ffs(start_pfn). >> >> >> How about: (not good at commento) >> >> /* >> * We start find large buddy from start_pfn order, since a >> * !PageBuddy() means all lower order page is !PageBuddy(). >> */ > >Personally, the way Zi worded it makes more sense to me. Maybe replacing >the second sentence with your comment is fine, but I think Zi's first >sentence provides important context in a clear way. Hi, Vishal Thanks for your comment. I am ok with it. Let's see Zi's opnion. And I will send a new version. :-) -- Wei Yang Help you, Help me ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order 2025-08-30 1:25 ` Wei Yang 2025-08-30 3:20 ` Vishal Moola (Oracle) @ 2025-08-31 1:28 ` Zi Yan 2025-08-31 3:35 ` Wei Yang 1 sibling, 1 reply; 8+ messages in thread From: Zi Yan @ 2025-08-31 1:28 UTC (permalink / raw) To: Wei Yang Cc: akpm, linux-mm, Johannes Weiner, Vlastimil Babka, David Hildenbrand On 29 Aug 2025, at 21:25, Wei Yang wrote: > On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote: >> On 28 Aug 2025, at 5:16, Wei Yang wrote: >> >>> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large >>> buddy. While if the order is less than start_pfn aligned order, we would >>> get the same pfn and do the same check again. >>> >>> Iterate from start_pfn aligned order to reduce duplicated work. >>> >>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> >>> Cc: Johannes Weiner <hannes@cmpxchg.org> >>> Cc: Zi Yan <ziy@nvidia.com> >>> Cc: Vlastimil Babka <vbabka@suse.cz> >>> Cc: David Hildenbrand <david@redhat.com> >>> >>> --- >>> I build this and run, but not sure how fully test this. >>> --- >>> mm/page_alloc.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index 27ea4c7acd15..7f2dfd30106f 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >>> @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page, >>> /* Look for a buddy that straddles start_pfn */ >>> static unsigned long find_large_buddy(unsigned long start_pfn) >>> { >>> - int order = 0; >>> + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER; >>> struct page *page; >>> unsigned long pfn = start_pfn; >> > > Hi, Zi Yan > > Thanks for the review. > >> I think it is right, but the code is very subtle and hard to understand >> after the change. It is better to add comment to explain it. > > One thing I want to point out is in __move_freepages_block_isolate(), > find_large_buddy() is always given a pageblock aligned start_pfn. This means > if start_pfn is not a free page, it would always try 10 times until give up. find_large_buddy() is used to deal with free pages, so start_pfn is likely to be a free page. > >> >> Paste the code below for more context: >> >> while (!PageBuddy(page = pfn_to_page(pfn))) { >> /* Nothing found */ >> if (++order > MAX_PAGE_ORDER) >> return start_pfn; >> pfn &= ~0UL << order; >> } >> >> >> >> The code tries to find a PageBuddy starting from start_pfn starting from >> order=0. When entering the while loop, it means PageBuddy cannot be order-0 >> and ++order increases the order by 1. Your change fast forwards the process >> based on start_pfn. If start_pfn is not an order-0 page, based on first >> set bit in start_pfn and how buddy page is chosen, the next possible PageBuddy >> order can only be __ffs(start_pfn) + 1. Your code starts order at __ffs(start_pfn) >> and it works because "if (++order > MAX_PAGE_ORDER)" increases order >> to __ffs(start_pfn) + 1. >> >> Can you add a comment on your "int order = ..."? Something like: >> >> If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn >> has minimal order of __ffs(start_pfn) + 1. Fastforward order to __ffs(start_pfn) >> to remove unnecessary work in the while below. > > Sure, I would add a comment above the assignment. > > But in my mind, we are not farst forward order, but start check from order of > __ffs(start_pfn). Sure. > > > How about: (not good at commento) > > /* > * We start find large buddy from start_pfn order, since a It is unclear what start_pfn order means. > * !PageBuddy() means all lower order page is !PageBuddy(). > */ Here you assume start_pfn is not PageBuddy() already, but it can be an order-0 PageBuddy(). That is why my comment explicitly excluded the case to begin with. How about? If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn has minimal order of __ffs(start_pfn) + 1. Start checking the order with __ffs(start_pfn). If start_pfn is order-0, the starting order does not matter. > >> >> Feel free to reword the above. >> >> With the added comment, feel free to add Reviewed-by: Zi Yan <ziy@nvidia.com> >> >> >> BTW, I also notice that when start_pfn is an order-0 PageBuddy, the >> "if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true >> even if there is no buddy straddles start_pfn, although "return pfn" >> gives the same results as "return start_pfn" (no straddle). The original >> code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm: >> page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn >> before the straddle check, so the correct code should check start_pfn == pfn >> and return early. But since current code is functionally equivalent. >> Maybe adding a comment about it would be sufficient. Something like: > > The comment above this function says "a buddy that straddles start_pfn", this > looks good to me. An order-0 page that start from start_pfn also means > straddle start_pfn. > > This may differ from original logic, but seems not wrong. Straddle means a buddy page has start_pfn in the middle and the caller in __move_freepages_block_isolate() needs to split the buddy page. > >> >> When the found buddy order is 0, the check would give false positive, >> but the returned result is still correct, since pfn is the same as start_pfn. >> > > I prefer not adding this. No strong opinion about it. -- Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order 2025-08-31 1:28 ` Zi Yan @ 2025-08-31 3:35 ` Wei Yang 0 siblings, 0 replies; 8+ messages in thread From: Wei Yang @ 2025-08-31 3:35 UTC (permalink / raw) To: Zi Yan Cc: Wei Yang, akpm, linux-mm, Johannes Weiner, Vlastimil Babka, David Hildenbrand On Sat, Aug 30, 2025 at 09:28:24PM -0400, Zi Yan wrote: >On 29 Aug 2025, at 21:25, Wei Yang wrote: > >> On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote: >>> On 28 Aug 2025, at 5:16, Wei Yang wrote: >>> >>>> We iterate pfn from order 0 to MAX_PAGE_ORDER aligned to find large >>>> buddy. While if the order is less than start_pfn aligned order, we would >>>> get the same pfn and do the same check again. >>>> >>>> Iterate from start_pfn aligned order to reduce duplicated work. >>>> >>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> >>>> Cc: Johannes Weiner <hannes@cmpxchg.org> >>>> Cc: Zi Yan <ziy@nvidia.com> >>>> Cc: Vlastimil Babka <vbabka@suse.cz> >>>> Cc: David Hildenbrand <david@redhat.com> >>>> >>>> --- >>>> I build this and run, but not sure how fully test this. >>>> --- >>>> mm/page_alloc.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>> index 27ea4c7acd15..7f2dfd30106f 100644 >>>> --- a/mm/page_alloc.c >>>> +++ b/mm/page_alloc.c >>>> @@ -2033,7 +2033,7 @@ static int move_freepages_block(struct zone *zone, struct page *page, >>>> /* Look for a buddy that straddles start_pfn */ >>>> static unsigned long find_large_buddy(unsigned long start_pfn) >>>> { >>>> - int order = 0; >>>> + int order = start_pfn ? __ffs(start_pfn) : MAX_PAGE_ORDER; >>>> struct page *page; >>>> unsigned long pfn = start_pfn; >>> >> >> Hi, Zi Yan >> >> Thanks for the review. >> >>> I think it is right, but the code is very subtle and hard to understand >>> after the change. It is better to add comment to explain it. >> >> One thing I want to point out is in __move_freepages_block_isolate(), >> find_large_buddy() is always given a pageblock aligned start_pfn. This means >> if start_pfn is not a free page, it would always try 10 times until give up. > >find_large_buddy() is used to deal with free pages, so start_pfn is likely >to be a free page. > I am not that familiar with the background, my question here. I see the call flow is: alloc_contig_pages_noprof() __alloc_contig_pages(pfn, ) <-- start_pfn start_isolate_page_range() isolate_single_pageblock() set_migratetype_isolate() pageblock_isolate_and_move_free_pages() find_large_buddy() If my understanding is correct, the start_pfn comes from __alloc_contig_pages(), which is get from zone's pfn iteration. I don't see it filter non-free page. So the possibility of free/non-free is equal to me. Maybe I missed something? [...] >> How about: (not good at commento) >> >> /* >> * We start find large buddy from start_pfn order, since a > >It is unclear what start_pfn order means. > >> * !PageBuddy() means all lower order page is !PageBuddy(). >> */ > >Here you assume start_pfn is not PageBuddy() already, but it >can be an order-0 PageBuddy(). That is why my comment explicitly >excluded the case to begin with. > >How about? > >If start_pfn is not an order-0 PageBuddy, next PageBuddy containing start_pfn >has minimal order of __ffs(start_pfn) + 1. Start checking the order with >__ffs(start_pfn). If start_pfn is order-0, the starting order does not matter. > Thanks, will use this one. >> >>> >>> Feel free to reword the above. >>> >>> With the added comment, feel free to add Reviewed-by: Zi Yan <ziy@nvidia.com> >>> >>> >>> BTW, I also notice that when start_pfn is an order-0 PageBuddy, the >>> "if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true >>> even if there is no buddy straddles start_pfn, although "return pfn" >>> gives the same results as "return start_pfn" (no straddle). The original >>> code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm: >>> page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn >>> before the straddle check, so the correct code should check start_pfn == pfn >>> and return early. But since current code is functionally equivalent. >>> Maybe adding a comment about it would be sufficient. Something like: >> >> The comment above this function says "a buddy that straddles start_pfn", this >> looks good to me. An order-0 page that start from start_pfn also means >> straddle start_pfn. >> >> This may differ from original logic, but seems not wrong. > >Straddle means a buddy page has start_pfn in the middle and the caller Ok, maybe the word "straddle" literally means it. If so, it really confuse. >in __move_freepages_block_isolate() needs to split the buddy page. Do you mean if start_pfn is the head of a large buddy page, we don't split it in __move_freepages_block_isolate()? Per my understanding, if this pageblock is a part of a large buddy, no matter it is at the beginning or end, we should split it. Just want to confirm the behavior with you in case I misunderstand. -- Wei Yang Help you, Help me ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order 2025-08-29 3:02 ` Zi Yan 2025-08-30 1:25 ` Wei Yang @ 2025-08-30 2:15 ` Wei Yang 1 sibling, 0 replies; 8+ messages in thread From: Wei Yang @ 2025-08-30 2:15 UTC (permalink / raw) To: Zi Yan Cc: Wei Yang, akpm, linux-mm, Johannes Weiner, Vlastimil Babka, David Hildenbrand On Thu, Aug 28, 2025 at 11:02:33PM -0400, Zi Yan wrote: [...] > >BTW, I also notice that when start_pfn is an order-0 PageBuddy, the >"if (pfn + (1 << buddy_order(page)) > start_pfn)" check below would be true >even if there is no buddy straddles start_pfn, although "return pfn" >gives the same results as "return start_pfn" (no straddle). The original >code before the addition of find_large_buddy() (commit fd919a85cd55 ("mm: >page_isolation: prepare for hygienic freelists")) checks start_pfn == pfn >before the straddle check, so the correct code should check start_pfn == pfn >and return early. But since current code is functionally equivalent. >Maybe adding a comment about it would be sufficient. Something like: > >When the found buddy order is 0, the check would give false positive, >but the returned result is still correct, since pfn is the same as start_pfn. Hm... found another thing. In __move_freepages_block_isolate(), when checking "We're the starting block of a large buddy", variable "page" may not be the large buddy page we found from find_large_buddy(). Since prep_move_freepages_block() set start_pfn to pageblock_start_pfn(page_to_pfn(page)), which may be different. -- Wei Yang Help you, Help me ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-08-31 3:35 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-08-28 9:16 [PATCH] mm/page_alloc: find_large_buddy() from start_pfn aligned order Wei Yang 2025-08-29 3:02 ` Zi Yan 2025-08-30 1:25 ` Wei Yang 2025-08-30 3:20 ` Vishal Moola (Oracle) 2025-08-30 7:48 ` Wei Yang 2025-08-31 1:28 ` Zi Yan 2025-08-31 3:35 ` Wei Yang 2025-08-30 2:15 ` Wei Yang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox