[PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v0] mm/page_alloc: Cleanup for  __del_page_from_free_list()
@ 2025-09-25  8:50 zhongjinji
  2025-09-30 13:55 ` Vlastimil Babka
  0 siblings, 1 reply; 9+ messages in thread
From: zhongjinji @ 2025-09-25  8:50 UTC (permalink / raw)
  To: akpm
  Cc: vbabka, surenb, mhocko, jackmanb, hannes, ziy, linux-mm,
	linux-kernel, liulu.liu, feng.han, zhongjinji

It is unnecessary to set page->private in __del_page_from_free_list().

If the page is about to be allocated, page->private will be cleared by
post_alloc_hook() before the page is handed out. If the page is expanded
or merged, page->private will be reset by set_buddy_order, and no one
will retrieve the page's buddy_order without the PageBuddy flag being set.
If the page is isolated, it will also reset page->private when it
succeeds.

Since __del_page_from_free_list() is a hot path in the kernel, it would be
better to remove the unnecessary set_page_private().

Signed-off-by: zhongjinji <zhongjinji@honor.com>
---
 mm/page_alloc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d1d037f97c5f..1999eb7e7c14 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon

 	list_del(&page->buddy_list);
 	__ClearPageBuddy(page);
-	set_page_private(page, 0);
 	zone->free_area[order].nr_free--;

 	if (order >= pageblock_order && !is_migrate_isolate(migratetype))
-- 
2.17.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()
  2025-09-25  8:50 [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() zhongjinji
@ 2025-09-30 13:55 ` Vlastimil Babka
  2025-09-30 14:28   ` Zi Yan
  0 siblings, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2025-09-30 13:55 UTC (permalink / raw)
  To: zhongjinji, akpm
  Cc: surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel,
	liulu.liu, feng.han

On 9/25/25 10:50, zhongjinji wrote:
> It is unnecessary to set page->private in __del_page_from_free_list().
> 
> If the page is about to be allocated, page->private will be cleared by
> post_alloc_hook() before the page is handed out. If the page is expanded
> or merged, page->private will be reset by set_buddy_order, and no one
> will retrieve the page's buddy_order without the PageBuddy flag being set.
> If the page is isolated, it will also reset page->private when it
> succeeds.

Seems correct.

> Since __del_page_from_free_list() is a hot path in the kernel, it would be
> better to remove the unnecessary set_page_private().
> 
> Signed-off-by: zhongjinji <zhongjinji@honor.com>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/page_alloc.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d1d037f97c5f..1999eb7e7c14 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
>  
>  	list_del(&page->buddy_list);
>  	__ClearPageBuddy(page);
> -	set_page_private(page, 0);
>  	zone->free_area[order].nr_free--;
>  
>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()
  2025-09-30 13:55 ` Vlastimil Babka
@ 2025-09-30 14:28   ` Zi Yan
  2025-09-30 15:20     ` Vlastimil Babka
  2025-10-01  4:38     ` jinji zhong
  0 siblings, 2 replies; 9+ messages in thread
From: Zi Yan @ 2025-09-30 14:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: zhongjinji, akpm, surenb, mhocko, jackmanb, hannes, linux-mm,
	linux-kernel, liulu.liu, feng.han

On 30 Sep 2025, at 9:55, Vlastimil Babka wrote:

> On 9/25/25 10:50, zhongjinji wrote:
>> It is unnecessary to set page->private in __del_page_from_free_list().
>>
>> If the page is about to be allocated, page->private will be cleared by
>> post_alloc_hook() before the page is handed out. If the page is expanded
>> or merged, page->private will be reset by set_buddy_order, and no one
>> will retrieve the page's buddy_order without the PageBuddy flag being set.
>> If the page is isolated, it will also reset page->private when it
>> succeeds.
>
> Seems correct.

This means high order free pages will have head[2N].private set to a non-zero
value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ...
head[N*(2^M)].private is M and head[0].private is the actual free page order.
If such a high order free page is used as high order folio, it should be fine.
But if user allocates a non-compound high order page and uses split_page()
to get a list of order-0 pages from this high order page, some pages will
have non zero private. I wonder if these users are prepared for that.

For example, kernel/events/ring_buffer.c does it. In its comment, it says
“set its first page's private to this order; !PagePrivate(page) means it's
just a normal page.”
(see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634)

I wonder if non zero page->private would cause any issue there.

Maybe split_page() should set all page->private to 0.

Let me know if I get anything wrong.

>
>> Since __del_page_from_free_list() is a hot path in the kernel, it would be
>> better to remove the unnecessary set_page_private().
>>
>> Signed-off-by: zhongjinji <zhongjinji@honor.com>
>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>
>> ---
>>  mm/page_alloc.c | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index d1d037f97c5f..1999eb7e7c14 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
>>
>>  	list_del(&page->buddy_list);
>>  	__ClearPageBuddy(page);
>> -	set_page_private(page, 0);
>>  	zone->free_area[order].nr_free--;
>>
>>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))

Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()
  2025-09-30 14:28   ` Zi Yan
@ 2025-09-30 15:20     ` Vlastimil Babka
  2025-09-30 15:28       ` Zi Yan
  2025-10-01  4:38     ` jinji zhong
  1 sibling, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2025-09-30 15:20 UTC (permalink / raw)
  To: Zi Yan, Matthew Wilcox
  Cc: zhongjinji, akpm, surenb, mhocko, jackmanb, hannes, linux-mm,
	linux-kernel, liulu.liu, feng.han

On 9/30/25 16:28, Zi Yan wrote:
> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote:
> 
>> On 9/25/25 10:50, zhongjinji wrote:
>>> It is unnecessary to set page->private in __del_page_from_free_list().
>>>
>>> If the page is about to be allocated, page->private will be cleared by
>>> post_alloc_hook() before the page is handed out. If the page is expanded
>>> or merged, page->private will be reset by set_buddy_order, and no one
>>> will retrieve the page's buddy_order without the PageBuddy flag being set.
>>> If the page is isolated, it will also reset page->private when it
>>> succeeds.
>>
>> Seems correct.
> 
> This means high order free pages will have head[2N].private set to a non-zero
> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ...
> head[N*(2^M)].private is M and head[0].private is the actual free page order.

Hm right, tail pages...

> If such a high order free page is used as high order folio, it should be fine.

We don't reinterpret the private field in any of the first X tail pages for
folios? That would be bad too.

> But if user allocates a non-compound high order page and uses split_page()
> to get a list of order-0 pages from this high order page, some pages will
> have non zero private. I wonder if these users are prepared for that.
> 
> For example, kernel/events/ring_buffer.c does it. In its comment, it says
> “set its first page's private to this order; !PagePrivate(page) means it's
> just a normal page.”
> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634)
> 
> I wonder if non zero page->private would cause any issue there.
> 
> Maybe split_page() should set all page->private to 0.
> 
> Let me know if I get anything wrong.

Maybe we could postpone this optimization until struct pages are shrunk.

>>
>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be
>>> better to remove the unnecessary set_page_private().
>>>
>>> Signed-off-by: zhongjinji <zhongjinji@honor.com>
>>
>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>
>>> ---
>>>  mm/page_alloc.c | 1 -
>>>  1 file changed, 1 deletion(-)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index d1d037f97c5f..1999eb7e7c14 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
>>>
>>>  	list_del(&page->buddy_list);
>>>  	__ClearPageBuddy(page);
>>> -	set_page_private(page, 0);
>>>  	zone->free_area[order].nr_free--;
>>>
>>>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))
> 
> 
> Best Regards,
> Yan, Zi



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()
  2025-09-30 15:20     ` Vlastimil Babka
@ 2025-09-30 15:28       ` Zi Yan
  0 siblings, 0 replies; 9+ messages in thread
From: Zi Yan @ 2025-09-30 15:28 UTC (permalink / raw)
  To: Vlastimil Babka, Matthew Wilcox
  Cc: zhongjinji, akpm, surenb, mhocko, jackmanb, hannes, linux-mm,
	linux-kernel, liulu.liu, feng.han

On 30 Sep 2025, at 11:20, Vlastimil Babka wrote:

> On 9/30/25 16:28, Zi Yan wrote:
>> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote:
>>
>>> On 9/25/25 10:50, zhongjinji wrote:
>>>> It is unnecessary to set page->private in __del_page_from_free_list().
>>>>
>>>> If the page is about to be allocated, page->private will be cleared by
>>>> post_alloc_hook() before the page is handed out. If the page is expanded
>>>> or merged, page->private will be reset by set_buddy_order, and no one
>>>> will retrieve the page's buddy_order without the PageBuddy flag being set.
>>>> If the page is isolated, it will also reset page->private when it
>>>> succeeds.
>>>
>>> Seems correct.
>>
>> This means high order free pages will have head[2N].private set to a non-zero
>> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ...
>> head[N*(2^M)].private is M and head[0].private is the actual free page order.
>
> Hm right, tail pages...
>
>> If such a high order free page is used as high order folio, it should be fine.
>
> We don't reinterpret the private field in any of the first X tail pages for
> folios? That would be bad too.

I thought so until I see the comment above page_private():
“page_private can be used on tail pages.”

Now I am not so confident about “it should be fine”.

>
>> But if user allocates a non-compound high order page and uses split_page()
>> to get a list of order-0 pages from this high order page, some pages will
>> have non zero private. I wonder if these users are prepared for that.
>>
>> For example, kernel/events/ring_buffer.c does it. In its comment, it says
>> “set its first page's private to this order; !PagePrivate(page) means it's
>> just a normal page.”
>> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634)
>>
>> I wonder if non zero page->private would cause any issue there.
>>
>> Maybe split_page() should set all page->private to 0.
>>
>> Let me know if I get anything wrong.
>
> Maybe we could postpone this optimization until struct pages are shrunk.

Yes, I agree. But I think willy could give a more definitive answer here.

>
>>>
>>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be
>>>> better to remove the unnecessary set_page_private().
>>>>
>>>> Signed-off-by: zhongjinji <zhongjinji@honor.com>
>>>
>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>
>>>> ---
>>>>  mm/page_alloc.c | 1 -
>>>>  1 file changed, 1 deletion(-)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index d1d037f97c5f..1999eb7e7c14 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
>>>>
>>>>  	list_del(&page->buddy_list);
>>>>  	__ClearPageBuddy(page);
>>>> -	set_page_private(page, 0);
>>>>  	zone->free_area[order].nr_free--;
>>>>
>>>>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))
>>
>>
>> Best Regards,
>> Yan, Zi


Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()
  2025-09-30 14:28   ` Zi Yan
  2025-09-30 15:20     ` Vlastimil Babka
@ 2025-10-01  4:38     ` jinji zhong
  2025-10-03 15:18       ` Zi Yan
  1 sibling, 1 reply; 9+ messages in thread
From: jinji zhong @ 2025-10-01  4:38 UTC (permalink / raw)
  To: ziy
  Cc: akpm, feng.han, hannes, jackmanb, linux-kernel, linux-mm,
	liulu.liu, mhocko, surenb, vbabka, zhongjinji

> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote:

> >On 9/25/25 10:50, zhongjinji wrote:
> >> It is unnecessary to set page->private in __del_page_from_free_list().
> >>
> >> If the page is about to be allocated, page->private will be cleared by
> >> post_alloc_hook() before the page is handed out. If the page is expanded
> >> or merged, page->private will be reset by set_buddy_order, and no one
> >> will retrieve the page's buddy_order without the PageBuddy flag being set.
> >> If the page is isolated, it will also reset page->private when it
> >> succeeds.
> >
> >Seems correct.

> This means high order free pages will have head[2N].private set to a non-zero
> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ...
> head[N*(2^M)].private is M and head[0].private is the actual free page order.
> If such a high order free page is used as high order folio, it should be fine.
> But if user allocates a non-compound high order page and uses split_page()
> to get a list of order-0 pages from this high order page, some pages will
> have non zero private. I wonder if these users are prepared for that.

Having non-empty page->private in tail pages of non-compound high-order
pages is not an issue, as pages from the pcp lists never guarantee their
initial state. If ensuring empty page->private for tail pages is required,
we should handle this in prep_new_page(), similar to the approach taken in
prep_compound_page().

> For example, kernel/events/ring_buffer.c does it. In its comment, it says
> “set its first page's private to this order; !PagePrivate(page) means it's
> just a normal page.”
> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634)

PagePrivate is a flag in page->flags that indicates page->private is
already in use. While PageBuddy serves a similar purpose, it additionally
signifies that the page is part of the buddy system.

> I wonder if non zero page->private would cause any issue there.

> Maybe split_page() should set all page->private to 0.

> Let me know if I get anything wrong.

> >
> >> Since __del_page_from_free_list() is a hot path in the kernel, it would be
> >> better to remove the unnecessary set_page_private().
> >>
> >> Signed-off-by: zhongjinji <zhongjinji@honor.com>
> >
> >Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> >
> >> ---
> >>  mm/page_alloc.c | 1 -
> >>  1 file changed, 1 deletion(-)
> >>
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index d1d037f97c5f..1999eb7e7c14 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
> >>
> >>  	list_del(&page->buddy_list);
> >>  	__ClearPageBuddy(page);
> >> -	set_page_private(page, 0);
> >>  	zone->free_area[order].nr_free--;
> >>
> >>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))


> Best Regards,
> Yan, Zi



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()
  2025-10-01  4:38     ` jinji zhong
@ 2025-10-03 15:18       ` Zi Yan
  2025-10-20 15:06         ` jinji zhong
  0 siblings, 1 reply; 9+ messages in thread
From: Zi Yan @ 2025-10-03 15:18 UTC (permalink / raw)
  To: jinji zhong
  Cc: akpm, feng.han, hannes, jackmanb, linux-kernel, linux-mm,
	liulu.liu, mhocko, surenb, vbabka, zhongjinji

On 1 Oct 2025, at 0:38, jinji zhong wrote:

>> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote:
>
>>> On 9/25/25 10:50, zhongjinji wrote:
>>>> It is unnecessary to set page->private in __del_page_from_free_list().
>>>>
>>>> If the page is about to be allocated, page->private will be cleared by
>>>> post_alloc_hook() before the page is handed out. If the page is expanded
>>>> or merged, page->private will be reset by set_buddy_order, and no one
>>>> will retrieve the page's buddy_order without the PageBuddy flag being set.
>>>> If the page is isolated, it will also reset page->private when it
>>>> succeeds.
>>>
>>> Seems correct.
>
>> This means high order free pages will have head[2N].private set to a non-zero
>> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ...
>> head[N*(2^M)].private is M and head[0].private is the actual free page order.
>> If such a high order free page is used as high order folio, it should be fine.
>> But if user allocates a non-compound high order page and uses split_page()
>> to get a list of order-0 pages from this high order page, some pages will
>> have non zero private. I wonder if these users are prepared for that.
>
> Having non-empty page->private in tail pages of non-compound high-order
> pages is not an issue, as pages from the pcp lists never guarantee their
> initial state. If ensuring empty page->private for tail pages is required,

Sure. But is it because all page allocation users return used pages with
->private set back to 0? And can all page allocation users handle non-zero
->private? Otherwise, it can cause subtle bugs.

> we should handle this in prep_new_page(), similar to the approach taken in
> prep_compound_page().
>
>> For example, kernel/events/ring_buffer.c does it. In its comment, it says
>> “set its first page's private to this order; !PagePrivate(page) means it's
>> just a normal page.”
>> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634)
>
> PagePrivate is a flag in page->flags that indicates page->private is
> already in use. While PageBuddy serves a similar purpose, it additionally
> signifies that the page is part of the buddy system.

OK. You mean ->private will never be used if PagePrivate is not set
in ring buffer code?

If you are confident about it is OK to make some pages’ ->private not being
zero at allocation, I am not going to block the patch.

>
>> I wonder if non zero page->private would cause any issue there.
>
>> Maybe split_page() should set all page->private to 0.
>
>> Let me know if I get anything wrong.
>
>>>
>>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be
>>>> better to remove the unnecessary set_page_private().
>>>>
>>>> Signed-off-by: zhongjinji <zhongjinji@honor.com>
>>>
>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>
>>>> ---
>>>>  mm/page_alloc.c | 1 -
>>>>  1 file changed, 1 deletion(-)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index d1d037f97c5f..1999eb7e7c14 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
>>>>
>>>>  	list_del(&page->buddy_list);
>>>>  	__ClearPageBuddy(page);
>>>> -	set_page_private(page, 0);
>>>>  	zone->free_area[order].nr_free--;
>>>>
>>>>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))
>
>
>> Best Regards,
>> Yan, Zi


Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()
  2025-10-03 15:18       ` Zi Yan
@ 2025-10-20 15:06         ` jinji zhong
  2025-10-20 19:55           ` Zi Yan
  0 siblings, 1 reply; 9+ messages in thread
From: jinji zhong @ 2025-10-20 15:06 UTC (permalink / raw)
  To: ziy
  Cc: akpm, feng.han, hannes, jackmanb, jinji.z.zhong, linux-kernel,
	linux-mm, liulu.liu, mhocko, surenb, vbabka, zhongjinji

>On 1 Oct 2025, at 0:38, jinji zhong wrote:
>
>>> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote:
>>
>>>> On 9/25/25 10:50, zhongjinji wrote:
>>>>> It is unnecessary to set page->private in __del_page_from_free_list().
>>>>>
>>>>> If the page is about to be allocated, page->private will be cleared by
>>>>> post_alloc_hook() before the page is handed out. If the page is expanded
>>>>> or merged, page->private will be reset by set_buddy_order, and no one
>>>>> will retrieve the page's buddy_order without the PageBuddy flag being set.
>>>>> If the page is isolated, it will also reset page->private when it
>>>>> succeeds.
>>>>
>>>> Seems correct.
>>
>>> This means high order free pages will have head[2N].private set to a non-zero
>>> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ...
>>> head[N*(2^M)].private is M and head[0].private is the actual free page order.
>>> If such a high order free page is used as high order folio, it should be fine.
>>> But if user allocates a non-compound high order page and uses split_page()
>>> to get a list of order-0 pages from this high order page, some pages will
>>> have non zero private. I wonder if these users are prepared for that.
>>
>> Having non-empty page->private in tail pages of non-compound high-order
>> pages is not an issue, as pages from the pcp lists never guarantee their
>> initial state. If ensuring empty page->private for tail pages is required,
>
>Sure. But is it because all page allocation users return used pages with

Some users [2] do not reset the private back to 0. When the page is a tail
page, the non-zero private value will persist until the page is split.

>->private set back to 0? And can all page allocation users handle non-zero
>->private? Otherwise, it can cause subtle bugs.

Yes, you are right. Some users(like swapfile [1]) cannot handle non-zero private.

>> we should handle this in prep_new_page(), similar to the approach taken in
>> prep_compound_page().
>>
>>> For example, kernel/events/ring_buffer.c does it. In its comment, it says
>>> “set its first page's private to this order; !PagePrivate(page) means it's
>>> just a normal page.”
>>> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634)
>>
>> PagePrivate is a flag in page->flags that indicates page->private is
>> already in use. While PageBuddy serves a similar purpose, it additionally
>> signifies that the page is part of the buddy system.
>
>OK. You mean ->private will never be used if PagePrivate is not set
>in ring buffer code?

In the ring buffer code, it only uses the private field of the head page,
but I recently found that the swapfile [1] is assuming page->private is zero,
even if the page is a tail page, which seems a bit dangerous. Adding this
patch will make this situation worse.

link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3745 [1]
link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3887 [2]

>If you are confident about it is OK to make some pages’ ->private not being
>zero at allocation, I am not going to block the patch.
>
>>
>>> I wonder if non zero page->private would cause any issue there.
>>
>>> Maybe split_page() should set all page->private to 0.
>>
>>> Let me know if I get anything wrong.
>>
>>>>
>>>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be
>>>>> better to remove the unnecessary set_page_private().
>>>>>
>>>>> Signed-off-by: zhongjinji <zhongjinji@honor.com>
>>>>
>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>>
>>>>> ---
>>>>>  mm/page_alloc.c | 1 -
>>>>>  1 file changed, 1 deletion(-)
>>>>>
>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>>> index d1d037f97c5f..1999eb7e7c14 100644
>>>>> --- a/mm/page_alloc.c
>>>>> +++ b/mm/page_alloc.c
>>>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
>>>>>
>>>>>  	list_del(&page->buddy_list);
>>>>>  	__ClearPageBuddy(page);
>>>>> -	set_page_private(page, 0);
>>>>>  	zone->free_area[order].nr_free--;
>>>>>
>>>>>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))
>>
>>
>>> Best Regards,
>>> Yan, Zi
>
>
>Best Regards,
>Yan, Zi


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()
  2025-10-20 15:06         ` jinji zhong
@ 2025-10-20 19:55           ` Zi Yan
  0 siblings, 0 replies; 9+ messages in thread
From: Zi Yan @ 2025-10-20 19:55 UTC (permalink / raw)
  To: jinji zhong
  Cc: akpm, feng.han, hannes, jackmanb, linux-kernel, linux-mm,
	liulu.liu, mhocko, surenb, vbabka, zhongjinji

On 20 Oct 2025, at 11:06, jinji zhong wrote:

>> On 1 Oct 2025, at 0:38, jinji zhong wrote:
>>
>>>> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote:
>>>
>>>>> On 9/25/25 10:50, zhongjinji wrote:
>>>>>> It is unnecessary to set page->private in __del_page_from_free_list().
>>>>>>
>>>>>> If the page is about to be allocated, page->private will be cleared by
>>>>>> post_alloc_hook() before the page is handed out. If the page is expanded
>>>>>> or merged, page->private will be reset by set_buddy_order, and no one
>>>>>> will retrieve the page's buddy_order without the PageBuddy flag being set.
>>>>>> If the page is isolated, it will also reset page->private when it
>>>>>> succeeds.
>>>>>
>>>>> Seems correct.
>>>
>>>> This means high order free pages will have head[2N].private set to a non-zero
>>>> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ...
>>>> head[N*(2^M)].private is M and head[0].private is the actual free page order.
>>>> If such a high order free page is used as high order folio, it should be fine.
>>>> But if user allocates a non-compound high order page and uses split_page()
>>>> to get a list of order-0 pages from this high order page, some pages will
>>>> have non zero private. I wonder if these users are prepared for that.
>>>
>>> Having non-empty page->private in tail pages of non-compound high-order
>>> pages is not an issue, as pages from the pcp lists never guarantee their
>>> initial state. If ensuring empty page->private for tail pages is required,
>>
>> Sure. But is it because all page allocation users return used pages with
>
> Some users [2] do not reset the private back to 0. When the page is a tail
> page, the non-zero private value will persist until the page is split.
>
>> ->private set back to 0? And can all page allocation users handle non-zero
>> ->private? Otherwise, it can cause subtle bugs.
>
> Yes, you are right. Some users(like swapfile [1]) cannot handle non-zero private.
>
>>> we should handle this in prep_new_page(), similar to the approach taken in
>>> prep_compound_page().
>>>
>>>> For example, kernel/events/ring_buffer.c does it. In its comment, it says
>>>> “set its first page's private to this order; !PagePrivate(page) means it's
>>>> just a normal page.”
>>>> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634)
>>>
>>> PagePrivate is a flag in page->flags that indicates page->private is
>>> already in use. While PageBuddy serves a similar purpose, it additionally
>>> signifies that the page is part of the buddy system.
>>
>> OK. You mean ->private will never be used if PagePrivate is not set
>> in ring buffer code?
>
> In the ring buffer code, it only uses the private field of the head page,
> but I recently found that the swapfile [1] is assuming page->private is zero,
> even if the page is a tail page, which seems a bit dangerous. Adding this
> patch will make this situation worse.

Yeah, thank you for the detective work. The comment in [1] sounds really
alarming, as the code relies on a specific behavior:
    /*
	 *Page allocation does not initialize the page's lru field,
	 * but it does always reset its private field.
     */

We can revisit your patch when page->private is gone like Vlastimil suggested.

>
> link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3745 [1]
> link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3887 [2]
>
>> If you are confident about it is OK to make some pages’ ->private not being
>> zero at allocation, I am not going to block the patch.
>>
>>>
>>>> I wonder if non zero page->private would cause any issue there.
>>>
>>>> Maybe split_page() should set all page->private to 0.
>>>
>>>> Let me know if I get anything wrong.
>>>
>>>>>
>>>>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be
>>>>>> better to remove the unnecessary set_page_private().
>>>>>>
>>>>>> Signed-off-by: zhongjinji <zhongjinji@honor.com>
>>>>>
>>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>>>>>
>>>>>> ---
>>>>>>  mm/page_alloc.c | 1 -
>>>>>>  1 file changed, 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>>>> index d1d037f97c5f..1999eb7e7c14 100644
>>>>>> --- a/mm/page_alloc.c
>>>>>> +++ b/mm/page_alloc.c
>>>>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
>>>>>>
>>>>>>  	list_del(&page->buddy_list);
>>>>>>  	__ClearPageBuddy(page);
>>>>>> -	set_page_private(page, 0);
>>>>>>  	zone->free_area[order].nr_free--;
>>>>>>
>>>>>>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))
>>>
>>>
>>>> Best Regards,
>>>> Yan, Zi
>>
>>
>> Best Regards,
>> Yan, Zi


--
Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-10-20 19:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-25  8:50 [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() zhongjinji
2025-09-30 13:55 ` Vlastimil Babka
2025-09-30 14:28   ` Zi Yan
2025-09-30 15:20     ` Vlastimil Babka
2025-09-30 15:28       ` Zi Yan
2025-10-01  4:38     ` jinji zhong
2025-10-03 15:18       ` Zi Yan
2025-10-20 15:06         ` jinji zhong
2025-10-20 19:55           ` Zi Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox