* [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() @ 2025-09-25 8:50 zhongjinji 2025-09-30 13:55 ` Vlastimil Babka 0 siblings, 1 reply; 9+ messages in thread From: zhongjinji @ 2025-09-25 8:50 UTC (permalink / raw) To: akpm Cc: vbabka, surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel, liulu.liu, feng.han, zhongjinji It is unnecessary to set page->private in __del_page_from_free_list(). If the page is about to be allocated, page->private will be cleared by post_alloc_hook() before the page is handed out. If the page is expanded or merged, page->private will be reset by set_buddy_order, and no one will retrieve the page's buddy_order without the PageBuddy flag being set. If the page is isolated, it will also reset page->private when it succeeds. Since __del_page_from_free_list() is a hot path in the kernel, it would be better to remove the unnecessary set_page_private(). Signed-off-by: zhongjinji <zhongjinji@honor.com> --- mm/page_alloc.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d1d037f97c5f..1999eb7e7c14 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon list_del(&page->buddy_list); __ClearPageBuddy(page); - set_page_private(page, 0); zone->free_area[order].nr_free--; if (order >= pageblock_order && !is_migrate_isolate(migratetype)) -- 2.17.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() 2025-09-25 8:50 [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() zhongjinji @ 2025-09-30 13:55 ` Vlastimil Babka 2025-09-30 14:28 ` Zi Yan 0 siblings, 1 reply; 9+ messages in thread From: Vlastimil Babka @ 2025-09-30 13:55 UTC (permalink / raw) To: zhongjinji, akpm Cc: surenb, mhocko, jackmanb, hannes, ziy, linux-mm, linux-kernel, liulu.liu, feng.han On 9/25/25 10:50, zhongjinji wrote: > It is unnecessary to set page->private in __del_page_from_free_list(). > > If the page is about to be allocated, page->private will be cleared by > post_alloc_hook() before the page is handed out. If the page is expanded > or merged, page->private will be reset by set_buddy_order, and no one > will retrieve the page's buddy_order without the PageBuddy flag being set. > If the page is isolated, it will also reset page->private when it > succeeds. Seems correct. > Since __del_page_from_free_list() is a hot path in the kernel, it would be > better to remove the unnecessary set_page_private(). > > Signed-off-by: zhongjinji <zhongjinji@honor.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> > --- > mm/page_alloc.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index d1d037f97c5f..1999eb7e7c14 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon > > list_del(&page->buddy_list); > __ClearPageBuddy(page); > - set_page_private(page, 0); > zone->free_area[order].nr_free--; > > if (order >= pageblock_order && !is_migrate_isolate(migratetype)) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() 2025-09-30 13:55 ` Vlastimil Babka @ 2025-09-30 14:28 ` Zi Yan 2025-09-30 15:20 ` Vlastimil Babka 2025-10-01 4:38 ` jinji zhong 0 siblings, 2 replies; 9+ messages in thread From: Zi Yan @ 2025-09-30 14:28 UTC (permalink / raw) To: Vlastimil Babka Cc: zhongjinji, akpm, surenb, mhocko, jackmanb, hannes, linux-mm, linux-kernel, liulu.liu, feng.han On 30 Sep 2025, at 9:55, Vlastimil Babka wrote: > On 9/25/25 10:50, zhongjinji wrote: >> It is unnecessary to set page->private in __del_page_from_free_list(). >> >> If the page is about to be allocated, page->private will be cleared by >> post_alloc_hook() before the page is handed out. If the page is expanded >> or merged, page->private will be reset by set_buddy_order, and no one >> will retrieve the page's buddy_order without the PageBuddy flag being set. >> If the page is isolated, it will also reset page->private when it >> succeeds. > > Seems correct. This means high order free pages will have head[2N].private set to a non-zero value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ... head[N*(2^M)].private is M and head[0].private is the actual free page order. If such a high order free page is used as high order folio, it should be fine. But if user allocates a non-compound high order page and uses split_page() to get a list of order-0 pages from this high order page, some pages will have non zero private. I wonder if these users are prepared for that. For example, kernel/events/ring_buffer.c does it. In its comment, it says “set its first page's private to this order; !PagePrivate(page) means it's just a normal page.” (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634) I wonder if non zero page->private would cause any issue there. Maybe split_page() should set all page->private to 0. Let me know if I get anything wrong. > >> Since __del_page_from_free_list() is a hot path in the kernel, it would be >> better to remove the unnecessary set_page_private(). >> >> Signed-off-by: zhongjinji <zhongjinji@honor.com> > > Reviewed-by: Vlastimil Babka <vbabka@suse.cz> > >> --- >> mm/page_alloc.c | 1 - >> 1 file changed, 1 deletion(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index d1d037f97c5f..1999eb7e7c14 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon >> >> list_del(&page->buddy_list); >> __ClearPageBuddy(page); >> - set_page_private(page, 0); >> zone->free_area[order].nr_free--; >> >> if (order >= pageblock_order && !is_migrate_isolate(migratetype)) Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() 2025-09-30 14:28 ` Zi Yan @ 2025-09-30 15:20 ` Vlastimil Babka 2025-09-30 15:28 ` Zi Yan 2025-10-01 4:38 ` jinji zhong 1 sibling, 1 reply; 9+ messages in thread From: Vlastimil Babka @ 2025-09-30 15:20 UTC (permalink / raw) To: Zi Yan, Matthew Wilcox Cc: zhongjinji, akpm, surenb, mhocko, jackmanb, hannes, linux-mm, linux-kernel, liulu.liu, feng.han On 9/30/25 16:28, Zi Yan wrote: > On 30 Sep 2025, at 9:55, Vlastimil Babka wrote: > >> On 9/25/25 10:50, zhongjinji wrote: >>> It is unnecessary to set page->private in __del_page_from_free_list(). >>> >>> If the page is about to be allocated, page->private will be cleared by >>> post_alloc_hook() before the page is handed out. If the page is expanded >>> or merged, page->private will be reset by set_buddy_order, and no one >>> will retrieve the page's buddy_order without the PageBuddy flag being set. >>> If the page is isolated, it will also reset page->private when it >>> succeeds. >> >> Seems correct. > > This means high order free pages will have head[2N].private set to a non-zero > value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ... > head[N*(2^M)].private is M and head[0].private is the actual free page order. Hm right, tail pages... > If such a high order free page is used as high order folio, it should be fine. We don't reinterpret the private field in any of the first X tail pages for folios? That would be bad too. > But if user allocates a non-compound high order page and uses split_page() > to get a list of order-0 pages from this high order page, some pages will > have non zero private. I wonder if these users are prepared for that. > > For example, kernel/events/ring_buffer.c does it. In its comment, it says > “set its first page's private to this order; !PagePrivate(page) means it's > just a normal page.” > (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634) > > I wonder if non zero page->private would cause any issue there. > > Maybe split_page() should set all page->private to 0. > > Let me know if I get anything wrong. Maybe we could postpone this optimization until struct pages are shrunk. >> >>> Since __del_page_from_free_list() is a hot path in the kernel, it would be >>> better to remove the unnecessary set_page_private(). >>> >>> Signed-off-by: zhongjinji <zhongjinji@honor.com> >> >> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> >> >>> --- >>> mm/page_alloc.c | 1 - >>> 1 file changed, 1 deletion(-) >>> >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index d1d037f97c5f..1999eb7e7c14 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon >>> >>> list_del(&page->buddy_list); >>> __ClearPageBuddy(page); >>> - set_page_private(page, 0); >>> zone->free_area[order].nr_free--; >>> >>> if (order >= pageblock_order && !is_migrate_isolate(migratetype)) > > > Best Regards, > Yan, Zi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() 2025-09-30 15:20 ` Vlastimil Babka @ 2025-09-30 15:28 ` Zi Yan 0 siblings, 0 replies; 9+ messages in thread From: Zi Yan @ 2025-09-30 15:28 UTC (permalink / raw) To: Vlastimil Babka, Matthew Wilcox Cc: zhongjinji, akpm, surenb, mhocko, jackmanb, hannes, linux-mm, linux-kernel, liulu.liu, feng.han On 30 Sep 2025, at 11:20, Vlastimil Babka wrote: > On 9/30/25 16:28, Zi Yan wrote: >> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote: >> >>> On 9/25/25 10:50, zhongjinji wrote: >>>> It is unnecessary to set page->private in __del_page_from_free_list(). >>>> >>>> If the page is about to be allocated, page->private will be cleared by >>>> post_alloc_hook() before the page is handed out. If the page is expanded >>>> or merged, page->private will be reset by set_buddy_order, and no one >>>> will retrieve the page's buddy_order without the PageBuddy flag being set. >>>> If the page is isolated, it will also reset page->private when it >>>> succeeds. >>> >>> Seems correct. >> >> This means high order free pages will have head[2N].private set to a non-zero >> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ... >> head[N*(2^M)].private is M and head[0].private is the actual free page order. > > Hm right, tail pages... > >> If such a high order free page is used as high order folio, it should be fine. > > We don't reinterpret the private field in any of the first X tail pages for > folios? That would be bad too. I thought so until I see the comment above page_private(): “page_private can be used on tail pages.” Now I am not so confident about “it should be fine”. > >> But if user allocates a non-compound high order page and uses split_page() >> to get a list of order-0 pages from this high order page, some pages will >> have non zero private. I wonder if these users are prepared for that. >> >> For example, kernel/events/ring_buffer.c does it. In its comment, it says >> “set its first page's private to this order; !PagePrivate(page) means it's >> just a normal page.” >> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634) >> >> I wonder if non zero page->private would cause any issue there. >> >> Maybe split_page() should set all page->private to 0. >> >> Let me know if I get anything wrong. > > Maybe we could postpone this optimization until struct pages are shrunk. Yes, I agree. But I think willy could give a more definitive answer here. > >>> >>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be >>>> better to remove the unnecessary set_page_private(). >>>> >>>> Signed-off-by: zhongjinji <zhongjinji@honor.com> >>> >>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> >>> >>>> --- >>>> mm/page_alloc.c | 1 - >>>> 1 file changed, 1 deletion(-) >>>> >>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>> index d1d037f97c5f..1999eb7e7c14 100644 >>>> --- a/mm/page_alloc.c >>>> +++ b/mm/page_alloc.c >>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon >>>> >>>> list_del(&page->buddy_list); >>>> __ClearPageBuddy(page); >>>> - set_page_private(page, 0); >>>> zone->free_area[order].nr_free--; >>>> >>>> if (order >= pageblock_order && !is_migrate_isolate(migratetype)) >> >> >> Best Regards, >> Yan, Zi Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() 2025-09-30 14:28 ` Zi Yan 2025-09-30 15:20 ` Vlastimil Babka @ 2025-10-01 4:38 ` jinji zhong 2025-10-03 15:18 ` Zi Yan 1 sibling, 1 reply; 9+ messages in thread From: jinji zhong @ 2025-10-01 4:38 UTC (permalink / raw) To: ziy Cc: akpm, feng.han, hannes, jackmanb, linux-kernel, linux-mm, liulu.liu, mhocko, surenb, vbabka, zhongjinji > On 30 Sep 2025, at 9:55, Vlastimil Babka wrote: > >On 9/25/25 10:50, zhongjinji wrote: > >> It is unnecessary to set page->private in __del_page_from_free_list(). > >> > >> If the page is about to be allocated, page->private will be cleared by > >> post_alloc_hook() before the page is handed out. If the page is expanded > >> or merged, page->private will be reset by set_buddy_order, and no one > >> will retrieve the page's buddy_order without the PageBuddy flag being set. > >> If the page is isolated, it will also reset page->private when it > >> succeeds. > > > >Seems correct. > This means high order free pages will have head[2N].private set to a non-zero > value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ... > head[N*(2^M)].private is M and head[0].private is the actual free page order. > If such a high order free page is used as high order folio, it should be fine. > But if user allocates a non-compound high order page and uses split_page() > to get a list of order-0 pages from this high order page, some pages will > have non zero private. I wonder if these users are prepared for that. Having non-empty page->private in tail pages of non-compound high-order pages is not an issue, as pages from the pcp lists never guarantee their initial state. If ensuring empty page->private for tail pages is required, we should handle this in prep_new_page(), similar to the approach taken in prep_compound_page(). > For example, kernel/events/ring_buffer.c does it. In its comment, it says > “set its first page's private to this order; !PagePrivate(page) means it's > just a normal page.” > (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634) PagePrivate is a flag in page->flags that indicates page->private is already in use. While PageBuddy serves a similar purpose, it additionally signifies that the page is part of the buddy system. > I wonder if non zero page->private would cause any issue there. > Maybe split_page() should set all page->private to 0. > Let me know if I get anything wrong. > > > >> Since __del_page_from_free_list() is a hot path in the kernel, it would be > >> better to remove the unnecessary set_page_private(). > >> > >> Signed-off-by: zhongjinji <zhongjinji@honor.com> > > > >Reviewed-by: Vlastimil Babka <vbabka@suse.cz> > > > >> --- > >> mm/page_alloc.c | 1 - > >> 1 file changed, 1 deletion(-) > >> > >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >> index d1d037f97c5f..1999eb7e7c14 100644 > >> --- a/mm/page_alloc.c > >> +++ b/mm/page_alloc.c > >> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon > >> > >> list_del(&page->buddy_list); > >> __ClearPageBuddy(page); > >> - set_page_private(page, 0); > >> zone->free_area[order].nr_free--; > >> > >> if (order >= pageblock_order && !is_migrate_isolate(migratetype)) > Best Regards, > Yan, Zi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() 2025-10-01 4:38 ` jinji zhong @ 2025-10-03 15:18 ` Zi Yan 2025-10-20 15:06 ` jinji zhong 0 siblings, 1 reply; 9+ messages in thread From: Zi Yan @ 2025-10-03 15:18 UTC (permalink / raw) To: jinji zhong Cc: akpm, feng.han, hannes, jackmanb, linux-kernel, linux-mm, liulu.liu, mhocko, surenb, vbabka, zhongjinji On 1 Oct 2025, at 0:38, jinji zhong wrote: >> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote: > >>> On 9/25/25 10:50, zhongjinji wrote: >>>> It is unnecessary to set page->private in __del_page_from_free_list(). >>>> >>>> If the page is about to be allocated, page->private will be cleared by >>>> post_alloc_hook() before the page is handed out. If the page is expanded >>>> or merged, page->private will be reset by set_buddy_order, and no one >>>> will retrieve the page's buddy_order without the PageBuddy flag being set. >>>> If the page is isolated, it will also reset page->private when it >>>> succeeds. >>> >>> Seems correct. > >> This means high order free pages will have head[2N].private set to a non-zero >> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ... >> head[N*(2^M)].private is M and head[0].private is the actual free page order. >> If such a high order free page is used as high order folio, it should be fine. >> But if user allocates a non-compound high order page and uses split_page() >> to get a list of order-0 pages from this high order page, some pages will >> have non zero private. I wonder if these users are prepared for that. > > Having non-empty page->private in tail pages of non-compound high-order > pages is not an issue, as pages from the pcp lists never guarantee their > initial state. If ensuring empty page->private for tail pages is required, Sure. But is it because all page allocation users return used pages with ->private set back to 0? And can all page allocation users handle non-zero ->private? Otherwise, it can cause subtle bugs. > we should handle this in prep_new_page(), similar to the approach taken in > prep_compound_page(). > >> For example, kernel/events/ring_buffer.c does it. In its comment, it says >> “set its first page's private to this order; !PagePrivate(page) means it's >> just a normal page.” >> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634) > > PagePrivate is a flag in page->flags that indicates page->private is > already in use. While PageBuddy serves a similar purpose, it additionally > signifies that the page is part of the buddy system. OK. You mean ->private will never be used if PagePrivate is not set in ring buffer code? If you are confident about it is OK to make some pages’ ->private not being zero at allocation, I am not going to block the patch. > >> I wonder if non zero page->private would cause any issue there. > >> Maybe split_page() should set all page->private to 0. > >> Let me know if I get anything wrong. > >>> >>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be >>>> better to remove the unnecessary set_page_private(). >>>> >>>> Signed-off-by: zhongjinji <zhongjinji@honor.com> >>> >>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> >>> >>>> --- >>>> mm/page_alloc.c | 1 - >>>> 1 file changed, 1 deletion(-) >>>> >>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>> index d1d037f97c5f..1999eb7e7c14 100644 >>>> --- a/mm/page_alloc.c >>>> +++ b/mm/page_alloc.c >>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon >>>> >>>> list_del(&page->buddy_list); >>>> __ClearPageBuddy(page); >>>> - set_page_private(page, 0); >>>> zone->free_area[order].nr_free--; >>>> >>>> if (order >= pageblock_order && !is_migrate_isolate(migratetype)) > > >> Best Regards, >> Yan, Zi Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() 2025-10-03 15:18 ` Zi Yan @ 2025-10-20 15:06 ` jinji zhong 2025-10-20 19:55 ` Zi Yan 0 siblings, 1 reply; 9+ messages in thread From: jinji zhong @ 2025-10-20 15:06 UTC (permalink / raw) To: ziy Cc: akpm, feng.han, hannes, jackmanb, jinji.z.zhong, linux-kernel, linux-mm, liulu.liu, mhocko, surenb, vbabka, zhongjinji >On 1 Oct 2025, at 0:38, jinji zhong wrote: > >>> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote: >> >>>> On 9/25/25 10:50, zhongjinji wrote: >>>>> It is unnecessary to set page->private in __del_page_from_free_list(). >>>>> >>>>> If the page is about to be allocated, page->private will be cleared by >>>>> post_alloc_hook() before the page is handed out. If the page is expanded >>>>> or merged, page->private will be reset by set_buddy_order, and no one >>>>> will retrieve the page's buddy_order without the PageBuddy flag being set. >>>>> If the page is isolated, it will also reset page->private when it >>>>> succeeds. >>>> >>>> Seems correct. >> >>> This means high order free pages will have head[2N].private set to a non-zero >>> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ... >>> head[N*(2^M)].private is M and head[0].private is the actual free page order. >>> If such a high order free page is used as high order folio, it should be fine. >>> But if user allocates a non-compound high order page and uses split_page() >>> to get a list of order-0 pages from this high order page, some pages will >>> have non zero private. I wonder if these users are prepared for that. >> >> Having non-empty page->private in tail pages of non-compound high-order >> pages is not an issue, as pages from the pcp lists never guarantee their >> initial state. If ensuring empty page->private for tail pages is required, > >Sure. But is it because all page allocation users return used pages with Some users [2] do not reset the private back to 0. When the page is a tail page, the non-zero private value will persist until the page is split. >->private set back to 0? And can all page allocation users handle non-zero >->private? Otherwise, it can cause subtle bugs. Yes, you are right. Some users(like swapfile [1]) cannot handle non-zero private. >> we should handle this in prep_new_page(), similar to the approach taken in >> prep_compound_page(). >> >>> For example, kernel/events/ring_buffer.c does it. In its comment, it says >>> “set its first page's private to this order; !PagePrivate(page) means it's >>> just a normal page.” >>> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634) >> >> PagePrivate is a flag in page->flags that indicates page->private is >> already in use. While PageBuddy serves a similar purpose, it additionally >> signifies that the page is part of the buddy system. > >OK. You mean ->private will never be used if PagePrivate is not set >in ring buffer code? In the ring buffer code, it only uses the private field of the head page, but I recently found that the swapfile [1] is assuming page->private is zero, even if the page is a tail page, which seems a bit dangerous. Adding this patch will make this situation worse. link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3745 [1] link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3887 [2] >If you are confident about it is OK to make some pages’ ->private not being >zero at allocation, I am not going to block the patch. > >> >>> I wonder if non zero page->private would cause any issue there. >> >>> Maybe split_page() should set all page->private to 0. >> >>> Let me know if I get anything wrong. >> >>>> >>>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be >>>>> better to remove the unnecessary set_page_private(). >>>>> >>>>> Signed-off-by: zhongjinji <zhongjinji@honor.com> >>>> >>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> >>>> >>>>> --- >>>>> mm/page_alloc.c | 1 - >>>>> 1 file changed, 1 deletion(-) >>>>> >>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>> index d1d037f97c5f..1999eb7e7c14 100644 >>>>> --- a/mm/page_alloc.c >>>>> +++ b/mm/page_alloc.c >>>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon >>>>> >>>>> list_del(&page->buddy_list); >>>>> __ClearPageBuddy(page); >>>>> - set_page_private(page, 0); >>>>> zone->free_area[order].nr_free--; >>>>> >>>>> if (order >= pageblock_order && !is_migrate_isolate(migratetype)) >> >> >>> Best Regards, >>> Yan, Zi > > >Best Regards, >Yan, Zi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() 2025-10-20 15:06 ` jinji zhong @ 2025-10-20 19:55 ` Zi Yan 0 siblings, 0 replies; 9+ messages in thread From: Zi Yan @ 2025-10-20 19:55 UTC (permalink / raw) To: jinji zhong Cc: akpm, feng.han, hannes, jackmanb, linux-kernel, linux-mm, liulu.liu, mhocko, surenb, vbabka, zhongjinji On 20 Oct 2025, at 11:06, jinji zhong wrote: >> On 1 Oct 2025, at 0:38, jinji zhong wrote: >> >>>> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote: >>> >>>>> On 9/25/25 10:50, zhongjinji wrote: >>>>>> It is unnecessary to set page->private in __del_page_from_free_list(). >>>>>> >>>>>> If the page is about to be allocated, page->private will be cleared by >>>>>> post_alloc_hook() before the page is handed out. If the page is expanded >>>>>> or merged, page->private will be reset by set_buddy_order, and no one >>>>>> will retrieve the page's buddy_order without the PageBuddy flag being set. >>>>>> If the page is isolated, it will also reset page->private when it >>>>>> succeeds. >>>>> >>>>> Seems correct. >>> >>>> This means high order free pages will have head[2N].private set to a non-zero >>>> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ... >>>> head[N*(2^M)].private is M and head[0].private is the actual free page order. >>>> If such a high order free page is used as high order folio, it should be fine. >>>> But if user allocates a non-compound high order page and uses split_page() >>>> to get a list of order-0 pages from this high order page, some pages will >>>> have non zero private. I wonder if these users are prepared for that. >>> >>> Having non-empty page->private in tail pages of non-compound high-order >>> pages is not an issue, as pages from the pcp lists never guarantee their >>> initial state. If ensuring empty page->private for tail pages is required, >> >> Sure. But is it because all page allocation users return used pages with > > Some users [2] do not reset the private back to 0. When the page is a tail > page, the non-zero private value will persist until the page is split. > >> ->private set back to 0? And can all page allocation users handle non-zero >> ->private? Otherwise, it can cause subtle bugs. > > Yes, you are right. Some users(like swapfile [1]) cannot handle non-zero private. > >>> we should handle this in prep_new_page(), similar to the approach taken in >>> prep_compound_page(). >>> >>>> For example, kernel/events/ring_buffer.c does it. In its comment, it says >>>> “set its first page's private to this order; !PagePrivate(page) means it's >>>> just a normal page.” >>>> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634) >>> >>> PagePrivate is a flag in page->flags that indicates page->private is >>> already in use. While PageBuddy serves a similar purpose, it additionally >>> signifies that the page is part of the buddy system. >> >> OK. You mean ->private will never be used if PagePrivate is not set >> in ring buffer code? > > In the ring buffer code, it only uses the private field of the head page, > but I recently found that the swapfile [1] is assuming page->private is zero, > even if the page is a tail page, which seems a bit dangerous. Adding this > patch will make this situation worse. Yeah, thank you for the detective work. The comment in [1] sounds really alarming, as the code relies on a specific behavior: /* *Page allocation does not initialize the page's lru field, * but it does always reset its private field. */ We can revisit your patch when page->private is gone like Vlastimil suggested. > > link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3745 [1] > link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3887 [2] > >> If you are confident about it is OK to make some pages’ ->private not being >> zero at allocation, I am not going to block the patch. >> >>> >>>> I wonder if non zero page->private would cause any issue there. >>> >>>> Maybe split_page() should set all page->private to 0. >>> >>>> Let me know if I get anything wrong. >>> >>>>> >>>>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be >>>>>> better to remove the unnecessary set_page_private(). >>>>>> >>>>>> Signed-off-by: zhongjinji <zhongjinji@honor.com> >>>>> >>>>> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> >>>>> >>>>>> --- >>>>>> mm/page_alloc.c | 1 - >>>>>> 1 file changed, 1 deletion(-) >>>>>> >>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>>> index d1d037f97c5f..1999eb7e7c14 100644 >>>>>> --- a/mm/page_alloc.c >>>>>> +++ b/mm/page_alloc.c >>>>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon >>>>>> >>>>>> list_del(&page->buddy_list); >>>>>> __ClearPageBuddy(page); >>>>>> - set_page_private(page, 0); >>>>>> zone->free_area[order].nr_free--; >>>>>> >>>>>> if (order >= pageblock_order && !is_migrate_isolate(migratetype)) >>> >>> >>>> Best Regards, >>>> Yan, Zi >> >> >> Best Regards, >> Yan, Zi -- Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-10-20 19:56 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-25 8:50 [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list() zhongjinji 2025-09-30 13:55 ` Vlastimil Babka 2025-09-30 14:28 ` Zi Yan 2025-09-30 15:20 ` Vlastimil Babka 2025-09-30 15:28 ` Zi Yan 2025-10-01 4:38 ` jinji zhong 2025-10-03 15:18 ` Zi Yan 2025-10-20 15:06 ` jinji zhong 2025-10-20 19:55 ` Zi Yan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox