linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm,page_owner: Defer enablement of static branch
@ 2024-04-09 13:17 Oscar Salvador
  2024-04-09 13:26 ` Vlastimil Babka
  2024-04-09 13:27 ` Kefeng Wang
  0 siblings, 2 replies; 3+ messages in thread
From: Oscar Salvador @ 2024-04-09 13:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Vlastimil Babka, Kefeng Wang, Oscar Salvador

Kefeng Wang reported that he was seeing some memory leaks with kmemleak
with page_owner enabled.
The reason behind is that we enable the page_owner_inited static branch
and then proceed with the linking of stack_list struct to dummy_stack,
which means that exists a race window between these two steps where we
can have pages already being allocated calling add_stack_record_to_list(),
allocating objects and linking them to stack_list, but then we set
stack_list pointing to dummy_stack in init_page_owner.
Which means that the objects that have been allocated during that time
window are unreferenced and lost.

Fix this by deferring the enablement of the branch until we have properly
set up the list.

Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Closes: https://lore.kernel.org/linux-mm/74b147b0-718d-4d50-be75-d6afc801cd24@huawei.com/
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Fixes: 4bedfb314bdd ("mm,page_owner: maintain own list of stack_records structs")
Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
Special thanks and kudos go to Kefeng Wang for helping me out chasing
down this bug, as I could not reproduce it with any of my machines, and
to Vlastimil to bring another pair of eyes, which was very helpful.

 mm/page_owner.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index 9bef0b442863..742f432e5bf0 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -118,7 +118,6 @@ static __init void init_page_owner(void)
 	register_dummy_stack();
 	register_failure_stack();
 	register_early_stack();
-	static_branch_enable(&page_owner_inited);
 	init_early_allocated_pages();
 	/* Initialize dummy and failure stacks and link them to stack_list */
 	dummy_stack.stack_record = __stack_depot_get_stack_record(dummy_handle);
@@ -129,6 +128,7 @@ static __init void init_page_owner(void)
 		refcount_set(&failure_stack.stack_record->count, 1);
 	dummy_stack.next = &failure_stack;
 	stack_list = &dummy_stack;
+	static_branch_enable(&page_owner_inited);
 }
 
 struct page_ext_operations page_owner_ops = {
-- 
2.44.0



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm,page_owner: Defer enablement of static branch
  2024-04-09 13:17 [PATCH] mm,page_owner: Defer enablement of static branch Oscar Salvador
@ 2024-04-09 13:26 ` Vlastimil Babka
  2024-04-09 13:27 ` Kefeng Wang
  1 sibling, 0 replies; 3+ messages in thread
From: Vlastimil Babka @ 2024-04-09 13:26 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton; +Cc: linux-kernel, linux-mm, Kefeng Wang

On 4/9/24 3:17 PM, Oscar Salvador wrote:
> Kefeng Wang reported that he was seeing some memory leaks with kmemleak
> with page_owner enabled.
> The reason behind is that we enable the page_owner_inited static branch
> and then proceed with the linking of stack_list struct to dummy_stack,
> which means that exists a race window between these two steps where we
> can have pages already being allocated calling add_stack_record_to_list(),
> allocating objects and linking them to stack_list, but then we set
> stack_list pointing to dummy_stack in init_page_owner.
> Which means that the objects that have been allocated during that time
> window are unreferenced and lost.
> 
> Fix this by deferring the enablement of the branch until we have properly
> set up the list.
> 
> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Closes: https://lore.kernel.org/linux-mm/74b147b0-718d-4d50-be75-d6afc801cd24@huawei.com/
> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Fixes: 4bedfb314bdd ("mm,page_owner: maintain own list of stack_records structs")

That's in rc1 so ideally mm-hotfixes.

> Signed-off-by: Oscar Salvador <osalvador@suse.de>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
> Special thanks and kudos go to Kefeng Wang for helping me out chasing
> down this bug, as I could not reproduce it with any of my machines, and
> to Vlastimil to bring another pair of eyes, which was very helpful.
> 
>  mm/page_owner.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index 9bef0b442863..742f432e5bf0 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -118,7 +118,6 @@ static __init void init_page_owner(void)
>  	register_dummy_stack();
>  	register_failure_stack();
>  	register_early_stack();
> -	static_branch_enable(&page_owner_inited);
>  	init_early_allocated_pages();
>  	/* Initialize dummy and failure stacks and link them to stack_list */
>  	dummy_stack.stack_record = __stack_depot_get_stack_record(dummy_handle);
> @@ -129,6 +128,7 @@ static __init void init_page_owner(void)
>  		refcount_set(&failure_stack.stack_record->count, 1);
>  	dummy_stack.next = &failure_stack;
>  	stack_list = &dummy_stack;
> +	static_branch_enable(&page_owner_inited);
>  }
>  
>  struct page_ext_operations page_owner_ops = {



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm,page_owner: Defer enablement of static branch
  2024-04-09 13:17 [PATCH] mm,page_owner: Defer enablement of static branch Oscar Salvador
  2024-04-09 13:26 ` Vlastimil Babka
@ 2024-04-09 13:27 ` Kefeng Wang
  1 sibling, 0 replies; 3+ messages in thread
From: Kefeng Wang @ 2024-04-09 13:27 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton; +Cc: linux-kernel, linux-mm, Vlastimil Babka



On 2024/4/9 21:17, Oscar Salvador wrote:
> Kefeng Wang reported that he was seeing some memory leaks with kmemleak
> with page_owner enabled.
> The reason behind is that we enable the page_owner_inited static branch
> and then proceed with the linking of stack_list struct to dummy_stack,
> which means that exists a race window between these two steps where we
> can have pages already being allocated calling add_stack_record_to_list(),
> allocating objects and linking them to stack_list, but then we set
> stack_list pointing to dummy_stack in init_page_owner.
> Which means that the objects that have been allocated during that time
> window are unreferenced and lost.
> 
> Fix this by deferring the enablement of the branch until we have properly
> set up the list.
> 
> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Closes: https://lore.kernel.org/linux-mm/74b147b0-718d-4d50-be75-d6afc801cd24@huawei.com/
> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Fixes: 4bedfb314bdd ("mm,page_owner: maintain own list of stack_records structs")
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
> Special thanks and kudos go to Kefeng Wang for helping me out chasing
> down this bug, as I could not reproduce it with any of my machines, and
> to Vlastimil to bring another pair of eyes, which was very helpful.
> 
The issue is found by accident when test my migrate changes, thanks for
your great job and quick response, this does pass my test, thanks ;)

>   mm/page_owner.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index 9bef0b442863..742f432e5bf0 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -118,7 +118,6 @@ static __init void init_page_owner(void)
>   	register_dummy_stack();
>   	register_failure_stack();
>   	register_early_stack();
> -	static_branch_enable(&page_owner_inited);
>   	init_early_allocated_pages();
>   	/* Initialize dummy and failure stacks and link them to stack_list */
>   	dummy_stack.stack_record = __stack_depot_get_stack_record(dummy_handle);
> @@ -129,6 +128,7 @@ static __init void init_page_owner(void)
>   		refcount_set(&failure_stack.stack_record->count, 1);
>   	dummy_stack.next = &failure_stack;
>   	stack_list = &dummy_stack;
> +	static_branch_enable(&page_owner_inited);
>   }
>   
>   struct page_ext_operations page_owner_ops = {


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-04-09 13:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-09 13:17 [PATCH] mm,page_owner: Defer enablement of static branch Oscar Salvador
2024-04-09 13:26 ` Vlastimil Babka
2024-04-09 13:27 ` Kefeng Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox