linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
@ 2025-02-25 14:19 Jinjiang Tu
  2025-02-26 15:57 ` David Hildenbrand
  0 siblings, 1 reply; 6+ messages in thread
From: Jinjiang Tu @ 2025-02-25 14:19 UTC (permalink / raw)
  To: muchun.song, akpm, mel, lee.schermerhorn, andi, david; +Cc: linux-mm

In set_max_huge_pages(), min_count should mean the acquired persistent
huge pages, but it contains surplus huge pages. It will leads to failing
to freeing free huge pages for a Node.

Steps to reproduce:
1) create 5 huge pages in Node 0
2) run a program to use all the huge pages
3) create 5 huge pages in Node 1
4) echo 0 > nr_hugepages for Node 1 to free the huge pages

The result:
        Node 0    Node 1
Total     5         5
Free      0         5
Surp      5         5

Fixes: 9a30523066cd ("hugetlb: add per node hstate attributes")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
---
 mm/hugetlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 163190e89ea1..783faec7360b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3758,7 +3758,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 	 * and won't grow the pool anywhere else. Not until one of the
 	 * sysctls are changed, or the surplus pages go out of use.
 	 */
-	min_count = h->resv_huge_pages + h->nr_huge_pages - h->free_huge_pages;
+	min_count = h->resv_huge_pages + persistent_huge_pages(h) - h->free_huge_pages;
 	min_count = max(count, min_count);
 	try_to_free_low(h, min_count, nodes_allowed);
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
  2025-02-25 14:19 [PATCH] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages Jinjiang Tu
@ 2025-02-26 15:57 ` David Hildenbrand
  2025-02-27  3:11   ` Jinjiang Tu
  0 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand @ 2025-02-26 15:57 UTC (permalink / raw)
  To: Jinjiang Tu, muchun.song, akpm, mel, lee.schermerhorn, andi; +Cc: linux-mm

On 25.02.25 15:19, Jinjiang Tu wrote:
> In set_max_huge_pages(), min_count should mean the acquired persistent
> huge pages, but it contains surplus huge pages. It will leads to failing

s/leads/lead/

> to freeing free huge pages for a Node.
> 
> Steps to reproduce:
> 1) create 5 huge pages in Node 0
> 2) run a program to use all the huge pages
> 3) create 5 huge pages in Node 1
> 4) echo 0 > nr_hugepages for Node 1 to free the huge pages
> 
> The result:
>          Node 0    Node 1
> Total     5         5
> Free      0         5
> Surp      5         5

Can you also share the results after your change?


> 
> Fixes: 9a30523066cd ("hugetlb: add per node hstate attributes")
> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
> ---
>   mm/hugetlb.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 163190e89ea1..783faec7360b 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3758,7 +3758,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
>   	 * and won't grow the pool anywhere else. Not until one of the
>   	 * sysctls are changed, or the surplus pages go out of use.
>   	 */
> -	min_count = h->resv_huge_pages + h->nr_huge_pages - h->free_huge_pages;
> +	min_count = h->resv_huge_pages + persistent_huge_pages(h) - h->free_huge_pages;
>   	min_count = max(count, min_count);
>   	try_to_free_low(h, min_count, nodes_allowed);
>   


-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
  2025-02-26 15:57 ` David Hildenbrand
@ 2025-02-27  3:11   ` Jinjiang Tu
  2025-02-27  6:19     ` Oscar Salvador
  0 siblings, 1 reply; 6+ messages in thread
From: Jinjiang Tu @ 2025-02-27  3:11 UTC (permalink / raw)
  To: David Hildenbrand, muchun.song, akpm, mel, lee.schermerhorn, andi
  Cc: linux-mm


在 2025/2/26 23:57, David Hildenbrand 写道:
> On 25.02.25 15:19, Jinjiang Tu wrote:
>> In set_max_huge_pages(), min_count should mean the acquired persistent
>> huge pages, but it contains surplus huge pages. It will leads to failing
>
> s/leads/lead/
>
>> to freeing free huge pages for a Node.
>>
>> Steps to reproduce:
>> 1) create 5 huge pages in Node 0
>> 2) run a program to use all the huge pages
>> 3) create 5 huge pages in Node 1
>> 4) echo 0 > nr_hugepages for Node 1 to free the huge pages
>>
>> The result:
>>          Node 0    Node 1
>> Total     5         5
>> Free      0         5
>> Surp      5         5
>
> Can you also share the results after your change?
With this patch, step 4) destroys the 5 huge pages in Node 1

The result with this patch:
          Node 0    Node 1
Total     5             0
Free      0             0
Surp     5             0
>
>
>>
>> Fixes: 9a30523066cd ("hugetlb: add per node hstate attributes")
>> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
>> ---
>>   mm/hugetlb.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 163190e89ea1..783faec7360b 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3758,7 +3758,7 @@ static int set_max_huge_pages(struct hstate *h, 
>> unsigned long count, int nid,
>>        * and won't grow the pool anywhere else. Not until one of the
>>        * sysctls are changed, or the surplus pages go out of use.
>>        */
>> -    min_count = h->resv_huge_pages + h->nr_huge_pages - 
>> h->free_huge_pages;
>> +    min_count = h->resv_huge_pages + persistent_huge_pages(h) - 
>> h->free_huge_pages;
>>       min_count = max(count, min_count);
>>       try_to_free_low(h, min_count, nodes_allowed);
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
  2025-02-27  3:11   ` Jinjiang Tu
@ 2025-02-27  6:19     ` Oscar Salvador
  2025-02-27  7:06       ` Jinjiang Tu
  2025-03-04 13:44       ` Jinjiang Tu
  0 siblings, 2 replies; 6+ messages in thread
From: Oscar Salvador @ 2025-02-27  6:19 UTC (permalink / raw)
  To: Jinjiang Tu
  Cc: David Hildenbrand, muchun.song, akpm, mel, lee.schermerhorn,
	andi, linux-mm

On Thu, Feb 27, 2025 at 11:11:25AM +0800, Jinjiang Tu wrote:
> The result with this patch:
>          Node 0    Node 1
> Total     5             0
> Free      0             0
> Surp     5             0

Why would node0 have any surplus pages?

Also, this does not reproduce in my machine.

Are you overcommitting? Are you using the hvo optimization?
If so, it might be that we cannot allocate vmemmap pages and those
hugetlb pages stay in a surplus state?

From bulk_vmemmap_restore_error():

 * If unable to restore a hugetlb page, the hugetlb
 * page is made a surplus page and removed from the list.


-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
  2025-02-27  6:19     ` Oscar Salvador
@ 2025-02-27  7:06       ` Jinjiang Tu
  2025-03-04 13:44       ` Jinjiang Tu
  1 sibling, 0 replies; 6+ messages in thread
From: Jinjiang Tu @ 2025-02-27  7:06 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: David Hildenbrand, muchun.song, akpm, mel, lee.schermerhorn,
	andi, linux-mm


在 2025/2/27 14:19, Oscar Salvador 写道:
> On Thu, Feb 27, 2025 at 11:11:25AM +0800, Jinjiang Tu wrote:
>> The result with this patch:
>>           Node 0    Node 1
>> Total     5             0
>> Free      0             0
>> Surp     5             0
> Why would node0 have any surplus pages?
>
> Also, this does not reproduce in my machine.
Sorry, I missed a step between step 2) and 3).
After step 2), echo 0 > 
/sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

So, the node0 has 5 surplus pages.
>
> Are you overcommitting? Are you using the hvo optimization?
> If so, it might be that we cannot allocate vmemmap pages and those
> hugetlb pages stay in a surplus state?
>
> >From bulk_vmemmap_restore_error():
>
>   * If unable to restore a hugetlb page, the hugetlb
>   * page is made a surplus page and removed from the list.
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
  2025-02-27  6:19     ` Oscar Salvador
  2025-02-27  7:06       ` Jinjiang Tu
@ 2025-03-04 13:44       ` Jinjiang Tu
  1 sibling, 0 replies; 6+ messages in thread
From: Jinjiang Tu @ 2025-03-04 13:44 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: David Hildenbrand, muchun.song, akpm, mel, lee.schermerhorn,
	andi, linux-mm


在 2025/2/27 14:19, Oscar Salvador 写道:
> On Thu, Feb 27, 2025 at 11:11:25AM +0800, Jinjiang Tu wrote:
>> The result with this patch:
>>           Node 0    Node 1
>> Total     5             0
>> Free      0             0
>> Surp     5             0
> Why would node0 have any surplus pages?
>
> Also, this does not reproduce in my machine.
>
> Are you overcommitting? Are you using the hvo optimization?
> If so, it might be that we cannot allocate vmemmap pages and those
> hugetlb pages stay in a surplus state?
>
> >From bulk_vmemmap_restore_error():
>
>   * If unable to restore a hugetlb page, the hugetlb
>   * page is made a surplus page and removed from the list.

So surplus pages may be free pages and not reserved. In this patch, we
shouldn't subtract surplus_huge_pages from min_mount, otherwise these free
pages will be subtracted twice.

>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-03-04 13:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-25 14:19 [PATCH] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages Jinjiang Tu
2025-02-26 15:57 ` David Hildenbrand
2025-02-27  3:11   ` Jinjiang Tu
2025-02-27  6:19     ` Oscar Salvador
2025-02-27  7:06       ` Jinjiang Tu
2025-03-04 13:44       ` Jinjiang Tu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox