linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/hugeltb: fix nodes huge page allocation when there are surplus pages
@ 2023-08-26  3:57 Xueshi Hu
  2023-08-28 23:34 ` Mike Kravetz
  0 siblings, 1 reply; 2+ messages in thread
From: Xueshi Hu @ 2023-08-26  3:57 UTC (permalink / raw)
  To: Mike Kravetz, Muchun Song, Andrew Morton, Oscar Salvador,
	Naoya Horiguchi
  Cc: Xueshi Hu, linux-mm, linux-kernel

In set_nr_huge_pages(), local variable "count" is used to record
persistent_huge_pages(), but when it cames to nodes huge page allocation,
the semantics changes to nr_huge_pages. When there exists surplus huge
pages and using the interface under
/sys/devices/system/node/node*/hugepages to change huge page pool size,
this difference can result in the allocation of an unexpected number of
huge pages.

Steps to reproduce the bug:

Starting with:

				  Node 0          Node 1    Total
	HugePages_Total             0.00            0.00     0.00
	HugePages_Free              0.00            0.00     0.00
	HugePages_Surp              0.00            0.00     0.00

create 100 huge pages in Node 0 and consume it, then set Node 0 's
nr_hugepages to 0.

yields:

				  Node 0          Node 1    Total
	HugePages_Total           200.00            0.00   200.00
	HugePages_Free              0.00            0.00     0.00
	HugePages_Surp            200.00            0.00   200.00

write 100 to Node 1's nr_hugepages

		echo 100 > /sys/devices/system/node/node1/\
	hugepages/hugepages-2048kB/nr_hugepages

gets:

				  Node 0          Node 1    Total
	HugePages_Total           200.00          400.00   600.00
	HugePages_Free              0.00          400.00   400.00
	HugePages_Surp            200.00            0.00   200.00

Kernel is expected to create only 100 huge pages and it gives 200.

Fixes: fd875dca7c71 ("hugetlbfs: fix potential over/underflow setting node specific nr_hugepages")
Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com>
---
 mm/hugetlb.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6da626bfb52e..54e2e2e12aa9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3494,7 +3494,9 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 	if (nid != NUMA_NO_NODE) {
 		unsigned long old_count = count;
 
-		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
+		count += persistent_huge_pages(h) -
+			 (h->nr_huge_pages_node[nid] -
+			  h->surplus_huge_pages_node[nid]);
 		/*
 		 * User may have specified a large count value which caused the
 		 * above calculation to overflow.  In this case, they wanted
-- 
2.40.1



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] mm/hugeltb: fix nodes huge page allocation when there are surplus pages
  2023-08-26  3:57 [PATCH] mm/hugeltb: fix nodes huge page allocation when there are surplus pages Xueshi Hu
@ 2023-08-28 23:34 ` Mike Kravetz
  0 siblings, 0 replies; 2+ messages in thread
From: Mike Kravetz @ 2023-08-28 23:34 UTC (permalink / raw)
  To: Xueshi Hu
  Cc: Muchun Song, Andrew Morton, Oscar Salvador, Naoya Horiguchi,
	linux-mm, linux-kernel

On 08/26/23 11:57, Xueshi Hu wrote:
> In set_nr_huge_pages(), local variable "count" is used to record
> persistent_huge_pages(), but when it cames to nodes huge page allocation,
> the semantics changes to nr_huge_pages. When there exists surplus huge
> pages and using the interface under
> /sys/devices/system/node/node*/hugepages to change huge page pool size,
> this difference can result in the allocation of an unexpected number of
> huge pages.
> 
> Steps to reproduce the bug:
> 
> Starting with:
> 
> 				  Node 0          Node 1    Total
> 	HugePages_Total             0.00            0.00     0.00
> 	HugePages_Free              0.00            0.00     0.00
> 	HugePages_Surp              0.00            0.00     0.00
> 
> create 100 huge pages in Node 0 and consume it, then set Node 0 's
> nr_hugepages to 0.
> 
> yields:
> 
> 				  Node 0          Node 1    Total
> 	HugePages_Total           200.00            0.00   200.00
> 	HugePages_Free              0.00            0.00     0.00
> 	HugePages_Surp            200.00            0.00   200.00
> 
> write 100 to Node 1's nr_hugepages
> 
> 		echo 100 > /sys/devices/system/node/node1/\
> 	hugepages/hugepages-2048kB/nr_hugepages
> 
> gets:
> 
> 				  Node 0          Node 1    Total
> 	HugePages_Total           200.00          400.00   600.00
> 	HugePages_Free              0.00          400.00   400.00
> 	HugePages_Surp            200.00            0.00   200.00
> 
> Kernel is expected to create only 100 huge pages and it gives 200.
> 
> Fixes: fd875dca7c71 ("hugetlbfs: fix potential over/underflow setting node specific nr_hugepages")
> Signed-off-by: Xueshi Hu <xueshi.hu@smartx.com>
> ---
>  mm/hugetlb.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Thank you!

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

A note about the Fixes: tag.  Commit fd875dca7c71 moved the root cause of this
issue (the line changed below) from the routine __nr_hugepages_store_common
to the routine set_max_huge_pages.  I believe the actual issue has existed
since  commit 9a30523066cde that added hugetlb node specific support way back
in 2009 (2.6.32 timeframe).

If 9a30523066cde is used in the fixes tag, there will be some non-automatic
backports for releases where fd875dca7c71 does not exist.  I would suggest
changing the fixes tag.  We can ignore the broken backports for older releases,
as I really doubt this is a real issue for many/any people.
-- 
Mike Kravetz

> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6da626bfb52e..54e2e2e12aa9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3494,7 +3494,9 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
>  	if (nid != NUMA_NO_NODE) {
>  		unsigned long old_count = count;
>  
> -		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
> +		count += persistent_huge_pages(h) -
> +			 (h->nr_huge_pages_node[nid] -
> +			  h->surplus_huge_pages_node[nid]);
>  		/*
>  		 * User may have specified a large count value which caused the
>  		 * above calculation to overflow.  In this case, they wanted
> -- 
> 2.40.1
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-08-28 23:35 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-26  3:57 [PATCH] mm/hugeltb: fix nodes huge page allocation when there are surplus pages Xueshi Hu
2023-08-28 23:34 ` Mike Kravetz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox