linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
@ 2025-04-09  5:59 Jinjiang Tu
  2025-04-10  2:14 ` Andrew Morton
  0 siblings, 1 reply; 2+ messages in thread
From: Jinjiang Tu @ 2025-04-09  5:59 UTC (permalink / raw)
  To: osalvador, david, akpm, muchun.song; +Cc: linux-mm, wangkefeng.wang, tujinjiang

In set_max_huge_pages(), min_count is computed taking into account surplus
huge pages, which might lead in some cases to not be able to free huge
pages and end up accounting them as surplus instead.

One way to solve it is to subtract surplus_huge_pages directly, but we
cannot do it blindly because there might be surplus pages that are also
free pages, which might happen when we fail to restore the vmemmap for
optimized hvo pages. So we could be subtracting the same page twice.

In order to work this around, let us first compute the number of free
persistent pages, and use that along with surplus pages to compute
min_count.

Steps to reproduce:
1) create 5 hugetlb folios in Node0
2) run a program to use all the hugetlb folios
3) echo 0 > nr_hugepages for Node0 to free the hugetlb folios. Thus the 5
hugetlb folios in Node0 are accounted as surplus.
4) create 5 hugetlb folios in Node1
5) echo 0 > nr_hugepages for Node1 to free the hugetlb folios

The result:
        Node0    Node1
Total     5         5
Free      0         5
Surp      5         5

The result with this patch:
        Node0    Node1
Total     5         0
Free      0         0
Surp      5         0

Fixes: 9a30523066cd ("hugetlb: add per node hstate attributes")
Acked-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
---
Changelog since v3:
 * update changelog, suggested by Oscar Salvador
 * collect ack from Oscar Salvador

 mm/hugetlb.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 39f92aad7bd1..e4aed3557339 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3825,6 +3825,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
 static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 			      nodemask_t *nodes_allowed)
 {
+	unsigned long persistent_free_count;
 	unsigned long min_count;
 	unsigned long allocated;
 	struct folio *folio;
@@ -3959,8 +3960,24 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 	 * though, we'll note that we're not allowed to exceed surplus
 	 * and won't grow the pool anywhere else. Not until one of the
 	 * sysctls are changed, or the surplus pages go out of use.
+	 *
+	 * min_count is the expected number of persistent pages, we
+	 * shouldn't calculate min_count by using
+	 * resv_huge_pages + persistent_huge_pages() - free_huge_pages,
+	 * because there may exist free surplus huge pages, and this will
+	 * lead to subtracting twice. Free surplus huge pages come from HVO
+	 * failing to restore vmemmap, see comments in the callers of
+	 * hugetlb_vmemmap_restore_folio(). Thus, we should calculate
+	 * persistent free count first.
 	 */
-	min_count = h->resv_huge_pages + h->nr_huge_pages - h->free_huge_pages;
+	persistent_free_count = h->free_huge_pages;
+	if (h->free_huge_pages > persistent_huge_pages(h)) {
+		if (h->free_huge_pages > h->surplus_huge_pages)
+			persistent_free_count -= h->surplus_huge_pages;
+		else
+			persistent_free_count = 0;
+	}
+	min_count = h->resv_huge_pages + persistent_huge_pages(h) - persistent_free_count;
 	min_count = max(count, min_count);
 	try_to_free_low(h, min_count, nodes_allowed);
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH v4] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages
  2025-04-09  5:59 [PATCH v4] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages Jinjiang Tu
@ 2025-04-10  2:14 ` Andrew Morton
  0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2025-04-10  2:14 UTC (permalink / raw)
  To: Jinjiang Tu; +Cc: osalvador, david, muchun.song, linux-mm, wangkefeng.wang

On Wed, 9 Apr 2025 13:59:57 +0800 Jinjiang Tu <tujinjiang@huawei.com> wrote:

> Changelog since v3:
>  * update changelog, suggested by Oscar Salvador

Thanks, I updated the changelog in-place.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-04-10  2:15 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-09  5:59 [PATCH v4] mm/hugetlb: fix set_max_huge_pages() when there are surplus pages Jinjiang Tu
2025-04-10  2:14 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox