From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 576ACC021B8 for ; Tue, 4 Mar 2025 13:29:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C46C56B0085; Tue, 4 Mar 2025 08:29:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BF7C76B0088; Tue, 4 Mar 2025 08:29:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE5A06B0089; Tue, 4 Mar 2025 08:29:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 906936B0085 for ; Tue, 4 Mar 2025 08:29:45 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 3F968558CC for ; Tue, 4 Mar 2025 13:29:45 +0000 (UTC) X-FDA: 83183951130.04.F81AEA1 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf06.hostedemail.com (Postfix) with ESMTP id 06875180011 for ; Tue, 4 Mar 2025 13:29:42 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf06.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741094983; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=6xCAIUDdu60MIRzt5IFJJTk4ZU+KEtD8YOImYQHouYA=; b=i0XtXPPcWJhdIiGSD9ibqtkzQjnKXb22h6XAblBnIa+xVMqnDsZFOpBNRR+cR/A2pXGwzh xsaCl0tfEskMzizYtrVwFs4g4p42ug2rm79JrfQgT+oX76zLKVIQmFFlj5eGg7SemREJJN jqJTdSgecS0vnqxtj4ND4GFbNV6G4VY= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf06.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741094983; a=rsa-sha256; cv=none; b=7ev37Ymnby5CmNpHhUCCrQoIp1FaGQ3LrOj9GcfRiob6eRJF8gj6tESu86/P8lcbB6qyIw 9qFlkeh1snYYlKd+A4O8seEWNTxUnnPu844DeMnPanJIPJmU3E/bCuQgj4gWiYtiEF5C/U 3kjcQgSyqSSs+51Wov7J5fJsj7ecCuc= Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4Z6byt70lKz2DjqY; Tue, 4 Mar 2025 21:25:26 +0800 (CST) Received: from kwepemo200002.china.huawei.com (unknown [7.202.195.209]) by mail.maildlp.com (Postfix) with ESMTPS id 7BB591A016C; Tue, 4 Mar 2025 21:29:38 +0800 (CST) Received: from huawei.com (10.175.124.71) by kwepemo200002.china.huawei.com (7.202.195.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 4 Mar 2025 21:29:37 +0800 From: Jinjiang Tu To: , , , CC: , , , Subject: [PATCH v2] mm/hugetlb: fix surplus pages in dissolve_free_huge_page() Date: Tue, 4 Mar 2025 21:21:06 +0800 Message-ID: <20250304132106.2872754-1-tujinjiang@huawei.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.175.124.71] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemo200002.china.huawei.com (7.202.195.209) X-Stat-Signature: e8w6rsmq734u93urghuhhk6pf8hss4tr X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 06875180011 X-Rspam-User: X-HE-Tag: 1741094982-441987 X-HE-Meta: U2FsdGVkX1/9LKkN46PQ7Wj2NF4NjStUxcLy0Gh8JqUxKLXy0kM2GjodE7vyPXx4Lt5U5ru414aTbS0p00HlNIt8YZJmSzsUw8DgXR9vdPM1j5EpVBjiM9QLUmYHrWYBgZHSOVYVIhjL6yKt7xe31qdCQO1J8KwtMhBe+GbKVmuFRE08tkrqXa9ogsr1ISkmdmBPQqTLANA537mWArvg2mZWbQ7FjSShlK2fwm6K5nUbbc3Pm4Kq8y3fQ0/mpf7e4EvjKrqHywcRiIUB9kyfUwoeJ5DtqY7yHoS8oFlQO7mS+giSJ8T5Wzkiiqc/9f/tIYCBbWaSGwvy6P9CIzwTlPdXTFXV0bTpafOYBEk5NsFqRKBLR5TW5jb2aCBPso3EKr+Nopeb5CD4mc17ybtm1qUtOZNI/4skOJOTG22j1ZYa72pk7G+73Qi+9958OtMF7Qg6OJ6ipi4aTRSF43c38Hz8wOOtgbVeGojflhFNYO/w5OXdTbn5GvgnblrCej+wGtwqAy+lgezApE7LyTlrFvzEHFyCW3ejLpdbD1i4cnoZSa4UR28yyqfwqXiO1P1arUjWOkh12jP+2UcLmlR28PE9o3k56L9J5iNExuVhOHKDUoQwcrLsvT6gktG6+ZkrRA2gEBo4u074FnVc/ZnrwFJAnIbTLGuWdjTrAgmYKNNpJ6r6tTDBzn6fdFyf6eq7k7H+BJBfoxbaremDWvs+zYM6YFxNU95OjcLpAi8eKMYlxIGvdXcLihxW3mrXwUCv5x2G496cE8ZQW65vjfGxeaWf58zlZRzUrvL//eHmDCU7sZucgCj2nF7KP5gO4HMNBiCcya1/ELOtLIb5iu7ZBou4tXWbKUodxQfNY6Hg7ScOFyXANC8bPmLr3YUS4RFL5gQvyQkYHcjY/oM2plwsjLEXLjWZ2OyjUmO03kAexgWCDNBF9+A4YS50FR+nRjwW2I8ebk/k3m+i05bXLfK GOB/BVbZ qXC3C/Rz6vT8Ju9rB0cDNQbGZK3IX8aTyywT07onawMJpdCTmdVXkMrgtn2FcpW+O7XP1RGWMNHqTYd1AuCdtSjXD6Emet80JPMWwt1nw3Y2uT9esb3D9zC/B5DQNt8FER8Sy71RZQ8+1xOxnK5b0FQcZRkhqFC6mRx3jVl0KtsjJqvz3otvivPaOb2QSwvUwsCXwQFF66oAFp64dbMJDTFxrQT2b7/Q2eooNR0TPKYGb4lSjmk8wksaRcm+1ZDl9eKOr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In dissolve_free_huge_page(), free huge pages are dissolved without adjusting surplus count. However, free huge pages may be accounted as surplus pages, and will lead to wrong surplus count. I reproduce this issue on qemu. The steps are: 1) Node1 is memory-less at first. Hot-add memory to node1 by executing the two commands in qemu monitor: object_add memory-backend-ram,id=mem1,size=1G device_add pc-dimm,id=dimm1,memdev=mem1,node=1 2) online one memory block of Node1 with: echo online_movable > /sys/devices/system/node/node1/memoryX/state 3) create 64 huge pages for node1 4) run a program to reserve (don't consume) all the huge pages 5) echo 0 > nr_huge_pages for node1. After this step, free huge pages in Node1 are surplus. 6) create 80 huge pages for node0 7) offline memory of node1, The memory range to offline contains the free surplus huge pages created in step3) ~ step5) echo offline > /sys/devices/system/node/node1/memoryX/state 8) kill the program in step 4) The result: Node0 Node1 total 80 0 free 80 0 surplus 0 61 To fix it, adjust surplus when destroying huge pages if the node has surplus pages in dissolve_free_hugetlb_folio(). The result with this patch: Node0 Node1 total 80 0 free 80 0 surplus 0 0 Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Acked-by: David Hildenbrand Signed-off-by: Jinjiang Tu --- Changelog since v1: improve commit message, suggested by David Hildenbrand mm/hugetlb.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..2a24ade9d157 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2135,6 +2135,8 @@ int dissolve_free_hugetlb_folio(struct folio *folio) if (!folio_ref_count(folio)) { struct hstate *h = folio_hstate(folio); + bool adjust_surplus = false; + if (!available_huge_pages(h)) goto out; @@ -2157,7 +2159,9 @@ int dissolve_free_hugetlb_folio(struct folio *folio) goto retry; } - remove_hugetlb_folio(h, folio, false); + if (h->surplus_huge_pages_node[folio_nid(folio)]) + adjust_surplus = true; + remove_hugetlb_folio(h, folio, adjust_surplus); h->max_huge_pages--; spin_unlock_irq(&hugetlb_lock); @@ -2177,7 +2181,7 @@ int dissolve_free_hugetlb_folio(struct folio *folio) rc = hugetlb_vmemmap_restore_folio(h, folio); if (rc) { spin_lock_irq(&hugetlb_lock); - add_hugetlb_folio(h, folio, false); + add_hugetlb_folio(h, folio, adjust_surplus); h->max_huge_pages++; goto out; } -- 2.43.0