From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7577C001DC for ; Sun, 30 Jul 2023 12:53:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3F6818D0008; Sun, 30 Jul 2023 08:53:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3573A8D0001; Sun, 30 Jul 2023 08:53:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1FBEF8D0008; Sun, 30 Jul 2023 08:53:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 070938D0001 for ; Sun, 30 Jul 2023 08:53:43 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id CD1BCB16B5 for ; Sun, 30 Jul 2023 12:53:42 +0000 (UTC) X-FDA: 81068269884.23.7D3757B Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf16.hostedemail.com (Postfix) with ESMTP id E84C5180009 for ; Sun, 30 Jul 2023 12:53:40 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=smartx-com.20221208.gappssmtp.com header.s=20221208 header.b=uiVImfiB; dmarc=none; spf=none (imf16.hostedemail.com: domain of xueshi.hu@smartx.com has no SPF policy when checking 209.85.210.169) smtp.mailfrom=xueshi.hu@smartx.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690721621; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Amv2wMJcu6d669hQdN4MYu+ekSagqBpfLdGX/fiqoN0=; b=LVlAr+Ufg2m0iyK/CZpMQAOLS4c0O+YmTvuO61A1Q+l0yId+pDM5p2qieUD4Qu3x0rK5ld zkSCV4hs+tkXRE6QFuWHVBBrPoXoTmK2/gXbodkX1Tij2KMKJrTooODuT/8O/GYw0GW1g5 TPMTFSccnz+s+7KeN1domefELdmhdrI= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=smartx-com.20221208.gappssmtp.com header.s=20221208 header.b=uiVImfiB; dmarc=none; spf=none (imf16.hostedemail.com: domain of xueshi.hu@smartx.com has no SPF policy when checking 209.85.210.169) smtp.mailfrom=xueshi.hu@smartx.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690721621; a=rsa-sha256; cv=none; b=gZXQCeDKdALARBWVVTADHN8UPL8GuIM8bnGBzaSgMMiUrA5MSSW4yYqcfAPwvI4R+IclLc wJmxjSdCAQQmgJ4JTVWi0mW/M4+HNMVbEfwL4Wgb5lMBO+5hLlCDcBqiz2yLdSeHl4KL+m b8qfy93US40NCt038sTsPylinDeui+c= Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-686b91c2744so2625943b3a.0 for ; Sun, 30 Jul 2023 05:53:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=smartx-com.20221208.gappssmtp.com; s=20221208; t=1690721617; x=1691326417; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Amv2wMJcu6d669hQdN4MYu+ekSagqBpfLdGX/fiqoN0=; b=uiVImfiBA5WjIcLHKOZinRJnwQc/CdKGW20gki9VKdhMEyx+2h2Lvq6gpZVV4xJ2E2 pGeM0At/XrQtZopjTN+OmhCGrnkZ9KJ/ZN50M9SR/gapFjsmAtOa9GE+zqCl/4WIIMBh IreS/jD6DHC0+40Ev6DMHywgNL0nWhBBzcBQdz1H5UW9ftFDNE+p4/Cm34XjzAyEeExs Bqbq83xdXaqKeSvd7MNzAwSQjd3yWSmywhsOYQGYTcjJRu5NfJ3lcpqZwaxz4g4JjNO0 QON0iIW4wLHtDsyK2tykLAwuOpvKY+PDD89ry7e/425APzbgSb3s5Chr9y/owcU220T7 91AQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690721617; x=1691326417; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Amv2wMJcu6d669hQdN4MYu+ekSagqBpfLdGX/fiqoN0=; b=cuowozxKuDM8tk8gGImeePYTzRM69OTuzfyKBZD6W5EUOJ3w/tBd53zrNAC4PlzHNY uJgxovCOEdfE0AertNw4Nqebe4Rgj2iIA7b4IfxB9WTOQ8LwYY2gENcAC6fkogYN9v8E Bql0zvSD9v44tjUlqFmrrFqp0tFabyzKgX27fJioZ/W0YHj8mW9KqEM6eCjF0qf9t2SS c3qQIIasP0GxbD4HKLuM/zA7AiwjxGCVwhelPIh/p1dlHvivLZOTAbX2BQXq+7uY/y4t PD9MLGYGzSSVWjfRFOVgyw0sPe/7YKrI+cEb0BMWktgRKzZRo/UetrT+w15OGue0BtA3 O9Og== X-Gm-Message-State: ABy/qLbW+FPSeKoEm/36y5OdetGKLbRz+IZPZcmIQQgYcrjbw3RZEQxV R1t10n9Y6rJHnf8GiB4Wm/q40w== X-Google-Smtp-Source: APBJJlELWwq2Gq+RnddR+pggr4t/KNsjGRudZwld4SQrRDVMFNpf+ahcc2e76lL9DWEAaGt4qWYZQA== X-Received: by 2002:a05:6a21:66c7:b0:138:1c5b:2653 with SMTP id ze7-20020a056a2166c700b001381c5b2653mr7003415pzb.41.1690721617530; Sun, 30 Jul 2023 05:53:37 -0700 (PDT) Received: from nixos.tailf4e9e.ts.net ([47.75.78.161]) by smtp.googlemail.com with ESMTPSA id s9-20020aa78d49000000b00687260020b1sm1731130pfe.72.2023.07.30.05.53.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 30 Jul 2023 05:53:37 -0700 (PDT) From: Xueshi Hu To: mike.kravetz@oracle.com, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, Xueshi Hu Subject: [PATCH 3/3] mm/hugeltb: fix nodes huge page allocation when there are surplus pages Date: Sun, 30 Jul 2023 20:51:56 +0800 Message-Id: <20230730125156.207301-4-xueshi.hu@smartx.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230730125156.207301-1-xueshi.hu@smartx.com> References: <20230730125156.207301-1-xueshi.hu@smartx.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Stat-Signature: xow9t8er91166xb17cfphfpf1hejzpyz X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E84C5180009 X-HE-Tag: 1690721620-134721 X-HE-Meta: U2FsdGVkX19VUCdgXrUu3vyEQfZWNxzm906sB3Zw+QjRn5aezQ/v4EFGvQsMI4wE7nLJPycdetOHfG3uzIetzEBYNC6/IfLNBA6iD/UzANUvXupVQgncO9377gAIwUTmvVMPqBjK1jRX0Hfc6e2XMBSkk2KoU0GDri0z4j8THYQMtUj+G4hiUzRAsQ3uvIV8U8aS0VLoeYDBet2DYF7HbArpDNUtRqNTL/0zGDKS6H2B+i47byY+zJqmWJRZIAYFVlqdwWLs+ynb5muxXnqqdAW+RtWK11Y+hgKD3AltjIIU62Imqp/nbIEtLwHcg98tIS1jyDyxMK2VMA5VEOCplNkAk2Ax/sktTB83xS1epCvrEkpyX+vVMUg2JR1M68FRV8M1pMdaJnE8y12DkkwOK3ZP8P39+sSjUHcVTfNhdLNiDIyOAYz+ujYIE52Jd5LkPu1M+EGXHCouPwsNXWgYL/eG+0Xs0v/5BnqXY4sBvW4ccVs4Ain1SYDUXYdRTd6xkCcKm9OYzoldHtBjCHSDPqQHV1bgj09wuAFK7DN0eOKuD7xFEsaYEEgmaUfJggMw1506YsXWyRWIsaf7ETB1tcPBGHyRWuaxN7RFwbaCS096I6YFHjGVgREwgssxqQxG3pEpca4L4k2dYCn6jIg+AHsH1Nzv7B3MgkFtvy720JMCfZT0jwhfHwJkt4wUwCSWcMOmdyuPW2pLM78oPMGeGVPQnux+FOnR/zF/wFljqERBagtbmKc1d5a5jLTGZGOyjZOl2t1DsdfkyJUxwtMi41frntlK9PHZuE0uMxEBPL2sPVSJFRBj+lUQ4FXyUD2ppLeQyUazSX4u1VOD4uqvlztAKIiatSfWVNX+H5iGFPR70159lPHk/ysTKsqPddmOJ5kt3L7gnlUgF0n1aQLqBOCpGmEDcwbcy5FvO5uh5SbWh4x9VYBYdjZfxMgyhgKinjssu2iVNu2v+u9TP/Z CpFIjI// 4K+4oj6i2AAPcVWt+a8w1OT9TaqrUkEBDeuK43ySQ/b89EisHYtYHLmb4rf+FxHs8ATb6RPHcwJ1U3D+mx+Yo/TUBN/XmZqQAv8xEEfDCa4/IPT2r6d40ONB9Q93mQYLkdi1VhR+ZULpE5BCG0VmAV4kEwNaMP4j23pVisGF8ht2Bq7NcX+E+feAJgbZQNiSS2aoWSX57Ymp7gvlPGXCkuo6JI+3si1Tk/w3Qqe2tvvFm+/Ytp86tT05jm5IwPORb2nqBKrLScJ4DtRIQWmCoffiAhKZZL6sEyXsDbay7eYJ5qL1FOVbbesfsl5VYrXOFSP3gbhSYE5/8RY0Vr48bPZEufqK/SihjxoSUk/NfE6YeSr1X6hJdIod4A2k8pv7GSzWTP1X4yve1Gq+fSWfScwiizTkcOmXfeHowi4hUwSocdhI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In set_nr_huge_pages(), local variable "count" is used to record persistent_huge_pages(), but when it cames to nodes huge page allocation, the semantics changes to nr_huge_pages. When there exists surplus huge pages and using the interface under /sys/devices/system/node/node*/hugepages to change huge page pool size, this difference can result in the allocation of an unexpected number of huge pages. Steps to reproduce the bug: Starting with: Node 0 Node 1 Total HugePages_Total 0.00 0.00 0.00 HugePages_Free 0.00 0.00 0.00 HugePages_Surp 0.00 0.00 0.00 create 100 huge pages in Node 0 and consume it, then set Node 0 's nr_hugepages to 0. yields: Node 0 Node 1 Total HugePages_Total 200.00 0.00 200.00 HugePages_Free 0.00 0.00 0.00 HugePages_Surp 200.00 0.00 200.00 write 100 to Node 1's nr_hugepages echo 100 > /sys/devices/system/node/node1/\ hugepages/hugepages-2048kB/nr_hugepages gets: Node 0 Node 1 Total HugePages_Total 200.00 400.00 600.00 HugePages_Free 0.00 400.00 400.00 HugePages_Surp 200.00 0.00 200.00 Kernel is expected to create only 100 huge pages and it gives 200. Signed-off-by: Xueshi Hu --- mm/hugetlb.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 56647235ab21..8ed4fffdebda 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3490,7 +3490,9 @@ static int set_nr_huge_pages(struct hstate *h, unsigned long count, int nid, if (nid != NUMA_NO_NODE) { unsigned long old_count = count; - count += h->nr_huge_pages - h->nr_huge_pages_node[nid]; + count += persistent_huge_pages(h) - + (h->nr_huge_pages_node[nid] - + h->surplus_huge_pages_node[nid]); /* * User may have specified a large count value which caused the * above calculation to overflow. In this case, they wanted -- 2.40.1