From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AE7BC369CA for ; Sat, 19 Apr 2025 17:20:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E9506B008C; Sat, 19 Apr 2025 13:20:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 998A06B0092; Sat, 19 Apr 2025 13:20:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AD726B0093; Sat, 19 Apr 2025 13:20:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6CD9B6B008C for ; Sat, 19 Apr 2025 13:20:35 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8A31F1429D1 for ; Sat, 19 Apr 2025 17:20:35 +0000 (UTC) X-FDA: 83351457630.09.53211ED Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf10.hostedemail.com (Postfix) with ESMTP id 418F9C0005 for ; Sat, 19 Apr 2025 17:20:32 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Y1W9CtYL; dmarc=none; spf=none (imf10.hostedemail.com: domain of rdunlap@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=rdunlap@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745083234; a=rsa-sha256; cv=none; b=EvfvFfQ/uIrDrqFkIwXDjbT1hrSD8jbPKsJotwMklCHWcsvhT8QAmfvke0ek6x/KfnmcHc GmRMKh564x332kHvpNGbD4887fkGTeox/oKjeAMz9c1tEwo9ER92FtAxNxAS6pIuRiGB5o w4KgxLJw+yvPlvTtm4va4e1jbW5eaAU= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Y1W9CtYL; dmarc=none; spf=none (imf10.hostedemail.com: domain of rdunlap@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=rdunlap@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745083234; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=shQn835qdIaooUON9bexG92y1I3IWbZ1v7IIKF2PNuY=; b=blScmJD7yte081TPkgDyfrEy9utklFnAYsJIKRLONfe6ApUXmXeWk/NFvk6/gkPxk4ki+k B57kSLnzh0HBKit7p797e1YUmlak/rgX0cj3qAnOygdZQ18RJlYKLgDZYP4Wbha/TWbH2g qkD11rozw0j2jCgGS+B8V7SZFDm3XeQ= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender :Reply-To:Content-ID:Content-Description; bh=shQn835qdIaooUON9bexG92y1I3IWbZ1v7IIKF2PNuY=; b=Y1W9CtYLUD4zIsvScfrkQKcocN VLMb8/lhuLwWY9+sG3asDjXRhoSaRoJmaqMNL+/ZilCgG5FQJ5Ui6Y0E1VbGMLzFcvhGCkZdPTLZ9 n5xL6vpa7Yxq/miXPes+FCT3r772Fv9l6Zjoii/fxB8imG37FVs21PgtVfAC6GPsTkLKqsnSImFXX I3z6JRaDLXxxO+H32k1wSQ5iaDj474TM7lMwVsROlqUzC2jh6HC3x1XKgEmIFMcT6vUB9kyqU3eRM l6LPb8b0MWcADwPAS4MVa/SRW5706ydNvo125p+hSYkP5Yn3mz7fMJMhtXjvZ5nTnT+21d7S4ToLJ Y25eUdDw==; Received: from [50.39.124.201] (helo=[192.168.254.17]) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1u6BrW-00000003iYa-2T8P; Sat, 19 Apr 2025 17:20:18 +0000 Message-ID: <1bb5fadd-583c-4c56-b52f-37eee516c1dd@infradead.org> Date: Sat, 19 Apr 2025 10:20:14 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH] docs: hugetlbpage.rst: add free surplus huge pages description To: Jinjiang Tu , osalvador@suse.de, muchun.song@linux.dev, akpm@linux-foundation.org, david@redhat.com, corbet@lwn.net Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, wangkefeng.wang@huawei.com References: <20250419073214.2688926-1-tujinjiang@huawei.com> Content-Language: en-US From: Randy Dunlap In-Reply-To: <20250419073214.2688926-1-tujinjiang@huawei.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 418F9C0005 X-Stat-Signature: i8uzr6kgkw1hpz8afcaj1z4w4pxps8dp X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1745083232-564890 X-HE-Meta: U2FsdGVkX19FAGKVzdDd5gm6sC1z5IbUIJd99t16bJltZnDuDK4z7ocPOZsL7bzUFtRoRL+PsCtT602njDCKjoEONxKm31cSN4h+e9d6imo36vz50klCX4uJogRXKoaxYv+noq9TfnqWkWD8O0O0SXEG5B9L5t6UADzdb5OvAVBNE8hI72fwfg++FzlkRcv7Qw8pLXhqCIWTQjz4t3fYQuJGfb6DJrfn8zhZhwonzTC11mWvCsvEnJNBkc4B1yy3JrRZzUx/1Oe4qdcRhBn3zRJNpjiYs4e4IZbOIxTJcTCM5g4ECXaQS58YJK8Lt8dQPwC4l/HkyU+SKSjT1E5GYQ8Bji8NjNKoAKlu0iGlTaGJp1CwKKQeRrHSLo+iK5OfjQMYc8EmQ9FoAJPUdgUOQ3XJdFT1216Ylc/p3QXuHuihzXOHakxjtRBEP4QPPJFrm1JCRJhj2IPGZS7RBAt6lJAWKw1WFfIkfQfpz1BH9OCbIf69I9J7n0yQMWL7MxkF+1lp2eT7MdntYYgsaaH/Z1m1pDhSJytyH6bAUV4aJJtvLhFioid42+sqmiI5r3FeiWk3KUDpjnUiPlrd9EWAFuPuysLKRGcXqTjY+BxOB6pWd9l5J74wREogLaPnBebdWhZdpUiJUvbKQ8bWXiZPknLsUY1FC/vCWGd92yEXc3/A7jR0djD1HXlqq7d2fkIYfKpkYUMGzUpTKvnzjiLAGBVas3wTO4JZsHlKglpN1ZUQ/6uMTl2m7V2Zzr3UUpfBI42XEA9kQk9ZouWGm63as2GusKPU0ruHFZsV+Pg+hCZ+KgB5h6vxof7xt1UqfJFgoNUztJ3WzqYzIrBPgwsxfdQVW97hDSVVysyV2zKqGH+JUgLC6C7PSzLdePaq2OYfpPIZAHbp+KLXsz5T0tSAS6uvawgWrgqH5LXYl8J3TAc7KdyI3o72SYa6AbgrNu4E X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 4/19/25 12:32 AM, Jinjiang Tu wrote: > When echo 0 > /proc/sys/vm/nr_hugepages is concurrent with freeing in-use > huge pages to the huge page pool, some free huge pages may fail to be > destroyed and accounted as surplus. The counts are like below: > > HugePages_Total: 1024 > HugePages_Free: 1024 > HugePages_Surp: 1024 > > When set_max_huge_pages() decrease the pool size, it first return free > pages to the buddy allocator, and then account other pages as surplus. > Between the two steps, the hugetlb_lock is released to free memory and > require the hugetlb_lock again. If another process free huge pages to the > pool between the two steps, these free huge pages will be accounted as > surplus. > > Besides, Free surplus huge pages come from failing to restore vmemmap. > > Once the two situation occurs, users couldn't directly shrink the huge > page pool via echo 0 > nr_hugepages, should use one of the two ways to > destroy these free surplus huge pages: > 1) echo $nr_surplus > nr_hugepages to convert the surplus free huge pages > to persistent free huge pages first, and then echo 0 > nr_hugepages to > destroy these huge pages. > 2) allocate these free surplus huge pages, and will try to destroy them > when freeing them. > > However, there is no documentation to describe it, users may be confused > and don't know how to handle in such case. So update the documention. > > Signed-off-by: Jinjiang Tu > --- > Documentation/admin-guide/mm/hugetlbpage.rst | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst > index 67a941903fd2..0456cefae039 100644 > --- a/Documentation/admin-guide/mm/hugetlbpage.rst > +++ b/Documentation/admin-guide/mm/hugetlbpage.rst > @@ -239,6 +239,17 @@ this condition holds--that is, until ``nr_hugepages+nr_overcommit_hugepages`` is > increased sufficiently, or the surplus huge pages go out of use and are freed-- > no more surplus huge pages will be allowed to be allocated. > > +Caveat: Shrinking the persistent huge page pool via ``nr_hugepages`` may be > +concurrent with freeing in-use huge pages to the huge page pool, leading to some > +huge pages are still in the huge page pool and accounted as surplus. Besides, > +When the feature of freeing unused vmemmap pages associated with each hugetlb page when > +is enabled, free huge page may be accounted as surplus too. In such two cases, users > +couldn't directly shrink the huge page pool via echo 0 to ``nr_hugepages``, should but should Also, please limit each line to <80 characters. > +echo $nr_surplus to ``nr_hugepages`` to convert the surplus free huge pages to > +persistent free huge pages first, and then echo 0 to ``nr_hugepages`` to destroy > +these huge pages. Another way to destroy is allocating these free surplus huge > +pages and these huge pages will be tried to destroy when they are freed. > + But I don't see why this is a user problem to be solved by users... > With support for multiple huge page pools at run-time available, much of > the huge page userspace interface in ``/proc/sys/vm`` has been duplicated in > sysfs. -- ~Randy