linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Xueshi Hu <xueshi.hu@smartx.com>,
	mike.kravetz@oracle.com, muchun.song@linux.dev, corbet@lwn.net,
	akpm@linux-foundation.org, n-horiguchi@ah.jp.nec.com,
	osalvador@suse.de
Cc: linux-mm@kvack.org
Subject: Re: [PATCH v2 1/4] mm/hugetlb: fix the inconsistency of /proc/sys/vm/nr_huge_pages
Date: Tue, 8 Aug 2023 09:58:32 +0200	[thread overview]
Message-ID: <5b404d86-6b6f-b6c7-6286-f2ce3c4b5424@redhat.com> (raw)
In-Reply-To: <79508337-08c1-7926-afd9-af21ee128949@smartx.com>

On 08.08.23 04:28, Xueshi Hu wrote:
> On 8/7/23 23:15, David Hildenbrand wrote:
>> On 06.08.23 09:48, Xueshi Hu wrote:
>>> There are currently three 'nr_hugepages' used to export the number of
>>> huge
>>> pages:
>>> 1. /proc/sys/vm/nr_hugepages
>>> 2. /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
>>> 3. /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
>>>
>>> For consistency, all three 'nr_hugepages' should return the total number
>>> of huge pages. When written, the number of persistent huge pages will be
>>> adjusted to the specified value.
>>>
>>> But, /proc/sys/vm/nr_hugepages returns the number of persistent huge
>>> pages.
>>>
>>
>> But that's documented behavior, no?
>>
>> Documentation/admin-guide/mm/hugetlbpage.rst
>>
>> ``/proc/sys/vm/nr_hugepages`` indicates the current number of
>> "persistent" huge
>> pages in the kernel's huge page pool.  "Persistent" huge pages will be
>> returned to the huge page pool when freed by a task.  A user with root
>> privileges can dynamically allocate more or free some persistent huge pages
>> by increasing or decreasing the value of ``nr_hugepages``.
>>
> 
> Actually, Documentation/admin-guide/mm/hugetlbpage.rst is contradictory
> about the definition of /proc/sys/vm/nr_hugepages.
> 
> The documentation says:
> 
> - ``/proc/sys/vm/nr_hugepages`` indicates the current number of
> "persistent" huge.
> 
> But, the documentation also says:
> 
> - The ``/proc`` interfaces discussed above have been retained for
> backwards compatibility.

Yes. And why shouldn't the behavior of these toggles be retained for backwards compatibility?

> 
> - The ``nr_hugepages`` attribute returns the total number of huge pages on
> the specified node.  When this attribute is written, the number of
> persistent huge pages on the parent node will be adjusted to the specified
> value, if sufficient resources exist, regardless of the task's mempolicy
> or cpuset constraints.

That is part of the "/sys/devices/system/node/node[0-9]*/hugepages/" docuemntation,
not "/proc/sys/vm/nr_hugepages" documentation.

It's in the "Per Node Hugepages Attributes" section.


If I am not wrong, that documentation -- including the usage of the "persistent"
term -- were introduced in 2009 already:

commit 267b4c281b4a43c8f3d965c791d3a7fd62448733
Author: Lee Schermerhorn <lee.schermerhorn@hp.com>
Date:   Mon Dec 14 17:58:30 2009 -0800

     hugetlb: update hugetlb documentation for NUMA controls
     
     Update the kernel huge tlb documentation to describe the numa memory
     policy based huge page management.  Additionaly, the patch includes a fair
     amount of rework to improve consistency, eliminate duplication and set the
     context for documenting the memory policy interaction.
     
So this has been documented behavior for a long time.

> 
> So, I create the patch 4 to make the documentation more clear.
> 
> If such subtle inconsistencies result in unexpected behavior, it can be
> challenging for a system administrator to detect.

Your patch is changing the traditional, documented behavior to then change
the documentation to match the new implementation.

How can you be sure nobody relies on exactly that traditional, well documented
behavior?

Why change the behavior of an interface that is kept for backwards compatibility,
that might still be in use under the assumptions that the behavior is exactly
like that (and for at least the last 14 years)?

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2023-08-08  7:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-06  7:48 [PATCH v2 0/4] mm/hugetlb: fix /sys and /proc fs dealing with persistent hugepages Xueshi Hu
2023-08-06  7:48 ` [PATCH v2 1/4] mm/hugetlb: fix the inconsistency of /proc/sys/vm/nr_huge_pages Xueshi Hu
2023-08-07 15:15   ` David Hildenbrand
2023-08-08  2:28     ` Xueshi Hu
2023-08-08  7:58       ` David Hildenbrand [this message]
2023-08-08  9:13         ` Xueshi Hu
2023-08-10  0:17           ` Mike Kravetz
2023-08-10  7:34             ` David Hildenbrand
2023-08-10 22:15               ` Mike Kravetz
2023-08-25  4:02               ` Xueshi Hu
2023-08-25 21:15                 ` Mike Kravetz
2023-08-26  2:51                   ` Xueshi Hu
2023-08-06  7:48 ` [PATCH v2 2/4] mm/hugeltb: clean up hstate::max_huge_pages Xueshi Hu
2023-08-07 15:17   ` David Hildenbrand
2023-08-06  7:48 ` [PATCH v2 3/4] mm/hugeltb: fix nodes huge page allocation when there are surplus pages Xueshi Hu
2023-08-06  7:48 ` [PATCH v2 4/4] docs: hugetlbpage.rst: make the meaning of persistent hugetlb pages clear Xueshi Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b404d86-6b6f-b6c7-6286-f2ce3c4b5424@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=osalvador@suse.de \
    --cc=xueshi.hu@smartx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox