linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>, Zi Yan <ziy@nvidia.com>,
	Barry Song <21cnbao@gmail.com>,
	Alistair Popple <apopple@nvidia.com>,
	William Kucharski <william.kucharski@oracle.com>
Cc: linux-mm@kvack.org, Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH v2] tools/mm: Add thpmaps script to dump THP usage info
Date: Mon, 15 Jan 2024 13:30:12 -0800	[thread overview]
Message-ID: <f23e6f32-397e-4821-a0fe-f2a0bb6e2fe0@nvidia.com> (raw)
In-Reply-To: <9acb1684-7c5a-41c4-9a23-edad73e55585@arm.com>

On 1/15/24 07:56, Ryan Roberts wrote:
...
>> But yes, let me work up some improved documentation and send it out for your
>> review. The reason its a bit terse at the moment, is that I'm using Python's
>> ArgumentParser for the documentation, and it removes all line breaks from the
>> description which makes it hard to format longer form docs. Anyway, that's a bad
>> excuse for bad docs so I'll figure out a solution.
> 
> Here is my proposed documentation. If you could take a look and let me know if
> it makes sense, then I'll modify the tool to conform:
> 

Looks great. One typo fix and a note, below.

> --8<--
> 
> $ ./thpmaps --help
> 
> usage: thpmaps [-h] [--pid pid | --cgroup path] [--rollup] [--cont size[KMG]]
>                 [--inc-smaps] [--inc-empty] [--periodic sleep_ms]
> 
> Prints information about how transparent huge pages are mapped, either system-
> wide, or for a specified process or cgroup.
> 
> A default set of statistics is always generated for THP mappings. However, it is

The way this is done is sufficiently interesting to the sysadmin to say a
few words about it. Something along these lines, approximately:

  -----
  When run without options, cgroups v1 or v2 (depending on what is active
  on the system) is used in order to get a listing of all user space pids.
  That pid list is passed into the core script, as if the user had provided
  "--pids pid1 pid2 ...".
  -----

This reminds me that maybe a --pids options is helpful, what do you think?


> also possible to generate additional statistics for "contiguous block mappings"
> where the block size is user-defined.
> 
> Statistics are maintained independently for anonymous and file-backed
> (pagecache) memory and are shown both in kB and as a percentage of either total
> anonymous or total file-backed memory as appropriate.
> 
> THP Statistics
> --------------
> 
> Statistics are always generated for fully- and contiguously-mapped THPs whose
> mapping address is aligned to their size, for each <size> supported by the
> system. Separate counters describe THPs mapped by PTE vs those mapped by PMD.
> (Although note a THP can only be mapped by PMD if it is PMD-sized):
> 
> - anon-thp-pte-aligned-<size>kB
> - file-thp-pte-aligned-<size>kB
> - anon-thp-pmd-aligned-<size>kB
> - file-thp-pmd-aligned-<size>kB
> 
> Similarly, statistics are always generated for fully- and contiguously-mapped
> THPs whose mapping address is *not* aligned to their size, for each <size>
> supported by the system. Due to the unaligned mapping, it is impossible to map
> by PMD, so there are only PTE counters for this case:
> 
> - anon-thp-pte-unaligned-<size>kB
> - file-thp-pte-unaligned-<size>kB
> 
> Statistics are also always generated for mapped pages that belong to a THP but
> where the is THP is *not* fully- and contiguously- mapped. These "partial"
> mappings are all counted in the same counter regardless of the size of the THP
> that is partially mapped:
> 
> - anon-thp-pte-partial
> - file-thp-pte-partial
> 
> Contiguous Block Statistics
> ---------------------------
> 
> An optional, additional set of statistics is generated for every contiguous
> block size specified with `--cont <size>`. These statistics show how much memory
> is mapped in contiguous blocks of <size> and also aligned to <size>. A given
> contiguous block must all belong to the same THP, but there is no requirement
> for it to be the *whole* THP. Separate counters describe contiguous blocks
> mapped by PTE vs those mapped by PMD:
> 
> - anon-cont-pte-aligned-<size>kB
> - file-cont-pte-aligned-<size>kB
> - anon-cont-pmd-aligned-<size>kB
> - file-cont-pmd-aligned-<size>kB
> 
> As an example, if montiroing 64K contiguous blocks (--cont 64K), there are a

typo: "monitoring"

> number of sources that could provide such blocks: a fully- and contiguously-
> mapped 64K THP that is aligned to a 64K boundary would provide 1 block. A fully-
> and contiguously-mapped 128K THP that is aligned to at least a 64K boundary
> would provide 2 blocks. Or a 128K THP that maps its first 100K, but contiguously
> and starting at a 64K boundary would provide 1 block. A fully- and contiguously-
> mapped 2M THP would provide 32 blocks. There are many other possible
> permutations.
> 
> optional arguments:
>    -h, --help           show this help message and exit
>    --pid pid            Process id of the target process. --pid and --cgroup are
>                         mutually exclusive. If neither are provided, all
>                         processes are scanned to provide system-wide information.
>    --cgroup path        Path to the target cgroup in sysfs. Iterates over every
>                         pid in the cgroup and its children. --pid and --cgroup
>                         are mutually exclusive. If neither are provided, all
>                         processes are scanned to provide system-wide information.
>    --rollup             Sum the per-vma statistics to provide a summary over the
>                         whole system, process or cgroup.
>    --cont size[KMG]     Adds stats for memory that is mapped in contiguous blocks
>                         of <size> and also aligned to <size>. May be issued
>                         multiple times to track multiple sized blocks. Useful to
>                         infer e.g. arm64 contpte and hpa mappings. Size must be a
>                         power-of-2 number of pages.
>    --inc-smaps          Include all numerical, additive /proc/<pid>/smaps stats
>                         in the output.
>    --inc-empty          Show all statistics including those whose value is 0.
>    --periodic sleep_ms  Run in a loop, polling every sleep_ms milliseconds.
> 
> Requires root privilege to access pagemap and kpageflags.
> 
> --8<--

It's all looking much more understandable now, very nice.

thanks,
-- 
John Hubbard
NVIDIA



  reply	other threads:[~2024-01-15 21:31 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-10 17:32 Ryan Roberts
2024-01-10 23:21 ` John Hubbard
2024-01-11  0:11   ` John Hubbard
2024-01-11  3:32     ` John Hubbard
2024-01-11 11:54   ` Ryan Roberts
2024-01-11 17:32     ` Ryan Roberts
2024-01-11 18:01       ` David Hildenbrand
2024-01-11 18:04       ` John Hubbard
2024-01-12 10:01         ` Ryan Roberts
2024-01-11 18:17     ` John Hubbard
2024-01-12 10:00       ` Ryan Roberts
2024-01-12 19:14         ` John Hubbard
2024-01-15  9:48           ` Ryan Roberts
2024-01-15 15:56             ` Ryan Roberts
2024-01-15 21:30               ` John Hubbard [this message]
2024-01-16  8:53                 ` Ryan Roberts
2024-01-16 17:27                   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f23e6f32-397e-4821-a0fe-f2a0bb6e2fe0@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=v-songbaohua@oppo.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    --cc=yuzenghui@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox