linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>, Zi Yan <ziy@nvidia.com>,
	Barry Song <21cnbao@gmail.com>,
	Alistair Popple <apopple@nvidia.com>,
	William Kucharski <william.kucharski@oracle.com>
Cc: linux-mm@kvack.org, Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH v2] tools/mm: Add thpmaps script to dump THP usage info
Date: Thu, 11 Jan 2024 10:17:34 -0800	[thread overview]
Message-ID: <ebe47d0f-35e3-4c49-8128-8e705b57adbe@nvidia.com> (raw)
In-Reply-To: <43230798-af22-4f59-b37c-8257bae32af8@arm.com>

On 1/11/24 03:54, Ryan Roberts wrote:
...
> I'm not sure exectly what you are asking. The "cont" counters are counting
> blocks of contiguous, naturally aligned physical memory, which are also mapped
> contiguously and aligned. So a smaller --cont would always include all the
> memory captured in a larger --cont. In this case, its all the *file-backed*
> memory (as highighted in the label name) so nothing to do with (m)THP. But where
> you have THP, --cont doesn't care what the underlying THP size is as long as its
> requirements are met, so PMD-sized THPs would be included in e.g.
> *anon*-cont-aligned-128kB.
> 
> Note the the "--cont" counters don't directly count memory that is PTE-mapped
> with the contiguous bit set in the page table; it just counts memory that meets
> the alignment, size and mapping requirements. On arm64 systems with the contpte
> series, the contiguous bit would be used here, but its not a part of what's
> getting measured.
> 

The "cont" and "naturally aligned" terms are difficult here, even though
I'm familiar with the implementation. But putting on my systems
monitoring hat, these terms are not helping people as much as I'd like,
because:

a) "Contiguous" is not really a unique situation, so measuring large pages
    that are "contiguous" is confusing. All folios are contiguous, and
    anything a pte points to is contiguous as well. So --cont really
    throws off the user/reader.

b) "Naturally aligned" is also tricky. Because "natural" is not explained.
Here it means NAPOT (naturally aligned power of two, I saw that in the
riscv docs).

After spending a day or two exploring running systems with this, I'd
like to suggest:

1) measure "native PMD THPs" vs. pte-mapped mTHPs. This provides a lot
of information: mTHP is configured as expected, and is helping or not,
etc.

2) Not having to list out all the mTHP sizes would be nice. Instead,
just use the possible sizes from /sys/kernel/mm/transparent_hugepage/* ,
unless the user specifies sizes.

...
                          (e.g. /sys/fs/cgroup for cgroup-v2 or
>>>                          /sys/fs/cgroup/pids for cgroup-v1). Exactly one
>>>                          of --pid and --cgroup must be provided.
>>
>> Maybe we could add "--global" to that list. That would look, in order,
>> inside cgroups2 and cgroups, for a list of pids, and then run as if
>> --cgroup /sys/fs/cgroup or --cgroup /sys/fs/cgroup/pids were specified.
> 
> I think actually it might be better just to make global the default when neither
> --pid nor --cgroup are provided? And in this case, I'll just grab all the pids
> from /proc rather than traverse the cgroup hierachy, that way it will work on
> systems without cgroups. Does that work for you?

Yes! That was my initial idea, in fact, and after over-thinking it for
a while, it turned into the above. haha :)


thanks,
-- 
John Hubbard
NVIDIA



  parent reply	other threads:[~2024-01-11 18:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-10 17:32 Ryan Roberts
2024-01-10 23:21 ` John Hubbard
2024-01-11  0:11   ` John Hubbard
2024-01-11  3:32     ` John Hubbard
2024-01-11 11:54   ` Ryan Roberts
2024-01-11 17:32     ` Ryan Roberts
2024-01-11 18:01       ` David Hildenbrand
2024-01-11 18:04       ` John Hubbard
2024-01-12 10:01         ` Ryan Roberts
2024-01-11 18:17     ` John Hubbard [this message]
2024-01-12 10:00       ` Ryan Roberts
2024-01-12 19:14         ` John Hubbard
2024-01-15  9:48           ` Ryan Roberts
2024-01-15 15:56             ` Ryan Roberts
2024-01-15 21:30               ` John Hubbard
2024-01-16  8:53                 ` Ryan Roberts
2024-01-16 17:27                   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ebe47d0f-35e3-4c49-8128-8e705b57adbe@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=v-songbaohua@oppo.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    --cc=yuzenghui@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox