From: Barry Song <21cnbao@gmail.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Zenghui Yu <yuzenghui@huawei.com>,
Matthew Wilcox <willy@infradead.org>,
David Hildenbrand <david@redhat.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
John Hubbard <jhubbard@nvidia.com>, Zi Yan <ziy@nvidia.com>,
Alistair Popple <apopple@nvidia.com>,
linux-mm@kvack.org
Subject: Re: [RFC PATCH v1] tools/mm: Add thpmaps script to dump THP usage info
Date: Wed, 3 Jan 2024 19:44:51 +1300 [thread overview]
Message-ID: <CAGsJ_4wgvuEE1GpmJU3fkJ35Xx0Ue-bLohRfivy4XEDc4N8fPw@mail.gmail.com> (raw)
In-Reply-To: <20240102153828.1002295-1-ryan.roberts@arm.com>
On Wed, Jan 3, 2024 at 4:38 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> With the proliferation of large folios for file-backed memory, and more
> recently the introduction of multi-size THP for anonymous memory, it is
> becoming useful to be able to see exactly how large folios are mapped
> into processes. For some architectures (e.g. arm64), if most memory is
> mapped using contpte-sized and -aligned blocks, TLB usage can be
> optimized so it's useful to see where these requirements are and are not
> being met.
>
> thpmaps is a Python utility that reads /proc/<pid>/smaps,
> /proc/<pid>/pagemap and /proc/kpageflags to print information about how
> transparent huge pages (both file and anon) are mapped to a specified
> process or cgroup. It aims to help users debug and optimize their
> workloads. In future we may wish to introduce stats directly into the
> kernel (e.g. smaps or similar), but for now this provides a short term
> solution without the need to introduce any new ABI.
>
> Run with help option for a full listing of the arguments:
>
> # thpmaps --help
>
> --8<--
> usage: thpmaps [-h] [--pid pid] [--cgroup path] [--summary]
> [--cont size[KMG]] [--inc-smaps] [--inc-empty]
> [--periodic sleep_ms]
>
> Prints information about how transparent huge pages are mapped to a
> specified process or cgroup. Shows statistics for fully-mapped THPs of
> every size, mapped both naturally aligned and unaligned for both file
> and anonymous memory. See [anon|file]-thp-[aligned|unaligned]-<size>kB
> keys. Shows statistics for mapped pages that belong to a THP but which
> are not fully mapped. See [anon|file]-thp-partial keys. Optionally
> shows statistics for naturally aligned, contiguous blocks of memory of
> a specified size (when --cont is provided). See [anon|file]-cont-
> aligned-<size>kB keys. Statistics are shown in kB and as a percentage
> of either total anon or file memory as appropriate.
>
> options:
> -h, --help show this help message and exit
> --pid pid Process id of the target process. Exactly one of
> --pid and --cgroup must be provided.
> --cgroup path Path to the target cgroup in sysfs. Iterates
> over every pid in the cgroup. Exactly one of
> --pid and --cgroup must be provided.
> --summary Sum the per-vma statistics to provide a summary
> over the whole process or cgroup.
> --cont size[KMG] Adds anon and file stats for naturally aligned,
> contiguously mapped blocks of the specified
> size. May be issued multiple times to track
> multiple sized blocks. Useful to infer e.g.
> arm64 contpte and hpa mappings. Size must be a
> power-of-2 number of pages.
> --inc-smaps Include all numerical, additive
> /proc/<pid>/smaps stats in the output.
> --inc-empty Show all statistics including those whose value
> is 0.
> --periodic sleep_ms Run in a loop, polling every sleep_ms
> milliseconds.
>
> Requires root privilege to access pagemap and kpageflags.
> --8<--
>
> Example command to summarise fully and partially mapped THPs and 64K
> contiguous blocks over all VMAs in a single process (--inc-empty forces
> printing stats that are 0):
>
> # ./thpmaps --pid 10837 --cont 64K --summary --inc-empty
>
> --8<--
> anon-thp-aligned-16kB: 16 kB ( 0%)
> anon-thp-aligned-32kB: 0 kB ( 0%)
> anon-thp-aligned-64kB: 4194304 kB (100%)
> anon-thp-aligned-128kB: 0 kB ( 0%)
> anon-thp-aligned-256kB: 0 kB ( 0%)
> anon-thp-aligned-512kB: 0 kB ( 0%)
> anon-thp-aligned-1024kB: 0 kB ( 0%)
> anon-thp-aligned-2048kB: 0 kB ( 0%)
> anon-thp-unaligned-16kB: 0 kB ( 0%)
> anon-thp-unaligned-32kB: 0 kB ( 0%)
> anon-thp-unaligned-64kB: 0 kB ( 0%)
> anon-thp-unaligned-128kB: 0 kB ( 0%)
> anon-thp-unaligned-256kB: 0 kB ( 0%)
> anon-thp-unaligned-512kB: 0 kB ( 0%)
> anon-thp-unaligned-1024kB: 0 kB ( 0%)
> anon-thp-unaligned-2048kB: 0 kB ( 0%)
> anon-thp-partial: 0 kB ( 0%)
> file-thp-aligned-16kB: 16 kB ( 1%)
> file-thp-aligned-32kB: 64 kB ( 5%)
> file-thp-aligned-64kB: 640 kB (50%)
> file-thp-aligned-128kB: 128 kB (10%)
> file-thp-aligned-256kB: 0 kB ( 0%)
> file-thp-aligned-512kB: 0 kB ( 0%)
> file-thp-aligned-1024kB: 0 kB ( 0%)
> file-thp-aligned-2048kB: 0 kB ( 0%)
> file-thp-unaligned-16kB: 16 kB ( 1%)
> file-thp-unaligned-32kB: 32 kB ( 3%)
> file-thp-unaligned-64kB: 64 kB ( 5%)
> file-thp-unaligned-128kB: 0 kB ( 0%)
> file-thp-unaligned-256kB: 0 kB ( 0%)
> file-thp-unaligned-512kB: 0 kB ( 0%)
> file-thp-unaligned-1024kB: 0 kB ( 0%)
> file-thp-unaligned-2048kB: 0 kB ( 0%)
> file-thp-partial: 12 kB ( 1%)
> anon-cont-aligned-64kB: 4194304 kB (100%)
> file-cont-aligned-64kB: 768 kB (61%)
> --8<--
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
Hi Ryan,
I ran a couple of test cases with different parameters, it seems to
work correctly.
just i don't understand the below, what is the meaning of 000000ce at
the beginning of
each line?
/thpmaps --pid 206 --cont 64K
000000ce 0000aaaadbb20000-0000aaaadbb21000 r-xp 00000000 fe:00
00426969 /root/a.out
000000ce 0000aaaadbb3f000-0000aaaadbb40000 r--p 0000f000 fe:00
00426969 /root/a.out
000000ce 0000aaaadbb40000-0000aaaadbb41000 rw-p 00010000 fe:00
00426969 /root/a.out
000000ce 0000ffff702c0000-0000ffffb02c0000 rw-p 00000000 00:00 00000000
anon-thp-aligned-64kB: 473920 kB (100%)
anon-cont-aligned-64kB: 473920 kB (100%)
000000ce 0000ffffb02c0000-0000ffffb044c000 r-xp 00000000 fe:00
00395429 /usr/lib/aarch64-linux-gnu/libc.so.6
000000ce 0000ffffb044c000-0000ffffb045d000 ---p 0018c000 fe:00
00395429 /usr/lib/aarch64-linux-gnu/libc.so.6
000000ce 0000ffffb045d000-0000ffffb0460000 r--p 0018d000 fe:00
00395429 /usr/lib/aarch64-linux-gnu/libc.so.6
000000ce 0000ffffb0460000-0000ffffb0462000 rw-p 00190000 fe:00
00395429 /usr/lib/aarch64-linux-gnu/libc.so.6
000000ce 0000ffffb0462000-0000ffffb046f000 rw-p 00000000 00:00 00000000
000000ce 0000ffffb0477000-0000ffffb049d000 r-xp 00000000 fe:00
00393893 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
000000ce 0000ffffb04b0000-0000ffffb04b2000 rw-p 00000000 00:00 00000000
000000ce 0000ffffb04b2000-0000ffffb04b4000 r--p 00000000 00:00 00000000 [vvar]
000000ce 0000ffffb04b4000-0000ffffb04b5000 r-xp 00000000 00:00 00000000 [vdso]
000000ce 0000ffffb04b5000-0000ffffb04b7000 r--p 0002e000 fe:00
00393893 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
000000ce 0000ffffb04b7000-0000ffffb04b9000 rw-p 00030000 fe:00
00393893 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
000000ce 0000ffffdaba4000-0000ffffdabc5000 rw-p 00000000 00:00 00000000 [stack]
>
> I've found this very useful for debugging, and I know others have requested a
> way to check if mTHP and contpte is working, so thought this might a good short
> term solution until we figure out how best to add stats in the kernel?
>
> Thanks,
> Ryan
>
> tools/mm/Makefile | 9 +-
> tools/mm/thpmaps | 573 ++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 578 insertions(+), 4 deletions(-)
> create mode 100755 tools/mm/thpmaps
>
> diff --git a/tools/mm/Makefile b/tools/mm/Makefile
> index 1c5606cc3334..7bb03606b9ea 100644
> --- a/tools/mm/Makefile
> +++ b/tools/mm/Makefile
> @@ -3,7 +3,8 @@
> #
> include ../scripts/Makefile.include
>
> -TARGETS=page-types slabinfo page_owner_sort
> +BUILD_TARGETS=page-types slabinfo page_owner_sort
> +INSTALL_TARGETS = $(BUILD_TARGETS) thpmaps
>
> LIB_DIR = ../lib/api
> LIBS = $(LIB_DIR)/libapi.a
> @@ -11,9 +12,9 @@ LIBS = $(LIB_DIR)/libapi.a
> CFLAGS += -Wall -Wextra -I../lib/ -pthread
> LDFLAGS += $(LIBS) -pthread
>
> -all: $(TARGETS)
> +all: $(BUILD_TARGETS)
>
> -$(TARGETS): $(LIBS)
> +$(BUILD_TARGETS): $(LIBS)
>
> $(LIBS):
> make -C $(LIB_DIR)
> @@ -29,4 +30,4 @@ sbindir ?= /usr/sbin
>
> install: all
> install -d $(DESTDIR)$(sbindir)
> - install -m 755 -p $(TARGETS) $(DESTDIR)$(sbindir)
> + install -m 755 -p $(INSTALL_TARGETS) $(DESTDIR)$(sbindir)
> diff --git a/tools/mm/thpmaps b/tools/mm/thpmaps
> new file mode 100755
> index 000000000000..af9b19f63eb4
> --- /dev/null
> +++ b/tools/mm/thpmaps
> @@ -0,0 +1,573 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0-only
> +# Copyright (C) 2024 ARM Ltd.
> +#
> +# Utility providing smaps-like output detailing transparent hugepage usage.
> +# For more info, run:
> +# ./thpmaps --help
> +#
> +# Requires numpy:
> +# pip3 install numpy
> +
> +
> +import argparse
> +import collections
> +import math
> +import os
> +import re
> +import resource
> +import shutil
> +import sys
> +import time
> +import numpy as np
> +
> +
> +with open('/sys/kernel/mm/transparent_hugepage/hpage_pmd_size') as f:
> + PAGE_SIZE = resource.getpagesize()
> + PAGE_SHIFT = int(math.log2(PAGE_SIZE))
> + PMD_SIZE = int(f.read())
> + PMD_ORDER = int(math.log2(PMD_SIZE / PAGE_SIZE))
> +
> +
> +def align_forward(v, a):
> + return (v + (a - 1)) & ~(a - 1)
> +
> +
> +def align_offset(v, a):
> + return v & (a - 1)
> +
> +
> +def nrkb(nr):
> + # Convert number of pages to KB.
> + return (nr << PAGE_SHIFT) >> 10
> +
> +
> +def odkb(order):
> + # Convert page order to KB.
> + return nrkb(1 << order)
> +
> +
> +def cont_ranges_all(arrs):
> + # Given a list of arrays, find the ranges for which values are monotonically
> + # incrementing in all arrays.
> + assert(len(arrs) > 0)
> + sz = len(arrs[0])
> + for arr in arrs:
> + assert(arr.shape == (sz,))
> + r = np.full(sz, 2)
> + d = np.diff(arrs[0]) == 1
> + for dd in [np.diff(arr) == 1 for arr in arrs[1:]]:
> + d &= dd
> + r[1:] -= d
> + r[:-1] -= d
> + return [np.repeat(arr, r).reshape(-1, 2) for arr in arrs]
> +
> +
> +class ArgException(Exception):
> + pass
> +
> +
> +class FileIOException(Exception):
> + pass
> +
> +
> +class BinArrayFile:
> + # Base class used to read /proc/<pid>/pagemap and /proc/kpageflags into a
> + # numpy array. Use inherrited class in a with clause to ensure file is
> + # closed when it goes out of scope.
> + def __init__(self, filename, element_size):
> + self.element_size = element_size
> + self.filename = filename
> + self.fd = os.open(self.filename, os.O_RDONLY)
> +
> + def cleanup(self):
> + os.close(self.fd)
> +
> + def __enter__(self):
> + return self
> +
> + def __exit__(self, exc_type, exc_val, exc_tb):
> + self.cleanup()
> +
> + def _readin(self, offset, buffer):
> + length = os.preadv(self.fd, (buffer,), offset)
> + if len(buffer) != length:
> + raise FileIOException('error: {} failed to read {} bytes at {:x}'
> + .format(self.filename, len(buffer), offset))
> +
> + def _toarray(self, buf):
> + assert(self.element_size == 8)
> + return np.frombuffer(buf, dtype=np.uint64)
> +
> + def getv(self, vec):
> + sz = 0
> + for region in vec:
> + sz += int(region[1] - region[0] + 1) * self.element_size
> + buf = bytearray(sz)
> + view = memoryview(buf)
> + pos = 0
> + for region in vec:
> + offset = int(region[0]) * self.element_size
> + length = int(region[1] - region[0] + 1) * self.element_size
> + self._readin(offset, view[pos:pos+length])
> + pos += length
> + return self._toarray(buf)
> +
> + def get(self, index, nr=1):
> + offset = index * self.element_size
> + length = nr * self.element_size
> + buf = bytearray(length)
> + self._readin(offset, buf)
> + return self._toarray(buf)
> +
> +
> +PM_PAGE_PRESENT = 1 << 63
> +PM_PFN_MASK = (1 << 55) - 1
> +
> +class PageMap(BinArrayFile):
> + # Read ranges of a given pid's pagemap into a numpy array.
> + def __init__(self, pid='self'):
> + super().__init__(f'/proc/{pid}/pagemap', 8)
> +
> +
> +KPF_ANON = 1 << 12
> +KPF_COMPOUND_HEAD = 1 << 15
> +KPF_COMPOUND_TAIL = 1 << 16
> +
> +class KPageFlags(BinArrayFile):
> + # Read ranges of /proc/kpageflags into a numpy array.
> + def __init__(self):
> + super().__init__(f'/proc/kpageflags', 8)
> +
> +
> +VMA = collections.namedtuple('VMA', [
> + 'name',
> + 'start',
> + 'end',
> + 'read',
> + 'write',
> + 'execute',
> + 'private',
> + 'pgoff',
> + 'major',
> + 'minor',
> + 'inode',
> + 'stats',
> +])
> +
> +class VMAList:
> + # A container for VMAs, parsed from /proc/<pid>/smaps. Iterate over the
> + # instance to receive VMAs.
> + head_regex = re.compile(r"^([\da-f]+)-([\da-f]+) ([r-])([w-])([x-])([ps]) ([\da-f]+) ([\da-f]+):([\da-f]+) ([\da-f]+)\s*(.*)$")
> + kb_item_regex = re.compile(r"(\w+):\s*(\d+)\s*kB")
> +
> + def __init__(self, pid='self'):
> + def is_vma(line):
> + return self.head_regex.search(line) != None
> +
> + def get_vma(line):
> + m = self.head_regex.match(line)
> + if m is None:
> + return None
> + return VMA(
> + name=m.group(11),
> + start=int(m.group(1), 16),
> + end=int(m.group(2), 16),
> + read=m.group(3) == 'r',
> + write=m.group(4) == 'w',
> + execute=m.group(5) == 'x',
> + private=m.group(6) == 'p',
> + pgoff=int(m.group(7), 16),
> + major=int(m.group(8), 16),
> + minor=int(m.group(9), 16),
> + inode=int(m.group(10), 16),
> + stats={},
> + )
> +
> + def get_value(line):
> + # Currently only handle the KB stats because they are summed for
> + # --summary. Core code doesn't know how to combine other stats.
> + exclude = ['KernelPageSize', 'MMUPageSize']
> + m = self.kb_item_regex.search(line)
> + if m:
> + param = m.group(1)
> + if param not in exclude:
> + value = int(m.group(2))
> + return param, value
> + return None, None
> +
> + def parse_smaps(file):
> + vmas = []
> + i = 0
> +
> + line = file.readline()
> +
> + while True:
> + if not line:
> + break
> + line = line.strip()
> +
> + i += 1
> +
> + vma = get_vma(line)
> + if vma is None:
> + raise FileIOException(f'error: could not parse line {i}: "{line}"')
> +
> + while True:
> + line = file.readline()
> + if not line:
> + break
> + line = line.strip()
> + if is_vma(line):
> + break
> +
> + i += 1
> +
> + param, value = get_value(line)
> + if param:
> + vma.stats[param] = {'type': None, 'value': value}
> +
> + vmas.append(vma)
> +
> + return vmas
> +
> + with open(f'/proc/{pid}/smaps', 'r') as file:
> + self.vmas = parse_smaps(file)
> +
> + def __iter__(self):
> + yield from self.vmas
> +
> +
> +def thp_parse(max_order, kpageflags, vfns, pfns, anons, heads):
> + # Given 4 same-sized arrays representing a range within a page table backed
> + # by THPs (vfns: virtual frame numbers, pfns: physical frame numbers, anons:
> + # True if page is anonymous, heads: True if page is head of a THP), return a
> + # dictionary of statistics describing the mapped THPs.
> + stats = {
> + 'file': {
> + 'partial': 0,
> + 'aligned': [0] * (max_order + 1),
> + 'unaligned': [0] * (max_order + 1),
> + },
> + 'anon': {
> + 'partial': 0,
> + 'aligned': [0] * (max_order + 1),
> + 'unaligned': [0] * (max_order + 1),
> + },
> + }
> +
> + indexes = np.arange(len(vfns), dtype=np.uint64)
> + ranges = cont_ranges_all([indexes, vfns, pfns])
> + for rindex, rpfn in zip(ranges[0], ranges[2]):
> + index_next = int(rindex[0])
> + index_end = int(rindex[1]) + 1
> + pfn_end = int(rpfn[1]) + 1
> +
> + folios = indexes[index_next:index_end][heads[index_next:index_end]]
> +
> + # Account pages for any partially mapped THP at the front. In that case,
> + # the first page of the range is a tail.
> + nr = (int(folios[0]) if len(folios) else index_end) - index_next
> + stats['anon' if anons[index_next] else 'file']['partial'] += nr
> +
> + # Account pages for any partially mapped THP at the back. In that case,
> + # the next page after the range is a tail.
> + if len(folios):
> + flags = int(kpageflags.get(pfn_end)[0])
> + if flags & KPF_COMPOUND_TAIL:
> + nr = index_end - int(folios[-1])
> + folios = folios[:-1]
> + index_end -= nr
> + stats['anon' if anons[index_end - 1] else 'file']['partial'] += nr
> +
> + # Account fully mapped THPs in the middle of the range.
> + if len(folios):
> + folio_nrs = np.append(np.diff(folios), np.uint64(index_end - folios[-1]))
> + folio_orders = np.log2(folio_nrs).astype(np.uint64)
> + for index, order in zip(folios, folio_orders):
> + index = int(index)
> + order = int(order)
> + nr = 1 << order
> + vfn = int(vfns[index])
> + align = 'aligned' if align_forward(vfn, nr) == vfn else 'unaligned'
> + anon = 'anon' if anons[index] else 'file'
> + stats[anon][align][order] += nr
> +
> + rstats = {}
> +
> + def flatten_sub(type, subtype, stats):
> + for od, nr in enumerate(stats[2:], 2):
> + rstats[f"{type}-thp-{subtype}-{odkb(od)}kB"] = {'type': type, 'value': nrkb(nr)}
> +
> + def flatten_type(type, stats):
> + flatten_sub(type, 'aligned', stats['aligned'])
> + flatten_sub(type, 'unaligned', stats['unaligned'])
> + rstats[f"{type}-thp-partial"] = {'type': type, 'value': nrkb(stats['partial'])}
> +
> + flatten_type('anon', stats['anon'])
> + flatten_type('file', stats['file'])
> +
> + return rstats
> +
> +
> +def cont_parse(order, vfns, pfns, anons, heads):
> + # Given 4 same-sized arrays representing a range within a page table backed
> + # by THPs (vfns: virtual frame numbers, pfns: physical frame numbers, anons:
> + # True if page is anonymous, heads: True if page is head of a THP), return a
> + # dictionary of statistics describing the contiguous blocks.
> + nr_cont = 1 << order
> + nr_anon = 0
> + nr_file = 0
> +
> + ranges = cont_ranges_all([np.arange(len(vfns), dtype=np.uint64), vfns, pfns])
> + for rindex, rvfn, rpfn in zip(*ranges):
> + index_next = int(rindex[0])
> + index_end = int(rindex[1]) + 1
> + vfn_start = int(rvfn[0])
> + pfn_start = int(rpfn[0])
> +
> + if align_offset(pfn_start, nr_cont) != align_offset(vfn_start, nr_cont):
> + continue
> +
> + off = align_forward(vfn_start, nr_cont) - vfn_start
> + index_next += off
> +
> + while index_next + nr_cont <= index_end:
> + folio_boundary = heads[index_next+1:index_next+nr_cont].any()
> + if not folio_boundary:
> + if anons[index_next]:
> + nr_anon += nr_cont
> + else:
> + nr_file += nr_cont
> + index_next += nr_cont
> +
> + return {
> + f"anon-cont-aligned-{nrkb(nr_cont)}kB": {'type': 'anon', 'value': nrkb(nr_anon)},
> + f"file-cont-aligned-{nrkb(nr_cont)}kB": {'type': 'file', 'value': nrkb(nr_file)},
> + }
> +
> +
> +def vma_print(vma, pid):
> + # Prints a VMA instance in a format similar to smaps. The main difference is
> + # that the pid is included as the first value.
> + print("{:08x} {:016x}-{:016x} {}{}{}{} {:08x} {:02x}:{:02x} {:08x} {}"
> + .format(
> + pid, vma.start, vma.end,
> + 'r' if vma.read else '-', 'w' if vma.write else '-',
> + 'x' if vma.execute else '-', 'p' if vma.private else 's',
> + vma.pgoff, vma.major, vma.minor, vma.inode, vma.name
> + ))
> +
> +
> +def stats_print(stats, tot_anon, tot_file, inc_empty):
> + # Print a statistics dictionary.
> + label_field = 32
> + for label, stat in stats.items():
> + type = stat['type']
> + value = stat['value']
> + if value or inc_empty:
> + pad = max(0, label_field - len(label) - 1)
> + if type == 'anon':
> + percent = f' ({value / tot_anon:3.0%})'
> + elif type == 'file':
> + percent = f' ({value / tot_file:3.0%})'
> + else:
> + percent = ''
> + print(f"{label}:{' ' * pad}{value:8} kB{percent}")
> +
> +
> +def vma_parse(vma, pagemap, kpageflags, contorders):
> + # Generate thp and cont statistics for a single VMA.
> + start = vma.start >> PAGE_SHIFT
> + end = vma.end >> PAGE_SHIFT
> +
> + pmes = pagemap.get(start, end - start)
> + present = pmes & PM_PAGE_PRESENT != 0
> + pfns = pmes & PM_PFN_MASK
> + pfns = pfns[present]
> + vfns = np.arange(start, end, dtype=np.uint64)
> + vfns = vfns[present]
> +
> + flags = kpageflags.getv(cont_ranges_all([pfns])[0])
> + anons = flags & KPF_ANON != 0
> + heads = flags & KPF_COMPOUND_HEAD != 0
> + tails = flags & KPF_COMPOUND_TAIL != 0
> + thps = heads | tails
> +
> + tot_anon = np.count_nonzero(anons)
> + tot_file = np.size(anons) - tot_anon
> + tot_anon = nrkb(tot_anon)
> + tot_file = nrkb(tot_file)
> +
> + vfns = vfns[thps]
> + pfns = pfns[thps]
> + anons = anons[thps]
> + heads = heads[thps]
> +
> + thpstats = thp_parse(PMD_ORDER, kpageflags, vfns, pfns, anons, heads)
> + contstats = [cont_parse(order, vfns, pfns, anons, heads) for order in contorders]
> +
> + return {
> + **thpstats,
> + **{k: v for s in contstats for k, v in s.items()}
> + }, tot_anon, tot_file
> +
> +
> +def do_main(args):
> + pids = set()
> + summary = {}
> + summary_anon = 0
> + summary_file = 0
> +
> + if args.cgroup:
> + with open(f'{args.cgroup}/cgroup.procs') as pidfile:
> + for line in pidfile.readlines():
> + pids.add(int(line.strip()))
> + else:
> + pids.add(args.pid)
> +
> + for pid in pids:
> + try:
> + with PageMap(pid) as pagemap:
> + with KPageFlags() as kpageflags:
> + for vma in VMAList(pid):
> + if (vma.read or vma.write or vma.execute) and vma.stats['Rss']['value'] > 0:
> + stats, vma_anon, vma_file = vma_parse(vma, pagemap, kpageflags, args.cont)
> + else:
> + stats = {}
> + vma_anon = 0
> + vma_file = 0
> + if args.inc_smaps:
> + stats = {**vma.stats, **stats}
> + if args.summary:
> + for k, v in stats.items():
> + if k in summary:
> + assert(summary[k]['type'] == v['type'])
> + summary[k]['value'] += v['value']
> + else:
> + summary[k] = v
> + summary_anon += vma_anon
> + summary_file += vma_file
> + else:
> + vma_print(vma, pid)
> + stats_print(stats, vma_anon, vma_file, args.inc_empty)
> + except FileNotFoundError:
> + if not args.cgroup:
> + raise
> + except ProcessLookupError:
> + if not args.cgroup:
> + raise
> +
> + if args.summary:
> + stats_print(summary, summary_anon, summary_file, args.inc_empty)
> +
> +
> +def main():
> + def formatter(prog):
> + width = shutil.get_terminal_size().columns
> + width -= 2
> + width = min(80, width)
> + return argparse.HelpFormatter(prog, width=width)
> +
> + def size2order(human):
> + units = {"K": 2**10, "M": 2**20, "G": 2**30}
> + unit = 1
> + if human[-1] in units:
> + unit = units[human[-1]]
> + human = human[:-1]
> + try:
> + size = int(human)
> + except ValueError:
> + raise ArgException('error: --cont value must be integer size with optional KMG unit')
> + size *= unit
> + order = int(math.log2(size / PAGE_SIZE))
> + if order < 1:
> + raise ArgException('error: --cont value must be size of at least 2 pages')
> + if (1 << order) * PAGE_SIZE != size:
> + raise ArgException('error: --cont value must be size of power-of-2 pages')
> + return order
> +
> + parser = argparse.ArgumentParser(formatter_class=formatter,
> + description="""Prints information about how transparent huge pages are
> + mapped to a specified process or cgroup.
> +
> + Shows statistics for fully-mapped THPs of every size, mapped
> + both naturally aligned and unaligned for both file and
> + anonymous memory. See
> + [anon|file]-thp-[aligned|unaligned]-<size>kB keys.
> +
> + Shows statistics for mapped pages that belong to a THP but
> + which are not fully mapped. See [anon|file]-thp-partial
> + keys.
> +
> + Optionally shows statistics for naturally aligned,
> + contiguous blocks of memory of a specified size (when --cont
> + is provided). See [anon|file]-cont-aligned-<size>kB keys.
> +
> + Statistics are shown in kB and as a percentage of either
> + total anon or file memory as appropriate.""",
> + epilog="""Requires root privilege to access pagemap and kpageflags.""")
> +
> + parser.add_argument('--pid',
> + metavar='pid', required=False, type=int,
> + help="""Process id of the target process. Exactly one of --pid and
> + --cgroup must be provided.""")
> +
> + parser.add_argument('--cgroup',
> + metavar='path', required=False,
> + help="""Path to the target cgroup in sysfs. Iterates over every pid in
> + the cgroup. Exactly one of --pid and --cgroup must be provided.""")
> +
> + parser.add_argument('--summary',
> + required=False, default=False, action='store_true',
> + help="""Sum the per-vma statistics to provide a summary over the whole
> + process or cgroup.""")
> +
> + parser.add_argument('--cont',
> + metavar='size[KMG]', required=False, default=[], action='append',
> + help="""Adds anon and file stats for naturally aligned, contiguously
> + mapped blocks of the specified size. May be issued multiple times to
> + track multiple sized blocks. Useful to infer e.g. arm64 contpte and
> + hpa mappings. Size must be a power-of-2 number of pages.""")
> +
> + parser.add_argument('--inc-smaps',
> + required=False, default=False, action='store_true',
> + help="""Include all numerical, additive /proc/<pid>/smaps stats in the
> + output.""")
> +
> + parser.add_argument('--inc-empty',
> + required=False, default=False, action='store_true',
> + help="""Show all statistics including those whose value is 0.""")
> +
> + parser.add_argument('--periodic',
> + metavar='sleep_ms', required=False, type=int,
> + help="""Run in a loop, polling every sleep_ms milliseconds.""")
> +
> + args = parser.parse_args()
> +
> + try:
> + if (args.pid and args.cgroup) or \
> + (not args.pid and not args.cgroup):
> + raise ArgException("error: Exactly one of --pid and --cgroup must be provided.")
> +
> + args.cont = [size2order(cont) for cont in args.cont]
> + except ArgException as e:
> + parser.print_usage()
> + raise
> +
> + if args.periodic:
> + while True:
> + do_main(args)
> + print()
> + time.sleep(args.periodic / 1000)
> + else:
> + do_main(args)
> +
> +
> +if __name__ == "__main__":
> + try:
> + main()
> + except Exception as e:
> + prog = os.path.basename(sys.argv[0])
> + print(f'{prog}: {e}')
> + exit(1)
> --
> 2.25.1
>
next prev parent reply other threads:[~2024-01-03 6:45 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-02 15:38 Ryan Roberts
2024-01-03 6:44 ` Barry Song [this message]
2024-01-03 8:07 ` William Kucharski
2024-01-03 8:24 ` Ryan Roberts
2024-01-03 9:16 ` Barry Song
2024-01-03 9:35 ` Ryan Roberts
2024-01-03 10:09 ` William Kucharski
2024-01-03 10:20 ` Ryan Roberts
2024-01-04 22:48 ` John Hubbard
2024-01-05 8:35 ` Ryan Roberts
2024-01-05 11:30 ` William Kucharski
2024-01-05 23:07 ` John Hubbard
2024-01-05 23:18 ` John Hubbard
2024-01-10 8:43 ` Ryan Roberts
2024-01-05 8:40 ` Ryan Roberts
2024-01-10 3:34 ` John Hubbard
2024-01-10 3:51 ` Barry Song
2024-01-10 4:15 ` John Hubbard
2024-01-10 8:02 ` Barry Song
2024-01-10 8:58 ` Ryan Roberts
2024-01-10 9:09 ` Barry Song
2024-01-10 9:20 ` Ryan Roberts
2024-01-10 10:23 ` Ryan Roberts
2024-01-10 10:30 ` Barry Song
2024-01-10 10:38 ` Ryan Roberts
2024-01-10 10:42 ` David Hildenbrand
2024-01-10 10:55 ` Ryan Roberts
2024-01-10 11:00 ` David Hildenbrand
2024-01-10 11:20 ` Ryan Roberts
2024-01-10 11:24 ` David Hildenbrand
2024-01-10 11:38 ` Barry Song
2024-01-10 11:59 ` Ryan Roberts
2024-01-10 12:05 ` Barry Song
2024-01-10 12:12 ` David Hildenbrand
2024-01-10 15:19 ` Zi Yan
2024-01-10 15:27 ` David Hildenbrand
2024-01-10 22:14 ` Barry Song
2024-01-11 12:25 ` Ryan Roberts
2024-01-11 13:18 ` David Hildenbrand
2024-01-11 20:21 ` Barry Song
2024-01-11 20:28 ` David Hildenbrand
2024-01-12 6:03 ` Barry Song
2024-01-12 10:44 ` Ryan Roberts
2024-01-12 10:18 ` Ryan Roberts
2024-01-17 15:49 ` David Hildenbrand
2024-01-11 20:45 ` Barry Song
2024-01-12 10:25 ` Ryan Roberts
2024-01-10 23:34 ` Barry Song
2024-01-10 10:48 ` Barry Song
2024-01-10 10:54 ` David Hildenbrand
2024-01-10 10:58 ` Ryan Roberts
2024-01-10 11:02 ` David Hildenbrand
2024-01-10 11:07 ` Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGsJ_4wgvuEE1GpmJU3fkJ35Xx0Ue-bLohRfivy4XEDc4N8fPw@mail.gmail.com \
--to=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=david@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
--cc=yuzenghui@huawei.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox