[PATCH v2] smaps: Report correct page sizes with THP

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2] smaps: Report correct page sizes with THP
@ 2026-02-25 23:27 Andi Kleen
  2026-02-26 12:08 ` Usama Arif
  2026-02-26 17:31 ` David Hildenbrand (Arm)
  0 siblings, 2 replies; 3+ messages in thread
From: Andi Kleen @ 2026-02-25 23:27 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, Andi Kleen

The earlier version of this patch kit wasn't that well received,
with the main objection being non support for mTHP. This variant
tracks any mTHP sizes in a VMA and reports them with MMUPageSizeN in smaps,
with increasing N.  The base page size is still reported w/o a
number postfix to stay compatible.

The nice thing is that the patch is actually simpler and more
straight forward than the THP only variant. Also improved the
documentation.

Recently I wasted quite some time debugging why THP didn't work, when it
was just smaps always reporting the base page size. It has separate
counts for (non m) THP, but using them is not always obvious.
I left KernelPageSize alone.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 Documentation/filesystems/proc.rst |  8 ++++++--
 fs/proc/task_mmu.c                 | 14 +++++++++++++-
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index b0c0d1b45b99..c5102ef7a2eb 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -452,6 +452,7 @@ Memory Area, or VMA) there is a series of lines such as the following::
     Size:               1084 kB
     KernelPageSize:        4 kB
     MMUPageSize:           4 kB
+    MMUPageSize2:       2048 kB
     Rss:                 892 kB
     Pss:                 374 kB
     Pss_Dirty:             0 kB
@@ -476,14 +477,17 @@ Memory Area, or VMA) there is a series of lines such as the following::
     VmFlags: rd ex mr mw me dw
 
 The first of these lines shows the same information as is displayed for
-the mapping in /proc/PID/maps.  Following lines show the size of the
+the mapping in /proc/PID/maps (except that there might be more page sizes
+if the mapping has them)
+Following lines show the size of the
 mapping (size); the size of each page allocated when backing a VMA
 (KernelPageSize), which is usually the same as the size in the page table
 entries; the page size used by the MMU when backing a VMA (in most cases,
 the same as KernelPageSize); the amount of the mapping that is currently
 resident in RAM (RSS); the process's proportional share of this mapping
 (PSS); and the number of clean and dirty shared and private pages in the
-mapping.
+mapping. If the mapping has multiple page size there might be a be multiple
+numbered MMUPageSize entries.
 
 The "proportional set size" (PSS) of a process is the count of pages it has
 in memory, where each page is divided by the number of processes sharing it.
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e091931d7ca1..8bfd8b13c2ed 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -874,6 +874,7 @@ struct mem_size_stats {
 	unsigned long shared_hugetlb;
 	unsigned long private_hugetlb;
 	unsigned long ksm;
+	unsigned long compound_orders;
 	u64 pss;
 	u64 pss_anon;
 	u64 pss_file;
@@ -942,6 +943,9 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
 	if (young || folio_test_young(folio) || folio_test_referenced(folio))
 		mss->referenced += size;
 
+	mss->compound_orders |=
+		BIT_ULL(compound ? folio_large_order(folio) : 0);
+
 	/*
 	 * Then accumulate quantities that may depend on sharing, or that may
 	 * differ page-by-page.
@@ -1371,6 +1375,7 @@ static int show_smap(struct seq_file *m, void *v)
 {
 	struct vm_area_struct *vma = v;
 	struct mem_size_stats mss = {};
+	int i, cnt = 0;
 
 	smap_gather_stats(vma, &mss, 0);
 
@@ -1378,7 +1383,14 @@ static int show_smap(struct seq_file *m, void *v)
 
 	SEQ_PUT_DEC("Size:           ", vma->vm_end - vma->vm_start);
 	SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma));
-	SEQ_PUT_DEC(" kB\nMMUPageSize:    ", vma_mmu_pagesize(vma));
+
+	for_each_set_bit(i, &mss.compound_orders, BITS_PER_LONG) {
+		if (cnt++ == 0)
+			SEQ_PUT_DEC(" kB\nMMUPageSize:    ", PAGE_SIZE << i);
+		else
+			seq_printf(m, " kB\nMMUPageSize%d:   %8u",
+					cnt, 1 << (PAGE_SHIFT-10+i));
+	}
 	seq_puts(m, " kB\n");
 
 	__show_smap(m, &mss, false);
-- 
2.53.0



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] smaps: Report correct page sizes with THP
  2026-02-25 23:27 [PATCH v2] smaps: Report correct page sizes with THP Andi Kleen
@ 2026-02-26 12:08 ` Usama Arif
  2026-02-26 17:31 ` David Hildenbrand (Arm)
  1 sibling, 0 replies; 3+ messages in thread
From: Usama Arif @ 2026-02-26 12:08 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Usama Arif, linux-mm, akpm

On Wed, 25 Feb 2026 15:27:08 -0800 Andi Kleen <ak@linux.intel.com> wrote:

> The earlier version of this patch kit wasn't that well received,
> with the main objection being non support for mTHP. This variant
> tracks any mTHP sizes in a VMA and reports them with MMUPageSizeN in smaps,
> with increasing N.  The base page size is still reported w/o a
> number postfix to stay compatible.
> 
> The nice thing is that the patch is actually simpler and more
> straight forward than the THP only variant. Also improved the
> documentation.
> 
> Recently I wasted quite some time debugging why THP didn't work, when it
> was just smaps always reporting the base page size. It has separate
> counts for (non m) THP, but using them is not always obvious.
> I left KernelPageSize alone.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  Documentation/filesystems/proc.rst |  8 ++++++--
>  fs/proc/task_mmu.c                 | 14 +++++++++++++-
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index b0c0d1b45b99..c5102ef7a2eb 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -452,6 +452,7 @@ Memory Area, or VMA) there is a series of lines such as the following::
>      Size:               1084 kB
>      KernelPageSize:        4 kB
>      MMUPageSize:           4 kB
> +    MMUPageSize2:       2048 kB
>      Rss:                 892 kB
>      Pss:                 374 kB
>      Pss_Dirty:             0 kB
> @@ -476,14 +477,17 @@ Memory Area, or VMA) there is a series of lines such as the following::
>      VmFlags: rd ex mr mw me dw
>  
>  The first of these lines shows the same information as is displayed for
> -the mapping in /proc/PID/maps.  Following lines show the size of the
> +the mapping in /proc/PID/maps (except that there might be more page sizes
> +if the mapping has them)
> +Following lines show the size of the
>  mapping (size); the size of each page allocated when backing a VMA
>  (KernelPageSize), which is usually the same as the size in the page table
>  entries; the page size used by the MMU when backing a VMA (in most cases,
>  the same as KernelPageSize); the amount of the mapping that is currently
>  resident in RAM (RSS); the process's proportional share of this mapping
>  (PSS); and the number of clean and dirty shared and private pages in the
> -mapping.
> +mapping. If the mapping has multiple page size there might be a be multiple
> +numbered MMUPageSize entries.
>  
>  The "proportional set size" (PSS) of a process is the count of pages it has
>  in memory, where each page is divided by the number of processes sharing it.
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e091931d7ca1..8bfd8b13c2ed 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -874,6 +874,7 @@ struct mem_size_stats {
>  	unsigned long shared_hugetlb;
>  	unsigned long private_hugetlb;
>  	unsigned long ksm;
> +	unsigned long compound_orders;
>  	u64 pss;
>  	u64 pss_anon;
>  	u64 pss_file;
> @@ -942,6 +943,9 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
>  	if (young || folio_test_young(folio) || folio_test_referenced(folio))
>  		mss->referenced += size;
>  
> +	mss->compound_orders |=
> +		BIT_ULL(compound ? folio_large_order(folio) : 0);
> +
>  	/*
>  	 * Then accumulate quantities that may depend on sharing, or that may
>  	 * differ page-by-page.
> @@ -1371,6 +1375,7 @@ static int show_smap(struct seq_file *m, void *v)
>  {
>  	struct vm_area_struct *vma = v;
>  	struct mem_size_stats mss = {};
> +	int i, cnt = 0;
>  
>  	smap_gather_stats(vma, &mss, 0);
>  
> @@ -1378,7 +1383,14 @@ static int show_smap(struct seq_file *m, void *v)
>  
>  	SEQ_PUT_DEC("Size:           ", vma->vm_end - vma->vm_start);
>  	SEQ_PUT_DEC(" kB\nKernelPageSize: ", vma_kernel_pagesize(vma));
> -	SEQ_PUT_DEC(" kB\nMMUPageSize:    ", vma_mmu_pagesize(vma));
> +
> +	for_each_set_bit(i, &mss.compound_orders, BITS_PER_LONG) {

Hello Andi!

When a VMA has no resident pages (e.g., freshly mmap'd but not yet
faulted), compound_orders will be zero and the for_each_set_bit loop
will not execute at all. This means no MMUPageSize line is emitted
for that VMA.

Previously, vma_mmu_pagesize() was called unconditionally and always
produced the MMUPageSize field. Userspace tools that parse smaps and
expect MMUPageSize to always be present would break on VMAs with no
resident pages. Should we always add it?

Thanks

> +		if (cnt++ == 0)
> +			SEQ_PUT_DEC(" kB\nMMUPageSize:    ", PAGE_SIZE << i);
> +		else
> +			seq_printf(m, " kB\nMMUPageSize%d:   %8u",
> +					cnt, 1 << (PAGE_SHIFT-10+i));
> +	}
>  	seq_puts(m, " kB\n");
>  
>  	__show_smap(m, &mss, false);
> -- 
> 2.53.0
> 
> 
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] smaps: Report correct page sizes with THP
  2026-02-25 23:27 [PATCH v2] smaps: Report correct page sizes with THP Andi Kleen
  2026-02-26 12:08 ` Usama Arif
@ 2026-02-26 17:31 ` David Hildenbrand (Arm)
  1 sibling, 0 replies; 3+ messages in thread
From: David Hildenbrand (Arm) @ 2026-02-26 17:31 UTC (permalink / raw)
  To: Andi Kleen, linux-mm; +Cc: akpm

On 2/26/26 00:27, Andi Kleen wrote:
> The earlier version of this patch kit wasn't that well received,
> with the main objection being non support for mTHP. This variant
> tracks any mTHP sizes in a VMA and reports them with MMUPageSizeN in smaps,
> with increasing N.  The base page size is still reported w/o a
> number postfix to stay compatible.
> 
> The nice thing is that the patch is actually simpler and more
> straight forward than the THP only variant. Also improved the
> documentation.
> 
> Recently I wasted quite some time debugging why THP didn't work, when it
> was just smaps always reporting the base page size. It has separate
> counts for (non m) THP, but using them is not always obvious.
> I left KernelPageSize alone.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---

You should CC people that were commented on earlier versions.

I still don't like this.


a) Just because a folio has a certain order does not imply that hw actually
coalesces anything. MMUPageSize is otherwise misleading.

b) Simply because you find a folio of a certain order does not imply that
it is even fully mapped in there.

c) PTE coalescing on AMD can even span folios

But more importantly

d) MMUPageSize is independent of the actual page mappings, and I don't
   think we should change these semantics.


Let's see why MMUPageSize was added in the first place:

commit 3340289ddf29ca75c3acfb3a6b72f234b2f74d5c
Author: Mel Gorman <mel@csn.ul.ie>
Date:   Tue Jan 6 14:38:54 2009 -0800

    mm: report the MMU pagesize in /proc/pid/smaps
    
    The KernelPageSize entry in /proc/pid/smaps is the pagesize used by the
    kernel to back a VMA.  This matches the size used by the MMU in the
    majority of cases.  However, one counter-example occurs on PPC64 kernels
    whereby a kernel using 64K as a base pagesize may still use 4K pages for
    the MMU on older processor.  To distinguish, this patch reports
    MMUPageSize as the pagesize used by the MMU in /proc/pid/smaps.


So instead of 64K (PAGE_SIZE), they reported 4K. Always. Even if nothing is mapped.

So you could indicate all MMUPageSize that hardware possibly supports in here.
I don't think it's that helpful.

We once discussed exporting more stats here (similar to AnonHugePages/ShmemPmdMapped, ...)
but we were concerned about creating a mess with mTHP stats.

For this reason, Ryan developed a tool (tools/mm/thpmaps) to introspect the
actual mappings.

See

commit 2444172cfde45a3d6e655f50c620727c76bab4a2
Author: Ryan Roberts <ryan.roberts@arm.com>
Date:   Tue Jan 16 14:12:35 2024 +0000

    tools/mm: add thpmaps script to dump THP usage info
    
    With the proliferation of large folios for file-backed memory, and more
    recently the introduction of multi-size THP for anonymous memory, it is
    becoming useful to be able to see exactly how large folios are mapped into
    processes.  For some architectures (e.g.  arm64), if most memory is mapped
    using contpte-sized and -aligned blocks, TLB usage can be optimized so
    it's useful to see where these requirements are and are not being met.


-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-02-26 17:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-25 23:27 [PATCH v2] smaps: Report correct page sizes with THP Andi Kleen
2026-02-26 12:08 ` Usama Arif
2026-02-26 17:31 ` David Hildenbrand (Arm)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox