Re: [PATCH] alloc_tag: add per-NUMA node stats

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Kent Overstreet <kent.overstreet@linux.dev>
To: David Wang <00107082@163.com>
Cc: cachen@purestorage.com, akpm@linux-foundation.org, cl@gentwo.org,
	 corbet@lwn.net, dennis@kernel.org, hannes@cmpxchg.org,
	harry.yoo@oracle.com,  jackmanb@google.com,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	 linux-mm@kvack.org, mhocko@suse.com, rientjes@google.com,
	roman.gushchin@linux.dev,  surenb@google.com, tj@kernel.org,
	vbabka@suse.cz, yzhong@purestorage.com,  ziy@nvidia.com
Subject: Re: [PATCH] alloc_tag: add per-NUMA node stats
Date: Thu, 12 Jun 2025 11:37:12 -0400	[thread overview]
Message-ID: <ub5knll6sof6sbl4elcrdpmf7ptyds6xfusio672fgyt6sxeja@3awoyjpq7xev> (raw)
In-Reply-To: <20250612053605.5911-1-00107082@163.com>

On Thu, Jun 12, 2025 at 01:36:05PM +0800, David Wang wrote:
> Hi, 
> 
> On Tue, 10 Jun 2025 17:30:53 -0600 Casey Chen <cachen@purestorage.com> wrote:
> > Add support for tracking per-NUMA node statistics in /proc/allocinfo.
> > Previously, each alloc_tag had a single set of counters (bytes and
> > calls), aggregated across all CPUs. With this change, each CPU can
> > maintain separate counters for each NUMA node, allowing finer-grained
> > memory allocation profiling.
> > 
> > This feature is controlled by the new
> > CONFIG_MEM_ALLOC_PROFILING_PER_NUMA_STATS option:
> > 
> > * When enabled (=y), the output includes per-node statistics following
> >   the total bytes/calls:
> > 
> > <size> <calls> <tag info>
> > ...
> > 315456       9858     mm/dmapool.c:338 func:pool_alloc_page
> >         nid0     94912        2966
> >         nid1     220544       6892
> > 7680         60       mm/dmapool.c:254 func:dma_pool_create
> >         nid0     4224         33
> >         nid1     3456         27
> > 
> > * When disabled (=n), the output remains unchanged:
> > <size> <calls> <tag info>
> > ...
> > 315456       9858     mm/dmapool.c:338 func:pool_alloc_page
> > 7680         60       mm/dmapool.c:254 func:dma_pool_create
> > 
> > To minimize memory overhead, per-NUMA stats counters are dynamically
> > allocated using the percpu allocator. PERCPU_DYNAMIC_RESERVE has been
> > increased to ensure sufficient space for in-kernel alloc_tag counters.
> > 
> > For in-kernel alloc_tag instances, pcpu_alloc_noprof() is used to
> > allocate counters. These allocations are excluded from the profiling
> > statistics themselves.
> 
> Considering NUMA balance, I have two questions:
> 1. Do we need the granularity of calling sites?
> We need that granularity to identify a possible memory leak, or somewhere
> we can optimize its memory usage.
> But for NUMA unbalance, the calling site would mostly be *innocent*, the
> clue normally lies in the cpu making memory allocation, memory interface, etc...
> The point is, when NUMA unbalance happened, can it be fixed by adjusting the calling sites?
> Isn't <cpu, memory interface/slab name, numa id> enough to be used as key for numa
> stats analysis?

kmalloc_node().

Per callsite is the right granularity.

But AFAIK correlating profiling information with the allocation is still
an entirely manual process, so that's the part I'm interested in right
now.

Under the hood memory allocation profiling gives you the ability to map
any specific allocation to the line of code that owns it - that is, map
kernel virtual address to codetag.

But I don't know if perf collects _data_ addresses on cache misses. Does
anyone?

next prev parent reply	other threads:[~2025-06-12 15:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-10 23:30 Casey Chen
2025-06-11  1:21 ` Andrew Morton
2025-06-11  1:33   ` Casey Chen
2025-06-11  3:47     ` Kent Overstreet
2025-06-11  3:41   ` Kent Overstreet
2025-06-12  5:36 ` David Wang
2025-06-12 15:37   ` Kent Overstreet [this message]
2025-06-18 22:16 ` Kent Overstreet
2025-07-08 21:52   ` David Rientjes
2025-07-08 22:38     ` Christoph Lameter (Ampere)
2025-07-09 19:14       ` David Rientjes
2025-07-08 22:53     ` Casey Chen
2025-07-08 23:07     ` Casey Chen
2025-07-10  5:54     ` Sourav Panda
  -- strict thread matches above, loose matches on Subject: below --
2025-05-30  0:39 [PATCH 0/1] alloc_tag: add per-numa " Casey Chen
2025-05-30  0:39 ` [PATCH] " Casey Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ub5knll6sof6sbl4elcrdpmf7ptyds6xfusio672fgyt6sxeja@3awoyjpq7xev \
    --to=kent.overstreet@linux.dev \
    --cc=00107082@163.com \
    --cc=akpm@linux-foundation.org \
    --cc=cachen@purestorage.com \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dennis@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=jackmanb@google.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=yzhong@purestorage.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox