From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E151C71136 for ; Thu, 12 Jun 2025 15:37:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 03A356B0093; Thu, 12 Jun 2025 11:37:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 012836B0095; Thu, 12 Jun 2025 11:37:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E93DE6B0096; Thu, 12 Jun 2025 11:37:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CA3EE6B0093 for ; Thu, 12 Jun 2025 11:37:22 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 401DA1A19B2 for ; Thu, 12 Jun 2025 15:37:22 +0000 (UTC) X-FDA: 83547152724.04.1094DD3 Received: from out-179.mta1.migadu.com (out-179.mta1.migadu.com [95.215.58.179]) by imf11.hostedemail.com (Postfix) with ESMTP id 4D2254000B for ; Thu, 12 Jun 2025 15:37:20 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=CCV7CGw7; spf=pass (imf11.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.179 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749742640; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7CU/iQsk5gpUai/YAmYmMwFfofj02R/7lL+2KrmQX8Y=; b=ggIYySMKnG28MupJfmXNB0Oy2dIt6jW67es5RKsx5CLGqsqK9gGeWnVmE3hxjIMgvtX3jM elKaTWWz+mN0UtTdSMjpjY7GFtUdPK+8j5nRFC4RzZXq4+XGZZ9dG4QL7zmccrbWhqjvC8 fsUyZ/FA/Kbt4zu+t5/PZ35QPCrg1m0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=CCV7CGw7; spf=pass (imf11.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.179 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749742640; a=rsa-sha256; cv=none; b=ly8lEx2oZUYSPuAGDcOODKxxL7S4dcEaC9n88PkyOC8n1lFv8Lv3bqz3YKw72XN8FVYnxl duqh/OpsnggjzQV6C8rkSZEXoqqARR9jGrUt/FGFrZJyHnCZvYcBk7IXU5esSMek+FbSSY wEf1nVjLgSlQq6j8qnCKwvGbvMs6G2Y= Date: Thu, 12 Jun 2025 11:37:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1749742637; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7CU/iQsk5gpUai/YAmYmMwFfofj02R/7lL+2KrmQX8Y=; b=CCV7CGw79cuNiwv6htG/7Spz678SlAhpsv+4m607dtbbh7HzBUbTyoaLpPNBrrenxvyGBw 2Xx6YMoa8NPHFAv2hHjvq4c03koTlaIaNgNxrBg7ROLm64jQHArzc/G+2gAiko4tFEWxtQ AeUSnHssZ/WYL8RNIUVwvdEx7XtX8c0= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: David Wang <00107082@163.com> Cc: cachen@purestorage.com, akpm@linux-foundation.org, cl@gentwo.org, corbet@lwn.net, dennis@kernel.org, hannes@cmpxchg.org, harry.yoo@oracle.com, jackmanb@google.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, rientjes@google.com, roman.gushchin@linux.dev, surenb@google.com, tj@kernel.org, vbabka@suse.cz, yzhong@purestorage.com, ziy@nvidia.com Subject: Re: [PATCH] alloc_tag: add per-NUMA node stats Message-ID: References: <20250610233053.973796-1-cachen@purestorage.com> <20250612053605.5911-1-00107082@163.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250612053605.5911-1-00107082@163.com> X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 4D2254000B X-Stat-Signature: zmgcs7e8dkac6iwze777y7w9z73qbewr X-Rspam-User: X-HE-Tag: 1749742640-797527 X-HE-Meta: U2FsdGVkX18vVOrdJouOPloCZ3E80uQ1XNuksb7UR5ugaFJNdfsYVAYubU7IlwTPsEMxe2Lh3ra2WAxkLwk6BXNQQmuZY2EwwyrYaT+KGm6qENbSoq4Vk11/LZnD934veDIdg3X1z8B+l8JDSDe6ygMW+r8ekPsxTuk78miYY7vHzh1YFnDwJDotHS+qlgJHCEdAy5A7V+2eBTseGRpIuAuO5WqMwTTOLnGFzPrBtx5U12wXz89RJvze1mddVT+0UPSMTEMB4DjQhhxT2S3fWv31QeOLt6hyvPWhEqT7sHkZDuWsXjv2Ew2UPP/HgzZqApmKtXBkn9WrsUV8v26pOFi8Zoc39FXz7hpg0lTH5qDbnHKB/yZXqRan0vmm7DqyC59+jd1+aOQa/u9oDsVbodShMFASNqPKYpJKo449/5IikA89QpTTRUe/wbujEzkNTfnK3TlMDZEm9IwS2rlXhtxkBbOE9RaMfDwDw68+K2c4/mXUVod88q0xEq1w0jsZZ6BIPt+NCEeIwueclSfm/tf95/wyNPvH2RtqRA+7bFOygcnMEdPmHOt3JD6oumkJwja/OgtEG8pL2PxWLw8BlgfkqwXZDWriIU8ZJVRwucF3z/cnCg2RcQN06KDCEOss7dPTQnlHFWWwKR+YUD2rttx4OOAYhFHSboczDBSCDez6JF7f3ZL0lzc2ID8ScYREHg8Slyxt/fs4pKTNyefi663dUX685DsJNd2M9UoL4vvi4l/frz2Vul3jche+gB2NKKVpBMWE+SM+JSLTnxgL71As/5cqV+CUlQT9KSkfTwhWM5c3w/E/qXYD/qg8QUwd+X4ep4GoP1JV6udhC8zT1Sxn8kBw2dkSnHTUSJiGgNIUlOEcaA7mVjbd7UlTfMJtrCalHrUSr2GtW1rH+peuTzhR1XFRC5kioZ+d1GcvDtSUxRsA2wonUiDPEKsYpPl1KsW3Z4TeFqlREl1SReE Lma6uAD+ +5dtvWsGhQRPAQ/kTFn/OWIekCHW13bI67F6Ab1xgQBPVND0EGndwvOn4FUrUOe5eeArLEgeBi8Vi+Xi3N6k5BcToOO8va5qdMzDoIYiLtfp34qAiA5UM2XzmT2w0aArjDaq+qUeIjQ/EGnHpH56hjxkdJruy972wH2WudvgzSxOC1drdkpz7pArDj5Xtpp9ebIumdxIykjLTx41XNYHh974OeJhgGO9D8/k/rH+sEoUl6XHb/APqWBq0wvWXbZT+tPEWUHqe8SX1Lrtx+ieMkSylvKUeX7cBqqkQPeuQ2K1xRlFjRFSrT1nFeiVgpu8iyw75xDQPAvQfQKYL0Gz6oSc3tH2DEVA43u+N4+TYK3tbSPg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 12, 2025 at 01:36:05PM +0800, David Wang wrote: > Hi, > > On Tue, 10 Jun 2025 17:30:53 -0600 Casey Chen wrote: > > Add support for tracking per-NUMA node statistics in /proc/allocinfo. > > Previously, each alloc_tag had a single set of counters (bytes and > > calls), aggregated across all CPUs. With this change, each CPU can > > maintain separate counters for each NUMA node, allowing finer-grained > > memory allocation profiling. > > > > This feature is controlled by the new > > CONFIG_MEM_ALLOC_PROFILING_PER_NUMA_STATS option: > > > > * When enabled (=y), the output includes per-node statistics following > > the total bytes/calls: > > > > > > ... > > 315456 9858 mm/dmapool.c:338 func:pool_alloc_page > > nid0 94912 2966 > > nid1 220544 6892 > > 7680 60 mm/dmapool.c:254 func:dma_pool_create > > nid0 4224 33 > > nid1 3456 27 > > > > * When disabled (=n), the output remains unchanged: > > > > ... > > 315456 9858 mm/dmapool.c:338 func:pool_alloc_page > > 7680 60 mm/dmapool.c:254 func:dma_pool_create > > > > To minimize memory overhead, per-NUMA stats counters are dynamically > > allocated using the percpu allocator. PERCPU_DYNAMIC_RESERVE has been > > increased to ensure sufficient space for in-kernel alloc_tag counters. > > > > For in-kernel alloc_tag instances, pcpu_alloc_noprof() is used to > > allocate counters. These allocations are excluded from the profiling > > statistics themselves. > > Considering NUMA balance, I have two questions: > 1. Do we need the granularity of calling sites? > We need that granularity to identify a possible memory leak, or somewhere > we can optimize its memory usage. > But for NUMA unbalance, the calling site would mostly be *innocent*, the > clue normally lies in the cpu making memory allocation, memory interface, etc... > The point is, when NUMA unbalance happened, can it be fixed by adjusting the calling sites? > Isn't enough to be used as key for numa > stats analysis? kmalloc_node(). Per callsite is the right granularity. But AFAIK correlating profiling information with the allocation is still an entirely manual process, so that's the part I'm interested in right now. Under the hood memory allocation profiling gives you the ability to map any specific allocation to the line of code that owns it - that is, map kernel virtual address to codetag. But I don't know if perf collects _data_ addresses on cache misses. Does anyone?