From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9928C5AE59 for ; Wed, 4 Jun 2025 00:55:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 512276B0538; Tue, 3 Jun 2025 20:55:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E96B6B053A; Tue, 3 Jun 2025 20:55:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4277D6B053B; Tue, 3 Jun 2025 20:55:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2554A6B0538 for ; Tue, 3 Jun 2025 20:55:18 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id BC0AE141BAC for ; Wed, 4 Jun 2025 00:55:17 +0000 (UTC) X-FDA: 83515899474.29.D26D3EE Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by imf26.hostedemail.com (Postfix) with ESMTP id 90D27140008 for ; Wed, 4 Jun 2025 00:55:15 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=purestorage.com header.s=google2022 header.b=GaeDIqzc; spf=pass (imf26.hostedemail.com: domain of cachen@purestorage.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=cachen@purestorage.com; dmarc=pass (policy=reject) header.from=purestorage.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748998515; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tUXYgLh0N4G1+VfNYUZhh4bFJFfqCxm3P//BEgo/SfQ=; b=eXKTEpW+dugoV8SIiX1FESm1QN7fxc2BTXWR2N1i3NQ3AX+0si71rkTYk+4mFTOQHSmIFR hxK76FOald9pNWzUtaw0ycdQbjv6NQlV5KHuEwANi5R0AEaG4PvHjs3B3POhAutdCIC7tK tqdJevgGazAgJ7AWLPE6YVyOY8wWEUs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748998516; a=rsa-sha256; cv=none; b=gwjiIEoeYKyTrcKIjpvlBezGTekncjf6v6gHBcU+zyatZMzTSn3FtLiak1el35F8ywNliR Za0v+jpHZwPi6wXKuxAnetB22+QUThoDP6JMydeCAC45JnaKaYF/Q3PFp57ryf1s1J+yTa Jj6PhM9tYNCsfiQHkH45PJ9c1f5jKmM= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=purestorage.com header.s=google2022 header.b=GaeDIqzc; spf=pass (imf26.hostedemail.com: domain of cachen@purestorage.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=cachen@purestorage.com; dmarc=pass (policy=reject) header.from=purestorage.com Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-31305ee3281so106058a91.0 for ; Tue, 03 Jun 2025 17:55:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1748998514; x=1749603314; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tUXYgLh0N4G1+VfNYUZhh4bFJFfqCxm3P//BEgo/SfQ=; b=GaeDIqzclWyGmDInawP1xpUmEFpiQEeLpqYmbizw9+8u0CO7sngt7Qk6lKuuM9VKhe BohvPnY3Eng0Y3z6V6oU4oEguF/fqC71qHKQCqp4B50WX2nh/7X6TUb2XeHxS5ho9U7A 1kzcjuFuF/kHJrFlEUBiSnsQ8pXX/5EVg6qeNzb+EQXSxERTdmVTgEY8Mv5JRU1R4BsT 9YEI87RWprhVfsm2iHjas8m9ONZBzOLgDH5qBeEki9WBPSQwtvCx5oZ0/rc1hDyS9DFF JOZMTBFjAsV8TJDBlMAGv7tqnIMDrnSAE9f+La//ppzaUpNltD4uFSXUJkSot1hVUAwN XfTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748998514; x=1749603314; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tUXYgLh0N4G1+VfNYUZhh4bFJFfqCxm3P//BEgo/SfQ=; b=jQjP5lmFsEYFvjtRnukih9kAKNM5kRUIgtGU6XX3vn3yCO5uvb7TW6RFdbheI1b9vn HaIRWLLy6uqD3qqSb4M+qY58KSx5SkmaUUYA2ipb6i0jQaOrSzkj6Qm4A4WMMzyzySJM 4Jo01uMUTVs6XlwY4ag3SXErGtYpO1GV1Ad38P9DpxgJmC6/Y58ZdFbpljGmAIeqmOl8 iljSWqS0dXgN2YvfZcSK0m7TC2JsVELRqakmEcsU4f5KWDcFHe/0IDWpXUMpCjhmCxV8 By9JuKyoc8jpdvjIABwbf3m6405mEIP/GE3VQZyXvFVVvFFJPsQP9KrWXDXwBcqEvLMK NugA== X-Forwarded-Encrypted: i=1; AJvYcCVozTETWMbWSgLUzKEA4b84oTXLnJGNFTnNRxyJNcmTn7Fpw0L7E5pzrs8ztTe/rr2kMNH7WP6vRA==@kvack.org X-Gm-Message-State: AOJu0YyoU/KXiRyiKUr43SzXSNEfZaVrVZM3URl/9HkP5pnlNKaUkH3C nliWaAeGRpqbBmD8sUcIPjVWtIh1LM6qhO2lcXAnv6msD2kRl4FZ3e1siUv0MZfhurcytj7pnEY chYWIqnqBmRjfCtO9FSCEyxSH66njFUlcbfWSte9mhA== X-Gm-Gg: ASbGnctYdvfRnuHX59AuDhSG2PA5Fq9IUUadHFDlGFWZ0S3LX/f9fSzoKHqRzqIj3qC RNe4mWTos8DOFDNQgvb8jVuQbNIrB8cJUMstSfUt0DiWqmG0WMkUE96heR5ypVpwWWyphg5kJ0z 5LF7u7vHLXcE78IXyoigAoSbNiqNjEgCFs X-Google-Smtp-Source: AGHT+IGmLkYugpB7xNGvNRNKXQikOr+9dtpkaJRpM6Fjk0kFCOUsb8FW2/HLTr5RbVpZkv3tbeC7kW7R69F7ncBwhJI= X-Received: by 2002:a17:90b:3e89:b0:312:e9d:4001 with SMTP id 98e67ed59e1d1-3130cd7f04bmr504601a91.8.1748998514267; Tue, 03 Jun 2025 17:55:14 -0700 (PDT) MIME-Version: 1.0 References: <20250530003944.2929392-1-cachen@purestorage.com> <5iiwnofmnx565g3xv3zdt35b7qkuwylzedkidnav72t24asswj@omjgyjnauulg> In-Reply-To: From: Casey Chen Date: Tue, 3 Jun 2025 17:55:03 -0700 X-Gm-Features: AX0GCFsmlQ8uwyrZty7loHeAD1NwQbxVxgE9XWOvtacriAtC4PJM9cinWMbcFOc Message-ID: Subject: Re: [PATCH 0/1] alloc_tag: add per-numa node stats To: Suren Baghdasaryan Cc: Kent Overstreet , linux-mm@kvack.org, yzhong@purestorage.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 90D27140008 X-Stat-Signature: sdkgf6oozd1zrua5z9tzpa79txrcwio9 X-Rspam-User: X-HE-Tag: 1748998515-595905 X-HE-Meta: U2FsdGVkX1/q00E8Q0bDoJDELQ816e+Zh4n3kcrUgF1xdyu00bl+kMmR1KbNNrydBPStEXUdxkjEU2chorMJokaNaFGWdVji52RRxl1cL4H/arNSuV0OYTNjSAUnY9brjc1ygw8ib5NmeO0uVdEh65GHFH8kAGqPz/8caLH4lZUHhUedGuboEB9KfuYFSkb/r/itPEqvHm/J47TiN3rTxUTr/tIX7co6MXq+3VZZSzdzw1mcxj6hpSD0XoZKOQB+JSXEIsN6ggwNemhTUxerHVnI564e7166S1GAOSUMzjKLPRkHwy48+AebaDfy7xquN2pX6M33ZTrcq4MINmlk0VhW+kEepTRWM2WTsOigJq8JzKdmNtqHgJ6X/bFrw2/NSjVpLsA1CW1zqyob7UR6+T1+VOApcwgveZjwOVFXTIGaTBVQjUjbfQAxBVW9p9daHDf/OZMkUutfksEwaY2rG4O9RqtqF5uzC0crjAGb5Dj9s5XsZ7irHnVg+BdEDX2PrTvXerRXAMMHF5ZacoR5GPl/LkbVFwNSO432xg1ej0gvZDBII39XVvjFFeFRNMQwjZ7a4pazA1gen4GkGRI0sH2TNHK2/1zRMW5tRBMBDo9OtY+pqBBDwLHehHgfiTaa/2f+BTrmBwczvzj09Wv0TSjeBzJzi7/zWRKEsasjWRHSHgkkupKBFVhvKxlDuII31jhXL0gFjo3l7je6eBM8f8bjP1yRP8wIVHN7chKrxGT4yjofrZ9wHeIfbpR4UGRqBVRIrYFKaBBMdMy0dg0Xf1D2SkivHPZhk5+MNXYm5wJFuvwnu35glIpzeN2NjrzYxUYxy177816rytPw5979O2jtYPQwwpGlf9CvcMIZuqarYt46KQkq6S0HGsTUtE03yXH+79kcprq1+WAQ7+I1KiRyPshN7F/h2WCAPxRVPLgU2KAge3Zhl8hrlP1tAWSiE+yxHSkoga83LzH1tXQ lOfwzWgA En1qYSAegUgtS6QMGjnZiYWr53uLEmMaU+w4Er0u8ejH1l5rPlP1KJPGoT/DCse+iwNuS8FhoKloPedUL+oLXYqAs+V5AtqbNfQA5WA7MVKn/zf4x7LGH34seX32kWRNTx8V2tIa494Z9VAfFl7D3PQfKzZLEmnutTPPzVuQWw0ClJ6Rmw1ezONx8c4aIGX1dYEdrJowFSHzTaO1/43Qe1CF7u+ohV4vHckJPJGBIVxfcVKaNBxT5P/d52KsC4BfR9V9cb8RZsJlrwikRdJ/3ZJ4ajNmUnkYSz8Jo/3qV9gTrnwb2WVibMR/JCGWtH1+w4XQXDEfl5QN7UltQYSBysVilPpqPqo7rDU7fB2IaIgKJWKc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 3, 2025 at 8:01=E2=80=AFAM Suren Baghdasaryan wrote: > > On Mon, Jun 2, 2025 at 2:32=E2=80=AFPM Suren Baghdasaryan wrote: > > > > On Mon, Jun 2, 2025 at 1:48=E2=80=AFPM Casey Chen wrote: > > > > > > On Fri, May 30, 2025 at 5:05=E2=80=AFPM Kent Overstreet > > > wrote: > > > > > > > > On Fri, May 30, 2025 at 02:45:57PM -0700, Casey Chen wrote: > > > > > On Thu, May 29, 2025 at 6:11=E2=80=AFPM Kent Overstreet > > > > > wrote: > > > > > > > > > > > > On Thu, May 29, 2025 at 06:39:43PM -0600, Casey Chen wrote: > > > > > > > The patch is based 4aab42ee1e4e ("mm/zblock: make active_list= rcu_list") > > > > > > > from branch mm-new of git://git.kernel.org/pub/scm/linux/kern= el/git/akpm/mm > > > > > > > > > > > > > > The patch adds per-NUMA alloc_tag stats. Bytes/calls in total= and per-NUMA > > > > > > > nodes are displayed in a single row for each alloc_tag in /pr= oc/allocinfo. > > > > > > > Also percpu allocation is marked and its stats is stored on N= UMA node 0. > > > > > > > For example, the resulting file looks like below. > > > > > > > > > > > > > > percpu y total 8588 2147 numa0 8588 2= 147 numa1 0 0 kernel/irq/irqdesc.c:425 func:alloc_desc > > > > > > > percpu n total 447232 1747 numa0 269568 1= 053 numa1 177664 694 lib/maple_tree.c:165 func:mt_alloc_bulk > > > > > > > percpu n total 83200 325 numa0 30976 = 121 numa1 52224 204 lib/maple_tree.c:160 func:mt_alloc_one > > > > > > > ... > > > > > > > percpu n total 364800 5700 numa0 109440 1= 710 numa1 255360 3990 drivers/net/ethernet/mellanox/mlx5/core/cmd= .c:1410 [mlx5_core] func:mlx5_alloc_cmd_msg > > > > > > > percpu n total 1249280 39040 numa0 374784 11= 712 numa1 874496 27328 drivers/net/ethernet/mellanox/mlx5/core/cmd= .c:1376 [mlx5_core] func:alloc_cmd_box > > > > > > > > > > > > Err, what is 'percpu y/n'? > > > > > > > > > > > > > > > > Mark percpu allocation with 'percpu y/n' because for percpu alloc= ation > > > > > stats, 'bytes' is per-cpu, we have to multiply it by the number o= f > > > > > CPUs to get the total bytes. Mark it so we know the exact amount = of > > > > > memory used. Any /proc/allocinfo parser can understand it and mak= e > > > > > correct calculations. > > > > > > > > Ok, just wanted to be sure it wasn't something else. Let's shorten = that > > > > though, a single character should suffice (we already have a header= that > > > > can explain what it is) - if you're growing the width we don't want= to > > > > overflow. > > > > > > > > > > Does it have a header ? > > > > Yes. See print_allocinfo_header(). > > I was thinking if instead of changing /proc/allocinfo format to > contain both total and per-node information we can keep it as is > (containing only totals) while exposing per-node information inside > new /sys/devices/system/node/node/allocinfo files. That seems > cleaner to me. > The output of /sys/devices/system/node/node/allocinfo is strictly limited to a single PAGE_SIZE and it cannot display stats for all tags. > I'm also not a fan of "percpu y" tags as that requires the reader to > know how many CPUs were in the system to make the calculation (you > might get the allocinfo content from a system you have no access to > and no additional information). Maybe we can have "per-cpu bytes" and > "total bytes" columns instead? For per-cpu allocations these will be > different, for all other allocations these two columns will contain > the same number. I plan to remove 'percpu y/n' from this patch and implement it later. > > > > > > > > > > > > > > > > > > > > > > > > > To save memory, we dynamically allocate per-NUMA node stats c= ounter once the > > > > > > > system boots up and knows how many NUMA nodes available. perc= pu allocators > > > > > > > are used for memory allocation hence increase PERCPU_DYNAMIC_= RESERVE. > > > > > > > > > > > > > > For in-kernel alloc_tags, pcpu_alloc_noprof() is called so th= e memory for > > > > > > > these counters are not accounted in profiling stats. > > > > > > > > > > > > > > For loadable modules, __alloc_percpu_gfp() is called and memo= ry is accounted. > > > > > > > > > > > > Intruiging, but I'd make it a kconfig option, AFAIK this would = mainly be > > > > > > of interest to people looking at optimizing allocations to make= sure > > > > > > they're on the right numa node? > > > > > > > > > > Yes, to help us know if there is an NUMA imbalance issue and make= some > > > > > optimizations. I can make it a kconfig. Does anybody else have an= y > > > > > opinion about this feature ? Thanks! > > > > > > > > I would like to see some other opinions from potential users, have = you > > > > been circulating it? > > > > > > We have been using it internally for a while. I don't know who the > > > potential users are and how to reach them so I am sharing it here to > > > collect opinions from others. > > > > Should definitely have a separate Kconfig option. Have you measured > > the memory and performance overhead of this change?