From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2D6EC5B543 for ; Wed, 4 Jun 2025 15:22:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B8EA6B0601; Wed, 4 Jun 2025 11:22:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2411C6B0606; Wed, 4 Jun 2025 11:22:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12FDF6B0607; Wed, 4 Jun 2025 11:22:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E6D9F6B0601 for ; Wed, 4 Jun 2025 11:22:03 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 97649BF576 for ; Wed, 4 Jun 2025 15:22:03 +0000 (UTC) X-FDA: 83518083726.28.3DC58B8 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) by imf24.hostedemail.com (Postfix) with ESMTP id AA5D8180009 for ; Wed, 4 Jun 2025 15:22:01 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=B5EF8luE; spf=pass (imf24.hostedemail.com: domain of surenb@google.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749050521; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=I56Iun7fqrv0ex6T6FJtJXp8VZKLWBDyvHGsraJSH4E=; b=4Y72sY6vAbgifzDTmzY5yUUSqOfIiWqERRCf9aORGrk7yIDPj+By8m0Jlnw4U3//3xVxIU +IZB4w5+J7K9urlvH2dURWfbPFRjcuhbxoP4naCsZ/Kir53Yqp8QHPxrPZoS/bt6k/cpvz kBpo+5kccCLk9fE7rrLEAG8OY73hMz4= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=B5EF8luE; spf=pass (imf24.hostedemail.com: domain of surenb@google.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749050521; a=rsa-sha256; cv=none; b=H3AynSsE3gh0CVZ50vAx72ujkawj8+CbSIev8FATZVfYfqBu2PwIngLApOngQT+s+96PtD WVAeTI4URKwOGnN1rId+L62zF6Vxglv7uEOeO5VduwswuYBjKt5Preg7nT2rB3mxGAjJMS MpFghykMrgj5NNHVdV6cOSdUxngdFcQ= Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-4a58197794eso258311cf.1 for ; Wed, 04 Jun 2025 08:22:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1749050521; x=1749655321; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=I56Iun7fqrv0ex6T6FJtJXp8VZKLWBDyvHGsraJSH4E=; b=B5EF8luEfe9MDNbC/uiQNrMrlJyZI//XQLevwAd89Wm9LxKOoqr9I8fTOcbP2fJ/1R D+WDje1kqZF0bR/V0HSwMLVKTCM4NBgfHZDGcQ1VnlLx07tro/M9d6fZFyagtf/lkejt m6itwAT8Btwt7BN6vW3TNXruD9Y036UVK787beQ9JamjSHwFwwUB1wojY+uR9Z0aKW72 wwT28GNoQra/4+3OI6mTlTYwv3ebseokpu4ChKxmvmDcDB7i84sgTHY5GT72jCZAAmSR Xe4akhDlOu5Nrl5xXzKQNhyi2NOWDh50L3N7FMEJwE5ZW1GqJD0JgbvNmpOt+wKjuaky CdHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749050521; x=1749655321; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I56Iun7fqrv0ex6T6FJtJXp8VZKLWBDyvHGsraJSH4E=; b=WtbuxID6mBotRjzX6nAz4S4MbdIkLEeYUyFqXuTMSd4okg/WyjX26hJr8b5dUj92SH JGvkNiIXtGK2ZQY66To/AYbAVmfvTertaSBYPPDoBRT1vf7XtoNFrt06zZfvTcpwrgnn iR1td4DQlBLuTo8PkEe8WGqGtwQtfBLq2/OHVP3DrSKcgjc8AJoNT+PmH74y3rNYigWN angxuEn3zyp6xszKDSorV2Z8QfZWlAmHGFTuomjfC3co1f5pDtrYDxqnG/zdPot0at3k rOI3moB1wDBTN9HypOGyqP8ukIB17FeVvsK8dsyZkIPyanPEO1ZaLool6j5vqAqOFn/R E11w== X-Forwarded-Encrypted: i=1; AJvYcCXCgDurjLPkSP+xRE1oC4OP05I+74OckAXcP+tiTZJsF4zZ2e/MtwjptKefbqeffVCQf2iMeMtWoA==@kvack.org X-Gm-Message-State: AOJu0YxwnOsRGwgiu2ma+n47CIkUfmFw0FSoYM2+jc/AsePmtdtXzu3p ycL05tVzx2jW/hTU+77aww3dF6296WbzP+Vb34zRy1XdiwYlXDTsTvNFsg++ahIpR906MfzKJt5 zCbeRJfNjn4GOCVEk+he3R6tYYgVwEB2PZ1m/9SNr X-Gm-Gg: ASbGncshJTEWZ4p5nfMl8ccv9JIRfCqfXZ+vs/6PZFzL8qARUjE0RoYW3j0GV9GhBSF r8CrnNzZQ38Uk8EyNt3d0NNtT8sOO/1+e67wkbyf+q+yc3A3aOQ8wTB6vapyOKEnayBpNp6zjOe effDWRpzqhQEn2RpuGiEuZ5FO4wXru3PAQN6GS5nLycIlYPQvJgEeSSY4QF+ugVD1UsEi01ClN7 FeihjkaY/8= X-Google-Smtp-Source: AGHT+IFdNsqmEVF1HJCgLHDvBg8lkvBR/Pe6T2/ubkmPUR6GVqF3R5TWi/NevUVAYrUCuVpPXX1RANxxmaGURBZOG0U= X-Received: by 2002:a05:622a:2304:b0:497:2f60:4ca4 with SMTP id d75a77b69052e-4a5a60dd45fmr4085361cf.15.1749050520287; Wed, 04 Jun 2025 08:22:00 -0700 (PDT) MIME-Version: 1.0 References: <20250530003944.2929392-1-cachen@purestorage.com> <5iiwnofmnx565g3xv3zdt35b7qkuwylzedkidnav72t24asswj@omjgyjnauulg> In-Reply-To: From: Suren Baghdasaryan Date: Wed, 4 Jun 2025 08:21:48 -0700 X-Gm-Features: AX0GCFvn7EvCKJj7NiTmpOxUhLkjJoghQVfeaXdwfcy65X4zZs5lyGofB2OFmHk Message-ID: Subject: Re: [PATCH 0/1] alloc_tag: add per-numa node stats To: Casey Chen Cc: Kent Overstreet , linux-mm@kvack.org, yzhong@purestorage.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: AA5D8180009 X-Stat-Signature: ztzd7o6bpz1434bmghoxpxhc88d8ofcq X-Rspam-User: X-HE-Tag: 1749050521-833028 X-HE-Meta: U2FsdGVkX18AX2qm4tHOx2nyYQgBV04DoQeI7zrHKryheivn5lloe3IWjDwqcdCvM7rRSQz8g/4XXCV9rSnZu/QFBQtZomOaIRh8IYfFrPskTTRGlyXh3Njh8oIGi9YcDFBRQLfboaIgU+WeS6WV2/wsC7JAwN5IGICLp63Oa6alkoOl/tP08WqyAgA9NoBtbsMaqOAJ+65nYhf6vrK9GiIMj8g+y4Ozfo5FcW0Q7oMFNjRLvjoJI176FgUQF8ieTqmOGkO8etDGNWYyZUu2eJKNIota688iAsPX4CuAMzJ545YLS+B/agCIbHFB1tpCjsfTEPF+gO5fBKBInMBqvIyIE7NwovTnBitPXqeA0arN1OkeLp0An5Vfyx8NyIslUbIgwbR+xliyuDy9p9+5V8N4B/x4TqXi6dFODdnDBEEcogGtmUDVKRRN4ZR0liHhogb2CV3mujnCW+RC7v5xVvV8TPSjESX1clgW2cIatU6cZfUCazy/3bp1mIR8IborZp9d6R27T1J8C039EAJa8rHMiS9V6cm5QpRDQiSrwNysxlYY1ZhqsCPXgbJcX21VuMKGXqifAuIFFxIn4Tn5K33P9/UrhAMkSCDWLuzwwV2b38Ro5NiZBF5MRf3uJ2pglecq9fpnUhOxVOyIToQhfzSyQKD+GlktGWVaNlTPYy3rPE5TOhEbsZS5xGxB2RgyWQlN40xDbyb1VuJustwXNM99HpPUTj21r8Io7QqkDVWhjMr/ak/4ClZ7ZeMSbEZRf9bIS3fRLssXj7NqtpWdR1A4vI2Bi4yRqg8wbf1sQyFp9vUbgpY0C4HAZnYQVggO/oUWL9Op4ynXDQb5Sf5SSPXFixAQFpT1y5SKeqQlmDF8q1Wn3HOKiV2g4QeLf/Z2GpsG9oSoWNHW+aCaw+320MenL8z32wUA1USymt4YwoOHdgje5zi0dZ6rw9j3Xm69HO4GbdTKRyyyylmBWB1 SB+KHJgc q/qAdv5N60NAqv3viYSFFhUYYHJioOAcP/AsFQPwk4ggJNMZ9rS/piugLHkBmflg0HsLF6zRNY3nk5pnDvNjObuHhA/0QGh2hB4gbd0Qh51uL303D0eWl4ettH5B+DzZWXPnQN6GiohTjrEOFGkLUW+a57LHyROui1Hjxj7hBHhfn6SjC70vnE4TCitbtPRlPhUmkRqvHp15dOrO8KpwJBcFqqN97uiZrRH6Dyj8JkX6D+jA+xcayY21zieQjJa27XaMTEHau+VXU3rvQbCSp1oeYOl27GJ9BnFuylEngQ7hrUpU984eBW5Gw28arfhFo+j/ghnGyGEnzmSy8ko3adTkhLpnWrbVsM0qY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 3, 2025 at 5:55=E2=80=AFPM Casey Chen = wrote: > > On Tue, Jun 3, 2025 at 8:01=E2=80=AFAM Suren Baghdasaryan wrote: > > > > On Mon, Jun 2, 2025 at 2:32=E2=80=AFPM Suren Baghdasaryan wrote: > > > > > > On Mon, Jun 2, 2025 at 1:48=E2=80=AFPM Casey Chen wrote: > > > > > > > > On Fri, May 30, 2025 at 5:05=E2=80=AFPM Kent Overstreet > > > > wrote: > > > > > > > > > > On Fri, May 30, 2025 at 02:45:57PM -0700, Casey Chen wrote: > > > > > > On Thu, May 29, 2025 at 6:11=E2=80=AFPM Kent Overstreet > > > > > > wrote: > > > > > > > > > > > > > > On Thu, May 29, 2025 at 06:39:43PM -0600, Casey Chen wrote: > > > > > > > > The patch is based 4aab42ee1e4e ("mm/zblock: make active_li= st rcu_list") > > > > > > > > from branch mm-new of git://git.kernel.org/pub/scm/linux/ke= rnel/git/akpm/mm > > > > > > > > > > > > > > > > The patch adds per-NUMA alloc_tag stats. Bytes/calls in tot= al and per-NUMA > > > > > > > > nodes are displayed in a single row for each alloc_tag in /= proc/allocinfo. > > > > > > > > Also percpu allocation is marked and its stats is stored on= NUMA node 0. > > > > > > > > For example, the resulting file looks like below. > > > > > > > > > > > > > > > > percpu y total 8588 2147 numa0 8588 = 2147 numa1 0 0 kernel/irq/irqdesc.c:425 func:alloc_desc > > > > > > > > percpu n total 447232 1747 numa0 269568 = 1053 numa1 177664 694 lib/maple_tree.c:165 func:mt_alloc_bulk > > > > > > > > percpu n total 83200 325 numa0 30976 = 121 numa1 52224 204 lib/maple_tree.c:160 func:mt_alloc_one > > > > > > > > ... > > > > > > > > percpu n total 364800 5700 numa0 109440 = 1710 numa1 255360 3990 drivers/net/ethernet/mellanox/mlx5/core/c= md.c:1410 [mlx5_core] func:mlx5_alloc_cmd_msg > > > > > > > > percpu n total 1249280 39040 numa0 374784 = 11712 numa1 874496 27328 drivers/net/ethernet/mellanox/mlx5/core/c= md.c:1376 [mlx5_core] func:alloc_cmd_box > > > > > > > > > > > > > > Err, what is 'percpu y/n'? > > > > > > > > > > > > > > > > > > > Mark percpu allocation with 'percpu y/n' because for percpu all= ocation > > > > > > stats, 'bytes' is per-cpu, we have to multiply it by the number= of > > > > > > CPUs to get the total bytes. Mark it so we know the exact amoun= t of > > > > > > memory used. Any /proc/allocinfo parser can understand it and m= ake > > > > > > correct calculations. > > > > > > > > > > Ok, just wanted to be sure it wasn't something else. Let's shorte= n that > > > > > though, a single character should suffice (we already have a head= er that > > > > > can explain what it is) - if you're growing the width we don't wa= nt to > > > > > overflow. > > > > > > > > > > > > > Does it have a header ? > > > > > > Yes. See print_allocinfo_header(). > > > > I was thinking if instead of changing /proc/allocinfo format to > > contain both total and per-node information we can keep it as is > > (containing only totals) while exposing per-node information inside > > new /sys/devices/system/node/node/allocinfo files. That seems > > cleaner to me. > > > > The output of /sys/devices/system/node/node/allocinfo is > strictly limited to a single PAGE_SIZE and it cannot display stats for > all tags. Ugh, that's a pity. Another option would be to add "nid" column like this when this config is specified: nid bytes calls 0 8588 2147 kernel/irq/irqdesc.c:425 func:alloc_des= c 1 0 0 kernel/irq/irqdesc.c:425 func:alloc_desc ... It bloats the file size but looks more structured to me. > > > I'm also not a fan of "percpu y" tags as that requires the reader to > > know how many CPUs were in the system to make the calculation (you > > might get the allocinfo content from a system you have no access to > > and no additional information). Maybe we can have "per-cpu bytes" and > > "total bytes" columns instead? For per-cpu allocations these will be > > different, for all other allocations these two columns will contain > > the same number. > > I plan to remove 'percpu y/n' from this patch and implement it later. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To save memory, we dynamically allocate per-NUMA node stats= counter once the > > > > > > > > system boots up and knows how many NUMA nodes available. pe= rcpu allocators > > > > > > > > are used for memory allocation hence increase PERCPU_DYNAMI= C_RESERVE. > > > > > > > > > > > > > > > > For in-kernel alloc_tags, pcpu_alloc_noprof() is called so = the memory for > > > > > > > > these counters are not accounted in profiling stats. > > > > > > > > > > > > > > > > For loadable modules, __alloc_percpu_gfp() is called and me= mory is accounted. > > > > > > > > > > > > > > Intruiging, but I'd make it a kconfig option, AFAIK this woul= d mainly be > > > > > > > of interest to people looking at optimizing allocations to ma= ke sure > > > > > > > they're on the right numa node? > > > > > > > > > > > > Yes, to help us know if there is an NUMA imbalance issue and ma= ke some > > > > > > optimizations. I can make it a kconfig. Does anybody else have = any > > > > > > opinion about this feature ? Thanks! > > > > > > > > > > I would like to see some other opinions from potential users, hav= e you > > > > > been circulating it? > > > > > > > > We have been using it internally for a while. I don't know who the > > > > potential users are and how to reach them so I am sharing it here t= o > > > > collect opinions from others. > > > > > > Should definitely have a separate Kconfig option. Have you measured > > > the memory and performance overhead of this change?