From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADDB2C5AD49 for ; Tue, 3 Jun 2025 20:18:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 447C46B050C; Tue, 3 Jun 2025 16:18:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D1DC6B050E; Tue, 3 Jun 2025 16:18:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 30EA86B050F; Tue, 3 Jun 2025 16:18:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0F9F46B050C for ; Tue, 3 Jun 2025 16:18:42 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 921FC816CB for ; Tue, 3 Jun 2025 20:18:41 +0000 (UTC) X-FDA: 83515202442.14.41ED31C Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf12.hostedemail.com (Postfix) with ESMTP id B86D44000A for ; Tue, 3 Jun 2025 20:18:39 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=U90+7LQ7; spf=pass (imf12.hostedemail.com: domain of surenb@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748981919; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0xdGXud/wemGdnKSqoZSv8o9SPq3xfMfHpMXklpm3qw=; b=0BEyR+cM98SE/xgwiNnvRIqx7U4PweQ1mnJrmlEPy0uONeuj1CpCG4uSM8pFh0Hdw+kAg3 ST1Hqi4zAbed4eru2F6cD4GdPMaTCuPHeNRRQFIDdpwn24t0284fxjWc0SqGXJCYB2hd0K blhsvJ5CJZgq7pV1d/y34kvxbR03i0Y= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748981919; a=rsa-sha256; cv=none; b=BAsjBsalOdIs2D90TuN8rBnoZpUqGhawbshY84zrYRNLI/HBRZgWW9AzNwmwlgLAkrKQyX gQn/EC9cOQPDJ6XCr0JkiI8GE2kn871179l7FCw9QDv12u7v3vQW+deO9s3hlJ3+D/UF1/ 0oI0bfbaehCQq8FDQahRgKpp/UrJPw4= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=U90+7LQ7; spf=pass (imf12.hostedemail.com: domain of surenb@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-47e9fea29easo29851cf.1 for ; Tue, 03 Jun 2025 13:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1748981919; x=1749586719; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=0xdGXud/wemGdnKSqoZSv8o9SPq3xfMfHpMXklpm3qw=; b=U90+7LQ7qFdX/SE4YmOzNxH7Umh3gsbEnpVhJ1PCJhpi8AKrud9CRAnLfMGvdYQF2n ZlIgc/eQ9wcYfYPoxvO8oWMrtbcE50wr1M2QrwFkXAWYoRjsM2saa3p9tMTd7yLr7jQa RC5nSLZb1K5KkxwrBL/3lkwqRiPYFCZrj5dOqkPmQ/UZtvB8owqGkKIODE38u3aazwkQ M5gZ6JhGiv8ENaYk+asqvqKkwDlycstw1kRl0WiYFmqLVd0e5zTnQ4aJ77uA8J3lJXHr FiSvRGGyxrRJEb4P8UCMXZG47F7EV3GKey4NGrn096U9hv0ljHxW4/c97O50NIXuGa3/ +Tgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748981919; x=1749586719; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0xdGXud/wemGdnKSqoZSv8o9SPq3xfMfHpMXklpm3qw=; b=L94+s5lr7h2t0NRoC/OKFLneXR94gOhXu24fpSJ4c5hB+Dkv0hi0TEaZ0IZXbi2vR1 oLO3+weGVGNMslZaIAmg8Mz0DYrbBBs/1fMft9Xl9VN1eEicz1CAaufFvi2Vkj2ReqRv LwNEy79GvJ3lp0GeGgTP4S8VfnGWVg71fEwDAbrYn0dDEJ9v7+tM/LhGryDPIJhieknw pKB/84X6tcG/80rwQ3cO7Q+3hk76CaiG8bBL9fKGm00acPA+nfg+i3469FqaT+mNDrnl IIg0Mgx5ACUBc+hlvr412m49e96SIcQR2f324FK3wIdzirEEusHZ9HMH6ssd/vDokd/d fUlw== X-Forwarded-Encrypted: i=1; AJvYcCWblALiZ4G/XO25mK3iVFUpLUN4UCyRVr7nZirG6biI31LUhN2ozxTaiaPEC6l8xQhxUB537TbrgQ==@kvack.org X-Gm-Message-State: AOJu0Yx2phahB6kLXEVFJyaDaZ6Zip1krDYiFff6OWbk+qSZL5+Ozfuh Zrj2q8yAC4/b5WYTpOGJAWmPiKKH/13pwWMcudwOmmuE6h1kvV3n6Sr1FTLVg3+aoPUo1hvikCa UGzBL/Ifqt77KWlqWXTOoLlLNwOebCyGZDeFAJWL8 X-Gm-Gg: ASbGncvmEEUKGMOQqC8DbBmZyH9GAit/Sl69E0WLbLTqM8eG5DZ6tableaM45Kl/n2I hpDOwmW2qKWGnO5FpDD1DcW2WtRB13KbDKV5wFcjQpO6mkbfnY/HMcHqdgY0+d7NO6NUadKxXBK TMtaC+Fo+8wQ7pvk04CiYv+ga4+YPJnmoxOdjot3HfwHsrBsYuerREESOvAyVCr8NHaUupqIBo X-Google-Smtp-Source: AGHT+IERQ+SMIHm+8HBxhGMtzpvCBpPc3PFInIZYGotIdNjwB8lFXdEm62H+aFYxt5BY5utSOwRbG1kgWwSGDvRkKfo= X-Received: by 2002:a05:622a:4acb:b0:494:9777:4bd with SMTP id d75a77b69052e-4a5a53c7451mr555611cf.3.1748981918459; Tue, 03 Jun 2025 13:18:38 -0700 (PDT) MIME-Version: 1.0 References: <20250530003944.2929392-1-cachen@purestorage.com> <5iiwnofmnx565g3xv3zdt35b7qkuwylzedkidnav72t24asswj@omjgyjnauulg> In-Reply-To: From: Suren Baghdasaryan Date: Tue, 3 Jun 2025 13:18:27 -0700 X-Gm-Features: AX0GCFtjjJT6F-6estcw41S5b7t3TrA9YTfaMxCS2GenT1pQW5henp9drtQfxHQ Message-ID: Subject: Re: [PATCH 0/1] alloc_tag: add per-numa node stats To: Casey Chen Cc: Kent Overstreet , linux-mm@kvack.org, yzhong@purestorage.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B86D44000A X-Stat-Signature: 5gatjabz4ftz76yp41gmpm5buw4tpe86 X-Rspam-User: X-HE-Tag: 1748981919-175431 X-HE-Meta: U2FsdGVkX19DzE/qK++NqX82ADPvu8OGg6N7iXomNsGam3sZqORBQ+E3a080wdiwv58HRB7PAaCQ4eTcW2BxKjyRyljcw57MpV6cj2dfHdywCayvwrSM6AyF0Iyxklb2tbV4hod/vc4LNUmNRnY1jDMC+f0zmnKa9J26hwngm5BWwDFOriW0lXW1c538m4WdJW3oSaR/67KRA3TFLRLpO8zlHqyFwMnM3bWd9tnK7ZzkOJLwKop/N/GG1IhYJevpCp3PLx9BK+GMTGRG5mL3dneiz69J5wlsVWFl8Zl4Wh67i9aCRX6d89kRde/aQ1oSdBMzlPfOBHJhnW/n54SEGff4M2Lry4uaKY3BVujwqkUE9MzyJo8iJMpzsBtJGQGcNZjzYwMUOgHq+NFH5h+eTpe5lQJEcdAsfTfn601Vgv5s/OA5tGatF5ycjE+4/OKjS/1CKQrxEuN44o1bUngSC4ZqFgD9B1ZFucwoG58tCc62iK6Y1wSHEuy0EVJqDTQkJxkjIU7yCWHL4nCDaL6tCENi+/NIHtqUOwFukMio1Hg/yuJEtjKDNpLsopdmx6o7pQHVKq1/nXGXQunrdO/jzIA7aqtIQ4/ySFnEat1s1ly57XxwgPL7co5Nt1R1z0XzWwUW/GJFQWYKQJNlsdcz17pyMwT199YQc9yC2MIactdu3TsTnVSZsuR9YVdSyA1MvKljr23MJI0Us6CXqtsQzFYrpW2edGxeMe/mGik4APcNCbNhFkFsFMqWJzdYsGFE/8i9hs/KF5uQi7K0hpxUqFq1gZLaGZVGokeLztflrSyeD38/eKoCkTOVBHlSSjCF11wvRBOkTFEDFTSmT4v1dQcWoAf48teg5AKV7uRbSVVbFlykORbau670tnuDnSww8fZYPuz10uIUompqoJmkCrvYAWdR2FXgc7hY/6v+g5iB0/yeUrwZNOtJYZBu55mQHObctDit86qrnT/vC6/ OzVVDgV3 dGb4oZEWtqycDpb8tTcFPgS37/trxyZWB9Wo/SAAnNp9r/B/zfm47YwXR5MVazGwewg+1DS9i+fteKgrVuXOx3eV4lkMOEyM9mM3cfm64emBFKNFQubw92JVwGkHWS9TZBPxoapMEMo7haCaQcKkawhUIwynPBRGnbHJzO0LWQWhL0aeqlClRSsxahqPnbtf4LcsOlF25VWqNDnw7AoCJ7Q4L9YurMy7aip4iygC5TIoafCLaHPE9HVRjaApz49pqLqIGNYHR/1yaiR+EWtv8TRkVVqR6XJ7A1HZner9hpzOw/Ibgl/VRObaAOGLelSxhFPG2nu7ayLkNwBlZNhCf7jS63A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 3, 2025 at 1:01=E2=80=AFPM Casey Chen = wrote: > > On Mon, Jun 2, 2025 at 2:32=E2=80=AFPM Suren Baghdasaryan wrote: > > > > On Mon, Jun 2, 2025 at 1:48=E2=80=AFPM Casey Chen wrote: > > > > > > On Fri, May 30, 2025 at 5:05=E2=80=AFPM Kent Overstreet > > > wrote: > > > > > > > > On Fri, May 30, 2025 at 02:45:57PM -0700, Casey Chen wrote: > > > > > On Thu, May 29, 2025 at 6:11=E2=80=AFPM Kent Overstreet > > > > > wrote: > > > > > > > > > > > > On Thu, May 29, 2025 at 06:39:43PM -0600, Casey Chen wrote: > > > > > > > The patch is based 4aab42ee1e4e ("mm/zblock: make active_list= rcu_list") > > > > > > > from branch mm-new of git://git.kernel.org/pub/scm/linux/kern= el/git/akpm/mm > > > > > > > > > > > > > > The patch adds per-NUMA alloc_tag stats. Bytes/calls in total= and per-NUMA > > > > > > > nodes are displayed in a single row for each alloc_tag in /pr= oc/allocinfo. > > > > > > > Also percpu allocation is marked and its stats is stored on N= UMA node 0. > > > > > > > For example, the resulting file looks like below. > > > > > > > > > > > > > > percpu y total 8588 2147 numa0 8588 2= 147 numa1 0 0 kernel/irq/irqdesc.c:425 func:alloc_desc > > > > > > > percpu n total 447232 1747 numa0 269568 1= 053 numa1 177664 694 lib/maple_tree.c:165 func:mt_alloc_bulk > > > > > > > percpu n total 83200 325 numa0 30976 = 121 numa1 52224 204 lib/maple_tree.c:160 func:mt_alloc_one > > > > > > > ... > > > > > > > percpu n total 364800 5700 numa0 109440 1= 710 numa1 255360 3990 drivers/net/ethernet/mellanox/mlx5/core/cmd= .c:1410 [mlx5_core] func:mlx5_alloc_cmd_msg > > > > > > > percpu n total 1249280 39040 numa0 374784 11= 712 numa1 874496 27328 drivers/net/ethernet/mellanox/mlx5/core/cmd= .c:1376 [mlx5_core] func:alloc_cmd_box > > > > > > > > > > > > Err, what is 'percpu y/n'? > > > > > > > > > > > > > > > > Mark percpu allocation with 'percpu y/n' because for percpu alloc= ation > > > > > stats, 'bytes' is per-cpu, we have to multiply it by the number o= f > > > > > CPUs to get the total bytes. Mark it so we know the exact amount = of > > > > > memory used. Any /proc/allocinfo parser can understand it and mak= e > > > > > correct calculations. > > > > > > > > Ok, just wanted to be sure it wasn't something else. Let's shorten = that > > > > though, a single character should suffice (we already have a header= that > > > > can explain what it is) - if you're growing the width we don't want= to > > > > overflow. > > > > > > > > > > Does it have a header ? > > > > Yes. See print_allocinfo_header(). > > > > > > > > > > > > > > > > > > > > > > > > To save memory, we dynamically allocate per-NUMA node stats c= ounter once the > > > > > > > system boots up and knows how many NUMA nodes available. perc= pu allocators > > > > > > > are used for memory allocation hence increase PERCPU_DYNAMIC_= RESERVE. > > > > > > > > > > > > > > For in-kernel alloc_tags, pcpu_alloc_noprof() is called so th= e memory for > > > > > > > these counters are not accounted in profiling stats. > > > > > > > > > > > > > > For loadable modules, __alloc_percpu_gfp() is called and memo= ry is accounted. > > > > > > > > > > > > Intruiging, but I'd make it a kconfig option, AFAIK this would = mainly be > > > > > > of interest to people looking at optimizing allocations to make= sure > > > > > > they're on the right numa node? > > > > > > > > > > Yes, to help us know if there is an NUMA imbalance issue and make= some > > > > > optimizations. I can make it a kconfig. Does anybody else have an= y > > > > > opinion about this feature ? Thanks! > > > > > > > > I would like to see some other opinions from potential users, have = you > > > > been circulating it? > > > > > > We have been using it internally for a while. I don't know who the > > > potential users are and how to reach them so I am sharing it here to > > > collect opinions from others. > > > > Should definitely have a separate Kconfig option. Have you measured > > the memory and performance overhead of this change? > > I can make it a Kconfig option. Let's say, > CONFIG_MEM_ALLOC_PER_NUMA_STATS=3Dy/n, which controls the number of > counter per CPU. > If CONFIG_MEM_ALLOC_PER_NUMA_STATS=3Dy, num_counter_percpu =3D > num_possible_nodes(), else num_counter_percpu =3D 1. > > There is some memory cost. Additional memory used =3D Number of > additional NUMA nodes * Number of CPUs * Number of tags * Size of each > counter > For example, in one of my testbeds, additional memory used =3D 1 (two > nodes in total) * 112 (number of CPUs) * 4540 (number of tags) * 16 > (size of counter), ~8MiB. This testbed has a total 755 GiB of memory. Please add these numbers in the changelog when you post the next version.