From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9ED2C5B543 for ; Fri, 30 May 2025 00:40:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49A476B0083; Thu, 29 May 2025 20:40:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 472B26B0085; Thu, 29 May 2025 20:40:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 361F86B0088; Thu, 29 May 2025 20:40:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 140D06B0083 for ; Thu, 29 May 2025 20:40:18 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 83B42E91D2 for ; Fri, 30 May 2025 00:40:17 +0000 (UTC) X-FDA: 83497717674.02.7CD9532 Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) by imf11.hostedemail.com (Postfix) with ESMTP id 74E6D40008 for ; Fri, 30 May 2025 00:40:15 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=purestorage.com header.s=google2022 header.b=AAeakdp6; dmarc=pass (policy=reject) header.from=purestorage.com; spf=pass (imf11.hostedemail.com: domain of cachen@purestorage.com designates 209.85.208.52 as permitted sender) smtp.mailfrom=cachen@purestorage.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748565615; a=rsa-sha256; cv=none; b=X+X2g4/MxDAd7Up+9Gc30euLpXfFePIwmatwJtfyaptYF2EL1X6/BCNUsd1ma14BnLw7D5 okdYft2elW3Osb53+3mLXhSGPal/+GGDdiaNJ62GRCUHyuAal3UDdKTaz+T80r2M2/8oi2 ZUFqpJq+mqxFY1+oV5gzAsBUa+Bbtfw= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=purestorage.com header.s=google2022 header.b=AAeakdp6; dmarc=pass (policy=reject) header.from=purestorage.com; spf=pass (imf11.hostedemail.com: domain of cachen@purestorage.com designates 209.85.208.52 as permitted sender) smtp.mailfrom=cachen@purestorage.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748565615; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6sKP4XsfozHvllxoZYfwK+kX2mPiaybYco2+ul5eEtQ=; b=ir/1vuEakC4blPOoiwP9iYK5CahQ2mWGwCuSYWdwKPVcWYv6ee0HjZV92bXzitOXlUJIM+ BQAj5O1n20uRlbMCY83w1UicA0doIA25glwRayIBxal12N/X4kDcGpMmB7V3AT1BRX1Lgo 08BQHVttQ6WepJaieQxZLimPBm6AB3Y= Received: by mail-ed1-f52.google.com with SMTP id 4fb4d7f45d1cf-60472d90787so320936a12.1 for ; Thu, 29 May 2025 17:40:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1748565613; x=1749170413; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6sKP4XsfozHvllxoZYfwK+kX2mPiaybYco2+ul5eEtQ=; b=AAeakdp6LAV5pUdGFtYoZ9zg9HKXtGgHY9xmWN0KT2dH1Walf+ODK7EGxguuIDRyyi nru2uMlhQzyRmuSKlmdYDoGYmfZjfnqd6QwOoFN4Nz7g/T8Z6Vx1X5tYaKczO9/9NTgx ceVL5ku59+kXB9jfz4FowDM5D9OEXTC87Cq2kzrJUR7HiuAGWhyJ+klVkyY7W/NWk5br Zp6QKgPeDrDq8avH2OE1Qbw9nyd96F0C+qO0Do1kSB9jFEfx2Inu25iGfSWoPIEIPW+j d6z1VMPMRUrAakCiBxkkR15NH5qW51RA/pg7+ojrj3phjdU7mcNnNYP7m02Y1qbeX2Ck PTRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748565613; x=1749170413; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6sKP4XsfozHvllxoZYfwK+kX2mPiaybYco2+ul5eEtQ=; b=jGtT9jQHOrW2jk6XNIrMaZ5NziGZO6R59C9rDV/TXzYExl6fwe2LmLgulbH8+ONyUd wDgNaYYqc2D/FSs4qrSi5vDhu2CS0Ar8ya7h1CRb8e9dock7siAH9DAkluaI/lvUUEgZ 6S8h01EV7pDSIcGMOnp3BdGUcYW1NUcAOVm/ruAuPOktVy3/6vAIaIPL/qYbZP6XMY9M NrXDaKOgmn+JA3eU7kNdVBQzWJhMNeCWnYAxI/mkulvAFyuy/m46PmfGyPlcPNLf4AJz HABnv3h594HfA7KRXSkaOIVJc2NGKgVPHe3XfNTKVshMkRjH/niy/xDrmuRns7WQfu/r aAwg== X-Gm-Message-State: AOJu0YxkavucHIF7N+Z/jqMo+Is0sqKCY8VqaDCrpoqcNWIx9OVpirrH MebyQTmJuwsz9K0LJT4CQM/FPXr7J8Wsn5Wane7hV/3UHFSytcTSjhvjIorsnLTyHMjJgyYNbdW f4IcwTKsbhp3FUWS/wUC+avowKYuSXylD8X8zaJlw7dz4x/79N9YenY+nPQR1LMizuHPa+yZFwf v9iWW5Qr/QHt/AYKmlaL6vz0Dvxx+Wj8udX58HrmbL X-Gm-Gg: ASbGncsL3ZFuHV/DZC/OXgPESux1rDd9UetZxD7+FdJWDj2FpfUfTIgHQmDSBpn4wQL KhBFTkBr30A7+MFMYqi1MLs3AXlPulIM9LLzTIup/wTpmWxEwjZie+I8S7wOLSsXuuagOzIaVnW SATG8yl5Sd793/EJlS0ZofDANK5S0Z9OzKbtJbeO8sL4Rj29+xFvWHyJ1Is354TfUYwswXgiT9b 30QixDBbK4jW4tpRweC6TUy2ct2/SxvkvxLOi+XWjKe5I3VJ2r0MU5xUXRioaNYJCsdi5Qid1Tk z6xBZ6ghV62g3TLkYoqutLi/dQBdCHRcz0JtdAXsBSNu3A4gfoYer17+MXJyocLINVLtxKk= X-Google-Smtp-Source: AGHT+IElcbLoNRXYGbWqatlmt8/YY7hy1lFNUxTK5bT4HB58jKF4EKVVMtBGKNpTMYjh7MYVmcdGCQ== X-Received: by 2002:a17:907:9342:b0:ad8:882e:38a with SMTP id a640c23a62f3a-adb36bfc46dmr7537566b.14.1748565613098; Thu, 29 May 2025 17:40:13 -0700 (PDT) Received: from dev-cachen2.dev.purestorage.com ([2620:125:9007:640:ffff::a1fd]) by smtp.googlemail.com with ESMTPSA id a640c23a62f3a-ada5dd043edsm229149466b.96.2025.05.29.17.40.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 May 2025 17:40:12 -0700 (PDT) From: Casey Chen To: linux-mm@kvack.org, surenb@google.com, kent.overstreet@linux.dev Cc: yzhong@purestorage.com, cachen@purestorage.com Subject: [PATCH] alloc_tag: add per-numa node stats Date: Thu, 29 May 2025 18:39:44 -0600 Message-Id: <20250530003944.2929392-2-cachen@purestorage.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250530003944.2929392-1-cachen@purestorage.com> References: <20250530003944.2929392-1-cachen@purestorage.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 74E6D40008 X-Stat-Signature: rnmjwxk31ncpb4nsybe99cgcs8xw7jp5 X-Rspam-User: X-HE-Tag: 1748565615-486998 X-HE-Meta: U2FsdGVkX1+8YkMO7oBaFntNK1ubTxwzfmZsGe+AePyqwGYd8P4sBGhihbmv3QckzdGfouiDGf+eLtQNc5cD9DITZbDy1Gl7fuZ1zIAH6DP6+BmkgUgJtVTVDg4bUX8OErif5IxjkrS/4F2mgBFqr9kJyS1nf/mZZRF2JcgUHmFjchLU31xbGuR0JNxO7fLoVvviInenbr+3ksQZOP/3tqLWIrzl0ictxt0+EqLWo0U80bDZJ2MpGAPzsK4tJFAAy4tWDx6GxlLVeC6vR5vzLlxSddTqTjZkYd/gcue0ifk5cDAQ9+OwlNuEsnWpaNveSiiBkmEaKnBMPjchyy51AomJNGZcuBi0NMc7QNoKT2AM3KdW7vnGI0+8p9vlWPcvKkWiVUzFQU/t8acTPhWXicOtS+3JS8gLoI/Ly24O6IvWfIovG9BwS2uT6fjai/iorCaPFi0ZR+WzpS3tDGE/LlLK6aDKL3a73Qtqh8fc0lfXXbDcofYpe349+zsdTmk4/mwR1xG5zbiBs51zWePnflWOTwA9SheGLE2AF0JVURlx0ksbV4pUlISMEqTtvLSavcewDN7CGoqfY9ZGmuRutXRnOQBRZRLGx7jPC1VDy14HAKJ6W0FKpLK/7ZCnMwkp9LBgfsfjjxcbSeahDW0M+cnGRd/jSEQAvRpjirlmOAg1/O+0ZRVd6WqCDQK+tDGVerrdWNcgqfvDyu/UruTN1TsNrDGrfNcL+bAeBpEdPDux1UvGixb3oD9m1sjzO6dZ/bCKz1MSdjqnTan2sEtwyQz97IU8XiEbSRL0C7uQrE4ltINXvmXiVSrHogm8LwQBoRgXPXa/2FOzBNPjvdBGURKVBa4rgfNY5SkcIm/rnEKN0+TJHcUfG0xA9U4C6lg0/39zsf4b9jNXeCBq7vQa9FydTYjGPHR/wR6jSLo2vRFDroJM/UyqXZf0PqfuJ3rEoxV2uYnPTFnphnrACcd z4zly0vP sSKhuKyE37Os4HUapk+Sk3L6hvRorDxaYcy2+ShQHgn7MIkV4/8deyc2z9xBM3qdKz2yTVzlWMrnmDn4vAlzb0U//VwB7nsTHtJ+dDuY+hRQNxoKtREjwfboE5fGxNqYfURsW0/XLaWQx69LQWHU5T57chD3qiSiYm/XhZSUKWM7vI9HG2nUOkjUAxwe2WhGXr6Bda3QTs6f+AuzkvnJu2lJ6hw1g3D7gw6HXdz+cpe38Zt0HL0UHggxzSYg2crhnh4p3rLen0zKYaDNWw5qqhHe/qbwIkXtiEFP8QTbgYPAzhGKRNVXjti4rwwzTggmd4LUoyLjvgymconIx0wq3+kbFSnBF/n1+tBz5ve+hTvhogC8DjCt1Ct4OxZrbPVuZZhef X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add per-numa stats for each alloc_tag. We used to have only one alloc_tag_counters per CPU, now each CPU has one per numa node. bytes/calls in total and for each numa node are now displayed together in a single row for each alloc_tag in /proc/allocinfo. Note for percpu allocation, per-numa stats doesn't make sense. Numa nodes for per-CPU memory vary. Each CPU usually gets copy from its local numa node. We don't have a way to count numa node stats by CPU, so just store all stats in numa 0. Also, the 'bytes' field is just the number needed by a single CPU, to get the total bytes, multiply it by number of possible CPUs. Added boolean field 'percpu' to mark all percpu allocations in /proc/allocinfo. To minimize memory usage, alloc_tag stats counters are dynamically allocated with percpu allocator. Increase PERCPU_DYNAMIC_RESERVE to accommodate counters for in-kernel alloc_tags. For in-kernel alloc_tag, pcpu_alloc_noprof() is called to allocate stats counters, which is not accounted for in profiling stats. Signed-off-by: Casey Chen Reviewed-by: Yuanyuan Zhong --- include/linux/alloc_tag.h | 49 ++++++++++++++++++++++++++++----------- include/linux/codetag.h | 4 ++++ include/linux/percpu.h | 2 +- lib/alloc_tag.c | 43 ++++++++++++++++++++++++++++------ mm/page_alloc.c | 35 ++++++++++++++-------------- mm/percpu.c | 8 +++++-- mm/show_mem.c | 20 ++++++++++++---- mm/slub.c | 11 ++++++--- 8 files changed, 123 insertions(+), 49 deletions(-) diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h index 8f7931eb7d16..99d4a1823e51 100644 --- a/include/linux/alloc_tag.h +++ b/include/linux/alloc_tag.h @@ -15,6 +15,8 @@ #include #include +extern int num_numa_nodes; + struct alloc_tag_counters { u64 bytes; u64 calls; @@ -134,16 +136,34 @@ static inline bool mem_alloc_profiling_enabled(void) &mem_alloc_profiling_key); } +static inline struct alloc_tag_counters alloc_tag_read_nid(struct alloc_tag *tag, int nid) +{ + struct alloc_tag_counters v = { 0, 0 }; + struct alloc_tag_counters *counters; + int cpu; + + for_each_possible_cpu(cpu) { + counters = per_cpu_ptr(tag->counters, cpu); + v.bytes += counters[nid].bytes; + v.calls += counters[nid].calls; + } + + return v; +} + static inline struct alloc_tag_counters alloc_tag_read(struct alloc_tag *tag) { struct alloc_tag_counters v = { 0, 0 }; - struct alloc_tag_counters *counter; + struct alloc_tag_counters *counters; int cpu; + int nid; for_each_possible_cpu(cpu) { - counter = per_cpu_ptr(tag->counters, cpu); - v.bytes += counter->bytes; - v.calls += counter->calls; + counters = per_cpu_ptr(tag->counters, cpu); + for (nid = 0; nid < num_numa_nodes; nid++) { + v.bytes += counters[nid].bytes; + v.calls += counters[nid].calls; + } } return v; @@ -179,7 +199,7 @@ static inline bool __alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag return true; } -static inline bool alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag *tag) +static inline bool alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag *tag, int nid) { if (unlikely(!__alloc_tag_ref_set(ref, tag))) return false; @@ -190,17 +210,18 @@ static inline bool alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag *t * Each new reference for every sub-allocation needs to increment call * counter because when we free each part the counter will be decremented. */ - this_cpu_inc(tag->counters->calls); + this_cpu_inc(tag->counters[nid].calls); return true; } -static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, size_t bytes) +static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, + int nid, size_t bytes) { - if (likely(alloc_tag_ref_set(ref, tag))) - this_cpu_add(tag->counters->bytes, bytes); + if (likely(alloc_tag_ref_set(ref, tag, nid))) + this_cpu_add(tag->counters[nid].bytes, bytes); } -static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) +static inline void alloc_tag_sub(union codetag_ref *ref, int nid, size_t bytes) { struct alloc_tag *tag; @@ -215,8 +236,8 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) tag = ct_to_alloc_tag(ref->ct); - this_cpu_sub(tag->counters->bytes, bytes); - this_cpu_dec(tag->counters->calls); + this_cpu_sub(tag->counters[nid].bytes, bytes); + this_cpu_dec(tag->counters[nid].calls); ref->ct = NULL; } @@ -228,8 +249,8 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) #define DEFINE_ALLOC_TAG(_alloc_tag) static inline bool mem_alloc_profiling_enabled(void) { return false; } static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, - size_t bytes) {} -static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) {} + int nid, size_t bytes) {} +static inline void alloc_tag_sub(union codetag_ref *ref, int nid, size_t bytes) {} #define alloc_tag_record(p) do {} while (0) #endif /* CONFIG_MEM_ALLOC_PROFILING */ diff --git a/include/linux/codetag.h b/include/linux/codetag.h index 5f2b9a1f722c..79d6b96c61f6 100644 --- a/include/linux/codetag.h +++ b/include/linux/codetag.h @@ -16,6 +16,10 @@ struct module; #define CODETAG_SECTION_START_PREFIX "__start_" #define CODETAG_SECTION_STOP_PREFIX "__stop_" +enum codetag_flags { + CODETAG_PERCPU_ALLOC = (1 << 0), /* codetag tracking percpu allocation */ +}; + /* * An instance of this structure is created in a special ELF section at every * code location being tagged. At runtime, the special section is treated as diff --git a/include/linux/percpu.h b/include/linux/percpu.h index 85bf8dd9f087..d92c27fbcd0d 100644 --- a/include/linux/percpu.h +++ b/include/linux/percpu.h @@ -43,7 +43,7 @@ # define PERCPU_DYNAMIC_SIZE_SHIFT 12 #endif /* LOCKDEP and PAGE_SIZE > 4KiB */ #else -#define PERCPU_DYNAMIC_SIZE_SHIFT 10 +#define PERCPU_DYNAMIC_SIZE_SHIFT 13 #endif /* diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c index d48b80f3f007..b4d2d5663c4c 100644 --- a/lib/alloc_tag.c +++ b/lib/alloc_tag.c @@ -42,6 +42,9 @@ struct allocinfo_private { bool print_header; }; +int num_numa_nodes; +static unsigned long pcpu_counters_size; + static void *allocinfo_start(struct seq_file *m, loff_t *pos) { struct allocinfo_private *priv; @@ -95,9 +98,16 @@ static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct) { struct alloc_tag *tag = ct_to_alloc_tag(ct); struct alloc_tag_counters counter = alloc_tag_read(tag); - s64 bytes = counter.bytes; + int nid; + + seq_buf_printf(out, "percpu %c total %12lli %8llu ", + ct->flags & CODETAG_PERCPU_ALLOC ? 'y' : 'n', + counter.bytes, counter.calls); + for (nid = 0; nid < num_numa_nodes; nid++) { + counter = alloc_tag_read_nid(tag, nid); + seq_buf_printf(out, "numa%d %12lli %8llu ", nid, counter.bytes, counter.calls); + } - seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls); codetag_to_text(out, ct); seq_buf_putc(out, ' '); seq_buf_putc(out, '\n'); @@ -184,7 +194,7 @@ void pgalloc_tag_split(struct folio *folio, int old_order, int new_order) if (get_page_tag_ref(folio_page(folio, i), &ref, &handle)) { /* Set new reference to point to the original tag */ - alloc_tag_ref_set(&ref, tag); + alloc_tag_ref_set(&ref, tag, folio_nid(folio)); update_page_tag_ref(handle, &ref); put_page_tag_ref(handle); } @@ -247,19 +257,36 @@ static void shutdown_mem_profiling(bool remove_file) void __init alloc_tag_sec_init(void) { struct alloc_tag *last_codetag; + int i; if (!mem_profiling_support) return; - if (!static_key_enabled(&mem_profiling_compressed)) - return; - kernel_tags.first_tag = (struct alloc_tag *)kallsyms_lookup_name( SECTION_START(ALLOC_TAG_SECTION_NAME)); last_codetag = (struct alloc_tag *)kallsyms_lookup_name( SECTION_STOP(ALLOC_TAG_SECTION_NAME)); kernel_tags.count = last_codetag - kernel_tags.first_tag; + num_numa_nodes = num_possible_nodes(); + pcpu_counters_size = num_numa_nodes * sizeof(struct alloc_tag_counters); + for (i = 0; i < kernel_tags.count; i++) { + /* Each CPU has one counter per numa node */ + kernel_tags.first_tag[i].counters = + pcpu_alloc_noprof(pcpu_counters_size, + sizeof(struct alloc_tag_counters), + false, GFP_KERNEL | __GFP_ZERO); + if (!kernel_tags.first_tag[i].counters) { + while (--i >= 0) + free_percpu(kernel_tags.first_tag[i].counters); + pr_info("Failed to allocate per-cpu alloc_tag counters\n"); + return; + } + } + + if (!static_key_enabled(&mem_profiling_compressed)) + return; + /* Check if kernel tags fit into page flags */ if (kernel_tags.count > (1UL << NR_UNUSED_PAGEFLAG_BITS)) { shutdown_mem_profiling(false); /* allocinfo file does not exist yet */ @@ -622,7 +649,9 @@ static int load_module(struct module *mod, struct codetag *start, struct codetag stop_tag = ct_to_alloc_tag(stop); for (tag = start_tag; tag < stop_tag; tag++) { WARN_ON(tag->counters); - tag->counters = alloc_percpu(struct alloc_tag_counters); + tag->counters = __alloc_percpu_gfp(pcpu_counters_size, + sizeof(struct alloc_tag_counters), + GFP_KERNEL | __GFP_ZERO); if (!tag->counters) { while (--tag >= start_tag) { free_percpu(tag->counters); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 90b06f3d004c..8219d8de6f97 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1107,58 +1107,59 @@ void __clear_page_tag_ref(struct page *page) /* Should be called only if mem_alloc_profiling_enabled() */ static noinline void __pgalloc_tag_add(struct page *page, struct task_struct *task, - unsigned int nr) + int nid, unsigned int nr) { union pgtag_ref_handle handle; union codetag_ref ref; if (get_page_tag_ref(page, &ref, &handle)) { - alloc_tag_add(&ref, task->alloc_tag, PAGE_SIZE * nr); + alloc_tag_add(&ref, task->alloc_tag, nid, PAGE_SIZE * nr); update_page_tag_ref(handle, &ref); put_page_tag_ref(handle); } } static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, - unsigned int nr) + int nid, unsigned int nr) { if (mem_alloc_profiling_enabled()) - __pgalloc_tag_add(page, task, nr); + __pgalloc_tag_add(page, task, nid, nr); } /* Should be called only if mem_alloc_profiling_enabled() */ static noinline -void __pgalloc_tag_sub(struct page *page, unsigned int nr) +void __pgalloc_tag_sub(struct page *page, int nid, unsigned int nr) { union pgtag_ref_handle handle; union codetag_ref ref; if (get_page_tag_ref(page, &ref, &handle)) { - alloc_tag_sub(&ref, PAGE_SIZE * nr); + alloc_tag_sub(&ref, nid, PAGE_SIZE * nr); update_page_tag_ref(handle, &ref); put_page_tag_ref(handle); } } -static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) +static inline void pgalloc_tag_sub(struct page *page, int nid, unsigned int nr) { if (mem_alloc_profiling_enabled()) - __pgalloc_tag_sub(page, nr); + __pgalloc_tag_sub(page, nid, nr); } /* When tag is not NULL, assuming mem_alloc_profiling_enabled */ -static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, unsigned int nr) +static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, + int nid, unsigned int nr) { if (tag) - this_cpu_sub(tag->counters->bytes, PAGE_SIZE * nr); + this_cpu_sub(tag->counters[nid].bytes, PAGE_SIZE * nr); } #else /* CONFIG_MEM_ALLOC_PROFILING */ static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, - unsigned int nr) {} -static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {} -static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, unsigned int nr) {} + int nid, unsigned int nr) {} +static inline void pgalloc_tag_sub(struct page *page, int nid, unsigned int nr) {} +static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, int nid, unsigned int nr) {} #endif /* CONFIG_MEM_ALLOC_PROFILING */ @@ -1197,7 +1198,7 @@ __always_inline bool free_pages_prepare(struct page *page, /* Do not let hwpoison pages hit pcplists/buddy */ reset_page_owner(page, order); page_table_check_free(page, order); - pgalloc_tag_sub(page, 1 << order); + pgalloc_tag_sub(page, page_to_nid(page), 1 << order); /* * The page is isolated and accounted for. @@ -1251,7 +1252,7 @@ __always_inline bool free_pages_prepare(struct page *page, page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; reset_page_owner(page, order); page_table_check_free(page, order); - pgalloc_tag_sub(page, 1 << order); + pgalloc_tag_sub(page, page_to_nid(page), 1 << order); if (!PageHighMem(page)) { debug_check_no_locks_freed(page_address(page), @@ -1707,7 +1708,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order, set_page_owner(page, order, gfp_flags); page_table_check_alloc(page, order); - pgalloc_tag_add(page, current, 1 << order); + pgalloc_tag_add(page, current, page_to_nid(page), 1 << order); } static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags, @@ -5064,7 +5065,7 @@ static void ___free_pages(struct page *page, unsigned int order, if (put_page_testzero(page)) __free_frozen_pages(page, order, fpi_flags); else if (!head) { - pgalloc_tag_sub_pages(tag, (1 << order) - 1); + pgalloc_tag_sub_pages(tag, page_to_nid(page), (1 << order) - 1); while (order-- > 0) __free_frozen_pages(page + (1 << order), order, fpi_flags); diff --git a/mm/percpu.c b/mm/percpu.c index b35494c8ede2..130450e9718e 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1691,15 +1691,19 @@ static void pcpu_alloc_tag_alloc_hook(struct pcpu_chunk *chunk, int off, size_t size) { if (mem_alloc_profiling_enabled() && likely(chunk->obj_exts)) { + /* For percpu allocation, store all alloc_tag stats on numa node 0 */ alloc_tag_add(&chunk->obj_exts[off >> PCPU_MIN_ALLOC_SHIFT].tag, - current->alloc_tag, size); + current->alloc_tag, 0, size); + if (current->alloc_tag) + current->alloc_tag->ct.flags |= CODETAG_PERCPU_ALLOC; } } static void pcpu_alloc_tag_free_hook(struct pcpu_chunk *chunk, int off, size_t size) { + /* percpu alloc_tag stats is stored on numa node 0 so subtract from node 0 */ if (mem_alloc_profiling_enabled() && likely(chunk->obj_exts)) - alloc_tag_sub(&chunk->obj_exts[off >> PCPU_MIN_ALLOC_SHIFT].tag, size); + alloc_tag_sub(&chunk->obj_exts[off >> PCPU_MIN_ALLOC_SHIFT].tag, 0, size); } #else static void pcpu_alloc_tag_alloc_hook(struct pcpu_chunk *chunk, int off, diff --git a/mm/show_mem.c b/mm/show_mem.c index 03e8d968fd1a..132b3aa82d83 100644 --- a/mm/show_mem.c +++ b/mm/show_mem.c @@ -5,6 +5,7 @@ * Copyright (C) 2008 Johannes Weiner */ +#include #include #include #include @@ -433,18 +434,27 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx) struct alloc_tag *tag = ct_to_alloc_tag(ct); struct alloc_tag_counters counter = alloc_tag_read(tag); char bytes[10]; + int nid; string_get_size(counter.bytes, 1, STRING_UNITS_2, bytes, sizeof(bytes)); + pr_notice("percpu %c total %12s %8llu ", + ct->flags & CODETAG_PERCPU_ALLOC ? 'y' : 'n', + bytes, counter.calls); + + for (nid = 0; nid < num_numa_nodes; nid++) { + counter = alloc_tag_read_nid(tag, nid); + string_get_size(counter.bytes, 1, STRING_UNITS_2, + bytes, sizeof(bytes)); + pr_notice("numa%d %12s %8llu ", nid, bytes, counter.calls); + } /* Same as alloc_tag_to_text() but w/o intermediate buffer */ if (ct->modname) - pr_notice("%12s %8llu %s:%u [%s] func:%s\n", - bytes, counter.calls, ct->filename, + pr_notice("%s:%u [%s] func:%s\n", ct->filename, ct->lineno, ct->modname, ct->function); else - pr_notice("%12s %8llu %s:%u func:%s\n", - bytes, counter.calls, ct->filename, - ct->lineno, ct->function); + pr_notice("%s:%u func:%s\n", + ct->filename, ct->lineno, ct->function); } } } diff --git a/mm/slub.c b/mm/slub.c index be8b09e09d30..068b88b85d80 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2104,8 +2104,12 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) * If other users appear then mem_alloc_profiling_enabled() * check should be added before alloc_tag_add(). */ - if (likely(obj_exts)) - alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size); + if (likely(obj_exts)) { + struct page *page = virt_to_page(object); + + alloc_tag_add(&obj_exts->ref, current->alloc_tag, + page_to_nid(page), s->size); + } } static inline void @@ -2133,8 +2137,9 @@ __alloc_tagging_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p for (i = 0; i < objects; i++) { unsigned int off = obj_to_index(s, slab, p[i]); + struct page *page = virt_to_page(p[i]); - alloc_tag_sub(&obj_exts[off].ref, s->size); + alloc_tag_sub(&obj_exts[off].ref, page_to_nid(page), s->size); } } -- 2.34.1