* [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output
@ 2025-09-15 23:02 Suren Baghdasaryan
2025-09-15 23:05 ` Suren Baghdasaryan
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Suren Baghdasaryan @ 2025-09-15 23:02 UTC (permalink / raw)
To: akpm
Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes,
roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason,
pasha.tatashin, souravpanda, surenb, linux-mm, linux-kernel
While rare, memory allocation profiling can contain inaccurate counters
if slab object extension vector allocation fails. That allocation might
succeed later but prior to that, slab allocations that would have used
that object extension vector will not be accounted for. To indicate
incorrect counters, "accurate:no" marker is appended to the call site
line in the /proc/allocinfo output.
Bump up /proc/allocinfo version to reflect the change in the file format
and update documentation.
Example output with invalid counters:
allocinfo - version: 2.0
0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no
0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no
32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
Changes since v1[1]:
- Changed the marker from asterisk to accurate:no pair, per Andrew Morton
- Documented /proc/allocinfo v2 format
- Update the changelog
- Added Acked-by from v2 since the functionality is the same,
per Shakeel Butt, Usama Arif and Johannes Weiner
[1] https://lore.kernel.org/all/20250909234942.1104356-1-surenb@google.com/
Documentation/filesystems/proc.rst | 4 ++++
include/linux/alloc_tag.h | 12 ++++++++++++
include/linux/codetag.h | 5 ++++-
lib/alloc_tag.c | 4 +++-
mm/slub.c | 2 ++
5 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 915a3e44bc12..1776a06571c2 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -1009,6 +1009,10 @@ number, module (if originates from a loadable module) and the function calling
the allocation. The number of bytes allocated and number of calls at each
location are reported. The first line indicates the version of the file, the
second line is the header listing fields in the file.
+If file version is 2.0 or higher then each line may contain additional
+<key>:<value> pairs representing extra information about the call site.
+For example if the counters are not accurate, the line will be appended with
+"accurate:no" pair.
Example output.
diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h
index 9ef2633e2c08..d40ac39bfbe8 100644
--- a/include/linux/alloc_tag.h
+++ b/include/linux/alloc_tag.h
@@ -221,6 +221,16 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes)
ref->ct = NULL;
}
+static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag)
+{
+ tag->ct.flags |= CODETAG_FLAG_INACCURATE;
+}
+
+static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag)
+{
+ return !!(tag->ct.flags & CODETAG_FLAG_INACCURATE);
+}
+
#define alloc_tag_record(p) ((p) = current->alloc_tag)
#else /* CONFIG_MEM_ALLOC_PROFILING */
@@ -230,6 +240,8 @@ static inline bool mem_alloc_profiling_enabled(void) { return false; }
static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag,
size_t bytes) {}
static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) {}
+static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) {}
+static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) { return false; }
#define alloc_tag_record(p) do {} while (0)
#endif /* CONFIG_MEM_ALLOC_PROFILING */
diff --git a/include/linux/codetag.h b/include/linux/codetag.h
index 457ed8fd3214..8ea2a5f7c98a 100644
--- a/include/linux/codetag.h
+++ b/include/linux/codetag.h
@@ -16,13 +16,16 @@ struct module;
#define CODETAG_SECTION_START_PREFIX "__start_"
#define CODETAG_SECTION_STOP_PREFIX "__stop_"
+/* codetag flags */
+#define CODETAG_FLAG_INACCURATE (1 << 0)
+
/*
* An instance of this structure is created in a special ELF section at every
* code location being tagged. At runtime, the special section is treated as
* an array of these.
*/
struct codetag {
- unsigned int flags; /* used in later patches */
+ unsigned int flags;
unsigned int lineno;
const char *modname;
const char *function;
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index 79891528e7b6..12ff80bbbd22 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -80,7 +80,7 @@ static void allocinfo_stop(struct seq_file *m, void *arg)
static void print_allocinfo_header(struct seq_buf *buf)
{
/* Output format version, so we can change it. */
- seq_buf_printf(buf, "allocinfo - version: 1.0\n");
+ seq_buf_printf(buf, "allocinfo - version: 2.0\n");
seq_buf_printf(buf, "# <size> <calls> <tag info>\n");
}
@@ -92,6 +92,8 @@ static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct)
seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls);
codetag_to_text(out, ct);
+ if (unlikely(alloc_tag_is_inaccurate(tag)))
+ seq_buf_printf(out, " accurate:no");
seq_buf_putc(out, ' ');
seq_buf_putc(out, '\n');
}
diff --git a/mm/slub.c b/mm/slub.c
index af343ca570b5..9c04f29ee8de 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2143,6 +2143,8 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags)
*/
if (likely(obj_exts))
alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size);
+ else
+ alloc_tag_set_inaccurate(current->alloc_tag);
}
static inline void
--
2.51.0.384.g4c02a37b29-goog
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-15 23:02 [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output Suren Baghdasaryan @ 2025-09-15 23:05 ` Suren Baghdasaryan 2025-09-16 0:11 ` Andrew Morton 2025-09-16 12:57 ` Vlastimil Babka 2 siblings, 0 replies; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-15 23:05 UTC (permalink / raw) To: akpm Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Mon, Sep 15, 2025 at 4:02 PM Suren Baghdasaryan <surenb@google.com> wrote: > > While rare, memory allocation profiling can contain inaccurate counters > if slab object extension vector allocation fails. That allocation might > succeed later but prior to that, slab allocations that would have used > that object extension vector will not be accounted for. To indicate > incorrect counters, "accurate:no" marker is appended to the call site > line in the /proc/allocinfo output. > Bump up /proc/allocinfo version to reflect the change in the file format > and update documentation. > > Example output with invalid counters: > allocinfo - version: 2.0 > 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > > Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > Acked-by: Shakeel Butt <shakeel.butt@linux.dev> > Acked-by: Usama Arif <usamaarif642@gmail.com> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > --- > Changes since v1[1]: > - Changed the marker from asterisk to accurate:no pair, per Andrew Morton > - Documented /proc/allocinfo v2 format > - Update the changelog > - Added Acked-by from v2 since the functionality is the same, > per Shakeel Butt, Usama Arif and Johannes Weiner Sorry, forgot the --base=auto when formatting the patch. It's based on mm-new. > > [1] https://lore.kernel.org/all/20250909234942.1104356-1-surenb@google.com/ > > Documentation/filesystems/proc.rst | 4 ++++ > include/linux/alloc_tag.h | 12 ++++++++++++ > include/linux/codetag.h | 5 ++++- > lib/alloc_tag.c | 4 +++- > mm/slub.c | 2 ++ > 5 files changed, 25 insertions(+), 2 deletions(-) > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > index 915a3e44bc12..1776a06571c2 100644 > --- a/Documentation/filesystems/proc.rst > +++ b/Documentation/filesystems/proc.rst > @@ -1009,6 +1009,10 @@ number, module (if originates from a loadable module) and the function calling > the allocation. The number of bytes allocated and number of calls at each > location are reported. The first line indicates the version of the file, the > second line is the header listing fields in the file. > +If file version is 2.0 or higher then each line may contain additional > +<key>:<value> pairs representing extra information about the call site. > +For example if the counters are not accurate, the line will be appended with > +"accurate:no" pair. > > Example output. > > diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h > index 9ef2633e2c08..d40ac39bfbe8 100644 > --- a/include/linux/alloc_tag.h > +++ b/include/linux/alloc_tag.h > @@ -221,6 +221,16 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) > ref->ct = NULL; > } > > +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) > +{ > + tag->ct.flags |= CODETAG_FLAG_INACCURATE; > +} > + > +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) > +{ > + return !!(tag->ct.flags & CODETAG_FLAG_INACCURATE); > +} > + > #define alloc_tag_record(p) ((p) = current->alloc_tag) > > #else /* CONFIG_MEM_ALLOC_PROFILING */ > @@ -230,6 +240,8 @@ static inline bool mem_alloc_profiling_enabled(void) { return false; } > static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, > size_t bytes) {} > static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) {} > +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) {} > +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) { return false; } > #define alloc_tag_record(p) do {} while (0) > > #endif /* CONFIG_MEM_ALLOC_PROFILING */ > diff --git a/include/linux/codetag.h b/include/linux/codetag.h > index 457ed8fd3214..8ea2a5f7c98a 100644 > --- a/include/linux/codetag.h > +++ b/include/linux/codetag.h > @@ -16,13 +16,16 @@ struct module; > #define CODETAG_SECTION_START_PREFIX "__start_" > #define CODETAG_SECTION_STOP_PREFIX "__stop_" > > +/* codetag flags */ > +#define CODETAG_FLAG_INACCURATE (1 << 0) > + > /* > * An instance of this structure is created in a special ELF section at every > * code location being tagged. At runtime, the special section is treated as > * an array of these. > */ > struct codetag { > - unsigned int flags; /* used in later patches */ > + unsigned int flags; > unsigned int lineno; > const char *modname; > const char *function; > diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c > index 79891528e7b6..12ff80bbbd22 100644 > --- a/lib/alloc_tag.c > +++ b/lib/alloc_tag.c > @@ -80,7 +80,7 @@ static void allocinfo_stop(struct seq_file *m, void *arg) > static void print_allocinfo_header(struct seq_buf *buf) > { > /* Output format version, so we can change it. */ > - seq_buf_printf(buf, "allocinfo - version: 1.0\n"); > + seq_buf_printf(buf, "allocinfo - version: 2.0\n"); > seq_buf_printf(buf, "# <size> <calls> <tag info>\n"); > } > > @@ -92,6 +92,8 @@ static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct) > > seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls); > codetag_to_text(out, ct); > + if (unlikely(alloc_tag_is_inaccurate(tag))) > + seq_buf_printf(out, " accurate:no"); > seq_buf_putc(out, ' '); > seq_buf_putc(out, '\n'); > } > diff --git a/mm/slub.c b/mm/slub.c > index af343ca570b5..9c04f29ee8de 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2143,6 +2143,8 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) > */ > if (likely(obj_exts)) > alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size); > + else > + alloc_tag_set_inaccurate(current->alloc_tag); > } > > static inline void > -- > 2.51.0.384.g4c02a37b29-goog > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-15 23:02 [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output Suren Baghdasaryan 2025-09-15 23:05 ` Suren Baghdasaryan @ 2025-09-16 0:11 ` Andrew Morton 2025-09-16 2:48 ` Suren Baghdasaryan 2025-09-16 12:57 ` Vlastimil Babka 2 siblings, 1 reply; 20+ messages in thread From: Andrew Morton @ 2025-09-16 0:11 UTC (permalink / raw) To: Suren Baghdasaryan Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Mon, 15 Sep 2025 16:02:24 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > While rare, memory allocation profiling can contain inaccurate counters > if slab object extension vector allocation fails. That allocation might > succeed later but prior to that, slab allocations that would have used > that object extension vector will not be accounted for. To indicate > incorrect counters, "accurate:no" marker is appended to the call site > line in the /proc/allocinfo output. > Bump up /proc/allocinfo version to reflect the change in the file format > and update documentation. > > Example output with invalid counters: > allocinfo - version: 2.0 > 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > > ... > > --- a/Documentation/filesystems/proc.rst > +++ b/Documentation/filesystems/proc.rst > @@ -1009,6 +1009,10 @@ number, module (if originates from a loadable module) and the function calling > the allocation. The number of bytes allocated and number of calls at each > location are reported. The first line indicates the version of the file, the > second line is the header listing fields in the file. > +If file version is 2.0 or higher then each line may contain additional > +<key>:<value> pairs representing extra information about the call site. > +For example if the counters are not accurate, the line will be appended with > +"accurate:no" pair. Perhaps we can tell people what accurate:no actually means. It is a rather disturbing thing to see! How worried should our users be about it? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 0:11 ` Andrew Morton @ 2025-09-16 2:48 ` Suren Baghdasaryan 2025-09-16 2:56 ` Andrew Morton 0 siblings, 1 reply; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-16 2:48 UTC (permalink / raw) To: Andrew Morton Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Mon, Sep 15, 2025 at 5:11 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Mon, 15 Sep 2025 16:02:24 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > While rare, memory allocation profiling can contain inaccurate counters > > if slab object extension vector allocation fails. That allocation might > > succeed later but prior to that, slab allocations that would have used > > that object extension vector will not be accounted for. To indicate > > incorrect counters, "accurate:no" marker is appended to the call site > > line in the /proc/allocinfo output. > > Bump up /proc/allocinfo version to reflect the change in the file format > > and update documentation. > > > > Example output with invalid counters: > > allocinfo - version: 2.0 > > 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > > 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > > 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > > 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > > 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > > 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > > 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > > 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > > 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > > 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > > > > ... > > > > --- a/Documentation/filesystems/proc.rst > > +++ b/Documentation/filesystems/proc.rst > > @@ -1009,6 +1009,10 @@ number, module (if originates from a loadable module) and the function calling > > the allocation. The number of bytes allocated and number of calls at each > > location are reported. The first line indicates the version of the file, the > > second line is the header listing fields in the file. > > +If file version is 2.0 or higher then each line may contain additional > > +<key>:<value> pairs representing extra information about the call site. > > +For example if the counters are not accurate, the line will be appended with > > +"accurate:no" pair. > > Perhaps we can tell people what accurate:no actually means. It is a > rather disturbing thing to see! How worried should our users be about > it? Right. How about adding a section like this: Supported markers in v2: accurate:no Absolute values of the counters in this line are not accurate because of the failure to allocate storage required to track some of the allocations made at this location. Deltas in these counters are accurate, therefore counters can be used to track allocation size and count changes. If this looks good, could you fold it into the existing patch or should I respin? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 2:48 ` Suren Baghdasaryan @ 2025-09-16 2:56 ` Andrew Morton 2025-09-16 3:34 ` Suren Baghdasaryan 0 siblings, 1 reply; 20+ messages in thread From: Andrew Morton @ 2025-09-16 2:56 UTC (permalink / raw) To: Suren Baghdasaryan Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Mon, 15 Sep 2025 19:48:14 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > Perhaps we can tell people what accurate:no actually means. It is a > > rather disturbing thing to see! How worried should our users be about > > it? > > Right. How about adding a section like this: > > Supported markers in v2: > accurate:no > Absolute values of the counters in this line are not > accurate because of the failure to allocate storage required > to track some of the allocations made at this location. > Deltas in these counters are accurate, therefore counters > can be used to track allocation size and count changes. > > > If this looks good, looks awesome ;) > could you fold it into the existing patch or > should I respin? A little fixlet would be preferred (by me, at least). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 2:56 ` Andrew Morton @ 2025-09-16 3:34 ` Suren Baghdasaryan 2025-09-16 4:21 ` Andrew Morton 0 siblings, 1 reply; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-16 3:34 UTC (permalink / raw) To: Andrew Morton Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Mon, Sep 15, 2025 at 7:56 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Mon, 15 Sep 2025 19:48:14 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > > Perhaps we can tell people what accurate:no actually means. It is a > > > rather disturbing thing to see! How worried should our users be about > > > it? > > > > Right. How about adding a section like this: > > > > Supported markers in v2: > > accurate:no > > Absolute values of the counters in this line are not > > accurate because of the failure to allocate storage required > > to track some of the allocations made at this location. > > Deltas in these counters are accurate, therefore counters > > can be used to track allocation size and count changes. > > > > > > If this looks good, > > looks awesome ;) > > > could you fold it into the existing patch or > > should I respin? > > A little fixlet would be preferred (by me, at least). Ok, should I post a fixup patch or you will do that in-place? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 3:34 ` Suren Baghdasaryan @ 2025-09-16 4:21 ` Andrew Morton 2025-09-16 4:39 ` Suren Baghdasaryan 0 siblings, 1 reply; 20+ messages in thread From: Andrew Morton @ 2025-09-16 4:21 UTC (permalink / raw) To: Suren Baghdasaryan Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Mon, 15 Sep 2025 20:34:33 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > could you fold it into the existing patch or > > > should I respin? > > > > A little fixlet would be preferred (by me, at least). > > Ok, should I post a fixup patch or you will do that in-place? I think the former, please. Your intent is preferable to my interpretation of your intent and we get all the nice patch metadata to track everything. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 4:21 ` Andrew Morton @ 2025-09-16 4:39 ` Suren Baghdasaryan 2025-09-16 16:02 ` Suren Baghdasaryan 0 siblings, 1 reply; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-16 4:39 UTC (permalink / raw) To: Andrew Morton Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Mon, Sep 15, 2025 at 9:21 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Mon, 15 Sep 2025 20:34:33 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > > > > could you fold it into the existing patch or > > > > should I respin? > > > > > > A little fixlet would be preferred (by me, at least). > > > > Ok, should I post a fixup patch or you will do that in-place? > > I think the former, please. Your intent is preferable to my > interpretation of your intent and we get all the nice patch metadata to > track everything. Will do first thing in the morning. Thanks! > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 4:39 ` Suren Baghdasaryan @ 2025-09-16 16:02 ` Suren Baghdasaryan 0 siblings, 0 replies; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-16 16:02 UTC (permalink / raw) To: Andrew Morton Cc: kent.overstreet, vbabka, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Mon, Sep 15, 2025 at 9:39 PM Suren Baghdasaryan <surenb@google.com> wrote: > > On Mon, Sep 15, 2025 at 9:21 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > > > On Mon, 15 Sep 2025 20:34:33 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > > > > > > > could you fold it into the existing patch or > > > > > should I respin? > > > > > > > > A little fixlet would be preferred (by me, at least). > > > > > > Ok, should I post a fixup patch or you will do that in-place? > > > > I think the former, please. Your intent is preferable to my > > interpretation of your intent and we get all the nice patch metadata to > > track everything. > > Will do first thing in the morning. Thanks! Posted at: https://lore.kernel.org/all/20250916160110.266190-1-surenb@google.com/ > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-15 23:02 [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output Suren Baghdasaryan 2025-09-15 23:05 ` Suren Baghdasaryan 2025-09-16 0:11 ` Andrew Morton @ 2025-09-16 12:57 ` Vlastimil Babka 2025-09-16 15:51 ` Suren Baghdasaryan 2 siblings, 1 reply; 20+ messages in thread From: Vlastimil Babka @ 2025-09-16 12:57 UTC (permalink / raw) To: Suren Baghdasaryan, akpm Cc: kent.overstreet, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On 9/16/25 01:02, Suren Baghdasaryan wrote: > While rare, memory allocation profiling can contain inaccurate counters > if slab object extension vector allocation fails. That allocation might > succeed later but prior to that, slab allocations that would have used > that object extension vector will not be accounted for. To indicate > incorrect counters, "accurate:no" marker is appended to the call site > line in the /proc/allocinfo output. > Bump up /proc/allocinfo version to reflect the change in the file format > and update documentation. > > Example output with invalid counters: > allocinfo - version: 2.0 > 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > > Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > Acked-by: Shakeel Butt <shakeel.butt@linux.dev> > Acked-by: Usama Arif <usamaarif642@gmail.com> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> With this format you could instead print the accumulated size of allocations that could not allocate their objext (for the given tag). It should be then an upper bound of the actual error, because obviously we cannot recognize moments where these allocations are freed - so we don't know for which tag to decrement. Maybe it could be more useful output than the yes/no information, although of course require more storage in struct codetag, so I don't know if it's worth it. Maybe a global counter of sum size for all these missed objexts could be also maintained, and that wouldn't be an upper bound but an actual current error, that is if we can precisely determine that when freeing an object, we don't have a tag to decrement because objext allocation had failed on it and thus that allocation had incremented this global error counter and it's correct to decrement it. > --- > Changes since v1[1]: > - Changed the marker from asterisk to accurate:no pair, per Andrew Morton > - Documented /proc/allocinfo v2 format > - Update the changelog > - Added Acked-by from v2 since the functionality is the same, > per Shakeel Butt, Usama Arif and Johannes Weiner > > [1] https://lore.kernel.org/all/20250909234942.1104356-1-surenb@google.com/ > > Documentation/filesystems/proc.rst | 4 ++++ > include/linux/alloc_tag.h | 12 ++++++++++++ > include/linux/codetag.h | 5 ++++- > lib/alloc_tag.c | 4 +++- > mm/slub.c | 2 ++ > 5 files changed, 25 insertions(+), 2 deletions(-) > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > index 915a3e44bc12..1776a06571c2 100644 > --- a/Documentation/filesystems/proc.rst > +++ b/Documentation/filesystems/proc.rst > @@ -1009,6 +1009,10 @@ number, module (if originates from a loadable module) and the function calling > the allocation. The number of bytes allocated and number of calls at each > location are reported. The first line indicates the version of the file, the > second line is the header listing fields in the file. > +If file version is 2.0 or higher then each line may contain additional > +<key>:<value> pairs representing extra information about the call site. > +For example if the counters are not accurate, the line will be appended with > +"accurate:no" pair. > > Example output. > > diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h > index 9ef2633e2c08..d40ac39bfbe8 100644 > --- a/include/linux/alloc_tag.h > +++ b/include/linux/alloc_tag.h > @@ -221,6 +221,16 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) > ref->ct = NULL; > } > > +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) > +{ > + tag->ct.flags |= CODETAG_FLAG_INACCURATE; > +} > + > +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) > +{ > + return !!(tag->ct.flags & CODETAG_FLAG_INACCURATE); > +} > + > #define alloc_tag_record(p) ((p) = current->alloc_tag) > > #else /* CONFIG_MEM_ALLOC_PROFILING */ > @@ -230,6 +240,8 @@ static inline bool mem_alloc_profiling_enabled(void) { return false; } > static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, > size_t bytes) {} > static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) {} > +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) {} > +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) { return false; } > #define alloc_tag_record(p) do {} while (0) > > #endif /* CONFIG_MEM_ALLOC_PROFILING */ > diff --git a/include/linux/codetag.h b/include/linux/codetag.h > index 457ed8fd3214..8ea2a5f7c98a 100644 > --- a/include/linux/codetag.h > +++ b/include/linux/codetag.h > @@ -16,13 +16,16 @@ struct module; > #define CODETAG_SECTION_START_PREFIX "__start_" > #define CODETAG_SECTION_STOP_PREFIX "__stop_" > > +/* codetag flags */ > +#define CODETAG_FLAG_INACCURATE (1 << 0) > + > /* > * An instance of this structure is created in a special ELF section at every > * code location being tagged. At runtime, the special section is treated as > * an array of these. > */ > struct codetag { > - unsigned int flags; /* used in later patches */ > + unsigned int flags; > unsigned int lineno; > const char *modname; > const char *function; > diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c > index 79891528e7b6..12ff80bbbd22 100644 > --- a/lib/alloc_tag.c > +++ b/lib/alloc_tag.c > @@ -80,7 +80,7 @@ static void allocinfo_stop(struct seq_file *m, void *arg) > static void print_allocinfo_header(struct seq_buf *buf) > { > /* Output format version, so we can change it. */ > - seq_buf_printf(buf, "allocinfo - version: 1.0\n"); > + seq_buf_printf(buf, "allocinfo - version: 2.0\n"); > seq_buf_printf(buf, "# <size> <calls> <tag info>\n"); > } > > @@ -92,6 +92,8 @@ static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct) > > seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls); > codetag_to_text(out, ct); > + if (unlikely(alloc_tag_is_inaccurate(tag))) > + seq_buf_printf(out, " accurate:no"); > seq_buf_putc(out, ' '); > seq_buf_putc(out, '\n'); > } > diff --git a/mm/slub.c b/mm/slub.c > index af343ca570b5..9c04f29ee8de 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2143,6 +2143,8 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) > */ > if (likely(obj_exts)) > alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size); > + else > + alloc_tag_set_inaccurate(current->alloc_tag); > } > > static inline void ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 12:57 ` Vlastimil Babka @ 2025-09-16 15:51 ` Suren Baghdasaryan 2025-09-16 21:11 ` Usama Arif 0 siblings, 1 reply; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-16 15:51 UTC (permalink / raw) To: Vlastimil Babka Cc: akpm, kent.overstreet, hannes, usamaarif642, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > On 9/16/25 01:02, Suren Baghdasaryan wrote: > > While rare, memory allocation profiling can contain inaccurate counters > > if slab object extension vector allocation fails. That allocation might > > succeed later but prior to that, slab allocations that would have used > > that object extension vector will not be accounted for. To indicate > > incorrect counters, "accurate:no" marker is appended to the call site > > line in the /proc/allocinfo output. > > Bump up /proc/allocinfo version to reflect the change in the file format > > and update documentation. > > > > Example output with invalid counters: > > allocinfo - version: 2.0 > > 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > > 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > > 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > > 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > > 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > > 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > > 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > > 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > > 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > > 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > > > > Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > Acked-by: Shakeel Butt <shakeel.butt@linux.dev> > > Acked-by: Usama Arif <usamaarif642@gmail.com> > > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > > With this format you could instead print the accumulated size of allocations > that could not allocate their objext (for the given tag). It should be then > an upper bound of the actual error, because obviously we cannot recognize > moments where these allocations are freed - so we don't know for which tag > to decrement. Maybe it could be more useful output than the yes/no > information, although of course require more storage in struct codetag, so I > don't know if it's worth it. Yeah, I'm reluctant to add more fields to the codetag and increase the overhead until we have a usecases. If that happens and with the new format we can add something like error_size:<value> to indicate the amount of the error. > > Maybe a global counter of sum size for all these missed objexts could be > also maintained, and that wouldn't be an upper bound but an actual current > error, that is if we can precisely determine that when freeing an object, we > don't have a tag to decrement because objext allocation had failed on it and > thus that allocation had incremented this global error counter and it's > correct to decrement it. That's a good idea and should be doable without too much overhead. Thanks! For the UAPI... I think for this case IOCTL would work and the use scenario would be that the user sees the "accurate:no" mark and issues ioctl command to retrieve this global counter value. Usama, since you initiated this feature request, do you think such a counter would be useful? > > > --- > > Changes since v1[1]: > > - Changed the marker from asterisk to accurate:no pair, per Andrew Morton > > - Documented /proc/allocinfo v2 format > > - Update the changelog > > - Added Acked-by from v2 since the functionality is the same, > > per Shakeel Butt, Usama Arif and Johannes Weiner > > > > [1] https://lore.kernel.org/all/20250909234942.1104356-1-surenb@google.com/ > > > > Documentation/filesystems/proc.rst | 4 ++++ > > include/linux/alloc_tag.h | 12 ++++++++++++ > > include/linux/codetag.h | 5 ++++- > > lib/alloc_tag.c | 4 +++- > > mm/slub.c | 2 ++ > > 5 files changed, 25 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > > index 915a3e44bc12..1776a06571c2 100644 > > --- a/Documentation/filesystems/proc.rst > > +++ b/Documentation/filesystems/proc.rst > > @@ -1009,6 +1009,10 @@ number, module (if originates from a loadable module) and the function calling > > the allocation. The number of bytes allocated and number of calls at each > > location are reported. The first line indicates the version of the file, the > > second line is the header listing fields in the file. > > +If file version is 2.0 or higher then each line may contain additional > > +<key>:<value> pairs representing extra information about the call site. > > +For example if the counters are not accurate, the line will be appended with > > +"accurate:no" pair. > > > > Example output. > > > > diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h > > index 9ef2633e2c08..d40ac39bfbe8 100644 > > --- a/include/linux/alloc_tag.h > > +++ b/include/linux/alloc_tag.h > > @@ -221,6 +221,16 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) > > ref->ct = NULL; > > } > > > > +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) > > +{ > > + tag->ct.flags |= CODETAG_FLAG_INACCURATE; > > +} > > + > > +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) > > +{ > > + return !!(tag->ct.flags & CODETAG_FLAG_INACCURATE); > > +} > > + > > #define alloc_tag_record(p) ((p) = current->alloc_tag) > > > > #else /* CONFIG_MEM_ALLOC_PROFILING */ > > @@ -230,6 +240,8 @@ static inline bool mem_alloc_profiling_enabled(void) { return false; } > > static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, > > size_t bytes) {} > > static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) {} > > +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) {} > > +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) { return false; } > > #define alloc_tag_record(p) do {} while (0) > > > > #endif /* CONFIG_MEM_ALLOC_PROFILING */ > > diff --git a/include/linux/codetag.h b/include/linux/codetag.h > > index 457ed8fd3214..8ea2a5f7c98a 100644 > > --- a/include/linux/codetag.h > > +++ b/include/linux/codetag.h > > @@ -16,13 +16,16 @@ struct module; > > #define CODETAG_SECTION_START_PREFIX "__start_" > > #define CODETAG_SECTION_STOP_PREFIX "__stop_" > > > > +/* codetag flags */ > > +#define CODETAG_FLAG_INACCURATE (1 << 0) > > + > > /* > > * An instance of this structure is created in a special ELF section at every > > * code location being tagged. At runtime, the special section is treated as > > * an array of these. > > */ > > struct codetag { > > - unsigned int flags; /* used in later patches */ > > + unsigned int flags; > > unsigned int lineno; > > const char *modname; > > const char *function; > > diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c > > index 79891528e7b6..12ff80bbbd22 100644 > > --- a/lib/alloc_tag.c > > +++ b/lib/alloc_tag.c > > @@ -80,7 +80,7 @@ static void allocinfo_stop(struct seq_file *m, void *arg) > > static void print_allocinfo_header(struct seq_buf *buf) > > { > > /* Output format version, so we can change it. */ > > - seq_buf_printf(buf, "allocinfo - version: 1.0\n"); > > + seq_buf_printf(buf, "allocinfo - version: 2.0\n"); > > seq_buf_printf(buf, "# <size> <calls> <tag info>\n"); > > } > > > > @@ -92,6 +92,8 @@ static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct) > > > > seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls); > > codetag_to_text(out, ct); > > + if (unlikely(alloc_tag_is_inaccurate(tag))) > > + seq_buf_printf(out, " accurate:no"); > > seq_buf_putc(out, ' '); > > seq_buf_putc(out, '\n'); > > } > > diff --git a/mm/slub.c b/mm/slub.c > > index af343ca570b5..9c04f29ee8de 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -2143,6 +2143,8 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) > > */ > > if (likely(obj_exts)) > > alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size); > > + else > > + alloc_tag_set_inaccurate(current->alloc_tag); > > } > > > > static inline void > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 15:51 ` Suren Baghdasaryan @ 2025-09-16 21:11 ` Usama Arif 2025-09-16 21:46 ` Suren Baghdasaryan 0 siblings, 1 reply; 20+ messages in thread From: Usama Arif @ 2025-09-16 21:11 UTC (permalink / raw) To: Suren Baghdasaryan, Vlastimil Babka Cc: akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On 16/09/2025 16:51, Suren Baghdasaryan wrote: > On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote: >> >> On 9/16/25 01:02, Suren Baghdasaryan wrote: >>> While rare, memory allocation profiling can contain inaccurate counters >>> if slab object extension vector allocation fails. That allocation might >>> succeed later but prior to that, slab allocations that would have used >>> that object extension vector will not be accounted for. To indicate >>> incorrect counters, "accurate:no" marker is appended to the call site >>> line in the /proc/allocinfo output. >>> Bump up /proc/allocinfo version to reflect the change in the file format >>> and update documentation. >>> >>> Example output with invalid counters: >>> allocinfo - version: 2.0 >>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes >>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add >>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no >>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set >>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc >>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale >>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs >>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no >>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create >>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device >>> >>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> >>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> >>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> >>> Acked-by: Usama Arif <usamaarif642@gmail.com> >>> Acked-by: Johannes Weiner <hannes@cmpxchg.org> >> >> With this format you could instead print the accumulated size of allocations >> that could not allocate their objext (for the given tag). It should be then >> an upper bound of the actual error, because obviously we cannot recognize >> moments where these allocations are freed - so we don't know for which tag >> to decrement. Maybe it could be more useful output than the yes/no >> information, although of course require more storage in struct codetag, so I >> don't know if it's worth it. > > Yeah, I'm reluctant to add more fields to the codetag and increase the > overhead until we have a usecases. If that happens and with the new > format we can add something like error_size:<value> to indicate the > amount of the error. > >> >> Maybe a global counter of sum size for all these missed objexts could be >> also maintained, and that wouldn't be an upper bound but an actual current >> error, that is if we can precisely determine that when freeing an object, we >> don't have a tag to decrement because objext allocation had failed on it and >> thus that allocation had incremented this global error counter and it's >> correct to decrement it. > > That's a good idea and should be doable without too much overhead. Thanks! > For the UAPI... I think for this case IOCTL would work and the use > scenario would be that the user sees the "accurate:no" mark and issues > ioctl command to retrieve this global counter value. > Usama, since you initiated this feature request, do you think such a > counter would be useful? > hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers, but it might be better to put it there? If the value is in the file, I imagine people will be more prone to looking at it? I am not completely sure if everyone will do an ioctl to try and find this out? Especially if you just have infra that is just automatically collecting info from this file. >> >>> --- >>> Changes since v1[1]: >>> - Changed the marker from asterisk to accurate:no pair, per Andrew Morton >>> - Documented /proc/allocinfo v2 format >>> - Update the changelog >>> - Added Acked-by from v2 since the functionality is the same, >>> per Shakeel Butt, Usama Arif and Johannes Weiner >>> >>> [1] https://lore.kernel.org/all/20250909234942.1104356-1-surenb@google.com/ >>> >>> Documentation/filesystems/proc.rst | 4 ++++ >>> include/linux/alloc_tag.h | 12 ++++++++++++ >>> include/linux/codetag.h | 5 ++++- >>> lib/alloc_tag.c | 4 +++- >>> mm/slub.c | 2 ++ >>> 5 files changed, 25 insertions(+), 2 deletions(-) >>> >>> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst >>> index 915a3e44bc12..1776a06571c2 100644 >>> --- a/Documentation/filesystems/proc.rst >>> +++ b/Documentation/filesystems/proc.rst >>> @@ -1009,6 +1009,10 @@ number, module (if originates from a loadable module) and the function calling >>> the allocation. The number of bytes allocated and number of calls at each >>> location are reported. The first line indicates the version of the file, the >>> second line is the header listing fields in the file. >>> +If file version is 2.0 or higher then each line may contain additional >>> +<key>:<value> pairs representing extra information about the call site. >>> +For example if the counters are not accurate, the line will be appended with >>> +"accurate:no" pair. >>> >>> Example output. >>> >>> diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h >>> index 9ef2633e2c08..d40ac39bfbe8 100644 >>> --- a/include/linux/alloc_tag.h >>> +++ b/include/linux/alloc_tag.h >>> @@ -221,6 +221,16 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) >>> ref->ct = NULL; >>> } >>> >>> +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) >>> +{ >>> + tag->ct.flags |= CODETAG_FLAG_INACCURATE; >>> +} >>> + >>> +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) >>> +{ >>> + return !!(tag->ct.flags & CODETAG_FLAG_INACCURATE); >>> +} >>> + >>> #define alloc_tag_record(p) ((p) = current->alloc_tag) >>> >>> #else /* CONFIG_MEM_ALLOC_PROFILING */ >>> @@ -230,6 +240,8 @@ static inline bool mem_alloc_profiling_enabled(void) { return false; } >>> static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, >>> size_t bytes) {} >>> static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) {} >>> +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) {} >>> +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) { return false; } >>> #define alloc_tag_record(p) do {} while (0) >>> >>> #endif /* CONFIG_MEM_ALLOC_PROFILING */ >>> diff --git a/include/linux/codetag.h b/include/linux/codetag.h >>> index 457ed8fd3214..8ea2a5f7c98a 100644 >>> --- a/include/linux/codetag.h >>> +++ b/include/linux/codetag.h >>> @@ -16,13 +16,16 @@ struct module; >>> #define CODETAG_SECTION_START_PREFIX "__start_" >>> #define CODETAG_SECTION_STOP_PREFIX "__stop_" >>> >>> +/* codetag flags */ >>> +#define CODETAG_FLAG_INACCURATE (1 << 0) >>> + >>> /* >>> * An instance of this structure is created in a special ELF section at every >>> * code location being tagged. At runtime, the special section is treated as >>> * an array of these. >>> */ >>> struct codetag { >>> - unsigned int flags; /* used in later patches */ >>> + unsigned int flags; >>> unsigned int lineno; >>> const char *modname; >>> const char *function; >>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c >>> index 79891528e7b6..12ff80bbbd22 100644 >>> --- a/lib/alloc_tag.c >>> +++ b/lib/alloc_tag.c >>> @@ -80,7 +80,7 @@ static void allocinfo_stop(struct seq_file *m, void *arg) >>> static void print_allocinfo_header(struct seq_buf *buf) >>> { >>> /* Output format version, so we can change it. */ >>> - seq_buf_printf(buf, "allocinfo - version: 1.0\n"); >>> + seq_buf_printf(buf, "allocinfo - version: 2.0\n"); >>> seq_buf_printf(buf, "# <size> <calls> <tag info>\n"); >>> } >>> >>> @@ -92,6 +92,8 @@ static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct) >>> >>> seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls); >>> codetag_to_text(out, ct); >>> + if (unlikely(alloc_tag_is_inaccurate(tag))) >>> + seq_buf_printf(out, " accurate:no"); >>> seq_buf_putc(out, ' '); >>> seq_buf_putc(out, '\n'); >>> } >>> diff --git a/mm/slub.c b/mm/slub.c >>> index af343ca570b5..9c04f29ee8de 100644 >>> --- a/mm/slub.c >>> +++ b/mm/slub.c >>> @@ -2143,6 +2143,8 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) >>> */ >>> if (likely(obj_exts)) >>> alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size); >>> + else >>> + alloc_tag_set_inaccurate(current->alloc_tag); >>> } >>> >>> static inline void >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 21:11 ` Usama Arif @ 2025-09-16 21:46 ` Suren Baghdasaryan 2025-09-16 21:52 ` Usama Arif 0 siblings, 1 reply; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-16 21:46 UTC (permalink / raw) To: Usama Arif Cc: Vlastimil Babka, akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@gmail.com> wrote: > > > > On 16/09/2025 16:51, Suren Baghdasaryan wrote: > > On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote: > >> > >> On 9/16/25 01:02, Suren Baghdasaryan wrote: > >>> While rare, memory allocation profiling can contain inaccurate counters > >>> if slab object extension vector allocation fails. That allocation might > >>> succeed later but prior to that, slab allocations that would have used > >>> that object extension vector will not be accounted for. To indicate > >>> incorrect counters, "accurate:no" marker is appended to the call site > >>> line in the /proc/allocinfo output. > >>> Bump up /proc/allocinfo version to reflect the change in the file format > >>> and update documentation. > >>> > >>> Example output with invalid counters: > >>> allocinfo - version: 2.0 > >>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > >>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > >>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > >>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > >>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > >>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > >>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > >>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > >>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > >>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > >>> > >>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > >>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> > >>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> > >>> Acked-by: Usama Arif <usamaarif642@gmail.com> > >>> Acked-by: Johannes Weiner <hannes@cmpxchg.org> > >> > >> With this format you could instead print the accumulated size of allocations > >> that could not allocate their objext (for the given tag). It should be then > >> an upper bound of the actual error, because obviously we cannot recognize > >> moments where these allocations are freed - so we don't know for which tag > >> to decrement. Maybe it could be more useful output than the yes/no > >> information, although of course require more storage in struct codetag, so I > >> don't know if it's worth it. > > > > Yeah, I'm reluctant to add more fields to the codetag and increase the > > overhead until we have a usecases. If that happens and with the new > > format we can add something like error_size:<value> to indicate the > > amount of the error. > > > >> > >> Maybe a global counter of sum size for all these missed objexts could be > >> also maintained, and that wouldn't be an upper bound but an actual current > >> error, that is if we can precisely determine that when freeing an object, we > >> don't have a tag to decrement because objext allocation had failed on it and > >> thus that allocation had incremented this global error counter and it's > >> correct to decrement it. > > > > That's a good idea and should be doable without too much overhead. Thanks! > > For the UAPI... I think for this case IOCTL would work and the use > > scenario would be that the user sees the "accurate:no" mark and issues > > ioctl command to retrieve this global counter value. > > Usama, since you initiated this feature request, do you think such a > > counter would be useful? > > > > > hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers, > but it might be better to put it there? > If the value is in the file, I imagine people will be more prone to looking at it? > I am not completely sure if everyone will do an ioctl to try and find this out? > Especially if you just have infra that is just automatically collecting info from > this file. The current file reports per-codetag data and not global counters. We could report it somewhere in the header but the first question to answer is: would this be really useful (not in a way of "nice to have" but for a concrete usecase)? If not then I would suggest keeping things simple until there is a need for it. > > >> > >>> --- > >>> Changes since v1[1]: > >>> - Changed the marker from asterisk to accurate:no pair, per Andrew Morton > >>> - Documented /proc/allocinfo v2 format > >>> - Update the changelog > >>> - Added Acked-by from v2 since the functionality is the same, > >>> per Shakeel Butt, Usama Arif and Johannes Weiner > >>> > >>> [1] https://lore.kernel.org/all/20250909234942.1104356-1-surenb@google.com/ > >>> > >>> Documentation/filesystems/proc.rst | 4 ++++ > >>> include/linux/alloc_tag.h | 12 ++++++++++++ > >>> include/linux/codetag.h | 5 ++++- > >>> lib/alloc_tag.c | 4 +++- > >>> mm/slub.c | 2 ++ > >>> 5 files changed, 25 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > >>> index 915a3e44bc12..1776a06571c2 100644 > >>> --- a/Documentation/filesystems/proc.rst > >>> +++ b/Documentation/filesystems/proc.rst > >>> @@ -1009,6 +1009,10 @@ number, module (if originates from a loadable module) and the function calling > >>> the allocation. The number of bytes allocated and number of calls at each > >>> location are reported. The first line indicates the version of the file, the > >>> second line is the header listing fields in the file. > >>> +If file version is 2.0 or higher then each line may contain additional > >>> +<key>:<value> pairs representing extra information about the call site. > >>> +For example if the counters are not accurate, the line will be appended with > >>> +"accurate:no" pair. > >>> > >>> Example output. > >>> > >>> diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h > >>> index 9ef2633e2c08..d40ac39bfbe8 100644 > >>> --- a/include/linux/alloc_tag.h > >>> +++ b/include/linux/alloc_tag.h > >>> @@ -221,6 +221,16 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) > >>> ref->ct = NULL; > >>> } > >>> > >>> +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) > >>> +{ > >>> + tag->ct.flags |= CODETAG_FLAG_INACCURATE; > >>> +} > >>> + > >>> +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) > >>> +{ > >>> + return !!(tag->ct.flags & CODETAG_FLAG_INACCURATE); > >>> +} > >>> + > >>> #define alloc_tag_record(p) ((p) = current->alloc_tag) > >>> > >>> #else /* CONFIG_MEM_ALLOC_PROFILING */ > >>> @@ -230,6 +240,8 @@ static inline bool mem_alloc_profiling_enabled(void) { return false; } > >>> static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, > >>> size_t bytes) {} > >>> static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) {} > >>> +static inline void alloc_tag_set_inaccurate(struct alloc_tag *tag) {} > >>> +static inline bool alloc_tag_is_inaccurate(struct alloc_tag *tag) { return false; } > >>> #define alloc_tag_record(p) do {} while (0) > >>> > >>> #endif /* CONFIG_MEM_ALLOC_PROFILING */ > >>> diff --git a/include/linux/codetag.h b/include/linux/codetag.h > >>> index 457ed8fd3214..8ea2a5f7c98a 100644 > >>> --- a/include/linux/codetag.h > >>> +++ b/include/linux/codetag.h > >>> @@ -16,13 +16,16 @@ struct module; > >>> #define CODETAG_SECTION_START_PREFIX "__start_" > >>> #define CODETAG_SECTION_STOP_PREFIX "__stop_" > >>> > >>> +/* codetag flags */ > >>> +#define CODETAG_FLAG_INACCURATE (1 << 0) > >>> + > >>> /* > >>> * An instance of this structure is created in a special ELF section at every > >>> * code location being tagged. At runtime, the special section is treated as > >>> * an array of these. > >>> */ > >>> struct codetag { > >>> - unsigned int flags; /* used in later patches */ > >>> + unsigned int flags; > >>> unsigned int lineno; > >>> const char *modname; > >>> const char *function; > >>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c > >>> index 79891528e7b6..12ff80bbbd22 100644 > >>> --- a/lib/alloc_tag.c > >>> +++ b/lib/alloc_tag.c > >>> @@ -80,7 +80,7 @@ static void allocinfo_stop(struct seq_file *m, void *arg) > >>> static void print_allocinfo_header(struct seq_buf *buf) > >>> { > >>> /* Output format version, so we can change it. */ > >>> - seq_buf_printf(buf, "allocinfo - version: 1.0\n"); > >>> + seq_buf_printf(buf, "allocinfo - version: 2.0\n"); > >>> seq_buf_printf(buf, "# <size> <calls> <tag info>\n"); > >>> } > >>> > >>> @@ -92,6 +92,8 @@ static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct) > >>> > >>> seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls); > >>> codetag_to_text(out, ct); > >>> + if (unlikely(alloc_tag_is_inaccurate(tag))) > >>> + seq_buf_printf(out, " accurate:no"); > >>> seq_buf_putc(out, ' '); > >>> seq_buf_putc(out, '\n'); > >>> } > >>> diff --git a/mm/slub.c b/mm/slub.c > >>> index af343ca570b5..9c04f29ee8de 100644 > >>> --- a/mm/slub.c > >>> +++ b/mm/slub.c > >>> @@ -2143,6 +2143,8 @@ __alloc_tagging_slab_alloc_hook(struct kmem_cache *s, void *object, gfp_t flags) > >>> */ > >>> if (likely(obj_exts)) > >>> alloc_tag_add(&obj_exts->ref, current->alloc_tag, s->size); > >>> + else > >>> + alloc_tag_set_inaccurate(current->alloc_tag); > >>> } > >>> > >>> static inline void > >> > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 21:46 ` Suren Baghdasaryan @ 2025-09-16 21:52 ` Usama Arif 2025-09-16 22:26 ` Suren Baghdasaryan 0 siblings, 1 reply; 20+ messages in thread From: Usama Arif @ 2025-09-16 21:52 UTC (permalink / raw) To: Suren Baghdasaryan Cc: Vlastimil Babka, akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On 16/09/2025 22:46, Suren Baghdasaryan wrote: > On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@gmail.com> wrote: >> >> >> >> On 16/09/2025 16:51, Suren Baghdasaryan wrote: >>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote: >>>> >>>> On 9/16/25 01:02, Suren Baghdasaryan wrote: >>>>> While rare, memory allocation profiling can contain inaccurate counters >>>>> if slab object extension vector allocation fails. That allocation might >>>>> succeed later but prior to that, slab allocations that would have used >>>>> that object extension vector will not be accounted for. To indicate >>>>> incorrect counters, "accurate:no" marker is appended to the call site >>>>> line in the /proc/allocinfo output. >>>>> Bump up /proc/allocinfo version to reflect the change in the file format >>>>> and update documentation. >>>>> >>>>> Example output with invalid counters: >>>>> allocinfo - version: 2.0 >>>>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes >>>>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add >>>>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no >>>>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set >>>>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc >>>>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale >>>>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs >>>>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no >>>>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create >>>>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device >>>>> >>>>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> >>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> >>>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> >>>>> Acked-by: Usama Arif <usamaarif642@gmail.com> >>>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org> >>>> >>>> With this format you could instead print the accumulated size of allocations >>>> that could not allocate their objext (for the given tag). It should be then >>>> an upper bound of the actual error, because obviously we cannot recognize >>>> moments where these allocations are freed - so we don't know for which tag >>>> to decrement. Maybe it could be more useful output than the yes/no >>>> information, although of course require more storage in struct codetag, so I >>>> don't know if it's worth it. >>> >>> Yeah, I'm reluctant to add more fields to the codetag and increase the >>> overhead until we have a usecases. If that happens and with the new >>> format we can add something like error_size:<value> to indicate the >>> amount of the error. >>> >>>> >>>> Maybe a global counter of sum size for all these missed objexts could be >>>> also maintained, and that wouldn't be an upper bound but an actual current >>>> error, that is if we can precisely determine that when freeing an object, we >>>> don't have a tag to decrement because objext allocation had failed on it and >>>> thus that allocation had incremented this global error counter and it's >>>> correct to decrement it. >>> >>> That's a good idea and should be doable without too much overhead. Thanks! >>> For the UAPI... I think for this case IOCTL would work and the use >>> scenario would be that the user sees the "accurate:no" mark and issues >>> ioctl command to retrieve this global counter value. >>> Usama, since you initiated this feature request, do you think such a >>> counter would be useful? >>> >> >> >> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers, >> but it might be better to put it there? >> If the value is in the file, I imagine people will be more prone to looking at it? >> I am not completely sure if everyone will do an ioctl to try and find this out? >> Especially if you just have infra that is just automatically collecting info from >> this file. > > The current file reports per-codetag data and not global counters. We > could report it somewhere in the header but the first question to > answer is: would this be really useful (not in a way of "nice to > have" but for a concrete usecase)? If not then I would suggest keeping > things simple until there is a need for it. > I think its a nice to have. I can't think of a concrete usecase at present. I guess a potential usecase is if you are trying to use memory allocation profiling to debug OOMs and the missed objects size is very large. I guess we wont know until this happens, but I would hope this number is usually small. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 21:52 ` Usama Arif @ 2025-09-16 22:26 ` Suren Baghdasaryan 2025-09-16 22:27 ` Suren Baghdasaryan 2025-09-17 7:38 ` Vlastimil Babka 0 siblings, 2 replies; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-16 22:26 UTC (permalink / raw) To: Usama Arif Cc: Vlastimil Babka, akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@gmail.com> wrote: > > > > On 16/09/2025 22:46, Suren Baghdasaryan wrote: > > On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@gmail.com> wrote: > >> > >> > >> > >> On 16/09/2025 16:51, Suren Baghdasaryan wrote: > >>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote: > >>>> > >>>> On 9/16/25 01:02, Suren Baghdasaryan wrote: > >>>>> While rare, memory allocation profiling can contain inaccurate counters > >>>>> if slab object extension vector allocation fails. That allocation might > >>>>> succeed later but prior to that, slab allocations that would have used > >>>>> that object extension vector will not be accounted for. To indicate > >>>>> incorrect counters, "accurate:no" marker is appended to the call site > >>>>> line in the /proc/allocinfo output. > >>>>> Bump up /proc/allocinfo version to reflect the change in the file format > >>>>> and update documentation. > >>>>> > >>>>> Example output with invalid counters: > >>>>> allocinfo - version: 2.0 > >>>>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > >>>>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > >>>>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > >>>>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > >>>>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > >>>>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > >>>>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > >>>>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > >>>>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > >>>>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > >>>>> > >>>>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > >>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> > >>>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> > >>>>> Acked-by: Usama Arif <usamaarif642@gmail.com> > >>>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org> > >>>> > >>>> With this format you could instead print the accumulated size of allocations > >>>> that could not allocate their objext (for the given tag). It should be then > >>>> an upper bound of the actual error, because obviously we cannot recognize > >>>> moments where these allocations are freed - so we don't know for which tag > >>>> to decrement. Maybe it could be more useful output than the yes/no > >>>> information, although of course require more storage in struct codetag, so I > >>>> don't know if it's worth it. > >>> > >>> Yeah, I'm reluctant to add more fields to the codetag and increase the > >>> overhead until we have a usecases. If that happens and with the new > >>> format we can add something like error_size:<value> to indicate the > >>> amount of the error. > >>> > >>>> > >>>> Maybe a global counter of sum size for all these missed objexts could be > >>>> also maintained, and that wouldn't be an upper bound but an actual current > >>>> error, that is if we can precisely determine that when freeing an object, we > >>>> don't have a tag to decrement because objext allocation had failed on it and > >>>> thus that allocation had incremented this global error counter and it's > >>>> correct to decrement it. > >>> > >>> That's a good idea and should be doable without too much overhead. Thanks! > >>> For the UAPI... I think for this case IOCTL would work and the use > >>> scenario would be that the user sees the "accurate:no" mark and issues > >>> ioctl command to retrieve this global counter value. > >>> Usama, since you initiated this feature request, do you think such a > >>> counter would be useful? > >>> > >> > >> > >> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers, > >> but it might be better to put it there? > >> If the value is in the file, I imagine people will be more prone to looking at it? > >> I am not completely sure if everyone will do an ioctl to try and find this out? > >> Especially if you just have infra that is just automatically collecting info from > >> this file. > > > > The current file reports per-codetag data and not global counters. We > > could report it somewhere in the header but the first question to > > answer is: would this be really useful (not in a way of "nice to > > have" but for a concrete usecase)? If not then I would suggest keeping > > things simple until there is a need for it. > > > > I think its a nice to have. I can't think of a concrete usecase at present. > > I guess a potential usecase is if you are trying to use memory allocation > profiling to debug OOMs and the missed objects size is very large. I guess we > wont know until this happens, but I would hope this number is usually small. Hmm. Missing a large allocation and not knowing about it can be a problem... I'll start sketching a patch to see if tracking such a global counter has any drawbacks and in the meantime I'm open to suggestions on how to expose it to the userspace. About concerns on the IOCTL interface, would it be more usable if we get the alloctop [1] or a similar tool which can be used to easily issue such commands into kernel/tools? [1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 22:26 ` Suren Baghdasaryan @ 2025-09-16 22:27 ` Suren Baghdasaryan 2025-09-17 21:09 ` Usama Arif 2025-09-17 7:38 ` Vlastimil Babka 1 sibling, 1 reply; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-16 22:27 UTC (permalink / raw) To: Usama Arif Cc: Vlastimil Babka, akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Tue, Sep 16, 2025 at 10:26 PM Suren Baghdasaryan <surenb@google.com> wrote: > > On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@gmail.com> wrote: > > > > > > > > On 16/09/2025 22:46, Suren Baghdasaryan wrote: > > > On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@gmail.com> wrote: > > >> > > >> > > >> > > >> On 16/09/2025 16:51, Suren Baghdasaryan wrote: > > >>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > >>>> > > >>>> On 9/16/25 01:02, Suren Baghdasaryan wrote: > > >>>>> While rare, memory allocation profiling can contain inaccurate counters > > >>>>> if slab object extension vector allocation fails. That allocation might > > >>>>> succeed later but prior to that, slab allocations that would have used > > >>>>> that object extension vector will not be accounted for. To indicate > > >>>>> incorrect counters, "accurate:no" marker is appended to the call site > > >>>>> line in the /proc/allocinfo output. > > >>>>> Bump up /proc/allocinfo version to reflect the change in the file format > > >>>>> and update documentation. > > >>>>> > > >>>>> Example output with invalid counters: > > >>>>> allocinfo - version: 2.0 > > >>>>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > > >>>>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > > >>>>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > > >>>>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > > >>>>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > > >>>>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > > >>>>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > > >>>>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > > >>>>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > > >>>>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > > >>>>> > > >>>>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > > >>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > >>>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> > > >>>>> Acked-by: Usama Arif <usamaarif642@gmail.com> > > >>>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org> > > >>>> > > >>>> With this format you could instead print the accumulated size of allocations > > >>>> that could not allocate their objext (for the given tag). It should be then > > >>>> an upper bound of the actual error, because obviously we cannot recognize > > >>>> moments where these allocations are freed - so we don't know for which tag > > >>>> to decrement. Maybe it could be more useful output than the yes/no > > >>>> information, although of course require more storage in struct codetag, so I > > >>>> don't know if it's worth it. > > >>> > > >>> Yeah, I'm reluctant to add more fields to the codetag and increase the > > >>> overhead until we have a usecases. If that happens and with the new > > >>> format we can add something like error_size:<value> to indicate the > > >>> amount of the error. > > >>> > > >>>> > > >>>> Maybe a global counter of sum size for all these missed objexts could be > > >>>> also maintained, and that wouldn't be an upper bound but an actual current > > >>>> error, that is if we can precisely determine that when freeing an object, we > > >>>> don't have a tag to decrement because objext allocation had failed on it and > > >>>> thus that allocation had incremented this global error counter and it's > > >>>> correct to decrement it. > > >>> > > >>> That's a good idea and should be doable without too much overhead. Thanks! > > >>> For the UAPI... I think for this case IOCTL would work and the use > > >>> scenario would be that the user sees the "accurate:no" mark and issues > > >>> ioctl command to retrieve this global counter value. > > >>> Usama, since you initiated this feature request, do you think such a > > >>> counter would be useful? > > >>> > > >> > > >> > > >> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers, > > >> but it might be better to put it there? > > >> If the value is in the file, I imagine people will be more prone to looking at it? > > >> I am not completely sure if everyone will do an ioctl to try and find this out? > > >> Especially if you just have infra that is just automatically collecting info from > > >> this file. > > > > > > The current file reports per-codetag data and not global counters. We > > > could report it somewhere in the header but the first question to > > > answer is: would this be really useful (not in a way of "nice to > > > have" but for a concrete usecase)? If not then I would suggest keeping > > > things simple until there is a need for it. > > > > > > > I think its a nice to have. I can't think of a concrete usecase at present. > > > > I guess a potential usecase is if you are trying to use memory allocation > > profiling to debug OOMs and the missed objects size is very large. I guess we > > wont know until this happens, but I would hope this number is usually small. > > Hmm. Missing a large allocation and not knowing about it can be a problem... > I'll start sketching a patch to see if tracking such a global counter > has any drawbacks and in the meantime I'm open to suggestions on how > to expose it to the userspace. > > About concerns on the IOCTL interface, would it be more usable if we > get the alloctop [1] or a similar tool which can be used to easily > issue such commands into kernel/tools? > > [1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860 Ugh, sorry. Externally accesible link would be https://android-review.googlesource.com/c/platform/system/memory/libmeminfo/+/3431860 > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 22:27 ` Suren Baghdasaryan @ 2025-09-17 21:09 ` Usama Arif 2025-09-17 23:04 ` Suren Baghdasaryan 0 siblings, 1 reply; 20+ messages in thread From: Usama Arif @ 2025-09-17 21:09 UTC (permalink / raw) To: Suren Baghdasaryan Cc: Vlastimil Babka, akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On 16/09/2025 23:27, Suren Baghdasaryan wrote: > On Tue, Sep 16, 2025 at 10:26 PM Suren Baghdasaryan <surenb@google.com> wrote: >> >> On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@gmail.com> wrote: >>> >>> >>> >>> On 16/09/2025 22:46, Suren Baghdasaryan wrote: >>>> On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@gmail.com> wrote: >>>>> >>>>> >>>>> >>>>> On 16/09/2025 16:51, Suren Baghdasaryan wrote: >>>>>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote: >>>>>>> >>>>>>> On 9/16/25 01:02, Suren Baghdasaryan wrote: >>>>>>>> While rare, memory allocation profiling can contain inaccurate counters >>>>>>>> if slab object extension vector allocation fails. That allocation might >>>>>>>> succeed later but prior to that, slab allocations that would have used >>>>>>>> that object extension vector will not be accounted for. To indicate >>>>>>>> incorrect counters, "accurate:no" marker is appended to the call site >>>>>>>> line in the /proc/allocinfo output. >>>>>>>> Bump up /proc/allocinfo version to reflect the change in the file format >>>>>>>> and update documentation. >>>>>>>> >>>>>>>> Example output with invalid counters: >>>>>>>> allocinfo - version: 2.0 >>>>>>>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes >>>>>>>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add >>>>>>>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no >>>>>>>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set >>>>>>>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc >>>>>>>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale >>>>>>>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs >>>>>>>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no >>>>>>>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create >>>>>>>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device >>>>>>>> >>>>>>>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> >>>>>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> >>>>>>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> >>>>>>>> Acked-by: Usama Arif <usamaarif642@gmail.com> >>>>>>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org> >>>>>>> >>>>>>> With this format you could instead print the accumulated size of allocations >>>>>>> that could not allocate their objext (for the given tag). It should be then >>>>>>> an upper bound of the actual error, because obviously we cannot recognize >>>>>>> moments where these allocations are freed - so we don't know for which tag >>>>>>> to decrement. Maybe it could be more useful output than the yes/no >>>>>>> information, although of course require more storage in struct codetag, so I >>>>>>> don't know if it's worth it. >>>>>> >>>>>> Yeah, I'm reluctant to add more fields to the codetag and increase the >>>>>> overhead until we have a usecases. If that happens and with the new >>>>>> format we can add something like error_size:<value> to indicate the >>>>>> amount of the error. >>>>>> >>>>>>> >>>>>>> Maybe a global counter of sum size for all these missed objexts could be >>>>>>> also maintained, and that wouldn't be an upper bound but an actual current >>>>>>> error, that is if we can precisely determine that when freeing an object, we >>>>>>> don't have a tag to decrement because objext allocation had failed on it and >>>>>>> thus that allocation had incremented this global error counter and it's >>>>>>> correct to decrement it. >>>>>> >>>>>> That's a good idea and should be doable without too much overhead. Thanks! >>>>>> For the UAPI... I think for this case IOCTL would work and the use >>>>>> scenario would be that the user sees the "accurate:no" mark and issues >>>>>> ioctl command to retrieve this global counter value. >>>>>> Usama, since you initiated this feature request, do you think such a >>>>>> counter would be useful? >>>>>> >>>>> >>>>> >>>>> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers, >>>>> but it might be better to put it there? >>>>> If the value is in the file, I imagine people will be more prone to looking at it? >>>>> I am not completely sure if everyone will do an ioctl to try and find this out? >>>>> Especially if you just have infra that is just automatically collecting info from >>>>> this file. >>>> >>>> The current file reports per-codetag data and not global counters. We >>>> could report it somewhere in the header but the first question to >>>> answer is: would this be really useful (not in a way of "nice to >>>> have" but for a concrete usecase)? If not then I would suggest keeping >>>> things simple until there is a need for it. >>>> >>> >>> I think its a nice to have. I can't think of a concrete usecase at present. >>> >>> I guess a potential usecase is if you are trying to use memory allocation >>> profiling to debug OOMs and the missed objects size is very large. I guess we >>> wont know until this happens, but I would hope this number is usually small. >> >> Hmm. Missing a large allocation and not knowing about it can be a problem... >> I'll start sketching a patch to see if tracking such a global counter >> has any drawbacks and in the meantime I'm open to suggestions on how >> to expose it to the userspace. >> >> About concerns on the IOCTL interface, would it be more usable if we >> get the alloctop [1] or a similar tool which can be used to easily >> issue such commands into kernel/tools? >> >> [1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860 > > Ugh, sorry. Externally accesible link would be > https://android-review.googlesource.com/c/platform/system/memory/libmeminfo/+/3431860 > Yeah this would be nice to have. We do have something very similar in our infra, to basically sort by size and store only top x entries. When doing manually, I just do sort -g /proc/allocinfo|tail -n 30|numfmt --to=iec which is copied from the kernel doc. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-17 21:09 ` Usama Arif @ 2025-09-17 23:04 ` Suren Baghdasaryan 0 siblings, 0 replies; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-17 23:04 UTC (permalink / raw) To: Usama Arif Cc: Vlastimil Babka, akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Wed, Sep 17, 2025 at 2:10 PM Usama Arif <usamaarif642@gmail.com> wrote: > > > > On 16/09/2025 23:27, Suren Baghdasaryan wrote: > > On Tue, Sep 16, 2025 at 10:26 PM Suren Baghdasaryan <surenb@google.com> wrote: > >> > >> On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@gmail.com> wrote: > >>> > >>> > >>> > >>> On 16/09/2025 22:46, Suren Baghdasaryan wrote: > >>>> On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@gmail.com> wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 16/09/2025 16:51, Suren Baghdasaryan wrote: > >>>>>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote: > >>>>>>> > >>>>>>> On 9/16/25 01:02, Suren Baghdasaryan wrote: > >>>>>>>> While rare, memory allocation profiling can contain inaccurate counters > >>>>>>>> if slab object extension vector allocation fails. That allocation might > >>>>>>>> succeed later but prior to that, slab allocations that would have used > >>>>>>>> that object extension vector will not be accounted for. To indicate > >>>>>>>> incorrect counters, "accurate:no" marker is appended to the call site > >>>>>>>> line in the /proc/allocinfo output. > >>>>>>>> Bump up /proc/allocinfo version to reflect the change in the file format > >>>>>>>> and update documentation. > >>>>>>>> > >>>>>>>> Example output with invalid counters: > >>>>>>>> allocinfo - version: 2.0 > >>>>>>>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes > >>>>>>>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add > >>>>>>>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no > >>>>>>>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set > >>>>>>>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc > >>>>>>>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale > >>>>>>>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs > >>>>>>>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no > >>>>>>>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create > >>>>>>>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device > >>>>>>>> > >>>>>>>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > >>>>>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com> > >>>>>>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> > >>>>>>>> Acked-by: Usama Arif <usamaarif642@gmail.com> > >>>>>>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org> > >>>>>>> > >>>>>>> With this format you could instead print the accumulated size of allocations > >>>>>>> that could not allocate their objext (for the given tag). It should be then > >>>>>>> an upper bound of the actual error, because obviously we cannot recognize > >>>>>>> moments where these allocations are freed - so we don't know for which tag > >>>>>>> to decrement. Maybe it could be more useful output than the yes/no > >>>>>>> information, although of course require more storage in struct codetag, so I > >>>>>>> don't know if it's worth it. > >>>>>> > >>>>>> Yeah, I'm reluctant to add more fields to the codetag and increase the > >>>>>> overhead until we have a usecases. If that happens and with the new > >>>>>> format we can add something like error_size:<value> to indicate the > >>>>>> amount of the error. > >>>>>> > >>>>>>> > >>>>>>> Maybe a global counter of sum size for all these missed objexts could be > >>>>>>> also maintained, and that wouldn't be an upper bound but an actual current > >>>>>>> error, that is if we can precisely determine that when freeing an object, we > >>>>>>> don't have a tag to decrement because objext allocation had failed on it and > >>>>>>> thus that allocation had incremented this global error counter and it's > >>>>>>> correct to decrement it. > >>>>>> > >>>>>> That's a good idea and should be doable without too much overhead. Thanks! > >>>>>> For the UAPI... I think for this case IOCTL would work and the use > >>>>>> scenario would be that the user sees the "accurate:no" mark and issues > >>>>>> ioctl command to retrieve this global counter value. > >>>>>> Usama, since you initiated this feature request, do you think such a > >>>>>> counter would be useful? > >>>>>> > >>>>> > >>>>> > >>>>> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers, > >>>>> but it might be better to put it there? > >>>>> If the value is in the file, I imagine people will be more prone to looking at it? > >>>>> I am not completely sure if everyone will do an ioctl to try and find this out? > >>>>> Especially if you just have infra that is just automatically collecting info from > >>>>> this file. > >>>> > >>>> The current file reports per-codetag data and not global counters. We > >>>> could report it somewhere in the header but the first question to > >>>> answer is: would this be really useful (not in a way of "nice to > >>>> have" but for a concrete usecase)? If not then I would suggest keeping > >>>> things simple until there is a need for it. > >>>> > >>> > >>> I think its a nice to have. I can't think of a concrete usecase at present. > >>> > >>> I guess a potential usecase is if you are trying to use memory allocation > >>> profiling to debug OOMs and the missed objects size is very large. I guess we > >>> wont know until this happens, but I would hope this number is usually small. > >> > >> Hmm. Missing a large allocation and not knowing about it can be a problem... > >> I'll start sketching a patch to see if tracking such a global counter > >> has any drawbacks and in the meantime I'm open to suggestions on how > >> to expose it to the userspace. > >> > >> About concerns on the IOCTL interface, would it be more usable if we > >> get the alloctop [1] or a similar tool which can be used to easily > >> issue such commands into kernel/tools? > >> > >> [1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860 > > > > Ugh, sorry. Externally accesible link would be > > https://android-review.googlesource.com/c/platform/system/memory/libmeminfo/+/3431860 > > > > Yeah this would be nice to have. We do have something very similar in our infra, to basically > sort by size and store only top x entries. > > When doing manually, I just do sort -g /proc/allocinfo|tail -n 30|numfmt --to=iec which is copied from > the kernel doc. Got it. I guess if we get an upstream tool like that which is kept in-sync with kernel's UAPI and new features, that would make the maintenance easier for everyone. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-16 22:26 ` Suren Baghdasaryan 2025-09-16 22:27 ` Suren Baghdasaryan @ 2025-09-17 7:38 ` Vlastimil Babka 2025-09-17 23:02 ` Suren Baghdasaryan 1 sibling, 1 reply; 20+ messages in thread From: Vlastimil Babka @ 2025-09-17 7:38 UTC (permalink / raw) To: Suren Baghdasaryan, Usama Arif Cc: akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On 9/17/25 00:26, Suren Baghdasaryan wrote: > On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@gmail.com> wrote: > > Hmm. Missing a large allocation and not knowing about it can be a problem... > I'll start sketching a patch to see if tracking such a global counter > has any drawbacks and in the meantime I'm open to suggestions on how > to expose it to the userspace. Could it be made to look like an actual tag in the output? e.g. lib/alloc_tag.c:1234 func:untracked_slab_objects (probably some better name conveying it's uknown due to failure to allocate objexts) Maybe even implemented in a way that it's not a specially crafted output line. > About concerns on the IOCTL interface, would it be more usable if we > get the alloctop [1] or a similar tool which can be used to easily > issue such commands into kernel/tools? > > [1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860 > >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output 2025-09-17 7:38 ` Vlastimil Babka @ 2025-09-17 23:02 ` Suren Baghdasaryan 0 siblings, 0 replies; 20+ messages in thread From: Suren Baghdasaryan @ 2025-09-17 23:02 UTC (permalink / raw) To: Vlastimil Babka Cc: Usama Arif, akpm, kent.overstreet, hannes, rientjes, roman.gushchin, harry.yoo, shakeel.butt, 00107082, pyyjason, pasha.tatashin, souravpanda, linux-mm, linux-kernel On Wed, Sep 17, 2025 at 12:38 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > On 9/17/25 00:26, Suren Baghdasaryan wrote: > > On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@gmail.com> wrote: > > > > Hmm. Missing a large allocation and not knowing about it can be a problem... > > I'll start sketching a patch to see if tracking such a global counter > > has any drawbacks and in the meantime I'm open to suggestions on how > > to expose it to the userspace. > > Could it be made to look like an actual tag in the output? > e.g. lib/alloc_tag.c:1234 func:untracked_slab_objects Technically I think we can do that but it feels a bit hacky... I'll keep this option in mind and wait for more suggestions. Thanks Vlastimil! > > (probably some better name conveying it's uknown due to failure to allocate > objexts) > > Maybe even implemented in a way that it's not a specially crafted output line. > > > About concerns on the IOCTL interface, would it be more usable if we > > get the alloctop [1] or a similar tool which can be used to easily > > issue such commands into kernel/tools? > > > > [1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860 > > > >> > ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-09-17 23:05 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-15 23:02 [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output Suren Baghdasaryan 2025-09-15 23:05 ` Suren Baghdasaryan 2025-09-16 0:11 ` Andrew Morton 2025-09-16 2:48 ` Suren Baghdasaryan 2025-09-16 2:56 ` Andrew Morton 2025-09-16 3:34 ` Suren Baghdasaryan 2025-09-16 4:21 ` Andrew Morton 2025-09-16 4:39 ` Suren Baghdasaryan 2025-09-16 16:02 ` Suren Baghdasaryan 2025-09-16 12:57 ` Vlastimil Babka 2025-09-16 15:51 ` Suren Baghdasaryan 2025-09-16 21:11 ` Usama Arif 2025-09-16 21:46 ` Suren Baghdasaryan 2025-09-16 21:52 ` Usama Arif 2025-09-16 22:26 ` Suren Baghdasaryan 2025-09-16 22:27 ` Suren Baghdasaryan 2025-09-17 21:09 ` Usama Arif 2025-09-17 23:04 ` Suren Baghdasaryan 2025-09-17 7:38 ` Vlastimil Babka 2025-09-17 23:02 ` Suren Baghdasaryan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox