From: Suren Baghdasaryan <surenb@google.com>
To: Usama Arif <usamaarif642@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
akpm@linux-foundation.org, kent.overstreet@linux.dev,
hannes@cmpxchg.org, rientjes@google.com,
roman.gushchin@linux.dev, harry.yoo@oracle.com,
shakeel.butt@linux.dev, 00107082@163.com, pyyjason@gmail.com,
pasha.tatashin@soleen.com, souravpanda@google.com,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output
Date: Tue, 16 Sep 2025 22:26:10 +0000 [thread overview]
Message-ID: <CAJuCfpEULVxMixDjrk_xg7+3+97dkcMmkDd++BaR17X4tDSs6Q@mail.gmail.com> (raw)
In-Reply-To: <e5e3d96a-d0aa-4466-8303-5a7e8f96bbe5@gmail.com>
On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@gmail.com> wrote:
>
>
>
> On 16/09/2025 22:46, Suren Baghdasaryan wrote:
> > On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@gmail.com> wrote:
> >>
> >>
> >>
> >> On 16/09/2025 16:51, Suren Baghdasaryan wrote:
> >>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >>>>
> >>>> On 9/16/25 01:02, Suren Baghdasaryan wrote:
> >>>>> While rare, memory allocation profiling can contain inaccurate counters
> >>>>> if slab object extension vector allocation fails. That allocation might
> >>>>> succeed later but prior to that, slab allocations that would have used
> >>>>> that object extension vector will not be accounted for. To indicate
> >>>>> incorrect counters, "accurate:no" marker is appended to the call site
> >>>>> line in the /proc/allocinfo output.
> >>>>> Bump up /proc/allocinfo version to reflect the change in the file format
> >>>>> and update documentation.
> >>>>>
> >>>>> Example output with invalid counters:
> >>>>> allocinfo - version: 2.0
> >>>>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
> >>>>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
> >>>>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no
> >>>>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
> >>>>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
> >>>>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
> >>>>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
> >>>>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no
> >>>>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
> >>>>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
> >>>>>
> >>>>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> >>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> >>>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> >>>>> Acked-by: Usama Arif <usamaarif642@gmail.com>
> >>>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> >>>>
> >>>> With this format you could instead print the accumulated size of allocations
> >>>> that could not allocate their objext (for the given tag). It should be then
> >>>> an upper bound of the actual error, because obviously we cannot recognize
> >>>> moments where these allocations are freed - so we don't know for which tag
> >>>> to decrement. Maybe it could be more useful output than the yes/no
> >>>> information, although of course require more storage in struct codetag, so I
> >>>> don't know if it's worth it.
> >>>
> >>> Yeah, I'm reluctant to add more fields to the codetag and increase the
> >>> overhead until we have a usecases. If that happens and with the new
> >>> format we can add something like error_size:<value> to indicate the
> >>> amount of the error.
> >>>
> >>>>
> >>>> Maybe a global counter of sum size for all these missed objexts could be
> >>>> also maintained, and that wouldn't be an upper bound but an actual current
> >>>> error, that is if we can precisely determine that when freeing an object, we
> >>>> don't have a tag to decrement because objext allocation had failed on it and
> >>>> thus that allocation had incremented this global error counter and it's
> >>>> correct to decrement it.
> >>>
> >>> That's a good idea and should be doable without too much overhead. Thanks!
> >>> For the UAPI... I think for this case IOCTL would work and the use
> >>> scenario would be that the user sees the "accurate:no" mark and issues
> >>> ioctl command to retrieve this global counter value.
> >>> Usama, since you initiated this feature request, do you think such a
> >>> counter would be useful?
> >>>
> >>
> >>
> >> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers,
> >> but it might be better to put it there?
> >> If the value is in the file, I imagine people will be more prone to looking at it?
> >> I am not completely sure if everyone will do an ioctl to try and find this out?
> >> Especially if you just have infra that is just automatically collecting info from
> >> this file.
> >
> > The current file reports per-codetag data and not global counters. We
> > could report it somewhere in the header but the first question to
> > answer is: would this be really useful (not in a way of "nice to
> > have" but for a concrete usecase)? If not then I would suggest keeping
> > things simple until there is a need for it.
> >
>
> I think its a nice to have. I can't think of a concrete usecase at present.
>
> I guess a potential usecase is if you are trying to use memory allocation
> profiling to debug OOMs and the missed objects size is very large. I guess we
> wont know until this happens, but I would hope this number is usually small.
Hmm. Missing a large allocation and not knowing about it can be a problem...
I'll start sketching a patch to see if tracking such a global counter
has any drawbacks and in the meantime I'm open to suggestions on how
to expose it to the userspace.
About concerns on the IOCTL interface, would it be more usable if we
get the alloctop [1] or a similar tool which can be used to easily
issue such commands into kernel/tools?
[1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860
>
next prev parent reply other threads:[~2025-09-16 22:26 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-15 23:02 Suren Baghdasaryan
2025-09-15 23:05 ` Suren Baghdasaryan
2025-09-16 0:11 ` Andrew Morton
2025-09-16 2:48 ` Suren Baghdasaryan
2025-09-16 2:56 ` Andrew Morton
2025-09-16 3:34 ` Suren Baghdasaryan
2025-09-16 4:21 ` Andrew Morton
2025-09-16 4:39 ` Suren Baghdasaryan
2025-09-16 16:02 ` Suren Baghdasaryan
2025-09-16 12:57 ` Vlastimil Babka
2025-09-16 15:51 ` Suren Baghdasaryan
2025-09-16 21:11 ` Usama Arif
2025-09-16 21:46 ` Suren Baghdasaryan
2025-09-16 21:52 ` Usama Arif
2025-09-16 22:26 ` Suren Baghdasaryan [this message]
2025-09-16 22:27 ` Suren Baghdasaryan
2025-09-17 21:09 ` Usama Arif
2025-09-17 23:04 ` Suren Baghdasaryan
2025-09-17 7:38 ` Vlastimil Babka
2025-09-17 23:02 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJuCfpEULVxMixDjrk_xg7+3+97dkcMmkDd++BaR17X4tDSs6Q@mail.gmail.com \
--to=surenb@google.com \
--cc=00107082@163.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pasha.tatashin@soleen.com \
--cc=pyyjason@gmail.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=souravpanda@google.com \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox