From: Suren Baghdasaryan <surenb@google.com>
To: Usama Arif <usamaarif642@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
akpm@linux-foundation.org, kent.overstreet@linux.dev,
hannes@cmpxchg.org, rientjes@google.com,
roman.gushchin@linux.dev, harry.yoo@oracle.com,
shakeel.butt@linux.dev, 00107082@163.com, pyyjason@gmail.com,
pasha.tatashin@soleen.com, souravpanda@google.com,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/1] alloc_tag: mark inaccurate allocation counters in /proc/allocinfo output
Date: Wed, 17 Sep 2025 16:04:48 -0700 [thread overview]
Message-ID: <CAJuCfpFBov_2F9Kx5Csio=hOe8kY1yXjmg_z8dXU=ZUQ_-wmaQ@mail.gmail.com> (raw)
In-Reply-To: <d451dce9-2217-4351-bc53-09967fa86cca@gmail.com>
On Wed, Sep 17, 2025 at 2:10 PM Usama Arif <usamaarif642@gmail.com> wrote:
>
>
>
> On 16/09/2025 23:27, Suren Baghdasaryan wrote:
> > On Tue, Sep 16, 2025 at 10:26 PM Suren Baghdasaryan <surenb@google.com> wrote:
> >>
> >> On Tue, Sep 16, 2025 at 9:52 PM Usama Arif <usamaarif642@gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> On 16/09/2025 22:46, Suren Baghdasaryan wrote:
> >>>> On Tue, Sep 16, 2025 at 2:11 PM Usama Arif <usamaarif642@gmail.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 16/09/2025 16:51, Suren Baghdasaryan wrote:
> >>>>>> On Tue, Sep 16, 2025 at 5:57 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >>>>>>>
> >>>>>>> On 9/16/25 01:02, Suren Baghdasaryan wrote:
> >>>>>>>> While rare, memory allocation profiling can contain inaccurate counters
> >>>>>>>> if slab object extension vector allocation fails. That allocation might
> >>>>>>>> succeed later but prior to that, slab allocations that would have used
> >>>>>>>> that object extension vector will not be accounted for. To indicate
> >>>>>>>> incorrect counters, "accurate:no" marker is appended to the call site
> >>>>>>>> line in the /proc/allocinfo output.
> >>>>>>>> Bump up /proc/allocinfo version to reflect the change in the file format
> >>>>>>>> and update documentation.
> >>>>>>>>
> >>>>>>>> Example output with invalid counters:
> >>>>>>>> allocinfo - version: 2.0
> >>>>>>>> 0 0 arch/x86/kernel/kdebugfs.c:105 func:create_setup_data_nodes
> >>>>>>>> 0 0 arch/x86/kernel/alternative.c:2090 func:alternatives_smp_module_add
> >>>>>>>> 0 0 arch/x86/kernel/alternative.c:127 func:__its_alloc accurate:no
> >>>>>>>> 0 0 arch/x86/kernel/fpu/regset.c:160 func:xstateregs_set
> >>>>>>>> 0 0 arch/x86/kernel/fpu/xstate.c:1590 func:fpstate_realloc
> >>>>>>>> 0 0 arch/x86/kernel/cpu/aperfmperf.c:379 func:arch_enable_hybrid_capacity_scale
> >>>>>>>> 0 0 arch/x86/kernel/cpu/amd_cache_disable.c:258 func:init_amd_l3_attrs
> >>>>>>>> 49152 48 arch/x86/kernel/cpu/mce/core.c:2709 func:mce_device_create accurate:no
> >>>>>>>> 32768 1 arch/x86/kernel/cpu/mce/genpool.c:132 func:mce_gen_pool_create
> >>>>>>>> 0 0 arch/x86/kernel/cpu/mce/amd.c:1341 func:mce_threshold_create_device
> >>>>>>>>
> >>>>>>>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> >>>>>>>> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> >>>>>>>> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> >>>>>>>> Acked-by: Usama Arif <usamaarif642@gmail.com>
> >>>>>>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> >>>>>>>
> >>>>>>> With this format you could instead print the accumulated size of allocations
> >>>>>>> that could not allocate their objext (for the given tag). It should be then
> >>>>>>> an upper bound of the actual error, because obviously we cannot recognize
> >>>>>>> moments where these allocations are freed - so we don't know for which tag
> >>>>>>> to decrement. Maybe it could be more useful output than the yes/no
> >>>>>>> information, although of course require more storage in struct codetag, so I
> >>>>>>> don't know if it's worth it.
> >>>>>>
> >>>>>> Yeah, I'm reluctant to add more fields to the codetag and increase the
> >>>>>> overhead until we have a usecases. If that happens and with the new
> >>>>>> format we can add something like error_size:<value> to indicate the
> >>>>>> amount of the error.
> >>>>>>
> >>>>>>>
> >>>>>>> Maybe a global counter of sum size for all these missed objexts could be
> >>>>>>> also maintained, and that wouldn't be an upper bound but an actual current
> >>>>>>> error, that is if we can precisely determine that when freeing an object, we
> >>>>>>> don't have a tag to decrement because objext allocation had failed on it and
> >>>>>>> thus that allocation had incremented this global error counter and it's
> >>>>>>> correct to decrement it.
> >>>>>>
> >>>>>> That's a good idea and should be doable without too much overhead. Thanks!
> >>>>>> For the UAPI... I think for this case IOCTL would work and the use
> >>>>>> scenario would be that the user sees the "accurate:no" mark and issues
> >>>>>> ioctl command to retrieve this global counter value.
> >>>>>> Usama, since you initiated this feature request, do you think such a
> >>>>>> counter would be useful?
> >>>>>>
> >>>>>
> >>>>>
> >>>>> hmm, I really dont like suggesting changing /proc/allocinfo as it will break parsers,
> >>>>> but it might be better to put it there?
> >>>>> If the value is in the file, I imagine people will be more prone to looking at it?
> >>>>> I am not completely sure if everyone will do an ioctl to try and find this out?
> >>>>> Especially if you just have infra that is just automatically collecting info from
> >>>>> this file.
> >>>>
> >>>> The current file reports per-codetag data and not global counters. We
> >>>> could report it somewhere in the header but the first question to
> >>>> answer is: would this be really useful (not in a way of "nice to
> >>>> have" but for a concrete usecase)? If not then I would suggest keeping
> >>>> things simple until there is a need for it.
> >>>>
> >>>
> >>> I think its a nice to have. I can't think of a concrete usecase at present.
> >>>
> >>> I guess a potential usecase is if you are trying to use memory allocation
> >>> profiling to debug OOMs and the missed objects size is very large. I guess we
> >>> wont know until this happens, but I would hope this number is usually small.
> >>
> >> Hmm. Missing a large allocation and not knowing about it can be a problem...
> >> I'll start sketching a patch to see if tracking such a global counter
> >> has any drawbacks and in the meantime I'm open to suggestions on how
> >> to expose it to the userspace.
> >>
> >> About concerns on the IOCTL interface, would it be more usable if we
> >> get the alloctop [1] or a similar tool which can be used to easily
> >> issue such commands into kernel/tools?
> >>
> >> [1] https://android-review.git.corp.google.com/c/platform/system/memory/libmeminfo/+/3431860
> >
> > Ugh, sorry. Externally accesible link would be
> > https://android-review.googlesource.com/c/platform/system/memory/libmeminfo/+/3431860
> >
>
> Yeah this would be nice to have. We do have something very similar in our infra, to basically
> sort by size and store only top x entries.
>
> When doing manually, I just do sort -g /proc/allocinfo|tail -n 30|numfmt --to=iec which is copied from
> the kernel doc.
Got it. I guess if we get an upstream tool like that which is kept
in-sync with kernel's UAPI and new features, that would make the
maintenance easier for everyone.
next prev parent reply other threads:[~2025-09-17 23:05 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-15 23:02 Suren Baghdasaryan
2025-09-15 23:05 ` Suren Baghdasaryan
2025-09-16 0:11 ` Andrew Morton
2025-09-16 2:48 ` Suren Baghdasaryan
2025-09-16 2:56 ` Andrew Morton
2025-09-16 3:34 ` Suren Baghdasaryan
2025-09-16 4:21 ` Andrew Morton
2025-09-16 4:39 ` Suren Baghdasaryan
2025-09-16 16:02 ` Suren Baghdasaryan
2025-09-16 12:57 ` Vlastimil Babka
2025-09-16 15:51 ` Suren Baghdasaryan
2025-09-16 21:11 ` Usama Arif
2025-09-16 21:46 ` Suren Baghdasaryan
2025-09-16 21:52 ` Usama Arif
2025-09-16 22:26 ` Suren Baghdasaryan
2025-09-16 22:27 ` Suren Baghdasaryan
2025-09-17 21:09 ` Usama Arif
2025-09-17 23:04 ` Suren Baghdasaryan [this message]
2025-09-17 7:38 ` Vlastimil Babka
2025-09-17 23:02 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJuCfpFBov_2F9Kx5Csio=hOe8kY1yXjmg_z8dXU=ZUQ_-wmaQ@mail.gmail.com' \
--to=surenb@google.com \
--cc=00107082@163.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pasha.tatashin@soleen.com \
--cc=pyyjason@gmail.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=souravpanda@google.com \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox