From: Usama Arif <usamaarif642@gmail.com>
To: Suren Baghdasaryan <surenb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>,
Linux Memory Management List <linux-mm@kvack.org>,
kent.overstreet@linux.dev, vlad.wing@gmail.com
Subject: Re: Memory allocation profiling warnings in memory bound systems
Date: Mon, 19 May 2025 18:23:59 +0100 [thread overview]
Message-ID: <d4011fd4-8899-43b9-8c27-6a3ba25f5d86@gmail.com> (raw)
In-Reply-To: <CAJuCfpFi+JASfrEH+zr9by=DE7XwVW-txHUhoGQ_AspCZZk7+g@mail.gmail.com>
On 19/05/2025 17:42, Suren Baghdasaryan wrote:
> On Mon, May 19, 2025 at 9:08 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>>
>> On Mon, May 19, 2025 at 08:50:28AM -0700, Suren Baghdasaryan wrote:
>>> On Mon, May 19, 2025 at 6:33 AM Usama Arif <usamaarif642@gmail.com> wrote:
>>>>
>>>>
>>>> +cc Vlad
>>>>
>>>> On 19/05/2025 14:31, Usama Arif wrote:
>>>>> Hi,
>>>>>
>>>>> We have started enabling memory allocation profiling (with kernel 6.13) in our fleet
>>>>> and are seeing a large number of warnings (reported by Vlad Poenaru) due to failure
>>>>> in allocation of slab object extensions on services that are memory bound. I have attached
>>>>> one of the logs at the end.
>>>>>
>>>>> Does it make sense to change the slabobj_ext to be allocated via kvcalloc and also change
>>>>> the WARN to WARN_ONCE (or maybe even pr_debug?) like the diff below? A large number of
>>>>> prints for this in a short time may mask any real issues in the system during memory
>>>>> pressure being reported in dmesg. I tried to see if there were any changes after 6.13
>>>>> to this code but didn't find any, but thought will check before sending below as a patch.
>>>>>
>>>>> diff --git a/mm/slub.c b/mm/slub.c
>>>>> index c2151c9fee22..4595ca190cd9 100644
>>>>> --- a/mm/slub.c
>>>>> +++ b/mm/slub.c
>>>>> @@ -1961,7 +1961,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>>>>> gfp &= ~OBJCGS_CLEAR_MASK;
>>>>> /* Prevent recursive extension vector allocation */
>>>>> gfp |= __GFP_NO_OBJ_EXT;
>>>>> - vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
>>>>> + vec = kvcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
>>>
>>> Hi Usama,
>>> Is the allocation larger than page size? IIUC, unless allocation size
>>> is over PAGE_SIZE, kvcalloc_node() will not fall back to vmalloc (see:
>>> https://elixir.bootlin.com/linux/v6.14.7/source/mm/util.c#L668). How
>>> big is the allocation when it fails in your case?
>>
>> Digging through the reports, it appears we're encountering both. We've
>> seen a zswap slab where the slab is order-0 and slabext is
>> higher-order (8 byte objects, 512 objsperslab, 1 pageperslab), but
>> also biovec-max where it's the other way round (4k byte objects, 8
>> objsperslab, 8 pagesperslab).
>>
>> In the first case, vmalloc would help. In the second it wouldn't.
>
> Ok, then I don't see any downside to changing to kvcalloc_node() here.
> Let's do it.
>
>>
>> The second case is interesting. The higher-order slab succeeds because
>> bios use a mempool; but the system is so depleted that the order-0 for
>> the slabext fails.
>
> I see.
>
>>
>> I'm not sure there is much we can do about this tbh. It would seem
>> overkill to add a mempool or grant the tracking access to system-wide
>> emergency reserves.
>
> Yeah, with the system under so much memory pressure we probably have
> bigger issues than extension vector allocation failures.
>
>>
>> A warn-once would probably make sense nonetheless.
>
> Agree.
>
>>
>> It might also make sense to flag the line item for that callsite in
>> the reporting file, to make it obvious that the counter is compromised
>> and is missing allocations?
>
> Good idea. We could output something like 'X' instead of the number if
> the value is known to be invalid. I can look into it. Will also have
> to raise the file version so that parsers can handle this change.
>
Thanks, I will send the above diff as patches.
For when the value is inaccurate, it might be better to have the number
and [X] next to it to reflect its inaccurate? Maybe an inaccurate number
is better than no number?
next prev parent reply other threads:[~2025-05-19 17:24 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-19 13:31 Usama Arif
2025-05-19 13:33 ` Usama Arif
2025-05-19 15:50 ` Suren Baghdasaryan
2025-05-19 16:08 ` Johannes Weiner
2025-05-19 16:42 ` Suren Baghdasaryan
2025-05-19 17:23 ` Usama Arif [this message]
2025-05-19 17:29 ` Johannes Weiner
2025-05-19 17:56 ` Suren Baghdasaryan
2025-05-19 18:31 ` Usama Arif
2025-05-19 18:39 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d4011fd4-8899-43b9-8c27-6a3ba25f5d86@gmail.com \
--to=usamaarif642@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-mm@kvack.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=vlad.wing@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox