On Wed, Jan 29, 2025 at 1:50 AM Vlastimil Babka wrote: > > On 1/29/25 01:03, Steven Rostedt wrote: > > On Tue, 28 Jan 2025 15:43:13 -0800 > > Suren Baghdasaryan wrote: > > > >> > How slow is it to always do the call instead of inlining? > >> > >> Let's see... The additional overhead if we always call is: > >> > >> Little core: 2.42% > >> Middle core: 1.23% > >> Big core: 0.66% > >> > >> Not a huge deal because the overhead of memory profiling when enabled > >> is much higher. So, maybe for simplicity I should indeed always call? > > > > That's what I was thinking, unless the other maintainers are OK with this > > special logic. > > If it's acceptable, I would prefer to always call. Ok, I'll post that version. If this becomes an issue we can reconsider later. > But at the same time make > sure the static key test is really inlined, i.e. force inline > alloc_tagging_slab_alloc_hook() (see my other reply looking at the disassembly). Sorry, I should have made it clear that I uninlined alloc_tagging_slab_alloc_hook() to localize the relevant code. If reality it is inlined. Since inlined outputs are quite big, I'm attaching disassembly of kmem_cache_alloc_noprof() which has alloc_tagging_slab_alloc_hook() inlined in it. > > Well or rather just open-code the contents of the > alloc_tagging_slab_alloc_hook and alloc_tagging_slab_free_hook (as they look > after this patch) into the callers. It's just two lines. The extra layer is > just unnecessary distraction. alloc_tagging_slab_alloc_hook() is inlined, no need to open-code. > > Then it's probably inevitable the actual hook content after the static key > test should not be inline even with > CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT as the result would be inlined > into too many places. But since we remove one call layer anyway thanks to > above, even without the full inlining the resulting performance could > hopefully be fine (compared to the state before your series). Agree. Thanks for the feedback! I'll prepare v2 with no dependency on CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT for not inlining (always call). > > > -- Steve >