From this perspective, it seems better to add it to the trace. Then shall I make a patch to add to the trace? ________________________________ 发件人: David Hildenbrand 发送时间: 2025年8月16日 15:49:52 收件人: Andrew Morton 抄送: Xiang Gao; lorenzo.stoakes@oracle.com; Liam.Howlett@oracle.com; vbabka@suse.cz; rppt@kernel.org; surenb@google.com; mhocko@suse.com; linux-mm@kvack.org; linux-kernel@vger.kernel.org; 高翔 主题: [External Mail]Re: [PATCH] mm/cma: print total and used pages in cma_alloc() [外部邮件] 此邮件来源于小米公司外部,请谨慎处理。若对邮件安全性存疑,请将邮件转发给misec@xiaomi.com进行反馈 On 16.08.25 09:34, Andrew Morton wrote: > On Sat, 16 Aug 2025 08:56:47 +0200 David Hildenbrand wrote: > >> On 16.08.25 08:45, Andrew Morton wrote: >>> On Sat, 16 Aug 2025 08:27:39 +0200 David Hildenbrand wrote: >>> >>>>> @@ -858,8 +869,8 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, >>>>> if (!cma || !cma->count) >>>>> return page; >>>>> >>>>> - pr_debug("%s(cma %p, name: %s, count %lu, align %d)\n", __func__, >>>>> - (void *)cma, cma->name, count, align); >>>>> + pr_debug("%s(cma %p, name: %s, total pages: %lu, used pages: %lu, request pages: %lu, align %d)\n", >>>>> + __func__, (void *)cma, cma->name, cma->count, cma_get_used_pages(cma), count, align); >>>> >>>> ^ one space missing for proper indentation. >>>> >>>> But doing another spinlock cycle just for debugging purposes? That does >>>> not feel right, sorry. >>> >>> If we're calling pr_debug() frequently enough for this to matter, we >>> have other problems! >> >> We call it for each and every actual CMA allocation? I really don't see >> why we want to just randomly make CMA allocation latency worse. > > pr_debug() is 12 million times more expensive than a spin_lock()! > >> Is the existing pr_debug() a problem? Maybe. But who actually has debug >> messages enabled in any sane setup? > > Nobody, clearly. If anyone enabled pr_debug() in here, they'd > immediately have to remove those statements to get any work done. Kill > it. I just learned that pr_debug() on a !CONFIG_DEBUG kernel translates to no_printk(), which is just a mostly-empty macro that doesn't really use any of the parameters. I would assume the cma_get_used_pages() would get completely optimized out in that case. So, I don't care, but ... moving to tracing seems much more reasonable. -- Cheers David / dhildenb