From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61DE3C369D3 for ; Fri, 25 Apr 2025 04:35:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 272AC6B0007; Fri, 25 Apr 2025 00:35:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2216B6B0008; Fri, 25 Apr 2025 00:35:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E8D66B000A; Fri, 25 Apr 2025 00:35:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E49636B0007 for ; Fri, 25 Apr 2025 00:35:35 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D48C480B6D for ; Fri, 25 Apr 2025 04:35:35 +0000 (UTC) X-FDA: 83371302630.01.9509E3F Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf28.hostedemail.com (Postfix) with ESMTP id 0540BC0003 for ; Fri, 25 Apr 2025 04:35:33 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=qCmKPfQV; dmarc=none; spf=none (imf28.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745555734; a=rsa-sha256; cv=none; b=aJJLk7psBdaiYGlXiNuyS/5Ipnrp9ZsECOL1/daHrpOLxLL8IID8unTrHc7rUHEGsyiN5h px/LowznUsQPgtL7tsYOnwaAaYSS052oPrnXssVjlOEuAkNbiOfAiLsojGDZ9zi8P+zCMh 6YSOQjc0HlDW2E0TYua0wVrBauRARLY= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=qCmKPfQV; dmarc=none; spf=none (imf28.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745555734; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OpLvPEgMNJelRwjlOYq5KIQsfCtG8tk9z9qOlmY49hE=; b=wtRfbN1NV51vQWvDjsKjEQaZGpxrOE8gB83krWBiD8HbkXkFiKL6Qn3hCpfGwzah6D9jly p8cv7Vxz9oS525ScCdLGdyxICjUJD46sjM8J9tdj2Rwt6C0tO7brlcwP1/QffMjnJSrWkM pqgnFuVHc9Bs6a9wSfM1Hhp+HmL6KA8= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=OpLvPEgMNJelRwjlOYq5KIQsfCtG8tk9z9qOlmY49hE=; b=qCmKPfQVdojQvLhSagwK14y4Lz o/szmMWhAtG2hMB8e1aBc7Lg7ZVc2Mw+O8dB7ZIdxrUVHWqgmZr2/Q/aLiSwN/qtk3CjOiPuppM73 Kj00vhsc06zY8hpb1TqoQm2OPldcHEOrf6zIlvKTBEAfy684Ea6OAUw3SgkQG3RwgPo7+nDrGVnkH VUsO9RZWaPs1cMetrxGAwQ89MgukfkVkFltvaA+/OBzMCcl24qg8fkW7apk5gi1XEQAsRSDLeED+S i828G4oZWNbftE1RSMms4SvJGYhcjyl8pkWfO7DdymJfxKm/YjQ+yvkGX3oRKRWrC24y39n52RWNF kwVSZBog==; Received: from willy by casper.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u8AmY-0000000Dkxs-12xD; Fri, 25 Apr 2025 04:35:22 +0000 Date: Fri, 25 Apr 2025 05:35:22 +0100 From: Matthew Wilcox To: Huan Yang Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Petr Mladek , Vlastimil Babka , Rasmus Villemoes , Francesco Valla , Raul E Rangel , "Paul E. McKenney" , Huang Shijie , Guo Weikang , "Uladzislau Rezki (Sony)" , KP Singh , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, opensource.kernel@vivo.com Subject: Re: [PATCH v3 0/3] Use kmem_cache for memcg alloc Message-ID: References: <20250425031935.76411-1-link@vivo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250425031935.76411-1-link@vivo.com> X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 0540BC0003 X-Stat-Signature: 4ydgkrz5t1obgjdohezozd4ztsbfixhw X-Rspam-User: X-HE-Tag: 1745555733-560135 X-HE-Meta: U2FsdGVkX1+mHQMOlw7U8ov3Iz+Ta55gNmhadgLm/oXM7d3mro3P9TaPY/O99FniUEAYv2ilSAHtxjz10eHn8dt5vIkvuPcmQD4Evazo2UMoi+t9kITa1PORfN44US9cK5Gfn5018PfB6UVFvQqFGJKCJwMOt0tdWnua/8l3ampFtsf10nbv9jZtaF56neXzVjBVTr+E3R8e/Jpr3GSdPo/sMa7kyHlszNtmsg202c8UEC00D0N7xdDe+Plhu7CTQnHY2jb1ux2hWtEafT1/Vs1dkUyyqays35T1+XM320E0wa/1U217ydfIuER5i6sJspyoMO/1hiqKkPfHhJ1NkdlMobcEaSCt32qYpDU9p09qE7fwC7RknmORZJVc+aSi6QLWG6bd+a5I84X5WhhKueOMN/LMjpdZA1Vg0mOWLHwXvZMtqH11KLu1R4SdwXurulw8kZ+CRd2Nl/NTqGfuFKjTtmtiiB8xU+/Dj2Ynx7zjtkLEIe24KatTbXNWBqdJUG74eVeVCa9PQAeIPCWFHorz628BlSiRVpBtuw8M+1GSN4iiFYQYTS2VRSHGPdBLaKHllbgw9io5QzNHVGhCf/Zjtl8Ge6SIr6ycHdkoJlwn3ty+Fh/vbrYQ9ndZKwXgevosHd+qVCFmB6Srg9q87BkJdJFsL7aETRb02RH/s0DA469hACt3mdxhCf4wPipd5MbTKDYbyf6mW7VhXpBziMeAMEnFZpJVNxyepLLfBPh4KFKMle7k2J5FJLrIdDi3+q3cF67/eW/sPpga+8WTR5jH5TY+Q5pYY20Nwy5tS2NB+7PS39ZGWmkIrT6sM++MCyebbLg18c6t/QvHD4rS7UzlRvw7nM6CS/KxpGbaRozB/8fmZBT5zE+xKTCGAaig7uG+VjtKGjvCyN5n7tNej9qsdRW6yn1+ehDpYGo5p4Nf6Y3x/2wFqmLgqsQQL3wsAetIuKgmYZ/gu7jcKDl YicLv3RQ nKOOOPN4aAhVjGur1DCW5S1wy8CbA0Zt8S6BFgCO7AZCgaONBbhHp4D0/wAksxIlPiUf+IaqwBSzB14u/LQBeSxpilZSW2J0qYBHh812+ng0A9iDLEOTQQGo7hJylQxf105rK3OhjH9e5C6/F4nWt/21l3hKxSYhCMb/9Pk0PEmWY/BcACK22yCZqjfjwCXfqQxXyDTha5bfn1yNyIbBJqN9yEGg596eS/+kpzr7C3rfC64IGYyOOZIFaXpvQiCkONyAX/J2rAo2XyWKlfQN//b8w+tVpW4jeUXXm8DtjUehLImWa7ZZPpqVKcZs/W1FGu+g3HnNwXMDjazsjY4VVB0gTjzGDk7rZ/vOf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 25, 2025 at 11:19:22AM +0800, Huan Yang wrote: > Key Observations: > 1. Both structures use kmalloc with requested sizes between 2KB-4KB > 2. Allocation alignment forces 4KB slab usage due to pre-defined sizes > (64B, 128B,..., 2KB, 4KB, 8KB) > 3. Memory waste per memcg instance: > Base struct: 4096 - 2312 = 1784 bytes > Per-node struct: 4096 - 2896 = 1200 bytes > Total waste: 2984 bytes (1-node system) > NUMA scaling: (1200 + 8) * nr_node_ids bytes > So, it's a little waste. [...] > This indicates that the `mem_cgroup` struct now requests 2312 bytes > and is allocated 2368 bytes, while `mem_cgroup_per_node` requests 2896 bytes > and is allocated 2944 bytes. > The slight increase in allocated size is due to `SLAB_HWCACHE_ALIGN` in the > `kmem_cache`. > > Without `SLAB_HWCACHE_ALIGN`, the allocation might appear as: > > # mem_cgroup struct allocation > sh-9269 [003] ..... 80.396366: kmem_cache_alloc: > call_site=mem_cgroup_css_alloc+0xbc/0x5d4 ptr=000000005b12b475 > bytes_req=2312 bytes_alloc=2312 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 > accounted=false > > # mem_cgroup_per_node allocation > sh-9269 [003] ..... 80.396411: kmem_cache_alloc: > call_site=mem_cgroup_css_alloc+0x1b8/0x5d4 ptr=00000000f347adc6 > bytes_req=2896 bytes_alloc=2896 gfp_flags=GFP_KERNEL|__GFP_ZERO node=0 > accounted=false > > While the `bytes_alloc` now matches the `bytes_req`, this patchset defaults > to using `SLAB_HWCACHE_ALIGN` as it is generally considered more beneficial > for performance. Please let me know if there are any issues or if I've > misunderstood anything. This isn't really the right way to think about this. Memory is ultimately allocated from the page allocator. So what you want to know is how many objects you get per page. Before, it's one per page (since both objects are between 2k and 4k and rounded up to 4k). After, slab will create slabs of a certain order to minimise waste, but also not inflate the allocation order too high. Let's assume it goes all the way to order 3 (like kmalloc-4k does), so you want to know how many objects fit in a 32KiB allocation. With HWCACHE_ALIGN, you get floor(32768/2368) = 13 and floor(32768/2944) = 11. Without HWCACHE_ALIGN( you get floor(32768/2312) = 14 and floor(32768/2896) = 11. So there is a packing advantage to turning off HWCACHE_ALIGN (for the first slab; no difference for the second). BUT! Now you have cacheline aliasing between two objects, and that's probably bad. It's the kind of performance problem that's really hard to see. Anyway, you've gone from allocating 8 objects per 32KiB to allocating 13 objects per 32KiB, a 62% improvement in memory consumption.