From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31BDDC369AB for ; Thu, 24 Apr 2025 16:39:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 502D26B00AD; Thu, 24 Apr 2025 12:39:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B2636B00AE; Thu, 24 Apr 2025 12:39:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 350CC6B00C0; Thu, 24 Apr 2025 12:39:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 113E26B00AD for ; Thu, 24 Apr 2025 12:39:12 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 44D9BB98C5 for ; Thu, 24 Apr 2025 16:39:11 +0000 (UTC) X-FDA: 83369497302.25.4DB7663 Received: from gentwo.org (gentwo.org [62.72.0.81]) by imf24.hostedemail.com (Postfix) with ESMTP id 93685180011 for ; Thu, 24 Apr 2025 16:39:09 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gentwo.org header.s=default header.b=MEhiU7RS; dmarc=pass (policy=reject) header.from=gentwo.org; spf=pass (imf24.hostedemail.com: domain of cl@gentwo.org designates 62.72.0.81 as permitted sender) smtp.mailfrom=cl@gentwo.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745512749; a=rsa-sha256; cv=none; b=IKFxRLkirK3BkMrBF++LCltNi/HenSEE3WGnajaytSCob0HyfRvkEy/Ov29zG7PS2QSJKu Atr+0A9e8Zq/miT/8a/dHB66cgtu18+MopfwBFEyCuGOKenkHC5SEy45d3NjMeVuUb4NGk J04nJ3rVRQQlY8jMR8to00SSFV30b1k= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gentwo.org header.s=default header.b=MEhiU7RS; dmarc=pass (policy=reject) header.from=gentwo.org; spf=pass (imf24.hostedemail.com: domain of cl@gentwo.org designates 62.72.0.81 as permitted sender) smtp.mailfrom=cl@gentwo.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745512749; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=m4YGeBqf8Mkv4bC7YmvzT9g0zGADmjZ2P38yIYVVAcA=; b=zqCJHBkk0jQPHNA+WAke88IBiZjuMnCiavhKDNhFQUWJHIChTdfcXtCC7uIPtgU8FaXj0f /zKcF8c8/tfYh405xAvRPbrBbqsO0gxFBoTBZIqNjfw4irZMfrl9Ch44XDOfgKxD4ORZVe ZT4Ak5ut8pvww3VTCIrIN+oLEX0SwTc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gentwo.org; s=default; t=1745512748; bh=m4YGeBqf8Mkv4bC7YmvzT9g0zGADmjZ2P38yIYVVAcA=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=MEhiU7RSYK+eNRChd2WM6mSdgvzxoWywJoLNJxkEmSwZo0uFLJ1mEJ0tKEeZTn1cz 5vIkXGwiuIJWa3ez3vPZCUoZMCvpgCswr2ueTqX0iWAjnlbbRWvCNAjA4BVOUttG8/ 6j7Ir7IGuYKKIBJ5Z3hhFrfxObX9zOgy4WnDR3Uk= Received: by gentwo.org (Postfix, from userid 1003) id 403164025D; Thu, 24 Apr 2025 09:39:08 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by gentwo.org (Postfix) with ESMTP id 3EF44401F6; Thu, 24 Apr 2025 09:39:08 -0700 (PDT) Date: Thu, 24 Apr 2025 09:39:08 -0700 (PDT) From: "Christoph Lameter (Ampere)" To: Mateusz Guzik cc: Harry Yoo , Vlastimil Babka , David Rientjes , Andrew Morton , Dennis Zhou , Tejun Heo , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Vlad Buslov , Yevgeny Kliteynik , Jan Kara , Byungchul Park , linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 0/7] Reviving the slab destructor to tackle the percpu allocator scalability problem In-Reply-To: Message-ID: References: <20250424080755.272925-1-harry.yoo@oracle.com> <80208a6c-ec42-6260-5f6f-b3c5c2788fcd@gentwo.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 93685180011 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: b9epop3icfkh5dq1hd5xdpmwjigo5gdc X-HE-Tag: 1745512749-181750 X-HE-Meta: U2FsdGVkX1/Ol7waSutmVfZHKv3RmPmMMdnFAj6jjQ8BmNeVsnbLXKwwnd30Tcvw88fVRxjZCas3Un0Vireohs5MBN945e9yVTb4SZ1EMyZLZhA7AbTOg3/56kRlGhfzfKaQ5GnTL6iFcLUpWfQbDE8IR4U+PTUlTfTFrrL8y1HBAE3yC83lq/t82/IZGEnxiYJP2LmRjDI0RBgNdc408OjjOS6FwPbhkjbjhHcdLMCxOYeqM3ZMtd0u32DVYqGpTG5JcVvJnLnzsrQvkjQCoAdcTV+20jw+TxecYX/TJOSwzJKulCzIUKw0ucn+rV1WXC9ubRgqS2ZO6bC5ZCXASoyMjf8D3Pw4lWZutgOnr3qQGGQqFytGB1WVZ6+m5kLvuMcs5CyK5S5jbrACb93wR7bOLeWjxDylgMUkqmOQJODz0P5/ngrNLv+UX333qcPOdAZOYzOjJ3+RE4v17EAYkvbqUNFQ9BBA//jPLnO9l2FxGfuEuSHalso/WnuUqvcFjnI/dMtK0dU9VGnhJXnJH6E0d7U4VcstUknM25EBzMABofrnheV3ppriKS5PXG/tXTMroIdvv34raB6jPm5X46T+2qMLXBGRT63pASKDBjL6E2ftR/bc3E7JrQHpToHkzt6vG+8ySuEaZxBHDIQvTR1/aRLhOQKO0Gk3lZskDNtqEf/DXehUa1xyDmdrr+fkTQQLqOcK0xvXb6NEbcP7j+EhoWsP/WaSn1qFJixx5ukCGt+rmRXmQwGV6YHXDcfGQnt1nvn6fIGLsmLA9wUP7J+rPQO8pwfIU1lueyrsBzd9aKI3GWbSUxxNyWgtYomnm/D17qZQPoAWEgt8sUgFA7IzAGTvduBIrmVYOXQmXzt+Fj5Gd+kKhjZ/UzVeiHKaLbsH1p1RkSg0ARdJIqxuitr7utxOUhKCiBbN6CqAgYaZwu2ucwbAgp43r3zviNpEweKE66Ob84PfOe4yCEu uUqLwbWZ oYFWj0znJFJM89CylEDmWUjFEph4SyAqgUYCBDBgae87h4MiV17Oue8x7SAzH57Q8MW3Coxl0inGblC+p1xkGS4GJN38Q0dhpfFRaI1GktMKK9fgd5OCbSLl07ki3mWsgG5lvmo8Qjnpd2+mUnlDhLp4oWduNTktlMYURiTcN3hASqxRWfdPo/kMZHIRtkWQkz8HUaOkisEKAkD/Fe3FNvrd9jNrqiE6Bua17ggb6p8hxzZQXB3SLn3BveLIu5DHtGv9DTrWYRkpMDUQEMQBzkKvdW/ZOVgeeD6Ck3h8FT/DF9efYeZ3HP2m/XxafmZ1Jmov2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 24 Apr 2025, Mateusz Guzik wrote: > > You could allocate larger percpu areas for a batch of them and > > then assign as needed. > > I was considering a mechanism like that earlier, but the changes > needed to make it happen would result in worse state for the > alloc/free path. > > RSS counters are embedded into mm with only the per-cpu areas being a > pointer. The machinery maintains a global list of all of their > instances, i.e. the pointers to internal to mm_struct. That is to say > even if you deserialized allocation of percpu memory itself, you would > still globally serialize on adding/removing the counters to the global > list. > > But suppose this got reworked somehow and this bit ceases to be a problem. > > Another spot where mm alloc/free globally serializes (at least on > x86_64) is pgd_alloc/free on the global pgd_lock. > > Suppose you managed to decompose the lock into a finer granularity, to > the point where it does not pose a problem from contention standpoint. > Even then that's work which does not have to happen there. > > General theme is there is a lot of expensive work happening when > dealing with mm lifecycle (*both* from single- and multi-threaded > standpoint) and preferably it would only be dealt with once per > object's existence. Maybe change the lifecyle? Allocate a batch nr of entries initially from the slab allocator and use them for multiple mm_structs as the need arises. Do not free them to the slab allocator until you have too many that do nothing around? You may also want to avoid counter updates with this scheme if you only count the batchees useed. It will become a bit fuzzy but you improve scalability.