From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F0ACC02182 for ; Thu, 23 Jan 2025 01:15:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 45C0C6B0082; Wed, 22 Jan 2025 20:15:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 40E106B0083; Wed, 22 Jan 2025 20:15:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2FB556B0085; Wed, 22 Jan 2025 20:15:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 149D86B0082 for ; Wed, 22 Jan 2025 20:15:58 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8B4EE120A26 for ; Thu, 23 Jan 2025 01:15:57 +0000 (UTC) X-FDA: 83036949954.22.847F36A Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf25.hostedemail.com (Postfix) with ESMTP id AFBB6A000E for ; Thu, 23 Jan 2025 01:15:54 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737594955; a=rsa-sha256; cv=none; b=fVyDAP4rQw/EyGTwVS2yTdswJlIDoGAhySKfj7qSkTdDxS69fjMFOlM8w9v2+5tbe+9nsB Kc1TxYG1KhM7obySE4Y+gKbkL8IcMJnsE7qEKSuCe0xirtmEfKX1AdRen+hUj8zLEyPvvg zlY8fvV5metvG/p2WcwBrZO9zJFHUGs= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737594955; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=43+tdM2cHgwjb6BmLG0RTqQtixGl0RL7bfG85jWy7TA=; b=JAtkCs0LKPnzvzcv2elSPE+BvNse0t/QWC1zGhMO5sVAfyChDEKR22jHr+hqi6YYaD/IWh OL0ijiRurGeGjgr0nvutzsZLZFfq0o75312kzXHAQSC5sNDLR7KylUwm68dJDKckH605Xd 5nIyWqFpZUyDIOGOHsYy2ngUg+OXTfA= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1talmJ-000000003oT-1O4v; Wed, 22 Jan 2025 20:13:03 -0500 Message-ID: <5820b18ef0ba48c33a62553fcc444c47f963b907.camel@surriel.com> Subject: Re: [PATCH v6 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes From: Rik van Riel To: Peter Zijlstra Cc: x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com Date: Wed, 22 Jan 2025 20:13:03 -0500 In-Reply-To: <20250122083835.GE7145@noisy.programming.kicks-ass.net> References: <20250120024104.1924753-1-riel@surriel.com> <20250120024104.1924753-10-riel@surriel.com> <20250122083835.GE7145@noisy.programming.kicks-ass.net> Autocrypt: addr=riel@surriel.com; prefer-encrypt=mutual; keydata=mQENBFIt3aUBCADCK0LicyCYyMa0E1lodCDUBf6G+6C5UXKG1jEYwQu49cc/gUBTTk33A eo2hjn4JinVaPF3zfZprnKMEGGv4dHvEOCPWiNhlz5RtqH3SKJllq2dpeMS9RqbMvDA36rlJIIo47 Z/nl6IA8MDhSqyqdnTY8z7LnQHqq16jAqwo7Ll9qALXz4yG1ZdSCmo80VPetBZZPw7WMjo+1hByv/ lvdFnLfiQ52tayuuC1r9x2qZ/SYWd2M4p/f5CLmvG9UcnkbYFsKWz8bwOBWKg1PQcaYHLx06sHGdY dIDaeVvkIfMFwAprSo5EFU+aes2VB2ZjugOTbkkW2aPSWTRsBhPHhV6dABEBAAG0HlJpayB2YW4gU mllbCA8cmllbEByZWRoYXQuY29tPokBHwQwAQIACQUCW5LcVgIdIAAKCRDOed6ShMTeg05SB/986o gEgdq4byrtaBQKFg5LWfd8e+h+QzLOg/T8mSS3dJzFXe5JBOfvYg7Bj47xXi9I5sM+I9Lu9+1XVb/ r2rGJrU1DwA09TnmyFtK76bgMF0sBEh1ECILYNQTEIemzNFwOWLZZlEhZFRJsZyX+mtEp/WQIygHV WjwuP69VJw+fPQvLOGn4j8W9QXuvhha7u1QJ7mYx4dLGHrZlHdwDsqpvWsW+3rsIqs1BBe5/Itz9o 6y9gLNtQzwmSDioV8KhF85VmYInslhv5tUtMEppfdTLyX4SUKh8ftNIVmH9mXyRCZclSoa6IMd635 Jq1Pj2/Lp64tOzSvN5Y9zaiCc5FucXtB9SaWsgdmFuIFJpZWwgPHJpZWxAc3VycmllbC5jb20+iQE +BBMBAgAoBQJSLd2lAhsjBQkSzAMABgsJCAcDAgYVCAIJCgsEFgIDAQIeAQIXgAAKCRDOed6ShMTe g4PpB/0ZivKYFt0LaB22ssWUrBoeNWCP1NY/lkq2QbPhR3agLB7ZXI97PF2z/5QD9Fuy/FD/jddPx KRTvFCtHcEzTOcFjBmf52uqgt3U40H9GM++0IM0yHusd9EzlaWsbp09vsAV2DwdqS69x9RPbvE/Ne fO5subhocH76okcF/aQiQ+oj2j6LJZGBJBVigOHg+4zyzdDgKM+jp0bvDI51KQ4XfxV593OhvkS3z 3FPx0CE7l62WhWrieHyBblqvkTYgJ6dq4bsYpqxxGJOkQ47WpEUx6onH+rImWmPJbSYGhwBzTo0Mm G1Nb1qGPG+mTrSmJjDRxrwf1zjmYqQreWVSFEt26tBpSaWsgdmFuIFJpZWwgPHJpZWxAZmIuY29tP okBPgQTAQIAKAUCW5LbiAIbIwUJEswDAAYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQznneko TE3oOUEQgAsrGxjTC1bGtZyuvyQPcXclap11Ogib6rQywGYu6/Mnkbd6hbyY3wpdyQii/cas2S44N cQj8HkGv91JLVE24/Wt0gITPCH3rLVJJDGQxprHTVDs1t1RAbsbp0XTksZPCNWDGYIBo2aHDwErhI omYQ0Xluo1WBtH/UmHgirHvclsou1Ks9jyTxiPyUKRfae7GNOFiX99+ZlB27P3t8CjtSO831Ij0Ip QrfooZ21YVlUKw0Wy6Ll8EyefyrEYSh8KTm8dQj4O7xxvdg865TLeLpho5PwDRF+/mR3qi8CdGbkE c4pYZQO8UDXUN4S+pe0aTeTqlYw8rRHWF9TnvtpcNzZw== Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.1 (3.54.1-1.fc41) MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: AFBB6A000E X-Stat-Signature: 5kddyb3g633ofdmi4mn3p1q549au7w59 X-Rspam-User: X-HE-Tag: 1737594954-1867 X-HE-Meta: U2FsdGVkX1+ARe3LdfA0zrXOcB7iR27okyvSOCI5oWozB9KNhoMVyngjFI4+tKYzWVDc22Onkja2kZZewIA7oT1ItrQULP2ujApofOOfVaA07IArtBmXnC8hmwui6Fk4gANyUkDLui6IJZj36sUaYWzP7+jAoB74n0sgs9cByGnZyd2hriHA15O2cDTA0755HJz/MNT+unKwlPMcdQaeGzZ4zfNOSpqF1QVfvqoH9a8QMtWemZwscYRN0oypdvDr8KSdp3PE52RJ19YUrOAHp9VhMofLctxDjC0mXVrfe9HP7sqGBY+u+nmM2SA0w6ha2E/bEAD/H2KmCrYauK2s+QXyYOWxqoTo9IEN7wjYMVR8Fgigca+0rOnOLU2SrrWaa13c5JpcJLSljD2kIm+6zPp5+nVKOrW2cWgvpwvfgjGk0HrTvUYQhGnHL/TgougNMIFUclhermBd5kL89wKXUHCb2bKb3Qg3Xry+41rkPJjVE6GxyHAsIVj6/oxSnmOPzJ82HXh6SUeHANSY4gK1VqdRffTFpcT31posyUUAJgQiz79TtsrkOALi3gfxKPGy2RG2T02dCT/9pn5pg0W6zqIVfNpSGK9OJHJVwZIpRM/7GnP11X0W1ltFPS1mIK0BpOlSc5ZgerEZzDFQgV8SaCEjehnUEn6ZdBpuarOuocgWz0tB0yqlfgoOc16fr/LpCVQ1Sx39jgYN+GnFRF0419ZTj3GlbcQSjataxXUz8UOhdWtTKQ6syuofJs8giyD3VlOkLDZYiczqSl+SidQEk0a0jcTD8WWW8o5iQOi+m49KANfH7jhiQoecXqthr5ketn1Gm/lXVEsF+Qon5/RxJzmO0ij1TE0+TtKZRc+Z8wUxnreZAnXoAJjPh6mGXE/NlyTF1+J6AkJa6tMSTiAOvYPYNepiw4PEGxb8K3Gtgb8JTTfbfo4nxj5kks1aWGB/fPNzs6lKA15Opkzsm+0 xYbUrMz7 6eMoXQFMmxuzTGHLmxsek46fnKJ/MXpjV1qla3nbL7yjh6vY1hfWX4q8Uh5Wc7ErGbFlAIIfM6a6JoMFpTe7mn1z0czkl+ao4psvb5IKlQQSGGBKR/7P4BV1SDYRZrm2oO3EPyOjQ5SqE7h9TvqVwomoQQvdWdTHUJw72pqmLOlSKgQ5vib3f0D8GYrpRPXX7I2UHEkinVlBViDlTgAJg+MfKGZthaUNxurg+bzkePpKqEJjLfP9r3ximpEZ8oQG+/+90FWPC5OZICMK/gBVcCq6IWDXXupBAxstZ77iIuoO2bu2K0xz+9r48Ex2XuW2hwwMSH+I8WEjGoFusikHvFIhkjg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 2025-01-22 at 09:38 +0100, Peter Zijlstra wrote: >=20 > Looking at this more... I'm left wondering, did 'we' look at any > other > architecture code at all?=20 >=20 > For example, look at arch/arm64/mm/context.c and see how their reset > works. Notably, they are not at all limited to reclaiming free'd > ASIDs, > but will very aggressively take back all ASIDs except for the current > running ones. >=20 I did look at the ARM64 code, and while their reset is much nicer, it looks like that comes at a cost on each process at context switch time. In new_context(), there is a call to check_update_reserved_asid(), which will iterate over all CPUs to check whether this process's ASID is part of the reserved list that got carried over during the rollover. I don't know if that would scale well enough to work on systems with thousands of CPUs. > If we want to move towards relying on broadcast TBLI, we'll need to > go in that direction. For single threaded processes, which are still very common, a local flush would likely be faster than broadcast flushes, even if multiple broadcast flushes can be pending simultaneously. For very large systems with a large number of processes, I agree we want to move in that direction, but we may need to figure out whether or not everybody taking the=C2=A0 cpu_asid_lock at rollover time, and then scanning all other CPUs from check_update_reserved_asid(), with the lock held, would scale to systems with thousands of CPUs. Everybody taking the cpu_asid_lock would probably be fine, if they didn't all have to scan over all the CPUs. If we can figure out a more scalable way to do the new_context() stuff, this would definitely be the way to go. --=20 All Rights Reversed.