From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1EE4E7719A for ; Sun, 12 Jan 2025 02:51:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 60F826B0093; Sat, 11 Jan 2025 21:51:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BF1B6B0095; Sat, 11 Jan 2025 21:51:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4ADBA6B0096; Sat, 11 Jan 2025 21:51:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2C73C6B0093 for ; Sat, 11 Jan 2025 21:51:14 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 58CB71616F2 for ; Sun, 12 Jan 2025 02:51:13 +0000 (UTC) X-FDA: 82997273226.17.F4C7C12 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf11.hostedemail.com (Postfix) with ESMTP id C81A740003 for ; Sun, 12 Jan 2025 02:51:11 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736650271; a=rsa-sha256; cv=none; b=gsQvLHY1cPW7iaHTweusHHFgZ2yr5oSaxRxBPbnPVI8IrjcCygkFJUl6y3egxXEsD95ExL Mx03lsQMDLlcqIZ/fQ4EDT3TdKGoQPooRAVosJtQrdQNk8ArX45N4P+IIqMD1ZWO4lClJW kBIJkEOV0zXrGq3QvLPQnG6G2B07JS8= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736650271; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ug2WHQ05K4un0wjQ7AndcLyw/j+DpAoSu+SnAtW7geg=; b=vQAvNg77eauNmjwnOiBEh/HEiV+cvEoBOkFJVwndLEiKZZ/wUB3/n9t20rUNNdQd5FuheZ WSzv6pYDHREK/NAUve88PvBSFliXNFj3wSHGDxsEaFa051BXrN7yxj4Y7X5hlTmweN9Y/q iCoHM/Z/9G5noCjrOLIy+ZDcfrEf7xU= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tWo05-000000000d5-3ptU; Sat, 11 Jan 2025 21:46:53 -0500 Message-ID: <0a24ce5d1ae0c782c9c676466bd7051693852997.camel@surriel.com> Subject: Re: [PATCH v3 00/12] AMD broadcast TLB invalidation From: Rik van Riel To: Dave Hansen , x86@kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, akpm@linux-foundation.org, nadav.amit@gmail.com, zhengqi.arch@bytedance.com, linux-mm@kvack.org Date: Sat, 11 Jan 2025 21:46:53 -0500 In-Reply-To: References: <20241230175550.4046587-1-riel@surriel.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.1 (3.54.1-1.fc41) MIME-Version: 1.0 X-Rspamd-Queue-Id: C81A740003 X-Stat-Signature: jhtafktkqww5nazno5tjuu7fgks7bfxj X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1736650271-359107 X-HE-Meta: U2FsdGVkX1+5+AD3CvwQYNa9b7phDK+p2m8JzxxEru15YC/yqe9lAyTV7RceWk8c0xeUo0AbeeaAvh0IFRwSOdSgHDZY89m5WQUotCIDOxU3y6fW9Q7GvyjHKuR/N3qSLCBFrpbFrDqa07ZlxMggVLHLiiI0BgrKRMd2nXZIctn3inXD+uHaJG5qHdooVFJ0+j6whT2r7j8CpY1/WFnfNhUBL8HgCXQ3HvJYW2BbNYbTHUh6diphLG87mm4FtaeNACkDEsIn+ol46T9FWopNdHRvoMgu/8EdewXNm6F+XhB1iUznif3KV3itaB+hzk12g1CpSG4RLzja5+b4ys2ThdbIEbjiliilr1GxH0ZpKR+vXsg49/hk0gDYnmjep9I7BaQ8+SnECPEtSxbxnrdzzpfLzZ7exu1b+shc7pLhyYCBOhBZ3nsjaMrWoSY4NhnzVTIw6NJeU+PVMOxEHGDlAhxOd+UxWcl1CpTJsN5WwQADO3uDl1bIYFwzKmqn8EkZYO7tKYXX0fspI0a8ZogUOAUQJZ41r1Y+HdqnotILulrLnSLIRVER+KiJTBa0jXZTDMeET8hb7+nYE9RaPxJ21z23llTF3XTq54ioElMckJk7KBkIBk/R/Z6zaPadOet58965ugL+XUdhETRjvSF1Ucto5OZqclx3QO5Lk4vSW0MtC+scdm5X2BLj7TMFrQQ/+hxUY2ZHMIIqtaOMbCYkGhrkHESA2LEJ+oH8Ut/7+sGt6DgAk06SL8jd0LioPH7Y6Kiu/qDxge82a4rMkzv8yROJMfBXjtfqI2c8EOkS0jAaU43y4nZ9lKhF7R0ODJjqj7TN6Qzsz4J02to5mng5H7LDVQ+G2MGiVkSljz/afqyl+lQFDAEzgoup6orz/RNvp9WhHYLmPB38Kvk5dM6ntQazC9uwVGvOo5SE2MTiFJpwQjImdQ/kwBHR/3hRgFcKpUegXYsSe+m7xlPCHX/ 1hQ+iCfd RFgBKn0FzehiCetld0jv+vqCTp24v6/qrTaT7Jpv1de3xfDqXxY0e4SzhFQX42kPeGiZg9lZaJ2x9vrXV/JUKXveQZByXbP+AIXSYLUeBVmvvg4c7UlYnwF9WGhtgLvDUsWlbXn/4+G3tQ0ugNH/MTZCMgkxJ8Hf6GNXuZc5J9uYG1K7mImg5FKH2aZ2ggrE8entYIyUNsBoHHlK5Vj4Kk6EheorArbS9s+pufaySrtMcmjO+e9PPlxmrcL2UjM6VPa4I3Ccml/Xg6RFr5OQyj4H0eUJA9oOAgGJY8U1vG0V1zKjDxQxqwMSw2aJ6Hb+Wdh2MRv4wAg3Mr0b7+7lU3Xo0yGTu/SMRpBjOnYnXA6aS6zaQtRTa5FDbBDXLSFM9apAFv1Wflhz73ZA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 2025-01-06 at 11:03 -0800, Dave Hansen wrote: >=20 > So can we call them "global", "shared" or "system" ASIDs, please? >=20 I have renamed them to global ASIDs. > Second, the TLB_NR_DYN_ASIDS was picked because it's roughly the > number > of distinct PCIDs that the CPU can keep in the TLB at once (at least > on > Intel). Let's say a CPU has 6 mm's in the per-cpu ASID space and > another > 6 in the shared/broadcast space. At that point, PCIDs might not be > doing > much good because the TLB can't store entries for 12 PCIDs. >=20 If the CPU has 12 runnable processes, we may have various other performance issues, too, like the system simply not having enough CPU power to run all the runnable tasks. Most of the systems I have looked at seem to average between .2 and 2 runnable tasks per CPU, depending on whether the workload is CPU bound, or memory/IO bound. > Is there any comprehension in this series? Should we be indexing > cpu_tlbstate.ctxs[] by a *context* number rather than by the ASID > that > it's running as? >=20 We only need the cpu_tlbstate.ctxs[] for the per-CPU ASID space, in order to look up what process is assigned which slot. We do not need it for global ASID numbers, which are always the same everywhere. > Last, I'm not 100% convinced we want to do this whole thing. The > will-it-scale numbers are nice. But given the complexity of this, I > think we need some actual, real end users to stand up and say exactly > how this is important in *PRODUCTION* to them. >=20 Do any of these count? :) https://www.phoronix.com/review/amd-invlpgb-linux I am hoping to gather some real world numbers as well, and will work with some workload owners to get some numbers. --=20 All Rights Reversed.