From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80961E77194 for ; Mon, 30 Dec 2024 17:57:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D09C8D0001; Mon, 30 Dec 2024 12:57:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 580478D0005; Mon, 30 Dec 2024 12:57:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42C4A8D0001; Mon, 30 Dec 2024 12:57:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 061F88D0003 for ; Mon, 30 Dec 2024 12:57:33 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AD7511C93F1 for ; Mon, 30 Dec 2024 17:57:33 +0000 (UTC) X-FDA: 82952381400.26.ADCFDFA Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf22.hostedemail.com (Postfix) with ESMTP id 7E549C0005 for ; Mon, 30 Dec 2024 17:56:43 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735581420; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/ROEAJWpnVZFJf4Dg9ia9HcbvCnSCRF4EwlBtxxwlFY=; b=lft0DP+HQRXie7HaGrsCaDT5Yf1uheD1Kdoza632qO2entrCkz6/4pmZrZZthNjjXkmWrd d9Ii81K5C1CA7QJ9qaqGbrDiOqwEcBdMg+ixhziboZpymZWDu74N/5Lex/Pv11sEZrR1L9 h+Ed4POSrc1nsazdUkgwpTlTnMCaNDU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735581420; a=rsa-sha256; cv=none; b=0Fu7YeRQivBYWEDFTWCHeK8DK82Im0wpmWKS2UciO5AwM8l57NrCdFHKhav1vV0sYZ9P20 Ogv4yt3b6R+4nPb3NkgRkOlBKY5HmSTFL06RHThJmcVlUbKvaklk8/KWnyy5R7slnAr4u3 4kVRuMTH18yzvHduYdUbF0MEbs4OqgA= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tSJzf-000000008Lf-1Qtd; Mon, 30 Dec 2024 12:55:55 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, akpm@linux-foundation.org, nadav.amit@gmail.com, zhengqi.arch@bytedance.com, linux-mm@kvack.org, Rik van Riel Subject: [PATCH 11/12] x86/mm: enable AMD translation cache extensions Date: Mon, 30 Dec 2024 12:53:12 -0500 Message-ID: <20241230175550.4046587-12-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230175550.4046587-1-riel@surriel.com> References: <20241230175550.4046587-1-riel@surriel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7E549C0005 X-Stat-Signature: 344mpmu9r5gmsfkdkbhx99whea7hx3z7 X-Rspam-User: X-HE-Tag: 1735581403-157587 X-HE-Meta: U2FsdGVkX1/eFevoKg8PwhmFpLWm2TmxRjzPlJSLpB6MBhf9ij/fqMLkjrtHFOPv3ja/ND9lSgdTErZ9DukYzPduRPgScggjYRNhaRkT9Yg6eD04GdN+1mMKtuIdcdlGea5eaTpYozwnR3qomOVfZGyAb56Ti84tXo/q6mgXlVqBNIrS65xlNq8z9www9KyZNe7Wot273XlKmTvzr9FaaeWcBgnjl2mRlI9O2ximX7Qm9wbnOTbpPN+s4I493suxT5xWvnh77SX6AQjVgCX0DyG2AXw0ztOo6TfH2Av3t0z5/nQJ+nv/oxMt/5tBmGdzgUrQv2WT7gDSmAa+IOsGGRZGYoDOzcqQjQGCqcYzMEZfL8OVaKGh/N16MtW6xJfLWCkP+egMMy6hPiOq2/CtaANRxR/b1sIPXfj/B3bugU4Cl7xxBXVZ5VnL92qcBm5BNEi/3mcq/WDi8jsLdmiIH15HXwEisyJrvmePJyvjkUZNJND/gpYIMGi7LOsqUkB8SGzrIqdHitMAFUsep/oUcA/za4LWnBq1luJ9SbMGkg8l4It1R7D8MR0DMkjtK3Hl1cjjp3arY9m8JZJvs+XaOyu3O+vrIZTtcVZawPjye73cfG2VDW4knnwAm2Bcil1f6V8F9LJPxQ9d/4h2lslbDx/Ojl/a8jXGcNlV2dJgJdQAVedH7T2HhCXCNqm2jRfL/8++rglkVUqo5lFRK8viXRfO8sCJFJyVm6W/FqKzoAETZ2mRoSRMuaOswnmVl2dlO0YwT+gOYKcIvVn4WbJi8JfbXpYAvFgW3Cw8RuhTGFgz6jKqVLXpJ1DW3NZW95xMS3LLjvxAa+OSaoGJlB6S8oEzjpNfqiyqcbF9zXxqnMaQgA1dbB4SA2OzWfKMC4RFBjEdXCrIV/7Oq5VshK9oYm7WoFsEc8C3Y+KcrmcgpjI+G5/PO779uaGSnDlepVXhW36d1kkEA/tAf46Cp4V Ftaiz1oU TIUIYurlDRRE5KeM4N4mlAa59erSaM8pDeN9AD9wz2e6tceJJxEQB8oeA4P24ZhONnHYwNQw1sPa3bO/LAB8cxUdztxdgKPLpoIh8RUXEZT5W6agWNFtArdxmbaiPGs/88RB5Xf9vWIz/gjZWacwH5hEp9KVmD6+QSrA9LWBjm2UUXOzjxztIIxCI53pLRqxNadu6m24l15f1hafrYsTsCIyQTCq6GlZdYoLhmokOj8PUbf2cP+811fHwqrusq/7FnROxGslmq/m13Bcdxgod85oTywlnfRsH12VB/2BYlwfhck2DbRaEGluTaPpRuh2+wyNlW3AvOZxFRPGr62V/nDy4e8lJRih/fp6TI0J1kl35BSEY2pPnLgoyxm1zE5G86F/f X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With AMD TCE (translation cache extensions) only the intermediate mappings that cover the address range zapped by INVLPG / INVLPGB get invalidated, rather than all intermediate mappings getting zapped at every TLB invalidation. This can help reduce the TLB miss rate, by keeping more intermediate mappings in the cache. >From the AMD manual: Translation Cache Extension (TCE) Bit. Bit 15, read/write. Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID instructions operate on TLB entries. When this bit is 0, these instructions remove the target PTE from the TLB as well as all upper-level table entries that are cached in the TLB, whether or not they are associated with the target PTE. When this bit is set, these instructions will remove the target PTE and only those upper-level entries that lead to the target PTE in the page table hierarchy, leaving unrelated upper-level entries intact. Signed-off-by: Rik van Riel --- arch/x86/kernel/cpu/amd.c | 8 ++++++++ arch/x86/mm/tlb.c | 10 +++++++--- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 226b8fc64bfc..4dc42705aaca 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1143,6 +1143,14 @@ static void cpu_detect_tlb_amd(struct cpuinfo_x86 *c) /* Max number of pages INVLPGB can invalidate in one shot */ invlpgb_count_max = (edx & 0xffff) + 1; + + /* If supported, enable translation cache extensions (TCE) */ + cpuid(0x80000001, &eax, &ebx, &ecx, &edx); + if (ecx & BIT(17)) { + u64 msr = native_read_msr(MSR_EFER);; + msr |= BIT(15); + wrmsrl(MSR_EFER, msr); + } } static const struct cpu_dev amd_cpu_dev = { diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 454a370494d3..585d0731ca9f 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -477,7 +477,7 @@ static void broadcast_tlb_flush(struct flush_tlb_info *info) if (info->stride_shift > PMD_SHIFT) maxnr = 1; - if (info->end == TLB_FLUSH_ALL) { + if (info->end == TLB_FLUSH_ALL || info->freed_tables) { invlpgb_flush_single_pcid(kern_pcid(asid)); /* Do any CPUs supporting INVLPGB need PTI? */ if (static_cpu_has(X86_FEATURE_PTI)) @@ -1110,7 +1110,7 @@ static void flush_tlb_func(void *info) * * The only question is whether to do a full or partial flush. * - * We do a partial flush if requested and two extra conditions + * We do a partial flush if requested and three extra conditions * are met: * * 1. f->new_tlb_gen == local_tlb_gen + 1. We have an invariant that @@ -1137,10 +1137,14 @@ static void flush_tlb_func(void *info) * date. By doing a full flush instead, we can increase * local_tlb_gen all the way to mm_tlb_gen and we can probably * avoid another flush in the very near future. + * + * 3. No page tables were freed. If page tables were freed, a full + * flush ensures intermediate translations in the TLB get flushed. */ if (f->end != TLB_FLUSH_ALL && f->new_tlb_gen == local_tlb_gen + 1 && - f->new_tlb_gen == mm_tlb_gen) { + f->new_tlb_gen == mm_tlb_gen && + !f->freed_tables) { /* Partial flush */ unsigned long addr = f->start; -- 2.47.1