From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64A4FC5475B for ; Tue, 12 Mar 2024 00:35:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCA4C8D000E; Mon, 11 Mar 2024 20:35:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B7A7F8D000D; Mon, 11 Mar 2024 20:35:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A41878D000E; Mon, 11 Mar 2024 20:35:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 923AF8D000D for ; Mon, 11 Mar 2024 20:35:55 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3D7D3140E12 for ; Tue, 12 Mar 2024 00:35:55 +0000 (UTC) X-FDA: 81886519470.24.8A6EB52 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf05.hostedemail.com (Postfix) with ESMTP id 4B57A100010 for ; Tue, 12 Mar 2024 00:35:53 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=sifive.com header.s=google header.b=hc7Tbdf2; spf=pass (imf05.hostedemail.com: domain of samuel.holland@sifive.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=samuel.holland@sifive.com; dmarc=pass (policy=reject) header.from=sifive.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710203753; a=rsa-sha256; cv=none; b=fswOk4Y7eycPj7n0nzQ07nqc9JejzUbMad3jhDY73AMCIooKTsm5obAjUVK2BZozyD4zxT 3iUYt52ZQYZZZ4pNTYbj7uSjkVI2p/YfDXyebZWRFZ+ifJwn4VVQVWLM0+ixsCSYafiJxl MOcCkKLsqQtyn5fnDnwAjIVwDkGbYTA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=sifive.com header.s=google header.b=hc7Tbdf2; spf=pass (imf05.hostedemail.com: domain of samuel.holland@sifive.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=samuel.holland@sifive.com; dmarc=pass (policy=reject) header.from=sifive.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710203753; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3JTBtDRaEpoCIigQxhZrQrfkOxHxP/G0rOetjigew6g=; b=DvAXlxAhc7r0jExAM8ExAWmXU/AQ17I7c4n4XukPy7xFs8tvQKdm903m9yeUTfocABlNyX /tDhOqImnMc/XKkrWSaJh7a5mbgIfg9Tr99/LOmn1PwADoh45fgl4o3xGDlL40u/9zS4oc Rn2UtfWcN8J9M3Jc7N1dbmMtx1ApFUg= Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-42a9c21f9ecso24794791cf.0 for ; Mon, 11 Mar 2024 17:35:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1710203752; x=1710808552; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=3JTBtDRaEpoCIigQxhZrQrfkOxHxP/G0rOetjigew6g=; b=hc7Tbdf2AVaJtr3H+ADAK0Jk5q0IDGnlgbebaCSnmG5xzl70T/32Trtg4JKBCeMAfx Rtqj57Ruw13MWaW/kJrcLrlJrjzt4MHsjDcB+7K1rTxM81/wYPZj5pvFhGA7a2uTR8IN IFsow+E+ShG0bPuJ7vGom2SHZ+BfR5qOlyacpKLC8VhSDTCTszhip+ihJO1g50lCWXE3 FI/LsF8eMwF4QApd7i3+ooJdGyxugGjQPlzGDbfQCHf1nhSbppgg37bQFhZczzrY2mzk 1f29J8UZJ+tE+3SblrQwjsvqw4fCGLUZNIaXpAfufVlCb+8arY9JUTk+fE/rvIu7WMgc bdNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710203752; x=1710808552; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3JTBtDRaEpoCIigQxhZrQrfkOxHxP/G0rOetjigew6g=; b=Z5l7XUp2uccjGDkP4FqWEze4fHCG7sOFHdEDo9aiS1MpD8gnpOi+8iZyDJR/grwQEf hLlE367rruysos51RGg6hlEUxrtZItDMauaAPjpNel/zq/fhXM0fiiJ+9PO6TwL2tvJl JA7eRREftdrE8lJs1FLbnuIV77zn0ClJllS0POJo9tq++4XL0tiIAN+lXRNVdyO/xRJC JeZn6UFJf/EpVcauw22/sN+jE0IjuOCDX4MziSuOHFdFWS/2CwcLlfiiNTLks7Dut0DP 9lNFKS80FIE9pPa0WbRtoLZMR7vWCH7vq5nXlRYtjZkgXxJ83GWP7ToH9yIIQ3GnfPFU RJ0A== X-Forwarded-Encrypted: i=1; AJvYcCVEIBcR2vvFFD4DNTXHB25I3spapYZE1Ze+MW6xPKv253MbcPDOY3TM2WlpqOjcC39aKMdXe4UHi0niz7w37fY6YvA= X-Gm-Message-State: AOJu0Yz0gdxCILDIBlKmfmvjFOcBGWBikGbmlRZZREZ59oBNl5bf9o3X LMXlk05+jk8Ap6QoqAtjU8Y8h/E79YzZ65RB12O5ar5ow3z6ffNzBilEQWWgj5o= X-Google-Smtp-Source: AGHT+IEf0OlYbf0aVPGX+KApg4PvxZ0cB3JvU5qic6wZx79zx+tbwWq5OvM38n+dAE1aOcxr4lfC8A== X-Received: by 2002:a05:622a:3cd:b0:42e:fcf5:b0a0 with SMTP id k13-20020a05622a03cd00b0042efcf5b0a0mr589449qtx.66.1710203752404; Mon, 11 Mar 2024 17:35:52 -0700 (PDT) Received: from [100.64.0.1] ([170.85.8.176]) by smtp.gmail.com with ESMTPSA id o9-20020ac84289000000b0042c78553d1dsm3211862qtl.28.2024.03.11.17.35.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 11 Mar 2024 17:35:51 -0700 (PDT) Message-ID: <100d5414-11fa-4c47-9c35-51f5fad2d6e6@sifive.com> Date: Mon, 11 Mar 2024 19:35:49 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [External] [PATCH v5 08/13] riscv: Avoid TLB flush loops when affected by SiFive CIP-1200 Content-Language: en-US To: yunhui cui Cc: Palmer Dabbelt , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Alexandre Ghiti , Jisheng Zhang References: <20240229232211.161961-1-samuel.holland@sifive.com> <20240229232211.161961-9-samuel.holland@sifive.com> From: Samuel Holland In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 4B57A100010 X-Stat-Signature: pbzr8dwbyk5urwmtt7iw115sfz4x19gr X-Rspam-User: X-HE-Tag: 1710203753-323519 X-HE-Meta: U2FsdGVkX19+t8tiBtwxuCRrJMxFAfWMXxa5qgVAO8EkqheU3zpVafwKIWJ1KMNvgjehSSniJvQ6EYy6fk6eVhdriRZSiShAQmpF7m75WoAc9x3znGDEVwP2ZlCnVphnCGF9klPnfM8WdVMahpXOzyMEF/0/zLyZF4tVQHSLUGVkxz2FX9DZo8JM/djGun1OuWapnwi18Y0DivOUfhev8hfSS3fWkKIcJSGtzF9tNi+qBFlQfo9zXS9XGXiKVl5MT3oZFXNNcVomgK4FO9cBJmTGq5q7THdqfg7Re5BrnSPmOhZj2xE00H8Ci5g1kb952s9c5Sfq+fpe8CemSl7IYRSUHFWUtLVL/Yi4gKqxz8CHCns8UhKqv9Oqsun3xelImoEsOUKgzyycVFxO+btrB4RQp3kwmIKWnId8NSWD17EHYfXsycwYdb4PpExZwLh1eQuR1MtstKCt6GTTsXzC6DgasKyPCXKDx6dGvwy1SaO85tf0/AisrtPqMjrrWwdwCAtgHOqL57jupCGFTuEXDaiwyC6c9NmEfX72Vywu/cQuJAJsBcYo1EfO3NkDO6nZG/fhTUO+lUAARP7GefMIiaWk9aQgIsXlIhSn0hKUqwSf0MGgRpDG7rU3XpR3w/EhRNkvAARbsFIMBTTDCZ6061LZAh1pNCS/k1Eeisv0cADY/pRafq+zLHE22uWIV3n0LFCqKPgW6Ymmh9CKLKJ++eCOVEGV87PwF4vHrM2ONzWrpwclymliL1vZ0lQ5AnEqUADsA0Qz+eIMpz0zz3PE4xdPLH+LAZHARP1m6+WP77UQPSLsrBm9a9abBSLxqDQBeqLkxQKXsaRg1vRBmvssikIYlSKl5HIFn7iiLvecVluoyQujexuV1YbKuf8gH11onxLKVyU1RTsEaqkAr6BC1wduBn5fIzH1pF1OJg09tq9xbkGkoyGmSo+jhzDAkjf2h/lChSjbK8oHnmV77jq gndGGTHb IVSzO5UOrivNj5WVwV77EZkAvcKFuGSZGZ+nAasoElzjWa5Rj10TcNvt5Zkfr/HKibJpu12959iGYH3oUknTLmXw5qEFXWtOpaJah8ywE7ua1myCvBk9W4bM1yg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000209, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Yunhui, On 2024-02-29 8:48 PM, yunhui cui wrote: > Hi Samuel, > > On Fri, Mar 1, 2024 at 7:22 AM Samuel Holland wrote: >> >> Since implementations affected by SiFive errata CIP-1200 always use the >> global variant of the sfence.vma instruction, they only need to execute >> the instruction once. The range-based loop only hurts performance. >> >> Signed-off-by: Samuel Holland >> --- >> >> (no changes since v4) >> >> Changes in v4: >> - Only set tlb_flush_all_threshold when CONFIG_MMU=y. >> >> Changes in v3: >> - New patch for v3 >> >> arch/riscv/errata/sifive/errata.c | 5 +++++ >> arch/riscv/include/asm/tlbflush.h | 2 ++ >> arch/riscv/mm/tlbflush.c | 2 +- >> 3 files changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/arch/riscv/errata/sifive/errata.c b/arch/riscv/errata/sifive/errata.c >> index 3d9a32d791f7..716cfedad3a2 100644 >> --- a/arch/riscv/errata/sifive/errata.c >> +++ b/arch/riscv/errata/sifive/errata.c >> @@ -42,6 +42,11 @@ static bool errata_cip_1200_check_func(unsigned long arch_id, unsigned long imp >> return false; >> if ((impid & 0xffffff) > 0x200630 || impid == 0x1200626) >> return false; >> + >> +#ifdef CONFIG_MMU >> + tlb_flush_all_threshold = 0; >> +#endif >> + >> return true; >> } >> >> diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h >> index 463b615d7728..8e329721375b 100644 >> --- a/arch/riscv/include/asm/tlbflush.h >> +++ b/arch/riscv/include/asm/tlbflush.h >> @@ -66,6 +66,8 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, >> unsigned long uaddr); >> void arch_flush_tlb_batched_pending(struct mm_struct *mm); >> void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); >> + >> +extern unsigned long tlb_flush_all_threshold; >> #else /* CONFIG_MMU */ >> #define local_flush_tlb_all() do { } while (0) >> #endif /* CONFIG_MMU */ >> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c >> index 365e0a0e4725..22870f213188 100644 >> --- a/arch/riscv/mm/tlbflush.c >> +++ b/arch/riscv/mm/tlbflush.c >> @@ -11,7 +11,7 @@ >> * Flush entire TLB if number of entries to be flushed is greater >> * than the threshold below. >> */ >> -static unsigned long tlb_flush_all_threshold __read_mostly = 64; >> +unsigned long tlb_flush_all_threshold __read_mostly = 64; >> >> static void local_flush_tlb_range_threshold_asid(unsigned long start, >> unsigned long size, >> -- >> 2.43.1 >> > > If local_flush_tlb_all_asid() is used every time, more PTWs will be > generated. Will such modifications definitely improve the overall > performance? This change in this commit specifically applies to older SiFive SoCs with a bug making single-page sfence.vma instructions unsafe to use. In this case, a single call to local_flush_tlb_all_asid() is optimal, yes. > Hi Alex, Samuel, > The relationship between flush_xx_range_asid() and nr_ptes is > basically linear growth (y=kx +b), while flush_all_asid() has nothing > to do with nr_ptes (y=c). > Some TLBs may do some optimization. The operation of flush all itself > requires very few cycles, but there is a certain delay between > consecutive flush all. > The intersection of the two straight lines is the optimal solution of > tlb_flush_all_threshold. In actual situations, continuous > flush_all_asid will not occur. One problem caused by flush_all_asid() > is that multiple flush entries require PTW, which causes greater > latency. > Therefore, the value of tlb_flush_all_threshold needs to be considered > or quantified. Maybe doing local_flush_tlb_page_asid() based on the > actual nr_ptes_in_range would give better overall performance. > What do you think? Yes, this was something Alex brought up when adding this threshold, that it should be tuned for various scenarios. That still needs to be done. This patch just covers one specific case where we know the optimal answer due to an erratum. Regards, Samuel