From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A717C54E58 for ; Tue, 12 Mar 2024 01:52:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8ACAE8D0005; Mon, 11 Mar 2024 21:52:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 80E3C6B0147; Mon, 11 Mar 2024 21:52:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B07F8D0005; Mon, 11 Mar 2024 21:52:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4DBC46B0145 for ; Mon, 11 Mar 2024 21:52:13 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id EA33B120F18 for ; Tue, 12 Mar 2024 01:52:12 +0000 (UTC) X-FDA: 81886711704.09.5A8CA9B Received: from mail-oa1-f52.google.com (mail-oa1-f52.google.com [209.85.160.52]) by imf17.hostedemail.com (Postfix) with ESMTP id B758840002 for ; Tue, 12 Mar 2024 01:52:10 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=VO0DUDBZ; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf17.hostedemail.com: domain of cuiyunhui@bytedance.com designates 209.85.160.52 as permitted sender) smtp.mailfrom=cuiyunhui@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710208331; a=rsa-sha256; cv=none; b=LzkThznpZhAhPQVyC0D36itlB6yiVU1TnRR9hcbVT3XmGFddFB5CHhPpQmzIcGKDh2dwc/ LGX1FQgq0fipXqafjX4/pYPn5X9sxLunJKFHsr2/8xQ4vIA0mdipTz/yHmj7FmZohhid4o TSZWA+Cqurmw6+GOGhymFPAwnrNpxtQ= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=VO0DUDBZ; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf17.hostedemail.com: domain of cuiyunhui@bytedance.com designates 209.85.160.52 as permitted sender) smtp.mailfrom=cuiyunhui@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710208331; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PUDWC2eUrOZ8C2ynRVAXHz0uE0sUzoNBRorpvOY+r94=; b=6R44/iTU7Rt/3Cjigp4D2L5cQw/cJcn7zuIa7LZ5v9BUM6z2TeLw1ykigdFwy3y95cDf+a j9UWqPutXY9pcCWnxgNmo1kMFaEdptDOjpmpEMOvvzjDdAXbK/dnnK4Pfw4iH00TcSK8z0 ez7zL7wsQcDZ/rSwFJcpre05tT/qQp0= Received: by mail-oa1-f52.google.com with SMTP id 586e51a60fabf-2220a3b3871so560360fac.2 for ; Mon, 11 Mar 2024 18:52:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1710208330; x=1710813130; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=PUDWC2eUrOZ8C2ynRVAXHz0uE0sUzoNBRorpvOY+r94=; b=VO0DUDBZdJ6buH/MHauVCOkAbftIfQLfEvkaSqHzheY1I8nx5IX6lyksm4aSzKaL/c a70PBRvW9+xHxpC4RtX0NoKcUQ+X41HXDZWeDimOMoM3xS6S43aujsZunKXrLRlgRaDX G8kM9UOPuudc2t4f5MXIsOWvHPqHdli45yKNMnK7UHSFOHX06loD4TWyFfv8y1JbmXe6 k/QuBu+5lvU1WHM08UG2RBHShe+nmXIqy36D0AntppZjK7FJc9YEK/F08Kon9PoPmfjK 3xpUjW35cTZdTEY9nZU107jtvpSmEcyMbqVLrGtVOwBPKSdLGzQ9ux0c/Lyr03JQ5b+7 ytCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710208330; x=1710813130; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PUDWC2eUrOZ8C2ynRVAXHz0uE0sUzoNBRorpvOY+r94=; b=Ou0JmadxZsMo5UAo3351hLeMnTN4KT8foeMAPzD3/i3RRt52dMM+3ERATHRXG6q7HZ QdoQCeX3jHU8zE4n+ZJk1MHADwLl0v01bL5e+kd95O5tObhGELOWZP+uH3rHrqwjlSoH LrpC50rX0xKTDDndWMPH7cotsp1vJW/uWP8dlh2hnwuaOGsN0hw23yJv97PumWuh053K d3kssGz/RAyctYL5ODouYWdPmlpx3Nc5yvQ1onrQ70s+AyhWf/kw6DoElj7d98SeYM2T vUQW/IvbQgUYXT1xhZZ3JR2vtqbtzvl7AVrrsdjsGDvZArat4o4wFOczFghSbbFT5Gp3 sOvA== X-Forwarded-Encrypted: i=1; AJvYcCWvfe19x5DMky1cZhCsRPNHjaxhdHlbvoucrtLHt/qWzXSsN5vl4A+N5XTcTBsxjAxcbTGUJ6tp2BAJB0Sj+MYXxrs= X-Gm-Message-State: AOJu0Yw8R2QB6yDctNszbGTGw7duMLiI1OfS+rh3VbN8Sir1kcqgGnE/ 9hElbX4CxWrxgBb5IFVKrzKa3c+PzfpeRf8qiczNiiV+1y3V4NMEJyPcfKhgeVIUTKxHB+xutim P+7I0Iqb/w76j4efdhdR6PabCLq5MXpRDWnwslg== X-Google-Smtp-Source: AGHT+IEw/i3iZ/moT3hoOqVkA0QyZhJITAge4gritgZJGaa4ZUoskcP4axQtqW/tHQZK40PRLKgUppucovxI/Ur0GZ0= X-Received: by 2002:a05:6870:d182:b0:220:f92f:5bf7 with SMTP id a2-20020a056870d18200b00220f92f5bf7mr8145573oac.50.1710208329733; Mon, 11 Mar 2024 18:52:09 -0700 (PDT) MIME-Version: 1.0 References: <20240229232211.161961-1-samuel.holland@sifive.com> <20240229232211.161961-9-samuel.holland@sifive.com> <100d5414-11fa-4c47-9c35-51f5fad2d6e6@sifive.com> In-Reply-To: <100d5414-11fa-4c47-9c35-51f5fad2d6e6@sifive.com> From: yunhui cui Date: Tue, 12 Mar 2024 09:51:58 +0800 Message-ID: Subject: Re: [External] [PATCH v5 08/13] riscv: Avoid TLB flush loops when affected by SiFive CIP-1200 To: Samuel Holland Cc: Palmer Dabbelt , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Alexandre Ghiti , Jisheng Zhang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B758840002 X-Stat-Signature: sk8gnjs5p58niuc3jg7kj8tu7hpiaq3b X-HE-Tag: 1710208330-414096 X-HE-Meta: U2FsdGVkX19V9JHqq5XOvX20NrNmIZJ1k/uvFvzXfYH8+PV3YD3D9TUDAph9eenXYnfDEQOi9vI8AexvXGiwq+UY6qd4R7rm+DU6xnQD2XTon6v9VGjHd4WaSXlZqcvltcfKKBMouBIMdBsMl/kz7F5lMbIBwU08SjBVjDggILDzAOw/5XXCAaDsG5WzbJiugwC1nlGto61ZbvOE/gkEJN5qlB1wT2Jfnzh6dVCM/LHr/e97Mcxl43M1gwFt9+InQX1pcstbT195KWDF209wjHpwOX79O6bh+ysEKp9ibJHXANHQTfMZVo4lkTaPUVexfIItpEFSLks23vZSX4W0ys84/geIii46m9BN7jxPytjg9lJCN1eC0Axhf8CAUNJTUV/4xPH7LuWT4PFH24ym2GUGZ02u2x550Zr7B2KFksKPsLy8N6yyWvmmQYBmv6MABM5TJU0l/wuiIkQRC37z+//CaRWRAUDS3McwC8kjmWEDCa4zgv2sMziBaB4PKoDvAV6pTsw6ZaHMp73bv/ow71zu99skGQksaok4HWDoN0E7pfr1wNfM5cuLJvaW3hfX6ZELLM3HySK6pfLWnOQR4h56zS7leRE1l1MD9jLZfyY5jtFrG9BtYWv/DfjnkA5H0/lQ0s2Y/XZmUJuUTEjAxOsey7Mj2ecJFfhOBS3o8Q8pvw3cPQDFKiC6EaeHA4GBHdnRO1pamISjhCY6j1qv3AvJrDcivzgC2hWj3TpjsGHsSxkHw/PF9bHVY4xpka+RccnxEM49lxMnDdYwG0GTRZOZr1lU0Sy+irF06jWUFSnG6hLQuleTZppG5BBal1mgqYAJY/EpjHKwvIOJpDQSmh2QM/7e0cA9J13HCfl36c0mRqNejusCNIb1ILGzjNTianuze6xnjuFt83peY6ZHFtzYPKDGG1fxo3Jzo+HxEkxu6eMQbsamo1OuotSdduBpd85wsDczv2l21mND8Nt n/kOookL 2XfNWFT5E3ZqkgG5l/vwGHIu8yl6N9aDs2sAuiLcpDOko/pEuselHWxE3FzKzgzhVnw1w9VGwKLnM95jJjG2ZfcT73e0NwuKJJB4qBHb48MRZPQ362k33aayLimYjiXIAeykUNnPeL0UZpTjnkkaufg2QYisdle29GkqByCnsv3mROfYEg+Ch2Hm6TEBcdGIbcKWvS1iwkHowppQfWFxLO7w4i9J1fSkszw1WWGDdn5Ql1h8vHYYQxqbQ8ZG6D4t4s87fmJMADCVjO1h331zpUbxr+ElcFdj0jaJo8ePJIadTjr0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: H Samuel, On Tue, Mar 12, 2024 at 8:36=E2=80=AFAM Samuel Holland wrote: > > Hi Yunhui, > > On 2024-02-29 8:48 PM, yunhui cui wrote: > > Hi Samuel, > > > > On Fri, Mar 1, 2024 at 7:22=E2=80=AFAM Samuel Holland wrote: > >> > >> Since implementations affected by SiFive errata CIP-1200 always use th= e > >> global variant of the sfence.vma instruction, they only need to execut= e > >> the instruction once. The range-based loop only hurts performance. > >> > >> Signed-off-by: Samuel Holland > >> --- > >> > >> (no changes since v4) > >> > >> Changes in v4: > >> - Only set tlb_flush_all_threshold when CONFIG_MMU=3Dy. > >> > >> Changes in v3: > >> - New patch for v3 > >> > >> arch/riscv/errata/sifive/errata.c | 5 +++++ > >> arch/riscv/include/asm/tlbflush.h | 2 ++ > >> arch/riscv/mm/tlbflush.c | 2 +- > >> 3 files changed, 8 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/riscv/errata/sifive/errata.c b/arch/riscv/errata/sif= ive/errata.c > >> index 3d9a32d791f7..716cfedad3a2 100644 > >> --- a/arch/riscv/errata/sifive/errata.c > >> +++ b/arch/riscv/errata/sifive/errata.c > >> @@ -42,6 +42,11 @@ static bool errata_cip_1200_check_func(unsigned lon= g arch_id, unsigned long imp > >> return false; > >> if ((impid & 0xffffff) > 0x200630 || impid =3D=3D 0x1200626) > >> return false; > >> + > >> +#ifdef CONFIG_MMU > >> + tlb_flush_all_threshold =3D 0; > >> +#endif > >> + > >> return true; > >> } > >> > >> diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/as= m/tlbflush.h > >> index 463b615d7728..8e329721375b 100644 > >> --- a/arch/riscv/include/asm/tlbflush.h > >> +++ b/arch/riscv/include/asm/tlbflush.h > >> @@ -66,6 +66,8 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_= unmap_batch *batch, > >> unsigned long uaddr); > >> void arch_flush_tlb_batched_pending(struct mm_struct *mm); > >> void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); > >> + > >> +extern unsigned long tlb_flush_all_threshold; > >> #else /* CONFIG_MMU */ > >> #define local_flush_tlb_all() do { } while (0) > >> #endif /* CONFIG_MMU */ > >> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c > >> index 365e0a0e4725..22870f213188 100644 > >> --- a/arch/riscv/mm/tlbflush.c > >> +++ b/arch/riscv/mm/tlbflush.c > >> @@ -11,7 +11,7 @@ > >> * Flush entire TLB if number of entries to be flushed is greater > >> * than the threshold below. > >> */ > >> -static unsigned long tlb_flush_all_threshold __read_mostly =3D 64; > >> +unsigned long tlb_flush_all_threshold __read_mostly =3D 64; > >> > >> static void local_flush_tlb_range_threshold_asid(unsigned long start, > >> unsigned long size, > >> -- > >> 2.43.1 > >> > > > > If local_flush_tlb_all_asid() is used every time, more PTWs will be > > generated. Will such modifications definitely improve the overall > > performance? > > This change in this commit specifically applies to older SiFive SoCs with= a bug > making single-page sfence.vma instructions unsafe to use. In this case, a= single > call to local_flush_tlb_all_asid() is optimal, yes. Would it be more clear to add this content to the git commit description appropriately? > > > Hi Alex, Samuel, > > The relationship between flush_xx_range_asid() and nr_ptes is > > basically linear growth (y=3Dkx +b), while flush_all_asid() has nothing > > to do with nr_ptes (y=3Dc). > > Some TLBs may do some optimization. The operation of flush all itself > > requires very few cycles, but there is a certain delay between > > consecutive flush all. > > The intersection of the two straight lines is the optimal solution of > > tlb_flush_all_threshold. In actual situations, continuous > > flush_all_asid will not occur. One problem caused by flush_all_asid() > > is that multiple flush entries require PTW, which causes greater > > latency. > > Therefore, the value of tlb_flush_all_threshold needs to be considered > > or quantified. Maybe doing local_flush_tlb_page_asid() based on the > > actual nr_ptes_in_range would give better overall performance. > > What do you think? > > Yes, this was something Alex brought up when adding this threshold, that = it > should be tuned for various scenarios. That still needs to be done. This = patch > just covers one specific case where we know the optimal answer due to an = erratum. > > Regards, > Samuel > Thanks, Yunhui