From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44B33C433DF for ; Wed, 8 Jul 2020 12:51:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E4FFD206C3 for ; Wed, 8 Jul 2020 12:51:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E4FFD206C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5F4CE6B00DA; Wed, 8 Jul 2020 08:51:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CC6C6B00DC; Wed, 8 Jul 2020 08:51:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4BB4A6B00DD; Wed, 8 Jul 2020 08:51:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id 383666B00DA for ; Wed, 8 Jul 2020 08:51:23 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E82FF2C88 for ; Wed, 8 Jul 2020 12:51:22 +0000 (UTC) X-FDA: 77014894404.11.teeth20_261681526ebd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 9CE1C180F68B7 for ; Wed, 8 Jul 2020 12:40:57 +0000 (UTC) X-HE-Tag: teeth20_261681526ebd X-Filterd-Recvd-Size: 6347 Received: from huawei.com (szxga07-in.huawei.com [45.249.212.35]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Wed, 8 Jul 2020 12:40:56 +0000 (UTC) Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 18210E1EC47F2B394025; Wed, 8 Jul 2020 20:40:51 +0800 (CST) Received: from DESKTOP-KKJBAGG.china.huawei.com (10.174.186.75) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.487.0; Wed, 8 Jul 2020 20:40:40 +0800 From: Zhenyu Ye To: , , , , , , CC: , , , , , , , , , Subject: [RFC PATCH v5 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 Date: Wed, 8 Jul 2020 20:40:31 +0800 Message-ID: <20200708124031.1414-3-yezhenyu2@huawei.com> X-Mailer: git-send-email 2.22.0.windows.1 In-Reply-To: <20200708124031.1414-1-yezhenyu2@huawei.com> References: <20200708124031.1414-1-yezhenyu2@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.174.186.75] X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 9CE1C180F68B7 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add __TLBI_VADDR_RANGE macro and rewrite __flush_tlb_range(). In this patch, we only use the TLBI RANGE feature if the stride =3D=3D PA= GE_SIZE, because when stride > PAGE_SIZE, usually only a small number of pages nee= d to be flushed and classic tlbi intructions are more effective. We can also use 'end - start < threshold number' to decide which way to go, however, different hardware may have different thresholds, so I'm not sure if this is feasible. Signed-off-by: Zhenyu Ye --- arch/arm64/include/asm/tlbflush.h | 104 ++++++++++++++++++++++++++---- 1 file changed, 90 insertions(+), 14 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/t= lbflush.h index bc3949064725..30975ddb8f06 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -50,6 +50,16 @@ __tlbi(op, (arg) | USER_ASID_FLAG); \ } while (0) =20 +#define __tlbi_last_level(op1, op2, arg, last_level) do { \ + if (last_level) { \ + __tlbi(op1, arg); \ + __tlbi_user(op1, arg); \ + } else { \ + __tlbi(op2, arg); \ + __tlbi_user(op2, arg); \ + } \ +} while (0) + /* This macro creates a properly formatted VA operand for the TLBI */ #define __TLBI_VADDR(addr, asid) \ ({ \ @@ -59,6 +69,60 @@ __ta; \ }) =20 +/* + * Get translation granule of the system, which is decided by + * PAGE_SIZE. Used by TTL. + * - 4KB : 1 + * - 16KB : 2 + * - 64KB : 3 + */ +static inline unsigned long get_trans_granule(void) +{ + switch (PAGE_SIZE) { + case SZ_4K: + return 1; + case SZ_16K: + return 2; + case SZ_64K: + return 3; + default: + return 0; + } +} + +/* + * This macro creates a properly formatted VA operand for the TLBI RANGE= . + * The value bit assignments are: + * + * +----------+------+-------+-------+-------+----------------------+ + * | ASID | TG | SCALE | NUM | TTL | BADDR | + * +-----------------+-------+-------+-------+----------------------+ + * |63 48|47 46|45 44|43 39|38 37|36 0| + * + * The address range is determined by below formula: + * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE) + * + */ +#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl) \ + ({ \ + unsigned long __ta =3D (addr) >> PAGE_SHIFT; \ + __ta &=3D GENMASK_ULL(36, 0); \ + __ta |=3D (unsigned long)(ttl) << 37; \ + __ta |=3D (unsigned long)(num) << 39; \ + __ta |=3D (unsigned long)(scale) << 44; \ + __ta |=3D get_trans_granule() << 46; \ + __ta |=3D (unsigned long)(asid) << 48; \ + __ta; \ + }) + +/* These macros are used by the TLBI RANGE feature. */ +#define __TLBI_RANGE_PAGES(num, scale) (((num) + 1) << (5 * (scale) + 1)= ) +#define MAX_TLBI_RANGE_PAGES __TLBI_RANGE_PAGES(31, 3) + +#define TLBI_RANGE_MASK GENMASK_ULL(4, 0) +#define __TLBI_RANGE_NUM(range, scale) \ + (((range) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) + /* * TLB Invalidation * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D @@ -181,32 +245,44 @@ static inline void __flush_tlb_range(struct vm_area= _struct *vma, unsigned long start, unsigned long end, unsigned long stride, bool last_level) { + int num =3D 0; + int scale =3D 0; unsigned long asid =3D ASID(vma->vm_mm); unsigned long addr; + unsigned long range_pages; =20 start =3D round_down(start, stride); end =3D round_up(end, stride); + range_pages =3D (end - start) >> PAGE_SHIFT; =20 - if ((end - start) >=3D (MAX_TLBI_OPS * stride)) { + if ((!cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && + (end - start) >=3D (MAX_TLBI_OPS * stride)) || + range_pages >=3D MAX_TLBI_RANGE_PAGES) { flush_tlb_mm(vma->vm_mm); return; } =20 - /* Convert the stride into units of 4k */ - stride >>=3D 12; - - start =3D __TLBI_VADDR(start, asid); - end =3D __TLBI_VADDR(end, asid); - dsb(ishst); - for (addr =3D start; addr < end; addr +=3D stride) { - if (last_level) { - __tlbi(vale1is, addr); - __tlbi_user(vale1is, addr); - } else { - __tlbi(vae1is, addr); - __tlbi_user(vae1is, addr); + while (range_pages > 0) { + if (cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && + stride =3D=3D PAGE_SIZE && range_pages % 2 =3D=3D 0) { + num =3D __TLBI_RANGE_NUM(range_pages, scale) - 1; + if (num >=3D 0) { + addr =3D __TLBI_VADDR_RANGE(start, asid, scale, + num, 0); + __tlbi_last_level(rvale1is, rvae1is, addr, + last_level); + start +=3D __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; + range_pages -=3D __TLBI_RANGE_PAGES(num, scale); + } + scale++; + continue; } + + addr =3D __TLBI_VADDR(start, asid); + __tlbi_last_level(vale1is, vae1is, addr, last_level); + start +=3D stride; + range_pages -=3D stride >> PAGE_SHIFT; } dsb(ish); } --=20 2.19.1