From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1D76C433FC for ; Sat, 11 Jul 2020 06:51:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 937C820720 for ; Sat, 11 Jul 2020 06:51:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 937C820720 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CF78A8D0002; Sat, 11 Jul 2020 02:51:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CA7648D0001; Sat, 11 Jul 2020 02:51:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B97858D0002; Sat, 11 Jul 2020 02:51:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0061.hostedemail.com [216.40.44.61]) by kanga.kvack.org (Postfix) with ESMTP id A05F38D0001 for ; Sat, 11 Jul 2020 02:51:14 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 424CF180AD807 for ; Sat, 11 Jul 2020 06:51:14 +0000 (UTC) X-FDA: 77024873268.28.gold78_2f0a68026ed5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 172766C11 for ; Sat, 11 Jul 2020 06:51:14 +0000 (UTC) X-HE-Tag: gold78_2f0a68026ed5 X-Filterd-Recvd-Size: 4452 Received: from huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Sat, 11 Jul 2020 06:51:12 +0000 (UTC) Received: from DGGEMS407-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 104F5913F15158AB43CC; Sat, 11 Jul 2020 14:50:57 +0800 (CST) Received: from [127.0.0.1] (10.174.186.75) by DGGEMS407-HUB.china.huawei.com (10.3.19.207) with Microsoft SMTP Server id 14.3.487.0; Sat, 11 Jul 2020 14:50:47 +0800 Subject: Re: [PATCH v2 2/2] arm64: tlb: Use the TLBI RANGE feature in arm64 To: Catalin Marinas CC: , , , , , , , , , , , , , , References: <20200710094420.517-1-yezhenyu2@huawei.com> <20200710094420.517-3-yezhenyu2@huawei.com> <20200710183158.GE11839@gaia> From: Zhenyu Ye Message-ID: Date: Sat, 11 Jul 2020 14:50:46 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 MIME-Version: 1.0 In-Reply-To: <20200710183158.GE11839@gaia> Content-Type: text/plain; charset="gbk" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.186.75] X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 172766C11 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Catalin, On 2020/7/11 2:31, Catalin Marinas wrote: > On Fri, Jul 10, 2020 at 05:44:20PM +0800, Zhenyu Ye wrote: >> - if ((end - start) >= (MAX_TLBI_OPS * stride)) { >> + if ((!cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) && >> + (end - start) >= (MAX_TLBI_OPS * stride)) || >> + pages >= MAX_TLBI_RANGE_PAGES) { >> flush_tlb_mm(vma->vm_mm); >> return; >> } > > I think we can use strictly greater here rather than greater or equal. > MAX_TLBI_RANGE_PAGES can be encoded as num 31, scale 3. Sorry, we can't. For a boundary value (such as 2^6), we have two way to express it in TLBI RANGE operations: 1. scale = 0, num = 31. 2. scale = 1, num = 0. I used the second way in following implementation. However, for the MAX_TLBI_RANGE_PAGES, we can only use scale = 3, num = 31. So if use strictly greater here, ERROR will happen when range pages equal to MAX_TLBI_RANGE_PAGES. There are two ways to avoid this bug: 1. Just keep 'greater or equal' here. The ARM64 specification does not specify how we flush tlb entries in this case, flush_tlb_mm() is also a good choice for such a wide range of pages. 2. Add check in the loop, just like: (this may cause the codes a bit ugly) num = __TLBI_RANGE_NUM(pages, scale) - 1; /* scale = 4, num = 0 is equal to scale = 3, num = 31. */ if (scale == 4 && num == 0) { scale = 3; num = 31; } if (num >= 0) { ... Which one do you prefer and how do you want to fix this error? Just a fix patch again? > >> >> - /* Convert the stride into units of 4k */ >> - stride >>= 12; >> + dsb(ishst); >> >> - start = __TLBI_VADDR(start, asid); >> - end = __TLBI_VADDR(end, asid); >> + /* >> + * When cpu does not support TLBI RANGE feature, we flush the tlb >> + * entries one by one at the granularity of 'stride'. >> + * When cpu supports the TLBI RANGE feature, then: >> + * 1. If pages is odd, flush the first page through non-RANGE >> + * instruction; >> + * 2. For remaining pages: The minimum range granularity is decided >> + * by 'scale', so we can not flush all pages by one instruction >> + * in some cases. >> + * Here, we start from scale = 0, flush corresponding pages >> + * (from 2^(5*scale + 1) to 2^(5*(scale + 1) + 1)), and increase >> + * it until no pages left. >> + */ >> + while (pages > 0) { > > I did some simple checks on ((end - start) % stride) and never > triggered. I had a slight worry that pages could become negative (and > we'd loop forever since it's unsigned long) for some mismatched stride > and flush size. It doesn't seem like. > The start and end are round_down/up in the function: start = round_down(start, stride); end = round_up(end, stride); So the flush size and stride will never mismatch. Thanks, Zhenyu