From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33FD7C4332F for ; Mon, 14 Nov 2022 14:19:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C03B6B0072; Mon, 14 Nov 2022 09:19:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7706C6B0075; Mon, 14 Nov 2022 09:19:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 65EF28E0001; Mon, 14 Nov 2022 09:19:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 51B546B0072 for ; Mon, 14 Nov 2022 09:19:33 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1786C140F78 for ; Mon, 14 Nov 2022 14:19:33 +0000 (UTC) X-FDA: 80132255826.24.9034F6C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 3787B20009 for ; Mon, 14 Nov 2022 14:19:32 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 71CAB23A; Mon, 14 Nov 2022 06:19:37 -0800 (PST) Received: from [192.168.0.110] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 61C753F73D; Mon, 14 Nov 2022 06:19:21 -0800 (PST) Message-ID: <40f1b5ad-2165-bb81-1ff5-89786373fa14@arm.com> Date: Mon, 14 Nov 2022 19:49:13 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Subject: Re: [PATCH v5 2/2] arm64: support batched/deferred tlb shootdown during page reclamation Content-Language: en-US To: Yicong Yang , akpm@linux-foundation.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, catalin.marinas@arm.com, will@kernel.org, linux-doc@vger.kernel.org Cc: yangyicong@hisilicon.com, corbet@lwn.net, peterz@infradead.org, arnd@arndb.de, punit.agrawal@bytedance.com, linux-kernel@vger.kernel.org, darren@os.amperecomputing.com, huzhanyuan@oppo.com, lipeifeng@oppo.com, zhangshiming@oppo.com, guojian@oppo.com, realmz6@gmail.com, linux-mips@vger.kernel.org, openrisc@lists.librecores.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, Barry Song <21cnbao@gmail.com>, wangkefeng.wang@huawei.com, xhao@linux.alibaba.com, prime.zeng@hisilicon.com, Barry Song , Nadav Amit , Mel Gorman References: <20221028081255.19157-1-yangyicong@huawei.com> <20221028081255.19157-3-yangyicong@huawei.com> <86fbdc8c-0dcb-9b8f-d843-63460d8b1d6a@arm.com> <9982dac0-9f2e-112a-d440-467c8e8f8aa4@huawei.com> From: Anshuman Khandual In-Reply-To: <9982dac0-9f2e-112a-d440-467c8e8f8aa4@huawei.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668435572; a=rsa-sha256; cv=none; b=ZBovmW6gqk317PKM3lZLXgCPZK+7+LpBxA6jmC55fk+4O8SrMyFkqOV7S9upfRtRXzB4jv 9Kt0IhiaQkogJ6gMCUQr3RTG2Nz0Z+yHaSedIrnvIkKG+gOH4+l4F2qve8Tk92jWxxlFes EfJ1hFuT/YyQ3U2MFmNUg84gfj2S7kw= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668435572; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sq3QpVFC+23wnVMDwRYvSGqEtcfg8mOuSNZ8q+CiZNU=; b=KuhaYWOS+9W2NEYe8VmkD1ytVQ+aj/SqfN6l47Z/H+ySZ70AwP9BaEwRcp3L9dHX9ZxWU8 VRe/1mSkvkIjMPJjtluYlXTnts7u3nq5b3JXqmdV02DjW7HrNgOXG2USH3c9k98uaMIY6g 8a04XzwoDY7fi1zwftPg+TJ4QB37kTw= X-Rspamd-Queue-Id: 3787B20009 X-Rspam-User: Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com X-Rspamd-Server: rspam06 X-Stat-Signature: dg5g4phksffp4jw6xehpb5is58i9k8dm X-HE-Tag: 1668435572-708937 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/14/22 14:16, Yicong Yang wrote: > On 2022/11/14 11:29, Anshuman Khandual wrote: >> >> On 10/28/22 13:42, Yicong Yang wrote: >>> +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) >>> +{ >>> + /* >>> + * TLB batched flush is proved to be beneficial for systems with large >>> + * number of CPUs, especially system with more than 8 CPUs. TLB shutdown >>> + * is cheap on small systems which may not need this feature. So use >>> + * a threshold for enabling this to avoid potential side effects on >>> + * these platforms. >>> + */ >>> + if (num_online_cpus() <= CONFIG_ARM64_NR_CPUS_FOR_BATCHED_TLB) >>> + return false; >>> + >>> +#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI >>> + if (unlikely(this_cpu_has_cap(ARM64_WORKAROUND_REPEAT_TLBI))) >>> + return false; >>> +#endif >> should_defer_flush() is immediately followed by set_tlb_ubc_flush_pending() which calls >> arch_tlbbatch_add_mm(), triggering the actual TLBI flush via __flush_tlb_page_nosync(). >> It should be okay to check capability with this_cpu_has_cap() as the entire call chain >> here is executed on the same cpu. But just wondering if cpus_have_const_cap() would be >> simpler, consistent, and also cost effective ? >> > ok. Checked cpus_have_const_cap() I think it matches your words. > >> Regardless, a comment is needed before the #ifdef block explaining why it does not make >> sense to defer/batch when __tlbi()/__tlbi_user() implementation will execute 'dsb(ish)' >> between two TLBI instructions to workaround the errata. >> > The workaround for the errata mentioned the affected platforms need the tlbi+dsb to be done > twice, so I'm not sure if we defer the final dsb will cause any problem so I think the judgement > here is used for safety. I have no such platform to test if it's ok to defer the last dsb. We should not defer TLB flush on such systems, as ensured by the above test and 'false' return afterwards. The only question is whether this decision should be taken at a CPU level (which is affected by the errata) or the whole system level. What is required now - Replace this_cpu_has_cap() with cpus_have_const_cap ? - Add the following comment before the #ifdef check /* * TLB flush deferral is not required on systems, which are affected with * ARM64_WORKAROUND_REPEAT_TLBI, as __tlbi()/__tlbi_user() implementation * will have two consecutive TLBI instructions with a dsb(ish) in between * defeating the purpose (i.e save overall 'dsb ish' cost). */