From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AA12C3DA7D for ; Thu, 5 Jan 2023 18:15:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD68B8E0002; Thu, 5 Jan 2023 13:15:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B60AE8E0001; Thu, 5 Jan 2023 13:15:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A004C8E0002; Thu, 5 Jan 2023 13:15:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8C16E8E0001 for ; Thu, 5 Jan 2023 13:15:10 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2F5E71C5D50 for ; Thu, 5 Jan 2023 18:15:10 +0000 (UTC) X-FDA: 80321547180.30.9932E72 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf22.hostedemail.com (Postfix) with ESMTP id 5FD24C001B for ; Thu, 5 Jan 2023 18:15:08 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf22.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672942508; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=THFwCrlGO101cvZMAcWPw4GENfB3yw+0qLwQyfftdBE=; b=yAC0y7v52AsnGD6+rjuTU5S/VkJ6nQBbCSP/28OchMjfhV3s4n4oHzKZW07r1Qk99OaIML h6z6/4JR9c15JU+9GpwZ34nG8stMajpUTXI6BlUNJ9G7VOwc/8OAxjPYRNkRiYlGFZWriS KaeKBre8ifu622b92xqYCBzkUjrzJvg= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf22.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672942508; a=rsa-sha256; cv=none; b=7V6p/DXpvKXVR3t2wi0k2k9X6I9I/EQIs20zOrUOm5dTWbuz4bzHUIQx+63j/Sk9uCQP+/ iF7gzRvv6kRy5EDefceOh3ZCx5vohA9NP64QD0DYEERlneXBFkrUWaCLBR6t45jUAytj84 30T9gI0kyFO/wxujl3w8pA7xOUlOsdM= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 292AE61BD1; Thu, 5 Jan 2023 18:15:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 52EDBC433D2; Thu, 5 Jan 2023 18:15:01 +0000 (UTC) Date: Thu, 5 Jan 2023 18:14:58 +0000 From: Catalin Marinas To: Yicong Yang Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, will@kernel.org, anshuman.khandual@arm.com, linux-doc@vger.kernel.org, corbet@lwn.net, peterz@infradead.org, arnd@arndb.de, punit.agrawal@bytedance.com, linux-kernel@vger.kernel.org, darren@os.amperecomputing.com, yangyicong@hisilicon.com, huzhanyuan@oppo.com, lipeifeng@oppo.com, zhangshiming@oppo.com, guojian@oppo.com, realmz6@gmail.com, linux-mips@vger.kernel.org, openrisc@lists.librecores.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, Barry Song <21cnbao@gmail.com>, wangkefeng.wang@huawei.com, xhao@linux.alibaba.com, prime.zeng@hisilicon.com, Barry Song , Nadav Amit , Mel Gorman Subject: Re: [PATCH v7 2/2] arm64: support batched/deferred tlb shootdown during page reclamation Message-ID: References: <20221117082648.47526-1-yangyicong@huawei.com> <20221117082648.47526-3-yangyicong@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221117082648.47526-3-yangyicong@huawei.com> X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 5FD24C001B X-Stat-Signature: nawk9bajkhsyzstt3nmhdtonsb3qdohj X-HE-Tag: 1672942508-147258 X-HE-Meta: U2FsdGVkX1/GzlNOgBr8x0IgJzwWwSMfWZ4hV3vMFyYDyXsTbCKuCdA5jb38MlnP3pE+wQxhU2LEreG2DYEJFkEdXHDos65gZvzwBwrSiNXes1YuBsDxbsL+Rce1rwuMVETtPqxp5yyqx2MN+UrVBuCGjhDd4QYO8TGsy7FNZcU+ThY/P95RHdm92gHjSiAB7IhgEtvSKXtMoq8+KkwjcqHgQMbE58F64VXefK/45ZXtl2KpMsCECvhACw0W+SmPuZhIMMQlbMO8RPzObvCMMIKGV3YnAQC5BAs4QqrhXvUS2t07LcH6Wn6fKMJWlGYCMXWNIhMGNltmgnE2iUK+11CDCoW98yiBuVyaSkhbVSVOO/NE85ukbqW/CK+jcgNekKnIENJOLcu6CujxkqgG42e/6RIO+dboxt7duOAKh/5znlUapsQ+U585+02iCqhuVFAd8dY07H9omqdlJi8I6rTdkkl6RTpoYya1EcXm4n/RpZxwKPF76UhYDjkoCY8zdDErQ3RhsdZIdfOoXKnYFtSM91IoV9VJfYYNMMoE4nsXkCZtvCOfbRaq5fCh7aRZgnQZXsqiGoxrA38ZdDshcwmCdi+iXH4vUBvgg8QGhEOWmLlIIBTMRm+7xev6QVziXyZ4PMqnaGOMoSl5zIRW/z8qFthau45I3sTATLexO/CkVPxz9PEErnHdBOiIWTVkawALTljjyF1axfKbv271wUBzpCGL0S8Zbcnkt8uaGKGsP1KNzySUMmO5nsnJ7qYrxLKtmHF6PyGuqdr03WwssIOH9GnyCep39llLMp7hY+SJyXgox0uhYIfKP1EcSx30dClKNwai+K+EreTyC+8GmVfAqw+OKREhvEnqm78Of2S8eJJAv9sGd2hch3JhYY5EC1j5gKQEsvjwFw0i5yTlvHJ0K+6eoOA7y+KGZSb2uJsiHQsTFDBLr25+DX18L6VzFh0RLvmGeFCXJIqXES7 1Cf2eHzG xUPtD72NL8aBM489njbbLpxClNFBVQAFLY3AETHjkfJQN9xZYooXeYucWnOHaAFNJtGq17dEwjJnxJDF3XCDXbwlcyhTaHxEePOE4ZbHkkhXlT1YgRPfLfnCkaGsVcbomQXXNOaAtxIfpFJ+BNOVGK7C3R0aAXlDwGGSX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 17, 2022 at 04:26:48PM +0800, Yicong Yang wrote: > It is tested on 4,8,128 CPU platforms and shows to be beneficial on > large systems but may not have improvement on small systems like on > a 4 CPU platform. So make ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH depends > on CONFIG_EXPERT for this stage and make this disabled on systems > with less than 8 CPUs. User can modify this threshold according to > their own platforms by CONFIG_NR_CPUS_FOR_BATCHED_TLB. What's the overhead of such batching on systems with 4 or fewer CPUs? If it isn't noticeable, I'd rather have it always on than some number chosen on whichever SoC you tested. Another option would be to make this a sysctl tunable. > .../features/vm/TLB/arch-support.txt | 2 +- > arch/arm64/Kconfig | 6 +++ > arch/arm64/include/asm/tlbbatch.h | 12 +++++ > arch/arm64/include/asm/tlbflush.h | 52 ++++++++++++++++++- > arch/x86/include/asm/tlbflush.h | 5 +- > include/linux/mm_types_task.h | 4 +- > mm/rmap.c | 10 ++-- Please keep any function prototype changes in a preparatory patch so that the arm64 one only introduces the arch specific changes. Easier to review. > +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) > +{ > + /* > + * TLB batched flush is proved to be beneficial for systems with large > + * number of CPUs, especially system with more than 8 CPUs. TLB shutdown > + * is cheap on small systems which may not need this feature. So use > + * a threshold for enabling this to avoid potential side effects on > + * these platforms. > + */ > + if (num_online_cpus() < CONFIG_ARM64_NR_CPUS_FOR_BATCHED_TLB) > + return false; The x86 implementation tracks the cpumask of where a task has run. We don't have such tracking on arm64 and I don't think it matters. As noticed/described in this series, the bottleneck is the actual DSB synchronisation (which sends a DVM Sync message to all the other CPUs and waits for a DVM Complete response). So I think it makes sense not to bother with an mm_cpumask(). What this patch aims to optimise is actually the number of DSBs issued on an SMP system by ptep_clear_flush(). The DVM is not an architected concept (well, it's part of AMBA AXI). I'd be curious to know how such patch behaves on Apple's M1/M2 hardware. My preference would be to have this always on for num_online_cpus() > 1 if there's no overhead. -- Catalin