From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C397BC369C2 for ; Tue, 22 Apr 2025 08:19:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E7CC6B0082; Tue, 22 Apr 2025 04:18:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 947F46B0083; Tue, 22 Apr 2025 04:18:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C24C6B0085; Tue, 22 Apr 2025 04:18:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 503C56B0082 for ; Tue, 22 Apr 2025 04:18:55 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 5F01A1A17CF for ; Tue, 22 Apr 2025 08:18:56 +0000 (UTC) X-FDA: 83360979072.08.2DF0A3E Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf12.hostedemail.com (Postfix) with ESMTP id C710E40004 for ; Tue, 22 Apr 2025 08:18:54 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf12.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745309934; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I2Ms5e6XgW21f3dS+j8TeaVjHGkw/FqJaWlbIwh/Gog=; b=RIIC2Ex3mQBPOihc8yMbHokUDenZZjSxgsvB7pJEY19PlgkrFVik3jNeVOlMOX1TJgXV4n MK2R6Kb3qPlfdKg5Hc/q7CtXjiSx5tfI3L6I+9umsP90ugRqpNt24mz/0zM36/0wOkLYcL t7wmVhmC9GOBLThCagb/vPgFTcuv78M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745309934; a=rsa-sha256; cv=none; b=pIoKFcTB5YYwNLsHdrbbMnLTtRw1kXRt+L03J2ieU2lq/eANVEyda9JRlM7iQsU33LzCIn OiWU5T1mUG7KTVFChIklENb1SbYdAXlDkch44gvsLaF9PmXFucrVvzwg8yfbrCkQSQt4Zw PtEf6A3mlOzgfv635XJnA9xmeIUzhlM= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf12.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 99A3B152B; Tue, 22 Apr 2025 01:18:49 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E685E3F66E; Tue, 22 Apr 2025 01:18:51 -0700 (PDT) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 06/11] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Date: Tue, 22 Apr 2025 09:18:14 +0100 Message-ID: <20250422081822.1836315-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250422081822.1836315-1-ryan.roberts@arm.com> References: <20250422081822.1836315-1-ryan.roberts@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C710E40004 X-Rspam-User: X-Stat-Signature: c677txijk71j6qyt1ba4ig41mmoz359w X-HE-Tag: 1745309934-579394 X-HE-Meta: U2FsdGVkX1+tSirGy3RBiuAE0sHSAYsXlLLCw9vdB8yEsDTW1EuobVm5XladUIUV+KR6OS63xjzBrsQpmbapH63t/Hu3r2C3KSlSCGiA+pg3agFiCyZsl+F7hlke/iKti5JwsT9tPLIvnR7rXAFVBOIPjYk/LuGigKFP8fqp/dvICDjPHre164B7jcBZsRdU2saC0yueZ5tJPf546TwEm3FPXpRYOHU42dK+cSB2i40wsKQ9tI1cYkfK1DETIocWFIZFWu7CZa2EOi1oFdOGT8yA3GbkrNUJg76DYODSnWkfOE3gn43M33YE3UyL6X57OpiMgqfUAY38e3VBX3w3vqsJadl2mpFp1AdGducjSyvGk5ht3l/gUjPh/UNd1j60D16VCEEJRN2gO7KYyOXv+IOvPdxv8VTtrgob9iBmoBvyEwkJdxWCOSAplxF1Nre3Z4gSIuQA52/YMzQTvh/EZ17+t0T+XYHXHHq6bZ0sspxUvjq7Tx7Bb6qgYPjGSX4+IgckH7Rxt2gNeGw8i0vYrOVSkH9R8Rd5aBR5ZPo5zPUiprCOGum98s6G1/5P/b8g3iqOycbnCyp55BXn7UbVJ8sK042LZwyHNYX2bJ53GqlyobEEJQlL7GS+vR3LZfjEGprg4YuQaxCuptE/Niglb6EqgMmtg5kTvx2Dg7thEmPhnIEuFbViwAgfGNAbjE4Y523lHihz8q1diCS/xeiheAx4NXosZsCH3O7td0xjlNDXK2qA23wCtdnu35qf/UaWNnBix2Pkzbn3yn81+X2NVI0QPJv0qubknkRzweomqJY+ava15KNP+ZaRzstX0S7FSDVfuCpAf9fG1B06mH1LUdRmE3suykURDMVK9st3Y5rx2Yt5c5yuUVyP0XybkVb/azHFlGzMDdD2tgQUawsb4OP6LdNaIXZjeur+hFXLHZ+Zmqna1Uhd7NdP/fMjbV2D5L1zxKedZjMQi8XpvFJ yp3vldkr IHCOkl77fVxlmyJWgdBHahuKjukZEyGAYDMjjOA1UV/FIXfwHJnQSejOH9KtYla0YAjUQKaBuv3Y2khIXl2IKHlqxN5A6lC41odboRefmsUiTjqOeRb2YbiEwW08poE49u5Tsyc7ekKVYbt3LfgdTypQ/0OxGGIrpwV5XgBjLRRIFnIXCFjMOls6CnvTuwMaEJcHzSwbJdvyNVp8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: set_ptes_anysz() previously called __set_pte() for each PTE in the range, which would conditionally issue a DSB and ISB to make the new PTE value immediately visible to the table walker if the new PTE was valid and for kernel space. We can do better than this; let's hoist those barriers out of the loop so that they are only issued once at the end of the loop. We then reduce the cost by the number of PTEs in the range. Reviewed-by: Catalin Marinas Reviewed-by: Anshuman Khandual Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index d80aa9ba0a16..39c331743b69 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -320,13 +320,11 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) WRITE_ONCE(*ptep, pte); } -static inline void __set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte_complete(pte_t pte) { - __set_pte_nosync(ptep, pte); - /* * Only if the new pte is valid and kernel, otherwise TLB maintenance - * or update_mmu_cache() have the necessary barriers. + * has the necessary barriers. */ if (pte_valid_not_user(pte)) { dsb(ishst); @@ -334,6 +332,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) } } +static inline void __set_pte(pte_t *ptep, pte_t pte) +{ + __set_pte_nosync(ptep, pte); + __set_pte_complete(pte); +} + static inline pte_t __ptep_get(pte_t *ptep) { return READ_ONCE(*ptep); @@ -658,12 +662,14 @@ static inline void __set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, for (;;) { __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + __set_pte_nosync(ptep, pte); if (--nr == 0) break; ptep++; pte = pte_advance_pfn(pte, stride); } + + __set_pte_complete(pte); } static inline void __set_ptes(struct mm_struct *mm, -- 2.43.0