From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 896C9C36010 for ; Fri, 4 Apr 2025 04:11:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E8BF280004; Fri, 4 Apr 2025 00:11:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 19884280003; Fri, 4 Apr 2025 00:11:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 05D63280004; Fri, 4 Apr 2025 00:11:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DBD73280003 for ; Fri, 4 Apr 2025 00:11:29 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4507FBA6E9 for ; Fri, 4 Apr 2025 04:11:30 +0000 (UTC) X-FDA: 83295037140.16.CDEC801 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf12.hostedemail.com (Postfix) with ESMTP id 5BA1740004 for ; Fri, 4 Apr 2025 04:11:28 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf12.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743739888; a=rsa-sha256; cv=none; b=dPIFh1nbQ/GXvGesXtnGcCsDVmRBUIDRKXJLiG/CjVi+FOpqh0s7crTQjD9UNKl3q+SXFm Llsvra/r4f14ZDy9WMkMuE2ue2sK2SPAnC7VmzNj5qWAwhKh8rixaYkDogLIo5D6zUjsZe JKMOAhjZIgWOyX1Q3jMcH3vijOy0KoI= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf12.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743739888; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ufexy5i692lebLA4hhJSD93mhE+sAXBdXxBoWGxvQf8=; b=CveXOA9iZaV33plySzBSdKIoCTsOfHTyJ82ejnsZBGckoY93DXNUqWhKg62etWjFDZ/FqT zFO2Iri6nIBxPUPgVyG+3APSOCoMHVfh04PFX5tsJjDYSMgOi+C4j+MSsspER4WYMlvOSn E0Fm4Gh6AD0U/7FVcGBoDnG7fK8xZxs= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C5EC51516; Thu, 3 Apr 2025 21:11:29 -0700 (PDT) Received: from [10.162.40.17] (a077893.blr.arm.com [10.162.40.17]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A59C83F59E; Thu, 3 Apr 2025 21:11:22 -0700 (PDT) Message-ID: <4b706492-abaf-44f9-92dc-ba5aadc80c31@arm.com> Date: Fri, 4 Apr 2025 09:41:19 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 06/11] arm64/mm: Hoist barriers out of set_ptes_anysz() loop To: Ryan Roberts , Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Alexandre Ghiti , Kevin Brodsky Cc: linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20250304150444.3788920-1-ryan.roberts@arm.com> <20250304150444.3788920-7-ryan.roberts@arm.com> Content-Language: en-US From: Anshuman Khandual In-Reply-To: <20250304150444.3788920-7-ryan.roberts@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5BA1740004 X-Stat-Signature: 318ird6ghd13qwipspidchtpjbdhkrs8 X-Rspam-User: X-HE-Tag: 1743739888-262305 X-HE-Meta: U2FsdGVkX1+iwmvYL40OT9BbEocDK1uFayP3UoGmcA5MDr2PEL96RE/PaehFtsNNXUZRs9QFzenb74Ic67pPCBVKW5xue0Xad9hX2/tQ1rjnPlAnuWHKF/GQcbSU7hR4Eebr+gl/uxRNEVxUr3WNWaTOEjJ3sR/XvToG0+s54D9+i7jGmQGkhHWCIjiq+eEudPAnOxCUWXu5SdXZKrRlFUsQPpRLIL6d1GdCECI5QJAzOAX9OncizmlCZSMLS/xQo4E2m4NurMoDe/z/e2q+IoCH2774x3J+CuhMwBkusLUBd1Gl7esiE1a5XC6he7v1goSf14W1V96ywUa+4/v62G2Fzo5RTcKbXSbfrdXtBYWXOXln7tq8IKat0tLJU0UEajb+nIDo4d3q0TycRWxbVPAx8pMHup51Mfl+287N7SlBttOZjen7Nx3j1pD2RTjJtUAc84GoeOr/hRlJ2yLfOHSYgF3aZcDoEaRc8um4DAcd4DlKVOMI5eIm+xQAXa2EqEZfuINJLClGTP9TajAi/UHFBMgbQfp8XvFScAmIgbc8lQaMOIIkOP/43iQva75jYcMy/AMRGWfBrac3gbxtGE+2pGYbJzmGwQWsPP3aQwkct+gFCzeWHto5LaI+9D0RuBtSvQ9OfVQjpyz1ql/T1zqqeLmm0QOHt1cyOWXkC0a0OKsdJ/xvgoHpUfikMlrPdyjttdeBpDg1J+Xyh5E+8w9TI2UwZ5gcgD3jic2lIJVaSwM85fnB8nUQsaatJlJMJsEafm9ywdG9PA7Kwmnpyccl2Yf6nDKrgqI2n7JaidhjCO8ZRvjVo4AyRJlCgUGyIWUmtdFJYojv3SPzimsKNCjFpcbUciL+k9BTpqiFQanfrmT2X9geZSwMupYEnCYd5eGpGNF81zGCyxScGBj5nvpKsuGebFZnrxkfoRf5O+yIyU7m0fB/ec4FT741zjvJtR1ANIEqsNUby7mq5Sg wBEaJA4t BODChYQ3g91JFeHyxZ7ecM9UyT9GAMu8K/5MjboUl7mRLlfoH9kpD4d31cv3xvtggKDhiKM7Z2xRrQqNkKRd0OooSq9lDptYq933LhA1puwiZaOT2u+3iXryH8RcUUvb1cta/+NFoo/ulH1YqMESn/bix1wVV0ATlnTqSSL3huCc5UdJsm61VlZ0eVQBFy8sgfDVs95q0SgsfZCW93z2rMGeBQNXrAJtFQIBzieV5FdBU69SdYLJlnuctKA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3/4/25 20:34, Ryan Roberts wrote: > set_ptes_anysz() previously called __set_pte() for each PTE in the > range, which would conditionally issue a DSB and ISB to make the new PTE > value immediately visible to the table walker if the new PTE was valid > and for kernel space. > > We can do better than this; let's hoist those barriers out of the loop > so that they are only issued once at the end of the loop. We then reduce > the cost by the number of PTEs in the range. > > Signed-off-by: Ryan Roberts > --- > arch/arm64/include/asm/pgtable.h | 16 +++++++++++----- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > index e255a36380dc..1898c3069c43 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -317,13 +317,11 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) > WRITE_ONCE(*ptep, pte); > } > > -static inline void __set_pte(pte_t *ptep, pte_t pte) > +static inline void __set_pte_complete(pte_t pte) > { > - __set_pte_nosync(ptep, pte); > - > /* > * Only if the new pte is valid and kernel, otherwise TLB maintenance > - * or update_mmu_cache() have the necessary barriers. > + * has the necessary barriers. > */ > if (pte_valid_not_user(pte)) { > dsb(ishst); > @@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) > } > } > > +static inline void __set_pte(pte_t *ptep, pte_t pte) > +{ > + __set_pte_nosync(ptep, pte); > + __set_pte_complete(pte); > +} > + > static inline pte_t __ptep_get(pte_t *ptep) > { > return READ_ONCE(*ptep); > @@ -647,12 +651,14 @@ static inline void set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, pte_t pte, > > for (;;) { > __check_safe_pte_update(mm, ptep, pte); > - __set_pte(ptep, pte); > + __set_pte_nosync(ptep, pte); > if (--nr == 0) > break; > ptep++; > pte = pte_advance_pfn(pte, stride); > } > + > + __set_pte_complete(pte); > } > > static inline void __set_ptes(struct mm_struct *mm, Reviewed-by: Anshuman Khandual