From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55F01C49361 for ; Thu, 17 Jun 2021 10:41:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C3A49613CB for ; Thu, 17 Jun 2021 10:41:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C3A49613CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3C7A76B0070; Thu, 17 Jun 2021 06:41:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 350166B0071; Thu, 17 Jun 2021 06:41:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CA0C6B0072; Thu, 17 Jun 2021 06:41:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id D6E506B0070 for ; Thu, 17 Jun 2021 06:41:01 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 64D9F810A7A8 for ; Thu, 17 Jun 2021 10:41:01 +0000 (UTC) X-FDA: 78262873122.01.399FC27 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 99D5120010B1 for ; Thu, 17 Jun 2021 10:40:47 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0E46D31B; Thu, 17 Jun 2021 03:41:00 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 72B1A3F694; Thu, 17 Jun 2021 03:40:57 -0700 (PDT) Date: Thu, 17 Jun 2021 11:40:46 +0100 From: Mark Rutland To: Andy Lutomirski Cc: x86@kernel.org, Dave Hansen , LKML , linux-mm@kvack.org, Andrew Morton , Mathieu Desnoyers , Nicholas Piggin , Peter Zijlstra , Russell King , linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 7/8] membarrier: Remove arm (32) support for SYNC_CORE Message-ID: <20210617103524.GA82133@C02TD0UTHF1T.local> References: <2142129092ff9aa00e600c42a26c4015b7f5ceec.1623813516.git.luto@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2142129092ff9aa00e600c42a26c4015b7f5ceec.1623813516.git.luto@kernel.org> Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf11.hostedemail.com: domain of mark.rutland@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=mark.rutland@arm.com X-Rspamd-Server: rspam02 X-Stat-Signature: dscoom3gdqfdow4cajiwqkq1pu9msk3n X-Rspamd-Queue-Id: 99D5120010B1 X-HE-Tag: 1623926447-33097 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 15, 2021 at 08:21:12PM -0700, Andy Lutomirski wrote: > On arm32, the only way to safely flush icache from usermode is to call > cacheflush(2). This also handles any required pipeline flushes, so > membarrier's SYNC_CORE feature is useless on arm. Remove it. Unfortunately, it's a bit more complicated than that, and these days SYNC_CORE is equally necessary on arm as on arm64. This is something that changed in the architecture over time, but since ARMv7 we generally need both the cache maintenance *and* a context synchronization event (the latter must occur on the CPU which will execute the instructions). If you look at the latest ARMv7-AR manual (ARM DDI 406C.d), section A3.5.4 "Concurrent modification and execution of instructions" covers this. That manual can be found at: https://developer.arm.com/documentation/ddi0406/latest/ Likewise for ARMv8-A; the latest manual (ARM DDI 0487G.a) covers this in sections B2.2.5 and E2.3.5. That manual can be found at: https://developer.arm.com/documentation/ddi0487/ga I am not sure about exactly what's required 11MPcore, since that's somewhat a special case as the only SMP design prior to ARMv7-A mandating broadcast maintenance. For intuition's sake, one reason for this is that once a CPU has fetched an instruction from an instruction cache into its pipeline and that instruction is "in-flight", changes to that instruction cache are not guaranteed to affect the "in-flight" copy (which e.g. could be decomposed into micro-ops and so on). While these parts of a CPU aren't necessarily designed as caches, they effectively transiently cache a stale copy of the instruction while it is being executed. This is more pronounced on newer designs with more complex execution pipelines (e.g. with bigger windows for out-of-order execution and speculation), and generally it's unlikely for this to be noticed on smaller/simpler designs. As above, modifying instructions requires two things: 1) Making sure that *subsequent* instruction fetches will see the new instructions. This is what cacheflush(2) does, and this is similar to what SW does on arm64 with DC CVAU + IC IVAU instructions and associated memory barriers. 2) Making sure that a CPU fetches the instructions *after* the cache maintenance is complete. There are a few ways to do this: * A context synchronization event (e.g. an ISB or exception return) on the CPU that will execute the instructions. This is what membarrier(SYNC_CORE) does. * In ARMv8-A there are some restrictions on the order in which modified instructions are guaranteed to be observed (e.g. if you publish a function, then subsequently install a branch to that new function), where an ISB may not be necessary. In the latest ARMv8-A manual as linked above, those are described in sections: - B2.3.8 "Ordering of instruction fetches" (for 64-bit) - E2.3.8 "Ordering of instruction fetches" (for 32-bit) * Where we can guarantee that a CPU cannot possibly have an instruction in-flight (e.g. due to a lack of a mapping to fetch instructions from), nothing is necessary. This is what we rely on when faulting in code pages. In these cases, the CPU is liable to take fault on the missing translation anyway. Thanks, Mark. > > Cc: Mathieu Desnoyers > Cc: Nicholas Piggin > Cc: Peter Zijlstra > Cc: Russell King > Cc: linux-arm-kernel@lists.infradead.org > Signed-off-by: Andy Lutomirski > --- > arch/arm/Kconfig | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 24804f11302d..89a885fba724 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -10,7 +10,6 @@ config ARM > select ARCH_HAS_FORTIFY_SOURCE > select ARCH_HAS_KEEPINITRD > select ARCH_HAS_KCOV > - select ARCH_HAS_MEMBARRIER_SYNC_CORE > select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE > select ARCH_HAS_PTE_SPECIAL if ARM_LPAE > select ARCH_HAS_PHYS_TO_DMA > -- > 2.31.1 >