From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7F5CC49361 for ; Thu, 17 Jun 2021 13:51:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8AF0060234 for ; Thu, 17 Jun 2021 13:51:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8AF0060234 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1697B6B006E; Thu, 17 Jun 2021 09:51:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 11A4C6B0071; Thu, 17 Jun 2021 09:51:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EFC4F6B0072; Thu, 17 Jun 2021 09:51:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id C07EE6B006E for ; Thu, 17 Jun 2021 09:51:52 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 61C12180C2E72 for ; Thu, 17 Jun 2021 13:51:52 +0000 (UTC) X-FDA: 78263354064.39.C5EA9D3 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 3289C9001E40 for ; Thu, 17 Jun 2021 13:51:37 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C4E4D106F; Thu, 17 Jun 2021 06:51:49 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6D9043F719; Thu, 17 Jun 2021 06:51:47 -0700 (PDT) Date: Thu, 17 Jun 2021 14:51:33 +0100 From: Mark Rutland To: Andy Lutomirski Cc: "Russell King (Oracle)" , the arch/x86 maintainers , Dave Hansen , Linux Kernel Mailing List , linux-mm@kvack.org, Andrew Morton , Mathieu Desnoyers , Nicholas Piggin , "Peter Zijlstra (Intel)" , linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 7/8] membarrier: Remove arm (32) support for SYNC_CORE Message-ID: <20210617135133.GA86101@C02TD0UTHF1T.local> References: <2142129092ff9aa00e600c42a26c4015b7f5ceec.1623813516.git.luto@kernel.org> <20210617103524.GA82133@C02TD0UTHF1T.local> <20210617112305.GK22278@shell.armlinux.org.uk> <20210617113349.GB82133@C02TD0UTHF1T.local> <394219d4-36a6-4e7f-a03c-8590551b099a@www.fastmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <394219d4-36a6-4e7f-a03c-8590551b099a@www.fastmail.com> X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 3289C9001E40 Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf19.hostedemail.com: domain of mark.rutland@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=mark.rutland@arm.com X-Stat-Signature: w54xhrt5i13utshg3fs7p6oangd19zcw X-HE-Tag: 1623937897-974159 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jun 17, 2021 at 06:41:41AM -0700, Andy Lutomirski wrote: >=20 >=20 > On Thu, Jun 17, 2021, at 4:33 AM, Mark Rutland wrote: > > On Thu, Jun 17, 2021 at 12:23:05PM +0100, Russell King (Oracle) wrote= : > > > On Thu, Jun 17, 2021 at 11:40:46AM +0100, Mark Rutland wrote: > > > > On Tue, Jun 15, 2021 at 08:21:12PM -0700, Andy Lutomirski wrote: > > > > > On arm32, the only way to safely flush icache from usermode is = to call > > > > > cacheflush(2). This also handles any required pipeline flushes= , so > > > > > membarrier's SYNC_CORE feature is useless on arm. Remove it. > > > >=20 > > > > Unfortunately, it's a bit more complicated than that, and these d= ays > > > > SYNC_CORE is equally necessary on arm as on arm64. This is someth= ing > > > > that changed in the architecture over time, but since ARMv7 we ge= nerally > > > > need both the cache maintenance *and* a context synchronization e= vent > > > > (the latter must occur on the CPU which will execute the instruct= ions). > > > >=20 > > > > If you look at the latest ARMv7-AR manual (ARM DDI 406C.d), secti= on > > > > A3.5.4 "Concurrent modification and execution of instructions" co= vers > > > > this. That manual can be found at: > > > >=20 > > > > https://developer.arm.com/documentation/ddi0406/latest/ > > >=20 > > > Looking at that, sys_cacheflush() meets this. The manual details a > > > series of cache maintenance calls in "step 1" that the modifying th= read > > > must issue - this is exactly what sys_cacheflush() does. The same i= s > > > true for ARMv6, except the "ISB" terminology is replaced by a > > > "PrefetchFlush" terminology. (I checked DDI0100I). > > >=20 > > > "step 2" requires an ISB on the "other CPU" prior to executing that > > > code. As I understand it, in ARMv7, userspace can issue an ISB itse= lf. > > >=20 > > > For ARMv6K, it doesn't have ISB, but instead has a CP15 instruction > > > for this that isn't availble to userspace. This is where we come to > > > the situation about ARM 11MPCore, and whether we continue to suppor= t > > > it or not. > > >=20 > > > So, I think we're completely fine with ARMv7 under 32-bit ARM kerne= ls > > > as userspace has everything that's required. ARMv6K is a different > > > matter as we've already identified for several reasons. > >=20 > > Sure, and I agree we should not change cacheflush(). > >=20 > > The point of membarrier(SYNC_CORE) is that you can move the cost of t= hat > > ISB out of the fast-path in the executing thread(s) and into the > > slow-path on the thread which generated the code. > >=20 > > So e.g. rather than an executing thread always having to do: > >=20 > > LDR , [] > > ISB // in case funcptr was just updated > > BLR > >=20 > > ... you have the thread generating the code use membarrier(SYNC_CORE) > > prior to plublishing the funcptr, and the fast-path on all the execut= ing > > threads can be: > >=20 > > LDR [] > > BLR > >=20 > > ... and thus I think we still want membarrier(SYNC_CORE) so that peop= le > > can do this, even if there are other means to achieve the same > > functionality. >=20 > I had the impression that sys_cacheflush() did that. Am I wrong? Currently sys_cacheflush() doesn't do this, and IIUC it has never done remote context synchronization even for architectures that need that (e.g. x86 requiring a serializing instruction). > In any event, I=E2=80=99m even more convinced that no new SYNC_CORE arc= hes > should be added. We need a new API that just does the right thing.=20 My intuition is the other way around, and that this is a gnereally useful thing for architectures that require context synchronization. It's not clear to me what "the right thing" would mean specifically, and on architectures with userspace cache maintenance JITs can usually do the most optimal maintenance, and only need help for the context synchronization. Thanks, Mark.