From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9548C2B9F4 for ; Thu, 17 Jun 2021 14:06:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 76AFD611CA for ; Thu, 17 Jun 2021 14:06:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 76AFD611CA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E9FC16B0072; Thu, 17 Jun 2021 10:06:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E78636B0073; Thu, 17 Jun 2021 10:06:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3F2E6B0074; Thu, 17 Jun 2021 10:06:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0106.hostedemail.com [216.40.44.106]) by kanga.kvack.org (Postfix) with ESMTP id A299D6B0072 for ; Thu, 17 Jun 2021 10:06:15 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 24B1EBBC8 for ; Thu, 17 Jun 2021 14:06:15 +0000 (UTC) X-FDA: 78263390310.16.96DFDD1 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf30.hostedemail.com (Postfix) with ESMTP id E0861E000243 for ; Thu, 17 Jun 2021 14:06:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=OM5P5xBbthYbL75jxvFs5uaCgdAcSHR4gEDRFZu/D28=; b=hNEIGS3JdsLyvgPzIYnV9sH1vb M1ICBIC5aSRoXC09K1dGYd9S6ApQ4/UMNBrxMe5qvmbMP65UYYDl5Y6xP8DddyiatT+Bx9JrqQYO2 l3fqZsujFsFGECxy9h/XhbcrNJMOXax8QoJ0SfLEHFV+RgwV0Z971DBGHseQ39hfgc2MUief50Zoo Nb5aPExY0yfSAx/L5/+NH/AXwOeahwU4nFiiUzN4498GsT+gcwET1xzod0C1hXhPO+1UWtbvq2iE5 xDr0GGOt03+HlUAWVAq5GiKUO9fF+EPW6cc3gX5qTzmHqGWETCDm9bbSlAL2KrwumlXWBFqkEaOq+ QVT0tqaw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1ltsdm-009CPL-JG; Thu, 17 Jun 2021 14:05:25 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 180873001DB; Thu, 17 Jun 2021 16:05:03 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id D52342C072F49; Thu, 17 Jun 2021 16:05:03 +0200 (CEST) Date: Thu, 17 Jun 2021 16:05:03 +0200 From: Peter Zijlstra To: Andy Lutomirski Cc: Mark Rutland , "Russell King (Oracle)" , the arch/x86 maintainers , Dave Hansen , Linux Kernel Mailing List , linux-mm@kvack.org, Andrew Morton , Mathieu Desnoyers , Nicholas Piggin , linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 7/8] membarrier: Remove arm (32) support for SYNC_CORE Message-ID: References: <2142129092ff9aa00e600c42a26c4015b7f5ceec.1623813516.git.luto@kernel.org> <20210617103524.GA82133@C02TD0UTHF1T.local> <20210617112305.GK22278@shell.armlinux.org.uk> <20210617113349.GB82133@C02TD0UTHF1T.local> <394219d4-36a6-4e7f-a03c-8590551b099a@www.fastmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <394219d4-36a6-4e7f-a03c-8590551b099a@www.fastmail.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E0861E000243 X-Stat-Signature: xggwzjjtm1kun6gu3b4qzth7fqqtcyon Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=hNEIGS3J; dmarc=none; spf=none (imf30.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org X-HE-Tag: 1623938766-966824 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jun 17, 2021 at 06:41:41AM -0700, Andy Lutomirski wrote: > On Thu, Jun 17, 2021, at 4:33 AM, Mark Rutland wrote: > > Sure, and I agree we should not change cacheflush(). > >=20 > > The point of membarrier(SYNC_CORE) is that you can move the cost of t= hat > > ISB out of the fast-path in the executing thread(s) and into the > > slow-path on the thread which generated the code. > >=20 > > So e.g. rather than an executing thread always having to do: > >=20 > > LDR , [] > > ISB // in case funcptr was just updated > > BLR > >=20 > > ... you have the thread generating the code use membarrier(SYNC_CORE) > > prior to plublishing the funcptr, and the fast-path on all the execut= ing > > threads can be: > >=20 > > LDR [] > > BLR > >=20 > > ... and thus I think we still want membarrier(SYNC_CORE) so that peop= le > > can do this, even if there are other means to achieve the same > > functionality. >=20 > I had the impression that sys_cacheflush() did that. Am I wrong? Yes, sys_cacheflush() only does what it says on the tin (and only correctly for hardware broadcast -- everything except 11mpcore). It only invalidates the caches, but not the per CPU derived state like prefetch buffers and micro-op buffers, and certainly not instructions already in flight. So anything OoO needs at the very least a complete pipeline stall injected, but probably something stronger to make it flush the buffers. > In any event, I=E2=80=99m even more convinced that no new SYNC_CORE arc= hes > should be added. We need a new API that just does the right thing.=20 I really don't understand why you hate the thing so much; SYNC_CORE is a means of injecting whatever instruction is required to flush all uarch state related to instructions on all theads (not all CPUs) of a process as efficient as possible. The alternative is sending signals to all threads (including the non-running ones) which is known to scale very poorly indeed, or, as Mark suggests above, have very expensive instructions unconditinoally in the instruction stream, which is also undesired.