From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yh0-f54.google.com (mail-yh0-f54.google.com [209.85.213.54]) by kanga.kvack.org (Postfix) with ESMTP id 11A8F6B0083 for ; Tue, 26 Nov 2013 12:11:42 -0500 (EST) Received: by mail-yh0-f54.google.com with SMTP id z12so4123332yhz.41 for ; Tue, 26 Nov 2013 09:11:41 -0800 (PST) Received: from e37.co.us.ibm.com (e37.co.us.ibm.com. [32.97.110.158]) by mx.google.com with ESMTPS id a9si24700751yhm.287.2013.11.26.09.11.40 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 26 Nov 2013 09:11:41 -0800 (PST) Received: from /spool/local by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 26 Nov 2013 10:11:40 -0700 Received: from b03cxnp07027.gho.boulder.ibm.com (b03cxnp07027.gho.boulder.ibm.com [9.17.130.14]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 4A38F3E40040 for ; Tue, 26 Nov 2013 10:11:37 -0700 (MST) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp07027.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rAQF9ZMG44040206 for ; Tue, 26 Nov 2013 16:09:35 +0100 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id rAQHEVDw012167 for ; Tue, 26 Nov 2013 10:14:33 -0700 Date: Tue, 26 Nov 2013 09:11:06 -0800 From: "Paul E. McKenney" Subject: Re: [PATCH v6 4/5] MCS Lock: Barrier corrections Message-ID: <20131126171106.GJ4138@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20131121215249.GZ16796@laptop.programming.kicks-ass.net> <20131121221859.GH4138@linux.vnet.ibm.com> <20131122155835.GR3866@twins.programming.kicks-ass.net> <20131122182632.GW4138@linux.vnet.ibm.com> <20131122185107.GJ4971@laptop.programming.kicks-ass.net> <20131125173540.GK3694@twins.programming.kicks-ass.net> <20131125180250.GR4138@linux.vnet.ibm.com> <20131125182715.GG10022@twins.programming.kicks-ass.net> <20131125235252.GA4138@linux.vnet.ibm.com> <20131126095945.GI10022@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20131126095945.GI10022@twins.programming.kicks-ass.net> Sender: owner-linux-mm@kvack.org List-ID: To: Peter Zijlstra Cc: Will Deacon , Tim Chen , Ingo Molnar , Andrew Morton , Thomas Gleixner , "linux-kernel@vger.kernel.org" , linux-mm , "linux-arch@vger.kernel.org" , Linus Torvalds , Waiman Long , Andrea Arcangeli , Alex Shi , Andi Kleen , Michel Lespinasse , Davidlohr Bueso , Matthew R Wilcox , Dave Hansen , Rik van Riel , Peter Hurley , Raghavendra K T , George Spelvin , "H. Peter Anvin" , Arnd Bergmann , Aswin Chandramouleeswaran , Scott J Norton , "Figo.zhang" On Tue, Nov 26, 2013 at 10:59:45AM +0100, Peter Zijlstra wrote: > On Mon, Nov 25, 2013 at 03:52:52PM -0800, Paul E. McKenney wrote: > > On Mon, Nov 25, 2013 at 07:27:15PM +0100, Peter Zijlstra wrote: > > > On Mon, Nov 25, 2013 at 10:02:50AM -0800, Paul E. McKenney wrote: > > > > And if the two locks are different, then the guarantee applies only > > > > when the unlock and lock are on the same CPU, in which case, as Linus > > > > noted, the xchg() on entry to the slow path does the job for use. > > > > > > But in that case we rely on the fact that the thing is part of a > > > composite and we should no longer call it load_acquire, because frankly > > > it doesn't have acquire semantics anymore because the read can escape > > > out. > > > > Actually, load-acquire and store-release are only required to provide > > ordering in the threads/CPUs doing the load-acquire/store-release > > operations. It is just that we require something stronger than minimal > > load-acquire/store-release to make a Linux-kernel lock. > > I suspect we're talking past one another here; but our Document > describes ACQUIRE/RELEASE semantics such that > > RELEASE > ACQUIRE > > matches a full barrier, regardless on whether it is the same lock or > not. Ah, got it! > If you now want to weaken this definition, then that needs consideration > because we actually rely on things like > > spin_unlock(l1); > spin_lock(l2); > > being full barriers. > > Now granted, for lock operations we have actual atomic ops in between > which would cure x86, but it would leave us confused with the barrier > semantics. > > So please; either: > > A) we have the strong ACQUIRE/RELEASE semantics as currently described; > and therefore any RELEASE+ACQUIRE pair must form a full barrier; and > our propose primitives are non-compliant and needs strengthening. > > B) we go fudge about with the definitions. Another approach would be to have local and global variants, so that the local variants have acquire/release semantics that are guaranteed to be visible only in the involved threads (sufficient for circular buffers) while the global ones are visible globally, thus sufficient for queued locks. > But given the current description of our ACQUIRE barrier, we simply > cannot claim the proposed primitives are good on x86 IMO. > > Also, instead of the smp_store_release() I would argue that > smp_load_acquire() is the one that needs the full buffer, even on PPC. > > Because our ACQUIRE dis-allows loads/stores leaking out upwards, and > both TSO and PPC lwsync allow just that, so the smp_load_acquire() is > the one that needs the full barrier. You lost me on this one. Here is x86 ACQUIRE for X: r1 = ACCESS_ONCE(X); Since x86 does not reorder loads with later loads or stores, this should be sufficience. For powerpc: r1 = ACCESS_ONCE(X); lwsync; And lwsync does not allow prior loads to be reordered with later loads or stores, so this should also be sufficient. In both cases, a RELEASE+ACQUIRE provides a full barrier as long as RELEASE has the right stuff in it. So what am I missing? Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org