From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pb0-f49.google.com (mail-pb0-f49.google.com [209.85.160.49]) by kanga.kvack.org (Postfix) with ESMTP id ACC176B0036 for ; Wed, 20 Nov 2013 12:14:10 -0500 (EST) Received: by mail-pb0-f49.google.com with SMTP id jt11so5673933pbb.22 for ; Wed, 20 Nov 2013 09:14:10 -0800 (PST) Received: from psmtp.com ([74.125.245.155]) by mx.google.com with SMTP id am2si14691268pad.212.2013.11.20.09.14.08 for ; Wed, 20 Nov 2013 09:14:09 -0800 (PST) Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 20 Nov 2013 10:14:07 -0700 Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id BC6323E4003F for ; Wed, 20 Nov 2013 10:14:03 -0700 (MST) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08025.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rAKFCGmY25886842 for ; Wed, 20 Nov 2013 16:12:16 +0100 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id rAKHGtKt010430 for ; Wed, 20 Nov 2013 10:16:57 -0700 Date: Wed, 20 Nov 2013 09:14:00 -0800 From: "Paul E. McKenney" Subject: Re: [PATCH v6 4/5] MCS Lock: Barrier corrections Message-ID: <20131120171400.GI4138@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1384911463.11046.454.camel@schen9-DESK> <20131120153123.GF4138@linux.vnet.ibm.com> <20131120154643.GG19352@mudshark.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20131120154643.GG19352@mudshark.cambridge.arm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Will Deacon Cc: Tim Chen , Ingo Molnar , Andrew Morton , Thomas Gleixner , "linux-kernel@vger.kernel.org" , linux-mm , "linux-arch@vger.kernel.org" , Linus Torvalds , Waiman Long , Andrea Arcangeli , Alex Shi , Andi Kleen , Michel Lespinasse , Davidlohr Bueso , Matthew R Wilcox , Dave Hansen , Peter Zijlstra , Rik van Riel , Peter Hurley , Raghavendra K T , George Spelvin , "H. Peter Anvin" , Arnd Bergmann , Aswin Chandramouleeswaran , Scott J Norton , "Figo.zhang" On Wed, Nov 20, 2013 at 03:46:43PM +0000, Will Deacon wrote: > Hi Paul, > > On Wed, Nov 20, 2013 at 03:31:23PM +0000, Paul E. McKenney wrote: > > On Tue, Nov 19, 2013 at 05:37:43PM -0800, Tim Chen wrote: > > > @@ -68,7 +72,12 @@ void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node) > > > while (!(next = ACCESS_ONCE(node->next))) > > > arch_mutex_cpu_relax(); > > > } > > > - ACCESS_ONCE(next->locked) = 1; > > > - smp_wmb(); > > > + /* > > > + * Pass lock to next waiter. > > > + * smp_store_release() provides a memory barrier to ensure > > > + * all operations in the critical section has been completed > > > + * before unlocking. > > > + */ > > > + smp_store_release(&next->locked, 1); > > > > However, there is one problem with this that I missed yesterday. > > > > Documentation/memory-barriers.txt requires that an unlock-lock pair > > provide a full barrier, but this is not guaranteed if we use > > smp_store_release() for unlock and smp_load_acquire() for lock. > > At least one of these needs a full memory barrier. > > Hmm, so in the following case: > > Access A > unlock() /* release semantics */ > lock() /* acquire semantics */ > Access B > > A cannot pass beyond the unlock() and B cannot pass the before the lock(). > > I agree that accesses between the unlock and the lock can be move across both > A and B, but that doesn't seem to matter by my reading of the above. > > What is the problematic scenario you have in mind? Are you thinking of the > lock() moving before the unlock()? That's only permitted by RCpc afaiu, > which I don't think any architectures supported by Linux implement... > (ARMv8 acquire/release is RCsc). If smp_load_acquire() and smp_store_release() are both implemented using lwsync on powerpc, and if Access A is a store and Access B is a load, then Access A and Access B can be reordered. Of course, if every other architecture will be providing RCsc implementations for smp_load_acquire() and smp_store_release(), which would not be a bad thing, then another approach is for powerpc to use sync rather than lwsync for one or the other of smp_load_acquire() or smp_store_release(). Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org