linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations
       [not found] <cover.1383771175.git.tim.c.chen@linux.intel.com>
@ 2013-11-06 21:36 ` Tim Chen
  2013-11-06 21:41   ` Davidlohr Bueso
  2013-11-06 21:42   ` H. Peter Anvin
  2013-11-06 21:37 ` [PATCH v3 1/5] MCS Lock: Restructure the MCS lock defines and locking code into its own file Tim Chen
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:36 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Thomas Gleixner
  Cc: linux-kernel, linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Tim Chen,
	Raghavendra K T, George Spelvin, H. Peter Anvin, Arnd Bergmann,
	Aswin Chandramouleeswaran, Scott J Norton, Will Deacon,
	Figo.zhang

In this patch series, we separated out the MCS lock code which was
previously embedded in the mutex.c.  This allows for easier reuse of
MCS lock in other places like rwsem and qrwlock.  We also did some micro
optimizations and barrier cleanup.

This patches were previously part of the rwsem optimization patch series
but now we spearate them out.

Tim Chen

v3:
1. modified memory barriers to support non x86 architectures that have
weak memory ordering.

v2:
1. change export mcs_spin_lock as a GPL export symbol
2. corrected mcs_spin_lock to references


Jason Low (2):
  MCS Lock: optimizations and extra comments
  MCS Lock: Barrier corrections


Jason Low (2):
  MCS Lock: optimizations and extra comments
  MCS Lock: Barrier corrections

Tim Chen (1):
  MCS Lock: Restructure the MCS lock defines and locking code into its
    own file

Waiman Long (2):
  MCS Lock: Make mcs_spinlock.h includable in other files
  MCS Lock: Allow architecture specific memory barrier in lock/unlock

 arch/x86/include/asm/barrier.h |    6 +++
 include/linux/mcs_spinlock.h   |   25 ++++++++++
 include/linux/mutex.h          |    5 +-
 kernel/Makefile                |    6 +-
 kernel/mcs_spinlock.c          |   96 ++++++++++++++++++++++++++++++++++++++++
 kernel/mutex.c                 |   60 +++----------------------
 6 files changed, 140 insertions(+), 58 deletions(-)
 create mode 100644 include/linux/mcs_spinlock.h
 create mode 100644 kernel/mcs_spinlock.c

-- 
1.7.4.4


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 1/5] MCS Lock: Restructure the MCS lock defines and locking code into its own file
       [not found] <cover.1383771175.git.tim.c.chen@linux.intel.com>
  2013-11-06 21:36 ` [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations Tim Chen
@ 2013-11-06 21:37 ` Tim Chen
  2013-11-06 21:37 ` [PATCH v3 2/5] MCS Lock: optimizations and extra comments Tim Chen
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:37 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Thomas Gleixner
  Cc: linux-kernel, linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Tim Chen,
	Raghavendra K T, George Spelvin, H. Peter Anvin, Arnd Bergmann,
	Aswin Chandramouleeswaran, Scott J Norton, Will Deacon,
	Figo.zhang

We will need the MCS lock code for doing optimistic spinning for rwsem.
Extracting the MCS code from mutex.c and put into its own file allow us
to reuse this code easily for rwsem.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
---
 include/linux/mcs_spinlock.h |   64 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/mutex.h        |    5 ++-
 kernel/mutex.c               |   60 ++++----------------------------------
 3 files changed, 74 insertions(+), 55 deletions(-)
 create mode 100644 include/linux/mcs_spinlock.h

diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
new file mode 100644
index 0000000..b5de3b0
--- /dev/null
+++ b/include/linux/mcs_spinlock.h
@@ -0,0 +1,64 @@
+/*
+ * MCS lock defines
+ *
+ * This file contains the main data structure and API definitions of MCS lock.
+ *
+ * The MCS lock (proposed by Mellor-Crummey and Scott) is a simple spin-lock
+ * with the desirable properties of being fair, and with each cpu trying
+ * to acquire the lock spinning on a local variable.
+ * It avoids expensive cache bouncings that common test-and-set spin-lock
+ * implementations incur.
+ */
+#ifndef __LINUX_MCS_SPINLOCK_H
+#define __LINUX_MCS_SPINLOCK_H
+
+struct mcs_spinlock {
+	struct mcs_spinlock *next;
+	int locked; /* 1 if lock acquired */
+};
+
+/*
+ * We don't inline mcs_spin_lock() so that perf can correctly account for the
+ * time spent in this lock function.
+ */
+static noinline
+void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
+{
+	struct mcs_spinlock *prev;
+
+	/* Init node */
+	node->locked = 0;
+	node->next   = NULL;
+
+	prev = xchg(lock, node);
+	if (likely(prev == NULL)) {
+		/* Lock acquired */
+		node->locked = 1;
+		return;
+	}
+	ACCESS_ONCE(prev->next) = node;
+	smp_wmb();
+	/* Wait until the lock holder passes the lock down */
+	while (!ACCESS_ONCE(node->locked))
+		arch_mutex_cpu_relax();
+}
+
+static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
+{
+	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
+
+	if (likely(!next)) {
+		/*
+		 * Release the lock by setting it to NULL
+		 */
+		if (cmpxchg(lock, node, NULL) == node)
+			return;
+		/* Wait until the next pointer is set */
+		while (!(next = ACCESS_ONCE(node->next)))
+			arch_mutex_cpu_relax();
+	}
+	ACCESS_ONCE(next->locked) = 1;
+	smp_wmb();
+}
+
+#endif /* __LINUX_MCS_SPINLOCK_H */
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index bab49da..32a32e6 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -46,6 +46,7 @@
  * - detects multi-task circular deadlocks and prints out all affected
  *   locks and tasks (and only those tasks)
  */
+struct mcs_spinlock;
 struct mutex {
 	/* 1: unlocked, 0: locked, negative: locked, possible waiters */
 	atomic_t		count;
@@ -55,7 +56,7 @@ struct mutex {
 	struct task_struct	*owner;
 #endif
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
-	void			*spin_mlock;	/* Spinner MCS lock */
+	struct mcs_spinlock	*mcs_lock;	/* Spinner MCS lock */
 #endif
 #ifdef CONFIG_DEBUG_MUTEXES
 	const char 		*name;
@@ -179,4 +180,4 @@ extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
 # define arch_mutex_cpu_relax() cpu_relax()
 #endif
 
-#endif
+#endif /* __LINUX_MUTEX_H */
diff --git a/kernel/mutex.c b/kernel/mutex.c
index d24105b..e08b183 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -25,6 +25,7 @@
 #include <linux/spinlock.h>
 #include <linux/interrupt.h>
 #include <linux/debug_locks.h>
+#include <linux/mcs_spinlock.h>
 
 /*
  * In the DEBUG case we are using the "NULL fastpath" for mutexes,
@@ -52,7 +53,7 @@ __mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
 	INIT_LIST_HEAD(&lock->wait_list);
 	mutex_clear_owner(lock);
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
-	lock->spin_mlock = NULL;
+	lock->mcs_lock = NULL;
 #endif
 
 	debug_mutex_init(lock, name, key);
@@ -111,54 +112,7 @@ EXPORT_SYMBOL(mutex_lock);
  * more or less simultaneously, the spinners need to acquire a MCS lock
  * first before spinning on the owner field.
  *
- * We don't inline mspin_lock() so that perf can correctly account for the
- * time spent in this lock function.
  */
-struct mspin_node {
-	struct mspin_node *next ;
-	int		  locked;	/* 1 if lock acquired */
-};
-#define	MLOCK(mutex)	((struct mspin_node **)&((mutex)->spin_mlock))
-
-static noinline
-void mspin_lock(struct mspin_node **lock, struct mspin_node *node)
-{
-	struct mspin_node *prev;
-
-	/* Init node */
-	node->locked = 0;
-	node->next   = NULL;
-
-	prev = xchg(lock, node);
-	if (likely(prev == NULL)) {
-		/* Lock acquired */
-		node->locked = 1;
-		return;
-	}
-	ACCESS_ONCE(prev->next) = node;
-	smp_wmb();
-	/* Wait until the lock holder passes the lock down */
-	while (!ACCESS_ONCE(node->locked))
-		arch_mutex_cpu_relax();
-}
-
-static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node)
-{
-	struct mspin_node *next = ACCESS_ONCE(node->next);
-
-	if (likely(!next)) {
-		/*
-		 * Release the lock by setting it to NULL
-		 */
-		if (cmpxchg(lock, node, NULL) == node)
-			return;
-		/* Wait until the next pointer is set */
-		while (!(next = ACCESS_ONCE(node->next)))
-			arch_mutex_cpu_relax();
-	}
-	ACCESS_ONCE(next->locked) = 1;
-	smp_wmb();
-}
 
 /*
  * Mutex spinning code migrated from kernel/sched/core.c
@@ -448,7 +402,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 
 	for (;;) {
 		struct task_struct *owner;
-		struct mspin_node  node;
+		struct mcs_spinlock  node;
 
 		if (use_ww_ctx && ww_ctx->acquired > 0) {
 			struct ww_mutex *ww;
@@ -470,10 +424,10 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		 * If there's an owner, wait for it to either
 		 * release the lock or go to sleep.
 		 */
-		mspin_lock(MLOCK(lock), &node);
+		mcs_spin_lock(&lock->mcs_lock, &node);
 		owner = ACCESS_ONCE(lock->owner);
 		if (owner && !mutex_spin_on_owner(lock, owner)) {
-			mspin_unlock(MLOCK(lock), &node);
+			mcs_spin_unlock(&lock->mcs_lock, &node);
 			goto slowpath;
 		}
 
@@ -488,11 +442,11 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 			}
 
 			mutex_set_owner(lock);
-			mspin_unlock(MLOCK(lock), &node);
+			mcs_spin_unlock(&lock->mcs_lock, &node);
 			preempt_enable();
 			return 0;
 		}
-		mspin_unlock(MLOCK(lock), &node);
+		mcs_spin_unlock(&lock->mcs_lock, &node);
 
 		/*
 		 * When there's no owner, we might have preempted between the
-- 
1.7.4.4



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 2/5] MCS Lock: optimizations and extra comments
       [not found] <cover.1383771175.git.tim.c.chen@linux.intel.com>
  2013-11-06 21:36 ` [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations Tim Chen
  2013-11-06 21:37 ` [PATCH v3 1/5] MCS Lock: Restructure the MCS lock defines and locking code into its own file Tim Chen
@ 2013-11-06 21:37 ` Tim Chen
  2013-11-06 21:47   ` Tim Chen
  2013-11-06 21:37 ` [PATCH v3 3/5] MCS Lock: Barrier corrections Tim Chen
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:37 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Thomas Gleixner
  Cc: linux-kernel, linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Tim Chen,
	Raghavendra K T, George Spelvin, H. Peter Anvin, Arnd Bergmann,
	Aswin Chandramouleeswaran, Scott J Norton, Will Deacon,
	Figo.zhang

Remove unnecessary operation and make the cmpxchg(lock, node, NULL) == node
check in mcs_spin_unlock() likely() as it is likely that a race did not occur
most of the time.

Also add in more comments describing how the local node is used in MCS locks.

Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 include/linux/mcs_spinlock.h |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
index b5de3b0..96f14299 100644
--- a/include/linux/mcs_spinlock.h
+++ b/include/linux/mcs_spinlock.h
@@ -18,6 +18,12 @@ struct mcs_spinlock {
 };
 
 /*
+ * In order to acquire the lock, the caller should declare a local node and
+ * pass a reference of the node to this function in addition to the lock.
+ * If the lock has already been acquired, then this will proceed to spin
+ * on this node->locked until the previous lock holder sets the node->locked
+ * in mcs_spin_unlock().
+ *
  * We don't inline mcs_spin_lock() so that perf can correctly account for the
  * time spent in this lock function.
  */
@@ -33,7 +39,6 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 	prev = xchg(lock, node);
 	if (likely(prev == NULL)) {
 		/* Lock acquired */
-		node->locked = 1;
 		return;
 	}
 	ACCESS_ONCE(prev->next) = node;
@@ -43,6 +48,10 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 		arch_mutex_cpu_relax();
 }
 
+/*
+ * Releases the lock. The caller should pass in the corresponding node that
+ * was used to acquire the lock.
+ */
 static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 {
 	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
@@ -51,7 +60,7 @@ static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *nod
 		/*
 		 * Release the lock by setting it to NULL
 		 */
-		if (cmpxchg(lock, node, NULL) == node)
+		if (likely(cmpxchg(lock, node, NULL) == node))
 			return;
 		/* Wait until the next pointer is set */
 		while (!(next = ACCESS_ONCE(node->next)))
-- 
1.7.4.4



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 3/5] MCS Lock: Barrier corrections
       [not found] <cover.1383771175.git.tim.c.chen@linux.intel.com>
                   ` (2 preceding siblings ...)
  2013-11-06 21:37 ` [PATCH v3 2/5] MCS Lock: optimizations and extra comments Tim Chen
@ 2013-11-06 21:37 ` Tim Chen
  2013-11-07  1:39   ` Linus Torvalds
  2013-11-06 21:37 ` [PATCH v3 4/5] MCS Lock: Make mcs_spinlock.h includable in other files Tim Chen
  2013-11-06 21:37 ` [PATCH v3 5/5] MCS Lock: Allow architecture specific memory barrier in lock/unlock Tim Chen
  5 siblings, 1 reply; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:37 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Thomas Gleixner
  Cc: linux-kernel, linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Tim Chen,
	Raghavendra K T, George Spelvin, H. Peter Anvin, Arnd Bergmann,
	Aswin Chandramouleeswaran, Scott J Norton, Will Deacon,
	Figo.zhang

This patch corrects the way memory barriers are used in the MCS lock
and removes ones that are not needed. Also add comments on all barriers.

Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 include/linux/mcs_spinlock.h |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
index 96f14299..93d445d 100644
--- a/include/linux/mcs_spinlock.h
+++ b/include/linux/mcs_spinlock.h
@@ -36,16 +36,19 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 	node->locked = 0;
 	node->next   = NULL;
 
+	/* xchg() provides a memory barrier */
 	prev = xchg(lock, node);
 	if (likely(prev == NULL)) {
 		/* Lock acquired */
 		return;
 	}
 	ACCESS_ONCE(prev->next) = node;
-	smp_wmb();
 	/* Wait until the lock holder passes the lock down */
 	while (!ACCESS_ONCE(node->locked))
 		arch_mutex_cpu_relax();
+
+	/* Make sure subsequent operations happen after the lock is acquired */
+	smp_rmb();
 }
 
 /*
@@ -58,6 +61,7 @@ static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *nod
 
 	if (likely(!next)) {
 		/*
+		 * cmpxchg() provides a memory barrier.
 		 * Release the lock by setting it to NULL
 		 */
 		if (likely(cmpxchg(lock, node, NULL) == node))
@@ -65,9 +69,14 @@ static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *nod
 		/* Wait until the next pointer is set */
 		while (!(next = ACCESS_ONCE(node->next)))
 			arch_mutex_cpu_relax();
+	} else {
+		/*
+		 * Make sure all operations within the critical section
+		 * happen before the lock is released.
+		 */
+		smp_wmb();
 	}
 	ACCESS_ONCE(next->locked) = 1;
-	smp_wmb();
 }
 
 #endif /* __LINUX_MCS_SPINLOCK_H */
-- 
1.7.4.4



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 4/5] MCS Lock: Make mcs_spinlock.h includable in other files
       [not found] <cover.1383771175.git.tim.c.chen@linux.intel.com>
                   ` (3 preceding siblings ...)
  2013-11-06 21:37 ` [PATCH v3 3/5] MCS Lock: Barrier corrections Tim Chen
@ 2013-11-06 21:37 ` Tim Chen
  2013-11-06 21:41   ` Tim Chen
  2013-11-06 21:37 ` [PATCH v3 5/5] MCS Lock: Allow architecture specific memory barrier in lock/unlock Tim Chen
  5 siblings, 1 reply; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:37 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Thomas Gleixner
  Cc: linux-kernel, linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Tim Chen,
	Raghavendra K T, George Spelvin, H. Peter Anvin, Arnd Bergmann,
	Aswin Chandramouleeswaran, Scott J Norton, Will Deacon,
	Figo.zhang

The following changes are made to enable mcs_spinlock.h file to be
widely included in other files without causing problem:

1) Include a number of prerequisite header files and define
   arch_mutex_cpu_relax(), if not previously defined.
2) Make mcs_spin_unlock() an inlined function and
   rename mcs_spin_lock() to _raw_mcs_spin_lock() which is also an
   inlined function.
3) Create a new mcs_spinlock.c file to contain the non-inlined
   mcs_spin_lock() function.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 include/linux/mcs_spinlock.h |   27 ++++++++++++++++++++++-----
 kernel/Makefile              |    6 +++---
 kernel/mcs_spinlock.c        |   21 +++++++++++++++++++++
 3 files changed, 46 insertions(+), 8 deletions(-)
 create mode 100644 kernel/mcs_spinlock.c

diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
index 93d445d..f2c71e8 100644
--- a/include/linux/mcs_spinlock.h
+++ b/include/linux/mcs_spinlock.h
@@ -12,11 +12,27 @@
 #ifndef __LINUX_MCS_SPINLOCK_H
 #define __LINUX_MCS_SPINLOCK_H
 
+/*
+ * asm/processor.h may define arch_mutex_cpu_relax().
+ * If it is not defined, cpu_relax() will be used.
+ */
+#include <asm/barrier.h>
+#include <asm/cmpxchg.h>
+#include <asm/processor.h>
+#include <linux/compiler.h>
+
+#ifndef arch_mutex_cpu_relax
+# define arch_mutex_cpu_relax() cpu_relax()
+#endif
+
 struct mcs_spinlock {
 	struct mcs_spinlock *next;
 	int locked; /* 1 if lock acquired */
 };
 
+extern
+void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node);
+
 /*
  * In order to acquire the lock, the caller should declare a local node and
  * pass a reference of the node to this function in addition to the lock.
@@ -24,11 +40,11 @@ struct mcs_spinlock {
  * on this node->locked until the previous lock holder sets the node->locked
  * in mcs_spin_unlock().
  *
- * We don't inline mcs_spin_lock() so that perf can correctly account for the
- * time spent in this lock function.
+ * The _raw_mcs_spin_lock() function should not be called directly. Instead,
+ * users should call mcs_spin_lock().
  */
-static noinline
-void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
+static inline
+void _raw_mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 {
 	struct mcs_spinlock *prev;
 
@@ -55,7 +71,8 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
  * Releases the lock. The caller should pass in the corresponding node that
  * was used to acquire the lock.
  */
-static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
+static inline
+void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 {
 	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
 
diff --git a/kernel/Makefile b/kernel/Makefile
index 1ce4755..2ad8454 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -50,9 +50,9 @@ obj-$(CONFIG_SMP) += smp.o
 ifneq ($(CONFIG_SMP),y)
 obj-y += up.o
 endif
-obj-$(CONFIG_SMP) += spinlock.o
-obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock.o
-obj-$(CONFIG_PROVE_LOCKING) += spinlock.o
+obj-$(CONFIG_SMP) += spinlock.o mcs_spinlock.o
+obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock.o mcs_spinlock.o
+obj-$(CONFIG_PROVE_LOCKING) += spinlock.o mcs_spinlock.o
 obj-$(CONFIG_UID16) += uid16.o
 obj-$(CONFIG_MODULES) += module.o
 obj-$(CONFIG_MODULE_SIG) += module_signing.o modsign_pubkey.o modsign_certificate.o
diff --git a/kernel/mcs_spinlock.c b/kernel/mcs_spinlock.c
new file mode 100644
index 0000000..3c55626
--- /dev/null
+++ b/kernel/mcs_spinlock.c
@@ -0,0 +1,21 @@
+/*
+ * MCS lock
+ *
+ * The MCS lock (proposed by Mellor-Crummey and Scott) is a simple spin-lock
+ * with the desirable properties of being fair, and with each cpu trying
+ * to acquire the lock spinning on a local variable.
+ * It avoids expensive cache bouncings that common test-and-set spin-lock
+ * implementations incur.
+ */
+#include <linux/mcs_spinlock.h>
+#include <linux/export.h>
+
+/*
+ * We don't inline mcs_spin_lock() so that perf can correctly account for the
+ * time spent in this lock function.
+ */
+void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
+{
+	_raw_mcs_spin_lock(lock, node);
+}
+EXPORT_SYMBOL_GPL(mcs_spin_lock);
-- 
1.7.4.4



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 5/5] MCS Lock: Allow architecture specific memory barrier in lock/unlock
       [not found] <cover.1383771175.git.tim.c.chen@linux.intel.com>
                   ` (4 preceding siblings ...)
  2013-11-06 21:37 ` [PATCH v3 4/5] MCS Lock: Make mcs_spinlock.h includable in other files Tim Chen
@ 2013-11-06 21:37 ` Tim Chen
  2013-11-06 21:42   ` Tim Chen
  5 siblings, 1 reply; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:37 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Thomas Gleixner
  Cc: linux-kernel, linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Tim Chen,
	Raghavendra K T, George Spelvin, H. Peter Anvin, Arnd Bergmann,
	Aswin Chandramouleeswaran, Scott J Norton, Will Deacon,
	Figo.zhang

This patch moves the decision of what kind of memory barriers to be
used in the MCS lock and unlock functions to the architecture specific
layer. It also moves the actual lock/unlock code to mcs_spinlock.c
file.

A full memory barrier will be used if the following macros are not
defined:
 1) smp_mb__before_critical_section()
 2) smp_mb__after_critical_section()

For the x86 architecture, only compiler barrier will be needed.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 arch/x86/include/asm/barrier.h |    6 +++
 include/linux/mcs_spinlock.h   |   78 +-------------------------------------
 kernel/mcs_spinlock.c          |   81 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 86 insertions(+), 79 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index c6cd358..6d0172c 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -92,6 +92,12 @@
 #endif
 #define smp_read_barrier_depends()	read_barrier_depends()
 #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
+
+#if !defined(CONFIG_X86_PPRO_FENCE) && !defined(CONFIG_X86_OOSTORE)
+# define smp_mb__before_critical_section()	barrier()
+# define smp_mb__after_critical_section()	barrier()
+#endif
+
 #else
 #define smp_mb()	barrier()
 #define smp_rmb()	barrier()
diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
index f2c71e8..d54bb23 100644
--- a/include/linux/mcs_spinlock.h
+++ b/include/linux/mcs_spinlock.h
@@ -12,19 +12,6 @@
 #ifndef __LINUX_MCS_SPINLOCK_H
 #define __LINUX_MCS_SPINLOCK_H
 
-/*
- * asm/processor.h may define arch_mutex_cpu_relax().
- * If it is not defined, cpu_relax() will be used.
- */
-#include <asm/barrier.h>
-#include <asm/cmpxchg.h>
-#include <asm/processor.h>
-#include <linux/compiler.h>
-
-#ifndef arch_mutex_cpu_relax
-# define arch_mutex_cpu_relax() cpu_relax()
-#endif
-
 struct mcs_spinlock {
 	struct mcs_spinlock *next;
 	int locked; /* 1 if lock acquired */
@@ -32,68 +19,7 @@ struct mcs_spinlock {
 
 extern
 void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node);
-
-/*
- * In order to acquire the lock, the caller should declare a local node and
- * pass a reference of the node to this function in addition to the lock.
- * If the lock has already been acquired, then this will proceed to spin
- * on this node->locked until the previous lock holder sets the node->locked
- * in mcs_spin_unlock().
- *
- * The _raw_mcs_spin_lock() function should not be called directly. Instead,
- * users should call mcs_spin_lock().
- */
-static inline
-void _raw_mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
-{
-	struct mcs_spinlock *prev;
-
-	/* Init node */
-	node->locked = 0;
-	node->next   = NULL;
-
-	/* xchg() provides a memory barrier */
-	prev = xchg(lock, node);
-	if (likely(prev == NULL)) {
-		/* Lock acquired */
-		return;
-	}
-	ACCESS_ONCE(prev->next) = node;
-	/* Wait until the lock holder passes the lock down */
-	while (!ACCESS_ONCE(node->locked))
-		arch_mutex_cpu_relax();
-
-	/* Make sure subsequent operations happen after the lock is acquired */
-	smp_rmb();
-}
-
-/*
- * Releases the lock. The caller should pass in the corresponding node that
- * was used to acquire the lock.
- */
-static inline
-void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
-{
-	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
-
-	if (likely(!next)) {
-		/*
-		 * cmpxchg() provides a memory barrier.
-		 * Release the lock by setting it to NULL
-		 */
-		if (likely(cmpxchg(lock, node, NULL) == node))
-			return;
-		/* Wait until the next pointer is set */
-		while (!(next = ACCESS_ONCE(node->next)))
-			arch_mutex_cpu_relax();
-	} else {
-		/*
-		 * Make sure all operations within the critical section
-		 * happen before the lock is released.
-		 */
-		smp_wmb();
-	}
-	ACCESS_ONCE(next->locked) = 1;
-}
+extern
+void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node);
 
 #endif /* __LINUX_MCS_SPINLOCK_H */
diff --git a/kernel/mcs_spinlock.c b/kernel/mcs_spinlock.c
index 3c55626..2dfd207 100644
--- a/kernel/mcs_spinlock.c
+++ b/kernel/mcs_spinlock.c
@@ -7,15 +7,90 @@
  * It avoids expensive cache bouncings that common test-and-set spin-lock
  * implementations incur.
  */
+/*
+ * asm/processor.h may define arch_mutex_cpu_relax().
+ * If it is not defined, cpu_relax() will be used.
+ */
+#include <asm/barrier.h>
+#include <asm/cmpxchg.h>
+#include <asm/processor.h>
+#include <linux/compiler.h>
 #include <linux/mcs_spinlock.h>
 #include <linux/export.h>
 
+#ifndef arch_mutex_cpu_relax
+# define arch_mutex_cpu_relax() cpu_relax()
+#endif
+
 /*
- * We don't inline mcs_spin_lock() so that perf can correctly account for the
- * time spent in this lock function.
+ * Fall back to use full memory barrier if those macros are not defined
+ * in a architecture specific header file.
+ */
+#ifndef smp_mb__before_critical_section
+#define	smp_mb__before_critical_section()	smp_mb()
+#endif
+
+#ifndef smp_mb__after_critical_section
+#define	smp_mb__after_critical_section()	smp_mb()
+#endif
+
+
+/*
+ * In order to acquire the lock, the caller should declare a local node and
+ * pass a reference of the node to this function in addition to the lock.
+ * If the lock has already been acquired, then this will proceed to spin
+ * on this node->locked until the previous lock holder sets the node->locked
+ * in mcs_spin_unlock().
  */
 void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 {
-	_raw_mcs_spin_lock(lock, node);
+	struct mcs_spinlock *prev;
+
+	/* Init node */
+	node->locked = 0;
+	node->next   = NULL;
+
+	/* xchg() provides a memory barrier */
+	prev = xchg(lock, node);
+	if (likely(prev == NULL)) {
+		/* Lock acquired */
+		return;
+	}
+	ACCESS_ONCE(prev->next) = node;
+	/* Wait until the lock holder passes the lock down */
+	while (!ACCESS_ONCE(node->locked))
+		arch_mutex_cpu_relax();
+
+	/* Make sure subsequent operations happen after the lock is acquired */
+	smp_mb__before_critical_section();
 }
 EXPORT_SYMBOL_GPL(mcs_spin_lock);
+
+/*
+ * Releases the lock. The caller should pass in the corresponding node that
+ * was used to acquire the lock.
+ */
+void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
+{
+	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
+
+	if (likely(!next)) {
+		/*
+		 * cmpxchg() provides a memory barrier.
+		 * Release the lock by setting it to NULL
+		 */
+		if (likely(cmpxchg(lock, node, NULL) == node))
+			return;
+		/* Wait until the next pointer is set */
+		while (!(next = ACCESS_ONCE(node->next)))
+			arch_mutex_cpu_relax();
+	} else {
+		/*
+		 * Make sure all operations within the critical section
+		 * happen before the lock is released.
+		 */
+		smp_mb__after_critical_section();
+	}
+	ACCESS_ONCE(next->locked) = 1;
+}
+EXPORT_SYMBOL_GPL(mcs_spin_unlock);
-- 
1.7.4.4


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/5] MCS Lock: Make mcs_spinlock.h includable in other files
  2013-11-06 21:37 ` [PATCH v3 4/5] MCS Lock: Make mcs_spinlock.h includable in other files Tim Chen
@ 2013-11-06 21:41   ` Tim Chen
  0 siblings, 0 replies; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Thomas Gleixner, linux-kernel, linux-mm,
	linux-arch, Linus Torvalds, Waiman Long, Andrea Arcangeli,
	Alex Shi, Andi Kleen, Michel Lespinasse, Davidlohr Bueso,
	Matthew R Wilcox, Dave Hansen, Peter Zijlstra, Rik van Riel,
	Peter Hurley, Paul E.McKenney, Raghavendra K T, George Spelvin,
	H. Peter Anvin, Arnd Bergmann, Aswin Chandramouleeswaran,
	Scott J Norton, Will Deacon, Figo.zhang

On Wed, 2013-11-06 at 13:37 -0800, Tim Chen wrote:
> The following changes are made to enable mcs_spinlock.h file to be
> widely included in other files without causing problem:
> 
> 1) Include a number of prerequisite header files and define
>    arch_mutex_cpu_relax(), if not previously defined.
> 2) Make mcs_spin_unlock() an inlined function and
>    rename mcs_spin_lock() to _raw_mcs_spin_lock() which is also an
>    inlined function.
> 3) Create a new mcs_spinlock.c file to contain the non-inlined
>    mcs_spin_lock() function.
> 
> Signed-off-by: Waiman Long <Waiman.Long@hp.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>

Should be Acked-by: Tim Chen <tim.c.chen@linux.intel.com>

> ---
>  include/linux/mcs_spinlock.h |   27 ++++++++++++++++++++++-----
>  kernel/Makefile              |    6 +++---
>  kernel/mcs_spinlock.c        |   21 +++++++++++++++++++++
>  3 files changed, 46 insertions(+), 8 deletions(-)
>  create mode 100644 kernel/mcs_spinlock.c
> 
> diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
> index 93d445d..f2c71e8 100644
> --- a/include/linux/mcs_spinlock.h
> +++ b/include/linux/mcs_spinlock.h
> @@ -12,11 +12,27 @@
>  #ifndef __LINUX_MCS_SPINLOCK_H
>  #define __LINUX_MCS_SPINLOCK_H
>  
> +/*
> + * asm/processor.h may define arch_mutex_cpu_relax().
> + * If it is not defined, cpu_relax() will be used.
> + */
> +#include <asm/barrier.h>
> +#include <asm/cmpxchg.h>
> +#include <asm/processor.h>
> +#include <linux/compiler.h>
> +
> +#ifndef arch_mutex_cpu_relax
> +# define arch_mutex_cpu_relax() cpu_relax()
> +#endif
> +
>  struct mcs_spinlock {
>  	struct mcs_spinlock *next;
>  	int locked; /* 1 if lock acquired */
>  };
>  
> +extern
> +void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node);
> +
>  /*
>   * In order to acquire the lock, the caller should declare a local node and
>   * pass a reference of the node to this function in addition to the lock.
> @@ -24,11 +40,11 @@ struct mcs_spinlock {
>   * on this node->locked until the previous lock holder sets the node->locked
>   * in mcs_spin_unlock().
>   *
> - * We don't inline mcs_spin_lock() so that perf can correctly account for the
> - * time spent in this lock function.
> + * The _raw_mcs_spin_lock() function should not be called directly. Instead,
> + * users should call mcs_spin_lock().
>   */
> -static noinline
> -void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> +static inline
> +void _raw_mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
>  {
>  	struct mcs_spinlock *prev;
>  
> @@ -55,7 +71,8 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
>   * Releases the lock. The caller should pass in the corresponding node that
>   * was used to acquire the lock.
>   */
> -static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> +static inline
> +void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
>  {
>  	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
>  
> diff --git a/kernel/Makefile b/kernel/Makefile
> index 1ce4755..2ad8454 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -50,9 +50,9 @@ obj-$(CONFIG_SMP) += smp.o
>  ifneq ($(CONFIG_SMP),y)
>  obj-y += up.o
>  endif
> -obj-$(CONFIG_SMP) += spinlock.o
> -obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock.o
> -obj-$(CONFIG_PROVE_LOCKING) += spinlock.o
> +obj-$(CONFIG_SMP) += spinlock.o mcs_spinlock.o
> +obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock.o mcs_spinlock.o
> +obj-$(CONFIG_PROVE_LOCKING) += spinlock.o mcs_spinlock.o
>  obj-$(CONFIG_UID16) += uid16.o
>  obj-$(CONFIG_MODULES) += module.o
>  obj-$(CONFIG_MODULE_SIG) += module_signing.o modsign_pubkey.o modsign_certificate.o
> diff --git a/kernel/mcs_spinlock.c b/kernel/mcs_spinlock.c
> new file mode 100644
> index 0000000..3c55626
> --- /dev/null
> +++ b/kernel/mcs_spinlock.c
> @@ -0,0 +1,21 @@
> +/*
> + * MCS lock
> + *
> + * The MCS lock (proposed by Mellor-Crummey and Scott) is a simple spin-lock
> + * with the desirable properties of being fair, and with each cpu trying
> + * to acquire the lock spinning on a local variable.
> + * It avoids expensive cache bouncings that common test-and-set spin-lock
> + * implementations incur.
> + */
> +#include <linux/mcs_spinlock.h>
> +#include <linux/export.h>
> +
> +/*
> + * We don't inline mcs_spin_lock() so that perf can correctly account for the
> + * time spent in this lock function.
> + */
> +void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> +{
> +	_raw_mcs_spin_lock(lock, node);
> +}
> +EXPORT_SYMBOL_GPL(mcs_spin_lock);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations
  2013-11-06 21:36 ` [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations Tim Chen
@ 2013-11-06 21:41   ` Davidlohr Bueso
  2013-11-06 23:55     ` Tim Chen
  2013-11-06 21:42   ` H. Peter Anvin
  1 sibling, 1 reply; 27+ messages in thread
From: Davidlohr Bueso @ 2013-11-06 21:41 UTC (permalink / raw)
  To: Tim Chen
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, linux-kernel,
	linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Raghavendra K T,
	George Spelvin, H. Peter Anvin, Arnd Bergmann,
	Aswin Chandramouleeswaran, Scott J Norton, Will Deacon,
	Figo.zhang

On Wed, 2013-11-06 at 13:36 -0800, Tim Chen wrote:
> In this patch series, we separated out the MCS lock code which was
> previously embedded in the mutex.c.  This allows for easier reuse of
> MCS lock in other places like rwsem and qrwlock.  We also did some micro
> optimizations and barrier cleanup.
> 
> This patches were previously part of the rwsem optimization patch series
> but now we spearate them out.
> 
> Tim Chen
> 
> v3:
> 1. modified memory barriers to support non x86 architectures that have
> weak memory ordering.
> 
> v2:
> 1. change export mcs_spin_lock as a GPL export symbol
> 2. corrected mcs_spin_lock to references
> 
> 
> Jason Low (2):
>   MCS Lock: optimizations and extra comments
>   MCS Lock: Barrier corrections
> 
> 
> Jason Low (2):
>   MCS Lock: optimizations and extra comments
>   MCS Lock: Barrier corrections
> 
> Tim Chen (1):
>   MCS Lock: Restructure the MCS lock defines and locking code into its
>     own file
> 
> Waiman Long (2):
>   MCS Lock: Make mcs_spinlock.h includable in other files
>   MCS Lock: Allow architecture specific memory barrier in lock/unlock
> 
>  arch/x86/include/asm/barrier.h |    6 +++
>  include/linux/mcs_spinlock.h   |   25 ++++++++++
>  include/linux/mutex.h          |    5 +-
>  kernel/Makefile                |    6 +-
>  kernel/mcs_spinlock.c          |   96 ++++++++++++++++++++++++++++++++++++++++
>  kernel/mutex.c                 |   60 +++----------------------
>  6 files changed, 140 insertions(+), 58 deletions(-)
>  create mode 100644 include/linux/mcs_spinlock.h
>  create mode 100644 kernel/mcs_spinlock.c

Hmm I noticed that Peter's patchset to move locking mechanisms into a
unique directory is now in -tip, ie:

http://marc.info/?l=linux-kernel&m=138373682928585

So we'll have problems applying this patchset, it would probably be best
to rebase on top.

Thanks,
Davidlohr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 5/5] MCS Lock: Allow architecture specific memory barrier in lock/unlock
  2013-11-06 21:37 ` [PATCH v3 5/5] MCS Lock: Allow architecture specific memory barrier in lock/unlock Tim Chen
@ 2013-11-06 21:42   ` Tim Chen
  0 siblings, 0 replies; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Thomas Gleixner, linux-kernel, linux-mm,
	linux-arch, Linus Torvalds, Waiman Long, Andrea Arcangeli,
	Alex Shi, Andi Kleen, Michel Lespinasse, Davidlohr Bueso,
	Matthew R Wilcox, Dave Hansen, Peter Zijlstra, Rik van Riel,
	Peter Hurley, Paul E.McKenney, Raghavendra K T, George Spelvin,
	H. Peter Anvin, Arnd Bergmann, Aswin Chandramouleeswaran,
	Scott J Norton, Will Deacon, Figo.zhang

On Wed, 2013-11-06 at 13:37 -0800, Tim Chen wrote:
> This patch moves the decision of what kind of memory barriers to be
> used in the MCS lock and unlock functions to the architecture specific
> layer. It also moves the actual lock/unlock code to mcs_spinlock.c
> file.
> 
> A full memory barrier will be used if the following macros are not
> defined:
>  1) smp_mb__before_critical_section()
>  2) smp_mb__after_critical_section()
> 
> For the x86 architecture, only compiler barrier will be needed.
> 
> Signed-off-by: Waiman Long <Waiman.Long@hp.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>

Should be Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
>  arch/x86/include/asm/barrier.h |    6 +++
>  include/linux/mcs_spinlock.h   |   78 +-------------------------------------
>  kernel/mcs_spinlock.c          |   81 ++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 86 insertions(+), 79 deletions(-)
> 
> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> index c6cd358..6d0172c 100644
> --- a/arch/x86/include/asm/barrier.h
> +++ b/arch/x86/include/asm/barrier.h
> @@ -92,6 +92,12 @@
>  #endif
>  #define smp_read_barrier_depends()	read_barrier_depends()
>  #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
> +
> +#if !defined(CONFIG_X86_PPRO_FENCE) && !defined(CONFIG_X86_OOSTORE)
> +# define smp_mb__before_critical_section()	barrier()
> +# define smp_mb__after_critical_section()	barrier()
> +#endif
> +
>  #else
>  #define smp_mb()	barrier()
>  #define smp_rmb()	barrier()
> diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
> index f2c71e8..d54bb23 100644
> --- a/include/linux/mcs_spinlock.h
> +++ b/include/linux/mcs_spinlock.h
> @@ -12,19 +12,6 @@
>  #ifndef __LINUX_MCS_SPINLOCK_H
>  #define __LINUX_MCS_SPINLOCK_H
>  
> -/*
> - * asm/processor.h may define arch_mutex_cpu_relax().
> - * If it is not defined, cpu_relax() will be used.
> - */
> -#include <asm/barrier.h>
> -#include <asm/cmpxchg.h>
> -#include <asm/processor.h>
> -#include <linux/compiler.h>
> -
> -#ifndef arch_mutex_cpu_relax
> -# define arch_mutex_cpu_relax() cpu_relax()
> -#endif
> -
>  struct mcs_spinlock {
>  	struct mcs_spinlock *next;
>  	int locked; /* 1 if lock acquired */
> @@ -32,68 +19,7 @@ struct mcs_spinlock {
>  
>  extern
>  void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node);
> -
> -/*
> - * In order to acquire the lock, the caller should declare a local node and
> - * pass a reference of the node to this function in addition to the lock.
> - * If the lock has already been acquired, then this will proceed to spin
> - * on this node->locked until the previous lock holder sets the node->locked
> - * in mcs_spin_unlock().
> - *
> - * The _raw_mcs_spin_lock() function should not be called directly. Instead,
> - * users should call mcs_spin_lock().
> - */
> -static inline
> -void _raw_mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> -{
> -	struct mcs_spinlock *prev;
> -
> -	/* Init node */
> -	node->locked = 0;
> -	node->next   = NULL;
> -
> -	/* xchg() provides a memory barrier */
> -	prev = xchg(lock, node);
> -	if (likely(prev == NULL)) {
> -		/* Lock acquired */
> -		return;
> -	}
> -	ACCESS_ONCE(prev->next) = node;
> -	/* Wait until the lock holder passes the lock down */
> -	while (!ACCESS_ONCE(node->locked))
> -		arch_mutex_cpu_relax();
> -
> -	/* Make sure subsequent operations happen after the lock is acquired */
> -	smp_rmb();
> -}
> -
> -/*
> - * Releases the lock. The caller should pass in the corresponding node that
> - * was used to acquire the lock.
> - */
> -static inline
> -void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> -{
> -	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
> -
> -	if (likely(!next)) {
> -		/*
> -		 * cmpxchg() provides a memory barrier.
> -		 * Release the lock by setting it to NULL
> -		 */
> -		if (likely(cmpxchg(lock, node, NULL) == node))
> -			return;
> -		/* Wait until the next pointer is set */
> -		while (!(next = ACCESS_ONCE(node->next)))
> -			arch_mutex_cpu_relax();
> -	} else {
> -		/*
> -		 * Make sure all operations within the critical section
> -		 * happen before the lock is released.
> -		 */
> -		smp_wmb();
> -	}
> -	ACCESS_ONCE(next->locked) = 1;
> -}
> +extern
> +void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node);
>  
>  #endif /* __LINUX_MCS_SPINLOCK_H */
> diff --git a/kernel/mcs_spinlock.c b/kernel/mcs_spinlock.c
> index 3c55626..2dfd207 100644
> --- a/kernel/mcs_spinlock.c
> +++ b/kernel/mcs_spinlock.c
> @@ -7,15 +7,90 @@
>   * It avoids expensive cache bouncings that common test-and-set spin-lock
>   * implementations incur.
>   */
> +/*
> + * asm/processor.h may define arch_mutex_cpu_relax().
> + * If it is not defined, cpu_relax() will be used.
> + */
> +#include <asm/barrier.h>
> +#include <asm/cmpxchg.h>
> +#include <asm/processor.h>
> +#include <linux/compiler.h>
>  #include <linux/mcs_spinlock.h>
>  #include <linux/export.h>
>  
> +#ifndef arch_mutex_cpu_relax
> +# define arch_mutex_cpu_relax() cpu_relax()
> +#endif
> +
>  /*
> - * We don't inline mcs_spin_lock() so that perf can correctly account for the
> - * time spent in this lock function.
> + * Fall back to use full memory barrier if those macros are not defined
> + * in a architecture specific header file.
> + */
> +#ifndef smp_mb__before_critical_section
> +#define	smp_mb__before_critical_section()	smp_mb()
> +#endif
> +
> +#ifndef smp_mb__after_critical_section
> +#define	smp_mb__after_critical_section()	smp_mb()
> +#endif
> +
> +
> +/*
> + * In order to acquire the lock, the caller should declare a local node and
> + * pass a reference of the node to this function in addition to the lock.
> + * If the lock has already been acquired, then this will proceed to spin
> + * on this node->locked until the previous lock holder sets the node->locked
> + * in mcs_spin_unlock().
>   */
>  void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
>  {
> -	_raw_mcs_spin_lock(lock, node);
> +	struct mcs_spinlock *prev;
> +
> +	/* Init node */
> +	node->locked = 0;
> +	node->next   = NULL;
> +
> +	/* xchg() provides a memory barrier */
> +	prev = xchg(lock, node);
> +	if (likely(prev == NULL)) {
> +		/* Lock acquired */
> +		return;
> +	}
> +	ACCESS_ONCE(prev->next) = node;
> +	/* Wait until the lock holder passes the lock down */
> +	while (!ACCESS_ONCE(node->locked))
> +		arch_mutex_cpu_relax();
> +
> +	/* Make sure subsequent operations happen after the lock is acquired */
> +	smp_mb__before_critical_section();
>  }
>  EXPORT_SYMBOL_GPL(mcs_spin_lock);
> +
> +/*
> + * Releases the lock. The caller should pass in the corresponding node that
> + * was used to acquire the lock.
> + */
> +void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> +{
> +	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
> +
> +	if (likely(!next)) {
> +		/*
> +		 * cmpxchg() provides a memory barrier.
> +		 * Release the lock by setting it to NULL
> +		 */
> +		if (likely(cmpxchg(lock, node, NULL) == node))
> +			return;
> +		/* Wait until the next pointer is set */
> +		while (!(next = ACCESS_ONCE(node->next)))
> +			arch_mutex_cpu_relax();
> +	} else {
> +		/*
> +		 * Make sure all operations within the critical section
> +		 * happen before the lock is released.
> +		 */
> +		smp_mb__after_critical_section();
> +	}
> +	ACCESS_ONCE(next->locked) = 1;
> +}
> +EXPORT_SYMBOL_GPL(mcs_spin_unlock);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations
  2013-11-06 21:36 ` [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations Tim Chen
  2013-11-06 21:41   ` Davidlohr Bueso
@ 2013-11-06 21:42   ` H. Peter Anvin
  2013-11-06 21:59     ` Michel Lespinasse
  1 sibling, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2013-11-06 21:42 UTC (permalink / raw)
  To: Tim Chen, Ingo Molnar, Andrew Morton, Thomas Gleixner
  Cc: linux-kernel, linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Raghavendra K T,
	George Spelvin, Arnd Bergmann, Aswin Chandramouleeswaran,
	Scott J Norton, Will Deacon, Figo.zhang

On 11/06/2013 01:36 PM, Tim Chen wrote:
> In this patch series, we separated out the MCS lock code which was
> previously embedded in the mutex.c.  This allows for easier reuse of
> MCS lock in other places like rwsem and qrwlock.  We also did some micro
> optimizations and barrier cleanup.
> 
> This patches were previously part of the rwsem optimization patch series
> but now we spearate them out.
> 
> Tim Chen

Perhaps I'm missing something here, but what is MCS lock and what is the
value?

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/5] MCS Lock: optimizations and extra comments
  2013-11-06 21:37 ` [PATCH v3 2/5] MCS Lock: optimizations and extra comments Tim Chen
@ 2013-11-06 21:47   ` Tim Chen
  0 siblings, 0 replies; 27+ messages in thread
From: Tim Chen @ 2013-11-06 21:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Thomas Gleixner, linux-kernel, linux-mm,
	linux-arch, Linus Torvalds, Waiman Long, Andrea Arcangeli,
	Alex Shi, Andi Kleen, Michel Lespinasse, Davidlohr Bueso,
	Matthew R Wilcox, Dave Hansen, Peter Zijlstra, Rik van Riel,
	Peter Hurley, Paul E.McKenney, Raghavendra K T, George Spelvin,
	H. Peter Anvin, Arnd Bergmann, Aswin Chandramouleeswaran,
	Scott J Norton, Will Deacon, Figo.zhang

On Wed, 2013-11-06 at 13:37 -0800, Tim Chen wrote:
> Remove unnecessary operation and make the cmpxchg(lock, node, NULL) == node
> check in mcs_spin_unlock() likely() as it is likely that a race did not occur
> most of the time.
> 
> Also add in more comments describing how the local node is used in MCS locks.
> 
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Jason Low <jason.low2@hp.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>

Should be Acked-by: Tim Chen <tim.c.chen@linux.intel.com>.  
My fat fingers accidentally added my signed off for all patches.

Tim

> ---
>  include/linux/mcs_spinlock.h |   13 +++++++++++--
>  1 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
> index b5de3b0..96f14299 100644
> --- a/include/linux/mcs_spinlock.h
> +++ b/include/linux/mcs_spinlock.h
> @@ -18,6 +18,12 @@ struct mcs_spinlock {
>  };
>  
>  /*
> + * In order to acquire the lock, the caller should declare a local node and
> + * pass a reference of the node to this function in addition to the lock.
> + * If the lock has already been acquired, then this will proceed to spin
> + * on this node->locked until the previous lock holder sets the node->locked
> + * in mcs_spin_unlock().
> + *
>   * We don't inline mcs_spin_lock() so that perf can correctly account for the
>   * time spent in this lock function.
>   */
> @@ -33,7 +39,6 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
>  	prev = xchg(lock, node);
>  	if (likely(prev == NULL)) {
>  		/* Lock acquired */
> -		node->locked = 1;
>  		return;
>  	}
>  	ACCESS_ONCE(prev->next) = node;
> @@ -43,6 +48,10 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
>  		arch_mutex_cpu_relax();
>  }
>  
> +/*
> + * Releases the lock. The caller should pass in the corresponding node that
> + * was used to acquire the lock.
> + */
>  static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
>  {
>  	struct mcs_spinlock *next = ACCESS_ONCE(node->next);
> @@ -51,7 +60,7 @@ static void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *nod
>  		/*
>  		 * Release the lock by setting it to NULL
>  		 */
> -		if (cmpxchg(lock, node, NULL) == node)
> +		if (likely(cmpxchg(lock, node, NULL) == node))
>  			return;
>  		/* Wait until the next pointer is set */
>  		while (!(next = ACCESS_ONCE(node->next)))


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations
  2013-11-06 21:42   ` H. Peter Anvin
@ 2013-11-06 21:59     ` Michel Lespinasse
  0 siblings, 0 replies; 27+ messages in thread
From: Michel Lespinasse @ 2013-11-06 21:59 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Tim Chen, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	linux-kernel, linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Davidlohr Bueso,
	Matthew R Wilcox, Dave Hansen, Peter Zijlstra, Rik van Riel,
	Peter Hurley, Paul E.McKenney, Raghavendra K T, George Spelvin,
	Arnd Bergmann, Aswin Chandramouleeswaran, Scott J Norton,
	Will Deacon, Figo.zhang

On Wed, Nov 6, 2013 at 1:42 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> Perhaps I'm missing something here, but what is MCS lock and what is the
> value?

Its a kind of queued lock where each waiter spins on a a separate
memory word, instead of having them all spin on the lock's memory
word. This helps with scalability when many waiters queue on the same
lock.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations
  2013-11-06 21:41   ` Davidlohr Bueso
@ 2013-11-06 23:55     ` Tim Chen
  0 siblings, 0 replies; 27+ messages in thread
From: Tim Chen @ 2013-11-06 23:55 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, linux-kernel,
	linux-mm, linux-arch, Linus Torvalds, Waiman Long,
	Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Peter Hurley, Paul E.McKenney, Raghavendra K T,
	George Spelvin, H. Peter Anvin, Arnd Bergmann,
	Aswin Chandramouleeswaran, Scott J Norton, Will Deacon,
	Figo.zhang

On Wed, 2013-11-06 at 13:41 -0800, Davidlohr Bueso wrote:
> On Wed, 2013-11-06 at 13:36 -0800, Tim Chen wrote:
> > In this patch series, we separated out the MCS lock code which was
> > previously embedded in the mutex.c.  This allows for easier reuse of
> > MCS lock in other places like rwsem and qrwlock.  We also did some micro
> > optimizations and barrier cleanup.
> > 
> > This patches were previously part of the rwsem optimization patch series
> > but now we spearate them out.
> > 
> > Tim Chen
> > 
> > v3:
> > 1. modified memory barriers to support non x86 architectures that have
> > weak memory ordering.
> > 
> > v2:
> > 1. change export mcs_spin_lock as a GPL export symbol
> > 2. corrected mcs_spin_lock to references
> > 
> > 
> > Jason Low (2):
> >   MCS Lock: optimizations and extra comments
> >   MCS Lock: Barrier corrections
> > 
> > 
> > Jason Low (2):
> >   MCS Lock: optimizations and extra comments
> >   MCS Lock: Barrier corrections
> > 
> > Tim Chen (1):
> >   MCS Lock: Restructure the MCS lock defines and locking code into its
> >     own file
> > 
> > Waiman Long (2):
> >   MCS Lock: Make mcs_spinlock.h includable in other files
> >   MCS Lock: Allow architecture specific memory barrier in lock/unlock
> > 
> >  arch/x86/include/asm/barrier.h |    6 +++
> >  include/linux/mcs_spinlock.h   |   25 ++++++++++
> >  include/linux/mutex.h          |    5 +-
> >  kernel/Makefile                |    6 +-
> >  kernel/mcs_spinlock.c          |   96 ++++++++++++++++++++++++++++++++++++++++
> >  kernel/mutex.c                 |   60 +++----------------------
> >  6 files changed, 140 insertions(+), 58 deletions(-)
> >  create mode 100644 include/linux/mcs_spinlock.h
> >  create mode 100644 kernel/mcs_spinlock.c
> 
> Hmm I noticed that Peter's patchset to move locking mechanisms into a
> unique directory is now in -tip, ie:
> 
> http://marc.info/?l=linux-kernel&m=138373682928585
> 
> So we'll have problems applying this patchset, it would probably be best
> to rebase on top.

Good point.  Will update the patchset.

Tim

> 
> Thanks,
> Davidlohr
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-06 21:37 ` [PATCH v3 3/5] MCS Lock: Barrier corrections Tim Chen
@ 2013-11-07  1:39   ` Linus Torvalds
  2013-11-07  4:29     ` Waiman Long
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2013-11-07  1:39 UTC (permalink / raw)
  To: Tim Chen
  Cc: Arnd Bergmann, Figo. zhang, Aswin Chandramouleeswaran,
	Rik van Riel, Waiman Long, Raghavendra K T, Paul E.McKenney,
	linux-arch, Andi Kleen, Peter Zijlstra, George Spelvin,
	Michel Lespinasse, Ingo Molnar, Peter Hurley, H. Peter Anvin,
	Andrew Morton, linux-mm, Alex Shi, Andrea Arcangeli,
	Scott J Norton, linux-kernel, Thomas Gleixner, Dave Hansen,
	Matthew R Wilcox, Will Deacon, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 3446 bytes --]

Sorry about the HTML crap, the internet connection is too slow for my
normal email habits, so I'm using my phone.

I think the barriers are still totally wrong for the locking functions.

Adding an smp_rmb after waiting for the lock is pure BS. Writes in the
locked region could percolate out of the locked region.

The thing is, you cannot do the memory ordering for locks in any same
generic way. Not using our current barrier system. On x86 (and many others)
the smp_rmb will work fine, because writes are never moved earlier. But on
other architectures you really need an acquire to get a lock efficiently.
No separate barriers. An acquire needs to be on the instruction that does
the lock.

Same goes for unlock. On x86 any store is a fine unlock, but on other
architectures you need a store with a release marker.

So no amount of barriers will ever do this correctly. Sure, you can add
full memory barriers and it will be "correct" but it will be unbearably
slow, and add totally unnecessary serialization. So *correct* locking will
require architecture support.

     Linus
On Nov 7, 2013 6:37 AM, "Tim Chen" <tim.c.chen@linux.intel.com> wrote:

> This patch corrects the way memory barriers are used in the MCS lock
> and removes ones that are not needed. Also add comments on all barriers.
>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Jason Low <jason.low2@hp.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
>  include/linux/mcs_spinlock.h |   13 +++++++++++--
>  1 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
> index 96f14299..93d445d 100644
> --- a/include/linux/mcs_spinlock.h
> +++ b/include/linux/mcs_spinlock.h
> @@ -36,16 +36,19 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct
> mcs_spinlock *node)
>         node->locked = 0;
>         node->next   = NULL;
>
> +       /* xchg() provides a memory barrier */
>         prev = xchg(lock, node);
>         if (likely(prev == NULL)) {
>                 /* Lock acquired */
>                 return;
>         }
>         ACCESS_ONCE(prev->next) = node;
> -       smp_wmb();
>         /* Wait until the lock holder passes the lock down */
>         while (!ACCESS_ONCE(node->locked))
>                 arch_mutex_cpu_relax();
> +
> +       /* Make sure subsequent operations happen after the lock is
> acquired */
> +       smp_rmb();
>  }
>
>  /*
> @@ -58,6 +61,7 @@ static void mcs_spin_unlock(struct mcs_spinlock **lock,
> struct mcs_spinlock *nod
>
>         if (likely(!next)) {
>                 /*
> +                * cmpxchg() provides a memory barrier.
>                  * Release the lock by setting it to NULL
>                  */
>                 if (likely(cmpxchg(lock, node, NULL) == node))
> @@ -65,9 +69,14 @@ static void mcs_spin_unlock(struct mcs_spinlock **lock,
> struct mcs_spinlock *nod
>                 /* Wait until the next pointer is set */
>                 while (!(next = ACCESS_ONCE(node->next)))
>                         arch_mutex_cpu_relax();
> +       } else {
> +               /*
> +                * Make sure all operations within the critical section
> +                * happen before the lock is released.
> +                */
> +               smp_wmb();
>         }
>         ACCESS_ONCE(next->locked) = 1;
> -       smp_wmb();
>  }
>
>  #endif /* __LINUX_MCS_SPINLOCK_H */
> --
> 1.7.4.4
>
>
>
>

[-- Attachment #2: Type: text/html, Size: 4388 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07  1:39   ` Linus Torvalds
@ 2013-11-07  4:29     ` Waiman Long
  2013-11-07  8:13     ` Ingo Molnar
  2013-11-07  9:55     ` Michel Lespinasse
  2 siblings, 0 replies; 27+ messages in thread
From: Waiman Long @ 2013-11-07  4:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tim Chen, Arnd Bergmann, Figo. zhang, Aswin Chandramouleeswaran,
	Rik van Riel, Raghavendra K T, Paul E.McKenney, linux-arch,
	Andi Kleen, Peter Zijlstra, George Spelvin, Michel Lespinasse,
	Ingo Molnar, Peter Hurley, H. Peter Anvin, Andrew Morton,
	linux-mm, Alex Shi, Andrea Arcangeli, Scott J Norton,
	linux-kernel, Thomas Gleixner, Dave Hansen, Matthew R Wilcox,
	Will Deacon, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]

On 11/06/2013 08:39 PM, Linus Torvalds wrote:
>
> Sorry about the HTML crap, the internet connection is too slow for my 
> normal email habits, so I'm using my phone.
>
> I think the barriers are still totally wrong for the locking functions.
>
> Adding an smp_rmb after waiting for the lock is pure BS. Writes in the 
> locked region could percolate out of the locked region.
>
> The thing is, you cannot do the memory ordering for locks in any same 
> generic way. Not using our current barrier system. On x86 (and many 
> others) the smp_rmb will work fine, because writes are never moved 
> earlier. But on other architectures you really need an acquire to get 
> a lock efficiently. No separate barriers. An acquire needs to be on 
> the instruction that does the lock.
>
> Same goes for unlock. On x86 any store is a fine unlock, but on other 
> architectures you need a store with a release marker.
>
> So no amount of barriers will ever do this correctly. Sure, you can 
> add full memory barriers and it will be "correct" but it will be 
> unbearably slow, and add totally unnecessary serialization. So 
> *correct* locking will require architecture support.
>
>

Yes, we realized that we can't do it in a generic way without 
introducing unwanted overhead. So I had sent out another patch to do it 
in an architecture specific way to enable each architecture to choose 
their memory barrier. It was at the end of the v3 and v4 patch series.

-Longman

[-- Attachment #2: Type: text/html, Size: 2123 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07  1:39   ` Linus Torvalds
  2013-11-07  4:29     ` Waiman Long
@ 2013-11-07  8:13     ` Ingo Molnar
  2013-11-07  8:22       ` Linus Torvalds
  2013-11-07  9:55     ` Michel Lespinasse
  2 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2013-11-07  8:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tim Chen, Arnd Bergmann, Figo. zhang, Aswin Chandramouleeswaran,
	Rik van Riel, Waiman Long, Raghavendra K T, Paul E.McKenney,
	linux-arch, Andi Kleen, Peter Zijlstra, George Spelvin,
	Michel Lespinasse, Ingo Molnar, Peter Hurley, H. Peter Anvin,
	Andrew Morton, linux-mm, Alex Shi, Andrea Arcangeli,
	Scott J Norton, linux-kernel, Thomas Gleixner, Dave Hansen,
	Matthew R Wilcox, Will Deacon, Davidlohr Bueso


Linus,

A more general maintenance question: do you agree with the whole idea to 
factor out the MCS logic from mutex.c to make it reusable?

This optimization patch makes me think it's a useful thing to do:

  [PATCH v3 2/5] MCS Lock: optimizations and extra comments

as that kicks back optimizations to the mutex code as well. It also 
brought some spotlight on mutex code that it would not have gotten 
otherwise.

That advantage is also its disadvantage: additional coupling between rwsem 
and mutex logic internals. But not like it's overly hard to undo this 
change, so I'm in general in favor of this direction ...

So unless you object to this direction, I planned to apply this 
preparatory series to the locking tree once we are all happy with all the 
fine details.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07  8:13     ` Ingo Molnar
@ 2013-11-07  8:22       ` Linus Torvalds
  2013-11-07  8:25         ` Ingo Molnar
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2013-11-07  8:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Waiman Long, Arnd Bergmann, Rik van Riel,
	Aswin Chandramouleeswaran, Paul E.McKenney, Raghavendra K T,
	Figo. zhang, linux-arch, Andi Kleen, Peter Zijlstra,
	George Spelvin, Tim Chen, Michel Lespinasse, Ingo Molnar,
	Peter Hurley, H. Peter Anvin, Andrew Morton, linux-mm,
	Andrea Arcangeli, Alex Shi, linux-kernel, Scott J Norton,
	Thomas Gleixner, Dave Hansen, Matthew R Wilcox, Will Deacon,
	Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]

I don't necessarily mind the factoring out, I just think it needs to be
really solid and clear if - and *before* - we do this. We do *not* want to
factor out some half-arsed implementation and then have later patches to
fix up the crud. Nor when multiple different locks then use that common
code.

So I think it needs to be *clearly* great code before it gets factored out.
Because before it is great code, it should not be shared with anything else.

     Linus
On Nov 7, 2013 5:13 PM, "Ingo Molnar" <mingo@kernel.org> wrote:

>
> Linus,
>
> A more general maintenance question: do you agree with the whole idea to
> factor out the MCS logic from mutex.c to make it reusable?
>
> This optimization patch makes me think it's a useful thing to do:
>
>   [PATCH v3 2/5] MCS Lock: optimizations and extra comments
>
> as that kicks back optimizations to the mutex code as well. It also
> brought some spotlight on mutex code that it would not have gotten
> otherwise.
>
> That advantage is also its disadvantage: additional coupling between rwsem
> and mutex logic internals. But not like it's overly hard to undo this
> change, so I'm in general in favor of this direction ...
>
> So unless you object to this direction, I planned to apply this
> preparatory series to the locking tree once we are all happy with all the
> fine details.
>
> Thanks,
>
>         Ingo
>

[-- Attachment #2: Type: text/html, Size: 1753 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07  8:22       ` Linus Torvalds
@ 2013-11-07  8:25         ` Ingo Molnar
  0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2013-11-07  8:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Waiman Long, Arnd Bergmann, Rik van Riel,
	Aswin Chandramouleeswaran, Paul E.McKenney, Raghavendra K T,
	Figo. zhang, linux-arch, Andi Kleen, Peter Zijlstra,
	George Spelvin, Tim Chen, Michel Lespinasse, Ingo Molnar,
	Peter Hurley, H. Peter Anvin, Andrew Morton, linux-mm,
	Andrea Arcangeli, Alex Shi, linux-kernel, Scott J Norton,
	Thomas Gleixner, Dave Hansen, Matthew R Wilcox, Will Deacon,
	Davidlohr Bueso


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> I don't necessarily mind the factoring out, I just think it needs to be 
> really solid and clear if - and *before* - we do this. [...]

Okay, agreed.

> [...] We do *not* want to factor out some half-arsed implementation and 
> then have later patches to fix up the crud. Nor when multiple different 
> locks then use that common code.
> 
> So I think it needs to be *clearly* great code before it gets factored 
> out. Because before it is great code, it should not be shared with 
> anything else.

Ok, we'll go through it with a fine comb and I won't rush merging it.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07  1:39   ` Linus Torvalds
  2013-11-07  4:29     ` Waiman Long
  2013-11-07  8:13     ` Ingo Molnar
@ 2013-11-07  9:55     ` Michel Lespinasse
  2013-11-07 12:06       ` Linus Torvalds
  2 siblings, 1 reply; 27+ messages in thread
From: Michel Lespinasse @ 2013-11-07  9:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tim Chen, Arnd Bergmann, Figo. zhang, Aswin Chandramouleeswaran,
	Rik van Riel, Waiman Long, Raghavendra K T, Paul E.McKenney,
	linux-arch, Andi Kleen, Peter Zijlstra, George Spelvin,
	Ingo Molnar, Peter Hurley, H. Peter Anvin, Andrew Morton,
	linux-mm, Alex Shi, Andrea Arcangeli, Scott J Norton,
	linux-kernel, Thomas Gleixner, Dave Hansen, Matthew R Wilcox,
	Will Deacon, Davidlohr Bueso

On Wed, Nov 6, 2013 at 5:39 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Sorry about the HTML crap, the internet connection is too slow for my normal
> email habits, so I'm using my phone.
>
> I think the barriers are still totally wrong for the locking functions.
>
> Adding an smp_rmb after waiting for the lock is pure BS. Writes in the
> locked region could percolate out of the locked region.
>
> The thing is, you cannot do the memory ordering for locks in any same
> generic way. Not using our current barrier system. On x86 (and many others)
> the smp_rmb will work fine, because writes are never moved earlier. But on
> other architectures you really need an acquire to get a lock efficiently. No
> separate barriers. An acquire needs to be on the instruction that does the
> lock.
>
> Same goes for unlock. On x86 any store is a fine unlock, but on other
> architectures you need a store with a release marker.
>
> So no amount of barriers will ever do this correctly. Sure, you can add full
> memory barriers and it will be "correct" but it will be unbearably slow, and
> add totally unnecessary serialization. So *correct* locking will require
> architecture support.

Rather than writing arch-specific locking code, would you agree to
introduce acquire and release memory operations ?

The semantics of an acquire memory operation would be: the specified
memory operation occurs, and any reads or writes after that operation
are guaranteed not to be reordered before it (useful to implement lock
acquisitions).
The semantics of a release memory operation would be: the specified
memory operation occurs, and any reads or writes before that operation
are guaranteed not to be reordered after it (useful to implement lock
releases).

Now each arch would still need to define several acquire and release
operations, but this is a quite useful model to build generic code on.
For example, the fast path for the x86 spinlock implementation could
be expressed generically as an acquire fetch-and-add (for
__ticket_spin_lock) and a release add (for __ticket_spin_unlock).

Would you think this is a useful direction to move to ?

Thanks,

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07  9:55     ` Michel Lespinasse
@ 2013-11-07 12:06       ` Linus Torvalds
  2013-11-07 12:50         ` Michel Lespinasse
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2013-11-07 12:06 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Waiman Long, Arnd Bergmann, Rik van Riel,
	Aswin Chandramouleeswaran, Paul E.McKenney, Raghavendra K T,
	Figo. zhang, linux-arch, Andi Kleen, Peter Zijlstra,
	George Spelvin, Tim Chen, Ingo Molnar, Peter Hurley,
	H. Peter Anvin, Andrew Morton, linux-mm, Andrea Arcangeli,
	Alex Shi, linux-kernel, Scott J Norton, Thomas Gleixner,
	Dave Hansen, Matthew R Wilcox, Will Deacon, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 339 bytes --]

On Nov 7, 2013 6:55 PM, "Michel Lespinasse" <walken@google.com> wrote:
>
> Rather than writing arch-specific locking code, would you agree to
> introduce acquire and release memory operations ?

Yes, that's probably the right thing to do. What ops do we need? Store with
release, cmpxchg and load with acquire? Anything else?

      Linus

[-- Attachment #2: Type: text/html, Size: 485 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07 12:06       ` Linus Torvalds
@ 2013-11-07 12:50         ` Michel Lespinasse
  2013-11-07 14:31           ` Paul E. McKenney
  0 siblings, 1 reply; 27+ messages in thread
From: Michel Lespinasse @ 2013-11-07 12:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Waiman Long, Arnd Bergmann, Rik van Riel,
	Aswin Chandramouleeswaran, Paul E.McKenney, Raghavendra K T,
	Figo. zhang, linux-arch, Andi Kleen, Peter Zijlstra,
	George Spelvin, Tim Chen, Ingo Molnar, Peter Hurley,
	H. Peter Anvin, Andrew Morton, linux-mm, Andrea Arcangeli,
	Alex Shi, linux-kernel, Scott J Norton, Thomas Gleixner,
	Dave Hansen, Matthew R Wilcox, Will Deacon, Davidlohr Bueso

On Thu, Nov 7, 2013 at 4:06 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Nov 7, 2013 6:55 PM, "Michel Lespinasse" <walken@google.com> wrote:
>>
>> Rather than writing arch-specific locking code, would you agree to
>> introduce acquire and release memory operations ?
>
> Yes, that's probably the right thing to do. What ops do we need? Store with
> release, cmpxchg and load with acquire? Anything else?

Depends on what lock types we want to implement on top; for MCS we would need:
- xchg acquire (common case) and load acquire (for spinning on our
locker's wait word)
- cmpxchg release (when there is no next locker) and store release
(when writing to the next locker's wait word)

One downside of the proposal is that using a load acquire for spinning
puts the memory barrier within the spin loop. So this model is very
intuitive and does not add unnecessary barriers on x86, but it my
place the barriers in a suboptimal place for architectures that need
them.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07 12:50         ` Michel Lespinasse
@ 2013-11-07 14:31           ` Paul E. McKenney
  2013-11-07 19:59             ` Michel Lespinasse
  0 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2013-11-07 14:31 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Linus Torvalds, Waiman Long, Arnd Bergmann, Rik van Riel,
	Aswin Chandramouleeswaran, Raghavendra K T, Figo. zhang,
	linux-arch, Andi Kleen, Peter Zijlstra, George Spelvin, Tim Chen,
	Ingo Molnar, Peter Hurley, H. Peter Anvin, Andrew Morton,
	linux-mm, Andrea Arcangeli, Alex Shi, linux-kernel,
	Scott J Norton, Thomas Gleixner, Dave Hansen, Matthew R Wilcox,
	Will Deacon, Davidlohr Bueso

On Thu, Nov 07, 2013 at 04:50:23AM -0800, Michel Lespinasse wrote:
> On Thu, Nov 7, 2013 at 4:06 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Nov 7, 2013 6:55 PM, "Michel Lespinasse" <walken@google.com> wrote:
> >>
> >> Rather than writing arch-specific locking code, would you agree to
> >> introduce acquire and release memory operations ?
> >
> > Yes, that's probably the right thing to do. What ops do we need? Store with
> > release, cmpxchg and load with acquire? Anything else?
> 
> Depends on what lock types we want to implement on top; for MCS we would need:
> - xchg acquire (common case) and load acquire (for spinning on our
> locker's wait word)
> - cmpxchg release (when there is no next locker) and store release
> (when writing to the next locker's wait word)
> 
> One downside of the proposal is that using a load acquire for spinning
> puts the memory barrier within the spin loop. So this model is very
> intuitive and does not add unnecessary barriers on x86, but it my
> place the barriers in a suboptimal place for architectures that need
> them.

OK, I will bite...  Why is a barrier in the spinloop suboptimal?

Can't say that I have tried measuring it, but the barrier should not
normally result in interconnect traffic.  Given that the barrier is
required anyway, it should not affect lock-acquisition latency.

So what am I missing here?

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07 14:31           ` Paul E. McKenney
@ 2013-11-07 19:59             ` Michel Lespinasse
  2013-11-07 21:15               ` Tim Chen
  0 siblings, 1 reply; 27+ messages in thread
From: Michel Lespinasse @ 2013-11-07 19:59 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Linus Torvalds, Waiman Long, Arnd Bergmann, Rik van Riel,
	Aswin Chandramouleeswaran, Raghavendra K T, Figo. zhang,
	linux-arch, Andi Kleen, Peter Zijlstra, George Spelvin, Tim Chen,
	Ingo Molnar, Peter Hurley, H. Peter Anvin, Andrew Morton,
	linux-mm, Andrea Arcangeli, Alex Shi, LKML, Scott J Norton,
	Thomas Gleixner, Dave Hansen, Matthew R Wilcox, Will Deacon,
	Davidlohr Bueso

On Thu, Nov 7, 2013 at 6:31 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Thu, Nov 07, 2013 at 04:50:23AM -0800, Michel Lespinasse wrote:
>> On Thu, Nov 7, 2013 at 4:06 AM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> >
>> > On Nov 7, 2013 6:55 PM, "Michel Lespinasse" <walken@google.com> wrote:
>> >>
>> >> Rather than writing arch-specific locking code, would you agree to
>> >> introduce acquire and release memory operations ?
>> >
>> > Yes, that's probably the right thing to do. What ops do we need? Store with
>> > release, cmpxchg and load with acquire? Anything else?
>>
>> Depends on what lock types we want to implement on top; for MCS we would need:
>> - xchg acquire (common case) and load acquire (for spinning on our
>> locker's wait word)
>> - cmpxchg release (when there is no next locker) and store release
>> (when writing to the next locker's wait word)
>>
>> One downside of the proposal is that using a load acquire for spinning
>> puts the memory barrier within the spin loop. So this model is very
>> intuitive and does not add unnecessary barriers on x86, but it my
>> place the barriers in a suboptimal place for architectures that need
>> them.
>
> OK, I will bite...  Why is a barrier in the spinloop suboptimal?

It's probably not a big deal - all I meant to say is that if you were
manually placing barriers, you would probably put one after the loop
instead. I don't deal much with architectures where such barriers are
needed, so I don't know for sure if the difference means much.

> Can't say that I have tried measuring it, but the barrier should not
> normally result in interconnect traffic.  Given that the barrier is
> required anyway, it should not affect lock-acquisition latency.

Agree

> So what am I missing here?

I think you read my second email as me trying to shoot down a proposal
- I wasn't, as I really like the acquire/release model and find it
easy to program with, which is why I'm proposing it in the first
place. I just wanted to be upfront about all potential downsides, so
we can consider them and see if they are significant - I don't think
they are, but I'm not the best person to judge that as I mostly just
deal with x86 stuff.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07 19:59             ` Michel Lespinasse
@ 2013-11-07 21:15               ` Tim Chen
  2013-11-07 22:21                 ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Tim Chen @ 2013-11-07 21:15 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Paul McKenney, Linus Torvalds, Waiman Long, Arnd Bergmann,
	Rik van Riel, Aswin Chandramouleeswaran, Raghavendra K T,
	Figo. zhang, linux-arch, Andi Kleen, Peter Zijlstra,
	George Spelvin, Ingo Molnar, Peter Hurley, H. Peter Anvin,
	Andrew Morton, linux-mm, Andrea Arcangeli, Alex Shi, LKML,
	Scott J Norton, Thomas Gleixner, Dave Hansen, Matthew R Wilcox,
	Will Deacon, Davidlohr Bueso

On Thu, 2013-11-07 at 11:59 -0800, Michel Lespinasse wrote:
> On Thu, Nov 7, 2013 at 6:31 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Thu, Nov 07, 2013 at 04:50:23AM -0800, Michel Lespinasse wrote:
> >> On Thu, Nov 7, 2013 at 4:06 AM, Linus Torvalds
> >> <torvalds@linux-foundation.org> wrote:
> >> >
> >> > On Nov 7, 2013 6:55 PM, "Michel Lespinasse" <walken@google.com> wrote:
> >> >>
> >> >> Rather than writing arch-specific locking code, would you agree to
> >> >> introduce acquire and release memory operations ?
> >> >
> >> > Yes, that's probably the right thing to do. What ops do we need? Store with
> >> > release, cmpxchg and load with acquire? Anything else?
> >>
> >> Depends on what lock types we want to implement on top; for MCS we would need:
> >> - xchg acquire (common case) and load acquire (for spinning on our
> >> locker's wait word)
> >> - cmpxchg release (when there is no next locker) and store release
> >> (when writing to the next locker's wait word)
> >>
> >> One downside of the proposal is that using a load acquire for spinning
> >> puts the memory barrier within the spin loop. So this model is very
> >> intuitive and does not add unnecessary barriers on x86, but it my
> >> place the barriers in a suboptimal place for architectures that need
> >> them.
> >
> > OK, I will bite...  Why is a barrier in the spinloop suboptimal?
> 
> It's probably not a big deal - all I meant to say is that if you were
> manually placing barriers, you would probably put one after the loop
> instead. I don't deal much with architectures where such barriers are
> needed, so I don't know for sure if the difference means much.

We could do a load acquire at the end of the 
spin loop in the lock function and not in the spin loop itself if cost
of barrier within spin loop is a concern.

Michel, are you planning to do an implementation of
load-acquire/store-release functions of various architectures?

Or is the approach of arch specific memory barrier for MCS 
an acceptable one before load-acquire and store-release
are available?  Are there any technical issues remaining with 
the patchset after including including Waiman's arch specific barrier?

Tim

> 
> > Can't say that I have tried measuring it, but the barrier should not
> > normally result in interconnect traffic.  Given that the barrier is
> > required anyway, it should not affect lock-acquisition latency.
> 
> Agree
> 
> > So what am I missing here?
> 
> I think you read my second email as me trying to shoot down a proposal
> - I wasn't, as I really like the acquire/release model and find it
> easy to program with, which is why I'm proposing it in the first
> place. I just wanted to be upfront about all potential downsides, so
> we can consider them and see if they are significant - I don't think
> they are, but I'm not the best person to judge that as I mostly just
> deal with x86 stuff.
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07 21:15               ` Tim Chen
@ 2013-11-07 22:21                 ` Peter Zijlstra
  2013-11-07 22:43                   ` Michel Lespinasse
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2013-11-07 22:21 UTC (permalink / raw)
  To: Tim Chen
  Cc: Michel Lespinasse, Paul McKenney, Linus Torvalds, Waiman Long,
	Arnd Bergmann, Rik van Riel, Aswin Chandramouleeswaran,
	Raghavendra K T, Figo. zhang, linux-arch, Andi Kleen,
	George Spelvin, Ingo Molnar, Peter Hurley, H. Peter Anvin,
	Andrew Morton, linux-mm, Andrea Arcangeli, Alex Shi, LKML,
	Scott J Norton, Thomas Gleixner, Dave Hansen, Matthew R Wilcox,
	Will Deacon, Davidlohr Bueso

On Thu, Nov 07, 2013 at 01:15:51PM -0800, Tim Chen wrote:
> Michel, are you planning to do an implementation of
> load-acquire/store-release functions of various architectures?

A little something like this:
http://marc.info/?l=linux-arch&m=138386254111507

It so happens we were working on that the past week or so due to another
issue ;-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07 22:21                 ` Peter Zijlstra
@ 2013-11-07 22:43                   ` Michel Lespinasse
  2013-11-08  1:16                     ` Tim Chen
  0 siblings, 1 reply; 27+ messages in thread
From: Michel Lespinasse @ 2013-11-07 22:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tim Chen, Paul McKenney, Linus Torvalds, Waiman Long,
	Arnd Bergmann, Rik van Riel, Aswin Chandramouleeswaran,
	Raghavendra K T, Figo. zhang, linux-arch, Andi Kleen,
	George Spelvin, Ingo Molnar, Peter Hurley, H. Peter Anvin,
	Andrew Morton, linux-mm, Andrea Arcangeli, Alex Shi, LKML,
	Scott J Norton, Thomas Gleixner, Dave Hansen, Matthew R Wilcox,
	Will Deacon, Davidlohr Bueso

On Thu, Nov 7, 2013 at 2:21 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Nov 07, 2013 at 01:15:51PM -0800, Tim Chen wrote:
>> Michel, are you planning to do an implementation of
>> load-acquire/store-release functions of various architectures?
>
> A little something like this:
> http://marc.info/?l=linux-arch&m=138386254111507
>
> It so happens we were working on that the past week or so due to another
> issue ;-)

Haha, awesome, I wasn't aware of this effort.

Tim: my approach would be to provide the acquire/release operations in
arch-specific include files, and have a default implementation using
barriers for arches who don't provide these new ops. That way you make
it work on all arches at once (using the default implementation) and
make it fast on any arch that cares.

>> Or is the approach of arch specific memory barrier for MCS
>> an acceptable one before load-acquire and store-release
>> are available?  Are there any technical issues remaining with
>> the patchset after including including Waiman's arch specific barrier?

I don't want to stand in the way of Waiman's change, and I had
actually taken the same approach with arch-specific barriers when
proposing some queue spinlocks in the past; however I do feel that
this comes back regularly enough that having acquire/release
primitives available would help, hence my proposal.

That said, earlier in the thread Linus said we should probably get all
our ducks in a row before going forward with this, so...

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/5] MCS Lock: Barrier corrections
  2013-11-07 22:43                   ` Michel Lespinasse
@ 2013-11-08  1:16                     ` Tim Chen
  0 siblings, 0 replies; 27+ messages in thread
From: Tim Chen @ 2013-11-08  1:16 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Peter Zijlstra, Paul McKenney, Linus Torvalds, Waiman Long,
	Arnd Bergmann, Rik van Riel, Aswin Chandramouleeswaran,
	Raghavendra K T, Figo. zhang, linux-arch, Andi Kleen,
	George Spelvin, Ingo Molnar, Peter Hurley, H. Peter Anvin,
	Andrew Morton, linux-mm, Andrea Arcangeli, Alex Shi, LKML,
	Scott J Norton, Thomas Gleixner, Dave Hansen, Matthew R Wilcox,
	Will Deacon, Davidlohr Bueso

On Thu, 2013-11-07 at 14:43 -0800, Michel Lespinasse wrote:
> On Thu, Nov 7, 2013 at 2:21 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Thu, Nov 07, 2013 at 01:15:51PM -0800, Tim Chen wrote:
> >> Michel, are you planning to do an implementation of
> >> load-acquire/store-release functions of various architectures?
> >
> > A little something like this:
> > http://marc.info/?l=linux-arch&m=138386254111507
> >
> > It so happens we were working on that the past week or so due to another
> > issue ;-)
> 
> Haha, awesome, I wasn't aware of this effort.
> 
> Tim: my approach would be to provide the acquire/release operations in
> arch-specific include files, and have a default implementation using
> barriers for arches who don't provide these new ops. That way you make
> it work on all arches at once (using the default implementation) and
> make it fast on any arch that cares.
> 
> >> Or is the approach of arch specific memory barrier for MCS
> >> an acceptable one before load-acquire and store-release
> >> are available?  Are there any technical issues remaining with
> >> the patchset after including including Waiman's arch specific barrier?
> 
> I don't want to stand in the way of Waiman's change, and I had
> actually taken the same approach with arch-specific barriers when
> proposing some queue spinlocks in the past; however I do feel that
> this comes back regularly enough that having acquire/release
> primitives available would help, hence my proposal.
> 
> That said, earlier in the thread Linus said we should probably get all
> our ducks in a row before going forward with this, so...
> 

With the load_acquire and store_release implemented, it should be
pretty straightforward to implement MCS with them.  I'll respin
the patch series with these primitives.

Thanks.

Tim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-11-08  1:16 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1383771175.git.tim.c.chen@linux.intel.com>
2013-11-06 21:36 ` [PATCH v3 0/4] MCS Lock: MCS lock code cleanup and optimizations Tim Chen
2013-11-06 21:41   ` Davidlohr Bueso
2013-11-06 23:55     ` Tim Chen
2013-11-06 21:42   ` H. Peter Anvin
2013-11-06 21:59     ` Michel Lespinasse
2013-11-06 21:37 ` [PATCH v3 1/5] MCS Lock: Restructure the MCS lock defines and locking code into its own file Tim Chen
2013-11-06 21:37 ` [PATCH v3 2/5] MCS Lock: optimizations and extra comments Tim Chen
2013-11-06 21:47   ` Tim Chen
2013-11-06 21:37 ` [PATCH v3 3/5] MCS Lock: Barrier corrections Tim Chen
2013-11-07  1:39   ` Linus Torvalds
2013-11-07  4:29     ` Waiman Long
2013-11-07  8:13     ` Ingo Molnar
2013-11-07  8:22       ` Linus Torvalds
2013-11-07  8:25         ` Ingo Molnar
2013-11-07  9:55     ` Michel Lespinasse
2013-11-07 12:06       ` Linus Torvalds
2013-11-07 12:50         ` Michel Lespinasse
2013-11-07 14:31           ` Paul E. McKenney
2013-11-07 19:59             ` Michel Lespinasse
2013-11-07 21:15               ` Tim Chen
2013-11-07 22:21                 ` Peter Zijlstra
2013-11-07 22:43                   ` Michel Lespinasse
2013-11-08  1:16                     ` Tim Chen
2013-11-06 21:37 ` [PATCH v3 4/5] MCS Lock: Make mcs_spinlock.h includable in other files Tim Chen
2013-11-06 21:41   ` Tim Chen
2013-11-06 21:37 ` [PATCH v3 5/5] MCS Lock: Allow architecture specific memory barrier in lock/unlock Tim Chen
2013-11-06 21:42   ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox