[PATCH 0/5] Remove the XFS mrlock

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/5] Remove the XFS mrlock
@ 2023-09-07 17:47 Matthew Wilcox (Oracle)
  2023-09-07 17:47 ` [PATCH 1/5] locking: Add rwsem_is_write_locked() Matthew Wilcox (Oracle)
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-09-07 17:47 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

XFS has an mrlock wrapper around the rwsem which adds only the
functionality of knowing whether the rwsem is currently held in read
or write mode.  Both regular rwsems and rt-rwsems know this, they just
don't expose it as an API.  By adding that, we can remove the XFS mrlock
as well as improving the debug assertions for the mmap_lock when lockdep
is disabled.

I'd like acks from the locking people, then it probably should go upstream
through the XFS tree since that's where the patch series touches the
most code.

Matthew Wilcox (Oracle) (5):
  locking: Add rwsem_is_write_locked()
  mm: Use rwsem_is_write_locked in mmap_assert_write_locked
  xfs: Use rwsem_is_write_locked()
  xfs: Remove mrlock wrapper
  xfs: Stop using lockdep to assert that locks are held

 fs/xfs/mrlock.h           | 78 ---------------------------------------
 fs/xfs/xfs_inode.c        | 61 ++++++++++--------------------
 fs/xfs/xfs_inode.h        |  2 +-
 fs/xfs/xfs_iops.c         |  4 +-
 fs/xfs/xfs_linux.h        |  2 +-
 fs/xfs/xfs_super.c        |  4 +-
 include/linux/mmap_lock.h |  2 +-
 include/linux/rwbase_rt.h |  5 +++
 include/linux/rwsem.h     | 10 +++++
 9 files changed, 40 insertions(+), 128 deletions(-)
 delete mode 100644 fs/xfs/mrlock.h

-- 
2.40.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 17:47 [PATCH 0/5] Remove the XFS mrlock Matthew Wilcox (Oracle)
@ 2023-09-07 17:47 ` Matthew Wilcox (Oracle)
  2023-09-07 18:05   ` Waiman Long
  2023-09-07 19:08   ` Peter Zijlstra
  2023-09-07 17:47 ` [PATCH 2/5] mm: Use rwsem_is_write_locked in mmap_assert_write_locked Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 33+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-09-07 17:47 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

Several places want to know whether the lock is held by a writer, instead
of just whether it's held.  We can implement this for both normal and
rt rwsems.  RWSEM_WRITER_LOCKED is declared in rwsem.c and exposing
it outside that file might tempt other people to use it, so just use
a comment to note that's what the 1 means, and help anybody find it if
they're looking to change the implementation.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/rwbase_rt.h |  5 +++++
 include/linux/rwsem.h     | 10 ++++++++++
 2 files changed, 15 insertions(+)

diff --git a/include/linux/rwbase_rt.h b/include/linux/rwbase_rt.h
index 1d264dd08625..3c25b14edc05 100644
--- a/include/linux/rwbase_rt.h
+++ b/include/linux/rwbase_rt.h
@@ -31,6 +31,11 @@ static __always_inline bool rw_base_is_locked(struct rwbase_rt *rwb)
 	return atomic_read(&rwb->readers) != READER_BIAS;
 }
 
+static __always_inline bool rw_base_is_write_locked(struct rwbase_rt *rwb)
+{
+	return atomic_read(&rwb->readers) == WRITER_BIAS;
+}
+
 static __always_inline bool rw_base_is_contended(struct rwbase_rt *rwb)
 {
 	return atomic_read(&rwb->readers) > 0;
diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index 1dd530ce8b45..0f78b8d2e653 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -72,6 +72,11 @@ static inline int rwsem_is_locked(struct rw_semaphore *sem)
 	return atomic_long_read(&sem->count) != 0;
 }
 
+static inline int rwsem_is_write_locked(struct rw_semaphore *sem)
+{
+	return atomic_long_read(&sem->count) & 1 /* RWSEM_WRITER_LOCKED */;
+}
+
 #define RWSEM_UNLOCKED_VALUE		0L
 #define __RWSEM_COUNT_INIT(name)	.count = ATOMIC_LONG_INIT(RWSEM_UNLOCKED_VALUE)
 
@@ -157,6 +162,11 @@ static __always_inline int rwsem_is_locked(struct rw_semaphore *sem)
 	return rw_base_is_locked(&sem->rwbase);
 }
 
+static __always_inline int rwsem_is_write_locked(struct rw_semaphore *sem)
+{
+	return rw_base_is_write_locked(&sem->rwbase);
+}
+
 static __always_inline int rwsem_is_contended(struct rw_semaphore *sem)
 {
 	return rw_base_is_contended(&sem->rwbase);
-- 
2.40.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 2/5] mm: Use rwsem_is_write_locked in mmap_assert_write_locked
  2023-09-07 17:47 [PATCH 0/5] Remove the XFS mrlock Matthew Wilcox (Oracle)
  2023-09-07 17:47 ` [PATCH 1/5] locking: Add rwsem_is_write_locked() Matthew Wilcox (Oracle)
@ 2023-09-07 17:47 ` Matthew Wilcox (Oracle)
  2023-09-07 17:47 ` [PATCH 3/5] xfs: Use rwsem_is_write_locked() Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-09-07 17:47 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

This slightly strengthens our checks when lockdep is disabled.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mmap_lock.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index 8d38dcb6d044..0258b06668e8 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -69,7 +69,7 @@ static inline void mmap_assert_locked(struct mm_struct *mm)
 static inline void mmap_assert_write_locked(struct mm_struct *mm)
 {
 	lockdep_assert_held_write(&mm->mmap_lock);
-	VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm);
+	VM_BUG_ON_MM(!rwsem_is_write_locked(&mm->mmap_lock), mm);
 }
 
 #ifdef CONFIG_PER_VMA_LOCK
-- 
2.40.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 3/5] xfs: Use rwsem_is_write_locked()
  2023-09-07 17:47 [PATCH 0/5] Remove the XFS mrlock Matthew Wilcox (Oracle)
  2023-09-07 17:47 ` [PATCH 1/5] locking: Add rwsem_is_write_locked() Matthew Wilcox (Oracle)
  2023-09-07 17:47 ` [PATCH 2/5] mm: Use rwsem_is_write_locked in mmap_assert_write_locked Matthew Wilcox (Oracle)
@ 2023-09-07 17:47 ` Matthew Wilcox (Oracle)
  2023-09-08  9:09   ` Christoph Hellwig
  2023-09-07 17:47 ` [PATCH 4/5] xfs: Remove mrlock wrapper Matthew Wilcox (Oracle)
  2023-09-07 17:47 ` [PATCH 5/5] xfs: Stop using lockdep to assert that locks are held Matthew Wilcox (Oracle)
  4 siblings, 1 reply; 33+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-09-07 17:47 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

This avoids using the mr_writer field to check the XFS ILOCK is held
for write.  It also improves the checking we do when lockdep is disabled.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/xfs/xfs_inode.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 360fe83a334f..e58d84d23f49 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -339,8 +339,11 @@ __xfs_rwsem_islocked(
 	struct rw_semaphore	*rwsem,
 	bool			shared)
 {
-	if (!debug_locks)
+	if (!debug_locks) {
+		if (!shared)
+			return rwsem_is_write_locked(rwsem);
 		return rwsem_is_locked(rwsem);
+	}
 
 	if (!shared)
 		return lockdep_is_held_type(rwsem, 0);
@@ -359,12 +362,10 @@ xfs_isilocked(
 	struct xfs_inode	*ip,
 	uint			lock_flags)
 {
-	if (lock_flags & (XFS_ILOCK_EXCL|XFS_ILOCK_SHARED)) {
-		if (!(lock_flags & XFS_ILOCK_SHARED))
-			return !!ip->i_lock.mr_writer;
+	if (lock_flags & XFS_ILOCK_SHARED)
 		return rwsem_is_locked(&ip->i_lock.mr_lock);
-	}
-
+	if (lock_flags & XFS_ILOCK_EXCL)
+		return rwsem_is_write_locked(&ip->i_lock.mr_lock);
 	if (lock_flags & (XFS_MMAPLOCK_EXCL|XFS_MMAPLOCK_SHARED)) {
 		return __xfs_rwsem_islocked(&VFS_I(ip)->i_mapping->invalidate_lock,
 				(lock_flags & XFS_MMAPLOCK_SHARED));
-- 
2.40.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 4/5] xfs: Remove mrlock wrapper
  2023-09-07 17:47 [PATCH 0/5] Remove the XFS mrlock Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2023-09-07 17:47 ` [PATCH 3/5] xfs: Use rwsem_is_write_locked() Matthew Wilcox (Oracle)
@ 2023-09-07 17:47 ` Matthew Wilcox (Oracle)
  2023-09-07 17:47 ` [PATCH 5/5] xfs: Stop using lockdep to assert that locks are held Matthew Wilcox (Oracle)
  4 siblings, 0 replies; 33+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-09-07 17:47 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

mrlock was an rwsem wrapper that also recorded whether the lock was
held for read or write.  Now that we can ask the generic code whether
the lock is held for read or write, we can remove this wrapper and use
an rwsem directly.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/xfs/mrlock.h    | 78 ----------------------------------------------
 fs/xfs/xfs_inode.c | 18 +++++------
 fs/xfs/xfs_inode.h |  2 +-
 fs/xfs/xfs_iops.c  |  4 +--
 fs/xfs/xfs_linux.h |  2 +-
 fs/xfs/xfs_super.c |  4 +--
 6 files changed, 14 insertions(+), 94 deletions(-)
 delete mode 100644 fs/xfs/mrlock.h

diff --git a/fs/xfs/mrlock.h b/fs/xfs/mrlock.h
deleted file mode 100644
index 79155eec341b..000000000000
--- a/fs/xfs/mrlock.h
+++ /dev/null
@@ -1,78 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (c) 2000-2006 Silicon Graphics, Inc.
- * All Rights Reserved.
- */
-#ifndef __XFS_SUPPORT_MRLOCK_H__
-#define __XFS_SUPPORT_MRLOCK_H__
-
-#include <linux/rwsem.h>
-
-typedef struct {
-	struct rw_semaphore	mr_lock;
-#if defined(DEBUG) || defined(XFS_WARN)
-	int			mr_writer;
-#endif
-} mrlock_t;
-
-#if defined(DEBUG) || defined(XFS_WARN)
-#define mrinit(mrp, name)	\
-	do { (mrp)->mr_writer = 0; init_rwsem(&(mrp)->mr_lock); } while (0)
-#else
-#define mrinit(mrp, name)	\
-	do { init_rwsem(&(mrp)->mr_lock); } while (0)
-#endif
-
-#define mrlock_init(mrp, t,n,s)	mrinit(mrp, n)
-#define mrfree(mrp)		do { } while (0)
-
-static inline void mraccess_nested(mrlock_t *mrp, int subclass)
-{
-	down_read_nested(&mrp->mr_lock, subclass);
-}
-
-static inline void mrupdate_nested(mrlock_t *mrp, int subclass)
-{
-	down_write_nested(&mrp->mr_lock, subclass);
-#if defined(DEBUG) || defined(XFS_WARN)
-	mrp->mr_writer = 1;
-#endif
-}
-
-static inline int mrtryaccess(mrlock_t *mrp)
-{
-	return down_read_trylock(&mrp->mr_lock);
-}
-
-static inline int mrtryupdate(mrlock_t *mrp)
-{
-	if (!down_write_trylock(&mrp->mr_lock))
-		return 0;
-#if defined(DEBUG) || defined(XFS_WARN)
-	mrp->mr_writer = 1;
-#endif
-	return 1;
-}
-
-static inline void mrunlock_excl(mrlock_t *mrp)
-{
-#if defined(DEBUG) || defined(XFS_WARN)
-	mrp->mr_writer = 0;
-#endif
-	up_write(&mrp->mr_lock);
-}
-
-static inline void mrunlock_shared(mrlock_t *mrp)
-{
-	up_read(&mrp->mr_lock);
-}
-
-static inline void mrdemote(mrlock_t *mrp)
-{
-#if defined(DEBUG) || defined(XFS_WARN)
-	mrp->mr_writer = 0;
-#endif
-	downgrade_write(&mrp->mr_lock);
-}
-
-#endif /* __XFS_SUPPORT_MRLOCK_H__ */
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index e58d84d23f49..c3cd73c29868 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -208,9 +208,9 @@ xfs_ilock(
 	}
 
 	if (lock_flags & XFS_ILOCK_EXCL)
-		mrupdate_nested(&ip->i_lock, XFS_ILOCK_DEP(lock_flags));
+		down_write_nested(&ip->i_lock, XFS_ILOCK_DEP(lock_flags));
 	else if (lock_flags & XFS_ILOCK_SHARED)
-		mraccess_nested(&ip->i_lock, XFS_ILOCK_DEP(lock_flags));
+		down_read_nested(&ip->i_lock, XFS_ILOCK_DEP(lock_flags));
 }
 
 /*
@@ -251,10 +251,10 @@ xfs_ilock_nowait(
 	}
 
 	if (lock_flags & XFS_ILOCK_EXCL) {
-		if (!mrtryupdate(&ip->i_lock))
+		if (!down_write_trylock(&ip->i_lock))
 			goto out_undo_mmaplock;
 	} else if (lock_flags & XFS_ILOCK_SHARED) {
-		if (!mrtryaccess(&ip->i_lock))
+		if (!down_read_trylock(&ip->i_lock))
 			goto out_undo_mmaplock;
 	}
 	return 1;
@@ -303,9 +303,9 @@ xfs_iunlock(
 		up_read(&VFS_I(ip)->i_mapping->invalidate_lock);
 
 	if (lock_flags & XFS_ILOCK_EXCL)
-		mrunlock_excl(&ip->i_lock);
+		up_write(&ip->i_lock);
 	else if (lock_flags & XFS_ILOCK_SHARED)
-		mrunlock_shared(&ip->i_lock);
+		up_read(&ip->i_lock);
 
 	trace_xfs_iunlock(ip, lock_flags, _RET_IP_);
 }
@@ -324,7 +324,7 @@ xfs_ilock_demote(
 		~(XFS_IOLOCK_EXCL|XFS_MMAPLOCK_EXCL|XFS_ILOCK_EXCL)) == 0);
 
 	if (lock_flags & XFS_ILOCK_EXCL)
-		mrdemote(&ip->i_lock);
+		downgrade_write(&ip->i_lock);
 	if (lock_flags & XFS_MMAPLOCK_EXCL)
 		downgrade_write(&VFS_I(ip)->i_mapping->invalidate_lock);
 	if (lock_flags & XFS_IOLOCK_EXCL)
@@ -363,9 +363,9 @@ xfs_isilocked(
 	uint			lock_flags)
 {
 	if (lock_flags & XFS_ILOCK_SHARED)
-		return rwsem_is_locked(&ip->i_lock.mr_lock);
+		return rwsem_is_locked(&ip->i_lock);
 	if (lock_flags & XFS_ILOCK_EXCL)
-		return rwsem_is_write_locked(&ip->i_lock.mr_lock);
+		return rwsem_is_write_locked(&ip->i_lock);
 	if (lock_flags & (XFS_MMAPLOCK_EXCL|XFS_MMAPLOCK_SHARED)) {
 		return __xfs_rwsem_islocked(&VFS_I(ip)->i_mapping->invalidate_lock,
 				(lock_flags & XFS_MMAPLOCK_SHARED));
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 7547caf2f2ab..18941cd21b81 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -39,7 +39,7 @@ typedef struct xfs_inode {
 
 	/* Transaction and locking information. */
 	struct xfs_inode_log_item *i_itemp;	/* logging information */
-	mrlock_t		i_lock;		/* inode lock */
+	struct rw_semaphore	i_lock;		/* inode lock */
 	atomic_t		i_pincount;	/* inode pin count */
 	struct llist_node	i_gclist;	/* deferred inactivation list */
 
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 2ededd3f6b8c..ba64c9c5d3ab 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1280,9 +1280,9 @@ xfs_setup_inode(
 		 */
 		lockdep_set_class(&inode->i_rwsem,
 				  &inode->i_sb->s_type->i_mutex_dir_key);
-		lockdep_set_class(&ip->i_lock.mr_lock, &xfs_dir_ilock_class);
+		lockdep_set_class(&ip->i_lock, &xfs_dir_ilock_class);
 	} else {
-		lockdep_set_class(&ip->i_lock.mr_lock, &xfs_nondir_ilock_class);
+		lockdep_set_class(&ip->i_lock, &xfs_nondir_ilock_class);
 	}
 
 	/*
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index e9d317a3dafe..15fdaef578fe 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -22,7 +22,6 @@ typedef __u32			xfs_nlink_t;
 #include "xfs_types.h"
 
 #include "kmem.h"
-#include "mrlock.h"
 
 #include <linux/semaphore.h>
 #include <linux/mm.h>
@@ -51,6 +50,7 @@ typedef __u32			xfs_nlink_t;
 #include <linux/notifier.h>
 #include <linux/delay.h>
 #include <linux/log2.h>
+#include <linux/rwsem.h>
 #include <linux/spinlock.h>
 #include <linux/random.h>
 #include <linux/ctype.h>
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 1f77014c6e1a..d190afbcfe04 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -739,9 +739,7 @@ xfs_fs_inode_init_once(
 	/* xfs inode */
 	atomic_set(&ip->i_pincount, 0);
 	spin_lock_init(&ip->i_flags_lock);
-
-	mrlock_init(&ip->i_lock, MRLOCK_ALLOW_EQUAL_PRI|MRLOCK_BARRIER,
-		     "xfsino", ip->i_ino);
+	init_rwsem(&ip->i_lock);
 }
 
 /*
-- 
2.40.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 5/5] xfs: Stop using lockdep to assert that locks are held
  2023-09-07 17:47 [PATCH 0/5] Remove the XFS mrlock Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2023-09-07 17:47 ` [PATCH 4/5] xfs: Remove mrlock wrapper Matthew Wilcox (Oracle)
@ 2023-09-07 17:47 ` Matthew Wilcox (Oracle)
  4 siblings, 0 replies; 33+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-09-07 17:47 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

Lockdep does not know that the worker thread has inherited the lock
from its caller.  Rather than dance around moving the ownership from the
caller to the thread and back again, just remove the lockdep assertions
and rely on the rwsem itself to tell us whether _somebody_ is holding
the lock at the moment.

__xfs_rwsem_islocked() simplifies into a trivial function, which is easy
to inline into xfs_isilocked().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/xfs/xfs_inode.c | 40 ++++++++--------------------------------
 1 file changed, 8 insertions(+), 32 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index c3cd73c29868..81ee6bf8c662 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -334,29 +334,6 @@ xfs_ilock_demote(
 }
 
 #if defined(DEBUG) || defined(XFS_WARN)
-static inline bool
-__xfs_rwsem_islocked(
-	struct rw_semaphore	*rwsem,
-	bool			shared)
-{
-	if (!debug_locks) {
-		if (!shared)
-			return rwsem_is_write_locked(rwsem);
-		return rwsem_is_locked(rwsem);
-	}
-
-	if (!shared)
-		return lockdep_is_held_type(rwsem, 0);
-
-	/*
-	 * We are checking that the lock is held at least in shared
-	 * mode but don't care that it might be held exclusively
-	 * (i.e. shared | excl). Hence we check if the lock is held
-	 * in any mode rather than an explicit shared mode.
-	 */
-	return lockdep_is_held_type(rwsem, -1);
-}
-
 bool
 xfs_isilocked(
 	struct xfs_inode	*ip,
@@ -366,15 +343,14 @@ xfs_isilocked(
 		return rwsem_is_locked(&ip->i_lock);
 	if (lock_flags & XFS_ILOCK_EXCL)
 		return rwsem_is_write_locked(&ip->i_lock);
-	if (lock_flags & (XFS_MMAPLOCK_EXCL|XFS_MMAPLOCK_SHARED)) {
-		return __xfs_rwsem_islocked(&VFS_I(ip)->i_mapping->invalidate_lock,
-				(lock_flags & XFS_MMAPLOCK_SHARED));
-	}
-
-	if (lock_flags & (XFS_IOLOCK_EXCL | XFS_IOLOCK_SHARED)) {
-		return __xfs_rwsem_islocked(&VFS_I(ip)->i_rwsem,
-				(lock_flags & XFS_IOLOCK_SHARED));
-	}
+	if (lock_flags & XFS_MMAPLOCK_SHARED)
+		return rwsem_is_locked(&VFS_I(ip)->i_mapping->invalidate_lock);
+	if (lock_flags & XFS_MMAPLOCK_EXCL)
+		return rwsem_is_write_locked(&VFS_I(ip)->i_mapping->invalidate_lock);
+	if (lock_flags & XFS_IOLOCK_SHARED)
+		return rwsem_is_locked(&VFS_I(ip)->i_rwsem);
+	if (lock_flags & XFS_IOLOCK_EXCL)
+		return rwsem_is_write_locked(&VFS_I(ip)->i_rwsem);
 
 	ASSERT(0);
 	return false;
-- 
2.40.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 17:47 ` [PATCH 1/5] locking: Add rwsem_is_write_locked() Matthew Wilcox (Oracle)
@ 2023-09-07 18:05   ` Waiman Long
  2023-09-07 19:33     ` Matthew Wilcox
  2023-09-07 19:08   ` Peter Zijlstra
  1 sibling, 1 reply; 33+ messages in thread
From: Waiman Long @ 2023-09-07 18:05 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong, linux-xfs

On 9/7/23 13:47, Matthew Wilcox (Oracle) wrote:
> Several places want to know whether the lock is held by a writer, instead
> of just whether it's held.  We can implement this for both normal and
> rt rwsems.  RWSEM_WRITER_LOCKED is declared in rwsem.c and exposing
> it outside that file might tempt other people to use it, so just use
> a comment to note that's what the 1 means, and help anybody find it if
> they're looking to change the implementation.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>   include/linux/rwbase_rt.h |  5 +++++
>   include/linux/rwsem.h     | 10 ++++++++++
>   2 files changed, 15 insertions(+)
>
> diff --git a/include/linux/rwbase_rt.h b/include/linux/rwbase_rt.h
> index 1d264dd08625..3c25b14edc05 100644
> --- a/include/linux/rwbase_rt.h
> +++ b/include/linux/rwbase_rt.h
> @@ -31,6 +31,11 @@ static __always_inline bool rw_base_is_locked(struct rwbase_rt *rwb)
>   	return atomic_read(&rwb->readers) != READER_BIAS;
>   }
>   
> +static __always_inline bool rw_base_is_write_locked(struct rwbase_rt *rwb)
> +{
> +	return atomic_read(&rwb->readers) == WRITER_BIAS;
> +}
> +
>   static __always_inline bool rw_base_is_contended(struct rwbase_rt *rwb)
>   {
>   	return atomic_read(&rwb->readers) > 0;
> diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
> index 1dd530ce8b45..0f78b8d2e653 100644
> --- a/include/linux/rwsem.h
> +++ b/include/linux/rwsem.h
> @@ -72,6 +72,11 @@ static inline int rwsem_is_locked(struct rw_semaphore *sem)
>   	return atomic_long_read(&sem->count) != 0;
>   }
>   
> +static inline int rwsem_is_write_locked(struct rw_semaphore *sem)
> +{
> +	return atomic_long_read(&sem->count) & 1 /* RWSEM_WRITER_LOCKED */;
> +}

I would prefer you move the various RWSEM_* count bit macros from 
kernel/locking/rwsem.c to under the !PREEMPT_RT block and directly use 
RWSEM_WRITER_LOCKED instead of hardcoding a value of 1.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 17:47 ` [PATCH 1/5] locking: Add rwsem_is_write_locked() Matthew Wilcox (Oracle)
  2023-09-07 18:05   ` Waiman Long
@ 2023-09-07 19:08   ` Peter Zijlstra
  2023-09-07 19:20     ` Matthew Wilcox
  1 sibling, 1 reply; 33+ messages in thread
From: Peter Zijlstra @ 2023-09-07 19:08 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Ingo Molnar, Will Deacon, Waiman Long, linux-kernel, linux-mm,
	Chandan Babu R, Darrick J . Wong, linux-xfs

On Thu, Sep 07, 2023 at 06:47:01PM +0100, Matthew Wilcox (Oracle) wrote:
> Several places want to know whether the lock is held by a writer, instead
> of just whether it's held.  We can implement this for both normal and
> rt rwsems.  RWSEM_WRITER_LOCKED is declared in rwsem.c and exposing
> it outside that file might tempt other people to use it, so just use
> a comment to note that's what the 1 means, and help anybody find it if
> they're looking to change the implementation.

I'm presuming this is deep in a callchain where they know they hold the
lock, but they lost in what capacity?

In general I strongly dislike the whole _is_locked family, because it
gives very poorly defined semantics if used by anybody but the owner.

If these new functions are indeed to be used only by lock holders to
determine what kind of lock they hold, could we please put:

	lockdep_assert_held()

in them?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 19:08   ` Peter Zijlstra
@ 2023-09-07 19:20     ` Matthew Wilcox
  2023-09-07 19:38       ` Peter Zijlstra
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Wilcox @ 2023-09-07 19:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Will Deacon, Waiman Long, linux-kernel, linux-mm,
	Chandan Babu R, Darrick J . Wong, linux-xfs

On Thu, Sep 07, 2023 at 09:08:10PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 07, 2023 at 06:47:01PM +0100, Matthew Wilcox (Oracle) wrote:
> > Several places want to know whether the lock is held by a writer, instead
> > of just whether it's held.  We can implement this for both normal and
> > rt rwsems.  RWSEM_WRITER_LOCKED is declared in rwsem.c and exposing
> > it outside that file might tempt other people to use it, so just use
> > a comment to note that's what the 1 means, and help anybody find it if
> > they're looking to change the implementation.
> 
> I'm presuming this is deep in a callchain where they know they hold the
> lock, but they lost in what capacity?

No, it's just assertions.  You can see that in patch 3 where it's
used in functions called things like "xfs_islocked".

> In general I strongly dislike the whole _is_locked family, because it
> gives very poorly defined semantics if used by anybody but the owner.
> 
> If these new functions are indeed to be used only by lock holders to
> determine what kind of lock they hold, could we please put:
> 
> 	lockdep_assert_held()
> 
> in them?

Patch 2 shows it in use in the MM code.  We already have a
lockdep_assert_held_write(), but most people don't enable lockdep, so
we also have VM_BUG_ON_MM(!rwsem_is_write_locked(&mm->mmap_lock), mm)
to give us a good assertion when lockdep is disabled.

XFS has a problem with using lockdep in general, which is that a worker
thread can be spawned and use the fact that the spawner is holding the
lock.  There's no mechanism for the worker thread to ask "Does struct
task_struct *p hold the lock?".


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 18:05   ` Waiman Long
@ 2023-09-07 19:33     ` Matthew Wilcox
  2023-09-07 21:06       ` Waiman Long
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Wilcox @ 2023-09-07 19:33 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, linux-kernel, linux-mm,
	Chandan Babu R, Darrick J . Wong, linux-xfs

On Thu, Sep 07, 2023 at 02:05:54PM -0400, Waiman Long wrote:
> On 9/7/23 13:47, Matthew Wilcox (Oracle) wrote:
> > +static inline int rwsem_is_write_locked(struct rw_semaphore *sem)
> > +{
> > +	return atomic_long_read(&sem->count) & 1 /* RWSEM_WRITER_LOCKED */;
> > +}
> 
> I would prefer you move the various RWSEM_* count bit macros from
> kernel/locking/rwsem.c to under the !PREEMPT_RT block and directly use
> RWSEM_WRITER_LOCKED instead of hardcoding a value of 1.

Just to be clear, you want the ~50 lines from:

/*
 * On 64-bit architectures, the bit definitions of the count are:
...
#define RWSEM_READ_FAILED_MASK  (RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\
                                 RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL)

moved from rwsem.c to rwsem.h?

Or just these four lines:

#define RWSEM_WRITER_LOCKED     (1UL << 0)
#define RWSEM_FLAG_WAITERS      (1UL << 1)
#define RWSEM_FLAG_HANDOFF      (1UL << 2)
#define RWSEM_FLAG_READFAIL     (1UL << (BITS_PER_LONG - 1))



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 19:20     ` Matthew Wilcox
@ 2023-09-07 19:38       ` Peter Zijlstra
  2023-09-07 23:00         ` Dave Chinner
  2023-09-08  0:01         ` Matthew Wilcox
  0 siblings, 2 replies; 33+ messages in thread
From: Peter Zijlstra @ 2023-09-07 19:38 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Ingo Molnar, Will Deacon, Waiman Long, linux-kernel, linux-mm,
	Chandan Babu R, Darrick J . Wong, linux-xfs

On Thu, Sep 07, 2023 at 08:20:30PM +0100, Matthew Wilcox wrote:
> On Thu, Sep 07, 2023 at 09:08:10PM +0200, Peter Zijlstra wrote:
> > On Thu, Sep 07, 2023 at 06:47:01PM +0100, Matthew Wilcox (Oracle) wrote:
> > > Several places want to know whether the lock is held by a writer, instead
> > > of just whether it's held.  We can implement this for both normal and
> > > rt rwsems.  RWSEM_WRITER_LOCKED is declared in rwsem.c and exposing
> > > it outside that file might tempt other people to use it, so just use
> > > a comment to note that's what the 1 means, and help anybody find it if
> > > they're looking to change the implementation.
> > 
> > I'm presuming this is deep in a callchain where they know they hold the
> > lock, but they lost in what capacity?
> 
> No, it's just assertions.  You can see that in patch 3 where it's
> used in functions called things like "xfs_islocked".

Right, but if you're not the lock owner, your answer to the question is
a dice-roll, it might be locked, it might not be.

> > In general I strongly dislike the whole _is_locked family, because it
> > gives very poorly defined semantics if used by anybody but the owner.
> > 
> > If these new functions are indeed to be used only by lock holders to
> > determine what kind of lock they hold, could we please put:
> > 
> > 	lockdep_assert_held()
> > 
> > in them?
> 
> Patch 2 shows it in use in the MM code.  We already have a
> lockdep_assert_held_write(), but most people don't enable lockdep, so

Most devs should run with lockdep on when writing new code, and I know
the sanitizer robots run with lockdep on.

In general there seems to be a ton of lockdep on coverage.

> we also have VM_BUG_ON_MM(!rwsem_is_write_locked(&mm->mmap_lock), mm)
> to give us a good assertion when lockdep is disabled.

Is that really worth it still? I mean, much of these assertions pre-date
lockdep.

> XFS has a problem with using lockdep in general, which is that a worker
> thread can be spawned and use the fact that the spawner is holding the
> lock.  There's no mechanism for the worker thread to ask "Does struct
> task_struct *p hold the lock?".

Will be somewhat tricky to make happen -- but might be doable. It is
however an interface that is *very* hard to use correctly. Basically I
think you want to also assert that your target task 'p' is blocked,
right?

That is: assert @p is blocked and holds @lock.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 19:33     ` Matthew Wilcox
@ 2023-09-07 21:06       ` Waiman Long
  2023-09-07 23:47         ` Waiman Long
  0 siblings, 1 reply; 33+ messages in thread
From: Waiman Long @ 2023-09-07 21:06 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, linux-kernel, linux-mm,
	Chandan Babu R, Darrick J . Wong, linux-xfs


On 9/7/23 15:33, Matthew Wilcox wrote:
> On Thu, Sep 07, 2023 at 02:05:54PM -0400, Waiman Long wrote:
>> On 9/7/23 13:47, Matthew Wilcox (Oracle) wrote:
>>> +static inline int rwsem_is_write_locked(struct rw_semaphore *sem)
>>> +{
>>> +	return atomic_long_read(&sem->count) & 1 /* RWSEM_WRITER_LOCKED */;
>>> +}
>> I would prefer you move the various RWSEM_* count bit macros from
>> kernel/locking/rwsem.c to under the !PREEMPT_RT block and directly use
>> RWSEM_WRITER_LOCKED instead of hardcoding a value of 1.
> Just to be clear, you want the ~50 lines from:
>
> /*
>   * On 64-bit architectures, the bit definitions of the count are:
> ...
> #define RWSEM_READ_FAILED_MASK  (RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\
>                                   RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL)
>
> moved from rwsem.c to rwsem.h?
>
> Or just these four lines:
>
> #define RWSEM_WRITER_LOCKED     (1UL << 0)
> #define RWSEM_FLAG_WAITERS      (1UL << 1)
> #define RWSEM_FLAG_HANDOFF      (1UL << 2)
> #define RWSEM_FLAG_READFAIL     (1UL << (BITS_PER_LONG - 1))

I think just the first 3 lines will be enough. Maybe a bit of comment 
about these bit flags in the count atomic_long value.

Cheers,
Longman




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 19:38       ` Peter Zijlstra
@ 2023-09-07 23:00         ` Dave Chinner
  2023-09-08 10:44           ` Peter Zijlstra
  2023-09-08  0:01         ` Matthew Wilcox
  1 sibling, 1 reply; 33+ messages in thread
From: Dave Chinner @ 2023-09-07 23:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Wilcox, Ingo Molnar, Will Deacon, Waiman Long,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Thu, Sep 07, 2023 at 09:38:38PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 07, 2023 at 08:20:30PM +0100, Matthew Wilcox wrote:
> > On Thu, Sep 07, 2023 at 09:08:10PM +0200, Peter Zijlstra wrote:
> > > On Thu, Sep 07, 2023 at 06:47:01PM +0100, Matthew Wilcox (Oracle) wrote:
> > > > Several places want to know whether the lock is held by a writer, instead
> > > > of just whether it's held.  We can implement this for both normal and
> > > > rt rwsems.  RWSEM_WRITER_LOCKED is declared in rwsem.c and exposing
> > > > it outside that file might tempt other people to use it, so just use
> > > > a comment to note that's what the 1 means, and help anybody find it if
> > > > they're looking to change the implementation.
> > > 
> > > I'm presuming this is deep in a callchain where they know they hold the
> > > lock, but they lost in what capacity?
> > 
> > No, it's just assertions.  You can see that in patch 3 where it's
> > used in functions called things like "xfs_islocked".
> 
> Right, but if you're not the lock owner, your answer to the question is
> a dice-roll, it might be locked, it might not be.

Except that the person writing the code knows the call chain that
leads up to that code, and so they have a pretty good idea whether
the object should be locked or not. If we are running that code, and
the object is locked, then it's pretty much guaranteed that the
owner of the lock is code that executed the check, because otherwise
we have a *major lock implementation bug*.

i.e. if we get to a place where rwsem_is_write_locked() fires
because some other task holds the lock, it almost always means we
have *two* tasks holding the lock exclusively.

Yes, it's these non-lockdep checks in XFS that have found rwsem
implementation bugs in the past. We've had them fire when the lock
was write locked when we know a few lines earlier it was taken as a
read lock, or marked write locked when they should have been
unlocked, etc because the rwsem code failed to enforce rw exclusion
semantics correctly.

So, really, these lock checks should be considered in the context of
the code that is running them and what such a "false detection"
would actually mean. In my experience, a false detection like you
talk about above means "rwsems are broken", not that there is a
problem with the code using the rwsems or the rwsem state check.

> > > In general I strongly dislike the whole _is_locked family, because it
> > > gives very poorly defined semantics if used by anybody but the owner.
> > > 
> > > If these new functions are indeed to be used only by lock holders to
> > > determine what kind of lock they hold, could we please put:
> > > 
> > > 	lockdep_assert_held()
> > > 
> > > in them?
> > 
> > Patch 2 shows it in use in the MM code.  We already have a
> > lockdep_assert_held_write(), but most people don't enable lockdep, so
> 
> Most devs should run with lockdep on when writing new code, and I know
> the sanitizer robots run with lockdep on.
> 
> In general there seems to be a ton of lockdep on coverage.

*cough*

Bit locks, semaphores, and all sorts of other constructs for IO
serialisation (like inode_dio_wait()) have no lockdep coverage at
all. IOWs, large chunks of many filesystems, the VFS and the VM have
little to no lockdep coverage at all.

> > we also have VM_BUG_ON_MM(!rwsem_is_write_locked(&mm->mmap_lock), mm)
> > to give us a good assertion when lockdep is disabled.
> 
> Is that really worth it still? I mean, much of these assertions pre-date
> lockdep.

And we're trying to propagate them because lockdep isn't a viable
option for day to day testing of filesystems because of it's
overhead vs how infrequently it finds new problems.

> > XFS has a problem with using lockdep in general, which is that a worker
> > thread can be spawned and use the fact that the spawner is holding the
> > lock.  There's no mechanism for the worker thread to ask "Does struct
> > task_struct *p hold the lock?".
> 
> Will be somewhat tricky to make happen -- but might be doable. It is
> however an interface that is *very* hard to use correctly. Basically I
> think you want to also assert that your target task 'p' is blocked,
> right?
> 
> That is: assert @p is blocked and holds @lock.

That addresses the immediate symptom; it doesn't address the large
problem with lockdep and needing non-owner rwsem semantics.

i.e. synchronous task based locking models don't work for
asynchronous multi-stage pipeline processing engines like XFS. The
lock protects the data object and follows the data object through
the processing pipeline, whilst the original submitter moves on to
the next operation to processes without blocking.

This is the non-blocking, async processing model that io_uring
development is pushing filesystems towards, so assuming that we only
hand a lock to a single worker task and then wait for it complete
(i.e. synchronous operation) flies in the face of current
development directions...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 21:06       ` Waiman Long
@ 2023-09-07 23:47         ` Waiman Long
  2023-09-08  0:44           ` Dave Chinner
  0 siblings, 1 reply; 33+ messages in thread
From: Waiman Long @ 2023-09-07 23:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, linux-kernel, linux-mm,
	Chandan Babu R, Darrick J . Wong, linux-xfs


On 9/7/23 17:06, Waiman Long wrote:
>
> On 9/7/23 15:33, Matthew Wilcox wrote:
>> On Thu, Sep 07, 2023 at 02:05:54PM -0400, Waiman Long wrote:
>>> On 9/7/23 13:47, Matthew Wilcox (Oracle) wrote:
>>>> +static inline int rwsem_is_write_locked(struct rw_semaphore *sem)
>>>> +{
>>>> +    return atomic_long_read(&sem->count) & 1 /* 
>>>> RWSEM_WRITER_LOCKED */;
>>>> +}
>>> I would prefer you move the various RWSEM_* count bit macros from
>>> kernel/locking/rwsem.c to under the !PREEMPT_RT block and directly use
>>> RWSEM_WRITER_LOCKED instead of hardcoding a value of 1.
>> Just to be clear, you want the ~50 lines from:
>>
>> /*
>>   * On 64-bit architectures, the bit definitions of the count are:
>> ...
>> #define RWSEM_READ_FAILED_MASK (RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\
>> RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL)
>>
>> moved from rwsem.c to rwsem.h?
>>
>> Or just these four lines:
>>
>> #define RWSEM_WRITER_LOCKED     (1UL << 0)
>> #define RWSEM_FLAG_WAITERS      (1UL << 1)
>> #define RWSEM_FLAG_HANDOFF      (1UL << 2)
>> #define RWSEM_FLAG_READFAIL     (1UL << (BITS_PER_LONG - 1))
>
> I think just the first 3 lines will be enough. Maybe a bit of comment 
> about these bit flags in the count atomic_long value.

Actually, the old rwsem implementation won't allow you to reliably 
determine if a rwsem is write locked because the xadd instruction is 
used for write locking and the code had to back out the WRITER_BIAS if 
the attempt failed. Maybe that is why XFS has its own code to check if a 
rwsem is write locked which is needed with the old rwsem implementation.

The new implementation makes this check reliable. Still it is not easy 
to check if a rwsem is read locked as the check will be rather 
complicated and probably racy.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 19:38       ` Peter Zijlstra
  2023-09-07 23:00         ` Dave Chinner
@ 2023-09-08  0:01         ` Matthew Wilcox
  1 sibling, 0 replies; 33+ messages in thread
From: Matthew Wilcox @ 2023-09-08  0:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Will Deacon, Waiman Long, linux-kernel, linux-mm,
	Chandan Babu R, Darrick J . Wong, linux-xfs

On Thu, Sep 07, 2023 at 09:38:38PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 07, 2023 at 08:20:30PM +0100, Matthew Wilcox wrote:
> > On Thu, Sep 07, 2023 at 09:08:10PM +0200, Peter Zijlstra wrote:
> > > On Thu, Sep 07, 2023 at 06:47:01PM +0100, Matthew Wilcox (Oracle) wrote:
> > > > Several places want to know whether the lock is held by a writer, instead
> > > > of just whether it's held.  We can implement this for both normal and
> > > > rt rwsems.  RWSEM_WRITER_LOCKED is declared in rwsem.c and exposing
> > > > it outside that file might tempt other people to use it, so just use
> > > > a comment to note that's what the 1 means, and help anybody find it if
> > > > they're looking to change the implementation.
> > > 
> > > I'm presuming this is deep in a callchain where they know they hold the
> > > lock, but they lost in what capacity?
> > 
> > No, it's just assertions.  You can see that in patch 3 where it's
> > used in functions called things like "xfs_islocked".
> 
> Right, but if you're not the lock owner, your answer to the question is
> a dice-roll, it might be locked, it might not be.

Sure, but if you are the lock owner, it's guaranteed to work.  And if
you hold it for read, and check that it's held for write, that will
definitely fail.  I agree "somebody else is holding it" gives you a
false negative, but most locks aren't contended, so it'll fail the
assertion often enough.

> > Patch 2 shows it in use in the MM code.  We already have a
> > lockdep_assert_held_write(), but most people don't enable lockdep, so
> 
> Most devs should run with lockdep on when writing new code, and I know
> the sanitizer robots run with lockdep on.
> 
> In general there seems to be a ton of lockdep on coverage.

Hm.  Enabling lockdep causes an xfstests run to go up from 6000 seconds
to 8000 seconds for me.  That's a significant reduction in the number
of test cycles I can run per day.

> > we also have VM_BUG_ON_MM(!rwsem_is_write_locked(&mm->mmap_lock), mm)
> > to give us a good assertion when lockdep is disabled.
> 
> Is that really worth it still? I mean, much of these assertions pre-date
> lockdep.

That's true.  Is it possible to do a cheaper version of lockdep where it
only records that you have the lock and doesn't, eg, check for locking
dependencies?  I haven't analysed lockdep to see what the expensive part
is; for all I know it's recording the holders, and this idea is
worthless.

> > XFS has a problem with using lockdep in general, which is that a worker
> > thread can be spawned and use the fact that the spawner is holding the
> > lock.  There's no mechanism for the worker thread to ask "Does struct
> > task_struct *p hold the lock?".
> 
> Will be somewhat tricky to make happen -- but might be doable. It is
> however an interface that is *very* hard to use correctly. Basically I
> think you want to also assert that your target task 'p' is blocked,
> right?
> 
> That is: assert @p is blocked and holds @lock.

Probably, although I think there might be a race where the worker thread
starts running before the waiter starts waiting?

In conversation with Darrick & Dave, XFS does this in order to avoid
overflowing the kernel stack.  So if you've been thinking about "we
could support segmented stacks", it would avoid having to support this
lockdep handoff.  It'd probably work a lot better for the scheduler too.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 23:47         ` Waiman Long
@ 2023-09-08  0:44           ` Dave Chinner
  0 siblings, 0 replies; 33+ messages in thread
From: Dave Chinner @ 2023-09-08  0:44 UTC (permalink / raw)
  To: Waiman Long
  Cc: Matthew Wilcox, Peter Zijlstra, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Thu, Sep 07, 2023 at 07:47:31PM -0400, Waiman Long wrote:
> 
> On 9/7/23 17:06, Waiman Long wrote:
> > 
> > On 9/7/23 15:33, Matthew Wilcox wrote:
> > > On Thu, Sep 07, 2023 at 02:05:54PM -0400, Waiman Long wrote:
> > > > On 9/7/23 13:47, Matthew Wilcox (Oracle) wrote:
> > > > > +static inline int rwsem_is_write_locked(struct rw_semaphore *sem)
> > > > > +{
> > > > > +    return atomic_long_read(&sem->count) & 1 /*
> > > > > RWSEM_WRITER_LOCKED */;
> > > > > +}
> > > > I would prefer you move the various RWSEM_* count bit macros from
> > > > kernel/locking/rwsem.c to under the !PREEMPT_RT block and directly use
> > > > RWSEM_WRITER_LOCKED instead of hardcoding a value of 1.
> > > Just to be clear, you want the ~50 lines from:
> > > 
> > > /*
> > >   * On 64-bit architectures, the bit definitions of the count are:
> > > ...
> > > #define RWSEM_READ_FAILED_MASK (RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\
> > > RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL)
> > > 
> > > moved from rwsem.c to rwsem.h?
> > > 
> > > Or just these four lines:
> > > 
> > > #define RWSEM_WRITER_LOCKED     (1UL << 0)
> > > #define RWSEM_FLAG_WAITERS      (1UL << 1)
> > > #define RWSEM_FLAG_HANDOFF      (1UL << 2)
> > > #define RWSEM_FLAG_READFAIL     (1UL << (BITS_PER_LONG - 1))
> > 
> > I think just the first 3 lines will be enough. Maybe a bit of comment
> > about these bit flags in the count atomic_long value.
> 
> Actually, the old rwsem implementation won't allow you to reliably determine
> if a rwsem is write locked because the xadd instruction is used for write
> locking and the code had to back out the WRITER_BIAS if the attempt failed.
> Maybe that is why XFS has its own code to check if a rwsem is write locked
> which is needed with the old rwsem implementation.

mrlocks pre-date rwsems entirely on Linux.  mrlocks were introduced
to XFS as part of the port from Irix back in 2000. This originally
had a 'ismrlocked()' function for checking lock state.

In 2003, this was expanded to allow explicit lock type checks via
'mrislocked_access() and 'mrislocked_update()' wrappers that checked
internal counters to determine how it was locked.

In 2004, the mrlock was converted to use the generic kernel rwsem
implementation, and because that couldn't be used to track writers,
the mrlock included a mr_writer boolean field to indicate it was
write locked for the purpose of implementing the existing debug
checks. Hence the mrlock debug code has always had reliable
differentiation of read vs write state, whereas we couldn't do that
natively with rwsems for a real long time.

The mrlocks have essentially remained unchanged since 2004 - this
long predates lockdep, and it lives on because gives us something
lockdep doesn't: zero overhead locking validation.

> The new implementation makes this check reliable. Still it is not easy to
> check if a rwsem is read locked as the check will be rather complicated and
> probably racy.

You can't look at these locking checks in isolation. These checks
are done in code paths where we expect the caller to have already
locked the rwsem in the manner required. Hence there should be no races
with rwsem state changes at all.

If we see a locking assert to fire, it means we either screwed up
the XFS locking completely (no races necessary), or there's a bug in
the rwsem implementation. The latter case has occurred several
times; the rwsem locking checks in XFS have uncovered more than one
rwsem implementation bug in the past...

IOWs, the explicit lock state checks and asserts provide a
lock implemetnation validation mechanism that lockdep doesn't....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/5] xfs: Use rwsem_is_write_locked()
  2023-09-07 17:47 ` [PATCH 3/5] xfs: Use rwsem_is_write_locked() Matthew Wilcox (Oracle)
@ 2023-09-08  9:09   ` Christoph Hellwig
  2023-09-08  9:10     ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2023-09-08  9:09 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Thu, Sep 07, 2023 at 06:47:03PM +0100, Matthew Wilcox (Oracle) wrote:
> This avoids using the mr_writer field to check the XFS ILOCK is held
> for write.  It also improves the checking we do when lockdep is disabled.

Hmm, is there any reason to keep the lockdep-based version of
__xfs_rwsem_islocked around now at all?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/5] xfs: Use rwsem_is_write_locked()
  2023-09-08  9:09   ` Christoph Hellwig
@ 2023-09-08  9:10     ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2023-09-08  9:10 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Fri, Sep 08, 2023 at 02:09:51AM -0700, Christoph Hellwig wrote:
> On Thu, Sep 07, 2023 at 06:47:03PM +0100, Matthew Wilcox (Oracle) wrote:
> > This avoids using the mr_writer field to check the XFS ILOCK is held
> > for write.  It also improves the checking we do when lockdep is disabled.
> 
> Hmm, is there any reason to keep the lockdep-based version of
> __xfs_rwsem_islocked around now at all?

Ok, looks like the last patch fixes that.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-07 23:00         ` Dave Chinner
@ 2023-09-08 10:44           ` Peter Zijlstra
  2023-09-10 22:56             ` Dave Chinner
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Zijlstra @ 2023-09-08 10:44 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Matthew Wilcox, Ingo Molnar, Will Deacon, Waiman Long,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Fri, Sep 08, 2023 at 09:00:08AM +1000, Dave Chinner wrote:

> > Right, but if you're not the lock owner, your answer to the question is
> > a dice-roll, it might be locked, it might not be.
> 
> Except that the person writing the code knows the call chain that
> leads up to that code, and so they have a pretty good idea whether
> the object should be locked or not. If we are running that code, and
> the object is locked, then it's pretty much guaranteed that the
> owner of the lock is code that executed the check, because otherwise
> we have a *major lock implementation bug*.

Agreed, and this is fine. However there's been some very creative
'use' of the _is_locked() class of functions in the past that did not
follow 'common' sense.

If all usage was: I should be holding this, lets check. I probably
wouldn't have this bad feeling about things.

> > Most devs should run with lockdep on when writing new code, and I know
> > the sanitizer robots run with lockdep on.
> > 
> > In general there seems to be a ton of lockdep on coverage.
> 
> *cough*
> 
> Bit locks, semaphores, and all sorts of other constructs for IO
> serialisation (like inode_dio_wait()) have no lockdep coverage at
> all. IOWs, large chunks of many filesystems, the VFS and the VM have
> little to no lockdep coverage at all.

True, however I was commenting on the assertion that vm code has
duplicate asserts with the implication that was because not a lot of
people run with lockdep on.

> > > we also have VM_BUG_ON_MM(!rwsem_is_write_locked(&mm->mmap_lock), mm)
> > > to give us a good assertion when lockdep is disabled.
> > 
> > Is that really worth it still? I mean, much of these assertions pre-date
> > lockdep.
> 
> And we're trying to propagate them because lockdep isn't a viable
> option for day to day testing of filesystems because of it's
> overhead vs how infrequently it finds new problems.

... in XFS. Lockdep avoids a giant pile of broken from entering the
kernel and the robots still report plenty.

> > > XFS has a problem with using lockdep in general, which is that a worker
> > > thread can be spawned and use the fact that the spawner is holding the
> > > lock.  There's no mechanism for the worker thread to ask "Does struct
> > > task_struct *p hold the lock?".
> > 
> > Will be somewhat tricky to make happen -- but might be doable. It is
> > however an interface that is *very* hard to use correctly. Basically I
> > think you want to also assert that your target task 'p' is blocked,
> > right?
> > 
> > That is: assert @p is blocked and holds @lock.
> 
> That addresses the immediate symptom; it doesn't address the large
> problem with lockdep and needing non-owner rwsem semantics.
> 
> i.e. synchronous task based locking models don't work for
> asynchronous multi-stage pipeline processing engines like XFS. The
> lock protects the data object and follows the data object through
> the processing pipeline, whilst the original submitter moves on to
> the next operation to processes without blocking.
> 
> This is the non-blocking, async processing model that io_uring
> development is pushing filesystems towards, so assuming that we only
> hand a lock to a single worker task and then wait for it complete
> (i.e. synchronous operation) flies in the face of current
> development directions...

I was looking at things from an interface abuse perspective. How easy is
it to do the wrong thing. As said, we've had a bunch of really dodgy
code with the _is_locked class of functions, hence my desire to find
something else.

As to the whole non-owner locking, yes, that's problematic. I'm not
convinced async operations require non-owner locking, at the same time I
do see that IO completions pose a challence.

Coming from the schedulability and real-time corner, non-owner locks are
a nightmare because of the inversions. So yeah, fun to be had I'm sure.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-08 10:44           ` Peter Zijlstra
@ 2023-09-10 22:56             ` Dave Chinner
  2023-09-10 23:17               ` Matthew Wilcox
  0 siblings, 1 reply; 33+ messages in thread
From: Dave Chinner @ 2023-09-10 22:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Wilcox, Ingo Molnar, Will Deacon, Waiman Long,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Fri, Sep 08, 2023 at 12:44:34PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 08, 2023 at 09:00:08AM +1000, Dave Chinner wrote:
> 
> > > Right, but if you're not the lock owner, your answer to the question is
> > > a dice-roll, it might be locked, it might not be.
> > 
> > Except that the person writing the code knows the call chain that
> > leads up to that code, and so they have a pretty good idea whether
> > the object should be locked or not. If we are running that code, and
> > the object is locked, then it's pretty much guaranteed that the
> > owner of the lock is code that executed the check, because otherwise
> > we have a *major lock implementation bug*.
> 
> Agreed, and this is fine. However there's been some very creative
> 'use' of the _is_locked() class of functions in the past that did not
> follow 'common' sense.
> 
> If all usage was: I should be holding this, lets check. I probably
> wouldn't have this bad feeling about things.

So your argument against such an interface is essentially "we can't
have nice things because someone might abuse them"?

> > > Most devs should run with lockdep on when writing new code, and I know
> > > the sanitizer robots run with lockdep on.
> > > 
> > > In general there seems to be a ton of lockdep on coverage.
> > 
> > *cough*
> > 
> > Bit locks, semaphores, and all sorts of other constructs for IO
> > serialisation (like inode_dio_wait()) have no lockdep coverage at
> > all. IOWs, large chunks of many filesystems, the VFS and the VM have
> > little to no lockdep coverage at all.
> 
> True, however I was commenting on the assertion that vm code has
> duplicate asserts with the implication that was because not a lot of
> people run with lockdep on.

I think that implication is pretty much spot on the money for any
subsystem that has significant correctness testing overhead. A
single fstests regression test pass for a single configuration for
XFS takes well over 4 hours to run. If I add lockdep, it's about 7
hours. If I add lockdep and KASAN, it's closer to 12 hours. It just
isn't viable to run test kernels with these options day-in, day-out.
Maybe once a release I'll run a weekend sanity check with them
enabled, but otherwise the rare issue they find just isn't worth the
cost of enabling them....

> > > > we also have VM_BUG_ON_MM(!rwsem_is_write_locked(&mm->mmap_lock), mm)
> > > > to give us a good assertion when lockdep is disabled.
> > > 
> > > Is that really worth it still? I mean, much of these assertions pre-date
> > > lockdep.
> > 
> > And we're trying to propagate them because lockdep isn't a viable
> > option for day to day testing of filesystems because of it's
> > overhead vs how infrequently it finds new problems.
> 
> ... in XFS. Lockdep avoids a giant pile of broken from entering the
> kernel and the robots still report plenty.

Nobody is suggesting that lockdep gets replaced by these functions.
They are *in addition* to lockdep, and are used to give use 99.9%
certainty that locks are being used correctly without adding any
runtime overhead at all.

That's the whole point - it is simple introspection code that will
find most gross locking mistakes that people make very quickly,
without any real costs. If you're worried about people abusing
introspection code, then you're forgetting that they can just dig
directly at the rwsem guts directly themselves. Adding an interface
that does introspection right and enables the internal
imp[lementation to change without breaking anything is a no brainer;
it stops people from just digging in the guts of the structure and
guaranteeing that code will break if the implementation changes...

> > > > XFS has a problem with using lockdep in general, which is that a worker
> > > > thread can be spawned and use the fact that the spawner is holding the
> > > > lock.  There's no mechanism for the worker thread to ask "Does struct
> > > > task_struct *p hold the lock?".
> > > 
> > > Will be somewhat tricky to make happen -- but might be doable. It is
> > > however an interface that is *very* hard to use correctly. Basically I
> > > think you want to also assert that your target task 'p' is blocked,
> > > right?
> > > 
> > > That is: assert @p is blocked and holds @lock.
> > 
> > That addresses the immediate symptom; it doesn't address the large
> > problem with lockdep and needing non-owner rwsem semantics.
> > 
> > i.e. synchronous task based locking models don't work for
> > asynchronous multi-stage pipeline processing engines like XFS. The
> > lock protects the data object and follows the data object through
> > the processing pipeline, whilst the original submitter moves on to
> > the next operation to processes without blocking.
> > 
> > This is the non-blocking, async processing model that io_uring
> > development is pushing filesystems towards, so assuming that we only
> > hand a lock to a single worker task and then wait for it complete
> > (i.e. synchronous operation) flies in the face of current
> > development directions...
> 
> I was looking at things from an interface abuse perspective. How easy is
> it to do the wrong thing. As said, we've had a bunch of really dodgy
> code with the _is_locked class of functions, hence my desire to find
> something else.
> 
> As to the whole non-owner locking, yes, that's problematic. I'm not
> convinced async operations require non-owner locking, at the same time I
> do see that IO completions pose a challence.
> 
> Coming from the schedulability and real-time corner, non-owner locks are
> a nightmare because of the inversions. So yeah, fun to be had I'm sure.

I'm not sure you understand the scope of the problem with modern
filesystems vs RT processing. The moment code enters a modern
filesystem, it gives up all hope of real-time response guarantees.
There is currently nothing a RT process can do but wait for the
filesystem to finish with the locks it holds, and the wait times are
effectively unbound because there may be a requirement for tens of
thousands of IOs to be done before the lock is dropped and the RT
task can make progress.

Priority inheritance for the lock owner won't make any difference
here, because the latency is not caused by something running on a
CPU.  IOWs, lock inversions and non-owner locks are the very least
of the problems for RT priority apps when it comes to filesystem
operations.

The solution for RT priority apps avoiding priority inversions in
filesystems is going be io_uring. i.e. the initial NOWAIT operation
is done with RT priority in the RT task itself, but if that is going
to be blocked it gets punted to a background worker for async
processing and the RT priority task goes on to processing the next
thing it needs to do.

All the  background async operations are performed with the same
(non-RT) priority and we just don't need to care about priority
inversions or the problems RT has with non-owner lock contexts. The
RT tasks themselves don't care, either, because they don't ever get
stuck waiting on a filesystem lock that a lower priority task might
hold, or get stuck on an operation that might require unbound
amounts of IO to complete (e.g. transaction reservations).

IOWs, if we want to make "RT with filesystems" a reality, we need to
stop worrying about constraining lock implementations and handling
priority inversions. Instead, we need to look towards how to make
filesystem infrastructure fully non-blocking for RT priority tasks
and writing RT applications to use that infrastructure....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-10 22:56             ` Dave Chinner
@ 2023-09-10 23:17               ` Matthew Wilcox
  2023-09-11  0:55                 ` Dave Chinner
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Wilcox @ 2023-09-10 23:17 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Mon, Sep 11, 2023 at 08:56:45AM +1000, Dave Chinner wrote:
> On Fri, Sep 08, 2023 at 12:44:34PM +0200, Peter Zijlstra wrote:
> > Agreed, and this is fine. However there's been some very creative
> > 'use' of the _is_locked() class of functions in the past that did not
> > follow 'common' sense.
> > 
> > If all usage was: I should be holding this, lets check. I probably
> > wouldn't have this bad feeling about things.
> 
> So your argument against such an interface is essentially "we can't
> have nice things because someone might abuse them"?

Some people are very creative ...

I was thinking about how to handle this better.  We could have

static inline void rwsem_assert_locked(const struct rw_semaphore *sem)
{
	BUG_ON(atomic_long_read(&sem->count) == 0);
}

static inline void rwsem_assert_write_locked(const struct rw_semaphore *sem)
{
	BUG_ON((atomic_long_read(&sem->count) & 1) != 1);
}

but then we'd also need to change how XFS currently uses the ASSERT()
macro to be ASSERT_LOCKED(ip, flags), and I understand this code is also
used in userspace, so it'd involve changing that shim, and we're getting
way past the amount of code I'm comfortable changing, and way past the
amount of time I should be spending on this.

And then there'd be the inevitable bikeshedding about "don't use BUG_ON"
and it's probably just for the best if I walk away at this point,
becoming the third person to fail to remove the mrlock.

I'll keep reading this thread to see if some consensus emerges, but I'm
not optimistic.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-10 23:17               ` Matthew Wilcox
@ 2023-09-11  0:55                 ` Dave Chinner
  2023-09-11  2:15                   ` Waiman Long
  0 siblings, 1 reply; 33+ messages in thread
From: Dave Chinner @ 2023-09-11  0:55 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Mon, Sep 11, 2023 at 12:17:18AM +0100, Matthew Wilcox wrote:
> On Mon, Sep 11, 2023 at 08:56:45AM +1000, Dave Chinner wrote:
> > On Fri, Sep 08, 2023 at 12:44:34PM +0200, Peter Zijlstra wrote:
> > > Agreed, and this is fine. However there's been some very creative
> > > 'use' of the _is_locked() class of functions in the past that did not
> > > follow 'common' sense.
> > > 
> > > If all usage was: I should be holding this, lets check. I probably
> > > wouldn't have this bad feeling about things.
> > 
> > So your argument against such an interface is essentially "we can't
> > have nice things because someone might abuse them"?
> 
> Some people are very creative ...

Sure, but that's no reason to stop anyone else from making progress.

> I was thinking about how to handle this better.  We could have
> 
> static inline void rwsem_assert_locked(const struct rw_semaphore *sem)
> {
> 	BUG_ON(atomic_long_read(&sem->count) == 0);
> }
> 
> static inline void rwsem_assert_write_locked(const struct rw_semaphore *sem)
> {
> 	BUG_ON((atomic_long_read(&sem->count) & 1) != 1);
> }

We already have CONFIG_DEBUG_RWSEMS, so we can put these
introspection interfaces inside debug code, and make any attempt to
use them outside of debug builds break the build. e.g:

#if DEBUG_RWSEMS
/*
 * rwsem locked checks can only be used by conditionally compiled
 * subsystem debug code. It is not valid to use them in normal
 * production code.
 */
static inline bool rwsem_is_write_locked()
{
	....
}

static inline bool rwsem_is_locked()
{
	....
}
#else /* !DEBUG_RWSEMS */
#define rwsem_is_write_locked()		BUILD_BUG()
#define rwsem_is_locked()		BUILD_BUG()
#endif /* DEBUG_RWSEMS */

And now we simply add a single line to subsystem Kconfig debug
options to turn on rwsem introspection for their debug checks like
so:

 config XFS_DEBUG
 	bool "XFS Debugging support"
 	depends on XFS_FS
+	select RWSEM_DEBUG
 	help
 	  Say Y here to get an XFS build with many debugging features,
 	  including ASSERT checks, function wrappers around macros,

> but then we'd also need to change how XFS currently uses the ASSERT()
> macro to be ASSERT_LOCKED(ip, flags), and I understand this code is also
> used in userspace, so it'd involve changing that shim, and we're getting
> way past the amount of code I'm comfortable changing, and way past the
> amount of time I should be spending on this.
> 
> And then there'd be the inevitable bikeshedding about "don't use BUG_ON"
> and it's probably just for the best if I walk away at this point,
> becoming the third person to fail to remove the mrlock.

Yeah, which further points out how ridiculous the situation is. This
is useful debug code and it can *obviously* and *easily* be
constrained to debug environments.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-11  0:55                 ` Dave Chinner
@ 2023-09-11  2:15                   ` Waiman Long
  2023-09-11 22:29                     ` Dave Chinner
  0 siblings, 1 reply; 33+ messages in thread
From: Waiman Long @ 2023-09-11  2:15 UTC (permalink / raw)
  To: Dave Chinner, Matthew Wilcox
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, linux-kernel, linux-mm,
	Chandan Babu R, Darrick J . Wong, linux-xfs


On 9/10/23 20:55, Dave Chinner wrote:
> On Mon, Sep 11, 2023 at 12:17:18AM +0100, Matthew Wilcox wrote:
>> On Mon, Sep 11, 2023 at 08:56:45AM +1000, Dave Chinner wrote:
>>> On Fri, Sep 08, 2023 at 12:44:34PM +0200, Peter Zijlstra wrote:
>>>> Agreed, and this is fine. However there's been some very creative
>>>> 'use' of the _is_locked() class of functions in the past that did not
>>>> follow 'common' sense.
>>>>
>>>> If all usage was: I should be holding this, lets check. I probably
>>>> wouldn't have this bad feeling about things.
>>> So your argument against such an interface is essentially "we can't
>>> have nice things because someone might abuse them"?
>> Some people are very creative ...
> Sure, but that's no reason to stop anyone else from making progress.
>
>> I was thinking about how to handle this better.  We could have
>>
>> static inline void rwsem_assert_locked(const struct rw_semaphore *sem)
>> {
>> 	BUG_ON(atomic_long_read(&sem->count) == 0);
>> }
>>
>> static inline void rwsem_assert_write_locked(const struct rw_semaphore *sem)
>> {
>> 	BUG_ON((atomic_long_read(&sem->count) & 1) != 1);
>> }
> We already have CONFIG_DEBUG_RWSEMS, so we can put these
> introspection interfaces inside debug code, and make any attempt to
> use them outside of debug builds break the build. e.g:
>
> #if DEBUG_RWSEMS
> /*
>   * rwsem locked checks can only be used by conditionally compiled
>   * subsystem debug code. It is not valid to use them in normal
>   * production code.
>   */
> static inline bool rwsem_is_write_locked()
> {
> 	....
> }
>
> static inline bool rwsem_is_locked()
> {
> 	....
> }
> #else /* !DEBUG_RWSEMS */
> #define rwsem_is_write_locked()		BUILD_BUG()
> #define rwsem_is_locked()		BUILD_BUG()
> #endif /* DEBUG_RWSEMS */
>
> And now we simply add a single line to subsystem Kconfig debug
> options to turn on rwsem introspection for their debug checks like
> so:
>
>   config XFS_DEBUG
>   	bool "XFS Debugging support"
>   	depends on XFS_FS
> +	select RWSEM_DEBUG
>   	help
>   	  Say Y here to get an XFS build with many debugging features,
>   	  including ASSERT checks, function wrappers around macros,

That may be a possible compromise. Actually, I am not against having 
them defined even outside the DEBUG_RWSEMS. We already have 
mutex_is_locked() defined and used in a lot of places. So this is just 
providing the rwsem equivalents.

I also agreed that these APIs can be misused by other users. I think we 
should clearly document the caveats of using these. So unless there are 
other means of maintaining the stability of the lock state, the test 
result may no longer be true right after the test. It is simply just the 
lock state at a certain moment in time. Callers are using them at their 
own risk.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-11  2:15                   ` Waiman Long
@ 2023-09-11 22:29                     ` Dave Chinner
  2023-09-12  9:03                       ` Peter Zijlstra
  0 siblings, 1 reply; 33+ messages in thread
From: Dave Chinner @ 2023-09-11 22:29 UTC (permalink / raw)
  To: Waiman Long
  Cc: Matthew Wilcox, Peter Zijlstra, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Sun, Sep 10, 2023 at 10:15:59PM -0400, Waiman Long wrote:
> 
> On 9/10/23 20:55, Dave Chinner wrote:
> > On Mon, Sep 11, 2023 at 12:17:18AM +0100, Matthew Wilcox wrote:
> > > On Mon, Sep 11, 2023 at 08:56:45AM +1000, Dave Chinner wrote:
> > > > On Fri, Sep 08, 2023 at 12:44:34PM +0200, Peter Zijlstra wrote:
> > > > > Agreed, and this is fine. However there's been some very creative
> > > > > 'use' of the _is_locked() class of functions in the past that did not
> > > > > follow 'common' sense.
> > > > > 
> > > > > If all usage was: I should be holding this, lets check. I probably
> > > > > wouldn't have this bad feeling about things.
> > > > So your argument against such an interface is essentially "we can't
> > > > have nice things because someone might abuse them"?
> > > Some people are very creative ...
> > Sure, but that's no reason to stop anyone else from making progress.
> > 
> > > I was thinking about how to handle this better.  We could have
> > > 
> > > static inline void rwsem_assert_locked(const struct rw_semaphore *sem)
> > > {
> > > 	BUG_ON(atomic_long_read(&sem->count) == 0);
> > > }
> > > 
> > > static inline void rwsem_assert_write_locked(const struct rw_semaphore *sem)
> > > {
> > > 	BUG_ON((atomic_long_read(&sem->count) & 1) != 1);
> > > }
> > We already have CONFIG_DEBUG_RWSEMS, so we can put these
> > introspection interfaces inside debug code, and make any attempt to
> > use them outside of debug builds break the build. e.g:
> > 
> > #if DEBUG_RWSEMS
> > /*
> >   * rwsem locked checks can only be used by conditionally compiled
> >   * subsystem debug code. It is not valid to use them in normal
> >   * production code.
> >   */
> > static inline bool rwsem_is_write_locked()
> > {
> > 	....
> > }
> > 
> > static inline bool rwsem_is_locked()
> > {
> > 	....
> > }
> > #else /* !DEBUG_RWSEMS */
> > #define rwsem_is_write_locked()		BUILD_BUG()
> > #define rwsem_is_locked()		BUILD_BUG()
> > #endif /* DEBUG_RWSEMS */
> > 
> > And now we simply add a single line to subsystem Kconfig debug
> > options to turn on rwsem introspection for their debug checks like
> > so:
> > 
> >   config XFS_DEBUG
> >   	bool "XFS Debugging support"
> >   	depends on XFS_FS
> > +	select RWSEM_DEBUG
> >   	help
> >   	  Say Y here to get an XFS build with many debugging features,
> >   	  including ASSERT checks, function wrappers around macros,
> 
> That may be a possible compromise. Actually, I am not against having them
> defined even outside the DEBUG_RWSEMS. We already have mutex_is_locked()
> defined and used in a lot of places. So this is just providing the rwsem
> equivalents.

So, once again, we have mixed messages from the lock maintainers.
One says "no, it might get abused", another says "I'm fine with
that", and now we have a maintainer disagreement stalemate.

This is dysfunctional.

Peter, Waiman, please make a decision one way or the other about
allowing rwsems ito support native write lock checking. In the
absence of an actual yes/no decision, do we assume that the
maintainers don't actually care about it and we should just
submit it straight to Linus?

-Dave.
-- 
Dave Chinner
david@fromorbit.com


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-11 22:29                     ` Dave Chinner
@ 2023-09-12  9:03                       ` Peter Zijlstra
  2023-09-12 12:28                         ` Matthew Wilcox
  2023-09-12 23:16                         ` Dave Chinner
  0 siblings, 2 replies; 33+ messages in thread
From: Peter Zijlstra @ 2023-09-12  9:03 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Waiman Long, Matthew Wilcox, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Tue, Sep 12, 2023 at 08:29:55AM +1000, Dave Chinner wrote:

> So, once again, we have mixed messages from the lock maintainers.
> One says "no, it might get abused", another says "I'm fine with
> that", and now we have a maintainer disagreement stalemate.

I didn't say no, I was trying to see if there's alternatives because the
is_locked pattern has a history of abuse.

If not, then sure we can do this; it's not like I managed to get rid of
muteX_is_locked() -- and I actually tried at some point :/

And just now I grepped for it, and look what I find:

drivers/hid/hid-nintendo.c:     if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
drivers/nvdimm/btt.c:           if (mutex_is_locked(&arena->err_lock)

And there's more :-(

Also, please just calm down already..


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-12  9:03                       ` Peter Zijlstra
@ 2023-09-12 12:28                         ` Matthew Wilcox
  2023-09-12 13:52                           ` Peter Zijlstra
  2023-09-12 23:16                         ` Dave Chinner
  1 sibling, 1 reply; 33+ messages in thread
From: Matthew Wilcox @ 2023-09-12 12:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Chinner, Waiman Long, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Tue, Sep 12, 2023 at 11:03:42AM +0200, Peter Zijlstra wrote:
> If not, then sure we can do this; it's not like I managed to get rid of
> muteX_is_locked() -- and I actually tried at some point :/
> 
> And just now I grepped for it, and look what I find:
> 
> drivers/hid/hid-nintendo.c:     if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
> drivers/nvdimm/btt.c:           if (mutex_is_locked(&arena->err_lock)
> 
> And there's more :-(

Are these actually abuse?  I looked at these two, and they both seem to
be asking "Does somebody else currently have this mutex?" rather than
"Do I have this mutex?".


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-12 12:28                         ` Matthew Wilcox
@ 2023-09-12 13:52                           ` Peter Zijlstra
  2023-09-12 13:58                             ` Matthew Wilcox
  2023-09-12 14:02                             ` Peter Zijlstra
  0 siblings, 2 replies; 33+ messages in thread
From: Peter Zijlstra @ 2023-09-12 13:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dave Chinner, Waiman Long, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Tue, Sep 12, 2023 at 01:28:13PM +0100, Matthew Wilcox wrote:
> On Tue, Sep 12, 2023 at 11:03:42AM +0200, Peter Zijlstra wrote:
> > If not, then sure we can do this; it's not like I managed to get rid of
> > muteX_is_locked() -- and I actually tried at some point :/
> > 
> > And just now I grepped for it, and look what I find:
> > 
> > drivers/hid/hid-nintendo.c:     if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
> > drivers/nvdimm/btt.c:           if (mutex_is_locked(&arena->err_lock)
> > 
> > And there's more :-(
> 
> Are these actually abuse?  I looked at these two, and they both seem to
> be asking "Does somebody else currently have this mutex?" rather than
> "Do I have this mutex?".

It's effectively a random number generator in that capacity. Someone
might have it or might have had it when you looked and no longer have
it, or might have it now but not when you asked.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-12 13:52                           ` Peter Zijlstra
@ 2023-09-12 13:58                             ` Matthew Wilcox
  2023-09-12 14:23                               ` Peter Zijlstra
  2023-09-12 14:02                             ` Peter Zijlstra
  1 sibling, 1 reply; 33+ messages in thread
From: Matthew Wilcox @ 2023-09-12 13:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Chinner, Waiman Long, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Tue, Sep 12, 2023 at 03:52:13PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 12, 2023 at 01:28:13PM +0100, Matthew Wilcox wrote:
> > On Tue, Sep 12, 2023 at 11:03:42AM +0200, Peter Zijlstra wrote:
> > > If not, then sure we can do this; it's not like I managed to get rid of
> > > muteX_is_locked() -- and I actually tried at some point :/
> > > 
> > > And just now I grepped for it, and look what I find:
> > > 
> > > drivers/hid/hid-nintendo.c:     if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
> > > drivers/nvdimm/btt.c:           if (mutex_is_locked(&arena->err_lock)
> > > 
> > > And there's more :-(
> > 
> > Are these actually abuse?  I looked at these two, and they both seem to
> > be asking "Does somebody else currently have this mutex?" rather than
> > "Do I have this mutex?".
> 
> It's effectively a random number generator in that capacity. Someone
> might have it or might have had it when you looked and no longer have
> it, or might have it now but not when you asked.

Well, no.

                if (mutex_is_locked(&arena->err_lock)
                                || arena->freelist[lane].has_err) {
                        nd_region_release_lane(btt->nd_region, lane);

                        ret = arena_clear_freelist_error(arena, lane);

So that's "Is somebody currently processing an error, or have they
already finished setting an error".  Sure, it's somewhat racy, but
it looks like a performance optimisation, not something that needs
100% accuracy.

The other one's in a similar boat; an optimisation if anyone else is
currently holding this mutex:

        /*
         * Immediately after receiving a report is the most reliable time to
         * send a subcommand to the controller. Wake any subcommand senders
         * waiting for a report.
         */
        if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
                spin_lock_irqsave(&ctlr->lock, flags);
                ctlr->received_input_report = true;
                spin_unlock_irqrestore(&ctlr->lock, flags);
                wake_up(&ctlr->wait);
        }

Sure, they might not still be holding it, or it may have been grabbed
one clock tick later; that just means they miss out on this optimisation.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-12 13:52                           ` Peter Zijlstra
  2023-09-12 13:58                             ` Matthew Wilcox
@ 2023-09-12 14:02                             ` Peter Zijlstra
  1 sibling, 0 replies; 33+ messages in thread
From: Peter Zijlstra @ 2023-09-12 14:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dave Chinner, Waiman Long, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Tue, Sep 12, 2023 at 03:52:13PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 12, 2023 at 01:28:13PM +0100, Matthew Wilcox wrote:
> > On Tue, Sep 12, 2023 at 11:03:42AM +0200, Peter Zijlstra wrote:
> > > If not, then sure we can do this; it's not like I managed to get rid of
> > > muteX_is_locked() -- and I actually tried at some point :/
> > > 
> > > And just now I grepped for it, and look what I find:
> > > 
> > > drivers/hid/hid-nintendo.c:     if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
> > > drivers/nvdimm/btt.c:           if (mutex_is_locked(&arena->err_lock)
> > > 
> > > And there's more :-(
> > 
> > Are these actually abuse?  I looked at these two, and they both seem to
> > be asking "Does somebody else currently have this mutex?" rather than
> > "Do I have this mutex?".
> 
> It's effectively a random number generator in that capacity. Someone
> might have it or might have had it when you looked and no longer have
> it, or might have it now but not when you asked.

Also, there's more fun; the 'is_locked' store from spin_lock() (or
mutex, or whatever) is not ordered vs any other write inside the
critical section.

So something like:

	bar = 0;

	CPU0			CPU1

	spin_lock(&foo)		
	bar = 1;		x = READ_ONCE(bar)
				y = spin_is_locked(&foo);
	spin_unlock(&foo);


can have x==1 && y==0, even though CPU0 is currently inside the critical
section.

Normally that doesn't matter, and for the program-order case where you
ask 'am I holding the lock' this obviously cannot go wrong. But the
moment you ask: 'is someone else holding the lock' it all goes sideways
real fast.

We've been there, done that, got a t-shirt etc..


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-12 13:58                             ` Matthew Wilcox
@ 2023-09-12 14:23                               ` Peter Zijlstra
  2023-09-12 15:27                                 ` Darrick J. Wong
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Zijlstra @ 2023-09-12 14:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dave Chinner, Waiman Long, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Tue, Sep 12, 2023 at 02:58:32PM +0100, Matthew Wilcox wrote:
> On Tue, Sep 12, 2023 at 03:52:13PM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 12, 2023 at 01:28:13PM +0100, Matthew Wilcox wrote:
> > > On Tue, Sep 12, 2023 at 11:03:42AM +0200, Peter Zijlstra wrote:
> > > > If not, then sure we can do this; it's not like I managed to get rid of
> > > > muteX_is_locked() -- and I actually tried at some point :/
> > > > 
> > > > And just now I grepped for it, and look what I find:
> > > > 
> > > > drivers/hid/hid-nintendo.c:     if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
> > > > drivers/nvdimm/btt.c:           if (mutex_is_locked(&arena->err_lock)
> > > > 
> > > > And there's more :-(
> > > 
> > > Are these actually abuse?  I looked at these two, and they both seem to
> > > be asking "Does somebody else currently have this mutex?" rather than
> > > "Do I have this mutex?".
> > 
> > It's effectively a random number generator in that capacity. Someone
> > might have it or might have had it when you looked and no longer have
> > it, or might have it now but not when you asked.
> 
> Well, no.
> 
>                 if (mutex_is_locked(&arena->err_lock)
>                                 || arena->freelist[lane].has_err) {
>                         nd_region_release_lane(btt->nd_region, lane);
> 
>                         ret = arena_clear_freelist_error(arena, lane);
> 
> So that's "Is somebody currently processing an error, or have they
> already finished setting an error".  Sure, it's somewhat racy, but
> it looks like a performance optimisation, not something that needs
> 100% accuracy.

We're arguing past one another I think. Yes mutex_is_locked() is a
random number generator when asked for something you don't own. But it
might not be a bug because the code is ok with races.

It is still fully dodgy IMO, such usage is pretty close to UB.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-12 14:23                               ` Peter Zijlstra
@ 2023-09-12 15:27                                 ` Darrick J. Wong
  2023-09-13  8:59                                   ` Peter Zijlstra
  0 siblings, 1 reply; 33+ messages in thread
From: Darrick J. Wong @ 2023-09-12 15:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Wilcox, Dave Chinner, Waiman Long, Ingo Molnar,
	Will Deacon, linux-kernel, linux-mm, Chandan Babu R, linux-xfs

On Tue, Sep 12, 2023 at 04:23:00PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 12, 2023 at 02:58:32PM +0100, Matthew Wilcox wrote:
> > On Tue, Sep 12, 2023 at 03:52:13PM +0200, Peter Zijlstra wrote:
> > > On Tue, Sep 12, 2023 at 01:28:13PM +0100, Matthew Wilcox wrote:
> > > > On Tue, Sep 12, 2023 at 11:03:42AM +0200, Peter Zijlstra wrote:
> > > > > If not, then sure we can do this; it's not like I managed to get rid of
> > > > > muteX_is_locked() -- and I actually tried at some point :/
> > > > > 
> > > > > And just now I grepped for it, and look what I find:
> > > > > 
> > > > > drivers/hid/hid-nintendo.c:     if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
> > > > > drivers/nvdimm/btt.c:           if (mutex_is_locked(&arena->err_lock)
> > > > > 
> > > > > And there's more :-(
> > > > 
> > > > Are these actually abuse?  I looked at these two, and they both seem to
> > > > be asking "Does somebody else currently have this mutex?" rather than
> > > > "Do I have this mutex?".
> > > 
> > > It's effectively a random number generator in that capacity. Someone
> > > might have it or might have had it when you looked and no longer have
> > > it, or might have it now but not when you asked.
> > 
> > Well, no.
> > 
> >                 if (mutex_is_locked(&arena->err_lock)
> >                                 || arena->freelist[lane].has_err) {
> >                         nd_region_release_lane(btt->nd_region, lane);
> > 
> >                         ret = arena_clear_freelist_error(arena, lane);
> > 
> > So that's "Is somebody currently processing an error, or have they
> > already finished setting an error".  Sure, it's somewhat racy, but
> > it looks like a performance optimisation, not something that needs
> > 100% accuracy.
> 
> We're arguing past one another I think. Yes mutex_is_locked() is a
> random number generator when asked for something you don't own. But it
> might not be a bug because the code is ok with races.
> 
> It is still fully dodgy IMO, such usage is pretty close to UB.

My 2 cents here:

I could live with Longman's suggestion of an rwsem_assert_is_locked that
only exists if DEBUG_RWSEMS is enabled.  Something like:

#ifdef CONFIG_DEBUG_RWSEMS
static inline bool __rwsem_assert_is_locked(struct rw_semaphore *rwsem,
		const char *file, int line)
{
	bool ret = rwsem_is_locked(rwsem);
	if (!ret)
		WARN(1, "!rwsem_is_locked(rwsem) at %s line %d", file, line);
	return ret;
}
#define rwsem_assert_is_locked(r) \
	__rwsem_assert_is_locked((r), __FILE__, __LINE__)
#endif

and then XFS could do:

	ASSERT(rwsem_assert_is_locked(&VFS_I(ip)->i_rwsem));

Wherein ASSERT is only #defined if CONFIG_XFS_DEBUG, and XFS_DEBUG
selects DEBUG_RWSEMS, per Longman's suggestion.  That's work for what we
want it for (simple cheap lock checking) without becoming a general
lockabuse predicate.

--D


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-12  9:03                       ` Peter Zijlstra
  2023-09-12 12:28                         ` Matthew Wilcox
@ 2023-09-12 23:16                         ` Dave Chinner
  1 sibling, 0 replies; 33+ messages in thread
From: Dave Chinner @ 2023-09-12 23:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Matthew Wilcox, Ingo Molnar, Will Deacon,
	linux-kernel, linux-mm, Chandan Babu R, Darrick J . Wong,
	linux-xfs

On Tue, Sep 12, 2023 at 11:03:42AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 12, 2023 at 08:29:55AM +1000, Dave Chinner wrote:
> 
> > So, once again, we have mixed messages from the lock maintainers.
> > One says "no, it might get abused", another says "I'm fine with
> > that", and now we have a maintainer disagreement stalemate.
> 
> I didn't say no, I was trying to see if there's alternatives because the
> is_locked pattern has a history of abuse.

Yet you haven't suggested or commented on the proposed methods to
avoid abuse - you are still arguing that it can be abused. Go back
and read what I proposed before continuing to argue about
mutex_is_locked()....

> If not, then sure we can do this; it's not like I managed to get rid of
> muteX_is_locked() -- and I actually tried at some point :/
> 
> And just now I grepped for it, and look what I find:
> 
> drivers/hid/hid-nintendo.c:     if (unlikely(mutex_is_locked(&ctlr->output_mutex))) {
> drivers/nvdimm/btt.c:           if (mutex_is_locked(&arena->err_lock)
> 
> And there's more :-(

.... like this.

Seriously: if we put it behind CONFIG_DEBUG_RWSEM, and then turn
that on when other subsystem debug code wants the rwsem
introspection, why does anyone outside that subsystem even how it
gets used? It won't even get used in production kernels, because
nobody will turn something on that requires rwsem debug in a
production kernel.

If you are still concerned that this will happen, then do the same
that we've done for trace_printk and other debug only functions:
dump a big warning at boot time that rwsem debug is enabled and this
is not a supported production kernel configuration.

> Also, please just calm down already..

I'm perfectly calm and relaxed.  Asking for a definitive decision
between co-maintainers who are giving decidedly mixed signals is a
very reasonable request to make.  Just because you may not like what
such a request implies about how the code is being maintained, it
doesn't mean I'm the slightest bit upset, hysterical or irrational.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] locking: Add rwsem_is_write_locked()
  2023-09-12 15:27                                 ` Darrick J. Wong
@ 2023-09-13  8:59                                   ` Peter Zijlstra
  0 siblings, 0 replies; 33+ messages in thread
From: Peter Zijlstra @ 2023-09-13  8:59 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Matthew Wilcox, Dave Chinner, Waiman Long, Ingo Molnar,
	Will Deacon, linux-kernel, linux-mm, Chandan Babu R, linux-xfs

On Tue, Sep 12, 2023 at 08:27:15AM -0700, Darrick J. Wong wrote:

> I could live with Longman's suggestion of an rwsem_assert_is_locked that
> only exists if DEBUG_RWSEMS is enabled.  Something like:
> 
> #ifdef CONFIG_DEBUG_RWSEMS
> static inline bool __rwsem_assert_is_locked(struct rw_semaphore *rwsem,
> 		const char *file, int line)
> {
> 	bool ret = rwsem_is_locked(rwsem);
> 	if (!ret)
> 		WARN(1, "!rwsem_is_locked(rwsem) at %s line %d", file, line);
> 	return ret;
> }
> #define rwsem_assert_is_locked(r) \
> 	__rwsem_assert_is_locked((r), __FILE__, __LINE__)
> #endif
> 
> and then XFS could do:
> 
> 	ASSERT(rwsem_assert_is_locked(&VFS_I(ip)->i_rwsem));
> 
> Wherein ASSERT is only #defined if CONFIG_XFS_DEBUG, and XFS_DEBUG
> selects DEBUG_RWSEMS, per Longman's suggestion.  That's work for what we
> want it for (simple cheap lock checking) without becoming a general
> lockabuse predicate.

Ack.


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2023-09-13  8:59 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-07 17:47 [PATCH 0/5] Remove the XFS mrlock Matthew Wilcox (Oracle)
2023-09-07 17:47 ` [PATCH 1/5] locking: Add rwsem_is_write_locked() Matthew Wilcox (Oracle)
2023-09-07 18:05   ` Waiman Long
2023-09-07 19:33     ` Matthew Wilcox
2023-09-07 21:06       ` Waiman Long
2023-09-07 23:47         ` Waiman Long
2023-09-08  0:44           ` Dave Chinner
2023-09-07 19:08   ` Peter Zijlstra
2023-09-07 19:20     ` Matthew Wilcox
2023-09-07 19:38       ` Peter Zijlstra
2023-09-07 23:00         ` Dave Chinner
2023-09-08 10:44           ` Peter Zijlstra
2023-09-10 22:56             ` Dave Chinner
2023-09-10 23:17               ` Matthew Wilcox
2023-09-11  0:55                 ` Dave Chinner
2023-09-11  2:15                   ` Waiman Long
2023-09-11 22:29                     ` Dave Chinner
2023-09-12  9:03                       ` Peter Zijlstra
2023-09-12 12:28                         ` Matthew Wilcox
2023-09-12 13:52                           ` Peter Zijlstra
2023-09-12 13:58                             ` Matthew Wilcox
2023-09-12 14:23                               ` Peter Zijlstra
2023-09-12 15:27                                 ` Darrick J. Wong
2023-09-13  8:59                                   ` Peter Zijlstra
2023-09-12 14:02                             ` Peter Zijlstra
2023-09-12 23:16                         ` Dave Chinner
2023-09-08  0:01         ` Matthew Wilcox
2023-09-07 17:47 ` [PATCH 2/5] mm: Use rwsem_is_write_locked in mmap_assert_write_locked Matthew Wilcox (Oracle)
2023-09-07 17:47 ` [PATCH 3/5] xfs: Use rwsem_is_write_locked() Matthew Wilcox (Oracle)
2023-09-08  9:09   ` Christoph Hellwig
2023-09-08  9:10     ` Christoph Hellwig
2023-09-07 17:47 ` [PATCH 4/5] xfs: Remove mrlock wrapper Matthew Wilcox (Oracle)
2023-09-07 17:47 ` [PATCH 5/5] xfs: Stop using lockdep to assert that locks are held Matthew Wilcox (Oracle)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox