From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Oliver Sang <oliver.sang@intel.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: Re: [PATCH] mm/mmap: Clean up validate_mm() calls
Date: Wed, 5 Jul 2023 16:46:29 -0400 [thread overview]
Message-ID: <20230705204629.clctvnx4qdqoexyp@revolver> (raw)
In-Reply-To: <20230704184752.6lwrytfirr4huu34@revolver>
[-- Attachment #1: Type: text/plain, Size: 2324 bytes --]
* Liam R. Howlett <Liam.Howlett@Oracle.com> [230704 14:47]:
> * Linus Torvalds <torvalds@linux-foundation.org> [230704 14:36]:
> > On Tue, 4 Jul 2023 at 11:25, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > >
> > > validate_mm() calls are too spread out and duplicated in numerous
> > > locations. Also, now that the stack write is done under the write lock,
> > > it is not necessary to validate the mm prior to write operations.
> >
> > So while I applied the fixes directly since I was doing all the
> > write-locking stuff (and asked for the locking cleanup), I'm hoping
> > these kinds of cleanups will now go back to normal and go through
> > Andrew.
> >
> > I do have a question related to the write locking: now that we should
> > always hold the mmap lock for writing when doing any modifications,
> > can the "lock_is_held()" assertions be tightened?
> >
> > Right now it's "any locking", but for actual modification it should
> > probably be using
> >
> > lockdep_is_held_type(mt->ma_external_lock, 1)
For completeness of the email tread; it turns out we want 0 as the last
parameter.
(include/linux/lockdep.h)
/*
* Acquire a lock.
*
* Values for "read":
*
* 0: exclusive (write) acquire
* 1: read-acquire (no recursion allowed)
* 2: read-acquire with same-instance recursion allowed
*
* Values for check:
*
* 0: simple checks (freeing, held-at-exit-time, etc.)
* 1: full validation
*/
...
/*
* Same "read" as for lock_acquire(), except -1 means any.
*/
extern int lock_is_held_type(const struct lockdep_map *lock, int read);
> >
> > but there's just one 'mt_lock_is_held()' function (presumably because
> > the internal lock is always just a spinlock that doesn't have the
> > reader/writer distinction).
>
> Ah, yes. I was trying to do just that, but ran into an issue and backed
> out of fully fixing this portion up until later.
>
Here are two patches to increase the strictness of the maple tree
locking. I've boot tested them on x86_64 with the bots config and
ensured the lockdep problem was resolved.
The first introduces the new mt_write_locked() function, which ensures
the lock type is for writing.
The second updates the munmap path to avoid triggering the warnings
associated with dropping the mmap_lock prior to freeing the VMAs.
Thanks,
Liam
[-- Attachment #2: 0001-maple_tree-Be-more-strict-about-locking.patch --]
[-- Type: text/x-diff, Size: 2895 bytes --]
From c214e54a20258ca9c3ff787b435b04a1d900ad21 Mon Sep 17 00:00:00 2001
From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Date: Wed, 5 Jul 2023 12:37:47 -0400
Subject: [PATCH 1/2] maple_tree: Be more strict about locking
Use lockdep to check the write path in the maple tree holds the lock in
write mode.
Introduce mt_write_lock_is_held() to check if the lock is held for
writing. Update the necessary checks for rcu_dereference_protected() to
use the new write lock check.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
include/linux/maple_tree.h | 12 ++++++++++--
lib/maple_tree.c | 10 ++++++++--
2 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 295548cca8b3..f856d67a5d7c 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -184,12 +184,20 @@ enum maple_type {
#ifdef CONFIG_LOCKDEP
typedef struct lockdep_map *lockdep_map_p;
-#define mt_lock_is_held(mt) lock_is_held(mt->ma_external_lock)
+#define mt_write_lock_is_held(mt) \
+ (!(mt)->ma_external_lock || \
+ lock_is_held_type((mt)->ma_external_lock, 0))
+
+#define mt_lock_is_held(mt) \
+ (!(mt)->ma_external_lock || \
+ lock_is_held((mt)->ma_external_lock))
+
#define mt_set_external_lock(mt, lock) \
(mt)->ma_external_lock = &(lock)->dep_map
#else
typedef struct { /* nothing */ } lockdep_map_p;
-#define mt_lock_is_held(mt) 1
+#define mt_lock_is_held(mt) 1
+#define mt_write_lock_is_held(mt) 1
#define mt_set_external_lock(mt, lock) do { } while (0)
#endif
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index bfffbb7cab26..1c9eab89e34b 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -804,6 +804,12 @@ static inline void __rcu **ma_slots(struct maple_node *mn, enum maple_type mt)
}
}
+static inline bool mt_write_locked(const struct maple_tree *mt)
+{
+ return mt_external_lock(mt) ? mt_write_lock_is_held(mt) :
+ lockdep_is_held(&mt->ma_lock);
+}
+
static inline bool mt_locked(const struct maple_tree *mt)
{
return mt_external_lock(mt) ? mt_lock_is_held(mt) :
@@ -819,7 +825,7 @@ static inline void *mt_slot(const struct maple_tree *mt,
static inline void *mt_slot_locked(struct maple_tree *mt, void __rcu **slots,
unsigned char offset)
{
- return rcu_dereference_protected(slots[offset], mt_locked(mt));
+ return rcu_dereference_protected(slots[offset], mt_write_locked(mt));
}
/*
* mas_slot_locked() - Get the slot value when holding the maple tree lock.
@@ -862,7 +868,7 @@ static inline void *mas_root(struct ma_state *mas)
static inline void *mt_root_locked(struct maple_tree *mt)
{
- return rcu_dereference_protected(mt->ma_root, mt_locked(mt));
+ return rcu_dereference_protected(mt->ma_root, mt_write_locked(mt));
}
/*
--
2.39.2
[-- Attachment #3: 0002-mm-mmap-Change-detached-vma-locking-scheme.patch --]
[-- Type: text/x-diff, Size: 1065 bytes --]
From 58fd73f90e7331678a728ada9bf92013105afbc1 Mon Sep 17 00:00:00 2001
From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Date: Wed, 5 Jul 2023 14:47:49 -0400
Subject: [PATCH 2/2] mm/mmap: Change detached vma locking scheme
Don't set the lock to the mm lock so that the detached VMA tree does not
complain about being unlocked when the mmap_lock is dropped prior to
freeing the tree.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
mm/mmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 964a8aa59297..3bb5a4e1f4f1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2426,7 +2426,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
unsigned long locked_vm = 0;
MA_STATE(mas_detach, &mt_detach, 0, 0);
mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK_MASK);
- mt_set_external_lock(&mt_detach, &mm->mmap_lock);
+ mt_detach.ma_external_lock = NULL;
/*
* If we need to split any vma, do it now to save pain later.
--
2.39.2
prev parent reply other threads:[~2023-07-05 20:46 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-04 18:24 Liam R. Howlett
2023-07-04 18:36 ` Linus Torvalds
2023-07-04 18:47 ` Liam R. Howlett
2023-07-05 20:46 ` Liam R. Howlett [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230705204629.clctvnx4qdqoexyp@revolver \
--to=liam.howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=oliver.sang@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox