From: Vlastimil Babka <vbabka@suse.cz>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Jann Horn <jannh@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-rt-devel@lists.linux.dev,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
Waiman Long <longman@redhat.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Clark Williams <clrkwllms@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH v2 1/2] mm/vma: use lockdep where we can, reduce duplication
Date: Tue, 20 Jan 2026 14:53:30 +0100 [thread overview]
Message-ID: <30f843d9-03cf-4c7c-8a29-8e11b12e47e4@suse.cz> (raw)
In-Reply-To: <bd410fd91d22eef37f9b617327cafce33b267b09.1768855783.git.lorenzo.stoakes@oracle.com>
On 1/19/26 21:59, Lorenzo Stoakes wrote:
> We introduce vma_is_read_locked(), which must deal with the case in which
> VMA write lock sets refcnt to VMA_LOCK_OFFSET or VMA_LOCK_OFFSET +
> 1. Luckily is_vma_writer_only() already exists which we can use to check
> this.
So I think there's a bit of a caveat in that
- is_vma_writer_only() may be a false positive if there is a temporary
reader of a detached vma (per comments in vma_mark_detached() and
vma_mark_detached())
- hence vma_is_read_locked() may be a false negative
- hence vma_assert_locked() might assume wrongly that we should not assert
being a reader, so we vma_assert_write_locked() instead, and fail
Howevever the above should mean it could be only us who is the temporary
reader. And we are not going to use vma_assert_locked() during the temporary
reader part (in vma_start_read()).
So it's probably fine, but maybe worth some comments to prevent people
getting suspicious and reconstructing this?
But I think perhaps also vma_assert_locked() could, with lockdep enabled
(similarly to vma_assert_stabilised() in patch 2), use the
"lock_is_held(&vma->vmlock_dep_map)" condition (without immediately
asserting it) for the primary reader vs writer decision, and not rely on
vma_is_read_locked()? Because lockdep has the precise information.
It would likely make things more ugly, or require more refactoring, but
hopefully worthwhile?
> We then try to make vma_assert_locked() use lockdep as far as we can.
>
> Unfortunately the VMA lock implementation does not even try to track VMA
> write locks using lockdep, so we cannot track the lock this way.
>
> This is less egregious than it might seem as VMA write locks are predicated
> on mmap write locks, which we do lockdep assert.
>
> vma_assert_write_locked() already asserts the mmap write lock is taken so
> we get that checked implicitly.
> However for read locks we do indeed use lockdup, via rwsem_acquire_read()
> called in vma_start_read() and rwsem_release_read() called in
> vma_refcount_put() called in turn by vma_end_read().
>
> Therefore we perform a lockdep assertion if the VMA is known to be
> read-locked.
>
> If it is write-locked, we assert the mmap lock instead, with a lockdep
> check if lockdep is enabled.
>
> If lockdep is not enabled, we just check that locks are in place.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> include/linux/mmap_lock.h | 34 ++++++++++++++++++++++++++++++----
> 1 file changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
> index b50416fbba20..6979222882f1 100644
> --- a/include/linux/mmap_lock.h
> +++ b/include/linux/mmap_lock.h
> @@ -236,6 +236,13 @@ int vma_start_write_killable(struct vm_area_struct *vma)
> return __vma_start_write(vma, mm_lock_seq, TASK_KILLABLE);
> }
>
> +static inline bool vma_is_read_locked(const struct vm_area_struct *vma)
> +{
> + const unsigned int refcnt = refcount_read(&vma->vm_refcnt);
> +
> + return refcnt > 1 && !is_vma_writer_only(refcnt);
> +}
> +
> static inline void vma_assert_write_locked(struct vm_area_struct *vma)
> {
> unsigned int mm_lock_seq;
> @@ -243,12 +250,31 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma)
> VM_BUG_ON_VMA(!__is_vma_write_locked(vma, &mm_lock_seq), vma);
> }
>
> +/**
> + * vma_assert_locked() - Assert that @vma is either read or write locked and
> + * that we have ownership of that lock (if lockdep is enabled).
> + * @vma: The VMA we assert.
> + *
> + * If lockdep is enabled, we ensure ownership of the VMA lock. Otherwise we
> + * assert that we are VMA write-locked, which implicitly asserts that we hold
> + * the mmap write lock.
> + */
> static inline void vma_assert_locked(struct vm_area_struct *vma)
> {
> - unsigned int mm_lock_seq;
> -
> - VM_BUG_ON_VMA(refcount_read(&vma->vm_refcnt) <= 1 &&
> - !__is_vma_write_locked(vma, &mm_lock_seq), vma);
> + /*
> + * VMA locks currently only utilise lockdep for read locks, as
> + * vma_end_write_all() releases an unknown number of VMA write locks and
> + * we don't currently walk the maple tree to identify which locks are
> + * released even under CONFIG_LOCKDEP.
> + *
> + * However, VMA write locks are predicated on an mmap write lock, which
> + * we DO track under lockdep, and which vma_assert_write_locked()
> + * asserts.
> + */
> + if (vma_is_read_locked(vma))
> + lockdep_assert(lock_is_held(&vma->vmlock_dep_map));
> + else
> + vma_assert_write_locked(vma);
> }
>
> static inline bool vma_is_attached(struct vm_area_struct *vma)
next prev parent reply other threads:[~2026-01-20 13:53 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-19 20:59 [PATCH v2 0/2] add and use vma_assert_stabilised() helper Lorenzo Stoakes
2026-01-19 20:59 ` [PATCH v2 1/2] mm/vma: use lockdep where we can, reduce duplication Lorenzo Stoakes
2026-01-20 13:53 ` Vlastimil Babka [this message]
2026-01-20 17:49 ` Lorenzo Stoakes
2026-01-20 21:28 ` Vlastimil Babka
2026-01-21 9:07 ` Lorenzo Stoakes
2026-01-19 20:59 ` [PATCH v2 2/2] mm: add vma_assert_stabilised() Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=30f843d9-03cf-4c7c-8a29-8e11b12e47e4@suse.cz \
--to=vbabka@suse.cz \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=clrkwllms@kernel.org \
--cc=david@kernel.org \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=longman@redhat.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox