linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: peterz@infradead.org, willy@infradead.org,
	liam.howlett@oracle.com,  lorenzo.stoakes@oracle.com,
	mhocko@suse.com, vbabka@suse.cz,  hannes@cmpxchg.org,
	mjguzik@gmail.com, oliver.sang@intel.com,
	 mgorman@techsingularity.net, david@redhat.com,
	peterx@redhat.com,  oleg@redhat.com, dave@stgolabs.net,
	paulmck@kernel.org, brauner@kernel.org,  dhowells@redhat.com,
	hdanton@sina.com, hughd@google.com,  lokeshgidra@google.com,
	minchan@google.com, jannh@google.com,  shakeel.butt@linux.dev,
	souravpanda@google.com, pasha.tatashin@soleen.com,
	 klarasmodin@gmail.com, richard.weiyang@gmail.com,
	corbet@lwn.net,  linux-doc@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,  kernel-team@android.com,
	surenb@google.com,  "Liam R. Howlett" <Liam.Howlett@Oracle.com>
Subject: [PATCH v8 16/16] docs/mm: document latest changes to vm_lock
Date: Wed,  8 Jan 2025 18:30:25 -0800	[thread overview]
Message-ID: <20250109023025.2242447-17-surenb@google.com> (raw)
In-Reply-To: <20250109023025.2242447-1-surenb@google.com>

Change the documentation to reflect that vm_lock is integrated into vma
and replaced with vm_refcnt.
Document newly introduced vma_start_read_locked{_nested} functions.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
---
 Documentation/mm/process_addrs.rst | 44 ++++++++++++++++++------------
 1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
index 81417fa2ed20..f573de936b5d 100644
--- a/Documentation/mm/process_addrs.rst
+++ b/Documentation/mm/process_addrs.rst
@@ -716,9 +716,14 @@ calls :c:func:`!rcu_read_lock` to ensure that the VMA is looked up in an RCU
 critical section, then attempts to VMA lock it via :c:func:`!vma_start_read`,
 before releasing the RCU lock via :c:func:`!rcu_read_unlock`.
 
-VMA read locks hold the read lock on the :c:member:`!vma->vm_lock` semaphore for
-their duration and the caller of :c:func:`!lock_vma_under_rcu` must release it
-via :c:func:`!vma_end_read`.
+In cases when the user already holds mmap read lock, :c:func:`!vma_start_read_locked`
+and :c:func:`!vma_start_read_locked_nested` can be used. These functions do not
+fail due to lock contention but the caller should still check their return values
+in case they fail for other reasons.
+
+VMA read locks increment :c:member:`!vma.vm_refcnt` reference counter for their
+duration and the caller of :c:func:`!lock_vma_under_rcu` must drop it via
+:c:func:`!vma_end_read`.
 
 VMA **write** locks are acquired via :c:func:`!vma_start_write` in instances where a
 VMA is about to be modified, unlike :c:func:`!vma_start_read` the lock is always
@@ -726,9 +731,9 @@ acquired. An mmap write lock **must** be held for the duration of the VMA write
 lock, releasing or downgrading the mmap write lock also releases the VMA write
 lock so there is no :c:func:`!vma_end_write` function.
 
-Note that a semaphore write lock is not held across a VMA lock. Rather, a
-sequence number is used for serialisation, and the write semaphore is only
-acquired at the point of write lock to update this.
+Note that when write-locking a VMA lock, the :c:member:`!vma.vm_refcnt` is temporarily
+modified so that readers can detect the presense of a writer. The reference counter is
+restored once the vma sequence number used for serialisation is updated.
 
 This ensures the semantics we require - VMA write locks provide exclusive write
 access to the VMA.
@@ -738,7 +743,7 @@ Implementation details
 
 The VMA lock mechanism is designed to be a lightweight means of avoiding the use
 of the heavily contended mmap lock. It is implemented using a combination of a
-read/write semaphore and sequence numbers belonging to the containing
+reference counter and sequence numbers belonging to the containing
 :c:struct:`!struct mm_struct` and the VMA.
 
 Read locks are acquired via :c:func:`!vma_start_read`, which is an optimistic
@@ -779,28 +784,31 @@ release of any VMA locks on its release makes sense, as you would never want to
 keep VMAs locked across entirely separate write operations. It also maintains
 correct lock ordering.
 
-Each time a VMA read lock is acquired, we acquire a read lock on the
-:c:member:`!vma->vm_lock` read/write semaphore and hold it, while checking that
-the sequence count of the VMA does not match that of the mm.
+Each time a VMA read lock is acquired, we increment :c:member:`!vma.vm_refcnt`
+reference counter and check that the sequence count of the VMA does not match
+that of the mm.
 
-If it does, the read lock fails. If it does not, we hold the lock, excluding
-writers, but permitting other readers, who will also obtain this lock under RCU.
+If it does, the read lock fails and :c:member:`!vma.vm_refcnt` is dropped.
+If it does not, we keep the reference counter raised, excluding writers, but
+permitting other readers, who can also obtain this lock under RCU.
 
 Importantly, maple tree operations performed in :c:func:`!lock_vma_under_rcu`
 are also RCU safe, so the whole read lock operation is guaranteed to function
 correctly.
 
-On the write side, we acquire a write lock on the :c:member:`!vma->vm_lock`
-read/write semaphore, before setting the VMA's sequence number under this lock,
-also simultaneously holding the mmap write lock.
+On the write side, we set a bit in :c:member:`!vma.vm_refcnt` which can't be
+modified by readers and wait for all readers to drop their reference count.
+Once there are no readers, VMA's sequence number is set to match that of the
+mm. During this entire operation mmap write lock is held.
 
 This way, if any read locks are in effect, :c:func:`!vma_start_write` will sleep
 until these are finished and mutual exclusion is achieved.
 
-After setting the VMA's sequence number, the lock is released, avoiding
-complexity with a long-term held write lock.
+After setting the VMA's sequence number, the bit in :c:member:`!vma.vm_refcnt`
+indicating a writer is cleared. From this point on, VMA's sequence number will
+indicate VMA's write-locked state until mmap write lock is dropped or downgraded.
 
-This clever combination of a read/write semaphore and sequence count allows for
+This clever combination of a reference counter and sequence count allows for
 fast RCU-based per-VMA lock acquisition (especially on page fault, though
 utilised elsewhere) with minimal complexity around lock ordering.
 
-- 
2.47.1.613.gc27f4b7a9f-goog



  parent reply	other threads:[~2025-01-09  2:31 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-09  2:30 [PATCH v8 00/16] move per-vma lock into vm_area_struct Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 01/16] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 02/16] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 03/16] mm: mark vma as detached until it's added into vma tree Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 04/16] mm: introduce vma_iter_store_attached() to use with attached vmas Suren Baghdasaryan
2025-01-09 14:01   ` Vlastimil Babka
2025-01-09  2:30 ` [PATCH v8 05/16] mm: mark vmas detached upon exit Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 06/16] types: move struct rcuwait into types.h Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 07/16] mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 08/16] mm: move mmap_init_lock() out of the header file Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 09/16] mm: uninline the main body of vma_start_write() Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 10/16] refcount: introduce __refcount_{add|inc}_not_zero_limited Suren Baghdasaryan
2025-01-09 14:42   ` Vlastimil Babka
2025-01-11  1:33     ` Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 11/16] mm: replace vm_lock and detached flag with a reference count Suren Baghdasaryan
2025-01-09 10:35   ` Hillf Danton
2025-01-09 16:01     ` Suren Baghdasaryan
2025-01-10 14:34   ` Vlastimil Babka
2025-01-10 15:56     ` Suren Baghdasaryan
2025-01-10 16:47       ` Suren Baghdasaryan
2025-01-10 16:50         ` Suren Baghdasaryan
2025-01-10 22:26       ` Vlastimil Babka
2025-01-10 22:37         ` Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 12/16] mm/debug: print vm_refcnt state when dumping the vma Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 13/16] mm: remove extra vma_numab_state_init() call Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 14/16] mm: prepare lock_vma_under_rcu() for vma reuse possibility Suren Baghdasaryan
2025-01-09  2:30 ` [PATCH v8 15/16] mm: make vma cache SLAB_TYPESAFE_BY_RCU Suren Baghdasaryan
2025-01-10 15:32   ` Vlastimil Babka
2025-01-10 16:07     ` Suren Baghdasaryan
2025-01-10 22:14       ` Vlastimil Babka
2025-01-11  3:37       ` Suren Baghdasaryan
2025-01-10 17:47   ` Liam R. Howlett
2025-01-10 19:07     ` Suren Baghdasaryan
2025-01-10 19:46       ` Liam R. Howlett
2025-01-10 20:34         ` Suren Baghdasaryan
2025-01-10 20:47           ` Liam R. Howlett
2025-01-10 21:32             ` Suren Baghdasaryan
2025-01-10 19:51       ` Liam R. Howlett
2025-01-10 20:40         ` Suren Baghdasaryan
2025-01-10 20:48           ` Liam R. Howlett
2025-01-09  2:30 ` Suren Baghdasaryan [this message]
2025-01-09  2:32 ` [PATCH v8 00/16] move per-vma lock into vm_area_struct Suren Baghdasaryan
2025-01-09 11:51 ` Peter Zijlstra
2025-01-09 15:48   ` Suren Baghdasaryan
2025-01-10 17:01     ` Peter Zijlstra
2025-01-15  8:59       ` Peter Zijlstra
2025-01-09 13:41 ` Vlastimil Babka
2025-01-09 15:57   ` Suren Baghdasaryan
2025-01-10  0:14     ` Suren Baghdasaryan
2025-01-09 15:59   ` Suren Baghdasaryan
2025-01-10  0:16     ` Suren Baghdasaryan
2025-01-10 15:36       ` Vlastimil Babka
2025-01-10 16:08         ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250109023025.2242447-17-surenb@google.com \
    --to=surenb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=kernel-team@android.com \
    --cc=klarasmodin@gmail.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lokeshgidra@google.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=mjguzik@gmail.com \
    --cc=oleg@redhat.com \
    --cc=oliver.sang@intel.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=richard.weiyang@gmail.com \
    --cc=shakeel.butt@linux.dev \
    --cc=souravpanda@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox