From: Peter Zijlstra <peterz@infradead.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org,
mjguzik@gmail.com, oliver.sang@intel.com,
mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com,
oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com,
hughd@google.com, lokeshgidra@google.com, minchan@google.com,
jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com,
pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net,
linux-doc@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count
Date: Tue, 17 Dec 2024 11:30:35 +0100 [thread overview]
Message-ID: <20241217103035.GD11133@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <CAJuCfpEu_rZkC+ktWXE=rA-VenFBZR9VQ-SnVkDbXUqsd3Ys_A@mail.gmail.com>
On Mon, Dec 16, 2024 at 01:44:45PM -0800, Suren Baghdasaryan wrote:
> On Mon, Dec 16, 2024 at 1:38 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Dec 16, 2024 at 11:24:13AM -0800, Suren Baghdasaryan wrote:
> > > +static inline void vma_refcount_put(struct vm_area_struct *vma)
> > > +{
> > > + int refcnt;
> > > +
> > > + if (!__refcount_dec_and_test(&vma->vm_refcnt, &refcnt)) {
> > > + rwsem_release(&vma->vmlock_dep_map, _RET_IP_);
> > > +
> > > + if (refcnt & VMA_STATE_LOCKED)
> > > + rcuwait_wake_up(&vma->vm_mm->vma_writer_wait);
> > > + }
> > > +}
> > > +
> > > /*
> > > * Try to read-lock a vma. The function is allowed to occasionally yield false
> > > * locked result to avoid performance overhead, in which case we fall back to
> > > @@ -710,6 +728,8 @@ static inline void vma_lock_init(struct vm_area_struct *vma)
> > > */
> > > static inline bool vma_start_read(struct vm_area_struct *vma)
> > > {
> > > + int oldcnt;
> > > +
> > > /*
> > > * Check before locking. A race might cause false locked result.
> > > * We can use READ_ONCE() for the mm_lock_seq here, and don't need
> > > @@ -720,13 +740,20 @@ static inline bool vma_start_read(struct vm_area_struct *vma)
> > > if (READ_ONCE(vma->vm_lock_seq) == READ_ONCE(vma->vm_mm->mm_lock_seq.sequence))
> > > return false;
> > >
> > > +
> > > + rwsem_acquire_read(&vma->vmlock_dep_map, 0, 0, _RET_IP_);
> > > + /* Limit at VMA_STATE_LOCKED - 2 to leave one count for a writer */
> > > + if (unlikely(!__refcount_inc_not_zero_limited(&vma->vm_refcnt, &oldcnt,
> > > + VMA_STATE_LOCKED - 2))) {
> > > + rwsem_release(&vma->vmlock_dep_map, _RET_IP_);
> > > return false;
> > > + }
> > > + lock_acquired(&vma->vmlock_dep_map, _RET_IP_);
> > >
> > > /*
> > > + * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
> > > * False unlocked result is impossible because we modify and check
> > > + * vma->vm_lock_seq under vma->vm_refcnt protection and mm->mm_lock_seq
> > > * modification invalidates all existing locks.
> > > *
> > > * We must use ACQUIRE semantics for the mm_lock_seq so that if we are
> > > @@ -734,10 +761,12 @@ static inline bool vma_start_read(struct vm_area_struct *vma)
> > > * after it has been unlocked.
> > > * This pairs with RELEASE semantics in vma_end_write_all().
> > > */
> > > + if (oldcnt & VMA_STATE_LOCKED ||
> > > + unlikely(vma->vm_lock_seq == raw_read_seqcount(&vma->vm_mm->mm_lock_seq))) {
> > > + vma_refcount_put(vma);
> >
> > Suppose we have detach race with a concurrent RCU lookup like:
> >
> > vma = mas_lookup();
> >
> > vma_start_write();
> > mas_detach();
> > vma_start_read()
> > rwsem_acquire_read()
> > inc // success
> > vma_mark_detach();
> > dec_and_test // assumes 1->0
> > // is actually 2->1
> >
> > if (vm_lock_seq == vma->vm_mm_mm_lock_seq) // true
> > vma_refcount_put
> > dec_and_test() // 1->0
> > *NO* rwsem_release()
> >
>
> Yes, this is possible. I think that's not a problem until we start
> reusing the vmas and I deal with this race later in this patchset.
> I think what you described here is the same race I mention in the
> description of this patch:
> https://lore.kernel.org/all/20241216192419.2970941-14-surenb@google.com/
> I introduce vma_ensure_detached() in that patch to handle this case
> and ensure that vmas are detached before they are returned into the
> slab cache for reuse. Does that make sense?
So I just replied there, and no, I don't think it makes sense. Just put
the kmem_cache_free() in vma_refcount_put(), to be done on 0.
Anyway, my point was more about the weird entanglement of lockdep and
the refcount. Just pull the lockdep annotation out of _put() and put it
explicitly in the vma_start_read() error paths and vma_end_read().
Additionally, having vma_end_write() would allow you to put a lockdep
annotation in vma_{start,end}_write() -- which was I think the original
reason I proposed it a while back, that and having improved clarity when
reading the code, since explicitly marking the end of a section is
helpful.
next prev parent reply other threads:[~2024-12-17 10:30 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-16 19:24 [PATCH v6 00/16] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 01/16] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 02/16] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 03/16] mm: mark vma as detached until it's added into vma tree Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 04/16] mm/nommu: fix the last places where vma is not locked before being attached Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 05/16] types: move struct rcuwait into types.h Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 06/16] mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail Suren Baghdasaryan
2024-12-17 11:31 ` Lokesh Gidra
2024-12-17 15:51 ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 07/16] mm: move mmap_init_lock() out of the header file Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 08/16] mm: uninline the main body of vma_start_write() Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 09/16] refcount: introduce __refcount_{add|inc}_not_zero_limited Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count Suren Baghdasaryan
2024-12-16 20:42 ` Peter Zijlstra
2024-12-16 20:53 ` Suren Baghdasaryan
2024-12-16 21:15 ` Peter Zijlstra
2024-12-16 21:53 ` Suren Baghdasaryan
2024-12-16 22:00 ` Peter Zijlstra
2024-12-16 21:37 ` Peter Zijlstra
2024-12-16 21:44 ` Suren Baghdasaryan
2024-12-17 10:30 ` Peter Zijlstra [this message]
2024-12-17 16:27 ` Suren Baghdasaryan
2024-12-18 9:41 ` Peter Zijlstra
2024-12-18 10:06 ` Peter Zijlstra
2024-12-18 15:37 ` Liam R. Howlett
2024-12-18 15:50 ` Suren Baghdasaryan
2024-12-18 16:18 ` Peter Zijlstra
2024-12-18 17:36 ` Suren Baghdasaryan
2024-12-18 17:44 ` Peter Zijlstra
2024-12-18 17:58 ` Suren Baghdasaryan
2024-12-18 19:00 ` Liam R. Howlett
2024-12-18 19:07 ` Suren Baghdasaryan
2024-12-18 19:29 ` Suren Baghdasaryan
2024-12-18 19:38 ` Liam R. Howlett
2024-12-18 20:00 ` Suren Baghdasaryan
2024-12-18 20:38 ` Liam R. Howlett
2024-12-18 21:53 ` Suren Baghdasaryan
2024-12-18 21:55 ` Suren Baghdasaryan
2024-12-19 0:35 ` Andrew Morton
2024-12-19 0:47 ` Suren Baghdasaryan
2024-12-19 9:13 ` Peter Zijlstra
2024-12-19 11:20 ` Peter Zijlstra
2024-12-19 16:17 ` Suren Baghdasaryan
2024-12-19 17:16 ` Liam R. Howlett
2024-12-19 17:42 ` Peter Zijlstra
2024-12-19 18:18 ` Liam R. Howlett
2024-12-19 18:46 ` Peter Zijlstra
2024-12-19 18:55 ` Liam R. Howlett
2024-12-20 15:22 ` Suren Baghdasaryan
2024-12-23 3:03 ` Suren Baghdasaryan
2024-12-26 17:12 ` Suren Baghdasaryan
2024-12-19 16:14 ` Suren Baghdasaryan
2024-12-19 17:23 ` Peter Zijlstra
2024-12-19 8:55 ` Peter Zijlstra
2024-12-19 16:08 ` Suren Baghdasaryan
2024-12-19 8:53 ` Peter Zijlstra
2024-12-19 16:08 ` Suren Baghdasaryan
2024-12-18 15:57 ` Suren Baghdasaryan
2024-12-18 16:13 ` Peter Zijlstra
2024-12-18 15:42 ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 11/16] mm: enforce vma to be in detached state before freeing Suren Baghdasaryan
2024-12-16 21:16 ` Peter Zijlstra
2024-12-16 21:18 ` Peter Zijlstra
2024-12-16 21:57 ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 12/16] mm: remove extra vma_numab_state_init() call Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 13/16] mm: introduce vma_ensure_detached() Suren Baghdasaryan
2024-12-17 10:26 ` Peter Zijlstra
2024-12-17 15:58 ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 14/16] mm: prepare lock_vma_under_rcu() for vma reuse possibility Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 15/16] mm: make vma cache SLAB_TYPESAFE_BY_RCU Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 16/16] docs/mm: document latest changes to vm_lock Suren Baghdasaryan
2024-12-16 19:39 ` [PATCH v6 00/16] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-17 18:42 ` Andrew Morton
2024-12-17 18:49 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241217103035.GD11133@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=kernel-team@android.com \
--cc=klarasmodin@gmail.com \
--cc=liam.howlett@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lokeshgidra@google.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@google.com \
--cc=mjguzik@gmail.com \
--cc=oleg@redhat.com \
--cc=oliver.sang@intel.com \
--cc=pasha.tatashin@soleen.com \
--cc=paulmck@kernel.org \
--cc=peterx@redhat.com \
--cc=shakeel.butt@linux.dev \
--cc=souravpanda@google.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox