From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>,
Shakeel Butt <shakeel.butt@linux.dev>,
David Hildenbrand <david@kernel.org>,
Rik van Riel <riel@surriel.com>, Harry Yoo <harry.yoo@oracle.com>,
Jann Horn <jannh@google.com>, Mike Rapoport <rppt@kernel.org>,
Michal Hocko <mhocko@suse.com>, Pedro Falcato <pfalcato@suse.de>,
Chris Li <chriscli@google.com>,
Barry Song <v-songbaohua@oppo.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/8] mm/rmap: remove unnecessary root lock dance in anon_vma clone, unmap
Date: Thu, 8 Jan 2026 17:46:11 +0000 [thread overview]
Message-ID: <351668f9-dca4-4df8-bcfc-0fa165523edc@lucifer.local> (raw)
In-Reply-To: <CAJuCfpGQQid_VPx9Y1TE4ozXEQM8tWixxLDnS3cvrM3sdT84QQ@mail.gmail.com>
On Tue, Jan 06, 2026 at 12:58:46PM -0800, Suren Baghdasaryan wrote:
> On Tue, Jan 6, 2026 at 5:58 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Mon, Dec 29, 2025 at 02:17:53PM -0800, Suren Baghdasaryan wrote:
> > > On Wed, Dec 17, 2025 at 4:27 AM Lorenzo Stoakes
> > > <lorenzo.stoakes@oracle.com> wrote:
> > > >
> > > > The root anon_vma of all anon_vma's linked to a VMA must by definition be
> > > > the same - a VMA and all of its descendants/ancestors must exist in the
> > > > same CoW chain.
> > > >
> > > > Commit bb4aa39676f7 ("mm: avoid repeated anon_vma lock/unlock sequences in
> > > > anon_vma_clone()") introduced paranoid checking of the root anon_vma
> > > > remaining the same throughout all AVC's in 2011.
> > > >
> > > > I think 15 years later we can safely assume that this is always the case.
> > > >
> > > > Additionally, since unfaulted VMAs being cloned from or unlinked are
> > > > no-op's, we can simply lock the anon_vma's associated with this rather than
> > > > doing any specific dance around this.
> > > >
> > > > This removes unnecessary checks and makes it clear that the root anon_vma
> > > > is shared between all anon_vma's in a given VMA's anon_vma_chain.
> > > >
> > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > > ---
> > > > mm/rmap.c | 48 ++++++++++++------------------------------------
> > > > 1 file changed, 12 insertions(+), 36 deletions(-)
> > > >
> > > > diff --git a/mm/rmap.c b/mm/rmap.c
> > > > index 9332d1cbc643..60134a566073 100644
> > > > --- a/mm/rmap.c
> > > > +++ b/mm/rmap.c
> > > > @@ -231,32 +231,6 @@ int __anon_vma_prepare(struct vm_area_struct *vma)
> > > > return -ENOMEM;
> > > > }
> > > >
> > > > -/*
> > > > - * This is a useful helper function for locking the anon_vma root as
> > > > - * we traverse the vma->anon_vma_chain, looping over anon_vma's that
> > > > - * have the same vma.
> > > > - *
> > > > - * Such anon_vma's should have the same root, so you'd expect to see
> > > > - * just a single mutex_lock for the whole traversal.
> > > > - */
> > > > -static inline struct anon_vma *lock_anon_vma_root(struct anon_vma *root, struct anon_vma *anon_vma)
> > > > -{
> > > > - struct anon_vma *new_root = anon_vma->root;
> > > > - if (new_root != root) {
> > > > - if (WARN_ON_ONCE(root))
> > > > - up_write(&root->rwsem);
> > > > - root = new_root;
> > > > - down_write(&root->rwsem);
> > > > - }
> > > > - return root;
> > > > -}
> > > > -
> > > > -static inline void unlock_anon_vma_root(struct anon_vma *root)
> > > > -{
> > > > - if (root)
> > > > - up_write(&root->rwsem);
> > > > -}
> > > > -
> > > > static void check_anon_vma_clone(struct vm_area_struct *dst,
> > > > struct vm_area_struct *src)
> > > > {
> > > > @@ -307,26 +281,25 @@ static void check_anon_vma_clone(struct vm_area_struct *dst,
> > > > int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> > > > {
> > > > struct anon_vma_chain *avc, *pavc;
> > > > - struct anon_vma *root = NULL;
> > > >
> > > > if (!src->anon_vma)
> > > > return 0;
> > > >
> > > > check_anon_vma_clone(dst, src);
> > > >
> > > > + anon_vma_lock_write(src->anon_vma);
> > > > list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
> > > > struct anon_vma *anon_vma;
> > > >
> > > > avc = anon_vma_chain_alloc(GFP_NOWAIT);
> > > > if (unlikely(!avc)) {
> > > > - unlock_anon_vma_root(root);
> > > > - root = NULL;
> > > > + anon_vma_unlock_write(src->anon_vma);
> > > > avc = anon_vma_chain_alloc(GFP_KERNEL);
> > > > if (!avc)
> > > > goto enomem_failure;
> > > > + anon_vma_lock_write(src->anon_vma);
> > >
> > > So, we drop and then reacquire src->anon_vma->root->rwsem, expecting
> > > src->anon_vma and src->anon_vma->root to be the same. And IIUC
> >
> > I mean did you read the commit message? :)
> >
> > We're not expecting that, they _have_ to be the same. It simply makes no sense
> > for them _not_ to be the same.
>
> Sorry, maybe I chose my words badly to explain my concern. I meant
> that we expect those fields to still be valid between the time when we
> drop and re-ackquire the lock. The comment next to anon_vma.rwsem
> definition says "W: modification, R: walking the list". Here we are
> walking the list with the lock but are dropping the lock in the
> process. I think there needs to be an explanation why this is safe.
This already happened though? And yes it's sketchy.
I don't think this is necessary as later I change this anyway, so we'd just be
adding an explanation I'd have to delete later.
I already provide explanation as to the locking when I go ahead and change the
scope of the anon_vma rmap lock elsewhere so this general 'explaining lock
scope' pattern is happening in the final result of the series.
>
>
> >
> > This is kind of the entire point of the patch.
> >
> > > src->vm_mm's mmap lock is what guarantees all this. If so, could you
> > > please add a clarifying comment here?
> >
> > No that's not what guarantees it? I don't understand what you mean?
> >
> > I mean in a sense, if you had a totally broken situation where you didn't take
> > exclusive locks and could do some horribly broken racing here, then sure you
> > might end up with something broken, but I think it's super confusing to say 'oh
> > this lock guarantees it', well no it guarantees that you aren't completely
> > broken, what guarantees the shared root is how anon_vma_fork() works, which is
> > to:
> >
> > - Clone.
> > - If not reused an anon_vma (which by recursion would also have same root)
> > allocate new anon_vma.
> > - If allocated new, set root to source VMA's anon_vma, which by definition also
> > has to be in its anon_vma_chain and have the same root (itself, if we're
> > cloning from the ultimate parent).
> >
> > But I don't think it'd be helpful to document all this, or we get into _adding_
> > confusion by putting _too much_ in a comment.
> >
> > So I guess I'll just say,a s I do in the newly introduced
> > clenaup_partial_anon_vmas():
> >
> > /* All anon_vma's share the same root. */
>
> Yeah, my concern was not the root being different but that the list
> itself is stable after we drop the lock.
Again, I'm going to end up deleting any explanation that I add in a later
patch where I extensively change this, which seems like it'd not be a
useful thing to do in the series.
So I think we should leave it as-is.
Thanks, Lorenzo
next prev parent reply other threads:[~2026-01-08 17:46 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-17 12:27 [PATCH 0/8] mm: clean up anon_vma implementation Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 1/8] mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts Lorenzo Stoakes
2025-12-19 18:22 ` Liam R. Howlett
2025-12-29 21:18 ` Suren Baghdasaryan
2025-12-30 21:21 ` Suren Baghdasaryan
2026-01-06 12:54 ` Lorenzo Stoakes
2026-01-06 13:01 ` Lorenzo Stoakes
2026-01-06 13:04 ` Lorenzo Stoakes
2026-01-06 13:34 ` Lorenzo Stoakes
2026-01-06 18:52 ` Suren Baghdasaryan
2026-01-06 13:51 ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 2/8] mm/rmap: skip unfaulted VMAs on anon_vma clone, unlink Lorenzo Stoakes
2025-12-19 18:28 ` Liam R. Howlett
2025-12-29 21:41 ` Suren Baghdasaryan
2026-01-06 13:17 ` Lorenzo Stoakes
2026-01-06 13:14 ` Lorenzo Stoakes
2026-01-06 13:42 ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 3/8] mm/rmap: remove unnecessary root lock dance in anon_vma clone, unmap Lorenzo Stoakes
2025-12-29 22:17 ` Suren Baghdasaryan
2026-01-06 13:58 ` Lorenzo Stoakes
2026-01-06 20:58 ` Suren Baghdasaryan
2026-01-08 17:46 ` Lorenzo Stoakes [this message]
2025-12-17 12:27 ` [PATCH 4/8] mm/rmap: remove anon_vma_merge() function Lorenzo Stoakes
2025-12-30 19:35 ` Suren Baghdasaryan
2026-01-06 14:00 ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 5/8] mm/rmap: make anon_vma functions internal Lorenzo Stoakes
2025-12-30 19:38 ` Suren Baghdasaryan
2026-01-06 14:03 ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 6/8] mm/mmap_lock: add vma_is_attached() helper Lorenzo Stoakes
2025-12-30 19:50 ` Suren Baghdasaryan
2026-01-06 14:06 ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 7/8] mm/rmap: allocate anon_vma_chain objects unlocked when possible Lorenzo Stoakes
2025-12-30 21:35 ` Suren Baghdasaryan
2026-01-06 14:17 ` Lorenzo Stoakes
2026-01-06 21:20 ` Suren Baghdasaryan
2026-01-08 17:26 ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 8/8] mm/rmap: separate out fork-only logic on anon_vma_clone() Lorenzo Stoakes
2025-12-30 22:02 ` Suren Baghdasaryan
2026-01-06 14:43 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=351668f9-dca4-4df8-bcfc-0fa165523edc@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=chriscli@google.com \
--cc=david@kernel.org \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=pfalcato@suse.de \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox