linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	David Hildenbrand <david@kernel.org>,
	Rik van Riel <riel@surriel.com>, Harry Yoo <harry.yoo@oracle.com>,
	Jann Horn <jannh@google.com>, Mike Rapoport <rppt@kernel.org>,
	Michal Hocko <mhocko@suse.com>, Pedro Falcato <pfalcato@suse.de>,
	Chris Li <chriscli@google.com>,
	Barry Song <v-songbaohua@oppo.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/8] mm/rmap: remove unnecessary root lock dance in anon_vma clone, unmap
Date: Tue, 6 Jan 2026 13:58:23 +0000	[thread overview]
Message-ID: <2ec5bf63-139e-4e8b-85e2-efb48adc93fb@lucifer.local> (raw)
In-Reply-To: <CAJuCfpEWwdnNs458wSb__C0eHSEvbvJ9nK3ryQWLTjA3+b3BmQ@mail.gmail.com>

On Mon, Dec 29, 2025 at 02:17:53PM -0800, Suren Baghdasaryan wrote:
> On Wed, Dec 17, 2025 at 4:27 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > The root anon_vma of all anon_vma's linked to a VMA must by definition be
> > the same - a VMA and all of its descendants/ancestors must exist in the
> > same CoW chain.
> >
> > Commit bb4aa39676f7 ("mm: avoid repeated anon_vma lock/unlock sequences in
> > anon_vma_clone()") introduced paranoid checking of the root anon_vma
> > remaining the same throughout all AVC's in 2011.
> >
> > I think 15 years later we can safely assume that this is always the case.
> >
> > Additionally, since unfaulted VMAs being cloned from or unlinked are
> > no-op's, we can simply lock the anon_vma's associated with this rather than
> > doing any specific dance around this.
> >
> > This removes unnecessary checks and makes it clear that the root anon_vma
> > is shared between all anon_vma's in a given VMA's anon_vma_chain.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > ---
> >  mm/rmap.c | 48 ++++++++++++------------------------------------
> >  1 file changed, 12 insertions(+), 36 deletions(-)
> >
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 9332d1cbc643..60134a566073 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -231,32 +231,6 @@ int __anon_vma_prepare(struct vm_area_struct *vma)
> >         return -ENOMEM;
> >  }
> >
> > -/*
> > - * This is a useful helper function for locking the anon_vma root as
> > - * we traverse the vma->anon_vma_chain, looping over anon_vma's that
> > - * have the same vma.
> > - *
> > - * Such anon_vma's should have the same root, so you'd expect to see
> > - * just a single mutex_lock for the whole traversal.
> > - */
> > -static inline struct anon_vma *lock_anon_vma_root(struct anon_vma *root, struct anon_vma *anon_vma)
> > -{
> > -       struct anon_vma *new_root = anon_vma->root;
> > -       if (new_root != root) {
> > -               if (WARN_ON_ONCE(root))
> > -                       up_write(&root->rwsem);
> > -               root = new_root;
> > -               down_write(&root->rwsem);
> > -       }
> > -       return root;
> > -}
> > -
> > -static inline void unlock_anon_vma_root(struct anon_vma *root)
> > -{
> > -       if (root)
> > -               up_write(&root->rwsem);
> > -}
> > -
> >  static void check_anon_vma_clone(struct vm_area_struct *dst,
> >                                  struct vm_area_struct *src)
> >  {
> > @@ -307,26 +281,25 @@ static void check_anon_vma_clone(struct vm_area_struct *dst,
> >  int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >  {
> >         struct anon_vma_chain *avc, *pavc;
> > -       struct anon_vma *root = NULL;
> >
> >         if (!src->anon_vma)
> >                 return 0;
> >
> >         check_anon_vma_clone(dst, src);
> >
> > +       anon_vma_lock_write(src->anon_vma);
> >         list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
> >                 struct anon_vma *anon_vma;
> >
> >                 avc = anon_vma_chain_alloc(GFP_NOWAIT);
> >                 if (unlikely(!avc)) {
> > -                       unlock_anon_vma_root(root);
> > -                       root = NULL;
> > +                       anon_vma_unlock_write(src->anon_vma);
> >                         avc = anon_vma_chain_alloc(GFP_KERNEL);
> >                         if (!avc)
> >                                 goto enomem_failure;
> > +                       anon_vma_lock_write(src->anon_vma);
>
> So, we drop and then reacquire src->anon_vma->root->rwsem, expecting
> src->anon_vma and src->anon_vma->root to be the same. And IIUC

I mean did you read the commit message? :)

We're not expecting that, they _have_ to be the same. It simply makes no sense
for them _not_ to be the same.

This is kind of the entire point of the patch.

> src->vm_mm's mmap lock is what guarantees all this. If so, could you
> please add a clarifying comment here?

No that's not what guarantees it? I don't understand what you mean?

I mean in a sense, if you had a totally broken situation where you didn't take
exclusive locks and could do some horribly broken racing here, then sure you
might end up with something broken, but I think it's super confusing to say 'oh
this lock guarantees it', well no it guarantees that you aren't completely
broken, what guarantees the shared root is how anon_vma_fork() works, which is
to:

- Clone.
- If not reused an anon_vma (which by recursion would also have same root)
  allocate new anon_vma.
- If allocated new, set root to source VMA's anon_vma, which by definition also
  has to be in its anon_vma_chain and have the same root (itself, if we're
  cloning from the ultimate parent).

But I don't think it'd be helpful to document all this, or we get into _adding_
confusion by putting _too much_ in a comment.

So I guess I'll just say,a s I do in the newly introduced
clenaup_partial_anon_vmas():

	/* All anon_vma's share the same root. */

>
> >                 }
> >                 anon_vma = pavc->anon_vma;
> > -               root = lock_anon_vma_root(root, anon_vma);
> >                 anon_vma_chain_link(dst, avc, anon_vma);
> >
> >                 /*
> > @@ -343,7 +316,8 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >         }
> >         if (dst->anon_vma)
> >                 dst->anon_vma->num_active_vmas++;
> > -       unlock_anon_vma_root(root);
> > +
> > +       anon_vma_unlock_write(src->anon_vma);
> >         return 0;
> >
> >   enomem_failure:
> > @@ -438,15 +412,17 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
> >  void unlink_anon_vmas(struct vm_area_struct *vma)
> >  {
> >         struct anon_vma_chain *avc, *next;
> > -       struct anon_vma *root = NULL;
> > +       struct anon_vma *active_anon_vma = vma->anon_vma;
> >
> >         /* Always hold mmap lock, read-lock on unmap possibly. */
> >         mmap_assert_locked(vma->vm_mm);
> >
> >         /* Unfaulted is a no-op. */
> > -       if (!vma->anon_vma)
> > +       if (!active_anon_vma)
> >                 return;
> >
> > +       anon_vma_lock_write(active_anon_vma);
> > +
> >         /*
> >          * Unlink each anon_vma chained to the VMA.  This list is ordered
> >          * from newest to oldest, ensuring the root anon_vma gets freed last.
> > @@ -454,7 +430,6 @@ void unlink_anon_vmas(struct vm_area_struct *vma)
> >         list_for_each_entry_safe(avc, next, &vma->anon_vma_chain, same_vma) {
> >                 struct anon_vma *anon_vma = avc->anon_vma;
> >
> > -               root = lock_anon_vma_root(root, anon_vma);
> >                 anon_vma_interval_tree_remove(avc, &anon_vma->rb_root);
> >
> >                 /*
> > @@ -470,13 +445,14 @@ void unlink_anon_vmas(struct vm_area_struct *vma)
> >                 anon_vma_chain_free(avc);
> >         }
> >
> > -       vma->anon_vma->num_active_vmas--;
> > +       active_anon_vma->num_active_vmas--;
> >         /*
> >          * vma would still be needed after unlink, and anon_vma will be prepared
> >          * when handle fault.
> >          */
> >         vma->anon_vma = NULL;
> > -       unlock_anon_vma_root(root);
> > +       anon_vma_unlock_write(active_anon_vma);
> > +
> >
> >         /*
> >          * Iterate the list once more, it now only contains empty and unlinked
> > --
> > 2.52.0
> >


  reply	other threads:[~2026-01-06 13:58 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-17 12:27 [PATCH 0/8] mm: clean up anon_vma implementation Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 1/8] mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts Lorenzo Stoakes
2025-12-19 18:22   ` Liam R. Howlett
2025-12-29 21:18     ` Suren Baghdasaryan
2025-12-30 21:21       ` Suren Baghdasaryan
2026-01-06 12:54       ` Lorenzo Stoakes
2026-01-06 13:01         ` Lorenzo Stoakes
2026-01-06 13:04           ` Lorenzo Stoakes
2026-01-06 13:34             ` Lorenzo Stoakes
2026-01-06 18:52         ` Suren Baghdasaryan
2026-01-06 13:51     ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 2/8] mm/rmap: skip unfaulted VMAs on anon_vma clone, unlink Lorenzo Stoakes
2025-12-19 18:28   ` Liam R. Howlett
2025-12-29 21:41     ` Suren Baghdasaryan
2026-01-06 13:17       ` Lorenzo Stoakes
2026-01-06 13:14     ` Lorenzo Stoakes
2026-01-06 13:42       ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 3/8] mm/rmap: remove unnecessary root lock dance in anon_vma clone, unmap Lorenzo Stoakes
2025-12-29 22:17   ` Suren Baghdasaryan
2026-01-06 13:58     ` Lorenzo Stoakes [this message]
2026-01-06 20:58       ` Suren Baghdasaryan
2026-01-08 17:46         ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 4/8] mm/rmap: remove anon_vma_merge() function Lorenzo Stoakes
2025-12-30 19:35   ` Suren Baghdasaryan
2026-01-06 14:00     ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 5/8] mm/rmap: make anon_vma functions internal Lorenzo Stoakes
2025-12-30 19:38   ` Suren Baghdasaryan
2026-01-06 14:03     ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 6/8] mm/mmap_lock: add vma_is_attached() helper Lorenzo Stoakes
2025-12-30 19:50   ` Suren Baghdasaryan
2026-01-06 14:06     ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 7/8] mm/rmap: allocate anon_vma_chain objects unlocked when possible Lorenzo Stoakes
2025-12-30 21:35   ` Suren Baghdasaryan
2026-01-06 14:17     ` Lorenzo Stoakes
2026-01-06 21:20       ` Suren Baghdasaryan
2026-01-08 17:26         ` Lorenzo Stoakes
2025-12-17 12:27 ` [PATCH 8/8] mm/rmap: separate out fork-only logic on anon_vma_clone() Lorenzo Stoakes
2025-12-30 22:02   ` Suren Baghdasaryan
2026-01-06 14:43     ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2ec5bf63-139e-4e8b-85e2-efb48adc93fb@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=chriscli@google.com \
    --cc=david@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=pfalcato@suse.de \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox