From: Jann Horn <jannh@google.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Barry Song <21cnbao@gmail.com>,
Nicolas Geoffray <ngeoffray@google.com>,
Lokesh Gidra <lokeshgidra@google.com>,
David Hildenbrand <david@redhat.com>,
Harry Yoo <harry.yoo@oracle.com>,
Suren Baghdasaryan <surenb@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@surriel.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Linux-MM <linux-mm@kvack.org>,
Kalesh Singh <kaleshsingh@google.com>,
SeongJae Park <sj@kernel.org>,
Barry Song <v-songbaohua@oppo.com>, Peter Xu <peterx@redhat.com>
Subject: Re: [DISCUSSION] anon_vma root lock contention and per anon_vma lock
Date: Thu, 11 Sep 2025 20:22:13 +0200 [thread overview]
Message-ID: <CAG48ez0GVWV024kPe6kSV8c0LO7coACYXf9-85iqw+T+paUi3Q@mail.gmail.com> (raw)
In-Reply-To: <a67129f8-9ff6-4109-bbbf-4209f6dfa3be@lucifer.local>
On Thu, Sep 11, 2025 at 10:29 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
> On Thu, Sep 11, 2025 at 07:17:01PM +1200, Barry Song wrote:
> > Hi All,
> >
> > I’m aware that Lokesh started a discussion on the concurrency issue
> > between usefaultfd_move and memory reclamation [1]. However, my
> > concern is different, so I’m starting a separate discussion.
> >
> > In the process tree, many processes may share anon_vma->root, even if
> > they don’t share the anon_vma itself. This causes serious lock contention
> > between memory reclamation (which calls folio_referenced and try_to_unmap)
> > and other processes calling fork(), exit(), mprotect(), etc.
>
> Well, when you say lock contention, I mean - we need to have a lock that is held
> over the entire fork tree, as we are cloning references to them.
>
> This is at the anon_vma level - so the folio might be exclusive, but other
> folios there might not be.
>
> Note that I'm working on a radical rework of anon_vma's at the moment (time
> is not in my favour given other tasks + review workload, but it _is_
> happening).
>
> So I'm interested to gather real world usecase data on how best to
> implement things and this is interesting re: that.
>
> My proposed approach would use something like ranged locks. It's a bit
> fuzzy right now so definitely interested in putting some meat on that.
>
> >
> > On Android, this issue becomes more severe since many processes are
> > descendants of zygote.
> >
> > Memory reclamation path:
> > folio_lock_anon_vma_read
> >
> > mprotect path:
> > mprotect
> > split_vma
> > anon_vma_clone
> >
> > fork / copy_process path:
> > copy_process
> > dup_mmap
> > anon_vma_fork
> >
> > exit path:
> > exit_mmap
> > free_pgtables
> > unlink_anon_vmas
> >
> > To be honest, memory reclamation—especially folio_referenced()—is a
> > problem. It is called very frequently and can block other important
> > user threads waiting for the anon_vma root lock, causing UI lag.
> >
> > I have a rough idea: since the vast majority of anon folios are actually
> > exclusive (I observed almost 98% of Android anon folios fall into this
> > category), they don’t need to iterate the anon_vma tree. They belong to
> > a single process, and even for rmap, it is per-process.
> >
> > I propose introducing a per-anon_vma lock. For exclusive folios whose
> > anon_vma is not shared, we could use this per-anon_vma lock.
>
> I'm not sure how adding _more_ locks is going to reduce contention :) and
> the anon_vma's are all linked to their parents etc. etc. so it's simply not
> ok to hold one lock and not the others when making changes.
folio_referenced() only wants to look at mappings of a single folio,
right? And it only uses the anon_vma of that folio? So as long as we
can guarantee that the folio can't concurrently change which anon_vma
it is associated with, folio_referenced() really only cares about the
specific anon_vma that the folio is associated with, and the anon_vmas
of other folios in the VMAs we traverse are irrelevant?
Basically I think paths that come through the rmap would usually be
able to use such a fine-grained lock, while paths that come through
the MM would often have to use more coarse locking.
Of course paths requiring coarse locking (like for splitting VMAs and
such) would then have to take a pile of locks, one lock per anon_vma
associated with a given VMA. That part shouldn't be overly complicated
though, we'd mainly have to make sure that there is a consistent lock
ordering (such as "if you want to lock multiple anon_vmas, you have to
lock the root anon_vma before the others").
next prev parent reply other threads:[~2025-09-11 18:22 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 7:17 Barry Song
2025-09-11 8:14 ` David Hildenbrand
2025-09-11 8:34 ` Lorenzo Stoakes
2025-09-11 9:18 ` Barry Song
2025-09-11 10:47 ` Lorenzo Stoakes
2025-09-11 8:28 ` Lorenzo Stoakes
2025-09-11 18:22 ` Jann Horn [this message]
2025-09-12 4:49 ` Lorenzo Stoakes
2025-09-12 11:37 ` Jann Horn
2025-09-12 11:56 ` Lorenzo Stoakes
2025-09-14 23:53 ` Matthew Wilcox
2025-09-15 0:23 ` Barry Song
2025-09-15 1:47 ` Suren Baghdasaryan
2025-09-15 8:41 ` Lorenzo Stoakes
2025-09-15 2:50 ` Matthew Wilcox
2025-09-15 5:17 ` David Hildenbrand
2025-09-15 9:42 ` Lorenzo Stoakes
2025-09-15 10:29 ` David Hildenbrand
2025-09-15 10:56 ` Lorenzo Stoakes
2025-09-15 9:22 ` Lorenzo Stoakes
2025-09-15 10:41 ` David Hildenbrand
2025-09-15 10:51 ` Lorenzo Stoakes
2025-09-15 8:57 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAG48ez0GVWV024kPe6kSV8c0LO7coACYXf9-85iqw+T+paUi3Q@mail.gmail.com \
--to=jannh@google.com \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=harry.yoo@oracle.com \
--cc=kaleshsingh@google.com \
--cc=linux-mm@kvack.org \
--cc=lokeshgidra@google.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=ngeoffray@google.com \
--cc=peterx@redhat.com \
--cc=riel@surriel.com \
--cc=sj@kernel.org \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox