linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Nicolas Geoffray <ngeoffray@google.com>,
	Lokesh Gidra <lokeshgidra@google.com>,
	 David Hildenbrand <david@redhat.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	 Harry Yoo <harry.yoo@oracle.com>,
	Suren Baghdasaryan <surenb@google.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@surriel.com>,
	 "Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	 Linux-MM <linux-mm@kvack.org>,
	Kalesh Singh <kaleshsingh@google.com>,
	 SeongJae Park <sj@kernel.org>,
	Barry Song <v-songbaohua@oppo.com>, Peter Xu <peterx@redhat.com>
Subject: Re: [DISCUSSION] anon_vma root lock contention and per anon_vma lock
Date: Mon, 15 Sep 2025 08:23:38 +0800	[thread overview]
Message-ID: <CAGsJ_4xDRB_F-T42WnhqpmwLyiZRwLGqx9vDf_d5TFALsCRX4A@mail.gmail.com> (raw)
In-Reply-To: <aMdVcH7KCXBvLtFP@casper.infradead.org>

On Mon, Sep 15, 2025 at 7:53 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Sep 11, 2025 at 07:17:01PM +1200, Barry Song wrote:
> > In the process tree, many processes may share anon_vma->root, even if
> > they don’t share the anon_vma itself. This causes serious lock contention
> > between memory reclamation (which calls folio_referenced and try_to_unmap)
> > and other processes calling fork(), exit(), mprotect(), etc.
> >
> > On Android, this issue becomes more severe since many processes are
> > descendants of zygote.
>
> I'm not nearly as familiar with anon_vma as, well, the rest of you
> are.  As I understand this situation, usually after fork(), a process
> calls exec() and the VMAs evaporate.  Android is different in that after
> the zygotecalls fork(), there is no exec() and so the VMAs stay COW.
>
> I wonder if we could fix this by adding a new syscall:
>
>         mremap(addr, size, size, MREMAP_COW_NOW);
>
> That would create a new VMA that contains the COWed pages from the
> old VMA, but crucially no longer attached to the anon_vma root of
> the zygote.  You wouldn't want to call this for every VMA, of course.
> Just the ones which are likely to be fully COWed.
>
> Maybe this isn't practical, but I thought it worth suggesting.

Thank you for the suggestion, Matthew.

Lorenzo suggested possibly unlinking the child anon_vma from the root once all
folios have been CoW-ed:

"Right now, even if you entirely CoW everything in a VMA, we are still
attached to parents with all the overhead. That's something I can look at.
"

My concern is that it’s difficult to determine whether a VMA has been completely
CoW-ed, and a single shared folio would prevent the unlink.
So I’m not sure this approach would work.

You seem to be proposing a forced CoW as a way to safely unlink from the root.

A side effect is the potential for sudden, heavy memory allocation,
whereas CoW lets asynchronous tasks such as kswap work concurrently.

Another issue is the extra memory use from folios that could have been
shared but aren’t—likely minor on Android, since only a small portion
of memory is actually shared, based on our observations.

Calling mremap for each VMA might be difficult. Something applied to the
whole process could be more practical—similar to exec, but only
performing CoW and unlinking the anon_vma root.

On the other hand, most anon folios are not actually shared, yet
folio_referenced and try_to_unmap still take the entire root lock.
In reality, they only care about their own node—no need to iterate
the whole tree.

I still think optimizing from that angle could be a better entry point :-)

Thanks
Barry


  reply	other threads:[~2025-09-15  0:23 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-11  7:17 Barry Song
2025-09-11  8:14 ` David Hildenbrand
2025-09-11  8:34   ` Lorenzo Stoakes
2025-09-11  9:18   ` Barry Song
2025-09-11 10:47     ` Lorenzo Stoakes
2025-09-11  8:28 ` Lorenzo Stoakes
2025-09-11 18:22   ` Jann Horn
2025-09-12  4:49     ` Lorenzo Stoakes
2025-09-12 11:37       ` Jann Horn
2025-09-12 11:56         ` Lorenzo Stoakes
2025-09-14 23:53 ` Matthew Wilcox
2025-09-15  0:23   ` Barry Song [this message]
2025-09-15  1:47     ` Suren Baghdasaryan
2025-09-15  8:41       ` Lorenzo Stoakes
2025-09-15  2:50     ` Matthew Wilcox
2025-09-15  5:17       ` David Hildenbrand
2025-09-15  9:42         ` Lorenzo Stoakes
2025-09-15 10:29           ` David Hildenbrand
2025-09-15 10:56             ` Lorenzo Stoakes
2025-09-15  9:22       ` Lorenzo Stoakes
2025-09-15 10:41         ` David Hildenbrand
2025-09-15 10:51           ` Lorenzo Stoakes
2025-09-15  8:57   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGsJ_4xDRB_F-T42WnhqpmwLyiZRwLGqx9vDf_d5TFALsCRX4A@mail.gmail.com \
    --to=21cnbao@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=kaleshsingh@google.com \
    --cc=linux-mm@kvack.org \
    --cc=lokeshgidra@google.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=ngeoffray@google.com \
    --cc=peterx@redhat.com \
    --cc=riel@surriel.com \
    --cc=sj@kernel.org \
    --cc=surenb@google.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox