From: Matthew Wilcox <willy@infradead.org>
To: Barry Song <21cnbao@gmail.com>
Cc: Nicolas Geoffray <ngeoffray@google.com>,
Lokesh Gidra <lokeshgidra@google.com>,
David Hildenbrand <david@redhat.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Harry Yoo <harry.yoo@oracle.com>,
Suren Baghdasaryan <surenb@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@surriel.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
Linux-MM <linux-mm@kvack.org>,
Kalesh Singh <kaleshsingh@google.com>,
SeongJae Park <sj@kernel.org>, Barry Song <v-songbaohua@oppo.com>,
Peter Xu <peterx@redhat.com>
Subject: Re: [DISCUSSION] anon_vma root lock contention and per anon_vma lock
Date: Mon, 15 Sep 2025 03:50:01 +0100 [thread overview]
Message-ID: <aMd-2argDQCHww_Q@casper.infradead.org> (raw)
In-Reply-To: <CAGsJ_4xDRB_F-T42WnhqpmwLyiZRwLGqx9vDf_d5TFALsCRX4A@mail.gmail.com>
On Mon, Sep 15, 2025 at 08:23:38AM +0800, Barry Song wrote:
> > I wonder if we could fix this by adding a new syscall:
> >
> > mremap(addr, size, size, MREMAP_COW_NOW);
> >
> > That would create a new VMA that contains the COWed pages from the
> > old VMA, but crucially no longer attached to the anon_vma root of
> > the zygote. You wouldn't want to call this for every VMA, of course.
> > Just the ones which are likely to be fully COWed.
> >
> > Maybe this isn't practical, but I thought it worth suggesting.
>
> Lorenzo suggested possibly unlinking the child anon_vma from the root once all
> folios have been CoW-ed:
>
> "Right now, even if you entirely CoW everything in a VMA, we are still
> attached to parents with all the overhead. That's something I can look at.
> "
>
> My concern is that it’s difficult to determine whether a VMA has been completely
> CoW-ed, and a single shared folio would prevent the unlink.
> So I’m not sure this approach would work.
I'm concerned that tracking how many folios remain shared may be
inefficient. Also that information needs to be gathered in both parent
and child.
> You seem to be proposing a forced CoW as a way to safely unlink from the root.
>
> A side effect is the potential for sudden, heavy memory allocation,
> whereas CoW lets asynchronous tasks such as kswap work concurrently.
Perhaps you could help us out with some stats on that -- how much
anonymous memory starts out shared between the zygote and a newly
spawned process?
> Another issue is the extra memory use from folios that could have been
> shared but aren’t—likely minor on Android, since only a small portion
> of memory is actually shared, based on our observations.
>
> Calling mremap for each VMA might be difficult. Something applied to the
> whole process could be more practical—similar to exec, but only
> performing CoW and unlinking the anon_vma root.
That seems like it would be worse for memory consumption than doing it
on the VMAs in question.
Another possibility would be for the zygote to set a flag on the VMA,
say EAGER_COW which forces a COW of all pages as soon as the first one
is COWed. But then we're paying at fault time rather than in a syscall
that we can predict.
Another point in favour of COW_NOW or EAGER_COW is that we can choose to
allocate folios of the appropriate size at that time. Unless something's
changed, I think we always COW individual pages rather than multiple
pages at once.
next prev parent reply other threads:[~2025-09-15 2:50 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 7:17 Barry Song
2025-09-11 8:14 ` David Hildenbrand
2025-09-11 8:34 ` Lorenzo Stoakes
2025-09-11 9:18 ` Barry Song
2025-09-11 10:47 ` Lorenzo Stoakes
2025-09-11 8:28 ` Lorenzo Stoakes
2025-09-11 18:22 ` Jann Horn
2025-09-12 4:49 ` Lorenzo Stoakes
2025-09-12 11:37 ` Jann Horn
2025-09-12 11:56 ` Lorenzo Stoakes
2025-09-14 23:53 ` Matthew Wilcox
2025-09-15 0:23 ` Barry Song
2025-09-15 1:47 ` Suren Baghdasaryan
2025-09-15 8:41 ` Lorenzo Stoakes
2025-09-15 2:50 ` Matthew Wilcox [this message]
2025-09-15 5:17 ` David Hildenbrand
2025-09-15 9:42 ` Lorenzo Stoakes
2025-09-15 10:29 ` David Hildenbrand
2025-09-15 10:56 ` Lorenzo Stoakes
2025-09-15 9:22 ` Lorenzo Stoakes
2025-09-15 10:41 ` David Hildenbrand
2025-09-15 10:51 ` Lorenzo Stoakes
2025-09-15 8:57 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMd-2argDQCHww_Q@casper.infradead.org \
--to=willy@infradead.org \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=kaleshsingh@google.com \
--cc=linux-mm@kvack.org \
--cc=lokeshgidra@google.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=ngeoffray@google.com \
--cc=peterx@redhat.com \
--cc=riel@surriel.com \
--cc=sj@kernel.org \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox