linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: lsf-pc@lists.linux-foundation.org
Cc: linux-mm@kvack.org, David Hildenbrand <david@kernel.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Suren Baghdasaryan <surenb@google.com>,
	Pedro Falcato <pfalcato@suse.de>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Harry Yoo <harry.yoo@oracle.com>, Rik van Riel <riel@surriel.com>,
	Jann Horn <jannh@google.com>, Chris Li <chriscli@google.com>,
	Barry Song <baohua@kernel.org>
Subject: [LSM/MM/BPF TOPIC] The Future of the Anonymous Reverse Mapping
Date: Thu, 19 Feb 2026 19:28:34 +0000	[thread overview]
Message-ID: <8aa41d47-ee41-4af1-a334-587a34fe865d@lucifer.local> (raw)

Currently we track the reverse mapping between folios and VMAs at a VMA level,
utilising a complicated and confusing combination of anon_vma objects and
anon_vma_chain's linking them, which must be updated when VMAs are split,
merged, remapped or forked.

It's further complicated by various optimisations intended to avoid scalability
issues in locking and memory allocation.

I have done recent work to improve the situation [0] which has also lead to a
reported improvement in lock scalability [1], but fundamentally the situation
remains the same.

The logic is actually, when you think hard enough about it, is a fairly
reasonable means of implementing the reverse mapping at a VMA level.

It is, however, a very broken abstraction as it stands. In order to work with
the logic, you have to essentially keep a broad understanding of the entire
implementation in your head at one time - that is, not much is really
abstracted.

This results in confusion, mistakes, and bit rot. It's also very time-consuming
to work with - personally I've gone to the lengths of writing a private set of
slides for myself on the topic as a reminder each time I come back to it.

There are also issues with lock scalability - the use of interval trees to
maintain a connection between an anon_vma and AVCs connected to VMAs requires
that a lock must be held across the entire 'CoW hierarchy' of parent and child
VMAs whenever performing an rmap walk or performing a merge, split, remap or
fork.

This is because we tear down all interval tree mappings and reestablish them
each time we might see changes in VMA geometry. This is an issue Barry Song
identified as problematic in a real world use case [2].

So what do we do to improve the situation?

Recently I have been working on an experimental new approach to the anonymous
reverse mapping, in which we instead track anonymous remaps, and then use the
VMA's virtual page offset to locate VMAs from the folio.

I have got the implementation working to the point where it tracks the exact
same VMAs as the anon_vma implementation, and it seems a lot of it can be done
under RCU.

It avoids the need to maintain expensive mappings at a VMA level, though it
incurs a cost in tracking remaps, and MAP_PRIVATE files are very much a TODO
(they maintain a file vma->vm_pgoff, even when CoW'd, so the remap tracking is
pretty sub-optimal).

I am investigating whether I can change how MAP_PRIVATE file-backed mappings
work to avoid this issue, and will be developing tests to see how lock
scalability, throughput and memory usage compare to the anon_vma approach under
different workloads.

This experiment may or may not work out, either way it will be interesting to
discuss it.

By the time LSF/MM comes around I may even have already decided on a different
approach but that's what makes things interesting :)

[0]:https://lore.kernel.org/all/cover.1767711638.git.lorenzo.stoakes@oracle.com/
[1]:https://lore.kernel.org/all/202602061747.855f053f-lkp@intel.com/
[2]:https://lore.kernel.org/linux-mm/CAGsJ_4x=YsQR=nNcHA-q=0vg0b7ok=81C_qQqKmoJ+BZ+HVduQ@mail.gmail.com/

Cheers, Lorenzo


             reply	other threads:[~2026-02-19 19:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-19 19:28 Lorenzo Stoakes [this message]
2026-02-19 20:25 ` Suren Baghdasaryan
2026-02-20 11:34   ` Lorenzo Stoakes
2026-02-20 15:03 ` Liam R. Howlett
2026-02-20 15:38   ` Lorenzo Stoakes
2026-02-20 19:22     ` Liam R. Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8aa41d47-ee41-4af1-a334-587a34fe865d@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=baohua@kernel.org \
    --cc=chriscli@google.com \
    --cc=david@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=pfalcato@suse.de \
    --cc=riel@surriel.com \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox