From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>,
Shakeel Butt <shakeel.butt@linux.dev>,
David Hildenbrand <david@kernel.org>,
Rik van Riel <riel@surriel.com>, Harry Yoo <harry.yoo@oracle.com>,
Jann Horn <jannh@google.com>, Mike Rapoport <rppt@kernel.org>,
Michal Hocko <mhocko@suse.com>, Pedro Falcato <pfalcato@suse.de>,
Chris Li <chriscli@google.com>,
Barry Song <v-songbaohua@oppo.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: [PATCH 1/8] mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts
Date: Wed, 17 Dec 2025 12:27:03 +0000 [thread overview]
Message-ID: <3acc90a8613d5e2ea8882d60b5677228e6fe624d.1765970117.git.lorenzo.stoakes@oracle.com> (raw)
In-Reply-To: <cover.1765970117.git.lorenzo.stoakes@oracle.com>
Add kdoc comments, describe exactly what these functinos are used for in
detail, pointing out importantly that the anon_vma_clone() !dst->anon_vma
&& src->anon_vma dance is ONLY for fork.
Both are confusing functions that will be refactored in a subsequent patch
but the first stage is establishing documentation and some invariatns.
Add some basic CONFIG_DEBUG_VM asserts that help document expected state,
specifically:
anon_vma_clone()
- mmap write lock held.
- We do nothing if src VMA is not faulted.
- The destination VMA has no anon_vma_chain yet.
- We are always operating on the same active VMA (i.e. vma->anon-vma).
- If not forking, must operate on the same mm_struct.
unlink_anon_vmas()
- mmap lock held (read on unmap downgraded).
- That unfaulted VMAs are no-ops.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/rmap.c | 82 +++++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 64 insertions(+), 18 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index d6799afe1114..0e34c0a69fbc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -257,30 +257,60 @@ static inline void unlock_anon_vma_root(struct anon_vma *root)
up_write(&root->rwsem);
}
-/*
- * Attach the anon_vmas from src to dst.
- * Returns 0 on success, -ENOMEM on failure.
- *
- * anon_vma_clone() is called by vma_expand(), vma_merge(), __split_vma(),
- * copy_vma() and anon_vma_fork(). The first four want an exact copy of src,
- * while the last one, anon_vma_fork(), may try to reuse an existing anon_vma to
- * prevent endless growth of anon_vma. Since dst->anon_vma is set to NULL before
- * call, we can identify this case by checking (!dst->anon_vma &&
- * src->anon_vma).
- *
- * If (!dst->anon_vma && src->anon_vma) is true, this function tries to find
- * and reuse existing anon_vma which has no vmas and only one child anon_vma.
- * This prevents degradation of anon_vma hierarchy to endless linear chain in
- * case of constantly forking task. On the other hand, an anon_vma with more
- * than one child isn't reused even if there was no alive vma, thus rmap
- * walker has a good chance of avoiding scanning the whole hierarchy when it
- * searches where page is mapped.
+static void check_anon_vma_clone(struct vm_area_struct *dst,
+ struct vm_area_struct *src)
+{
+ /* The write lock must be held. */
+ mmap_assert_write_locked(src->vm_mm);
+ /* If not a fork (implied by dst->anon_vma) then must be on same mm. */
+ VM_WARN_ON_ONCE(dst->anon_vma && dst->vm_mm != src->vm_mm);
+
+ /* No source anon_vma is a no-op. */
+ VM_WARN_ON_ONCE(!src->anon_vma && !list_empty(&src->anon_vma_chain));
+ VM_WARN_ON_ONCE(!src->anon_vma && dst->anon_vma);
+ /* We are establishing a new anon_vma_chain. */
+ VM_WARN_ON_ONCE(!list_empty(&dst->anon_vma_chain));
+ /*
+ * On fork, dst->anon_vma is set NULL (temporarily). Otherwise, anon_vma
+ * must be the same across dst and src.
+ */
+ VM_WARN_ON_ONCE(dst->anon_vma && dst->anon_vma != src->anon_vma);
+}
+
+/**
+ * anon_vma_clone - Establishes new anon_vma_chain objects in @dst linking to
+ * all of the anon_vma objects contained within @src anon_vma_chain's.
+ * @dst: The destination VMA with an empty anon_vma_chain.
+ * @src: The source VMA we wish to duplicate.
+ *
+ * This is the heart of the VMA side of the anon_vma implementation - we invoke
+ * this function whenever we need to set up a new VMA's anon_vma state.
+ *
+ * This is invoked for:
+ *
+ * - VMA Merge, but only when @dst is unfaulted and @src is faulted - meaning we
+ * clone @src into @dst.
+ * - VMA split.
+ * - VMA (m)remap.
+ * - Fork of faulted VMA.
+ *
+ * In all cases other than fork this is simply a duplication. Fork additionally
+ * adds a new active anon_vma.
+ *
+ * ONLY in the case of fork do we try to 'reuse' existing anon_vma's in an
+ * anon_vma hierarchy, reusing anon_vma's which have no VMA associated with them
+ * but do have a single child. This is to avoid waste of memory when repeatedly
+ * forking.
+ *
+ * Returns: 0 on success, -ENOMEM on failure.
*/
int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
{
struct anon_vma_chain *avc, *pavc;
struct anon_vma *root = NULL;
+ check_anon_vma_clone(dst, src);
+
list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
struct anon_vma *anon_vma;
@@ -392,11 +422,27 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
return -ENOMEM;
}
+/**
+ * unlink_anon_vmas() - remove all links between a VMA and anon_vma's, freeing
+ * anon_vma_chain objects.
+ * @vma: The VMA whose links to anon_vma objects is to be severed.
+ *
+ * As part of the process anon_vma_chain's are freed,
+ * anon_vma->num_children,num_active_vmas is updated as required and, if the
+ * relevant anon_vma references no further VMAs, its reference count is
+ * decremented.
+ */
void unlink_anon_vmas(struct vm_area_struct *vma)
{
struct anon_vma_chain *avc, *next;
struct anon_vma *root = NULL;
+ /* Always hold mmap lock, read-lock on unmap possibly. */
+ mmap_assert_locked(vma->vm_mm);
+
+ /* Unfaulted is a no-op. */
+ VM_WARN_ON_ONCE(!vma->anon_vma && !list_empty(&vma->anon_vma_chain));
+
/*
* Unlink each anon_vma chained to the VMA. This list is ordered
* from newest to oldest, ensuring the root anon_vma gets freed last.
--
2.52.0
next prev parent reply other threads:[~2025-12-17 12:27 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-17 12:27 [PATCH 0/8] mm: clean up anon_vma implementation Lorenzo Stoakes
2025-12-17 12:27 ` Lorenzo Stoakes [this message]
2025-12-19 18:22 ` [PATCH 1/8] mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts Liam R. Howlett
2025-12-29 21:18 ` Suren Baghdasaryan
2025-12-30 21:21 ` Suren Baghdasaryan
2025-12-17 12:27 ` [PATCH 2/8] mm/rmap: skip unfaulted VMAs on anon_vma clone, unlink Lorenzo Stoakes
2025-12-19 18:28 ` Liam R. Howlett
2025-12-29 21:41 ` Suren Baghdasaryan
2025-12-17 12:27 ` [PATCH 3/8] mm/rmap: remove unnecessary root lock dance in anon_vma clone, unmap Lorenzo Stoakes
2025-12-29 22:17 ` Suren Baghdasaryan
2025-12-17 12:27 ` [PATCH 4/8] mm/rmap: remove anon_vma_merge() function Lorenzo Stoakes
2025-12-30 19:35 ` Suren Baghdasaryan
2025-12-17 12:27 ` [PATCH 5/8] mm/rmap: make anon_vma functions internal Lorenzo Stoakes
2025-12-30 19:38 ` Suren Baghdasaryan
2025-12-17 12:27 ` [PATCH 6/8] mm/mmap_lock: add vma_is_attached() helper Lorenzo Stoakes
2025-12-30 19:50 ` Suren Baghdasaryan
2025-12-17 12:27 ` [PATCH 7/8] mm/rmap: allocate anon_vma_chain objects unlocked when possible Lorenzo Stoakes
2025-12-30 21:35 ` Suren Baghdasaryan
2025-12-17 12:27 ` [PATCH 8/8] mm/rmap: separate out fork-only logic on anon_vma_clone() Lorenzo Stoakes
2025-12-30 22:02 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3acc90a8613d5e2ea8882d60b5677228e6fe624d.1765970117.git.lorenzo.stoakes@oracle.com \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=chriscli@google.com \
--cc=david@kernel.org \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=pfalcato@suse.de \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox