linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	David Hildenbrand <david@kernel.org>,
	Rik van Riel <riel@surriel.com>, Harry Yoo <harry.yoo@oracle.com>,
	Jann Horn <jannh@google.com>, Mike Rapoport <rppt@kernel.org>,
	Michal Hocko <mhocko@suse.com>, Pedro Falcato <pfalcato@suse.de>,
	Chris Li <chriscli@google.com>,
	Barry Song <v-songbaohua@oppo.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 8/8] mm/rmap: separate out fork-only logic on anon_vma_clone()
Date: Tue, 6 Jan 2026 14:27:37 -0500	[thread overview]
Message-ID: <oecut2v62gxlisp24gxgzwzxm3y3kiqcbkrheuxwkwyrb45apl@qtyhfsv34wi7> (raw)
In-Reply-To: <a4f87ba914697a47419f12901a995759fd977fdd.1767711638.git.lorenzo.stoakes@oracle.com>

* Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [260106 10:05]:
> Specify which operation is being performed to anon_vma_clone(), which
> allows us to do checks specific to each operation type, as well as to
> separate out and make clear that the anon_vma reuse logic is absolutely
> specific to fork only.
> 
> This opens the door to further refactorings and refinements later as we
> have more information to work with.
> 
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

A few minor things, but this looks correct.

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  mm/internal.h                    | 11 ++++-
>  mm/rmap.c                        | 74 ++++++++++++++++++++++----------
>  mm/vma.c                         |  6 +--
>  tools/testing/vma/vma_internal.h | 11 ++++-
>  4 files changed, 74 insertions(+), 28 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 4ba784023a9f..8baa7bd2b8f7 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -244,7 +244,16 @@ static inline void anon_vma_unlock_read(struct anon_vma *anon_vma)
>  
>  struct anon_vma *folio_get_anon_vma(const struct folio *folio);
>  
> -int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src);
> +/* Operations which modify VMAs. */
> +enum vma_operation {
> +	VMA_OP_SPLIT,
> +	VMA_OP_MERGE_UNFAULTED,
> +	VMA_OP_REMAP,
> +	VMA_OP_FORK,
> +};
> +
> +int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src,
> +	enum vma_operation operation);
>  int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma);
>  int  __anon_vma_prepare(struct vm_area_struct *vma);
>  void unlink_anon_vmas(struct vm_area_struct *vma);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 8f4393546bce..336b27e00238 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -233,12 +233,13 @@ int __anon_vma_prepare(struct vm_area_struct *vma)
>  }
>  
>  static void check_anon_vma_clone(struct vm_area_struct *dst,
> -				 struct vm_area_struct *src)
> +				 struct vm_area_struct *src,
> +				 enum vma_operation operation)

You could save a line here by putting src and operation on the same line
and tabbing only twice, but sure.  This is true in earlier patches as
well.

>  {
>  	/* The write lock must be held. */
>  	mmap_assert_write_locked(src->vm_mm);
> -	/* If not a fork (implied by dst->anon_vma) then must be on same mm. */
> -	VM_WARN_ON_ONCE(dst->anon_vma && dst->vm_mm != src->vm_mm);
> +	/* If not a fork then must be on same mm. */
> +	VM_WARN_ON_ONCE(operation != VMA_OP_FORK && dst->vm_mm != src->vm_mm);
>  
>  	/* If we have anything to do src->anon_vma must be provided. */
>  	VM_WARN_ON_ONCE(!src->anon_vma && !list_empty(&src->anon_vma_chain));
> @@ -250,6 +251,40 @@ static void check_anon_vma_clone(struct vm_area_struct *dst,
>  	 * must be the same across dst and src.
>  	 */
>  	VM_WARN_ON_ONCE(dst->anon_vma && dst->anon_vma != src->anon_vma);
> +	/*
> +	 * Essentially equivalent to above - if not a no-op, we should expect
> +	 * dst->anon_vma to be set for everything except a fork.
> +	 */
> +	VM_WARN_ON_ONCE(operation != VMA_OP_FORK && src->anon_vma &&
> +			!dst->anon_vma);
> +	/* For the anon_vma to be compatible, it can only be singular. */
> +	VM_WARN_ON_ONCE(operation == VMA_OP_MERGE_UNFAULTED &&
> +			!list_is_singular(&src->anon_vma_chain));
> +#ifdef CONFIG_PER_VMA_LOCK
> +	/* Only merging an unfaulted VMA leaves the destination attached. */
> +	VM_WARN_ON_ONCE(operation != VMA_OP_MERGE_UNFAULTED &&
> +			vma_is_attached(dst));
> +#endif
> +}
> +

try seems to imply we can return something saying it didn't work out,
but this is void.  Naming is hard.  reuse_anon_vma_if_necessary seems
insane, so I don't really have anything better.

> +static void try_to_reuse_anon_vma(struct vm_area_struct *dst,
> +				  struct anon_vma *anon_vma)
> +{
> +	/* If already populated, nothing to do.*/
> +	if (dst->anon_vma)
> +		return;

This is only used on VMA_OP_FORK, how is it populated?

I assume this is a later refinement?

> +
> +	/*
> +	 * We reuse an anon_vma if any linking VMAs were unmapped and it has
> +	 * only a single child at most.
> +	 */
> +	if (anon_vma->num_active_vmas > 0)
> +		return;
> +	if (anon_vma->num_children > 1)
> +		return;
> +
> +	dst->anon_vma = anon_vma;
> +	anon_vma->num_active_vmas++;
>  }
>  
>  static void cleanup_partial_anon_vmas(struct vm_area_struct *vma);
> @@ -259,6 +294,7 @@ static void cleanup_partial_anon_vmas(struct vm_area_struct *vma);
>   * all of the anon_vma objects contained within @src anon_vma_chain's.
>   * @dst: The destination VMA with an empty anon_vma_chain.
>   * @src: The source VMA we wish to duplicate.
> + * @operation: The type of operation which resulted in the clone.
>   *
>   * This is the heart of the VMA side of the anon_vma implementation - we invoke
>   * this function whenever we need to set up a new VMA's anon_vma state.
> @@ -281,17 +317,17 @@ static void cleanup_partial_anon_vmas(struct vm_area_struct *vma);
>   *
>   * Returns: 0 on success, -ENOMEM on failure.
>   */
> -int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> +int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src,
> +		   enum vma_operation operation)
>  {
>  	struct anon_vma_chain *avc, *pavc;
> +	struct anon_vma *active_anon_vma = src->anon_vma;
>  
> -	check_anon_vma_clone(dst, src);
> +	check_anon_vma_clone(dst, src, operation);
>  
> -	if (!src->anon_vma)
> +	if (!active_anon_vma)
>  		return 0;
>  
> -	check_anon_vma_clone(dst, src);
> -
>  	/*
>  	 * Allocate AVCs. We don't need an anon_vma lock for this as we
>  	 * are not updating the anon_vma rbtree nor are we changing
> @@ -317,22 +353,14 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
>  		struct anon_vma *anon_vma = avc->anon_vma;
>  
>  		anon_vma_interval_tree_insert(avc, &anon_vma->rb_root);
> -
> -		/*
> -		 * Reuse existing anon_vma if it has no vma and only one
> -		 * anon_vma child.
> -		 *
> -		 * Root anon_vma is never reused:
> -		 * it has self-parent reference and at least one child.
> -		 */
> -		if (!dst->anon_vma && src->anon_vma &&
> -		    anon_vma->num_children < 2 &&
> -		    anon_vma->num_active_vmas == 0)
> -			dst->anon_vma = anon_vma;
> +		if (operation == VMA_OP_FORK)
> +			try_to_reuse_anon_vma(dst, anon_vma);
>  	}
> -	if (dst->anon_vma)
> +
> +	if (operation != VMA_OP_FORK)
>  		dst->anon_vma->num_active_vmas++;
> -	anon_vma_unlock_write(src->anon_vma);
> +
> +	anon_vma_unlock_write(active_anon_vma);
>  	return 0;
>  
>   enomem_failure:
> @@ -362,7 +390,7 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
>  	 * First, attach the new VMA to the parent VMA's anon_vmas,
>  	 * so rmap can find non-COWed pages in child processes.
>  	 */
> -	error = anon_vma_clone(vma, pvma);
> +	error = anon_vma_clone(vma, pvma, VMA_OP_FORK);
>  	if (error)
>  		return error;
>  
> diff --git a/mm/vma.c b/mm/vma.c
> index 4294ecdc23a5..2a063d6568d9 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -528,7 +528,7 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  	if (err)
>  		goto out_free_vmi;
>  
> -	err = anon_vma_clone(new, vma);
> +	err = anon_vma_clone(new, vma, VMA_OP_SPLIT);
>  	if (err)
>  		goto out_free_mpol;
>  
> @@ -626,7 +626,7 @@ static int dup_anon_vma(struct vm_area_struct *dst,
>  
>  		vma_assert_write_locked(dst);
>  		dst->anon_vma = src->anon_vma;
> -		ret = anon_vma_clone(dst, src);
> +		ret = anon_vma_clone(dst, src, VMA_OP_MERGE_UNFAULTED);
>  		if (ret)
>  			return ret;
>  
> @@ -1899,7 +1899,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>  		vma_set_range(new_vma, addr, addr + len, pgoff);
>  		if (vma_dup_policy(vma, new_vma))
>  			goto out_free_vma;
> -		if (anon_vma_clone(new_vma, vma))
> +		if (anon_vma_clone(new_vma, vma, VMA_OP_REMAP))
>  			goto out_free_mempol;
>  		if (new_vma->vm_file)
>  			get_file(new_vma->vm_file);
> diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
> index 93e5792306d9..7fa56dcc53a6 100644
> --- a/tools/testing/vma/vma_internal.h
> +++ b/tools/testing/vma/vma_internal.h
> @@ -600,6 +600,14 @@ struct mmap_action {
>  	bool hide_from_rmap_until_complete :1;
>  };
>  
> +/* Operations which modify VMAs. */
> +enum vma_operation {
> +	VMA_OP_SPLIT,
> +	VMA_OP_MERGE_UNFAULTED,
> +	VMA_OP_REMAP,
> +	VMA_OP_FORK,
> +};
> +
>  /*
>   * Describes a VMA that is about to be mmap()'ed. Drivers may choose to
>   * manipulate mutable fields which will cause those fields to be updated in the
> @@ -1157,7 +1165,8 @@ static inline int vma_dup_policy(struct vm_area_struct *src, struct vm_area_stru
>  	return 0;
>  }
>  
> -static inline int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> +static inline int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src,
> +				 enum vma_operation operation)
>  {
>  	/* For testing purposes. We indicate that an anon_vma has been cloned. */
>  	if (src->anon_vma != NULL) {
> -- 
> 2.52.0
> 


  reply	other threads:[~2026-01-06 19:27 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-06 15:04 [PATCH v2 0/8] mm: clean up anon_vma implementation Lorenzo Stoakes
2026-01-06 15:04 ` [PATCH v2 1/8] mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts Lorenzo Stoakes
2026-01-06 15:04 ` [PATCH v2 2/8] mm/rmap: skip unfaulted VMAs on anon_vma clone, unlink Lorenzo Stoakes
2026-01-06 18:34   ` Liam R. Howlett
2026-01-06 15:04 ` [PATCH v2 3/8] mm/rmap: remove unnecessary root lock dance in anon_vma clone, unmap Lorenzo Stoakes
2026-01-06 18:42   ` Liam R. Howlett
2026-01-06 15:04 ` [PATCH v2 4/8] mm/rmap: remove anon_vma_merge() function Lorenzo Stoakes
2026-01-06 18:42   ` Liam R. Howlett
2026-01-06 15:04 ` [PATCH v2 5/8] mm/rmap: make anon_vma functions internal Lorenzo Stoakes
2026-01-06 18:54   ` Liam R. Howlett
2026-01-06 15:04 ` [PATCH v2 6/8] mm/mmap_lock: add vma_is_attached() helper Lorenzo Stoakes
2026-01-06 18:56   ` Liam R. Howlett
2026-01-06 15:04 ` [PATCH v2 7/8] mm/rmap: allocate anon_vma_chain objects unlocked when possible Lorenzo Stoakes
2026-01-06 19:02   ` Liam R. Howlett
2026-01-08 18:51   ` Lorenzo Stoakes
2026-01-06 15:04 ` [PATCH v2 8/8] mm/rmap: separate out fork-only logic on anon_vma_clone() Lorenzo Stoakes
2026-01-06 19:27   ` Liam R. Howlett [this message]
2026-01-08 17:58     ` Lorenzo Stoakes
2026-01-08 18:52   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=oecut2v62gxlisp24gxgzwzxm3y3kiqcbkrheuxwkwyrb45apl@qtyhfsv34wi7 \
    --to=liam.howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=chriscli@google.com \
    --cc=david@kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=pfalcato@suse.de \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox