linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Cc: "Liam R . Howlett" <Liam.Howlett@oracle.com>
Subject: Re: [PATCH 07/10] mm: avoid using vma_merge() for new VMAs
Date: Thu, 8 Aug 2024 18:45:43 +0200	[thread overview]
Message-ID: <28774b9d-b74e-4028-acef-5d4f09a5d36a@suse.cz> (raw)
In-Reply-To: <cf40652a2c3f6b987623f8f11a514618718546f7.1722849860.git.lorenzo.stoakes@oracle.com>

On 8/5/24 14:13, Lorenzo Stoakes wrote:
> In mmap_region() and do_brk_flags() we open code scenarios where we prefer
> to use vma_expand() rather than invoke a full vma_merge() operation.
> 
> Abstract this logic and eliminate all of the open-coding, and also use the
> same logic for all cases where we add new VMAs to, rather than ultimately
> use vma_merge(), rather use vma_expand().
> 
> We implement this by replacing vma_merge_new_vma() with this newly
> abstracted logic.
> 
> Doing so removes duplication and simplifies VMA merging in all such cases,
> laying the ground for us to eliminate the merging of new VMAs in
> vma_merge() altogether.
> 
> This makes it far easier to understand what is happening in these cases
> avoiding confusion, bugs and allowing for future optimisation.
> 
> As a result of this change we are also able to make vma_prepare(),
> init_vma_prep(), vma_complete(), can_vma_merge_before() and
> can_vma_merge_after() static and internal to vma.c.

That's really great, but it would be even better if these code moves could
be a separate patch as it would make reviewing so much easier. But with git
diff's --color-moved to the rescue, let me try...

> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
>  mm/mmap.c                        |  79 ++---
>  mm/vma.c                         | 482 +++++++++++++++++++------------
>  mm/vma.h                         |  51 +---
>  tools/testing/vma/vma_internal.h |   6 +
>  4 files changed, 324 insertions(+), 294 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index f6593a81f73d..c03f50f46396 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1363,8 +1363,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  {
>  	struct mm_struct *mm = current->mm;
>  	struct vm_area_struct *vma = NULL;
> -	struct vm_area_struct *next, *prev, *merge;
> -	pgoff_t pglen = len >> PAGE_SHIFT;
> +	struct vm_area_struct *merge;
>  	unsigned long charged = 0;
>  	unsigned long end = addr + len;
>  	bool writable_file_mapping = false;
> @@ -1411,44 +1410,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  		vm_flags |= VM_ACCOUNT;
>  	}
>  
> -	next = vmg.next = vma_next(&vmi);
> -	prev = vmg.prev = vma_prev(&vmi);
> -	if (vm_flags & VM_SPECIAL) {
> -		if (prev)
> -			vma_iter_next_range(&vmi);
> -		goto cannot_expand;
> -	}
> -
> -	/* Attempt to expand an old mapping */
> -	/* Check next */
> -	if (next && next->vm_start == end && can_vma_merge_before(&vmg)) {
> -		/* We can adjust this as can_vma_merge_after() doesn't touch */
> -		vmg.end = next->vm_end;
> -		vma = vmg.vma = next;
> -		vmg.pgoff = next->vm_pgoff - pglen;
> -
> -		/* We may merge our NULL anon_vma with non-NULL in next. */
> -		vmg.anon_vma = vma->anon_vma;
> -	}
> -
> -	/* Check prev */
> -	if (prev && prev->vm_end == addr && can_vma_merge_after(&vmg)) {
> -		vmg.start = prev->vm_start;
> -		vma = vmg.vma = prev;
> -		vmg.pgoff = prev->vm_pgoff;
> -	} else if (prev) {
> -		vma_iter_next_range(&vmi);
> -	}
> -
> -	/* Actually expand, if possible */
> -	if (vma && !vma_expand(&vmg)) {
> -		khugepaged_enter_vma(vma, vm_flags);
> +	vma = vma_merge_new_vma(&vmg);
> +	if (vma)
>  		goto expanded;
> -	}
> -
> -	if (vma == prev)
> -		vma_iter_set(&vmi, addr);
> -cannot_expand:
>  
>  	/*
>  	 * Determine the object being mapped and call the appropriate
> @@ -1493,10 +1457,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  		 * If vm_flags changed after call_mmap(), we should try merge
>  		 * vma again as we may succeed this time.
>  		 */
> -		if (unlikely(vm_flags != vma->vm_flags && prev)) {
> -			merge = vma_merge_new_vma_wrapper(&vmi, prev, vma,
> -							  vma->vm_start, vma->vm_end,
> -							  vma->vm_pgoff);
> +		if (unlikely(vm_flags != vma->vm_flags && vmg.prev)) {
> +			merge = vma_merge_new_vma(&vmg);

Can this even succeed if we don't update vmg->vm_flags? Previously the
wrapper would take them from vma.

> +
>  			if (merge) {
>  				/*
>  				 * ->mmap() can change vma->vm_file and fput

<snip>

> +/*
> + * vma_merge_new_vma - Attempt to merge a new VMA into address space
> + *
> + * @vmg: Describes the VMA we are adding, in the range @vmg->start to @vmg->end
> + *       (exclusive), which we try to merge with any adjacent VMAs if possible.
> + *
> + * We are about to add a VMA to the address space starting at @vmg->start and
> + * ending at @vmg->end. There are three different possible scenarios:
> + *
> + * 1. There is a VMA with identical properties immediately adjacent to the
> + *    proposed new VMA [@vmg->start, @vmg->end) either before or after it -
> + *    EXPAND that VMA:
> + *
> + * Proposed:       |-----|  or  |-----|
> + * Existing:  |----|                  |----|
> + *
> + * 2. There are VMAs with identical properties immediately adjacent to the
> + *    proposed new VMA [@vmg->start, @vmg->end) both before AND after it -
> + *    EXPAND the former and REMOVE the latter:
> + *
> + * Proposed:       |-----|
> + * Existing:  |----|     |----|
> + *
> + * 3. There are no VMAs immediately adjacent to the proposed new VMA or those
> + *    VMAs do not have identical attributes - NO MERGE POSSIBLE.
> + *
> + * In instances where we can merge, this function returns the expanded VMA which
> + * will have its range adjusted accordingly and the underlying maple tree also
> + * adjusted.
> + *
> + * Returns: In instances where no merge was possible, NULL. Otherwise, a pointer
> + *          to the VMA we expanded.
> + *
> + * This function also adjusts @vmg to provide @vmg->prev and @vmg->next if
> + * neither already specified, and adjusts [@vmg->start, @vmg->end) to span the
> + * expanded range.
> + *
> + * ASSUMPTIONS:
> + * - The caller must hold a WRITE lock on the mm_struct->mmap_lock.
> + * - The caller must have determined that [@vmg->start, @vmg->end) is empty.

Should we be paranoid and assert something?

> + */
> +struct vm_area_struct *vma_merge_new_vma(struct vma_merge_struct *vmg)
> +{
> +	bool is_special = vmg->flags & VM_SPECIAL;
> +	struct vm_area_struct *prev = vmg->prev;
> +	struct vm_area_struct *next = vmg->next;
> +	unsigned long start = vmg->start;
> +	unsigned long end = vmg->end;
> +	pgoff_t pgoff = vmg->pgoff;
> +	pgoff_t pglen = PHYS_PFN(end - start);
> +
> +	VM_WARN_ON(vmg->vma);
> +
> +	if (!prev && !next) {
> +		/*
> +		 * Since the caller must have determined that the requested
> +		 * range is empty, vmg->vmi will be left pointing at the VMA
> +		 * immediately prior.
> +		 */

OK that's perhaps not that obvious, as it seems copy_vma() is doing some
special dance to ensure this. Should we add it to the ASSUMPTIONS and assert
it, or is there a maple tree operation we can do to ensure it, ideally if
it's very cheap if the iterator is already set the way we want it to be?

> +		next = vmg->next = vma_next(vmg->vmi);
> +		prev = vmg->prev = vma_prev(vmg->vmi);
> +
> +		/* Avoid maple tree re-walk. */
> +		if (is_special && prev)
> +			vma_iter_next_range(vmg->vmi);

I wish I knew what this did but seems it's the same as the old code did so
hopefully that's fine.

> +	}
> +
> +	/* If special mapping or no adjacent VMAs, nothing to merge. */
> +	if (is_special || (!prev && !next))
> +		return NULL;
> +
> +	/* If we can merge with the following VMA, adjust vmg accordingly. */
> +	if (next && next->vm_start == end && can_vma_merge_before(vmg)) {
> +		/*
> +		 * We can adjust this here as can_vma_merge_after() doesn't
> +		 * touch vmg->end.
> +		 */
> +		vmg->end = next->vm_end;
> +		vmg->vma = next;
> +		vmg->pgoff = next->vm_pgoff - pglen;
> +
> +		vmg->anon_vma = next->anon_vma;
> +	}
> +
> +	/* If we can merge with the previous VMA, adjust vmg accordingly. */
> +	if (prev && prev->vm_end == start && can_vma_merge_after(vmg)) {
> +		vmg->start = prev->vm_start;
> +		vmg->vma = prev;
> +		vmg->pgoff = prev->vm_pgoff;
> +	} else if (prev) {
> +		vma_iter_next_range(vmg->vmi);
> +	}

Sigh... ditto.



  parent reply	other threads:[~2024-08-08 16:45 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-05 12:13 [PATCH 00/10] mm: remove vma_merge() Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 01/10] tools: improve vma test Makefile Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 02/10] mm: introduce vma_merge_struct and abstract merge parameters Lorenzo Stoakes
2024-08-06 12:47   ` Petr Tesařík
2024-08-06 13:43     ` Lorenzo Stoakes
2024-08-06 14:06       ` Petr Tesařík
2024-08-06 14:20         ` Lorenzo Stoakes
2024-08-06 14:32           ` Petr Tesařík
2024-08-08 12:49   ` Vlastimil Babka
2024-08-08 17:18     ` Lorenzo Stoakes
2024-08-08 20:07   ` Liam R. Howlett
2024-08-09 10:11     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 03/10] mm: abstract duplicated policy comparison Lorenzo Stoakes
2024-08-06 12:50   ` Petr Tesařík
2024-08-05 12:13 ` [PATCH 04/10] mm: abstract parameters for vma_expand/shrink() Lorenzo Stoakes
2024-08-06 12:54   ` Petr Tesařík
     [not found]   ` <f12608ec-9c40-4977-a5a6-479f86b44e80@kernel.org>
2024-08-08 15:45     ` Lorenzo Stoakes
2024-08-08 20:20       ` Liam R. Howlett
2024-08-09 10:18         ` Lorenzo Stoakes
2024-08-14 13:53       ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 05/10] mm: abstract vma_merge_new_vma() to use vma_merge_struct Lorenzo Stoakes
     [not found]   ` <82b802e0-94fd-4cca-ad8f-ea2d85bcae64@kernel.org>
2024-08-08 15:52     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 06/10] tools: add VMA merge tests Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 07/10] mm: avoid using vma_merge() for new VMAs Lorenzo Stoakes
2024-08-06 13:04   ` Petr Tesařík
2024-08-06 13:44     ` Lorenzo Stoakes
2024-08-08 16:45   ` Vlastimil Babka [this message]
2024-08-08 18:02     ` Lorenzo Stoakes
2024-08-08 18:34       ` Liam R. Howlett
2024-08-08 19:06         ` Liam R. Howlett
2024-08-09 10:14           ` Lorenzo Stoakes
2024-08-09 15:23   ` Liam R. Howlett
2024-08-09 17:20     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 08/10] mm: introduce commit_merge(), abstracting merge operation Lorenzo Stoakes
2024-08-06 13:41   ` Petr Tesařík
2024-08-06 13:48     ` Lorenzo Stoakes
2024-08-06 14:13       ` Petr Tesařík
2024-08-06 14:30         ` Lorenzo Stoakes
2024-08-06 14:39           ` Petr Tesařík
2024-08-09 10:15   ` Vlastimil Babka
2024-08-09 10:53     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 09/10] mm: refactor vma_merge() into modify-only vma_merge_modified() Lorenzo Stoakes
2024-08-06 13:42   ` Petr Tesařík
2024-08-06 13:52     ` Lorenzo Stoakes
2024-08-09 13:44   ` Vlastimil Babka
2024-08-09 13:57     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 10/10] mm: rework vm_ops->close() handling on VMA merge Lorenzo Stoakes
2024-08-06 13:55   ` Petr Tesařík
2024-08-06 14:08     ` Lorenzo Stoakes
2024-08-06 14:21       ` Petr Tesařík
2024-08-06 14:42         ` Lorenzo Stoakes
2024-08-09 14:25   ` Vlastimil Babka
2024-08-09 14:37     ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28774b9d-b74e-4028-acef-5d4f09a5d36a@suse.cz \
    --to=vbabka@suse.cz \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox