linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 07/10] mm: avoid using vma_merge() for new VMAs
Date: Fri, 9 Aug 2024 11:14:53 +0100	[thread overview]
Message-ID: <0570a5ed-2583-4d96-9d92-95865ee65b97@lucifer.local> (raw)
In-Reply-To: <neghnx3e4alozuee2v7ucu5utced3olxscg7ponqtt7f3tr4cs@73az35wxv5ed>

On Thu, Aug 08, 2024 at 03:06:14PM GMT, Liam R. Howlett wrote:
> * Liam R. Howlett <Liam.Howlett@oracle.com> [240808 14:34]:
> > * Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [240808 14:02]:
> > > On Thu, Aug 08, 2024 at 06:45:43PM GMT, Vlastimil Babka wrote:
> > > > On 8/5/24 14:13, Lorenzo Stoakes wrote:
> > > > > In mmap_region() and do_brk_flags() we open code scenarios where we prefer
> > > > > to use vma_expand() rather than invoke a full vma_merge() operation.
> > > > >
> > > > > Abstract this logic and eliminate all of the open-coding, and also use the
> > > > > same logic for all cases where we add new VMAs to, rather than ultimately
> > > > > use vma_merge(), rather use vma_expand().
> > > > >
> > > > > We implement this by replacing vma_merge_new_vma() with this newly
> > > > > abstracted logic.
> > > > >
> > > > > Doing so removes duplication and simplifies VMA merging in all such cases,
> > > > > laying the ground for us to eliminate the merging of new VMAs in
> > > > > vma_merge() altogether.
> > > > >
> > > > > This makes it far easier to understand what is happening in these cases
> > > > > avoiding confusion, bugs and allowing for future optimisation.
> > > > >
> > > > > As a result of this change we are also able to make vma_prepare(),
> > > > > init_vma_prep(), vma_complete(), can_vma_merge_before() and
> > > > > can_vma_merge_after() static and internal to vma.c.
> > > >
> > > > That's really great, but it would be even better if these code moves could
> > > > be a separate patch as it would make reviewing so much easier. But with git
> > > > diff's --color-moved to the rescue, let me try...
> > >
> > > Will separate out on respin.
> > >
> > > >
> > > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > > > ---
> > > > >  mm/mmap.c                        |  79 ++---
> > > > >  mm/vma.c                         | 482 +++++++++++++++++++------------
> > > > >  mm/vma.h                         |  51 +---
> > > > >  tools/testing/vma/vma_internal.h |   6 +
> > > > >  4 files changed, 324 insertions(+), 294 deletions(-)
> >
> > ...
> > > > > + */
> > > > > +struct vm_area_struct *vma_merge_new_vma(struct vma_merge_struct *vmg)
> > > > > +{
> > > > > +	bool is_special = vmg->flags & VM_SPECIAL;
> > > > > +	struct vm_area_struct *prev = vmg->prev;
> > > > > +	struct vm_area_struct *next = vmg->next;
> > > > > +	unsigned long start = vmg->start;
> > > > > +	unsigned long end = vmg->end;
> > > > > +	pgoff_t pgoff = vmg->pgoff;
> > > > > +	pgoff_t pglen = PHYS_PFN(end - start);
> > > > > +
> > > > > +	VM_WARN_ON(vmg->vma);
> > > > > +
> > > > > +	if (!prev && !next) {
> > > > > +		/*
> > > > > +		 * Since the caller must have determined that the requested
> > > > > +		 * range is empty, vmg->vmi will be left pointing at the VMA
> > > > > +		 * immediately prior.
> > > > > +		 */
> > > >
> > > > OK that's perhaps not that obvious, as it seems copy_vma() is doing some
> > > > special dance to ensure this. Should we add it to the ASSUMPTIONS and assert
> > > > it, or is there a maple tree operation we can do to ensure it, ideally if
> > > > it's very cheap if the iterator is already set the way we want it to be?
> > > >
> > >
> > > To be fair this is something that was previously assumed, and I just added
> > > a comment.
> > >
> > > Will add to assumptions, and again I think any assert should be done in
> > > such a way that under non-CONFIG_DEBUG_VM nothing happens, maybe
> > > VM_WARN_ON()?
> > >
> > > Will try to come up with something.
>
> Something like:
>
> VM_BUG_ON(vma_iter_end(vmg->vmi) > start);
>
> > >
> > > > > +		next = vmg->next = vma_next(vmg->vmi);
>
> and:
>
> VM_BUG_ON(vma_iter_addr(vmg->vmi) < end);
>

Ack x2.

Thought VM_BUG_ON() was 'not done' these days though... but checkpatch.pl
has become rather hit or miss as to what should be given attention to or
not.

> > > > > +		prev = vmg->prev = vma_prev(vmg->vmi);
> > > > > +
> > > > > +		/* Avoid maple tree re-walk. */
> > > > > +		if (is_special && prev)
> > > > > +			vma_iter_next_range(vmg->vmi);
> > > >
> > > > I wish I knew what this did but seems it's the same as the old code did so
> > > > hopefully that's fine.
> > >
> > > I think point is that we are about to exit, so we'd be left pointing at
> > > prev. But since we're exiting in just a second, we want to be pointing at
> > > the next vma which will become the prev of the next merge attempt.
> > >
> > > Liam can maybe elucidate further.
> >
> > What you have to remember is that the vma iterator (vmg->vmi above),
> > contains (or, basically is) a maple state (usually written as mas).  We
> > keep state of the maple tree walker so that we don't have to keep
> > re-walking to find the same thing.  We move around the tree with this
> > maple state because going prev/next is faster from leaves (almost always
> > just the next thing in the nodes array of pointers).
> >
> > We use the maple state to write as well, so the maple state needs to
> > point to the correct location in the tree for a write.
> >
> > The maple tree is a range-based tree, so each entry exists for a span of
> > values.  A write happens at the lowest index and can overwrite
> > subsequent values.  This means that the maple state needs to point to
> > the range containing the lowest index for the write (if it's pointing to
> > a node - it could walk from the top).
> >
> > A side effect of writing to the lowest index is that we need to point to
> > the previous vma if we are going to 'expand' the vma.  The range is
> > essentially going to be from prev->start to "whatever we are expanding
> > over".
> >
> > In the old code, the vm_flags & VM_SPECIAL code meant there was no way
> > an expansion was going to happen, but we've moved the maple state to the
> > wrong location for a write of a new vma - so this vma_iter_next_range()
> > just moves it back.  Then we "goto cannot_expand".
> >


  reply	other threads:[~2024-08-09 10:15 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-05 12:13 [PATCH 00/10] mm: remove vma_merge() Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 01/10] tools: improve vma test Makefile Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 02/10] mm: introduce vma_merge_struct and abstract merge parameters Lorenzo Stoakes
2024-08-06 12:47   ` Petr Tesařík
2024-08-06 13:43     ` Lorenzo Stoakes
2024-08-06 14:06       ` Petr Tesařík
2024-08-06 14:20         ` Lorenzo Stoakes
2024-08-06 14:32           ` Petr Tesařík
2024-08-08 12:49   ` Vlastimil Babka
2024-08-08 17:18     ` Lorenzo Stoakes
2024-08-08 20:07   ` Liam R. Howlett
2024-08-09 10:11     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 03/10] mm: abstract duplicated policy comparison Lorenzo Stoakes
2024-08-06 12:50   ` Petr Tesařík
2024-08-05 12:13 ` [PATCH 04/10] mm: abstract parameters for vma_expand/shrink() Lorenzo Stoakes
2024-08-06 12:54   ` Petr Tesařík
     [not found]   ` <f12608ec-9c40-4977-a5a6-479f86b44e80@kernel.org>
2024-08-08 15:45     ` Lorenzo Stoakes
2024-08-08 20:20       ` Liam R. Howlett
2024-08-09 10:18         ` Lorenzo Stoakes
2024-08-14 13:53       ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 05/10] mm: abstract vma_merge_new_vma() to use vma_merge_struct Lorenzo Stoakes
     [not found]   ` <82b802e0-94fd-4cca-ad8f-ea2d85bcae64@kernel.org>
2024-08-08 15:52     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 06/10] tools: add VMA merge tests Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 07/10] mm: avoid using vma_merge() for new VMAs Lorenzo Stoakes
2024-08-06 13:04   ` Petr Tesařík
2024-08-06 13:44     ` Lorenzo Stoakes
2024-08-08 16:45   ` Vlastimil Babka
2024-08-08 18:02     ` Lorenzo Stoakes
2024-08-08 18:34       ` Liam R. Howlett
2024-08-08 19:06         ` Liam R. Howlett
2024-08-09 10:14           ` Lorenzo Stoakes [this message]
2024-08-09 15:23   ` Liam R. Howlett
2024-08-09 17:20     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 08/10] mm: introduce commit_merge(), abstracting merge operation Lorenzo Stoakes
2024-08-06 13:41   ` Petr Tesařík
2024-08-06 13:48     ` Lorenzo Stoakes
2024-08-06 14:13       ` Petr Tesařík
2024-08-06 14:30         ` Lorenzo Stoakes
2024-08-06 14:39           ` Petr Tesařík
2024-08-09 10:15   ` Vlastimil Babka
2024-08-09 10:53     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 09/10] mm: refactor vma_merge() into modify-only vma_merge_modified() Lorenzo Stoakes
2024-08-06 13:42   ` Petr Tesařík
2024-08-06 13:52     ` Lorenzo Stoakes
2024-08-09 13:44   ` Vlastimil Babka
2024-08-09 13:57     ` Lorenzo Stoakes
2024-08-05 12:13 ` [PATCH 10/10] mm: rework vm_ops->close() handling on VMA merge Lorenzo Stoakes
2024-08-06 13:55   ` Petr Tesařík
2024-08-06 14:08     ` Lorenzo Stoakes
2024-08-06 14:21       ` Petr Tesařík
2024-08-06 14:42         ` Lorenzo Stoakes
2024-08-09 14:25   ` Vlastimil Babka
2024-08-09 14:37     ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0570a5ed-2583-4d96-9d92-95865ee65b97@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox