From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
sidhartha.kumar@oracle.com,
"Paul E . McKenney" <paulmck@kernel.org>,
Bert Karwatzki <spasswolf@web.de>, Jiri Olsa <olsajiri@gmail.com>,
linux-kernel@vger.kernel.org, Kees Cook <kees@kernel.org>
Subject: Re: [PATCH v3 11/16] mm/mmap: Track start and end of munmap in vma_munmap_struct
Date: Mon, 8 Jul 2024 10:45:38 -0400 [thread overview]
Message-ID: <jghcpyibgtemzndnpeqncc6qboym3uk26nxsjsht2sllcgaheh@37zpmuehbhdq> (raw)
In-Reply-To: <37ea5831-2163-4086-8b2c-baff3be2e5ad@lucifer.local>
* Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [240705 16:27]:
> On Thu, Jul 04, 2024 at 02:27:13PM GMT, Liam R. Howlett wrote:
> > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> >
> > Set the start and end address for munmap when the prev and next are
> > gathered. This is needed to avoid incorrect addresses being used during
> > the vms_complete_munmap_vmas() function if the prev/next vma are
> > expanded.
>
> When we spoke about this separately you mentioned that specific arches may
> be more likely to encounter this issue, perhaps worth mentioning something
> about that in the commit msg? Unless I misunderstood you.
What we spoke about was mappings outside vmas, that is between two vmas
may have mappings on certain archs - I'm not entirely sure on this or if
it's still something we have to worry about. That is, we use the
prev->vm_end and next->vm_start as the unmapping range instead of the
actual vma start and end.
There is also the upper and lower limits if prev or next does not exist.
See git id 6ee8630e02be6, and e2cdef8c847b4 - probably from an older git
history than kernel.org: https://github.com/mpe/linux-fullhistory.git
What I am trying to avoid here is using the prev->vm_end address for
munmap when we are changing the prev->vm_end to expand over the area we
are mapping. And the same for expanding next backwards.
>
> >
> > Add a new helper vms_complete6ee8630e02be6_pte_clear(), which is needed later and
> > will avoid growing the argument list to unmap_region() beyond the 9 it
> > already has.
>
> My word.
>
> >
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > ---
> > mm/internal.h | 2 ++
> > mm/mmap.c | 34 +++++++++++++++++++++++++++-------
> > 2 files changed, 29 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 8cbbbe7d40f3..4c9f06669cc4 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -1493,6 +1493,8 @@ struct vma_munmap_struct {
> > struct list_head *uf; /* Userfaultfd list_head */
> > unsigned long start; /* Aligned start addr */
> > unsigned long end; /* Aligned end addr */
> > + unsigned long unmap_start;
> > + unsigned long unmap_end;
> > int vma_count; /* Number of vmas that will be removed */
> > unsigned long nr_pages; /* Number of pages being removed */
> > unsigned long locked_vm; /* Number of locked pages */
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index ecf55d32e804..45443a53be76 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -525,6 +525,8 @@ static inline void init_vma_munmap(struct vma_munmap_struct *vms,
> > vms->vma_count = 0;
> > vms->nr_pages = vms->locked_vm = vms->nr_accounted = 0;
> > vms->exec_vm = vms->stack_vm = vms->data_vm = 0;
> > + vms->unmap_start = FIRST_USER_ADDRESS;
> > + vms->unmap_end = USER_PGTABLES_CEILING;
> > }
> >
> > /*
> > @@ -2610,6 +2612,26 @@ static inline void abort_munmap_vmas(struct ma_state *mas_detach)
> > __mt_destroy(mas_detach->tree);
> > }
> >
> > +
> > +static void vms_complete_pte_clear(struct vma_munmap_struct *vms,
> > + struct ma_state *mas_detach, bool mm_wr_locked)
> > +{
> > + struct mmu_gather tlb;
> > +
> > + /*
> > + * We can free page tables without write-locking mmap_lock because VMAs
> > + * were isolated before we downgraded mmap_lock.
> > + */
> > + mas_set(mas_detach, 1);
> > + lru_add_drain();
> > + tlb_gather_mmu(&tlb, vms->mm);
> > + update_hiwater_rss(vms->mm);
> > + unmap_vmas(&tlb, mas_detach, vms->vma, vms->start, vms->end, vms->vma_count, mm_wr_locked);
> > + mas_set(mas_detach, 1);
>
> I know it's necessary as unmap_vmas() will adjust mas_detach, but it kind
> of aesthetically sucks to set it to 1, do some stuff, then set it to 1
> again. But this is not a big deal :>)
>
> > + free_pgtables(&tlb, mas_detach, vms->vma, vms->unmap_start, vms->unmap_end, mm_wr_locked);
>
> Yeah this bit definitely needs a comment I think, this is very confusing
> indeed. Under what circumstances will these differ from [vms->start,
> vms->end), etc.?
>
> I'm guessing it's to do with !vms->prev and !vms->next needing to be set to
> [FIRST_USER_ADDRESS, USER_PGTABLES_CEILING)?
Yes, exactly. Since we are setting the range to unmap, we can just set
it to the correct value during the gather stage of the VMAs.
>
> > + tlb_finish_mmu(&tlb);
> > +}
> > +
> > /*
> > * vms_complete_munmap_vmas() - Finish the munmap() operation
> > * @vms: The vma munmap struct
> > @@ -2631,13 +2653,7 @@ static void vms_complete_munmap_vmas(struct vma_munmap_struct *vms,
> > if (vms->unlock)
> > mmap_write_downgrade(mm);
> >
> > - /*
> > - * We can free page tables without write-locking mmap_lock because VMAs
> > - * were isolated before we downgraded mmap_lock.
> > - */
> > - mas_set(mas_detach, 1);
> > - unmap_region(mm, mas_detach, vms->vma, vms->prev, vms->next,
> > - vms->start, vms->end, vms->vma_count, !vms->unlock);
> > + vms_complete_pte_clear(vms, mas_detach, !vms->unlock);
> > /* Update high watermark before we lower total_vm */
> > update_hiwater_vm(mm);
> > /* Stat accounting */
> > @@ -2699,6 +2715,8 @@ static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms,
> > goto start_split_failed;
> > }
> > vms->prev = vma_prev(vms->vmi);
> > + if (vms->prev)
> > + vms->unmap_start = vms->prev->vm_end;
> >
> > /*
> > * Detach a range of VMAs from the mm. Using next as a temp variable as
> > @@ -2757,6 +2775,8 @@ static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms,
> > }
> >
> > vms->next = vma_next(vms->vmi);
> > + if (vms->next)
> > + vms->unmap_end = vms->next->vm_start;
> >
> > #if defined(CONFIG_DEBUG_VM_MAPLE_TREE)
> > /* Make sure no VMAs are about to be lost. */
> > --
> > 2.43.0
> >
>
> Other than wanting some extra comments, this looks fine and I know how
> hard-won the unmap range bit of this change was so:
>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
next prev parent reply other threads:[~2024-07-08 14:46 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-04 18:27 [PATCH v3 00/16] Avoid MAP_FIXED gap exposure Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 01/16] mm/mmap: Correctly position vma_iterator in __split_vma() Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 02/16] mm/mmap: Introduce abort_munmap_vmas() Liam R. Howlett
2024-07-05 17:02 ` Lorenzo Stoakes
2024-07-05 18:12 ` Liam R. Howlett
2024-07-10 16:06 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 03/16] mm/mmap: Introduce vmi_complete_munmap_vmas() Liam R. Howlett
2024-07-05 17:39 ` Lorenzo Stoakes
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 04/16] mm/mmap: Extract the gathering of vmas from do_vmi_align_munmap() Liam R. Howlett
2024-07-05 18:01 ` Lorenzo Stoakes
2024-07-05 18:41 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 16:32 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 05/16] mm/mmap: Introduce vma_munmap_struct for use in munmap operations Liam R. Howlett
2024-07-05 18:39 ` Lorenzo Stoakes
2024-07-05 19:09 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 16:30 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 06/16] mm/mmap: Change munmap to use vma_munmap_struct() for accounting and surrounding vmas Liam R. Howlett
2024-07-05 19:27 ` Lorenzo Stoakes
2024-07-05 19:59 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 17:29 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 07/16] mm/mmap: Extract validate_mm() from vma_complete() Liam R. Howlett
2024-07-05 19:35 ` Lorenzo Stoakes
2024-07-10 16:06 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 08/16] mm/mmap: Inline munmap operation in mmap_region() Liam R. Howlett
2024-07-05 19:39 ` Lorenzo Stoakes
2024-07-05 20:00 ` Liam R. Howlett
2024-07-10 16:15 ` Suren Baghdasaryan
2024-07-10 16:35 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 09/16] mm/mmap: Expand mmap_region() munmap call Liam R. Howlett
2024-07-05 20:06 ` Lorenzo Stoakes
2024-07-05 20:30 ` Liam R. Howlett
2024-07-05 20:36 ` Lorenzo Stoakes
2024-07-08 14:49 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 10/16] mm/mmap: Reposition vma iterator in mmap_region() Liam R. Howlett
2024-07-05 20:18 ` Lorenzo Stoakes
2024-07-05 20:56 ` Liam R. Howlett
2024-07-08 11:08 ` Lorenzo Stoakes
2024-07-08 16:43 ` Liam R. Howlett
2024-07-10 16:48 ` Suren Baghdasaryan
2024-07-10 17:18 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 11/16] mm/mmap: Track start and end of munmap in vma_munmap_struct Liam R. Howlett
2024-07-05 20:27 ` Lorenzo Stoakes
2024-07-08 14:45 ` Liam R. Howlett [this message]
2024-07-10 17:14 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 12/16] mm/mmap: Clean up unmap_region() argument list Liam R. Howlett
2024-07-05 20:33 ` Lorenzo Stoakes
2024-07-10 17:14 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 13/16] mm/mmap: Avoid zeroing vma tree in mmap_region() Liam R. Howlett
2024-07-08 12:18 ` Lorenzo Stoakes
2024-07-08 19:10 ` Liam R. Howlett
2024-07-09 14:27 ` Lorenzo Stoakes
2024-07-09 18:43 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 14/16] mm/mmap: Use PHYS_PFN " Liam R. Howlett
2024-07-08 12:21 ` Lorenzo Stoakes
2024-07-09 18:35 ` Liam R. Howlett
2024-07-09 18:42 ` Lorenzo Stoakes
2024-07-10 17:32 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 15/16] mm/mmap: Use vms accounted pages " Liam R. Howlett
2024-07-08 12:43 ` Lorenzo Stoakes
2024-07-10 17:43 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 16/16] mm/mmap: Move may_expand_vm() check " Liam R. Howlett
2024-07-08 12:52 ` Lorenzo Stoakes
2024-07-08 20:43 ` Liam R. Howlett
2024-07-09 14:42 ` Liam R. Howlett
2024-07-09 14:51 ` Lorenzo Stoakes
2024-07-09 14:52 ` Liam R. Howlett
2024-07-09 18:13 ` Dave Hansen
2024-07-09 14:45 ` Lorenzo Stoakes
2024-07-10 12:28 ` Michael Ellerman
2024-07-10 12:45 ` Lorenzo Stoakes
2024-07-10 12:59 ` LEROY Christophe
2024-07-10 16:09 ` Liam R. Howlett
2024-07-10 19:27 ` Dmitry Safonov
2024-07-10 21:04 ` LEROY Christophe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jghcpyibgtemzndnpeqncc6qboym3uk26nxsjsht2sllcgaheh@37zpmuehbhdq \
--to=liam.howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=lstoakes@gmail.com \
--cc=olsajiri@gmail.com \
--cc=paulmck@kernel.org \
--cc=sidhartha.kumar@oracle.com \
--cc=spasswolf@web.de \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox