From: Suren Baghdasaryan <surenb@google.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
sidhartha.kumar@oracle.com,
"Paul E . McKenney" <paulmck@kernel.org>,
Bert Karwatzki <spasswolf@web.de>, Jiri Olsa <olsajiri@gmail.com>,
linux-kernel@vger.kernel.org, Kees Cook <kees@kernel.org>
Subject: Re: [PATCH v3 11/16] mm/mmap: Track start and end of munmap in vma_munmap_struct
Date: Wed, 10 Jul 2024 10:14:32 -0700 [thread overview]
Message-ID: <CAJuCfpGg7_CcfQJOSAhcdSVHJY-QTT0dz90nSC8PNgf+xp3cOQ@mail.gmail.com> (raw)
In-Reply-To: <37ea5831-2163-4086-8b2c-baff3be2e5ad@lucifer.local>
On Fri, Jul 5, 2024 at 1:27 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Thu, Jul 04, 2024 at 02:27:13PM GMT, Liam R. Howlett wrote:
> > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> >
> > Set the start and end address for munmap when the prev and next are
> > gathered. This is needed to avoid incorrect addresses being used during
> > the vms_complete_munmap_vmas() function if the prev/next vma are
> > expanded.
>
> When we spoke about this separately you mentioned that specific arches may
> be more likely to encounter this issue, perhaps worth mentioning something
> about that in the commit msg? Unless I misunderstood you.
>
> >
> > Add a new helper vms_complete_pte_clear(), which is needed later and
> > will avoid growing the argument list to unmap_region() beyond the 9 it
> > already has.
>
> My word.
>
> >
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > ---
> > mm/internal.h | 2 ++
> > mm/mmap.c | 34 +++++++++++++++++++++++++++-------
> > 2 files changed, 29 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 8cbbbe7d40f3..4c9f06669cc4 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -1493,6 +1493,8 @@ struct vma_munmap_struct {
> > struct list_head *uf; /* Userfaultfd list_head */
> > unsigned long start; /* Aligned start addr */
> > unsigned long end; /* Aligned end addr */
> > + unsigned long unmap_start;
> > + unsigned long unmap_end;
> > int vma_count; /* Number of vmas that will be removed */
> > unsigned long nr_pages; /* Number of pages being removed */
> > unsigned long locked_vm; /* Number of locked pages */
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index ecf55d32e804..45443a53be76 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -525,6 +525,8 @@ static inline void init_vma_munmap(struct vma_munmap_struct *vms,
> > vms->vma_count = 0;
> > vms->nr_pages = vms->locked_vm = vms->nr_accounted = 0;
> > vms->exec_vm = vms->stack_vm = vms->data_vm = 0;
> > + vms->unmap_start = FIRST_USER_ADDRESS;
> > + vms->unmap_end = USER_PGTABLES_CEILING;
> > }
> >
> > /*
> > @@ -2610,6 +2612,26 @@ static inline void abort_munmap_vmas(struct ma_state *mas_detach)
> > __mt_destroy(mas_detach->tree);
> > }
> >
> > +
> > +static void vms_complete_pte_clear(struct vma_munmap_struct *vms,
> > + struct ma_state *mas_detach, bool mm_wr_locked)
> > +{
> > + struct mmu_gather tlb;
> > +
> > + /*
> > + * We can free page tables without write-locking mmap_lock because VMAs
> > + * were isolated before we downgraded mmap_lock.
> > + */
> > + mas_set(mas_detach, 1);
> > + lru_add_drain();
> > + tlb_gather_mmu(&tlb, vms->mm);
> > + update_hiwater_rss(vms->mm);
> > + unmap_vmas(&tlb, mas_detach, vms->vma, vms->start, vms->end, vms->vma_count, mm_wr_locked);
> > + mas_set(mas_detach, 1);
>
> I know it's necessary as unmap_vmas() will adjust mas_detach, but it kind
> of aesthetically sucks to set it to 1, do some stuff, then set it to 1
> again. But this is not a big deal :>)
>
> > + free_pgtables(&tlb, mas_detach, vms->vma, vms->unmap_start, vms->unmap_end, mm_wr_locked);
>
> Yeah this bit definitely needs a comment I think, this is very confusing
> indeed. Under what circumstances will these differ from [vms->start,
> vms->end), etc.?
>
> I'm guessing it's to do with !vms->prev and !vms->next needing to be set to
> [FIRST_USER_ADDRESS, USER_PGTABLES_CEILING)?
>
> > + tlb_finish_mmu(&tlb);
> > +}
> > +
> > /*
> > * vms_complete_munmap_vmas() - Finish the munmap() operation
> > * @vms: The vma munmap struct
> > @@ -2631,13 +2653,7 @@ static void vms_complete_munmap_vmas(struct vma_munmap_struct *vms,
> > if (vms->unlock)
> > mmap_write_downgrade(mm);
> >
> > - /*
> > - * We can free page tables without write-locking mmap_lock because VMAs
> > - * were isolated before we downgraded mmap_lock.
> > - */
> > - mas_set(mas_detach, 1);
> > - unmap_region(mm, mas_detach, vms->vma, vms->prev, vms->next,
> > - vms->start, vms->end, vms->vma_count, !vms->unlock);
> > + vms_complete_pte_clear(vms, mas_detach, !vms->unlock);
> > /* Update high watermark before we lower total_vm */
> > update_hiwater_vm(mm);
> > /* Stat accounting */
> > @@ -2699,6 +2715,8 @@ static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms,
> > goto start_split_failed;
> > }
> > vms->prev = vma_prev(vms->vmi);
> > + if (vms->prev)
> > + vms->unmap_start = vms->prev->vm_end;
> >
> > /*
> > * Detach a range of VMAs from the mm. Using next as a temp variable as
> > @@ -2757,6 +2775,8 @@ static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms,
> > }
> >
> > vms->next = vma_next(vms->vmi);
> > + if (vms->next)
> > + vms->unmap_end = vms->next->vm_start;
> >
> > #if defined(CONFIG_DEBUG_VM_MAPLE_TREE)
> > /* Make sure no VMAs are about to be lost. */
> > --
> > 2.43.0
> >
>
> Other than wanting some extra comments, this looks fine and I know how
> hard-won the unmap range bit of this change was so:
>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Ok, another case when code duplication will be removed in the next patch. LGTM.
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
next prev parent reply other threads:[~2024-07-10 17:14 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-04 18:27 [PATCH v3 00/16] Avoid MAP_FIXED gap exposure Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 01/16] mm/mmap: Correctly position vma_iterator in __split_vma() Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 02/16] mm/mmap: Introduce abort_munmap_vmas() Liam R. Howlett
2024-07-05 17:02 ` Lorenzo Stoakes
2024-07-05 18:12 ` Liam R. Howlett
2024-07-10 16:06 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 03/16] mm/mmap: Introduce vmi_complete_munmap_vmas() Liam R. Howlett
2024-07-05 17:39 ` Lorenzo Stoakes
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 04/16] mm/mmap: Extract the gathering of vmas from do_vmi_align_munmap() Liam R. Howlett
2024-07-05 18:01 ` Lorenzo Stoakes
2024-07-05 18:41 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 16:32 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 05/16] mm/mmap: Introduce vma_munmap_struct for use in munmap operations Liam R. Howlett
2024-07-05 18:39 ` Lorenzo Stoakes
2024-07-05 19:09 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 16:30 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 06/16] mm/mmap: Change munmap to use vma_munmap_struct() for accounting and surrounding vmas Liam R. Howlett
2024-07-05 19:27 ` Lorenzo Stoakes
2024-07-05 19:59 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 17:29 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 07/16] mm/mmap: Extract validate_mm() from vma_complete() Liam R. Howlett
2024-07-05 19:35 ` Lorenzo Stoakes
2024-07-10 16:06 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 08/16] mm/mmap: Inline munmap operation in mmap_region() Liam R. Howlett
2024-07-05 19:39 ` Lorenzo Stoakes
2024-07-05 20:00 ` Liam R. Howlett
2024-07-10 16:15 ` Suren Baghdasaryan
2024-07-10 16:35 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 09/16] mm/mmap: Expand mmap_region() munmap call Liam R. Howlett
2024-07-05 20:06 ` Lorenzo Stoakes
2024-07-05 20:30 ` Liam R. Howlett
2024-07-05 20:36 ` Lorenzo Stoakes
2024-07-08 14:49 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 10/16] mm/mmap: Reposition vma iterator in mmap_region() Liam R. Howlett
2024-07-05 20:18 ` Lorenzo Stoakes
2024-07-05 20:56 ` Liam R. Howlett
2024-07-08 11:08 ` Lorenzo Stoakes
2024-07-08 16:43 ` Liam R. Howlett
2024-07-10 16:48 ` Suren Baghdasaryan
2024-07-10 17:18 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 11/16] mm/mmap: Track start and end of munmap in vma_munmap_struct Liam R. Howlett
2024-07-05 20:27 ` Lorenzo Stoakes
2024-07-08 14:45 ` Liam R. Howlett
2024-07-10 17:14 ` Suren Baghdasaryan [this message]
2024-07-04 18:27 ` [PATCH v3 12/16] mm/mmap: Clean up unmap_region() argument list Liam R. Howlett
2024-07-05 20:33 ` Lorenzo Stoakes
2024-07-10 17:14 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 13/16] mm/mmap: Avoid zeroing vma tree in mmap_region() Liam R. Howlett
2024-07-08 12:18 ` Lorenzo Stoakes
2024-07-08 19:10 ` Liam R. Howlett
2024-07-09 14:27 ` Lorenzo Stoakes
2024-07-09 18:43 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 14/16] mm/mmap: Use PHYS_PFN " Liam R. Howlett
2024-07-08 12:21 ` Lorenzo Stoakes
2024-07-09 18:35 ` Liam R. Howlett
2024-07-09 18:42 ` Lorenzo Stoakes
2024-07-10 17:32 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 15/16] mm/mmap: Use vms accounted pages " Liam R. Howlett
2024-07-08 12:43 ` Lorenzo Stoakes
2024-07-10 17:43 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 16/16] mm/mmap: Move may_expand_vm() check " Liam R. Howlett
2024-07-08 12:52 ` Lorenzo Stoakes
2024-07-08 20:43 ` Liam R. Howlett
2024-07-09 14:42 ` Liam R. Howlett
2024-07-09 14:51 ` Lorenzo Stoakes
2024-07-09 14:52 ` Liam R. Howlett
2024-07-09 18:13 ` Dave Hansen
2024-07-09 14:45 ` Lorenzo Stoakes
2024-07-10 12:28 ` Michael Ellerman
2024-07-10 12:45 ` Lorenzo Stoakes
2024-07-10 12:59 ` LEROY Christophe
2024-07-10 16:09 ` Liam R. Howlett
2024-07-10 19:27 ` Dmitry Safonov
2024-07-10 21:04 ` LEROY Christophe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJuCfpGg7_CcfQJOSAhcdSVHJY-QTT0dz90nSC8PNgf+xp3cOQ@mail.gmail.com \
--to=surenb@google.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=lstoakes@gmail.com \
--cc=olsajiri@gmail.com \
--cc=paulmck@kernel.org \
--cc=sidhartha.kumar@oracle.com \
--cc=spasswolf@web.de \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox