From: Mina Almasry <almasrymina@google.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Ken Chen <kenchen@google.com>,
Chris Kennelly <ckennelly@google.com>,
Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
Kirill Shutemov <kirill@shutemov.name>
Subject: Re: [PATCH v6 1/2] mm, hugepages: add mremap() support for hugepage backed vma
Date: Wed, 13 Oct 2021 12:55:51 -0700 [thread overview]
Message-ID: <CAHS8izOaywTt3wDD0i+PJiq+Hmpdr3RwMrB8HpiZ4gEVyO2Y3g@mail.gmail.com> (raw)
In-Reply-To: <8ade8697-3f03-c2d4-f7a9-e64cb600ad33@oracle.com>
On Tue, Oct 12, 2021 at 4:51 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 10/11/21 6:17 PM, Mina Almasry wrote:
> > Support mremap() for hugepage backed vma segment by simply repositioning
> > page table entries. The page table entries are repositioned to the new
> > virtual address on mremap().
> >
> > Hugetlb mremap() support is of course generic; my motivating use case
> > is a library (hugepage_text), which reloads the ELF text of executables
> > in hugepages. This significantly increases the execution performance of
> > said executables.
> >
> > Restricts the mremap operation on hugepages to up to the size of the
> > original mapping as the underlying hugetlb reservation is not yet
> > capable of handling remapping to a larger size.
> >
> > During the mremap() operation we detect pmd_share'd mappings and we
> > unshare those during the mremap(). On access and fault the sharing is
> > established again.
> >
> > Signed-off-by: Mina Almasry <almasrymina@google.com>
>
> Thanks!
>
> Just some minor nits below. If you agree with the suggestions and make
> the changes, you can add:
>
> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
>
Thank you as always for your review and patience. I've applied the
changes and uploaded v7 with the reviewed-by. Any chance I can get an
Ack at least on the associated test patch? Or should I send these
somewhere else for review?
> > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> > index ebaba02706c87..c6b70f1ede6bf 100644
> > --- a/include/linux/hugetlb.h
> > +++ b/include/linux/hugetlb.h
> > @@ -124,6 +124,7 @@ struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages,
> > void hugepage_put_subpool(struct hugepage_subpool *spool);
> >
> > void reset_vma_resv_huge_pages(struct vm_area_struct *vma);
> > +void clear_vma_resv_huge_pages(struct vm_area_struct *vma);
> > int hugetlb_sysctl_handler(struct ctl_table *, int, void *, size_t *, loff_t *);
> > int hugetlb_overcommit_handler(struct ctl_table *, int, void *, size_t *,
> > loff_t *);
> > @@ -132,6 +133,10 @@ int hugetlb_treat_movable_handler(struct ctl_table *, int, void *, size_t *,
> > int hugetlb_mempolicy_sysctl_handler(struct ctl_table *, int, void *, size_t *,
> > loff_t *);
> >
> > +int move_hugetlb_page_tables(struct vm_area_struct *vma,
> > + struct vm_area_struct *new_vma,
> > + unsigned long old_addr, unsigned long new_addr,
> > + unsigned long len);
> > int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, struct vm_area_struct *);
> > long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *,
> > struct page **, struct vm_area_struct **,
> > @@ -215,6 +220,10 @@ static inline void reset_vma_resv_huge_pages(struct vm_area_struct *vma)
> > {
> > }
> >
> > +static inline void clear_vma_resv_huge_pages(struct vm_area_struct *vma)
> > +{
> > +}
> > +
> > static inline unsigned long hugetlb_total_pages(void)
> > {
> > return 0;
> > @@ -262,6 +271,12 @@ static inline int copy_hugetlb_page_range(struct mm_struct *dst,
> > return 0;
> > }
> >
> > +#define move_hugetlb_page_tables(vma, new_vma, old_addr, new_addr, len) \
> > + ({ \
> > + BUG(); \
> > + 0; \
> > + })
> > +
>
> Any reason why you did not make this a static inline? Trying to save
> code in the !CONFIG_HUGETLB case? macros seem to end up causing more
> issues down the line. I would suggest making this a static inline
> unless there is a good reason to keep as a macro.
>
No good reason. I converted to static inline.
> > static inline void hugetlb_report_meminfo(struct seq_file *m)
> > {
> > }
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 6d2f4c25dd9fb..6e91cd3905e73 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> ...
> > +int move_hugetlb_page_tables(struct vm_area_struct *vma,
> > + struct vm_area_struct *new_vma,
> > + unsigned long old_addr, unsigned long new_addr,
> > + unsigned long len)
> > +{
> > + struct hstate *h = hstate_vma(vma);
> > + struct address_space *mapping = vma->vm_file->f_mapping;
> > + unsigned long sz = huge_page_size(h);
> > + struct mm_struct *mm = vma->vm_mm;
> > + unsigned long old_end = old_addr + len;
> > + unsigned long old_addr_copy;
> > + pte_t *src_pte, *dst_pte;
> > + struct mmu_notifier_range range;
> > +
> > + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, old_addr,
> > + old_end);
> > + adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
> > + mmu_notifier_invalidate_range_start(&range);
> > + /* Prevent race with file truncation */
> > + i_mmap_lock_write(mapping);
> > + for (; old_addr < old_end; old_addr += sz, new_addr += sz) {
> > + src_pte = huge_pte_offset(mm, old_addr, sz);
> > + if (!src_pte)
> > + continue;
> > + if (huge_pte_none(huge_ptep_get(src_pte)))
> > + continue;
> > +
> > + /* old_addr arg to huge_pmd_unshare() is a pointer and so the
> > + * arg may be modified. Pass a copy instead to preserve the
> > + * value in old_arg.
>
> value in old_addr.
>
Fixed.
> > + */
> > + old_addr_copy = old_addr;
> > +
> > + if (huge_pmd_unshare(mm, vma, &old_addr_copy, src_pte))
> > + continue;
> > +
> > + dst_pte = huge_pte_alloc(mm, new_vma, new_addr, sz);
> > + if (!dst_pte)
> > + break;
> > +
> > + move_huge_pte(vma, old_addr, new_addr, src_pte);
> > + }
> > + i_mmap_unlock_write(mapping);
> > + flush_tlb_range(vma, old_end - len, old_end);
> > + mmu_notifier_invalidate_range_end(&range);
> > +
> > + return len + old_addr - old_end;
> > +}
> > +
> > static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
> > unsigned long start, unsigned long end,
> > struct page *ref_page)
> > @@ -6280,7 +6385,8 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma,
> > return saddr;
> > }
> >
> > -static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr)
> > +static bool hugetlb_vma_shareable(struct vm_area_struct *vma,
> > + unsigned long addr)
> > {
> > unsigned long base = addr & PUD_MASK;
> > unsigned long end = base + PUD_SIZE;
> > @@ -6299,7 +6405,7 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr)
> > if (uffd_disable_huge_pmd_share(vma))
> > return false;
> > #endif
> > - return vma_shareable(vma, addr);
> > + return hugetlb_vma_shareable(vma, addr);
> > }
> >
> > /*
>
> In an earlier version of the patch, vma_shareable was renamed
> hugetlb_vma_shareable because it was going to be used outside hugetlb.c.
> Therefore the hugetlb_* name would provide some context. That is no
> longer the case. So, there really is no need to change the name. In
> fact, none of the remap code even calls this routine. Sugggest you
> drop the name change.
Fixed.
> --
> Mike Kravetz
prev parent reply other threads:[~2021-10-13 19:56 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-12 1:17 Mina Almasry
2021-10-12 1:17 ` [PATCH v6 2/2] mm, hugepages: Add hugetlb vma mremap() test Mina Almasry
2021-10-12 23:51 ` [PATCH v6 1/2] mm, hugepages: add mremap() support for hugepage backed vma Mike Kravetz
2021-10-13 19:55 ` Mina Almasry [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHS8izOaywTt3wDD0i+PJiq+Hmpdr3RwMrB8HpiZ4gEVyO2Y3g@mail.gmail.com \
--to=almasrymina@google.com \
--cc=akpm@linux-foundation.org \
--cc=ckennelly@google.com \
--cc=kenchen@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox