linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James Houghton <jthoughton@google.com>
To: Jiaqi Yan <jiaqiyan@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <songmuchun@bytedance.com>,
	 Peter Xu <peterx@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	 Axel Rasmussen <axelrasmussen@google.com>,
	Mina Almasry <almasrymina@google.com>,
	 "Zach O'Keefe" <zokeefe@google.com>,
	Manish Mishra <manish.mishra@nutanix.com>,
	 Naoya Horiguchi <naoya.horiguchi@nec.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	 "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	 Baolin Wang <baolin.wang@linux.alibaba.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	 Yang Shi <shy828301@gmail.com>,
	Frank van der Linden <fvdl@google.com>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 05/46] rmap: hugetlb: switch from page_dup_file_rmap to page_add_file_rmap
Date: Thu, 2 Mar 2023 08:43:43 -0800	[thread overview]
Message-ID: <CADrL8HWPJuFixYzh97nT_8XO2kaS6i+wY+T58HL3GTmq9u=yTw@mail.gmail.com> (raw)
In-Reply-To: <CADrL8HVMp9kA=c904pUCqa-J_1vY4UPtsL9up+ZVVDp4TZbG2w@mail.gmail.com>

On Thu, Mar 2, 2023 at 7:44 AM James Houghton <jthoughton@google.com> wrote:
>
> On Wed, Mar 1, 2023 at 5:06 PM Jiaqi Yan <jiaqiyan@google.com> wrote:
> >
> > On Fri, Feb 17, 2023 at 4:28 PM James Houghton <jthoughton@google.com> wrote:
> > >
> > > This only applies to file-backed HugeTLB, and it should be a no-op until
> > > high-granularity mapping is possible. Also update page_remove_rmap to
> > > support the eventual case where !compound && folio_test_hugetlb().
> > >
> > > HugeTLB doesn't use LRU or mlock, so we avoid those bits. This also
> > > means we don't need to use subpage_mapcount; if we did, it would
> > > overflow with only a few mappings.
>
> This is wrong; I guess I misunderstood the code when I wrote this
> commit. subpages_mapcount (now called _nr_pages_mapped) won't overflow
> (unless HugeTLB pages could be greater than 16G). It is indeed a bug
> not to update _nr_pages_mapped the same way THPs do.
>
> >
> > >
> > > There is still one caller of page_dup_file_rmap left: copy_present_pte,
> > > and it is always called with compound=false in this case.
> > >
> > > Signed-off-by: James Houghton <jthoughton@google.com>
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 08004371cfed..6c008c9de80e 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -5077,7 +5077,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> > >                          * sleep during the process.
> > >                          */
> > >                         if (!PageAnon(ptepage)) {
> > > -                               page_dup_file_rmap(ptepage, true);
> > > +                               page_add_file_rmap(ptepage, src_vma, true);
> > >                         } else if (page_try_dup_anon_rmap(ptepage, true,
> > >                                                           src_vma)) {
> > >                                 pte_t src_pte_old = entry;
> > > @@ -5910,7 +5910,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
> > >         if (anon_rmap)
> > >                 hugepage_add_new_anon_rmap(folio, vma, haddr);
> > >         else
> > > -               page_dup_file_rmap(&folio->page, true);
> > > +               page_add_file_rmap(&folio->page, vma, true);
> > >         new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE)
> > >                                 && (vma->vm_flags & VM_SHARED)));
> > >         /*
> > > @@ -6301,7 +6301,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> > >                 goto out_release_unlock;
> > >
> > >         if (folio_in_pagecache)
> > > -               page_dup_file_rmap(&folio->page, true);
> > > +               page_add_file_rmap(&folio->page, dst_vma, true);
> > >         else
> > >                 hugepage_add_new_anon_rmap(folio, dst_vma, dst_addr);
> > >
> > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > index d3964c414010..b0f87f19b536 100644
> > > --- a/mm/migrate.c
> > > +++ b/mm/migrate.c
> > > @@ -254,7 +254,7 @@ static bool remove_migration_pte(struct folio *folio,
> > >                                 hugepage_add_anon_rmap(new, vma, pvmw.address,
> > >                                                        rmap_flags);
> > >                         else
> > > -                               page_dup_file_rmap(new, true);
> > > +                               page_add_file_rmap(new, vma, true);
> > >                         set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte);
> > >                 } else
> > >  #endif
> > > diff --git a/mm/rmap.c b/mm/rmap.c
> > > index 15ae24585fc4..c010d0af3a82 100644
> > > --- a/mm/rmap.c
> > > +++ b/mm/rmap.c
> >
> > Given you are making hugetlb's ref/mapcount mechanism to be consistent
> > with THP, I think the special folio_test_hugetlb checks you added in
> > this commit will break page_mapped() and folio_mapped() if page/folio
> > is HGMed. With these checks, folio->_nr_pages_mapped are not properly
> > increased/decreased.
>
> Thank you, Jiaqi! I didn't realize I broke
> folio_mapped()/page_mapped(). The end result is that page_mapped() may
> report that an HGMed page isn't mapped when it is. Not good!
>
> >
> > > @@ -1318,21 +1318,21 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
> > >         int nr = 0, nr_pmdmapped = 0;
> > >         bool first;
> > >
> > > -       VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page);
> > > +       VM_BUG_ON_PAGE(compound && !PageTransHuge(page)
> > > +                               && !folio_test_hugetlb(folio), page);
> > >
> > >         /* Is page being mapped by PTE? Is this its first map to be added? */
> > >         if (likely(!compound)) {
> > >                 first = atomic_inc_and_test(&page->_mapcount);
> > >                 nr = first;
> > > -               if (first && folio_test_large(folio)) {
> > > +               if (first && folio_test_large(folio)
> > > +                         && !folio_test_hugetlb(folio)) {
> >
> > So we should still increment _nr_pages_mapped for hugetlb case here,
> > and decrement in the corresponding place in page_remove_rmap.
> >
> > >                         nr = atomic_inc_return_relaxed(mapped);
> > >                         nr = (nr < COMPOUND_MAPPED);
> > >                 }
> > > -       } else if (folio_test_pmd_mappable(folio)) {
> > > -               /* That test is redundant: it's for safety or to optimize out */
> > > -
> > > +       } else {
> > >                 first = atomic_inc_and_test(&folio->_entire_mapcount);
> > > -               if (first) {
> > > +               if (first && !folio_test_hugetlb(folio)) {
> >
> > Same here: we should still increase _nr_pages_mapped by
> > COMPOUND_MAPPED and decrease by COMPOUND_MAPPED in the corresponding
> > place in page_remove_rmap.
> >
> > >                         nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped);
> > >                         if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) {
> > >                                 nr_pmdmapped = folio_nr_pages(folio);
> > > @@ -1347,6 +1347,9 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
> > >                 }
> > >         }
> > >
> > > +       if (folio_test_hugetlb(folio))
> > > +               return;
> > > +
> > >         if (nr_pmdmapped)
> > >                 __lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ?
> > >                         NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr_pmdmapped);
> > > @@ -1376,8 +1379,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
> > >         VM_BUG_ON_PAGE(compound && !PageHead(page), page);
> > >
> > >         /* Hugetlb pages are not counted in NR_*MAPPED */
> > > -       if (unlikely(folio_test_hugetlb(folio))) {
> > > -               /* hugetlb pages are always mapped with pmds */
> > > +       if (unlikely(folio_test_hugetlb(folio)) && compound) {
> > >                 atomic_dec(&folio->_entire_mapcount);
> > >                 return;
> > >         }
> >
> > This entire if-block should be removed after you remove the
> > !folio_test_hugetlb checks in page_add_file_rmap.
>
> This is the not-so-obvious change that is needed. Thank you!
>
> >
> > > @@ -1386,15 +1388,14 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
> > >         if (likely(!compound)) {
> > >                 last = atomic_add_negative(-1, &page->_mapcount);
> > >                 nr = last;
> > > -               if (last && folio_test_large(folio)) {
> > > +               if (last && folio_test_large(folio)
> > > +                        && !folio_test_hugetlb(folio)) {
> >
> > ditto.
> >
> > >                         nr = atomic_dec_return_relaxed(mapped);
> > >                         nr = (nr < COMPOUND_MAPPED);
> > >                 }
> > > -       } else if (folio_test_pmd_mappable(folio)) {
> > > -               /* That test is redundant: it's for safety or to optimize out */
> > > -
> > > +       } else {
> > >                 last = atomic_add_negative(-1, &folio->_entire_mapcount);
> > > -               if (last) {
> > > +               if (last && !folio_test_hugetlb(folio)) {
> >
> > ditto.
>
> I agree with all of your suggestions. Testing with the hugetlb-hgm
> selftest, nothing seems to break. :)
>
> Given that this is at least the third or fourth major bug in this
> version of the series, I'll go ahead and send a v3 sooner rather than
> later.

This solution limits the size of a HugeTLB page to 16G. I'm not sure
if there are any architectures that support HugeTLB pages larger than
16G (it looks like powerpc supports 16G). If they do, I think we can
just increase the value of COMPOUND_MAPPED. If that's not possible, we
would need another solution than participating in _nr_pages_mapped
like THPs.


  reply	other threads:[~2023-03-02 16:49 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-18  0:27 [PATCH v2 00/46] hugetlb: introduce HugeTLB high-granularity mapping James Houghton
2023-02-18  0:27 ` [PATCH v2 01/46] hugetlb: don't set PageUptodate for UFFDIO_CONTINUE James Houghton
2023-02-18  0:41   ` Mina Almasry
2023-02-21 15:59     ` James Houghton
2023-02-21 19:33       ` Mike Kravetz
2023-02-21 19:58         ` James Houghton
2023-02-18  0:27 ` [PATCH v2 02/46] hugetlb: remove mk_huge_pte; it is unused James Houghton
2023-02-18  0:27 ` [PATCH v2 03/46] hugetlb: remove redundant pte_mkhuge in migration path James Houghton
2023-02-18  0:27 ` [PATCH v2 04/46] hugetlb: only adjust address ranges when VMAs want PMD sharing James Houghton
2023-02-18  1:10   ` Mina Almasry
2023-02-18  0:27 ` [PATCH v2 05/46] rmap: hugetlb: switch from page_dup_file_rmap to page_add_file_rmap James Houghton
2023-03-02  1:06   ` Jiaqi Yan
2023-03-02 15:44     ` James Houghton
2023-03-02 16:43       ` James Houghton [this message]
2023-03-02 19:22         ` Mike Kravetz
2023-02-18  0:27 ` [PATCH v2 06/46] hugetlb: add CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING James Houghton
2023-02-18  0:27 ` [PATCH v2 07/46] mm: add VM_HUGETLB_HGM VMA flag James Houghton
2023-02-24 22:35   ` Mike Kravetz
2023-02-18  0:27 ` [PATCH v2 08/46] hugetlb: add HugeTLB HGM enablement helpers James Houghton
2023-02-18  1:40   ` Mina Almasry
2023-02-21 16:16     ` James Houghton
2023-02-24 23:08   ` Mike Kravetz
2023-02-18  0:27 ` [PATCH v2 09/46] mm: add MADV_SPLIT to enable HugeTLB HGM James Houghton
2023-02-18  1:58   ` Mina Almasry
2023-02-21 16:33     ` James Houghton
2023-02-24 23:25   ` Mike Kravetz
2023-02-27 15:14     ` James Houghton
2023-02-18  0:27 ` [PATCH v2 10/46] hugetlb: make huge_pte_lockptr take an explicit shift argument James Houghton
2023-02-18  0:27 ` [PATCH v2 11/46] hugetlb: add hugetlb_pte to track HugeTLB page table entries James Houghton
2023-02-18  5:24   ` Mina Almasry
2023-02-21 16:36     ` James Houghton
2023-02-25  0:09   ` Mike Kravetz
2023-02-18  0:27 ` [PATCH v2 12/46] hugetlb: add hugetlb_alloc_pmd and hugetlb_alloc_pte James Houghton
2023-02-18 17:46   ` kernel test robot
2023-02-27 19:16   ` Mike Kravetz
2023-02-27 19:31     ` James Houghton
2023-02-18  0:27 ` [PATCH v2 13/46] hugetlb: add hugetlb_hgm_walk and hugetlb_walk_step James Houghton
2023-02-18  7:43   ` kernel test robot
2023-02-18 18:07   ` kernel test robot
2023-02-21 17:09     ` James Houghton
2023-02-28 22:14   ` Mike Kravetz
2023-02-28 23:03     ` James Houghton
2023-02-18  0:27 ` [PATCH v2 14/46] hugetlb: split PTE markers when doing HGM walks James Houghton
2023-02-18 19:49   ` kernel test robot
2023-02-28 22:48   ` Mike Kravetz
2023-02-18  0:27 ` [PATCH v2 15/46] hugetlb: add make_huge_pte_with_shift James Houghton
2023-02-22 21:14   ` Mina Almasry
2023-02-22 22:53     ` James Houghton
2023-02-18  0:27 ` [PATCH v2 16/46] hugetlb: make default arch_make_huge_pte understand small mappings James Houghton
2023-02-22 21:17   ` Mina Almasry
2023-02-22 22:52     ` James Houghton
2023-02-28 23:02   ` Mike Kravetz
2023-02-18  0:27 ` [PATCH v2 17/46] hugetlbfs: do a full walk to check if vma maps a page James Houghton
2023-02-22 15:46   ` James Houghton
2023-02-28 23:52     ` Mike Kravetz
2023-02-18  0:27 ` [PATCH v2 18/46] hugetlb: add HGM support to __unmap_hugepage_range James Houghton
2023-02-18  0:27 ` [PATCH v2 19/46] hugetlb: add HGM support to hugetlb_change_protection James Houghton
2023-02-18  0:27 ` [PATCH v2 20/46] hugetlb: add HGM support to follow_hugetlb_page James Houghton
2023-02-18  0:27 ` [PATCH v2 21/46] hugetlb: add HGM support to hugetlb_follow_page_mask James Houghton
2023-02-18  0:27 ` [PATCH v2 22/46] hugetlb: add HGM support to copy_hugetlb_page_range James Houghton
2023-02-24 17:39   ` James Houghton
2023-02-18  0:27 ` [PATCH v2 23/46] hugetlb: add HGM support to move_hugetlb_page_tables James Houghton
2023-02-18  0:27 ` [PATCH v2 24/46] hugetlb: add HGM support to hugetlb_fault and hugetlb_no_page James Houghton
2023-02-18  0:27 ` [PATCH v2 25/46] hugetlb: use struct hugetlb_pte for walk_hugetlb_range James Houghton
2023-02-18  0:27 ` [PATCH v2 26/46] mm: rmap: provide pte_order in page_vma_mapped_walk James Houghton
2023-02-18  0:28 ` [PATCH v2 27/46] mm: rmap: update try_to_{migrate,unmap} to handle mapcount for HGM James Houghton
2023-02-18  0:28 ` [PATCH v2 28/46] mm: rmap: in try_to_{migrate,unmap}, check head page for hugetlb page flags James Houghton
2023-02-18  0:28 ` [PATCH v2 29/46] hugetlb: update page_vma_mapped to do high-granularity walks James Houghton
2023-02-18  0:28 ` [PATCH v2 30/46] hugetlb: add high-granularity migration support James Houghton
2023-02-18  0:28 ` [PATCH v2 31/46] hugetlb: sort hstates in hugetlb_init_hstates James Houghton
2023-02-18  0:28 ` [PATCH v2 32/46] hugetlb: add for_each_hgm_shift James Houghton
2023-02-18  0:28 ` [PATCH v2 33/46] hugetlb: userfaultfd: add support for high-granularity UFFDIO_CONTINUE James Houghton
2023-02-18  0:28 ` [PATCH v2 34/46] hugetlb: add MADV_COLLAPSE for hugetlb James Houghton
2023-02-18  0:28 ` [PATCH v2 35/46] hugetlb: add check to prevent refcount overflow via HGM James Houghton
2023-02-24 17:42   ` James Houghton
2023-02-24 18:05     ` James Houghton
2023-02-18  0:28 ` [PATCH v2 36/46] hugetlb: remove huge_pte_lock and huge_pte_lockptr James Houghton
2023-02-18  0:28 ` [PATCH v2 37/46] hugetlb: replace make_huge_pte with make_huge_pte_with_shift James Houghton
2023-02-18  0:28 ` [PATCH v2 38/46] mm: smaps: add stats for HugeTLB mapping size James Houghton
2023-02-18  0:28 ` [PATCH v2 39/46] hugetlb: x86: enable high-granularity mapping for x86_64 James Houghton
2023-02-18  0:28 ` [PATCH v2 40/46] docs: hugetlb: update hugetlb and userfaultfd admin-guides with HGM info James Houghton
2023-02-18  0:28 ` [PATCH v2 41/46] docs: proc: include information about HugeTLB HGM James Houghton
2023-02-18  0:28 ` [PATCH v2 42/46] selftests/mm: add HugeTLB HGM to userfaultfd selftest James Houghton
2023-02-18  0:28 ` [PATCH v2 43/46] KVM: selftests: add HugeTLB HGM to KVM demand paging selftest James Houghton
2023-02-18  0:28 ` [PATCH v2 44/46] selftests/mm: add anon and shared hugetlb to migration test James Houghton
2023-02-18  0:28 ` [PATCH v2 45/46] selftests/mm: add hugetlb HGM test to migration selftest James Houghton
2023-02-18  0:28 ` [PATCH v2 46/46] selftests/mm: add HGM UFFDIO_CONTINUE and hwpoison tests James Houghton
2023-02-24 17:37   ` James Houghton
2023-02-21 21:46 ` [PATCH v2 00/46] hugetlb: introduce HugeTLB high-granularity mapping Mike Kravetz
2023-02-22 15:48   ` David Hildenbrand
2023-02-22 20:57     ` Mina Almasry
2023-02-23  9:07       ` David Hildenbrand
2023-02-23 15:53         ` James Houghton
2023-02-23 16:17           ` David Hildenbrand
2023-02-23 18:33             ` Dr. David Alan Gilbert
2023-02-23 18:25           ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADrL8HWPJuFixYzh97nT_8XO2kaS6i+wY+T58HL3GTmq9u=yTw@mail.gmail.com' \
    --to=jthoughton@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=axelrasmussen@google.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=fvdl@google.com \
    --cc=jiaqiyan@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=manish.mishra@nutanix.com \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox