From: Peter Xu <peterx@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org,
Michael Ellerman <mpe@ellerman.id.au>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Matthew Wilcox <willy@infradead.org>,
Rik van Riel <riel@surriel.com>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Yang Shi <shy828301@gmail.com>,
John Hubbard <jhubbard@nvidia.com>,
linux-arm-kernel@lists.infradead.org,
"Kirill A . Shutemov" <kirill@shutemov.name>,
Andrew Jones <andrew.jones@linux.dev>,
Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
Muchun Song <muchun.song@linux.dev>,
Christoph Hellwig <hch@infradead.org>,
linux-riscv@lists.infradead.org,
James Houghton <jthoughton@google.com>,
David Hildenbrand <david@redhat.com>,
Jason Gunthorpe <jgg@nvidia.com>,
Andrea Arcangeli <aarcange@redhat.com>,
"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>,
Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code
Date: Fri, 22 Mar 2024 22:15:20 -0400 [thread overview]
Message-ID: <Zf47OKjYjkUhs6AT@x1n> (raw)
In-Reply-To: <Zf4mR_OxE5Ft4VJg@x1n>
On Fri, Mar 22, 2024 at 08:45:59PM -0400, Peter Xu wrote:
> On Fri, Mar 22, 2024 at 01:48:18PM -0700, Andrew Morton wrote:
> > On Thu, 21 Mar 2024 18:08:02 -0400 peterx@redhat.com wrote:
> >
> > > From: Peter Xu <peterx@redhat.com>
> > >
> > > Now follow_page() is ready to handle hugetlb pages in whatever form, and
> > > over all architectures. Switch to the generic code path.
> > >
> > > Time to retire hugetlb_follow_page_mask(), following the previous
> > > retirement of follow_hugetlb_page() in 4849807114b8.
> > >
> > > There may be a slight difference of how the loops run when processing slow
> > > GUP over a large hugetlb range on cont_pte/cont_pmd supported archs: each
> > > loop of __get_user_pages() will resolve one pgtable entry with the patch
> > > applied, rather than relying on the size of hugetlb hstate, the latter may
> > > cover multiple entries in one loop.
> > >
> > > A quick performance test on an aarch64 VM on M1 chip shows 15% degrade over
> > > a tight loop of slow gup after the path switched. That shouldn't be a
> > > problem because slow-gup should not be a hot path for GUP in general: when
> > > page is commonly present, fast-gup will already succeed, while when the
> > > page is indeed missing and require a follow up page fault, the slow gup
> > > degrade will probably buried in the fault paths anyway. It also explains
> > > why slow gup for THP used to be very slow before 57edfcfd3419 ("mm/gup:
> > > accelerate thp gup even for "pages != NULL"") lands, the latter not part of
> > > a performance analysis but a side benefit. If the performance will be a
> > > concern, we can consider handle CONT_PTE in follow_page().
> > >
> > > Before that is justified to be necessary, keep everything clean and simple.
> > >
> >
> > mm/gup.c:33:21: warning: 'follow_hugepd' declared 'static' but never defined [-Wunused-function]
> > 33 | static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
> > | ^~~~~~~~~~~~~
> >
> > --- a/mm/gup.c~mm-gup-handle-hugepd-for-follow_page-fix
> > +++ a/mm/gup.c
> > @@ -30,10 +30,12 @@ struct follow_page_context {
> > unsigned int page_mask;
> > };
> >
> > +#ifdef CONFIG_HAVE_FAST_GUP
> > static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
> > unsigned long addr, unsigned int pdshift,
> > unsigned int flags,
> > struct follow_page_context *ctx);
> > +#endif
> >
> > static inline void sanity_check_pinned_pages(struct page **pages,
> > unsigned long npages)
> > _
> >
> >
> > This looks inelegant.
> >
> > That's two build issues so far. Please be more expansive in the
> > Kconfig variations when testing. Especially when mucking with pgtable
> > macros.
>
> Andrew,
>
> Apologies for that, and also for a slightly late response. Yeah it's time
> I'll need my set of things to do serious build tests, and I'll at least
> start to cover a few error prone configs/archs in with that.
>
> I was trying to rely on the build bot in many of previous such cases, as
> that was quite useful to me to cover many build issues without investing my
> own test setups, but I think for some reason it retired and stopped working
> for a while. Maybe I shouldn't have relied on it at all.
>
> For this specific issue, I'm not sure if CONFIG_HAVE_FAST_GUP is proper?
> As follow_hugepd() is used in slow gup not fast. So maybe we can put that
> under CONFIG_MMU below that code (and I think we can drop "static" too as I
> don't think it's anything useful). My version of fixup attached at the end
the static is useful; below patch did pass on m68k but won't on
x86.. ignore that please.
> of email, and I verified it on m68k build.
>
> I do plan to post a small fixup series to fix these issues (so far it may
> contain 1 formal patch to touch up vmstat_item_print_in_thp, and 2 fixups
> where I'll mark the subject with "fixup!" properly). Either you can pick
> up below or you can wait for my small patchset, should be there either
> today or tomorrow.
I changed plan here too; I found more users of HPAGE_PMD_NR assuming it's
defined even if !CONFIG_MMU. That's weird, as CONFIG_MMU doesn't even
define PMD_SHIFT... To fix this I decided to use the old trick on using
BUILD_BUG() like it used to work before; frankly I don't know how that
didn't throw warnings, but i'll make sure it passes all known builds (ps: I
still haven't got my build harness ready, so that will still be limited but
should solve known issues).
In short: please wait for my fixup series. Thanks.
>
> Thanks,
>
> ===8<===
> diff --git a/mm/gup.c b/mm/gup.c
> index 4cd349390477..a2ed8203495a 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -30,11 +30,6 @@ struct follow_page_context {
> unsigned int page_mask;
> };
>
> -static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
> - unsigned long addr, unsigned int pdshift,
> - unsigned int flags,
> - struct follow_page_context *ctx);
> -
> static inline void sanity_check_pinned_pages(struct page **pages,
> unsigned long npages)
> {
> @@ -505,6 +500,12 @@ static inline void mm_set_has_pinned_flag(unsigned long *mm_flags)
> }
>
> #ifdef CONFIG_MMU
> +
> +struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
> + unsigned long addr, unsigned int pdshift,
> + unsigned int flags,
> + struct follow_page_context *ctx);
> +
> static struct page *no_page_table(struct vm_area_struct *vma,
> unsigned int flags, unsigned long address)
> {
> ===8<===
>
> --
> Peter Xu
--
Peter Xu
next prev parent reply other threads:[~2024-03-23 2:15 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-21 22:07 [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2 peterx
2024-03-21 22:07 ` [PATCH v3 01/12] mm/Kconfig: CONFIG_PGTABLE_HAS_HUGE_LEAVES peterx
2024-03-21 22:07 ` [PATCH v3 02/12] mm/hugetlb: Declare hugetlbfs_pagecache_present() non-static peterx
2024-03-21 22:07 ` [PATCH v3 03/12] mm: Make HPAGE_PXD_* macros even if !THP peterx
2024-03-22 17:14 ` SeongJae Park
2024-03-23 0:30 ` Peter Xu
2024-03-23 1:05 ` SeongJae Park
2024-03-21 22:07 ` [PATCH v3 04/12] mm: Introduce vma_pgtable_walk_{begin|end}() peterx
2024-03-22 12:27 ` Jason Gunthorpe
2024-03-21 22:07 ` [PATCH v3 06/12] mm/gup: Refactor record_subpages() to find 1st small page peterx
2024-03-21 22:07 ` [PATCH v3 07/12] mm/gup: Handle hugetlb for no_page_table() peterx
2024-03-21 22:07 ` [PATCH v3 08/12] mm/gup: Cache *pudp in follow_pud_mask() peterx
2024-03-21 22:07 ` [PATCH v3 09/12] mm/gup: Handle huge pud for follow_pud_mask() peterx
2024-03-21 22:08 ` [PATCH v3 11/12] mm/gup: Handle hugepd for follow_page() peterx
2024-03-21 22:08 ` [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code peterx
2024-03-22 13:30 ` Jason Gunthorpe
2024-03-22 15:55 ` Peter Xu
2024-03-22 16:08 ` Jason Gunthorpe
2024-03-22 20:48 ` Andrew Morton
2024-03-23 0:45 ` Peter Xu
2024-03-23 2:15 ` Peter Xu [this message]
[not found] ` <20240321220802.679544-6-peterx@redhat.com>
2024-03-22 12:28 ` [PATCH v3 05/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing Jason Gunthorpe
2024-03-22 16:10 ` [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2 Jason Gunthorpe
2024-03-25 18:58 ` Peter Xu
2024-03-26 14:02 ` Jason Gunthorpe
2024-04-04 21:48 ` Peter Xu
2024-04-05 18:16 ` Jason Gunthorpe
2024-04-05 21:42 ` Peter Xu
2024-04-09 23:43 ` Jason Gunthorpe
2024-04-10 15:28 ` Peter Xu
2024-04-10 16:30 ` Christophe Leroy
2024-04-10 19:58 ` Peter Xu
2024-04-12 14:27 ` Christophe Leroy
2024-03-25 14:56 ` Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zf47OKjYjkUhs6AT@x1n \
--to=peterx@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.jones@linux.dev \
--cc=aneesh.kumar@kernel.org \
--cc=axelrasmussen@google.com \
--cc=christophe.leroy@csgroup.eu \
--cc=david@redhat.com \
--cc=hch@infradead.org \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=jthoughton@google.com \
--cc=kirill@shutemov.name \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lstoakes@gmail.com \
--cc=mike.kravetz@oracle.com \
--cc=mpe@ellerman.id.au \
--cc=muchun.song@linux.dev \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=shy828301@gmail.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox