From: Jason Gunthorpe <jgg@nvidia.com>
To: Peter Xu <peterx@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org,
Michael Ellerman <mpe@ellerman.id.au>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Matthew Wilcox <willy@infradead.org>,
Rik van Riel <riel@surriel.com>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Yang Shi <shy828301@gmail.com>,
John Hubbard <jhubbard@nvidia.com>,
linux-arm-kernel@lists.infradead.org,
"Kirill A . Shutemov" <kirill@shutemov.name>,
Andrew Jones <andrew.jones@linux.dev>,
Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Muchun Song <muchun.song@linux.dev>,
Christoph Hellwig <hch@infradead.org>,
linux-riscv@lists.infradead.org,
James Houghton <jthoughton@google.com>,
David Hildenbrand <david@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>,
Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2
Date: Tue, 9 Apr 2024 20:43:55 -0300 [thread overview]
Message-ID: <20240409234355.GJ5383@nvidia.com> (raw)
In-Reply-To: <ZhBwVLyHr8WEKSx2@x1n>
On Fri, Apr 05, 2024 at 05:42:44PM -0400, Peter Xu wrote:
> In short, hugetlb mappings shouldn't be special comparing to other huge pXd
> and large folio (cont-pXd) mappings for most of the walkers in my mind, if
> not all. I need to look at all the walkers and there can be some tricky
> ones, but I believe that applies in general. It's actually similar to what
> I did with slow gup here.
I think that is the big question, I also haven't done the research to
know the answer.
At this point focusing on moving what is reasonable to the pXX_* API
makes sense to me. Then reviewing what remains and making some
decision.
> Like this series, for cont-pXd we'll need multiple walks comparing to
> before (when with hugetlb_entry()), but for that part I'll provide some
> performance tests too, and we also have a fallback plan, which is to detect
> cont-pXd existance, which will also work for large folios.
I think we can optimize this pretty easy.
> > I think if you do the easy places for pXX conversion you will have a
> > good idea about what is needed for the hard places.
>
> Here IMHO we don't need to understand "what is the size of this hugetlb
> vma"
Yeh, I never really understood why hugetlb was linked to the VMA.. The
page table is self describing, obviously.
> or "which level of pgtable does this hugetlb vma pages locate",
Ditto
> because we may not need that, e.g., when we only want to collect some smaps
> statistics. "whether it's hugetlb" may matter, though. E.g. in the mm
> walker we see a huge pmd, it can be a thp, it can be a hugetlb (when
> hugetlb_entry removed), we may need extra check later to put things into
> the right bucket, but for the walker itself it doesn't necessarily need
> hugetlb_entry().
Right, places may still need to know it is part of a huge VMA because we
have special stuff linked to that.
> > But then again we come back to power and its big list of page sizes
> > and variety :( Looks like some there have huge sizes at the pgd level
> > at least.
>
> Yeah this is something I want to be super clear, because I may miss
> something: we don't have real pgd pages, right? Powerpc doesn't even
> define p4d_leaf(), AFAICT.
AFAICT it is because it hides it all in hugepd.
If the goal is to purge hugepd then some of the options might turn out
to convert hugepd into huge p4d/pgd, as I understand it. It would be
nice to have certainty on this at least.
We have effectively three APIs to parse a single page table and
currently none of the APIs can return 100% of the data for power.
Jason
next prev parent reply other threads:[~2024-04-09 23:44 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-21 22:07 peterx
2024-03-21 22:07 ` [PATCH v3 01/12] mm/Kconfig: CONFIG_PGTABLE_HAS_HUGE_LEAVES peterx
2024-03-21 22:07 ` [PATCH v3 02/12] mm/hugetlb: Declare hugetlbfs_pagecache_present() non-static peterx
2024-03-21 22:07 ` [PATCH v3 03/12] mm: Make HPAGE_PXD_* macros even if !THP peterx
2024-03-22 17:14 ` SeongJae Park
2024-03-23 0:30 ` Peter Xu
2024-03-23 1:05 ` SeongJae Park
2024-03-21 22:07 ` [PATCH v3 04/12] mm: Introduce vma_pgtable_walk_{begin|end}() peterx
2024-03-22 12:27 ` Jason Gunthorpe
2024-03-21 22:07 ` [PATCH v3 06/12] mm/gup: Refactor record_subpages() to find 1st small page peterx
2024-03-21 22:07 ` [PATCH v3 07/12] mm/gup: Handle hugetlb for no_page_table() peterx
2024-03-21 22:07 ` [PATCH v3 08/12] mm/gup: Cache *pudp in follow_pud_mask() peterx
2024-03-21 22:07 ` [PATCH v3 09/12] mm/gup: Handle huge pud for follow_pud_mask() peterx
2024-03-21 22:08 ` [PATCH v3 11/12] mm/gup: Handle hugepd for follow_page() peterx
2024-03-21 22:08 ` [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code peterx
2024-03-22 13:30 ` Jason Gunthorpe
2024-03-22 15:55 ` Peter Xu
2024-03-22 16:08 ` Jason Gunthorpe
2024-03-22 20:48 ` Andrew Morton
2024-03-23 0:45 ` Peter Xu
2024-03-23 2:15 ` Peter Xu
[not found] ` <20240321220802.679544-6-peterx@redhat.com>
2024-03-22 12:28 ` [PATCH v3 05/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing Jason Gunthorpe
2024-03-22 16:10 ` [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2 Jason Gunthorpe
2024-03-25 18:58 ` Peter Xu
2024-03-26 14:02 ` Jason Gunthorpe
2024-04-04 21:48 ` Peter Xu
2024-04-05 18:16 ` Jason Gunthorpe
2024-04-05 21:42 ` Peter Xu
2024-04-09 23:43 ` Jason Gunthorpe [this message]
2024-04-10 15:28 ` Peter Xu
2024-04-10 16:30 ` Christophe Leroy
2024-04-10 19:58 ` Peter Xu
2024-04-12 14:27 ` Christophe Leroy
2024-03-25 14:56 ` Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240409234355.GJ5383@nvidia.com \
--to=jgg@nvidia.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.jones@linux.dev \
--cc=aneesh.kumar@kernel.org \
--cc=axelrasmussen@google.com \
--cc=christophe.leroy@csgroup.eu \
--cc=david@redhat.com \
--cc=hch@infradead.org \
--cc=jhubbard@nvidia.com \
--cc=jthoughton@google.com \
--cc=kirill@shutemov.name \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lstoakes@gmail.com \
--cc=mike.kravetz@oracle.com \
--cc=mpe@ellerman.id.au \
--cc=muchun.song@linux.dev \
--cc=peterx@redhat.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=shy828301@gmail.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox