linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christophe Leroy <christophe.leroy@csgroup.eu>
To: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Matthew Wilcox <willy@infradead.org>,
	Rik van Riel <riel@surriel.com>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yang Shi <shy828301@gmail.com>,
	John Hubbard <jhubbard@nvidia.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Andrew Jones <andrew.jones@linux.dev>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Muchun Song <muchun.song@linux.dev>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	James Houghton <jthoughton@google.com>,
	David Hildenbrand <david@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2
Date: Fri, 12 Apr 2024 14:27:22 +0000	[thread overview]
Message-ID: <195306fe-bb13-47bc-b26a-e87b4a6383d9@csgroup.eu> (raw)
In-Reply-To: <Zhbvd9WZzWl3IA8Y@x1n>



Le 10/04/2024 à 21:58, Peter Xu a écrit :
>>
>> e500 has two modes: 32 bits and 64 bits.
>>
>> For 32 bits:
>>
>> 8xx is the only one handling it through HW-assisted pagetable walk hence
>> requiring a 2-level whatever the pagesize is.
> 
> Hmm I think maybe finally I get it..
> 
> I think the confusion came from when I saw there's always such level-2
> table described in Figure 8-5 of the manual:
> 
> https://www.nxp.com/docs/en/reference-manual/MPC860UM.pdf

Yes indeed that figure is confusing.

Table 8-1 gives a pretty good idea of what is required. We only use 
MD_CTR[TWAM] = 1

> 
> So I suppose you meant for 8M, the PowerPC 8xx system hardware will be
> aware of such 8M pgtable (from level-1's entry, where it has bit 28-29 set
> 011b), then it won't ever read anything starting from "Level-2 Descriptor
> 1" (but only read the only entry "Level-2 Descriptor 0"), so fundamentally
> hugepd format must look like such for 8xx?
> 
> But then perhaps it's still compatible with cont-pte because the rest
> entries (pte index 1+) will simply be ignored by the hardware?

Yes, still compatible with CONT-PTE allthough things become tricky 
because you need two page tables to get the full 8M so that's a kind of 
cont-PMD down to PTE level, as you can see in my RFC series.

> 
>>
>> On e500 it is all software so pages 2M and larger should be cont-PGD (by
>> the way I'm a bit puzzled that on arches that have only 2 levels, ie PGD
>> and PTE, the PGD entries are populated by a function called PMD_populate()).
> 
> Yeah.. I am also wondering whether pgd_populate() could also work there
> (perhaps with some trivial changes, or maybe not even needed..), as when
> p4d/pud/pmd levels are missing, linux should just do something like an
> enforced cast from pgd_t* -> pmd_t* in this case.
> 
> I think currently they're already not pgd, as __find_linux_pte() already
> skipped pgd unconditionally:
> 
> 	pgdp = pgdir + pgd_index(ea);
> 	p4dp = p4d_offset(pgdp, ea);
> 

Yes that's what is confusing, some parts of code considers we have only 
a PGD and a PT while other parts consider we have only a PMD and a PT

>>
>> Current situation for 8xx is illustrated here:
>> https://github.com/linuxppc/wiki/wiki/Huge-pages#8xx
>>
>> I also tried to better illustrate e500/32 here:
>> https://github.com/linuxppc/wiki/wiki/Huge-pages#e500
>>
>> For 64 bits:
>> We have PTE/PMD/PUD/PGD, no P4D
>>
>> See arch/powerpc/include/asm/nohash/64/pgtable-4k.h
> 
> We don't have anything that is above pud in this category, right?  That's
> what I read from your wiki (and thanks for providing that in the first
> place; helps a lot for me to understand how it works on PowerPC).

Yes thanks to Michael and Aneesh who initiated that Wiki page.

> 
> I want to make sure if I can move on without caring on p4d/pgd leafs like
> what we do right now, even after if we can remove hugepd for good, in this
> case since p4d always missing, then it's about whether "pud|pmd|pte_leaf()"
> can also cover the pgd ones when that day comes, iiuc.

I guess so but I'd like Aneesh and/or Michael to confirm as I'm not an 
expert on PPC64.

Christophe

  reply	other threads:[~2024-04-12 14:27 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 22:07 peterx
2024-03-21 22:07 ` [PATCH v3 01/12] mm/Kconfig: CONFIG_PGTABLE_HAS_HUGE_LEAVES peterx
2024-03-21 22:07 ` [PATCH v3 02/12] mm/hugetlb: Declare hugetlbfs_pagecache_present() non-static peterx
2024-03-21 22:07 ` [PATCH v3 03/12] mm: Make HPAGE_PXD_* macros even if !THP peterx
2024-03-22 17:14   ` SeongJae Park
2024-03-23  0:30     ` Peter Xu
2024-03-23  1:05       ` SeongJae Park
2024-03-21 22:07 ` [PATCH v3 04/12] mm: Introduce vma_pgtable_walk_{begin|end}() peterx
2024-03-22 12:27   ` Jason Gunthorpe
2024-03-21 22:07 ` [PATCH v3 06/12] mm/gup: Refactor record_subpages() to find 1st small page peterx
2024-03-21 22:07 ` [PATCH v3 07/12] mm/gup: Handle hugetlb for no_page_table() peterx
2024-03-21 22:07 ` [PATCH v3 08/12] mm/gup: Cache *pudp in follow_pud_mask() peterx
2024-03-21 22:07 ` [PATCH v3 09/12] mm/gup: Handle huge pud for follow_pud_mask() peterx
2024-03-21 22:08 ` [PATCH v3 11/12] mm/gup: Handle hugepd for follow_page() peterx
2024-03-21 22:08 ` [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code peterx
2024-03-22 13:30   ` Jason Gunthorpe
2024-03-22 15:55     ` Peter Xu
2024-03-22 16:08       ` Jason Gunthorpe
2024-03-22 20:48   ` Andrew Morton
2024-03-23  0:45     ` Peter Xu
2024-03-23  2:15       ` Peter Xu
     [not found] ` <20240321220802.679544-6-peterx@redhat.com>
2024-03-22 12:28   ` [PATCH v3 05/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing Jason Gunthorpe
2024-03-22 16:10 ` [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2 Jason Gunthorpe
2024-03-25 18:58   ` Peter Xu
2024-03-26 14:02     ` Jason Gunthorpe
2024-04-04 21:48       ` Peter Xu
2024-04-05 18:16         ` Jason Gunthorpe
2024-04-05 21:42           ` Peter Xu
2024-04-09 23:43             ` Jason Gunthorpe
2024-04-10 15:28               ` Peter Xu
2024-04-10 16:30                 ` Christophe Leroy
2024-04-10 19:58                   ` Peter Xu
2024-04-12 14:27                     ` Christophe Leroy [this message]
2024-03-25 14:56 ` Christophe Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=195306fe-bb13-47bc-b26a-e87b4a6383d9@csgroup.eu \
    --to=christophe.leroy@csgroup.eu \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.jones@linux.dev \
    --cc=aneesh.kumar@kernel.org \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hch@infradead.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lstoakes@gmail.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mpe@ellerman.id.au \
    --cc=muchun.song@linux.dev \
    --cc=peterx@redhat.com \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox