From: Christophe Leroy <christophe.leroy@csgroup.eu>
To: Andrew Morton <akpm@linux-foundation.org>,
Jason Gunthorpe <jgg@nvidia.com>, Peter Xu <peterx@redhat.com>,
Oscar Salvador <osalvador@suse.de>,
Michael Ellerman <mpe@ellerman.id.au>,
Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linuxppc-dev@lists.ozlabs.org
Subject: [PATCH v5 00/18] Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)
Date: Mon, 10 Jun 2024 07:54:45 +0200 [thread overview]
Message-ID: <cover.1717955558.git.christophe.leroy@csgroup.eu> (raw)
This series has reached maturity to not send it as an RFC anymore.
Only the book3s/64 part maybe needs more attention. Alternatively
we could simply disable HUGE pages on book3s/64 in hash-4k mode if
we want to be on the safe side.
Also see https://github.com/linuxppc/issues/issues/483
Unlike most architectures, powerpc 8xx HW requires a two-level
pagetable topology for all page sizes. So a leaf PMD-contig approach
is not feasible as such.
Possible sizes on 8xx are 4k, 16k, 512k and 8M.
First level (PGD/PMD) covers 4M per entry. For 8M pages, two PMD entries
must point to a single entry level-2 page table. Until now that was
done using hugepd. This series changes it to use standard page tables
where the entry is replicated 1024 times on each of the two pagetables
refered by the two associated PMD entries for that 8M page.
For e500 and book3s/64 there are less constraints because it is not
tied to the HW assisted tablewalk like on 8xx, so it is easier to use
leaf PMDs (and PUDs).
On e500 the supported page sizes are 4M, 16M, 64M, 256M and 1G. All at
PMD level on e500/32 (mpc85xx) and mix of PMD and PUD for e500/64. We
encode page size with 4 available bits in PTE entries. On e300/32 PGD
entries size is increases to 64 bits in order to allow leaf-PMD entries
because PTE are 64 bits on e500.
On book3s/64 only the hash-4k mode is concerned. It supports 16M pages
as cont-PMD and 16G pages as cont-PUD. In other modes (radix-4k, radix-6k
and hash-64k) the sizes match with PMD and PUD sizes so that's just leaf
entries. The hash processing make things a bit more complex. To ease
things, __hash_page_huge() is modified to bail out when DIRTY or ACCESSED
bits are missing, leaving it to mm core to fix it.
Global changes in v5:
- Now use PAGE SIZE field in e500's PTE to store TSIZE instead of using U0-U3
- On e500/64, use highest bit to discriminate leaf entries because PUD entries are not garantied to be 4k aligned so PAGE SIZE field is not garantied to be 0 on a non-leaf entry.
Global changes in v4:
- Fixed a few issues reported privately by robots
- Rebased on top of v6.10-rc1
Global changes in v3:
- Removed patches 1 and 2
- Squashed patch 11 into patch 5
- Replaced patches 12 and 13 with a series from Michael
- Reordered patches a bit to have more general patches up front
For more details on changes, see in each patch.
Christophe Leroy (17):
mm: Define __pte_leaf_size() to also take a PMD entry
mm: Provide mm_struct and address to huge_ptep_get()
powerpc/mm: Remove _PAGE_PSIZE
powerpc/mm: Fix __find_linux_pte() on 32 bits with PMD leaf entries
powerpc/mm: Allow hugepages without hugepd
powerpc/8xx: Fix size given to set_huge_pte_at()
powerpc/8xx: Rework support for 8M pages using contiguous PTE entries
powerpc/8xx: Simplify struct mmu_psize_def
powerpc/e500: Remove enc and ind fields from struct mmu_psize_def
powerpc/e500: Switch to 64 bits PGD on 85xx (32 bits)
powerpc/e500: Encode hugepage size in PTE bits
powerpc/e500: Don't pre-check write access on data TLB error
powerpc/e500: Free r10 for FIND_PTE
powerpc/e500: Use contiguous PMD instead of hugepd
powerpc/64s: Use contiguous PMD/PUD instead of HUGEPD
powerpc/mm: Remove hugepd leftovers
mm: Remove CONFIG_ARCH_HAS_HUGEPD
Michael Ellerman (1):
powerpc/64e: Remove unused IBM HTW code [SQUASHED]
arch/arm/include/asm/hugetlb-3level.h | 4 +-
arch/arm64/include/asm/hugetlb.h | 2 +-
arch/arm64/mm/hugetlbpage.c | 2 +-
arch/powerpc/Kconfig | 1 -
arch/powerpc/include/asm/book3s/32/pgalloc.h | 2 -
arch/powerpc/include/asm/book3s/64/hash-4k.h | 15 -
arch/powerpc/include/asm/book3s/64/hash.h | 38 +-
arch/powerpc/include/asm/book3s/64/hugetlb.h | 38 --
.../include/asm/book3s/64/pgtable-4k.h | 47 --
.../include/asm/book3s/64/pgtable-64k.h | 20 -
arch/powerpc/include/asm/book3s/64/pgtable.h | 22 +-
arch/powerpc/include/asm/hugetlb.h | 15 +-
.../include/asm/nohash/32/hugetlb-8xx.h | 38 +-
arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 9 +-
arch/powerpc/include/asm/nohash/32/pte-40x.h | 3 -
arch/powerpc/include/asm/nohash/32/pte-44x.h | 3 -
arch/powerpc/include/asm/nohash/32/pte-85xx.h | 3 -
arch/powerpc/include/asm/nohash/32/pte-8xx.h | 58 ++-
.../powerpc/include/asm/nohash/hugetlb-e500.h | 39 +-
arch/powerpc/include/asm/nohash/mmu-e500.h | 6 +-
arch/powerpc/include/asm/nohash/pgalloc.h | 2 -
arch/powerpc/include/asm/nohash/pgtable.h | 46 +-
arch/powerpc/include/asm/nohash/pte-e500.h | 63 ++-
arch/powerpc/include/asm/page.h | 32 --
arch/powerpc/include/asm/pgtable-be-types.h | 10 -
arch/powerpc/include/asm/pgtable-types.h | 13 +-
arch/powerpc/include/asm/pgtable.h | 3 +
arch/powerpc/kernel/exceptions-64e.S | 4 +-
arch/powerpc/kernel/head_85xx.S | 70 +--
arch/powerpc/kernel/head_8xx.S | 10 +-
arch/powerpc/kernel/setup_64.c | 6 +-
arch/powerpc/mm/book3s64/hash_utils.c | 11 +-
arch/powerpc/mm/book3s64/hugetlbpage.c | 10 +
arch/powerpc/mm/book3s64/pgtable.c | 12 -
arch/powerpc/mm/hugetlbpage.c | 455 +-----------------
arch/powerpc/mm/init-common.c | 8 +-
arch/powerpc/mm/kasan/8xx.c | 21 +-
arch/powerpc/mm/nohash/8xx.c | 43 +-
arch/powerpc/mm/nohash/Makefile | 2 +-
arch/powerpc/mm/nohash/book3e_pgtable.c | 4 +-
arch/powerpc/mm/nohash/tlb.c | 407 +---------------
arch/powerpc/mm/nohash/tlb_64e.c | 314 ++++++++++++
arch/powerpc/mm/nohash/tlb_low_64e.S | 428 +---------------
arch/powerpc/mm/pgtable.c | 94 ++--
arch/powerpc/mm/pgtable_32.c | 2 +-
arch/riscv/include/asm/hugetlb.h | 2 +-
arch/riscv/mm/hugetlbpage.c | 2 +-
arch/s390/include/asm/hugetlb.h | 4 +-
arch/s390/mm/hugetlbpage.c | 4 +-
fs/hugetlbfs/inode.c | 2 +-
fs/proc/task_mmu.c | 10 +-
fs/userfaultfd.c | 2 +-
include/asm-generic/hugetlb.h | 2 +-
include/linux/hugetlb.h | 6 -
include/linux/pgtable.h | 3 +
include/linux/swapops.h | 4 +-
kernel/events/core.c | 2 +-
mm/Kconfig | 10 -
mm/damon/vaddr.c | 6 +-
mm/gup.c | 183 +------
mm/hmm.c | 2 +-
mm/hugetlb.c | 44 +-
mm/memory-failure.c | 2 +-
mm/mempolicy.c | 2 +-
mm/migrate.c | 4 +-
mm/mincore.c | 2 +-
mm/pagewalk.c | 57 +--
mm/userfaultfd.c | 2 +-
68 files changed, 749 insertions(+), 2043 deletions(-)
delete mode 100644 arch/powerpc/include/asm/book3s/64/pgtable-4k.h
create mode 100644 arch/powerpc/mm/nohash/tlb_64e.c
--
2.44.0
next reply other threads:[~2024-06-10 5:55 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-10 5:54 Christophe Leroy [this message]
2024-06-10 5:54 ` [PATCH v5 01/18] powerpc/64e: Remove unused IBM HTW code [SQUASHED] Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 02/18] mm: Define __pte_leaf_size() to also take a PMD entry Christophe Leroy
2024-06-11 9:34 ` Oscar Salvador
2024-06-11 14:17 ` Peter Xu
2024-06-11 15:08 ` Oscar Salvador
2024-06-11 15:20 ` Peter Xu
2024-06-11 16:10 ` Oscar Salvador
2024-06-11 19:00 ` LEROY Christophe
2024-06-11 21:43 ` Peter Xu
2024-06-13 7:19 ` Oscar Salvador
2024-06-13 16:43 ` LEROY Christophe
2024-06-14 14:14 ` Oscar Salvador
2024-06-11 16:53 ` LEROY Christophe
2024-06-11 14:50 ` LEROY Christophe
2024-06-10 5:54 ` [PATCH v5 03/18] mm: Provide mm_struct and address to huge_ptep_get() Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 04/18] powerpc/mm: Remove _PAGE_PSIZE Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 05/18] powerpc/mm: Fix __find_linux_pte() on 32 bits with PMD leaf entries Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 06/18] powerpc/mm: Allow hugepages without hugepd Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 07/18] powerpc/8xx: Fix size given to set_huge_pte_at() Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 08/18] powerpc/8xx: Rework support for 8M pages using contiguous PTE entries Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 09/18] powerpc/8xx: Simplify struct mmu_psize_def Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 10/18] powerpc/e500: Remove enc and ind fields from " Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 11/18] powerpc/e500: Switch to 64 bits PGD on 85xx (32 bits) Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 12/18] powerpc/e500: Encode hugepage size in PTE bits Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 13/18] powerpc/e500: Don't pre-check write access on data TLB error Christophe Leroy
2024-06-10 5:54 ` [PATCH v5 14/18] powerpc/e500: Free r10 for FIND_PTE Christophe Leroy
2024-06-10 5:55 ` [PATCH v5 15/18] powerpc/e500: Use contiguous PMD instead of hugepd Christophe Leroy
2024-06-10 5:55 ` [PATCH v5 16/18] powerpc/64s: Use contiguous PMD/PUD instead of HUGEPD Christophe Leroy
2024-06-13 7:39 ` Oscar Salvador
2024-06-24 14:24 ` LEROY Christophe
2024-06-10 5:55 ` [PATCH v5 17/18] powerpc/mm: Remove hugepd leftovers Christophe Leroy
2024-06-10 5:55 ` [PATCH v5 18/18] mm: Remove CONFIG_ARCH_HAS_HUGEPD Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1717955558.git.christophe.leroy@csgroup.eu \
--to=christophe.leroy@csgroup.eu \
--cc=akpm@linux-foundation.org \
--cc=jgg@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.de \
--cc=peterx@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox