From: Oscar Salvador <osalvador@suse.com>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Jason Gunthorpe <jgg@nvidia.com>, Peter Xu <peterx@redhat.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Nicholas Piggin <npiggin@gmail.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [RFC PATCH v4 13/16] powerpc/e500: Use contiguous PMD instead of hugepd
Date: Wed, 29 May 2024 10:49:21 +0200 [thread overview]
Message-ID: <ZlbsEb_T2eQYO-g4@localhost.localdomain> (raw)
In-Reply-To: <56cf925576285e2b97550f4f7317183d98d596c5.1716815901.git.christophe.leroy@csgroup.eu>
On Mon, May 27, 2024 at 03:30:11PM +0200, Christophe Leroy wrote:
> e500 supports many page sizes among which the following size are
> implemented in the kernel at the time being: 4M, 16M, 64M, 256M, 1G.
>
> On e500, TLB miss for hugepages is exclusively handled by SW even
> on e6500 which has HW assistance for 4k pages, so there are no
> constraints like on the 8xx.
>
> On e500/32, all are at PGD/PMD level and can be handled as
> cont-PMD.
>
> On e500/64, smaller ones are on PMD while bigger ones are on PUD.
> Again, they can easily be handled as cont-PMD and cont-PUD instead
> of hugepd.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
...
> diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
> index 90d6a0943b35..f7421d1a1693 100644
> --- a/arch/powerpc/include/asm/nohash/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/pgtable.h
> @@ -52,11 +52,36 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p
> {
> pte_basic_t old = pte_val(*p);
> pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
> + unsigned long sz;
> + unsigned long pdsize;
> + int i;
>
> if (new == old)
> return old;
>
> - *p = __pte(new);
> +#ifdef CONFIG_PPC_E500
> + if (huge)
> + sz = 1UL << (((old & _PAGE_HSIZE_MSK) >> _PAGE_HSIZE_SHIFT) + 20);
> + else
I think this will not compile when CONFIG_PPC_85xx && !CONFIG_PTE_64BIT.
You have declared _PAGE_HSIZE_MSK and _PAGE_HSIZE_SHIFT in
arch/powerpc/include/asm/nohash/hugetlb-e500.h.
But hugetlb-e500.h is only included if CONFIG_PPC_85xx && CONFIG_PTE_64BIT
(see arch/powerpc/include/asm/nohash/32/pgtable.h).
> +#endif
> + sz = PAGE_SIZE;
> +
> + if (!huge || sz < PMD_SIZE)
> + pdsize = PAGE_SIZE;
> + else if (sz < PUD_SIZE)
> + pdsize = PMD_SIZE;
> + else if (sz < P4D_SIZE)
> + pdsize = PUD_SIZE;
> + else if (sz < PGDIR_SIZE)
> + pdsize = P4D_SIZE;
> + else
> + pdsize = PGDIR_SIZE;
> +
> + for (i = 0; i < sz / pdsize; i++, p++) {
> + *p = __pte(new);
> + if (new)
> + new += (unsigned long long)(pdsize / PAGE_SIZE) << PTE_RPN_SHIFT;
I guess 'new' can be 0 if pte_update() is called on behave of clearing the pte?
> +static inline unsigned long pmd_leaf_size(pmd_t pmd)
> +{
> + return 1UL << (((pmd_val(pmd) & _PAGE_HSIZE_MSK) >> _PAGE_HSIZE_SHIFT) + 20);
Can we have the '20' somewhere defined with a comment on top explaining
what is so it is not a magic number?
Otherwise people might come look at this and wonder why 20.
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -331,6 +331,37 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> __set_huge_pte_at(pmdp, ptep, pte_val(pte));
> }
> }
> +#elif defined(CONFIG_PPC_E500)
> +void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> + pte_t pte, unsigned long sz)
> +{
> + unsigned long pdsize;
> + int i;
> +
> + pte = set_pte_filter(pte, addr);
> +
> + /*
> + * Make sure hardware valid bit is not set. We don't do
> + * tlb flush for this update.
> + */
> + VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
> +
> + if (sz < PMD_SIZE)
> + pdsize = PAGE_SIZE;
> + else if (sz < PUD_SIZE)
> + pdsize = PMD_SIZE;
> + else if (sz < P4D_SIZE)
> + pdsize = PUD_SIZE;
> + else if (sz < PGDIR_SIZE)
> + pdsize = P4D_SIZE;
> + else
> + pdsize = PGDIR_SIZE;
> +
> + for (i = 0; i < sz / pdsize; i++, ptep++, addr += pdsize) {
> + __set_pte_at(mm, addr, ptep, pte, 0);
> + pte = __pte(pte_val(pte) + ((unsigned long long)pdsize / PAGE_SIZE << PFN_PTE_SHIFT));
You can use pte_advance_pfn() here? Just give have
nr = (unsigned long long)pdsize / PAGE_SIZE << PFN_PTE_SHIFT)
pte_advance_pfn(pte, nr)
Which 'sz's can we have here? You mentioned that e500 support:
4M, 16M, 64M, 256M, 1G.
which of these ones can be huge?
--
Oscar Salvador
SUSE Labs
next prev parent reply other threads:[~2024-05-29 8:49 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-27 13:29 [RFC PATCH v4 00/16] Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64) Christophe Leroy
2024-05-27 13:29 ` [RFC PATCH v4 01/16] powerpc/64e: Remove unused IBM HTW code [SQUASHED] Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 02/16] mm: Define __pte_leaf_size() to also take a PMD entry Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 03/16] mm: Provide mm_struct and address to huge_ptep_get() Christophe Leroy
2024-05-28 4:12 ` Oscar Salvador
2024-05-28 5:41 ` Oscar Salvador
2024-05-28 11:02 ` Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 04/16] powerpc/mm: Remove _PAGE_PSIZE Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 05/16] powerpc/mm: Fix __find_linux_pte() on 32 bits with PMD leaf entries Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 06/16] powerpc/mm: Allow hugepages without hugepd Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 07/16] powerpc/8xx: Fix size given to set_huge_pte_at() Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 08/16] powerpc/8xx: Rework support for 8M pages using contiguous PTE entries Christophe Leroy
2024-05-29 8:02 ` Oscar Salvador
2024-05-29 9:39 ` Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 09/16] powerpc/8xx: Simplify struct mmu_psize_def Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 10/16] powerpc/e500: Remove enc and ind fields from " Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 11/16] powerpc/e500: Switch to 64 bits PGD on 85xx (32 bits) Christophe Leroy
2024-05-27 13:30 ` [RFC PATCH v4 12/16] powerpc/e500: Encode hugepage size in PTE bits Christophe Leroy
2024-05-29 8:05 ` Oscar Salvador
2024-05-29 9:49 ` Christophe Leroy
2024-05-29 10:09 ` Oscar Salvador
2024-05-29 10:14 ` Christophe Leroy
2024-05-29 10:15 ` Oscar Salvador
2024-05-27 13:30 ` [RFC PATCH v4 13/16] powerpc/e500: Use contiguous PMD instead of hugepd Christophe Leroy
2024-05-29 8:49 ` Oscar Salvador [this message]
2024-05-29 9:58 ` Christophe Leroy
2024-05-29 10:05 ` Oscar Salvador
2024-05-27 13:30 ` [RFC PATCH v4 14/16] powerpc/64s: Use contiguous PMD/PUD instead of HUGEPD Christophe Leroy
2024-05-29 9:23 ` Oscar Salvador
2024-05-29 10:07 ` Christophe Leroy
2024-05-29 10:11 ` Oscar Salvador
2024-05-27 13:30 ` [RFC PATCH v4 15/16] powerpc/mm: Remove hugepd leftovers Christophe Leroy
2024-05-29 10:12 ` Oscar Salvador
2024-05-27 13:30 ` [RFC PATCH v4 16/16] mm: Remove CONFIG_ARCH_HAS_HUGEPD Christophe Leroy
2024-05-29 10:13 ` Oscar Salvador
2024-05-29 20:49 ` Oscar Salvador
2024-05-29 10:23 ` [RFC PATCH v4 00/16] Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64) Oscar Salvador
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZlbsEb_T2eQYO-g4@localhost.localdomain \
--to=osalvador@suse.com \
--cc=akpm@linux-foundation.org \
--cc=christophe.leroy@csgroup.eu \
--cc=jgg@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=peterx@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox