From: Ryan Roberts <ryan.roberts@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Andrew Morton <akpm@linux-foundation.org>,
Uladzislau Rezki <urezki@gmail.com>,
Christoph Hellwig <hch@infradead.org>,
David Hildenbrand <david@redhat.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Mark Rutland <mark.rutland@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Alexandre Ghiti <alexghiti@rivosinc.com>,
Kevin Brodsky <kevin.brodsky@arm.com>,
linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 10/14] arm64/mm: Support huge pte-mapped pages in vmap
Date: Tue, 25 Feb 2025 16:57:47 +0000 [thread overview]
Message-ID: <601fa2ab-aa75-4a34-8754-65c1d0b9a8c7@arm.com> (raw)
In-Reply-To: <Z7xgKjhMZAY5mOwK@arm.com>
On 24/02/2025 12:03, Catalin Marinas wrote:
> On Mon, Feb 17, 2025 at 02:08:02PM +0000, Ryan Roberts wrote:
>> Implement the required arch functions to enable use of contpte in the
>> vmap when VM_ALLOW_HUGE_VMAP is specified. This speeds up vmap
>> operations due to only having to issue a DSB and ISB per contpte block
>> instead of per pte.
>
> For non-cont PTEs, do you happen to know how often vmap_pte_range() is
> called for multiple entries?
It's mostly when vmalloc_huge() is called and for remapping io regions I
believe. I don't have numbers though.
> It might be worth changing that to use
> set_ptes() directly and we get some benefit as well.
I thought about that, but it's a bit of a pain, given the current
arch_vmap_pte_range_map_size() opt-in for set_huge_pte_at().
Ideally I think we would just set_ptes() the whole lot (or at least in chunks of
max_page_shift so we honour the "must belong to same folio" requirement). Then
let contpte decide when to use contiguous mappings. But we currently block
contpte operations for kernel mappings because of the potential need to
invalidate, which can cause faults. So that would need work. And we would need
to rework powerpc somehow.
The alternative would be to keep the arch_vmap_pte_range_map_size() opt-in and
somehow expose the alignment/size requirements to vmap_pte_range() so it can do
set_ptes() in the gaps. But that all felt quite messy.
In the end, I decided we would be getting the benefits via arch_make_huge_pte()
for >=64K folios, and decided not to worry about the <=3K folio cases.
>
>> But it also means that the TLB pressure reduces due
>> to only needing a single TLB entry for the whole contpte block.
>>
>> Since vmap uses set_huge_pte_at() to set the contpte, that API is now
>> used for kernel mappings for the first time. Although in the vmap case
>> we never expect it to be called to modify a valid mapping so
>> clear_flush() should never be called, it's still wise to make it robust
>> for the kernel case, so amend the tlb flush function if the mm is for
>> kernel space.
>>
>> Tested with vmalloc performance selftests:
>>
>> # kself/mm/test_vmalloc.sh \
>> run_test_mask=1
>> test_repeat_count=5
>> nr_pages=256
>> test_loop_count=100000
>> use_huge=1
>>
>> Duration reduced from 1274243 usec to 1083553 usec on Apple M2 for 15%
>> reduction in time taken.
>>
>> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>> arch/arm64/include/asm/vmalloc.h | 46 ++++++++++++++++++++++++++++++++
>> arch/arm64/mm/hugetlbpage.c | 5 +++-
>> 2 files changed, 50 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/include/asm/vmalloc.h b/arch/arm64/include/asm/vmalloc.h
>> index 38fafffe699f..40ebc664190b 100644
>> --- a/arch/arm64/include/asm/vmalloc.h
>> +++ b/arch/arm64/include/asm/vmalloc.h
>> @@ -23,6 +23,52 @@ static inline bool arch_vmap_pmd_supported(pgprot_t prot)
>> return !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS);
>> }
>>
>> +#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size
>> +static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr,
>> + unsigned long end, u64 pfn,
>> + unsigned int max_page_shift)
>> +{
>> + /*
>> + * If the block is at least CONT_PTE_SIZE in size, and is naturally
>> + * aligned in both virtual and physical space, then we can pte-map the
>> + * block using the PTE_CONT bit for more efficient use of the TLB.
>> + */
>> +
>
> Nit: unnecessary empty line.
Will fix.
>
>> + if (max_page_shift < CONT_PTE_SHIFT)
>> + return PAGE_SIZE;
>> +
>> + if (end - addr < CONT_PTE_SIZE)
>> + return PAGE_SIZE;
>> +
>> + if (!IS_ALIGNED(addr, CONT_PTE_SIZE))
>> + return PAGE_SIZE;
>> +
>> + if (!IS_ALIGNED(PFN_PHYS(pfn), CONT_PTE_SIZE))
>> + return PAGE_SIZE;
>> +
>> + return CONT_PTE_SIZE;
>> +}
>> +
>> +#define arch_vmap_pte_range_unmap_size arch_vmap_pte_range_unmap_size
>> +static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr,
>> + pte_t *ptep)
>> +{
>> + /*
>> + * The caller handles alignment so it's sufficient just to check
>> + * PTE_CONT.
>> + */
>> + return pte_valid_cont(__ptep_get(ptep)) ? CONT_PTE_SIZE : PAGE_SIZE;
>> +}
>> +
>> +#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift
>> +static inline int arch_vmap_pte_supported_shift(unsigned long size)
>> +{
>> + if (size >= CONT_PTE_SIZE)
>> + return CONT_PTE_SHIFT;
>> +
>> + return PAGE_SHIFT;
>> +}
>> +
>> #endif
>>
>> #define arch_vmap_pgprot_tagged arch_vmap_pgprot_tagged
>> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
>> index 8ac86cd180b3..a29f347fea54 100644
>> --- a/arch/arm64/mm/hugetlbpage.c
>> +++ b/arch/arm64/mm/hugetlbpage.c
>> @@ -217,7 +217,10 @@ static void clear_flush(struct mm_struct *mm,
>> for (i = 0; i < ncontig; i++, addr += pgsize, ptep++)
>> ptep_get_and_clear_anysz(mm, ptep, pgsize);
>>
>> - __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true);
>> + if (mm == &init_mm)
>> + flush_tlb_kernel_range(saddr, addr);
>> + else
>> + __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true);
>> }
>>
>> void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
next prev parent reply other threads:[~2025-02-25 16:57 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-17 14:07 [PATCH v2 00/14] Perf improvements for hugetlb and vmalloc on arm64 Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 01/14] arm64: hugetlb: Cleanup huge_pte size discovery mechanisms Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 02/14] arm64: hugetlb: Refine tlb maintenance scope Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 03/14] mm/page_table_check: Batch-check pmds/puds just like ptes Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 04/14] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 05/14] arm64: hugetlb: Use set_ptes_anysz() and ptep_get_and_clear_anysz() Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 06/14] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Ryan Roberts
2025-02-22 11:56 ` Catalin Marinas
2025-02-24 12:18 ` Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 07/14] arm64/mm: Avoid barriers for invalid or userspace mappings Ryan Roberts
2025-02-20 16:54 ` Kevin Brodsky
2025-02-24 12:26 ` Ryan Roberts
2025-02-22 13:17 ` Catalin Marinas
2025-02-25 16:41 ` Ryan Roberts
2025-02-17 14:08 ` [PATCH v2 08/14] mm/vmalloc: Warn on improper use of vunmap_range() Ryan Roberts
2025-02-20 7:02 ` Anshuman Khandual
2025-02-24 12:03 ` Catalin Marinas
2025-02-24 12:04 ` Catalin Marinas
2025-02-17 14:08 ` [PATCH v2 09/14] mm/vmalloc: Gracefully unmap huge ptes Ryan Roberts
2025-02-20 12:05 ` Uladzislau Rezki
2025-02-24 12:04 ` Catalin Marinas
2025-02-17 14:08 ` [PATCH v2 10/14] arm64/mm: Support huge pte-mapped pages in vmap Ryan Roberts
2025-02-24 12:03 ` Catalin Marinas
2025-02-25 16:57 ` Ryan Roberts [this message]
2025-02-17 14:08 ` [PATCH v2 11/14] mm/vmalloc: Batch arch_sync_kernel_mappings() more efficiently Ryan Roberts
2025-02-25 15:37 ` Catalin Marinas
2025-02-25 16:58 ` Ryan Roberts
2025-02-17 14:08 ` [PATCH v2 12/14] mm: Generalize arch_sync_kernel_mappings() Ryan Roberts
2025-02-25 17:10 ` Ryan Roberts
2025-02-25 17:52 ` Catalin Marinas
2025-02-17 14:08 ` [PATCH v2 13/14] mm: Only call arch_update_kernel_mappings_[begin|end]() for kernel mappings Ryan Roberts
2025-02-17 14:08 ` [PATCH v2 14/14] arm64/mm: Batch barriers when updating " Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=601fa2ab-aa75-4a34-8754-65c1d0b9a8c7@arm.com \
--to=ryan.roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=alexghiti@rivosinc.com \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=hch@infradead.org \
--cc=kevin.brodsky@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=pasha.tatashin@soleen.com \
--cc=urezki@gmail.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox