From: Ryan Roberts <ryan.roberts@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Muchun Song <muchun.song@linux.dev>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Andrew Morton <akpm@linux-foundation.org>,
Uladzislau Rezki <urezki@gmail.com>,
Christoph Hellwig <hch@infradead.org>,
Mark Rutland <mark.rutland@arm.com>,
Ard Biesheuvel <ardb@kernel.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Dev Jain <dev.jain@arm.com>,
Alexandre Ghiti <alexghiti@rivosinc.com>,
Steve Capper <steve.capper@linaro.org>,
Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>,
linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: [PATCH v1 08/16] arm64/mm: Hoist barriers out of ___set_ptes() loop
Date: Wed, 5 Feb 2025 15:09:48 +0000 [thread overview]
Message-ID: <20250205151003.88959-9-ryan.roberts@arm.com> (raw)
In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com>
___set_ptes() previously called __set_pte() for each PTE in the range,
which would conditionally issue a DSB and ISB to make the new PTE value
immediately visible to the table walker if the new PTE was valid and for
kernel space.
We can do better than this; let's hoist those barriers out of the loop
so that they are only issued once at the end of the loop. We then reduce
the cost by the number of PTEs in the range.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/pgtable.h | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 3b55d9a15f05..1d428e9c0e5a 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -317,10 +317,8 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte)
WRITE_ONCE(*ptep, pte);
}
-static inline void __set_pte(pte_t *ptep, pte_t pte)
+static inline void __set_pte_complete(pte_t pte)
{
- __set_pte_nosync(ptep, pte);
-
/*
* Only if the new pte is valid and kernel, otherwise TLB maintenance
* or update_mmu_cache() have the necessary barriers.
@@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte)
}
}
+static inline void __set_pte(pte_t *ptep, pte_t pte)
+{
+ __set_pte_nosync(ptep, pte);
+ __set_pte_complete(pte);
+}
+
static inline pte_t __ptep_get(pte_t *ptep)
{
return READ_ONCE(*ptep);
@@ -647,12 +651,14 @@ static inline void ___set_ptes(struct mm_struct *mm, pte_t *ptep, pte_t pte,
for (;;) {
__check_safe_pte_update(mm, ptep, pte);
- __set_pte(ptep, pte);
+ __set_pte_nosync(ptep, pte);
if (--nr == 0)
break;
ptep++;
pte = pte_advance_pfn(pte, stride);
}
+
+ __set_pte_complete(pte);
}
static inline void __set_ptes(struct mm_struct *mm,
--
2.43.0
next prev parent reply other threads:[~2025-02-05 15:10 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-05 15:09 [PATCH v1 00/16] hugetlb and vmalloc fixes and perf improvements Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 01/16] mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear() Ryan Roberts
2025-02-06 5:03 ` Anshuman Khandual
2025-02-06 12:15 ` Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 02/16] arm64: hugetlb: Fix huge_ptep_get_and_clear() for non-present ptes Ryan Roberts
2025-02-06 6:15 ` Anshuman Khandual
2025-02-06 12:55 ` Ryan Roberts
2025-02-12 14:44 ` Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 03/16] arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level Ryan Roberts
2025-02-06 6:46 ` Anshuman Khandual
2025-02-06 13:04 ` Ryan Roberts
2025-02-13 4:57 ` Anshuman Khandual
2025-02-05 15:09 ` [PATCH v1 04/16] arm64: hugetlb: Refine tlb maintenance scope Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 05/16] mm/page_table_check: Batch-check pmds/puds just like ptes Ryan Roberts
2025-02-06 10:55 ` Anshuman Khandual
2025-02-06 13:07 ` Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 06/16] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Ryan Roberts
2025-02-06 11:48 ` Anshuman Khandual
2025-02-06 13:26 ` Ryan Roberts
2025-02-07 9:38 ` Ryan Roberts
2025-02-12 15:29 ` Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 07/16] arm64: hugetlb: Use ___set_ptes() and ___ptep_get_and_clear() Ryan Roberts
2025-02-07 4:09 ` Anshuman Khandual
2025-02-07 10:00 ` Ryan Roberts
2025-02-05 15:09 ` Ryan Roberts [this message]
2025-02-07 5:35 ` [PATCH v1 08/16] arm64/mm: Hoist barriers out of ___set_ptes() loop Anshuman Khandual
2025-02-07 10:38 ` Ryan Roberts
2025-02-12 16:00 ` Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 09/16] arm64/mm: Avoid barriers for invalid or userspace mappings Ryan Roberts
2025-02-07 8:11 ` Anshuman Khandual
2025-02-07 10:53 ` Ryan Roberts
2025-02-12 16:48 ` Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 10/16] mm/vmalloc: Warn on improper use of vunmap_range() Ryan Roberts
2025-02-07 8:41 ` Anshuman Khandual
2025-02-07 10:59 ` Ryan Roberts
2025-02-13 6:36 ` Anshuman Khandual
2025-02-05 15:09 ` [PATCH v1 11/16] mm/vmalloc: Gracefully unmap huge ptes Ryan Roberts
2025-02-07 9:19 ` Anshuman Khandual
2025-02-05 15:09 ` [PATCH v1 12/16] arm64/mm: Support huge pte-mapped pages in vmap Ryan Roberts
2025-02-07 10:04 ` Anshuman Khandual
2025-02-07 11:20 ` Ryan Roberts
2025-02-13 6:32 ` Anshuman Khandual
2025-02-13 9:09 ` Ryan Roberts
2025-02-17 4:33 ` Anshuman Khandual
2025-02-05 15:09 ` [PATCH v1 13/16] mm: Don't skip arch_sync_kernel_mappings() in error paths Ryan Roberts
2025-02-07 10:21 ` Anshuman Khandual
2025-02-05 15:09 ` [PATCH v1 14/16] mm/vmalloc: Batch arch_sync_kernel_mappings() more efficiently Ryan Roberts
2025-02-10 7:11 ` Anshuman Khandual
2025-02-05 15:09 ` [PATCH v1 15/16] mm: Generalize arch_sync_kernel_mappings() Ryan Roberts
2025-02-10 7:45 ` Anshuman Khandual
2025-02-10 11:04 ` Ryan Roberts
2025-02-13 5:57 ` Anshuman Khandual
2025-02-13 9:17 ` Ryan Roberts
2025-02-05 15:09 ` [PATCH v1 16/16] arm64/mm: Defer barriers when updating kernel mappings Ryan Roberts
2025-02-10 8:03 ` Anshuman Khandual
2025-02-10 11:12 ` Ryan Roberts
2025-02-13 5:30 ` Anshuman Khandual
2025-02-13 9:38 ` Ryan Roberts
2025-02-17 4:48 ` Anshuman Khandual
2025-02-17 9:40 ` Ryan Roberts
2025-02-06 7:52 ` [PATCH v1 00/16] hugetlb and vmalloc fixes and perf improvements Andrew Morton
2025-02-06 11:59 ` Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250205151003.88959-9-ryan.roberts@arm.com \
--to=ryan.roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=alexghiti@rivosinc.com \
--cc=anshuman.khandual@arm.com \
--cc=ardb@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=dev.jain@arm.com \
--cc=hch@infradead.org \
--cc=kevin.brodsky@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=muchun.song@linux.dev \
--cc=pasha.tatashin@soleen.com \
--cc=steve.capper@linaro.org \
--cc=urezki@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox