* [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements
@ 2023-10-28 23:11 Samuel Holland
2023-10-28 23:11 ` [PATCH v2 01/11] riscv: Improve tlb_flush() Samuel Holland
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:11 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
While reviewing Alexandre Ghiti's "riscv: tlb flush improvements"
series[1], I noticed that most TLB flush functions end up as a call to
local_flush_tlb_all() when SMP is disabled. This series resolves that.
Along the way, I realized that we should be using single-ASID flushes
wherever possible, so I implemented that as well.
[1]: https://lore.kernel.org/linux-riscv/20231019140151.21629-1-alexghiti@rivosinc.com/
---
This series is based on v5 of Alexandre's changes, which I have included
here so the series can be built by the CI bots. I will rebase once his
series is merged.
Changes in v2:
- Rebase on Alexandre's "riscv: tlb flush improvements" series v5
- Move the SMP/UP merge earlier in the series to avoid build issues
- Make a copy of __flush_tlb_range() instead of adding ifdefs inside
- local_flush_tlb_all() is the only function used on !MMU (smpboot.c)
Alexandre Ghiti (4):
riscv: Improve tlb_flush()
riscv: Improve flush_tlb_range() for hugetlb pages
riscv: Make __flush_tlb_range() loop over pte instead of flushing the
whole tlb
riscv: Improve flush_tlb_kernel_range()
Samuel Holland (7):
riscv: mm: Combine the SMP and UP TLB flush code
riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma
riscv: mm: Introduce cntx2asid/cntx2version helper macros
riscv: mm: Use a fixed layout for the MM context ID
riscv: mm: Make asid_bits a local variable
riscv: mm: Preserve global TLB entries when switching contexts
riscv: mm: Always use ASID to flush MM contexts
arch/riscv/include/asm/errata_list.h | 12 +-
arch/riscv/include/asm/mmu.h | 3 +
arch/riscv/include/asm/mmu_context.h | 2 -
arch/riscv/include/asm/sbi.h | 3 -
arch/riscv/include/asm/tlb.h | 8 +-
arch/riscv/include/asm/tlbflush.h | 59 +++++----
arch/riscv/kernel/sbi.c | 32 ++---
arch/riscv/mm/Makefile | 5 +-
arch/riscv/mm/context.c | 26 ++--
arch/riscv/mm/tlbflush.c | 184 ++++++++++++++++-----------
10 files changed, 186 insertions(+), 148 deletions(-)
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 01/11] riscv: Improve tlb_flush()
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
@ 2023-10-28 23:11 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 02/11] riscv: Improve flush_tlb_range() for hugetlb pages Samuel Holland
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:11 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Andrew Jones, Lad Prabhakar, Samuel Holland
From: Alexandre Ghiti <alexghiti@rivosinc.com>
For now, tlb_flush() simply calls flush_tlb_mm() which results in a
flush of the whole TLB. So let's use mmu_gather fields to provide a more
fine-grained flush of the TLB.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
Changes in v2:
- Rebase on Alexandre's "riscv: tlb flush improvements" series v5
arch/riscv/include/asm/tlb.h | 8 +++++++-
arch/riscv/include/asm/tlbflush.h | 3 +++
arch/riscv/mm/tlbflush.c | 7 +++++++
3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/tlb.h b/arch/riscv/include/asm/tlb.h
index 120bcf2ed8a8..1eb5682b2af6 100644
--- a/arch/riscv/include/asm/tlb.h
+++ b/arch/riscv/include/asm/tlb.h
@@ -15,7 +15,13 @@ static void tlb_flush(struct mmu_gather *tlb);
static inline void tlb_flush(struct mmu_gather *tlb)
{
- flush_tlb_mm(tlb->mm);
+#ifdef CONFIG_MMU
+ if (tlb->fullmm || tlb->need_flush_all)
+ flush_tlb_mm(tlb->mm);
+ else
+ flush_tlb_mm_range(tlb->mm, tlb->start, tlb->end,
+ tlb_get_unmap_size(tlb));
+#endif
}
#endif /* _ASM_RISCV_TLB_H */
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index a09196f8de68..f5c4fb0ae642 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -32,6 +32,8 @@ static inline void local_flush_tlb_page(unsigned long addr)
#if defined(CONFIG_SMP) && defined(CONFIG_MMU)
void flush_tlb_all(void);
void flush_tlb_mm(struct mm_struct *mm);
+void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
+ unsigned long end, unsigned int page_size);
void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr);
void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end);
@@ -52,6 +54,7 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
}
#define flush_tlb_mm(mm) flush_tlb_all()
+#define flush_tlb_mm_range(mm, start, end, page_size) flush_tlb_all()
#endif /* !CONFIG_SMP || !CONFIG_MMU */
/* Flush a range of kernel pages */
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 77be59aadc73..fa03289853d8 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -132,6 +132,13 @@ void flush_tlb_mm(struct mm_struct *mm)
__flush_tlb_range(mm, 0, -1, PAGE_SIZE);
}
+void flush_tlb_mm_range(struct mm_struct *mm,
+ unsigned long start, unsigned long end,
+ unsigned int page_size)
+{
+ __flush_tlb_range(mm, start, end - start, page_size);
+}
+
void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
{
__flush_tlb_range(vma->vm_mm, addr, PAGE_SIZE, PAGE_SIZE);
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 02/11] riscv: Improve flush_tlb_range() for hugetlb pages
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
2023-10-28 23:11 ` [PATCH v2 01/11] riscv: Improve tlb_flush() Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 03/11] riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb Samuel Holland
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
From: Alexandre Ghiti <alexghiti@rivosinc.com>
flush_tlb_range() uses a fixed stride of PAGE_SIZE and in its current form,
when a hugetlb mapping needs to be flushed, flush_tlb_range() flushes the
whole tlb: so set a stride of the size of the hugetlb mapping in order to
only flush the hugetlb mapping. However, if the hugepage is a NAPOT region,
all PTEs that constitute this mapping must be invalidated, so the stride
size must actually be the size of the PTE.
Note that THPs are directly handled by flush_pmd_tlb_range().
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
[Samuel: Removed CONFIG_RISCV_ISA_SVNAPOT check]
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
Changes in v2:
- Rebase on Alexandre's "riscv: tlb flush improvements" series v5
arch/riscv/mm/tlbflush.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index fa03289853d8..b6d712a82306 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -3,6 +3,7 @@
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/sched.h>
+#include <linux/hugetlb.h>
#include <asm/sbi.h>
#include <asm/mmu_context.h>
@@ -147,7 +148,33 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end)
{
- __flush_tlb_range(vma->vm_mm, start, end - start, PAGE_SIZE);
+ unsigned long stride_size;
+
+ if (!is_vm_hugetlb_page(vma)) {
+ stride_size = PAGE_SIZE;
+ } else {
+ stride_size = huge_page_size(hstate_vma(vma));
+
+ /*
+ * As stated in the privileged specification, every PTE in a
+ * NAPOT region must be invalidated, so reset the stride in that
+ * case.
+ */
+ if (has_svnapot()) {
+ if (stride_size >= PGDIR_SIZE)
+ stride_size = PGDIR_SIZE;
+ else if (stride_size >= P4D_SIZE)
+ stride_size = P4D_SIZE;
+ else if (stride_size >= PUD_SIZE)
+ stride_size = PUD_SIZE;
+ else if (stride_size >= PMD_SIZE)
+ stride_size = PMD_SIZE;
+ else
+ stride_size = PAGE_SIZE;
+ }
+ }
+
+ __flush_tlb_range(vma->vm_mm, start, end - start, stride_size);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 03/11] riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
2023-10-28 23:11 ` [PATCH v2 01/11] riscv: Improve tlb_flush() Samuel Holland
2023-10-28 23:12 ` [PATCH v2 02/11] riscv: Improve flush_tlb_range() for hugetlb pages Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 04/11] riscv: Improve flush_tlb_kernel_range() Samuel Holland
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Mayuresh Chitale, Andrew Jones,
Lad Prabhakar, Samuel Holland
From: Alexandre Ghiti <alexghiti@rivosinc.com>
Currently, when the range to flush covers more than one page (a 4K page or
a hugepage), __flush_tlb_range() flushes the whole tlb. Flushing the whole
tlb comes with a greater cost than flushing a single entry so we should
flush single entries up to a certain threshold so that:
threshold * cost of flushing a single entry < cost of flushing the whole
tlb.
Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC
[Samuel: Fixed type of nr_ptes_in_range]
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
Changes in v2:
- Rebase on Alexandre's "riscv: tlb flush improvements" series v5
arch/riscv/include/asm/sbi.h | 3 -
arch/riscv/include/asm/tlbflush.h | 3 +
arch/riscv/kernel/sbi.c | 32 +++------
arch/riscv/mm/tlbflush.c | 115 +++++++++++++++---------------
4 files changed, 72 insertions(+), 81 deletions(-)
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index 5b4a1bf5f439..b79d0228144f 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -273,9 +273,6 @@ void sbi_set_timer(uint64_t stime_value);
void sbi_shutdown(void);
void sbi_send_ipi(unsigned int cpu);
int sbi_remote_fence_i(const struct cpumask *cpu_mask);
-int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
- unsigned long start,
- unsigned long size);
int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
unsigned long start,
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index f5c4fb0ae642..170a49c531c6 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -11,6 +11,9 @@
#include <asm/smp.h>
#include <asm/errata_list.h>
+#define FLUSH_TLB_MAX_SIZE ((unsigned long)-1)
+#define FLUSH_TLB_NO_ASID ((unsigned long)-1)
+
#ifdef CONFIG_MMU
extern unsigned long asid_mask;
diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index c672c8ba9a2a..5a62ed1da453 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -11,6 +11,7 @@
#include <linux/reboot.h>
#include <asm/sbi.h>
#include <asm/smp.h>
+#include <asm/tlbflush.h>
/* default SBI version is 0.1 */
unsigned long sbi_spec_version __ro_after_init = SBI_SPEC_VERSION_DEFAULT;
@@ -376,32 +377,15 @@ int sbi_remote_fence_i(const struct cpumask *cpu_mask)
}
EXPORT_SYMBOL(sbi_remote_fence_i);
-/**
- * sbi_remote_sfence_vma() - Execute SFENCE.VMA instructions on given remote
- * harts for the specified virtual address range.
- * @cpu_mask: A cpu mask containing all the target harts.
- * @start: Start of the virtual address
- * @size: Total size of the virtual address range.
- *
- * Return: 0 on success, appropriate linux error code otherwise.
- */
-int sbi_remote_sfence_vma(const struct cpumask *cpu_mask,
- unsigned long start,
- unsigned long size)
-{
- return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
- cpu_mask, start, size, 0, 0);
-}
-EXPORT_SYMBOL(sbi_remote_sfence_vma);
-
/**
* sbi_remote_sfence_vma_asid() - Execute SFENCE.VMA instructions on given
- * remote harts for a virtual address range belonging to a specific ASID.
+ * remote harts for a virtual address range belonging to a specific ASID or not.
*
* @cpu_mask: A cpu mask containing all the target harts.
* @start: Start of the virtual address
* @size: Total size of the virtual address range.
- * @asid: The value of address space identifier (ASID).
+ * @asid: The value of address space identifier (ASID), or FLUSH_TLB_NO_ASID
+ * for flushing all address spaces.
*
* Return: 0 on success, appropriate linux error code otherwise.
*/
@@ -410,8 +394,12 @@ int sbi_remote_sfence_vma_asid(const struct cpumask *cpu_mask,
unsigned long size,
unsigned long asid)
{
- return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
- cpu_mask, start, size, asid, 0);
+ if (asid == FLUSH_TLB_NO_ASID)
+ return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA,
+ cpu_mask, start, size, 0, 0);
+ else
+ return __sbi_rfence(SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID,
+ cpu_mask, start, size, asid, 0);
}
EXPORT_SYMBOL(sbi_remote_sfence_vma_asid);
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index b6d712a82306..e46fefc70927 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -9,28 +9,50 @@
static inline void local_flush_tlb_all_asid(unsigned long asid)
{
- __asm__ __volatile__ ("sfence.vma x0, %0"
- :
- : "r" (asid)
- : "memory");
+ if (asid != FLUSH_TLB_NO_ASID)
+ __asm__ __volatile__ ("sfence.vma x0, %0"
+ :
+ : "r" (asid)
+ : "memory");
+ else
+ local_flush_tlb_all();
}
static inline void local_flush_tlb_page_asid(unsigned long addr,
unsigned long asid)
{
- __asm__ __volatile__ ("sfence.vma %0, %1"
- :
- : "r" (addr), "r" (asid)
- : "memory");
+ if (asid != FLUSH_TLB_NO_ASID)
+ __asm__ __volatile__ ("sfence.vma %0, %1"
+ :
+ : "r" (addr), "r" (asid)
+ : "memory");
+ else
+ local_flush_tlb_page(addr);
}
-static inline void local_flush_tlb_range(unsigned long start,
- unsigned long size, unsigned long stride)
+/*
+ * Flush entire TLB if number of entries to be flushed is greater
+ * than the threshold below.
+ */
+static unsigned long tlb_flush_all_threshold __read_mostly = 64;
+
+static void local_flush_tlb_range_threshold_asid(unsigned long start,
+ unsigned long size,
+ unsigned long stride,
+ unsigned long asid)
{
- if (size <= stride)
- local_flush_tlb_page(start);
- else
- local_flush_tlb_all();
+ unsigned long nr_ptes_in_range = DIV_ROUND_UP(size, stride);
+ int i;
+
+ if (nr_ptes_in_range > tlb_flush_all_threshold) {
+ local_flush_tlb_all_asid(asid);
+ return;
+ }
+
+ for (i = 0; i < nr_ptes_in_range; ++i) {
+ local_flush_tlb_page_asid(start, asid);
+ start += stride;
+ }
}
static inline void local_flush_tlb_range_asid(unsigned long start,
@@ -38,8 +60,10 @@ static inline void local_flush_tlb_range_asid(unsigned long start,
{
if (size <= stride)
local_flush_tlb_page_asid(start, asid);
- else
+ else if (size == FLUSH_TLB_MAX_SIZE)
local_flush_tlb_all_asid(asid);
+ else
+ local_flush_tlb_range_threshold_asid(start, size, stride, asid);
}
static void __ipi_flush_tlb_all(void *info)
@@ -52,7 +76,7 @@ void flush_tlb_all(void)
if (riscv_use_ipi_for_rfence())
on_each_cpu(__ipi_flush_tlb_all, NULL, 1);
else
- sbi_remote_sfence_vma(NULL, 0, -1);
+ sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID);
}
struct flush_tlb_range_data {
@@ -69,18 +93,12 @@ static void __ipi_flush_tlb_range_asid(void *info)
local_flush_tlb_range_asid(d->start, d->size, d->stride, d->asid);
}
-static void __ipi_flush_tlb_range(void *info)
-{
- struct flush_tlb_range_data *d = info;
-
- local_flush_tlb_range(d->start, d->size, d->stride);
-}
-
static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
unsigned long size, unsigned long stride)
{
struct flush_tlb_range_data ftd;
struct cpumask *cmask = mm_cpumask(mm);
+ unsigned long asid = FLUSH_TLB_NO_ASID;
unsigned int cpuid;
bool broadcast;
@@ -90,39 +108,24 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
cpuid = get_cpu();
/* check if the tlbflush needs to be sent to other CPUs */
broadcast = cpumask_any_but(cmask, cpuid) < nr_cpu_ids;
- if (static_branch_unlikely(&use_asid_allocator)) {
- unsigned long asid = atomic_long_read(&mm->context.id) & asid_mask;
-
- if (broadcast) {
- if (riscv_use_ipi_for_rfence()) {
- ftd.asid = asid;
- ftd.start = start;
- ftd.size = size;
- ftd.stride = stride;
- on_each_cpu_mask(cmask,
- __ipi_flush_tlb_range_asid,
- &ftd, 1);
- } else
- sbi_remote_sfence_vma_asid(cmask,
- start, size, asid);
- } else {
- local_flush_tlb_range_asid(start, size, stride, asid);
- }
+
+ if (static_branch_unlikely(&use_asid_allocator))
+ asid = atomic_long_read(&mm->context.id) & asid_mask;
+
+ if (broadcast) {
+ if (riscv_use_ipi_for_rfence()) {
+ ftd.asid = asid;
+ ftd.start = start;
+ ftd.size = size;
+ ftd.stride = stride;
+ on_each_cpu_mask(cmask,
+ __ipi_flush_tlb_range_asid,
+ &ftd, 1);
+ } else
+ sbi_remote_sfence_vma_asid(cmask,
+ start, size, asid);
} else {
- if (broadcast) {
- if (riscv_use_ipi_for_rfence()) {
- ftd.asid = 0;
- ftd.start = start;
- ftd.size = size;
- ftd.stride = stride;
- on_each_cpu_mask(cmask,
- __ipi_flush_tlb_range,
- &ftd, 1);
- } else
- sbi_remote_sfence_vma(cmask, start, size);
- } else {
- local_flush_tlb_range(start, size, stride);
- }
+ local_flush_tlb_range_asid(start, size, stride, asid);
}
put_cpu();
@@ -130,7 +133,7 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
void flush_tlb_mm(struct mm_struct *mm)
{
- __flush_tlb_range(mm, 0, -1, PAGE_SIZE);
+ __flush_tlb_range(mm, 0, FLUSH_TLB_MAX_SIZE, PAGE_SIZE);
}
void flush_tlb_mm_range(struct mm_struct *mm,
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 04/11] riscv: Improve flush_tlb_kernel_range()
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (2 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 03/11] riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 05/11] riscv: mm: Combine the SMP and UP TLB flush code Samuel Holland
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Andrew Jones, Lad Prabhakar, Samuel Holland
From: Alexandre Ghiti <alexghiti@rivosinc.com>
This function used to simply flush the whole tlb of all harts, be more
subtile and try to only flush the range.
The problem is that we can only use PAGE_SIZE as stride since we don't know
the size of the underlying mapping and then this function will be improved
only if the size of the region to flush is < threshold * PAGE_SIZE.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC
[Samuel: Use cpu_online_mask and merge if statements]
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
Changes in v2:
- Rebase on Alexandre's "riscv: tlb flush improvements" series v5
arch/riscv/include/asm/tlbflush.h | 11 +++++-----
arch/riscv/mm/tlbflush.c | 34 ++++++++++++++++++++++---------
2 files changed, 30 insertions(+), 15 deletions(-)
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index 170a49c531c6..8f3418c5f172 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -40,6 +40,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr);
void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end);
+void flush_tlb_kernel_range(unsigned long start, unsigned long end);
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
@@ -56,15 +57,15 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
local_flush_tlb_all();
}
-#define flush_tlb_mm(mm) flush_tlb_all()
-#define flush_tlb_mm_range(mm, start, end, page_size) flush_tlb_all()
-#endif /* !CONFIG_SMP || !CONFIG_MMU */
-
/* Flush a range of kernel pages */
static inline void flush_tlb_kernel_range(unsigned long start,
unsigned long end)
{
- flush_tlb_all();
+ local_flush_tlb_all();
}
+#define flush_tlb_mm(mm) flush_tlb_all()
+#define flush_tlb_mm_range(mm, start, end, page_size) flush_tlb_all()
+#endif /* !CONFIG_SMP || !CONFIG_MMU */
+
#endif /* _ASM_RISCV_TLBFLUSH_H */
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index e46fefc70927..e6659d7368b3 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -97,20 +97,27 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
unsigned long size, unsigned long stride)
{
struct flush_tlb_range_data ftd;
- struct cpumask *cmask = mm_cpumask(mm);
+ const struct cpumask *cmask;
unsigned long asid = FLUSH_TLB_NO_ASID;
- unsigned int cpuid;
bool broadcast;
- if (cpumask_empty(cmask))
- return;
+ if (mm) {
+ unsigned int cpuid;
+
+ cmask = mm_cpumask(mm);
+ if (cpumask_empty(cmask))
+ return;
- cpuid = get_cpu();
- /* check if the tlbflush needs to be sent to other CPUs */
- broadcast = cpumask_any_but(cmask, cpuid) < nr_cpu_ids;
+ cpuid = get_cpu();
+ /* check if the tlbflush needs to be sent to other CPUs */
+ broadcast = cpumask_any_but(cmask, cpuid) < nr_cpu_ids;
- if (static_branch_unlikely(&use_asid_allocator))
- asid = atomic_long_read(&mm->context.id) & asid_mask;
+ if (static_branch_unlikely(&use_asid_allocator))
+ asid = atomic_long_read(&mm->context.id) & asid_mask;
+ } else {
+ cmask = cpu_online_mask;
+ broadcast = true;
+ }
if (broadcast) {
if (riscv_use_ipi_for_rfence()) {
@@ -128,7 +135,8 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
local_flush_tlb_range_asid(start, size, stride, asid);
}
- put_cpu();
+ if (mm)
+ put_cpu();
}
void flush_tlb_mm(struct mm_struct *mm)
@@ -179,6 +187,12 @@ void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
__flush_tlb_range(vma->vm_mm, start, end - start, stride_size);
}
+
+void flush_tlb_kernel_range(unsigned long start, unsigned long end)
+{
+ __flush_tlb_range(NULL, start, end - start, PAGE_SIZE);
+}
+
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end)
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 05/11] riscv: mm: Combine the SMP and UP TLB flush code
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (3 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 04/11] riscv: Improve flush_tlb_kernel_range() Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 06/11] riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma Samuel Holland
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
In SMP configurations, all TLB flushing narrower than flush_tlb_all()
goes through __flush_tlb_range(). Do the same in UP configurations.
This allows UP configurations to take advantage of recent improvements
to the code in tlbflush.c, such as support for huge pages and flushing
multiple-page ranges.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
Changes in v2:
- Move the SMP/UP merge earlier in the series to avoid build issues
- Make a copy of __flush_tlb_range() instead of adding ifdefs inside
- local_flush_tlb_all() is the only function used on !MMU (smpboot.c)
arch/riscv/include/asm/tlbflush.h | 33 +++++++------------------------
arch/riscv/mm/Makefile | 5 +----
arch/riscv/mm/tlbflush.c | 13 ++++++++++++
3 files changed, 21 insertions(+), 30 deletions(-)
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index 8f3418c5f172..317a1811aa51 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -27,13 +27,12 @@ static inline void local_flush_tlb_page(unsigned long addr)
{
ALT_FLUSH_TLB_PAGE(__asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory"));
}
-#else /* CONFIG_MMU */
-#define local_flush_tlb_all() do { } while (0)
-#define local_flush_tlb_page(addr) do { } while (0)
-#endif /* CONFIG_MMU */
-#if defined(CONFIG_SMP) && defined(CONFIG_MMU)
+#ifdef CONFIG_SMP
void flush_tlb_all(void);
+#else
+#define flush_tlb_all() local_flush_tlb_all()
+#endif
void flush_tlb_mm(struct mm_struct *mm);
void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned int page_size);
@@ -46,26 +45,8 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end);
void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end);
#endif
-#else /* CONFIG_SMP && CONFIG_MMU */
-
-#define flush_tlb_all() local_flush_tlb_all()
-#define flush_tlb_page(vma, addr) local_flush_tlb_page(addr)
-
-static inline void flush_tlb_range(struct vm_area_struct *vma,
- unsigned long start, unsigned long end)
-{
- local_flush_tlb_all();
-}
-
-/* Flush a range of kernel pages */
-static inline void flush_tlb_kernel_range(unsigned long start,
- unsigned long end)
-{
- local_flush_tlb_all();
-}
-
-#define flush_tlb_mm(mm) flush_tlb_all()
-#define flush_tlb_mm_range(mm, start, end, page_size) flush_tlb_all()
-#endif /* !CONFIG_SMP || !CONFIG_MMU */
+#else /* CONFIG_MMU */
+#define local_flush_tlb_all() do { } while (0)
+#endif /* CONFIG_MMU */
#endif /* _ASM_RISCV_TLBFLUSH_H */
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 9c454f90fd3d..64f901674e35 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -13,15 +13,12 @@ endif
KCOV_INSTRUMENT_init.o := n
obj-y += init.o
-obj-$(CONFIG_MMU) += extable.o fault.o pageattr.o
+obj-$(CONFIG_MMU) += extable.o fault.o pageattr.o tlbflush.o
obj-y += cacheflush.o
obj-y += context.o
obj-y += pgtable.o
obj-y += pmem.o
-ifeq ($(CONFIG_MMU),y)
-obj-$(CONFIG_SMP) += tlbflush.o
-endif
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_PTDUMP_CORE) += ptdump.o
obj-$(CONFIG_KASAN) += kasan_init.o
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index e6659d7368b3..22d7ed5abf8e 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -66,6 +66,7 @@ static inline void local_flush_tlb_range_asid(unsigned long start,
local_flush_tlb_range_threshold_asid(start, size, stride, asid);
}
+#ifdef CONFIG_SMP
static void __ipi_flush_tlb_all(void *info)
{
local_flush_tlb_all();
@@ -138,6 +139,18 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
if (mm)
put_cpu();
}
+#else
+static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
+ unsigned long size, unsigned long stride)
+{
+ unsigned long asid = FLUSH_TLB_NO_ASID;
+
+ if (mm && static_branch_unlikely(&use_asid_allocator))
+ asid = atomic_long_read(&mm->context.id) & asid_mask;
+
+ local_flush_tlb_range_asid(start, size, stride, asid);
+}
+#endif
void flush_tlb_mm(struct mm_struct *mm)
{
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 06/11] riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (4 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 05/11] riscv: mm: Combine the SMP and UP TLB flush code Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 07/11] riscv: mm: Introduce cntx2asid/cntx2version helper macros Samuel Holland
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
commit 3f1e782998cd ("riscv: add ASID-based tlbflushing methods") added
calls to the sfence.vma instruction with rs2 != x0. These single-ASID
instruction variants are also affected by SiFive errata CIP-1200.
Until now, the errata workaround was not needed for the single-ASID
sfence.vma variants, because they were only used when the ASID allocator
was enabled, and the affected SiFive platforms do not support multiple
ASIDs. However, we are going to start using those sfence.vma variants
regardless of ASID support, so now we need alternatives covering them.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
Changes in v2:
- Rebase on Alexandre's "riscv: tlb flush improvements" series v5
arch/riscv/include/asm/errata_list.h | 12 +++++++++++-
arch/riscv/include/asm/tlbflush.h | 19 ++++++++++++++++++-
arch/riscv/mm/tlbflush.c | 23 -----------------------
3 files changed, 29 insertions(+), 25 deletions(-)
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index b55b434f0059..d3f3c237adad 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -44,11 +44,21 @@ ALTERNATIVE(__stringify(RISCV_PTR do_page_fault), \
CONFIG_ERRATA_SIFIVE_CIP_453)
#else /* !__ASSEMBLY__ */
-#define ALT_FLUSH_TLB_PAGE(x) \
+#define ALT_SFENCE_VMA_ASID(asid) \
+asm(ALTERNATIVE("sfence.vma x0, %0", "sfence.vma", SIFIVE_VENDOR_ID, \
+ ERRATA_SIFIVE_CIP_1200, CONFIG_ERRATA_SIFIVE_CIP_1200) \
+ : : "r" (asid) : "memory")
+
+#define ALT_SFENCE_VMA_ADDR(addr) \
asm(ALTERNATIVE("sfence.vma %0", "sfence.vma", SIFIVE_VENDOR_ID, \
ERRATA_SIFIVE_CIP_1200, CONFIG_ERRATA_SIFIVE_CIP_1200) \
: : "r" (addr) : "memory")
+#define ALT_SFENCE_VMA_ADDR_ASID(addr, asid) \
+asm(ALTERNATIVE("sfence.vma %0, %1", "sfence.vma", SIFIVE_VENDOR_ID, \
+ ERRATA_SIFIVE_CIP_1200, CONFIG_ERRATA_SIFIVE_CIP_1200) \
+ : : "r" (addr), "r" (asid) : "memory")
+
/*
* _val is marked as "will be overwritten", so need to set it to 0
* in the default case.
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index 317a1811aa51..e529a643be17 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -22,10 +22,27 @@ static inline void local_flush_tlb_all(void)
__asm__ __volatile__ ("sfence.vma" : : : "memory");
}
+static inline void local_flush_tlb_all_asid(unsigned long asid)
+{
+ if (asid != FLUSH_TLB_NO_ASID)
+ ALT_SFENCE_VMA_ASID(asid);
+ else
+ local_flush_tlb_all();
+}
+
/* Flush one page from local TLB */
static inline void local_flush_tlb_page(unsigned long addr)
{
- ALT_FLUSH_TLB_PAGE(__asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory"));
+ ALT_SFENCE_VMA_ADDR(addr);
+}
+
+static inline void local_flush_tlb_page_asid(unsigned long addr,
+ unsigned long asid)
+{
+ if (asid != FLUSH_TLB_NO_ASID)
+ ALT_SFENCE_VMA_ADDR_ASID(addr, asid);
+ else
+ local_flush_tlb_page(addr);
}
#ifdef CONFIG_SMP
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 22d7ed5abf8e..0feccb8932d2 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -7,29 +7,6 @@
#include <asm/sbi.h>
#include <asm/mmu_context.h>
-static inline void local_flush_tlb_all_asid(unsigned long asid)
-{
- if (asid != FLUSH_TLB_NO_ASID)
- __asm__ __volatile__ ("sfence.vma x0, %0"
- :
- : "r" (asid)
- : "memory");
- else
- local_flush_tlb_all();
-}
-
-static inline void local_flush_tlb_page_asid(unsigned long addr,
- unsigned long asid)
-{
- if (asid != FLUSH_TLB_NO_ASID)
- __asm__ __volatile__ ("sfence.vma %0, %1"
- :
- : "r" (addr), "r" (asid)
- : "memory");
- else
- local_flush_tlb_page(addr);
-}
-
/*
* Flush entire TLB if number of entries to be flushed is greater
* than the threshold below.
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 07/11] riscv: mm: Introduce cntx2asid/cntx2version helper macros
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (5 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 06/11] riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 08/11] riscv: mm: Use a fixed layout for the MM context ID Samuel Holland
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
When using the ASID allocator, the MM context ID contains two values:
the ASID in the lower bits, and the allocator version number in the
remaining bits. Use macros to make this separation more obvious.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
(no changes since v1)
arch/riscv/include/asm/mmu.h | 3 +++
arch/riscv/mm/context.c | 12 ++++++------
arch/riscv/mm/tlbflush.c | 4 ++--
3 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
index 355504b37f8e..a550fbf770be 100644
--- a/arch/riscv/include/asm/mmu.h
+++ b/arch/riscv/include/asm/mmu.h
@@ -26,6 +26,9 @@ typedef struct {
#endif
} mm_context_t;
+#define cntx2asid(cntx) ((cntx) & asid_mask)
+#define cntx2version(cntx) ((cntx) & ~asid_mask)
+
void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa,
phys_addr_t sz, pgprot_t prot);
#endif /* __ASSEMBLY__ */
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 217fd4de6134..43d005f63253 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -81,7 +81,7 @@ static void __flush_context(void)
if (cntx == 0)
cntx = per_cpu(reserved_context, i);
- __set_bit(cntx & asid_mask, context_asid_map);
+ __set_bit(cntx2asid(cntx), context_asid_map);
per_cpu(reserved_context, i) = cntx;
}
@@ -102,7 +102,7 @@ static unsigned long __new_context(struct mm_struct *mm)
lockdep_assert_held(&context_lock);
if (cntx != 0) {
- unsigned long newcntx = ver | (cntx & asid_mask);
+ unsigned long newcntx = ver | cntx2asid(cntx);
/*
* If our current CONTEXT was active during a rollover, we
@@ -115,7 +115,7 @@ static unsigned long __new_context(struct mm_struct *mm)
* We had a valid CONTEXT in a previous life, so try to
* re-use it if possible.
*/
- if (!__test_and_set_bit(cntx & asid_mask, context_asid_map))
+ if (!__test_and_set_bit(cntx2asid(cntx), context_asid_map))
return newcntx;
}
@@ -168,7 +168,7 @@ static void set_mm_asid(struct mm_struct *mm, unsigned int cpu)
*/
old_active_cntx = atomic_long_read(&per_cpu(active_context, cpu));
if (old_active_cntx &&
- ((cntx & ~asid_mask) == atomic_long_read(¤t_version)) &&
+ (cntx2version(cntx) == atomic_long_read(¤t_version)) &&
atomic_long_cmpxchg_relaxed(&per_cpu(active_context, cpu),
old_active_cntx, cntx))
goto switch_mm_fast;
@@ -177,7 +177,7 @@ static void set_mm_asid(struct mm_struct *mm, unsigned int cpu)
/* Check that our ASID belongs to the current_version. */
cntx = atomic_long_read(&mm->context.id);
- if ((cntx & ~asid_mask) != atomic_long_read(¤t_version)) {
+ if (cntx2version(cntx) != atomic_long_read(¤t_version)) {
cntx = __new_context(mm);
atomic_long_set(&mm->context.id, cntx);
}
@@ -191,7 +191,7 @@ static void set_mm_asid(struct mm_struct *mm, unsigned int cpu)
switch_mm_fast:
csr_write(CSR_SATP, virt_to_pfn(mm->pgd) |
- ((cntx & asid_mask) << SATP_ASID_SHIFT) |
+ (cntx2asid(cntx) << SATP_ASID_SHIFT) |
satp_mode);
if (need_flush_tlb)
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 0feccb8932d2..1cfac683bda4 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -91,7 +91,7 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
broadcast = cpumask_any_but(cmask, cpuid) < nr_cpu_ids;
if (static_branch_unlikely(&use_asid_allocator))
- asid = atomic_long_read(&mm->context.id) & asid_mask;
+ asid = cntx2asid(atomic_long_read(&mm->context.id));
} else {
cmask = cpu_online_mask;
broadcast = true;
@@ -123,7 +123,7 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
unsigned long asid = FLUSH_TLB_NO_ASID;
if (mm && static_branch_unlikely(&use_asid_allocator))
- asid = atomic_long_read(&mm->context.id) & asid_mask;
+ asid = cntx2asid(atomic_long_read(&mm->context.id));
local_flush_tlb_range_asid(start, size, stride, asid);
}
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 08/11] riscv: mm: Use a fixed layout for the MM context ID
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (6 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 07/11] riscv: mm: Introduce cntx2asid/cntx2version helper macros Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 09/11] riscv: mm: Make asid_bits a local variable Samuel Holland
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
Currently, the size of the ASID field in the MM context ID dynamically
depends on the number of hardware-supported ASID bits. This requires
reading a global variable to extract either field from the context ID.
Instead, allocate the maximum possible number of bits to the ASID field,
so the layout of the context ID is known at compile-time.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
(no changes since v1)
arch/riscv/include/asm/mmu.h | 4 ++--
arch/riscv/include/asm/tlbflush.h | 2 --
arch/riscv/mm/context.c | 6 ++----
3 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
index a550fbf770be..dc0273f7905f 100644
--- a/arch/riscv/include/asm/mmu.h
+++ b/arch/riscv/include/asm/mmu.h
@@ -26,8 +26,8 @@ typedef struct {
#endif
} mm_context_t;
-#define cntx2asid(cntx) ((cntx) & asid_mask)
-#define cntx2version(cntx) ((cntx) & ~asid_mask)
+#define cntx2asid(cntx) ((cntx) & SATP_ASID_MASK)
+#define cntx2version(cntx) ((cntx) & ~SATP_ASID_MASK)
void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa,
phys_addr_t sz, pgprot_t prot);
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index e529a643be17..62d780037169 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -15,8 +15,6 @@
#define FLUSH_TLB_NO_ASID ((unsigned long)-1)
#ifdef CONFIG_MMU
-extern unsigned long asid_mask;
-
static inline void local_flush_tlb_all(void)
{
__asm__ __volatile__ ("sfence.vma" : : : "memory");
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 43d005f63253..b5170ac1b742 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -22,7 +22,6 @@ DEFINE_STATIC_KEY_FALSE(use_asid_allocator);
static unsigned long asid_bits;
static unsigned long num_asids;
-unsigned long asid_mask;
static atomic_long_t current_version;
@@ -128,7 +127,7 @@ static unsigned long __new_context(struct mm_struct *mm)
goto set_asid;
/* We're out of ASIDs, so increment current_version */
- ver = atomic_long_add_return_relaxed(num_asids, ¤t_version);
+ ver = atomic_long_add_return_relaxed(BIT(SATP_ASID_BITS), ¤t_version);
/* Flush everything */
__flush_context();
@@ -247,7 +246,6 @@ static int __init asids_init(void)
/* Pre-compute ASID details */
if (asid_bits) {
num_asids = 1 << asid_bits;
- asid_mask = num_asids - 1;
}
/*
@@ -255,7 +253,7 @@ static int __init asids_init(void)
* at-least twice more than CPUs
*/
if (num_asids > (2 * num_possible_cpus())) {
- atomic_long_set(¤t_version, num_asids);
+ atomic_long_set(¤t_version, BIT(SATP_ASID_BITS));
context_asid_map = bitmap_zalloc(num_asids, GFP_KERNEL);
if (!context_asid_map)
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 09/11] riscv: mm: Make asid_bits a local variable
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (7 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 08/11] riscv: mm: Use a fixed layout for the MM context ID Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 10/11] riscv: mm: Preserve global TLB entries when switching contexts Samuel Holland
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
This variable is only used inside asids_init().
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
(no changes since v1)
arch/riscv/mm/context.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index b5170ac1b742..43a8bc2d5af4 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -20,7 +20,6 @@
DEFINE_STATIC_KEY_FALSE(use_asid_allocator);
-static unsigned long asid_bits;
static unsigned long num_asids;
static atomic_long_t current_version;
@@ -226,7 +225,7 @@ static inline void set_mm(struct mm_struct *prev,
static int __init asids_init(void)
{
- unsigned long old;
+ unsigned long asid_bits, old;
/* Figure-out number of ASID bits in HW */
old = csr_read(CSR_SATP);
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 10/11] riscv: mm: Preserve global TLB entries when switching contexts
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (8 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 09/11] riscv: mm: Make asid_bits a local variable Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-10-28 23:12 ` [PATCH v2 11/11] riscv: mm: Always use ASID to flush MM contexts Samuel Holland
2023-11-07 6:50 ` [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements patchwork-bot+linux-riscv
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
If the CPU does not support multiple ASIDs, all MM contexts use ASID 0.
In this case, it is still beneficial to flush the TLB by ASID, as the
single-ASID variant of the sfence.vma instruction preserves TLB entries
for global (kernel) pages.
This optimization is recommended by the RISC-V privileged specification:
If the implementation does not provide ASIDs, or software chooses
to always use ASID 0, then after every satp write, software should
execute SFENCE.VMA with rs1=x0. In the common case that no global
translations have been modified, rs2 should be set to a register
other than x0 but which contains the value zero, so that global
translations are not flushed.
It is not possible to apply this optimization when using the ASID
allocator, because that code must flush the TLB for all ASIDs at once
when incrementing the version number.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
(no changes since v1)
arch/riscv/mm/context.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 43a8bc2d5af4..3ca9b653df7d 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -200,7 +200,7 @@ static void set_mm_noasid(struct mm_struct *mm)
{
/* Switch the page table and blindly nuke entire local TLB */
csr_write(CSR_SATP, virt_to_pfn(mm->pgd) | satp_mode);
- local_flush_tlb_all();
+ local_flush_tlb_all_asid(0);
}
static inline void set_mm(struct mm_struct *prev,
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 11/11] riscv: mm: Always use ASID to flush MM contexts
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (9 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 10/11] riscv: mm: Preserve global TLB entries when switching contexts Samuel Holland
@ 2023-10-28 23:12 ` Samuel Holland
2023-11-07 6:50 ` [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements patchwork-bot+linux-riscv
11 siblings, 0 replies; 13+ messages in thread
From: Samuel Holland @ 2023-10-28 23:12 UTC (permalink / raw)
To: Palmer Dabbelt, Alexandre Ghiti, linux-riscv
Cc: linux-kernel, linux-mm, Samuel Holland
Even if multiple ASIDs are not supported, using the single-ASID variant
of the sfence.vma instruction preserves TLB entries for global (kernel)
pages. So it is always most efficient to use the single-ASID code path.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
Changes in v2:
- Update both copies of __flush_tlb_range()
arch/riscv/include/asm/mmu_context.h | 2 --
arch/riscv/mm/context.c | 3 +--
arch/riscv/mm/tlbflush.c | 5 ++---
3 files changed, 3 insertions(+), 7 deletions(-)
diff --git a/arch/riscv/include/asm/mmu_context.h b/arch/riscv/include/asm/mmu_context.h
index 7030837adc1a..b0659413a080 100644
--- a/arch/riscv/include/asm/mmu_context.h
+++ b/arch/riscv/include/asm/mmu_context.h
@@ -33,8 +33,6 @@ static inline int init_new_context(struct task_struct *tsk,
return 0;
}
-DECLARE_STATIC_KEY_FALSE(use_asid_allocator);
-
#include <asm-generic/mmu_context.h>
#endif /* _ASM_RISCV_MMU_CONTEXT_H */
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 3ca9b653df7d..20057085ab8a 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -18,8 +18,7 @@
#ifdef CONFIG_MMU
-DEFINE_STATIC_KEY_FALSE(use_asid_allocator);
-
+static DEFINE_STATIC_KEY_FALSE(use_asid_allocator);
static unsigned long num_asids;
static atomic_long_t current_version;
diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 1cfac683bda4..9d06a3e9d330 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -90,8 +90,7 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
/* check if the tlbflush needs to be sent to other CPUs */
broadcast = cpumask_any_but(cmask, cpuid) < nr_cpu_ids;
- if (static_branch_unlikely(&use_asid_allocator))
- asid = cntx2asid(atomic_long_read(&mm->context.id));
+ asid = cntx2asid(atomic_long_read(&mm->context.id));
} else {
cmask = cpu_online_mask;
broadcast = true;
@@ -122,7 +121,7 @@ static void __flush_tlb_range(struct mm_struct *mm, unsigned long start,
{
unsigned long asid = FLUSH_TLB_NO_ASID;
- if (mm && static_branch_unlikely(&use_asid_allocator))
+ if (mm)
asid = cntx2asid(atomic_long_read(&mm->context.id));
local_flush_tlb_range_asid(start, size, stride, asid);
--
2.42.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
` (10 preceding siblings ...)
2023-10-28 23:12 ` [PATCH v2 11/11] riscv: mm: Always use ASID to flush MM contexts Samuel Holland
@ 2023-11-07 6:50 ` patchwork-bot+linux-riscv
11 siblings, 0 replies; 13+ messages in thread
From: patchwork-bot+linux-riscv @ 2023-11-07 6:50 UTC (permalink / raw)
To: Samuel Holland; +Cc: linux-riscv, palmer, alexghiti, linux-kernel, linux-mm
Hello:
This series was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <palmer@rivosinc.com>:
On Sat, 28 Oct 2023 16:11:58 -0700 you wrote:
> While reviewing Alexandre Ghiti's "riscv: tlb flush improvements"
> series[1], I noticed that most TLB flush functions end up as a call to
> local_flush_tlb_all() when SMP is disabled. This series resolves that.
> Along the way, I realized that we should be using single-ASID flushes
> wherever possible, so I implemented that as well.
>
> [1]: https://lore.kernel.org/linux-riscv/20231019140151.21629-1-alexghiti@rivosinc.com/
>
> [...]
Here is the summary with links:
- [v2,01/11] riscv: Improve tlb_flush()
https://git.kernel.org/riscv/c/c5e9b2c2ae82
- [v2,02/11] riscv: Improve flush_tlb_range() for hugetlb pages
https://git.kernel.org/riscv/c/c962a6e74639
- [v2,03/11] riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb
https://git.kernel.org/riscv/c/9d4e8d5fa7db
- [v2,04/11] riscv: Improve flush_tlb_kernel_range()
https://git.kernel.org/riscv/c/5e22bfd520ea
- [v2,05/11] riscv: mm: Combine the SMP and UP TLB flush code
(no matching commit)
- [v2,06/11] riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma
(no matching commit)
- [v2,07/11] riscv: mm: Introduce cntx2asid/cntx2version helper macros
(no matching commit)
- [v2,08/11] riscv: mm: Use a fixed layout for the MM context ID
(no matching commit)
- [v2,09/11] riscv: mm: Make asid_bits a local variable
(no matching commit)
- [v2,10/11] riscv: mm: Preserve global TLB entries when switching contexts
(no matching commit)
- [v2,11/11] riscv: mm: Always use ASID to flush MM contexts
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-11-07 6:50 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-28 23:11 [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
2023-10-28 23:11 ` [PATCH v2 01/11] riscv: Improve tlb_flush() Samuel Holland
2023-10-28 23:12 ` [PATCH v2 02/11] riscv: Improve flush_tlb_range() for hugetlb pages Samuel Holland
2023-10-28 23:12 ` [PATCH v2 03/11] riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb Samuel Holland
2023-10-28 23:12 ` [PATCH v2 04/11] riscv: Improve flush_tlb_kernel_range() Samuel Holland
2023-10-28 23:12 ` [PATCH v2 05/11] riscv: mm: Combine the SMP and UP TLB flush code Samuel Holland
2023-10-28 23:12 ` [PATCH v2 06/11] riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma Samuel Holland
2023-10-28 23:12 ` [PATCH v2 07/11] riscv: mm: Introduce cntx2asid/cntx2version helper macros Samuel Holland
2023-10-28 23:12 ` [PATCH v2 08/11] riscv: mm: Use a fixed layout for the MM context ID Samuel Holland
2023-10-28 23:12 ` [PATCH v2 09/11] riscv: mm: Make asid_bits a local variable Samuel Holland
2023-10-28 23:12 ` [PATCH v2 10/11] riscv: mm: Preserve global TLB entries when switching contexts Samuel Holland
2023-10-28 23:12 ` [PATCH v2 11/11] riscv: mm: Always use ASID to flush MM contexts Samuel Holland
2023-11-07 6:50 ` [PATCH v2 00/11] riscv: ASID-related and UP-related TLB flush enhancements patchwork-bot+linux-riscv
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox