linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/12] riscv: ASID-related and UP-related TLB flush enhancements
@ 2024-01-02 22:00 Samuel Holland
  2024-01-02 22:00 ` [PATCH v4 01/12] riscv: Flush the instruction cache during SMP bringup Samuel Holland
                   ` (11 more replies)
  0 siblings, 12 replies; 31+ messages in thread
From: Samuel Holland @ 2024-01-02 22:00 UTC (permalink / raw)
  To: Palmer Dabbelt, linux-riscv
  Cc: linux-kernel, linux-mm, Alexandre Ghiti, Samuel Holland

While reviewing Alexandre Ghiti's "riscv: tlb flush improvements"
series[1], I noticed that most TLB flush functions end up as a call to
local_flush_tlb_all() when SMP is disabled. This series resolves that,
and also optimizes the scenario where SMP is enabled but only one CPU is
present or online. Along the way, I realized that we should be using
single-ASID flushes wherever possible, so I implemented that as well.

Here are some numbers from D1 (with SMP disabled) which show the
performance impact:

v6.7-rc8:
 System Benchmarks Partial Index              BASELINE       RESULT    INDEX
 Execl Throughput                                 43.0        207.4     48.2
 File Copy 1024 bufsize 2000 maxblocks          3960.0      52187.4    131.8
 File Copy 256 bufsize 500 maxblocks            1655.0      14872.6     89.9
 File Copy 4096 bufsize 8000 maxblocks          5800.0     146597.8    252.8
 Pipe Throughput                               12440.0     125318.4    100.7
 Pipe-based Context Switching                   4000.0      17804.2     44.5
 Process Creation                                126.0        479.2     38.0
 Shell Scripts (1 concurrent)                     42.4        564.5    133.1
 Shell Scripts (16 concurrent)                     ---         36.8      ---
 Shell Scripts (8 concurrent)                      6.0         74.3    123.9
 System Call Overhead                          15000.0     182050.7    121.4
                                                                    ========
 System Benchmarks Index Score (Partial Only)                           93.2

v6.7-rc8 plus this patch series:
 System Benchmarks Partial Index              BASELINE       RESULT    INDEX
 Execl Throughput                                 43.0        208.5     48.5
 File Copy 1024 bufsize 2000 maxblocks          3960.0      56847.0    143.6
 File Copy 256 bufsize 500 maxblocks            1655.0      17728.9    107.1
 File Copy 4096 bufsize 8000 maxblocks          5800.0     168016.2    289.7
 Pipe Throughput                               12440.0     133376.2    107.2
 Pipe-based Context Switching                   4000.0      19736.3     49.3
 Process Creation                                126.0        484.5     38.4
 Shell Scripts (1 concurrent)                     42.4        564.1    133.0
 Shell Scripts (16 concurrent)                     ---         36.6      ---
 Shell Scripts (8 concurrent)                      6.0         74.1    123.5
 System Call Overhead                          15000.0     210181.8    140.1
                                                                    ========
 System Benchmarks Index Score (Partial Only)                          100.1

[1]: https://lore.kernel.org/linux-riscv/20231030133027.19542-1-alexghiti@rivosinc.com/

Changes in v4:
 - Fix a possible race between flush_icache_*() and SMP bringup
 - Refactor riscv_use_ipi_for_rfence() to make later changes cleaner
 - Optimize kernel TLB flushes with only one CPU online
 - Optimize global cache/TLB flushes with only one CPU online
 - Merge the two copies of __flush_tlb_range() and rely on the compiler
   to optimize out the broadcast path (both clang and gcc do this)
 - Merge the two copies of flush_tlb_all() and rely on constant folding
 - Only set tlb_flush_all_threshold when CONFIG_MMU=y.

Changes in v3:
 - Fixed a performance regression caused by executing sfence.vma in a
   loop on implementations affected by SiFive CIP-1200
 - Rebased on v6.7-rc1

Changes in v2:
 - Move the SMP/UP merge earlier in the series to avoid build issues
 - Make a copy of __flush_tlb_range() instead of adding ifdefs inside
 - local_flush_tlb_all() is the only function used on !MMU (smpboot.c)

Samuel Holland (12):
  riscv: Flush the instruction cache during SMP bringup
  riscv: Use IPIs for remote cache/TLB flushes by default
  riscv: mm: Broadcast kernel TLB flushes only when needed
  riscv: Only send remote fences when some other CPU is online
  riscv: mm: Combine the SMP and UP TLB flush code
  riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma
  riscv: Avoid TLB flush loops when affected by SiFive CIP-1200
  riscv: mm: Introduce cntx2asid/cntx2version helper macros
  riscv: mm: Use a fixed layout for the MM context ID
  riscv: mm: Make asid_bits a local variable
  riscv: mm: Preserve global TLB entries when switching contexts
  riscv: mm: Always use an ASID to flush mm contexts

 arch/riscv/errata/sifive/errata.c    |  5 ++
 arch/riscv/include/asm/errata_list.h | 12 ++++-
 arch/riscv/include/asm/mmu.h         |  3 ++
 arch/riscv/include/asm/mmu_context.h |  2 -
 arch/riscv/include/asm/sbi.h         |  4 ++
 arch/riscv/include/asm/smp.h         | 15 +-----
 arch/riscv/include/asm/tlbflush.h    | 50 ++++++++----------
 arch/riscv/kernel/sbi-ipi.c          | 11 +++-
 arch/riscv/kernel/smp.c              | 11 +---
 arch/riscv/kernel/smpboot.c          |  7 +--
 arch/riscv/mm/Makefile               |  5 +-
 arch/riscv/mm/cacheflush.c           |  7 +--
 arch/riscv/mm/context.c              | 26 ++++------
 arch/riscv/mm/tlbflush.c             | 76 +++++++++-------------------
 drivers/clocksource/timer-clint.c    |  2 +-
 15 files changed, 102 insertions(+), 134 deletions(-)

-- 
2.42.0



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-01-04 15:50 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-02 22:00 [PATCH v4 00/12] riscv: ASID-related and UP-related TLB flush enhancements Samuel Holland
2024-01-02 22:00 ` [PATCH v4 01/12] riscv: Flush the instruction cache during SMP bringup Samuel Holland
2024-01-04 11:58   ` Alexandre Ghiti
2024-01-02 22:00 ` [PATCH v4 02/12] riscv: Use IPIs for remote cache/TLB flushes by default Samuel Holland
2024-01-04 12:09   ` Alexandre Ghiti
2024-01-02 22:00 ` [PATCH v4 03/12] riscv: mm: Broadcast kernel TLB flushes only when needed Samuel Holland
2024-01-04 12:15   ` Alexandre Ghiti
2024-01-02 22:00 ` [PATCH v4 04/12] riscv: Only send remote fences when some other CPU is online Samuel Holland
2024-01-03 14:57   ` Jisheng Zhang
2024-01-03 15:04     ` Jisheng Zhang
2024-01-04 12:33   ` Alexandre Ghiti
2024-01-04 15:33     ` Samuel Holland
2024-01-02 22:00 ` [PATCH v4 05/12] riscv: mm: Combine the SMP and UP TLB flush code Samuel Holland
2024-01-04 12:36   ` Alexandre Ghiti
2024-01-02 22:00 ` [PATCH v4 06/12] riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma Samuel Holland
2024-01-02 22:00 ` [PATCH v4 07/12] riscv: Avoid TLB flush loops when affected by SiFive CIP-1200 Samuel Holland
2024-01-02 22:00 ` [PATCH v4 08/12] riscv: mm: Introduce cntx2asid/cntx2version helper macros Samuel Holland
2024-01-04 12:39   ` Alexandre Ghiti
2024-01-04 15:42     ` Samuel Holland
2024-01-02 22:00 ` [PATCH v4 09/12] riscv: mm: Use a fixed layout for the MM context ID Samuel Holland
2024-01-04 12:42   ` Alexandre Ghiti
2024-01-02 22:00 ` [PATCH v4 10/12] riscv: mm: Make asid_bits a local variable Samuel Holland
2024-01-03 15:00   ` Jisheng Zhang
2024-01-04 15:49     ` Samuel Holland
2024-01-04 12:47   ` Alexandre Ghiti
2024-01-02 22:00 ` [PATCH v4 11/12] riscv: mm: Preserve global TLB entries when switching contexts Samuel Holland
2024-01-04 12:55   ` Alexandre Ghiti
2024-01-02 22:00 ` [PATCH v4 12/12] riscv: mm: Always use an ASID to flush mm contexts Samuel Holland
2024-01-03 15:02   ` Jisheng Zhang
2024-01-04 15:50     ` Samuel Holland
2024-01-04 13:01   ` Alexandre Ghiti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox