linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table walkers
@ 2026-02-02  7:45 Lance Yang
  2026-02-02  7:45 ` [PATCH v4 1/3] mm: use targeted IPIs for TLB sync with " Lance Yang
                   ` (3 more replies)
  0 siblings, 4 replies; 35+ messages in thread
From: Lance Yang @ 2026-02-02  7:45 UTC (permalink / raw)
  To: akpm
  Cc: david, dave.hansen, dave.hansen, ypodemsk, hughd, will,
	aneesh.kumar, npiggin, peterz, tglx, mingo, bp, x86, hpa, arnd,
	lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
	ryan.roberts, dev.jain, baohua, shy828301, riel, jannh, jgross,
	seanjc, pbonzini, boris.ostrovsky, virtualization, kvm,
	linux-arch, linux-mm, linux-kernel, ioworker0

When freeing or unsharing page tables we send an IPI to synchronize with
concurrent lockless page table walkers (e.g. GUP-fast). Today we broadcast
that IPI to all CPUs, which is costly on large machines and hurts RT
workloads[1].

This series makes those IPIs targeted. We track which CPUs are currently
doing a lockless page table walk for a given mm (per-CPU
active_lockless_pt_walk_mm). When we need to sync, we only IPI those CPUs.
GUP-fast and perf_get_page_size() set/clear the tracker around their walk;
tlb_remove_table_sync_mm() uses it and replaces the previous broadcast in
the free/unshare paths.

On x86, when the TLB flush path already sends IPIs (native without INVLPGB,
or KVM), the extra sync IPI is redundant. We add a property on pv_mmu_ops
so each backend can declare whether its flush_tlb_multi sends real IPIs; if
so, tlb_remove_table_sync_mm() is a no-op. We also have tlb_flush() pass
both freed_tables and unshared_tables so lazy-TLB CPUs get IPIs during
hugetlb unshare.

David Hildenbrand did the initial implementation. I built on his work and
relied on off-list discussions to push it further - thanks a lot David!

[1] https://lore.kernel.org/linux-mm/1b27a3fa-359a-43d0-bdeb-c31341749367@kernel.org/

v3 -> v4:
- Rework based on David's two-step direction and per-CPU idea:
  1) Targeted IPIs: per-CPU variable when entering/leaving lockless page
     table walk; tlb_remove_table_sync_mm() IPIs only those CPUs.
  2) On x86, pv_mmu_ops property set at init to skip the extra sync when
     flush_tlb_multi() already sends IPIs.
  https://lore.kernel.org/linux-mm/bbfdf226-4660-4949-b17b-0d209ee4ef8c@kernel.org/
- https://lore.kernel.org/linux-mm/20260106120303.38124-1-lance.yang@linux.dev/

v2 -> v3:
- Complete rewrite: use dynamic IPI tracking instead of static checks
  (per Dave Hansen, thanks!)
- Track IPIs via mmu_gather: native_flush_tlb_multi() sets flag when
  actually sending IPIs
- Motivation for skipping redundant IPIs explained by David:
  https://lore.kernel.org/linux-mm/1b27a3fa-359a-43d0-bdeb-c31341749367@kernel.org/
- https://lore.kernel.org/linux-mm/20251229145245.85452-1-lance.yang@linux.dev/

v1 -> v2:
- Fix cover letter encoding to resolve send-email issues. Apologies for
  any email flood caused by the failed send attempts :(

RFC -> v1:
- Use a callback function in pv_mmu_ops instead of comparing function
  pointers (per David)
- Embed the check directly in tlb_remove_table_sync_one() instead of
  requiring every caller to check explicitly (per David)
- Move tlb_table_flush_implies_ipi_broadcast() outside of
  CONFIG_MMU_GATHER_RCU_TABLE_FREE to fix build error on architectures
  that don't enable this config.
  https://lore.kernel.org/oe-kbuild-all/202512142156.cShiu6PU-lkp@intel.com/
- https://lore.kernel.org/linux-mm/20251213080038.10917-1-lance.yang@linux.dev/

Lance Yang (3):
  mm: use targeted IPIs for TLB sync with lockless page table walkers
  mm: switch callers to tlb_remove_table_sync_mm()
  x86/tlb: add architecture-specific TLB IPI optimization support

 arch/x86/hyperv/mmu.c                 |  5 ++
 arch/x86/include/asm/paravirt.h       |  5 ++
 arch/x86/include/asm/paravirt_types.h |  6 +++
 arch/x86/include/asm/tlb.h            | 20 +++++++-
 arch/x86/kernel/kvm.c                 |  6 +++
 arch/x86/kernel/paravirt.c            | 18 +++++++
 arch/x86/kernel/smpboot.c             |  1 +
 arch/x86/xen/mmu_pv.c                 |  2 +
 include/asm-generic/tlb.h             | 28 +++++++++--
 include/linux/mm.h                    | 34 +++++++++++++
 kernel/events/core.c                  |  2 +
 mm/gup.c                              |  2 +
 mm/khugepaged.c                       |  2 +-
 mm/mmu_gather.c                       | 69 ++++++++++++++++++++++++---
 14 files changed, 187 insertions(+), 13 deletions(-)

-- 
2.49.0



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2026-02-05 22:50 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-02  7:45 [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table walkers Lance Yang
2026-02-02  7:45 ` [PATCH v4 1/3] mm: use targeted IPIs for TLB sync with " Lance Yang
2026-02-02  9:42   ` Peter Zijlstra
2026-02-02 12:14     ` Lance Yang
2026-02-02 12:51       ` Peter Zijlstra
2026-02-02 13:23         ` Lance Yang
2026-02-02 13:42           ` Peter Zijlstra
2026-02-02 14:28             ` Lance Yang
2026-02-02 16:20       ` Dave Hansen
2026-02-02  7:45 ` [PATCH v4 2/3] mm: switch callers to tlb_remove_table_sync_mm() Lance Yang
2026-02-02  7:45 ` [PATCH v4 3/3] x86/tlb: add architecture-specific TLB IPI optimization support Lance Yang
2026-02-02  9:54 ` [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table walkers Peter Zijlstra
2026-02-02 11:00   ` [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table Lance Yang
2026-02-02 12:50     ` Peter Zijlstra
2026-02-02 12:58       ` Lance Yang
2026-02-02 13:07         ` Lance Yang
2026-02-02 13:37           ` Peter Zijlstra
2026-02-02 14:37             ` Lance Yang
2026-02-02 15:09               ` Peter Zijlstra
2026-02-02 15:52                 ` Lance Yang
2026-02-05 13:25                   ` David Hildenbrand (Arm)
2026-02-05 15:01                     ` Lance Yang
2026-02-05 15:05                       ` David Hildenbrand (Arm)
2026-02-05 15:28                         ` Lance Yang
2026-02-05 15:09                       ` Dave Hansen
2026-02-05 15:31                         ` Lance Yang
2026-02-05 15:41                           ` Dave Hansen
2026-02-05 16:30                             ` Lance Yang
2026-02-05 16:46                               ` David Hildenbrand (Arm)
2026-02-05 16:48                               ` Matthew Wilcox
2026-02-05 17:06                                 ` David Hildenbrand (Arm)
2026-02-05 18:36                                   ` Dave Hansen
2026-02-05 22:49                                     ` David Hildenbrand (Arm)
2026-02-05 21:30                                   ` David Hildenbrand (Arm)
2026-02-05 17:00                               ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox