linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next 0/3] BPF-based NUMA balancing
@ 2026-01-13 12:12 Yafang Shao
  2026-01-13 12:12 ` [RFC PATCH bpf-next 1/3] sched: add helpers for numa balancing Yafang Shao
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Yafang Shao @ 2026-01-13 12:12 UTC (permalink / raw)
  To: roman.gushchin, inwardvessel, shakeel.butt, akpm, ast, daniel,
	andrii, mkoutny, yu.c.chen, zhao1.liu
  Cc: bpf, linux-mm, Yafang Shao

In our large fleet of Kubernetes-managed servers, NUMA balancing has been
historically disabled globally on each server. With increasing deployment
of AMD EPYC servers in our fleet, cross-NUMA access has become a critical
performance issue, prompting us to consider enabling NUMA balancing to
address it.

However, enabling NUMA balancing globally is not acceptable as it would
increase overall system overhead and potentially introduce latency spikes
for latency-sensitive workloads. Instead, we aim to enable it selectively
for workloads that can genuinely benefit from it. Even for such workloads,
we require fine-grained per-workload tuning capabilities.

To maximize cross-NUMA page migration while minimizing overhead, we
propose tuning NUMA balancing per workload using BPF.

This patchset introduces a new BPF hook ->numab_hook() as a memory cgroup
based struct-ops. This enables NUMA balancing for specific workloads
while keeping global NUMA balancing disabled. It also allows tuning
NUMA balancing parameters per workload. Patch #3 demonstrates how to
adjust the hot threshold per workload using BPF.

Since bpf_struct_ops and cgroups integration [0] is still under
development by Roman, this patchset temporarily embeds the cgroup ID
into the struct-ops for review purposes. We can migrate to the new
approach once it's available.

This is still an RFC with limited testing. Any feedback is welcome.

[0]. https://lore.kernel.org/bpf/CAADnVQJGiH_yF=AoFSRy4zh20uneJgBfqGshubLM6aVq069Fhg@mail.gmail.com/

Yafang Shao (3):
  sched: add helpers for numa balancing
  mm: add support for bpf based numa balancing
  mm: set numa balancing hot threshold with bpf

 MAINTAINERS                          |   1 +
 include/linux/memcontrol.h           |   6 +
 include/linux/sched/numa_balancing.h |  44 +++++
 kernel/sched/fair.c                  |  17 +-
 kernel/sched/sched.h                 |   2 -
 mm/Makefile                          |   5 +
 mm/bpf_numa_balancing.c              | 252 +++++++++++++++++++++++++++
 mm/memory-tiers.c                    |   3 +-
 mm/mempolicy.c                       |   3 +-
 mm/migrate.c                         |   7 +-
 mm/vmscan.c                          |   7 +-
 11 files changed, 326 insertions(+), 21 deletions(-)
 create mode 100644 mm/bpf_numa_balancing.c

-- 
2.43.5



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-01-13 12:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-13 12:12 [RFC PATCH bpf-next 0/3] BPF-based NUMA balancing Yafang Shao
2026-01-13 12:12 ` [RFC PATCH bpf-next 1/3] sched: add helpers for numa balancing Yafang Shao
2026-01-13 12:42   ` bot+bpf-ci
2026-01-13 12:48     ` Yafang Shao
2026-01-13 12:12 ` [RFC PATCH bpf-next 2/3] mm: add support for bpf based " Yafang Shao
2026-01-13 12:29   ` bot+bpf-ci
2026-01-13 12:46     ` Yafang Shao
2026-01-13 12:12 ` [RFC PATCH bpf-next 3/3] mm: set numa balancing hot threshold with bpf Yafang Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox