On Tue, Jan 13, 2026 at 08:12:37PM +0800, Yafang Shao <laoar.shao@gmail.com> wrote:
> bpf_numab_ops enables NUMA balancing for tasks within a specific memcg,
> even when global NUMA balancing is disabled. This allows selective NUMA
> optimization for workloads that benefit from it, while avoiding potential
> latency spikes for other workloads.
> 
> The policy must be attached to a leaf memory cgroup.

Why this restriction?
Do you envision how these extensions would apply hierarchically?
Regardless of that, being a "leaf memcg" is not a stationary condition
(mkdirs, writes to `cgroup.subtree_control`) so it should also be
prepared for that.

Also, I think (please correct me) that NUMA balancing doesn't need
memory controller (in contrast with OOM), so the attachment shouldn't be
through struct mem_cgroup but plain struct cgroup::bpf. If you could
consider this or add some details about this decision, it'd be great.


Thanks,
Michal

> To reduce lookup
> overhead, we can cache memcg::bpf_numab in the mm_struct of tasks within
> the memcg when it becomes a performance bottleneck.
> 
> The cgroup ID is embedded in bpf_numab_ops as a compile-time constant,
> which restricts each instance to a single cgroup and prevents attachment
> to multiple cgroups. Roman is working on a solution to remove this
> limitation, after which we can migrate to the new approach.
> 
> Currently only the normal mode is supported.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>