``` commit bce57b9b96e05731a14c7e10a364fdd9b733cda2 Author: Roman Gushchin mm: introduce bpf_out_of_memory() BPF kfunc This patch introduces bpf_out_of_memory() BPF kfunc to allow BPF programs to declare OOM events and trigger the kernel OOM handling mechanism. The function accepts a nullable memcg pointer for system-wide or memcg-scoped OOMs. > diff --git a/include/linux/oom.h b/include/linux/oom.h > index 721087952..3cbdcd013 100644 > --- a/include/linux/oom.h > +++ b/include/linux/oom.h > @@ -21,6 +21,11 @@ enum oom_constraint { > CONSTRAINT_MEMCG, > }; > > +enum bpf_oom_flags { > + BPF_OOM_FLAGS_WAIT_ON_OOM_LOCK = 1 << 0, > + BPF_OOM_FLAGS_LAST = 1 << 1, > +}; > + > /* > * Details of the page allocation that triggered the oom killer that are used to > * determine what should be killed. > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 3c86cd755..d7fca4bf5 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c [ ... ] > +/** > + * bpf_out_of_memory - declare Out Of Memory state and invoke OOM killer > + * @memcg__nullable: memcg or NULL for system-wide OOMs > + * @order: order of page which wasn't allocated > + * @flags: flags > + * @constraint_text__nullable: custom constraint description for the OOM report > + * > + * Declares the Out Of Memory state and invokes the OOM killer. > + * > + * OOM handlers are synchronized using the oom_lock mutex. If wait_on_oom_lock > + * is true, the function will wait on it. Otherwise it bails out with -EBUSY > + * if oom_lock is contended. > + * > + * Generally it's advised to pass wait_on_oom_lock=false for global OOMs > + * and wait_on_oom_lock=true for memcg-scoped OOMs. > + * > + * Returns 1 if the forward progress was achieved and some memory was freed. > + * Returns a negative value if an error occurred. > + */ > +__bpf_kfunc int bpf_out_of_memory(struct mem_cgroup *memcg__nullable, > + int order, u64 flags) > +{ > + struct oom_control oc = { > + .memcg = memcg__nullable, > + .order = order, > + }; Can bpf_out_of_memory() work correctly for system-wide OOMs with uninitialized gfp_mask? Looking at out_of_memory(): if (!(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc)) return true; With gfp_mask=0 and memcg=NULL (system-wide OOM), this evaluates to: !(0 & __GFP_FS) && !false = true, causing an early return without invoking the OOM killer. The comment in out_of_memory() explains this check is about allocation context: "The OOM killer does not compensate for IO-less reclaim." All other callers of out_of_memory() initialize gfp_mask: - mm/page_alloc.c: .gfp_mask = gfp_mask - mm/memcontrol.c: .gfp_mask = gfp_mask For memcg OOMs, the code works because is_memcg_oom(oc) is true, bypassing the check. But for system-wide OOMs (NULL memcg), won't this cause the function to return immediately without killing any process? > + int ret; > + > + if (flags & ~(BPF_OOM_FLAGS_LAST - 1)) > + return -EINVAL; > + > + if (oc.order < 0 || oc.order > MAX_PAGE_ORDER) > + return -EINVAL; > + > + if (flags & BPF_OOM_FLAGS_WAIT_ON_OOM_LOCK) { > + ret = mutex_lock_killable(&oom_lock); > + if (ret) > + return ret; > + } else if (!mutex_trylock(&oom_lock)) > + return -EBUSY; > + > + ret = out_of_memory(&oc); > + > + mutex_unlock(&oom_lock); > + return ret; > +} [ ... ] ``` --- AI reviewed your patch. Please fix the bug or email reply why it's not a bug. See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md In-Reply-To-Subject: `mm: introduce bpf_out_of_memory() BPF kfunc` CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18859027430