```
commit bce57b9b96e05731a14c7e10a364fdd9b733cda2
Author: Roman Gushchin <roman.gushchin@linux.dev>

mm: introduce bpf_out_of_memory() BPF kfunc

This patch introduces bpf_out_of_memory() BPF kfunc to allow BPF programs
to declare OOM events and trigger the kernel OOM handling mechanism. The
function accepts a nullable memcg pointer for system-wide or memcg-scoped
OOMs.

> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index 721087952..3cbdcd013 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -21,6 +21,11 @@ enum oom_constraint {
>  	CONSTRAINT_MEMCG,
>  };
>
> +enum bpf_oom_flags {
> +	BPF_OOM_FLAGS_WAIT_ON_OOM_LOCK = 1 << 0,
> +	BPF_OOM_FLAGS_LAST = 1 << 1,
> +};
> +
>  /*
>   * Details of the page allocation that triggered the oom killer that are used to
>   * determine what should be killed.
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 3c86cd755..d7fca4bf5 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c

[ ... ]

> +/**
> + * bpf_out_of_memory - declare Out Of Memory state and invoke OOM killer
> + * @memcg__nullable: memcg or NULL for system-wide OOMs
> + * @order: order of page which wasn't allocated
> + * @flags: flags
> + * @constraint_text__nullable: custom constraint description for the OOM report
> + *
> + * Declares the Out Of Memory state and invokes the OOM killer.
> + *
> + * OOM handlers are synchronized using the oom_lock mutex. If wait_on_oom_lock
> + * is true, the function will wait on it. Otherwise it bails out with -EBUSY
> + * if oom_lock is contended.
> + *
> + * Generally it's advised to pass wait_on_oom_lock=false for global OOMs
> + * and wait_on_oom_lock=true for memcg-scoped OOMs.
> + *
> + * Returns 1 if the forward progress was achieved and some memory was freed.
> + * Returns a negative value if an error occurred.
> + */
> +__bpf_kfunc int bpf_out_of_memory(struct mem_cgroup *memcg__nullable,
> +				  int order, u64 flags)
> +{
> +	struct oom_control oc = {
> +		.memcg = memcg__nullable,
> +		.order = order,
> +	};

Can bpf_out_of_memory() work correctly for system-wide OOMs with
uninitialized gfp_mask?

Looking at out_of_memory():

    if (!(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc))
        return true;

With gfp_mask=0 and memcg=NULL (system-wide OOM), this evaluates to:
!(0 & __GFP_FS) && !false = true, causing an early return without
invoking the OOM killer. The comment in out_of_memory() explains this
check is about allocation context: "The OOM killer does not compensate
for IO-less reclaim."

All other callers of out_of_memory() initialize gfp_mask:
- mm/page_alloc.c: .gfp_mask = gfp_mask
- mm/memcontrol.c: .gfp_mask = gfp_mask

For memcg OOMs, the code works because is_memcg_oom(oc) is true,
bypassing the check. But for system-wide OOMs (NULL memcg), won't this
cause the function to return immediately without killing any process?

> +	int ret;
> +
> +	if (flags & ~(BPF_OOM_FLAGS_LAST - 1))
> +		return -EINVAL;
> +
> +	if (oc.order < 0 || oc.order > MAX_PAGE_ORDER)
> +		return -EINVAL;
> +
> +	if (flags & BPF_OOM_FLAGS_WAIT_ON_OOM_LOCK) {
> +		ret = mutex_lock_killable(&oom_lock);
> +		if (ret)
> +			return ret;
> +	} else if (!mutex_trylock(&oom_lock))
> +		return -EBUSY;
> +
> +	ret = out_of_memory(&oc);
> +
> +	mutex_unlock(&oom_lock);
> +	return ret;
> +}

[ ... ]


```

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: `mm: introduce bpf_out_of_memory() BPF kfunc`
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18859027430