From: bot+bpf-ci@kernel.org
To: roman.gushchin@linux.dev,akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org,ast@kernel.org,surenb@google.com,mhocko@kernel.org,shakeel.butt@linux.dev,hannes@cmpxchg.org,andrii@kernel.org,inwardvessel@gmail.com,linux-mm@kvack.org,cgroups@vger.kernel.org,bpf@vger.kernel.org,martin.lau@kernel.org,song@kernel.org,memxor@gmail.com,tj@kernel.org,roman.gushchin@linux.dev,ast@kernel.org,andrii@kernel.org,daniel@iogearbox.net,martin.lau@kernel.org,eddyz87@gmail.com,yonghong.song@linux.dev,clm@meta.com,ihor.solodrai@linux.dev
Subject: Re: [PATCH v2 20/23] sched: psi: implement bpf_psi struct ops
Date: Mon, 27 Oct 2025 23:48:07 +0000 (UTC) [thread overview]
Message-ID: <0bb955784824863853d84e4b2fa96c81e3a0dd034b1705bbbbc452f8e4dd27b2@mail.kernel.org> (raw)
In-Reply-To: <20251027232206.473085-10-roman.gushchin@linux.dev>
[-- Attachment #1: Type: text/plain, Size: 4274 bytes --]
```
commit 264db34cd79e7c07b4b1e58e2efae3d632df9cb2
Author: Roman Gushchin <roman.gushchin@linux.dev>
sched: psi: implement bpf_psi struct ops
This patch implements a BPF struct ops-based mechanism to create PSI
triggers, attach them to cgroups or system wide and handle PSI events
in BPF.
> diff --git a/kernel/sched/bpf_psi.c b/kernel/sched/bpf_psi.c
> new file mode 100644
> index 000000000..c383a2011
> --- /dev/null
> +++ b/kernel/sched/bpf_psi.c
[ ... ]
> +void bpf_psi_cgroup_offline(struct cgroup *cgroup)
> +{
> + struct psi_group *group = cgroup->psi;
> + u64 cgrp_id = cgroup_id(cgroup);
> + struct psi_trigger *t, *p;
> + struct bpf_psi *bpf_psi;
> + LIST_HEAD(to_destroy);
> + int idx;
> +
> + if (!group)
> + return;
> +
> + spin_lock(&group->bpf_triggers_lock);
> + list_for_each_entry_safe(t, p, &group->bpf_triggers, bpf_group_node) {
> + if (!t->pinned) {
> + t->pinned = true;
> + list_move(&t->bpf_group_node, &to_destroy);
> + }
> + }
> + spin_unlock(&group->bpf_triggers_lock);
> +
> + list_for_each_entry_safe(t, p, &to_destroy, bpf_group_node) {
> + bpf_psi = READ_ONCE(t->bpf_psi);
> +
> + idx = srcu_read_lock(&bpf_psi->srcu);
> + if (bpf_psi->ops->handle_cgroup_offline)
> + bpf_psi->ops->handle_cgroup_offline(bpf_psi, cgrp_id);
> + srcu_read_unlock(&bpf_psi->srcu, idx);
> +
> + spin_lock(&bpf_psi->lock);
> + list_del(&t->bpf_psi_node);
> + spin_unlock(&bpf_psi->lock);
> +
> + WRITE_ONCE(t->bpf_psi, NULL);
> + flush_workqueue(bpf_psi_wq);
> + synchronize_srcu(&bpf_psi->srcu);
> + psi_trigger_destroy(t);
> + }
> +}
Can bpf_psi_cgroup_offline() race with bpf_psi_ops_unreg() and access
freed memory?
Consider this sequence:
Thread A (bpf_psi_cgroup_offline):
- Pins triggers under group->bpf_triggers_lock
- Reads bpf_psi = READ_ONCE(t->bpf_psi) outside any lock
Thread B (bpf_psi_ops_unreg):
- Iterates bpf_psi->triggers, skips pinned triggers
- Calls synchronize_srcu(&bpf_psi->srcu)
- Calls kfree(bpf_psi)
Thread A:
- Tries srcu_read_lock(&bpf_psi->srcu) on freed memory
The pinning mechanism prevents Thread B from NULLing t->bpf_psi when
Thread A has pinned the trigger. However, Thread B unconditionally
frees the bpf_psi object itself even when it didn't process any
triggers because they were all pinned by Thread A.
After Thread A pins all triggers and releases group->bpf_triggers_lock,
Thread B can complete bpf_psi_ops_unreg(), call synchronize_srcu() (no
readers yet), and free bpf_psi. Then Thread A accesses the freed
bpf_psi via srcu_read_lock() and spin_lock().
> +static void bpf_psi_ops_unreg(void *kdata, struct bpf_link *link)
> +{
> + struct bpf_psi_ops *ops = kdata;
> + struct bpf_psi *bpf_psi = ops->bpf_psi;
> + struct psi_trigger *t, *p;
> + LIST_HEAD(to_destroy);
> +
> + spin_lock(&bpf_psi->lock);
> + list_for_each_entry_safe(t, p, &bpf_psi->triggers, bpf_psi_node) {
> + spin_lock(&t->group->bpf_triggers_lock);
> + if (!t->pinned) {
> + t->pinned = true;
> + list_move(&t->bpf_group_node, &to_destroy);
> + list_del(&t->bpf_psi_node);
> +
> + WRITE_ONCE(t->bpf_psi, NULL);
> + }
> + spin_unlock(&t->group->bpf_triggers_lock);
> + }
> + spin_unlock(&bpf_psi->lock);
> +
> + flush_workqueue(bpf_psi_wq);
> + synchronize_srcu(&bpf_psi->srcu);
> +
> + list_for_each_entry_safe(t, p, &to_destroy, bpf_group_node)
> + psi_trigger_destroy(t);
> +
> + if (!list_empty(&bpf_psi->node)) {
> + mutex_lock(&bpf_psi_lock);
> + list_del(&bpf_psi->node);
> + mutex_unlock(&bpf_psi_lock);
> + static_branch_dec(&bpf_psi_notify_key);
> + }
> +
> + cleanup_srcu_struct(&bpf_psi->srcu);
> + kfree(bpf_psi);
> +}
When bpf_psi_ops_unreg() skips all triggers because they're pinned
(by bpf_psi_cgroup_offline()), does it still free bpf_psi too early?
The to_destroy list will be empty if all triggers were pinned, but
the function still proceeds to cleanup_srcu_struct() and kfree(bpf_psi)
while bpf_psi_cgroup_offline() may be using that bpf_psi pointer.
```
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
In-Reply-To-Subject: `sched: psi: implement bpf_psi struct ops`
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18859027430
next prev parent reply other threads:[~2025-10-27 23:48 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-27 23:21 [PATCH v2 11/23] mm: introduce BPF kfunc to access memory events Roman Gushchin
2025-10-27 23:21 ` [PATCH v2 12/23] bpf: selftests: selftests for memcg stat kfuncs Roman Gushchin
2025-10-27 23:21 ` [PATCH v2 13/23] mm: introduce bpf_out_of_memory() BPF kfunc Roman Gushchin
2025-10-27 23:57 ` bot+bpf-ci
2025-10-28 16:43 ` Roman Gushchin
2025-11-10 9:46 ` Michal Hocko
2025-11-11 19:13 ` Roman Gushchin
2025-11-12 7:50 ` Michal Hocko
2025-10-27 23:21 ` [PATCH v2 14/23] mm: allow specifying custom oom constraint for BPF triggers Roman Gushchin
2025-10-27 23:48 ` bot+bpf-ci
2025-10-28 15:58 ` Chris Mason
2025-10-28 16:20 ` Roman Gushchin
2025-10-28 16:35 ` Chris Mason
2025-11-10 9:31 ` Michal Hocko
2025-11-11 19:17 ` Roman Gushchin
2025-11-12 7:52 ` Michal Hocko
2025-10-27 23:21 ` [PATCH v2 15/23] mm: introduce bpf_task_is_oom_victim() kfunc Roman Gushchin
2025-10-28 17:32 ` Tejun Heo
2025-10-28 18:09 ` Roman Gushchin
2025-10-28 18:31 ` Tejun Heo
2025-10-27 23:21 ` [PATCH v2 16/23] libbpf: introduce bpf_map__attach_struct_ops_opts() Roman Gushchin
2025-10-27 23:48 ` bot+bpf-ci
2025-10-28 17:07 ` Roman Gushchin
2025-10-28 17:24 ` Andrii Nakryiko
2025-10-27 23:22 ` [PATCH v2 17/23] bpf: selftests: introduce read_cgroup_file() helper Roman Gushchin
2025-10-27 23:48 ` bot+bpf-ci
2025-10-28 16:31 ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 18/23] bpf: selftests: BPF OOM handler test Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 19/23] sched: psi: refactor psi_trigger_create() Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 20/23] sched: psi: implement bpf_psi struct ops Roman Gushchin
2025-10-27 23:48 ` bot+bpf-ci [this message]
2025-10-28 17:40 ` Tejun Heo
2025-10-28 18:29 ` Roman Gushchin
2025-10-28 18:35 ` Tejun Heo
2025-10-28 19:54 ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 21/23] sched: psi: implement bpf_psi_create_trigger() kfunc Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 22/23] bpf: selftests: add config for psi Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 23/23] bpf: selftests: PSI struct ops test Roman Gushchin
2025-10-27 23:48 ` bot+bpf-ci
2025-10-28 17:13 ` Roman Gushchin
2025-10-28 17:30 ` Alexei Starovoitov
2025-11-10 9:48 ` Michal Hocko
2025-11-11 19:03 ` Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0bb955784824863853d84e4b2fa96c81e3a0dd034b1705bbbbc452f8e4dd27b2@mail.kernel.org \
--to=bot+bpf-ci@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=clm@meta.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=ihor.solodrai@linux.dev \
--cc=inwardvessel@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=mhocko@kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=song@kernel.org \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox