linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: bot+bpf-ci@kernel.org
To: roman.gushchin@linux.dev,akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org,ast@kernel.org,surenb@google.com,mhocko@kernel.org,shakeel.butt@linux.dev,hannes@cmpxchg.org,andrii@kernel.org,inwardvessel@gmail.com,linux-mm@kvack.org,cgroups@vger.kernel.org,bpf@vger.kernel.org,martin.lau@kernel.org,song@kernel.org,memxor@gmail.com,tj@kernel.org,roman.gushchin@linux.dev,ast@kernel.org,andrii@kernel.org,daniel@iogearbox.net,martin.lau@kernel.org,eddyz87@gmail.com,yonghong.song@linux.dev,clm@meta.com,ihor.solodrai@linux.dev
Subject: Re: [PATCH v2 20/23] sched: psi: implement bpf_psi struct ops
Date: Mon, 27 Oct 2025 23:48:07 +0000 (UTC)	[thread overview]
Message-ID: <0bb955784824863853d84e4b2fa96c81e3a0dd034b1705bbbbc452f8e4dd27b2@mail.kernel.org> (raw)
In-Reply-To: <20251027232206.473085-10-roman.gushchin@linux.dev>

[-- Attachment #1: Type: text/plain, Size: 4274 bytes --]


```
commit 264db34cd79e7c07b4b1e58e2efae3d632df9cb2
Author: Roman Gushchin <roman.gushchin@linux.dev>

sched: psi: implement bpf_psi struct ops

This patch implements a BPF struct ops-based mechanism to create PSI
triggers, attach them to cgroups or system wide and handle PSI events
in BPF.

> diff --git a/kernel/sched/bpf_psi.c b/kernel/sched/bpf_psi.c
> new file mode 100644
> index 000000000..c383a2011
> --- /dev/null
> +++ b/kernel/sched/bpf_psi.c

[ ... ]

> +void bpf_psi_cgroup_offline(struct cgroup *cgroup)
> +{
> +	struct psi_group *group = cgroup->psi;
> +	u64 cgrp_id = cgroup_id(cgroup);
> +	struct psi_trigger *t, *p;
> +	struct bpf_psi *bpf_psi;
> +	LIST_HEAD(to_destroy);
> +	int idx;
> +
> +	if (!group)
> +		return;
> +
> +	spin_lock(&group->bpf_triggers_lock);
> +	list_for_each_entry_safe(t, p, &group->bpf_triggers, bpf_group_node) {
> +		if (!t->pinned) {
> +			t->pinned = true;
> +			list_move(&t->bpf_group_node, &to_destroy);
> +		}
> +	}
> +	spin_unlock(&group->bpf_triggers_lock);
> +
> +	list_for_each_entry_safe(t, p, &to_destroy, bpf_group_node) {
> +		bpf_psi = READ_ONCE(t->bpf_psi);
> +
> +		idx = srcu_read_lock(&bpf_psi->srcu);
> +		if (bpf_psi->ops->handle_cgroup_offline)
> +			bpf_psi->ops->handle_cgroup_offline(bpf_psi, cgrp_id);
> +		srcu_read_unlock(&bpf_psi->srcu, idx);
> +
> +		spin_lock(&bpf_psi->lock);
> +		list_del(&t->bpf_psi_node);
> +		spin_unlock(&bpf_psi->lock);
> +
> +		WRITE_ONCE(t->bpf_psi, NULL);
> +		flush_workqueue(bpf_psi_wq);
> +		synchronize_srcu(&bpf_psi->srcu);
> +		psi_trigger_destroy(t);
> +	}
> +}

Can bpf_psi_cgroup_offline() race with bpf_psi_ops_unreg() and access
freed memory?

Consider this sequence:

Thread A (bpf_psi_cgroup_offline):
  - Pins triggers under group->bpf_triggers_lock
  - Reads bpf_psi = READ_ONCE(t->bpf_psi) outside any lock

Thread B (bpf_psi_ops_unreg):
  - Iterates bpf_psi->triggers, skips pinned triggers
  - Calls synchronize_srcu(&bpf_psi->srcu)
  - Calls kfree(bpf_psi)

Thread A:
  - Tries srcu_read_lock(&bpf_psi->srcu) on freed memory

The pinning mechanism prevents Thread B from NULLing t->bpf_psi when
Thread A has pinned the trigger. However, Thread B unconditionally
frees the bpf_psi object itself even when it didn't process any
triggers because they were all pinned by Thread A.

After Thread A pins all triggers and releases group->bpf_triggers_lock,
Thread B can complete bpf_psi_ops_unreg(), call synchronize_srcu() (no
readers yet), and free bpf_psi. Then Thread A accesses the freed
bpf_psi via srcu_read_lock() and spin_lock().

> +static void bpf_psi_ops_unreg(void *kdata, struct bpf_link *link)
> +{
> +	struct bpf_psi_ops *ops = kdata;
> +	struct bpf_psi *bpf_psi = ops->bpf_psi;
> +	struct psi_trigger *t, *p;
> +	LIST_HEAD(to_destroy);
> +
> +	spin_lock(&bpf_psi->lock);
> +	list_for_each_entry_safe(t, p, &bpf_psi->triggers, bpf_psi_node) {
> +		spin_lock(&t->group->bpf_triggers_lock);
> +		if (!t->pinned) {
> +			t->pinned = true;
> +			list_move(&t->bpf_group_node, &to_destroy);
> +			list_del(&t->bpf_psi_node);
> +
> +			WRITE_ONCE(t->bpf_psi, NULL);
> +		}
> +		spin_unlock(&t->group->bpf_triggers_lock);
> +	}
> +	spin_unlock(&bpf_psi->lock);
> +
> +	flush_workqueue(bpf_psi_wq);
> +	synchronize_srcu(&bpf_psi->srcu);
> +
> +	list_for_each_entry_safe(t, p, &to_destroy, bpf_group_node)
> +		psi_trigger_destroy(t);
> +
> +	if (!list_empty(&bpf_psi->node)) {
> +		mutex_lock(&bpf_psi_lock);
> +		list_del(&bpf_psi->node);
> +		mutex_unlock(&bpf_psi_lock);
> +		static_branch_dec(&bpf_psi_notify_key);
> +	}
> +
> +	cleanup_srcu_struct(&bpf_psi->srcu);
> +	kfree(bpf_psi);
> +}

When bpf_psi_ops_unreg() skips all triggers because they're pinned
(by bpf_psi_cgroup_offline()), does it still free bpf_psi too early?

The to_destroy list will be empty if all triggers were pinned, but
the function still proceeds to cleanup_srcu_struct() and kfree(bpf_psi)
while bpf_psi_cgroup_offline() may be using that bpf_psi pointer.


```

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: `sched: psi: implement bpf_psi struct ops`
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/18859027430

  reply	other threads:[~2025-10-27 23:48 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-27 23:21 [PATCH v2 11/23] mm: introduce BPF kfunc to access memory events Roman Gushchin
2025-10-27 23:21 ` [PATCH v2 12/23] bpf: selftests: selftests for memcg stat kfuncs Roman Gushchin
2025-10-27 23:21 ` [PATCH v2 13/23] mm: introduce bpf_out_of_memory() BPF kfunc Roman Gushchin
2025-10-27 23:57   ` bot+bpf-ci
2025-10-28 16:43     ` Roman Gushchin
2025-11-10  9:46   ` Michal Hocko
2025-11-11 19:13     ` Roman Gushchin
2025-11-12  7:50       ` Michal Hocko
2025-10-27 23:21 ` [PATCH v2 14/23] mm: allow specifying custom oom constraint for BPF triggers Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 15:58     ` Chris Mason
2025-10-28 16:20       ` Roman Gushchin
2025-10-28 16:35         ` Chris Mason
2025-11-10  9:31   ` Michal Hocko
2025-11-11 19:17     ` Roman Gushchin
2025-11-12  7:52       ` Michal Hocko
2025-10-27 23:21 ` [PATCH v2 15/23] mm: introduce bpf_task_is_oom_victim() kfunc Roman Gushchin
2025-10-28 17:32   ` Tejun Heo
2025-10-28 18:09     ` Roman Gushchin
2025-10-28 18:31       ` Tejun Heo
2025-10-27 23:21 ` [PATCH v2 16/23] libbpf: introduce bpf_map__attach_struct_ops_opts() Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 17:07     ` Roman Gushchin
2025-10-28 17:24       ` Andrii Nakryiko
2025-10-27 23:22 ` [PATCH v2 17/23] bpf: selftests: introduce read_cgroup_file() helper Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 16:31     ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 18/23] bpf: selftests: BPF OOM handler test Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 19/23] sched: psi: refactor psi_trigger_create() Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 20/23] sched: psi: implement bpf_psi struct ops Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci [this message]
2025-10-28 17:40   ` Tejun Heo
2025-10-28 18:29     ` Roman Gushchin
2025-10-28 18:35       ` Tejun Heo
2025-10-28 19:54         ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 21/23] sched: psi: implement bpf_psi_create_trigger() kfunc Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 22/23] bpf: selftests: add config for psi Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 23/23] bpf: selftests: PSI struct ops test Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 17:13     ` Roman Gushchin
2025-10-28 17:30       ` Alexei Starovoitov
2025-11-10  9:48   ` Michal Hocko
2025-11-11 19:03     ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bb955784824863853d84e4b2fa96c81e3a0dd034b1705bbbbc452f8e4dd27b2@mail.kernel.org \
    --to=bot+bpf-ci@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=clm@meta.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=ihor.solodrai@linux.dev \
    --cc=inwardvessel@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=mhocko@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=song@kernel.org \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox