linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: hui.zhu@linux.dev
To: "Roman Gushchin" <roman.gushchin@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Johannes  Weiner" <hannes@cmpxchg.org>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"JP  Kobryn" <inwardvessel@gmail.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org, bpf@vger.kernel.org,
	"Martin KaFai Lau" <martin.lau@kernel.org>,
	"Song Liu" <song@kernel.org>,
	"Kumar Kartikeya  Dwivedi" <memxor@gmail.com>,
	"Tejun Heo" <tj@kernel.org>,
	"Roman  Gushchin" <roman.gushchin@linux.dev>
Subject: Re: [PATCH v2 21/23] sched: psi: implement bpf_psi_create_trigger()  kfunc
Date: Mon, 08 Dec 2025 08:49:34 +0000	[thread overview]
Message-ID: <1d9a162605a3f32ac215430131f7745488deaa34@linux.dev> (raw)
In-Reply-To: <20251027232206.473085-11-roman.gushchin@linux.dev>

2025年10月28日 07:22, "Roman Gushchin" <roman.gushchin@linux.dev mailto:roman.gushchin@linux.dev?to=%22Roman%20Gushchin%22%20%3Croman.gushchin%40linux.dev%3E > 写到:


> 
> Implement a new bpf_psi_create_trigger() BPF kfunc, which allows
> to create new PSI triggers and attach them to cgroups or be
> system-wide.
> 
> Created triggers will exist until the struct ops is loaded and
> if they are attached to a cgroup until the cgroup exists.
> 
> Due to a limitation of 5 arguments, the resource type and the "full"
> bit are squeezed into a single u32.
> 
> Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>

Hi Roman,

I wrote an eBPF program attempting to use bpf_psi struct ops and
bpf_psi_create_trigger to continuously receive memory-related PSI
events, but I only received one event.

Looking at the code implementation, when an event occurs:
if (cmpxchg(&t->event, 0, 1) == 0) {

However, in eBPF there appears to be no way to call the equivalent
of this code from psi_trigger_poll:
if (cmpxchg(&t->event, 1, 0) == 1)
to reset the event back to 0.

Would it be possible to add an additional BPF helper function to
handle this? Without a way to acknowledge/reset the event flag,
the trigger only fires once and cannot be reused for continuous
monitoring.

Best,
Hui



> ---
>  include/linux/cgroup.h | 4 ++
>  include/linux/psi.h | 6 +++
>  kernel/sched/bpf_psi.c | 94 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 104 insertions(+)
> 
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 6ed477338b16..1a99da44999e 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -707,6 +707,10 @@ static inline bool task_under_cgroup_hierarchy(struct task_struct *task,
>  
>  static inline void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
>  {}
> +static inline struct cgroup *cgroup_get_from_id(u64 id)
> +{
> + return NULL;
> +}
>  #endif /* !CONFIG_CGROUPS */
>  
>  #ifdef CONFIG_CGROUPS
> diff --git a/include/linux/psi.h b/include/linux/psi.h
> index 8178e998d94b..8ffe84cd8571 100644
> --- a/include/linux/psi.h
> +++ b/include/linux/psi.h
> @@ -50,6 +50,12 @@ int psi_cgroup_alloc(struct cgroup *cgrp);
>  void psi_cgroup_free(struct cgroup *cgrp);
>  void cgroup_move_task(struct task_struct *p, struct css_set *to);
>  void psi_cgroup_restart(struct psi_group *group);
> +
> +#else
> +static inline struct psi_group *cgroup_psi(struct cgroup *cgrp)
> +{
> + return &psi_system;
> +}
>  #endif
>  
>  #else /* CONFIG_PSI */
> diff --git a/kernel/sched/bpf_psi.c b/kernel/sched/bpf_psi.c
> index c383a20119a6..7974de56594f 100644
> --- a/kernel/sched/bpf_psi.c
> +++ b/kernel/sched/bpf_psi.c
> @@ -8,6 +8,7 @@
>  #include <linux/bpf_psi.h>
>  #include <linux/cgroup-defs.h>
>  
> +struct bpf_struct_ops bpf_psi_bpf_ops;
>  static struct workqueue_struct *bpf_psi_wq;
>  
>  static DEFINE_MUTEX(bpf_psi_lock);
> @@ -186,6 +187,92 @@ static const struct bpf_verifier_ops bpf_psi_verifier_ops = {
>  .is_valid_access = bpf_psi_ops_is_valid_access,
>  };
>  
> +__bpf_kfunc_start_defs();
> +
> +/**
> + * bpf_psi_create_trigger - Create a PSI trigger
> + * @bpf_psi: bpf_psi struct to attach the trigger to
> + * @cgroup_id: cgroup Id to attach the trigger; 0 for system-wide scope
> + * @resource: resource to monitor (PSI_MEM, PSI_IO, etc) and the full bit.
> + * @threshold_us: threshold in us
> + * @window_us: window in us
> + *
> + * Creates a PSI trigger and attached is to bpf_psi. The trigger will be
> + * active unless bpf struct ops is unloaded or the corresponding cgroup
> + * is deleted.
> + *
> + * Resource's most significant bit encodes whether "some" or "full"
> + * PSI state should be tracked.
> + *
> + * Returns 0 on success and the error code on failure.
> + */
> +__bpf_kfunc int bpf_psi_create_trigger(struct bpf_psi *bpf_psi,
> + u64 cgroup_id, u32 resource,
> + u32 threshold_us, u32 window_us)
> +{
> + enum psi_res res = resource & ~BPF_PSI_FULL;
> + bool full = resource & BPF_PSI_FULL;
> + struct psi_trigger_params params;
> + struct cgroup *cgroup __maybe_unused = NULL;
> + struct psi_group *group;
> + struct psi_trigger *t;
> + int ret = 0;
> +
> + if (res >= NR_PSI_RESOURCES)
> + return -EINVAL;
> +
> + if (IS_ENABLED(CONFIG_CGROUPS) && cgroup_id) {
> + cgroup = cgroup_get_from_id(cgroup_id);
> + if (IS_ERR_OR_NULL(cgroup))
> + return PTR_ERR(cgroup);
> +
> + group = cgroup_psi(cgroup);
> + } else {
> + group = &psi_system;
> + }
> +
> + params.type = PSI_BPF;
> + params.bpf_psi = bpf_psi;
> + params.privileged = capable(CAP_SYS_RESOURCE);
> + params.res = res;
> + params.full = full;
> + params.threshold_us = threshold_us;
> + params.window_us = window_us;
> +
> + t = psi_trigger_create(group, &params);
> + if (IS_ERR(t))
> + ret = PTR_ERR(t);
> + else
> + t->cgroup_id = cgroup_id;
> +
> +#ifdef CONFIG_CGROUPS
> + if (cgroup)
> + cgroup_put(cgroup);
> +#endif
> +
> + return ret;
> +}
> +__bpf_kfunc_end_defs();
> +
> +BTF_KFUNCS_START(bpf_psi_kfuncs)
> +BTF_ID_FLAGS(func, bpf_psi_create_trigger, KF_TRUSTED_ARGS)
> +BTF_KFUNCS_END(bpf_psi_kfuncs)
> +
> +static int bpf_psi_kfunc_filter(const struct bpf_prog *prog, u32 kfunc_id)
> +{
> + if (btf_id_set8_contains(&bpf_psi_kfuncs, kfunc_id) &&
> + prog->aux->st_ops != &bpf_psi_bpf_ops)
> + return -EACCES;
> +
> + return 0;
> +}
> +
> +static const struct btf_kfunc_id_set bpf_psi_kfunc_set = {
> + .owner = THIS_MODULE,
> + .set = &bpf_psi_kfuncs,
> + .filter = bpf_psi_kfunc_filter,
> +};
> +
>  static int bpf_psi_ops_reg(void *kdata, struct bpf_link *link)
>  {
>  struct bpf_psi_ops *ops = kdata;
> @@ -287,6 +374,13 @@ static int __init bpf_psi_struct_ops_init(void)
>  if (!bpf_psi_wq)
>  return -ENOMEM;
>  
> + err = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
> + &bpf_psi_kfunc_set);
> + if (err) {
> + pr_warn("error while registering bpf psi kfuncs: %d", err);
> + goto err;
> + }
> +
>  err = register_bpf_struct_ops(&bpf_psi_bpf_ops, bpf_psi_ops);
>  if (err) {
>  pr_warn("error while registering bpf psi struct ops: %d", err);
> -- 
> 2.51.0
>


  reply	other threads:[~2025-12-08  8:49 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-27 23:21 [PATCH v2 11/23] mm: introduce BPF kfunc to access memory events Roman Gushchin
2025-10-27 23:21 ` [PATCH v2 12/23] bpf: selftests: selftests for memcg stat kfuncs Roman Gushchin
2025-10-27 23:21 ` [PATCH v2 13/23] mm: introduce bpf_out_of_memory() BPF kfunc Roman Gushchin
2025-10-27 23:57   ` bot+bpf-ci
2025-10-28 16:43     ` Roman Gushchin
2025-11-10  9:46   ` Michal Hocko
2025-11-11 19:13     ` Roman Gushchin
2025-11-12  7:50       ` Michal Hocko
2025-10-27 23:21 ` [PATCH v2 14/23] mm: allow specifying custom oom constraint for BPF triggers Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 15:58     ` Chris Mason
2025-10-28 16:20       ` Roman Gushchin
2025-10-28 16:35         ` Chris Mason
2025-11-10  9:31   ` Michal Hocko
2025-11-11 19:17     ` Roman Gushchin
2025-11-12  7:52       ` Michal Hocko
2025-10-27 23:21 ` [PATCH v2 15/23] mm: introduce bpf_task_is_oom_victim() kfunc Roman Gushchin
2025-10-28 17:32   ` Tejun Heo
2025-10-28 18:09     ` Roman Gushchin
2025-10-28 18:31       ` Tejun Heo
2025-10-27 23:21 ` [PATCH v2 16/23] libbpf: introduce bpf_map__attach_struct_ops_opts() Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 17:07     ` Roman Gushchin
2025-10-28 17:24       ` Andrii Nakryiko
2025-10-27 23:22 ` [PATCH v2 17/23] bpf: selftests: introduce read_cgroup_file() helper Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 16:31     ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 18/23] bpf: selftests: BPF OOM handler test Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 19/23] sched: psi: refactor psi_trigger_create() Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 20/23] sched: psi: implement bpf_psi struct ops Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 17:40   ` Tejun Heo
2025-10-28 18:29     ` Roman Gushchin
2025-10-28 18:35       ` Tejun Heo
2025-10-28 19:54         ` Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 21/23] sched: psi: implement bpf_psi_create_trigger() kfunc Roman Gushchin
2025-12-08  8:49   ` hui.zhu [this message]
2025-10-27 23:22 ` [PATCH v2 22/23] bpf: selftests: add config for psi Roman Gushchin
2025-10-27 23:22 ` [PATCH v2 23/23] bpf: selftests: PSI struct ops test Roman Gushchin
2025-10-27 23:48   ` bot+bpf-ci
2025-10-28 17:13     ` Roman Gushchin
2025-10-28 17:30       ` Alexei Starovoitov
2025-11-10  9:48   ` Michal Hocko
2025-11-11 19:03     ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d9a162605a3f32ac215430131f7745488deaa34@linux.dev \
    --to=hui.zhu@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=inwardvessel@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=mhocko@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=song@kernel.org \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox