From: Yafang Shao <laoar.shao@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>, Martin Lau <kafai@fb.com>,
Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>,
john fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
jolsa@kernel.org, Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeelb@google.com>,
Muchun Song <songmuchun@bytedance.com>,
Andrew Morton <akpm@linux-foundation.org>,
netdev <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>
Subject: Re: [RFC PATCH bpf-next 15/15] bpf: Introduce selectable memcg for bpf map
Date: Tue, 2 Aug 2022 21:47:02 +0800 [thread overview]
Message-ID: <CALOAHbDZq89ATv5pK4FAX9-DYOWZjFJPtJ8fAYL1zhuS2c1D9w@mail.gmail.com> (raw)
In-Reply-To: <20220802045531.6oi2pt3fyjhotmjo@macbook-pro-3.dhcp.thefacebook.com>
On Tue, Aug 2, 2022 at 12:55 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Jul 29, 2022 at 03:23:16PM +0000, Yafang Shao wrote:
> > A new member memcg_fd is introduced into bpf attr of BPF_MAP_CREATE
> > command, which is the fd of an opened cgroup directory. In this cgroup,
> > the memory subsystem must be enabled. This value is valid only when
> > BPF_F_SELECTABLE_MEMCG is set in map_flags. Once the kernel get the
> > memory cgroup from this fd, it will set this memcg into bpf map, then
> > all the subsequent memory allocation of this map will be charge to the
> > memcg.
> >
> > The map creation paths in libbpf are also changed consequently.
> >
> > Currently it is only supported for cgroup2 directory.
> >
> > The usage of this new member as follows,
> > struct bpf_map_create_opts map_opts = {
> > .sz = sizeof(map_opts),
> > .map_flags = BPF_F_SELECTABLE_MEMCG,
> > };
> > int memcg_fd, int map_fd;
> > int key, value;
> >
> > memcg_fd = open("/cgroup2", O_DIRECTORY);
> > if (memcg_fd < 0) {
> > perror("memcg dir open");
> > return -1;
> > }
> >
> > map_opts.memcg_fd = memcg_fd;
> > map_fd = bpf_map_create(BPF_MAP_TYPE_HASH, "map_for_memcg",
> > sizeof(key), sizeof(value),
> > 1024, &map_opts);
> > if (map_fd <= 0) {
> > perror("map create");
> > return -1;
> > }
>
> Overall the api extension makes sense.
> The flexibility of selecting memcg is useful.
>
Thanks!
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > ---
> > include/uapi/linux/bpf.h | 2 ++
> > kernel/bpf/syscall.c | 47 ++++++++++++++++++++++++++--------
> > tools/include/uapi/linux/bpf.h | 2 ++
> > tools/lib/bpf/bpf.c | 1 +
> > tools/lib/bpf/bpf.h | 3 ++-
> > tools/lib/bpf/libbpf.c | 2 ++
> > 6 files changed, 46 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index d5fc1ea70b59..a6e02c8be924 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1296,6 +1296,8 @@ union bpf_attr {
> > * struct stored as the
> > * map value
> > */
> > + __s32 memcg_fd; /* selectable memcg */
> > + __s32 :32; /* hole */
>
> new fields cannot be inserted in the middle of uapi struct.
>
There's a "#define BPF_MAP_CREATE_LAST_FIELD map_extra" in
kernel/bpf/syscall.c, and thus I thought it may have some special
meaning, so I put the new field above it.
Now that it doesn't have any special meaning, I will change it as you suggested.
> > /* Any per-map-type extra fields
> > *
> > * BPF_MAP_TYPE_BLOOM_FILTER - the lowest 4 bits indicate the
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index 6401cc417fa9..9900e2b87315 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -402,14 +402,30 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock)
> > }
> >
> > #ifdef CONFIG_MEMCG_KMEM
> > -static void bpf_map_save_memcg(struct bpf_map *map)
> > +static int bpf_map_save_memcg(struct bpf_map *map, union bpf_attr *attr)
> > {
> > - /* Currently if a map is created by a process belonging to the root
> > - * memory cgroup, get_obj_cgroup_from_current() will return NULL.
> > - * So we have to check map->objcg for being NULL each time it's
> > - * being used.
> > - */
> > - map->objcg = get_obj_cgroup_from_current();
> > + struct obj_cgroup *objcg;
> > + struct cgroup *cgrp;
> > +
> > + if (attr->map_flags & BPF_F_SELECTABLE_MEMCG) {
>
> The flag is unnecessary. Just add memcg_fd to the end of attr and use != 0
> as a condition that it should be used instead of get_obj_cgroup_from_current().
> There are other parts of bpf uapi that have similar fd handling logic.
>
Right. There's a ensure_good_fd() to make the fd a positive number.
I will change it.
> > + cgrp = cgroup_get_from_fd(attr->memcg_fd);
> > + if (IS_ERR(cgrp))
> > + return -EINVAL;
> > +
> > + objcg = get_obj_cgroup_from_cgroup(cgrp);
> > + if (IS_ERR(objcg))
> > + return PTR_ERR(objcg);
> > + } else {
> > + /* Currently if a map is created by a process belonging to the root
> > + * memory cgroup, get_obj_cgroup_from_current() will return NULL.
> > + * So we have to check map->objcg for being NULL each time it's
> > + * being used.
> > + */
> > + objcg = get_obj_cgroup_from_current();
> > + }
> > +
> > + map->objcg = objcg;
> > + return 0;
> > }
> >
> > static void bpf_map_release_memcg(struct bpf_map *map)
> > @@ -485,8 +501,9 @@ void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size,
> > }
> >
> > #else
> > -static void bpf_map_save_memcg(struct bpf_map *map)
> > +static int bpf_map_save_memcg(struct bpf_map *map, union bpf_attr *attr)
> > {
> > + return 0;
> > }
> >
> > static void bpf_map_release_memcg(struct bpf_map *map)
> > @@ -530,13 +547,18 @@ void *bpf_map_container_alloc(union bpf_attr *attr, u64 size, int numa_node)
>
> High level uapi struct should not be passed into low level helper like this.
> Pls pass memcg_fd instead.
>
Sure, I will do it.
--
Regards
Yafang
prev parent reply other threads:[~2022-08-02 13:47 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-29 15:23 [RFC PATCH bpf-next 00/15] " Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 01/15] bpf: Remove unneeded memset in queue_stack_map creation Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 02/15] bpf: Use bpf_map_area_free instread of kvfree Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 03/15] bpf: Make __GFP_NOWARN consistent in bpf map creation Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 04/15] bpf: Use bpf_map_area_alloc consistently on " Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 05/15] bpf: Introduce helpers for container of struct bpf_map Yafang Shao
2022-08-02 4:58 ` Alexei Starovoitov
2022-08-02 13:47 ` Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 06/15] bpf: Use bpf_map_container_alloc helpers in various bpf maps Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 07/15] bpf: Define bpf_map_get_memcg for !CONFIG_MEMCG_KMEM Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 08/15] bpf: Use scope-based charge for bpf_map_area_alloc Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 09/15] bpf: Use bpf_map_kzalloc in arraymap Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 10/15] bpf: Use bpf_map_pages_alloc in ringbuf Yafang Shao
2022-08-01 23:16 ` Andrii Nakryiko
2022-08-02 13:31 ` Yafang Shao
2022-08-02 18:00 ` Andrii Nakryiko
2022-08-03 13:27 ` Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 11/15] bpf: Use bpf_map_kvcalloc in bpf_local_storage Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 12/15] mm, memcg: Add new helper get_obj_cgroup_from_cgroup Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 13/15] bpf: Add new parameter into bpf_map_container_alloc Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 14/15] bpf: Add new map flag BPF_F_SELECTABLE_MEMCG Yafang Shao
2022-07-29 15:23 ` [RFC PATCH bpf-next 15/15] bpf: Introduce selectable memcg for bpf map Yafang Shao
2022-08-02 4:55 ` Alexei Starovoitov
2022-08-02 13:47 ` Yafang Shao [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALOAHbDZq89ATv5pK4FAX9-DYOWZjFJPtJ8fAYL1zhuS2c1D9w@mail.gmail.com \
--to=laoar.shao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=hannes@cmpxchg.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=sdf@google.com \
--cc=shakeelb@google.com \
--cc=songliubraving@fb.com \
--cc=songmuchun@bytedance.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox