From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@kernel.org>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Eddy Z <eddyz87@gmail.com>, Tejun Heo <tj@kernel.org>,
Barret Rhoden <brho@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-mm <linux-mm@kvack.org>, Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next 11/16] libbpf: Add support for bpf_arena.
Date: Wed, 7 Feb 2024 17:38:47 -0800 [thread overview]
Message-ID: <CAADnVQKjkba_wiUJ9wps_k8+TYu_q3Ai5oQ1mnZQmpv+pnPfFw@mail.gmail.com> (raw)
In-Reply-To: <CAEf4Bza9gNXfGXuQnvWnoYNA08enBCkqn9uyHtBNdTpZRvn7og@mail.gmail.com>
On Wed, Feb 7, 2024 at 5:15 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Feb 6, 2024 at 2:05 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > mmap() bpf_arena right after creation, since the kernel needs to
> > remember the address returned from mmap. This is user_vm_start.
> > LLVM will generate bpf_arena_cast_user() instructions where
> > necessary and JIT will add upper 32-bit of user_vm_start
> > to such pointers.
> >
> > Use traditional map->value_size * map->max_entries to calculate mmap sz,
> > though it's not the best fit.
>
> We should probably make bpf_map_mmap_sz() aware of specific map type
> and do different calculations based on that. It makes sense to have
> round_up(PAGE_SIZE) for BPF map arena, and use just just value_size or
> max_entries to specify the size (fixing the other to be zero).
I went with value_size == key_size == 8 in order to be able to extend
it in the future and allow map_lookup/update/delete to do something
useful. Ex: lookup/delete can behave just like arena_alloc/free_pages.
Are you proposing to force key/value_size to zero ?
That was my first attempt.
key_size can be zero, but syscall side of lookup/update expects
a non-zero value_size for all maps regardless of type.
We can modify bpf/syscall.c, of course, but it feels arena would be
too different of a map if generic map handling code would need
to be specialized.
Then since value_size is > 0 then what sizes make sense?
When it's 8 it can be an indirection to anything.
key/value would be user pointers to other structs that
would be meaningful for an arena.
Right now it costs nothing to force both to 8 and pick any logic
when we decide what lookup/update should do.
But then when value_size == 8 than making max_entries to
mean the size of arena in bytes or pages.. starting to look odd
and different from all other maps.
We could go with max_entries==0 and value_size to mean the size of
arena in bytes, but it will prevent us from defining lookup/update
in the future, which doesn't feel right.
Considering all this I went with map->value_size * map->max_entries choice.
Though it's not pretty.
> > @@ -4908,6 +4910,22 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
> > if (map->fd == map_fd)
> > return 0;
> >
> > + if (def->type == BPF_MAP_TYPE_ARENA) {
> > + size_t mmap_sz;
> > +
> > + mmap_sz = bpf_map_mmap_sz(def->value_size, def->max_entries);
> > + map->mmaped = mmap((void *)map->map_extra, mmap_sz, PROT_READ | PROT_WRITE,
> > + map->map_extra ? MAP_SHARED | MAP_FIXED : MAP_SHARED,
> > + map_fd, 0);
> > + if (map->mmaped == MAP_FAILED) {
> > + err = -errno;
> > + map->mmaped = NULL;
> > + pr_warn("map '%s': failed to mmap bpf_arena: %d\n",
> > + bpf_map__name(map), err);
> > + return err;
>
> leaking map_fd here, you need to close(map_fd) before erroring out
ahh. good catch.
next prev parent reply other threads:[~2024-02-08 1:39 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-06 22:04 [PATCH bpf-next 00/16] bpf: Introduce BPF arena Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 01/16] bpf: Allow kfuncs return 'void *' Alexei Starovoitov
2024-02-08 19:40 ` Andrii Nakryiko
2024-02-09 0:09 ` Alexei Starovoitov
2024-02-09 19:09 ` Andrii Nakryiko
2024-02-10 2:32 ` Alexei Starovoitov
2024-02-09 16:06 ` David Vernet
2024-02-06 22:04 ` [PATCH bpf-next 02/16] bpf: Recognize '__map' suffix in kfunc arguments Alexei Starovoitov
2024-02-09 16:57 ` David Vernet
2024-02-09 17:46 ` Alexei Starovoitov
2024-02-09 18:11 ` David Vernet
2024-02-09 18:59 ` Alexei Starovoitov
2024-02-09 19:18 ` David Vernet
2024-02-06 22:04 ` [PATCH bpf-next 03/16] mm: Expose vmap_pages_range() to the rest of the kernel Alexei Starovoitov
2024-02-07 21:07 ` Lorenzo Stoakes
2024-02-07 22:56 ` Alexei Starovoitov
2024-02-08 5:44 ` Johannes Weiner
2024-02-08 23:55 ` Alexei Starovoitov
2024-02-09 6:36 ` Lorenzo Stoakes
2024-02-14 8:31 ` Christoph Hellwig
2024-02-06 22:04 ` [PATCH bpf-next 04/16] bpf: Introduce bpf_arena Alexei Starovoitov
2024-02-07 18:40 ` Barret Rhoden
2024-02-07 20:55 ` Alexei Starovoitov
2024-02-07 21:11 ` Barret Rhoden
2024-02-08 6:26 ` Alexei Starovoitov
2024-02-08 21:58 ` Barret Rhoden
2024-02-08 23:36 ` Alexei Starovoitov
2024-02-08 23:50 ` Barret Rhoden
2024-02-06 22:04 ` [PATCH bpf-next 05/16] bpf: Disasm support for cast_kern/user instructions Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 06/16] bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 07/16] bpf: Add x86-64 JIT support for bpf_cast_user instruction Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 08/16] bpf: Recognize cast_kern/user instructions in the verifier Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 09/16] bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 10/16] libbpf: Add __arg_arena to bpf_helpers.h Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 11/16] libbpf: Add support for bpf_arena Alexei Starovoitov
2024-02-08 1:15 ` Andrii Nakryiko
2024-02-08 1:38 ` Alexei Starovoitov [this message]
2024-02-08 18:29 ` Andrii Nakryiko
2024-02-08 18:45 ` Alexei Starovoitov
2024-02-08 18:54 ` Andrii Nakryiko
2024-02-08 18:59 ` Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 12/16] libbpf: Allow specifying 64-bit integers in map BTF Alexei Starovoitov
2024-02-08 1:16 ` Andrii Nakryiko
2024-02-08 1:58 ` Alexei Starovoitov
2024-02-08 18:16 ` Andrii Nakryiko
2024-02-06 22:04 ` [PATCH bpf-next 13/16] bpf: Tell bpf programs kernel's PAGE_SIZE Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 14/16] bpf: Add helper macro bpf_arena_cast() Alexei Starovoitov
2024-02-06 22:04 ` [PATCH bpf-next 15/16] selftests/bpf: Add bpf_arena_list test Alexei Starovoitov
2024-02-07 17:04 ` Eduard Zingerman
2024-02-08 2:59 ` Alexei Starovoitov
2024-02-08 11:10 ` Jose E. Marchesi
2024-02-06 22:04 ` [PATCH bpf-next 16/16] selftests/bpf: Add bpf_arena_htab test Alexei Starovoitov
2024-02-07 12:34 ` [PATCH bpf-next 00/16] bpf: Introduce BPF arena Donald Hunter
2024-02-07 13:33 ` Barret Rhoden
2024-02-07 20:16 ` Alexei Starovoitov
2024-02-07 20:12 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAADnVQKjkba_wiUJ9wps_k8+TYu_q3Ai5oQ1mnZQmpv+pnPfFw@mail.gmail.com \
--to=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brho@google.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-mm@kvack.org \
--cc=martin.lau@kernel.org \
--cc=memxor@gmail.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox