From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
bpf <bpf@vger.kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Eddy Z <eddyz87@gmail.com>, Tejun Heo <tj@kernel.org>,
Barret Rhoden <brho@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Uladzislau Rezki <urezki@gmail.com>,
linux-mm <linux-mm@kvack.org>, Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 04/20] mm: Expose vmap_pages_range() to the rest of the kernel.
Date: Thu, 15 Feb 2024 12:50:55 -0800 [thread overview]
Message-ID: <CAADnVQJ8azcUznU6KHhwEM99NUOx8oai8EOyay4dxLM6ho8mjw@mail.gmail.com> (raw)
In-Reply-To: <Zc22DluhMNk5_Zfn@infradead.org>
On Wed, Feb 14, 2024 at 10:58 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Feb 14, 2024 at 12:53:42PM -0800, Alexei Starovoitov wrote:
> > On Wed, Feb 14, 2024 at 12:36 AM Christoph Hellwig <hch@infradead.org> wrote:
> > >
> > > NAK. Please
> >
> > What is the alternative?
> > Remember, maintainers cannot tell developers "go away".
> > They must suggest a different path.
>
> That criteria is something you've made up.
I didn't invent it. I internalized it based on the feedback received.
> Telling that something
> is not ok is the most important job of not just maintainers but all
> developers.
I'm not saying that maintainers should not say "no",
I'm saying that maintainers should say "no", understand the problem
being solved, and suggest an alternative.
> Maybe start with a description of the problem you're
> solving and why you think it matters and needs different APIs.
bpf_arena doesn't need a different api. These 5 api-s below are enough.
I'm saying that vmap_pages_range() is equivalent to apply_to_page_range()
for all practical purposes.
So, since apply_to_page_range() is available to the kernel
(xen, gpu, kasan, etc) then I see no reason why
vmap_pages_range() shouldn't be available as well, since:
struct vmap_ctx {
struct page **pages;
int idx;
};
static int __for_each_pte(pte_t *ptep, unsigned long addr, void *data)
{
struct vmap_ctx *ctx = data;
struct page *page = ctx->pages[ctx->idx++];
/* TODO: sanity checks here */
set_pte_at(&init_mm, addr, ptep, mk_pte(page, PAGE_KERNEL));
return 0;
}
static int vmap_pages_range_hack(unsigned long addr, unsigned long end,
struct page **pages)
{
struct vmap_ctx ctx = { .pages = pages };
return apply_to_page_range(&init_mm, addr, end - addr,
__for_each_pte, &ctx);
}
Anything I miss?
> > . get_vm_area - external
> > . free_vm_area - EXPORT_SYMBOL_GPL
> > . vunmap_range - external
> > . vmalloc_to_page - EXPORT_SYMBOL
> > . apply_to_page_range - EXPORT_SYMBOL_GPL
> >
> > and the last one is pretty much equivalent to vmap_pages_range,
> > hence I'm surprised by push back to make vmap_pages_range available to bpf.
>
> And the last we've been trying to get rid of by ages because we don't
> want random modules to
Get rid of EXPORT_SYMBOL from it? Fine by me.
Or you're saying that you have a plan to replace apply_to_page_range()
with something else ? With what ?
> > > > For example, there is the public ioremap_page_range(), which is used
> > > > to map device memory into addressable kernel space.
> > >
> > > It's not really public. It's a helper for the ioremap implementation
> > > which really should not be arch specific to start with and are in
> > > the process of beeing consolidatd into common code.
> >
> > Any link to such consolidation of ioremap ? I couldn't find one.
>
> Second hit on google:
>
> https://lore.kernel.org/lkml/20230609075528.9390-1-bhe@redhat.com/T/
Thanks.
It sounded like you were referring to some future work.
The series that landed was a good cleanup.
No questions about it.
> > I surely don't want bpf_arena to cause headaches to mm folks.
> >
> > Anyway, ioremap_page_range() was just an example.
> > I could have used vmap() as an equivalent example.
> > vmap is EXPORT_SYMBOL, btw.
>
> vmap is a good well defined API. vmap_pages_range is not.
since vmap() is nothing but get_vm_area() + vmap_pages_range()
and few checks... I'm missing the point.
Pls elaborate.
> > What bpf_arena needs is pretty much vmap(), but instead of
> > allocating all pages in advance, allocate them and insert on demand.
>
> So propose an API that does that instead of exposing random low-level
> details.
The generic_ioremap_prot() and vmap() APIs make sense for the cases
when phys memory exists with known size. It needs to vmap-ed and
not touched after.
bpf_arena use case is similar to kasan which
reserves a giant virtual memory region, and then
does apply_to_page_range() to populate certain pte-s with pages in that region,
and later apply_to_existing_page_range() to free pages in kasan's region.
bpf_arena is very similar, except it currently calls get_vm_area()
to get a 4Gb+guard_pages region, and then vmap_pages_range() to
populate a page in it, and vunmap_range() to remove a page.
These existing api-s work, so not sure what you're requesting.
I can guess many different things, but pls clarify to reduce
this back and forth.
Are you worried about range checking? That vmap_pages_range()
can accidently hit an unintended range?
btw the cover letter and patch 5 explain the higher level motivation
from bpf pov in detail.
There was a bunch of feedback on that patch, which was addressed,
and the latest version is here:
https://git.kernel.org/pub/scm/linux/kernel/git/ast/bpf.git/commit/?h=arena&id=a752b4122071adb5307d7ab3ae6736a9a0e45317
next prev parent reply other threads:[~2024-02-15 20:51 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-09 4:05 [PATCH v2 bpf-next 00/20] bpf: Introduce BPF arena Alexei Starovoitov
2024-02-09 4:05 ` [PATCH v2 bpf-next 01/20] bpf: Allow kfuncs return 'void *' Alexei Starovoitov
2024-02-10 6:49 ` Kumar Kartikeya Dwivedi
2024-02-09 4:05 ` [PATCH v2 bpf-next 02/20] bpf: Recognize '__map' suffix in kfunc arguments Alexei Starovoitov
2024-02-09 4:05 ` [PATCH v2 bpf-next 03/20] bpf: Plumb get_unmapped_area() callback into bpf_map_ops Alexei Starovoitov
2024-02-09 4:05 ` [PATCH v2 bpf-next 04/20] mm: Expose vmap_pages_range() to the rest of the kernel Alexei Starovoitov
2024-02-14 8:36 ` Christoph Hellwig
2024-02-14 20:53 ` Alexei Starovoitov
2024-02-15 6:58 ` Christoph Hellwig
2024-02-15 20:50 ` Alexei Starovoitov [this message]
2024-02-15 21:26 ` Linus Torvalds
2024-02-16 9:31 ` Christoph Hellwig
2024-02-16 16:54 ` Alexei Starovoitov
2024-02-16 17:18 ` Uladzislau Rezki
2024-02-18 2:06 ` Alexei Starovoitov
2024-02-20 6:57 ` Christoph Hellwig
2024-02-09 4:05 ` [PATCH v2 bpf-next 05/20] bpf: Introduce bpf_arena Alexei Starovoitov
2024-02-09 20:36 ` David Vernet
2024-02-10 4:38 ` Alexei Starovoitov
2024-02-12 15:56 ` Barret Rhoden
2024-02-12 18:23 ` Alexei Starovoitov
[not found] ` <CAP01T75y-E8qjMpn_9E-k8H0QpPdjvYx9MMgx6cxGfmdVat+Xw@mail.gmail.com>
2024-02-12 18:21 ` Alexei Starovoitov
2024-02-13 23:14 ` Andrii Nakryiko
2024-02-13 23:29 ` Alexei Starovoitov
2024-02-14 0:03 ` Andrii Nakryiko
2024-02-14 0:14 ` Alexei Starovoitov
2024-02-09 4:05 ` [PATCH v2 bpf-next 06/20] bpf: Disasm support for cast_kern/user instructions Alexei Starovoitov
2024-02-09 4:05 ` [PATCH v2 bpf-next 07/20] bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions Alexei Starovoitov
2024-02-09 17:20 ` Eduard Zingerman
2024-02-13 22:20 ` Alexei Starovoitov
[not found] ` <CAP01T75sq=G5pfYvsYuxfdoFGOqSGrNcamCyA0posFA9pxNWRA@mail.gmail.com>
2024-02-13 22:00 ` Alexei Starovoitov
2024-02-09 4:05 ` [PATCH v2 bpf-next 08/20] bpf: Add x86-64 JIT support for bpf_cast_user instruction Alexei Starovoitov
2024-02-10 1:15 ` Eduard Zingerman
[not found] ` <CAP01T76JMbnS3PSpontzWmtSZ9cs97yO772R8zpWH-eHXviLSA@mail.gmail.com>
2024-02-13 22:28 ` Alexei Starovoitov
2024-02-09 4:05 ` [PATCH v2 bpf-next 09/20] bpf: Recognize cast_kern/user instructions in the verifier Alexei Starovoitov
2024-02-10 1:13 ` Eduard Zingerman
2024-02-13 2:58 ` Alexei Starovoitov
2024-02-13 12:01 ` Eduard Zingerman
2024-02-09 4:05 ` [PATCH v2 bpf-next 10/20] bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA Alexei Starovoitov
2024-02-13 23:14 ` Andrii Nakryiko
2024-02-14 0:26 ` Alexei Starovoitov
2024-02-09 4:05 ` [PATCH v2 bpf-next 11/20] libbpf: Add __arg_arena to bpf_helpers.h Alexei Starovoitov
2024-02-13 23:14 ` Andrii Nakryiko
2024-02-09 4:06 ` [PATCH v2 bpf-next 12/20] libbpf: Add support for bpf_arena Alexei Starovoitov
2024-02-12 18:12 ` Eduard Zingerman
2024-02-12 20:14 ` Alexei Starovoitov
2024-02-12 20:21 ` Eduard Zingerman
[not found] ` <CAP01T761B1+paMwrQesjX+zqFwQp8iUzLORueTjTLSHPbJ+0fQ@mail.gmail.com>
2024-02-12 19:11 ` Andrii Nakryiko
2024-02-13 23:15 ` Andrii Nakryiko
2024-02-14 0:32 ` Alexei Starovoitov
2024-02-09 4:06 ` [PATCH v2 bpf-next 13/20] libbpf: Allow specifying 64-bit integers in map BTF Alexei Starovoitov
2024-02-12 18:58 ` Eduard Zingerman
2024-02-13 23:15 ` Andrii Nakryiko
2024-02-14 0:47 ` Alexei Starovoitov
2024-02-14 0:51 ` Andrii Nakryiko
2024-02-09 4:06 ` [PATCH v2 bpf-next 14/20] libbpf: Recognize __arena global varaibles Alexei Starovoitov
2024-02-13 0:34 ` Eduard Zingerman
2024-02-13 0:44 ` Alexei Starovoitov
2024-02-13 0:49 ` Eduard Zingerman
2024-02-13 2:08 ` Alexei Starovoitov
2024-02-13 12:48 ` Eduard Zingerman
2024-02-13 23:11 ` Eduard Zingerman
2024-02-13 23:17 ` Andrii Nakryiko
2024-02-13 23:36 ` Eduard Zingerman
2024-02-14 0:09 ` Andrii Nakryiko
2024-02-14 0:16 ` Eduard Zingerman
2024-02-14 0:29 ` Andrii Nakryiko
2024-02-14 1:24 ` Alexei Starovoitov
2024-02-14 17:24 ` Andrii Nakryiko
2024-02-15 23:22 ` Andrii Nakryiko
2024-02-16 2:45 ` Alexei Starovoitov
2024-02-16 4:51 ` Andrii Nakryiko
2024-02-14 1:02 ` Alexei Starovoitov
2024-02-14 15:10 ` Eduard Zingerman
2024-02-13 23:15 ` Andrii Nakryiko
2024-02-09 4:06 ` [PATCH v2 bpf-next 15/20] bpf: Tell bpf programs kernel's PAGE_SIZE Alexei Starovoitov
2024-02-09 4:06 ` [PATCH v2 bpf-next 16/20] bpf: Add helper macro bpf_arena_cast() Alexei Starovoitov
[not found] ` <CAP01T743Mzfi9+2yMjB5+m2jpBLvij_tLyLFptkOpCekUn=soA@mail.gmail.com>
2024-02-13 22:35 ` Alexei Starovoitov
2024-02-14 16:47 ` Eduard Zingerman
2024-02-14 17:45 ` Alexei Starovoitov
2024-02-09 4:06 ` [PATCH v2 bpf-next 17/20] selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages Alexei Starovoitov
2024-02-09 23:14 ` David Vernet
2024-02-10 4:35 ` Alexei Starovoitov
2024-02-12 16:48 ` David Vernet
[not found] ` <CAP01T75qCUabu4-18nYwRDnSyTTgeAgNN3kePY5PXdnoTKt+Cg@mail.gmail.com>
2024-02-13 23:19 ` Alexei Starovoitov
2024-02-09 4:06 ` [PATCH v2 bpf-next 18/20] selftests/bpf: Add bpf_arena_list test Alexei Starovoitov
2024-02-09 4:06 ` [PATCH v2 bpf-next 19/20] selftests/bpf: Add bpf_arena_htab test Alexei Starovoitov
2024-02-09 4:06 ` [PATCH v2 bpf-next 20/20] selftests/bpf: Convert simple page_frag allocator to per-cpu Alexei Starovoitov
[not found] ` <CAP01T74x-N71rbS+jZ2z+3MPMe5WDeWKV_gWJmDCikV0YOpPFQ@mail.gmail.com>
2024-02-14 1:37 ` Alexei Starovoitov
2024-02-12 14:14 ` [PATCH v2 bpf-next 00/20] bpf: Introduce BPF arena David Hildenbrand
2024-02-12 18:14 ` Alexei Starovoitov
2024-02-13 10:35 ` David Hildenbrand
2024-02-12 17:36 ` Barret Rhoden
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAADnVQJ8azcUznU6KHhwEM99NUOx8oai8EOyay4dxLM6ho8mjw@mail.gmail.com \
--to=alexei.starovoitov@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brho@google.com \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=hch@infradead.org \
--cc=kernel-team@fb.com \
--cc=linux-mm@kvack.org \
--cc=lstoakes@gmail.com \
--cc=memxor@gmail.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox