linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	bpf <bpf@vger.kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Eddy Z <eddyz87@gmail.com>, Tejun Heo <tj@kernel.org>,
	 Barret Rhoden <brho@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Lorenzo Stoakes <lstoakes@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Uladzislau Rezki <urezki@gmail.com>,
	linux-mm <linux-mm@kvack.org>, Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 04/20] mm: Expose vmap_pages_range() to the rest of the kernel.
Date: Thu, 15 Feb 2024 12:50:55 -0800	[thread overview]
Message-ID: <CAADnVQJ8azcUznU6KHhwEM99NUOx8oai8EOyay4dxLM6ho8mjw@mail.gmail.com> (raw)
In-Reply-To: <Zc22DluhMNk5_Zfn@infradead.org>

On Wed, Feb 14, 2024 at 10:58 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Feb 14, 2024 at 12:53:42PM -0800, Alexei Starovoitov wrote:
> > On Wed, Feb 14, 2024 at 12:36 AM Christoph Hellwig <hch@infradead.org> wrote:
> > >
> > > NAK.  Please
> >
> > What is the alternative?
> > Remember, maintainers cannot tell developers "go away".
> > They must suggest a different path.
>
> That criteria is something you've made up.

I didn't invent it. I internalized it based on the feedback received.

> Telling that something
> is not ok is the most important job of not just maintainers but all
> developers.

I'm not saying that maintainers should not say "no",
I'm saying that maintainers should say "no", understand the problem
being solved, and suggest an alternative.

> Maybe start with a description of the problem you're
> solving and why you think it matters and needs different APIs.

bpf_arena doesn't need a different api. These 5 api-s below are enough.
I'm saying that vmap_pages_range() is equivalent to apply_to_page_range()
for all practical purposes.
So, since apply_to_page_range() is available to the kernel
(xen, gpu, kasan, etc) then I see no reason why
vmap_pages_range() shouldn't be available as well, since:

struct vmap_ctx {
     struct page **pages;
     int idx;
};

static int __for_each_pte(pte_t *ptep, unsigned long addr, void *data)
{
     struct vmap_ctx *ctx = data;
     struct page *page = ctx->pages[ctx->idx++];

     /* TODO: sanity checks here */
     set_pte_at(&init_mm, addr, ptep, mk_pte(page, PAGE_KERNEL));
     return 0;
}

static int vmap_pages_range_hack(unsigned long addr, unsigned long end,
                                 struct page **pages)
{
    struct vmap_ctx ctx = { .pages = pages };

    return apply_to_page_range(&init_mm, addr, end - addr,
__for_each_pte, &ctx);
}

Anything I miss?

> > . get_vm_area - external
> > . free_vm_area - EXPORT_SYMBOL_GPL
> > . vunmap_range - external
> > . vmalloc_to_page - EXPORT_SYMBOL
> > . apply_to_page_range - EXPORT_SYMBOL_GPL
> >
> > and the last one is pretty much equivalent to vmap_pages_range,
> > hence I'm surprised by push back to make vmap_pages_range available to bpf.
>
> And the last we've been trying to get rid of by ages because we don't
> want random modules to

Get rid of EXPORT_SYMBOL from it? Fine by me.
Or you're saying that you have a plan to replace apply_to_page_range()
with something else ? With what ?

> > > > For example, there is the public ioremap_page_range(), which is used
> > > > to map device memory into addressable kernel space.
> > >
> > > It's not really public.  It's a helper for the ioremap implementation
> > > which really should not be arch specific to start with and are in
> > > the process of beeing consolidatd into common code.
> >
> > Any link to such consolidation of ioremap ? I couldn't find one.
>
> Second hit on google:
>
> https://lore.kernel.org/lkml/20230609075528.9390-1-bhe@redhat.com/T/

Thanks.
It sounded like you were referring to some future work.
The series that landed was a good cleanup.
No questions about it.

> > I surely don't want bpf_arena to cause headaches to mm folks.
> >
> > Anyway, ioremap_page_range() was just an example.
> > I could have used vmap() as an equivalent example.
> > vmap is EXPORT_SYMBOL, btw.
>
> vmap is a good well defined API.  vmap_pages_range is not.

since vmap() is nothing but get_vm_area() + vmap_pages_range()
and few checks... I'm missing the point.
Pls elaborate.

> > What bpf_arena needs is pretty much vmap(), but instead of
> > allocating all pages in advance, allocate them and insert on demand.
>
> So propose an API that does that instead of exposing random low-level
> details.

The generic_ioremap_prot() and vmap() APIs make sense for the cases
when phys memory exists with known size. It needs to vmap-ed and
not touched after.
bpf_arena use case is similar to kasan which
reserves a giant virtual memory region, and then
does apply_to_page_range() to populate certain pte-s with pages in that region,
and later apply_to_existing_page_range() to free pages in kasan's region.

bpf_arena is very similar, except it currently calls get_vm_area()
to get a 4Gb+guard_pages region, and then vmap_pages_range() to
populate a page in it, and vunmap_range() to remove a page.

These existing api-s work, so not sure what you're requesting.
I can guess many different things, but pls clarify to reduce
this back and forth.
Are you worried about range checking? That vmap_pages_range()
can accidently hit an unintended range?

btw the cover letter and patch 5 explain the higher level motivation
from bpf pov in detail.
There was a bunch of feedback on that patch, which was addressed,
and the latest version is here:
https://git.kernel.org/pub/scm/linux/kernel/git/ast/bpf.git/commit/?h=arena&id=a752b4122071adb5307d7ab3ae6736a9a0e45317


  reply	other threads:[~2024-02-15 20:51 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-09  4:05 [PATCH v2 bpf-next 00/20] bpf: Introduce BPF arena Alexei Starovoitov
2024-02-09  4:05 ` [PATCH v2 bpf-next 01/20] bpf: Allow kfuncs return 'void *' Alexei Starovoitov
2024-02-10  6:49   ` Kumar Kartikeya Dwivedi
2024-02-09  4:05 ` [PATCH v2 bpf-next 02/20] bpf: Recognize '__map' suffix in kfunc arguments Alexei Starovoitov
2024-02-09  4:05 ` [PATCH v2 bpf-next 03/20] bpf: Plumb get_unmapped_area() callback into bpf_map_ops Alexei Starovoitov
2024-02-09  4:05 ` [PATCH v2 bpf-next 04/20] mm: Expose vmap_pages_range() to the rest of the kernel Alexei Starovoitov
2024-02-14  8:36   ` Christoph Hellwig
2024-02-14 20:53     ` Alexei Starovoitov
2024-02-15  6:58       ` Christoph Hellwig
2024-02-15 20:50         ` Alexei Starovoitov [this message]
2024-02-15 21:26           ` Linus Torvalds
2024-02-16  9:31           ` Christoph Hellwig
2024-02-16 16:54             ` Alexei Starovoitov
2024-02-16 17:18               ` Uladzislau Rezki
2024-02-18  2:06                 ` Alexei Starovoitov
2024-02-20  6:57               ` Christoph Hellwig
2024-02-09  4:05 ` [PATCH v2 bpf-next 05/20] bpf: Introduce bpf_arena Alexei Starovoitov
2024-02-09 20:36   ` David Vernet
2024-02-10  4:38     ` Alexei Starovoitov
2024-02-12 15:56   ` Barret Rhoden
2024-02-12 18:23     ` Alexei Starovoitov
     [not found]   ` <CAP01T75y-E8qjMpn_9E-k8H0QpPdjvYx9MMgx6cxGfmdVat+Xw@mail.gmail.com>
2024-02-12 18:21     ` Alexei Starovoitov
2024-02-13 23:14   ` Andrii Nakryiko
2024-02-13 23:29     ` Alexei Starovoitov
2024-02-14  0:03       ` Andrii Nakryiko
2024-02-14  0:14         ` Alexei Starovoitov
2024-02-09  4:05 ` [PATCH v2 bpf-next 06/20] bpf: Disasm support for cast_kern/user instructions Alexei Starovoitov
2024-02-09  4:05 ` [PATCH v2 bpf-next 07/20] bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions Alexei Starovoitov
2024-02-09 17:20   ` Eduard Zingerman
2024-02-13 22:20     ` Alexei Starovoitov
     [not found]   ` <CAP01T75sq=G5pfYvsYuxfdoFGOqSGrNcamCyA0posFA9pxNWRA@mail.gmail.com>
2024-02-13 22:00     ` Alexei Starovoitov
2024-02-09  4:05 ` [PATCH v2 bpf-next 08/20] bpf: Add x86-64 JIT support for bpf_cast_user instruction Alexei Starovoitov
2024-02-10  1:15   ` Eduard Zingerman
     [not found]   ` <CAP01T76JMbnS3PSpontzWmtSZ9cs97yO772R8zpWH-eHXviLSA@mail.gmail.com>
2024-02-13 22:28     ` Alexei Starovoitov
2024-02-09  4:05 ` [PATCH v2 bpf-next 09/20] bpf: Recognize cast_kern/user instructions in the verifier Alexei Starovoitov
2024-02-10  1:13   ` Eduard Zingerman
2024-02-13  2:58     ` Alexei Starovoitov
2024-02-13 12:01       ` Eduard Zingerman
2024-02-09  4:05 ` [PATCH v2 bpf-next 10/20] bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA Alexei Starovoitov
2024-02-13 23:14   ` Andrii Nakryiko
2024-02-14  0:26     ` Alexei Starovoitov
2024-02-09  4:05 ` [PATCH v2 bpf-next 11/20] libbpf: Add __arg_arena to bpf_helpers.h Alexei Starovoitov
2024-02-13 23:14   ` Andrii Nakryiko
2024-02-09  4:06 ` [PATCH v2 bpf-next 12/20] libbpf: Add support for bpf_arena Alexei Starovoitov
2024-02-12 18:12   ` Eduard Zingerman
2024-02-12 20:14     ` Alexei Starovoitov
2024-02-12 20:21       ` Eduard Zingerman
     [not found]   ` <CAP01T761B1+paMwrQesjX+zqFwQp8iUzLORueTjTLSHPbJ+0fQ@mail.gmail.com>
2024-02-12 19:11     ` Andrii Nakryiko
2024-02-13 23:15   ` Andrii Nakryiko
2024-02-14  0:32     ` Alexei Starovoitov
2024-02-09  4:06 ` [PATCH v2 bpf-next 13/20] libbpf: Allow specifying 64-bit integers in map BTF Alexei Starovoitov
2024-02-12 18:58   ` Eduard Zingerman
2024-02-13 23:15   ` Andrii Nakryiko
2024-02-14  0:47     ` Alexei Starovoitov
2024-02-14  0:51       ` Andrii Nakryiko
2024-02-09  4:06 ` [PATCH v2 bpf-next 14/20] libbpf: Recognize __arena global varaibles Alexei Starovoitov
2024-02-13  0:34   ` Eduard Zingerman
2024-02-13  0:44     ` Alexei Starovoitov
2024-02-13  0:49       ` Eduard Zingerman
2024-02-13  2:08         ` Alexei Starovoitov
2024-02-13 12:48           ` Eduard Zingerman
2024-02-13 23:11   ` Eduard Zingerman
2024-02-13 23:17     ` Andrii Nakryiko
2024-02-13 23:36       ` Eduard Zingerman
2024-02-14  0:09         ` Andrii Nakryiko
2024-02-14  0:16           ` Eduard Zingerman
2024-02-14  0:29             ` Andrii Nakryiko
2024-02-14  1:24           ` Alexei Starovoitov
2024-02-14 17:24             ` Andrii Nakryiko
2024-02-15 23:22               ` Andrii Nakryiko
2024-02-16  2:45                 ` Alexei Starovoitov
2024-02-16  4:51                   ` Andrii Nakryiko
2024-02-14  1:02     ` Alexei Starovoitov
2024-02-14 15:10       ` Eduard Zingerman
2024-02-13 23:15   ` Andrii Nakryiko
2024-02-09  4:06 ` [PATCH v2 bpf-next 15/20] bpf: Tell bpf programs kernel's PAGE_SIZE Alexei Starovoitov
2024-02-09  4:06 ` [PATCH v2 bpf-next 16/20] bpf: Add helper macro bpf_arena_cast() Alexei Starovoitov
     [not found]   ` <CAP01T743Mzfi9+2yMjB5+m2jpBLvij_tLyLFptkOpCekUn=soA@mail.gmail.com>
2024-02-13 22:35     ` Alexei Starovoitov
2024-02-14 16:47       ` Eduard Zingerman
2024-02-14 17:45         ` Alexei Starovoitov
2024-02-09  4:06 ` [PATCH v2 bpf-next 17/20] selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages Alexei Starovoitov
2024-02-09 23:14   ` David Vernet
2024-02-10  4:35     ` Alexei Starovoitov
2024-02-12 16:48       ` David Vernet
     [not found]       ` <CAP01T75qCUabu4-18nYwRDnSyTTgeAgNN3kePY5PXdnoTKt+Cg@mail.gmail.com>
2024-02-13 23:19         ` Alexei Starovoitov
2024-02-09  4:06 ` [PATCH v2 bpf-next 18/20] selftests/bpf: Add bpf_arena_list test Alexei Starovoitov
2024-02-09  4:06 ` [PATCH v2 bpf-next 19/20] selftests/bpf: Add bpf_arena_htab test Alexei Starovoitov
2024-02-09  4:06 ` [PATCH v2 bpf-next 20/20] selftests/bpf: Convert simple page_frag allocator to per-cpu Alexei Starovoitov
     [not found]   ` <CAP01T74x-N71rbS+jZ2z+3MPMe5WDeWKV_gWJmDCikV0YOpPFQ@mail.gmail.com>
2024-02-14  1:37     ` Alexei Starovoitov
2024-02-12 14:14 ` [PATCH v2 bpf-next 00/20] bpf: Introduce BPF arena David Hildenbrand
2024-02-12 18:14   ` Alexei Starovoitov
2024-02-13 10:35     ` David Hildenbrand
2024-02-12 17:36 ` Barret Rhoden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAADnVQJ8azcUznU6KHhwEM99NUOx8oai8EOyay4dxLM6ho8mjw@mail.gmail.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brho@google.com \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=kernel-team@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=memxor@gmail.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox