From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: davem@davemloft.net, daniel@iogearbox.net, andrii@kernel.org,
tj@kernel.org, delyank@fb.com, linux-mm@kvack.org,
bpf@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v2 bpf-next 01/12] bpf: Introduce any context BPF specific memory allocator.
Date: Fri, 19 Aug 2022 10:51:55 -0700 [thread overview]
Message-ID: <20220819175155.deyd62m6tscv63td@MacBook-Pro-3.local> (raw)
In-Reply-To: <CAP01T76rn=FrSC9VA=mbEK7QKMu-88uZiiekpn6TqQ_Ea0-u5Q@mail.gmail.com>
On Fri, Aug 19, 2022 at 04:31:11PM +0200, Kumar Kartikeya Dwivedi wrote:
> On Fri, 19 Aug 2022 at 00:30, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > Right. We cannot fail in unit_free().
> > With per-cpu counter both unit_alloc() and free_bulk_nmi() would
> > potentially fail in such unlikely scenario.
> > Not a big deal for free_bulk_nmi(). It would pick the element later.
> > For unit_alloc() return NULL is normal.
> > Especially since it's so unlikely for nmi to hit right in the middle
> > of llist_del_first().
> >
> > Since we'll add this per-cpu counter to solve interrupted llist_del_first()
> > it feels that the same counter can be used to protect unit_alloc/free/irq_work.
> > Then we don't need free_llist_nmi. Single free_llist would be fine,
> > but unit_free() should not fail. If free_list cannot be accessed
> > due to per-cpu counter being busy we have to push somewhere.
> > So it seems two lists are necessary. Maybe it's still better ?
> > Roughly I'm thinking of the following:
> > unit_alloc()
> > {
> > llnode = NULL;
> > local_irq_save();
> > if (__this_cpu_inc_return(c->alloc_active) != 1))
> > goto out;
> > llnode = __llist_del_first(&c->free_llist);
> > if (llnode)
> > cnt = --c->free_cnt;
> > out:
> > __this_cpu_dec(c->alloc_active);
> > local_irq_restore();
> > return ret;
> > }
> > unit_free()
> > {
> > local_irq_save();
> > if (__this_cpu_inc_return(c->alloc_active) != 1)) {
> > llist_add(llnode, &c->free_llist_nmi);
> > goto out;
> > }
> > __llist_add(llnode, &c->free_llist);
> > cnt = ++c->free_cnt;
> > out:
> > __this_cpu_dec(c->alloc_active);
> > local_irq_restore();
> > return ret;
> > }
> > alloc_bulk, free_bulk would be protected by alloc_active as well.
> > alloc_bulk_nmi is gone.
> > free_bulk_nmi is still there to drain unlucky unit_free,
> > but it's now alone to do llist_del_first() and it just frees anything
> > that is in the free_llist_nmi.
> > The main advantage is that free_llist_nmi doesn't need to prefilled.
> > It will be empty most of the time.
> > wdyt?
>
> Looks great! The other option would be to not have the overflow
> free_llist_nmi list and just allowing llist_add to free_llist from the
> NMI case even if we interrupt llist_del_first, but then the non-NMI
> user needs to use the atomic llist_add version as well (since we may
> interrupt it),
not only llist_add, but unit_alloc would have to use atomic llist_del_first too.
So any operation on the list would have to be with cmpxchg.
> which won't be great for performance.
exactly.
> So having the
> extra list is much better.
yep. same thinking.
I'll refactor the patches and send v3 with this approach.
next prev parent reply other threads:[~2022-08-19 17:52 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-17 21:04 [PATCH v2 bpf-next 00/12] bpf: " Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 01/12] bpf: Introduce any context " Alexei Starovoitov
2022-08-17 23:51 ` Kumar Kartikeya Dwivedi
2022-08-18 0:39 ` Alexei Starovoitov
2022-08-18 12:38 ` Kumar Kartikeya Dwivedi
2022-08-18 22:30 ` Alexei Starovoitov
2022-08-19 14:31 ` Kumar Kartikeya Dwivedi
2022-08-19 17:51 ` Alexei Starovoitov [this message]
2022-08-17 21:04 ` [PATCH v2 bpf-next 02/12] bpf: Convert hash map to bpf_mem_alloc Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 03/12] selftests/bpf: Improve test coverage of test_maps Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 04/12] samples/bpf: Reduce syscall overhead in map_perf_test Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 05/12] bpf: Relax the requirement to use preallocated hash maps in tracing progs Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 06/12] bpf: Optimize element count in non-preallocated hash map Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 07/12] bpf: Optimize call_rcu " Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 08/12] bpf: Adjust low/high watermarks in bpf_mem_cache Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 09/12] bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 10/12] bpf: Add percpu allocation support to bpf_mem_alloc Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 11/12] bpf: Convert percpu hash map to per-cpu bpf_mem_alloc Alexei Starovoitov
2022-08-17 21:04 ` [PATCH v2 bpf-next 12/12] bpf: Remove tracing program restriction on map types Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220819175155.deyd62m6tscv63td@MacBook-Pro-3.local \
--to=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=delyank@fb.com \
--cc=kernel-team@fb.com \
--cc=linux-mm@kvack.org \
--cc=memxor@gmail.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox