From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Harry Yoo <harry.yoo@oracle.com>, bpf <bpf@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Shakeel Butt <shakeel.butt@linux.dev>,
Michal Hocko <mhocko@suse.com>,
Sebastian Sewior <bigeasy@linutronix.de>,
Andrii Nakryiko <andrii@kernel.org>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH slab] slab: Disallow kprobes in ___slab_alloc()
Date: Tue, 16 Sep 2025 13:26:53 -0700 [thread overview]
Message-ID: <CAADnVQL6xGz8=NTDs=3wPfaEqxUjfQE98h5Q2ex-iyRs4yemiw@mail.gmail.com> (raw)
In-Reply-To: <c370486e-cb8f-4201-b70e-2bdddab9e642@suse.cz>
On Tue, Sep 16, 2025 at 12:06 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> >>
> >> Hm I see. I wrongly reasoned as if NOKPROBE_SYMBOL(___slab_alloc) covers the
> >> whole scope of ___slab_alloc() but that's not the case. Thanks for clearin
> >> that up.
> >
> > hmm. NOKPROBE_SYMBOL(___slab_alloc) covers the whole function.
> > It disallows kprobes anywhere within the body,
> > but it doesn't make it 'notrace', so tracing the first nop5
> > is still ok.
>
> Yeah by "scope" I meant also whatever that function calls, i.e. the spinlock
> operations you mentioned (local_lock_irqsave()). That's not part of the
> ___slab_alloc() body so you're right we have not eliminated it.
Ahh. Yes. All functions that ___slab_alloc() calls
are not affected and it's ok.
There are no calls in the middle freelist update.
> >>
> >> But with nmi that's variant of #1 of that comment.
> >>
> >> Like for ___slab_alloc() we need to prevent #2 with no nmi?
> >> example on !RT:
> >>
> >> kmalloc() -> ___slab_alloc() -> irqsave -> tracepoint/kprobe -> bpf ->
> >> kfree_nolock() -> do_slab_free()
> >>
> >> in_nmi() || !USE_LOCKLESS_FAST_PATH()
> >> false || false, we proceed, no checking of local_lock_is_locked()
> >>
> >> if (USE_LOCKLESS_FAST_PATH()) { - true (!RT)
> >> -> __update_cpu_freelist_fast()
> >>
> >> Am I missing something?
> >
> > It's ok to call __update_cpu_freelist_fast(). It won't break anything.
> > Because only nmi can make this cpu to be in the middle of freelist update.
>
> You're right, freeing uses the "slowpath" (local_lock protected instead of
> cmpxchg16b) c->freelist manipulation only on RT. So we can't preempt it with
> a kprobe on !RT because it doesn't exist there at all. The only one is in
> ___slab_alloc() and that's covered.
yep.
> I do really hope we'll get to the point where sheaves will be able to
> replace the other percpu slab caching completely... it should be much simpler.
+1.
Since we're talking about long term plans...
would be really cool to have per-numa allocators.
Like per-cpu alloc, but per-numa. And corresponding this_numa_add()
that will use atomic_add underneath.
Regular atomic across all nodes are becoming quite slow
in modern cpus, while per-cpu counters are too expensive from memory pov.
per-numa could be such middle ground with fast enough operations
and good memory usage.
next prev parent reply other threads:[~2025-09-16 20:27 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-16 2:21 Alexei Starovoitov
2025-09-16 10:40 ` Vlastimil Babka
2025-09-16 12:58 ` Harry Yoo
2025-09-16 13:13 ` Vlastimil Babka
2025-09-16 16:18 ` Alexei Starovoitov
2025-09-16 18:12 ` Vlastimil Babka
2025-09-16 18:46 ` Alexei Starovoitov
2025-09-16 19:06 ` Vlastimil Babka
2025-09-16 20:26 ` Alexei Starovoitov [this message]
2025-09-17 7:02 ` Harry Yoo
2025-09-17 7:06 ` Harry Yoo
2025-09-17 18:26 ` Alexei Starovoitov
2025-09-17 18:34 ` Vlastimil Babka
2025-09-17 18:40 ` Alexei Starovoitov
2025-09-16 10:59 ` Harry Yoo
2025-09-16 12:25 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAADnVQL6xGz8=NTDs=3wPfaEqxUjfQE98h5Q2ex-iyRs4yemiw@mail.gmail.com' \
--to=alexei.starovoitov@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=linux-mm@kvack.org \
--cc=memxor@gmail.com \
--cc=mhocko@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shakeel.butt@linux.dev \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox