linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Harry Yoo <harry.yoo@oracle.com>, bpf <bpf@vger.kernel.org>,
	 linux-mm <linux-mm@kvack.org>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	 Michal Hocko <mhocko@suse.com>,
	Sebastian Sewior <bigeasy@linutronix.de>,
	 Andrii Nakryiko <andrii@kernel.org>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	 Steven Rostedt <rostedt@goodmis.org>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH slab] slab: Disallow kprobes in ___slab_alloc()
Date: Tue, 16 Sep 2025 13:26:53 -0700	[thread overview]
Message-ID: <CAADnVQL6xGz8=NTDs=3wPfaEqxUjfQE98h5Q2ex-iyRs4yemiw@mail.gmail.com> (raw)
In-Reply-To: <c370486e-cb8f-4201-b70e-2bdddab9e642@suse.cz>

On Tue, Sep 16, 2025 at 12:06 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> >>
> >> Hm I see. I wrongly reasoned as if NOKPROBE_SYMBOL(___slab_alloc) covers the
> >> whole scope of ___slab_alloc() but that's not the case. Thanks for clearin
> >> that up.
> >
> > hmm. NOKPROBE_SYMBOL(___slab_alloc) covers the whole function.
> > It disallows kprobes anywhere within the body,
> > but it doesn't make it 'notrace', so tracing the first nop5
> > is still ok.
>
> Yeah by "scope" I meant also whatever that function calls, i.e. the spinlock
> operations you mentioned (local_lock_irqsave()). That's not part of the
> ___slab_alloc() body so you're right we have not eliminated it.

Ahh. Yes. All functions that ___slab_alloc() calls
are not affected and it's ok.
There are no calls in the middle freelist update.

> >>
> >> But with nmi that's variant of #1 of that comment.
> >>
> >> Like for ___slab_alloc() we need to prevent #2 with no nmi?
> >> example on !RT:
> >>
> >> kmalloc() -> ___slab_alloc() -> irqsave -> tracepoint/kprobe -> bpf ->
> >> kfree_nolock() -> do_slab_free()
> >>
> >> in_nmi() || !USE_LOCKLESS_FAST_PATH()
> >> false || false, we proceed, no checking of local_lock_is_locked()
> >>
> >> if (USE_LOCKLESS_FAST_PATH()) { - true (!RT)
> >> -> __update_cpu_freelist_fast()
> >>
> >> Am I missing something?
> >
> > It's ok to call __update_cpu_freelist_fast(). It won't break anything.
> > Because only nmi can make this cpu to be in the middle of freelist update.
>
> You're right, freeing uses the "slowpath" (local_lock protected instead of
> cmpxchg16b) c->freelist manipulation only on RT. So we can't preempt it with
> a kprobe on !RT because it doesn't exist there at all. The only one is in
> ___slab_alloc() and that's covered.

yep.

> I do really hope we'll get to the point where sheaves will be able to
> replace the other percpu slab caching completely... it should be much simpler.

+1.
Since we're talking about long term plans...
would be really cool to have per-numa allocators.
Like per-cpu alloc, but per-numa. And corresponding this_numa_add()
that will use atomic_add underneath.
Regular atomic across all nodes are becoming quite slow
in modern cpus, while per-cpu counters are too expensive from memory pov.
per-numa could be such middle ground with fast enough operations
and good memory usage.


  reply	other threads:[~2025-09-16 20:27 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-16  2:21 Alexei Starovoitov
2025-09-16 10:40 ` Vlastimil Babka
2025-09-16 12:58   ` Harry Yoo
2025-09-16 13:13     ` Vlastimil Babka
2025-09-16 16:18       ` Alexei Starovoitov
2025-09-16 18:12         ` Vlastimil Babka
2025-09-16 18:46           ` Alexei Starovoitov
2025-09-16 19:06             ` Vlastimil Babka
2025-09-16 20:26               ` Alexei Starovoitov [this message]
2025-09-17  7:02                 ` Harry Yoo
2025-09-17  7:06                   ` Harry Yoo
2025-09-17 18:26                     ` Alexei Starovoitov
2025-09-17 18:34                       ` Vlastimil Babka
2025-09-17 18:40                         ` Alexei Starovoitov
2025-09-16 10:59 ` Harry Yoo
2025-09-16 12:25   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADnVQL6xGz8=NTDs=3wPfaEqxUjfQE98h5Q2ex-iyRs4yemiw@mail.gmail.com' \
    --to=alexei.starovoitov@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=memxor@gmail.com \
    --cc=mhocko@suse.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=shakeel.butt@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox