From: Harry Yoo <harry.yoo@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
bpf@vger.kernel.org, linux-mm@kvack.org, shakeel.butt@linux.dev,
mhocko@suse.com, bigeasy@linutronix.de, andrii@kernel.org,
memxor@gmail.com, akpm@linux-foundation.org,
peterz@infradead.org, rostedt@goodmis.org, hannes@cmpxchg.org
Subject: Re: [PATCH slab] slab: Disallow kprobes in ___slab_alloc()
Date: Tue, 16 Sep 2025 21:58:04 +0900 [thread overview]
Message-ID: <aMlZ8Au2MBikJgta@hyeyoo> (raw)
In-Reply-To: <47aca3ca-a65b-4c0b-aaff-3a7bb6e484fe@suse.cz>
On Tue, Sep 16, 2025 at 12:40:12PM +0200, Vlastimil Babka wrote:
> On 9/16/25 04:21, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Disallow kprobes in ___slab_alloc() to prevent reentrance:
> > kmalloc() -> ___slab_alloc() -> local_lock_irqsave() ->
> > kprobe -> bpf -> kmalloc_nolock().
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>
> I wanted to fold this in "slab: Introduce kmalloc_nolock() and kfree_nolock()."
> and update comments to explain the NOKPROBE_SYMBOL(___slab_alloc);
>
> But now I'm not sure if we still need to invent the lockdep classes for PREEMPT_RT anymore:
>
> > /*
> > * ___slab_alloc()'s caller is supposed to check if kmem_cache::kmem_cache_cpu::lock
> > * can be acquired without a deadlock before invoking the function.
> > *
> > * Without LOCKDEP we trust the code to be correct. kmalloc_nolock() is
> > * using local_lock_is_locked() properly before calling local_lock_cpu_slab(),
> > * and kmalloc() is not used in an unsupported context.
> > *
> > * With LOCKDEP, on PREEMPT_RT lockdep does its checking in local_lock_irqsave().
> > * On !PREEMPT_RT we use trylock to avoid false positives in NMI, but
> > * lockdep_assert() will catch a bug in case:
> > * #1
> > * kmalloc() -> ___slab_alloc() -> irqsave -> NMI -> bpf -> kmalloc_nolock()
> > * or
> > * #2
> > * kmalloc() -> ___slab_alloc() -> irqsave -> tracepoint/kprobe -> bpf -> kmalloc_nolock()
>
> AFAICS see we now eliminated this possibility.
Right.
> > * On PREEMPT_RT an invocation is not possible from IRQ-off or preempt
> > * disabled context. The lock will always be acquired and if needed it
> > * block and sleep until the lock is available.
> > * #1 is possible in !PREEMPT_RT only.
>
> Yes because this in kmalloc_nolock_noprof()
>
> if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> /* kmalloc_nolock() in PREEMPT_RT is not supported from irq */
> return NULL;
>
>
> > * #2 is possible in both with a twist that irqsave is replaced with rt_spinlock:
> > * kmalloc() -> ___slab_alloc() -> rt_spin_lock(kmem_cache_A) ->
> > * tracepoint/kprobe -> bpf -> kmalloc_nolock() -> rt_spin_lock(kmem_cache_B)
> And this is no longer possible, so can we just remove these comments and drop
> "slab: Make slub local_(try)lock more precise for LOCKDEP" now?
Makes sense and sounds good to me.
Also in the commit mesage should be adjusted too:
> kmalloc_nolock() can be called from any context and can re-enter
> into ___slab_alloc():
> kmalloc() -> ___slab_alloc(cache_A) -> irqsave -> NMI -> bpf ->
> kmalloc_nolock() -> ___slab_alloc(cache_B)
> or
> kmalloc() -> ___slab_alloc(cache_A) -> irqsave -> tracepoint/kprobe -> bpf ->
> kmalloc_nolock() -> ___slab_alloc(cache_B)
The lattter path is not possible anymore,
> Similarly, in PREEMPT_RT local_lock_is_locked() returns true when per-cpu
> rt_spin_lock is locked by current _task_. In this case re-entrance into
> the same kmalloc bucket is unsafe, and kmalloc_nolock() tries a different
> bucket that is most likely is not locked by the current task.
> Though it may be locked by a different task it's safe to rt_spin_lock() and
> sleep on it.
and this paragraph is no longer valid either?
> > * local_lock_is_locked() prevents the case kmem_cache_A == kmem_cache_B
> > */
>
> However, what about the freeing path?
> Shouldn't we do the same with __slab_free() to prevent fast path messing up
> an interrupted slow path?
Hmm right, but we have:
(in_nmi() || !USE_LOCKLESS_FAST_PATH()) && local_lock_is_locked()
check in do_slab_free() if !allow_spin?
--
Cheers,
Harry / Hyeonggon
next prev parent reply other threads:[~2025-09-16 12:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-16 2:21 Alexei Starovoitov
2025-09-16 10:40 ` Vlastimil Babka
2025-09-16 12:58 ` Harry Yoo [this message]
2025-09-16 13:13 ` Vlastimil Babka
2025-09-16 16:18 ` Alexei Starovoitov
2025-09-16 18:12 ` Vlastimil Babka
2025-09-16 18:46 ` Alexei Starovoitov
2025-09-16 19:06 ` Vlastimil Babka
2025-09-16 20:26 ` Alexei Starovoitov
2025-09-17 7:02 ` Harry Yoo
2025-09-17 7:06 ` Harry Yoo
2025-09-17 18:26 ` Alexei Starovoitov
2025-09-17 18:34 ` Vlastimil Babka
2025-09-17 18:40 ` Alexei Starovoitov
2025-09-16 10:59 ` Harry Yoo
2025-09-16 12:25 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMlZ8Au2MBikJgta@hyeyoo \
--to=harry.yoo@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=memxor@gmail.com \
--cc=mhocko@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shakeel.butt@linux.dev \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox