From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: bpf <bpf@vger.kernel.org>, linux-mm <linux-mm@kvack.org>,
Vlastimil Babka <vbabka@suse.cz>,
Shakeel Butt <shakeel.butt@linux.dev>,
Michal Hocko <mhocko@suse.com>,
Sebastian Sewior <bigeasy@linutronix.de>,
Andrii Nakryiko <andrii@kernel.org>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH slab v5 6/6] slab: Introduce kmalloc_nolock() and kfree_nolock().
Date: Wed, 24 Sep 2025 09:43:45 +0200 [thread overview]
Message-ID: <CAADnVQ+7W9MBG5i-r1Bh+ya=xN13LTVLN+EYwzP9dhVo4cUnjw@mail.gmail.com> (raw)
In-Reply-To: <aNM-Esr0v_95qmEa@hyeyoo>
On Wed, Sep 24, 2025 at 2:41 AM Harry Yoo <harry.yoo@oracle.com> wrote:
>
> On Mon, Sep 08, 2025 at 06:00:07PM -0700, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > kmalloc_nolock() relies on ability of local_trylock_t to detect
> > the situation when per-cpu kmem_cache is locked.
> >
> > In !PREEMPT_RT local_(try)lock_irqsave(&s->cpu_slab->lock, flags)
> > disables IRQs and marks s->cpu_slab->lock as acquired.
> > local_lock_is_locked(&s->cpu_slab->lock) returns true when
> > slab is in the middle of manipulating per-cpu cache
> > of that specific kmem_cache.
> >
> > kmalloc_nolock() can be called from any context and can re-enter
> > into ___slab_alloc():
> > kmalloc() -> ___slab_alloc(cache_A) -> irqsave -> NMI -> bpf ->
> > kmalloc_nolock() -> ___slab_alloc(cache_B)
> > or
> > kmalloc() -> ___slab_alloc(cache_A) -> irqsave -> tracepoint/kprobe -> bpf ->
> > kmalloc_nolock() -> ___slab_alloc(cache_B)
> >
> > Hence the caller of ___slab_alloc() checks if &s->cpu_slab->lock
> > can be acquired without a deadlock before invoking the function.
> > If that specific per-cpu kmem_cache is busy the kmalloc_nolock()
> > retries in a different kmalloc bucket. The second attempt will
> > likely succeed, since this cpu locked different kmem_cache.
> >
> > Similarly, in PREEMPT_RT local_lock_is_locked() returns true when
> > per-cpu rt_spin_lock is locked by current _task_. In this case
> > re-entrance into the same kmalloc bucket is unsafe, and
> > kmalloc_nolock() tries a different bucket that is most likely is
> > not locked by the current task. Though it may be locked by a
> > different task it's safe to rt_spin_lock() and sleep on it.
> >
> > Similar to alloc_pages_nolock() the kmalloc_nolock() returns NULL
> > immediately if called from hard irq or NMI in PREEMPT_RT.
> >
> > kfree_nolock() defers freeing to irq_work when local_lock_is_locked()
> > and (in_nmi() or in PREEMPT_RT).
> >
> > SLUB_TINY config doesn't use local_lock_is_locked() and relies on
> > spin_trylock_irqsave(&n->list_lock) to allocate,
> > while kfree_nolock() always defers to irq_work.
> >
> > Note, kfree_nolock() must be called _only_ for objects allocated
> > with kmalloc_nolock(). Debug checks (like kmemleak and kfence)
> > were skipped on allocation, hence obj = kmalloc(); kfree_nolock(obj);
> > will miss kmemleak/kfence book keeping and will cause false positives.
> > large_kmalloc is not supported by either kmalloc_nolock()
> > or kfree_nolock().
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
>
> On the up-to-date version [1] of this patch,
> I tried my best to find flaws in the code, but came up empty this time.
Here's hoping :)
Much appreciate all the feedback and reviews during
this long journey (v1 was back in April).
next prev parent reply other threads:[~2025-09-24 7:44 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-09 1:00 [PATCH slab v5 0/6] slab: Re-entrant kmalloc_nolock() Alexei Starovoitov
2025-09-09 1:00 ` [PATCH slab v5 1/6] locking/local_lock: Introduce local_lock_is_locked() Alexei Starovoitov
2025-09-09 1:00 ` [PATCH slab v5 2/6] mm: Allow GFP_ACCOUNT to be used in alloc_pages_nolock() Alexei Starovoitov
2025-09-12 17:11 ` Shakeel Butt
2025-09-12 17:15 ` Matthew Wilcox
2025-09-12 17:34 ` Alexei Starovoitov
2025-09-12 17:46 ` Shakeel Butt
2025-09-12 17:47 ` Shakeel Butt
2025-09-15 5:25 ` Harry Yoo
2025-09-09 1:00 ` [PATCH slab v5 3/6] mm: Introduce alloc_frozen_pages_nolock() Alexei Starovoitov
2025-09-12 17:15 ` Shakeel Butt
2025-09-15 5:17 ` Harry Yoo
2025-09-09 1:00 ` [PATCH slab v5 4/6] slab: Make slub local_(try)lock more precise for LOCKDEP Alexei Starovoitov
2025-09-09 1:00 ` [PATCH slab v5 5/6] slab: Reuse first bit for OBJEXTS_ALLOC_FAIL Alexei Starovoitov
2025-09-12 19:27 ` Shakeel Butt
2025-09-12 21:03 ` Suren Baghdasaryan
2025-09-12 21:11 ` Shakeel Butt
2025-09-12 21:26 ` Suren Baghdasaryan
2025-09-12 21:24 ` Alexei Starovoitov
2025-09-12 21:29 ` Shakeel Butt
2025-09-12 21:31 ` Alexei Starovoitov
2025-09-12 21:44 ` Shakeel Butt
2025-09-12 21:59 ` Alexei Starovoitov
2025-09-13 0:01 ` Shakeel Butt
2025-09-13 0:07 ` Alexei Starovoitov
2025-09-13 0:33 ` Shakeel Butt
2025-09-13 0:36 ` Suren Baghdasaryan
2025-09-13 1:12 ` Alexei Starovoitov
2025-09-15 7:51 ` Vlastimil Babka
2025-09-15 15:06 ` Suren Baghdasaryan
2025-09-15 15:11 ` Vlastimil Babka
2025-09-15 15:25 ` Suren Baghdasaryan
2025-09-15 20:10 ` Suren Baghdasaryan
2025-09-13 1:16 ` Shakeel Butt
2025-09-15 6:14 ` Harry Yoo
2025-09-09 1:00 ` [PATCH slab v5 6/6] slab: Introduce kmalloc_nolock() and kfree_nolock() Alexei Starovoitov
2025-09-15 12:52 ` Harry Yoo
2025-09-15 14:39 ` Vlastimil Babka
2025-09-16 0:56 ` Alexei Starovoitov
2025-09-16 9:55 ` Vlastimil Babka
2025-09-16 1:00 ` Alexei Starovoitov
2025-09-24 0:40 ` Harry Yoo
2025-09-24 7:43 ` Alexei Starovoitov [this message]
2025-09-24 11:07 ` Harry Yoo
2025-09-12 9:33 ` [PATCH slab v5 0/6] slab: Re-entrant kmalloc_nolock() Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAADnVQ+7W9MBG5i-r1Bh+ya=xN13LTVLN+EYwzP9dhVo4cUnjw@mail.gmail.com' \
--to=alexei.starovoitov@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=linux-mm@kvack.org \
--cc=memxor@gmail.com \
--cc=mhocko@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shakeel.butt@linux.dev \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox