linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH slab v5 0/6] slab: Re-entrant kmalloc_nolock()
@ 2025-09-09  1:00 Alexei Starovoitov
  2025-09-09  1:00 ` [PATCH slab v5 1/6] locking/local_lock: Introduce local_lock_is_locked() Alexei Starovoitov
                   ` (6 more replies)
  0 siblings, 7 replies; 45+ messages in thread
From: Alexei Starovoitov @ 2025-09-09  1:00 UTC (permalink / raw)
  To: bpf, linux-mm
  Cc: vbabka, harry.yoo, shakeel.butt, mhocko, bigeasy, andrii, memxor,
	akpm, peterz, rostedt, hannes

From: Alexei Starovoitov <ast@kernel.org>

Overview:

This patch set introduces kmalloc_nolock() which is the next logical
step towards any context allocation necessary to remove bpf_mem_alloc
and get rid of preallocation requirement in BPF infrastructure.
In production BPF maps grew to gigabytes in size. Preallocation wastes
memory. Alloc from any context addresses this issue for BPF and
other subsystems that are forced to preallocate too.
This long task started with introduction of alloc_pages_nolock(),
then memcg and objcg were converted to operate from any context
including NMI, this set completes the task with kmalloc_nolock()
that builds on top of alloc_pages_nolock() and memcg changes.
After that BPF subsystem will gradually adopt it everywhere.

The patch set is on top of slab/for-next that already has
pre-patch "locking/local_lock: Expose dep_map in local_trylock_t." applied.
I think the patch set should be routed via vbabka/slab.git.

v4->v5:
- New patch "Reuse first bit for OBJEXTS_ALLOC_FAIL" to free up a bit
  and use it to mark slabobj_ext vector allocated with kmalloc_nolock(),
  so that freeing of the vector can be done with kfree_nolock()
- Call kasan_slab_free() directly from kfree_nolock() instead of deferring to
  do_slab_free() to avoid double poisoning
- Addressed other minor issues spotted by Harry

v4:
https://lore.kernel.org/all/20250718021646.73353-1-alexei.starovoitov@gmail.com/

v3->v4:
- Converted local_lock_cpu_slab() to macro
- Reordered patches 5 and 6
- Emphasized that kfree_nolock() shouldn't be used on kmalloc()-ed objects
- Addressed other comments and improved commit logs
- Fixed build issues reported by bots

v3:
https://lore.kernel.org/bpf/20250716022950.69330-1-alexei.starovoitov@gmail.com/

v2->v3:
- Adopted Sebastian's local_lock_cpu_slab(), but dropped gfpflags
  to avoid extra branch for performance reasons,
  and added local_unlock_cpu_slab() for symmetry.
- Dropped local_lock_lockdep_start/end() pair and switched to
  per kmem_cache lockdep class on PREEMPT_RT to silence false positive
  when the same cpu/task acquires two local_lock-s.
- Refactorred defer_free per Sebastian's suggestion
- Fixed slab leak when it needs to be deactivated via irq_work and llist
  as Vlastimil proposed. Including defer_free_barrier().
- Use kmem_cache->offset for llist_node pointer when linking objects
  instead of zero offset, since whole object could be used for slabs
  with ctors and other cases.
- Fixed "cnt = 1; goto redo;" issue.
- Fixed slab leak in alloc_single_from_new_slab().
- Retested with slab_debug, RT, !RT, lockdep, kasan, slab_tiny
- Added acks to patches 1-4 that should be good to go.

v2:
https://lore.kernel.org/bpf/20250709015303.8107-1-alexei.starovoitov@gmail.com/

v1->v2:
Added more comments for this non-trivial logic and addressed earlier comments.
In particular:
- Introduce alloc_frozen_pages_nolock() to avoid refcnt race
- alloc_pages_nolock() defaults to GFP_COMP
- Support SLUB_TINY
- Added more variants to stress tester to discover that kfree_nolock() can
  OOM, because deferred per-slab llist won't be serviced if kfree_nolock()
  gets unlucky long enough. Scraped previous approach and switched to
  global per-cpu llist with immediate irq_work_queue() to process all
  object sizes.
- Reentrant kmalloc cannot deactivate_slab(). In v1 the node hint was
  downgraded to NUMA_NO_NODE before calling slab_alloc(). Realized it's not
  good enough. There are odd cases that can trigger deactivate. Rewrote
  this part.
- Struggled with SLAB_NO_CMPXCHG. Thankfully Harry had a great suggestion:
  https://lore.kernel.org/bpf/aFvfr1KiNrLofavW@hyeyoo/
  which was adopted. So slab_debug works now.
- In v1 I had to s/local_lock_irqsave/local_lock_irqsave_check/ in a bunch
  of places in mm/slub.c to avoid lockdep false positives.
  Came up with much cleaner approach to silence invalid lockdep reports
  without sacrificing lockdep coverage. See local_lock_lockdep_start/end().

v1:
https://lore.kernel.org/bpf/20250501032718.65476-1-alexei.starovoitov@gmail.com/

Alexei Starovoitov (6):
  locking/local_lock: Introduce local_lock_is_locked().
  mm: Allow GFP_ACCOUNT to be used in alloc_pages_nolock().
  mm: Introduce alloc_frozen_pages_nolock()
  slab: Make slub local_(try)lock more precise for LOCKDEP
  slab: Reuse first bit for OBJEXTS_ALLOC_FAIL
  slab: Introduce kmalloc_nolock() and kfree_nolock().

 include/linux/gfp.h                 |   2 +-
 include/linux/kasan.h               |  13 +-
 include/linux/local_lock.h          |   2 +
 include/linux/local_lock_internal.h |   7 +
 include/linux/memcontrol.h          |  12 +-
 include/linux/rtmutex.h             |  10 +
 include/linux/slab.h                |   4 +
 kernel/bpf/stream.c                 |   2 +-
 kernel/bpf/syscall.c                |   2 +-
 kernel/locking/rtmutex_common.h     |   9 -
 mm/Kconfig                          |   1 +
 mm/internal.h                       |   4 +
 mm/kasan/common.c                   |   5 +-
 mm/page_alloc.c                     |  55 ++--
 mm/slab.h                           |   7 +
 mm/slab_common.c                    |   3 +
 mm/slub.c                           | 495 +++++++++++++++++++++++++---
 17 files changed, 541 insertions(+), 92 deletions(-)

-- 
2.47.3



^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2025-09-24 11:08 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-09  1:00 [PATCH slab v5 0/6] slab: Re-entrant kmalloc_nolock() Alexei Starovoitov
2025-09-09  1:00 ` [PATCH slab v5 1/6] locking/local_lock: Introduce local_lock_is_locked() Alexei Starovoitov
2025-09-09  1:00 ` [PATCH slab v5 2/6] mm: Allow GFP_ACCOUNT to be used in alloc_pages_nolock() Alexei Starovoitov
2025-09-12 17:11   ` Shakeel Butt
2025-09-12 17:15     ` Matthew Wilcox
2025-09-12 17:34       ` Alexei Starovoitov
2025-09-12 17:46         ` Shakeel Butt
2025-09-12 17:47   ` Shakeel Butt
2025-09-15  5:25   ` Harry Yoo
2025-09-09  1:00 ` [PATCH slab v5 3/6] mm: Introduce alloc_frozen_pages_nolock() Alexei Starovoitov
2025-09-12 17:15   ` Shakeel Butt
2025-09-15  5:17   ` Harry Yoo
2025-09-09  1:00 ` [PATCH slab v5 4/6] slab: Make slub local_(try)lock more precise for LOCKDEP Alexei Starovoitov
2025-09-09  1:00 ` [PATCH slab v5 5/6] slab: Reuse first bit for OBJEXTS_ALLOC_FAIL Alexei Starovoitov
2025-09-12 19:27   ` Shakeel Butt
2025-09-12 21:03     ` Suren Baghdasaryan
2025-09-12 21:11       ` Shakeel Butt
2025-09-12 21:26         ` Suren Baghdasaryan
2025-09-12 21:24       ` Alexei Starovoitov
2025-09-12 21:29         ` Shakeel Butt
2025-09-12 21:31           ` Alexei Starovoitov
2025-09-12 21:44             ` Shakeel Butt
2025-09-12 21:59               ` Alexei Starovoitov
2025-09-13  0:01                 ` Shakeel Butt
2025-09-13  0:07                   ` Alexei Starovoitov
2025-09-13  0:33                     ` Shakeel Butt
2025-09-13  0:36                       ` Suren Baghdasaryan
2025-09-13  1:12                         ` Alexei Starovoitov
2025-09-15  7:51                           ` Vlastimil Babka
2025-09-15 15:06                             ` Suren Baghdasaryan
2025-09-15 15:11                               ` Vlastimil Babka
2025-09-15 15:25                                 ` Suren Baghdasaryan
2025-09-15 20:10                                   ` Suren Baghdasaryan
2025-09-13  1:16   ` Shakeel Butt
2025-09-15  6:14   ` Harry Yoo
2025-09-09  1:00 ` [PATCH slab v5 6/6] slab: Introduce kmalloc_nolock() and kfree_nolock() Alexei Starovoitov
2025-09-15 12:52   ` Harry Yoo
2025-09-15 14:39     ` Vlastimil Babka
2025-09-16  0:56       ` Alexei Starovoitov
2025-09-16  9:55         ` Vlastimil Babka
2025-09-16  1:00     ` Alexei Starovoitov
2025-09-24  0:40   ` Harry Yoo
2025-09-24  7:43     ` Alexei Starovoitov
2025-09-24 11:07       ` Harry Yoo
2025-09-12  9:33 ` [PATCH slab v5 0/6] slab: Re-entrant kmalloc_nolock() Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox