linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/6] bpf, mm: Introduce __GFP_TRYLOCK
@ 2024-12-10  2:39 Alexei Starovoitov
  2024-12-10  2:39 ` [PATCH bpf-next v2 1/6] mm, bpf: Introduce __GFP_TRYLOCK for opportunistic page allocation Alexei Starovoitov
                   ` (5 more replies)
  0 siblings, 6 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2024-12-10  2:39 UTC (permalink / raw)
  To: bpf
  Cc: andrii, memxor, akpm, peterz, vbabka, bigeasy, rostedt, houtao1,
	hannes, shakeel.butt, mhocko, willy, tglx, tj, linux-mm,
	kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Hi All,

This is a more complete patch set that introduces __GFP_TRYLOCK
for opportunistic page allocation and lockless page freeing.
It's usable for bpf as-is.
The main motivation is to remove bpf_mem_alloc and make
alloc page and slab reentrant.
These patch set is a first step. Once try_alloc_pages() is available
new_slab() can be converted to it and the rest of kmalloc/slab_alloc.

I started hacking kmalloc() to replace bpf_mem_alloc() completely,
but ___slab_alloc() is quite complex to convert to trylock.
Mainly deactivate_slab part. It cannot fail, but when only trylock
is available I'm running out of ideas.
So far I'm thinking to limit it to:
- USE_LOCKLESS_FAST_PATH
  Which would mean that we would need to keep bpf_mem_alloc only for RT :(
- slab->flags & __CMPXCHG_DOUBLE, because various debugs cannot work in
  trylock mode. bit slab_lock() cannot be made to work with trylock either.
- simple kasan poison/unposion, since kasan_kmalloc and kasan_slab_free are
  too fancy with their own locks.

v1->v2:
- fixed buggy try_alloc_pages_noprof() in PREEMPT_RT. Thanks Peter.
- optimize all paths by doing spin_trylock_irqsave() first
  and only then check for gfp_flags & __GFP_TRYLOCK.
  Then spin_lock_irqsave() if it's a regular mode.
  So new gfp flag will not add performance overhead.
- patches 2-5 are new. They introduce lockless and/or trylock free_pages_nolock()
  and memcg support. So it's in usable shape for bpf in patch 6.

v1:
https://lore.kernel.org/bpf/20241116014854.55141-1-alexei.starovoitov@gmail.com/

Alexei Starovoitov (6):
  mm, bpf: Introduce __GFP_TRYLOCK for opportunistic page allocation
  mm, bpf: Introduce free_pages_nolock()
  locking/local_lock: Introduce local_trylock_irqsave()
  memcg: Add __GFP_TRYLOCK support.
  mm, bpf: Use __GFP_ACCOUNT in try_alloc_pages().
  bpf: Use try_alloc_pages() to allocate pages for bpf needs.

 include/linux/gfp.h                 | 25 ++++++++
 include/linux/gfp_types.h           |  3 +
 include/linux/local_lock.h          |  9 +++
 include/linux/local_lock_internal.h | 23 +++++++
 include/linux/mm_types.h            |  4 ++
 include/linux/mmzone.h              |  3 +
 include/trace/events/mmflags.h      |  1 +
 kernel/bpf/syscall.c                |  4 +-
 mm/fail_page_alloc.c                |  6 ++
 mm/internal.h                       |  1 +
 mm/memcontrol.c                     | 21 +++++--
 mm/page_alloc.c                     | 94 +++++++++++++++++++++++++----
 tools/perf/builtin-kmem.c           |  1 +
 13 files changed, 177 insertions(+), 18 deletions(-)

-- 
2.43.5



^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2024-12-13 22:02 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-10  2:39 [PATCH bpf-next v2 0/6] bpf, mm: Introduce __GFP_TRYLOCK Alexei Starovoitov
2024-12-10  2:39 ` [PATCH bpf-next v2 1/6] mm, bpf: Introduce __GFP_TRYLOCK for opportunistic page allocation Alexei Starovoitov
2024-12-10  5:31   ` Matthew Wilcox
2024-12-10  9:05     ` Michal Hocko
2024-12-10 20:25       ` Shakeel Butt
2024-12-11 10:08         ` Michal Hocko
2024-12-10 22:06       ` Alexei Starovoitov
2024-12-11 10:19         ` Michal Hocko
2024-12-12 15:07         ` Sebastian Sewior
2024-12-12 15:21           ` Michal Hocko
2024-12-12 15:35             ` Sebastian Sewior
2024-12-12 15:48               ` Steven Rostedt
2024-12-12 16:00                 ` Sebastian Sewior
2024-12-13 17:44                   ` Steven Rostedt
2024-12-13 18:44                     ` Alexei Starovoitov
2024-12-13 18:57                       ` Alexei Starovoitov
2024-12-13 20:09                       ` Steven Rostedt
2024-12-13 21:00                         ` Steven Rostedt
2024-12-13 22:02                           ` Alexei Starovoitov
2024-12-12 21:57               ` Alexei Starovoitov
2024-12-10 21:42     ` Alexei Starovoitov
2024-12-10  9:01   ` Sebastian Andrzej Siewior
2024-12-10 21:53     ` Alexei Starovoitov
2024-12-11  8:38       ` Vlastimil Babka
2024-12-12  2:14         ` Alexei Starovoitov
2024-12-12  8:54           ` Vlastimil Babka
2024-12-10 18:39   ` Vlastimil Babka
2024-12-10 22:42     ` Alexei Starovoitov
2024-12-11  8:48       ` Vlastimil Babka
2024-12-10  2:39 ` [PATCH bpf-next v2 2/6] mm, bpf: Introduce free_pages_nolock() Alexei Starovoitov
2024-12-10  8:35   ` Sebastian Andrzej Siewior
2024-12-10 22:49     ` Alexei Starovoitov
2024-12-12 14:44       ` Sebastian Andrzej Siewior
2024-12-12 19:57         ` Alexei Starovoitov
2024-12-11 10:11   ` Vlastimil Babka
2024-12-12  1:43     ` Alexei Starovoitov
2024-12-10  2:39 ` [PATCH bpf-next v2 3/6] locking/local_lock: Introduce local_trylock_irqsave() Alexei Starovoitov
2024-12-11 10:53   ` Vlastimil Babka
2024-12-11 11:55     ` Vlastimil Babka
2024-12-12  2:49       ` Alexei Starovoitov
2024-12-12  9:15         ` Vlastimil Babka
2024-12-13 14:02           ` Vlastimil Babka
2024-12-12 15:15   ` Sebastian Andrzej Siewior
2024-12-12 19:59     ` Alexei Starovoitov
2024-12-10  2:39 ` [PATCH bpf-next v2 4/6] memcg: Add __GFP_TRYLOCK support Alexei Starovoitov
2024-12-11 23:47   ` kernel test robot
2024-12-10  2:39 ` [PATCH bpf-next v2 5/6] mm, bpf: Use __GFP_ACCOUNT in try_alloc_pages() Alexei Starovoitov
2024-12-11 12:05   ` Vlastimil Babka
2024-12-12  2:54     ` Alexei Starovoitov
2024-12-10  2:39 ` [PATCH bpf-next v2 6/6] bpf: Use try_alloc_pages() to allocate pages for bpf needs Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox