linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Christoph Lameter <cl@gentwo.org>,
	David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Michal Hocko <mhocko@kernel.org>, Hao Li <hao.li@linux.dev>,
	Alexei Starovoitov <ast@kernel.org>,
	Puranjay Mohan <puranjay@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Amery Hung <ameryhung@gmail.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang@linux.dev>,
	Dave Chinner <david@fromorbit.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Muchun Song <muchun.song@linux.dev>,
	rcu@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org
Subject: Re: [RFC PATCH 0/7] k[v]free_rcu() improvements
Date: Fri, 6 Feb 2026 16:16:46 -0800	[thread overview]
Message-ID: <3069e76d-5c7a-4c3f-9b83-43ed1700b95f@paulmck-laptop> (raw)
In-Reply-To: <20260206093410.160622-1-harry.yoo@oracle.com>

On Fri, Feb 06, 2026 at 06:34:03PM +0900, Harry Yoo wrote:
> These are a few improvements for k[v]free_rcu() API, which were suggested
> by Alexei Starovoitov.
> 
> [ To kmemleak folks: I'm going to teach delete_object_full() and
>   paint_ptr() to ignore cases when the object does not exist.
>   Could you please let me know if the way it's done in patch 3
>   looks good? Only part 2 is relevant to you. ]

On what commit should I apply this series?  I get conflicts on top of -rcu
(no surprise there) and build errors on top of next-20260205.

							Thanx, Paul

> Although I've put some effort into providing a decent quality
> implementation, I'd like you to consider this as a proof-of-concept
> and let's discuss how best we could tackle those problems:
> 
>   1) Allow an 8-byte field to be used as an alternative to
>      struct rcu_head (16-byte) for 2-argument kvfree_rcu()
>   2) kmalloc_nolock() -> kfree[_rcu]() support
>   3) Add kfree_rcu_nolock() for NMI context
> 
> # Part 1. Allow an 8-byte field to be used as an alternative to
>   struct rcu_head for 2-argument kvfree_rcu()
>   
>   Technically, objects that are freed with k[v]free_rcu() need
>   only one pointer to link objects, because we already know that
>   the callback function is always kvfree(). For this purpose,
>   struct rcu_head is unnecessarily large (16 bytes on 64-bit).
> 
>   Allow a smaller, 8-byte field (of struct rcu_ptr type) to be used
>   with k[v]free_rcu(). Let's save one pointer per slab object.
>   
>   I have to admit that my naming skill isn't great; hopefully
>   we'll come up with a better name than `struct rcu_ptr`.
> 
>   With this feature, either a struct rcu_ptr or rcu_head field
>   can be used as the second argument of the k[v]free_rcu() API.
> 
>   Users that only use k[v]free_rcu() are highly encouraged to use
>   struct rcu_ptr; otherwise you're wasting memory. However, some users,
>   such as maple tree, may use call_rcu() or k[v]free_rcu() depending on
>   the situation for objects of the same type. For such users,
>   struct rcu_head remains the only option.
> 
>   Patch 1 implements this feature, and patch 2 adds a few users in mm/.
> 
> # Part 2. kmalloc_nolock() -> kfree() or kfree_rcu() path support
>   
>   Allow objects allocated with kmalloc_nolock() to be freed with
>   kfree[_rcu](). Without this support, users are forced to call
>   call_rcu() with kfree_nolock() to free objects after a grace period.
>   This is not efficient and can create unnecessarily many grace periods
>   by bypassing the kfree_rcu batching layer.
> 
>   The reason why it was not supported before was because some alloc
>   hooks are not called in kmalloc_nolock(), while all free hooks are
>   called in kfree().
> 
>   Patch 3 adds support for this by teaching kmemleak to ignore cases
>   when free hooks are called without prior alloc hooks. Patch 4 frees
>   a bit in enum objexts_flags, since we no longer have to remember
>   whether the array was allocated using kmalloc_nolock() or kmalloc().
> 
>   Note that the free hooks fall into these categories:
> 
>   - Its alloc hook is called in kmalloc_nolock(), no problem!
>     (kmsan_slab_alloc(), kasan_slab_alloc(),
>      memcg_slab_post_alloc_hook(), alloc_tagging_slab_alloc_hook())
> 
>   - Its alloc hook isn't called in kmalloc_nolock(); free hooks
>     must handle asymmetric hook calls. (kfence_free(),
>     kmemleak_free_recursive())
> 
>   - There is no matching alloc hook for the free hook; it's safe to
>     call. (debug_check_no_{locks,obj}_freed, __kcsan_check_access())
> 
>   Note that kmalloc() -> kfree_nolock() or kfree_rcu_nolock() isn't
>   still supported! That's much trickier :)
> 
> # Part 3. Add kfree_rcu_nolock() for NMI context
> 
>   Add a new 2-argument kfree_rcu_nolock() variant that is safe to be
>   called in NMI context. In NMI context, calling kfree_rcu() or
>   call_rcu() is not legal, and thus users are forced to implement some
>   sort of deferred freeing. Let's make users' lives easier with the new
>   variant.
> 
>   Note that 1-argument kfree_rcu_nolock() is not supported, since there
>   is not much we can do when trylock & memory allocation fails.
>   (You can't call synchronize_rcu() in NMI context!)
> 
>   When spinning on a lock is not allowed, try to acquire the spinlock.
>   When it succeeds in acquiring the lock, do either:
> 
>   1) Use the rcu sheaf to free the object. Note that call_rcu() cannot
>      be called in NMI context! When the rcu sheaf becomes full by
>      freeing the object, it cannot free to the sheaf and has to fall back.
>   
>   2) Use struct rcu_ptr field to link objects. Consuming a bnode
>      (of struct kvfree_rcu_bulk_data) and queueing work to maintain
>      a number of cached bnodes is avoided in NMI context.
> 
>   Note that scheduling delayed monitor work to drain objects after
>   KFREE_DRAIN_JIFFIES is done using a lazy irq_work to avoid raising
>   self-IPIs. That means scheduling delayed monitor work can be delayed
>   up to the length of a time slice.
> 
>   In rare cases where trylock fails, a non-lazy irq_work is used to
>   defer calling kvfree_rcu_call().
> 
>   When certain debug features (kmemleak, debugobjects) are enabled,
>   freeing in NMI context is always deferred because they use spinlocks.
> 
>   Patch 6 implements kfree_rcu_nolock() support, patch 7 adds sheaves
>   support for the new API.
> 
> Harry Yoo (7):
>   mm/slab: introduce k[v]free_rcu() with struct rcu_ptr
>   mm: use rcu_ptr instead of rcu_head
>   mm/slab: allow freeing kmalloc_nolock()'d objects using kfree[_rcu]()
>   mm/slab: free a bit in enum objexts_flags
>   mm/slab: move kfree_rcu_cpu[_work] definitions
>   mm/slab: introduce kfree_rcu_nolock()
>   mm/slab: make kfree_rcu_nolock() work with sheaves
> 
>  include/linux/list_lru.h   |   2 +-
>  include/linux/memcontrol.h |   3 +-
>  include/linux/rcupdate.h   |  68 +++++---
>  include/linux/shrinker.h   |   2 +-
>  include/linux/types.h      |   9 ++
>  mm/kmemleak.c              |  11 +-
>  mm/slab.h                  |   2 +-
>  mm/slab_common.c           | 309 +++++++++++++++++++++++++------------
>  mm/slub.c                  |  47 ++++--
>  mm/vmalloc.c               |   4 +-
>  10 files changed, 310 insertions(+), 147 deletions(-)
> 
> -- 
> 2.43.0
> 


  parent reply	other threads:[~2026-02-07  0:16 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06  9:34 Harry Yoo
2026-02-06  9:34 ` [RFC PATCH 1/7] mm/slab: introduce k[v]free_rcu() with struct rcu_ptr Harry Yoo
2026-02-11 10:16   ` Uladzislau Rezki
2026-02-11 10:44     ` Harry Yoo
2026-02-11 10:53       ` Uladzislau Rezki
2026-02-11 11:26         ` Harry Yoo
2026-02-11 13:02           ` Uladzislau Rezki
2026-02-11 17:05           ` Alexei Starovoitov
2026-02-12 11:52     ` Vlastimil Babka
2026-02-13  5:17       ` Harry Yoo
2026-02-06  9:34 ` [RFC PATCH 2/7] mm: use rcu_ptr instead of rcu_head Harry Yoo
2026-02-09 10:41   ` Uladzislau Rezki
2026-02-09 11:22     ` Harry Yoo
2026-02-06  9:34 ` [RFC PATCH 3/7] mm/slab: allow freeing kmalloc_nolock()'d objects using kfree[_rcu]() Harry Yoo
2026-02-06  9:34 ` [RFC PATCH 4/7] mm/slab: free a bit in enum objexts_flags Harry Yoo
2026-02-06 20:09   ` Alexei Starovoitov
2026-02-09  9:38     ` Vlastimil Babka
2026-02-09 18:44       ` Alexei Starovoitov
2026-02-06  9:34 ` [RFC PATCH 5/7] mm/slab: move kfree_rcu_cpu[_work] definitions Harry Yoo
2026-02-06  9:34 ` [RFC PATCH 6/7] mm/slab: introduce kfree_rcu_nolock() Harry Yoo
2026-02-12  2:58   ` Harry Yoo
2026-02-16 21:07   ` Joel Fernandes
2026-02-16 21:32     ` Joel Fernandes
2026-02-06  9:34 ` [RFC PATCH 7/7] mm/slab: make kfree_rcu_nolock() work with sheaves Harry Yoo
2026-02-12 19:15   ` Alexei Starovoitov
2026-02-13 11:55     ` Harry Yoo
2026-02-07  0:16 ` Paul E. McKenney [this message]
2026-02-07  1:21   ` [RFC PATCH 0/7] k[v]free_rcu() improvements Harry Yoo
2026-02-07  1:33     ` Paul E. McKenney
2026-02-09  9:02       ` Harry Yoo
2026-02-09 16:40         ` Paul E. McKenney
2026-02-12 14:28 ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3069e76d-5c7a-4c3f-9b83-43ed1700b95f@paulmck-laptop \
    --to=paulmck@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=david@fromorbit.com \
    --cc=frederic@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hao.li@linux.dev \
    --cc=harry.yoo@oracle.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joelagnelf@nvidia.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=puranjay@kernel.org \
    --cc=qiang.zhang@linux.dev \
    --cc=rcu@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=shakeel.butt@linux.dev \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox