linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB
@ 2025-02-03  9:28 Vlastimil Babka
  2025-02-03  9:28 ` [PATCH 1/4] slab, rcu: move TINY_RCU variant of " Vlastimil Babka
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Vlastimil Babka @ 2025-02-03  9:28 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo, linux-mm,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
	Vlastimil Babka

Following the move of the TREE_RCU batching kvfree_rcu() implementation
to slab, we still have the simple non-batching implementation in tiny
RCU, and RCU implementation specific ifdefs in slab code.

Finish the move and integration into slab. Allow using the simple
call_rcu() based implementation also with tree RCU when SLUB_TINY is
enabled, as its goal is also to limit memory footprint with less concern
for top performance.

In order to avoid RCU having to recognize the fake callback function
pointers (__is_kvfree_rcu_offset()) when handling call_rcu(), implement
a callback that can calculate the object's address from the embedded
rcu_head pointer without knowing the specific offset (previously SLOB
would not have made it possible, but it's gone now).

After this series, AFAIK only the following kvfree_rcu specific code
remains in RCU:

- a call to kfree_rcu_scheduler_running() from rcu_set_runtime_mode()

  - probably necessary and a generic registration interface would be
    unnecessary bloat?

- declarations of kfree_rcu() API in include/linux/rcupdate.h

  - could be moved to slab.h after checking for/fixing up potential
    missing includes

git tree:
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slub-tiny-kfree_rcu

Changes since RFC [1]:
- Rebased to v6.14-rc1
- Fixed build errors reported by bots.
- Changed where kvfree_call_rcu() is moved to in patch 1 to avoid another
  move in patch 4.
- R-b's by Ulad and Joel
- Fix a memory leak in kvfree_rcu_list() thanks to Ulad
- Various comments' improvements and fixes (Joel, Ulad)
- Rename config to CONFIG_KFREE_RCU_BATCHED (Ulad)

Due to changes, didn't keep Paul's Tested-by (but thanks!)

Will include in slab/for-next if no objection.

[1] https://lore.kernel.org/all/20250123-slub-tiny-kfree_rcu-v1-0-0e386ef1541a@suse.cz/

To: Christoph Lameter <cl@linux.com>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: linux-mm@kvack.org
To: "Paul E. McKenney" <paulmck@kernel.org>
To: Joel Fernandes <joel@joelfernandes.org>
To: Josh Triplett <josh@joshtriplett.org>
To: Boqun Feng <boqun.feng@gmail.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Cc: rcu@vger.kernel.org

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
Vlastimil Babka (4):
      slab, rcu: move TINY_RCU variant of kvfree_rcu() to SLAB
      rcu: remove trace_rcu_kvfree_callback
      rcu, slab: use a regular callback function for kvfree_rcu
      slab: don't batch kvfree_rcu() with SLUB_TINY

 include/linux/rcupdate.h   | 33 ++++++++++++++++++---------------
 include/linux/rcutiny.h    | 36 ------------------------------------
 include/linux/rcutree.h    |  3 ---
 include/linux/slab.h       | 14 ++++++++++++++
 include/trace/events/rcu.h | 34 ----------------------------------
 kernel/rcu/tiny.c          | 25 -------------------------
 kernel/rcu/tree.c          |  9 ++-------
 mm/Kconfig                 |  4 ++++
 mm/slab.h                  |  2 ++
 mm/slab_common.c           | 33 +++++++++++++++++++++++++++------
 mm/slub.c                  | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 11 files changed, 113 insertions(+), 126 deletions(-)
---
base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
change-id: 20250123-slub-tiny-kfree_rcu-bd65bfe222f2

Best regards,
-- 
Vlastimil Babka <vbabka@suse.cz>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] slab, rcu: move TINY_RCU variant of kvfree_rcu() to SLAB
  2025-02-03  9:28 [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Vlastimil Babka
@ 2025-02-03  9:28 ` Vlastimil Babka
  2025-02-04  9:50   ` Harry (Hyeonggon) Yoo
  2025-02-03  9:28 ` [PATCH 2/4] rcu: remove trace_rcu_kvfree_callback Vlastimil Babka
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Vlastimil Babka @ 2025-02-03  9:28 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo, linux-mm,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
	Vlastimil Babka

Following the move of TREE_RCU implementation, let's move also the
TINY_RCU one for consistency and subsequent refactoring.

For simplicity, remove the separate inline __kvfree_call_rcu() as
TINY_RCU is not meant for high-performance hardware anyway.

Declare kvfree_call_rcu() in rcupdate.h to avoid header dependency
issues.

Also move the kvfree_rcu_barrier() declaration to slab.h

Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/rcupdate.h |  5 +++++
 include/linux/rcutiny.h  | 36 ------------------------------------
 include/linux/rcutree.h  |  3 ---
 include/linux/slab.h     | 14 ++++++++++++++
 kernel/rcu/tiny.c        | 11 -----------
 mm/slab_common.c         | 19 +++++++++++++++++++
 6 files changed, 38 insertions(+), 50 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 48e5c03df1dd83c246a61d0fcc8aa638adcd7654..3f70d1c8144426f40553c8c589f07097ece8a706 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -1082,6 +1082,11 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
 #define kfree_rcu_mightsleep(ptr) kvfree_rcu_arg_1(ptr)
 #define kvfree_rcu_mightsleep(ptr) kvfree_rcu_arg_1(ptr)
 
+/*
+ * In mm/slab_common.c, no suitable header to include here.
+ */
+void kvfree_call_rcu(struct rcu_head *head, void *ptr);
+
 #define kvfree_rcu_arg_2(ptr, rhf)					\
 do {									\
 	typeof (ptr) ___p = (ptr);					\
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index fe42315f667fc5be7f2ed8eae6ea0c7193030846..f519cd6802286710bdd56588b5ff3d07bcd30b92 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -90,41 +90,6 @@ static inline void synchronize_rcu_expedited(void)
 	synchronize_rcu();
 }
 
-/*
- * Add one more declaration of kvfree() here. It is
- * not so straight forward to just include <linux/mm.h>
- * where it is defined due to getting many compile
- * errors caused by that include.
- */
-extern void kvfree(const void *addr);
-
-static inline void __kvfree_call_rcu(struct rcu_head *head, void *ptr)
-{
-	if (head) {
-		call_rcu(head, (rcu_callback_t) ((void *) head - ptr));
-		return;
-	}
-
-	// kvfree_rcu(one_arg) call.
-	might_sleep();
-	synchronize_rcu();
-	kvfree(ptr);
-}
-
-static inline void kvfree_rcu_barrier(void)
-{
-	rcu_barrier();
-}
-
-#ifdef CONFIG_KASAN_GENERIC
-void kvfree_call_rcu(struct rcu_head *head, void *ptr);
-#else
-static inline void kvfree_call_rcu(struct rcu_head *head, void *ptr)
-{
-	__kvfree_call_rcu(head, ptr);
-}
-#endif
-
 void rcu_qs(void);
 
 static inline void rcu_softirq_qs(void)
@@ -164,7 +129,6 @@ static inline void rcu_end_inkernel_boot(void) { }
 static inline bool rcu_inkernel_boot_has_ended(void) { return true; }
 static inline bool rcu_is_watching(void) { return true; }
 static inline void rcu_momentary_eqs(void) { }
-static inline void kfree_rcu_scheduler_running(void) { }
 
 /* Avoid RCU read-side critical sections leaking across. */
 static inline void rcu_all_qs(void) { barrier(); }
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 27d86d9127817e50f8d4dd79e1990d70a02435bb..dbe77b5fe06ec89a393b5444d6c479ced346a37b 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -34,12 +34,9 @@ static inline void rcu_virt_note_context_switch(void)
 }
 
 void synchronize_rcu_expedited(void);
-void kvfree_call_rcu(struct rcu_head *head, void *ptr);
-void kvfree_rcu_barrier(void);
 
 void rcu_barrier(void);
 void rcu_momentary_eqs(void);
-void kfree_rcu_scheduler_running(void);
 
 struct rcu_gp_oldstate {
 	unsigned long rgos_norm;
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 09eedaecf1205672bb2e7c8cd57ae8fccebc2737..bcc62e5656c35c6a3f4caf26fb33d7447dead39a 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -16,6 +16,7 @@
 #include <linux/gfp.h>
 #include <linux/overflow.h>
 #include <linux/types.h>
+#include <linux/rcupdate.h>
 #include <linux/workqueue.h>
 #include <linux/percpu-refcount.h>
 #include <linux/cleanup.h>
@@ -1082,6 +1083,19 @@ extern void kvfree_sensitive(const void *addr, size_t len);
 
 unsigned int kmem_cache_size(struct kmem_cache *s);
 
+#ifdef CONFIG_TINY_RCU
+static inline void kvfree_rcu_barrier(void)
+{
+	rcu_barrier();
+}
+
+static inline void kfree_rcu_scheduler_running(void) { }
+#else
+void kvfree_rcu_barrier(void);
+
+void kfree_rcu_scheduler_running(void);
+#endif
+
 /**
  * kmalloc_size_roundup - Report allocation bucket size for the given size
  *
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index 4b3f319114650d372d490919669a2a9e248e731d..0ec27093d0e14a4b1060ea08932c4ac13f9b0f26 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -246,17 +246,6 @@ bool poll_state_synchronize_rcu(unsigned long oldstate)
 }
 EXPORT_SYMBOL_GPL(poll_state_synchronize_rcu);
 
-#ifdef CONFIG_KASAN_GENERIC
-void kvfree_call_rcu(struct rcu_head *head, void *ptr)
-{
-	if (head)
-		kasan_record_aux_stack(ptr);
-
-	__kvfree_call_rcu(head, ptr);
-}
-EXPORT_SYMBOL_GPL(kvfree_call_rcu);
-#endif
-
 void __init rcu_init(void)
 {
 	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 4030907b6b7d89adcf62c70e82e95c8deb41adba..81a0ce77b11c2ef8db9164cdca5853069402f161 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1284,6 +1284,25 @@ EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc);
 EXPORT_TRACEPOINT_SYMBOL(kfree);
 EXPORT_TRACEPOINT_SYMBOL(kmem_cache_free);
 
+#ifdef CONFIG_TINY_RCU
+
+void kvfree_call_rcu(struct rcu_head *head, void *ptr)
+{
+	if (head) {
+		kasan_record_aux_stack(ptr);
+		call_rcu(head, (rcu_callback_t) ((void *) head - ptr));
+		return;
+	}
+
+	// kvfree_rcu(one_arg) call.
+	might_sleep();
+	synchronize_rcu();
+	kvfree(ptr);
+}
+EXPORT_SYMBOL_GPL(kvfree_call_rcu);
+
+#endif
+
 /*
  * This rcu parameter is runtime-read-only. It reflects
  * a minimum allowed number of objects which can be cached

-- 
2.48.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/4] rcu: remove trace_rcu_kvfree_callback
  2025-02-03  9:28 [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Vlastimil Babka
  2025-02-03  9:28 ` [PATCH 1/4] slab, rcu: move TINY_RCU variant of " Vlastimil Babka
@ 2025-02-03  9:28 ` Vlastimil Babka
  2025-02-05  9:28   ` Harry (Hyeonggon) Yoo
  2025-02-03  9:28 ` [PATCH 3/4] rcu, slab: use a regular callback function for kvfree_rcu Vlastimil Babka
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Vlastimil Babka @ 2025-02-03  9:28 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo, linux-mm,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
	Vlastimil Babka

Tree RCU does not handle kvfree_rcu() by queueing individual objects by
call_rcu() anymore, thus the tracepoint and associated
__is_kvfree_rcu_offset() check is dead code now. Remove it.

Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/trace/events/rcu.h | 34 ----------------------------------
 kernel/rcu/tree.c          |  9 ++-------
 2 files changed, 2 insertions(+), 41 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index e81431deaa50b21725b08ad7c1a2992a0074065a..ac3b28b8939b78843aecb44bed8cb76c451ab060 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -560,40 +560,6 @@ TRACE_EVENT_RCU(rcu_segcb_stats,
 
 );
 
-/*
- * Tracepoint for the registration of a single RCU callback of the special
- * kvfree() form.  The first argument is the RCU type, the second argument
- * is a pointer to the RCU callback, the third argument is the offset
- * of the callback within the enclosing RCU-protected data structure,
- * the fourth argument is the number of lazy callbacks queued, and the
- * fifth argument is the total number of callbacks queued.
- */
-TRACE_EVENT_RCU(rcu_kvfree_callback,
-
-	TP_PROTO(const char *rcuname, struct rcu_head *rhp, unsigned long offset,
-		 long qlen),
-
-	TP_ARGS(rcuname, rhp, offset, qlen),
-
-	TP_STRUCT__entry(
-		__field(const char *, rcuname)
-		__field(void *, rhp)
-		__field(unsigned long, offset)
-		__field(long, qlen)
-	),
-
-	TP_fast_assign(
-		__entry->rcuname = rcuname;
-		__entry->rhp = rhp;
-		__entry->offset = offset;
-		__entry->qlen = qlen;
-	),
-
-	TP_printk("%s rhp=%p func=%ld %ld",
-		  __entry->rcuname, __entry->rhp, __entry->offset,
-		  __entry->qlen)
-);
-
 /*
  * Tracepoint for marking the beginning rcu_do_batch, performed to start
  * RCU callback invocation.  The first argument is the RCU flavor,
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 475f31deed1418d73d097bb4d39fe714814fefa3..5dbc4189037c501b8c671598131ed5ea6d4afd71 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2931,13 +2931,8 @@ static int __init rcu_spawn_core_kthreads(void)
 static void rcutree_enqueue(struct rcu_data *rdp, struct rcu_head *head, rcu_callback_t func)
 {
 	rcu_segcblist_enqueue(&rdp->cblist, head);
-	if (__is_kvfree_rcu_offset((unsigned long)func))
-		trace_rcu_kvfree_callback(rcu_state.name, head,
-					 (unsigned long)func,
-					 rcu_segcblist_n_cbs(&rdp->cblist));
-	else
-		trace_rcu_callback(rcu_state.name, head,
-				   rcu_segcblist_n_cbs(&rdp->cblist));
+	trace_rcu_callback(rcu_state.name, head,
+			   rcu_segcblist_n_cbs(&rdp->cblist));
 	trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued"));
 }
 

-- 
2.48.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/4] rcu, slab: use a regular callback function for kvfree_rcu
  2025-02-03  9:28 [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Vlastimil Babka
  2025-02-03  9:28 ` [PATCH 1/4] slab, rcu: move TINY_RCU variant of " Vlastimil Babka
  2025-02-03  9:28 ` [PATCH 2/4] rcu: remove trace_rcu_kvfree_callback Vlastimil Babka
@ 2025-02-03  9:28 ` Vlastimil Babka
  2025-02-05  9:13   ` Harry (Hyeonggon) Yoo
  2025-02-03  9:28 ` [PATCH 4/4] slab: don't batch kvfree_rcu() with SLUB_TINY Vlastimil Babka
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Vlastimil Babka @ 2025-02-03  9:28 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo, linux-mm,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
	Vlastimil Babka

RCU has been special-casing callback function pointers that are integers
lower than 4096 as offsets of rcu_head for kvfree() instead. The tree
RCU implementation no longer does that as the batched kvfree_rcu() is
not a simple call_rcu(). The tiny RCU still does, and the plan is also
to make tree RCU use call_rcu() for SLUB_TINY configurations.

Instead of teaching tree RCU again to special case the offsets, let's
remove the special casing completely. Since there's no SLOB anymore, it
is possible to create a callback function that can take a pointer to a
middle of slab object with unknown offset and determine the object's
pointer before freeing it, so implement that as kvfree_rcu_cb().

Large kmalloc and vmalloc allocations are handled simply by aligning
down to page size. For that we retain the requirement that the offset is
smaller than 4096. But we can remove __is_kvfree_rcu_offset() completely
and instead just opencode the condition in the BUILD_BUG_ON() check.

Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/rcupdate.h | 28 +++++++++++++---------------
 kernel/rcu/tiny.c        | 14 --------------
 mm/slab.h                |  2 ++
 mm/slab_common.c         |  5 ++---
 mm/slub.c                | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 3f70d1c8144426f40553c8c589f07097ece8a706..23bcf71ffb06ecf60d42690803ffc5adb7d1aedd 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -1025,12 +1025,6 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
 #define RCU_POINTER_INITIALIZER(p, v) \
 		.p = RCU_INITIALIZER(v)
 
-/*
- * Does the specified offset indicate that the corresponding rcu_head
- * structure can be handled by kvfree_rcu()?
- */
-#define __is_kvfree_rcu_offset(offset) ((offset) < 4096)
-
 /**
  * kfree_rcu() - kfree an object after a grace period.
  * @ptr: pointer to kfree for double-argument invocations.
@@ -1041,11 +1035,11 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
  * when they are used in a kernel module, that module must invoke the
  * high-latency rcu_barrier() function at module-unload time.
  *
- * The kfree_rcu() function handles this issue.  Rather than encoding a
- * function address in the embedded rcu_head structure, kfree_rcu() instead
- * encodes the offset of the rcu_head structure within the base structure.
- * Because the functions are not allowed in the low-order 4096 bytes of
- * kernel virtual memory, offsets up to 4095 bytes can be accommodated.
+ * The kfree_rcu() function handles this issue. In order to have a universal
+ * callback function handling different offsets of rcu_head, the callback needs
+ * to determine the starting address of the freed object, which can be a large
+ * kmalloc or vmalloc allocation. To allow simply aligning the pointer down to
+ * page boundary for those, only offsets up to 4095 bytes can be accommodated.
  * If the offset is larger than 4095 bytes, a compile-time error will
  * be generated in kvfree_rcu_arg_2(). If this error is triggered, you can
  * either fall back to use of call_rcu() or rearrange the structure to
@@ -1087,14 +1081,18 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
  */
 void kvfree_call_rcu(struct rcu_head *head, void *ptr);
 
+/*
+ * The BUILD_BUG_ON() makes sure the rcu_head offset can be handled. See the
+ * comment of kfree_rcu() for details.
+ */
 #define kvfree_rcu_arg_2(ptr, rhf)					\
 do {									\
 	typeof (ptr) ___p = (ptr);					\
 									\
-	if (___p) {									\
-		BUILD_BUG_ON(!__is_kvfree_rcu_offset(offsetof(typeof(*(ptr)), rhf)));	\
-		kvfree_call_rcu(&((___p)->rhf), (void *) (___p));			\
-	}										\
+	if (___p) {							\
+		BUILD_BUG_ON(offsetof(typeof(*(ptr)), rhf) >= 4096);	\
+		kvfree_call_rcu(&((___p)->rhf), (void *) (___p));	\
+	}								\
 } while (0)
 
 #define kvfree_rcu_arg_1(ptr)					\
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index 0ec27093d0e14a4b1060ea08932c4ac13f9b0f26..7a34a99d4664f3fc050ec22378782fe26e8d1e95 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -85,15 +85,8 @@ void rcu_sched_clock_irq(int user)
 static inline bool rcu_reclaim_tiny(struct rcu_head *head)
 {
 	rcu_callback_t f;
-	unsigned long offset = (unsigned long)head->func;
 
 	rcu_lock_acquire(&rcu_callback_map);
-	if (__is_kvfree_rcu_offset(offset)) {
-		trace_rcu_invoke_kvfree_callback("", head, offset);
-		kvfree((void *)head - offset);
-		rcu_lock_release(&rcu_callback_map);
-		return true;
-	}
 
 	trace_rcu_invoke_callback("", head);
 	f = head->func;
@@ -159,10 +152,6 @@ void synchronize_rcu(void)
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu);
 
-static void tiny_rcu_leak_callback(struct rcu_head *rhp)
-{
-}
-
 /*
  * Post an RCU callback to be invoked after the end of an RCU grace
  * period.  But since we have but one CPU, that would be after any
@@ -178,9 +167,6 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
 			pr_err("%s(): Double-freed CB %p->%pS()!!!  ", __func__, head, head->func);
 			mem_dump_obj(head);
 		}
-
-		if (!__is_kvfree_rcu_offset((unsigned long)head->func))
-			WRITE_ONCE(head->func, tiny_rcu_leak_callback);
 		return;
 	}
 
diff --git a/mm/slab.h b/mm/slab.h
index e9fd9bf0bfa65b343a4ae0ecd5b4c2a325b04883..2f01c7317988ce036f0b22807403226a59f0f708 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -604,6 +604,8 @@ void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
 			    void **p, int objects, struct slabobj_ext *obj_exts);
 #endif
 
+void kvfree_rcu_cb(struct rcu_head *head);
+
 size_t __ksize(const void *objp);
 
 static inline size_t slab_ksize(const struct kmem_cache *s)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 81a0ce77b11c2ef8db9164cdca5853069402f161..6438a38aa5dc2ede0b5afa04bc3fbff5a4697d87 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1290,7 +1290,7 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr)
 {
 	if (head) {
 		kasan_record_aux_stack(ptr);
-		call_rcu(head, (rcu_callback_t) ((void *) head - ptr));
+		call_rcu(head, kvfree_rcu_cb);
 		return;
 	}
 
@@ -1551,8 +1551,7 @@ kvfree_rcu_list(struct rcu_head *head)
 		rcu_lock_acquire(&rcu_callback_map);
 		trace_rcu_invoke_kvfree_callback("slab", head, offset);
 
-		if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset)))
-			kvfree(ptr);
+		kvfree(ptr);
 
 		rcu_lock_release(&rcu_callback_map);
 		cond_resched_tasks_rcu_qs();
diff --git a/mm/slub.c b/mm/slub.c
index 1f50129dcfb3cd1fc76ac9398fa7718cedb42385..e8273f28656936c05d015c53923f8fe69cd161b2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -19,6 +19,7 @@
 #include <linux/bitops.h>
 #include <linux/slab.h>
 #include "slab.h"
+#include <linux/vmalloc.h>
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include <linux/kasan.h>
@@ -4728,6 +4729,51 @@ static void free_large_kmalloc(struct folio *folio, void *object)
 	folio_put(folio);
 }
 
+/*
+ * Given an rcu_head embedded within an object obtained from kvmalloc at an
+ * offset < 4k, free the object in question.
+ */
+void kvfree_rcu_cb(struct rcu_head *head)
+{
+	void *obj = head;
+	struct folio *folio;
+	struct slab *slab;
+	struct kmem_cache *s;
+	void *slab_addr;
+
+	if (is_vmalloc_addr(obj)) {
+		obj = (void *) PAGE_ALIGN_DOWN((unsigned long)obj);
+		vfree(obj);
+		return;
+	}
+
+	folio = virt_to_folio(obj);
+	if (!folio_test_slab(folio)) {
+		/*
+		 * rcu_head offset can be only less than page size so no need to
+		 * consider folio order
+		 */
+		obj = (void *) PAGE_ALIGN_DOWN((unsigned long)obj);
+		free_large_kmalloc(folio, obj);
+		return;
+	}
+
+	slab = folio_slab(folio);
+	s = slab->slab_cache;
+	slab_addr = folio_address(folio);
+
+	if (is_kfence_address(obj)) {
+		obj = kfence_object_start(obj);
+	} else {
+		unsigned int idx = __obj_to_index(s, slab_addr, obj);
+
+		obj = slab_addr + s->size * idx;
+		obj = fixup_red_left(s, obj);
+	}
+
+	slab_free(s, slab, obj, _RET_IP_);
+}
+
 /**
  * kfree - free previously allocated memory
  * @object: pointer returned by kmalloc() or kmem_cache_alloc()

-- 
2.48.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/4] slab: don't batch kvfree_rcu() with SLUB_TINY
  2025-02-03  9:28 [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Vlastimil Babka
                   ` (2 preceding siblings ...)
  2025-02-03  9:28 ` [PATCH 3/4] rcu, slab: use a regular callback function for kvfree_rcu Vlastimil Babka
@ 2025-02-03  9:28 ` Vlastimil Babka
  2025-02-05  9:40   ` Harry (Hyeonggon) Yoo
  2025-02-03 19:07 ` [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Paul E. McKenney
  2025-02-05  9:54 ` Vlastimil Babka
  5 siblings, 1 reply; 13+ messages in thread
From: Vlastimil Babka @ 2025-02-03  9:28 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo, linux-mm,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
	Vlastimil Babka

kvfree_rcu() is batched for better performance except on TINY_RCU, which
is a simple implementation for small UP systems. Similarly SLUB_TINY is
an option intended for small systems, whether or not used together with
TINY_RCU. In case SLUB_TINY is used with !TINY_RCU, it makes arguably
sense to not do the batching and limit the memory footprint. It's also
suboptimal to have RCU-specific #ifdefs in slab code.

With that, add CONFIG_KVFREE_RCU_BATCHED to determine whether batching
kvfree_rcu() implementation is used. It is not set by a user prompt, but
enabled by default and disabled in case TINY_RCU or SLUB_TINY are
enabled.

Use the new config for #ifdef's in slab code and extend their scope to
cover all code used by the batched kvfree_rcu(). For example there's no
need to perform kvfree_rcu_init() if the batching is disabled.

Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/slab.h |  2 +-
 mm/Kconfig           |  4 ++++
 mm/slab_common.c     | 15 +++++++++------
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index bcc62e5656c35c6a3f4caf26fb33d7447dead39a..7686054dd494cc65def7f58748718e03eb78e481 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -1083,7 +1083,7 @@ extern void kvfree_sensitive(const void *addr, size_t len);
 
 unsigned int kmem_cache_size(struct kmem_cache *s);
 
-#ifdef CONFIG_TINY_RCU
+#ifndef CONFIG_KVFREE_RCU_BATCHED
 static inline void kvfree_rcu_barrier(void)
 {
 	rcu_barrier();
diff --git a/mm/Kconfig b/mm/Kconfig
index 1b501db064172cf54f1b1259893dfba676473c56..0b7f4bb5cb801c110aaa484ceae3063aa7204d29 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -242,6 +242,10 @@ menu "Slab allocator options"
 config SLUB
 	def_bool y
 
+config KVFREE_RCU_BATCHED
+	def_bool y
+	depends on !SLUB_TINY && !TINY_RCU
+
 config SLUB_TINY
 	bool "Configure for minimal memory footprint"
 	depends on EXPERT
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 6438a38aa5dc2ede0b5afa04bc3fbff5a4697d87..46d0a4cd33b5982fd79c307d572f231fdea9514a 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1284,7 +1284,7 @@ EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc);
 EXPORT_TRACEPOINT_SYMBOL(kfree);
 EXPORT_TRACEPOINT_SYMBOL(kmem_cache_free);
 
-#ifdef CONFIG_TINY_RCU
+#ifndef CONFIG_KVFREE_RCU_BATCHED
 
 void kvfree_call_rcu(struct rcu_head *head, void *ptr)
 {
@@ -1301,7 +1301,11 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr)
 }
 EXPORT_SYMBOL_GPL(kvfree_call_rcu);
 
-#endif
+void __init kvfree_rcu_init(void)
+{
+}
+
+#else /* CONFIG_KVFREE_RCU_BATCHED */
 
 /*
  * This rcu parameter is runtime-read-only. It reflects
@@ -1879,8 +1883,6 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
 	return true;
 }
 
-#if !defined(CONFIG_TINY_RCU)
-
 static enum hrtimer_restart
 schedule_page_work_fn(struct hrtimer *t)
 {
@@ -2089,8 +2091,6 @@ void kvfree_rcu_barrier(void)
 }
 EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
 
-#endif /* #if !defined(CONFIG_TINY_RCU) */
-
 static unsigned long
 kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
 {
@@ -2180,3 +2180,6 @@ void __init kvfree_rcu_init(void)
 
 	shrinker_register(kfree_rcu_shrinker);
 }
+
+#endif /* CONFIG_KVFREE_RCU_BATCHED */
+

-- 
2.48.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB
  2025-02-03  9:28 [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Vlastimil Babka
                   ` (3 preceding siblings ...)
  2025-02-03  9:28 ` [PATCH 4/4] slab: don't batch kvfree_rcu() with SLUB_TINY Vlastimil Babka
@ 2025-02-03 19:07 ` Paul E. McKenney
  2025-02-04 10:02   ` Vlastimil Babka
  2025-02-05  9:54 ` Vlastimil Babka
  5 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2025-02-03 19:07 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, David Rientjes, Joel Fernandes, Josh Triplett,
	Boqun Feng, Uladzislau Rezki, Andrew Morton, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Zqiang, rcu

On Mon, Feb 03, 2025 at 10:28:46AM +0100, Vlastimil Babka wrote:
> Following the move of the TREE_RCU batching kvfree_rcu() implementation
> to slab, we still have the simple non-batching implementation in tiny
> RCU, and RCU implementation specific ifdefs in slab code.
> 
> Finish the move and integration into slab. Allow using the simple
> call_rcu() based implementation also with tree RCU when SLUB_TINY is
> enabled, as its goal is also to limit memory footprint with less concern
> for top performance.
> 
> In order to avoid RCU having to recognize the fake callback function
> pointers (__is_kvfree_rcu_offset()) when handling call_rcu(), implement
> a callback that can calculate the object's address from the embedded
> rcu_head pointer without knowing the specific offset (previously SLOB
> would not have made it possible, but it's gone now).
> 
> After this series, AFAIK only the following kvfree_rcu specific code
> remains in RCU:
> 
> - a call to kfree_rcu_scheduler_running() from rcu_set_runtime_mode()
> 
>   - probably necessary and a generic registration interface would be
>     unnecessary bloat?

I am OK with this calling directly.

Alternatively, you could create a core_initcall() that did nothing
but invoke kfree_rcu_scheduler_running(), and then drop that call from
rcu_set_runtime_mode().  Up to you guys!

> - declarations of kfree_rcu() API in include/linux/rcupdate.h
> 
>   - could be moved to slab.h after checking for/fixing up potential
>     missing includes

It would be good to move them, but no real hurry from this end.

> git tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slub-tiny-kfree_rcu
> 
> Changes since RFC [1]:
> - Rebased to v6.14-rc1
> - Fixed build errors reported by bots.
> - Changed where kvfree_call_rcu() is moved to in patch 1 to avoid another
>   move in patch 4.
> - R-b's by Ulad and Joel
> - Fix a memory leak in kvfree_rcu_list() thanks to Ulad
> - Various comments' improvements and fixes (Joel, Ulad)
> - Rename config to CONFIG_KFREE_RCU_BATCHED (Ulad)
> 
> Due to changes, didn't keep Paul's Tested-by (but thanks!)

I re-ran the tests, so feel free to add it back in.  ;-)

							Thanx, Paul

> Will include in slab/for-next if no objection.
> 
> [1] https://lore.kernel.org/all/20250123-slub-tiny-kfree_rcu-v1-0-0e386ef1541a@suse.cz/
> 
> To: Christoph Lameter <cl@linux.com>
> To: David Rientjes <rientjes@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> Cc: linux-mm@kvack.org
> To: "Paul E. McKenney" <paulmck@kernel.org>
> To: Joel Fernandes <joel@joelfernandes.org>
> To: Josh Triplett <josh@joshtriplett.org>
> To: Boqun Feng <boqun.feng@gmail.com>
> To: Uladzislau Rezki <urezki@gmail.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Zqiang <qiang.zhang1211@gmail.com>
> Cc: rcu@vger.kernel.org
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> Vlastimil Babka (4):
>       slab, rcu: move TINY_RCU variant of kvfree_rcu() to SLAB
>       rcu: remove trace_rcu_kvfree_callback
>       rcu, slab: use a regular callback function for kvfree_rcu
>       slab: don't batch kvfree_rcu() with SLUB_TINY
> 
>  include/linux/rcupdate.h   | 33 ++++++++++++++++++---------------
>  include/linux/rcutiny.h    | 36 ------------------------------------
>  include/linux/rcutree.h    |  3 ---
>  include/linux/slab.h       | 14 ++++++++++++++
>  include/trace/events/rcu.h | 34 ----------------------------------
>  kernel/rcu/tiny.c          | 25 -------------------------
>  kernel/rcu/tree.c          |  9 ++-------
>  mm/Kconfig                 |  4 ++++
>  mm/slab.h                  |  2 ++
>  mm/slab_common.c           | 33 +++++++++++++++++++++++++++------
>  mm/slub.c                  | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>  11 files changed, 113 insertions(+), 126 deletions(-)
> ---
> base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
> change-id: 20250123-slub-tiny-kfree_rcu-bd65bfe222f2
> 
> Best regards,
> -- 
> Vlastimil Babka <vbabka@suse.cz>
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] slab, rcu: move TINY_RCU variant of kvfree_rcu() to SLAB
  2025-02-03  9:28 ` [PATCH 1/4] slab, rcu: move TINY_RCU variant of " Vlastimil Babka
@ 2025-02-04  9:50   ` Harry (Hyeonggon) Yoo
  0 siblings, 0 replies; 13+ messages in thread
From: Harry (Hyeonggon) Yoo @ 2025-02-04  9:50 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
	Andrew Morton, Roman Gushchin, linux-mm, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu

On Mon, Feb 03, 2025 at 10:28:47AM +0100, Vlastimil Babka wrote:
> Following the move of TREE_RCU implementation, let's move also the
> TINY_RCU one for consistency and subsequent refactoring.
> 
> For simplicity, remove the separate inline __kvfree_call_rcu() as
> TINY_RCU is not meant for high-performance hardware anyway.
> 
> Declare kvfree_call_rcu() in rcupdate.h to avoid header dependency
> issues.
> 
> Also move the kvfree_rcu_barrier() declaration to slab.h
> 
> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Looks good to me,
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>

> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

-- 
Harry


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB
  2025-02-03 19:07 ` [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Paul E. McKenney
@ 2025-02-04 10:02   ` Vlastimil Babka
  2025-02-04 11:41     ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Vlastimil Babka @ 2025-02-04 10:02 UTC (permalink / raw)
  To: paulmck
  Cc: Christoph Lameter, David Rientjes, Joel Fernandes, Josh Triplett,
	Boqun Feng, Uladzislau Rezki, Andrew Morton, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Zqiang, rcu

On 2/3/25 20:07, Paul E. McKenney wrote:
> On Mon, Feb 03, 2025 at 10:28:46AM +0100, Vlastimil Babka wrote:
>> Following the move of the TREE_RCU batching kvfree_rcu() implementation
>> to slab, we still have the simple non-batching implementation in tiny
>> RCU, and RCU implementation specific ifdefs in slab code.
>> 
>> Finish the move and integration into slab. Allow using the simple
>> call_rcu() based implementation also with tree RCU when SLUB_TINY is
>> enabled, as its goal is also to limit memory footprint with less concern
>> for top performance.
>> 
>> In order to avoid RCU having to recognize the fake callback function
>> pointers (__is_kvfree_rcu_offset()) when handling call_rcu(), implement
>> a callback that can calculate the object's address from the embedded
>> rcu_head pointer without knowing the specific offset (previously SLOB
>> would not have made it possible, but it's gone now).
>> 
>> After this series, AFAIK only the following kvfree_rcu specific code
>> remains in RCU:
>> 
>> - a call to kfree_rcu_scheduler_running() from rcu_set_runtime_mode()
>> 
>>   - probably necessary and a generic registration interface would be
>>     unnecessary bloat?
> 
> I am OK with this calling directly.
> 
> Alternatively, you could create a core_initcall() that did nothing
> but invoke kfree_rcu_scheduler_running(), and then drop that call from
> rcu_set_runtime_mode().  Up to you guys!

I was afraid there's some ordering necessary with the rest of what
rcu_set_runtime_mode() does, but perhaps there's actually not.

>> - declarations of kfree_rcu() API in include/linux/rcupdate.h
>> 
>>   - could be moved to slab.h after checking for/fixing up potential
>>     missing includes
> 
> It would be good to move them, but no real hurry from this end.
> 
>> git tree:
>> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slub-tiny-kfree_rcu
>> 
>> Changes since RFC [1]:
>> - Rebased to v6.14-rc1
>> - Fixed build errors reported by bots.
>> - Changed where kvfree_call_rcu() is moved to in patch 1 to avoid another
>>   move in patch 4.
>> - R-b's by Ulad and Joel
>> - Fix a memory leak in kvfree_rcu_list() thanks to Ulad
>> - Various comments' improvements and fixes (Joel, Ulad)
>> - Rename config to CONFIG_KFREE_RCU_BATCHED (Ulad)
>> 
>> Due to changes, didn't keep Paul's Tested-by (but thanks!)
> 
> I re-ran the tests, so feel free to add it back in.  ;-)

Thanks, will do :)




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB
  2025-02-04 10:02   ` Vlastimil Babka
@ 2025-02-04 11:41     ` Paul E. McKenney
  0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2025-02-04 11:41 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, David Rientjes, Joel Fernandes, Josh Triplett,
	Boqun Feng, Uladzislau Rezki, Andrew Morton, Roman Gushchin,
	Hyeonggon Yoo, linux-mm, Steven Rostedt, Mathieu Desnoyers,
	Lai Jiangshan, Zqiang, rcu

On Tue, Feb 04, 2025 at 11:02:47AM +0100, Vlastimil Babka wrote:
> On 2/3/25 20:07, Paul E. McKenney wrote:
> > On Mon, Feb 03, 2025 at 10:28:46AM +0100, Vlastimil Babka wrote:
> >> Following the move of the TREE_RCU batching kvfree_rcu() implementation
> >> to slab, we still have the simple non-batching implementation in tiny
> >> RCU, and RCU implementation specific ifdefs in slab code.
> >> 
> >> Finish the move and integration into slab. Allow using the simple
> >> call_rcu() based implementation also with tree RCU when SLUB_TINY is
> >> enabled, as its goal is also to limit memory footprint with less concern
> >> for top performance.
> >> 
> >> In order to avoid RCU having to recognize the fake callback function
> >> pointers (__is_kvfree_rcu_offset()) when handling call_rcu(), implement
> >> a callback that can calculate the object's address from the embedded
> >> rcu_head pointer without knowing the specific offset (previously SLOB
> >> would not have made it possible, but it's gone now).
> >> 
> >> After this series, AFAIK only the following kvfree_rcu specific code
> >> remains in RCU:
> >> 
> >> - a call to kfree_rcu_scheduler_running() from rcu_set_runtime_mode()
> >> 
> >>   - probably necessary and a generic registration interface would be
> >>     unnecessary bloat?
> > 
> > I am OK with this calling directly.
> > 
> > Alternatively, you could create a core_initcall() that did nothing
> > but invoke kfree_rcu_scheduler_running(), and then drop that call from
> > rcu_set_runtime_mode().  Up to you guys!
> 
> I was afraid there's some ordering necessary with the rest of what
> rcu_set_runtime_mode() does, but perhaps there's actually not.

I *think* that they are independent, but I agree that it would be good
to carefully double-check.

The basis for my belief is that it is OK to invoke call_rcu() long
before rcu_init() is invoked, let alone the scheduler being ready.

							Thanx, Paul

> >> - declarations of kfree_rcu() API in include/linux/rcupdate.h
> >> 
> >>   - could be moved to slab.h after checking for/fixing up potential
> >>     missing includes
> > 
> > It would be good to move them, but no real hurry from this end.
> > 
> >> git tree:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slub-tiny-kfree_rcu
> >> 
> >> Changes since RFC [1]:
> >> - Rebased to v6.14-rc1
> >> - Fixed build errors reported by bots.
> >> - Changed where kvfree_call_rcu() is moved to in patch 1 to avoid another
> >>   move in patch 4.
> >> - R-b's by Ulad and Joel
> >> - Fix a memory leak in kvfree_rcu_list() thanks to Ulad
> >> - Various comments' improvements and fixes (Joel, Ulad)
> >> - Rename config to CONFIG_KFREE_RCU_BATCHED (Ulad)
> >> 
> >> Due to changes, didn't keep Paul's Tested-by (but thanks!)
> > 
> > I re-ran the tests, so feel free to add it back in.  ;-)
> 
> Thanks, will do :)
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] rcu, slab: use a regular callback function for kvfree_rcu
  2025-02-03  9:28 ` [PATCH 3/4] rcu, slab: use a regular callback function for kvfree_rcu Vlastimil Babka
@ 2025-02-05  9:13   ` Harry (Hyeonggon) Yoo
  0 siblings, 0 replies; 13+ messages in thread
From: Harry (Hyeonggon) Yoo @ 2025-02-05  9:13 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
	Andrew Morton, Roman Gushchin, linux-mm, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu

On Mon, Feb 03, 2025 at 10:28:49AM +0100, Vlastimil Babka wrote:
> RCU has been special-casing callback function pointers that are integers
> lower than 4096 as offsets of rcu_head for kvfree() instead. The tree
> RCU implementation no longer does that as the batched kvfree_rcu() is
> not a simple call_rcu(). The tiny RCU still does, and the plan is also
> to make tree RCU use call_rcu() for SLUB_TINY configurations.
> 
> Instead of teaching tree RCU again to special case the offsets, let's
> remove the special casing completely. Since there's no SLOB anymore, it
> is possible to create a callback function that can take a pointer to a
> middle of slab object with unknown offset and determine the object's
> pointer before freeing it, so implement that as kvfree_rcu_cb().
> 
> Large kmalloc and vmalloc allocations are handled simply by aligning
> down to page size. For that we retain the requirement that the offset is
> smaller than 4096. But we can remove __is_kvfree_rcu_offset() completely
> and instead just opencode the condition in the BUILD_BUG_ON() check.
> 
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Looks good to me,
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>

> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

-- 
Harry


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] rcu: remove trace_rcu_kvfree_callback
  2025-02-03  9:28 ` [PATCH 2/4] rcu: remove trace_rcu_kvfree_callback Vlastimil Babka
@ 2025-02-05  9:28   ` Harry (Hyeonggon) Yoo
  0 siblings, 0 replies; 13+ messages in thread
From: Harry (Hyeonggon) Yoo @ 2025-02-05  9:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
	Andrew Morton, Roman Gushchin, linux-mm, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu

On Mon, Feb 03, 2025 at 10:28:48AM +0100, Vlastimil Babka wrote:
> Tree RCU does not handle kvfree_rcu() by queueing individual objects by
> call_rcu() anymore, thus the tracepoint and associated
> __is_kvfree_rcu_offset() check is dead code now. Remove it.
> 
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Looks good to me,
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>

> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

-- 
Harry


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] slab: don't batch kvfree_rcu() with SLUB_TINY
  2025-02-03  9:28 ` [PATCH 4/4] slab: don't batch kvfree_rcu() with SLUB_TINY Vlastimil Babka
@ 2025-02-05  9:40   ` Harry (Hyeonggon) Yoo
  0 siblings, 0 replies; 13+ messages in thread
From: Harry (Hyeonggon) Yoo @ 2025-02-05  9:40 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
	Andrew Morton, Roman Gushchin, linux-mm, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu

On Mon, Feb 03, 2025 at 10:28:50AM +0100, Vlastimil Babka wrote:
> kvfree_rcu() is batched for better performance except on TINY_RCU, which
> is a simple implementation for small UP systems. Similarly SLUB_TINY is
> an option intended for small systems, whether or not used together with
> TINY_RCU. In case SLUB_TINY is used with !TINY_RCU, it makes arguably
> sense to not do the batching and limit the memory footprint. It's also
> suboptimal to have RCU-specific #ifdefs in slab code.
> 
> With that, add CONFIG_KVFREE_RCU_BATCHED to determine whether batching
> kvfree_rcu() implementation is used. It is not set by a user prompt, but
> enabled by default and disabled in case TINY_RCU or SLUB_TINY are
> enabled.
> 
> Use the new config for #ifdef's in slab code and extend their scope to
> cover all code used by the batched kvfree_rcu(). For example there's no
> need to perform kvfree_rcu_init() if the batching is disabled.
> 
> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---

Looks good to me,
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>

-- 
Harry


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB
  2025-02-03  9:28 [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Vlastimil Babka
                   ` (4 preceding siblings ...)
  2025-02-03 19:07 ` [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Paul E. McKenney
@ 2025-02-05  9:54 ` Vlastimil Babka
  5 siblings, 0 replies; 13+ messages in thread
From: Vlastimil Babka @ 2025-02-05  9:54 UTC (permalink / raw)
  To: Christoph Lameter, David Rientjes, Paul E. McKenney,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo, linux-mm,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu

On 2/3/25 10:28, Vlastimil Babka wrote:
> Following the move of the TREE_RCU batching kvfree_rcu() implementation
> to slab, we still have the simple non-batching implementation in tiny
> RCU, and RCU implementation specific ifdefs in slab code.
> 
> Finish the move and integration into slab. Allow using the simple
> call_rcu() based implementation also with tree RCU when SLUB_TINY is
> enabled, as its goal is also to limit memory footprint with less concern
> for top performance.
> 
> In order to avoid RCU having to recognize the fake callback function
> pointers (__is_kvfree_rcu_offset()) when handling call_rcu(), implement
> a callback that can calculate the object's address from the embedded
> rcu_head pointer without knowing the specific offset (previously SLOB
> would not have made it possible, but it's gone now).
> 
> After this series, AFAIK only the following kvfree_rcu specific code
> remains in RCU:
> 
> - a call to kfree_rcu_scheduler_running() from rcu_set_runtime_mode()
> 
>   - probably necessary and a generic registration interface would be
>     unnecessary bloat?
> 
> - declarations of kfree_rcu() API in include/linux/rcupdate.h
> 
>   - could be moved to slab.h after checking for/fixing up potential
>     missing includes
> 
> git tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slub-tiny-kfree_rcu
> 
> Changes since RFC [1]:
> - Rebased to v6.14-rc1
> - Fixed build errors reported by bots.
> - Changed where kvfree_call_rcu() is moved to in patch 1 to avoid another
>   move in patch 4.
> - R-b's by Ulad and Joel
> - Fix a memory leak in kvfree_rcu_list() thanks to Ulad
> - Various comments' improvements and fixes (Joel, Ulad)
> - Rename config to CONFIG_KFREE_RCU_BATCHED (Ulad)
> 
> Due to changes, didn't keep Paul's Tested-by (but thanks!)
> 
> Will include in slab/for-next if no objection.

Now done, thanks to Paul for testing and Harry for reviews!



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-02-05  9:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-03  9:28 [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Vlastimil Babka
2025-02-03  9:28 ` [PATCH 1/4] slab, rcu: move TINY_RCU variant of " Vlastimil Babka
2025-02-04  9:50   ` Harry (Hyeonggon) Yoo
2025-02-03  9:28 ` [PATCH 2/4] rcu: remove trace_rcu_kvfree_callback Vlastimil Babka
2025-02-05  9:28   ` Harry (Hyeonggon) Yoo
2025-02-03  9:28 ` [PATCH 3/4] rcu, slab: use a regular callback function for kvfree_rcu Vlastimil Babka
2025-02-05  9:13   ` Harry (Hyeonggon) Yoo
2025-02-03  9:28 ` [PATCH 4/4] slab: don't batch kvfree_rcu() with SLUB_TINY Vlastimil Babka
2025-02-05  9:40   ` Harry (Hyeonggon) Yoo
2025-02-03 19:07 ` [PATCH 0/4] slab, rcu: move and consolidate TINY_RCU kvfree_rcu() to SLAB Paul E. McKenney
2025-02-04 10:02   ` Vlastimil Babka
2025-02-04 11:41     ` Paul E. McKenney
2025-02-05  9:54 ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox