linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] allow KASAN to detect UAF in SLAB_TYPESAFE_BY_RCU slabs
@ 2024-07-24 16:34 Jann Horn
  2024-07-24 16:34 ` [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object Jann Horn
  2024-07-24 16:34 ` [PATCH v2 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG Jann Horn
  0 siblings, 2 replies; 8+ messages in thread
From: Jann Horn @ 2024-07-24 16:34 UTC (permalink / raw)
  To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
	Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Vlastimil Babka, Roman Gushchin, Hyeonggon Yoo
  Cc: Marco Elver, kasan-dev, linux-kernel, linux-mm, Jann Horn

Hi!

This is v2 of a series that I started, uuuh, almost a year ago.
(Sorry...)
v1 was at:
https://lore.kernel.org/lkml/20230825211426.3798691-1-jannh@google.com/

The purpose of the series is to allow KASAN to detect use-after-free
access in SLAB_TYPESAFE_BY_RCU slab caches, by essentially making them
behave as if the cache was not SLAB_TYPESAFE_BY_RCU but instead every
kfree() in the cache was a kfree_rcu().
This is gated behind a config flag that is supposed to only be enabled
in fuzzing/testing builds where the performance impact doesn't matter.

Patch 1/2 is new; it's some necessary prep work for the main patch to
work, though the KASAN integration maybe is a bit ugly.
Patch 2/2 is a rebased version of the old patch, with some changes to
how the config is wired up, with poison/unpoison logic added as
suggested by dvyukov@ back then, with cache destruction fixed using
rcu_barrier() as pointed out by dvyukov@ and the test robot, and a test
added as suggested by elver@.

Output of the new kunit testcase I added to the KASAN test suite:
==================================================================
BUG: KASAN: slab-use-after-free in kmem_cache_rcu_uaf+0x3ae/0x4d0
Read of size 1 at addr ffff88810d3c8000 by task kunit_try_catch/225

CPU: 7 PID: 225 Comm: kunit_try_catch Tainted: G    B            N 6.10.0-00003-gf0fc688e25ed #422
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x53/0x70
 print_report+0xce/0x670
[...]
 kasan_report+0xa5/0xe0
[...]
 kmem_cache_rcu_uaf+0x3ae/0x4d0
[...]
 kunit_try_run_case+0x1b3/0x490
[...]
 kunit_generic_run_threadfn_adapter+0x80/0xe0
 kthread+0x2a5/0x370
[...]
 ret_from_fork+0x34/0x70
[...]
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 225:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_slab_alloc+0x6e/0x70
 kmem_cache_alloc_noprof+0xef/0x2b0
 kmem_cache_rcu_uaf+0x10d/0x4d0
 kunit_try_run_case+0x1b3/0x490
 kunit_generic_run_threadfn_adapter+0x80/0xe0
 kthread+0x2a5/0x370
 ret_from_fork+0x34/0x70
 ret_from_fork_asm+0x1a/0x30

Freed by task 0:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x47/0x70
 slab_free_after_rcu_debug+0xee/0x240
 rcu_core+0x676/0x15b0
 handle_softirqs+0x22f/0x690
 irq_exit_rcu+0x84/0xb0
 sysvec_apic_timer_interrupt+0x6a/0x80
 asm_sysvec_apic_timer_interrupt+0x1a/0x20

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0x8e/0xa0
 __call_rcu_common.constprop.0+0x70/0xa70
 kmem_cache_rcu_uaf+0x16e/0x4d0
 kunit_try_run_case+0x1b3/0x490
 kunit_generic_run_threadfn_adapter+0x80/0xe0
 kthread+0x2a5/0x370
 ret_from_fork+0x34/0x70
 ret_from_fork_asm+0x1a/0x30

The buggy address belongs to the object at ffff88810d3c8000
 which belongs to the cache test_cache of size 200
The buggy address is located 0 bytes inside of
 freed 200-byte region [ffff88810d3c8000, ffff88810d3c80c8)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10d3c8
head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x200000000000040(head|node=0|zone=2)
page_type: 0xffffefff(slab)
raw: 0200000000000040 ffff88810d3c2000 dead000000000122 0000000000000000
raw: 0000000000000000 00000000801f001f 00000001ffffefff 0000000000000000
head: 0200000000000040 ffff88810d3c2000 dead000000000122 0000000000000000
head: 0000000000000000 00000000801f001f 00000001ffffefff 0000000000000000
head: 0200000000000001 ffffea000434f201 ffffffffffffffff 0000000000000000
head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88810d3c7f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88810d3c7f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88810d3c8000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff88810d3c8080: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc
 ffff88810d3c8100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
    ok 38 kmem_cache_rcu_uaf

Signed-off-by: Jann Horn <jannh@google.com>
---
Jann Horn (2):
      kasan: catch invalid free before SLUB reinitializes the object
      slub: Introduce CONFIG_SLUB_RCU_DEBUG

 include/linux/kasan.h | 20 +++++++++++++
 mm/Kconfig.debug      | 25 ++++++++++++++++
 mm/kasan/common.c     | 63 +++++++++++++++++++++++++++++++--------
 mm/kasan/kasan_test.c | 44 +++++++++++++++++++++++++++
 mm/slab.h             |  3 ++
 mm/slab_common.c      | 12 ++++++++
 mm/slub.c             | 82 ++++++++++++++++++++++++++++++++++++++++++++++-----
 7 files changed, 229 insertions(+), 20 deletions(-)
---
base-commit: 0c3836482481200ead7b416ca80c68a29cfdaabd
change-id: 20240723-kasan-tsbrcu-b715a901f776
-- 
Jann Horn <jannh@google.com>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object
  2024-07-24 16:34 [PATCH v2 0/2] allow KASAN to detect UAF in SLAB_TYPESAFE_BY_RCU slabs Jann Horn
@ 2024-07-24 16:34 ` Jann Horn
  2024-07-24 21:17   ` Andrew Morton
  2024-07-25 13:20   ` Vlastimil Babka
  2024-07-24 16:34 ` [PATCH v2 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG Jann Horn
  1 sibling, 2 replies; 8+ messages in thread
From: Jann Horn @ 2024-07-24 16:34 UTC (permalink / raw)
  To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
	Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Vlastimil Babka, Roman Gushchin, Hyeonggon Yoo
  Cc: Marco Elver, kasan-dev, linux-kernel, linux-mm, Jann Horn

Currently, when KASAN is combined with init-on-free behavior, the
initialization happens before KASAN's "invalid free" checks.

More importantly, a subsequent commit will want to use the object metadata
region to store an rcu_head, and we should let KASAN check that the object
pointer is valid before that. (Otherwise that change will make the existing
testcase kmem_cache_invalid_free fail.)

So add a new KASAN hook that allows KASAN to pre-validate a
kmem_cache_free() operation before SLUB actually starts modifying the
object or its metadata.

Signed-off-by: Jann Horn <jannh@google.com>
---
 include/linux/kasan.h | 10 ++++++++++
 mm/kasan/common.c     | 51 +++++++++++++++++++++++++++++++++++++++------------
 mm/slub.c             |  7 +++++++
 3 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 70d6a8f6e25d..eee8ca1dcb40 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -175,6 +175,16 @@ static __always_inline void * __must_check kasan_init_slab_obj(
 	return (void *)object;
 }
 
+bool __kasan_slab_pre_free(struct kmem_cache *s, void *object,
+			unsigned long ip);
+static __always_inline bool kasan_slab_pre_free(struct kmem_cache *s,
+						void *object)
+{
+	if (kasan_enabled())
+		return __kasan_slab_pre_free(s, object, _RET_IP_);
+	return false;
+}
+
 bool __kasan_slab_free(struct kmem_cache *s, void *object,
 			unsigned long ip, bool init);
 static __always_inline bool kasan_slab_free(struct kmem_cache *s,
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 85e7c6b4575c..7c7fc6ce7eb7 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -208,31 +208,52 @@ void * __must_check __kasan_init_slab_obj(struct kmem_cache *cache,
 	return (void *)object;
 }
 
-static inline bool poison_slab_object(struct kmem_cache *cache, void *object,
-				      unsigned long ip, bool init)
+enum free_validation_result {
+	KASAN_FREE_IS_IGNORED,
+	KASAN_FREE_IS_VALID,
+	KASAN_FREE_IS_INVALID
+};
+
+static enum free_validation_result check_slab_free(struct kmem_cache *cache,
+						void *object, unsigned long ip)
 {
-	void *tagged_object;
+	void *tagged_object = object;
 
-	if (!kasan_arch_is_ready())
-		return false;
+	if (is_kfence_address(object) || !kasan_arch_is_ready())
+		return KASAN_FREE_IS_IGNORED;
 
-	tagged_object = object;
 	object = kasan_reset_tag(object);
 
 	if (unlikely(nearest_obj(cache, virt_to_slab(object), object) != object)) {
 		kasan_report_invalid_free(tagged_object, ip, KASAN_REPORT_INVALID_FREE);
-		return true;
+		return KASAN_FREE_IS_INVALID;
 	}
 
-	/* RCU slabs could be legally used after free within the RCU period. */
-	if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU))
-		return false;
-
 	if (!kasan_byte_accessible(tagged_object)) {
 		kasan_report_invalid_free(tagged_object, ip, KASAN_REPORT_DOUBLE_FREE);
-		return true;
+		return KASAN_FREE_IS_INVALID;
 	}
 
+	return KASAN_FREE_IS_VALID;
+}
+
+static inline bool poison_slab_object(struct kmem_cache *cache, void *object,
+				      unsigned long ip, bool init)
+{
+	void *tagged_object = object;
+	enum free_validation_result valid = check_slab_free(cache, object, ip);
+
+	if (valid == KASAN_FREE_IS_IGNORED)
+		return false;
+	if (valid == KASAN_FREE_IS_INVALID)
+		return true;
+
+	object = kasan_reset_tag(object);
+
+	/* RCU slabs could be legally used after free within the RCU period. */
+	if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU))
+		return false;
+
 	kasan_poison(object, round_up(cache->object_size, KASAN_GRANULE_SIZE),
 			KASAN_SLAB_FREE, init);
 
@@ -242,6 +263,12 @@ static inline bool poison_slab_object(struct kmem_cache *cache, void *object,
 	return false;
 }
 
+bool __kasan_slab_pre_free(struct kmem_cache *cache, void *object,
+				unsigned long ip)
+{
+	return check_slab_free(cache, object, ip) == KASAN_FREE_IS_INVALID;
+}
+
 bool __kasan_slab_free(struct kmem_cache *cache, void *object,
 				unsigned long ip, bool init)
 {
diff --git a/mm/slub.c b/mm/slub.c
index 4927edec6a8c..34724704c52d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2170,6 +2170,13 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
 	if (kfence_free(x))
 		return false;
 
+	/*
+	 * Give KASAN a chance to notice an invalid free operation before we
+	 * modify the object.
+	 */
+	if (kasan_slab_pre_free(s, x))
+		return false;
+
 	/*
 	 * As memory initialization might be integrated into KASAN,
 	 * kasan_slab_free and initialization memset's must be

-- 
2.45.2.1089.g2a221341d9-goog



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG
  2024-07-24 16:34 [PATCH v2 0/2] allow KASAN to detect UAF in SLAB_TYPESAFE_BY_RCU slabs Jann Horn
  2024-07-24 16:34 ` [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object Jann Horn
@ 2024-07-24 16:34 ` Jann Horn
  2024-07-25 13:28   ` Vlastimil Babka
  1 sibling, 1 reply; 8+ messages in thread
From: Jann Horn @ 2024-07-24 16:34 UTC (permalink / raw)
  To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
	Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Vlastimil Babka, Roman Gushchin, Hyeonggon Yoo
  Cc: Marco Elver, kasan-dev, linux-kernel, linux-mm, Jann Horn

Currently, KASAN is unable to catch use-after-free in SLAB_TYPESAFE_BY_RCU
slabs because use-after-free is allowed within the RCU grace period by
design.

Add a SLUB debugging feature which RCU-delays every individual
kmem_cache_free() before either actually freeing the object or handing it
off to KASAN, and change KASAN to poison freed objects as normal when this
option is enabled.

Note that this creates an aligned 16-byte area in the middle of the slab
metadata area, which kinda sucks but seems to be necessary in order to be
able to store an rcu_head in there that can be unpoisoned while the RCU
callback is pending.
(metadata_access_enable/disable doesn't work here because while the RCU
callback is pending, it will be accessed by asynchronous RCU processing.)
To be able to re-poison the area after the RCU callback is done executing,
a new helper kasan_poison_range_as_redzone() is necessary.

For now I've configured Kconfig.debug to default-enable this feature in the
KASAN GENERIC and SW_TAGS modes; I'm not enabling it by default in HW_TAGS
mode because I'm not sure if it might have unwanted performance degradation
effects there.

Note that this is mostly useful with KASAN in the quarantine-based GENERIC
mode; SLAB_TYPESAFE_BY_RCU slabs are basically always also slabs with a
->ctor, and KASAN's assign_tag() currently has to assign fixed tags for
those, reducing the effectiveness of SW_TAGS/HW_TAGS mode.
(A possible future extension of this work would be to also let SLUB call
the ->ctor() on every allocation instead of only when the slab page is
allocated; then tag-based modes would be able to assign new tags on every
reallocation.)

Signed-off-by: Jann Horn <jannh@google.com>
---
 include/linux/kasan.h | 10 +++++++
 mm/Kconfig.debug      | 25 +++++++++++++++++
 mm/kasan/common.c     | 14 +++++++++-
 mm/kasan/kasan_test.c | 44 ++++++++++++++++++++++++++++++
 mm/slab.h             |  3 +++
 mm/slab_common.c      | 12 +++++++++
 mm/slub.c             | 75 +++++++++++++++++++++++++++++++++++++++++++++------
 7 files changed, 174 insertions(+), 9 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index eee8ca1dcb40..876ebd4241fe 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -349,6 +349,8 @@ static __always_inline void kasan_mempool_unpoison_object(void *ptr,
 		__kasan_mempool_unpoison_object(ptr, size, _RET_IP_);
 }
 
+void kasan_poison_range_as_redzone(void *ptr, size_t size);
+
 /*
  * Unlike kasan_check_read/write(), kasan_check_byte() is performed even for
  * the hardware tag-based mode that doesn't rely on compiler instrumentation.
@@ -361,6 +363,8 @@ static __always_inline bool kasan_check_byte(const void *addr)
 	return true;
 }
 
+size_t kasan_align(size_t size);
+
 #else /* CONFIG_KASAN */
 
 static inline void kasan_unpoison_range(const void *address, size_t size) {}
@@ -416,10 +420,16 @@ static inline bool kasan_mempool_poison_object(void *ptr)
 }
 static inline void kasan_mempool_unpoison_object(void *ptr, size_t size) {}
 
+static inline void kasan_poison_range_as_redzone(void *ptr, size_t size) {}
+
 static inline bool kasan_check_byte(const void *address)
 {
 	return true;
 }
+static inline size_t kasan_align(size_t size)
+{
+	return size;
+}
 
 #endif /* CONFIG_KASAN */
 
diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index afc72fde0f03..4eee5aa2de11 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -70,6 +70,31 @@ config SLUB_DEBUG_ON
 	  off in a kernel built with CONFIG_SLUB_DEBUG_ON by specifying
 	  "slab_debug=-".
 
+config SLUB_RCU_DEBUG
+	bool "Make use-after-free detection possible in TYPESAFE_BY_RCU caches"
+	depends on SLUB_DEBUG
+	default KASAN_GENERIC || KASAN_SW_TAGS
+	help
+	  Make SLAB_TYPESAFE_BY_RCU caches behave approximately as if the cache
+	  was not marked as SLAB_TYPESAFE_BY_RCU and every caller used
+	  kfree_rcu() instead.
+
+	  This is intended for use in combination with KASAN, to enable KASAN to
+	  detect use-after-free accesses in such caches.
+	  (KFENCE is able to do that independent of this flag.)
+
+	  This might degrade performance.
+	  Unfortunately this also prevents a very specific bug pattern from
+	  triggering (insufficient checks against an object being recycled
+	  within the RCU grace period); so this option can be turned off even on
+	  KASAN builds, in case you want to test for such a bug.
+
+	  If you're using this for testing bugs / fuzzing and care about
+	  catching all the bugs WAY more than performance, you might want to
+	  also turn on CONFIG_RCU_STRICT_GRACE_PERIOD.
+
+	  If unsure, say N.
+
 config PAGE_OWNER
 	bool "Track page owner"
 	depends on DEBUG_KERNEL && STACKTRACE_SUPPORT
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 7c7fc6ce7eb7..ff8843cc973d 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -251,7 +251,8 @@ static inline bool poison_slab_object(struct kmem_cache *cache, void *object,
 	object = kasan_reset_tag(object);
 
 	/* RCU slabs could be legally used after free within the RCU period. */
-	if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU))
+	if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU) &&
+	    !IS_ENABLED(CONFIG_SLUB_RCU_DEBUG))
 		return false;
 
 	kasan_poison(object, round_up(cache->object_size, KASAN_GRANULE_SIZE),
@@ -566,6 +567,12 @@ void __kasan_mempool_unpoison_object(void *ptr, size_t size, unsigned long ip)
 		poison_kmalloc_redzone(slab->slab_cache, ptr, size, flags);
 }
 
+void kasan_poison_range_as_redzone(void *ptr, size_t size)
+{
+	if (kasan_enabled())
+		kasan_poison(ptr, size, KASAN_SLAB_REDZONE, false);
+}
+
 bool __kasan_check_byte(const void *address, unsigned long ip)
 {
 	if (!kasan_byte_accessible(address)) {
@@ -574,3 +581,8 @@ bool __kasan_check_byte(const void *address, unsigned long ip)
 	}
 	return true;
 }
+
+size_t kasan_align(size_t size)
+{
+	return round_up(size, KASAN_GRANULE_SIZE);
+}
diff --git a/mm/kasan/kasan_test.c b/mm/kasan/kasan_test.c
index 7b32be2a3cf0..cba782a4b072 100644
--- a/mm/kasan/kasan_test.c
+++ b/mm/kasan/kasan_test.c
@@ -996,6 +996,49 @@ static void kmem_cache_invalid_free(struct kunit *test)
 	kmem_cache_destroy(cache);
 }
 
+static void kmem_cache_rcu_uaf(struct kunit *test)
+{
+	char *p;
+	size_t size = 200;
+	struct kmem_cache *cache;
+
+	KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_SLUB_RCU_DEBUG);
+
+	cache = kmem_cache_create("test_cache", size, 0, SLAB_TYPESAFE_BY_RCU,
+				  NULL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, cache);
+
+	p = kmem_cache_alloc(cache, GFP_KERNEL);
+	if (!p) {
+		kunit_err(test, "Allocation failed: %s\n", __func__);
+		kmem_cache_destroy(cache);
+		return;
+	}
+	*p = 1;
+
+	rcu_read_lock();
+
+	/* Free the object - this will internally schedule an RCU callback. */
+	kmem_cache_free(cache, p);
+
+	/* We should still be allowed to access the object at this point because
+	 * the cache is SLAB_TYPESAFE_BY_RCU and we've been in an RCU read-side
+	 * critical section since before the kmem_cache_free().
+	 */
+	READ_ONCE(*p);
+
+	rcu_read_unlock();
+
+	/* Wait for the RCU callback to execute; after this, the object should
+	 * have actually been freed from KASAN's perspective.
+	 */
+	rcu_barrier();
+
+	KUNIT_EXPECT_KASAN_FAIL(test, READ_ONCE(*p));
+
+	kmem_cache_destroy(cache);
+}
+
 static void empty_cache_ctor(void *object) { }
 
 static void kmem_cache_double_destroy(struct kunit *test)
@@ -1937,6 +1980,7 @@ static struct kunit_case kasan_kunit_test_cases[] = {
 	KUNIT_CASE(kmem_cache_oob),
 	KUNIT_CASE(kmem_cache_double_free),
 	KUNIT_CASE(kmem_cache_invalid_free),
+	KUNIT_CASE(kmem_cache_rcu_uaf),
 	KUNIT_CASE(kmem_cache_double_destroy),
 	KUNIT_CASE(kmem_cache_accounted),
 	KUNIT_CASE(kmem_cache_bulk),
diff --git a/mm/slab.h b/mm/slab.h
index 5f8f47c5bee0..77a8f28afafe 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -273,6 +273,9 @@ struct kmem_cache {
 	int refcount;			/* Refcount for slab cache destroy */
 	void (*ctor)(void *object);	/* Object constructor */
 	unsigned int inuse;		/* Offset to metadata */
+#ifdef CONFIG_SLUB_RCU_DEBUG
+	unsigned int debug_rcu_head_offset;
+#endif
 	unsigned int align;		/* Alignment */
 	unsigned int red_left_pad;	/* Left redzone padding size */
 	const char *name;		/* Name (only for display!) */
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 1560a1546bb1..19511e34017b 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -450,6 +450,18 @@ static void slab_caches_to_rcu_destroy_workfn(struct work_struct *work)
 
 static int shutdown_cache(struct kmem_cache *s)
 {
+	if (IS_ENABLED(CONFIG_SLUB_RCU_DEBUG) &&
+	    (s->flags & SLAB_TYPESAFE_BY_RCU)) {
+		/*
+		 * Under CONFIG_SLUB_RCU_DEBUG, when objects in a
+		 * SLAB_TYPESAFE_BY_RCU slab are freed, SLUB will internally
+		 * defer their freeing with call_rcu().
+		 * Wait for such call_rcu() invocations here before actually
+		 * destroying the cache.
+		 */
+		rcu_barrier();
+	}
+
 	/* free asan quarantined objects */
 	kasan_cache_shutdown(s);
 
diff --git a/mm/slub.c b/mm/slub.c
index 34724704c52d..999afdc1cffb 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1225,7 +1225,8 @@ static int check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
  * 	A. Free pointer (if we cannot overwrite object on free)
  * 	B. Tracking data for SLAB_STORE_USER
  *	C. Original request size for kmalloc object (SLAB_STORE_USER enabled)
- *	D. Padding to reach required alignment boundary or at minimum
+ *	D. RCU head for CONFIG_SLUB_RCU_DEBUG (with padding around it)
+ *	E. Padding to reach required alignment boundary or at minimum
  * 		one word if debugging is on to be able to detect writes
  * 		before the word boundary.
  *
@@ -1251,6 +1252,11 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
 			off += sizeof(unsigned int);
 	}
 
+#ifdef CONFIG_SLUB_RCU_DEBUG
+	if (s->flags & SLAB_TYPESAFE_BY_RCU)
+		off = kasan_align(s->debug_rcu_head_offset + sizeof(struct rcu_head));
+#endif /* CONFIG_SLUB_RCU_DEBUG */
+
 	off += kasan_metadata_size(s, false);
 
 	if (size_from_object(s) == off)
@@ -2144,15 +2150,21 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
 }
 #endif /* CONFIG_MEMCG_KMEM */
 
+#ifdef CONFIG_SLUB_RCU_DEBUG
+static void slab_free_after_rcu_debug(struct rcu_head *rcu_head);
+#endif
+
 /*
  * Hooks for other subsystems that check memory allocations. In a typical
  * production configuration these hooks all should produce no code at all.
  *
  * Returns true if freeing of the object can proceed, false if its reuse
- * was delayed by KASAN quarantine, or it was returned to KFENCE.
+ * was delayed by CONFIG_SLUB_RCU_DEBUG or KASAN quarantine, or it was returned
+ * to KFENCE.
  */
 static __always_inline
-bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
+bool slab_free_hook(struct kmem_cache *s, void *x, bool init,
+		    bool after_rcu_delay)
 {
 	kmemleak_free_recursive(x, s->flags);
 	kmsan_slab_free(s, x);
@@ -2163,7 +2175,7 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
 		debug_check_no_obj_freed(x, s->object_size);
 
 	/* Use KCSAN to help debug racy use-after-free. */
-	if (!(s->flags & SLAB_TYPESAFE_BY_RCU))
+	if (!(s->flags & SLAB_TYPESAFE_BY_RCU) || after_rcu_delay)
 		__kcsan_check_access(x, s->object_size,
 				     KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
 
@@ -2177,6 +2189,17 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
 	if (kasan_slab_pre_free(s, x))
 		return false;
 
+#ifdef CONFIG_SLUB_RCU_DEBUG
+	if ((s->flags & SLAB_TYPESAFE_BY_RCU) && !after_rcu_delay) {
+		struct rcu_head *rcu_head;
+
+		rcu_head = kasan_reset_tag(x) + s->debug_rcu_head_offset;
+		kasan_unpoison_range(rcu_head, sizeof(*rcu_head));
+		call_rcu(rcu_head, slab_free_after_rcu_debug);
+		return false;
+	}
+#endif /* CONFIG_SLUB_RCU_DEBUG */
+
 	/*
 	 * As memory initialization might be integrated into KASAN,
 	 * kasan_slab_free and initialization memset's must be
@@ -2214,7 +2237,7 @@ bool slab_free_freelist_hook(struct kmem_cache *s, void **head, void **tail,
 	bool init;
 
 	if (is_kfence_address(next)) {
-		slab_free_hook(s, next, false);
+		slab_free_hook(s, next, false, false);
 		return false;
 	}
 
@@ -2229,7 +2252,7 @@ bool slab_free_freelist_hook(struct kmem_cache *s, void **head, void **tail,
 		next = get_freepointer(s, object);
 
 		/* If object's reuse doesn't have to be delayed */
-		if (likely(slab_free_hook(s, object, init))) {
+		if (likely(slab_free_hook(s, object, init, false))) {
 			/* Move object to the new freelist */
 			set_freepointer(s, object, *head);
 			*head = object;
@@ -4442,7 +4465,7 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
 	memcg_slab_free_hook(s, slab, &object, 1);
 	alloc_tagging_slab_free_hook(s, slab, &object, 1);
 
-	if (likely(slab_free_hook(s, object, slab_want_init_on_free(s))))
+	if (likely(slab_free_hook(s, object, slab_want_init_on_free(s), false)))
 		do_slab_free(s, slab, object, object, 1, addr);
 }
 
@@ -4451,7 +4474,7 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
 static noinline
 void memcg_alloc_abort_single(struct kmem_cache *s, void *object)
 {
-	if (likely(slab_free_hook(s, object, slab_want_init_on_free(s))))
+	if (likely(slab_free_hook(s, object, slab_want_init_on_free(s), false)))
 		do_slab_free(s, virt_to_slab(object), object, object, 1, _RET_IP_);
 }
 #endif
@@ -4470,6 +4493,32 @@ void slab_free_bulk(struct kmem_cache *s, struct slab *slab, void *head,
 		do_slab_free(s, slab, head, tail, cnt, addr);
 }
 
+#ifdef CONFIG_SLUB_RCU_DEBUG
+static void slab_free_after_rcu_debug(struct rcu_head *rcu_head)
+{
+	struct slab *slab = virt_to_slab(rcu_head);
+	struct kmem_cache *s;
+	void *object;
+
+	if (WARN_ON(is_kfence_address(rcu_head)))
+		return;
+
+	/* find the object and the cache again */
+	if (WARN_ON(!slab))
+		return;
+	s = slab->slab_cache;
+	if (WARN_ON(!(s->flags & SLAB_TYPESAFE_BY_RCU)))
+		return;
+	object = (void *)rcu_head - s->debug_rcu_head_offset;
+	kasan_poison_range_as_redzone(rcu_head, kasan_align(sizeof(*rcu_head)));
+
+	/* resume freeing */
+	if (!slab_free_hook(s, object, slab_want_init_on_free(s), true))
+		return;
+	do_slab_free(s, slab, object, NULL, 1, _THIS_IP_);
+}
+#endif /* CONFIG_SLUB_RCU_DEBUG */
+
 #ifdef CONFIG_KASAN_GENERIC
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr)
 {
@@ -5199,6 +5248,16 @@ static int calculate_sizes(struct kmem_cache *s)
 		if (flags & SLAB_KMALLOC)
 			size += sizeof(unsigned int);
 	}
+
+#ifdef CONFIG_SLUB_RCU_DEBUG
+	if (flags & SLAB_TYPESAFE_BY_RCU) {
+		size = kasan_align(size);
+		size = ALIGN(size, __alignof__(struct rcu_head));
+		s->debug_rcu_head_offset = size;
+		size += sizeof(struct rcu_head);
+		size = kasan_align(size);
+	}
+#endif /* CONFIG_SLUB_RCU_DEBUG */
 #endif
 
 	kasan_cache_create(s, &size, &s->flags);

-- 
2.45.2.1089.g2a221341d9-goog



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object
  2024-07-24 16:34 ` [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object Jann Horn
@ 2024-07-24 21:17   ` Andrew Morton
  2024-07-25 10:54     ` Jann Horn
  2024-07-25 13:20   ` Vlastimil Babka
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2024-07-24 21:17 UTC (permalink / raw)
  To: Jann Horn
  Cc: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
	Dmitry Vyukov, Vincenzo Frascino, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Hyeonggon Yoo, Marco Elver, kasan-dev,
	linux-kernel, linux-mm

On Wed, 24 Jul 2024 18:34:12 +0200 Jann Horn <jannh@google.com> wrote:

> Currently, when KASAN is combined with init-on-free behavior, the
> initialization happens before KASAN's "invalid free" checks.
> 
> More importantly, a subsequent commit will want to use the object metadata
> region to store an rcu_head, and we should let KASAN check that the object
> pointer is valid before that. (Otherwise that change will make the existing
> testcase kmem_cache_invalid_free fail.)
> 
> So add a new KASAN hook that allows KASAN to pre-validate a
> kmem_cache_free() operation before SLUB actually starts modifying the
> object or its metadata.

I added this, to fix the CONFIG_KASAN=n build

--- a/include/linux/kasan.h~kasan-catch-invalid-free-before-slub-reinitializes-the-object-fix
+++ a/include/linux/kasan.h
@@ -381,6 +381,12 @@ static inline void *kasan_init_slab_obj(
 {
 	return (void *)object;
 }
+
+static inline bool kasan_slab_pre_free(struct kmem_cache *s, void *object)
+{
+	return false;
+}
+
 static inline bool kasan_slab_free(struct kmem_cache *s, void *object, bool init)
 {
 	return false;
_



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object
  2024-07-24 21:17   ` Andrew Morton
@ 2024-07-25 10:54     ` Jann Horn
  0 siblings, 0 replies; 8+ messages in thread
From: Jann Horn @ 2024-07-25 10:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
	Dmitry Vyukov, Vincenzo Frascino, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Hyeonggon Yoo, Marco Elver, kasan-dev,
	linux-kernel, linux-mm

On Wed, Jul 24, 2024 at 11:17 PM Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 24 Jul 2024 18:34:12 +0200 Jann Horn <jannh@google.com> wrote:
>
> > Currently, when KASAN is combined with init-on-free behavior, the
> > initialization happens before KASAN's "invalid free" checks.
> >
> > More importantly, a subsequent commit will want to use the object metadata
> > region to store an rcu_head, and we should let KASAN check that the object
> > pointer is valid before that. (Otherwise that change will make the existing
> > testcase kmem_cache_invalid_free fail.)
> >
> > So add a new KASAN hook that allows KASAN to pre-validate a
> > kmem_cache_free() operation before SLUB actually starts modifying the
> > object or its metadata.
>
> I added this, to fix the CONFIG_KASAN=n build

Whoops, thanks for fixing that up.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object
  2024-07-24 16:34 ` [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object Jann Horn
  2024-07-24 21:17   ` Andrew Morton
@ 2024-07-25 13:20   ` Vlastimil Babka
  1 sibling, 0 replies; 8+ messages in thread
From: Vlastimil Babka @ 2024-07-25 13:20 UTC (permalink / raw)
  To: Jann Horn, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino,
	Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo
  Cc: Marco Elver, kasan-dev, linux-kernel, linux-mm

On 7/24/24 6:34 PM, Jann Horn wrote:
> Currently, when KASAN is combined with init-on-free behavior, the
> initialization happens before KASAN's "invalid free" checks.
> 
> More importantly, a subsequent commit will want to use the object metadata
> region to store an rcu_head, and we should let KASAN check that the object
> pointer is valid before that. (Otherwise that change will make the existing
> testcase kmem_cache_invalid_free fail.)
> 
> So add a new KASAN hook that allows KASAN to pre-validate a
> kmem_cache_free() operation before SLUB actually starts modifying the
> object or its metadata.
> 
> Signed-off-by: Jann Horn <jannh@google.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz> #slub

> ---
>  include/linux/kasan.h | 10 ++++++++++
>  mm/kasan/common.c     | 51 +++++++++++++++++++++++++++++++++++++++------------
>  mm/slub.c             |  7 +++++++
>  3 files changed, 56 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 70d6a8f6e25d..eee8ca1dcb40 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -175,6 +175,16 @@ static __always_inline void * __must_check kasan_init_slab_obj(
>  	return (void *)object;
>  }
>  
> +bool __kasan_slab_pre_free(struct kmem_cache *s, void *object,
> +			unsigned long ip);
> +static __always_inline bool kasan_slab_pre_free(struct kmem_cache *s,
> +						void *object)
> +{
> +	if (kasan_enabled())
> +		return __kasan_slab_pre_free(s, object, _RET_IP_);
> +	return false;
> +}
> +
>  bool __kasan_slab_free(struct kmem_cache *s, void *object,
>  			unsigned long ip, bool init);
>  static __always_inline bool kasan_slab_free(struct kmem_cache *s,
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 85e7c6b4575c..7c7fc6ce7eb7 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -208,31 +208,52 @@ void * __must_check __kasan_init_slab_obj(struct kmem_cache *cache,
>  	return (void *)object;
>  }
>  
> -static inline bool poison_slab_object(struct kmem_cache *cache, void *object,
> -				      unsigned long ip, bool init)
> +enum free_validation_result {
> +	KASAN_FREE_IS_IGNORED,
> +	KASAN_FREE_IS_VALID,
> +	KASAN_FREE_IS_INVALID
> +};
> +
> +static enum free_validation_result check_slab_free(struct kmem_cache *cache,
> +						void *object, unsigned long ip)
>  {
> -	void *tagged_object;
> +	void *tagged_object = object;
>  
> -	if (!kasan_arch_is_ready())
> -		return false;
> +	if (is_kfence_address(object) || !kasan_arch_is_ready())
> +		return KASAN_FREE_IS_IGNORED;
>  
> -	tagged_object = object;
>  	object = kasan_reset_tag(object);
>  
>  	if (unlikely(nearest_obj(cache, virt_to_slab(object), object) != object)) {
>  		kasan_report_invalid_free(tagged_object, ip, KASAN_REPORT_INVALID_FREE);
> -		return true;
> +		return KASAN_FREE_IS_INVALID;
>  	}
>  
> -	/* RCU slabs could be legally used after free within the RCU period. */
> -	if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU))
> -		return false;
> -
>  	if (!kasan_byte_accessible(tagged_object)) {
>  		kasan_report_invalid_free(tagged_object, ip, KASAN_REPORT_DOUBLE_FREE);
> -		return true;
> +		return KASAN_FREE_IS_INVALID;
>  	}
>  
> +	return KASAN_FREE_IS_VALID;
> +}
> +
> +static inline bool poison_slab_object(struct kmem_cache *cache, void *object,
> +				      unsigned long ip, bool init)
> +{
> +	void *tagged_object = object;
> +	enum free_validation_result valid = check_slab_free(cache, object, ip);
> +
> +	if (valid == KASAN_FREE_IS_IGNORED)
> +		return false;
> +	if (valid == KASAN_FREE_IS_INVALID)
> +		return true;
> +
> +	object = kasan_reset_tag(object);
> +
> +	/* RCU slabs could be legally used after free within the RCU period. */
> +	if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU))
> +		return false;
> +
>  	kasan_poison(object, round_up(cache->object_size, KASAN_GRANULE_SIZE),
>  			KASAN_SLAB_FREE, init);
>  
> @@ -242,6 +263,12 @@ static inline bool poison_slab_object(struct kmem_cache *cache, void *object,
>  	return false;
>  }
>  
> +bool __kasan_slab_pre_free(struct kmem_cache *cache, void *object,
> +				unsigned long ip)
> +{
> +	return check_slab_free(cache, object, ip) == KASAN_FREE_IS_INVALID;
> +}
> +
>  bool __kasan_slab_free(struct kmem_cache *cache, void *object,
>  				unsigned long ip, bool init)
>  {
> diff --git a/mm/slub.c b/mm/slub.c
> index 4927edec6a8c..34724704c52d 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2170,6 +2170,13 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>  	if (kfence_free(x))
>  		return false;
>  
> +	/*
> +	 * Give KASAN a chance to notice an invalid free operation before we
> +	 * modify the object.
> +	 */
> +	if (kasan_slab_pre_free(s, x))
> +		return false;
> +
>  	/*
>  	 * As memory initialization might be integrated into KASAN,
>  	 * kasan_slab_free and initialization memset's must be
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG
  2024-07-24 16:34 ` [PATCH v2 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG Jann Horn
@ 2024-07-25 13:28   ` Vlastimil Babka
  2024-07-25 14:34     ` Jann Horn
  0 siblings, 1 reply; 8+ messages in thread
From: Vlastimil Babka @ 2024-07-25 13:28 UTC (permalink / raw)
  To: Jann Horn, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino,
	Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Roman Gushchin, Hyeonggon Yoo
  Cc: Marco Elver, kasan-dev, linux-kernel, linux-mm

On 7/24/24 6:34 PM, Jann Horn wrote:
> Currently, KASAN is unable to catch use-after-free in SLAB_TYPESAFE_BY_RCU
> slabs because use-after-free is allowed within the RCU grace period by
> design.
> 
> Add a SLUB debugging feature which RCU-delays every individual
> kmem_cache_free() before either actually freeing the object or handing it
> off to KASAN, and change KASAN to poison freed objects as normal when this
> option is enabled.
> 
> Note that this creates an aligned 16-byte area in the middle of the slab
> metadata area, which kinda sucks but seems to be necessary in order to be
> able to store an rcu_head in there that can be unpoisoned while the RCU
> callback is pending.

An alternative could be a head-less variant of kfree_rcu_mightsleep() that
would fail instead of go to reclaim if it can't allocate, and upon failure
we would fall back ot the old behavior and give up on checking that object?
But maybe it's just too complicated and we just pay the overhead. At least
this doesn't concern kmalloc caches with their power-of-two alignment
guarantees where extra metadata blows things up more.

> (metadata_access_enable/disable doesn't work here because while the RCU
> callback is pending, it will be accessed by asynchronous RCU processing.)
> To be able to re-poison the area after the RCU callback is done executing,
> a new helper kasan_poison_range_as_redzone() is necessary.
> 
> For now I've configured Kconfig.debug to default-enable this feature in the
> KASAN GENERIC and SW_TAGS modes; I'm not enabling it by default in HW_TAGS
> mode because I'm not sure if it might have unwanted performance degradation
> effects there.
> 
> Note that this is mostly useful with KASAN in the quarantine-based GENERIC
> mode; SLAB_TYPESAFE_BY_RCU slabs are basically always also slabs with a
> ->ctor, and KASAN's assign_tag() currently has to assign fixed tags for
> those, reducing the effectiveness of SW_TAGS/HW_TAGS mode.
> (A possible future extension of this work would be to also let SLUB call
> the ->ctor() on every allocation instead of only when the slab page is
> allocated; then tag-based modes would be able to assign new tags on every
> reallocation.)
> 
> Signed-off-by: Jann Horn <jannh@google.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz> #slab

...

> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -450,6 +450,18 @@ static void slab_caches_to_rcu_destroy_workfn(struct work_struct *work)
>  
>  static int shutdown_cache(struct kmem_cache *s)
>  {
> +	if (IS_ENABLED(CONFIG_SLUB_RCU_DEBUG) &&
> +	    (s->flags & SLAB_TYPESAFE_BY_RCU)) {
> +		/*
> +		 * Under CONFIG_SLUB_RCU_DEBUG, when objects in a
> +		 * SLAB_TYPESAFE_BY_RCU slab are freed, SLUB will internally
> +		 * defer their freeing with call_rcu().
> +		 * Wait for such call_rcu() invocations here before actually
> +		 * destroying the cache.
> +		 */
> +		rcu_barrier();
> +	}

I think once we have the series [1] settled (patch 5/6 specifically), the
delayed destruction could handle this case too?

[1]
https://lore.kernel.org/linux-mm/20240715-b4-slab-kfree_rcu-destroy-v1-0-46b2984c2205@suse.cz/

> +
>  	/* free asan quarantined objects */
>  	kasan_cache_shutdown(s);
>  
> diff --git a/mm/slub.c b/mm/slub.c
> index 34724704c52d..999afdc1cffb 100644




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG
  2024-07-25 13:28   ` Vlastimil Babka
@ 2024-07-25 14:34     ` Jann Horn
  0 siblings, 0 replies; 8+ messages in thread
From: Jann Horn @ 2024-07-25 14:34 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
	Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Roman Gushchin, Hyeonggon Yoo, Marco Elver, kasan-dev,
	linux-kernel, linux-mm

On Thu, Jul 25, 2024 at 3:28 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> On 7/24/24 6:34 PM, Jann Horn wrote:
> > Currently, KASAN is unable to catch use-after-free in SLAB_TYPESAFE_BY_RCU
> > slabs because use-after-free is allowed within the RCU grace period by
> > design.
> >
> > Add a SLUB debugging feature which RCU-delays every individual
> > kmem_cache_free() before either actually freeing the object or handing it
> > off to KASAN, and change KASAN to poison freed objects as normal when this
> > option is enabled.
> >
> > Note that this creates an aligned 16-byte area in the middle of the slab
> > metadata area, which kinda sucks but seems to be necessary in order to be
> > able to store an rcu_head in there that can be unpoisoned while the RCU
> > callback is pending.
>
> An alternative could be a head-less variant of kfree_rcu_mightsleep() that
> would fail instead of go to reclaim if it can't allocate, and upon failure
> we would fall back ot the old behavior and give up on checking that object?

Yes, true, that would be an option... behaving differently under
memory pressure seems a little weird to me, but it would probably do
the job...

I've now tried implementing it roughly as you suggested; the diffstat
for that (on top of the existing series) looks like this:

 include/linux/kasan.h | 24 +++++++++---------------
 mm/kasan/common.c     | 23 +++++++----------------
 mm/slab.h             |  3 ---
 mm/slub.c             | 46 +++++++++++++++++++---------------------------
 4 files changed, 35 insertions(+), 61 deletions(-)

Basically it gets rid of all the plumbing I added to stuff more things
into the metadata area, but it has to add a flag to kasan_slab_free()
to tell it whether the call is happening after RCU delay or not.

I'm changing slab_free_hook() to allocate an instance of the struct

struct rcu_delayed_free {
  struct rcu_head head;
  void *object;
};

with kmalloc(sizeof(*delayed_free), GFP_NOWAIT), and then if that
works, I use that to RCU-delay the freeing.


I think this looks a bit nicer than my original version; I'll go run
the test suite and then send it out as v3.


> But maybe it's just too complicated and we just pay the overhead. At least
> this doesn't concern kmalloc caches with their power-of-two alignment
> guarantees where extra metadata blows things up more.

If we wanted to compress the slab metadata for this down a bit, we
could probably also overlap the out-of-line freepointer with the
rcu_head, since the freepointer can't be in use while the rcu_head is
active... but I figured that since this is a debug feature mainly
intended for ASAN builds, keeping things simple is more important.

> > (metadata_access_enable/disable doesn't work here because while the RCU
> > callback is pending, it will be accessed by asynchronous RCU processing.)
> > To be able to re-poison the area after the RCU callback is done executing,
> > a new helper kasan_poison_range_as_redzone() is necessary.
> >
> > For now I've configured Kconfig.debug to default-enable this feature in the
> > KASAN GENERIC and SW_TAGS modes; I'm not enabling it by default in HW_TAGS
> > mode because I'm not sure if it might have unwanted performance degradation
> > effects there.
> >
> > Note that this is mostly useful with KASAN in the quarantine-based GENERIC
> > mode; SLAB_TYPESAFE_BY_RCU slabs are basically always also slabs with a
> > ->ctor, and KASAN's assign_tag() currently has to assign fixed tags for
> > those, reducing the effectiveness of SW_TAGS/HW_TAGS mode.
> > (A possible future extension of this work would be to also let SLUB call
> > the ->ctor() on every allocation instead of only when the slab page is
> > allocated; then tag-based modes would be able to assign new tags on every
> > reallocation.)
> >
> > Signed-off-by: Jann Horn <jannh@google.com>
>
> Acked-by: Vlastimil Babka <vbabka@suse.cz> #slab
>
> ...
>
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -450,6 +450,18 @@ static void slab_caches_to_rcu_destroy_workfn(struct work_struct *work)
> >
> >  static int shutdown_cache(struct kmem_cache *s)
> >  {
> > +     if (IS_ENABLED(CONFIG_SLUB_RCU_DEBUG) &&
> > +         (s->flags & SLAB_TYPESAFE_BY_RCU)) {
> > +             /*
> > +              * Under CONFIG_SLUB_RCU_DEBUG, when objects in a
> > +              * SLAB_TYPESAFE_BY_RCU slab are freed, SLUB will internally
> > +              * defer their freeing with call_rcu().
> > +              * Wait for such call_rcu() invocations here before actually
> > +              * destroying the cache.
> > +              */
> > +             rcu_barrier();
> > +     }
>
> I think once we have the series [1] settled (patch 5/6 specifically), the
> delayed destruction could handle this case too?
>
> [1]
> https://lore.kernel.org/linux-mm/20240715-b4-slab-kfree_rcu-destroy-v1-0-46b2984c2205@suse.cz/

Ah, thanks for the pointer, I hadn't seen that one.


> > +
> >       /* free asan quarantined objects */
> >       kasan_cache_shutdown(s);
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 34724704c52d..999afdc1cffb 100644
>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-07-25 14:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-24 16:34 [PATCH v2 0/2] allow KASAN to detect UAF in SLAB_TYPESAFE_BY_RCU slabs Jann Horn
2024-07-24 16:34 ` [PATCH v2 1/2] kasan: catch invalid free before SLUB reinitializes the object Jann Horn
2024-07-24 21:17   ` Andrew Morton
2024-07-25 10:54     ` Jann Horn
2024-07-25 13:20   ` Vlastimil Babka
2024-07-24 16:34 ` [PATCH v2 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG Jann Horn
2024-07-25 13:28   ` Vlastimil Babka
2024-07-25 14:34     ` Jann Horn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox