linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
@ 2025-02-04 17:33 Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode Maciej Wieczor-Retman
                   ` (16 more replies)
  0 siblings, 17 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

======= Introduction
The patchset aims to add a KASAN tag-based mode for the x86 architecture
with the help of the new CPU feature called Linear Address Masking
(LAM). Main improvement introduced by the series is 4x lower memory
usage compared to KASAN's generic mode, the only currently available
mode on x86.

There are two logical parts to this series. The first one attempts to
add a new memory saving mechanism called "dense mode" to the generic
part of the tag-based KASAN code. The second one focuses on implementing
and enabling the tag-based mode for the x86 architecture by using LAM.

======= How KASAN tag-based mode works?
When enabled, memory accesses and allocations are augmented by the
compiler during kernel compilation. Instrumentation functions are added
to each memory allocation and each pointer dereference.

The allocation related functions generate a random tag and save it in
two places: in shadow memory that maps to the allocated memory, and in
the top bits of the pointer that points to the allocated memory. Storing
the tag in the top of the pointer is possible because of Top-Byte Ignore
(TBI) on arm64 architecture and LAM on x86.

The access related functions are performing a comparison between the tag
stored in the pointer and the one stored in shadow memory. If the tags
don't match an out of bounds error must have occurred and so an error
report is generated.

The general idea for the tag-based mode is very well explained in the
series with the original implementation [1].

[1] https://lore.kernel.org/all/cover.1544099024.git.andreyknvl@google.com/

======= What is the new "dense mode"?
To further save memory the dense mode is introduced. The idea is that
normally one shadow byte stores one tag and this one tag covers one
granule of allocated memory which is 16 bytes. In the dense mode, one
tag still covers 16 bytes of allocated memory but is shortened in length
from 8 bits to 4 bits which makes it possible to store two tags in one
shadow memory byte.

=== Example:
The example below shows how the shadow memory looks like after
allocating 48 bytes of memory in both normal tag-based mode and the
dense mode. The contents of shadow memory are overlaid onto address
offsets that they relate to in the allocated kernel memory. Each cell
|        | symbolizes one byte of shadow memory.

= The regular tag based mode:
- Randomly generated 8-bit tag equals 0xAB.
- 0xFE is the tag that symbolizes unallocated memory.

Shadow memory contents:           |  0xAB  |  0xAB  |  0xAB  |  0xFE  |
Shadow memory address offsets:    0        1        2        3        4
Allocated memory address offsets: 0        16       32       48       64

= The dense tag based mode:
- Randomly generated 4-bit tag equals 0xC.
- 0xE is the tag that symbolizes unallocated memory.

Shadow memory contents:           |0xC 0xC |0xC 0xE |0xE 0xE |0xE 0xE |
Shadow memory address offsets:    0        1        2        3        4
Allocated memory address offsets: 0        32       64       96       128

=== Dense mode benefits summary
For a small price of a couple of bit shifts, the dense mode uses only
half the memory compared to the current arm64 tag-based mode, while
still preserving the 16 byte tag granularity which allows catching
smaller offsets of out of bounds errors.

======= Differences summary compared to the arm64 tag-based mode
- Tag width:
	- Tag width influences the chance of a tag mismatch due to two
	  tags from different allocations having the same value. The
	  bigger the possible range of tag values the lower the chance
	  of that happening.
	- Shortening the tag width from 8 bits to 4, while helping with
	  memory usage also increases the chance of not reporting an
	  error. 4 bit tags have a ~7% chance of a tag mismatch.

- TBI and LAM
	- TBI in arm64 allows for storing metadata in the top 8 bits of
	  the virtual address.
	- LAM in x86 allows storing tags in bits [62:57] of the pointer.
	  To maximize memory savings the tag width is reduced to bits
	  [60:57].

======= Testing
Checked all the kunits for both software tags and generic KASAN after
making changes.

In generic mode the results were:

kasan: pass:59 fail:0 skip:13 total:72
Totals: pass:59 fail:0 skip:13 total:72
ok 1 kasan

and for software tags:

kasan: pass:63 fail:0 skip:9 total:72
Totals: pass:63 fail:0 skip:9 total:72
ok 1 kasan

======= Benchmarks
All tests were ran on a Sierra Forest server platform with 512GB of
memory. The only differences between the tests were kernel options:
	- CONFIG_KASAN
	- CONFIG_KASAN_GENERIC
	- CONFIG_KASAN_SW_TAGS
	- CONFIG_KASAN_INLINE [1]
	- CONFIG_KASAN_OUTLINE [1]

Used memory in GBs after boot [2][3]:
* 14 for clean kernel
* 91 / 90 for generic KASAN (inline/outline)
* 31 for tag-based KASAN

Boot time (until login prompt):
* 03:48 for clean kernel
* 08:02 / 09:45 for generic KASAN (inline/outline)
* 08:50 for dense tag-based KASAN
* 04:50 for dense tag-based KASAN with stacktrace disabled [4]

Compilation time comparison (10 cores):
* 7:27 for clean kernel
* 8:21/7:44 for generic KASAN (inline/outline)
* 7:41 for tag-based KASAN

Network performance [5]:
* 13.7 Gbits/sec for clean kernel
* 2.25 Gbits/sec for generic KASAN inline
* 1.50 Gbits/sec for generic KASAN outline
* 1.55 Gbits/sec for dense tag-based KASAN
* 2.86 Gbits/sec for dense tag-based KASAN with stacktrace disabled

[1] Based on hwasan and asan compiler parameters used in
scripts/Makefile.kasan it looks like inline/outline modes have a bigger
impact on generic mode than the tag-based mode. In the former inlining
actually increases the kernel image size and improves performance. In
the latter it un-inlines some code portions for debugging purposes when
the outline mode is chosen but no real difference is visible in
performance and kernel image size.

[2] Used "cat /proc/meminfo | grep MemAvailable" and then subtracted
that from the total memory of the system. Initially wanted to use "grep
Slab" similarly to the cover letter for arm64 tag-based series but
because the tests were ran on a system with 512GB of RAM and memory
usage was more split up between different categories this better shows
the memory savings.

[3] If the 14 GBs from the clean build were subtracted from the KASAN
measurements one can see that the tag-based mode uses about 4x less of
the additional memory compared to the generic mode.

[4] Memory allocation and freeing performance suffers heavily from saving
stacktraces that can be later displayed in error reports.

[5] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.

======= Compilation
Clang was used to compile the series (make LLVM=1) since gcc doesn't
seem to have support for KASAN tag-based compiler instrumentation on
x86.

======= Dependencies
Series is based on risc-v series [1] that's currently in review. Because
of this for the time being it only applies cleanly on top of 6.12
mainline kernel. Will rebase on the newest kernel once the risc-v series
is also rebased.

[1] https://lore.kernel.org/all/20241022015913.3524425-1-samuel.holland@sifive.com/

Maciej Wieczor-Retman (15):
  kasan: Allocation enhancement for dense tag-based mode
  kasan: Tag checking with dense tag-based mode
  kasan: Vmalloc dense tag-based mode support
  kasan: arm64: x86: risc-v: Make special tags arch specific
  x86: Add arch specific kasan functions
  x86: Reset tag for virtual to physical address conversions
  mm: Pcpu chunk address tag reset
  x86: Physical address comparisons in fill_p*d/pte
  x86: Physical address comparison in current_mm pgd check
  x86: KASAN raw shadow memory PTE init
  x86: LAM initialization
  x86: Minimal SLAB alignment
  x86: runtime_const used for KASAN_SHADOW_END
  x86: Make software tag-based kasan available
  kasan: Add mititgation and debug modes

 Documentation/arch/x86/x86_64/mm.rst |  6 +-
 MAINTAINERS                          |  2 +-
 arch/arm64/include/asm/kasan-tags.h  |  9 +++
 arch/riscv/include/asm/kasan-tags.h  | 12 ++++
 arch/riscv/include/asm/kasan.h       |  4 --
 arch/x86/Kconfig                     | 11 +++-
 arch/x86/boot/compressed/misc.h      |  2 +
 arch/x86/include/asm/kasan-tags.h    |  9 +++
 arch/x86/include/asm/kasan.h         | 50 +++++++++++++--
 arch/x86/include/asm/page.h          | 17 +++--
 arch/x86/include/asm/page_64.h       |  2 +-
 arch/x86/kernel/head_64.S            |  3 +
 arch/x86/kernel/setup.c              |  2 +
 arch/x86/kernel/vmlinux.lds.S        |  1 +
 arch/x86/mm/init.c                   |  3 +
 arch/x86/mm/init_64.c                |  8 +--
 arch/x86/mm/kasan_init_64.c          | 24 +++++--
 arch/x86/mm/physaddr.c               |  1 +
 arch/x86/mm/tlb.c                    |  2 +-
 include/linux/kasan-tags.h           | 12 +++-
 include/linux/kasan.h                | 94 +++++++++++++++++++++++-----
 include/linux/mm.h                   |  6 +-
 include/linux/page-flags-layout.h    |  7 +--
 lib/Kconfig.kasan                    | 49 +++++++++++++++
 mm/kasan/Makefile                    |  3 +
 mm/kasan/dense.c                     | 83 ++++++++++++++++++++++++
 mm/kasan/kasan.h                     | 27 +-------
 mm/kasan/report.c                    |  6 +-
 mm/kasan/report_sw_tags.c            | 12 ++--
 mm/kasan/shadow.c                    | 47 ++++++++++----
 mm/kasan/sw_tags.c                   |  8 +++
 mm/kasan/tags.c                      |  5 ++
 mm/percpu-vm.c                       |  2 +-
 33 files changed, 432 insertions(+), 97 deletions(-)
 create mode 100644 arch/arm64/include/asm/kasan-tags.h
 create mode 100644 arch/riscv/include/asm/kasan-tags.h
 create mode 100644 arch/x86/include/asm/kasan-tags.h
 create mode 100644 mm/kasan/dense.c

-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-05 23:43   ` Andrey Konovalov
  2025-02-04 17:33 ` [PATCH 02/15] kasan: Tag checking with " Maciej Wieczor-Retman
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

Tag-based KASAN (on arm64) works by generating a random 8-bit tag and
putting it in both the top byte of the pointer (that points to the
allocated memory) and into all bytes of shadow memory that correspond to
the chunk of allocated regular memory. Each byte of shadow memory covers
a 16 byte chunk of allocated memory - a value called KASAN granularity.
This means that out-of-bounds memory accesses that happen inside the 16
bytes can't be caught.

The dense mode offers reducing the tag width from 8 to 4 bits and
storing two tags in one byte of shadow memory - one in the upper 4 bits
of the byte and one in the lower 4. This way one byte of shadow memory
can cover 32 bytes of allocated memory while still keeping the "16 bytes
per one tag" granularity. The lower 4 bits of each shadow byte map bytes
of memory with offsets 0-15 and the upper 4 bits map offsets 16-31.

Example:
The example below shows how the shadow memory looks like after
allocating 48 bytes of memory in both normal tag-based mode and the
dense mode. The contents of shadow memory are overlaid onto address
offsets that they relate to in the allocated kernel memory. Each cell
|    | symbolizes one byte of shadow memory.

= The regular tag based mode:
- Randomly generated 8-bit tag equals 0xAB.
- 0xFE is the tag that symbolizes unallocated memory.

Shadow memory contents:           |  0xAB  |  0xAB  |  0xAB  |  0xFE  |
Shadow memory address offsets:    0        1        2        3        4
Allocated memory address offsets: 0        16       32       48       64

= The dense tag based mode:
- Randomly generated 4-bit tag equals 0xC.
- 0xE is the tag that symbolizes unallocated memory.

Shadow memory contents:           |0xC 0xC |0xC 0xE |0xE 0xE |0xE 0xE |
Shadow memory address offsets:    0        1        2        3        4
Allocated memory address offsets: 0        32       64       96       128

Add a new config option and defines that can override the standard
system of one tag per one shadow byte.

Add alternative version of the kasan_poison() that deals with tags not
being aligned to byte size in shadow memory.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 include/linux/kasan.h | 18 ++++++++++++++++++
 lib/Kconfig.kasan     | 21 +++++++++++++++++++++
 mm/kasan/kasan.h      |  4 +---
 mm/kasan/shadow.c     | 33 ++++++++++++++++++++++++++++++---
 4 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 03b440658817..ea0f5acd875b 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -35,6 +35,24 @@ typedef unsigned int __bitwise kasan_vmalloc_flags_t;
 
 /* Software KASAN implementations use shadow memory. */
 
+#ifdef CONFIG_KASAN_SW_TAGS_DENSE
+#define KASAN_GRANULE_SHIFT	(KASAN_SHADOW_SCALE_SHIFT - 1)
+#define KASAN_SHADOW_SCALE_SIZE	(1UL << KASAN_SHADOW_SCALE_SHIFT)
+static inline u8 kasan_dense_tag(u8 tag)
+{
+	return (tag << KASAN_TAG_WIDTH | tag);
+}
+#else
+#define KASAN_GRANULE_SHIFT	KASAN_SHADOW_SCALE_SHIFT
+#define KASAN_SHADOW_SCALE_SIZE	(1UL << KASAN_GRANULE_SHIFT)
+static inline u8 kasan_dense_tag(u8 tag)
+{
+	return tag;
+}
+#endif
+
+#define KASAN_GRANULE_SIZE	(1UL << KASAN_GRANULE_SHIFT)
+
 #ifdef CONFIG_KASAN_SW_TAGS
 /* This matches KASAN_TAG_INVALID. */
 #define KASAN_SHADOW_INIT 0xFE
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index 98016e137b7f..d08b4e9bf477 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -19,6 +19,13 @@ config ARCH_DISABLE_KASAN_INLINE
 	  Disables both inline and stack instrumentation. Selected by
 	  architectures that do not support these instrumentation types.
 
+config ARCH_HAS_KASAN_SW_TAGS_DENSE
+	bool
+	help
+	  Enables option to compile tag-based KASAN with densely packed tags -
+	  two 4-bit tags per one byte of shadow memory. Set on architectures
+	  that have 4-bit tag macros.
+
 config CC_HAS_KASAN_GENERIC
 	def_bool $(cc-option, -fsanitize=kernel-address)
 
@@ -223,4 +230,18 @@ config KASAN_EXTRA_INFO
 	  boot parameter, it will add 8 * stack_ring_size bytes of additional
 	  memory consumption.
 
+config KASAN_SW_TAGS_DENSE
+	bool "Two 4-bit tags in one shadow memory byte"
+	depends on KASAN_SW_TAGS
+	depends on ARCH_HAS_KASAN_SW_TAGS_DENSE
+	help
+	  Enables packing two tags into one shadow byte to half the memory usage
+	  compared to normal tag-based mode.
+
+	  After setting this option, tag width macro is set to 4 and size macros
+	  are adjusted based on used KASAN_SHADOW_SCALE_SHIFT.
+
+	  ARCH_HAS_KASAN_SW_TAGS_DENSE is needed for this option since the
+	  special tag macros need to be properly set for 4-bit wide tags.
+
 endif # KASAN
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 72da5ddcceaa..0e04c5e2c405 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -128,9 +128,7 @@ static inline bool kasan_requires_meta(void)
 
 #endif /* CONFIG_KASAN_GENERIC */
 
-#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
-#define KASAN_GRANULE_SIZE	(1UL << KASAN_SHADOW_SCALE_SHIFT)
-#else
+#ifdef CONFIG_KASAN_HW_TAGS
 #include <asm/mte-kasan.h>
 #define KASAN_GRANULE_SIZE	MTE_GRANULE_SIZE
 #endif
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index d6210ca48dda..368503f54b87 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -123,7 +123,8 @@ EXPORT_SYMBOL(__hwasan_memcpy);
 
 void kasan_poison(const void *addr, size_t size, u8 value, bool init)
 {
-	void *shadow_start, *shadow_end;
+	u8 *shadow_start, *shadow_end, *shadow_start_aligned, *shadow_end_aligned, tag;
+	u64 addr64, addr_start_aligned, addr_end_aligned;
 
 	if (!kasan_arch_is_ready())
 		return;
@@ -134,16 +135,42 @@ void kasan_poison(const void *addr, size_t size, u8 value, bool init)
 	 * addresses to this function.
 	 */
 	addr = kasan_reset_tag(addr);
+	addr64 = (u64)addr;
 
-	if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
+	if (WARN_ON(addr64 & KASAN_GRANULE_MASK))
 		return;
 	if (WARN_ON(size & KASAN_GRANULE_MASK))
 		return;
 
 	shadow_start = kasan_mem_to_shadow(addr);
 	shadow_end = kasan_mem_to_shadow(addr + size);
+	addr_start_aligned = round_up(addr64, KASAN_SHADOW_SCALE_SIZE);
+	addr_end_aligned = round_down(addr64 + size, KASAN_SHADOW_SCALE_SIZE);
+	shadow_start_aligned = kasan_mem_to_shadow((void *)addr_start_aligned);
+	shadow_end_aligned = kasan_mem_to_shadow((void *)addr_end_aligned);
+
+	/* If size is empty just return. */
+	if (!size)
+		return;
 
-	__memset(shadow_start, value, shadow_end - shadow_start);
+	/* Memset the first unaligned tag in shadow memory. */
+	if (addr64 % KASAN_SHADOW_SCALE_SIZE) {
+		tag = *shadow_start & KASAN_TAG_MASK;
+		tag |= value << KASAN_TAG_WIDTH;
+		*shadow_start = tag;
+	}
+
+	/* Memset the middle aligned part in shadow memory. */
+	tag = kasan_dense_tag(value);
+	__memset(shadow_start_aligned, tag, shadow_end_aligned - shadow_start_aligned);
+
+	/* Memset the last unaligned tag in shadow memory. */
+	if ((addr64 + size) % KASAN_SHADOW_SCALE_SIZE) {
+		tag = KASAN_TAG_MASK << KASAN_TAG_WIDTH;
+		tag &= *shadow_end;
+		tag |= value;
+		*shadow_end = tag;
+	}
 }
 EXPORT_SYMBOL_GPL(kasan_poison);
 
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 02/15] kasan: Tag checking with dense tag-based mode
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-05 23:45   ` Andrey Konovalov
  2025-02-04 17:33 ` [PATCH 03/15] kasan: Vmalloc dense tag-based mode support Maciej Wieczor-Retman
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

In KASAN's tag-based mode (arm64) when a memory access occurs, the tag
stored in the top 8 bits of the pointer is compared with tags saved in
the region of the shadow memory that maps to memory the pointer points
to. If any of the tags in the shadow memory region do not match the one
stored in the pointer an error report is generated.

With the introduction of the dense mode, tags won't necessarily occupy
whole bytes of shadow memory if the previously allocated memory wasn't
aligned to 32 bytes - which is the coverage of one shadow byte.

Add an alternative implementation of kasan_check_range() that performs
special checks on first and last bytes of shadow memory ranges if the
originally allocated memory wasn't aligned to 32 bytes.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 include/linux/kasan.h     | 47 +++++++++++++++-------
 mm/kasan/Makefile         |  3 ++
 mm/kasan/dense.c          | 83 +++++++++++++++++++++++++++++++++++++++
 mm/kasan/kasan.h          |  2 +-
 mm/kasan/report.c         |  2 +-
 mm/kasan/report_sw_tags.c | 12 ++----
 mm/kasan/sw_tags.c        |  8 ++++
 7 files changed, 133 insertions(+), 24 deletions(-)
 create mode 100644 mm/kasan/dense.c

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index ea0f5acd875b..5a3e9bec21c2 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -33,6 +33,20 @@ typedef unsigned int __bitwise kasan_vmalloc_flags_t;
 
 #include <linux/pgtable.h>
 
+#ifndef kasan_mem_to_shadow
+static inline void *kasan_mem_to_shadow(const void *addr)
+{
+	void *scaled;
+
+	if (IS_ENABLED(CONFIG_KASAN_GENERIC))
+		scaled = (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT);
+	else
+		scaled = (void *)((long)addr >> KASAN_SHADOW_SCALE_SHIFT);
+
+	return KASAN_SHADOW_OFFSET + scaled;
+}
+#endif
+
 /* Software KASAN implementations use shadow memory. */
 
 #ifdef CONFIG_KASAN_SW_TAGS_DENSE
@@ -53,6 +67,25 @@ static inline u8 kasan_dense_tag(u8 tag)
 
 #define KASAN_GRANULE_SIZE	(1UL << KASAN_GRANULE_SHIFT)
 
+#ifdef CONFIG_KASAN_SW_TAGS_DENSE
+static inline u8 kasan_get_shadow_tag(const void *ptr)
+{
+	u8 shadow_byte = *(u8 *)kasan_mem_to_shadow(ptr);
+	unsigned long addr = (unsigned long)ptr;
+	int shift;
+
+	shift = !!(addr & KASAN_GRANULE_SIZE) * KASAN_TAG_WIDTH;
+	shadow_byte >>= shift;
+
+	return shadow_byte & KASAN_TAG_KERNEL;
+}
+#else
+static inline u8 kasan_get_shadow_tag(const void *addr)
+{
+	return (*(u8 *)kasan_mem_to_shadow(addr));
+}
+#endif
+
 #ifdef CONFIG_KASAN_SW_TAGS
 /* This matches KASAN_TAG_INVALID. */
 #define KASAN_SHADOW_INIT 0xFE
@@ -73,20 +106,6 @@ extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
 int kasan_populate_early_shadow(const void *shadow_start,
 				const void *shadow_end);
 
-#ifndef kasan_mem_to_shadow
-static inline void *kasan_mem_to_shadow(const void *addr)
-{
-	void *scaled;
-
-	if (IS_ENABLED(CONFIG_KASAN_GENERIC))
-		scaled = (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT);
-	else
-		scaled = (void *)((long)addr >> KASAN_SHADOW_SCALE_SHIFT);
-
-	return KASAN_SHADOW_OFFSET + scaled;
-}
-#endif
-
 int kasan_add_zero_shadow(void *start, unsigned long size);
 void kasan_remove_zero_shadow(void *start, unsigned long size);
 
diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
index b88543e5c0cc..3a460abd4c18 100644
--- a/mm/kasan/Makefile
+++ b/mm/kasan/Makefile
@@ -5,6 +5,7 @@ KCOV_INSTRUMENT := n
 
 # Disable ftrace to avoid recursion.
 CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_dense.o = $(CC_FLAGS_FTRACE)
 CFLAGS_REMOVE_generic.o = $(CC_FLAGS_FTRACE)
 CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE)
 CFLAGS_REMOVE_quarantine.o = $(CC_FLAGS_FTRACE)
@@ -24,6 +25,7 @@ CC_FLAGS_KASAN_RUNTIME += -fno-stack-protector
 CC_FLAGS_KASAN_RUNTIME += -DDISABLE_BRANCH_PROFILING
 
 CFLAGS_common.o := $(CC_FLAGS_KASAN_RUNTIME)
+CFLAGS_dense.o := $(CC_FLAGS_KASAN_RUNTIME)
 CFLAGS_generic.o := $(CC_FLAGS_KASAN_RUNTIME)
 CFLAGS_init.o := $(CC_FLAGS_KASAN_RUNTIME)
 CFLAGS_quarantine.o := $(CC_FLAGS_KASAN_RUNTIME)
@@ -49,6 +51,7 @@ RUSTFLAGS_kasan_test_rust.o := $(RUSTFLAGS_KASAN)
 CFLAGS_kasan_test_module.o := $(CFLAGS_KASAN_TEST)
 
 obj-y := common.o report.o
+obj-$(CONFIG_KASAN_SW_TAGS_DENSE) += dense.o
 obj-$(CONFIG_KASAN_GENERIC) += init.o generic.o report_generic.o shadow.o quarantine.o
 obj-$(CONFIG_KASAN_HW_TAGS) += hw_tags.o report_hw_tags.o tags.o report_tags.o
 obj-$(CONFIG_KASAN_SW_TAGS) += init.o report_sw_tags.o shadow.o sw_tags.o tags.o report_tags.o
diff --git a/mm/kasan/dense.c b/mm/kasan/dense.c
new file mode 100644
index 000000000000..306bbbfdce29
--- /dev/null
+++ b/mm/kasan/dense.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "kasan.h"
+
+static __always_inline bool kasan_check_range_inline(const void *addr,
+						     size_t size, bool write,
+						     unsigned long ret_ip)
+{
+	u8 *shadow_first, *shadow_last, *shadow, *shadow_first_aligned, *shadow_last_aligned;
+	u64 addr_start_aligned, addr_end_aligned;
+	u8 tag, kasan_granule_offset;
+	size_t aligned_size;
+	void *untagged_addr;
+
+	if (unlikely(size == 0))
+		return true;
+
+	if (unlikely(addr + size < addr))
+		return !kasan_report(addr, size, write, ret_ip);
+
+	tag = get_tag((const void *)addr);
+
+	/*
+	 * Ignore accesses for pointers tagged with native kernel
+	 * pointer tag to suppress false positives caused by kmap.
+	 *
+	 * Some kernel code was written to account for archs that don't keep
+	 * high memory mapped all the time, but rather map and unmap particular
+	 * pages when needed. Instead of storing a pointer to the kernel memory,
+	 * this code saves the address of the page structure and offset within
+	 * that page for later use. Those pages are then mapped and unmapped
+	 * with kmap/kunmap when necessary and virt_to_page is used to get the
+	 * virtual address of the page. For arm64 (that keeps the high memory
+	 * mapped all the time), kmap is turned into a page_address call.
+
+	 * The issue is that with use of the page_address + virt_to_page
+	 * sequence the top byte value of the original pointer gets lost (gets
+	 * set to KASAN_TAG_KERNEL).
+	 */
+	if (tag == KASAN_TAG_KERNEL)
+		return true;
+
+	untagged_addr = kasan_reset_tag((void *)round_down((u64)addr, KASAN_GRANULE_SIZE));
+	if (unlikely(!addr_has_metadata(untagged_addr)))
+		return !kasan_report(addr, size, write, ret_ip);
+
+	kasan_granule_offset = ((u64)addr & KASAN_GRANULE_MASK);
+	aligned_size = round_up(size + kasan_granule_offset, KASAN_GRANULE_SIZE);
+	shadow_first = kasan_mem_to_shadow(untagged_addr);
+	shadow_last = kasan_mem_to_shadow(untagged_addr + aligned_size);
+	addr_start_aligned = round_up((u64)untagged_addr, KASAN_SHADOW_SCALE_SIZE);
+	addr_end_aligned = round_down((u64)untagged_addr + aligned_size, KASAN_SHADOW_SCALE_SIZE);
+	shadow_first_aligned = kasan_mem_to_shadow((void *)addr_start_aligned);
+	shadow_last_aligned = kasan_mem_to_shadow((void *)addr_end_aligned);
+
+	/* Check the first unaligned tag in shadow memory. */
+	if ((u64)untagged_addr % KASAN_SHADOW_SCALE_SIZE) {
+		if (unlikely((*shadow_first >> KASAN_TAG_WIDTH) != tag))
+			return !kasan_report(addr, size, write, ret_ip);
+	}
+
+	/* Check the middle aligned part in shadow memory. */
+	for (shadow = shadow_first_aligned; shadow < shadow_last_aligned; shadow++) {
+		if (unlikely(*shadow != ((tag << KASAN_TAG_WIDTH) | tag)))
+			return !kasan_report(addr, size, write, ret_ip);
+	}
+
+	/* Check the last unaligned tag in shadow memory. */
+	if (((u64)untagged_addr + aligned_size) % KASAN_SHADOW_SCALE_SIZE) {
+		if (unlikely((*shadow_last & KASAN_TAG_MASK) != tag))
+			return !kasan_report(addr, size, write, ret_ip);
+	}
+
+	return true;
+}
+
+#if IS_ENABLED(CONFIG_KASAN_SW_TAGS_DENSE)
+bool kasan_check_range(const void *addr, size_t size, bool write,
+		       unsigned long ret_ip)
+{
+	return kasan_check_range_inline(addr, size, write, ret_ip);
+}
+#endif
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 0e04c5e2c405..d29bd0e65020 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -183,7 +183,7 @@ static inline bool kasan_requires_meta(void)
 #define META_BYTES_PER_BLOCK 1
 #define META_BLOCKS_PER_ROW 16
 #define META_BYTES_PER_ROW (META_BLOCKS_PER_ROW * META_BYTES_PER_BLOCK)
-#define META_MEM_BYTES_PER_ROW (META_BYTES_PER_ROW * KASAN_GRANULE_SIZE)
+#define META_MEM_BYTES_PER_ROW (META_BYTES_PER_ROW * KASAN_SHADOW_SCALE_SIZE)
 #define META_ROWS_AROUND_ADDR 2
 
 #define KASAN_STACK_DEPTH 64
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index c08097715686..ee9e406b0cdb 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -436,7 +436,7 @@ static int meta_pointer_offset(const void *row, const void *addr)
 	 *    plus 1 byte for space.
 	 */
 	return 3 + (BITS_PER_LONG / 8) * 2 +
-		(addr - row) / KASAN_GRANULE_SIZE * 3 + 1;
+		(addr - row) / KASAN_SHADOW_SCALE_SIZE * 3 + 1;
 }
 
 static void print_memory_metadata(const void *addr)
diff --git a/mm/kasan/report_sw_tags.c b/mm/kasan/report_sw_tags.c
index 689e94f9fe3c..1ac5c7a9011d 100644
--- a/mm/kasan/report_sw_tags.c
+++ b/mm/kasan/report_sw_tags.c
@@ -39,7 +39,7 @@ const void *kasan_find_first_bad_addr(const void *addr, size_t size)
 	if (!addr_has_metadata(p))
 		return p;
 
-	while (p < end && tag == *(u8 *)kasan_mem_to_shadow(p))
+	while (p < end && tag == kasan_get_shadow_tag(p))
 		p += KASAN_GRANULE_SIZE;
 
 	return p;
@@ -48,7 +48,6 @@ const void *kasan_find_first_bad_addr(const void *addr, size_t size)
 size_t kasan_get_alloc_size(void *object, struct kmem_cache *cache)
 {
 	size_t size = 0;
-	u8 *shadow;
 
 	/*
 	 * Skip the addr_has_metadata check, as this function only operates on
@@ -59,13 +58,11 @@ size_t kasan_get_alloc_size(void *object, struct kmem_cache *cache)
 	 * The loop below returns 0 for freed objects, for which KASAN cannot
 	 * calculate the allocation size based on the metadata.
 	 */
-	shadow = (u8 *)kasan_mem_to_shadow(object);
 	while (size < cache->object_size) {
-		if (*shadow != KASAN_TAG_INVALID)
+		if (kasan_get_shadow_tag(object + size) != KASAN_TAG_INVALID)
 			size += KASAN_GRANULE_SIZE;
 		else
 			return size;
-		shadow++;
 	}
 
 	return cache->object_size;
@@ -78,9 +75,8 @@ void kasan_metadata_fetch_row(char *buffer, void *row)
 
 void kasan_print_tags(u8 addr_tag, const void *addr)
 {
-	u8 *shadow = (u8 *)kasan_mem_to_shadow(addr);
-
-	pr_err("Pointer tag: [%02x], memory tag: [%02x]\n", addr_tag, *shadow);
+	pr_err("Pointer tag: [%02x], memory tag: [%02x]\n", addr_tag,
+	       kasan_get_shadow_tag(addr));
 }
 
 #ifdef CONFIG_KASAN_STACK
diff --git a/mm/kasan/sw_tags.c b/mm/kasan/sw_tags.c
index 32435d33583a..7a6b8ea9bf78 100644
--- a/mm/kasan/sw_tags.c
+++ b/mm/kasan/sw_tags.c
@@ -79,6 +79,7 @@ u8 __hwasan_generate_tag(void)
 }
 EXPORT_SYMBOL(__hwasan_generate_tag);
 
+#if !IS_ENABLED(CONFIG_KASAN_SW_TAGS_DENSE)
 bool kasan_check_range(const void *addr, size_t size, bool write,
 			unsigned long ret_ip)
 {
@@ -127,17 +128,24 @@ bool kasan_check_range(const void *addr, size_t size, bool write,
 
 	return true;
 }
+#endif
 
 bool kasan_byte_accessible(const void *addr)
 {
 	u8 tag = get_tag(addr);
 	void *untagged_addr = kasan_reset_tag(addr);
 	u8 shadow_byte;
+	int shift;
 
 	if (!addr_has_metadata(untagged_addr))
 		return false;
 
 	shadow_byte = READ_ONCE(*(u8 *)kasan_mem_to_shadow(untagged_addr));
+	if (IS_ENABLED(CONFIG_KASAN_SW_TAGS_DENSE)) {
+		shift = !!((u64)addr & BIT(KASAN_TAG_WIDTH)) * KASAN_TAG_WIDTH;
+		shadow_byte = (shadow_byte >> shift) & KASAN_TAG_KERNEL;
+	}
+
 	return tag == KASAN_TAG_KERNEL || tag == shadow_byte;
 }
 
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 03/15] kasan: Vmalloc dense tag-based mode support
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 02/15] kasan: Tag checking with " Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 04/15] kasan: arm64: x86: risc-v: Make special tags arch specific Maciej Wieczor-Retman
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

To use KASAN with the vmalloc allocator multiple functions are
implemented that deal with full pages of memory. Many of these functions
are hardcoded to deal with byte aligned shadow memory regions by using
__memset().

With the introduction of the dense mode, tags won't necessarily occupy
whole bytes of shadow memory if the previously allocated memory wasn't
aligned to 32 bytes - which is the coverage of one shadow byte.

Change __memset() calls to kasan_poison(). With dense tag-based mode
enabled that will take care of any unaligned tags in shadow memory.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 mm/kasan/kasan.h  |  2 +-
 mm/kasan/shadow.c | 14 ++++++--------
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index d29bd0e65020..a56aadd51485 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -135,7 +135,7 @@ static inline bool kasan_requires_meta(void)
 
 #define KASAN_GRANULE_MASK	(KASAN_GRANULE_SIZE - 1)
 
-#define KASAN_MEMORY_PER_SHADOW_PAGE	(KASAN_GRANULE_SIZE << PAGE_SHIFT)
+#define KASAN_MEMORY_PER_SHADOW_PAGE	(KASAN_SHADOW_SCALE_SIZE << PAGE_SHIFT)
 
 #ifdef CONFIG_KASAN_GENERIC
 #define KASAN_PAGE_FREE		0xFF  /* freed page */
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index 368503f54b87..94f51046e6ae 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -332,7 +332,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
 	if (!page)
 		return -ENOMEM;
 
-	__memset((void *)page, KASAN_VMALLOC_INVALID, PAGE_SIZE);
+	kasan_poison((void *)page, PAGE_SIZE, KASAN_VMALLOC_INVALID, false);
 	pte = pfn_pte(PFN_DOWN(__pa(page)), PAGE_KERNEL);
 
 	spin_lock(&init_mm.page_table_lock);
@@ -357,9 +357,6 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
 	if (!is_vmalloc_or_module_addr((void *)addr))
 		return 0;
 
-	shadow_start = (unsigned long)kasan_mem_to_shadow((void *)addr);
-	shadow_end = (unsigned long)kasan_mem_to_shadow((void *)addr + size);
-
 	/*
 	 * User Mode Linux maps enough shadow memory for all of virtual memory
 	 * at boot, so doesn't need to allocate more on vmalloc, just clear it.
@@ -368,12 +365,12 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
 	 * reason.
 	 */
 	if (IS_ENABLED(CONFIG_UML)) {
-		__memset((void *)shadow_start, KASAN_VMALLOC_INVALID, shadow_end - shadow_start);
+		kasan_poison((void *)addr, size, KASAN_VMALLOC_INVALID, false);
 		return 0;
 	}
 
-	shadow_start = PAGE_ALIGN_DOWN(shadow_start);
-	shadow_end = PAGE_ALIGN(shadow_end);
+	shadow_start = PAGE_ALIGN_DOWN((unsigned long)kasan_mem_to_shadow((void *)addr));
+	shadow_end = PAGE_ALIGN((unsigned long)kasan_mem_to_shadow((void *)addr + size));
 
 	ret = apply_to_page_range(&init_mm, shadow_start,
 				  shadow_end - shadow_start,
@@ -546,7 +543,8 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
 	if (shadow_end > shadow_start) {
 		size = shadow_end - shadow_start;
 		if (IS_ENABLED(CONFIG_UML)) {
-			__memset(shadow_start, KASAN_SHADOW_INIT, shadow_end - shadow_start);
+			kasan_poison((void *)region_start, region_start - region_end,
+				     KASAN_VMALLOC_INVALID, false);
 			return;
 		}
 		apply_to_existing_page_range(&init_mm,
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 04/15] kasan: arm64: x86: risc-v: Make special tags arch specific
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (2 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 03/15] kasan: Vmalloc dense tag-based mode support Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-05 20:20   ` Palmer Dabbelt
  2025-02-04 17:33 ` [PATCH 05/15] x86: Add arch specific kasan functions Maciej Wieczor-Retman
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

KASAN's tag-based mode defines multiple special tag values. They're
reserved for:
- Native kernel value. On arm64 it's 0xFF and it causes an early return
  in the tag checking function.
- Invalid value. 0xFE marks an area as freed / unallocated. It's also
  the value that is used to initialize regions of shadow memory.
- Max value. 0xFD is the highest value that can be randomly generated
  for a new tag.

Metadata macro is also defined:
- Tag width equal to 8.

Tag-based mode on x86 is going to use 4 bit wide tags so all the above
values need to be changed accordingly.

Make tags arch specific for x86, risc-v and arm64. On x86 the values
just lose the top 4 bits.

Replace hardcoded kernel tag value and tag width with macros in KASAN's
non-arch specific code.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 MAINTAINERS                         |  2 +-
 arch/arm64/include/asm/kasan-tags.h |  9 +++++++++
 arch/riscv/include/asm/kasan-tags.h | 12 ++++++++++++
 arch/riscv/include/asm/kasan.h      |  4 ----
 arch/x86/include/asm/kasan-tags.h   |  9 +++++++++
 include/linux/kasan-tags.h          | 12 +++++++++++-
 include/linux/kasan.h               |  4 +++-
 include/linux/mm.h                  |  6 +++---
 include/linux/page-flags-layout.h   |  7 +------
 9 files changed, 49 insertions(+), 16 deletions(-)
 create mode 100644 arch/arm64/include/asm/kasan-tags.h
 create mode 100644 arch/riscv/include/asm/kasan-tags.h
 create mode 100644 arch/x86/include/asm/kasan-tags.h

diff --git a/MAINTAINERS b/MAINTAINERS
index b878ddc99f94..45671faa3b6f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12227,7 +12227,7 @@ L:	kasan-dev@googlegroups.com
 S:	Maintained
 B:	https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
 F:	Documentation/dev-tools/kasan.rst
-F:	arch/*/include/asm/*kasan.h
+F:	arch/*/include/asm/*kasan*.h
 F:	arch/*/mm/kasan_init*
 F:	include/linux/kasan*.h
 F:	lib/Kconfig.kasan
diff --git a/arch/arm64/include/asm/kasan-tags.h b/arch/arm64/include/asm/kasan-tags.h
new file mode 100644
index 000000000000..9e835da95f6b
--- /dev/null
+++ b/arch/arm64/include/asm/kasan-tags.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_KASAN_TAGS_H
+#define __ASM_KASAN_TAGS_H
+
+#define KASAN_TAG_KERNEL	0xFF /* native kernel pointers tag */
+
+#define KASAN_TAG_WIDTH 8
+
+#endif /* ASM_KASAN_TAGS_H */
diff --git a/arch/riscv/include/asm/kasan-tags.h b/arch/riscv/include/asm/kasan-tags.h
new file mode 100644
index 000000000000..83d7dcc8af74
--- /dev/null
+++ b/arch/riscv/include/asm/kasan-tags.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_KASAN_TAGS_H
+#define __ASM_KASAN_TAGS_H
+
+#ifdef CONFIG_KASAN_SW_TAGS
+#define KASAN_TAG_KERNEL	0x7f /* native kernel pointers tag */
+#endif
+
+#define KASAN_TAG_WIDTH 8
+
+#endif /* ASM_KASAN_TAGS_H */
+
diff --git a/arch/riscv/include/asm/kasan.h b/arch/riscv/include/asm/kasan.h
index f6b378ba936d..27938e0d5233 100644
--- a/arch/riscv/include/asm/kasan.h
+++ b/arch/riscv/include/asm/kasan.h
@@ -41,10 +41,6 @@
 
 #define KASAN_SHADOW_OFFSET	_AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
 
-#ifdef CONFIG_KASAN_SW_TAGS
-#define KASAN_TAG_KERNEL	0x7f /* native kernel pointers tag */
-#endif
-
 #define arch_kasan_set_tag(addr, tag)	__tag_set(addr, tag)
 #define arch_kasan_reset_tag(addr)	__tag_reset(addr)
 #define arch_kasan_get_tag(addr)	__tag_get(addr)
diff --git a/arch/x86/include/asm/kasan-tags.h b/arch/x86/include/asm/kasan-tags.h
new file mode 100644
index 000000000000..68ba385bc75c
--- /dev/null
+++ b/arch/x86/include/asm/kasan-tags.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_KASAN_TAGS_H
+#define __ASM_KASAN_TAGS_H
+
+#define KASAN_TAG_KERNEL	0xF /* native kernel pointers tag */
+
+#define KASAN_TAG_WIDTH		4
+
+#endif /* ASM_KASAN_TAGS_H */
diff --git a/include/linux/kasan-tags.h b/include/linux/kasan-tags.h
index e07c896f95d3..b4aacfa8709b 100644
--- a/include/linux/kasan-tags.h
+++ b/include/linux/kasan-tags.h
@@ -2,7 +2,17 @@
 #ifndef _LINUX_KASAN_TAGS_H
 #define _LINUX_KASAN_TAGS_H
 
-#include <asm/kasan.h>
+#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS)
+#include <asm/kasan-tags.h>
+#endif
+
+#ifdef CONFIG_KASAN_SW_TAGS_DENSE
+#define KASAN_TAG_WIDTH		4
+#endif
+
+#ifndef KASAN_TAG_WIDTH
+#define KASAN_TAG_WIDTH		0
+#endif
 
 #ifndef KASAN_TAG_KERNEL
 #define KASAN_TAG_KERNEL	0xFF /* native kernel pointers tag */
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 5a3e9bec21c2..83146367170a 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -88,7 +88,9 @@ static inline u8 kasan_get_shadow_tag(const void *addr)
 
 #ifdef CONFIG_KASAN_SW_TAGS
 /* This matches KASAN_TAG_INVALID. */
-#define KASAN_SHADOW_INIT 0xFE
+#ifndef KASAN_SHADOW_INIT
+#define KASAN_SHADOW_INIT KASAN_TAG_INVALID
+#endif
 #else
 #define KASAN_SHADOW_INIT 0
 #endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 61fff5d34ed5..ddca2f63a5f6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1813,7 +1813,7 @@ static inline u8 page_kasan_tag(const struct page *page)
 
 	if (kasan_enabled()) {
 		tag = (page->flags >> KASAN_TAG_PGSHIFT) & KASAN_TAG_MASK;
-		tag ^= 0xff;
+		tag ^= KASAN_TAG_KERNEL;
 	}
 
 	return tag;
@@ -1826,7 +1826,7 @@ static inline void page_kasan_tag_set(struct page *page, u8 tag)
 	if (!kasan_enabled())
 		return;
 
-	tag ^= 0xff;
+	tag ^= KASAN_TAG_KERNEL;
 	old_flags = READ_ONCE(page->flags);
 	do {
 		flags = old_flags;
@@ -1845,7 +1845,7 @@ static inline void page_kasan_tag_reset(struct page *page)
 
 static inline u8 page_kasan_tag(const struct page *page)
 {
-	return 0xff;
+	return KASAN_TAG_KERNEL;
 }
 
 static inline void page_kasan_tag_set(struct page *page, u8 tag) { }
diff --git a/include/linux/page-flags-layout.h b/include/linux/page-flags-layout.h
index 7d79818dc065..ac3576f409ad 100644
--- a/include/linux/page-flags-layout.h
+++ b/include/linux/page-flags-layout.h
@@ -3,6 +3,7 @@
 #define PAGE_FLAGS_LAYOUT_H
 
 #include <linux/numa.h>
+#include <linux/kasan-tags.h>
 #include <generated/bounds.h>
 
 /*
@@ -72,12 +73,6 @@
 #define NODE_NOT_IN_PAGE_FLAGS	1
 #endif
 
-#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS)
-#define KASAN_TAG_WIDTH 8
-#else
-#define KASAN_TAG_WIDTH 0
-#endif
-
 #ifdef CONFIG_NUMA_BALANCING
 #define LAST__PID_SHIFT 8
 #define LAST__PID_MASK  ((1 << LAST__PID_SHIFT)-1)
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 05/15] x86: Add arch specific kasan functions
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (3 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 04/15] kasan: arm64: x86: risc-v: Make special tags arch specific Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 06/15] x86: Reset tag for virtual to physical address conversions Maciej Wieczor-Retman
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

KASAN's software tag-based mode needs multiple macros/functions to
handle tag and pointer interactions - mainly to set and retrieve tags
from the top bits of a pointer.

Mimic functions currently used by arm64 but change the tag's position to
bits [60:57] in the pointer.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/include/asm/kasan.h | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kasan.h b/arch/x86/include/asm/kasan.h
index de75306b932e..8829337a75fa 100644
--- a/arch/x86/include/asm/kasan.h
+++ b/arch/x86/include/asm/kasan.h
@@ -3,6 +3,8 @@
 #define _ASM_X86_KASAN_H
 
 #include <linux/const.h>
+#include <linux/kasan-tags.h>
+#include <linux/types.h>
 #define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
 #define KASAN_SHADOW_SCALE_SHIFT 3
 
@@ -24,8 +26,33 @@
 						  KASAN_SHADOW_SCALE_SHIFT)))
 
 #ifndef __ASSEMBLY__
+#include <linux/bitops.h>
+#include <linux/bitfield.h>
+#include <linux/bits.h>
+
+#define arch_kasan_set_tag(addr, tag)	__tag_set(addr, tag)
+#define arch_kasan_reset_tag(addr)	__tag_reset(addr)
+#define arch_kasan_get_tag(addr)	__tag_get(addr)
+
+#ifdef CONFIG_KASAN_SW_TAGS
+
+#define __tag_shifted(tag)		FIELD_PREP(GENMASK_ULL(60, 57), tag)
+#define __tag_reset(addr)		(sign_extend64((u64)(addr), 56))
+#define __tag_get(addr)			((u8)FIELD_GET(GENMASK_ULL(60, 57), (u64)addr))
+#else
+#define __tag_shifted(tag)		0UL
+#define __tag_reset(addr)		(addr)
+#define __tag_get(addr)			0
+#endif /* CONFIG_KASAN_SW_TAGS */
 
 #ifdef CONFIG_KASAN
+
+static inline const void *__tag_set(const void *addr, u8 tag)
+{
+	u64 __addr = (u64)addr & ~__tag_shifted(KASAN_TAG_KERNEL);
+	return (const void *)(__addr | __tag_shifted(tag));
+}
+
 void __init kasan_early_init(void);
 void __init kasan_init(void);
 void __init kasan_populate_shadow_for_vaddr(void *va, size_t size, int nid);
@@ -34,8 +61,9 @@ static inline void kasan_early_init(void) { }
 static inline void kasan_init(void) { }
 static inline void kasan_populate_shadow_for_vaddr(void *va, size_t size,
 						   int nid) { }
-#endif
 
-#endif
+#endif /* CONFIG_KASAN */
+
+#endif /* __ASSEMBLY__ */
 
 #endif
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 06/15] x86: Reset tag for virtual to physical address conversions
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (4 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 05/15] x86: Add arch specific kasan functions Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 07/15] mm: Pcpu chunk address tag reset Maciej Wieczor-Retman
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

Any place where pointer arithmetic is used to convert a virtual address
into a physical one can raise errors if the virtual address is tagged.

Reset the pointer's tag by sign extending the tag bits in macros that do
pointer arithmetic in address conversions. There will be no change in
compiled code with KASAN disabled since the compiler will optimize the
__tag_reset() out.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/include/asm/page.h    | 17 +++++++++++++----
 arch/x86/include/asm/page_64.h |  2 +-
 arch/x86/mm/physaddr.c         |  1 +
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 1b93ff80b43b..09c3914d8ce4 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -7,6 +7,7 @@
 #ifdef __KERNEL__
 
 #include <asm/page_types.h>
+#include <asm/kasan.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/page_64.h>
@@ -41,7 +42,7 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
 #define __pa(x)		__phys_addr((unsigned long)(x))
 #endif
 
-#define __pa_nodebug(x)	__phys_addr_nodebug((unsigned long)(x))
+#define __pa_nodebug(x)	__phys_addr_nodebug((unsigned long)(__tag_reset(x)))
 /* __pa_symbol should be used for C visible symbols.
    This seems to be the official gcc blessed way to do such arithmetic. */
 /*
@@ -65,9 +66,17 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
  * virt_to_page(kaddr) returns a valid pointer if and only if
  * virt_addr_valid(kaddr) returns true.
  */
-#define virt_to_page(kaddr)	pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
+
+#ifdef CONFIG_KASAN_SW_TAGS
+#define page_to_virt(x)	({									\
+	__typeof__(x) __page = x;								\
+	void *__addr = __va(page_to_pfn((__typeof__(x))__tag_reset(__page)) << PAGE_SHIFT);	\
+	(void *)__tag_set((const void *)__addr, page_kasan_tag(__page));			\
+})
+#endif
+#define virt_to_page(kaddr)	pfn_to_page(__pa((void *)__tag_reset(kaddr)) >> PAGE_SHIFT)
 extern bool __virt_addr_valid(unsigned long kaddr);
-#define virt_addr_valid(kaddr)	__virt_addr_valid((unsigned long) (kaddr))
+#define virt_addr_valid(kaddr)	__virt_addr_valid((unsigned long)(__tag_reset(kaddr)))
 
 static __always_inline void *pfn_to_kaddr(unsigned long pfn)
 {
@@ -81,7 +90,7 @@ static __always_inline u64 __canonical_address(u64 vaddr, u8 vaddr_bits)
 
 static __always_inline u64 __is_canonical_address(u64 vaddr, u8 vaddr_bits)
 {
-	return __canonical_address(vaddr, vaddr_bits) == vaddr;
+	return __canonical_address(vaddr, vaddr_bits) == __tag_reset(vaddr);
 }
 
 #endif	/* __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index f3d257c45225..6e24aeff36eb 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -33,7 +33,7 @@ static __always_inline unsigned long __phys_addr_nodebug(unsigned long x)
 extern unsigned long __phys_addr(unsigned long);
 extern unsigned long __phys_addr_symbol(unsigned long);
 #else
-#define __phys_addr(x)		__phys_addr_nodebug(x)
+#define __phys_addr(x)		__phys_addr_nodebug(__tag_reset(x))
 #define __phys_addr_symbol(x) \
 	((unsigned long)(x) - __START_KERNEL_map + phys_base)
 #endif
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index fc3f3d3e2ef2..7f2b11308245 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -14,6 +14,7 @@
 #ifdef CONFIG_DEBUG_VIRTUAL
 unsigned long __phys_addr(unsigned long x)
 {
+	x = __tag_reset(x);
 	unsigned long y = x - __START_KERNEL_map;
 
 	/* use the carry flag to determine if x was < __START_KERNEL_map */
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 07/15] mm: Pcpu chunk address tag reset
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (5 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 06/15] x86: Reset tag for virtual to physical address conversions Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte Maciej Wieczor-Retman
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

The problem presented here is related to NUMA systems and tag-based
KASAN mode. Getting to it can be explained in the following points:

	1. A new chunk is created with pcpu_create_chunk() and
	   vm_structs are allocated. On systems with one NUMA node only
	   one is allocated, but with more NUMA nodes at least a second
	   one will be allocated too.

	2. chunk->base_addr is assigned the modified value of
	   vms[0]->addr and thus inherits the tag of this allocated
	   structure.

	3. In pcpu_alloc() for each possible cpu pcpu_chunk_addr() is
	   executed which calculates per cpu pointers that correspond to
	   the vms structure addresses. The calculations are based on
	   adding an offset from a table to chunk->base_addr.

Here the problem presents itself since for addresses based on vms[1] and
up, the tag will be different than the ones based on vms[0] (base_addr).
The tag mismatch happens and an error is reported.

Reset the base_addr tag, since it will disable tag checks for pointers
derived arithmetically from base_addr that would inherit its tag.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 mm/percpu-vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c
index cd69caf6aa8d..e13750d804f7 100644
--- a/mm/percpu-vm.c
+++ b/mm/percpu-vm.c
@@ -347,7 +347,7 @@ static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp)
 	}
 
 	chunk->data = vms;
-	chunk->base_addr = vms[0]->addr - pcpu_group_offsets[0];
+	chunk->base_addr = kasan_reset_tag(vms[0]->addr) - pcpu_group_offsets[0];
 
 	pcpu_stats_chunk_alloc();
 	trace_percpu_create_chunk(chunk->base_addr);
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (6 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 07/15] mm: Pcpu chunk address tag reset Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-06  0:57   ` Dave Hansen
  2025-02-04 17:33 ` [PATCH 09/15] x86: Physical address comparison in current_mm pgd check Maciej Wieczor-Retman
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

Calculating page offset returns a pointer without a tag. When comparing
the calculated offset to a tagged page pointer an error is raised
because they are not equal.

Change pointer comparisons to physical address comparisons as to avoid
issues in KASAN that pointer arithmetic would create.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/mm/init_64.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index ff253648706f..bb101412424a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -251,7 +251,7 @@ static p4d_t *fill_p4d(pgd_t *pgd, unsigned long vaddr)
 	if (pgd_none(*pgd)) {
 		p4d_t *p4d = (p4d_t *)spp_getpage();
 		pgd_populate(&init_mm, pgd, p4d);
-		if (p4d != p4d_offset(pgd, 0))
+		if (__pa(p4d) != __pa(p4d_offset(pgd, 0)))
 			printk(KERN_ERR "PAGETABLE BUG #00! %p <-> %p\n",
 			       p4d, p4d_offset(pgd, 0));
 	}
@@ -263,7 +263,7 @@ static pud_t *fill_pud(p4d_t *p4d, unsigned long vaddr)
 	if (p4d_none(*p4d)) {
 		pud_t *pud = (pud_t *)spp_getpage();
 		p4d_populate(&init_mm, p4d, pud);
-		if (pud != pud_offset(p4d, 0))
+		if (__pa(pud) != __pa(pud_offset(p4d, 0)))
 			printk(KERN_ERR "PAGETABLE BUG #01! %p <-> %p\n",
 			       pud, pud_offset(p4d, 0));
 	}
@@ -275,7 +275,7 @@ static pmd_t *fill_pmd(pud_t *pud, unsigned long vaddr)
 	if (pud_none(*pud)) {
 		pmd_t *pmd = (pmd_t *) spp_getpage();
 		pud_populate(&init_mm, pud, pmd);
-		if (pmd != pmd_offset(pud, 0))
+		if (__pa(pmd) != __pa(pmd_offset(pud, 0)))
 			printk(KERN_ERR "PAGETABLE BUG #02! %p <-> %p\n",
 			       pmd, pmd_offset(pud, 0));
 	}
@@ -287,7 +287,7 @@ static pte_t *fill_pte(pmd_t *pmd, unsigned long vaddr)
 	if (pmd_none(*pmd)) {
 		pte_t *pte = (pte_t *) spp_getpage();
 		pmd_populate_kernel(&init_mm, pmd, pte);
-		if (pte != pte_offset_kernel(pmd, 0))
+		if (__pa(pte) != __pa(pte_offset_kernel(pmd, 0)))
 			printk(KERN_ERR "PAGETABLE BUG #03!\n");
 	}
 	return pte_offset_kernel(pmd, vaddr);
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 09/15] x86: Physical address comparison in current_mm pgd check
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (7 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 10/15] x86: KASAN raw shadow memory PTE init Maciej Wieczor-Retman
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

With KASAN software tag-based mode enabled PGD pointer stored in
current_mm structure is tagged while the same pointer computed through
__va(read_cr3_pa()) ends up with the tag space filled with ones.

Use current_mm->pgd' physical address and drop the __va() so the
VM_WARN_ON_ONCE can work properly and not report false positives while
KASAN is enabled.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/mm/tlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 86593d1b787d..95e3dc1fb766 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1295,7 +1295,7 @@ bool nmi_uaccess_okay(void)
 	if (loaded_mm != current_mm)
 		return false;
 
-	VM_WARN_ON_ONCE(current_mm->pgd != __va(read_cr3_pa()));
+	VM_WARN_ON_ONCE(__pa(current_mm->pgd) != read_cr3_pa());
 
 	return true;
 }
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 10/15] x86: KASAN raw shadow memory PTE init
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (8 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 09/15] x86: Physical address comparison in current_mm pgd check Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-05 23:45   ` Andrey Konovalov
  2025-02-04 17:33 ` [PATCH 11/15] x86: LAM initialization Maciej Wieczor-Retman
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

In KASAN's generic mode the default value in shadow memory is zero.
During initialization of shadow memory pages they are allocated and
zeroed.

In KASAN's tag-based mode the default tag for the arm64 architecture is
0xFE which corresponds to any memory that should not be accessed. On x86
(where tags are 4-bit wide instead of 8-bit wide) that tag is 0xE so
during the initializations all the bytes in shadow memory pages should
be filled with 0xE or 0xEE if two tags should be packed in one shadow
byte.

Use memblock_alloc_try_nid_raw() instead of memblock_alloc_try_nid() to
avoid zeroing out the memory so it can be set with the KASAN invalid
tag.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/mm/kasan_init_64.c | 19 ++++++++++++++++---
 include/linux/kasan.h       | 25 +++++++++++++++++++++++++
 mm/kasan/kasan.h            | 19 -------------------
 3 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 9dddf19a5571..55d468d83682 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -35,6 +35,18 @@ static __init void *early_alloc(size_t size, int nid, bool should_panic)
 	return ptr;
 }
 
+static __init void *early_raw_alloc(size_t size, int nid, bool should_panic)
+{
+	void *ptr = memblock_alloc_try_nid_raw(size, size,
+			__pa(MAX_DMA_ADDRESS), MEMBLOCK_ALLOC_ACCESSIBLE, nid);
+
+	if (!ptr && should_panic)
+		panic("%pS: Failed to allocate page, nid=%d from=%lx\n",
+		      (void *)_RET_IP_, nid, __pa(MAX_DMA_ADDRESS));
+
+	return ptr;
+}
+
 static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
 				      unsigned long end, int nid)
 {
@@ -64,8 +76,9 @@ static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
 		if (!pte_none(*pte))
 			continue;
 
-		p = early_alloc(PAGE_SIZE, nid, true);
-		entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL);
+		p = early_raw_alloc(PAGE_SIZE, nid, true);
+		memset(p, PAGE_SIZE, kasan_dense_tag(KASAN_SHADOW_INIT));
+		entry = pfn_pte(PFN_DOWN(__pa_nodebug(p)), PAGE_KERNEL);
 		set_pte_at(&init_mm, addr, pte, entry);
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 }
@@ -437,7 +450,7 @@ void __init kasan_init(void)
 	 * it may contain some garbage. Now we can clear and write protect it,
 	 * since after the TLB flush no one should write to it.
 	 */
-	memset(kasan_early_shadow_page, 0, PAGE_SIZE);
+	kasan_poison(kasan_early_shadow_page, PAGE_SIZE, KASAN_SHADOW_INIT, false);
 	for (i = 0; i < PTRS_PER_PTE; i++) {
 		pte_t pte;
 		pgprot_t prot;
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 83146367170a..af8272c74409 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -151,6 +151,31 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
 		__kasan_unpoison_range(addr, size);
 }
 
+#ifdef CONFIG_KASAN_HW_TAGS
+
+static inline void kasan_poison(const void *addr, size_t size, u8 value, bool init)
+{
+	if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
+		return;
+	if (WARN_ON(size & KASAN_GRANULE_MASK))
+		return;
+
+	hw_set_mem_tag_range(kasan_reset_tag(addr), size, value, init);
+}
+
+#else /* CONFIG_KASAN_HW_TAGS */
+
+/**
+ * kasan_poison - mark the memory range as inaccessible
+ * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
+ * @size - range size, must be aligned to KASAN_GRANULE_SIZE
+ * @value - value that's written to metadata for the range
+ * @init - whether to initialize the memory range (only for hardware tag-based)
+ */
+void kasan_poison(const void *addr, size_t size, u8 value, bool init);
+
+#endif /* CONFIG_KASAN_HW_TAGS */
+
 void __kasan_poison_pages(struct page *page, unsigned int order, bool init);
 static __always_inline void kasan_poison_pages(struct page *page,
 						unsigned int order, bool init)
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index a56aadd51485..2405477c5899 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -466,16 +466,6 @@ static inline u8 kasan_random_tag(void) { return 0; }
 
 #ifdef CONFIG_KASAN_HW_TAGS
 
-static inline void kasan_poison(const void *addr, size_t size, u8 value, bool init)
-{
-	if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
-		return;
-	if (WARN_ON(size & KASAN_GRANULE_MASK))
-		return;
-
-	hw_set_mem_tag_range(kasan_reset_tag(addr), size, value, init);
-}
-
 static inline void kasan_unpoison(const void *addr, size_t size, bool init)
 {
 	u8 tag = get_tag(addr);
@@ -497,15 +487,6 @@ static inline bool kasan_byte_accessible(const void *addr)
 
 #else /* CONFIG_KASAN_HW_TAGS */
 
-/**
- * kasan_poison - mark the memory range as inaccessible
- * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
- * @size - range size, must be aligned to KASAN_GRANULE_SIZE
- * @value - value that's written to metadata for the range
- * @init - whether to initialize the memory range (only for hardware tag-based)
- */
-void kasan_poison(const void *addr, size_t size, u8 value, bool init);
-
 /**
  * kasan_unpoison - mark the memory range as accessible
  * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 11/15] x86: LAM initialization
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (9 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 10/15] x86: KASAN raw shadow memory PTE init Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 12/15] x86: Minimal SLAB alignment Maciej Wieczor-Retman
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

To make use of KASAN's tag based mode on x86 Linear Address Masking
(LAM) needs to be enabled. To do that the 28th bit in CR4 needs to be
set.

Set the bit in early memory initialization.

When launching secondary CPUs the LAM bit gets lost. To avoid this it
needs to get added in a mask in head_64.S. The bit mask permits some
bits of CR4 to pass from the primary CPU to the secondary CPUs without
being cleared.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/kernel/head_64.S | 3 +++
 arch/x86/mm/init.c        | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 16752b8dfa89..7cdafcedbc70 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -199,6 +199,9 @@ SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL)
 	 *  there will be no global TLB entries after the execution."
 	 */
 	movl	$(X86_CR4_PAE | X86_CR4_LA57), %edx
+#ifdef CONFIG_ADDRESS_MASKING
+	orl	$X86_CR4_LAM_SUP, %edx
+#endif
 #ifdef CONFIG_X86_MCE
 	/*
 	 * Preserve CR4.MCE if the kernel will enable #MC support.
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index eb503f53c319..4dc3679fedd1 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -756,6 +756,9 @@ void __init init_mem_mapping(void)
 	probe_page_size_mask();
 	setup_pcid();
 
+	if (boot_cpu_has(X86_FEATURE_LAM) && IS_ENABLED(CONFIG_KASAN_SW_TAGS))
+		cr4_set_bits_and_update_boot(X86_CR4_LAM_SUP);
+
 #ifdef CONFIG_X86_64
 	end = max_pfn << PAGE_SHIFT;
 #else
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 12/15] x86: Minimal SLAB alignment
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (10 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 11/15] x86: LAM initialization Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 13/15] x86: runtime_const used for KASAN_SHADOW_END Maciej Wieczor-Retman
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

Adjust x86 minimal SLAB alignment to match KASAN granularity size. In
tag-based mode the size changes to 16 bytes so the value needs to be 4.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/include/asm/kasan.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/kasan.h b/arch/x86/include/asm/kasan.h
index 8829337a75fa..f7a8d3763615 100644
--- a/arch/x86/include/asm/kasan.h
+++ b/arch/x86/include/asm/kasan.h
@@ -36,6 +36,8 @@
 
 #ifdef CONFIG_KASAN_SW_TAGS
 
+#define ARCH_SLAB_MINALIGN (1ULL << KASAN_GRANULE_SHIFT)
+
 #define __tag_shifted(tag)		FIELD_PREP(GENMASK_ULL(60, 57), tag)
 #define __tag_reset(addr)		(sign_extend64((u64)(addr), 56))
 #define __tag_get(addr)			((u8)FIELD_GET(GENMASK_ULL(60, 57), (u64)addr))
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 13/15] x86: runtime_const used for KASAN_SHADOW_END
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (11 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 12/15] x86: Minimal SLAB alignment Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 14/15] x86: Make software tag-based kasan available Maciej Wieczor-Retman
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

On x86, generic KASAN is setup in a way that needs a single
KASAN_SHADOW_OFFSET value for both 4 and 5 level paging. It's required
to facilitate boot time switching and it's a compiler ABI so it can't be
changed during runtime.

Software tag-based mode doesn't tie shadow start and end to any linear
addresses as part of the compiler ABI so it can be changed during
runtime. This notion, for KASAN purposes, allows to optimize out macros
such us pgtable_l5_enabled() which would otherwise be used in every
single KASAN related function.

Use runtime_const infrastructure with pgtable_l5_enabled() to initialize
the end address of KASAN's shadow address space. It's a good choice
since in software tag based mode KASAN_SHADOW_OFFSET and
KASAN_SHADOW_END refer to the same value and the offset in
kasan_mem_to_shadow() is a signed negative value.

Setup KASAN_SHADOW_END values so that they're aligned to 4TB in 4-level
paging mode and to 2PB in 5-level paging mode. Also update x86 memory
map documentation.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 Documentation/arch/x86/x86_64/mm.rst |  6 ++++--
 arch/x86/Kconfig                     |  3 +--
 arch/x86/include/asm/kasan.h         | 14 +++++++++++++-
 arch/x86/kernel/vmlinux.lds.S        |  1 +
 arch/x86/mm/kasan_init_64.c          |  5 ++++-
 5 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/Documentation/arch/x86/x86_64/mm.rst b/Documentation/arch/x86/x86_64/mm.rst
index 35e5e18c83d0..4e8c04d71a13 100644
--- a/Documentation/arch/x86/x86_64/mm.rst
+++ b/Documentation/arch/x86/x86_64/mm.rst
@@ -48,7 +48,8 @@ Complete virtual memory map with 4-level page tables
    ffffe90000000000 |  -23    TB | ffffe9ffffffffff |    1 TB | ... unused hole
    ffffea0000000000 |  -22    TB | ffffeaffffffffff |    1 TB | virtual memory map (vmemmap_base)
    ffffeb0000000000 |  -21    TB | ffffebffffffffff |    1 TB | ... unused hole
-   ffffec0000000000 |  -20    TB | fffffbffffffffff |   16 TB | KASAN shadow memory
+   ffffec0000000000 |  -20    TB | fffffbffffffffff |   16 TB | KASAN shadow memory (generic mode)
+   fffff80000000000 |   -8    TB | fffffc0000000000 |    4 TB | KASAN shadow memory (software tag-based mode)
   __________________|____________|__________________|_________|____________________________________________________________
                                                               |
                                                               | Identical layout to the 56-bit one from here on:
@@ -107,7 +108,8 @@ Complete virtual memory map with 5-level page tables
    ffd2000000000000 |  -11.5  PB | ffd3ffffffffffff |  0.5 PB | ... unused hole
    ffd4000000000000 |  -11    PB | ffd5ffffffffffff |  0.5 PB | virtual memory map (vmemmap_base)
    ffd6000000000000 |  -10.5  PB | ffdeffffffffffff | 2.25 PB | ... unused hole
-   ffdf000000000000 |   -8.25 PB | fffffbffffffffff |   ~8 PB | KASAN shadow memory
+   ffdf000000000000 |   -8.25 PB | fffffbffffffffff |   ~8 PB | KASAN shadow memory (generic mode)
+   ffe8000000000000 |   -6    PB | fff0000000000000 |    2 PB | KASAN shadow memory (software tag-based mode)
   __________________|____________|__________________|_________|____________________________________________________________
                                                               |
                                                               | Identical layout to the 47-bit one from here on:
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7b9a7e8f39ac..dfec7bc692d4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -392,8 +392,7 @@ config AUDIT_ARCH
 
 config KASAN_SHADOW_OFFSET
 	hex
-	depends on KASAN
-	default 0xdffffc0000000000
+	default 0xdffffc0000000000 if KASAN_GENERIC
 
 config HAVE_INTEL_TXT
 	def_bool y
diff --git a/arch/x86/include/asm/kasan.h b/arch/x86/include/asm/kasan.h
index f7a8d3763615..79151356d5f2 100644
--- a/arch/x86/include/asm/kasan.h
+++ b/arch/x86/include/asm/kasan.h
@@ -5,7 +5,7 @@
 #include <linux/const.h>
 #include <linux/kasan-tags.h>
 #include <linux/types.h>
-#define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
+
 #define KASAN_SHADOW_SCALE_SHIFT 3
 
 /*
@@ -14,6 +14,8 @@
  * for kernel really starts from compiler's shadow offset +
  * 'kernel address space start' >> KASAN_SHADOW_SCALE_SHIFT
  */
+#ifdef CONFIG_KASAN_GENERIC
+#define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
 #define KASAN_SHADOW_START      (KASAN_SHADOW_OFFSET + \
 					((-1UL << __VIRTUAL_MASK_SHIFT) >> \
 						KASAN_SHADOW_SCALE_SHIFT))
@@ -24,12 +26,22 @@
 #define KASAN_SHADOW_END        (KASAN_SHADOW_START + \
 					(1ULL << (__VIRTUAL_MASK_SHIFT - \
 						  KASAN_SHADOW_SCALE_SHIFT)))
+#endif
+
 
 #ifndef __ASSEMBLY__
+#include <asm/runtime-const.h>
 #include <linux/bitops.h>
 #include <linux/bitfield.h>
 #include <linux/bits.h>
 
+#ifdef CONFIG_KASAN_SW_TAGS
+extern unsigned long KASAN_SHADOW_END_RC;
+#define KASAN_SHADOW_END	runtime_const_ptr(KASAN_SHADOW_END_RC)
+#define KASAN_SHADOW_OFFSET	KASAN_SHADOW_END
+#define KASAN_SHADOW_START	(KASAN_SHADOW_END - ((UL(1)) << (__VIRTUAL_MASK_SHIFT - KASAN_SHADOW_SCALE_SHIFT)))
+#endif
+
 #define arch_kasan_set_tag(addr, tag)	__tag_set(addr, tag)
 #define arch_kasan_reset_tag(addr)	__tag_reset(addr)
 #define arch_kasan_get_tag(addr)	__tag_get(addr)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index feb8102a9ca7..46183f7439c9 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -359,6 +359,7 @@ SECTIONS
 
 	RUNTIME_CONST_VARIABLES
 	RUNTIME_CONST(ptr, USER_PTR_MAX)
+	RUNTIME_CONST(ptr, KASAN_SHADOW_END_RC)
 
 	. = ALIGN(PAGE_SIZE);
 
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 55d468d83682..0f8190e0e5f6 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -358,6 +358,9 @@ void __init kasan_init(void)
 	int i;
 
 	memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
+	unsigned long KASAN_SHADOW_END_RC = pgtable_l5_enabled() ? 0xfff0000000000000 : 0xfffffc0000000000;
+
+	runtime_const_init(ptr, KASAN_SHADOW_END_RC);
 
 	/*
 	 * We use the same shadow offset for 4- and 5-level paging to
@@ -372,7 +375,7 @@ void __init kasan_init(void)
 	 * bunch of things like kernel code, modules, EFI mapping, etc.
 	 * We need to take extra steps to not overwrite them.
 	 */
-	if (pgtable_l5_enabled()) {
+	if (pgtable_l5_enabled() && !IS_ENABLED(CONFIG_KASAN_SW_TAGS)) {
 		void *ptr;
 
 		ptr = (void *)pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_END));
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 14/15] x86: Make software tag-based kasan available
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (12 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 13/15] x86: runtime_const used for KASAN_SHADOW_END Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 15/15] kasan: Add mititgation and debug modes Maciej Wieczor-Retman
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

Make CONFIG_KASAN_SW_TAGS available for x86 machines if they have
ADDRESS_MASKING enabled (LAM) as that works similarly to Top-Byte Ignore
(TBI) that allows the software tag-based mode on arm64 platform.

Set scale macro based on KASAN mode: in software tag-based mode 32 bytes
of memory map to one shadow byte and 16 in generic mode.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/Kconfig                | 8 ++++++++
 arch/x86/boot/compressed/misc.h | 2 ++
 arch/x86/include/asm/kasan.h    | 2 +-
 arch/x86/kernel/setup.c         | 2 ++
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index dfec7bc692d4..afbcf27ad278 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -36,6 +36,7 @@ config X86_64
 	select ARCH_HAS_ELFCORE_COMPAT
 	select ZONE_DMA32
 	select EXECMEM if DYNAMIC_FTRACE
+	select ARCH_HAS_KASAN_SW_TAGS_DENSE
 
 config FORCE_DYNAMIC_FTRACE
 	def_bool y
@@ -190,6 +191,7 @@ config X86
 	select HAVE_ARCH_JUMP_LABEL_RELATIVE
 	select HAVE_ARCH_KASAN			if X86_64
 	select HAVE_ARCH_KASAN_VMALLOC		if X86_64
+	select HAVE_ARCH_KASAN_SW_TAGS		if ADDRESS_MASKING
 	select HAVE_ARCH_KFENCE
 	select HAVE_ARCH_KMSAN			if X86_64
 	select HAVE_ARCH_KGDB
@@ -394,6 +396,12 @@ config KASAN_SHADOW_OFFSET
 	hex
 	default 0xdffffc0000000000 if KASAN_GENERIC
 
+config KASAN_SHADOW_SCALE_SHIFT
+	int
+	default 5 if KASAN_SW_TAGS_DENSE
+	default 4 if KASAN_SW_TAGS
+	default 3
+
 config HAVE_INTEL_TXT
 	def_bool y
 	depends on INTEL_IOMMU && ACPI
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index dd8d1a85f671..397a70558ffa 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -13,6 +13,8 @@
 #undef CONFIG_PARAVIRT_SPINLOCKS
 #undef CONFIG_KASAN
 #undef CONFIG_KASAN_GENERIC
+#undef CONFIG_KASAN_SW_TAGS
+#undef CONFIG_KASAN_SW_TAGS_DENSE
 
 #define __NO_FORTIFY
 
diff --git a/arch/x86/include/asm/kasan.h b/arch/x86/include/asm/kasan.h
index 79151356d5f2..99ff4ae83bf7 100644
--- a/arch/x86/include/asm/kasan.h
+++ b/arch/x86/include/asm/kasan.h
@@ -6,7 +6,7 @@
 #include <linux/kasan-tags.h>
 #include <linux/types.h>
 
-#define KASAN_SHADOW_SCALE_SHIFT 3
+#define KASAN_SHADOW_SCALE_SHIFT CONFIG_KASAN_SHADOW_SCALE_SHIFT
 
 /*
  * Compiler uses shadow offset assuming that addresses start
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f1fea506e20f..c300274e205a 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1121,6 +1121,8 @@ void __init setup_arch(char **cmdline_p)
 
 	kasan_init();
 
+	kasan_init_sw_tags();
+
 	/*
 	 * Sync back kernel address range.
 	 *
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 15/15] kasan: Add mititgation and debug modes
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (13 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 14/15] x86: Make software tag-based kasan available Maciej Wieczor-Retman
@ 2025-02-04 17:33 ` Maciej Wieczor-Retman
  2025-02-05 23:46   ` Andrey Konovalov
  2025-02-04 18:58 ` [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Christoph Lameter (Ampere)
  2025-02-05 23:40 ` Andrey Konovalov
  16 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

With smaller memory footprint KASAN could be used in production systems.
One problem is that saving stacktraces slowes memory allocation
substantially - with KASAN enabled up to 90% of time spent on kmalloc()
is spent on saving the stacktrace.

Add mitigation mode to allow the option for running KASAN focused on
performance and security. In mitigation mode disable saving stacktraces
and set fault mode to always panic on KASAN error as a security
mechanism.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 lib/Kconfig.kasan | 28 ++++++++++++++++++++++++++++
 mm/kasan/report.c |  4 ++++
 mm/kasan/tags.c   |  5 +++++
 3 files changed, 37 insertions(+)

diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index d08b4e9bf477..6daa62b40dea 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -244,4 +244,32 @@ config KASAN_SW_TAGS_DENSE
 	  ARCH_HAS_KASAN_SW_TAGS_DENSE is needed for this option since the
 	  special tag macros need to be properly set for 4-bit wide tags.
 
+choice
+	prompt "KASAN operation mode"
+	default KASAN_OPERATION_DEBUG
+	help
+	  Choose between the mitigation or debug operation modes.
+
+	  The first one disables stacktrace saving and enables panic on error.
+	  Faster memory allocation but less information. The second one is the
+	  default where KASAN operates with full functionality.
+
+config KASAN_OPERATION_DEBUG
+	bool "Debug operation mode"
+	depends on KASAN
+	help
+	  The default mode. Full functionality and all boot parameters
+	  available.
+
+config KASAN_OPERATION_MITIGATION
+	bool "Mitigation operation mode"
+	depends on KASAN
+	help
+	  Operation mode dedicated at faster operation at the cost of less
+	  information collection. Disables stacktrace saving for faster
+	  allocations and forces panic on KASAN error to mitigate malicious
+	  attacks.
+
+endchoice
+
 endif # KASAN
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index ee9e406b0cdb..ae989d3bd919 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -47,7 +47,11 @@ enum kasan_arg_fault {
 	KASAN_ARG_FAULT_PANIC_ON_WRITE,
 };
 
+#ifdef CONFIG_KASAN_OPERATION_MITIGATION
+static enum kasan_arg_fault kasan_arg_fault __ro_after_init = KASAN_ARG_FAULT_PANIC;
+#else
 static enum kasan_arg_fault kasan_arg_fault __ro_after_init = KASAN_ARG_FAULT_DEFAULT;
+#endif
 
 /* kasan.fault=report/panic */
 static int __init early_kasan_fault(char *arg)
diff --git a/mm/kasan/tags.c b/mm/kasan/tags.c
index c111d98961ed..2414cddeaaf3 100644
--- a/mm/kasan/tags.c
+++ b/mm/kasan/tags.c
@@ -78,6 +78,11 @@ early_param("kasan.stack_ring_size", early_kasan_flag_stack_ring_size);
 
 void __init kasan_init_tags(void)
 {
+	if (IS_ENABLED(CONFIG_KASAN_OPERATION_MITIGATION)) {
+		static_branch_disable(&kasan_flag_stacktrace);
+		return;
+	}
+
 	switch (kasan_arg_stacktrace) {
 	case KASAN_ARG_STACKTRACE_DEFAULT:
 		/* Default is specified by kasan_flag_stacktrace definition. */
-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (14 preceding siblings ...)
  2025-02-04 17:33 ` [PATCH 15/15] kasan: Add mititgation and debug modes Maciej Wieczor-Retman
@ 2025-02-04 18:58 ` Christoph Lameter (Ampere)
  2025-02-04 21:05   ` Dave Hansen
                     ` (2 more replies)
  2025-02-05 23:40 ` Andrey Konovalov
  16 siblings, 3 replies; 45+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-02-04 18:58 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

ARM64 supports MTE which is hardware support for tagging 16 byte granules
and verification of tags in pointers all in hardware and on some platforms
with *no* performance penalty since the tag is stored in the ECC areas of
DRAM and verified at the same time as the ECC.

Could we get support for that? This would allow us to enable tag checking
in production systems without performance penalty and no memory overhead.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-04 18:58 ` [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Christoph Lameter (Ampere)
@ 2025-02-04 21:05   ` Dave Hansen
  2025-02-05 18:59     ` Christoph Lameter (Ampere)
  2025-02-04 23:36   ` Jessica Clarke
  2025-02-04 23:36   ` Jessica Clarke
  2 siblings, 1 reply; 45+ messages in thread
From: Dave Hansen @ 2025-02-04 21:05 UTC (permalink / raw)
  To: Christoph Lameter (Ampere), Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On 2/4/25 10:58, Christoph Lameter (Ampere) wrote:
> ARM64 supports MTE which is hardware support for tagging 16 byte granules
> and verification of tags in pointers all in hardware and on some platforms
> with *no* performance penalty since the tag is stored in the ECC areas of
> DRAM and verified at the same time as the ECC.
> 
> Could we get support for that? This would allow us to enable tag checking
> in production systems without performance penalty and no memory overhead.

At least on the Intel side, there's no trajectory for doing something
like the MTE architecture for memory tagging. The DRAM "ECC" area is in
very high demand and if anything things are moving away from using ECC
"bits" for anything other than actual ECC. Even the MKTME+integrity
(used for TDX) metadata is probably going to find a new home at some point.

This shouldn't be a surprise to anyone on cc here. If it is, you should
probably be reaching out to Intel over your normal channels.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-04 18:58 ` [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Christoph Lameter (Ampere)
  2025-02-04 21:05   ` Dave Hansen
  2025-02-04 23:36   ` Jessica Clarke
@ 2025-02-04 23:36   ` Jessica Clarke
  2 siblings, 0 replies; 45+ messages in thread
From: Jessica Clarke @ 2025-02-04 23:36 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: Maciej Wieczor-Retman, luto, xin, kirill.shutemov, palmer, tj,
	andreyknvl, brgerst, ardb, dave.hansen, jgross, will, akpm, arnd,
	corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa, seanjc,
	paul.walmsley, aou, justinstitt, jason.andryuk, glider, ubizjak,
	jannh, bhe, vincenzo.frascino, rafael.j.wysocki, ndesaulniers,
	mingo, catalin.marinas, junichi.nomura, nathan, ryabinin.a.a,
	dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, kees, kasan-dev, x86, linux-arm-kernel,
	linux-riscv, linux-kernel, linux-mm, llvm, linux-doc

On 4 Feb 2025, at 18:58, Christoph Lameter (Ampere) <cl@gentwo.org> wrote:
> ARM64 supports MTE which is hardware support for tagging 16 byte granules
> and verification of tags in pointers all in hardware and on some platforms
> with *no* performance penalty since the tag is stored in the ECC areas of
> DRAM and verified at the same time as the ECC.
> 
> Could we get support for that? This would allow us to enable tag checking
> in production systems without performance penalty and no memory overhead.

It’s not “no performance penalty”, there is a cost to tracking the MTE
tags for checking. In asynchronous (or asymmetric) mode that’s not too
bad, but in synchronous mode there is a significant overhead even with
ECC. Normally on a store, once you’ve translated it and have the data,
you can buffer it up and defer the actual write until some time later.
If you hit in the L1 cache then that will probably be quite soon, but
if you miss then you have to wait for the data to come back from lower
levels of the hierarchy, potentially all the way out to DRAM. Or if you
have a write-around cache then you just send it out to the next level
when it’s ready. But now, if you have synchronous MTE, you cannot
retire your store instruction until you know what the tag for the
location you’re storing to is; effectively you have to wait until you
can do the full cache lookup, and potentially miss, until it can
retire. This puts pressure on the various microarchitectural structures
that track instructions as they get executed, as instructions are now
in flight for longer. Yes, it may well be that it is quicker for the
memory controller to get the tags from ECC bits than via some other
means, but you’re already paying many many cycles at that point, with
the relevant store being stuck unable to retire (and thus every
instruction after it in the instruction stream) that whole time, and no
write allocate or write around schemes can help you, because you
fundamentally have to wait for the tags to be read before you know if
the instruction is going to trap.

Now, you can choose to not use synchronous mode due to that overhead,
but that’s nuance that isn’t considered by your reply here and has some
consequences.

Jess



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-04 18:58 ` [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Christoph Lameter (Ampere)
  2025-02-04 21:05   ` Dave Hansen
@ 2025-02-04 23:36   ` Jessica Clarke
  2025-02-05 18:51     ` Christoph Lameter (Ampere)
  2025-02-04 23:36   ` Jessica Clarke
  2 siblings, 1 reply; 45+ messages in thread
From: Jessica Clarke @ 2025-02-04 23:36 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: Maciej Wieczor-Retman, luto, xin, kirill.shutemov, palmer, tj,
	andreyknvl, brgerst, ardb, dave.hansen, jgross, will, akpm, arnd,
	corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa, seanjc,
	paul.walmsley, aou, justinstitt, jason.andryuk, glider, ubizjak,
	jannh, bhe, vincenzo.frascino, rafael.j.wysocki, ndesaulniers,
	mingo, catalin.marinas, junichi.nomura, nathan, ryabinin.a.a,
	dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, kees, kasan-dev, x86, linux-arm-kernel,
	linux-riscv, linux-kernel, linux-mm, llvm, linux-doc

On 4 Feb 2025, at 18:58, Christoph Lameter (Ampere) <cl@gentwo.org> wrote:
> ARM64 supports MTE which is hardware support for tagging 16 byte granules
> and verification of tags in pointers all in hardware and on some platforms
> with *no* performance penalty since the tag is stored in the ECC areas of
> DRAM and verified at the same time as the ECC.
> 
> Could we get support for that? This would allow us to enable tag checking
> in production systems without performance penalty and no memory overhead.

It’s not “no performance penalty”, there is a cost to tracking the MTE
tags for checking. In asynchronous (or asymmetric) mode that’s not too
bad, but in synchronous mode there is a significant overhead even with
ECC. Normally on a store, once you’ve translated it and have the data,
you can buffer it up and defer the actual write until some time later.
If you hit in the L1 cache then that will probably be quite soon, but
if you miss then you have to wait for the data to come back from lower
levels of the hierarchy, potentially all the way out to DRAM. Or if you
have a write-around cache then you just send it out to the next level
when it’s ready. But now, if you have synchronous MTE, you cannot
retire your store instruction until you know what the tag for the
location you’re storing to is; effectively you have to wait until you
can do the full cache lookup, and potentially miss, until it can
retire. This puts pressure on the various microarchitectural structures
that track instructions as they get executed, as instructions are now
in flight for longer. Yes, it may well be that it is quicker for the
memory controller to get the tags from ECC bits than via some other
means, but you’re already paying many many cycles at that point, with
the relevant store being stuck unable to retire (and thus every
instruction after it in the instruction stream) that whole time, and no
write allocate or write around schemes can help you, because you
fundamentally have to wait for the tags to be read before you know if
the instruction is going to trap.

Now, you can choose to not use synchronous mode due to that overhead,
but that’s nuance that isn’t considered by your reply here and has some
consequences.

Jess



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-04 23:36   ` Jessica Clarke
@ 2025-02-05 18:51     ` Christoph Lameter (Ampere)
  2025-02-06  1:05       ` Jessica Clarke
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-02-05 18:51 UTC (permalink / raw)
  To: Jessica Clarke
  Cc: Maciej Wieczor-Retman, luto, xin, kirill.shutemov, palmer, tj,
	andreyknvl, brgerst, ardb, dave.hansen, jgross, will, akpm, arnd,
	corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa, seanjc,
	paul.walmsley, aou, justinstitt, jason.andryuk, glider, ubizjak,
	jannh, bhe, vincenzo.frascino, rafael.j.wysocki, ndesaulniers,
	mingo, catalin.marinas, junichi.nomura, nathan, ryabinin.a.a,
	dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, kees, kasan-dev, x86, linux-arm-kernel,
	linux-riscv, linux-kernel, linux-mm, llvm, linux-doc

[-- Attachment #1: Type: text/plain, Size: 379 bytes --]

On Tue, 4 Feb 2025, Jessica Clarke wrote:

> It’s not “no performance penalty”, there is a cost to tracking the MTE
> tags for checking. In asynchronous (or asymmetric) mode that’s not too


On Ampere Processor hardware there is no penalty since the logic is build
into the usual read/write paths. This is by design. There may be on other
platforms that cannot do this.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-04 21:05   ` Dave Hansen
@ 2025-02-05 18:59     ` Christoph Lameter (Ampere)
  2025-02-05 23:04       ` Ard Biesheuvel
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-02-05 18:59 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Maciej Wieczor-Retman, luto, xin, kirill.shutemov, palmer, tj,
	andreyknvl, brgerst, ardb, dave.hansen, jgross, will, akpm, arnd,
	corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa, seanjc,
	paul.walmsley, aou, justinstitt, jason.andryuk, glider, ubizjak,
	jannh, bhe, vincenzo.frascino, rafael.j.wysocki, ndesaulniers,
	mingo, catalin.marinas, junichi.nomura, nathan, ryabinin.a.a,
	dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, kees, kasan-dev, x86, linux-arm-kernel,
	linux-riscv, linux-kernel, linux-mm, llvm, linux-doc

On Tue, 4 Feb 2025, Dave Hansen wrote:

> > Could we get support for that? This would allow us to enable tag checking
> > in production systems without performance penalty and no memory overhead.
>
> At least on the Intel side, there's no trajectory for doing something
> like the MTE architecture for memory tagging. The DRAM "ECC" area is in
> very high demand and if anything things are moving away from using ECC
> "bits" for anything other than actual ECC. Even the MKTME+integrity
> (used for TDX) metadata is probably going to find a new home at some point.
>
> This shouldn't be a surprise to anyone on cc here. If it is, you should
> probably be reaching out to Intel over your normal channels.

Intel was a competitor for our company and AFAICT has issues all over
the place with performance given its conservative stands on technology. But
we do not test against Intel anymore. Can someone from AMD say something?

MTE tagging is part of the processor standard for ARM64 and Linux will
need to support the 16 byte tagging feature one way or another even if
Intel does not like it. And AFAICT hardware tagging support is a critical
security feature for the future.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/15] kasan: arm64: x86: risc-v: Make special tags arch specific
  2025-02-04 17:33 ` [PATCH 04/15] kasan: arm64: x86: risc-v: Make special tags arch specific Maciej Wieczor-Retman
@ 2025-02-05 20:20   ` Palmer Dabbelt
  2025-02-06 11:22     ` Maciej Wieczor-Retman
  0 siblings, 1 reply; 45+ messages in thread
From: Palmer Dabbelt @ 2025-02-05 20:20 UTC (permalink / raw)
  To: maciej.wieczor-retman
  Cc: luto, xin, kirill.shutemov, tj, andreyknvl, brgerst,
	Ard Biesheuvel, dave.hansen, jgross, Will Deacon, akpm,
	Arnd Bergmann, corbet, maciej.wieczor-retman, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, Paul Walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	Catalin Marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On Tue, 04 Feb 2025 09:33:45 PST (-0800), maciej.wieczor-retman@intel.com wrote:
> KASAN's tag-based mode defines multiple special tag values. They're
> reserved for:
> - Native kernel value. On arm64 it's 0xFF and it causes an early return
>   in the tag checking function.
> - Invalid value. 0xFE marks an area as freed / unallocated. It's also
>   the value that is used to initialize regions of shadow memory.
> - Max value. 0xFD is the highest value that can be randomly generated
>   for a new tag.
>
> Metadata macro is also defined:
> - Tag width equal to 8.
>
> Tag-based mode on x86 is going to use 4 bit wide tags so all the above
> values need to be changed accordingly.
>
> Make tags arch specific for x86, risc-v and arm64. On x86 the values
> just lose the top 4 bits.
>
> Replace hardcoded kernel tag value and tag width with macros in KASAN's
> non-arch specific code.
>
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---
>  MAINTAINERS                         |  2 +-
>  arch/arm64/include/asm/kasan-tags.h |  9 +++++++++
>  arch/riscv/include/asm/kasan-tags.h | 12 ++++++++++++
>  arch/riscv/include/asm/kasan.h      |  4 ----
>  arch/x86/include/asm/kasan-tags.h   |  9 +++++++++
>  include/linux/kasan-tags.h          | 12 +++++++++++-
>  include/linux/kasan.h               |  4 +++-
>  include/linux/mm.h                  |  6 +++---
>  include/linux/page-flags-layout.h   |  7 +------
>  9 files changed, 49 insertions(+), 16 deletions(-)
>  create mode 100644 arch/arm64/include/asm/kasan-tags.h
>  create mode 100644 arch/riscv/include/asm/kasan-tags.h
>  create mode 100644 arch/x86/include/asm/kasan-tags.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b878ddc99f94..45671faa3b6f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12227,7 +12227,7 @@ L:	kasan-dev@googlegroups.com
>  S:	Maintained
>  B:	https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
>  F:	Documentation/dev-tools/kasan.rst
> -F:	arch/*/include/asm/*kasan.h
> +F:	arch/*/include/asm/*kasan*.h
>  F:	arch/*/mm/kasan_init*
>  F:	include/linux/kasan*.h
>  F:	lib/Kconfig.kasan
> diff --git a/arch/arm64/include/asm/kasan-tags.h b/arch/arm64/include/asm/kasan-tags.h
> new file mode 100644
> index 000000000000..9e835da95f6b
> --- /dev/null
> +++ b/arch/arm64/include/asm/kasan-tags.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_KASAN_TAGS_H
> +#define __ASM_KASAN_TAGS_H
> +
> +#define KASAN_TAG_KERNEL	0xFF /* native kernel pointers tag */
> +
> +#define KASAN_TAG_WIDTH 8
> +
> +#endif /* ASM_KASAN_TAGS_H */
> diff --git a/arch/riscv/include/asm/kasan-tags.h b/arch/riscv/include/asm/kasan-tags.h
> new file mode 100644
> index 000000000000..83d7dcc8af74
> --- /dev/null
> +++ b/arch/riscv/include/asm/kasan-tags.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_KASAN_TAGS_H
> +#define __ASM_KASAN_TAGS_H
> +
> +#ifdef CONFIG_KASAN_SW_TAGS
> +#define KASAN_TAG_KERNEL	0x7f /* native kernel pointers tag */
> +#endif
> +
> +#define KASAN_TAG_WIDTH 8
> +
> +#endif /* ASM_KASAN_TAGS_H */
> +
> diff --git a/arch/riscv/include/asm/kasan.h b/arch/riscv/include/asm/kasan.h
> index f6b378ba936d..27938e0d5233 100644
> --- a/arch/riscv/include/asm/kasan.h
> +++ b/arch/riscv/include/asm/kasan.h
> @@ -41,10 +41,6 @@
>
>  #define KASAN_SHADOW_OFFSET	_AC(CONFIG_KASAN_SHADOW_OFFSET, UL)
>
> -#ifdef CONFIG_KASAN_SW_TAGS
> -#define KASAN_TAG_KERNEL	0x7f /* native kernel pointers tag */
> -#endif
> -
>  #define arch_kasan_set_tag(addr, tag)	__tag_set(addr, tag)
>  #define arch_kasan_reset_tag(addr)	__tag_reset(addr)
>  #define arch_kasan_get_tag(addr)	__tag_get(addr)
> diff --git a/arch/x86/include/asm/kasan-tags.h b/arch/x86/include/asm/kasan-tags.h
> new file mode 100644
> index 000000000000..68ba385bc75c
> --- /dev/null
> +++ b/arch/x86/include/asm/kasan-tags.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_KASAN_TAGS_H
> +#define __ASM_KASAN_TAGS_H
> +
> +#define KASAN_TAG_KERNEL	0xF /* native kernel pointers tag */
> +
> +#define KASAN_TAG_WIDTH		4
> +
> +#endif /* ASM_KASAN_TAGS_H */
> diff --git a/include/linux/kasan-tags.h b/include/linux/kasan-tags.h
> index e07c896f95d3..b4aacfa8709b 100644
> --- a/include/linux/kasan-tags.h
> +++ b/include/linux/kasan-tags.h
> @@ -2,7 +2,17 @@
>  #ifndef _LINUX_KASAN_TAGS_H
>  #define _LINUX_KASAN_TAGS_H
>
> -#include <asm/kasan.h>
> +#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS)
> +#include <asm/kasan-tags.h>
> +#endif
> +
> +#ifdef CONFIG_KASAN_SW_TAGS_DENSE
> +#define KASAN_TAG_WIDTH		4
> +#endif
> +
> +#ifndef KASAN_TAG_WIDTH
> +#define KASAN_TAG_WIDTH		0
> +#endif
>
>  #ifndef KASAN_TAG_KERNEL
>  #define KASAN_TAG_KERNEL	0xFF /* native kernel pointers tag */
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 5a3e9bec21c2..83146367170a 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -88,7 +88,9 @@ static inline u8 kasan_get_shadow_tag(const void *addr)
>
>  #ifdef CONFIG_KASAN_SW_TAGS
>  /* This matches KASAN_TAG_INVALID. */
> -#define KASAN_SHADOW_INIT 0xFE
> +#ifndef KASAN_SHADOW_INIT
> +#define KASAN_SHADOW_INIT KASAN_TAG_INVALID
> +#endif
>  #else
>  #define KASAN_SHADOW_INIT 0
>  #endif
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 61fff5d34ed5..ddca2f63a5f6 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1813,7 +1813,7 @@ static inline u8 page_kasan_tag(const struct page *page)
>
>  	if (kasan_enabled()) {
>  		tag = (page->flags >> KASAN_TAG_PGSHIFT) & KASAN_TAG_MASK;
> -		tag ^= 0xff;
> +		tag ^= KASAN_TAG_KERNEL;
>  	}
>
>  	return tag;
> @@ -1826,7 +1826,7 @@ static inline void page_kasan_tag_set(struct page *page, u8 tag)
>  	if (!kasan_enabled())
>  		return;
>
> -	tag ^= 0xff;
> +	tag ^= KASAN_TAG_KERNEL;
>  	old_flags = READ_ONCE(page->flags);
>  	do {
>  		flags = old_flags;
> @@ -1845,7 +1845,7 @@ static inline void page_kasan_tag_reset(struct page *page)
>
>  static inline u8 page_kasan_tag(const struct page *page)
>  {
> -	return 0xff;
> +	return KASAN_TAG_KERNEL;
>  }
>
>  static inline void page_kasan_tag_set(struct page *page, u8 tag) { }
> diff --git a/include/linux/page-flags-layout.h b/include/linux/page-flags-layout.h
> index 7d79818dc065..ac3576f409ad 100644
> --- a/include/linux/page-flags-layout.h
> +++ b/include/linux/page-flags-layout.h
> @@ -3,6 +3,7 @@
>  #define PAGE_FLAGS_LAYOUT_H
>
>  #include <linux/numa.h>
> +#include <linux/kasan-tags.h>
>  #include <generated/bounds.h>
>
>  /*
> @@ -72,12 +73,6 @@
>  #define NODE_NOT_IN_PAGE_FLAGS	1
>  #endif
>
> -#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS)
> -#define KASAN_TAG_WIDTH 8
> -#else
> -#define KASAN_TAG_WIDTH 0
> -#endif
> -
>  #ifdef CONFIG_NUMA_BALANCING
>  #define LAST__PID_SHIFT 8
>  #define LAST__PID_MASK  ((1 << LAST__PID_SHIFT)-1)

Acked-by: Palmer Dabbelt <palmer@rivosinc.com> # RISC-V

Probably best to keep this along with the rest of the patches, but LMK 
if you want me to point something at the RISC-V tree.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-05 18:59     ` Christoph Lameter (Ampere)
@ 2025-02-05 23:04       ` Ard Biesheuvel
  0 siblings, 0 replies; 45+ messages in thread
From: Ard Biesheuvel @ 2025-02-05 23:04 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: Dave Hansen, Maciej Wieczor-Retman, luto, xin, kirill.shutemov,
	palmer, tj, andreyknvl, brgerst, dave.hansen, jgross, will, akpm,
	arnd, corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa,
	seanjc, paul.walmsley, aou, justinstitt, jason.andryuk, glider,
	ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, kees, kasan-dev, x86, linux-arm-kernel,
	linux-riscv, linux-kernel, linux-mm, llvm, linux-doc

On Wed, 5 Feb 2025 at 20:31, Christoph Lameter (Ampere) <cl@gentwo.org> wrote:
>
> MTE tagging is part of the processor standard for ARM64 and Linux will
> need to support the 16 byte tagging feature one way or another even if
> Intel does not like it. And AFAICT hardware tagging support is a critical
> security feature for the future.
>

Can you explain what you feel is lacking in the existing MTE support
in KAsan (enabled when selecting CONFIG_KASAN_HW_TAGS)?


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
                   ` (15 preceding siblings ...)
  2025-02-04 18:58 ` [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Christoph Lameter (Ampere)
@ 2025-02-05 23:40 ` Andrey Konovalov
  2025-02-06 10:40   ` Maciej Wieczor-Retman
  16 siblings, 1 reply; 45+ messages in thread
From: Andrey Konovalov @ 2025-02-05 23:40 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On Tue, Feb 4, 2025 at 6:34 PM Maciej Wieczor-Retman
<maciej.wieczor-retman@intel.com> wrote:
>
> ======= Introduction
> The patchset aims to add a KASAN tag-based mode for the x86 architecture
> with the help of the new CPU feature called Linear Address Masking
> (LAM). Main improvement introduced by the series is 4x lower memory
> usage compared to KASAN's generic mode, the only currently available
> mode on x86.
>
> There are two logical parts to this series. The first one attempts to
> add a new memory saving mechanism called "dense mode" to the generic
> part of the tag-based KASAN code. The second one focuses on implementing
> and enabling the tag-based mode for the x86 architecture by using LAM.

Hi Maciej,

Awesome work! Great to see SW_TAGS mode supported on x86!

I started reviewing the patches, but this is somewhat complicated, as
the dense mode changes are squashed together with the generic ones for
x86 support. Could you please split this series into 2? Or at least
reorder the patches so that everything needed for basic x86 support
comes first and can be reviewed and tested separately.

I will post the comments for things I noted so far, including for the
dense mode changes, but I'll take a closer look after the split.

Also feel free to drop the dependency on that risc-v series, as it
doesn't get updated very often. But up to you.

And please also update all affected parts of Documentation/dev-tools/kasan.rst.

Thank you!


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode
  2025-02-04 17:33 ` [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode Maciej Wieczor-Retman
@ 2025-02-05 23:43   ` Andrey Konovalov
  2025-02-06 12:57     ` Maciej Wieczor-Retman
  0 siblings, 1 reply; 45+ messages in thread
From: Andrey Konovalov @ 2025-02-05 23:43 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On Tue, Feb 4, 2025 at 6:34 PM Maciej Wieczor-Retman
<maciej.wieczor-retman@intel.com> wrote:
>
> Tag-based KASAN (on arm64) works by generating a random 8-bit tag and
> putting it in both the top byte of the pointer (that points to the
> allocated memory) and into all bytes of shadow memory that correspond to
> the chunk of allocated regular memory. Each byte of shadow memory covers
> a 16 byte chunk of allocated memory - a value called KASAN granularity.
> This means that out-of-bounds memory accesses that happen inside the 16
> bytes can't be caught.
>
> The dense mode offers reducing the tag width from 8 to 4 bits and
> storing two tags in one byte of shadow memory - one in the upper 4 bits
> of the byte and one in the lower 4. This way one byte of shadow memory
> can cover 32 bytes of allocated memory while still keeping the "16 bytes
> per one tag" granularity. The lower 4 bits of each shadow byte map bytes
> of memory with offsets 0-15 and the upper 4 bits map offsets 16-31.
>
> Example:
> The example below shows how the shadow memory looks like after
> allocating 48 bytes of memory in both normal tag-based mode and the
> dense mode. The contents of shadow memory are overlaid onto address
> offsets that they relate to in the allocated kernel memory. Each cell
> |    | symbolizes one byte of shadow memory.
>
> = The regular tag based mode:
> - Randomly generated 8-bit tag equals 0xAB.
> - 0xFE is the tag that symbolizes unallocated memory.
>
> Shadow memory contents:           |  0xAB  |  0xAB  |  0xAB  |  0xFE  |
> Shadow memory address offsets:    0        1        2        3        4
> Allocated memory address offsets: 0        16       32       48       64
>
> = The dense tag based mode:
> - Randomly generated 4-bit tag equals 0xC.
> - 0xE is the tag that symbolizes unallocated memory.
>
> Shadow memory contents:           |0xC 0xC |0xC 0xE |0xE 0xE |0xE 0xE |
> Shadow memory address offsets:    0        1        2        3        4
> Allocated memory address offsets: 0        32       64       96       128
>
> Add a new config option and defines that can override the standard
> system of one tag per one shadow byte.
>
> Add alternative version of the kasan_poison() that deals with tags not
> being aligned to byte size in shadow memory.
>
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---
>  include/linux/kasan.h | 18 ++++++++++++++++++
>  lib/Kconfig.kasan     | 21 +++++++++++++++++++++
>  mm/kasan/kasan.h      |  4 +---
>  mm/kasan/shadow.c     | 33 ++++++++++++++++++++++++++++++---
>  4 files changed, 70 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 03b440658817..ea0f5acd875b 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -35,6 +35,24 @@ typedef unsigned int __bitwise kasan_vmalloc_flags_t;
>
>  /* Software KASAN implementations use shadow memory. */
>
> +#ifdef CONFIG_KASAN_SW_TAGS_DENSE
> +#define KASAN_GRANULE_SHIFT    (KASAN_SHADOW_SCALE_SHIFT - 1)
> +#define KASAN_SHADOW_SCALE_SIZE        (1UL << KASAN_SHADOW_SCALE_SHIFT)
> +static inline u8 kasan_dense_tag(u8 tag)
> +{
> +       return (tag << KASAN_TAG_WIDTH | tag);
> +}
> +#else
> +#define KASAN_GRANULE_SHIFT    KASAN_SHADOW_SCALE_SHIFT
> +#define KASAN_SHADOW_SCALE_SIZE        (1UL << KASAN_GRANULE_SHIFT)
> +static inline u8 kasan_dense_tag(u8 tag)
> +{
> +       return tag;
> +}
> +#endif
> +
> +#define KASAN_GRANULE_SIZE     (1UL << KASAN_GRANULE_SHIFT)
> +

Is there a reason these definitions are added to
include/linux/kasan.h? At least within this patch, they are only used
within mm/kasan, so let's keep them in mm/kasan/kasan.h.

>  #ifdef CONFIG_KASAN_SW_TAGS
>  /* This matches KASAN_TAG_INVALID. */
>  #define KASAN_SHADOW_INIT 0xFE
> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
> index 98016e137b7f..d08b4e9bf477 100644
> --- a/lib/Kconfig.kasan
> +++ b/lib/Kconfig.kasan
> @@ -19,6 +19,13 @@ config ARCH_DISABLE_KASAN_INLINE
>           Disables both inline and stack instrumentation. Selected by
>           architectures that do not support these instrumentation types.
>
> +config ARCH_HAS_KASAN_SW_TAGS_DENSE
> +       bool
> +       help
> +         Enables option to compile tag-based KASAN with densely packed tags -
> +         two 4-bit tags per one byte of shadow memory. Set on architectures
> +         that have 4-bit tag macros.
> +
>  config CC_HAS_KASAN_GENERIC
>         def_bool $(cc-option, -fsanitize=kernel-address)
>
> @@ -223,4 +230,18 @@ config KASAN_EXTRA_INFO
>           boot parameter, it will add 8 * stack_ring_size bytes of additional
>           memory consumption.
>
> +config KASAN_SW_TAGS_DENSE
> +       bool "Two 4-bit tags in one shadow memory byte"
> +       depends on KASAN_SW_TAGS
> +       depends on ARCH_HAS_KASAN_SW_TAGS_DENSE

I think this should also depend on KASAN_OUTLINE: Clang/GCC aren't
aware of the dense mode.

> +       help
> +         Enables packing two tags into one shadow byte to half the memory usage
> +         compared to normal tag-based mode.

But adds some performance impact?

> +
> +         After setting this option, tag width macro is set to 4 and size macros
> +         are adjusted based on used KASAN_SHADOW_SCALE_SHIFT.

I think this paragraph is an implementation detail and we can drop it.

> +
> +         ARCH_HAS_KASAN_SW_TAGS_DENSE is needed for this option since the
> +         special tag macros need to be properly set for 4-bit wide tags.
> +
>  endif # KASAN
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 72da5ddcceaa..0e04c5e2c405 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -128,9 +128,7 @@ static inline bool kasan_requires_meta(void)
>
>  #endif /* CONFIG_KASAN_GENERIC */
>
> -#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
> -#define KASAN_GRANULE_SIZE     (1UL << KASAN_SHADOW_SCALE_SHIFT)
> -#else
> +#ifdef CONFIG_KASAN_HW_TAGS
>  #include <asm/mte-kasan.h>
>  #define KASAN_GRANULE_SIZE     MTE_GRANULE_SIZE
>  #endif
> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index d6210ca48dda..368503f54b87 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -123,7 +123,8 @@ EXPORT_SYMBOL(__hwasan_memcpy);
>
>  void kasan_poison(const void *addr, size_t size, u8 value, bool init)
>  {
> -       void *shadow_start, *shadow_end;
> +       u8 *shadow_start, *shadow_end, *shadow_start_aligned, *shadow_end_aligned, tag;
> +       u64 addr64, addr_start_aligned, addr_end_aligned;
>
>         if (!kasan_arch_is_ready())
>                 return;
> @@ -134,16 +135,42 @@ void kasan_poison(const void *addr, size_t size, u8 value, bool init)
>          * addresses to this function.
>          */
>         addr = kasan_reset_tag(addr);
> +       addr64 = (u64)addr;
>
> -       if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
> +       if (WARN_ON(addr64 & KASAN_GRANULE_MASK))
>                 return;
>         if (WARN_ON(size & KASAN_GRANULE_MASK))
>                 return;
>
>         shadow_start = kasan_mem_to_shadow(addr);
>         shadow_end = kasan_mem_to_shadow(addr + size);
> +       addr_start_aligned = round_up(addr64, KASAN_SHADOW_SCALE_SIZE);
> +       addr_end_aligned = round_down(addr64 + size, KASAN_SHADOW_SCALE_SIZE);
> +       shadow_start_aligned = kasan_mem_to_shadow((void *)addr_start_aligned);
> +       shadow_end_aligned = kasan_mem_to_shadow((void *)addr_end_aligned);
> +
> +       /* If size is empty just return. */
> +       if (!size)
> +               return;
>
> -       __memset(shadow_start, value, shadow_end - shadow_start);
> +       /* Memset the first unaligned tag in shadow memory. */
> +       if (addr64 % KASAN_SHADOW_SCALE_SIZE) {

So this is required, because KASAN_SHADOW_SCALE_SIZE is 32 but minimal
slab alignment is still KASAN_GRANULE_SIZE == 16... We should at least
hide this check is under IS_ENABLED(KASAN_SW_TAGS_DENSE).

> +               tag = *shadow_start & KASAN_TAG_MASK;
> +               tag |= value << KASAN_TAG_WIDTH;
> +               *shadow_start = tag;
> +       }
> +
> +       /* Memset the middle aligned part in shadow memory. */
> +       tag = kasan_dense_tag(value);
> +       __memset(shadow_start_aligned, tag, shadow_end_aligned - shadow_start_aligned);
> +
> +       /* Memset the last unaligned tag in shadow memory. */
> +       if ((addr64 + size) % KASAN_SHADOW_SCALE_SIZE) {

Would it be possible to move this part to kasan_poison_last_granule()?
That functions seems to be serving a similar purpose but for the
Generic mode.

It might also be cleaner to add a kasan_poison_first_granule() that
contains the if (addr64 % KASAN_SHADOW_SCALE_SIZE) check.

> +               tag = KASAN_TAG_MASK << KASAN_TAG_WIDTH;
> +               tag &= *shadow_end;
> +               tag |= value;
> +               *shadow_end = tag;
> +       }
>  }
>  EXPORT_SYMBOL_GPL(kasan_poison);
>
> --
> 2.47.1
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/15] kasan: Tag checking with dense tag-based mode
  2025-02-04 17:33 ` [PATCH 02/15] kasan: Tag checking with " Maciej Wieczor-Retman
@ 2025-02-05 23:45   ` Andrey Konovalov
  2025-02-06 14:55     ` Maciej Wieczor-Retman
  0 siblings, 1 reply; 45+ messages in thread
From: Andrey Konovalov @ 2025-02-05 23:45 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On Tue, Feb 4, 2025 at 6:35 PM Maciej Wieczor-Retman
<maciej.wieczor-retman@intel.com> wrote:
>
> In KASAN's tag-based mode (arm64) when a memory access occurs, the tag
> stored in the top 8 bits of the pointer is compared with tags saved in
> the region of the shadow memory that maps to memory the pointer points
> to. If any of the tags in the shadow memory region do not match the one
> stored in the pointer an error report is generated.
>
> With the introduction of the dense mode, tags won't necessarily occupy
> whole bytes of shadow memory if the previously allocated memory wasn't
> aligned to 32 bytes - which is the coverage of one shadow byte.
>
> Add an alternative implementation of kasan_check_range() that performs
> special checks on first and last bytes of shadow memory ranges if the
> originally allocated memory wasn't aligned to 32 bytes.
>
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---
>  include/linux/kasan.h     | 47 +++++++++++++++-------
>  mm/kasan/Makefile         |  3 ++
>  mm/kasan/dense.c          | 83 +++++++++++++++++++++++++++++++++++++++
>  mm/kasan/kasan.h          |  2 +-
>  mm/kasan/report.c         |  2 +-
>  mm/kasan/report_sw_tags.c | 12 ++----
>  mm/kasan/sw_tags.c        |  8 ++++
>  7 files changed, 133 insertions(+), 24 deletions(-)
>  create mode 100644 mm/kasan/dense.c
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index ea0f5acd875b..5a3e9bec21c2 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -33,6 +33,20 @@ typedef unsigned int __bitwise kasan_vmalloc_flags_t;
>
>  #include <linux/pgtable.h>
>
> +#ifndef kasan_mem_to_shadow
> +static inline void *kasan_mem_to_shadow(const void *addr)
> +{
> +       void *scaled;
> +
> +       if (IS_ENABLED(CONFIG_KASAN_GENERIC))
> +               scaled = (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT);
> +       else
> +               scaled = (void *)((long)addr >> KASAN_SHADOW_SCALE_SHIFT);
> +
> +       return KASAN_SHADOW_OFFSET + scaled;
> +}
> +#endif

Any reason this is moved up here?


> +
>  /* Software KASAN implementations use shadow memory. */
>
>  #ifdef CONFIG_KASAN_SW_TAGS_DENSE
> @@ -53,6 +67,25 @@ static inline u8 kasan_dense_tag(u8 tag)
>
>  #define KASAN_GRANULE_SIZE     (1UL << KASAN_GRANULE_SHIFT)
>
> +#ifdef CONFIG_KASAN_SW_TAGS_DENSE
> +static inline u8 kasan_get_shadow_tag(const void *ptr)
> +{
> +       u8 shadow_byte = *(u8 *)kasan_mem_to_shadow(ptr);
> +       unsigned long addr = (unsigned long)ptr;
> +       int shift;
> +
> +       shift = !!(addr & KASAN_GRANULE_SIZE) * KASAN_TAG_WIDTH;
> +       shadow_byte >>= shift;
> +
> +       return shadow_byte & KASAN_TAG_KERNEL;
> +}
> +#else
> +static inline u8 kasan_get_shadow_tag(const void *addr)
> +{
> +       return (*(u8 *)kasan_mem_to_shadow(addr));
> +}
> +#endif
> +
>  #ifdef CONFIG_KASAN_SW_TAGS
>  /* This matches KASAN_TAG_INVALID. */
>  #define KASAN_SHADOW_INIT 0xFE
> @@ -73,20 +106,6 @@ extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
>  int kasan_populate_early_shadow(const void *shadow_start,
>                                 const void *shadow_end);
>
> -#ifndef kasan_mem_to_shadow
> -static inline void *kasan_mem_to_shadow(const void *addr)
> -{
> -       void *scaled;
> -
> -       if (IS_ENABLED(CONFIG_KASAN_GENERIC))
> -               scaled = (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT);
> -       else
> -               scaled = (void *)((long)addr >> KASAN_SHADOW_SCALE_SHIFT);
> -
> -       return KASAN_SHADOW_OFFSET + scaled;
> -}
> -#endif
> -
>  int kasan_add_zero_shadow(void *start, unsigned long size);
>  void kasan_remove_zero_shadow(void *start, unsigned long size);
>
> diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
> index b88543e5c0cc..3a460abd4c18 100644
> --- a/mm/kasan/Makefile
> +++ b/mm/kasan/Makefile
> @@ -5,6 +5,7 @@ KCOV_INSTRUMENT := n
>
>  # Disable ftrace to avoid recursion.
>  CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_dense.o = $(CC_FLAGS_FTRACE)
>  CFLAGS_REMOVE_generic.o = $(CC_FLAGS_FTRACE)
>  CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE)
>  CFLAGS_REMOVE_quarantine.o = $(CC_FLAGS_FTRACE)
> @@ -24,6 +25,7 @@ CC_FLAGS_KASAN_RUNTIME += -fno-stack-protector
>  CC_FLAGS_KASAN_RUNTIME += -DDISABLE_BRANCH_PROFILING
>
>  CFLAGS_common.o := $(CC_FLAGS_KASAN_RUNTIME)
> +CFLAGS_dense.o := $(CC_FLAGS_KASAN_RUNTIME)
>  CFLAGS_generic.o := $(CC_FLAGS_KASAN_RUNTIME)
>  CFLAGS_init.o := $(CC_FLAGS_KASAN_RUNTIME)
>  CFLAGS_quarantine.o := $(CC_FLAGS_KASAN_RUNTIME)
> @@ -49,6 +51,7 @@ RUSTFLAGS_kasan_test_rust.o := $(RUSTFLAGS_KASAN)
>  CFLAGS_kasan_test_module.o := $(CFLAGS_KASAN_TEST)
>
>  obj-y := common.o report.o
> +obj-$(CONFIG_KASAN_SW_TAGS_DENSE) += dense.o
>  obj-$(CONFIG_KASAN_GENERIC) += init.o generic.o report_generic.o shadow.o quarantine.o
>  obj-$(CONFIG_KASAN_HW_TAGS) += hw_tags.o report_hw_tags.o tags.o report_tags.o
>  obj-$(CONFIG_KASAN_SW_TAGS) += init.o report_sw_tags.o shadow.o sw_tags.o tags.o report_tags.o
> diff --git a/mm/kasan/dense.c b/mm/kasan/dense.c
> new file mode 100644
> index 000000000000..306bbbfdce29
> --- /dev/null
> +++ b/mm/kasan/dense.c
> @@ -0,0 +1,83 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include "kasan.h"
> +
> +static __always_inline bool kasan_check_range_inline(const void *addr,
> +                                                    size_t size, bool write,
> +                                                    unsigned long ret_ip)
> +{
> +       u8 *shadow_first, *shadow_last, *shadow, *shadow_first_aligned, *shadow_last_aligned;
> +       u64 addr_start_aligned, addr_end_aligned;
> +       u8 tag, kasan_granule_offset;
> +       size_t aligned_size;
> +       void *untagged_addr;
> +
> +       if (unlikely(size == 0))
> +               return true;
> +
> +       if (unlikely(addr + size < addr))
> +               return !kasan_report(addr, size, write, ret_ip);
> +
> +       tag = get_tag((const void *)addr);
> +
> +       /*
> +        * Ignore accesses for pointers tagged with native kernel
> +        * pointer tag to suppress false positives caused by kmap.
> +        *
> +        * Some kernel code was written to account for archs that don't keep
> +        * high memory mapped all the time, but rather map and unmap particular
> +        * pages when needed. Instead of storing a pointer to the kernel memory,
> +        * this code saves the address of the page structure and offset within
> +        * that page for later use. Those pages are then mapped and unmapped
> +        * with kmap/kunmap when necessary and virt_to_page is used to get the
> +        * virtual address of the page. For arm64 (that keeps the high memory
> +        * mapped all the time), kmap is turned into a page_address call.
> +
> +        * The issue is that with use of the page_address + virt_to_page
> +        * sequence the top byte value of the original pointer gets lost (gets
> +        * set to KASAN_TAG_KERNEL).
> +        */
> +       if (tag == KASAN_TAG_KERNEL)
> +               return true;
> +
> +       untagged_addr = kasan_reset_tag((void *)round_down((u64)addr, KASAN_GRANULE_SIZE));
> +       if (unlikely(!addr_has_metadata(untagged_addr)))
> +               return !kasan_report(addr, size, write, ret_ip);
> +
> +       kasan_granule_offset = ((u64)addr & KASAN_GRANULE_MASK);
> +       aligned_size = round_up(size + kasan_granule_offset, KASAN_GRANULE_SIZE);
> +       shadow_first = kasan_mem_to_shadow(untagged_addr);
> +       shadow_last = kasan_mem_to_shadow(untagged_addr + aligned_size);
> +       addr_start_aligned = round_up((u64)untagged_addr, KASAN_SHADOW_SCALE_SIZE);
> +       addr_end_aligned = round_down((u64)untagged_addr + aligned_size, KASAN_SHADOW_SCALE_SIZE);
> +       shadow_first_aligned = kasan_mem_to_shadow((void *)addr_start_aligned);
> +       shadow_last_aligned = kasan_mem_to_shadow((void *)addr_end_aligned);
> +
> +       /* Check the first unaligned tag in shadow memory. */
> +       if ((u64)untagged_addr % KASAN_SHADOW_SCALE_SIZE) {
> +               if (unlikely((*shadow_first >> KASAN_TAG_WIDTH) != tag))
> +                       return !kasan_report(addr, size, write, ret_ip);
> +       }
> +
> +       /* Check the middle aligned part in shadow memory. */
> +       for (shadow = shadow_first_aligned; shadow < shadow_last_aligned; shadow++) {
> +               if (unlikely(*shadow != ((tag << KASAN_TAG_WIDTH) | tag)))
> +                       return !kasan_report(addr, size, write, ret_ip);
> +       }
> +
> +       /* Check the last unaligned tag in shadow memory. */
> +       if (((u64)untagged_addr + aligned_size) % KASAN_SHADOW_SCALE_SIZE) {
> +               if (unlikely((*shadow_last & KASAN_TAG_MASK) != tag))
> +                       return !kasan_report(addr, size, write, ret_ip);
> +       }
> +
> +       return true;
> +}
> +
> +#if IS_ENABLED(CONFIG_KASAN_SW_TAGS_DENSE)
> +bool kasan_check_range(const void *addr, size_t size, bool write,
> +                      unsigned long ret_ip)
> +{
> +       return kasan_check_range_inline(addr, size, write, ret_ip);
> +}
> +#endif
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 0e04c5e2c405..d29bd0e65020 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -183,7 +183,7 @@ static inline bool kasan_requires_meta(void)
>  #define META_BYTES_PER_BLOCK 1
>  #define META_BLOCKS_PER_ROW 16
>  #define META_BYTES_PER_ROW (META_BLOCKS_PER_ROW * META_BYTES_PER_BLOCK)
> -#define META_MEM_BYTES_PER_ROW (META_BYTES_PER_ROW * KASAN_GRANULE_SIZE)
> +#define META_MEM_BYTES_PER_ROW (META_BYTES_PER_ROW * KASAN_SHADOW_SCALE_SIZE)
>  #define META_ROWS_AROUND_ADDR 2
>
>  #define KASAN_STACK_DEPTH 64
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index c08097715686..ee9e406b0cdb 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -436,7 +436,7 @@ static int meta_pointer_offset(const void *row, const void *addr)
>          *    plus 1 byte for space.
>          */
>         return 3 + (BITS_PER_LONG / 8) * 2 +
> -               (addr - row) / KASAN_GRANULE_SIZE * 3 + 1;
> +               (addr - row) / KASAN_SHADOW_SCALE_SIZE * 3 + 1;
>  }
>
>  static void print_memory_metadata(const void *addr)
> diff --git a/mm/kasan/report_sw_tags.c b/mm/kasan/report_sw_tags.c
> index 689e94f9fe3c..1ac5c7a9011d 100644
> --- a/mm/kasan/report_sw_tags.c
> +++ b/mm/kasan/report_sw_tags.c
> @@ -39,7 +39,7 @@ const void *kasan_find_first_bad_addr(const void *addr, size_t size)
>         if (!addr_has_metadata(p))
>                 return p;
>
> -       while (p < end && tag == *(u8 *)kasan_mem_to_shadow(p))
> +       while (p < end && tag == kasan_get_shadow_tag(p))
>                 p += KASAN_GRANULE_SIZE;
>
>         return p;
> @@ -48,7 +48,6 @@ const void *kasan_find_first_bad_addr(const void *addr, size_t size)
>  size_t kasan_get_alloc_size(void *object, struct kmem_cache *cache)
>  {
>         size_t size = 0;
> -       u8 *shadow;
>
>         /*
>          * Skip the addr_has_metadata check, as this function only operates on
> @@ -59,13 +58,11 @@ size_t kasan_get_alloc_size(void *object, struct kmem_cache *cache)
>          * The loop below returns 0 for freed objects, for which KASAN cannot
>          * calculate the allocation size based on the metadata.
>          */
> -       shadow = (u8 *)kasan_mem_to_shadow(object);
>         while (size < cache->object_size) {
> -               if (*shadow != KASAN_TAG_INVALID)
> +               if (kasan_get_shadow_tag(object + size) != KASAN_TAG_INVALID)
>                         size += KASAN_GRANULE_SIZE;
>                 else
>                         return size;
> -               shadow++;
>         }
>
>         return cache->object_size;
> @@ -78,9 +75,8 @@ void kasan_metadata_fetch_row(char *buffer, void *row)
>
>  void kasan_print_tags(u8 addr_tag, const void *addr)
>  {
> -       u8 *shadow = (u8 *)kasan_mem_to_shadow(addr);
> -
> -       pr_err("Pointer tag: [%02x], memory tag: [%02x]\n", addr_tag, *shadow);
> +       pr_err("Pointer tag: [%02x], memory tag: [%02x]\n", addr_tag,
> +              kasan_get_shadow_tag(addr));
>  }
>
>  #ifdef CONFIG_KASAN_STACK
> diff --git a/mm/kasan/sw_tags.c b/mm/kasan/sw_tags.c
> index 32435d33583a..7a6b8ea9bf78 100644
> --- a/mm/kasan/sw_tags.c
> +++ b/mm/kasan/sw_tags.c
> @@ -79,6 +79,7 @@ u8 __hwasan_generate_tag(void)
>  }
>  EXPORT_SYMBOL(__hwasan_generate_tag);
>
> +#if !IS_ENABLED(CONFIG_KASAN_SW_TAGS_DENSE)
>  bool kasan_check_range(const void *addr, size_t size, bool write,
>                         unsigned long ret_ip)
>  {
> @@ -127,17 +128,24 @@ bool kasan_check_range(const void *addr, size_t size, bool write,
>
>         return true;
>  }
> +#endif
>
>  bool kasan_byte_accessible(const void *addr)
>  {
>         u8 tag = get_tag(addr);
>         void *untagged_addr = kasan_reset_tag(addr);
>         u8 shadow_byte;
> +       int shift;
>
>         if (!addr_has_metadata(untagged_addr))
>                 return false;
>
>         shadow_byte = READ_ONCE(*(u8 *)kasan_mem_to_shadow(untagged_addr));
> +       if (IS_ENABLED(CONFIG_KASAN_SW_TAGS_DENSE)) {
> +               shift = !!((u64)addr & BIT(KASAN_TAG_WIDTH)) * KASAN_TAG_WIDTH;
> +               shadow_byte = (shadow_byte >> shift) & KASAN_TAG_KERNEL;
> +       }
> +
>         return tag == KASAN_TAG_KERNEL || tag == shadow_byte;
>  }
>
> --
> 2.47.1
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/15] x86: KASAN raw shadow memory PTE init
  2025-02-04 17:33 ` [PATCH 10/15] x86: KASAN raw shadow memory PTE init Maciej Wieczor-Retman
@ 2025-02-05 23:45   ` Andrey Konovalov
  2025-02-06 15:39     ` Maciej Wieczor-Retman
  0 siblings, 1 reply; 45+ messages in thread
From: Andrey Konovalov @ 2025-02-05 23:45 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On Tue, Feb 4, 2025 at 6:36 PM Maciej Wieczor-Retman
<maciej.wieczor-retman@intel.com> wrote:
>
> In KASAN's generic mode the default value in shadow memory is zero.
> During initialization of shadow memory pages they are allocated and
> zeroed.
>
> In KASAN's tag-based mode the default tag for the arm64 architecture is
> 0xFE which corresponds to any memory that should not be accessed. On x86
> (where tags are 4-bit wide instead of 8-bit wide) that tag is 0xE so
> during the initializations all the bytes in shadow memory pages should
> be filled with 0xE or 0xEE if two tags should be packed in one shadow
> byte.
>
> Use memblock_alloc_try_nid_raw() instead of memblock_alloc_try_nid() to
> avoid zeroing out the memory so it can be set with the KASAN invalid
> tag.
>
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---
>  arch/x86/mm/kasan_init_64.c | 19 ++++++++++++++++---
>  include/linux/kasan.h       | 25 +++++++++++++++++++++++++
>  mm/kasan/kasan.h            | 19 -------------------
>  3 files changed, 41 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
> index 9dddf19a5571..55d468d83682 100644
> --- a/arch/x86/mm/kasan_init_64.c
> +++ b/arch/x86/mm/kasan_init_64.c
> @@ -35,6 +35,18 @@ static __init void *early_alloc(size_t size, int nid, bool should_panic)
>         return ptr;
>  }
>
> +static __init void *early_raw_alloc(size_t size, int nid, bool should_panic)
> +{
> +       void *ptr = memblock_alloc_try_nid_raw(size, size,
> +                       __pa(MAX_DMA_ADDRESS), MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> +
> +       if (!ptr && should_panic)
> +               panic("%pS: Failed to allocate page, nid=%d from=%lx\n",
> +                     (void *)_RET_IP_, nid, __pa(MAX_DMA_ADDRESS));
> +
> +       return ptr;
> +}
> +
>  static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
>                                       unsigned long end, int nid)
>  {
> @@ -64,8 +76,9 @@ static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
>                 if (!pte_none(*pte))
>                         continue;
>
> -               p = early_alloc(PAGE_SIZE, nid, true);
> -               entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL);
> +               p = early_raw_alloc(PAGE_SIZE, nid, true);
> +               memset(p, PAGE_SIZE, kasan_dense_tag(KASAN_SHADOW_INIT));
> +               entry = pfn_pte(PFN_DOWN(__pa_nodebug(p)), PAGE_KERNEL);
>                 set_pte_at(&init_mm, addr, pte, entry);
>         } while (pte++, addr += PAGE_SIZE, addr != end);
>  }
> @@ -437,7 +450,7 @@ void __init kasan_init(void)
>          * it may contain some garbage. Now we can clear and write protect it,
>          * since after the TLB flush no one should write to it.
>          */
> -       memset(kasan_early_shadow_page, 0, PAGE_SIZE);
> +       kasan_poison(kasan_early_shadow_page, PAGE_SIZE, KASAN_SHADOW_INIT, false);
>         for (i = 0; i < PTRS_PER_PTE; i++) {
>                 pte_t pte;
>                 pgprot_t prot;
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 83146367170a..af8272c74409 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -151,6 +151,31 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
>                 __kasan_unpoison_range(addr, size);
>  }
>
> +#ifdef CONFIG_KASAN_HW_TAGS
> +
> +static inline void kasan_poison(const void *addr, size_t size, u8 value, bool init)
> +{
> +       if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
> +               return;
> +       if (WARN_ON(size & KASAN_GRANULE_MASK))
> +               return;
> +
> +       hw_set_mem_tag_range(kasan_reset_tag(addr), size, value, init);
> +}
> +
> +#else /* CONFIG_KASAN_HW_TAGS */
> +
> +/**
> + * kasan_poison - mark the memory range as inaccessible
> + * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
> + * @size - range size, must be aligned to KASAN_GRANULE_SIZE
> + * @value - value that's written to metadata for the range
> + * @init - whether to initialize the memory range (only for hardware tag-based)
> + */
> +void kasan_poison(const void *addr, size_t size, u8 value, bool init);
> +
> +#endif /* CONFIG_KASAN_HW_TAGS */

Please keep kasan_poison() and kasan_unpoison() in mm/kasan/kasan.h:
these are intended as internal-only functions (perhaps, we should add
this into the comment). Instead, add a purpose-specific wrapper
similar to the ones in include/linux/kasan.h.


> +
>  void __kasan_poison_pages(struct page *page, unsigned int order, bool init);
>  static __always_inline void kasan_poison_pages(struct page *page,
>                                                 unsigned int order, bool init)
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index a56aadd51485..2405477c5899 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -466,16 +466,6 @@ static inline u8 kasan_random_tag(void) { return 0; }
>
>  #ifdef CONFIG_KASAN_HW_TAGS
>
> -static inline void kasan_poison(const void *addr, size_t size, u8 value, bool init)
> -{
> -       if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
> -               return;
> -       if (WARN_ON(size & KASAN_GRANULE_MASK))
> -               return;
> -
> -       hw_set_mem_tag_range(kasan_reset_tag(addr), size, value, init);
> -}
> -
>  static inline void kasan_unpoison(const void *addr, size_t size, bool init)
>  {
>         u8 tag = get_tag(addr);
> @@ -497,15 +487,6 @@ static inline bool kasan_byte_accessible(const void *addr)
>
>  #else /* CONFIG_KASAN_HW_TAGS */
>
> -/**
> - * kasan_poison - mark the memory range as inaccessible
> - * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
> - * @size - range size, must be aligned to KASAN_GRANULE_SIZE
> - * @value - value that's written to metadata for the range
> - * @init - whether to initialize the memory range (only for hardware tag-based)
> - */
> -void kasan_poison(const void *addr, size_t size, u8 value, bool init);
> -
>  /**
>   * kasan_unpoison - mark the memory range as accessible
>   * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
> --
> 2.47.1
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 15/15] kasan: Add mititgation and debug modes
  2025-02-04 17:33 ` [PATCH 15/15] kasan: Add mititgation and debug modes Maciej Wieczor-Retman
@ 2025-02-05 23:46   ` Andrey Konovalov
  2025-02-07  9:08     ` Maciej Wieczor-Retman
  0 siblings, 1 reply; 45+ messages in thread
From: Andrey Konovalov @ 2025-02-05 23:46 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On Tue, Feb 4, 2025 at 6:37 PM Maciej Wieczor-Retman
<maciej.wieczor-retman@intel.com> wrote:
>
> With smaller memory footprint KASAN could be used in production systems.
> One problem is that saving stacktraces slowes memory allocation
> substantially - with KASAN enabled up to 90% of time spent on kmalloc()
> is spent on saving the stacktrace.
>
> Add mitigation mode to allow the option for running KASAN focused on
> performance and security. In mitigation mode disable saving stacktraces
> and set fault mode to always panic on KASAN error as a security
> mechanism.
>
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---
>  lib/Kconfig.kasan | 28 ++++++++++++++++++++++++++++
>  mm/kasan/report.c |  4 ++++
>  mm/kasan/tags.c   |  5 +++++
>  3 files changed, 37 insertions(+)
>
> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
> index d08b4e9bf477..6daa62b40dea 100644
> --- a/lib/Kconfig.kasan
> +++ b/lib/Kconfig.kasan
> @@ -244,4 +244,32 @@ config KASAN_SW_TAGS_DENSE
>           ARCH_HAS_KASAN_SW_TAGS_DENSE is needed for this option since the
>           special tag macros need to be properly set for 4-bit wide tags.
>
> +choice
> +       prompt "KASAN operation mode"
> +       default KASAN_OPERATION_DEBUG
> +       help
> +         Choose between the mitigation or debug operation modes.
> +
> +         The first one disables stacktrace saving and enables panic on error.
> +         Faster memory allocation but less information. The second one is the
> +         default where KASAN operates with full functionality.

This is something that I thought about before and I think we should
_not_ add configuration options like these. The distinction between
debug and mitigation modes is something that's specific to a
particular user of the feature. Some might prefer to take the impact
of having stack traces enabled in a production environment to allow
debugging in-the-wild exploitation attempts. Also at some point in the
future, we will hopefully have production-grade stack traces [1], and
this would thus change the desired behavior of
KASAN_OPERATION_MITIGATION.

We already have the kasan.stacktrace command-line parameter for
disabling stack trace collection. On top of that, if you prefer, we
could add a configuration option that changes the default value of
kasan_flag_stacktrace (but can still be overridden via the
kasan.stacktrace command-line parameter). Note though that by default,
stack traces should be turned on.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=211785


> +
> +config KASAN_OPERATION_DEBUG
> +       bool "Debug operation mode"
> +       depends on KASAN
> +       help
> +         The default mode. Full functionality and all boot parameters
> +         available.
> +
> +config KASAN_OPERATION_MITIGATION
> +       bool "Mitigation operation mode"
> +       depends on KASAN
> +       help
> +         Operation mode dedicated at faster operation at the cost of less
> +         information collection. Disables stacktrace saving for faster
> +         allocations and forces panic on KASAN error to mitigate malicious
> +         attacks.
> +
> +endchoice
> +
>  endif # KASAN
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index ee9e406b0cdb..ae989d3bd919 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -47,7 +47,11 @@ enum kasan_arg_fault {
>         KASAN_ARG_FAULT_PANIC_ON_WRITE,
>  };
>
> +#ifdef CONFIG_KASAN_OPERATION_MITIGATION
> +static enum kasan_arg_fault kasan_arg_fault __ro_after_init = KASAN_ARG_FAULT_PANIC;
> +#else
>  static enum kasan_arg_fault kasan_arg_fault __ro_after_init = KASAN_ARG_FAULT_DEFAULT;
> +#endif
>
>  /* kasan.fault=report/panic */
>  static int __init early_kasan_fault(char *arg)
> diff --git a/mm/kasan/tags.c b/mm/kasan/tags.c
> index c111d98961ed..2414cddeaaf3 100644
> --- a/mm/kasan/tags.c
> +++ b/mm/kasan/tags.c
> @@ -78,6 +78,11 @@ early_param("kasan.stack_ring_size", early_kasan_flag_stack_ring_size);
>
>  void __init kasan_init_tags(void)
>  {
> +       if (IS_ENABLED(CONFIG_KASAN_OPERATION_MITIGATION)) {
> +               static_branch_disable(&kasan_flag_stacktrace);
> +               return;
> +       }
> +
>         switch (kasan_arg_stacktrace) {
>         case KASAN_ARG_STACKTRACE_DEFAULT:
>                 /* Default is specified by kasan_flag_stacktrace definition. */
> --
> 2.47.1
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte
  2025-02-04 17:33 ` [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte Maciej Wieczor-Retman
@ 2025-02-06  0:57   ` Dave Hansen
  2025-02-07 16:37     ` Maciej Wieczor-Retman
  0 siblings, 1 reply; 45+ messages in thread
From: Dave Hansen @ 2025-02-06  0:57 UTC (permalink / raw)
  To: Maciej Wieczor-Retman, luto, xin, kirill.shutemov, palmer, tj,
	andreyknvl, brgerst, ardb, dave.hansen, jgross, will, akpm, arnd,
	corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa, seanjc,
	paul.walmsley, aou, justinstitt, jason.andryuk, glider, ubizjak,
	jannh, bhe, vincenzo.frascino, rafael.j.wysocki, ndesaulniers,
	mingo, catalin.marinas, junichi.nomura, nathan, ryabinin.a.a,
	dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

On 2/4/25 09:33, Maciej Wieczor-Retman wrote:
> @@ -287,7 +287,7 @@ static pte_t *fill_pte(pmd_t *pmd, unsigned long vaddr)
>  	if (pmd_none(*pmd)) {
>  		pte_t *pte = (pte_t *) spp_getpage();
>  		pmd_populate_kernel(&init_mm, pmd, pte);
> -		if (pte != pte_offset_kernel(pmd, 0))
> +		if (__pa(pte) != __pa(pte_offset_kernel(pmd, 0)))
>  			printk(KERN_ERR "PAGETABLE BUG #03!\n");
>  	}
>  	return pte_offset_kernel(pmd, vaddr);

Maciej, could you do a quick check on this and make sure that it doesn't
hurt code generation on current kernels?

pte_offset_kernel() has an internal __va() so this ends up logically
being something like:

-	if (     pte  !=      __va(pmd))
+	if (__pa(pte) != __pa(__va(pmd)))

The __pa() and __va() obviously logically cancel each other out in the
new version. But if the compiler for whatever reason can't figure this
out we might end up with worse code.

If it generates crummy code we might want to do this differently like
avoiding pte_offset_kernel() and adding some other helper that's more
direct and to the point.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-05 18:51     ` Christoph Lameter (Ampere)
@ 2025-02-06  1:05       ` Jessica Clarke
  2025-02-06 19:11         ` Christoph Lameter (Ampere)
  0 siblings, 1 reply; 45+ messages in thread
From: Jessica Clarke @ 2025-02-06  1:05 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: Maciej Wieczor-Retman, luto, xin, kirill.shutemov, palmer, tj,
	andreyknvl, brgerst, ardb, dave.hansen, jgross, will, akpm, arnd,
	corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa, seanjc,
	paul.walmsley, aou, justinstitt, jason.andryuk, glider, ubizjak,
	jannh, bhe, vincenzo.frascino, rafael.j.wysocki, ndesaulniers,
	mingo, catalin.marinas, junichi.nomura, nathan, ryabinin.a.a,
	dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, kees, kasan-dev, x86, linux-arm-kernel,
	linux-riscv, linux-kernel, linux-mm, llvm, linux-doc

On 5 Feb 2025, at 18:51, Christoph Lameter (Ampere) <cl@gentwo.org> wrote:
> 
> On Tue, 4 Feb 2025, Jessica Clarke wrote:
> 
>> It’s not “no performance penalty”, there is a cost to tracking the MTE
>> tags for checking. In asynchronous (or asymmetric) mode that’s not too
> 
> 
> On Ampere Processor hardware there is no penalty since the logic is build
> into the usual read/write paths. This is by design. There may be on other
> platforms that cannot do this.

You helpfully cut out all the explanation of where the performance
penalty comes from. But if it’s as you say I can only assume your
design chooses to stall all stores until they have actually written, in
which case you have a performance cost compared with hardware that
omitted MTE or optimises for non-synchronous MTE. The literature on MTE
agrees that it is not no penalty (but can be low penalty). I don’t
really want to have some big debate here about the ins and outs of MTE,
it’s not the place for it, but I will stand up and point out that
claiming MTE to be “no performance penalty” is misrepresentative of the
truth

Jess



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-05 23:40 ` Andrey Konovalov
@ 2025-02-06 10:40   ` Maciej Wieczor-Retman
  2025-02-06 18:10     ` Andrey Konovalov
  0 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-06 10:40 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

Hello Andrey!

On 2025-02-06 at 00:40:59 +0100, Andrey Konovalov wrote:
>On Tue, Feb 4, 2025 at 6:34 PM Maciej Wieczor-Retman
><maciej.wieczor-retman@intel.com> wrote:
>>
>> ======= Introduction
>> The patchset aims to add a KASAN tag-based mode for the x86 architecture
>> with the help of the new CPU feature called Linear Address Masking
>> (LAM). Main improvement introduced by the series is 4x lower memory
>> usage compared to KASAN's generic mode, the only currently available
>> mode on x86.
>>
>> There are two logical parts to this series. The first one attempts to
>> add a new memory saving mechanism called "dense mode" to the generic
>> part of the tag-based KASAN code. The second one focuses on implementing
>> and enabling the tag-based mode for the x86 architecture by using LAM.
>
>Hi Maciej,
>
>Awesome work! Great to see SW_TAGS mode supported on x86!

Glad to hear that, it was a lot of fun to work on :)

>
>I started reviewing the patches, but this is somewhat complicated, as
>the dense mode changes are squashed together with the generic ones for
>x86 support. Could you please split this series into 2? Or at least
>reorder the patches so that everything needed for basic x86 support
>comes first and can be reviewed and tested separately.

I'll try reordering first and see if it looks nice. Since the dense mode would
make some parts arch specific I think it's better to have the two parts in one
series for easier reference. But if it turns out more convoluted I'll just split
it as you suggested.

>
>I will post the comments for things I noted so far, including for the
>dense mode changes, but I'll take a closer look after the split.
>
>Also feel free to drop the dependency on that risc-v series, as it
>doesn't get updated very often. But up to you.

Okay, I was mostly interested in the patch that redefines KASAN_SHADOW_END as
KASAN_SHADOW_OFFSET and then gets shadow addresses by using a signed offset. But
I suppose I can just take that patch and prepend my series with that? (after
applying your comments from that series)

>
>And please also update all affected parts of Documentation/dev-tools/kasan.rst.

Right, thanks for the reminder :)

>
>Thank you!

-- 
Kind regards
Maciej Wieczór-Retman


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/15] kasan: arm64: x86: risc-v: Make special tags arch specific
  2025-02-05 20:20   ` Palmer Dabbelt
@ 2025-02-06 11:22     ` Maciej Wieczor-Retman
  0 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-06 11:22 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: luto, xin, kirill.shutemov, tj, andreyknvl, brgerst,
	Ard Biesheuvel, dave.hansen, jgross, Will Deacon, akpm,
	Arnd Bergmann, corbet, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, Paul Walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, Catalin Marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees, kasan-dev, x86,
	linux-arm-kernel, linux-riscv, linux-kernel, linux-mm, llvm,
	linux-doc, Samuel Holland

Hi!

On 2025-02-05 at 12:20:10 -0800, Palmer Dabbelt wrote:
>On Tue, 04 Feb 2025 09:33:45 PST (-0800), maciej.wieczor-retman@intel.com wrote:
...
>
>Acked-by: Palmer Dabbelt <palmer@rivosinc.com> # RISC-V
>
>Probably best to keep this along with the rest of the patches, but LMK if you
>want me to point something at the RISC-V tree.

Thanks for looking at the patches! As Andrey suggested since the risc-v KASAN
series doesn't get updated much I'll try not to base this series on the risc-v
one. I hope it's okay if I pick up the first patch/few patches that are not
risc-v related and try to upstream them along this series? They were really
helpful to my efforts here.

>_______________________________________________
>linux-riscv mailing list
>linux-riscv@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-riscv

-- 
Kind regards
Maciej Wieczór-Retman


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode
  2025-02-05 23:43   ` Andrey Konovalov
@ 2025-02-06 12:57     ` Maciej Wieczor-Retman
  2025-02-06 18:14       ` Andrey Konovalov
  0 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-06 12:57 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On 2025-02-06 at 00:43:46 +0100, Andrey Konovalov wrote:
>On Tue, Feb 4, 2025 at 6:34 PM Maciej Wieczor-Retman
><maciej.wieczor-retman@intel.com> wrote:
>>
>> Tag-based KASAN (on arm64) works by generating a random 8-bit tag and
>> putting it in both the top byte of the pointer (that points to the
>> allocated memory) and into all bytes of shadow memory that correspond to
>> the chunk of allocated regular memory. Each byte of shadow memory covers
>> a 16 byte chunk of allocated memory - a value called KASAN granularity.
>> This means that out-of-bounds memory accesses that happen inside the 16
>> bytes can't be caught.
>>
>> The dense mode offers reducing the tag width from 8 to 4 bits and
>> storing two tags in one byte of shadow memory - one in the upper 4 bits
>> of the byte and one in the lower 4. This way one byte of shadow memory
>> can cover 32 bytes of allocated memory while still keeping the "16 bytes
>> per one tag" granularity. The lower 4 bits of each shadow byte map bytes
>> of memory with offsets 0-15 and the upper 4 bits map offsets 16-31.
>>
>> Example:
>> The example below shows how the shadow memory looks like after
>> allocating 48 bytes of memory in both normal tag-based mode and the
>> dense mode. The contents of shadow memory are overlaid onto address
>> offsets that they relate to in the allocated kernel memory. Each cell
>> |    | symbolizes one byte of shadow memory.
>>
>> = The regular tag based mode:
>> - Randomly generated 8-bit tag equals 0xAB.
>> - 0xFE is the tag that symbolizes unallocated memory.
>>
>> Shadow memory contents:           |  0xAB  |  0xAB  |  0xAB  |  0xFE  |
>> Shadow memory address offsets:    0        1        2        3        4
>> Allocated memory address offsets: 0        16       32       48       64
>>
>> = The dense tag based mode:
>> - Randomly generated 4-bit tag equals 0xC.
>> - 0xE is the tag that symbolizes unallocated memory.
>>
>> Shadow memory contents:           |0xC 0xC |0xC 0xE |0xE 0xE |0xE 0xE |
>> Shadow memory address offsets:    0        1        2        3        4
>> Allocated memory address offsets: 0        32       64       96       128
>>
>> Add a new config option and defines that can override the standard
>> system of one tag per one shadow byte.
>>
>> Add alternative version of the kasan_poison() that deals with tags not
>> being aligned to byte size in shadow memory.
>>
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
>> ---
>>  include/linux/kasan.h | 18 ++++++++++++++++++
>>  lib/Kconfig.kasan     | 21 +++++++++++++++++++++
>>  mm/kasan/kasan.h      |  4 +---
>>  mm/kasan/shadow.c     | 33 ++++++++++++++++++++++++++++++---
>>  4 files changed, 70 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
>> index 03b440658817..ea0f5acd875b 100644
>> --- a/include/linux/kasan.h
>> +++ b/include/linux/kasan.h
>> @@ -35,6 +35,24 @@ typedef unsigned int __bitwise kasan_vmalloc_flags_t;
>>
>>  /* Software KASAN implementations use shadow memory. */
>>
>> +#ifdef CONFIG_KASAN_SW_TAGS_DENSE
>> +#define KASAN_GRANULE_SHIFT    (KASAN_SHADOW_SCALE_SHIFT - 1)
>> +#define KASAN_SHADOW_SCALE_SIZE        (1UL << KASAN_SHADOW_SCALE_SHIFT)
>> +static inline u8 kasan_dense_tag(u8 tag)
>> +{
>> +       return (tag << KASAN_TAG_WIDTH | tag);
>> +}
>> +#else
>> +#define KASAN_GRANULE_SHIFT    KASAN_SHADOW_SCALE_SHIFT
>> +#define KASAN_SHADOW_SCALE_SIZE        (1UL << KASAN_GRANULE_SHIFT)
>> +static inline u8 kasan_dense_tag(u8 tag)
>> +{
>> +       return tag;
>> +}
>> +#endif
>> +
>> +#define KASAN_GRANULE_SIZE     (1UL << KASAN_GRANULE_SHIFT)
>> +
>
>Is there a reason these definitions are added to
>include/linux/kasan.h? At least within this patch, they are only used
>within mm/kasan, so let's keep them in mm/kasan/kasan.h.

Parts of x86 arch use these later (minimal slab alignment, kasan shadow start
address) so I thought it was convenient to already have it in place here?

Since I'll be reordering patches I can just move these changes together.

>
>>  #ifdef CONFIG_KASAN_SW_TAGS
>>  /* This matches KASAN_TAG_INVALID. */
>>  #define KASAN_SHADOW_INIT 0xFE
>> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
>> index 98016e137b7f..d08b4e9bf477 100644
>> --- a/lib/Kconfig.kasan
>> +++ b/lib/Kconfig.kasan
>> @@ -19,6 +19,13 @@ config ARCH_DISABLE_KASAN_INLINE
>>           Disables both inline and stack instrumentation. Selected by
>>           architectures that do not support these instrumentation types.
>>
>> +config ARCH_HAS_KASAN_SW_TAGS_DENSE
>> +       bool
>> +       help
>> +         Enables option to compile tag-based KASAN with densely packed tags -
>> +         two 4-bit tags per one byte of shadow memory. Set on architectures
>> +         that have 4-bit tag macros.
>> +
>>  config CC_HAS_KASAN_GENERIC
>>         def_bool $(cc-option, -fsanitize=kernel-address)
>>
>> @@ -223,4 +230,18 @@ config KASAN_EXTRA_INFO
>>           boot parameter, it will add 8 * stack_ring_size bytes of additional
>>           memory consumption.
>>
>> +config KASAN_SW_TAGS_DENSE
>> +       bool "Two 4-bit tags in one shadow memory byte"
>> +       depends on KASAN_SW_TAGS
>> +       depends on ARCH_HAS_KASAN_SW_TAGS_DENSE
>
>I think this should also depend on KASAN_OUTLINE: Clang/GCC aren't
>aware of the dense mode.

I wasn't sure I fully understood how inline/outline interacts with clang/gcc on
x86 (especially that I think some parts are still missing in x86 clang for
tag-based KASAN). So I understand that compiling with inline doesn't do
anything? If so, is it not doing anything because of missing compiler code or
something in the kernel?

>
>> +       help
>> +         Enables packing two tags into one shadow byte to half the memory usage
>> +         compared to normal tag-based mode.
>
>But adds some performance impact?

I tried to measure the performance impact of dense/non-dense but didn't see much
more than noise in my tests. But I'll mention that there is some small
performance impact due to more bit shifts.

>
>> +
>> +         After setting this option, tag width macro is set to 4 and size macros
>> +         are adjusted based on used KASAN_SHADOW_SCALE_SHIFT.
>
>I think this paragraph is an implementation detail and we can drop it.

Okay, will do.

>
>> +
>> +         ARCH_HAS_KASAN_SW_TAGS_DENSE is needed for this option since the
>> +         special tag macros need to be properly set for 4-bit wide tags.
>> +
>>  endif # KASAN
>> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
>> index 72da5ddcceaa..0e04c5e2c405 100644
>> --- a/mm/kasan/kasan.h
>> +++ b/mm/kasan/kasan.h
>> @@ -128,9 +128,7 @@ static inline bool kasan_requires_meta(void)
>>
>>  #endif /* CONFIG_KASAN_GENERIC */
>>
>> -#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
>> -#define KASAN_GRANULE_SIZE     (1UL << KASAN_SHADOW_SCALE_SHIFT)
>> -#else
>> +#ifdef CONFIG_KASAN_HW_TAGS
>>  #include <asm/mte-kasan.h>
>>  #define KASAN_GRANULE_SIZE     MTE_GRANULE_SIZE
>>  #endif
>> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
>> index d6210ca48dda..368503f54b87 100644
>> --- a/mm/kasan/shadow.c
>> +++ b/mm/kasan/shadow.c
>> @@ -123,7 +123,8 @@ EXPORT_SYMBOL(__hwasan_memcpy);
>>
>>  void kasan_poison(const void *addr, size_t size, u8 value, bool init)
>>  {
>> -       void *shadow_start, *shadow_end;
>> +       u8 *shadow_start, *shadow_end, *shadow_start_aligned, *shadow_end_aligned, tag;
>> +       u64 addr64, addr_start_aligned, addr_end_aligned;
>>
>>         if (!kasan_arch_is_ready())
>>                 return;
>> @@ -134,16 +135,42 @@ void kasan_poison(const void *addr, size_t size, u8 value, bool init)
>>          * addresses to this function.
>>          */
>>         addr = kasan_reset_tag(addr);
>> +       addr64 = (u64)addr;
>>
>> -       if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
>> +       if (WARN_ON(addr64 & KASAN_GRANULE_MASK))
>>                 return;
>>         if (WARN_ON(size & KASAN_GRANULE_MASK))
>>                 return;
>>
>>         shadow_start = kasan_mem_to_shadow(addr);
>>         shadow_end = kasan_mem_to_shadow(addr + size);
>> +       addr_start_aligned = round_up(addr64, KASAN_SHADOW_SCALE_SIZE);
>> +       addr_end_aligned = round_down(addr64 + size, KASAN_SHADOW_SCALE_SIZE);
>> +       shadow_start_aligned = kasan_mem_to_shadow((void *)addr_start_aligned);
>> +       shadow_end_aligned = kasan_mem_to_shadow((void *)addr_end_aligned);
>> +
>> +       /* If size is empty just return. */
>> +       if (!size)
>> +               return;
>>
>> -       __memset(shadow_start, value, shadow_end - shadow_start);
>> +       /* Memset the first unaligned tag in shadow memory. */
>> +       if (addr64 % KASAN_SHADOW_SCALE_SIZE) {
>
>So this is required, because KASAN_SHADOW_SCALE_SIZE is 32 but minimal
>slab alignment is still KASAN_GRANULE_SIZE == 16... We should at least
>hide this check is under IS_ENABLED(KASAN_SW_TAGS_DENSE).

...
>
>> +               tag = *shadow_start & KASAN_TAG_MASK;
>> +               tag |= value << KASAN_TAG_WIDTH;
>> +               *shadow_start = tag;
>> +       }
>> +
>> +       /* Memset the middle aligned part in shadow memory. */
>> +       tag = kasan_dense_tag(value);
>> +       __memset(shadow_start_aligned, tag, shadow_end_aligned - shadow_start_aligned);
>> +
>> +       /* Memset the last unaligned tag in shadow memory. */
>> +       if ((addr64 + size) % KASAN_SHADOW_SCALE_SIZE) {
>
>Would it be possible to move this part to kasan_poison_last_granule()?
>That functions seems to be serving a similar purpose but for the
>Generic mode.
>
>It might also be cleaner to add a kasan_poison_first_granule() that
>contains the if (addr64 % KASAN_SHADOW_SCALE_SIZE) check.
...
sure, I'll try to move these checks to kasan_poison_first/last_granule.

>
>> +               tag = KASAN_TAG_MASK << KASAN_TAG_WIDTH;
>> +               tag &= *shadow_end;
>> +               tag |= value;
>> +               *shadow_end = tag;
>> +       }
>>  }
>>  EXPORT_SYMBOL_GPL(kasan_poison);
>>
>> --
>> 2.47.1
>>

-- 
Kind regards
Maciej Wieczór-Retman


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/15] kasan: Tag checking with dense tag-based mode
  2025-02-05 23:45   ` Andrey Konovalov
@ 2025-02-06 14:55     ` Maciej Wieczor-Retman
  0 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-06 14:55 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On 2025-02-06 at 00:45:01 +0100, Andrey Konovalov wrote:
>On Tue, Feb 4, 2025 at 6:35 PM Maciej Wieczor-Retman
><maciej.wieczor-retman@intel.com> wrote:
>>
>> In KASAN's tag-based mode (arm64) when a memory access occurs, the tag
>> stored in the top 8 bits of the pointer is compared with tags saved in
>> the region of the shadow memory that maps to memory the pointer points
>> to. If any of the tags in the shadow memory region do not match the one
>> stored in the pointer an error report is generated.
>>
>> With the introduction of the dense mode, tags won't necessarily occupy
>> whole bytes of shadow memory if the previously allocated memory wasn't
>> aligned to 32 bytes - which is the coverage of one shadow byte.
>>
>> Add an alternative implementation of kasan_check_range() that performs
>> special checks on first and last bytes of shadow memory ranges if the
>> originally allocated memory wasn't aligned to 32 bytes.
>>
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
>> ---
>>  include/linux/kasan.h     | 47 +++++++++++++++-------
>>  mm/kasan/Makefile         |  3 ++
>>  mm/kasan/dense.c          | 83 +++++++++++++++++++++++++++++++++++++++
>>  mm/kasan/kasan.h          |  2 +-
>>  mm/kasan/report.c         |  2 +-
>>  mm/kasan/report_sw_tags.c | 12 ++----
>>  mm/kasan/sw_tags.c        |  8 ++++
>>  7 files changed, 133 insertions(+), 24 deletions(-)
>>  create mode 100644 mm/kasan/dense.c
>>
>> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
>> index ea0f5acd875b..5a3e9bec21c2 100644
>> --- a/include/linux/kasan.h
>> +++ b/include/linux/kasan.h
>> @@ -33,6 +33,20 @@ typedef unsigned int __bitwise kasan_vmalloc_flags_t;
>>
>>  #include <linux/pgtable.h>
>>
>> +#ifndef kasan_mem_to_shadow
>> +static inline void *kasan_mem_to_shadow(const void *addr)
>> +{
>> +       void *scaled;
>> +
>> +       if (IS_ENABLED(CONFIG_KASAN_GENERIC))
>> +               scaled = (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT);
>> +       else
>> +               scaled = (void *)((long)addr >> KASAN_SHADOW_SCALE_SHIFT);
>> +
>> +       return KASAN_SHADOW_OFFSET + scaled;
>> +}
>> +#endif
>
>Any reason this is moved up here?

I think it was necessary for something I added, removed and then didn't notice
it's no longer needed. I'll move it back.

-- 
Kind regards
Maciej Wieczór-Retman


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/15] x86: KASAN raw shadow memory PTE init
  2025-02-05 23:45   ` Andrey Konovalov
@ 2025-02-06 15:39     ` Maciej Wieczor-Retman
  0 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-06 15:39 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On 2025-02-06 at 00:45:49 +0100, Andrey Konovalov wrote:
>On Tue, Feb 4, 2025 at 6:36 PM Maciej Wieczor-Retman
><maciej.wieczor-retman@intel.com> wrote:
>>
>> In KASAN's generic mode the default value in shadow memory is zero.
>> During initialization of shadow memory pages they are allocated and
>> zeroed.
>>
>> In KASAN's tag-based mode the default tag for the arm64 architecture is
>> 0xFE which corresponds to any memory that should not be accessed. On x86
>> (where tags are 4-bit wide instead of 8-bit wide) that tag is 0xE so
>> during the initializations all the bytes in shadow memory pages should
>> be filled with 0xE or 0xEE if two tags should be packed in one shadow
>> byte.
>>
>> Use memblock_alloc_try_nid_raw() instead of memblock_alloc_try_nid() to
>> avoid zeroing out the memory so it can be set with the KASAN invalid
>> tag.
>>
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
>> ---
>>  arch/x86/mm/kasan_init_64.c | 19 ++++++++++++++++---
>>  include/linux/kasan.h       | 25 +++++++++++++++++++++++++
>>  mm/kasan/kasan.h            | 19 -------------------
>>  3 files changed, 41 insertions(+), 22 deletions(-)
>>
>> diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
>> index 9dddf19a5571..55d468d83682 100644
>> --- a/arch/x86/mm/kasan_init_64.c
>> +++ b/arch/x86/mm/kasan_init_64.c
>> @@ -35,6 +35,18 @@ static __init void *early_alloc(size_t size, int nid, bool should_panic)
>>         return ptr;
>>  }
>>
>> +static __init void *early_raw_alloc(size_t size, int nid, bool should_panic)
>> +{
>> +       void *ptr = memblock_alloc_try_nid_raw(size, size,
>> +                       __pa(MAX_DMA_ADDRESS), MEMBLOCK_ALLOC_ACCESSIBLE, nid);
>> +
>> +       if (!ptr && should_panic)
>> +               panic("%pS: Failed to allocate page, nid=%d from=%lx\n",
>> +                     (void *)_RET_IP_, nid, __pa(MAX_DMA_ADDRESS));
>> +
>> +       return ptr;
>> +}
>> +
>>  static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
>>                                       unsigned long end, int nid)
>>  {
>> @@ -64,8 +76,9 @@ static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
>>                 if (!pte_none(*pte))
>>                         continue;
>>
>> -               p = early_alloc(PAGE_SIZE, nid, true);
>> -               entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL);
>> +               p = early_raw_alloc(PAGE_SIZE, nid, true);
>> +               memset(p, PAGE_SIZE, kasan_dense_tag(KASAN_SHADOW_INIT));
>> +               entry = pfn_pte(PFN_DOWN(__pa_nodebug(p)), PAGE_KERNEL);
>>                 set_pte_at(&init_mm, addr, pte, entry);
>>         } while (pte++, addr += PAGE_SIZE, addr != end);
>>  }
>> @@ -437,7 +450,7 @@ void __init kasan_init(void)
>>          * it may contain some garbage. Now we can clear and write protect it,
>>          * since after the TLB flush no one should write to it.
>>          */
>> -       memset(kasan_early_shadow_page, 0, PAGE_SIZE);
>> +       kasan_poison(kasan_early_shadow_page, PAGE_SIZE, KASAN_SHADOW_INIT, false);
>>         for (i = 0; i < PTRS_PER_PTE; i++) {
>>                 pte_t pte;
>>                 pgprot_t prot;
>> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
>> index 83146367170a..af8272c74409 100644
>> --- a/include/linux/kasan.h
>> +++ b/include/linux/kasan.h
>> @@ -151,6 +151,31 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
>>                 __kasan_unpoison_range(addr, size);
>>  }
>>
>> +#ifdef CONFIG_KASAN_HW_TAGS
>> +
>> +static inline void kasan_poison(const void *addr, size_t size, u8 value, bool init)
>> +{
>> +       if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
>> +               return;
>> +       if (WARN_ON(size & KASAN_GRANULE_MASK))
>> +               return;
>> +
>> +       hw_set_mem_tag_range(kasan_reset_tag(addr), size, value, init);
>> +}
>> +
>> +#else /* CONFIG_KASAN_HW_TAGS */
>> +
>> +/**
>> + * kasan_poison - mark the memory range as inaccessible
>> + * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
>> + * @size - range size, must be aligned to KASAN_GRANULE_SIZE
>> + * @value - value that's written to metadata for the range
>> + * @init - whether to initialize the memory range (only for hardware tag-based)
>> + */
>> +void kasan_poison(const void *addr, size_t size, u8 value, bool init);
>> +
>> +#endif /* CONFIG_KASAN_HW_TAGS */
>
>Please keep kasan_poison() and kasan_unpoison() in mm/kasan/kasan.h:
>these are intended as internal-only functions (perhaps, we should add
>this into the comment). Instead, add a purpose-specific wrapper
>similar to the ones in include/linux/kasan.h.
>

Okay, got it, I'll pass it through a wrapper.

>
>_______________________________________________
>linux-riscv mailing list
>linux-riscv@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-riscv

-- 
Kind regards
Maciej Wieczór-Retman


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-06 10:40   ` Maciej Wieczor-Retman
@ 2025-02-06 18:10     ` Andrey Konovalov
  0 siblings, 0 replies; 45+ messages in thread
From: Andrey Konovalov @ 2025-02-06 18:10 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On Thu, Feb 6, 2025 at 11:41 AM Maciej Wieczor-Retman
<maciej.wieczor-retman@intel.com> wrote:
>
> >I started reviewing the patches, but this is somewhat complicated, as
> >the dense mode changes are squashed together with the generic ones for
> >x86 support. Could you please split this series into 2? Or at least
> >reorder the patches so that everything needed for basic x86 support
> >comes first and can be reviewed and tested separately.
>
> I'll try reordering first and see if it looks nice. Since the dense mode would
> make some parts arch specific I think it's better to have the two parts in one
> series for easier reference. But if it turns out more convoluted I'll just split
> it as you suggested.

Yes, please do. I also think if you split the series, we can land the
basic x86 support fairly quickly, or at least I can do the review and
give the ack from the KASAN side. For the dense mode part, I'd like to
also hear the opinion of other KASAN developers wrt the overall
design.

> >Also feel free to drop the dependency on that risc-v series, as it
> >doesn't get updated very often. But up to you.
>
> Okay, I was mostly interested in the patch that redefines KASAN_SHADOW_END as
> KASAN_SHADOW_OFFSET and then gets shadow addresses by using a signed offset. But
> I suppose I can just take that patch and prepend my series with that? (after
> applying your comments from that series)

Sounds good to me!


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode
  2025-02-06 12:57     ` Maciej Wieczor-Retman
@ 2025-02-06 18:14       ` Andrey Konovalov
  0 siblings, 0 replies; 45+ messages in thread
From: Andrey Konovalov @ 2025-02-06 18:14 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On Thu, Feb 6, 2025 at 1:58 PM Maciej Wieczor-Retman
<maciej.wieczor-retman@intel.com> wrote:
>
> >Is there a reason these definitions are added to
> >include/linux/kasan.h? At least within this patch, they are only used
> >within mm/kasan, so let's keep them in mm/kasan/kasan.h.
>
> Parts of x86 arch use these later (minimal slab alignment, kasan shadow start
> address) so I thought it was convenient to already have it in place here?

AFAICT, KASAN_SHADOW_START only relies on KASAN_SHADOW_SCALE_SHIFT,
which is defined arch/x86/include/asm/kasan.h anyway.

And ARCH_SLAB_MINALIGN is defined in asm headers, so the definitions
from include/linux/kasan.h shouldn't be visible to it?

I think that we need to do is to define KASAN_GRANULE_SHIFT next to
KASAN_SHADOW_SCALE_SHIFT for x86 and then use it in mm/kasan/kasan.h
to define KASAN_GRANULE_SIZE for SW_TAGS. (Similarly as with arm64,
where ARCH_SLAB_MINALIGN depends on either KASAN_SHADOW_SCALE_SHIFT or
MTE_GRANULE_SIZE, both of which are defined in arm64 asm headers.)

Btw, I think ARCH_SLAB_MINALIGN needs to be defined in
include/asm/cache.h: at least all other architectures have it there.

> Since I'll be reordering patches I can just move these changes together.

Otherwise, if you need to expose something new in
include/linux/kasan.h, please do it together with the change that uses
it. Or you can even put it into a separate patch with an explanation
of why it's required - at least from the review perspective having
separate smaller patches is often better.

In general, if something doesn't need to get exposed to the rest of
the kernel, keep it in mm/kasan/kasan.h.

> >I think this should also depend on KASAN_OUTLINE: Clang/GCC aren't
> >aware of the dense mode.
>
> I wasn't sure I fully understood how inline/outline interacts with clang/gcc on
> x86 (especially that I think some parts are still missing in x86 clang for
> tag-based KASAN). So I understand that compiling with inline doesn't do
> anything? If so, is it not doing anything because of missing compiler code or
> something in the kernel?

With inline instrumentation, the compiler directly embeds the
instructions to calculate the shadow address and check the shadow
value. Since the compiler assumes that one shadow byte corresponds to
16 bytes of memory and not 32, the generated instructions won't be
compatible with the dense mode. With outline instrumentation, the
compiler just adds function calls and thus all the shadow calculations
are performed by the C code.

Or did the dense mode work for you with KASAN_INLINE enabled? I would
expect this not to work. Or maybe the inline instrumentation somehow
got auto-disabled...

> >Would it be possible to move this part to kasan_poison_last_granule()?
> >That functions seems to be serving a similar purpose but for the
> >Generic mode.
> >
> >It might also be cleaner to add a kasan_poison_first_granule() that
> >contains the if (addr64 % KASAN_SHADOW_SCALE_SIZE) check.
> ...
> sure, I'll try to move these checks to kasan_poison_first/last_granule.

For kasan_poison_last_granule(), I think the change makes sense. For
kasan_poison_first_granule(), please check whether it gives any
readability benefit - if kasan_poison() is the only caller, maybe
adding another function is not worth it.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-06  1:05       ` Jessica Clarke
@ 2025-02-06 19:11         ` Christoph Lameter (Ampere)
  2025-02-06 21:41           ` Dave Hansen
  2025-02-06 22:56           ` Andrey Konovalov
  0 siblings, 2 replies; 45+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-02-06 19:11 UTC (permalink / raw)
  To: Jessica Clarke
  Cc: Maciej Wieczor-Retman, luto, xin, kirill.shutemov, palmer, tj,
	andreyknvl, brgerst, ardb, dave.hansen, jgross, will, akpm, arnd,
	corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa, seanjc,
	paul.walmsley, aou, justinstitt, jason.andryuk, glider, ubizjak,
	jannh, bhe, vincenzo.frascino, rafael.j.wysocki, ndesaulniers,
	mingo, catalin.marinas, junichi.nomura, nathan, ryabinin.a.a,
	dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, kees, kasan-dev, x86, linux-arm-kernel,
	linux-riscv, linux-kernel, linux-mm, llvm, linux-doc

[-- Attachment #1: Type: text/plain, Size: 1661 bytes --]

On Thu, 6 Feb 2025, Jessica Clarke wrote:

> On 5 Feb 2025, at 18:51, Christoph Lameter (Ampere) <cl@gentwo.org> wrote:
> > On Ampere Processor hardware there is no penalty since the logic is build
> > into the usual read/write paths. This is by design. There may be on other
> > platforms that cannot do this.
>
> You helpfully cut out all the explanation of where the performance
> penalty comes from. But if it’s as you say I can only assume your
> design chooses to stall all stores until they have actually written, in
> which case you have a performance cost compared with hardware that
> omitted MTE or optimises for non-synchronous MTE. The literature on MTE
> agrees that it is not no penalty (but can be low penalty). I don’t
> really want to have some big debate here about the ins and outs of MTE,
> it’s not the place for it, but I will stand up and point out that
> claiming MTE to be “no performance penalty” is misrepresentative of the
> truth

I cannot share details since this information has not been released to be
public yet. I hear that a whitepaper will be coming soon to explain this
feature. The AmpereOne processors have been released a couple of months
ago.

I also see that KASAN_HW_TAGS exist but this means that the tags can only
be used with CONFIG_KASAN which is a kernel configuration for debug
purposes.

What we are interested in is a *production* implementation with minimal
software overhead that will be the default on ARM64 if the appropriate
hardware is detected. That in turn will hopefully allow other software
instrumentation that is currently used to keep small objects secure and in
turn creates overhead.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-06 19:11         ` Christoph Lameter (Ampere)
@ 2025-02-06 21:41           ` Dave Hansen
  2025-02-07  7:41             ` Maciej Wieczor-Retman
  2025-02-06 22:56           ` Andrey Konovalov
  1 sibling, 1 reply; 45+ messages in thread
From: Dave Hansen @ 2025-02-06 21:41 UTC (permalink / raw)
  To: Christoph Lameter (Ampere), Jessica Clarke
  Cc: Maciej Wieczor-Retman, luto, xin, kirill.shutemov, palmer, tj,
	andreyknvl, brgerst, ardb, dave.hansen, jgross, will, akpm, arnd,
	corbet, dvyukov, richard.weiyang, ytcoode, tglx, hpa, seanjc,
	paul.walmsley, aou, justinstitt, jason.andryuk, glider, ubizjak,
	jannh, bhe, vincenzo.frascino, rafael.j.wysocki, ndesaulniers,
	mingo, catalin.marinas, junichi.nomura, nathan, ryabinin.a.a,
	dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, kees, kasan-dev, x86, linux-arm-kernel,
	linux-riscv, linux-kernel, linux-mm, llvm, linux-doc, Shutemov,
	Kirill

On 2/6/25 11:11, Christoph Lameter (Ampere) wrote:
> I also see that KASAN_HW_TAGS exist but this means that the tags can only
> be used with CONFIG_KASAN which is a kernel configuration for debug
> purposes.
> 
> What we are interested in is a *production* implementation with minimal
> software overhead that will be the default on ARM64 if the appropriate
> hardware is detected. 

Ahh, interesting. I'd assumed that once folks had in-hardware tag checks
that they'd just turn on CONFIG_KASAN and be happy.  Guess not!

> That in turn will hopefully allow other software instrumentation
> that is currently used to keep small objects secure and in turn
> creates overhead.
OK, so KASAN as-is is too broad. Are you saying that the kernel
_currently_ have "software instrumentation" like SLAB
redzoning/poisoning and you'd like to see MTE used to replace those?

Are you just interested in small objects?  What counts as small?  I
assume it's anything roughly <PAGE_SIZE.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-06 19:11         ` Christoph Lameter (Ampere)
  2025-02-06 21:41           ` Dave Hansen
@ 2025-02-06 22:56           ` Andrey Konovalov
  1 sibling, 0 replies; 45+ messages in thread
From: Andrey Konovalov @ 2025-02-06 22:56 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: Jessica Clarke, Maciej Wieczor-Retman, luto, xin,
	kirill.shutemov, palmer, tj, brgerst, ardb, dave.hansen, jgross,
	will, akpm, arnd, corbet, dvyukov, richard.weiyang, ytcoode,
	tglx, hpa, seanjc, paul.walmsley, aou, justinstitt,
	jason.andryuk, glider, ubizjak, jannh, bhe, vincenzo.frascino,
	rafael.j.wysocki, ndesaulniers, mingo, catalin.marinas,
	junichi.nomura, nathan, ryabinin.a.a, dennis, bp, kevinloughlin,
	morbo, dan.j.williams, julian.stecklina, peterz, kees, kasan-dev,
	x86, linux-arm-kernel, linux-riscv, linux-kernel, linux-mm, llvm,
	linux-doc

On Thu, Feb 6, 2025 at 8:21 PM 'Christoph Lameter (Ampere)' via
kasan-dev <kasan-dev@googlegroups.com> wrote:
>
> I cannot share details since this information has not been released to be
> public yet. I hear that a whitepaper will be coming soon to explain this
> feature. The AmpereOne processors have been released a couple of months
> ago.
>
> I also see that KASAN_HW_TAGS exist but this means that the tags can only
> be used with CONFIG_KASAN which is a kernel configuration for debug
> purposes.
>
> What we are interested in is a *production* implementation with minimal
> software overhead that will be the default on ARM64 if the appropriate
> hardware is detected. That in turn will hopefully allow other software
> instrumentation that is currently used to keep small objects secure and in
> turn creates overhead.

Is there anything specific CONFIG_KASAN + CONFIG_KASAN_HW_TAGS do that
is not good enough for a production environment?

The last time I did some perf tests (a year+ ago on Pixel 8, I
believe), the two expensive parts of CONFIG_KASAN_HW_TAGS were:

1. Collecting stack traces. Thus, this can now be disabled via
kernel.stacktrace=off. And there's a tracking bug to add a
production-grade implementation [1];

2. Assigning memory tags to large allocations, specifically page_alloc
allocations with large orders  (AFAIR is was specifically assigning
the tags, not checking them). Thus, this can now be controlled via
kasan.page_alloc.sample(.order).

There's definitely room for optimization and additional config options
that cut down KASAN checks (for example, disabling tag checking of
mempool allocations; although arguably, people might want to have this
in a production environment.)

Otherwise, it's unclear to me what a new production-grade MTE
implementation would do different compared to KASAN_HW_TAGS. But if
there's something, we can just adjust KASAN_HW_TAGS instead.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=211785


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
  2025-02-06 21:41           ` Dave Hansen
@ 2025-02-07  7:41             ` Maciej Wieczor-Retman
  0 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-07  7:41 UTC (permalink / raw)
  To: Dave Hansen, Christoph Lameter (Ampere), andreyknvl
  Cc: Jessica Clarke, luto, xin, kirill.shutemov, palmer, tj, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc, Shutemov, Kirill

On 2025-02-06 at 13:41:29 -0800, Dave Hansen wrote:
>On 2/6/25 11:11, Christoph Lameter (Ampere) wrote:
>> I also see that KASAN_HW_TAGS exist but this means that the tags can only
>> be used with CONFIG_KASAN which is a kernel configuration for debug
>> purposes.
>> 
>> What we are interested in is a *production* implementation with minimal
>> software overhead that will be the default on ARM64 if the appropriate
>> hardware is detected. 
>
>Ahh, interesting. I'd assumed that once folks had in-hardware tag checks
>that they'd just turn on CONFIG_KASAN and be happy.  Guess not!
>
>> That in turn will hopefully allow other software instrumentation
>> that is currently used to keep small objects secure and in turn
>> creates overhead.
>OK, so KASAN as-is is too broad. Are you saying that the kernel
>_currently_ have "software instrumentation" like SLAB
>redzoning/poisoning and you'd like to see MTE used to replace those?

I share Andrey's opinion that in hardware KASAN mode (with MTE on arm64) after
disabling stacktraces (which in my tests in software tag-based mode took up ~90%
of the allocation - small kmalloc() - time) and tweaking the bigger allocations
there doesn't seem to be anything more left in KASAN that'd be slowing things
down.

Obviously this series deals with the tag-based mode which will suffer from all
the software instrumentation penalties to performance. So while it's still a
debugging feature at least it gains 2x-4x memory savings over the generic mode
already present on x86.

>
>Are you just interested in small objects?  What counts as small?  I
>assume it's anything roughly <PAGE_SIZE.

Would disabling vmalloc instrumentation achieve something like this? That is
tweakable during compilation.

>
>_______________________________________________
>linux-riscv mailing list
>linux-riscv@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-riscv

-- 
Kind regards
Maciej Wieczór-Retman


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 15/15] kasan: Add mititgation and debug modes
  2025-02-05 23:46   ` Andrey Konovalov
@ 2025-02-07  9:08     ` Maciej Wieczor-Retman
  0 siblings, 0 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-07  9:08 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: luto, xin, kirill.shutemov, palmer, tj, brgerst, ardb,
	dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On 2025-02-06 at 00:46:21 +0100, Andrey Konovalov wrote:
>On Tue, Feb 4, 2025 at 6:37 PM Maciej Wieczor-Retman
><maciej.wieczor-retman@intel.com> wrote:
...
>> +choice
>> +       prompt "KASAN operation mode"
>> +       default KASAN_OPERATION_DEBUG
>> +       help
>> +         Choose between the mitigation or debug operation modes.
>> +
>> +         The first one disables stacktrace saving and enables panic on error.
>> +         Faster memory allocation but less information. The second one is the
>> +         default where KASAN operates with full functionality.
>
>This is something that I thought about before and I think we should
>_not_ add configuration options like these. The distinction between
>debug and mitigation modes is something that's specific to a
>particular user of the feature. Some might prefer to take the impact
>of having stack traces enabled in a production environment to allow
>debugging in-the-wild exploitation attempts. Also at some point in the
>future, we will hopefully have production-grade stack traces [1], and
>this would thus change the desired behavior of
>KASAN_OPERATION_MITIGATION.
>
>We already have the kasan.stacktrace command-line parameter for
>disabling stack trace collection. On top of that, if you prefer, we
>could add a configuration option that changes the default value of
>kasan_flag_stacktrace (but can still be overridden via the
>kasan.stacktrace command-line parameter). Note though that by default,
>stack traces should be turned on.
>
>[1] https://bugzilla.kernel.org/show_bug.cgi?id=211785
>

Okay, I see your point. I'll drop the patch for now and rethink if messing with
how stacktraces are enabled/disabled is worth it.

>
>_______________________________________________
>linux-riscv mailing list
>linux-riscv@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-riscv

-- 
Kind regards
Maciej Wieczór-Retman


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte
  2025-02-06  0:57   ` Dave Hansen
@ 2025-02-07 16:37     ` Maciej Wieczor-Retman
  2025-02-11 19:59       ` Dave Hansen
  0 siblings, 1 reply; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-07 16:37 UTC (permalink / raw)
  To: Dave Hansen
  Cc: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On 2025-02-05 at 16:57:15 -0800, Dave Hansen wrote:
>On 2/4/25 09:33, Maciej Wieczor-Retman wrote:
>> @@ -287,7 +287,7 @@ static pte_t *fill_pte(pmd_t *pmd, unsigned long vaddr)
>>  	if (pmd_none(*pmd)) {
>>  		pte_t *pte = (pte_t *) spp_getpage();
>>  		pmd_populate_kernel(&init_mm, pmd, pte);
>> -		if (pte != pte_offset_kernel(pmd, 0))
>> +		if (__pa(pte) != __pa(pte_offset_kernel(pmd, 0)))
>>  			printk(KERN_ERR "PAGETABLE BUG #03!\n");
>>  	}
>>  	return pte_offset_kernel(pmd, vaddr);
>
>Maciej, could you do a quick check on this and make sure that it doesn't
>hurt code generation on current kernels?
>
>pte_offset_kernel() has an internal __va() so this ends up logically
>being something like:
>
>-	if (     pte  !=      __va(pmd))
>+	if (__pa(pte) != __pa(__va(pmd)))
>
>The __pa() and __va() obviously logically cancel each other out in the
>new version. But if the compiler for whatever reason can't figure this
>out we might end up with worse code.

I browsed through assembly and indeed the __pa(__va()) is longer compared to
only __va() or kasan_reset_tag(__va()).

How about we just open code the *_offset()? What do you think about the patch
below? We can lose the calls to *_index() because they are all zero so we're
only left with insides of the internal __va(). It didn't report any issues in
QEMU at least. The p4d_offset() isn't very pretty here but I think I can make it
better if you like the idea.

----------------------------------------

x86: Physical address comparisons in fill_p*d/pte

Calculating page offset returns a pointer without a tag. When comparing
the calculated offset to a tagged page pointer an error is raised
because they are not equal.

Change pointer comparisons to physical address comparisons as to avoid
issues in KASAN that pointer arithmetic would create. Open code parts
of p*d_offset() to avoid the internal __va() which complicates output
assembly.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
 arch/x86/mm/init_64.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index ff253648706f..89a86ac34d95 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -251,7 +251,10 @@ static p4d_t *fill_p4d(pgd_t *pgd, unsigned long vaddr)
 	if (pgd_none(*pgd)) {
 		p4d_t *p4d = (p4d_t *)spp_getpage();
 		pgd_populate(&init_mm, pgd, p4d);
-		if (p4d != p4d_offset(pgd, 0))
+
+		if (__pa(p4d) != (pgtable_l5_enabled() ?
+				  __pa(pgd) :
+				  (unsigned long)pgd_val(*pgd) & PTE_PFN_MASK))
 			printk(KERN_ERR "PAGETABLE BUG #00! %p <-> %p\n",
 			       p4d, p4d_offset(pgd, 0));
 	}
@@ -263,7 +266,7 @@ static pud_t *fill_pud(p4d_t *p4d, unsigned long vaddr)
 	if (p4d_none(*p4d)) {
 		pud_t *pud = (pud_t *)spp_getpage();
 		p4d_populate(&init_mm, p4d, pud);
-		if (pud != pud_offset(p4d, 0))
+		if (__pa(pud) != (p4d_val(*p4d) & p4d_pfn_mask(*p4d)))
 			printk(KERN_ERR "PAGETABLE BUG #01! %p <-> %p\n",
 			       pud, pud_offset(p4d, 0));
 	}
@@ -275,7 +278,7 @@ static pmd_t *fill_pmd(pud_t *pud, unsigned long vaddr)
 	if (pud_none(*pud)) {
 		pmd_t *pmd = (pmd_t *) spp_getpage();
 		pud_populate(&init_mm, pud, pmd);
-		if (pmd != pmd_offset(pud, 0))
+		if (__pa(pmd) != (pud_val(*pud) & pud_pfn_mask(*pud)))
 			printk(KERN_ERR "PAGETABLE BUG #02! %p <-> %p\n",
 			       pmd, pmd_offset(pud, 0));
 	}
@@ -287,7 +290,7 @@ static pte_t *fill_pte(pmd_t *pmd, unsigned long vaddr)
 	if (pmd_none(*pmd)) {
 		pte_t *pte = (pte_t *) spp_getpage();
 		pmd_populate_kernel(&init_mm, pmd, pte);
-		if (pte != pte_offset_kernel(pmd, 0))
+		if (__pa(pte) != (pmd_val(*pmd) & pmd_pfn_mask(*pmd)))
 			printk(KERN_ERR "PAGETABLE BUG #03!\n");
 	}
 	return pte_offset_kernel(pmd, vaddr);


>
>If it generates crummy code we might want to do this differently like
>avoiding pte_offset_kernel() and adding some other helper that's more
>direct and to the point.

-- 
Kind regards
Maciej Wieczór-Retman


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte
  2025-02-07 16:37     ` Maciej Wieczor-Retman
@ 2025-02-11 19:59       ` Dave Hansen
  0 siblings, 0 replies; 45+ messages in thread
From: Dave Hansen @ 2025-02-11 19:59 UTC (permalink / raw)
  To: Maciej Wieczor-Retman
  Cc: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet, dvyukov,
	richard.weiyang, ytcoode, tglx, hpa, seanjc, paul.walmsley, aou,
	justinstitt, jason.andryuk, glider, ubizjak, jannh, bhe,
	vincenzo.frascino, rafael.j.wysocki, ndesaulniers, mingo,
	catalin.marinas, junichi.nomura, nathan, ryabinin.a.a, dennis,
	bp, kevinloughlin, morbo, dan.j.williams, julian.stecklina,
	peterz, cl, kees, kasan-dev, x86, linux-arm-kernel, linux-riscv,
	linux-kernel, linux-mm, llvm, linux-doc

On 2/7/25 08:37, Maciej Wieczor-Retman wrote:
> @@ -287,7 +290,7 @@ static pte_t *fill_pte(pmd_t *pmd, unsigned long vaddr)
>  	if (pmd_none(*pmd)) {
>  		pte_t *pte = (pte_t *) spp_getpage();
>  		pmd_populate_kernel(&init_mm, pmd, pte);
> -		if (pte != pte_offset_kernel(pmd, 0))
> +		if (__pa(pte) != (pmd_val(*pmd) & pmd_pfn_mask(*pmd)))
>  			printk(KERN_ERR "PAGETABLE BUG #03!\n");
>  	}
>  	return pte_offset_kernel(pmd, vaddr);

Open coding it like this is fine with me.  The p*_offset_kernel(p*,0)
thing is arguably even harder to parse.


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2025-02-11 19:59 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode Maciej Wieczor-Retman
2025-02-05 23:43   ` Andrey Konovalov
2025-02-06 12:57     ` Maciej Wieczor-Retman
2025-02-06 18:14       ` Andrey Konovalov
2025-02-04 17:33 ` [PATCH 02/15] kasan: Tag checking with " Maciej Wieczor-Retman
2025-02-05 23:45   ` Andrey Konovalov
2025-02-06 14:55     ` Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 03/15] kasan: Vmalloc dense tag-based mode support Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 04/15] kasan: arm64: x86: risc-v: Make special tags arch specific Maciej Wieczor-Retman
2025-02-05 20:20   ` Palmer Dabbelt
2025-02-06 11:22     ` Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 05/15] x86: Add arch specific kasan functions Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 06/15] x86: Reset tag for virtual to physical address conversions Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 07/15] mm: Pcpu chunk address tag reset Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte Maciej Wieczor-Retman
2025-02-06  0:57   ` Dave Hansen
2025-02-07 16:37     ` Maciej Wieczor-Retman
2025-02-11 19:59       ` Dave Hansen
2025-02-04 17:33 ` [PATCH 09/15] x86: Physical address comparison in current_mm pgd check Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 10/15] x86: KASAN raw shadow memory PTE init Maciej Wieczor-Retman
2025-02-05 23:45   ` Andrey Konovalov
2025-02-06 15:39     ` Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 11/15] x86: LAM initialization Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 12/15] x86: Minimal SLAB alignment Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 13/15] x86: runtime_const used for KASAN_SHADOW_END Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 14/15] x86: Make software tag-based kasan available Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 15/15] kasan: Add mititgation and debug modes Maciej Wieczor-Retman
2025-02-05 23:46   ` Andrey Konovalov
2025-02-07  9:08     ` Maciej Wieczor-Retman
2025-02-04 18:58 ` [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Christoph Lameter (Ampere)
2025-02-04 21:05   ` Dave Hansen
2025-02-05 18:59     ` Christoph Lameter (Ampere)
2025-02-05 23:04       ` Ard Biesheuvel
2025-02-04 23:36   ` Jessica Clarke
2025-02-05 18:51     ` Christoph Lameter (Ampere)
2025-02-06  1:05       ` Jessica Clarke
2025-02-06 19:11         ` Christoph Lameter (Ampere)
2025-02-06 21:41           ` Dave Hansen
2025-02-07  7:41             ` Maciej Wieczor-Retman
2025-02-06 22:56           ` Andrey Konovalov
2025-02-04 23:36   ` Jessica Clarke
2025-02-05 23:40 ` Andrey Konovalov
2025-02-06 10:40   ` Maciej Wieczor-Retman
2025-02-06 18:10     ` Andrey Konovalov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox