linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86
@ 2025-02-04 17:33 Maciej Wieczor-Retman
  2025-02-04 17:33 ` [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode Maciej Wieczor-Retman
                   ` (16 more replies)
  0 siblings, 17 replies; 45+ messages in thread
From: Maciej Wieczor-Retman @ 2025-02-04 17:33 UTC (permalink / raw)
  To: luto, xin, kirill.shutemov, palmer, tj, andreyknvl, brgerst,
	ardb, dave.hansen, jgross, will, akpm, arnd, corbet,
	maciej.wieczor-retman, dvyukov, richard.weiyang, ytcoode, tglx,
	hpa, seanjc, paul.walmsley, aou, justinstitt, jason.andryuk,
	glider, ubizjak, jannh, bhe, vincenzo.frascino, rafael.j.wysocki,
	ndesaulniers, mingo, catalin.marinas, junichi.nomura, nathan,
	ryabinin.a.a, dennis, bp, kevinloughlin, morbo, dan.j.williams,
	julian.stecklina, peterz, cl, kees
  Cc: kasan-dev, x86, linux-arm-kernel, linux-riscv, linux-kernel,
	linux-mm, llvm, linux-doc

======= Introduction
The patchset aims to add a KASAN tag-based mode for the x86 architecture
with the help of the new CPU feature called Linear Address Masking
(LAM). Main improvement introduced by the series is 4x lower memory
usage compared to KASAN's generic mode, the only currently available
mode on x86.

There are two logical parts to this series. The first one attempts to
add a new memory saving mechanism called "dense mode" to the generic
part of the tag-based KASAN code. The second one focuses on implementing
and enabling the tag-based mode for the x86 architecture by using LAM.

======= How KASAN tag-based mode works?
When enabled, memory accesses and allocations are augmented by the
compiler during kernel compilation. Instrumentation functions are added
to each memory allocation and each pointer dereference.

The allocation related functions generate a random tag and save it in
two places: in shadow memory that maps to the allocated memory, and in
the top bits of the pointer that points to the allocated memory. Storing
the tag in the top of the pointer is possible because of Top-Byte Ignore
(TBI) on arm64 architecture and LAM on x86.

The access related functions are performing a comparison between the tag
stored in the pointer and the one stored in shadow memory. If the tags
don't match an out of bounds error must have occurred and so an error
report is generated.

The general idea for the tag-based mode is very well explained in the
series with the original implementation [1].

[1] https://lore.kernel.org/all/cover.1544099024.git.andreyknvl@google.com/

======= What is the new "dense mode"?
To further save memory the dense mode is introduced. The idea is that
normally one shadow byte stores one tag and this one tag covers one
granule of allocated memory which is 16 bytes. In the dense mode, one
tag still covers 16 bytes of allocated memory but is shortened in length
from 8 bits to 4 bits which makes it possible to store two tags in one
shadow memory byte.

=== Example:
The example below shows how the shadow memory looks like after
allocating 48 bytes of memory in both normal tag-based mode and the
dense mode. The contents of shadow memory are overlaid onto address
offsets that they relate to in the allocated kernel memory. Each cell
|        | symbolizes one byte of shadow memory.

= The regular tag based mode:
- Randomly generated 8-bit tag equals 0xAB.
- 0xFE is the tag that symbolizes unallocated memory.

Shadow memory contents:           |  0xAB  |  0xAB  |  0xAB  |  0xFE  |
Shadow memory address offsets:    0        1        2        3        4
Allocated memory address offsets: 0        16       32       48       64

= The dense tag based mode:
- Randomly generated 4-bit tag equals 0xC.
- 0xE is the tag that symbolizes unallocated memory.

Shadow memory contents:           |0xC 0xC |0xC 0xE |0xE 0xE |0xE 0xE |
Shadow memory address offsets:    0        1        2        3        4
Allocated memory address offsets: 0        32       64       96       128

=== Dense mode benefits summary
For a small price of a couple of bit shifts, the dense mode uses only
half the memory compared to the current arm64 tag-based mode, while
still preserving the 16 byte tag granularity which allows catching
smaller offsets of out of bounds errors.

======= Differences summary compared to the arm64 tag-based mode
- Tag width:
	- Tag width influences the chance of a tag mismatch due to two
	  tags from different allocations having the same value. The
	  bigger the possible range of tag values the lower the chance
	  of that happening.
	- Shortening the tag width from 8 bits to 4, while helping with
	  memory usage also increases the chance of not reporting an
	  error. 4 bit tags have a ~7% chance of a tag mismatch.

- TBI and LAM
	- TBI in arm64 allows for storing metadata in the top 8 bits of
	  the virtual address.
	- LAM in x86 allows storing tags in bits [62:57] of the pointer.
	  To maximize memory savings the tag width is reduced to bits
	  [60:57].

======= Testing
Checked all the kunits for both software tags and generic KASAN after
making changes.

In generic mode the results were:

kasan: pass:59 fail:0 skip:13 total:72
Totals: pass:59 fail:0 skip:13 total:72
ok 1 kasan

and for software tags:

kasan: pass:63 fail:0 skip:9 total:72
Totals: pass:63 fail:0 skip:9 total:72
ok 1 kasan

======= Benchmarks
All tests were ran on a Sierra Forest server platform with 512GB of
memory. The only differences between the tests were kernel options:
	- CONFIG_KASAN
	- CONFIG_KASAN_GENERIC
	- CONFIG_KASAN_SW_TAGS
	- CONFIG_KASAN_INLINE [1]
	- CONFIG_KASAN_OUTLINE [1]

Used memory in GBs after boot [2][3]:
* 14 for clean kernel
* 91 / 90 for generic KASAN (inline/outline)
* 31 for tag-based KASAN

Boot time (until login prompt):
* 03:48 for clean kernel
* 08:02 / 09:45 for generic KASAN (inline/outline)
* 08:50 for dense tag-based KASAN
* 04:50 for dense tag-based KASAN with stacktrace disabled [4]

Compilation time comparison (10 cores):
* 7:27 for clean kernel
* 8:21/7:44 for generic KASAN (inline/outline)
* 7:41 for tag-based KASAN

Network performance [5]:
* 13.7 Gbits/sec for clean kernel
* 2.25 Gbits/sec for generic KASAN inline
* 1.50 Gbits/sec for generic KASAN outline
* 1.55 Gbits/sec for dense tag-based KASAN
* 2.86 Gbits/sec for dense tag-based KASAN with stacktrace disabled

[1] Based on hwasan and asan compiler parameters used in
scripts/Makefile.kasan it looks like inline/outline modes have a bigger
impact on generic mode than the tag-based mode. In the former inlining
actually increases the kernel image size and improves performance. In
the latter it un-inlines some code portions for debugging purposes when
the outline mode is chosen but no real difference is visible in
performance and kernel image size.

[2] Used "cat /proc/meminfo | grep MemAvailable" and then subtracted
that from the total memory of the system. Initially wanted to use "grep
Slab" similarly to the cover letter for arm64 tag-based series but
because the tests were ran on a system with 512GB of RAM and memory
usage was more split up between different categories this better shows
the memory savings.

[3] If the 14 GBs from the clean build were subtracted from the KASAN
measurements one can see that the tag-based mode uses about 4x less of
the additional memory compared to the generic mode.

[4] Memory allocation and freeing performance suffers heavily from saving
stacktraces that can be later displayed in error reports.

[5] Measured as `iperf -s & iperf -c 127.0.0.1 -t 30`.

======= Compilation
Clang was used to compile the series (make LLVM=1) since gcc doesn't
seem to have support for KASAN tag-based compiler instrumentation on
x86.

======= Dependencies
Series is based on risc-v series [1] that's currently in review. Because
of this for the time being it only applies cleanly on top of 6.12
mainline kernel. Will rebase on the newest kernel once the risc-v series
is also rebased.

[1] https://lore.kernel.org/all/20241022015913.3524425-1-samuel.holland@sifive.com/

Maciej Wieczor-Retman (15):
  kasan: Allocation enhancement for dense tag-based mode
  kasan: Tag checking with dense tag-based mode
  kasan: Vmalloc dense tag-based mode support
  kasan: arm64: x86: risc-v: Make special tags arch specific
  x86: Add arch specific kasan functions
  x86: Reset tag for virtual to physical address conversions
  mm: Pcpu chunk address tag reset
  x86: Physical address comparisons in fill_p*d/pte
  x86: Physical address comparison in current_mm pgd check
  x86: KASAN raw shadow memory PTE init
  x86: LAM initialization
  x86: Minimal SLAB alignment
  x86: runtime_const used for KASAN_SHADOW_END
  x86: Make software tag-based kasan available
  kasan: Add mititgation and debug modes

 Documentation/arch/x86/x86_64/mm.rst |  6 +-
 MAINTAINERS                          |  2 +-
 arch/arm64/include/asm/kasan-tags.h  |  9 +++
 arch/riscv/include/asm/kasan-tags.h  | 12 ++++
 arch/riscv/include/asm/kasan.h       |  4 --
 arch/x86/Kconfig                     | 11 +++-
 arch/x86/boot/compressed/misc.h      |  2 +
 arch/x86/include/asm/kasan-tags.h    |  9 +++
 arch/x86/include/asm/kasan.h         | 50 +++++++++++++--
 arch/x86/include/asm/page.h          | 17 +++--
 arch/x86/include/asm/page_64.h       |  2 +-
 arch/x86/kernel/head_64.S            |  3 +
 arch/x86/kernel/setup.c              |  2 +
 arch/x86/kernel/vmlinux.lds.S        |  1 +
 arch/x86/mm/init.c                   |  3 +
 arch/x86/mm/init_64.c                |  8 +--
 arch/x86/mm/kasan_init_64.c          | 24 +++++--
 arch/x86/mm/physaddr.c               |  1 +
 arch/x86/mm/tlb.c                    |  2 +-
 include/linux/kasan-tags.h           | 12 +++-
 include/linux/kasan.h                | 94 +++++++++++++++++++++++-----
 include/linux/mm.h                   |  6 +-
 include/linux/page-flags-layout.h    |  7 +--
 lib/Kconfig.kasan                    | 49 +++++++++++++++
 mm/kasan/Makefile                    |  3 +
 mm/kasan/dense.c                     | 83 ++++++++++++++++++++++++
 mm/kasan/kasan.h                     | 27 +-------
 mm/kasan/report.c                    |  6 +-
 mm/kasan/report_sw_tags.c            | 12 ++--
 mm/kasan/shadow.c                    | 47 ++++++++++----
 mm/kasan/sw_tags.c                   |  8 +++
 mm/kasan/tags.c                      |  5 ++
 mm/percpu-vm.c                       |  2 +-
 33 files changed, 432 insertions(+), 97 deletions(-)
 create mode 100644 arch/arm64/include/asm/kasan-tags.h
 create mode 100644 arch/riscv/include/asm/kasan-tags.h
 create mode 100644 arch/x86/include/asm/kasan-tags.h
 create mode 100644 mm/kasan/dense.c

-- 
2.47.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2025-02-11 19:59 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-04 17:33 [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 01/15] kasan: Allocation enhancement for dense tag-based mode Maciej Wieczor-Retman
2025-02-05 23:43   ` Andrey Konovalov
2025-02-06 12:57     ` Maciej Wieczor-Retman
2025-02-06 18:14       ` Andrey Konovalov
2025-02-04 17:33 ` [PATCH 02/15] kasan: Tag checking with " Maciej Wieczor-Retman
2025-02-05 23:45   ` Andrey Konovalov
2025-02-06 14:55     ` Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 03/15] kasan: Vmalloc dense tag-based mode support Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 04/15] kasan: arm64: x86: risc-v: Make special tags arch specific Maciej Wieczor-Retman
2025-02-05 20:20   ` Palmer Dabbelt
2025-02-06 11:22     ` Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 05/15] x86: Add arch specific kasan functions Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 06/15] x86: Reset tag for virtual to physical address conversions Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 07/15] mm: Pcpu chunk address tag reset Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 08/15] x86: Physical address comparisons in fill_p*d/pte Maciej Wieczor-Retman
2025-02-06  0:57   ` Dave Hansen
2025-02-07 16:37     ` Maciej Wieczor-Retman
2025-02-11 19:59       ` Dave Hansen
2025-02-04 17:33 ` [PATCH 09/15] x86: Physical address comparison in current_mm pgd check Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 10/15] x86: KASAN raw shadow memory PTE init Maciej Wieczor-Retman
2025-02-05 23:45   ` Andrey Konovalov
2025-02-06 15:39     ` Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 11/15] x86: LAM initialization Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 12/15] x86: Minimal SLAB alignment Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 13/15] x86: runtime_const used for KASAN_SHADOW_END Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 14/15] x86: Make software tag-based kasan available Maciej Wieczor-Retman
2025-02-04 17:33 ` [PATCH 15/15] kasan: Add mititgation and debug modes Maciej Wieczor-Retman
2025-02-05 23:46   ` Andrey Konovalov
2025-02-07  9:08     ` Maciej Wieczor-Retman
2025-02-04 18:58 ` [PATCH 00/15] kasan: x86: arm64: risc-v: KASAN tag-based mode for x86 Christoph Lameter (Ampere)
2025-02-04 21:05   ` Dave Hansen
2025-02-05 18:59     ` Christoph Lameter (Ampere)
2025-02-05 23:04       ` Ard Biesheuvel
2025-02-04 23:36   ` Jessica Clarke
2025-02-05 18:51     ` Christoph Lameter (Ampere)
2025-02-06  1:05       ` Jessica Clarke
2025-02-06 19:11         ` Christoph Lameter (Ampere)
2025-02-06 21:41           ` Dave Hansen
2025-02-07  7:41             ` Maciej Wieczor-Retman
2025-02-06 22:56           ` Andrey Konovalov
2025-02-04 23:36   ` Jessica Clarke
2025-02-05 23:40 ` Andrey Konovalov
2025-02-06 10:40   ` Maciej Wieczor-Retman
2025-02-06 18:10     ` Andrey Konovalov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox