linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Marco Elver <elver@google.com>
Cc: linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Alexander Potapenko <glider@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Florent Revest <revest@google.com>,
	GONG Ruiqi <gongruiqi@huaweicloud.com>,
	Jann Horn <jannh@google.com>, Kees Cook <kees@kernel.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Matteo Rizzo <matteorizzo@google.com>,
	Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
	Nathan Chancellor <nathan@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Suren Baghdasaryan <surenb@google.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-hardening@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH RFC] slab: support for compiler-assisted type-based slab cache partitioning
Date: Tue, 26 Aug 2025 01:48:25 +0900	[thread overview]
Message-ID: <aKyT2UKmlznvN2jv@hyeyoo> (raw)
In-Reply-To: <20250825154505.1558444-1-elver@google.com>

On Mon, Aug 25, 2025 at 05:44:40PM +0200, Marco Elver wrote:
> [ Beware, this an early RFC for an in-development Clang feature, and
>   requires the following Clang/LLVM development tree:
>    https://github.com/melver/llvm-project/tree/alloc-token
>   The corresponding LLVM RFC and discussion can be found here:
>    https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434  ]

Whoa, a cutting-edge feature!

> Rework the general infrastructure around RANDOM_KMALLOC_CACHES into more
> flexible PARTITION_KMALLOC_CACHES, with the former being a partitioning
> mode of the latter.
> 
> Introduce a new mode, TYPED_KMALLOC_CACHES, which leverages Clang's
> "allocation tokens" via __builtin_alloc_token_infer [1].
> 
> This mechanism allows the compiler to pass a token ID derived from the
> allocation's type to the allocator. The compiler performs best-effort
> type inference, and recognizes idioms such as kmalloc(sizeof(T), ...).
> Unlike RANDOM_KMALLOC_CACHES, this mode deterministically assigns a slab
> cache to an allocation of type T, regardless of allocation site.

I don't think either TYPED_KMALLOC_CACHES or RANDOM_KMALLOC_CACHES is
strictly superior to the other (or am I wrong?). Would it be reasonable
to do some run-time randomization for TYPED_KMALLOC_CACHES too?
(i.e., randomize index within top/bottom half based on allocation site and
random seed)

> Clang's default token ID calculation is described as [1]:
> 
>    TypeHashPointerSplit: This mode assigns a token ID based on the hash
>    of the allocated type's name, where the top half ID-space is reserved
>    for types that contain pointers and the bottom half for types that do
>    not contain pointers.
> 
> Separating pointer-containing objects from pointerless objects and data
> allocations can help mitigate certain classes of memory corruption
> exploits [2]: attackers who gains a buffer overflow on a primitive
> buffer cannot use it to directly corrupt pointers or other critical
> metadata in an object residing in a different, isolated heap region.
>
> It is important to note that heap isolation strategies offer a
> best-effort approach, and do not provide a 100% security guarantee,
> albeit achievable at relatively low performance cost. Note that this
> also does not prevent cross-cache attacks, and SLAB_VIRTUAL [3] should
> be used as a complementary mitigation.

Not relevant to this patch, but just wondering if there are
any plans for SLAB_VIRTUAL?

> With all that, my kernel (x86 defconfig) shows me a histogram of slab
> cache object distribution per /proc/slabinfo (after boot):
> 
>   <slab cache>      <objs> <hist>
>   kmalloc-part-15     619  ++++++
>   kmalloc-part-14    1412  ++++++++++++++
>   kmalloc-part-13    1063  ++++++++++
>   kmalloc-part-12    1745  +++++++++++++++++
>   kmalloc-part-11     891  ++++++++
>   kmalloc-part-10     610  ++++++
>   kmalloc-part-09     792  +++++++
>   kmalloc-part-08    3054  ++++++++++++++++++++++++++++++
>   kmalloc-part-07     245  ++
>   kmalloc-part-06     182  +
>   kmalloc-part-05     122  +
>   kmalloc-part-04     295  ++
>   kmalloc-part-03     241  ++
>   kmalloc-part-02     107  +
>   kmalloc-part-01     124  +
>   kmalloc            6231  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 
> The above /proc/slabinfo snapshot shows me there are 7547 allocated
> objects (slabs 00 - 07) that the compiler claims contain no pointers or
> it was unable to infer the type of, and 10186 objects that contain
> pointers (slabs 08 - 15). On a whole, this looks relatively sane.
> 
> Additionally, when I compile my kernel with -Rpass=alloc-token, which
> provides diagnostics where (after dead-code elimination) type inference
> failed, I see 966 allocation sites where the compiler failed to identify
> a type. Some initial review confirms these are mostly variable sized
> buffers, but also include structs with trailing flexible length arrays
> (the latter could be recognized by the compiler by teaching it to look
> more deeply into complex expressions such as those generated by
> struct_size).

When the compiler fails to identify a type, does it go to top half or
bottom half, or perhaps it doesn't matter?

> Link: https://github.com/melver/llvm-project/blob/alloc-token/clang/docs/AllocToken.rst [1]
> Link: https://blog.dfsec.com/ios/2025/05/30/blasting-past-ios-18 [2]
> Link: https://lwn.net/Articles/944647/ [3]
> Signed-off-by: Marco Elver <elver@google.com>
> ---

I didn't go too deep into the implementation details, but I'm happy with
it since the change looks quite simple ;)


  reply	other threads:[~2025-08-25 16:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-25 15:44 Marco Elver
2025-08-25 16:48 ` Harry Yoo [this message]
2025-08-26 10:45   ` Marco Elver
2025-08-26 11:14   ` Matteo Rizzo
2025-08-25 20:17 ` Kees Cook
2025-08-26 10:50   ` Marco Elver
2025-08-26  4:59 ` GONG Ruiqi
2025-08-26 11:01   ` Marco Elver
2025-08-26 11:31     ` Florent Revest
2025-08-27  8:34     ` GONG Ruiqi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aKyT2UKmlznvN2jv@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=david@redhat.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=gongruiqi@huaweicloud.com \
    --cc=gustavoars@kernel.org \
    --cc=jannh@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kees@kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=matteorizzo@google.com \
    --cc=mhocko@suse.com \
    --cc=nathan@kernel.org \
    --cc=revest@google.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox