From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D27BC54E64 for ; Mon, 25 Mar 2024 21:50:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A41EA6B0087; Mon, 25 Mar 2024 17:49:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F3756B0088; Mon, 25 Mar 2024 17:49:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8E0AE6B0089; Mon, 25 Mar 2024 17:49:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 806446B0087 for ; Mon, 25 Mar 2024 17:49:59 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 49CF3C03FF for ; Mon, 25 Mar 2024 21:49:59 +0000 (UTC) X-FDA: 81936904518.05.643C062 Received: from out-182.mta0.migadu.com (out-182.mta0.migadu.com [91.218.175.182]) by imf28.hostedemail.com (Postfix) with ESMTP id 9F1EEC000A for ; Mon, 25 Mar 2024 21:49:56 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="TXKaT/UJ"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf28.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.182 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711403397; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Jnty0Hl4Z2lmYarICJkRG5Td4HWGlYS6TikUtT7dCz0=; b=nWAv7F7UcawDqmqFP/Y00k9HpmjRPXPGtgfQl1a0CJZ7wX/ND85/koce9OPwfC19jkl967 TbZgA7bc5F4xWbWjVk8SP4NVIFBjtbC1tlmKnPBPcHKHsFs0I3YSAliH9a7Wto2DbF9YcZ R+9RJIulGQyQwTComP43yXfSppINSUA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="TXKaT/UJ"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf28.hostedemail.com: domain of kent.overstreet@linux.dev designates 91.218.175.182 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711403397; a=rsa-sha256; cv=none; b=pOqFTs0oMx1yiy+qbi6C0hKqmmT/jNTO6rnBnmmXXCgi6dwd6xq1nQPGlb9c8GEOE7PLa1 yifMgrOdNIZUSIQP+4TM+Vbq03gF8P2SGTYRPkwrTUYL43Sy3UmZgsSN4bI7zuf3Km+quO yow06UD4SP60r7myuem3K0pOxABqOos= Date: Mon, 25 Mar 2024 17:49:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1711403394; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Jnty0Hl4Z2lmYarICJkRG5Td4HWGlYS6TikUtT7dCz0=; b=TXKaT/UJgOrUg8TgyxlfyzzkC1ZYQInuPHBt5aIQsdda3H8THBSSduvIFLEFUQudD2VKRw MkhcOpPdNonvHFVXaIZoG5U13nOjNSCXY4/MjZsehnaDQNS/rFoouZfgEoCrM/DKa/21Ts pKauDEaTSdUdbr0XudtTw+a1rQAcRlE= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Kees Cook Cc: Vlastimil Babka , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, "GONG, Ruiqi" , Xiu Jianfeng , Suren Baghdasaryan , Jann Horn , Matteo Rizzo , linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org Subject: Re: [PATCH v2 4/9] slab: Introduce kmem_buckets_create() Message-ID: <67tgebii42rwneeyqekmxxqo2bzgyysdqggciuew27bc3gbrkg@5ceqjmiaxvyu> References: <20240305100933.it.923-kees@kernel.org> <20240305101026.694758-4-keescook@chromium.org> <202403251327.C15C1E61A@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <202403251327.C15C1E61A@keescook> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 9F1EEC000A X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: zf3ajyiqd41e6e98rm59npb883douzfo X-HE-Tag: 1711403396-939998 X-HE-Meta: U2FsdGVkX1+bB6Lv+z6W21XmbcloKOlj1+spSAGdXFDxPIwoQqDLiGP1MBeTzDaAYtq37Z9UPZQHuIj2KPkrABgCXz4YGwXB2wBXlDs7KCLhMQ6HMuuXEbTgB9FqhYxQe9zZ58suBjlaZjHKmG3mXsdRpiZLf27cO/SB2+4y9bymOiQUbLfjXiCaw9Rxe4Tmfzmgxe6D5NflAx8p+/iJATdBjJZshOv3mo01AS0TRrh+zWMMFsA67gprUWwYatMCKkHT6SeIXuiYpe3bIGTGbc8B37NAUEcbpj12fbVRMoeahHh724qGDPIOvgvXpbNwot/ABrlwt//sW1Byr57f6hFj5xZIvXhJh25v5VdZE9ROYKvdewVW6ny/jfw/wBoctpUWT+VN5jMEp5id9hF6Cmz5AnA8kSwOuiPh73vHACxRhzP7RwzfU+vvTFVPVXuS5LLdGb0ayoMDiFZB7vigTtvlI+1OE+1PJzcZ+yyGe/35tN0plutsyVZ2koN7tN5gZRPajro2UgRSKhHk9XexLUhyuTOAmuvEKRdCu55bo0UL78P311+U/IYvPETmDLf2u6nfK4eitoTcf0YxJJI8LE5TPPasso15aaj9M0YxXy2nfknK1Y/Olv+J/eeo+mRT/dDXRNW6BksKlwRMdGSB55iYa4zc77+trAtqX1AqeFnQZRpueugxrtcJxOkw/Hgx2ivfS//KqbPdXabSJ01vSNkf4ZR7Rz9TA0icIDVDDY+q0lYJGHl8ndondd6djpx7LgbyW/Nna3R11Pzgw+kZdg3KRgFER4Ev4ZT+rOfi0Zp+f4LAwUph4TaBUISI/FSJRfqFCFF65zv9+d/XMNwPcii6Bg6NYTAZgm26sPNdDuYf1qSavn0nMFWNvLfDfICUD06SxqW9NIvUQ9Ay2EolwmZxOHz6S6RXTR0aEeFu3OSpXJokfzwiqWbGDdq/L5g/Brh8Cw5dI5G8z9kIAk6 k1355FA/ EUitOxifT1mIINf3e0osXCn45Vk3u8xThFUIOTGNvBbiUo5CfJ8Vd0E63e0/P/7djqxcGXV3/j6wa2J/llTVRrUYJB43aK434K5t1zMdQ/LsljmBOcymGRlRTaZwy3R+7Pv1G+kUqZ0sF8o0W2y7RkWnAl5lIgL3h+ahabB/6sFFQ9BXB5yTUu3vvw11V4R/eJPABAVy4DnDXh0ALY7D/C2brTSM0LLpPlqZj+kbDU68oXzBNOF34fAawlbMQba95uW5MzFGecXEATo3MD3DoG0ryCP6KeFnXFb+gnITCiOq5sWr05der8vED1DXDsQ0NNMYwteEhGta+RvTgBEkyc6ufDmBw4g+ZutgDx7zM6q22FZ64nVWPFtqg65JcHp8eyWaWKpHJUUkQJHS0rM8zQhZmpubgdzYihOnv+Nz+246F4yyH6WliC1rAST/qkQQVIhXqBQHCODwL1lUXOn5kRlY+B0iw0v85jmm4NbNqicPKMXZ7Ew9W+yo0crvtR88Spejm9cKrpsQCYJ2YINBAiZyrZq73ChLU0yDWuSoejvj16rAAGSb1M6qtxIy37+hBLlHZz6pArre+EeSM/SYA5mrddPgfzBPmhAnLEO6VaUMfqTTq+R3SAbTeuFjas7DgcoIVd9CZh0M62Jk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 25, 2024 at 01:40:34PM -0700, Kees Cook wrote: > On Mon, Mar 25, 2024 at 03:40:51PM -0400, Kent Overstreet wrote: > > On Tue, Mar 05, 2024 at 02:10:20AM -0800, Kees Cook wrote: > > > Dedicated caches are available For fixed size allocations via > > > kmem_cache_alloc(), but for dynamically sized allocations there is only > > > the global kmalloc API's set of buckets available. This means it isn't > > > possible to separate specific sets of dynamically sized allocations into > > > a separate collection of caches. > > > > > > This leads to a use-after-free exploitation weakness in the Linux > > > kernel since many heap memory spraying/grooming attacks depend on using > > > userspace-controllable dynamically sized allocations to collide with > > > fixed size allocations that end up in same cache. > > > > > > While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense > > > against these kinds of "type confusion" attacks, including for fixed > > > same-size heap objects, we can create a complementary deterministic > > > defense for dynamically sized allocations. > > > > > > In order to isolate user-controllable sized allocations from system > > > allocations, introduce kmem_buckets_create(), which behaves like > > > kmem_cache_create(). (The next patch will introduce kmem_buckets_alloc(), > > > which behaves like kmem_cache_alloc().) > > > > > > Allows for confining allocations to a dedicated set of sized caches > > > (which have the same layout as the kmalloc caches). > > > > > > This can also be used in the future once codetag allocation annotations > > > exist to implement per-caller allocation cache isolation[1] even for > > > dynamic allocations. > > > > > > Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1] > > > Signed-off-by: Kees Cook > > > --- > > > Cc: Vlastimil Babka > > > Cc: Christoph Lameter > > > Cc: Pekka Enberg > > > Cc: David Rientjes > > > Cc: Joonsoo Kim > > > Cc: Andrew Morton > > > Cc: Roman Gushchin > > > Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > > Cc: linux-mm@kvack.org > > > --- > > > include/linux/slab.h | 5 +++ > > > mm/slab_common.c | 72 ++++++++++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 77 insertions(+) > > > > > > diff --git a/include/linux/slab.h b/include/linux/slab.h > > > index f26ac9a6ef9f..058d0e3cd181 100644 > > > --- a/include/linux/slab.h > > > +++ b/include/linux/slab.h > > > @@ -493,6 +493,11 @@ void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru, > > > gfp_t gfpflags) __assume_slab_alignment __malloc; > > > void kmem_cache_free(struct kmem_cache *s, void *objp); > > > > > > +kmem_buckets *kmem_buckets_create(const char *name, unsigned int align, > > > + slab_flags_t flags, > > > + unsigned int useroffset, unsigned int usersize, > > > + void (*ctor)(void *)); > > > > I'd prefer an API that initialized an object over one that allocates it > > - that is, prefer > > > > kmem_buckets_init(kmem_buckets *bucekts, ...) > > Sure, that can work. kmem_cache_init() would need to exist for the same > reason though. That'll be a very worthwhile addition too; IPC running kernel code is always crap and dependent loads is a big part of that. I did mempool_init() and bioset_init() awhile back, so it's someone else's turn for this one :) > Sure, I think it'll depend on how the per-site allocations got wired up. > I think you're meaning to include a full copy of the kmem cache/bucket > struct with the codetag instead of just a pointer? I don't think that'll > work well to make it runtime selectable, and I don't see it using an > extra deref -- allocations already get the struct from somewhere and > deref it. The only change is where to find the struct. The codetags are in their own dedicated elf sections already, so if you put the kmem_buckets in the codetag the entire elf section can be discarded if it's not in use. Also, the issue isn't derefs - it's dependent loads and locality. Taking the address of the kmem_buckets to pass it is fine; the data referred to will still get pulled into cache when we touch the codetag. If it's behind a pointer we have to pull the codetag into cache, wait for that so we can get the kmme_buckets pointer - then start to pull in the kmem_buckets itself. If it's a cache miss you just slowed the entire allocation down by around 30 ns. > > I'm curious what all the arguments to kmem_buckets_create() are needed > > for, if this is supposed to be a replacement for kmalloc() users. > > Are you confusing kmem_buckets_create() with kmem_buckets_alloc()? These > args are needed to initialize the per-bucket caches, just like is > already done for the global kmalloc per-bucket caches. This mirrors > kmem_cache_create(). (Or more specifically, calls kmem_cache_create() > for each bucket size, so the args need to be passed through.) > > If you mean "why expose these arguments because they can just use the > existing defaults already used by the global kmalloc caches" then I > would say, it's to gain the benefit here of narrowing the scope of the > usercopy offsets. Right now kmalloc is forced to allow the full usercopy > window into an allocation, but we don't have to do this any more. For > example, see patch 8, where struct msg_msg doesn't need to expose the > header to userspace: "usercopy window"? You're now annotating which data can be copied to userspace? I'm skeptical, this looks like defensive programming gone amuck to me. > msg_buckets = kmem_buckets_create("msg_msg", 0, SLAB_ACCOUNT, > sizeof(struct msg_msg), > DATALEN_MSG, NULL);