From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E9EFC48BF6 for ; Thu, 22 Feb 2024 00:35:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3A1B6B0083; Wed, 21 Feb 2024 19:35:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DEA4A6B0087; Wed, 21 Feb 2024 19:35:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8A616B0088; Wed, 21 Feb 2024 19:35:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B79736B0083 for ; Wed, 21 Feb 2024 19:35:03 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 39318C0BCA for ; Thu, 22 Feb 2024 00:35:03 +0000 (UTC) X-FDA: 81817570086.08.2F2952D Received: from out-172.mta1.migadu.com (out-172.mta1.migadu.com [95.215.58.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 32EEF4001F for ; Thu, 22 Feb 2024 00:34:59 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=DNUFEkkq; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf11.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.172 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708562100; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=G88uSOyu1u80Eng+KqSI+b6wDgfOq5t2Nzkcxgmz8B8=; b=Dc2Hv4pXXdl/IHdgvlv+iVfhISOAt4LKwG3Ca2R5vjQvO6XXkuJlmE5sDPBl09oTjxGvAQ Pjl49Ku46wctLcg3hmkyKkFLwkls3IiQqawgWTH9Fy8IA0b62AixiCTd8sqQ7kb2O6JLbl mg+ww/N3oWEKjrq0vB1YI2J42G5fxXs= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=DNUFEkkq; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf11.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.172 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708562100; a=rsa-sha256; cv=none; b=OaftM9pcYWSDJ7KiIpsm7kfKD4xkCu6PPuz6CUwCijoX1NSb2ljkw0y57UTLw3GqvJO3+L OAUCLhjwLn0AbmumM4Isr8fEx8I8RddO3aFyxBYGI9/75XYuxDdkojt3EMlfzHD7aC8HQZ tRGdwJ6TAMrN7AZz94iPJDFn2giK91Q= Date: Wed, 21 Feb 2024 19:34:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1708562097; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=G88uSOyu1u80Eng+KqSI+b6wDgfOq5t2Nzkcxgmz8B8=; b=DNUFEkkqgQoWl+SP5laughmYXRyxMqPuHqKkeAnGKyFby/zySfYBY3FIzXWdfPg84xohpk J9sGqqVtCHuZbx8fNQO/nCqhO4qFN18b+/JysbQQ/t5n2nk/SljAJhmsg3qEL65+dgnRkH bSHDYXEVjKjZFJzAzNHVNqfZXbzQDrY= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Kees Cook Cc: Suren Baghdasaryan , akpm@linux-foundation.org, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, penguin-kernel@i-love.sakura.ne.jp, corbet@lwn.net, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, catalin.marinas@arm.com, will@kernel.org, arnd@arndb.de, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, dennis@kernel.org, tj@kernel.org, muchun.song@linux.dev, rppt@kernel.org, paulmck@kernel.org, pasha.tatashin@soleen.com, yosryahmed@google.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, andreyknvl@gmail.com, ndesaulniers@google.com, vvvvvv@google.com, gregkh@linuxfoundation.org, ebiggers@google.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, kernel-team@android.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-modules@vger.kernel.org, kasan-dev@googlegroups.com, cgroups@vger.kernel.org Subject: Re: [PATCH v4 14/36] lib: add allocation tagging support for memory allocation profiling Message-ID: References: <20240221194052.927623-1-surenb@google.com> <20240221194052.927623-15-surenb@google.com> <202402211449.401382D2AF@keescook> <4vwiwgsemga7vmahgwsikbsawjq5xfskdsssmjsfe5hn7k2alk@b6ig5v2pxe5i> <202402211608.41AD94094@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <202402211608.41AD94094@keescook> X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 32EEF4001F X-Stat-Signature: 93qks5pksgezd631zgu4okpyoxauptf4 X-HE-Tag: 1708562099-660879 X-HE-Meta: U2FsdGVkX19TVy2PSLfCkp4FycrKjLQf9MO3aiyVpH7tBWHxo515KGXxIkB9pxiDS/K/fqqYy0SiwE1xG+TzKWaO6uzWCVsOIg0vzwN+dBGVH26+TG+SPT3SF/D1Nf5NMzfan1/9Dibpe+MYYvOEqgCjk8Hp6tnkhKxp8wC7jaKTOsuhd89SwCQ/RSa2rOOsg+5UBqcbT+si0+0JFGN9TcbNZFD5QF81fLLP81ChI6MMQfSWsZXdmWVtKEoLJRhFpN83xnQscgAPBXvkc6pBsalCZwLR2uynZ8f55petLe/WMNaev6H05tosuAAR52tSfh+miB+AtiVq+yfzR43ZVPza1VhTo0USWYRb3G+b22BdakHj2o18TqZIalzBidy3Saq8CQ0/z9i1VSv+LKQwB3WHHlQ2dj9k8IZIDUBUZoD6i5DIgSKS36jl6kYF8xW9vilRTdpr+0mFNIOCpcTTcYnsXqTOlw4R8qG1zVbBV1UbkdKgUHLFVYqwwkFXBm5rVI2/rDzdlkgivWBKsPzw+FKGdbP0OkJ1nHDbWRe1V2KMuTqTgNu3lqapWfgh5zZ2CNLCoGqK92RD+ytYcCYUVypS9y+Qo165HdNdKcbMBhQ786jw31QGqq2i5T6u8IcLBmVAkvcGnsruhoXD9VhhQRZtlnSQkKvqscBjUcFJ0ufAcpeKvRmqtKaj7e9UuyHNOW10KeCGTTuuRzwS3J20XoNGYFZ7mmTzT+oXoqWIHGwqx+Y+kn+dJnAaWZrQ1tMRhy+hw100YCIXt0Y9QB5K//avOhI/ZBjJH269CMVWnzWNDyNLOqBlavqLkgqC1dMotonVKf/+n99C2Rf4sY7PjiZ9+EUV9B4kZrLT2ehtflcQ/rZU/yhMUhnNx/BQ6tF2anGtHI6tonM/pBaoJHkqUlzpc6CL/4osgPeJsZe4W5Cqhbb3OcfqbjtniUvk92G54AE4cLTEj+G+MRWdsFA cOPwN9We HHcQIhfAl6+rTxFUVZfMKtZXCyVuNzPRy/EGdpu/mrBqtdYiWJihe7Z9qPGZqlwl5HyP3ck06UfMboZDkYDO5JY0NsirYlNkjQwNkZwjLSsiTNShaEZ1mGo/qqqhqmTfMmcRcfKFhoIdc/+J8CsgzywDaUaZft6FWL/Po X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 21, 2024 at 04:25:02PM -0800, Kees Cook wrote: > On Wed, Feb 21, 2024 at 06:29:17PM -0500, Kent Overstreet wrote: > > On Wed, Feb 21, 2024 at 03:05:32PM -0800, Kees Cook wrote: > > > On Wed, Feb 21, 2024 at 11:40:27AM -0800, Suren Baghdasaryan wrote: > > > > [...] > > > > +struct alloc_tag { > > > > + struct codetag ct; > > > > + struct alloc_tag_counters __percpu *counters; > > > > +} __aligned(8); > > > > [...] > > > > +#define DEFINE_ALLOC_TAG(_alloc_tag) \ > > > > + static DEFINE_PER_CPU(struct alloc_tag_counters, _alloc_tag_cntr); \ > > > > + static struct alloc_tag _alloc_tag __used __aligned(8) \ > > > > + __section("alloc_tags") = { \ > > > > + .ct = CODE_TAG_INIT, \ > > > > + .counters = &_alloc_tag_cntr }; > > > > [...] > > > > +static inline struct alloc_tag *alloc_tag_save(struct alloc_tag *tag) > > > > +{ > > > > + swap(current->alloc_tag, tag); > > > > + return tag; > > > > +} > > > > > > Future security hardening improvement idea based on this infrastructure: > > > it should be possible to implement per-allocation-site kmem caches. For > > > example, we could create: > > > > > > struct alloc_details { > > > u32 flags; > > > union { > > > u32 size; /* not valid after __init completes */ > > > struct kmem_cache *cache; > > > }; > > > }; > > > > > > - add struct alloc_details to struct alloc_tag > > > - move the tags section into .ro_after_init > > > - extend alloc_hooks() to populate flags and size: > > > .flags = __builtin_constant_p(size) ? KMALLOC_ALLOCATE_FIXED > > > : KMALLOC_ALLOCATE_BUCKETS; > > > .size = __builtin_constant_p(size) ? size : SIZE_MAX; > > > - during kernel start or module init, walk the alloc_tag list > > > and create either a fixed-size kmem_cache or to allocate a > > > full set of kmalloc-buckets, and update the "cache" member. > > > - adjust kmalloc core routines to use current->alloc_tag->cache instead > > > of using the global buckets. > > > > > > This would get us fully separated allocations, producing better than > > > type-based levels of granularity, exceeding what we have currently with > > > CONFIG_RANDOM_KMALLOC_CACHES. > > > > > > Does this look possible, or am I misunderstanding something in the > > > infrastructure being created here? > > > > Definitely possible, but... would we want this? > > Yes, very very much. One of the worst and mostly unaddressed weaknesses > with the kernel right now is use-after-free based type confusion[0], which > depends on merged caches (or cache reuse). > > This doesn't solve cross-allocator (kmalloc/page_alloc) type confusion > (as terrifyingly demonstrated[1] by Jann Horn), but it does help with > what has been a very common case of "use msg_msg to impersonate your > target object"[2] exploitation. We have a ton of code that references PAGE_SIZE and uses the page allocator completely unnecessarily - that's something worth harping about at conferences; if we could motivate people to clean that stuff up it'd have a lot of positive effects. > > That would produce a _lot_ of kmem caches > > Fewer than you'd expect, but yes, there is some overhead. However, > out-of-tree forks of Linux have successfully experimented with this > already and seen good results[3]. So in that case - I don't think there's any need for a separate alloc_details; we'd just add a kmem_cache * to alloc_tag and then hook into the codetag init/unload path to create and destroy the kmem caches. No need to adjust the slab code either; alloc_hooks() itself could dispatch to kmem_cache_alloc() instead of kmalloc() if this is in use.