From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id CB4C7C77B7A
	for <linux-mm@archiver.kernel.org>; Wed, 31 May 2023 07:59:56 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 6ECD2280001; Wed, 31 May 2023 03:59:56 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 69BFC900002; Wed, 31 May 2023 03:59:56 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 564FE280001; Wed, 31 May 2023 03:59:56 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id 4742C900002
	for <linux-mm@kvack.org>; Wed, 31 May 2023 03:59:56 -0400 (EDT)
Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay02.hostedemail.com (Postfix) with ESMTP id 1A361120272
	for <linux-mm@kvack.org>; Wed, 31 May 2023 07:59:56 +0000 (UTC)
X-FDA: 80849801592.14.3FCF8A0
Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255])
	by imf03.hostedemail.com (Postfix) with ESMTP id 1998920014
	for <linux-mm@kvack.org>; Wed, 31 May 2023 07:59:52 +0000 (UTC)
Authentication-Results: imf03.hostedemail.com;
	dkim=none;
	dmarc=pass (policy=quarantine) header.from=huawei.com;
	spf=pass (imf03.hostedemail.com: domain of gongruiqi1@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=gongruiqi1@huawei.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1685519994;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=xe/kXwhcoI5dvncy643diuvWWH0RH7DOGcTshuZsb+Y=;
	b=fvHNL9Fp+2v++/oLJ4w64d/xykDH1soqqNotiY8lrl0EcaZKz6ELEbVcR1+3OF/oLFgZmT
	2nOv8i0Qij1q+97DopVasvjoJqamZF5xbdXKos8XhlzF5xjCXocmyoz3kHHg7Mhv2iZdbw
	OTjqBnh4jPXtLwqvYxVKfMjGa/5NwNI=
ARC-Authentication-Results: i=1;
	imf03.hostedemail.com;
	dkim=none;
	dmarc=pass (policy=quarantine) header.from=huawei.com;
	spf=pass (imf03.hostedemail.com: domain of gongruiqi1@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=gongruiqi1@huawei.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685519994; a=rsa-sha256;
	cv=none;
	b=jS5AQ8nFVm/EaRHYKUJsKqj/H2QebFHs5l0pzRX+h3TZ3odZ+jjJJz13iMRz2t3EKqK45H
	Fi7Ss4JRWnmR1TZHI5XnLJ+p/sSZ7iUIgWYx8v3NeXXfgywAU2lEXRt+NX7I+yQ7iGzy0G
	JjwDtcitm4K+N2QhJDuYm06tN5FHkdk=
Received: from dggpemm500016.china.huawei.com (unknown [172.30.72.54])
	by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4QWM4Z0qGdz18Lsy;
	Wed, 31 May 2023 15:55:10 +0800 (CST)
Received: from [10.67.110.48] (10.67.110.48) by dggpemm500016.china.huawei.com
 (7.185.36.25) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Wed, 31 May
 2023 15:59:47 +0800
Message-ID: <83f6cfbd-d081-5a76-7c7f-5e0b90b4ac74@huawei.com>
Date: Wed, 31 May 2023 15:59:47 +0800
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
 Thunderbird/102.11.0
Subject: Re: [PATCH RFC v2] Randomized slab caches for kmalloc()
Content-Language: en-US
To: Kees Cook <keescook@chromium.org>, Jann Horn <jannh@google.com>
CC: Vlastimil Babka <vbabka@suse.cz>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>, <linux-hardening@vger.kernel.org>, Hyeonggon
 Yoo <42.hyeyoo@gmail.com>, Alexander Lobakin <aleksander.lobakin@intel.com>,
	<kasan-dev@googlegroups.com>, Wang Weiyang <wangweiyang2@huawei.com>, Xiu
 Jianfeng <xiujianfeng@huawei.com>, Christoph Lameter <cl@linux.com>, David
 Rientjes <rientjes@google.com>, Roman Gushchin <roman.gushchin@linux.dev>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>, Andrew Morton
	<akpm@linux-foundation.org>, Pekka Enberg <penberg@kernel.org>, Paul Moore
	<paul@paul-moore.com>, James Morris <jmorris@namei.org>, "Serge E. Hallyn"
	<serge@hallyn.com>, "Gustavo A. R. Silva" <gustavoars@kernel.org>, "GONG,
 Ruiqi" <gongruiqi@huaweicloud.com>
References: <20230508075507.1720950-1-gongruiqi1@huawei.com>
 <202305161204.CB4A87C13@keescook>
From: Gong Ruiqi <gongruiqi1@huawei.com>
In-Reply-To: <202305161204.CB4A87C13@keescook>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Originating-IP: [10.67.110.48]
X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To
 dggpemm500016.china.huawei.com (7.185.36.25)
X-CFilter-Loop: Reflected
X-Rspam-User: 
X-Stat-Signature: zywddzofdknr94c3y7n1s6tg13zfjogk
X-Rspamd-Server: rspam07
X-Rspamd-Queue-Id: 1998920014
X-HE-Tag: 1685519992-645781
X-HE-Meta: U2FsdGVkX1/JOy39Kt/He9XY6FM8wB2CIi3QcAxn/lQLDWRlvXdJBJFO4cATBoYNSlyFV/vKcQnA2OAgzoX6f+tTxvdeMhA/cLBYZserswrHRK9x+26OdtgBhd89LtZRNPcAzxPXFzmlPZY4M+EOiAWZ0MlvMtryNXbFJCcRZ7PwpNeGARzA3UWK1bTey/LYTvqZFIv67xLAf1TDTcYBxuHSWwXyV4KCKE589x7kB/pE4cOSV/WU5ShQ7sry7/MD5TwfH6DV8zcnXeXBZecTr11tKwBcNf91dGllpvpaKHmrrVCSs/ove4LZSlNkyO1Fbbdvq3HFjOmXbf+e7X4e3kDVeZcEBXJ3FVdCLvgliNlFpM/NQG7d47W8jevy3qx4JtfIQnCNG+18i8qZ0Ntp49It+Roa70GlfxY/JQNR62AOZV3Qhar2UUsG88lX4nJpktp1aYpO92OWlhhXVRTwvCEgiHeUI8VGtunFypgI3JNEX55/lo/bU9dBvMua2m8C0Egjmc8cvzmnn5Ea2ncsZFY2SPqUJWcWoXlgVTnubm/bOpkvvkcYhZsOPNot9qtcAX1687vMCO4alDyu8T5YepHcfs5cpfnhakwAkmy2jK06onoAuF4ih9hFCJqfeTRC9TrDUBj1l26uJrrYO8/E/WAJpHJv4nz9lYM1UjgBh3O5umZyxJExGvdZLo241FhrwiGJFtUhEQhUzniLWazFwp+MX1K+j9mroGgAGPMruMQpDK98Xt7v9P3Thp5HLA1awutiB7UWGC+/DRAHWCmoGOOaw3QAa/fE0PsxQDS5FJgzYd0rCUbqpP3Lz4h5XOmusrDSn8B1rY3MUOJY80NKde5j1DAcwhrxVLFHGBnE2CJFmMvw1ThZDjM+I+qe76N1IhOAptlAzujsod9XcvkYumb0KqzfbS4Tbs8YWoDbktVfCMkvQ0V/BzxppcRS65/nx0Ql98NGqOOTnpE4aXX
 knIvKj8K
 uFCpIYOZdYjBg5XEuNEjW7eN+d+4UBGXcNAln3H+rYinwJ9DLJnF6ieSc5DNljXcsP0JtkldZh1Q+M1UQg8cy2KkpQ4t+wN1IoHXfAY8OPcwM9o8ksPd2D2VFJzReIk7WVnSPuV3fpQgjaRJavUav7vSCMnUc7Nr9JxujwDnGo7dQvwA2MaQ8wAJDNHRo8EThAIgZt7HDVA4Fx8/mkbMF4zuvXvFGCNq88pVj4KAvXtZw8lDQCJyjj86hyViy/fKyE9dwEWio7Z1ninz9V7N1oEh41fDNVov1y8XmqZ8KTu+VBDJDJPtetUsVdZvDFySn/ywaPwSsqR2w4fe1Lft1MEdgEXFcAVXLK+tS3FVYpEzANsN8C/pJE0XtbHpz0Yzo1jX+qLYq74S/xeutX/D65jeAhZD54GSnEmdyItH/BzwuYIHNsinYRRDijA==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

Sorry for the late reply. I was trapped by other in-house kernel issues
these days.

On 2023/05/17 3:34, Kees Cook wrote:
> For new CCs, the start of this thread is here[0].
> 
> On Mon, May 08, 2023 at 03:55:07PM +0800, GONG, Ruiqi wrote:
>> When exploiting memory vulnerabilities, "heap spraying" is a common
>> technique targeting those related to dynamic memory allocation (i.e. the
>> "heap"), and it plays an important role in a successful exploitation.
>> Basically, it is to overwrite the memory area of vulnerable object by
>> triggering allocation in other subsystems or modules and therefore
>> getting a reference to the targeted memory location. It's usable on
>> various types of vulnerablity including use after free (UAF), heap out-
>> of-bound write and etc.
> 
> I heartily agree we need some better approaches to deal with UAF, and
> by extension, heap spraying.

Thanks Kees :) Good to hear that!

> 
>> There are (at least) two reasons why the heap can be sprayed: 1) generic
>> slab caches are shared among different subsystems and modules, and
>> 2) dedicated slab caches could be merged with the generic ones.
>> Currently these two factors cannot be prevented at a low cost: the first
>> one is a widely used memory allocation mechanism, and shutting down slab
>> merging completely via `slub_nomerge` would be overkill.
>>
>> To efficiently prevent heap spraying, we propose the following approach:
>> to create multiple copies of generic slab caches that will never be
>> merged, and random one of them will be used at allocation. The random
>> selection is based on the address of code that calls `kmalloc()`, which
>> means it is static at runtime (rather than dynamically determined at
>> each time of allocation, which could be bypassed by repeatedly spraying
>> in brute force). In this way, the vulnerable object and memory allocated
>> in other subsystems and modules will (most probably) be on different
>> slab caches, which prevents the object from being sprayed.
> 
> This is a nice balance between the best option we have now
> ("slub_nomerge") and most invasive changes (type-based allocation
> segregation, which requires at least extensive compiler support),
> forcing some caches to be "out of reach".

Yes it is, and it's also cost-effective: achieving a quite satisfactory
mitigation with a small amount of code (only ~130 lines).

I get this impression also because (believe it or not) we did try to
implement similar idea as the latter one you mention, and that was super
complex, and the workload was really huge ...

> 
>>
>> The overhead of performance has been tested on a 40-core x86 server by
>> comparing the results of `perf bench all` between the kernels with and
>> without this patch based on the latest linux-next kernel, which shows
>> minor difference. A subset of benchmarks are listed below:
>>
>> 			control		experiment (avg of 3 samples)
>> sched/messaging (sec)	0.019		0.019
>> sched/pipe (sec)	5.253		5.340
>> syscall/basic (sec)	0.741		0.742
>> mem/memcpy (GB/sec)	15.258789	14.860495
>> mem/memset (GB/sec)	48.828125	50.431069
>>
>> The overhead of memory usage was measured by executing `free` after boot
>> on a QEMU VM with 1GB total memory, and as expected, it's positively
>> correlated with # of cache copies:
>>
>> 		control		4 copies	8 copies	16 copies
>> total		969.8M		968.2M		968.2M		968.2M
>> used		20.0M		21.9M		24.1M		26.7M
>> free		936.9M		933.6M		931.4M		928.6M
>> available	932.2M		928.8M		926.6M		923.9M
> 
> Great to see the impact: it's relatively tiny. Nice!
> 
> Back when we looked at cache quarantines, Jann pointed out that it
> was still possible to perform heap spraying -- it just needed more
> allocations. In this case, I think that's addressed (probabilistically)
> by making it less likely that a cache where a UAF is reachable is merged
> with something with strong exploitation primitives (e.g. msgsnd).
> 
> In light of all the UAF attack/defense breakdowns in Jann's blog
> post[1], I'm curious where this defense lands. It seems like it would
> keep the primitives described there (i.e. "upgrading" the heap spray
> into a page table "type confusion") would be addressed probabilistically
> just like any other style of attack. Jann, what do you think, and how
> does it compare to the KCTF work[2] you've been doing?

A kindly ping to Jann ;)

> 
> In addition to this work, I'd like to see something like the kmalloc
> caches, but for kmem_cache_alloc(), where a dedicated cache of
> variably-sized allocations can be managed. With that, we can split off
> _dedicated_ caches where we know there are strong exploitation
> primitives (i.e. msgsnd, etc). Then we can carve off known weak heap
> allocation caches as well as make merging probabilistically harder.

Would you please explain more about the necessity of applying similar
mitigation mechanism to dedicated caches?

Based on my knowledge, usually we believe dedicated caches are more
secure, although it's still possible to spray them, e.g. by the
technique that allocates & frees large amounts of slab objects to
manipulate the heap in pages. Nevertheless in most of cases they are
still good since such spraying is (considered to be) hard to implement.

Meanwhile, the aforementioned spraying technique can hardly be mitigated
within SLAB since it operates at the page level, and our randomization
idea cannot protect against it either, so it also makes me inclined to
believe it's not meaningful to apply randomization to dedicated caches.

> I imagine it would be possible to then split this series into two
> halves: one that creates the "make arbitrary-sized caches" API, and the
> second that applies that to kmalloc globally (as done here).
> 
>>
>> Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
>> ---
>>
>> v2:
>>   - Use hash_64() and a per-boot random seed to select kmalloc() caches.
> 
> This is good: I was hoping there would be something to make it per-boot
> randomized beyond just compile-time.
> 
> So, yes, I think this is worth it, but I'd like to see what design holes
> Jann can poke in it first. :)

Thanks again! I'm looking forward to receiving more comments from mm and
hardening developers.

> 
> -Kees
> 
> [0] https://lore.kernel.org/lkml/20230508075507.1720950-1-gongruiqi1@huawei.com/
> [1] https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html
> [2] https://github.com/thejh/linux/commit/a87ad16046f6f7fd61080ebfb93753366466b761
>