From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB4C7C77B7A for ; Wed, 31 May 2023 07:59:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6ECD2280001; Wed, 31 May 2023 03:59:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 69BFC900002; Wed, 31 May 2023 03:59:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 564FE280001; Wed, 31 May 2023 03:59:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4742C900002 for ; Wed, 31 May 2023 03:59:56 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1A361120272 for ; Wed, 31 May 2023 07:59:56 +0000 (UTC) X-FDA: 80849801592.14.3FCF8A0 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf03.hostedemail.com (Postfix) with ESMTP id 1998920014 for ; Wed, 31 May 2023 07:59:52 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf03.hostedemail.com: domain of gongruiqi1@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=gongruiqi1@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685519994; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xe/kXwhcoI5dvncy643diuvWWH0RH7DOGcTshuZsb+Y=; b=fvHNL9Fp+2v++/oLJ4w64d/xykDH1soqqNotiY8lrl0EcaZKz6ELEbVcR1+3OF/oLFgZmT 2nOv8i0Qij1q+97DopVasvjoJqamZF5xbdXKos8XhlzF5xjCXocmyoz3kHHg7Mhv2iZdbw OTjqBnh4jPXtLwqvYxVKfMjGa/5NwNI= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf03.hostedemail.com: domain of gongruiqi1@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=gongruiqi1@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685519994; a=rsa-sha256; cv=none; b=jS5AQ8nFVm/EaRHYKUJsKqj/H2QebFHs5l0pzRX+h3TZ3odZ+jjJJz13iMRz2t3EKqK45H Fi7Ss4JRWnmR1TZHI5XnLJ+p/sSZ7iUIgWYx8v3NeXXfgywAU2lEXRt+NX7I+yQ7iGzy0G JjwDtcitm4K+N2QhJDuYm06tN5FHkdk= Received: from dggpemm500016.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4QWM4Z0qGdz18Lsy; Wed, 31 May 2023 15:55:10 +0800 (CST) Received: from [10.67.110.48] (10.67.110.48) by dggpemm500016.china.huawei.com (7.185.36.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Wed, 31 May 2023 15:59:47 +0800 Message-ID: <83f6cfbd-d081-5a76-7c7f-5e0b90b4ac74@huawei.com> Date: Wed, 31 May 2023 15:59:47 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH RFC v2] Randomized slab caches for kmalloc() Content-Language: en-US To: Kees Cook , Jann Horn CC: Vlastimil Babka , , , , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Lobakin , , Wang Weiyang , Xiu Jianfeng , Christoph Lameter , David Rientjes , Roman Gushchin , Joonsoo Kim , Andrew Morton , Pekka Enberg , Paul Moore , James Morris , "Serge E. Hallyn" , "Gustavo A. R. Silva" , "GONG, Ruiqi" References: <20230508075507.1720950-1-gongruiqi1@huawei.com> <202305161204.CB4A87C13@keescook> From: Gong Ruiqi In-Reply-To: <202305161204.CB4A87C13@keescook> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.110.48] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500016.china.huawei.com (7.185.36.25) X-CFilter-Loop: Reflected X-Rspam-User: X-Stat-Signature: zywddzofdknr94c3y7n1s6tg13zfjogk X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1998920014 X-HE-Tag: 1685519992-645781 X-HE-Meta: U2FsdGVkX1/JOy39Kt/He9XY6FM8wB2CIi3QcAxn/lQLDWRlvXdJBJFO4cATBoYNSlyFV/vKcQnA2OAgzoX6f+tTxvdeMhA/cLBYZserswrHRK9x+26OdtgBhd89LtZRNPcAzxPXFzmlPZY4M+EOiAWZ0MlvMtryNXbFJCcRZ7PwpNeGARzA3UWK1bTey/LYTvqZFIv67xLAf1TDTcYBxuHSWwXyV4KCKE589x7kB/pE4cOSV/WU5ShQ7sry7/MD5TwfH6DV8zcnXeXBZecTr11tKwBcNf91dGllpvpaKHmrrVCSs/ove4LZSlNkyO1Fbbdvq3HFjOmXbf+e7X4e3kDVeZcEBXJ3FVdCLvgliNlFpM/NQG7d47W8jevy3qx4JtfIQnCNG+18i8qZ0Ntp49It+Roa70GlfxY/JQNR62AOZV3Qhar2UUsG88lX4nJpktp1aYpO92OWlhhXVRTwvCEgiHeUI8VGtunFypgI3JNEX55/lo/bU9dBvMua2m8C0Egjmc8cvzmnn5Ea2ncsZFY2SPqUJWcWoXlgVTnubm/bOpkvvkcYhZsOPNot9qtcAX1687vMCO4alDyu8T5YepHcfs5cpfnhakwAkmy2jK06onoAuF4ih9hFCJqfeTRC9TrDUBj1l26uJrrYO8/E/WAJpHJv4nz9lYM1UjgBh3O5umZyxJExGvdZLo241FhrwiGJFtUhEQhUzniLWazFwp+MX1K+j9mroGgAGPMruMQpDK98Xt7v9P3Thp5HLA1awutiB7UWGC+/DRAHWCmoGOOaw3QAa/fE0PsxQDS5FJgzYd0rCUbqpP3Lz4h5XOmusrDSn8B1rY3MUOJY80NKde5j1DAcwhrxVLFHGBnE2CJFmMvw1ThZDjM+I+qe76N1IhOAptlAzujsod9XcvkYumb0KqzfbS4Tbs8YWoDbktVfCMkvQ0V/BzxppcRS65/nx0Ql98NGqOOTnpE4aXX knIvKj8K uFCpIYOZdYjBg5XEuNEjW7eN+d+4UBGXcNAln3H+rYinwJ9DLJnF6ieSc5DNljXcsP0JtkldZh1Q+M1UQg8cy2KkpQ4t+wN1IoHXfAY8OPcwM9o8ksPd2D2VFJzReIk7WVnSPuV3fpQgjaRJavUav7vSCMnUc7Nr9JxujwDnGo7dQvwA2MaQ8wAJDNHRo8EThAIgZt7HDVA4Fx8/mkbMF4zuvXvFGCNq88pVj4KAvXtZw8lDQCJyjj86hyViy/fKyE9dwEWio7Z1ninz9V7N1oEh41fDNVov1y8XmqZ8KTu+VBDJDJPtetUsVdZvDFySn/ywaPwSsqR2w4fe1Lft1MEdgEXFcAVXLK+tS3FVYpEzANsN8C/pJE0XtbHpz0Yzo1jX+qLYq74S/xeutX/D65jeAhZD54GSnEmdyItH/BzwuYIHNsinYRRDijA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Sorry for the late reply. I was trapped by other in-house kernel issues these days. On 2023/05/17 3:34, Kees Cook wrote: > For new CCs, the start of this thread is here[0]. > > On Mon, May 08, 2023 at 03:55:07PM +0800, GONG, Ruiqi wrote: >> When exploiting memory vulnerabilities, "heap spraying" is a common >> technique targeting those related to dynamic memory allocation (i.e. the >> "heap"), and it plays an important role in a successful exploitation. >> Basically, it is to overwrite the memory area of vulnerable object by >> triggering allocation in other subsystems or modules and therefore >> getting a reference to the targeted memory location. It's usable on >> various types of vulnerablity including use after free (UAF), heap out- >> of-bound write and etc. > > I heartily agree we need some better approaches to deal with UAF, and > by extension, heap spraying. Thanks Kees :) Good to hear that! > >> There are (at least) two reasons why the heap can be sprayed: 1) generic >> slab caches are shared among different subsystems and modules, and >> 2) dedicated slab caches could be merged with the generic ones. >> Currently these two factors cannot be prevented at a low cost: the first >> one is a widely used memory allocation mechanism, and shutting down slab >> merging completely via `slub_nomerge` would be overkill. >> >> To efficiently prevent heap spraying, we propose the following approach: >> to create multiple copies of generic slab caches that will never be >> merged, and random one of them will be used at allocation. The random >> selection is based on the address of code that calls `kmalloc()`, which >> means it is static at runtime (rather than dynamically determined at >> each time of allocation, which could be bypassed by repeatedly spraying >> in brute force). In this way, the vulnerable object and memory allocated >> in other subsystems and modules will (most probably) be on different >> slab caches, which prevents the object from being sprayed. > > This is a nice balance between the best option we have now > ("slub_nomerge") and most invasive changes (type-based allocation > segregation, which requires at least extensive compiler support), > forcing some caches to be "out of reach". Yes it is, and it's also cost-effective: achieving a quite satisfactory mitigation with a small amount of code (only ~130 lines). I get this impression also because (believe it or not) we did try to implement similar idea as the latter one you mention, and that was super complex, and the workload was really huge ... > >> >> The overhead of performance has been tested on a 40-core x86 server by >> comparing the results of `perf bench all` between the kernels with and >> without this patch based on the latest linux-next kernel, which shows >> minor difference. A subset of benchmarks are listed below: >> >> control experiment (avg of 3 samples) >> sched/messaging (sec) 0.019 0.019 >> sched/pipe (sec) 5.253 5.340 >> syscall/basic (sec) 0.741 0.742 >> mem/memcpy (GB/sec) 15.258789 14.860495 >> mem/memset (GB/sec) 48.828125 50.431069 >> >> The overhead of memory usage was measured by executing `free` after boot >> on a QEMU VM with 1GB total memory, and as expected, it's positively >> correlated with # of cache copies: >> >> control 4 copies 8 copies 16 copies >> total 969.8M 968.2M 968.2M 968.2M >> used 20.0M 21.9M 24.1M 26.7M >> free 936.9M 933.6M 931.4M 928.6M >> available 932.2M 928.8M 926.6M 923.9M > > Great to see the impact: it's relatively tiny. Nice! > > Back when we looked at cache quarantines, Jann pointed out that it > was still possible to perform heap spraying -- it just needed more > allocations. In this case, I think that's addressed (probabilistically) > by making it less likely that a cache where a UAF is reachable is merged > with something with strong exploitation primitives (e.g. msgsnd). > > In light of all the UAF attack/defense breakdowns in Jann's blog > post[1], I'm curious where this defense lands. It seems like it would > keep the primitives described there (i.e. "upgrading" the heap spray > into a page table "type confusion") would be addressed probabilistically > just like any other style of attack. Jann, what do you think, and how > does it compare to the KCTF work[2] you've been doing? A kindly ping to Jann ;) > > In addition to this work, I'd like to see something like the kmalloc > caches, but for kmem_cache_alloc(), where a dedicated cache of > variably-sized allocations can be managed. With that, we can split off > _dedicated_ caches where we know there are strong exploitation > primitives (i.e. msgsnd, etc). Then we can carve off known weak heap > allocation caches as well as make merging probabilistically harder. Would you please explain more about the necessity of applying similar mitigation mechanism to dedicated caches? Based on my knowledge, usually we believe dedicated caches are more secure, although it's still possible to spray them, e.g. by the technique that allocates & frees large amounts of slab objects to manipulate the heap in pages. Nevertheless in most of cases they are still good since such spraying is (considered to be) hard to implement. Meanwhile, the aforementioned spraying technique can hardly be mitigated within SLAB since it operates at the page level, and our randomization idea cannot protect against it either, so it also makes me inclined to believe it's not meaningful to apply randomization to dedicated caches. > I imagine it would be possible to then split this series into two > halves: one that creates the "make arbitrary-sized caches" API, and the > second that applies that to kmalloc globally (as done here). > >> >> Signed-off-by: GONG, Ruiqi >> --- >> >> v2: >> - Use hash_64() and a per-boot random seed to select kmalloc() caches. > > This is good: I was hoping there would be something to make it per-boot > randomized beyond just compile-time. > > So, yes, I think this is worth it, but I'd like to see what design holes > Jann can poke in it first. :) Thanks again! I'm looking forward to receiving more comments from mm and hardening developers. > > -Kees > > [0] https://lore.kernel.org/lkml/20230508075507.1720950-1-gongruiqi1@huawei.com/ > [1] https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html > [2] https://github.com/thejh/linux/commit/a87ad16046f6f7fd61080ebfb93753366466b761 >