From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51D3BD711BE for ; Wed, 20 Nov 2024 15:56:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0F9D6B0096; Wed, 20 Nov 2024 10:56:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D98216B0098; Wed, 20 Nov 2024 10:56:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C38DF6B0099; Wed, 20 Nov 2024 10:56:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A0D426B0096 for ; Wed, 20 Nov 2024 10:56:08 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 205D41A0DB0 for ; Wed, 20 Nov 2024 15:56:08 +0000 (UTC) X-FDA: 82806922548.27.A3B678F Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf02.hostedemail.com (Postfix) with ESMTP id B30F18001C for ; Wed, 20 Nov 2024 15:54:28 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="BxEb/bkq"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf02.hostedemail.com: domain of kbusch@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=kbusch@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732118099; a=rsa-sha256; cv=none; b=anu/pm46UUsLkfY8H7EBLe5/1gm8IzJJeA0Nh5sXNQWDiWvhnW+aO1HrNwjA3guqw42kS2 xpLwKddFFZP1/IB4jc2iDisLDbhZBbm5Ywwa+PaVsJzyJgkAFoZEeXd50cGLeCoZVRF+3z 7GDVKgoG0Tq7CbBZVV0pS9b9Em/yW6I= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="BxEb/bkq"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf02.hostedemail.com: domain of kbusch@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=kbusch@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732118099; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7l1fZ3CcqLOcnaCnBDoaIiNv0PHYc7XDrNLVOyzciJQ=; b=yk6tYeQx5OOIcZQrb6Pjbbr1tqgTI96+6Rzx01+iZtVpl1i4mj0uUNhPPYYrEFoZf8n0Uu vj4Cc3cuK8LFHNuAlGrWadng1lF7+xM94b8esVRkJVprtZLMVmiHkkGOEjEXe6DQwDbBUh IbPN8Vzr1ZX1ktFNk56l2Ga7SodQus4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 57A28A43247; Wed, 20 Nov 2024 15:54:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 01076C4CECD; Wed, 20 Nov 2024 15:56:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1732118165; bh=dTJy5yTaJCywCkIvTApsDa6gTTCYEhUQN23orW3/WDo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=BxEb/bkqBfGFJoP3WnWDQ1Ve0VkLDvG7ASQST/odICAMqQwl2qv0/prT5iHfa55aK WoSLIalsGlyZ6vMZTrZoL7TrylF/EQXmMGf8fyVmmdGITDXnWNVWq8XKfMHpXOnoVR aKF/GuWgFysXsawoOaDy/E9YT3tRca4M+sCSUl4uCPYhR6/ONkvUfmdP42ZLs8Hrgk j5n60HlhjHWJ4P8W4AxEyKyRuem2gyj4iukVtQerx0KV1Bb7VtOxFbWRbxrA8kLIuv mjEXhJItT2tL4Mn1uLSmumS23QQcGgj2GAREMVLZqkZVB8OW1wO04DQsjrquFODggA sjcLESVkroPVA== Date: Wed, 20 Nov 2024 08:56:03 -0700 From: Keith Busch To: Christoph Hellwig Cc: Brian Johannesmeyer , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, Raphael Isemann , Cristiano Giuffrida , Herbert Bos , Greg KH Subject: Re: [RFC v2 0/2] dmapool: Mitigate device-controllable mem. corruption Message-ID: References: <20241119205529.3871048-1-bjohannesmeyer@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Queue-Id: B30F18001C X-Rspamd-Server: rspam01 X-Stat-Signature: 1x54ia69i51swfx4dw6dbpimgitufqbs X-HE-Tag: 1732118068-376006 X-HE-Meta: U2FsdGVkX1+VDMj+/m5L+raTQXBv+8jJ7ndop8yadxErt2KhVnvyPrIi8CsX5eDT8js4YTzy1f7wnJA2zFfV8HpL58hoDZAMPJUl3gL9mcAArG5avG4oLpH+AldqaMXx532zRBlZPfE7tlgCLtHXPY/1zo+QNbfQWWdrwlgBVyVu8DY4WZ4qK5t6eFnBtvG+5ibgWCct6Bio4tRPDUHrTK1UfJ8WSNdDZdonavGa1LObQWRl5U5RJiGwrEHQm0NRMysLbVJQLk1azL12u3YnkMp32WqETHbVDrGXtyDH+qA9FFwlgS/QGXDVjdoiM0Uc4NDvpnTBZPVNRwdpGrYdFfJ1OIesIgG9zbCcadcvB4Lkr87dcjCMDDfDAGYZyrRJOFb0FrY1e1+3FQ2Ay9WbFK27/eatZtFk2G9dOlov63n4ZQPF0CVrjin2HKH5wUo94IqHyd3nEQ2znn664Ip887j0oi0+fIDIApCtws7cd9eqyO8oayYQ9LzUbWJOXMVs9tRl7lwvtCTWkdjRIKxCaJff8eRTFnYYotSqJx8QqCgwSv9uNZzQhTv5Xji4040rJgygb4rc5E+80oR/mcQQWRyBGTOayHXDlk+zltdu2tMGYnR5bomG5oKsEKkzqcaBrDBoYTdOP5Azun4royc30k0DHOvMJ9bSnX5N1UMefMf32pAIRlAu/dMkJ6Z3R6NpbpLB6s5D6YQTthWZ6bOe1OSvhv8mYPnZGLPkiFal7xPSdjdJz2xdUpBabKVn4n4tWXHNCBy9XJbgpKclci56ZodSZ3aXrOSYpM3MEI9mwesqpxERe6Autl0Lgu95oQUs603QhsL9f/7XdFWSe/pBlyEhVPFfIYSql47E1S/1BE3rqmfoRZ0OkSNmH0mi28Kw70uN6Eit/QPyKdTNDB0OZjmfnOT/pYNmE/Bsg5qHgG0E9RD0sEmPzW19MZsZaQkZ5oYCyoqoTIy3HjUPA2V QIKD3kdi WQ/lmGHnCtQmVLyCiprFKGEzMwDzzjthomJG6dETyKIFHV9FYka+KfiWdUbCVN53NOcX7A3KSTlZlxfr+4gl+fIipVhGfT13/rFkefbDYG+I2XuFYC+MOedWoUioB1yHQ7x3/7lD8wPeFO5d4BmyOzFD4dkRTk4Z/bUUgr5SxTqvSpzBzD0+0QaTtZuLyREyWlgtKDjkWShc9tQQgLzuEVZPLbSshNOYm9HPMkYbmPFadcTnV0SzlHRgk3vPsGefNbYiYOV2NP8qpF8nXon4Fu9Y+puw3E+VWKJpn5Sf0FIgVijuo+Hhe3qayRA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 20, 2024 at 01:29:19AM -0800, Christoph Hellwig wrote: > On Tue, Nov 19, 2024 at 09:55:27PM +0100, Brian Johannesmeyer wrote: > > We discovered a security-related issue in the DMA pool allocator. > > > > V1 of our RFC was submitted to the Linux kernel security team. They > > recommended submitting it to the relevant subsystem maintainers and the > > hardening mailing list instead, as they did not consider this an explicit > > security issue. Their rationale was that Linux implicitly assumes hardware > > can be trusted. > > You should probably Cc Keith as the person who most recently did major > work on the dmpool code and might still remember how it works. Thanks. The intrusive list was overlayed in the freed blocks for spatial optimizations. If you're moving these field outside of it (I'll have to review the patch on lore), you can probably relax the minimum dma block size too since we don't need to hold the data structure information in it. > > **Threat Model**: While Linux drivers typically trust their hardware, there > > may be specific drivers that do not operate under this assumption. Hence, > > this threat model assumes a malicious peripheral device capable of > > corrupting DMA data to exploit the kernel. In this scenario, the device > > manipulates kernel-initialized data (similar to the attack described in the > > Thunderclap paper [0]) to achieve arbitrary kernel memory corruption. > > > > **DMA pool background**. A DMA pool aims to reduce the overhead of DMA > > allocations by creating a large DMA buffer --- the "pool" --- from which > > smaller buffers are allocated as needed. Fundamentally, a DMA pool > > functions like a heap: it is a structure composed of linked memory > > "blocks", which, in this context, are DMA buffers. When a driver employs a > > DMA pool, it grants the device access not only to these blocks but also to > > the pointers linking them. > > > > **Vulnerability**. Similar to traditional heap corruption vulnerabilities > > --- where a malicious program corrupts heap metadata to e.g., hijack > > control flow --- a malicious device may corrupt DMA pool metadata. This > > corruption can trivially lead to arbitrary kernel memory corruption from > > any driver that uses it. Indeed, because the DMA pool API is extensively > > used, this vulnerability is not confined to a single instance. In fact, > > every usage of the DMA pool API is potentially vulnerable. An exploit > > proceeds with the following steps: > > > > 1. The DMA `pool` initializes its list of blocks, then points to the first > > block. > > 2. The malicious device overwrites the first 8 bytes of the first block --- > > which contain its `next_block` pointer --- to an arbitrary kernel address, > > `kernel_addr`. > > 3. The driver makes its first call to `dma_pool_alloc()`, after which, the > > pool should point to the second block. However, it instead points to > > `kernel_addr`. > > 4. The driver again calls `dma_pool_alloc()`, which incorrectly returns > > `kernel_addr`. Therefore, anytime the driver writes to this "block", it may > > corrupt sensitive kernel data. > > > > I have a PDF document that illustrates how these steps work. Please let me > > know if you would like me to share it with you. > > > > **Proposed mitigation**. To mitigate the corruption of DMA pool metadata > > (i.e., the pointers linking the blocks), the metadata should be moved into > > non-DMA memory, ensuring it cannot be altered by a device. I have included > > a patch series that implements this change. Since I am not deeply familiar > > with the DMA pool internals, I would appreciate any feedback on the > > patches. I have tested the patches with the `DMAPOOL_TEST` test and my own > > basic unit tests that ensure the DMA pool allocator is not vulnerable. > > > > **Performance**. I evaluated the patch set's performance by running the > > `DMAPOOL_TEST` test with `DMAPOOL_DEBUG` enabled and with/without the > > patches applied. Here is its output *without* the patches applied: > > ``` > > dmapool test: size:16 align:16 blocks:8192 time:3194110 > > dmapool test: size:64 align:64 blocks:8192 time:4730440 > > dmapool test: size:256 align:256 blocks:8192 time:5489630 > > dmapool test: size:1024 align:1024 blocks:2048 time:517150 > > dmapool test: size:4096 align:4096 blocks:1024 time:399616 > > dmapool test: size:68 align:32 blocks:8192 time:6156527 > > ``` > > > > And here is its output *with* the patches applied: > > ``` > > dmapool test: size:16 align:16 blocks:8192 time:3541031 > > dmapool test: size:64 align:64 blocks:8192 time:4227262 > > dmapool test: size:256 align:256 blocks:8192 time:4890273 > > dmapool test: size:1024 align:1024 blocks:2048 time:515775 > > dmapool test: size:4096 align:4096 blocks:1024 time:523096 > > dmapool test: size:68 align:32 blocks:8192 time:3450830 > > ``` > > > > Based on my interpretation of the output, the patch set does not appear to > > negatively impact performance. In fact, it shows slight performance > > improvements in some tests (i.e., for sizes 64, 256, 1024, and 68). > > > > I speculate that these performance gains may be due to improved spatial > > locality of the `next_block` pointers. With the patches applied, the > > `next_block` pointers are consistently spaced 24 bytes apart, matching the > > new size of `struct dma_block`. Previously, the spacing between > > `next_block` pointers depended on the block size, so for 1024-byte blocks, > > the pointers were spaced 1024 bytes apart. However, I am still unsure why > > the performance improvement for 68-byte blocks is so significant. > > > > [0] Link: https://www.csl.sri.com/~neumann/ndss-iommu.pdf > > > > Brian Johannesmeyer (2): > > dmapool: Move pool metadata into non-DMA memory > > dmapool: Use pool_find_block() in pool_block_err() > > > > mm/dmapool.c | 96 ++++++++++++++++++++++++++++++++++------------------