From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0964D7879D for ; Thu, 21 Nov 2024 17:31:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3BB896B007B; Thu, 21 Nov 2024 12:31:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 344F36B0082; Thu, 21 Nov 2024 12:31:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E55F6B0085; Thu, 21 Nov 2024 12:31:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id F13276B007B for ; Thu, 21 Nov 2024 12:31:28 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E72978143B for ; Thu, 21 Nov 2024 17:31:27 +0000 (UTC) X-FDA: 82810792134.21.B97AE99 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) by imf24.hostedemail.com (Postfix) with ESMTP id 1C7D9180009 for ; Thu, 21 Nov 2024 17:31:18 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="OVVJW/mF"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of bjohannesmeyer@gmail.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=bjohannesmeyer@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732210099; a=rsa-sha256; cv=none; b=3kkxV85EvJOwsAGJ/se5/8mrPUosMQblT0GlFo5DETKedWLpWV0Xbd5HbTa+HGTVULepgO iZ5mnMWU036o5aOJsya8hXrQrq50d/EKeBOfb9eFU8KKn7xY75TWLfgVpJWSAYSTP2uZt4 VveH1M5bN4CYKAIYCL4Uw/qkkF8IGYg= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="OVVJW/mF"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of bjohannesmeyer@gmail.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=bjohannesmeyer@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732210099; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1q0HSP14gsDRlQDPzMJwpcg2o9yYkTm2IHCEX1dby3U=; b=nVKCgRMiRi+K6kfbVdja3QKvXq16fxaPJRBoFYSPPdeleCLpdFE13wkxjFj71u+3OoRNU0 98esjKAF89wIoxFDtmzltSXNC1XZGSs3U13t4xhFvZOSiakRZgT1LRvQC52r7VrjxqIlce LpzkYJoNUVRgE4cFDNGyhZ5Gtoa5pmg= Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-a9a0ef5179dso206417466b.1 for ; Thu, 21 Nov 2024 09:31:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732210284; x=1732815084; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1q0HSP14gsDRlQDPzMJwpcg2o9yYkTm2IHCEX1dby3U=; b=OVVJW/mF3g5bqRSQxURqWMgxOFOUaa3NUsqzF3oiH2mPXHjRoMuvtoCNFMr711TwdV P+U/ciP6YpxjJfbMiQxBlMQPexNq87OTC3H5BGMdeLXUzwZ1XasdAZCR6DJ9rOI+IhpT QGVJZjnFCGJVIDRvpuo1jq4s3rTZlfBsUmlE830rJFln32MPuG8t195m0ZOW1YEn7eHx B/cSfIlj4VslzICwfsYIJsG/OufSy0BZU6/ko/E9JaYLw3TmGnxhUv81RhYD+PFsI5PC fUwssIU6XGoJdSpb+gogkdkU0RjCDN3yA6s3P32kKUNh68slGa9TNcsYCue43E4PWV+y 06kA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732210284; x=1732815084; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1q0HSP14gsDRlQDPzMJwpcg2o9yYkTm2IHCEX1dby3U=; b=izG2TnQQc1K92OZDyLoQiYlJ7X5WgH0J/B3XpP+96en9VI5Vtk+ruhBCGfwtenFCsn zQbXrwSODxarJnBIBRKMk8VWFqvjBrUzqfw4eBAE+GUrHdkbhWuG2VUY6PgoMEKph33Q ijy71xSue4a2y5JZQKq6eN++RsVHenYhiYKeYQC26GLci48SX/tlWhd1He2S9b8W+QeL T+uBBM3yioB5nJA80CYpusghRf10odW8NGV6Vh//k13MFFHgXIQlXtjZ9wnwK5dihUkD DVQkIOP6zbVXK0SssKDZHyPSHkfxz05wgWj0ftytBpXlDOMpHxtzcK8xnlaYpUXMCur1 Mjhw== X-Forwarded-Encrypted: i=1; AJvYcCVDU2EdWTSq2U3s/nbfjhk9/HgoIXaOOOLIz5bGPjmNAzhmp5FQnwp80NE/LBdPV/d/KC9ckM5f7g==@kvack.org X-Gm-Message-State: AOJu0Yy7zz+bT7EvInxz1IAmNBnASdhVKWSb8Y+Ni59wLriCHTCZC48h hyvsIpEIP11oGRTQT12JboNAwokHLsPXHvkMnh8bFHGX75EGYEwf1yTnN455psZsK9mwZOd/0IJ yMMkfIiIFYU9oqlLRNx4Hbyre15Y= X-Gm-Gg: ASbGnct7pfkrrEMIuxA+V5sUyNy9Xa4eLG8bCdnwU1WR3V7n/KObMaVWGRhqANttzhv S+urzuHMgI9tJaMsz81xIHRrRJ2FpM2jV X-Google-Smtp-Source: AGHT+IEAuXYouPzIxNe16bNiBXRZIrAokIEtDRgIVHICyfbrOcySGz2sZoUZttoPy6RQM8tiJGfABgF8q6ET0f019cA= X-Received: by 2002:a17:906:328f:b0:a9a:1e4d:856d with SMTP id a640c23a62f3a-aa5099c0886mr1492966b.22.1732210284086; Thu, 21 Nov 2024 09:31:24 -0800 (PST) MIME-Version: 1.0 References: <20241119205529.3871048-1-bjohannesmeyer@gmail.com> In-Reply-To: From: Brian Johannesmeyer Date: Thu, 21 Nov 2024 10:31:11 -0700 Message-ID: Subject: Re: [RFC v2 0/2] dmapool: Mitigate device-controllable mem. corruption To: Keith Busch Cc: Christoph Hellwig , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, Raphael Isemann , Cristiano Giuffrida , Herbert Bos , Greg KH Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 7i8mgq4s6bgxxbyzwfxtepaex8y1paqz X-Rspamd-Queue-Id: 1C7D9180009 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1732210278-185435 X-HE-Meta: U2FsdGVkX1+m4rTO6CRcYKBpQPh8DnpJCJmllK0D91CFsS/fjmwZOuHhws9XePwvJFvZPeQbm0hk/QPpsofuEJFkn7BsD4MOnOUgt3OloBq/Usi1S+ok5q/wMB23J3QjxNJppNNfJ7QF7if/6ArcZDLFiXn6eG8XjBFPLE+4Z2lLS4jcrIvndA0FYku26UvLykQ6j4XuKFjxbjQJSwMAoWOEk9zVpWR7d1F+9Usjrhac0veXc46CIZokvS2iXnLvh587plLm5P/5w0QS+5yVIAG167PGjIGLV6p/+wuH/vgn08sczhxWJTX7q9U48eNkmmHGc2WifMoJZD3j3bDWCU6g3g7q6GQGxihbIEWdyKvYEKUObfLgrbs6dWK2usyZGzLrDhfx1E3LuJMjNHXlP8/IaXTnKoIGoAP0XOExraHCYkQ+CIEzZOJIeBrlG2/HnAE2KURXStMRo3XNlLg6CIr9+RLeny3+7LdzoyHXLsO/Ja6WIjqFuY+I8NhNFlZFFVaAlyWc/bms+Sz/HTDs0VspDusiMvv7r2BkntE8qN83LHkAaeZ9CSvRewzoTrUUBXtRCEm2IY21aUchdZ/lhOQCgPFN88/wnKyuc0DQScIiRXRK/aZXqW8pX8BnSc6B9d6rsnoB3NqO6Wo4oFvJhVLr//Mr0FlVtrvVjYgkVcg5Exrzj9KENYdv9YezouVw6Gh4eBpfipfJvC8ooWQ10FIZKvCpnyoDUym54tPnMzeVTeH1O2lRgePs8vaF9Tg8l+neRSwenDcc/8M9X16G7nvSdCHhihEobqyt2dRCxSsc55vVDb8vxbV6aXcmUjr7ooIJwr8OqJGGTIqdGpWjAvUkKat7+VSfXGx07T4lawaPgMJBSyAd7z6wcTg0Hck5oaqWFAqqwgbTZyEaNdtaie+Ltbi/LZa0pGFoJiz9Oc0vOkMjMWgL7alE7J8esLZvpFCtegoDWb6gWSU1cD0 WdtQbX9L K8CP7tgq21fpFcdLQbbHuEITMt3wQMYRw3/cxQGcd1Ab9N0FzAwHAdR9ZjhPAc5xMkuHjGMHd/8d9pdBZQ9q1u0dV80TvVYnxSoWvX6bYqvperKpp/zAJ1HYahjP8EmMvC7BufN+FpzPBSpEDvTTi/qyfmICLfBdhx9oDZVR9XLf1onE+UBMpK3ltUcAK2fH+fvj9O7WVr119OOAZcybREh9nORv8co7yDkmuXcURBGh4zLQX1TeShycx/rTuacJgdxGu0NhZehrdHKQ9EDgaZRrni9PfKyS9pkNGoUvS69UG3mccpsNqwsqTQnzM0Vdt1nbb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000076, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > Here's my quick thoughts at this late hour, though I might have > something else tomorrow. If the links are external to the dma addr > being freed, then I think you need to change the entire alloc/free API > to replace the dma_addr_t handle with a new type, like a tuple of > > { dma_addr_t, priv_list_link } > > That should make it possible to preserve the low constant time to alloc > and free in the critical section, which I think is a good thing to have. > I found 170 uses of dma_pool_alloc, and 360 dma_pool_free in the kernel, > so changing the API is no small task. :( Indeed, an API change like this might be the only way to move metadata out of the DMA blocks while maintaining its current performance. Regarding the performance hit of the submitted patches: - *Allocations* remain constant time (`O(1)`), as they simply check the `pool->next_block` pointer. - *Frees* are no longer constant time. Previously, `dma_pool_free()` converted a `vaddr` to its corresponding `block` by type casting, but now it calls `pool_find_block()`, which iterates over all pages (`O(n)`). Therefore, optimizing `dma_pool_free()` is key. While restructuring the API is the "best" solution, I implemented a compromise: introducing a `struct maple_tree block_map` field in `struct dma_pool` to save mappings of a `vaddr` to its corresponding `block`. A maple tree isn=E2=80=99t constant time, but it offers `O(log n)` performance, whi= ch is a significant improvement over the current `O(n)` iteration. Here are the performance results. I've already reported the first two sets of numbers, but I'll repeat them here: **Without no patches applied:** ``` dmapool test: size:16 align:16 blocks:8192 time:11860 dmapool test: size:64 align:64 blocks:8192 time:11951 dmapool test: size:256 align:256 blocks:8192 time:12287 dmapool test: size:1024 align:1024 blocks:2048 time:3134 dmapool test: size:4096 align:4096 blocks:1024 time:1686 dmapool test: size:68 align:32 blocks:8192 time:12050 ``` **With the submitted patches applied:** ``` dmapool test: size:16 align:16 blocks:8192 time:34432 dmapool test: size:64 align:64 blocks:8192 time:62262 dmapool test: size:256 align:256 blocks:8192 time:238137 dmapool test: size:1024 align:1024 blocks:2048 time:61386 dmapool test: size:4096 align:4096 blocks:1024 time:75342 dmapool test: size:68 align:32 blocks:8192 time:88243 ``` **With the submitted patches applied AND using a maple tree to improve the performance of vaddr-to-block translations:** ``` dmapool test: size:16 align:16 blocks:8192 time:43668 dmapool test: size:64 align:64 blocks:8192 time:44746 dmapool test: size:256 align:256 blocks:8192 time:45434 dmapool test: size:1024 align:1024 blocks:2048 time:11013 dmapool test: size:4096 align:4096 blocks:1024 time:5250 dmapool test: size:68 align:32 blocks:8192 time:45900 ``` The maple tree optimization reduces the performance hit for most block sizes, especially for larger blocks. While the performance is not fully back to baseline, it gives a reasonable trade-off between protection, runtime performance, and ease of deployment (i.e., not requiring an API change). If this approach looks acceptable, I can submit it as a V3 patch series for further review and discussion. Thanks, Brian Johannesmeyer