From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A2F0C433FE for ; Tue, 1 Nov 2022 18:15:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 685608E0001; Tue, 1 Nov 2022 14:15:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 634EC6B0073; Tue, 1 Nov 2022 14:15:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FCA88E0001; Tue, 1 Nov 2022 14:15:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3FFF16B0071 for ; Tue, 1 Nov 2022 14:15:12 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id DCD37140483 for ; Tue, 1 Nov 2022 18:15:11 +0000 (UTC) X-FDA: 80085675222.18.E7FF00F Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf06.hostedemail.com (Postfix) with ESMTP id CCE88180046 for ; Tue, 1 Nov 2022 18:15:10 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 146D01FB; Tue, 1 Nov 2022 11:15:16 -0700 (PDT) Received: from [10.57.38.25] (unknown [10.57.38.25]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 263FF3F5A1; Tue, 1 Nov 2022 11:15:06 -0700 (PDT) Message-ID: <0d01426e-dd16-5370-2ff4-d11205d4d20d@arm.com> Date: Tue, 1 Nov 2022 18:14:58 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations Content-Language: en-GB To: Catalin Marinas , Christoph Hellwig Cc: Greg Kroah-Hartman , Linus Torvalds , Arnd Bergmann , Will Deacon , Marc Zyngier , Andrew Morton , Herbert Xu , Ard Biesheuvel , Isaac Manjarres , Saravana Kannan , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org References: <20221030084718.GC5278@lst.de> <20221030091349.GA5600@lst.de> <20221101105919.GA13872@lst.de> From: Robin Murphy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667326511; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Gw8bGWktPPAaqRuSYfzMY0QVCrO1gBlKEhysOnZ8eqY=; b=w8+isqpCnS9T8NsT3WffFaa0cAYnepkMH2/M7MCBquOkDMnkBAOzijC2Nf1loeF0VKLmrb KAjaUnHIsmr/j4eKipwXeM3Bop0I7GnO/x4BmMKeTFJceUMqZ3tQMKFd2kovGLoshI02iE Ru8DRfzdVgGxszWjIagxub40tCh3R2c= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf06.hostedemail.com: domain of robin.murphy@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=robin.murphy@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667326511; a=rsa-sha256; cv=none; b=bvR9CLrvbQUR+xO/tU1JKgvrZDgTTBTKfGzpdJr8T5T3QS0NVGPHNcVVzyFhjXBXAi1JPP Ozj62xEe7yH4OSvonQxpJHuDggXc3qvCI2y26zf65qfcjRwm/35MGeO/zEwpLR+Qhz+Tc6 81xIKVTQ22Df+SjkVw1vXXXe6fDlAM0= Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf06.hostedemail.com: domain of robin.murphy@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=robin.murphy@arm.com X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: CCE88180046 X-Stat-Signature: fyojc4z5zxcwe11hp8if793ne6r4aawr X-HE-Tag: 1667326510-785443 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022-11-01 17:19, Catalin Marinas wrote: > On Tue, Nov 01, 2022 at 11:59:19AM +0100, Christoph Hellwig wrote: >> On Sun, Oct 30, 2022 at 04:43:21PM +0000, Catalin Marinas wrote: >>> The alternative is for dma_map_*() to (dynamically) verify at the point >>> of DMA setup rather than at the point of allocation (and bounce). >> >> One issue with dma_map_* is that is can't report errors very well. >> So I'd really prefer to avoid adding that kind of checking there. > > By checking I meant bouncing on condition fail rather than reporting > errors. > >>> We have a few options here: >>> >>> a) Require that the dma_map_*() functions (including sg) get a size >>> multiple of cache_line_size() or the static ARCH_DMA_MINALIGN. >> >> I don't think that is going to work, there are just too many instances >> all over the tree that pass smaller sizes. >> >>> b) Always bounce small sizes as they may have come from a small kmalloc >>> cache. The alignment of both ptr and length is ignored. This assumes >>> that in-struct alignments (crypto) use ARCH_DMA_MINALIGN and we can >>> reduce ARCH_KMALLOC_MINALIGN independently. >> >> I think the start must be very much aligned. So what is the problem >> with just checking the size? >> >> Otherwise I think this is the way to go. These tiny maps tend to be >> various aux setup path thing and not performance critical (and caring >> about performance for not DMA coherent devices isn't something we do >> all that much anyway). > > The check is a bit weird as it needs some awareness of the kmalloc > caches to avoid unnecessary bouncing. We can't just check for smaller > than 128 since 192 is another kmalloc cache that is not aligned to an > ARCH_DMA_MINALIGN of 128: > > https://git.kernel.org/arm64/c/3508d0d4d5e458e43916bf4f61b29d2a6f15c2bb > > I can probably make it a bit nicer by checking the alignment of > kmalloc_size_roundup(size) (this function doesn't seem to have any > callers). > > The main downside of bouncing is mobile phones where those vendors are > in the habit of passing noswiotlb on the kernel command line or they > want a very small bounce buffer (hard to pick a universally small size). > I guess we can go ahead with this and, depending on how bad the problem > is, we can look at optimising swiotlb to use a kmem_cache (or aligned > kmalloc) as fall-back for bouncing. > >>> (b) is what I attempted on one of my branches (until I got stuck on >>> bouncing for iommu). A slight overhead on dma_map_*() to check the >>> length and we may keep a check on the start alignment (not length >>> though). >> >> When was that? These days dma-iommu already has bouncing code for >> the untrusted device case, so handling the bouncing there for unaligned >> request on non-coherent devices shouldn't be all that horrible. > > The bouncing currently is all or nothing with iommu_dma_map_sg(), unlike > dma_direct_map_sg() which ends up calling dma_direct_map_page() and we > can do the bouncing per element. So I was looking to untangle > iommu_dma_map_sg() in a similar way but postponed it as too complicated. > > As a less than optimal solution, we can force bouncing for the whole > list if any of the sg elements is below the alignment size. Hopefully we > won't have many such mixed size cases. Sounds like you may have got the wrong impression - the main difference with iommu_dma_map_sg_swiotlb() is that it avoids trying to do any of the clever concatenation stuff, and simply maps each segment individually with iommu_dma_map_page(), exactly like dma-direct; only segments which need bouncing actually get bounced. The reason for tying it to the up-front dev_use_swiotlb() check is because sync and unmap have to do the right thing depending on how the list was initially mapped, and that's the easiest way to be consistent. However, since the P2P stuff landed what I'd like to do now is use the new scatterlist->dma_flags to indicate bounced segments in a similar fashion to bus-address segments, and so streamline the SWIOTLB bits into the main flow in the equivalent manner. What sadly wouldn't work is just adding extra conditions to dev_use_swiotlb() to go down the existing bounce-if-necessary path for all non-coherent devices, since there are non-coherent users of dma-buf and v4l2 which (for better or worse) depend on the clever concatenation stuff happening. Thanks, Robin.