From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59E0BC4332F for ; Mon, 7 Nov 2022 13:26:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6AD66B0071; Mon, 7 Nov 2022 08:26:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C1B266B0072; Mon, 7 Nov 2022 08:26:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B09706B0073; Mon, 7 Nov 2022 08:26:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A262D6B0071 for ; Mon, 7 Nov 2022 08:26:32 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6BC7E1A0F67 for ; Mon, 7 Nov 2022 13:26:32 +0000 (UTC) X-FDA: 80106720624.02.5CE2250 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf01.hostedemail.com (Postfix) with ESMTP id 4A75C4000E for ; Mon, 7 Nov 2022 13:26:30 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6DF911FB; Mon, 7 Nov 2022 05:26:35 -0800 (PST) Received: from [10.57.36.87] (unknown [10.57.36.87]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D39B3F73D; Mon, 7 Nov 2022 05:26:26 -0800 (PST) Message-ID: Date: Mon, 7 Nov 2022 13:26:21 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned Content-Language: en-GB To: Catalin Marinas , Christoph Hellwig Cc: Linus Torvalds , Arnd Bergmann , Greg Kroah-Hartman , Will Deacon , Marc Zyngier , Andrew Morton , Herbert Xu , Ard Biesheuvel , Isaac Manjarres , Saravana Kannan , Alasdair Kergon , Daniel Vetter , Joerg Roedel , Mark Brown , Mike Snitzer , "Rafael J. Wysocki" , linux-mm@kvack.org, iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org References: <20221106220143.2129263-1-catalin.marinas@arm.com> <20221106220143.2129263-4-catalin.marinas@arm.com> <20221107094603.GB6055@lst.de> From: Robin Murphy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667827590; a=rsa-sha256; cv=none; b=kPgBG1+U7Ga3hnFZbNKQQD/tdt8vvNiSPQOW5Le7vheXkgOyrycWyie2BN2A6aDVLQM6KQ DkoQRnjDd+29T7/AIan74rZdsJ12hJtqvFjqSAP19A34R6NbccRngFj0m78QyQfomzsk5M ilknPBaOZR7BTW6PAT7YuMEeQqEUW0k= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of robin.murphy@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=robin.murphy@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667827590; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zfOUpCLIyz+Jy/ZeulgsoS01lQ4nnFfYOR8Glm0owoU=; b=bFFVSnv5gtDqqvZUSBfBb+ratVZLxRfS/2zN4SqvLLUsthTsqUGheIA1irsq2ve+7Jb8YU pm/q2EszP0+wF5EDfF/Dyq6kG50nHzi3r9r7RrEg/W8aZ/byhjyA1X439GyKwYLkOuxVAl h83KPky5QWTE+0s9UGOYe1tvQ+mM5Qw= X-Stat-Signature: swagbj5uc9mcowiafzhyfh5nwbg46ewg X-Rspamd-Queue-Id: 4A75C4000E Authentication-Results: imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of robin.murphy@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=robin.murphy@arm.com; dmarc=pass (policy=none) header.from=arm.com X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1667827590-646911 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022-11-07 10:54, Catalin Marinas wrote: > On Mon, Nov 07, 2022 at 10:46:03AM +0100, Christoph Hellwig wrote: >>> +static inline bool dma_sg_kmalloc_needs_bounce(struct device *dev, >>> + struct scatterlist *sg, int nents, >>> + enum dma_data_direction dir) >>> +{ >>> + struct scatterlist *s; >>> + int i; >>> + >>> + if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) || >>> + dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev)) >>> + return false; >> >> This part should be shared with dma-direct in a well documented helper. >> >>> + for_each_sg(sg, s, nents, i) { >>> + if (dma_kmalloc_needs_bounce(dev, s->length, dir)) >>> + return true; >>> + } >> >> And for this loop iteration I'd much prefer it to be out of line, and >> also not available in a global helper. >> >> But maybe someone can come up with a nice tweak to the dma-iommu >> code to not require the extra sglist walk anyway. > > An idea: we could add another member to struct scatterlist to track the > bounced address. We can then do the bouncing in a similar way to > iommu_dma_map_sg_swiotlb() but without the iova allocation. The latter > would be a common path for both the bounced and non-bounced cases. FWIW I spent a little time looking at this as well; I'm pretty confident it can be done without the extra walk if the iommu-dma bouncing is completely refactored (and it might want a SWIOTLB helper to retrieve the original page from a bounced address). That's going to be a bigger job than I'll be able to finish this cycle, and I concluded that this in-between approach wouldn't be worth posting for its own sake, but as part of this series I think it's a reasonable compromise. What we have here is effectively a pretty specialist config that trades DMA mapping performance for memory efficiency, so trading a little more performance initially for the sake of keeping it manageable seems fair to me. The one thing I did get as far as writing up is the patch below, which I'll share as an indirect review comment on this patch - feel free to pick it up or squash it in if you think it's worthwhile. Thanks, Robin. ----->8----- From: Robin Murphy Date: Wed, 2 Nov 2022 17:35:09 +0000 Subject: [PATCH] scatterlist: Add dedicated config for DMA flags The DMA flags field will be useful for users beyond PCI P2P, so upgrade to its own dedicated config option. Signed-off-by: Robin Murphy --- drivers/pci/Kconfig | 1 + include/linux/scatterlist.h | 4 ++-- kernel/dma/Kconfig | 3 +++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index 55c028af4bd9..0303604d9de9 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -173,6 +173,7 @@ config PCI_P2PDMA # depends on 64BIT select GENERIC_ALLOCATOR + select NEED_SG_DMA_FLAGS help Enableѕ drivers to do PCI peer-to-peer transactions to and from BARs that are exposed in other devices that are the part of diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 375a5e90d86a..87aaf8b5cdb4 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -16,7 +16,7 @@ struct scatterlist { #ifdef CONFIG_NEED_SG_DMA_LENGTH unsigned int dma_length; #endif -#ifdef CONFIG_PCI_P2PDMA +#ifdef CONFIG_NEED_SG_DMA_FLAGS unsigned int dma_flags; #endif }; @@ -249,7 +249,7 @@ static inline void sg_unmark_end(struct scatterlist *sg) } /* - * CONFGI_PCI_P2PDMA depends on CONFIG_64BIT which means there is 4 bytes + * CONFIG_PCI_P2PDMA depends on CONFIG_64BIT which means there is 4 bytes * in struct scatterlist (assuming also CONFIG_NEED_SG_DMA_LENGTH is set). * Use this padding for DMA flags bits to indicate when a specific * dma address is a bus address. diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index 56866aaa2ae1..48016c4f67ac 100644 --- a/kernel/dma/Kconfig +++ b/kernel/dma/Kconfig @@ -24,6 +24,9 @@ config DMA_OPS_BYPASS config ARCH_HAS_DMA_MAP_DIRECT bool +config NEED_SG_DMA_FLAGS + bool + config NEED_SG_DMA_LENGTH bool --