From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0680C4332F for ; Tue, 8 Nov 2022 11:40:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DCE28E0001; Tue, 8 Nov 2022 06:40:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 28BDF6B0073; Tue, 8 Nov 2022 06:40:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 154098E0001; Tue, 8 Nov 2022 06:40:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 057A86B0071 for ; Tue, 8 Nov 2022 06:40:28 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id AAF58A0FF1 for ; Tue, 8 Nov 2022 11:40:27 +0000 (UTC) X-FDA: 80110082094.07.402FCA2 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf01.hostedemail.com (Postfix) with ESMTP id 6FA294000E for ; Tue, 8 Nov 2022 11:40:25 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8859F1FB; Tue, 8 Nov 2022 03:40:30 -0800 (PST) Received: from [10.57.36.87] (unknown [10.57.36.87]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 580213F534; Tue, 8 Nov 2022 03:40:21 -0800 (PST) Message-ID: Date: Tue, 8 Nov 2022 11:40:16 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [PATCH v3 03/13] iommu/dma: Force bouncing of the size is not cacheline-aligned Content-Language: en-GB To: Catalin Marinas Cc: Christoph Hellwig , Linus Torvalds , Arnd Bergmann , Greg Kroah-Hartman , Will Deacon , Marc Zyngier , Andrew Morton , Herbert Xu , Ard Biesheuvel , Isaac Manjarres , Saravana Kannan , Alasdair Kergon , Daniel Vetter , Joerg Roedel , Mark Brown , Mike Snitzer , "Rafael J. Wysocki" , linux-mm@kvack.org, iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org References: <20221106220143.2129263-1-catalin.marinas@arm.com> <20221106220143.2129263-4-catalin.marinas@arm.com> <20221107094603.GB6055@lst.de> From: Robin Murphy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of robin.murphy@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=robin.murphy@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667907625; a=rsa-sha256; cv=none; b=xdg+1xMC6QhXe1+yC2Sml6vRlxMWLUjknHNOR0FIRTsn0zOEtdJ5CTgc8ee8G3+JSdg423 P0/Zd3GWAjTuOOaiISUyIboYYWbcTkWps+znQMmvJSe8ZjZA5dlASzq48INCexWF/ycL14 2nKCwv/EmK4Ly/WvrRGhKVVQdtETilA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667907625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FblnJ7d0UCGkVFqj6NKbB3TbJU0lXSxW2gBp2MP8H68=; b=k37ew0B+fyshZz/fus/emcq65MR/JMADjhFqaR98OCbUpHJbSiAty71Vl8e5AAT4dZXQji zHK+T1OPC3fwmAUEcZJl22X1IwJz7GvL0Ed3kVcmmKiBLYA7GcHWNsEHmHG4LjttGf3Iqj DwtAf+MXnsEgyGM/olv//hLeF23I2p4= X-Stat-Signature: eg61dabi4yxhooky3qqsfssy8mngpedq X-Rspamd-Queue-Id: 6FA294000E Authentication-Results: imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of robin.murphy@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=robin.murphy@arm.com; dmarc=pass (policy=none) header.from=arm.com X-Rspamd-Server: rspam05 X-Rspam-User: X-HE-Tag: 1667907625-32226 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022-11-08 10:51, Catalin Marinas wrote: > On Mon, Nov 07, 2022 at 01:26:21PM +0000, Robin Murphy wrote: >> On 2022-11-07 10:54, Catalin Marinas wrote: >>> On Mon, Nov 07, 2022 at 10:46:03AM +0100, Christoph Hellwig wrote: >>>>> +static inline bool dma_sg_kmalloc_needs_bounce(struct device *dev, >>>>> + struct scatterlist *sg, int nents, >>>>> + enum dma_data_direction dir) >>>>> +{ >>>>> + struct scatterlist *s; >>>>> + int i; >>>>> + >>>>> + if (!IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) || >>>>> + dir == DMA_TO_DEVICE || dev_is_dma_coherent(dev)) >>>>> + return false; >>>> >>>> This part should be shared with dma-direct in a well documented helper. >>>> >>>>> + for_each_sg(sg, s, nents, i) { >>>>> + if (dma_kmalloc_needs_bounce(dev, s->length, dir)) >>>>> + return true; >>>>> + } >>>> >>>> And for this loop iteration I'd much prefer it to be out of line, and >>>> also not available in a global helper. >>>> >>>> But maybe someone can come up with a nice tweak to the dma-iommu >>>> code to not require the extra sglist walk anyway. >>> >>> An idea: we could add another member to struct scatterlist to track the >>> bounced address. We can then do the bouncing in a similar way to >>> iommu_dma_map_sg_swiotlb() but without the iova allocation. The latter >>> would be a common path for both the bounced and non-bounced cases. >> >> FWIW I spent a little time looking at this as well; I'm pretty confident >> it can be done without the extra walk if the iommu-dma bouncing is >> completely refactored (and it might want a SWIOTLB helper to retrieve >> the original page from a bounced address). > > Doesn't sg_page() provide the original page already? Either way, the > swiotlb knows it as it needs to do the copying between buffers. For the part where we temporarily rewrite the offsets and lengths to pass to iommu_map_sg(), we'd also have to swizzle any relevant page pointers so that that picks up the physical addresses of the bounce buffer slots rather than the original pages, but then we need to put them back straight afterwards. Since SWIOTLB keeps track of that internally, it'll be a lot neater and more efficient to simply ask for it than to allocate more temporary storage to remember it independently (like I did for that horrible erratum thing to keep it self-contained). >> That's going to be a bigger >> job than I'll be able to finish this cycle, and I concluded that this >> in-between approach wouldn't be worth posting for its own sake, but as >> part of this series I think it's a reasonable compromise. > > I'll drop my hack once you have something. Happy to carry it as part of > this series. Cool, I can't promise how soon I'll get there, but like I said if all the other objections are worked out in the meantime I have no issue with landing this approach and improving on it later. Thanks, Robin. >> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h >> index 375a5e90d86a..87aaf8b5cdb4 100644 >> --- a/include/linux/scatterlist.h >> +++ b/include/linux/scatterlist.h >> @@ -16,7 +16,7 @@ struct scatterlist { >> #ifdef CONFIG_NEED_SG_DMA_LENGTH >> unsigned int dma_length; >> #endif >> -#ifdef CONFIG_PCI_P2PDMA >> +#ifdef CONFIG_NEED_SG_DMA_FLAGS >> unsigned int dma_flags; >> #endif > > I initially had something similar but I decided it's overkill for a > patch that I expected to be NAK'ed. > > I'll include your patch in my series in the meantime. >