From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A2F4C369C2 for ; Mon, 5 May 2025 07:02:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 929D66B009E; Mon, 5 May 2025 03:02:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B24B6B00A1; Mon, 5 May 2025 03:02:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 706696B00A2; Mon, 5 May 2025 03:02:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 497B36B009E for ; Mon, 5 May 2025 03:02:28 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5A10B81AA4 for ; Mon, 5 May 2025 07:02:29 +0000 (UTC) X-FDA: 83407960818.05.25D6C08 Received: from sea.source.kernel.org (sea.source.kernel.org [172.234.252.31]) by imf07.hostedemail.com (Postfix) with ESMTP id 86AD040010 for ; Mon, 5 May 2025 07:02:27 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=YEBsdGBI; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf07.hostedemail.com: domain of leon@kernel.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746428547; a=rsa-sha256; cv=none; b=JRLqrvqg5ZUQ5/KA6brxoPvEBXHXdkj6aROA5H1JUikwe6Bd4fxav/fxiM3BtMBVhm1eO0 V1HROG0qv45YjBD5NGanrj00C7+/Ainou9+FY4ml7/1R5TxXR9fnno2pFCs15UigqTxuRR sSYzajRZ6fpxgjCDEgvO8FuYoevqDBE= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=YEBsdGBI; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf07.hostedemail.com: domain of leon@kernel.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746428547; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AF3gtQFNB1EkFviw4CF6/P7baGOpPGP0Qt5sCXG1QAQ=; b=al/LGCwmoJRO77+fqUv/0GgzfFSnvQMoS3iSL6VQ661t9Y6dQ1R2HLif39EjtxuVpUCMPX Qli0LwPm/Sbs82N7T6sYalZXHfriPT5XB4BdrIrFlVE6V6lRzxhwWGgiBlbPN7B/gH2vQy 0+hz8MDzBEofEC7LSOa/iZmYIxs2s2A= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id E7F4743EF5; Mon, 5 May 2025 07:02:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4017BC4CEEF; Mon, 5 May 2025 07:02:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1746428546; bh=OqVY+n6v+7S7pyZo8PdJ2+LmT5/B6VKI0mwJdkJCFKg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YEBsdGBIzt1tmoFUlPgwxbUt6Map7qYLunTdNuUjPrzcerFGyL4Kzyk+0TVD5BoXz euz6+X/5aUNGTTTNo2iZH9Bp2cV57g2pEcb71sD1lM67DrMFb9y6WGJ33HDm7LXTKZ VPfqcDgFb5srO3QSzcLSLYGV7wYSycu557udKIxEgQrzOEkUjp7oGXtqAIm6WQFxUu M6CH2lB+Zz4je8OGbLpqwEFC6u9jAEUG4ZZRQcRp1ntqpfK7g/PJYdhcx/AIbAcGTI 69NnZuPVEFSAqQ59+zTjweuZZohzp9W3GlFe9DXAkZyi2jyER/WXFwe9p06nek7jUm gi9tU66ZaiXuA== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Christoph Hellwig , Keith Busch , Jake Edge , Jonathan Corbet , Jason Gunthorpe , Zhu Yanjun , Robin Murphy , Joerg Roedel , Will Deacon , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= , Andrew Morton , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, Niklas Schnelle , Chuck Lever , Luis Chamberlain , Matthew Wilcox , Dan Williams , Kanchan Joshi , Chaitanya Kulkarni Subject: [PATCH v11 7/9] dma-mapping: Implement link/unlink ranges API Date: Mon, 5 May 2025 10:01:44 +0300 Message-ID: <41f0281051375512df1304abed642dcea2ae1e6b.1746424934.git.leon@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 86AD040010 X-Stat-Signature: c91k5h1mii5fqt1fpiwf97kpe1pmgrug X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1746428547-647418 X-HE-Meta: U2FsdGVkX18LW1USTX3QA+oEQiERJWkHbHAv9B5HtDk0z4s4bUzQA8Kxv+tg8t1NFzFKbplEJl5NqxT+zpOLaCCnadWGQNh2A5DZLJCrjHwuggKF6CzhuwPwAz69Kyu1NeD/jubXXwVz2qiFSwu7hOrVHOzCoessodZq1DQFvdtX1qzKr90lQ5Q8ieqF1Qw7Ez6JtDavfVk5k6FDVSgKpj/+O//k6bZHSWn1/kCXa/9ybKzWXi/igL5UeKbSC45PasVqyeh5PLCKlpHxYp/88iFoz6qEfnqEbTFiaWp+b1okiFEslRhjZ09/0Yjo54dG3az3tbo0doQWXEMlj0Lj9F/7zdtulrtEzMpsZ2TZRWglcbYnOYYxj9lvVWAVqoC4Qz2FTMU7VY9sd7xf8TzLQNlafZEFwN8Sru/L5iKNTZi6C41SLydM+fwo4QeN+NRSqGB0tmOqn/9Tmoi81bvnT/S8LjeyJHQc8Ic0CSVPekDxGuTdmD8R5Qe3dinNqXchi27KVQTS7rr+PRF5Ujejh0jb2OBnwHRu9MWJ9RNMcz7/K/aXmOHk2VhUnrsiix9+qSwOrp+xyDN46AcZBF7T9or+ACxnC3q/6jLO4q33fxQ+bqRIg2GqRElQ/Y3wtcd0WvLpd1y5o6c96M7j/R4VuQdoki+5CU81ZbBi+KmHnN6b3NwOyJmDW+haFjENfzqpJqjxRfAWOqGCuypD15hU6OwmjWMf18p6qi+DHi03ZXjnDNti/oP1QYJPSoT3gJUPzPaAkMwvl15A+srL23qbeiL40BQSsKrjsVSfbFOA8+mM8+18eV5dDIBWJqZGU4T8bQcSouJb9M8OilyqoR0vaH3feZImoKognDtFy15b7vY3WxjylhvYvxx8cBxr8kFKGn5EvzNX2cK0CdRia6iFwzOPsapJLsgsoTF+lhV0vjdWBBtlk70c293za91dJcb9AhcP4v6cTMk9Y7sfjd0 Nea6PS9D VWU9ojE0+V7/1d1zOaKIDhW51AG4T0KKn1ygjgFCt/obOBUdCNbrRwe3oHblrm1zBvdeHPfYQjcNizWvbp/D/sCmH11l90qzulZBNcmPHUVBqHIt4dhxBXjvK8WvxZ5dRAMR4EaVl98ZlBYfQwpK/pc4Hc/VSSOvM463mJuRr/17YdtcN4dvtdnWZojMcaYFmCKgd7+W61rBljaMW9R9xOqyt1Qs0NP32+XXdogT6xhy5AWLdhbCAep118Zl8wTCqvAL6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Introduce new DMA APIs to perform DMA linkage of buffers in layers higher than DMA. In proposed API, the callers will perform the following steps. In map path: if (dma_can_use_iova(...)) dma_iova_alloc() for (page in range) dma_iova_link_next(...) dma_iova_sync(...) else /* Fallback to legacy map pages */ for (all pages) dma_map_page(...) In unmap path: if (dma_can_use_iova(...)) dma_iova_destroy() else for (all pages) dma_unmap_page(...) Reviewed-by: Christoph Hellwig Tested-by: Jens Axboe Reviewed-by: Luis Chamberlain Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 275 +++++++++++++++++++++++++++++++++++- include/linux/dma-mapping.h | 32 +++++ 2 files changed, 306 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index d7684024c439..98f7205ec8fb 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1175,6 +1175,17 @@ static phys_addr_t iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys, return phys; } +/* + * Checks if a physical buffer has unaligned boundaries with respect to + * the IOMMU granule. Returns non-zero if either the start or end + * address is not aligned to the granule boundary. + */ +static inline size_t iova_unaligned(struct iova_domain *iovad, phys_addr_t phys, + size_t size) +{ + return iova_offset(iovad, phys | size); +} + dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, unsigned long attrs) @@ -1192,7 +1203,7 @@ dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, * we don't need to use a bounce page. */ if (dev_use_swiotlb(dev, size, dir) && - iova_offset(iovad, phys | size)) { + iova_unaligned(iovad, phys, size)) { phys = iommu_dma_map_swiotlb(dev, phys, size, dir, attrs); if (phys == (phys_addr_t)DMA_MAPPING_ERROR) return DMA_MAPPING_ERROR; @@ -1818,6 +1829,268 @@ void dma_iova_free(struct device *dev, struct dma_iova_state *state) } EXPORT_SYMBOL_GPL(dma_iova_free); +static int __dma_iova_link(struct device *dev, dma_addr_t addr, + phys_addr_t phys, size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + bool coherent = dev_is_dma_coherent(dev); + + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) + arch_sync_dma_for_device(phys, size, dir); + + return iommu_map_nosync(iommu_get_dma_domain(dev), addr, phys, size, + dma_info_to_prot(dir, coherent, attrs), GFP_ATOMIC); +} + +static int iommu_dma_iova_bounce_and_link(struct device *dev, dma_addr_t addr, + phys_addr_t phys, size_t bounce_len, + enum dma_data_direction dir, unsigned long attrs, + size_t iova_start_pad) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iova_domain *iovad = &domain->iova_cookie->iovad; + phys_addr_t bounce_phys; + int error; + + bounce_phys = iommu_dma_map_swiotlb(dev, phys, bounce_len, dir, attrs); + if (bounce_phys == DMA_MAPPING_ERROR) + return -ENOMEM; + + error = __dma_iova_link(dev, addr - iova_start_pad, + bounce_phys - iova_start_pad, + iova_align(iovad, bounce_len), dir, attrs); + if (error) + swiotlb_tbl_unmap_single(dev, bounce_phys, bounce_len, dir, + attrs); + return error; +} + +static int iommu_dma_iova_link_swiotlb(struct device *dev, + struct dma_iova_state *state, phys_addr_t phys, size_t offset, + size_t size, enum dma_data_direction dir, unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_start_pad = iova_offset(iovad, phys); + size_t iova_end_pad = iova_offset(iovad, phys + size); + dma_addr_t addr = state->addr + offset; + size_t mapped = 0; + int error; + + if (iova_start_pad) { + size_t bounce_len = min(size, iovad->granule - iova_start_pad); + + error = iommu_dma_iova_bounce_and_link(dev, addr, phys, + bounce_len, dir, attrs, iova_start_pad); + if (error) + return error; + state->__size |= DMA_IOVA_USE_SWIOTLB; + + mapped += bounce_len; + size -= bounce_len; + if (!size) + return 0; + } + + size -= iova_end_pad; + error = __dma_iova_link(dev, addr + mapped, phys + mapped, size, dir, + attrs); + if (error) + goto out_unmap; + mapped += size; + + if (iova_end_pad) { + error = iommu_dma_iova_bounce_and_link(dev, addr + mapped, + phys + mapped, iova_end_pad, dir, attrs, 0); + if (error) + goto out_unmap; + state->__size |= DMA_IOVA_USE_SWIOTLB; + } + + return 0; + +out_unmap: + dma_iova_unlink(dev, state, 0, mapped, dir, attrs); + return error; +} + +/** + * dma_iova_link - Link a range of IOVA space + * @dev: DMA device + * @state: IOVA state + * @phys: physical address to link + * @offset: offset into the IOVA state to map into + * @size: size of the buffer + * @dir: DMA direction + * @attrs: attributes of mapping properties + * + * Link a range of IOVA space for the given IOVA state without IOTLB sync. + * This function is used to link multiple physical addresses in contiguous + * IOVA space without performing costly IOTLB sync. + * + * The caller is responsible to call to dma_iova_sync() to sync IOTLB at + * the end of linkage. + */ +int dma_iova_link(struct device *dev, struct dma_iova_state *state, + phys_addr_t phys, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_start_pad = iova_offset(iovad, phys); + + if (WARN_ON_ONCE(iova_start_pad && offset > 0)) + return -EIO; + + if (dev_use_swiotlb(dev, size, dir) && + iova_unaligned(iovad, phys, size)) + return iommu_dma_iova_link_swiotlb(dev, state, phys, offset, + size, dir, attrs); + + return __dma_iova_link(dev, state->addr + offset - iova_start_pad, + phys - iova_start_pad, + iova_align(iovad, size + iova_start_pad), dir, attrs); +} +EXPORT_SYMBOL_GPL(dma_iova_link); + +/** + * dma_iova_sync - Sync IOTLB + * @dev: DMA device + * @state: IOVA state + * @offset: offset into the IOVA state to sync + * @size: size of the buffer + * + * Sync IOTLB for the given IOVA state. This function should be called on + * the IOVA-contiguous range created by one ore more dma_iova_link() calls + * to sync the IOTLB. + */ +int dma_iova_sync(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + dma_addr_t addr = state->addr + offset; + size_t iova_start_pad = iova_offset(iovad, addr); + + return iommu_sync_map(domain, addr - iova_start_pad, + iova_align(iovad, size + iova_start_pad)); +} +EXPORT_SYMBOL_GPL(dma_iova_sync); + +static void iommu_dma_iova_unlink_range_slow(struct device *dev, + dma_addr_t addr, size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_start_pad = iova_offset(iovad, addr); + dma_addr_t end = addr + size; + + do { + phys_addr_t phys; + size_t len; + + phys = iommu_iova_to_phys(domain, addr); + if (WARN_ON(!phys)) + /* Something very horrible happen here */ + return; + + len = min_t(size_t, + end - addr, iovad->granule - iova_start_pad); + + if (!dev_is_dma_coherent(dev) && + !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) + arch_sync_dma_for_cpu(phys, len, dir); + + swiotlb_tbl_unmap_single(dev, phys, len, dir, attrs); + + addr += len; + iova_start_pad = 0; + } while (addr < end); +} + +static void __iommu_dma_iova_unlink(struct device *dev, + struct dma_iova_state *state, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs, + bool free_iova) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + dma_addr_t addr = state->addr + offset; + size_t iova_start_pad = iova_offset(iovad, addr); + struct iommu_iotlb_gather iotlb_gather; + size_t unmapped; + + if ((state->__size & DMA_IOVA_USE_SWIOTLB) || + (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))) + iommu_dma_iova_unlink_range_slow(dev, addr, size, dir, attrs); + + iommu_iotlb_gather_init(&iotlb_gather); + iotlb_gather.queued = free_iova && READ_ONCE(cookie->fq_domain); + + size = iova_align(iovad, size + iova_start_pad); + addr -= iova_start_pad; + unmapped = iommu_unmap_fast(domain, addr, size, &iotlb_gather); + WARN_ON(unmapped != size); + + if (!iotlb_gather.queued) + iommu_iotlb_sync(domain, &iotlb_gather); + if (free_iova) + iommu_dma_free_iova(domain, addr, size, &iotlb_gather); +} + +/** + * dma_iova_unlink - Unlink a range of IOVA space + * @dev: DMA device + * @state: IOVA state + * @offset: offset into the IOVA state to unlink + * @size: size of the buffer + * @dir: DMA direction + * @attrs: attributes of mapping properties + * + * Unlink a range of IOVA space for the given IOVA state. + */ +void dma_iova_unlink(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + __iommu_dma_iova_unlink(dev, state, offset, size, dir, attrs, false); +} +EXPORT_SYMBOL_GPL(dma_iova_unlink); + +/** + * dma_iova_destroy - Finish a DMA mapping transaction + * @dev: DMA device + * @state: IOVA state + * @mapped_len: number of bytes to unmap + * @dir: DMA direction + * @attrs: attributes of mapping properties + * + * Unlink the IOVA range up to @mapped_len and free the entire IOVA space. The + * range of IOVA from dma_addr to @mapped_len must all be linked, and be the + * only linked IOVA in state. + */ +void dma_iova_destroy(struct device *dev, struct dma_iova_state *state, + size_t mapped_len, enum dma_data_direction dir, + unsigned long attrs) +{ + if (mapped_len) + __iommu_dma_iova_unlink(dev, state, 0, mapped_len, dir, attrs, + true); + else + /* + * We can be here if first call to dma_iova_link() failed and + * there is nothing to unlink, so let's be more clear. + */ + dma_iova_free(dev, state); +} +EXPORT_SYMBOL_GPL(dma_iova_destroy); + void iommu_setup_dma_ops(struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index de7f73810d54..a71e110f1e9d 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -309,6 +309,17 @@ static inline bool dma_use_iova(struct dma_iova_state *state) bool dma_iova_try_alloc(struct device *dev, struct dma_iova_state *state, phys_addr_t phys, size_t size); void dma_iova_free(struct device *dev, struct dma_iova_state *state); +void dma_iova_destroy(struct device *dev, struct dma_iova_state *state, + size_t mapped_len, enum dma_data_direction dir, + unsigned long attrs); +int dma_iova_sync(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size); +int dma_iova_link(struct device *dev, struct dma_iova_state *state, + phys_addr_t phys, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs); +void dma_iova_unlink(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size, enum dma_data_direction dir, + unsigned long attrs); #else /* CONFIG_IOMMU_DMA */ static inline bool dma_use_iova(struct dma_iova_state *state) { @@ -323,6 +334,27 @@ static inline void dma_iova_free(struct device *dev, struct dma_iova_state *state) { } +static inline void dma_iova_destroy(struct device *dev, + struct dma_iova_state *state, size_t mapped_len, + enum dma_data_direction dir, unsigned long attrs) +{ +} +static inline int dma_iova_sync(struct device *dev, + struct dma_iova_state *state, size_t offset, size_t size) +{ + return -EOPNOTSUPP; +} +static inline int dma_iova_link(struct device *dev, + struct dma_iova_state *state, phys_addr_t phys, size_t offset, + size_t size, enum dma_data_direction dir, unsigned long attrs) +{ + return -EOPNOTSUPP; +} +static inline void dma_iova_unlink(struct device *dev, + struct dma_iova_state *state, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs) +{ +} #endif /* CONFIG_IOMMU_DMA */ #if defined(CONFIG_HAS_DMA) && defined(CONFIG_DMA_NEED_SYNC) -- 2.49.0