From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 441EAC28B28 for ; Wed, 12 Mar 2025 19:32:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A9D19280003; Wed, 12 Mar 2025 15:32:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A26CE280001; Wed, 12 Mar 2025 15:32:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A164280003; Wed, 12 Mar 2025 15:32:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 67DE5280001 for ; Wed, 12 Mar 2025 15:32:56 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E3DCD1A1650 for ; Wed, 12 Mar 2025 19:32:56 +0000 (UTC) X-FDA: 83213896752.23.DEB407B Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf01.hostedemail.com (Postfix) with ESMTP id 27ECE40004 for ; Wed, 12 Mar 2025 19:32:55 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=b+Uk3+ue; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf01.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741807975; a=rsa-sha256; cv=none; b=U/6HumJXn5glKNBvixBZYc9pVubmvDmr+fuwy5ZWuIrx64kRxUuPKONQQDX2W7jIm+P0xK TEDNNU9q9vSMr4m2ff/O8wbwt3uR/69lNGTZ8g0SZ17khvEmYBAL8HbiAL8Q389Qn9vDQW Mcn7r+h6r9UhNpoYIM4mtopYD2kH7Jc= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=b+Uk3+ue; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf01.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741807975; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0FLcKjeaLNMvII1mQW4DoI9pbBBQ4rghYqS1Yd81wNw=; b=g6hys43DJPW6XJ8HjKe/z8W48X+52oaUckba3PldvmvxJmwmaDEsJF4zBm6Ll7Slx+DjNf exjslO7oz36N19hi9a0UAQOKmtHfj6fk//qyYT/SZLoJGS0m5O2DPp91x8M4riQqGPzKs2 qhISlQt+4U+/VH2neSzEUYYH/iSO11M= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 31902A471C0; Wed, 12 Mar 2025 19:27:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 10277C4CEDD; Wed, 12 Mar 2025 19:32:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741807973; bh=UqymYCmygChHTPDYtRXKIe/SXjDPGE2/pDiLoxPaWuE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=b+Uk3+ueemhPDieKQx1nZ/0T/AxEPOxYIBTKKZu8d1FoyzOFJ2KMESVuqAJNe8KXO lPvJ3iwuw2HcCXBIuKgsjNyo/2cqjf8Jop9pQY7LvExmkjK5VQ2tPUtrcbiV3P0ASH IYBEgv5p8zecYfDaGdbOOAKD9I4sD8DqvB+8A/zcDu1qLHTqXjdRmMZ8Z606DedjEH z1azm63C7XF2PUIgn/hPn3OfaqKFeeH1obyQ47S5CRNm+ClQQ8oMrtBYnWFPs3mBKy LtdGBSZ3zpQjDdqi56DXmuUpg730oss4lY71gok4/eHccCcea6u0LtXsrNvTOrgrNG ex/5JVMLQJT6g== Date: Wed, 12 Mar 2025 21:32:49 +0200 From: Leon Romanovsky To: Marek Szyprowski Cc: Robin Murphy , Christoph Hellwig , Jason Gunthorpe , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , Keith Busch , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?iso-8859-1?B?Suly9G1l?= Glisse , Andrew Morton , Jonathan Corbet , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, Randy Dunlap Subject: Re: [PATCH v7 00/17] Provide a new two step DMA mapping API Message-ID: <20250312193249.GI1322339@unreal> References: <20250220124827.GR53094@unreal> <1166a5f5-23cc-4cce-ba40-5e10ad2606de@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspam-User: X-Rspamd-Queue-Id: 27ECE40004 X-Rspamd-Server: rspam05 X-Stat-Signature: o4wfpmyyi7m7c1am4xidethh857tnagf X-HE-Tag: 1741807974-602730 X-HE-Meta: U2FsdGVkX19dv1aeVtBnDkQPwHz/CTxRCR6PyIUDWcR7BSb0JZeDDuX4Mv9huTzyTakib5u5enF6P8dUMDlVCZWsOr8SWtWoD4CkPYHl4Knbne1IY0xE6Z29EOwFlB8wEHAZ37dcA6+YxmmHrL9ANGeQZi9oRawn+ww9BSOsUG6jhG+Qj8yQpI4RHof/HSCkx4TgO3AQ0LpgjlpiThQRrTHManmUzbTqei2YmV+YRZ6yFrKnVprqYoCBqBViABhXIuYW4kbKnbuyJLEv5btCzzPQFGeD6G68Zup6mKqt+0d4zItA8btUCs/QihrX1uWCuFhBXHT9zQ8o42I1HcSgZyF19FXMqjK2h3TPHBXZ/vcG6jYDvSYyS/9Mgerr9+UQyjtJRrCdyVlQLJH8JD9QPxhkJc05LCaGFueG+M61eCP/Th77DgqCBeYtnoVTmK36g2VVCogh2hLkNfnftTfqnTGo6LcKW1xMFr6LvSfqrpoYoEP8Ch25KAdJukg/7C2XzBWoJp2Z0nx0C9RtU57xNXSb0zR+9aqkiY3aMeWpOZk21LkYoFTuURXDTCs6Dqkz13vVBohFVrEBKfJVdgOjruu4Xs13Kb1RB3psr0k8jPrw0SzHc7KuTEEYzPkP3etyHtWJox/MHzFmsYYWjs5Lmic5QbJNYwGyENoWmK18Q6IzspNVmrXUc86j4UX57b1+GXDuuVMN6m2Ck7k5ILPmKm8+BS0g2QXwY+7aYz5bWYDyRMP8B0U+ksKgRPpSxFlmau6+qcjM3ulidI8Z5SIv2prjGqEsKXOWYn7Zqc1MS7C0tY5/oyk7lMG0hTs806KV/pWcg/oDx73JnYvgog1bQ73QZHGO9+ia/qLXuKm8Sse3TScKJZijAylGnfP+RpF/Vii0t6gbDoNMbVx6cq0/gY8gHpC/5o6YSz6eDgz7J932u0LhQisg6PUhysQnsIIhDKuQ7c+bsKIE5XBz+cr DXw51RmU W3s+mGNoSIj3aMKiSN1yFt8QqHaLOLCyPU7+RhnM02Cqfe8CNMVuytMTR3uEPKmFew3KnTGdoru3Sm5ppBliGeZLraaYB2GZil/VaMin8CZWTvZaYUcSe2rNuzWgAt3OroJ3vldEjMLRzF3mootPRIZw43H4/dgWeZ1N3CQ81+5MuJ4SCqP8zqJToD+tJhPItP2q48y0aLfVxYql9PmTZIvmggAqmUVO6szVPsxj6tuNpKag2HJZz7oqiAxr4bZx1xprHvMndEn0vVeAZNhCR1EPqfeb7UbdbZwM7q3Z4JQ9Rz1t7I3fDaNb2txzAKCkgtBSwnxAXWOEuovFbOyMy/I+y49/b+zAiiTst X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 12, 2025 at 10:28:32AM +0100, Marek Szyprowski wrote: > Hi Robin > > On 28.02.2025 20:54, Robin Murphy wrote: > > On 20/02/2025 12:48 pm, Leon Romanovsky wrote: > >> On Wed, Feb 05, 2025 at 04:40:20PM +0200, Leon Romanovsky wrote: > >>> From: Leon Romanovsky > >>> > >>> Changelog: > >>> v7: > >>>   * Rebased to v6.14-rc1 > >> > >> <...> > >> > >>> Christoph Hellwig (6): > >>>    PCI/P2PDMA: Refactor the p2pdma mapping helpers > >>>    dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h > >>>    iommu: generalize the batched sync after map interface > >>>    iommu/dma: Factor out a iommu_dma_map_swiotlb helper > >>>    dma-mapping: add a dma_need_unmap helper > >>>    docs: core-api: document the IOVA-based API > >>> > >>> Leon Romanovsky (11): > >>>    iommu: add kernel-doc for iommu_unmap and iommu_unmap_fast > >>>    dma-mapping: Provide an interface to allow allocate IOVA > >>>    dma-mapping: Implement link/unlink ranges API > >>>    mm/hmm: let users to tag specific PFN with DMA mapped bit > >>>    mm/hmm: provide generic DMA managing logic > >>>    RDMA/umem: Store ODP access mask information in PFN > >>>    RDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page > >>>      linkage > >>>    RDMA/umem: Separate implicit ODP initialization from explicit ODP > >>>    vfio/mlx5: Explicitly use number of pages instead of allocated > >>> length > >>>    vfio/mlx5: Rewrite create mkey flow to allow better code reuse > >>>    vfio/mlx5: Enable the DMA link API > >>> > >>>   Documentation/core-api/dma-api.rst   |  70 ++++ > >>   drivers/infiniband/core/umem_odp.c   | 250 +++++--------- > >>>   drivers/infiniband/hw/mlx5/mlx5_ib.h |  12 +- > >>>   drivers/infiniband/hw/mlx5/odp.c     |  65 ++-- > >>>   drivers/infiniband/hw/mlx5/umr.c     |  12 +- > >>>   drivers/iommu/dma-iommu.c            | 468 > >>> +++++++++++++++++++++++---- > >>>   drivers/iommu/iommu.c                |  84 ++--- > >>>   drivers/pci/p2pdma.c                 |  38 +-- > >>>   drivers/vfio/pci/mlx5/cmd.c          | 375 +++++++++++---------- > >>>   drivers/vfio/pci/mlx5/cmd.h          |  35 +- > >>>   drivers/vfio/pci/mlx5/main.c         |  87 +++-- > >>>   include/linux/dma-map-ops.h          |  54 ---- > >>>   include/linux/dma-mapping.h          |  85 +++++ > >>>   include/linux/hmm-dma.h              |  33 ++ > >>>   include/linux/hmm.h                  |  21 ++ > >>>   include/linux/iommu.h                |   4 + > >>>   include/linux/pci-p2pdma.h           |  84 +++++ > >>>   include/rdma/ib_umem_odp.h           |  25 +- > >>>   kernel/dma/direct.c                  |  44 +-- > >>>   kernel/dma/mapping.c                 |  18 ++ > >>>   mm/hmm.c                             | 264 +++++++++++++-- > >>>   21 files changed, 1435 insertions(+), 693 deletions(-) > >>>   create mode 100644 include/linux/hmm-dma.h > >> > >> Kind reminder. <...> > Removing the need for scatterlists was advertised as the main goal of > this new API, but it looks that similar effects can be achieved with > just iterating over the pages and calling page-based DMA API directly. Such iteration can't be enough because P2P pages don't have struct pages, so you can't use reliably and efficiently dma_map_page_attrs() call. The only way to do so is to use dma_map_sg_attrs(), which relies on SG (the one that we want to remove) to map P2P pages. > Maybe I missed something. I still see some advantages in this DMA API > extension, but I would also like to see the clear benefits from > introducing it, like perf logs or other benchmark summary. We didn't focus yet on performance, however Christoph mentioned in his block RFC [1] that even simple conversion should improve performance as we are performing one P2P lookup per-bio and not per-SG entry as was before [2]. In addition it decreases memory [3] too. [1] https://lore.kernel.org/all/cover.1730037261.git.leon@kernel.org/ [2] https://lore.kernel.org/all/34d44537a65aba6ede215a8ad882aeee028b423a.1730037261.git.leon@kernel.org/ [3] https://lore.kernel.org/all/383557d0fa1aa393dbab4e1daec94b6cced384ab.1730037261.git.leon@kernel.org/ So clear benefits are: 1. Ability to use native for subsystem structure, e.g. bio for block, umem for RDMA, dmabuf for DRM, e.t.c. It removes current wasteful conversions from and to SG in order to work with DMA API. 2. Batched request and iotlb sync optimizations (perform only once). 3. Avoid very expensive call to pgmap pointer. 4. Expose MMIO over VFIO without hacks (PCI BAR doesn't have struct pages). See this series for such a hack https://lore.kernel.org/all/20250307052248.405803-1-vivek.kasireddy@intel.com/ Thanks > > > Best regards > -- > Marek Szyprowski, PhD > Samsung R&D Institute Poland > >