linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Robin Murphy <robin.murphy@arm.com>
Cc: "Leon Romanovsky" <leon@kernel.org>,
	"Jens Axboe" <axboe@kernel.dk>, "Jason Gunthorpe" <jgg@ziepe.ca>,
	"Joerg Roedel" <joro@8bytes.org>, "Will Deacon" <will@kernel.org>,
	"Christoph Hellwig" <hch@lst.de>,
	"Sagi Grimberg" <sagi@grimberg.me>,
	"Leon Romanovsky" <leonro@nvidia.com>,
	"Keith Busch" <kbusch@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Logan Gunthorpe" <logang@deltatee.com>,
	"Yishai Hadas" <yishaih@nvidia.com>,
	"Shameer Kolothum" <shameerali.kolothum.thodi@huawei.com>,
	"Kevin Tian" <kevin.tian@intel.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
	iommu@lists.linux.dev, linux-nvme@lists.infradead.org,
	linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	linux-mm@kvack.org, "Randy Dunlap" <rdunlap@infradead.org>
Subject: Re: [PATCH v5 07/17] dma-mapping: Implement link/unlink ranges API
Date: Wed, 15 Jan 2025 07:26:28 +0100	[thread overview]
Message-ID: <20250115062628.GA29782@lst.de> (raw)
In-Reply-To: <ad2312e0-10d5-467a-be5e-75e80805b311@arm.com>

On Tue, Jan 14, 2025 at 08:50:35PM +0000, Robin Murphy wrote:
>>   EXPORT_SYMBOL_GPL(dma_iova_free);
>>   +static int __dma_iova_link(struct device *dev, dma_addr_t addr,
>> +		phys_addr_t phys, size_t size, enum dma_data_direction dir,
>> +		unsigned long attrs)
>> +{
>> +	bool coherent = dev_is_dma_coherent(dev);
>> +
>> +	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
>> +		arch_sync_dma_for_device(phys, size, dir);
>
> Again, if we're going to pretend to support non-coherent devices, where are 
> the dma_sync_for_{device,cpu} calls that work for a dma_iova_state? It 
> can't be the existing dma_sync_single ops since that would require the user 
> to keep track of every mapping to sync them individually, and the whole 
> premise is to avoid doing that (not to mention dma-debug wouldn't like it). 
> Same for anything coherent but SWIOTLB-bounced.

That assumes you actually need to sync them.  Many DMA mapping if not
most dma mappings are one shots - map and unmap, no sync.  And these
will work fine here.

But I guess the documentation needs to spell that out.  While I don't
have a good non-coherent system to test, swiotlb has actually been
tested with nvme when I implemented this part.

>> +{
>> +	struct iommu_domain *domain = iommu_get_dma_domain(dev);
>> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>> +	struct iova_domain *iovad = &cookie->iovad;
>> +	size_t iova_start_pad = iova_offset(iovad, phys);
>> +	size_t iova_end_pad = iova_offset(iovad, phys + size);
>
> "end_pad" implies a length of padding from the unaligned end address to 
> reach the *next* granule boundary, but it seems this is actually the 
> unaligned tail length of the data itself. That's what confused me last 
> time, since in the map path that post-data padding region does matter in 
> its own right.

Yeah.  Do you have a suggestion for a better name?

>> +		phys_addr_t phys, size_t offset, size_t size,
>> +		enum dma_data_direction dir, unsigned long attrs)
>> +{
>> +	struct iommu_domain *domain = iommu_get_dma_domain(dev);
>> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>> +	struct iova_domain *iovad = &cookie->iovad;
>> +	size_t iova_start_pad = iova_offset(iovad, phys);
>> +
>> +	if (WARN_ON_ONCE(iova_start_pad && offset > 0))
>
> "iova_start_pad == 0" still doesn't guarantee that "phys" and "offset" are 
> appropriately aligned to each other.

>> +	if (dev_use_swiotlb(dev, size, dir) && iova_offset(iovad, phys | size))
>
> Again, why are we supporting non-granule-aligned mappings in the middle of 
> a range when the documentation explicitly says not to?

It's not trying to support that, but checking that this is guaranteed
to be the last one is harder than handling it like this.  If you have
a suggestion for better checks that would be very welcome.

>> +		if (!dev_is_dma_coherent(dev) &&
>> +		    !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
>> +			arch_sync_dma_for_cpu(phys, len, dir);
>
> Hmm, how do attrs even work for a bulk unlink/destroy when the individual 
> mappings could have been linked with different values?

They shouldn't.  Just like randomly mixing flags doesn't work for the
existing APIs.

> (So no, irrespective of how conceptually horrid it is, clearly it's not 
> even functionally viable to open-code abuse of DMA_ATTR_SKIP_CPU_SYNC in 
> callers to attempt to work around P2P mappings...)

What do you mean with "work around"?  I guess Leon added it to the hmm
code based on previous feedback, but I still don't think any of our P2P
infrastructure works reliably with non-coherent devices as
iommu_dma_map_sg gets this wrong.  So despite the earlier comments I
suspect this should stick to the state of the art even if that is broken.



  reply	other threads:[~2025-01-15  6:26 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-17 13:00 [PATCH v5 00/17] Provide a new two step DMA mapping API Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 01/17] PCI/P2PDMA: Refactor the p2pdma mapping helpers Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 02/17] dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 03/17] iommu: generalize the batched sync after map interface Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 04/17] iommu: add kernel-doc for iommu_unmap and iommu_unmap_fast Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 05/17] dma-mapping: Provide an interface to allow allocate IOVA Leon Romanovsky
2025-01-14 20:50   ` Robin Murphy
2025-01-15  6:13     ` Christoph Hellwig
2025-01-15  8:17     ` Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 06/17] iommu/dma: Factor out a iommu_dma_map_swiotlb helper Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 07/17] dma-mapping: Implement link/unlink ranges API Leon Romanovsky
2025-01-14 20:50   ` Robin Murphy
2025-01-15  6:26     ` Christoph Hellwig [this message]
2025-01-15  7:27       ` Leon Romanovsky
2025-01-15  8:33     ` Leon Romanovsky
2025-01-16 20:18       ` Jason Gunthorpe
2025-01-16 21:00         ` Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 08/17] dma-mapping: add a dma_need_unmap helper Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 09/17] docs: core-api: document the IOVA-based API Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 10/17] mm/hmm: let users to tag specific PFN with DMA mapped bit Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 11/17] mm/hmm: provide generic DMA managing logic Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 12/17] RDMA/umem: Store ODP access mask information in PFN Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 13/17] RDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page linkage Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 14/17] RDMA/umem: Separate implicit ODP initialization from explicit ODP Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 15/17] vfio/mlx5: Explicitly use number of pages instead of allocated length Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 16/17] vfio/mlx5: Rewrite create mkey flow to allow better code reuse Leon Romanovsky
2024-12-17 13:00 ` [PATCH v5 17/17] vfio/mlx5: Enable the DMA link API Leon Romanovsky
2025-01-14  8:38 ` [PATCH v5 00/17] Provide a new two step DMA mapping API Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250115062628.GA29782@lst.de \
    --to=hch@lst.de \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=joro@8bytes.org \
    --cc=kbusch@kernel.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=m.szyprowski@samsung.com \
    --cc=rdunlap@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=sagi@grimberg.me \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=will@kernel.org \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox