From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67768D132B5 for ; Mon, 4 Nov 2024 11:39:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F0F6D6B0088; Mon, 4 Nov 2024 06:39:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EBF4B6B008A; Mon, 4 Nov 2024 06:39:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB1F36B0092; Mon, 4 Nov 2024 06:39:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id BED2F6B0088 for ; Mon, 4 Nov 2024 06:39:29 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 73689C0EC2 for ; Mon, 4 Nov 2024 11:39:29 +0000 (UTC) X-FDA: 82748216292.23.BDC6BC8 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf22.hostedemail.com (Postfix) with ESMTP id 5748CC0018 for ; Mon, 4 Nov 2024 11:38:48 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=FCNZl80y; spf=pass (imf22.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730720284; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=38bY0Z9kxbshcWx0RtML+ki3pPEY5OxNydCiOSEC1Fg=; b=MHJ0GBXeFZXCpnG1Arzl3RPn8XZMT01PD/xDnIaC8eRbp0NCuyNWP1VVXCRnN9IkANulN4 cDoc0/6uR/Pg8tsV3ih2lUOZcVYcTL+ygSmhh/Cs//6FPanmSeoKVVWKaKeNZ9U5n8kEis J56lfCwjRClg1gnPGHyGxat7Tq9bxYw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730720284; a=rsa-sha256; cv=none; b=kTOFDabkv4Q2A10kbJsiOvRKQ+dX2t4a5MJWA60Tih27mhw2AEop5ftCG8Q5W8qQp9e2Kj awolYGmVUT1k1CXBUjtlwFTjhfnWObopRDEjjNTU2JpDioiNmZWZWtMYoPp/7N4Tr0HuJx ZZUh3xjO5kkdriUnxIRGyZ4Dms63xrY= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=FCNZl80y; spf=pass (imf22.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 11CC95C490E; Mon, 4 Nov 2024 11:38:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9E43EC4CECE; Mon, 4 Nov 2024 11:39:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730720366; bh=NEDVbMitndD/iK6JWHs2o1S6pn7h57zEQ83wRjOGbwM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=FCNZl80yEk5l8SaYaoCBd/ngvAnMVCC2ZjSDbwZfQ8mrGoQAQHq0KZ0YySad2w6iD bhKUdk2VJPuLxk2n0zNkeYqVTjWimCe0KY+EoWT4fqvV1+YLhQUvABpjnzxQEgdqFM 2BpPxg0LwBCTYbhkxkUYZgp3KkOsOZRDo+BKC3lH3JeAsOEk2fzj8ARLQQ14RhFnrX uXWBkNGPfoaMl6N4kDz6OiMk0Mf/BwrYcYvvPP+t71xWn1ZjQA3Q6lgcGNVObYG5Gq GbS2qup3wwHfcN9yIP+AIN4pioJI5Luew6aYTY7gCJ1DnGQC9xzlf55e02PVxyG0ae QwBznLzmgrROQ== Date: Mon, 4 Nov 2024 13:39:20 +0200 From: Leon Romanovsky To: Christoph Hellwig Cc: Robin Murphy , Jens Axboe , Jason Gunthorpe , Joerg Roedel , Will Deacon , Sagi Grimberg , Keith Busch , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?iso-8859-1?B?Suly9G1l?= Glisse , Andrew Morton , Jonathan Corbet , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v1 00/17] Provide a new two step DMA mapping API Message-ID: <20241104113920.GD99170@unreal> References: <3567312e-5942-4037-93dc-587f25f0778c@arm.com> <20241104095831.GA28751@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241104095831.GA28751@lst.de> X-Rspamd-Server: rspam10 X-Stat-Signature: p6j19d47qrmzt4xqnrtzg8jh5bftgwgy X-Rspamd-Queue-Id: 5748CC0018 X-Rspam-User: X-HE-Tag: 1730720328-411104 X-HE-Meta: U2FsdGVkX18Sc0vvfGovAZfNlwXuGjmrx/B/B0Z9ZgB2RLlJq9a4xf9UT6FFHghJYLJmg1ri19LRmeASR0yVqm+XdlQMZNgULyZgi+tTKsZAJJWfl74ao5lFDjHWHS1PK2SCcNMCdMg7deze44ovqJeYSLcZOtyyIhVkmPCj2EttOkgO7uxEMA5Ais5H5ChJYyPj9BfG+XKa8lw9TCFNGxn1YHW0vt6wnVp/MyFPKXx6lp45Nbt5VxavepoXpYgXPANEJcVhH7zIs/4nS5fmgy+GQfcNd1XKZoP6i6ijxmn5gPD1aHoc4U3AwNn0Ht3Zw8LQFXEdpdYorNH/bq5vN8IVk7JdJX4aC7JRG4Mpm0vBEftwqq/1F654CZ73R9D8/1gk2i8uGB6YeLHgceCWaMCcnNNhHAY35npP5zDIOVJyWmdNJQgoT8zNYfOzocfxBpEEc4VJ4RNI/3krdE3fuRTMEFi8k8KOs+lf0dTOu3nATtYzGCqSx8ONqYYdDyKZHzNxjBIBFwVUCvLv5pw5kn9SeM6tsxtDH+lAXr56b6RidBM6qhhrnOuRVcyn63RSbe4QwbliD43x81YATSl6G4smX1QfxHLVcaz9BmstlsY1bscx9EMzv/lLHmZEmSUNHMEk6IZOPzqP3NP8aS4lKVaPRZpYTD6wZdWVmdD7ebbl+VMjsNWfXyeJlMcO+14IRPosK1pZpO74CD+9EO8s5cxrPCY4ldWwqjvuUvMO8du0DL+9iWyeGgGxOdEes4MbmEETwq+Q/sIVgl7zrIjW3jOW+JMLsSUYZF8jvfH2usXi9HLi0JfZH6mU9Os6ttlKLoAWzT80l+ZggRSkHCwIhe0cQIH7Fbf92uUK2P4R3AcPNNNCr2jq85FR59Ijo7xyUuo4GHQzhbgJd2A5NrnNbpipSZi7D2vgDn8h/xafD/SSPfJJS7aI7CAnM71A3vxGmVqvl1GCqKBNFh63Zum /NoR5nge g/XOehf8FeEL71/YyML7zgIrk+t+KabXjVQ2K/6FobAtlv1mGw0fItmyrgMsi1WRwust3hQqtCubUgmD8c82A/CfmLPLtu50c+yoahhwIz7bfeIsBUxp4tweC8l9zAJzgShczG6w99icev3b2IAek2TK4kYcJJgrHZQgG7AgLUM7G9gnayhCU5tHRoc/8NIiVyOywchrdNscs0WTiNDKVf+ip/Cmxh0QCnEKq2wyPlXeoZyxuNG8VAiTeFa0XHCOZLF25 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Nov 04, 2024 at 10:58:31AM +0100, Christoph Hellwig wrote: > On Thu, Oct 31, 2024 at 09:17:45PM +0000, Robin Murphy wrote: <...> > >> 2. VFIO PCI live migration code is building a very large "page list" > >> for the device. Instead of allocating a scatter list entry per allocated > >> page it can just allocate an array of 'struct page *', saving a large > >> amount of memory. > > > > VFIO already assumes a coherent device with (realistically) an IOMMU which > > it explicitly manages - why is it even pretending to need a generic DMA > > API? > > AFAIK that does isn't really vfio as we know it but the control device > for live migration. But Leon or Jason might fill in more. Yes, you are right, as it is written above "VFIO PCI live migration ...". That piece of code directly connected to the underlying real HW device and uses DMA API to provide live migration functionality to/from that device. > > The point is that quite a few devices have these page list based APIs > (RDMA where mlx5 comes from, NVMe with PRPs, AHCI, GPUs). > > > > >> 3. NVMe PCI demonstrates how a BIO can be converted to a HW scatter > >> list without having to allocate then populate an intermediate SG table. > > > > As above, given that a bio_vec still deals in struct pages, that could > > seemingly already be done by just mapping the pages, so how is it proving > > any benefit of a fragile new interface? > > Because we only need to preallocate the tiny constant sized dma_iova_state > as part of the request instead of an additional scatterlist that requires > sizeof(struct page *) + sizeof(dma_addr_t) + 3 * sizeof(unsigned int) > per segment, including a memory allocation per I/O for that. > > > My big concern here is that a thin and vaguely-defined wrapper around the > > IOMMU API is itself a step which smells strongly of "abuse and design > > mistake", given that the basic notion of allocating DMA addresses in > > advance clearly cannot generalise. Thus it really demands some considered > > justification beyond "We must do something; This is something; Therefore we > > must do this." to be convincing. > > At least for the block code we have a nice little core wrapper that is > very easy to use, and provides a great reduction of memory use and > allocations. The HMM use case I'll let others talk about. I'm not sure about which wrappers Robin talks, but if we are talking about HMM wrappers, they gave us perfect combination of usability, performance and maintenance. All HMM users use same pattern, same structures and don't need to worry about internal DMA/IOMMU details. Thanks