From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01456C369D1 for ; Thu, 24 Apr 2025 07:55:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D656E6B002A; Thu, 24 Apr 2025 03:55:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF25C6B00A9; Thu, 24 Apr 2025 03:55:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B8E406B00AD; Thu, 24 Apr 2025 03:55:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 93A9C6B00A9 for ; Thu, 24 Apr 2025 03:55:40 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 951A41CC76E for ; Thu, 24 Apr 2025 07:55:40 +0000 (UTC) X-FDA: 83368178040.22.A38C3D0 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf30.hostedemail.com (Postfix) with ESMTP id E767480008 for ; Thu, 24 Apr 2025 07:55:38 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NosDI+x4; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf30.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745481339; a=rsa-sha256; cv=none; b=LcMn+BNB6aZcXNUJB6TfLofL2UKpfm3umapEoQId9QIw0mlK+9Rx/zZ7PnjhiB6uJ8NybD oLrGxrvbB2zcCZwjDhpCO68fEqiviQXSC75x6+GTGea1ebX+k4Lf8QyaBt2CKX6BvIm1Kb 50eAreCJe2dZH6YWaT4DdbNWXJqOdD4= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NosDI+x4; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf30.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745481339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d5g/riJXF/W/LjvW+uNmrD4nHxvKNcoliLJmax2mb0M=; b=zgzzapIKn+tXcUH1C1YyB7smz/S2eTkGhuv5UxVbYbylmJMiDSxQOmsIZhSAKqkTpqTJtW U1WffJm58xY/qLyRq062rOuGFT4G40IPuInu4W0buPJuHUvNvLem7MbuRZ5+mnl/S9a1PM ipgtl5JlZR9eCNBpxO2dmz6DvExFUMg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id EC8915C62EB; Thu, 24 Apr 2025 07:53:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D7605C4CEE3; Thu, 24 Apr 2025 07:55:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745481337; bh=eZB03wlWn20hlMjq44G4y2WHS0oF7M3JiBoS6ve+Nmc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=NosDI+x4cvbYgNU24M08GNHc6d7pJ4hl5/qxuZCxf+xwL2RBZRrJJUhOnFprAFHAp uMXuNxDsAPF8h8GL3TMV0/p3zy00XpDJUj1Mc5rlsuJvcv+Nh4bTlPohNQGuUY4EuN YdeQRGTfUzW8sL2ypAalBBGrL4iwWbiNeslamejqORQdHrhgX8L3E8sStLGQsqijej pyQJVsd0fCj3iAMLYy/oXTlGqfTe/wKosof0Q4BTawzUGoCVJidUnxK+0FFEnz2fZ7 9cHCcUgloCY9p7vNTJA50fE41QpG2My1P6zfNwh4v++zjtZTJG0F4JLP9YIeCN29iU 73C7zFzzL2JzQ== Date: Thu, 24 Apr 2025 10:55:32 +0300 From: Leon Romanovsky To: Jason Gunthorpe Cc: Marek Szyprowski , Jens Axboe , Christoph Hellwig , Keith Busch , Jake Edge , Jonathan Corbet , Zhu Yanjun , Robin Murphy , Joerg Roedel , Will Deacon , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?iso-8859-1?B?Suly9G1l?= Glisse , Andrew Morton , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, Niklas Schnelle , Chuck Lever , Luis Chamberlain , Matthew Wilcox , Dan Williams , Kanchan Joshi , Chaitanya Kulkarni Subject: Re: [PATCH v9 17/24] vfio/mlx5: Enable the DMA link API Message-ID: <20250424075532.GO48485@unreal> References: <20250423180941.GS1213339@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250423180941.GS1213339@ziepe.ca> X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: E767480008 X-Stat-Signature: ankiuefgrrkkgunz3gjzeya9bb7nsqbi X-Rspam-User: X-HE-Tag: 1745481338-20004 X-HE-Meta: U2FsdGVkX194eUpJXLvvmqP3tsM9yf5z7l5NyqayUkmzkAp36cffymkAzDyzxX+ePnkmLSUi4ZnjYx++poFlDVLAtof7eOpBJSTpB8Z9M3TPsVYENpP+TAjMLh+SFB/UJAUW43gebchuBsI6OBr2rQZnC+JryhHets+2YaHrRGoWEBs4xS37O/lk/pSJMDrtI9nDIplMPqtXI2ZyjcWmn2T48vPPjgohE6JkXPSAXSvDx1FS4MWVdalKJb0UJ8rfdgHcRcV5dCvcwRECAxra4Ch9PjU4gOcRI00NdPbylIVuNWDR8f4rzzFrG5bkg0Q0C2PBKZrAFCwYxITkSHCfuBK3f+2WAaWMKhfE2Tev840m0HVAARqHKxFRJqQFNYC9nFnHrDbYohVjwGUKTRDfK9iw6vkjJh5Kwh9y95/V67XVnsu1fmAvr914sNQAKX/IaK5VabAjAfW8cczAma9GHsXQfebvXzaC5qeXem+u2Cbvpz6Gu+Ay2IFVvE3iUnM5lHLwb3DQg3+5lMw3WJcG856xGB28MeFqp8RJ1N6PcRt5YxXGc7+bJxptOCIV5XXVxa5IJXzB+4zw6sirZazOUDsOyg3pVwz9tw0EyZR+/Cp6P3iHxcGwTelW9YCoHRR+7bPc3l5e7ehi7bxgFsbSX279Xr5JyYSprSFztktJKKLQvmq+lrP1jA2QyeRyAaXOLysjHBW6cL4OH9FEUug0jmuQj3R0xhAexHLtZ31i1v81E3Ks9+wjnwG8WqccJpc2OoLb3uDIS3ijhtMqWqX73Gus0qTLDTFcOvRmxMEFs0VbG/XF8Hy7u4MOIM3KsI9JHx0TjnWt4+uAack2Q8VDqsxhWFeCNj7hH87e9OCujVPdhMVoM+oc0xeeyTi5P9LwyUNfwL41Mrp8BAUGFxE2L6JuGBMtKQAdTknNJ9ooMIwt09xtY0NdNn1nrI6jMYQriVfOtXGj41hjhHQw258 U9xmAwSX tZwAEoWJPspc19qpqMGz1PSiw9tKeXuAvxlV/IfDvuHDlF0jz7Rz5eCF4zbyisS5cO8znS9oeZqcTbZA4ZczLknHLhQO00YwBntQ9gTIDroRdStHvs+Zbm16U3lBorcHMWL7Wo6zX+QkysWVdBCPx+rW6gizWVzE+tdpxjiCHfDk68o6OAlVKsl3/c03E9OjZHzfvsuzUbVyhjL++qit/KwizoNpiK98ph9TBLZ7vFAMx7QxelSQQbEl1yk2O9js1qlqB7xiL5oIJeHVRikHSy43lacwI2XgMdKKcriTNTOICibo2bojBpM/0Gw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 23, 2025 at 03:09:41PM -0300, Jason Gunthorpe wrote: > On Wed, Apr 23, 2025 at 11:13:08AM +0300, Leon Romanovsky wrote: > > From: Leon Romanovsky > > > > Remove intermediate scatter-gather table completely and > > enable new DMA link API. > > > > Tested-by: Jens Axboe > > Signed-off-by: Leon Romanovsky > > --- > > drivers/vfio/pci/mlx5/cmd.c | 298 ++++++++++++++++------------------- > > drivers/vfio/pci/mlx5/cmd.h | 21 ++- > > drivers/vfio/pci/mlx5/main.c | 31 ---- > > 3 files changed, 147 insertions(+), 203 deletions(-) > > Reviewed-by: Jason Gunthorpe > > > +static int register_dma_pages(struct mlx5_core_dev *mdev, u32 npages, > > + struct page **page_list, u32 *mkey_in, > > + struct dma_iova_state *state, > > + enum dma_data_direction dir) > > +{ > > + dma_addr_t addr; > > + size_t mapped = 0; > > + __be64 *mtt; > > + int i, err; > > > > - return mlx5_core_create_mkey(mdev, mkey, mkey_in, inlen); > > + WARN_ON_ONCE(dir == DMA_NONE); > > + > > + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); > > + > > + if (dma_iova_try_alloc(mdev->device, state, 0, npages * PAGE_SIZE)) { > > + addr = state->addr; > > + for (i = 0; i < npages; i++) { > > + err = dma_iova_link(mdev->device, state, > > + page_to_phys(page_list[i]), mapped, > > + PAGE_SIZE, dir, 0); > > + if (err) > > + goto error; > > + *mtt++ = cpu_to_be64(addr); > > + addr += PAGE_SIZE; > > + mapped += PAGE_SIZE; > > + } > > This is an area I'd like to see improvement on as a follow up. > > Given we know we are allocating contiguous IOVA we should be able to > request a certain alignment so we can know that it can be put into the > mkey as single mtt. That would eliminate the double translation cost in > the HW. > > The RDMA mkey builder is able to do this from the scatterlist but the > logic to do that was too complex to copy into vfio. This is close to > being simple enough, just the alignment is the only problem. I saw this improvement as well, but there is a need to generalize this "if (dma_iova_try_alloc) ... else ..." code first, as it will be used by all vfio HW drivers. So the plan is: 1. Merge the code as is. 2. Convert second vfio HW to the new API. 3. Propose something like dma_map_pages(..., struct page **page_list, ...) to map array of pages. 4. Optimize mlx5 vfio MTT creation. Thanks > > Jason >