From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 075FFC2BD09 for ; Tue, 9 Jul 2024 19:03:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 994406B009A; Tue, 9 Jul 2024 15:03:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 91D226B009B; Tue, 9 Jul 2024 15:03:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7BE5A6B009C; Tue, 9 Jul 2024 15:03:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 5B3B26B009A for ; Tue, 9 Jul 2024 15:03:19 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 07C6E12172D for ; Tue, 9 Jul 2024 19:03:19 +0000 (UTC) X-FDA: 82321137318.11.B8E4A9F Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf15.hostedemail.com (Postfix) with ESMTP id 11072A001F for ; Tue, 9 Jul 2024 19:03:16 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=KpB4cxYl; spf=pass (imf15.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.219.51 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720551781; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tz4SKanGT0p7B5kijsG/hyAEtWawoOPO2+O2oyG1KXg=; b=hNuON6LSGPrKIwrcS6E1rSWjGVWckxSZMefvAt4cpHWVtTb3jl2wf0VfX9e/1OpfuUPY9P ztZkD3XfvpbexbKTQ22n53omfhCvKgP+s8NayoLEim1nnw5Fs8QaShbt2o1yQaaZ297W8E IW5U3E9lxaLkbjhD5Cqy2NWk4Nfg+RQ= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=KpB4cxYl; spf=pass (imf15.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.219.51 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720551781; a=rsa-sha256; cv=none; b=PQrbByCOnznb+A0klvBQb5zhv5+ElecduWUDv83HdWX6QLXQCn7Kj8ubryCMJxlNyZxxqr VsXz1X65ez3wezbFiMGnI7D32dJfyH4frBpYTtlgkOLqbn2DmfHHJzoxkjFamO6U03f6+m /geS4hROjeo082Dwico26Hp/NNAWAeY= Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6b553b1a159so32574716d6.0 for ; Tue, 09 Jul 2024 12:03:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1720551796; x=1721156596; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=tz4SKanGT0p7B5kijsG/hyAEtWawoOPO2+O2oyG1KXg=; b=KpB4cxYlwmEvX190OqmY4Kq4mZiJvaTIs6CJXLQGbmrp2g9lywK/ga5FTgPNxdiQnt T7WMX7iuUVJFw0rvyuOxxiWCJl2orR9FCemzSmvUnPmb1Hqpkeep4ALwtYWw5w+QQoOQ gOifCMyOuQxf9bwXFB388kuyqN3BCjMbxz3LNdBg7uHAW9pruVNEthr3TEiIzfDgayzr PwGBAYucMqZHOtYjbWHYoWH3T3Dis7SbJeeDIbniMFBvqQZT8Bfc41Hcy+/aabEPMPCU 3dyuYJ4uFSG2gsBNdwpPWilwVkMeMQADkhkrS2DDCwufU0eVkQzrhqj3b0v/oUNNrpCb OtXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720551796; x=1721156596; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=tz4SKanGT0p7B5kijsG/hyAEtWawoOPO2+O2oyG1KXg=; b=kpSc2jVYKE6WWJo27ULJnwKhMZiDDFmdrr87eI92h0OxB6ydojhEBPq+F5gQqVnEMd 2ptIxWoY+nP93kSNCI30tPjGSJVgH3uaEbxkhH6yUCqrSqScnDzYbB4YmSqTXoqzoSNA x9Z8lFFf/2K0MTslgzlwhZDbUg6uZr+2QOEwwt5os8wA/bipvGCrlCZRFTecAtpNfA07 wNpUAMEdXQ0E8bmjLaVtEhSjUgRHclNjuoq3b/D8RsSYEUbZav1aM1sv4GNQaj+bWqCD RXCE6ncxpT8TTb9y8kL4yi20hhefj0z786yijwy0mJPsadgEyQIy40DuAepqDXK/1OBf r1Yg== X-Forwarded-Encrypted: i=1; AJvYcCXhlR0tZyE/Az1/YJbdJDPYYEQW2n1v76bbK2FXJdILP8ouq9sAqQglx8chg/cJK9x+I7oEWRfjcNZgOSeqPwWpSi0= X-Gm-Message-State: AOJu0YywFbUmQMCiSj9pmQzUe/2xVG2Ey+pi2vImhu5dSZXPLK7WvTbs fW5VGE4m75D78uzDcx1d1w/BAZYGULQPuuf8fAVhUeJU+QkTDkBg/g4B2XRJrsc= X-Google-Smtp-Source: AGHT+IFhetxp9iNMwbY2wmkBlgDvT9+MA6GFJVeq9xE0noHusNzmV7TqInccd4H71Z7ysufi1Iri/Q== X-Received: by 2002:a05:6214:1cc2:b0:6b5:52da:46f2 with SMTP id 6a1803df08f44-6b61bc80504mr38822486d6.6.1720551795800; Tue, 09 Jul 2024 12:03:15 -0700 (PDT) Received: from ziepe.ca ([128.77.69.90]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b61ba797c0sm11232896d6.91.2024.07.09.12.03.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Jul 2024 12:03:15 -0700 (PDT) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1sRFxj-002sHI-8S; Tue, 09 Jul 2024 15:53:15 -0300 Date: Tue, 9 Jul 2024 15:53:15 -0300 From: Jason Gunthorpe To: Christoph Hellwig Cc: Leon Romanovsky , Jens Axboe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , "Zeng, Oak" , Chaitanya Kulkarni , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH v1 00/18] Provide a new two step DMA API mapping API Message-ID: <20240709185315.GM14050@ziepe.ca> References: <20240703054238.GA25366@lst.de> <20240703105253.GA95824@unreal> <20240703143530.GA30857@lst.de> <20240703155114.GB95824@unreal> <20240704074855.GA26913@lst.de> <20240708165238.GE14050@ziepe.ca> <20240709061721.GA16180@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240709061721.GA16180@lst.de> X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 11072A001F X-Stat-Signature: rqsc4skmz8aar1murkw9bo3t1tw7tqec X-Rspam-User: X-HE-Tag: 1720551796-164764 X-HE-Meta: U2FsdGVkX19ElzM0mKLzIWD3ysPnLpwG7IpEEjsoDvGfAFrR1Uqkw1tMqePGYcRBdL4Ur0br5gNmOBahxQb/EW7zObVo7EqSWkHkuL7m9JVsokDOgSJrfccO2VcD89EeKHs6F5/sn99CvYguNnUqkzpt81+MC3ThAa9yizp6zfBEdo7mF0eikmU5VqsBvNy7f4Koq8f3UpIh49kopV7yRU1wAILkJo+FiiNutgq7i5RCD34UYR/wvvVZcyRyfnVMqJtJax05Pe6OpOgdRqRl7v5DEKrHl5r+oqgN6beXsPUwAkWZLRME/raboZGCOMWPiEn6i4C/fN0cBp2V2943xhA2uy0CEBkxPpYSKWDsrK1yuf2sSvZtJ4/bFwDUfADqZvL/HvySghYn6vs9DvBRF9yN0NksFgUi9aP7m1X1ify+7EH7J4KtH2Ls2uhI4PVv9U4y3nSQiEpeDUTDgUlrpUTU/jBNIDyV99Tc7uO/yxnt4y8bLevmAZqjCPQ/2YCjvMqYM50m3oVw8lG+AfsyOdLkOxpS3T62nLgG707hkSzpmlXZhsdh9EI3PXBS7n/p4uGkq9OFsS0EBZHQczRLMHyiQ8WtRiZ/++cBlVS29BwdCSPgs9P1FOWys26Khf9jLxKp+7dz2NGE2bpwvogAq6JxAhRKMOwra7tkg9jD2ZfDV2w+unczqMcIpou7hrAUklwy3KigMqGu9fU+BozO7PpAOcp2NaowAsuC9eS7bS0gdKEfa3qZrtW2vOvFnRHdr748lyYBreOfeTLxpQmOqsgZJS1OWa7ptkpqJGi+1kYhGn2s5k2PIKickZl1ibogMueY3e6Rp+GX2U7jiq5d8RjaMgqskG+ropzMJF7duG6LlKoSAcV4KZdCgW+09zFh2xGAfQOGUeSHPZbiOy9bYrEob4I7zsecBAdyPXBH/LLA38PHFDm9rwWWPmoavz8DyB3SQ728tFszKS9Ji6A XZHhtoZX 260lwTHWGsXan2XeE7I07xfOYQqKTUdZEW9zDCCrDGplHgCYzUSayFXbbxOFcVaJxMzxg8gMBjIWiGoITWjfkOW8sE4BnqLzrSBNhx+uZ+jVO48BBC5EWRUqr7LyGU31+B/0iVJOaeDJgNTxUdo2InwdpeBNxq5Ls/cyI6ltxF7TKSCVG17aadL39lrze76NdzVzoWg0omP4stu6Xm0mKlCtuEKF3f16svgn3mY59fB9FoHUl9MgGaoKX7b7C1qwFftOteqAubeckLhY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 09, 2024 at 08:17:21AM +0200, Christoph Hellwig wrote: > On Mon, Jul 08, 2024 at 01:52:38PM -0300, Jason Gunthorpe wrote: > > Ideally we'd have some template code that consolidates these loops to > > common code with driver provided hooks - there are a few ways to get > > that efficiently in C. > > > > I think it will be clearer when we get to RDMA and there we have the > > same SGL/PRP kind of split up and we can see what is sharable. > > I really would not want to build common code for PRPs - this is a concept > very specific to RDMA and NVMe. I think DRM has it too. If you are populating a GPU page table then it is basically a convoluted PRP. Probably requires different splitting logic than what RDMA does, but I've never looked. > OTOH more common code SGLs would be nice. If you look at e.g. SCSI > drivers most of them have a simpe loop of mapping the SG table and > then copying the fields into the hardware SGL. This would be a very > common case for a helper. Yes, I belive this is very common. > That whole thing of course opens the question if we want a pure > in-memory version of the dma_addr_t/len tuple. IMHO that is the best > way to migrate and allows to share code easily. We can look into ways > to avoiding that more for drivers that care, but most drivers are > probably best serve with it to keep the code simple and make the > conversion easier. My feeling has been that this RFC is the low level interface and we can bring our own data structure on top. It would probably make sense to build a scatterlist v2 on top of this that has an in-memory dma_addr_t/len list close to today. Yes it costs a memory allocation, or a larger initial allocation, but many places may not really care. Block drivers have always allocated a SGL, for instance. Then the verbosity of this API is less important as we may only use it in a few places. My main take away was that we should make the dma_ops interface simpler and more general so we can have this choice instead of welding a single datastructure through everything. > > I'm also cooking something that should let us build a way to iommu map > > a bio_vec very efficiently, which should transform this into a single > > indirect call into the iommu driver per bio_vec, and a single radix > > walk/etc. > > I assume you mean array of bio_vecs here. That would indeed nice. > We'd still potentially need a few calls for block drivers as > requests can have multiple bios and thus bio_vec arrays, but it would > still be a nice reduction of calls. Yes. iommufd has performance needs here, not sure what it will turn into but bio_vec[] direct to optimized radix manipuilation is something I'd be keen to see. Jason