From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A3D7C5475B for ; Wed, 6 Mar 2024 17:45:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A3866B0074; Wed, 6 Mar 2024 12:45:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 053C16B0075; Wed, 6 Mar 2024 12:45:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E5E5A6B0078; Wed, 6 Mar 2024 12:45:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D32C76B0074 for ; Wed, 6 Mar 2024 12:45:00 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id ADAF6A15DC for ; Wed, 6 Mar 2024 17:45:00 +0000 (UTC) X-FDA: 81867339960.10.14AE9DB Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) by imf15.hostedemail.com (Postfix) with ESMTP id DB216A000B for ; Wed, 6 Mar 2024 17:44:58 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=JCvEMDnq; spf=pass (imf15.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.222.172 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709747099; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FpyRbZ0RdQZQRjC0Rn2UI7ybSBSrjRoc3WCER4LofKQ=; b=FANJOrtR9r2kaBlROo5Da2QiDxxmpIRxwCCXVuL4AeMyMiSimKOfYe9HPLHhCK87p8i09s BjhGcfg0WSLfT2GFIHP2VPFDwIKZHXezF+vD1Dp3T53Dj5qKjs2ArnV1sgWcEn2TE8COj2 HrLxJ+CzI0k9IqmpldlF3tlbQq/WAys= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709747099; a=rsa-sha256; cv=none; b=eU5imL2GJniEyg9Orm82+OFw5js8JePC8QpuwEV6zil6A61GUfsqvbxEFZlRjH2RPLseuA Oiw/Slt9Z393lHw1Xr9qavzpo8sAmBriTj2HTLRWi1r72GTwHILzpiAd+G+IXF7rDKepXz Q2iqv1UVUqTu568MFy4+7B9VWtpPPJA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=JCvEMDnq; spf=pass (imf15.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.222.172 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-7882e8f99eeso4965985a.0 for ; Wed, 06 Mar 2024 09:44:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1709747098; x=1710351898; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=FpyRbZ0RdQZQRjC0Rn2UI7ybSBSrjRoc3WCER4LofKQ=; b=JCvEMDnqNO7TiZPrTf0QVfbQsuZv25mooI+RCU+HW4bJLrvGys1uZvU4ZZXjlQ0MYF snleUwmCIt3GzcgwNBgkFioZC7QbxgN1wDrXS/OHxAn7OW0Jf3ver8lVKU6srSsuOdc5 EpMijg5Q+VUFp5pVLAYrlTL7fo/n3wEWqV/X3bjufMRWNNI+bQn8HPgdslhgoGxToUX3 u/1skDZypg0K2aH0Uox37T8TLt7/37gBmtrCeWPEqLNIdlLDJY0sO8sF33rZ2aZOJJBy 7KyR2rY9CKeEYZdBvKVeSe+IxBluWp4l47bQWJ3pGMerXpd7q+SufIGTp+bb4xm5Jl4q oFUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709747098; x=1710351898; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FpyRbZ0RdQZQRjC0Rn2UI7ybSBSrjRoc3WCER4LofKQ=; b=S/mHKHg9lymr2Ed9c8LzacrUsxk7u1JNCIVT8yBvDgZS8vRlGWINcpo0ZWg+6h7eZm iP8edHuKA2NbtxF8OtSecp17HrXJvK0lWjCFwRkE0WFxTo2RQ98KPUBzMC0RX2f7K0v3 Zt4Wl87Cuz9Y9+zSQWCZxucX4ao6B+9aBLfaqupCli/LyOYXwWlxYxZiTXfKX1Twew6T gVmy5Z6MWKcJ1m8h7fVBWd445g0f24RPsYjtBEyXzN/z+OeUqu2uAf1DhHN6OaHX93lQ 06WCQAoL+ctofdiKxALGYbyk34BOKcuG2cGqiuq3DA4+au6EH6/Xvrp4xZFlCA1K9cP6 QXsw== X-Forwarded-Encrypted: i=1; AJvYcCV63by+LBS5rdDr5AraUWiFW9YePRLqVSfEghx5mcJW7yJjaxIzt9JvKG1hLPjwUb/bA7w3zgfYgI632XiaCb3+ZBA= X-Gm-Message-State: AOJu0YypAne0kdHhbFOY1/unIkB5Nx6HrlXxt6dVE3Da0HSsXiQCoERJ DD2X/99MAOPsbLTvBNhFzIjDbbZTRavn96XJbWdpFS3hcW7gLOYZcj4DewuxywE= X-Google-Smtp-Source: AGHT+IFPBkf7L5QuuJTJ8Xv6bEQYd424i8376Z5+WOarEDc8ZpU3tKfW4UZ0uYoM0R8xl79r0aNmBg== X-Received: by 2002:a05:6214:11b1:b0:690:64e6:33d5 with SMTP id u17-20020a05621411b100b0069064e633d5mr5599360qvv.54.1709747097870; Wed, 06 Mar 2024 09:44:57 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-68-80-239.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.80.239]) by smtp.gmail.com with ESMTPSA id ol17-20020a0562143d1100b006904e2c9e36sm7228573qvb.116.2024.03.06.09.44.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Mar 2024 09:44:57 -0800 (PST) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1rhvK4-001uSz-88; Wed, 06 Mar 2024 13:44:56 -0400 Date: Wed, 6 Mar 2024 13:44:56 -0400 From: Jason Gunthorpe To: Christoph Hellwig Cc: Leon Romanovsky , Robin Murphy , Marek Szyprowski , Joerg Roedel , Will Deacon , Chaitanya Kulkarni , Jonathan Corbet , Jens Axboe , Keith Busch , Sagi Grimberg , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Andrew Morton , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, kvm@vger.kernel.org, linux-mm@kvack.org, Bart Van Assche , Damien Le Moal , Amir Goldstein , "josef@toxicpanda.com" , "Martin K. Petersen" , "daniel@iogearbox.net" , Dan Williams , "jack@suse.com" , Zhu Yanjun Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to two steps Message-ID: <20240306174456.GO9225@ziepe.ca> References: <47afacda-3023-4eb7-b227-5f725c3187c2@arm.com> <20240305122935.GB36868@unreal> <20240306144416.GB19711@lst.de> <20240306154328.GM9225@ziepe.ca> <20240306162022.GB28427@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240306162022.GB28427@lst.de> X-Rspamd-Queue-Id: DB216A000B X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: nqau6dupz8tkr1gwuouiamay7smck4zn X-HE-Tag: 1709747098-858703 X-HE-Meta: U2FsdGVkX1+maRLPOGqqhQF8/iE43ZUGOt+fDwaNzyL15HcgGIBafPNJVNZ7QbRMtaLXJ7h8I8lTdjt9r2sgvmrgJp8Q20oSC73jmeTuGPwDQAM4p7t0Wc4iHxfwK5M1Tl8FJD3zvhITu36l7Pukn6xgrbeFMua6zii6EH4n3kXqEGt4RBtyaRYWMfodyYBp3vBkX8HHrR/+E2TQW0XRba3vaq4m4JQ98VDxWm1tn9+FNzyhTuJyHLE+aXXh+yDl64py683oYN8F16nTa4+uZ87lh3+4VgPoIPvurDJFXo9bxu7rmi2FFBvdzvB2E81cEUZrs+Ff9433Kl0Fn5j0sHzGhqdVv9UN+xQ6Y/9m26K68fz4TTRyB3/bkmYhrUhIXnLEWpqLW2ECva+wtXVDGRIPEzH9ZnPkQoRnMFG51CG1BvN08pd4GvZea58Tes8Ux/zmJau0pRk7FN8DlRoiPIV5yIRrxPdrK9fZ1Pe34bkXHVv4HEuWXH9w7aIcO3UsTEbfOx352ARfz+yUWPZ/1PoNFYIlSoqAMqVUksH3rmDdaVle0cvpq00S/8XM9aXZUSPC0HMxqNZzv61lhUiV+pk912cmEbNSnVO5P9LGk3WxZZjLvgvcFGCxj6yB9S0seepb055xvU7ZD4LZ2+OZUuTGwRl73sghpbYY57txhcKEeXCA5RRbpiFD/5A3rH6zYrO9DFfMLodMcT6PKOuxUZEU8as6JqIcq3v+nOhoAq1jTw3LwtXvpfQMULhVEVTWVK4nS57e4VrvifqysNIq13MM8KTcbBFdhKcZuoAcIZ9xPoPFWlkWmEwfcF+TdqU/XCBVuuJzCoZLJ6yjsLUfyPNSS32sfIo7CkKLOknx9fuMVHHGCHEIA9ynWq/QUewPpMFVvrfSjsHbwH3cHqs0szFNS3Gk+D73lkmfMcnQQo+iMP2apiajTCt0qU695j4cDqfGtqOtZj2Rua2w9lT lwGH2um4 Cx/TWMwdlvLdhutizIOa1/1hz5EoBIhj3kln+MfMu1+szyNLvj7UYNspL4SC00j7ioitR/0qG2Fm2BtDaEsIXASE0iw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 06, 2024 at 05:20:22PM +0100, Christoph Hellwig wrote: > On Wed, Mar 06, 2024 at 11:43:28AM -0400, Jason Gunthorpe wrote: > > I don't think they are so fundamentally different, at least in our > > past conversations I never came out with the idea we should burden the > > driver with two different flows based on what kind of alignment the > > transfer happens to have. > > Then we talked past each other.. Well, we never talked to such detail > > > So if we want to efficiently be able to handle these cases we need > > > two APIs in the driver and a good framework to switch between them. > > > > But, what does the non-page-aligned version look like? Doesn't it > > still look basically like this? > > I'd just rather have the non-aligned case for those who really need > it be the loop over map single region that is needed for the direct > mapping anyway. There is a list of interesting cases this has to cover: 1. Direct map. No dma_addr_t at unmap, multiple HW SGLs 2. IOMMU aligned map, no P2P. Only IOVA range at unmap, single HW SGLs 3. IOMMU aligned map, P2P. Only IOVA range at unmap, multiple HW SGLs 4. swiotlb single range. Only IOVA range at unmap, single HW SGL 5. swiotlb multi-range. All dma_addr_t's at unmap, multiple HW SGLs. 6. Unaligned IOMMU. Only IOVA range at unmap, multiple HW SGLs I think we agree that 1 and 2 should be optimized highly as they are the common case. That mainly means no dma_addr_t storage in either 5 is the slowest and has the most overhead. 4 is basically the same as 2 from the driver's viewpoint 3 is quite similar to 1, but it has the IOVA range at unmap. 6 doesn't have to be optimal, from the driver perspective it can be like 5 That is three basic driver flows 1/3, 2/4 and 5/6 So are you thinking something more like a driver flow of: .. extent IO and get # aligned pages and know if there is P2P .. dma_init_io(state, num_pages, p2p_flag) if (dma_io_single_range(state)) { // #2, #4 for each io() dma_link_aligned_pages(state, io range) hw_sgl = (state->iova, state->len) } else { // #1, #3, #5, #6 hw_sgls = alloc_hw_sgls(num_ios) if (dma_io_needs_dma_addr_unmap(state)) dma_addr_storage = alloc_num_ios(); // #5 only for each io() hw_sgl[i] = dma_map_single(state, io range) if (dma_addr_storage) dma_addr_storage[i] = hw_sgl[i]; // #5 only } ? This is not quite what you said, we split the driver flow based on needing 1 HW SGL vs need many HW SGL. > > So are they really so different to want different APIs? That strikes > > me as a big driver cost. > > To not have to store a dma_address range per CPU range that doesn't > actually get used at all. Right, that is a nice optimization we should reach for. Jason