From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D39A6CDB47E for ; Fri, 20 Oct 2023 04:58:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55C9A8D01BD; Fri, 20 Oct 2023 00:58:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 50CF18D001C; Fri, 20 Oct 2023 00:58:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42C828D01BD; Fri, 20 Oct 2023 00:58:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 34E2A8D001C for ; Fri, 20 Oct 2023 00:58:56 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 07006B62A0 for ; Fri, 20 Oct 2023 04:58:56 +0000 (UTC) X-FDA: 81364635072.11.19127E1 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by imf24.hostedemail.com (Postfix) with ESMTP id 4A8F5180015 for ; Fri, 20 Oct 2023 04:58:54 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697777934; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W3WWvFTG6WcEgC54KTmgHVx/nDtmL0uUHMEt7d4H7yI=; b=rVBFuakbYZoo/n0kch94S/tIMs8gulIeKhO+flXFAUOHpJSyCM243Gg7gLZXvgsRovSnz2 nvKNjMB3ZnsPR+mLQyEKwHrsP71+gt5a4rNarPKdQiIavEnG4lNYOzHtYE4Qodtf1AoAQT 3vF6k9QWZXfwcFNreikyjGDqnC1+mR8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697777934; a=rsa-sha256; cv=none; b=VSvW4Zsi0dtSQ7qBJ6Rprfc3XoYg2Rq6B6S/pDEOHjjTxyLhCr6O0fWTDBiVNee4Iru+FP me3khgA0rR/qN9hp76SCjXvJblrstwRDyeD3w6PL4/R91ecBWiUy1FI9RPuwktC2KShqe0 leeyfWxJuLvtb37AkHl2TI2/ommfyHU= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de; dmarc=none Received: by verein.lst.de (Postfix, from userid 2407) id EA8CA67373; Fri, 20 Oct 2023 06:58:49 +0200 (CEST) Date: Fri, 20 Oct 2023 06:58:49 +0200 From: Christoph Hellwig To: Matthew Wilcox Cc: Chuck Lever , Marek Szyprowski , Chuck Lever , Robin Murphy , Alexander Potapenko , linux-mm@kvack.org, linux-rdma@vger.kernel.org, Jens Axboe , kasan-dev@googlegroups.com, David Howells , iommu@lists.linux.dev, Christoph Hellwig Subject: Re: [PATCH RFC 0/9] Exploring biovec support in (R)DMA API Message-ID: <20231020045849.GA12269@lst.de> References: <169772852492.5232.17148564580779995849.stgit@klimt.1015granger.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) X-Stat-Signature: qxpbs53nrn54rrwcih6fkr9ie5a8pgue X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 4A8F5180015 X-Rspam-User: X-HE-Tag: 1697777934-701234 X-HE-Meta: U2FsdGVkX1+xWTqip92wX4Ki+5CGlKqmF+iUSxnko9go/Z845upVbgB+w5oZYxZHC63R/WDzatiHUVS4G0bMMz8KxCcfdtLZUXJzxY+rnNS6yzki5zgxWHSf64zzGMAlnCghPNbpN9GBrr7+4Kvu2NGidzEl4GGJt1FFqbeOA+V7Yibp9IF/gSGmh4LPhxGLgJ896y6OQd8lu+xBxbqlkalrUhkB7o2kx/NymldDxg25NeJ1xpoA+0OF++x0/igMMgLCA/a9qHi7XeXhOiVzpsWV3DWyO1tCsm2oQn4DD+bW4YdYjgWmX9YOTmovsa63Bhf7JC56+zTTNyB6++e5YVcdOZSRCFknaroDx/OBp0YLlcWH3xFhSvA1Vet8Lq75i/1ODJCviuD5ePthFqiqdsNeRhVkOpqkw3X22pgtbnHDGqNKMAj6Hhvf+QEQspkO4EK9R9yGf2iabWkOXuwFZNlVSulezhFTCppCMZeZCWeuo3sFqja2bhUF/G0MaVw9lDIF4eVLEoy88SbV0mxyZXexKsuqYaWlj97hSb1Jc9+FW4rYvksdZSrlZvT4oSg5CX/35+sSR4qtzRSYQ7Fbp++7P4Pt9C8ILMDBxEXyDyJOgD9ubsyHwFEKsLzo82XoVlMReFfDVvZiiek75ISb5U6EKDbMNdgeu/b9bh8i5Yh3MwEUXE9N2XtALsujhUk10G7yXeYaA2zJG/NNsGGADc9fIE68fU/fd9wjobRgF9jOwy72z7npfkU77Z3wqUHK/TTDcEzyVMzntOOORjcQxnbqInexw7w0tOBh4ncKtRuOgUWyL/y9kRXyhBY9rx7lcESTbJaZv6Dmz/KXGohoTJZuPW74+I00NLmRLNL4kxwEwFtTq8R+2kZFz4NwEFUxpoyFuu8r7O4E7MQhSI9dTPiucRFDCgyO+d05d8GMGC6abF5xuhOfRY2IEfBiNVo1JpAJEIHSgHQWq4oOpJS GYfpdMWO cJI5Svu0FeLDdw06wXWMpwowfZiCuvrFJl3JPfweh0NHfgd0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Oct 19, 2023 at 04:53:43PM +0100, Matthew Wilcox wrote: > > RDMA core API could support struct biovec array arguments. The > > series compiles on x86, but I haven't tested it further. I'm posting > > early in hopes of starting further discussion. > > Good call, because I think patch 2/9 is a complete non-starter. > > The fundamental problem with scatterlist is that it is both input > and output for the mapping operation. You're replicating this mistake > in a different data structure. Agreed. > > My vision for the future is that we have phyr as our input structure. > That looks something like: > > struct phyr { > phys_addr_t start; > size_t len; > }; So my plan was always to turn the bio_vec into that structure, since before you came u wit hthe phyr name. But that's really a separate discussion as we might as well support multiple input formats if we really have to. > Our output structure can continue being called the scatterlist, but > it needs to go on a diet and look more like: > > struct scatterlist { > dma_addr_t dma_address; > size_t dma_length; > }; I called it a dma_vec in my years old proposal I can't find any more. > Getting to this point is going to be a huge amount of work, and I need > to finish folios first. Or somebody else can work on it ;-) Well, we can stage this. I wish I could find my old proposal about the dma_batch API (I remember Robin commented on it, my he is better at finding it than me). I think that mostly still stands, independent of the transformation of the input structure. The basic idea is that we add a dma batching API, where you start a batch with one call, and then add new physically discontiguous vectors to add it until it is full and finalized it. Very similar to how the iommu API works internally. We'd then only use this API if we actually have an iommu (or if we want to be fancy swiotlb that could do the same linearization), for the direct map we'd still do the equivalent of dma_map_page for each element as we need one output vector per input vector anyway. As Jason pointed out the only fancy implementation we need for now is the IOMMU API. arm32 and powerpc will need to do the work to convert to it or do their own work.