From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A67E1C54EAA for ; Thu, 26 Jan 2023 19:38:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 404DF6B0080; Thu, 26 Jan 2023 14:38:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 38E256B0081; Thu, 26 Jan 2023 14:38:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2084A6B0082; Thu, 26 Jan 2023 14:38:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 105DF6B0080 for ; Thu, 26 Jan 2023 14:38:22 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C1BF04013E for ; Thu, 26 Jan 2023 19:38:21 +0000 (UTC) X-FDA: 80397961602.25.F0D88FC Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf06.hostedemail.com (Postfix) with ESMTP id D596F18001C for ; Thu, 26 Jan 2023 19:38:18 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=epV5VW3h; spf=pass (imf06.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.160.180 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674761899; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=548u3Fp4tscAsRqm8ZTE2VWhUoxZOCMAGUWVcX6Go58=; b=S6vMZBMcX3IPNxbO55q5+mklHDXAHHvutsEEIT/ijPziwVudRjdHagYgY49FAFf51BhoyE /R+NhfUy8QWkrAtSDaZdHwE5ItCmxX6XrV00ZbkoJHR18Ph6YofUAYFjMuMdKUO1GPPhqv 75UPWv4oVyKUR2X5Sf4rfGpgmXwT5VU= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=epV5VW3h; spf=pass (imf06.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.160.180 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674761899; a=rsa-sha256; cv=none; b=52pIprPJZFA0pwDWEa4i958I4aMRxFUHOubKgLTcxp/ankc0bOjwPaEYZdR1q5UGc41Rs5 TDkwgcaKfCJ6hYsCRNDSG35EXM+AJ/MqbGEhsbgsCOEnZ28+0yc9TQgktyd0AnuOXB8PjG 8fzEgfSDlkfaWw8Ct5El7CuMpkoOun0= Received: by mail-qt1-f180.google.com with SMTP id x5so2210438qti.3 for ; Thu, 26 Jan 2023 11:38:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=548u3Fp4tscAsRqm8ZTE2VWhUoxZOCMAGUWVcX6Go58=; b=epV5VW3hrViYljNLAdWJDqX3fAl4wYMv55qfvqUl+cZybWCeeDvqryFJs7Zk1z1ge0 +YfWxsTDgetquN/yOeHPal429etypXOBpel4tIAV8vd4EP8H055kKatEvKe6oZxMs3PT LcO0xq4K0msdWtCUNqYN4NK0vLHs5L0h44+WlAHrYWvJgXcgjDmqAFXJlLbfBB/UTvvN 9vIhPdjjmCe8giVOUtXcRKZC0HKhopCXDpNQUBIZSpW7Qbo2/7o+/aS/ZPpfQVm1O7vS K1a26f+LNFd2YzHSgykx3DHZbq9/58TJ23dpXqRpwr3NjRuHpKkuas3uu7YQGuqzHxWI HpSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=548u3Fp4tscAsRqm8ZTE2VWhUoxZOCMAGUWVcX6Go58=; b=Nd4qmSRCPSxDjVvNg6oqhXj1BwzmbuaII+F0mUzbzg0S9yIQDjW8LFfEo+Xe00Cv+N 0xslmriGD4Bkr/ZYD2yh4Jaw47VunyuUHoS5TLIzz5cMPY3aOF/j8SJ58nTUvGvqG+8m y8k7PKeIYGbId/KHTNVvLeVnUYJ0U2oka0j9bivPYffMBu2TYPsrW4KsUxDZe2mvVIf9 t9TAymizCSlmFwooOxTBd8O5jrurSPzmgO7vDmuvlcoEPYY5PWnBi+62bEyTeEqGl5kK 3EFfjJ9jCaODIWsUCrbK4imn36TaMb4DG/LXio9iZ68ZepUU3ZSHAz9F0bXiPgV4iwA+ 9rIg== X-Gm-Message-State: AFqh2kp864RcVbLnqZ4qPBeanCkIuTCd60jUp9qlkeDhkynY2eRjIu0n YWFpK/TurwuHf7Wwaqbwe57utg== X-Google-Smtp-Source: AMrXdXv+huCHA7ip/FVnskiv66KzYaO0gWPYCPL7QS4WBmaP67lWrrmrdvcrLuRJ9ZU3cV/x/SyhoQ== X-Received: by 2002:ac8:70a:0:b0:3b1:c477:eb65 with SMTP id g10-20020ac8070a000000b003b1c477eb65mr52142649qth.60.1674761897975; Thu, 26 Jan 2023 11:38:17 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-167-59-176.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.167.59.176]) by smtp.gmail.com with ESMTPSA id v3-20020ac87283000000b003b62e9c82ebsm1276806qto.48.2023.01.26.11.38.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Jan 2023 11:38:16 -0800 (PST) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1pL84e-00HP8B-92; Thu, 26 Jan 2023 15:38:16 -0400 Date: Thu, 26 Jan 2023 15:38:16 -0400 From: Jason Gunthorpe To: Dan Williams Cc: Matthew Wilcox , nvdimm@lists.linux.dev, lsf-pc@lists.linuxfoundation.org, linux-rdma@vger.kernel.org, John Hubbard , dri-devel@lists.freedesktop.org, Ming Lei , linux-block@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux.dev, netdev@vger.kernel.org, Joao Martins , Jason Gunthorpe via Lsf-pc , Logan Gunthorpe , Christoph Hellwig Subject: Re: [Lsf-pc] [LSF/MM/BPF proposal]: Physr discussion Message-ID: References: <63cee1d3eaaef_3a36e529488@dwillia2-xfh.jf.intel.com.notmuch> <63cef32cbafc3_3a36e529465@dwillia2-xfh.jf.intel.com.notmuch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <63cef32cbafc3_3a36e529465@dwillia2-xfh.jf.intel.com.notmuch> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D596F18001C X-Stat-Signature: nf3cojz1dshf9zo6rrqcrmzp9d5iiacf X-HE-Tag: 1674761898-954074 X-HE-Meta: U2FsdGVkX1880OqjBuVVWgd7DFjlajdnLAJPOEqwT/W2CZ+jikxtENCTtlnbirSb5OcjmsOg+mo4B347+oBWDo1q+Uun5i5f1bfd4mQ0LLD1P5pQgSABO6R5ZXc/d9HCBJcZW1xy5g/oMdeSrAeefQqiwJUlarw0WvEH/lCneWFHSBn51NUXxUdYhCFhcwXLG3MSVbiLr/jIlty6btdJrwG+hetnR78aX29DZZqxtuiSFhTOG+ycCrHZZUN5vP/u4pR+WL9rrlCuzVF6WYoLY2ISnWW7xCY2ldkqMtnVvmNaDUspY9+amkMfFAAlCJhcgFqpWYYrK1ldCahdjiB9iGXE+R8XTaPpFJLlrIQ4VzAmIb7g+ggKJTcbqPsv9kDAymiSDpdpFFr3Ump6QR4XjRh2sLLbgrdTKm4gqnxrzdDXYHNenm4+u6tu9sN+Iuf5MFcxU0FfVqLTr0HxBf0tD6YdCm0YAHw+1eqjjHCcUuAd8nQS4WwHVtaClfXeXjmdO29xhpXa+g+wS4i+amOTCMwU9nI40b9zxhp7p5r6vJr8xBSydLXMiJW/wFf5jGxL34v3b44mJvo13F4K7JWZvc3pgTCxKUUFAPUtMogV3DuGiyWliy50f778BBjbP9bWBrC7jvomptbSYeo+8Pk3+xIl7ysNuigujBnbmOsLd23zM7kRTld8gTto5kjfGvYJ0ADcKYdn2W/4YrDde3xR0m8hn3+IOf1hUbrZ1Ju5F3Wiqq/iInUp6ODvR75LeC+P1HDSZZgzQ/UFRs7S8kAnN6zdxmwTraH33Av/HfvfqtsIbN/u4ciUBgUUXXRNXK2FEZ35zyYHKS3FQ5e82bhDQUcIisdwsRKO4yKCH9LypdODCOp+CQbWtwrSE8zp+lsPb5gPQDLI0PNFrBj5gYbRwZkOlTqmAmFsA3IeRm9x8RdeLim4lp4roXYvo8tINO5hTnOAljIwUc5Wh41kkBx Pk+a9+tu v3CW6rXE70JN3paI6s5EosZ2iaUTfRuSypzIpg6l9wF1GEWU5tzhZRVUkuAqAE8rSr0YfdHMLIBinHg6lyKA5Dbxax4qP4TMDYaC7v54mbEz5Dp3zekIusDI171Wd4Qvnqu0xzJVoJRDotq0Uq6AZ2VOourVGQoH6jH3PXSWZCkFCkBu7hhyzqCSpEYVnBIodYiPE3C3iR3jclYQ7py2WYI31m6jHp3CwUvLlD7POggkVCVcTVVsscJFli9o3g8F1RAsRofM4o1itxL1MJQJ5MpBURjr8a/XURa8GEIjaaEEI/j8Pi5Eu2mKTjSuOy306y5uhtX/oG1WFwjVdFSMHC5yytw3UWMo39bkwPi/BJsDQctCJYPPB9BXtbJv2ko78L5iPwG+MVAgdXz8mv31ntV+4BRdS+BVCs06406QHQPjuoPCUWXxR5FDVqFxaamH3dqklBnqEyk3D/I4FjME4fyv8bLtObrhxwqXT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 23, 2023 at 12:50:52PM -0800, Dan Williams wrote: > Matthew Wilcox wrote: > > On Mon, Jan 23, 2023 at 11:36:51AM -0800, Dan Williams wrote: > > > Jason Gunthorpe via Lsf-pc wrote: > > > > I would like to have a session at LSF to talk about Matthew's > > > > physr discussion starter: > > > > > > > > https://lore.kernel.org/linux-mm/YdyKWeU0HTv8m7wD@casper.infradead.org/ > > > > > > > > I have become interested in this with some immediacy because of > > > > IOMMUFD and this other discussion with Christoph: > > > > > > > > https://lore.kernel.org/kvm/4-v2-472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com/ > > > > > > I think this is a worthwhile discussion. My main hangup with 'struct > > > page' elimination in general is that if anything needs to be allocated > > > > You're the first one to bring up struct page elimination. Neither Jason > > nor I have that as our motivation. > > Oh, ok, then maybe I misread the concern in the vfio discussion. I > thought the summary there is debating the ongoing requirement for > 'struct page' for P2PDMA? The VFIO problem is we need a unique pgmap at 4k granuals (or maybe smaller, technically), tightly packed, because VFIO exposes PCI BAR space that can be sized in such small amounts. So, using struct page means some kind of adventure in the memory hotplug code to allow tightly packed 4k pgmaps. And that is assuming that every architecture that wants to support VFIO supports pgmap and memory hot plug. I was just told that s390 doesn't, that is kind of important.. If there is a straightforward way to get a pgmap into VFIO then I'd do that and give up this quest :) I've never been looking at this from the angle of eliminating struct page, but from the perspective of allowing the DMA API to correctly do scatter/gather IO to non-struct page P2P memory because I *can't* get a struct page for it. Ie make dma_map_resource() better. Make P2P DMABUF work properly. This has to come along with a different way to store address ranges because the basic datum that needs to cross all the functional boundaries we have is an address range list. My general current sketch is we'd allocate some 'DMA P2P provider' structure analogous to the MEMORY_DEVICE_PCI_P2PDMA pgmap and a single provider would cover the entire MMIO aperture - eg the providing device's MMIO BAR. This is enough information for the DMA API to do its job. We get this back either by searching an interval treey thing on the physical address or by storing it directly in the address range list. Jason