linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Jason Gunthorpe via Lsf-pc <lsf-pc@lists.linux-foundation.org>,
	<lsf-pc@lists.linuxfoundation.org>, <linux-mm@kvack.org>,
	<iommu@lists.linux.dev>, <linux-rdma@vger.kernel.org>
Cc: <nvdimm@lists.linux.dev>, <linux-rdma@vger.kernel.org>,
	John Hubbard <jhubbard@nvidia.com>,
	Matthew Wilcox <willy@infradead.org>,
	Ming Lei <ming.lei@redhat.com>, <linux-block@vger.kernel.org>,
	<linux-mm@kvack.org>, <dri-devel@lists.freedesktop.org>,
	<netdev@vger.kernel.org>,
	Joao Martins <joao.m.martins@oracle.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Christoph Hellwig <hch@lst.de>
Subject: RE: [Lsf-pc] [LSF/MM/BPF proposal]: Physr discussion
Date: Mon, 23 Jan 2023 11:36:51 -0800	[thread overview]
Message-ID: <63cee1d3eaaef_3a36e529488@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <Y8v+qVZ8OmodOCQ9@nvidia.com>

Jason Gunthorpe via Lsf-pc wrote:
> I would like to have a session at LSF to talk about Matthew's
> physr discussion starter:
> 
>  https://lore.kernel.org/linux-mm/YdyKWeU0HTv8m7wD@casper.infradead.org/
> 
> I have become interested in this with some immediacy because of
> IOMMUFD and this other discussion with Christoph:
> 
>  https://lore.kernel.org/kvm/4-v2-472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com/

I think this is a worthwhile discussion. My main hangup with 'struct
page' elimination in general is that if anything needs to be allocated
to describe a physical address for other parts of the kernel to operate
on it, why not a 'struct page'? There are of course several difficulties
allocating a 'struct page' array, but I look at subsection support and
the tail page space optimization work as evidence that some of the pain
can be mitigated, what more needs to be done? I also think this is
somewhat of a separate consideration than replacing a bio_vec with phyr
where that has value independent of the mechanism used to manage
phys_addr_t => dma_addr_t.

> Which results in, more or less, we have no way to do P2P DMA
> operations without struct page - and from the RDMA side solving this
> well at the DMA API means advancing at least some part of the physr
> idea.
> 
> So - my objective is to enable to DMA API to "DMA map" something that
> is not a scatterlist, may or may not contain struct pages, but can
> still contain P2P DMA data. From there I would move RDMA MR's to use
> this new API, modify DMABUF to export it, complete the above VFIO
> series, and finally, use all of this to add back P2P support to VFIO
> when working with IOMMUFD by allowing IOMMUFD to obtain a safe
> reference to the VFIO memory using DMABUF. From there we'd want to see
> pin_user_pages optimized, and that also will need some discussion how
> best to structure it.
> 
> I also have several ideas on how something like physr can optimize the
> iommu driver ops when working with dma-iommu.c and IOMMUFD.
> 
> I've been working on an implementation and hope to have something
> draft to show on the lists in a few weeks. It is pretty clear there
> are several interesting decisions to make that I think will benefit
> from a live discussion.
> 
> Providing a kernel-wide alternative to scatterlist is something that
> has general interest across all the driver subsystems. I've started to
> view the general problem rather like xarray where the main focus is to
> create the appropriate abstraction and then go about transforming
> users to take advatange of the cleaner abstraction. scatterlist
> suffers here because it has an incredibly leaky API, a huge number of
> (often sketchy driver) users, and has historically been very difficult
> to improve.

When I read "general interest across all the driver subsystems" it is
hard not to ask "have all possible avenues to enable 'struct page' been
exhausted?"

> The session would quickly go over the current state of whatever the
> mailing list discussion evolves into and an open discussion around the
> different ideas.

Sounds good to me.


  parent reply	other threads:[~2023-01-23 19:37 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-21 15:03 Jason Gunthorpe
2023-01-23  4:36 ` Matthew Wilcox
2023-01-23 13:44   ` Jason Gunthorpe
2023-01-23 19:47     ` Bart Van Assche
2023-01-24  6:15       ` Chaitanya Kulkarni
2023-01-26  9:39   ` Mike Rapoport
2023-01-23 19:36 ` Dan Williams [this message]
2023-01-23 20:11   ` [Lsf-pc] " Matthew Wilcox
2023-01-23 20:50     ` Dan Williams
2023-01-23 22:46       ` Matthew Wilcox
2023-01-26 19:38       ` Jason Gunthorpe
2023-01-26  1:45 ` Zhu Yanjun
2023-02-28 20:59 ` T.J. Mercier
2023-04-17 19:59   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63cee1d3eaaef_3a36e529488@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux.dev \
    --cc=jhubbard@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=lsf-pc@lists.linuxfoundation.org \
    --cc=ming.lei@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox