linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yongji Xie <xieyongji@bytedance.com>
To: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	 virtualization@lists.linux-foundation.org
Subject: Re: [External] Re: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace
Date: Fri, 23 Oct 2020 10:55:44 +0800	[thread overview]
Message-ID: <CACycT3s2GZ3yKP+Xn2V83_-=tXg342J4n91ZAb0c-+UD_+sFnA@mail.gmail.com> (raw)
In-Reply-To: <6cff5900-42ee-a0f5-0d5f-9383646c27d9@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4234 bytes --]

On Tue, Oct 20, 2020 at 5:13 PM Jason Wang <jasowang@redhat.com> wrote:

>
> On 2020/10/20 下午4:35, Yongji Xie wrote:
> >
> >
> > On Tue, Oct 20, 2020 at 4:01 PM Jason Wang <jasowang@redhat.com
> > <mailto:jasowang@redhat.com>> wrote:
> >
> >
> >     On 2020/10/20 下午3:39, Yongji Xie wrote:
> >     >
> >     >
> >     > On Tue, Oct 20, 2020 at 11:20 AM Jason Wang <jasowang@redhat.com
> >     <mailto:jasowang@redhat.com>
> >     > <mailto:jasowang@redhat.com <mailto:jasowang@redhat.com>>> wrote:
> >     >
> >     >
> >     >     On 2020/10/19 下午10:56, Xie Yongji wrote:
> >     >     > This series introduces a framework, which can be used to
> >     implement
> >     >     > vDPA Devices in a userspace program. To implement it, the
> work
> >     >     > consist of two parts: control path emulating and data path
> >     >     offloading.
> >     >     >
> >     >     > In the control path, the VDUSE driver will make use of
> message
> >     >     > mechnism to forward the actions (get/set features, get/st
> >     status,
> >     >     > get/set config space and set virtqueue states) from
> >     virtio-vdpa
> >     >     > driver to userspace. Userspace can use read()/write() to
> >     >     > receive/reply to those control messages.
> >     >     >
> >     >     > In the data path, the VDUSE driver implements a MMU-based
> >     >     > on-chip IOMMU driver which supports both direct mapping and
> >     >     > indirect mapping with bounce buffer. Then userspace can
> access
> >     >     > those iova space via mmap(). Besides, eventfd mechnism is
> >     used to
> >     >     > trigger interrupts and forward virtqueue kicks.
> >     >
> >     >
> >     >     This is pretty interesting!
> >     >
> >     >     For vhost-vdpa, it should work, but for virtio-vdpa, I think we
> >     >     should
> >     >     carefully deal with the IOMMU/DMA ops stuffs.
> >     >
> >     >
> >     >     I notice that neither dma_map nor set_map is implemented in
> >     >     vduse_vdpa_config_ops, this means you want to let vhost-vDPA
> >     to deal
> >     >     with IOMMU domains stuffs.  Any reason for doing that?
> >     >
> >     > Actually, this series only focus on virtio-vdpa case now. To
> >     support
> >     > vhost-vdpa,  as you said, we need to implement
> >     dma_map/dma_unmap. But
> >     > there is a limit that vm's memory can't be anonymous pages which
> >     are
> >     > forbidden in vm_insert_page(). Maybe we need to add some limits on
> >     > vhost-vdpa?
> >
> >
> >     I'm not sure I get this, any reason that you want to use
> >     vm_insert_page() to VM's memory. Or do you mean you want to implement
> >     some kind of zero-copy?
> >
> >
> >
> > If my understanding is right, we will have a QEMU (VM) process and a
> > device emulation process in the vhost-vdpa case, right? When I/O
> > happens, the virtio driver in VM will put the IOVA to vring and device
> > emulation process will get the IOVA from vring. Then the device
> > emulation process will translate the IOVA to its VA to access the dma
> > buffer which resides in VM's memory. That means the device emulation
> > process needs to access VM's memory, so we should use vm_insert_page()
> > to build the page table of the device emulation process.
>
>
> Ok, I get you now. So it looks to me the that the real issue is not the
> limitation to anonymous page but see the comments above vm_insert_page():
>
> "
>
>   * The page has to be a nice clean _individual_ kernel allocation.
> "
>
> So I suspect that using vm_insert_page() to share pages between
> processes is legal. We need inputs from MM experts.
>
>
Yes,  vm_insert_page() can't be used in this case. So could we add the
shmfd into the vhost iotlb msg and pass it to the device emulation process
as a new iova_domain, just like vhost-user does.

Thanks,
Yongji


>

>
> >
> >     I guess from the software device implemention in user space it
> >     only need
> >     to receive IOVA ranges and map them in its own address space.
> >
> >
> > How to map them in its own address space if we don't use
> vm_insert_page()?
>
>

[-- Attachment #2: Type: text/html, Size: 6152 bytes --]

  reply	other threads:[~2020-10-23  2:56 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-19 14:56 Xie Yongji
2020-10-19 14:56 ` [RFC 1/4] mm: export zap_page_range() for driver use Xie Yongji
2020-10-19 15:14   ` Matthew Wilcox
2020-10-19 15:36     ` [External] " 谢永吉
2020-10-19 14:56 ` [RFC 2/4] vduse: Introduce VDUSE - vDPA Device in Userspace Xie Yongji
2020-10-19 15:08   ` Michael S. Tsirkin
2020-10-19 15:24     ` Randy Dunlap
2020-10-19 15:46       ` [External] " 谢永吉
2020-10-19 15:48     ` 谢永吉
2020-10-19 14:56 ` [RFC 3/4] vduse: grab the module's references until there is no vduse device Xie Yongji
2020-10-19 15:05   ` Michael S. Tsirkin
2020-10-19 15:44     ` [External] " 谢永吉
2020-10-19 15:47       ` Michael S. Tsirkin
2020-10-19 15:56         ` 谢永吉
2020-10-19 16:41           ` Michael S. Tsirkin
2020-10-20  7:42             ` Yongji Xie
2020-10-19 14:56 ` [RFC 4/4] vduse: Add memory shrinker to reclaim bounce pages Xie Yongji
2020-10-19 17:16 ` [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace Michael S. Tsirkin
2020-10-20  2:18   ` [External] " 谢永吉
2020-10-20  2:20     ` Jason Wang
2020-10-20  2:28       ` 谢永吉
2020-10-20  3:20 ` Jason Wang
2020-10-20  7:39   ` [External] " Yongji Xie
2020-10-20  8:01     ` Jason Wang
2020-10-20  8:35       ` Yongji Xie
2020-10-20  9:12         ` Jason Wang
2020-10-23  2:55           ` Yongji Xie [this message]
2020-10-23  8:44             ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACycT3s2GZ3yKP+Xn2V83_-=tXg342J4n91ZAb0c-+UD_+sFnA@mail.gmail.com' \
    --to=xieyongji@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox