From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 835D9C433DF for ; Tue, 20 Oct 2020 09:13:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E42F4222C8 for ; Tue, 20 Oct 2020 09:13:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="AdekWufW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E42F4222C8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 50E406B0062; Tue, 20 Oct 2020 05:13:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4979A6B006E; Tue, 20 Oct 2020 05:13:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35FE96B0070; Tue, 20 Oct 2020 05:13:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id 04C276B0062 for ; Tue, 20 Oct 2020 05:13:01 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id A0248180AD811 for ; Tue, 20 Oct 2020 09:13:01 +0000 (UTC) X-FDA: 77391739362.21.feast84_1a17c402723e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 7BBB0180442C0 for ; Tue, 20 Oct 2020 09:13:01 +0000 (UTC) X-HE-Tag: feast84_1a17c402723e X-Filterd-Recvd-Size: 6590 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Tue, 20 Oct 2020 09:13:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1603185180; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I5VsYREhL4++Osw5yB+estDDNbvwdPAcgT+fSqIRav8=; b=AdekWufWs0QDVCrKGBr6fXPoh5L8opBbA9sHQ090zyVyb89csWKxl7d3deN5iqXzz7Ojnr 8iSQohsqMnGTEfJWOHPromMB5LfkM849tYvlF2wXqQXBxpDPcx/wVM+K1YdyL3YAHUji14 9HKl5XuknealvHakJCM2qLWkn1pJBDI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-255-d0LveL7sPdWn4b9h4NnYjw-1; Tue, 20 Oct 2020 05:12:58 -0400 X-MC-Unique: d0LveL7sPdWn4b9h4NnYjw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8B81810866A1; Tue, 20 Oct 2020 09:12:56 +0000 (UTC) Received: from [10.72.13.171] (ovpn-13-171.pek2.redhat.com [10.72.13.171]) by smtp.corp.redhat.com (Postfix) with ESMTP id B2F032C31E; Tue, 20 Oct 2020 09:12:50 +0000 (UTC) Subject: Re: [External] Re: [RFC 0/4] Introduce VDUSE - vDPA Device in Userspace To: Yongji Xie Cc: "Michael S. Tsirkin" , akpm@linux-foundation.org, linux-mm@kvack.org, virtualization@lists.linux-foundation.org References: <20201019145623.671-1-xieyongji@bytedance.com> From: Jason Wang Message-ID: <6cff5900-42ee-a0f5-0d5f-9383646c27d9@redhat.com> Date: Tue, 20 Oct 2020 17:12:46 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=jasowang@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2020/10/20 =E4=B8=8B=E5=8D=884:35, Yongji Xie wrote: > > > On Tue, Oct 20, 2020 at 4:01 PM Jason Wang > wrote: > > > On 2020/10/20 =E4=B8=8B=E5=8D=883:39, Yongji Xie wrote: > > > > > > On Tue, Oct 20, 2020 at 11:20 AM Jason Wang > > >> wrote: > > > > > >=C2=A0 =C2=A0 =C2=A0On 2020/10/19 =E4=B8=8B=E5=8D=8810:56, Xie Yon= gji wrote: > >=C2=A0 =C2=A0 =C2=A0> This series introduces a framework, which ca= n be used to > implement > >=C2=A0 =C2=A0 =C2=A0> vDPA Devices in a userspace program. To impl= ement it, the work > >=C2=A0 =C2=A0 =C2=A0> consist of two parts: control path emulating= and data path > >=C2=A0 =C2=A0 =C2=A0offloading. > >=C2=A0 =C2=A0 =C2=A0> > >=C2=A0 =C2=A0 =C2=A0> In the control path, the VDUSE driver will m= ake use of message > >=C2=A0 =C2=A0 =C2=A0> mechnism to forward the actions (get/set fea= tures, get/st > status, > >=C2=A0 =C2=A0 =C2=A0> get/set config space and set virtqueue state= s) from > virtio-vdpa > >=C2=A0 =C2=A0 =C2=A0> driver to userspace. Userspace can use read(= )/write() to > >=C2=A0 =C2=A0 =C2=A0> receive/reply to those control messages. > >=C2=A0 =C2=A0 =C2=A0> > >=C2=A0 =C2=A0 =C2=A0> In the data path, the VDUSE driver implement= s a MMU-based > >=C2=A0 =C2=A0 =C2=A0> on-chip IOMMU driver which supports both dir= ect mapping and > >=C2=A0 =C2=A0 =C2=A0> indirect mapping with bounce buffer. Then us= erspace can access > >=C2=A0 =C2=A0 =C2=A0> those iova space via mmap(). Besides, eventf= d mechnism is > used to > >=C2=A0 =C2=A0 =C2=A0> trigger interrupts and forward virtqueue kic= ks. > > > > > >=C2=A0 =C2=A0 =C2=A0This is pretty interesting! > > > >=C2=A0 =C2=A0 =C2=A0For vhost-vdpa, it should work, but for virtio= -vdpa, I think we > >=C2=A0 =C2=A0 =C2=A0should > >=C2=A0 =C2=A0 =C2=A0carefully deal with the IOMMU/DMA ops stuffs. > > > > > >=C2=A0 =C2=A0 =C2=A0I notice that neither dma_map nor set_map is i= mplemented in > >=C2=A0 =C2=A0 =C2=A0vduse_vdpa_config_ops, this means you want to = let vhost-vDPA > to deal > >=C2=A0 =C2=A0 =C2=A0with IOMMU domains stuffs.=C2=A0 Any reason fo= r doing that? > > > > Actually, this series only focus on virtio-vdpa case now. To > support > > vhost-vdpa,=C2=A0 as you said, we need to implement > dma_map/dma_unmap. But > > there is a limit that vm's memory can't be anonymous pages which > are > > forbidden in vm_insert_page(). Maybe we need to add some limits o= n > > vhost-vdpa? > > > I'm not sure I get this, any reason that you want to use > vm_insert_page() to VM's memory. Or do you mean you want to impleme= nt > some kind of zero-copy?=20 > > > > If my understanding is right, we will have a QEMU (VM) process and a=20 > device emulation process in the vhost-vdpa case, right? When I/O=20 > happens, the virtio driver in VM will put the IOVA to vring and device=20 > emulation process will get the IOVA from vring. Then the device=20 > emulation process will=C2=A0translate the IOVA to its VA to access the = dma=20 > buffer which resides in VM's memory. That means the device emulation=20 > process needs to access VM's=C2=A0memory, so we should use vm_insert_pa= ge()=20 > to build the page table of the device emulation process. Ok, I get you now. So it looks to me the that the real issue is not the=20 limitation to anonymous page but see the comments above vm_insert_page(): " =C2=A0* The page has to be a nice clean _individual_ kernel allocation. " So I suspect that using vm_insert_page() to share pages between=20 processes is legal. We need inputs from MM experts. Thanks > > I guess from the software device implemention in user space it > only need > to receive IOVA ranges and map them in its own address space. > > > How to map them in its own address space if we don't use vm_insert_page= ()?