From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44A7AC433DB for ; Thu, 31 Dec 2020 05:16:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A72022158C for ; Thu, 31 Dec 2020 05:16:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A72022158C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A59A28D00A4; Thu, 31 Dec 2020 00:16:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A30AA6B00AF; Thu, 31 Dec 2020 00:16:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A7C38D00B0; Thu, 31 Dec 2020 00:16:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0055.hostedemail.com [216.40.44.55]) by kanga.kvack.org (Postfix) with ESMTP id 62C996B00B0 for ; Thu, 31 Dec 2020 00:16:01 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 1B6A2824805A for ; Thu, 31 Dec 2020 05:16:01 +0000 (UTC) X-FDA: 77652415722.03.ducks51_4413ede274ab Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 0120828A4E8 for ; Thu, 31 Dec 2020 05:16:00 +0000 (UTC) X-HE-Tag: ducks51_4413ede274ab X-Filterd-Recvd-Size: 8064 Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Thu, 31 Dec 2020 05:16:00 +0000 (UTC) Received: by mail-ed1-f47.google.com with SMTP id i24so17265258edj.8 for ; Wed, 30 Dec 2020 21:15:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=Fr8GvN1+YBA2/bLMNNrkP+yQOBmzJ011p/HaCmEdZ/0=; b=XFJR6OoQ+EmCp4AYfNhOZ40sV2hBUq3MgIvdfNJv2GH3YoziryLKY7EKtuAQ3ppvpw 8CHRbGH2ANTsgrVBzf/5gUEtQjS8ZDlwgrZDmGoV9Vn85UL3c7oZm8nonYHld1evL+6W 5jZLPoVtG3MPLwDTX6tb9J2tiELJYjP6I3kENz/SrFDfehbLXe+K3mfpADKo/B/+3qOY M+CrYPZY27Or6LYTB5Gb8yktbZ/V8D1BVEMN+eMeGFYo7jMMKV65YjqZSjB4YWvN/tDm U2KIacSuYf+BaBXdXdoBaf53Z8T7eMDfTY7nYKXbcPsx9qScSB8HDWK8U3z76WDaVYs1 rnJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Fr8GvN1+YBA2/bLMNNrkP+yQOBmzJ011p/HaCmEdZ/0=; b=MjFFST0j8WxxtFPHCKsMgb0gx3AiSt1/MAtZz0x8/lVPHRLUC5zb7F4QnVkSaLt3Fk PgFSNlPFSiyYfITHQQTHJTkgTXRJ7trTB6JwZ639Wy3Q0ro/daoqAXhDke+gPgXY4u5b QrlNPdiwHkEzbwrCR+6kAMsn2HVd4q6gTVFFHSsSeRV94opad2rWwKam8lLpAVPq6Qw1 cJojjNrecLfI7QeYET9S+R50E3gHoSrhyLteu9/IRFfBF/d8rVJIOEGiwyJ92LAs/hCr Idf940jD8BNWqVKzBB4dKk/2JZ4vrluA6R/lYwX3JguPb8hlikiyZLKCU7zABRBJNqYc zfpQ== X-Gm-Message-State: AOAM531WK5XrRaEJ4x5rKn5EvtriW3K2bot1lsUYFPlPW6Lrtx8N4Edw EXpSP+4VoV6Ne3gGSudbM0mPKN2eY0C3gZCFRlNJ X-Google-Smtp-Source: ABdhPJyDGyU+tI3srMa1F+hq5E4g0RZOH5eSzzR9KpND7TxAz97zJun+r+t0BDMFiLqvI0IkVgAgyXV5X8tcVzK4uSc= X-Received: by 2002:a50:f40e:: with SMTP id r14mr52010730edm.5.1609391758773; Wed, 30 Dec 2020 21:15:58 -0800 (PST) MIME-Version: 1.0 References: <20201222145221.711-1-xieyongji@bytedance.com> <0e6faf9c-117a-e23c-8d6d-488d0ec37412@redhat.com> <2b24398c-e6d9-14ec-2c0d-c303d528e377@redhat.com> <1356137727.40748805.1609233068675.JavaMail.zimbra@redhat.com> <3fc6a132-9fc2-c4e2-7fb1-b5a8bfb771fa@redhat.com> In-Reply-To: From: Yongji Xie Date: Thu, 31 Dec 2020 13:15:48 +0800 Message-ID: Subject: Re: Re: [RFC v2 09/13] vduse: Add support for processing vhost iotlb message To: Jason Wang Cc: "Michael S. Tsirkin" , Stefan Hajnoczi , sgarzare@redhat.com, Parav Pandit , akpm@linux-foundation.org, Randy Dunlap , Matthew Wilcox , viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Dec 31, 2020 at 10:49 AM Jason Wang wrote: > > > On 2020/12/30 =E4=B8=8B=E5=8D=886:12, Yongji Xie wrote: > > On Wed, Dec 30, 2020 at 4:41 PM Jason Wang wrote: > >> > >> On 2020/12/30 =E4=B8=8B=E5=8D=883:09, Yongji Xie wrote: > >>> On Wed, Dec 30, 2020 at 2:11 PM Jason Wang wrot= e: > >>>> On 2020/12/29 =E4=B8=8B=E5=8D=886:26, Yongji Xie wrote: > >>>>> On Tue, Dec 29, 2020 at 5:11 PM Jason Wang wr= ote: > >>>>>> ----- Original Message ----- > >>>>>>> On Mon, Dec 28, 2020 at 4:43 PM Jason Wang = wrote: > >>>>>>>> On 2020/12/28 =E4=B8=8B=E5=8D=884:14, Yongji Xie wrote: > >>>>>>>>>> I see. So all the above two questions are because VHOST_IOTLB_= INVALIDATE > >>>>>>>>>> is expected to be synchronous. This need to be solved by tweak= ing the > >>>>>>>>>> current VDUSE API or we can re-visit to go with descriptors re= laying > >>>>>>>>>> first. > >>>>>>>>>> > >>>>>>>>> Actually all vdpa related operations are synchronous in current > >>>>>>>>> implementation. The ops.set_map/dma_map/dma_unmap should not re= turn > >>>>>>>>> until the VDUSE_UPDATE_IOTLB/VDUSE_INVALIDATE_IOTLB message is = replied > >>>>>>>>> by userspace. Could it solve this problem? > >>>>>>>> I was thinking whether or not we need to generate IOTLB_INV= ALIDATE > >>>>>>>> message to VDUSE during dma_unmap (vduse_dev_unmap_page). > >>>>>>>> > >>>>>>>> If we don't, we're probably fine. > >>>>>>>> > >>>>>>> It seems not feasible. This message will be also used in the > >>>>>>> virtio-vdpa case to notify userspace to unmap some pages during > >>>>>>> consistent dma unmapping. Maybe we can document it to make sure t= he > >>>>>>> users can handle the message correctly. > >>>>>> Just to make sure I understand your point. > >>>>>> > >>>>>> Do you mean you plan to notify the unmap of 1) streaming DMA or 2) > >>>>>> coherent DMA? > >>>>>> > >>>>>> For 1) you probably need a workqueue to do that since dma unmap ca= n > >>>>>> be done in irq or bh context. And if usrspace does't do the unmap,= it > >>>>>> can still access the bounce buffer (if you don't zap pte)? > >>>>>> > >>>>> I plan to do it in the coherent DMA case. > >>>> Any reason for treating coherent DMA differently? > >>>> > >>> Now the memory of the bounce buffer is allocated page by page in the > >>> page fault handler. So it can't be used in coherent DMA mapping case > >>> which needs some memory with contiguous virtual addresses. I can use > >>> vmalloc() to do allocation for the bounce buffer instead. But it migh= t > >>> cause some memory waste. Any suggestion? > >> > >> I may miss something. But I don't see a relationship between the > >> IOTLB_UNMAP and vmalloc(). > >> > > In the vmalloc() case, the coherent DMA page will be taken from the > > memory allocated by vmalloc(). So IOTLB_UNMAP is not needed anymore > > during coherent DMA unmapping because those vmalloc'ed memory which > > has been mapped into userspace address space during initialization can > > be reused. And userspace should not unmap the region until we destroy > > the device. > > > Just to make sure I understand. My understanding is that IOTLB_UNMAP is > only needed when there's a change the mapping from IOVA to page. > Yes, that's true. > So if we stick to the mapping, e.g during dma_unmap, we just put IOVA to > free list to be used by the next IOVA allocating. IOTLB_UNMAP could be > avoided. > > So we are not limited by how the pages are actually allocated? > In coherent DMA cases, we need to return some memory with contiguous kernel virtual addresses. That is the reason why we need vmalloc() here. If we allocate the memory page by page, the corresponding kernel virtual addresses in a contiguous IOVA range might not be contiguous. And in streaming DMA cases, there is no limit. So another choice is using vmalloc'ed memory only for coherent DMA cases. Not sure if this is clear for you. Thanks, Yongji