From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,PDS_BAD_THREAD_QP_64, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 267DAC433DB for ; Fri, 29 Jan 2021 10:33:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9277464E6E for ; Fri, 29 Jan 2021 10:33:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9277464E6E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9F7E36B0006; Fri, 29 Jan 2021 05:33:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 97FAC6B0070; Fri, 29 Jan 2021 05:33:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7FA4A6B0071; Fri, 29 Jan 2021 05:33:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0151.hostedemail.com [216.40.44.151]) by kanga.kvack.org (Postfix) with ESMTP id 62F8D6B0006 for ; Fri, 29 Jan 2021 05:33:42 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 1EF218249980 for ; Fri, 29 Jan 2021 10:33:42 +0000 (UTC) X-FDA: 77758451484.05.title73_020896a275a7 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id F318F1801814C for ; Fri, 29 Jan 2021 10:33:41 +0000 (UTC) X-HE-Tag: title73_020896a275a7 X-Filterd-Recvd-Size: 7422 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Fri, 29 Jan 2021 10:33:39 +0000 (UTC) Received: from DGGEMM404-HUB.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4DRtt86zqhz13nFt; Fri, 29 Jan 2021 18:31:28 +0800 (CST) Received: from dggpemm500011.china.huawei.com (7.185.36.110) by DGGEMM404-HUB.china.huawei.com (10.3.20.212) with Microsoft SMTP Server (TLS) id 14.3.498.0; Fri, 29 Jan 2021 18:33:29 +0800 Received: from dggemi761-chm.china.huawei.com (10.1.198.147) by dggpemm500011.china.huawei.com (7.185.36.110) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2106.2; Fri, 29 Jan 2021 18:33:29 +0800 Received: from dggemi761-chm.china.huawei.com ([10.9.49.202]) by dggemi761-chm.china.huawei.com ([10.9.49.202]) with mapi id 15.01.2106.006; Fri, 29 Jan 2021 18:33:29 +0800 From: "Song Bao Hua (Barry Song)" To: "Tian, Kevin" , Jason Gunthorpe CC: "chensihang (A)" , Arnd Bergmann , Greg Kroah-Hartman , "linux-kernel@vger.kernel.org" , "iommu@lists.linux-foundation.org" , "linux-mm@kvack.org" , Zhangfei Gao , "Liguozhu (Kenneth)" , "linux-accelerators@lists.ozlabs.org" Subject: RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device Thread-Topic: [RFC PATCH v2] uacce: Add uacce_ctrl misc device Thread-Index: AQHW8vWniUnMS+RFOU2UJJCa8sDlvKo39q+AgADtcaD//5AFgIAAh9tQ//+YwwCAAIUv4IAEx5CAgACGO9A= Date: Fri, 29 Jan 2021 10:33:28 +0000 Message-ID: <234b8c25afc440ce8245aca9081652fb@hisilicon.com> References: <1611563696-235269-1-git-send-email-wangzhou1@hisilicon.com> <20210125154717.GW4605@ziepe.ca> <96b655ade2534a65974a378bb68383ee@hisilicon.com> <20210125231619.GY4605@ziepe.ca> <5f64a68042c64f37b5cba74028bd2189@hisilicon.com> <20210126011304.GZ4605@ziepe.ca> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.126.203.74] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > -----Original Message----- > From: Tian, Kevin [mailto:kevin.tian@intel.com] > Sent: Friday, January 29, 2021 11:09 PM > To: Song Bao Hua (Barry Song) ; Jason Gunthor= pe > > Cc: chensihang (A) ; Arnd Bergmann > ; Greg Kroah-Hartman ; > linux-kernel@vger.kernel.org; iommu@lists.linux-foundation.org; > linux-mm@kvack.org; Zhangfei Gao ; Liguozhu > (Kenneth) ; linux-accelerators@lists.ozlabs.org > Subject: RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device >=20 > > From: Song Bao Hua (Barry Song) > > Sent: Tuesday, January 26, 2021 9:27 AM > > > > > -----Original Message----- > > > From: Jason Gunthorpe [mailto:jgg@ziepe.ca] > > > Sent: Tuesday, January 26, 2021 2:13 PM > > > To: Song Bao Hua (Barry Song) > > > Cc: Wangzhou (B) ; Greg Kroah-Hartman > > > ; Arnd Bergmann ; > > Zhangfei Gao > > > ; linux-accelerators@lists.ozlabs.org; > > > linux-kernel@vger.kernel.org; iommu@lists.linux-foundation.org; > > > linux-mm@kvack.org; Liguozhu (Kenneth) ; > > chensihang > > > (A) > > > Subject: Re: [RFC PATCH v2] uacce: Add uacce_ctrl misc device > > > > > > On Mon, Jan 25, 2021 at 11:35:22PM +0000, Song Bao Hua (Barry Song) > > wrote: > > > > > > > > On Mon, Jan 25, 2021 at 10:21:14PM +0000, Song Bao Hua (Barry Son= g) > > wrote: > > > > > > mlock, while certainly be able to prevent swapping out, it won'= t > > > > > > be able to stop page moving due to: > > > > > > * memory compaction in alloc_pages() > > > > > > * making huge pages > > > > > > * numa balance > > > > > > * memory compaction in CMA > > > > > > > > > > Enabling those things is a major reason to have SVA device in the > > > > > first place, providing a SW API to turn it all off seems like the > > > > > wrong direction. > > > > > > > > I wouldn't say this is a major reason to have SVA. If we read the > > > > history of SVA and papers, people would think easy programming due > > > > to data struct sharing between cpu and device, and process space > > > > isolation in device would be the major reasons for SVA. SVA also > > > > declares it supports zero-copy while zero-copy doesn't necessarily > > > > depend on SVA. > > > > > > Once you have to explicitly make system calls to declare memory under > > > IO, you loose all of that. > > > > > > Since you've asked the app to be explicit about the DMAs it intends t= o > > > do, there is not really much reason to use SVA for those DMAs anymore= . > > > > Let's see a non-SVA case. We are not using SVA, we can have > > a memory pool by hugetlb or pin, and app can allocate memory > > from this pool, and get stable I/O performance on the memory > > from the pool. But device has its separate page table which > > is not bound with this process, thus lacking the protection > > of process space isolation. Plus, CPU and device are using > > different address. > > > > And then we move to SVA case, we can still have a memory pool > > by hugetlb or pin, and app can allocate memory from this pool > > since this pool is mapped to the address space of the process, > > and we are able to get stable I/O performance since it is always > > there. But in this case, device is using the page table of > > process with the full permission control. > > And they are using same address and can possibly enjoy the easy > > programming if HW supports. > > > > SVA is not doom to work with IO page fault only. If we have SVA+pin, > > we would get both sharing address and stable I/O latency. > > >=20 > Isn't it like a traditional MAP_DMA API (imply pinning) plus specifying > cpu_va of the memory pool as the iova? I think it enjoys the advantage of stable I/O latency of traditional MAP_DMA, and also uses the process page table which SVA can provide. The major difference is that in SVA case, iova totally belongs to process and is as normal as other heap/stack/data: p =3D mmap(.....MAP_ANON....); ioctl(/dev/acc, p, PIN); SVA for itself, provides the ability to guarantee the address space isolation of multiple processes. If the device can access the data struct such as list, tree directly, they can further enjoy the convenience of programming SVA gives. So we are looking for a combination of stable io latency of traditional DMA map and the ability of SVA. >=20 > Thanks > Kevin Thanks Barry