From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EC69C433DB for ; Mon, 22 Mar 2021 08:13:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 82A3360233 for ; Mon, 22 Mar 2021 08:13:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 82A3360233 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 45E438D0003; Mon, 22 Mar 2021 03:54:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 40F718D0001; Mon, 22 Mar 2021 03:54:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2754A8D0003; Mon, 22 Mar 2021 03:54:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0125.hostedemail.com [216.40.44.125]) by kanga.kvack.org (Postfix) with ESMTP id 0842E8D0001 for ; Mon, 22 Mar 2021 03:54:32 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 6407B18025582 for ; Mon, 22 Mar 2021 08:13:22 +0000 (UTC) X-FDA: 77946795444.26.917917B Received: from pio-pvt-msa3.bahnhof.se (pio-pvt-msa3.bahnhof.se [79.136.2.42]) by imf07.hostedemail.com (Postfix) with ESMTP id 3F8D1A0009CA for ; Mon, 22 Mar 2021 08:13:21 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by pio-pvt-msa3.bahnhof.se (Postfix) with ESMTP id 6D39E3FE5D; Mon, 22 Mar 2021 09:13:19 +0100 (CET) Authentication-Results: pio-pvt-msa3.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b=HcLH/7tT; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se Received: from pio-pvt-msa3.bahnhof.se ([127.0.0.1]) by localhost (pio-pvt-msa3.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0eW8CeNzc491; Mon, 22 Mar 2021 09:13:18 +0100 (CET) Received: by pio-pvt-msa3.bahnhof.se (Postfix) with ESMTPA id BCAB73F39B; Mon, 22 Mar 2021 09:13:15 +0100 (CET) Received: from [192.168.1.108] (2.71.25.58.mobile.tre.se [2.71.25.58]) by mail1.shipmail.org (Postfix) with ESMTPSA id A54013600BA; Mon, 22 Mar 2021 09:13:14 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1616400795; bh=ftxuHrqVt3LcJ6pfzDEqMV30xdNNqBuZ31nTRzOF8Y8=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=HcLH/7tTFUKA4+aZAskEI+go7IUsEJe86sYoHi5kr+7X8iPzvcQcUkemZ2QBhkDLV jCwml0Mbl0nrlqdl/eA3iI5Q/UDFxXvboarpFOPkmTzQK4tOKXlgLD3LbC/ZboU4hx 7bL+/izXvJaNoW1fOESN7615jEVMnPKnpj2WNDj0= Subject: Re: [RFC PATCH 2/2] mm,drm/ttm: Use VM_PFNMAP for TTM vmas To: =?UTF-8?Q?Christian_K=c3=b6nig?= , dri-devel@lists.freedesktop.org Cc: David Airlie , Daniel Vetter , Andrew Morton , Jason Gunthorpe , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20210321184529.59006-1-thomas_os@shipmail.org> <20210321184529.59006-3-thomas_os@shipmail.org> From: =?UTF-8?Q?Thomas_Hellstr=c3=b6m_=28Intel=29?= Message-ID: <5578ccaa-9751-3a01-1846-330c09bf9ce7@shipmail.org> Date: Mon, 22 Mar 2021 09:13:13 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Stat-Signature: ryggn3h6keh7ske51izimrsd68ujg9c7 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 3F8D1A0009CA Received-SPF: none (shipmail.org>: No applicable sender policy available) receiver=imf07; identity=mailfrom; envelope-from=""; helo=pio-pvt-msa3.bahnhof.se; client-ip=79.136.2.42 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616400801-453815 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi! On 3/22/21 8:47 AM, Christian K=C3=B6nig wrote: > Am 21.03.21 um 19:45 schrieb Thomas Hellstr=C3=B6m (Intel): >> To block fast gup we need to make sure TTM ptes are always special. >> With MIXEDMAP we, on architectures that don't support pte_special, >> insert normal ptes, but OTOH on those architectures, fast is not >> supported. >> At the same time, the function documentation to vm_normal_page()=20 >> suggests >> that ptes pointing to system memory pages of MIXEDMAP vmas are always >> normal, but that doesn't seem consistent with what's implemented in >> vmf_insert_mixed(). I'm thus not entirely sure this patch is actually >> needed. >> >> But to make sure and to avoid also normal (non-fast) gup, make all >> TTM vmas PFNMAP. With PFNMAP we can't allow COW mappings >> anymore so make is_cow_mapping() available and use it to reject >> COW mappigs at mmap time. > > I would separate the disallowing of COW mapping from the PFN change.=20 > I'm pretty sure that COW mappings never worked on TTM BOs in the first=20 > place. COW doesn't work with PFNMAP together with non-linear maps, so as a=20 consequence from moving from MIXEDMAP to PFNMAP we must disallow COW, so=20 it seems logical to me to do it in one patch. And working COW was one of the tests I used for huge PMDs/PUDs, so it=20 has indeed been working, but I can't think of any relevant use-cases. Did you, BTW, have a chance to test this with WC mappings? Thanks, /Thomas > > But either way this patch is Reviewed-by: Christian K=C3=B6nig=20 > . > > Thanks, > Christian. > >> >> There was previously a comment in the code that WC mappings together >> with x86 PAT + PFNMAP was bad for performance. However from looking at >> vmf_insert_mixed() it looks like in the current code PFNMAP and MIXEDM= AP >> are handled the same for architectures that support pte_special. This >> means there should not be a performance difference anymore, but this >> needs to be verified. >> >> Cc: Christian Koenig >> Cc: David Airlie >> Cc: Daniel Vetter >> Cc: Andrew Morton >> Cc: Jason Gunthorpe >> Cc: linux-mm@kvack.org >> Cc: dri-devel@lists.freedesktop.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Thomas Hellstr=C3=B6m (Intel) >> --- >> =C2=A0 drivers/gpu/drm/ttm/ttm_bo_vm.c | 22 ++++++++-------------- >> =C2=A0 include/linux/mm.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 5 +++++ >> =C2=A0 mm/internal.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 5 ----- >> =C2=A0 3 files changed, 13 insertions(+), 19 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c=20 >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> index 1c34983480e5..708c6fb9be81 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> @@ -372,12 +372,7 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct=20 >> vm_fault *vmf, >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * at arbi= trary times while the data is mmap'ed. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * See vmf= _insert_mixed_prot() for a discussion. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ >> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (vma->vm_flags & VM_MIX= EDMAP) >> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 re= t =3D vmf_insert_mixed_prot(vma, address, >> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 __pfn_to_pfn_t(pfn, PFN_DEV), >> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 prot); >> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 else >> -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 re= t =3D vmf_insert_pfn_prot(vma, address, pfn, prot); >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D vmf_insert_pfn_pro= t(vma, address, pfn, prot); >> =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Never= error on prefaulted PTEs */ >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (unlikely((r= et & VM_FAULT_ERROR))) { >> @@ -555,18 +550,14 @@ static void ttm_bo_mmap_vma_setup(struct=20 >> ttm_buffer_object *bo, struct vm_area_s >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * Note: We're transferring the bo= reference to >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * vma->vm_private_data here. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ >> - >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 vma->vm_private_data =3D bo; >> =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* >> -=C2=A0=C2=A0=C2=A0=C2=A0 * We'd like to use VM_PFNMAP on shared mappi= ngs, where >> -=C2=A0=C2=A0=C2=A0=C2=A0 * (vma->vm_flags & VM_SHARED) !=3D 0, for pe= rformance reasons, >> -=C2=A0=C2=A0=C2=A0=C2=A0 * but for some reason VM_PFNMAP + x86 PAT + = write-combine is very >> -=C2=A0=C2=A0=C2=A0=C2=A0 * bad for performance. Until that has been s= orted out, use >> -=C2=A0=C2=A0=C2=A0=C2=A0 * VM_MIXEDMAP on all mappings. See freedeskt= op.org bug #75719 >> +=C2=A0=C2=A0=C2=A0=C2=A0 * PFNMAP forces us to block COW mappings in = mmap(), >> +=C2=A0=C2=A0=C2=A0=C2=A0 * and with MIXEDMAP we would incorrectly all= ow fast gup >> +=C2=A0=C2=A0=C2=A0=C2=A0 * on TTM memory on architectures that don't = have pte_special. >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ >> -=C2=A0=C2=A0=C2=A0 vma->vm_flags |=3D VM_MIXEDMAP; >> -=C2=A0=C2=A0=C2=A0 vma->vm_flags |=3D VM_IO | VM_DONTEXPAND | VM_DONT= DUMP; >> +=C2=A0=C2=A0=C2=A0 vma->vm_flags |=3D VM_PFNMAP | VM_IO | VM_DONTEXPA= ND | VM_DONTDUMP; >> =C2=A0 } >> =C2=A0 =C2=A0 int ttm_bo_mmap(struct file *filp, struct vm_area_struct= *vma, >> @@ -579,6 +570,9 @@ int ttm_bo_mmap(struct file *filp, struct=20 >> vm_area_struct *vma, >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (unlikely(vma->vm_pgoff < DRM_FILE_P= AGE_OFFSET_START)) >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return -EINVAL; >> =C2=A0 +=C2=A0=C2=A0=C2=A0 if (unlikely(is_cow_mapping(vma->vm_flags))= ) >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return -EINVAL; >> + >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bo =3D ttm_bo_vm_lookup(bdev, vma->vm_p= goff, vma_pages(vma)); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (unlikely(!bo)) >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return -EINVAL; >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 77e64e3eac80..c6ebf7f9ddbb 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -686,6 +686,11 @@ static inline bool vma_is_accessible(struct=20 >> vm_area_struct *vma) >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return vma->vm_flags & VM_ACCESS_FLAGS; >> =C2=A0 } >> =C2=A0 +static inline bool is_cow_mapping(vm_flags_t flags) >> +{ >> +=C2=A0=C2=A0=C2=A0 return (flags & (VM_SHARED | VM_MAYWRITE)) =3D=3D = VM_MAYWRITE; >> +} >> + >> =C2=A0 #ifdef CONFIG_SHMEM >> =C2=A0 /* >> =C2=A0=C2=A0 * The vma_is_shmem is not inline because it is used only = by slow >> diff --git a/mm/internal.h b/mm/internal.h >> index 9902648f2206..1432feec62df 100644 >> --- a/mm/internal.h >> +++ b/mm/internal.h >> @@ -296,11 +296,6 @@ static inline unsigned int buddy_order(struct=20 >> page *page) >> =C2=A0=C2=A0 */ >> =C2=A0 #define buddy_order_unsafe(page) READ_ONCE(page_private(page)) >> =C2=A0 -static inline bool is_cow_mapping(vm_flags_t flags) >> -{ >> -=C2=A0=C2=A0=C2=A0 return (flags & (VM_SHARED | VM_MAYWRITE)) =3D=3D = VM_MAYWRITE; >> -} >> - >> =C2=A0 /* >> =C2=A0=C2=A0 * These three helpers classifies VMAs for virtual memory = accounting. >> =C2=A0=C2=A0 */