From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 204A8C432C3 for ; Wed, 27 Nov 2019 12:30:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BD3AD2075C for ; Wed, 27 Nov 2019 12:30:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=shipmail.org header.i=@shipmail.org header.b="VxwY5H0R" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BD3AD2075C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 51D156B039C; Wed, 27 Nov 2019 07:30:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CC5F6B039D; Wed, 27 Nov 2019 07:30:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3BCF16B039E; Wed, 27 Nov 2019 07:30:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id 25B476B039C for ; Wed, 27 Nov 2019 07:30:29 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id E480C248C for ; Wed, 27 Nov 2019 12:30:28 +0000 (UTC) X-FDA: 76201990536.27.cook05_7870f3289100c X-HE-Tag: cook05_7870f3289100c X-Filterd-Recvd-Size: 9653 Received: from ste-pvt-msa1.bahnhof.se (ste-pvt-msa1.bahnhof.se [213.80.101.70]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Wed, 27 Nov 2019 12:30:27 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTP id DFEED3F6AE; Wed, 27 Nov 2019 13:30:25 +0100 (CET) Authentication-Results: ste-pvt-msa1.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b=VxwY5H0R; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se Received: from ste-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (ste-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lnNctQQ04JgA; Wed, 27 Nov 2019 13:30:24 +0100 (CET) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id 74AC73F65E; Wed, 27 Nov 2019 13:30:14 +0100 (CET) Received: from localhost.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id 961E1360140; Wed, 27 Nov 2019 13:30:14 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1574857814; bh=8IITHQAfGvblzc/bEW0D8L+no6nlFjLug5vo8LfdhDs=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=VxwY5H0RGbDuQlyu1pSzinDPtqni31sAgeF2NSwZ3R8weQOyMlKevHzxhyErw8SLW yK2mTUo8KG38lCaavjs4d+i5ffsMgzaEAZZMJstKLS/kH8C3Iv+A9bA6AwCDC99lpA gjOpY7MQSfo4Y68fuRB5OCJGpQtqdnVRyePis96U= Subject: Re: [RFC PATCH 6/7] drm/ttm: Introduce a huge page aligning TTM range manager. To: =?UTF-8?Q?Christian_K=c3=b6nig?= , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-graphics-maintainer@vmware.com Cc: Thomas Hellstrom , Andrew Morton , Michal Hocko , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Ralph Campbell , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= References: <20191127083120.34611-1-thomas_os@shipmail.org> <20191127083120.34611-7-thomas_os@shipmail.org> From: =?UTF-8?Q?Thomas_Hellstr=c3=b6m_=28VMware=29?= Organization: VMware Inc. Message-ID: <1f356be5-2535-8f76-f33f-540feb3a72ea@shipmail.org> Date: Wed, 27 Nov 2019 13:30:14 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/27/19 11:05 AM, Christian K=C3=B6nig wrote: > I don't see the advantage over just increasing the alignment=20 > requirements in the driver side? The advantage is that we don't fail space allocation if we can't match=20 the alignment. We instead fall back to a lower alignment if it's=20 compatible with the GPU required alignment. Thanks, /Thomas > > That would be a one liner if I'm not completely mistaken. > > Regards, > Christian. > > Am 27.11.19 um 09:31 schrieb Thomas Hellstr=C3=B6m (VMware): >> From: Thomas Hellstrom >> >> Using huge page-table entries require that the start of a buffer objec= t >> is huge page size aligned. So introduce a ttm_bo_man_get_node_huge() >> function that attempts to accomplish this for allocations that are=20 >> larger >> than the huge page size, and provide a new range-manager instance that >> uses that function. >> >> Cc: Andrew Morton >> Cc: Michal Hocko >> Cc: "Matthew Wilcox (Oracle)" >> Cc: "Kirill A. Shutemov" >> Cc: Ralph Campbell >> Cc: "J=C3=A9r=C3=B4me Glisse" >> Cc: "Christian K=C3=B6nig" >> Signed-off-by: Thomas Hellstrom >> --- >> =C2=A0 drivers/gpu/drm/ttm/ttm_bo_manager.c | 92 +++++++++++++++++++++= +++++++ >> =C2=A0 include/drm/ttm/ttm_bo_driver.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |= =C2=A0 1 + >> =C2=A0 2 files changed, 93 insertions(+) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c=20 >> b/drivers/gpu/drm/ttm/ttm_bo_manager.c >> index 18d3debcc949..26aa1a2ae7f1 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c >> @@ -89,6 +89,89 @@ static int ttm_bo_man_get_node(struct=20 >> ttm_mem_type_manager *man, >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return 0; >> =C2=A0 } >> =C2=A0 +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >> +static int ttm_bo_insert_aligned(struct drm_mm *mm, struct=20 >> drm_mm_node *node, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long align_pages, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 const struct ttm_place *place, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 struct ttm_mem_reg *mem, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long lpfn, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 enum drm_mm_insert_mode mode) >> +{ >> +=C2=A0=C2=A0=C2=A0 if (align_pages >=3D mem->page_alignment && >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (!mem->page_alignment || a= lign_pages % mem->page_alignment=20 >> =3D=3D 0)) { >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return drm_mm_insert_node_= in_range(mm, node, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 mem->num_pages, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 align_pages, 0, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 place->fpfn, lpfn, mode); >> +=C2=A0=C2=A0=C2=A0 } >> + >> +=C2=A0=C2=A0=C2=A0 return -ENOSPC; >> +} >> + >> +static int ttm_bo_man_get_node_huge(struct ttm_mem_type_manager *man, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct ttm_buffer_object *b= o, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 const struct ttm_place *pla= ce, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct ttm_mem_reg *mem) >> +{ >> +=C2=A0=C2=A0=C2=A0 struct ttm_range_manager *rman =3D (struct ttm_ran= ge_manager *)=20 >> man->priv; >> +=C2=A0=C2=A0=C2=A0 struct drm_mm *mm =3D &rman->mm; >> +=C2=A0=C2=A0=C2=A0 struct drm_mm_node *node; >> +=C2=A0=C2=A0=C2=A0 unsigned long align_pages; >> +=C2=A0=C2=A0=C2=A0 unsigned long lpfn; >> +=C2=A0=C2=A0=C2=A0 enum drm_mm_insert_mode mode =3D DRM_MM_INSERT_BES= T; >> +=C2=A0=C2=A0=C2=A0 int ret; >> + >> +=C2=A0=C2=A0=C2=A0 node =3D kzalloc(sizeof(*node), GFP_KERNEL); >> +=C2=A0=C2=A0=C2=A0 if (!node) >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return -ENOMEM; >> + >> +=C2=A0=C2=A0=C2=A0 lpfn =3D place->lpfn; >> +=C2=A0=C2=A0=C2=A0 if (!lpfn) >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 lpfn =3D man->size; >> + >> +=C2=A0=C2=A0=C2=A0 mode =3D DRM_MM_INSERT_BEST; >> +=C2=A0=C2=A0=C2=A0 if (place->flags & TTM_PL_FLAG_TOPDOWN) >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mode =3D DRM_MM_INSERT_HIG= H; >> + >> +=C2=A0=C2=A0=C2=A0 spin_lock(&rman->lock); >> +=C2=A0=C2=A0=C2=A0 if (IS_ENABLED(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPA= GE_PUD)) { >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 align_pages =3D (HPAGE_PUD= _SIZE >> PAGE_SHIFT); >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (mem->num_pages >=3D al= ign_pages) { >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 re= t =3D ttm_bo_insert_aligned(mm, node, align_pages, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 place, mem, lpfn, mode); >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if= (!ret) >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 goto found_unlock; >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >> +=C2=A0=C2=A0=C2=A0 } >> + >> +=C2=A0=C2=A0=C2=A0 align_pages =3D (HPAGE_PMD_SIZE >> PAGE_SHIFT); >> +=C2=A0=C2=A0=C2=A0 if (mem->num_pages >=3D align_pages) { >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D ttm_bo_insert_alig= ned(mm, node, align_pages, place, mem, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 lpf= n, mode); >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!ret) >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 go= to found_unlock; >> +=C2=A0=C2=A0=C2=A0 } >> + >> +=C2=A0=C2=A0=C2=A0 ret =3D drm_mm_insert_node_in_range(mm, node, mem-= >num_pages, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mem->page_align= ment, 0, >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 place->fpfn, lp= fn, mode); >> +found_unlock: >> +=C2=A0=C2=A0=C2=A0 spin_unlock(&rman->lock); >> + >> +=C2=A0=C2=A0=C2=A0 if (unlikely(ret)) { >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 kfree(node); >> +=C2=A0=C2=A0=C2=A0 } else { >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mem->mm_node =3D node; >> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mem->start =3D node->start= ; >> +=C2=A0=C2=A0=C2=A0 } >> + >> +=C2=A0=C2=A0=C2=A0 return 0; >> +} >> +#else >> +#define ttm_bo_man_get_node_huge ttm_bo_man_get_node >> +#endif >> + >> + >> =C2=A0 static void ttm_bo_man_put_node(struct ttm_mem_type_manager *ma= n, >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct ttm_mem_reg *mem) >> =C2=A0 { >> @@ -154,3 +237,12 @@ const struct ttm_mem_type_manager_func=20 >> ttm_bo_manager_func =3D { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 .debug =3D ttm_bo_man_debug >> =C2=A0 }; >> =C2=A0 EXPORT_SYMBOL(ttm_bo_manager_func); >> + >> +const struct ttm_mem_type_manager_func ttm_bo_manager_huge_func =3D { >> +=C2=A0=C2=A0=C2=A0 .init =3D ttm_bo_man_init, >> +=C2=A0=C2=A0=C2=A0 .takedown =3D ttm_bo_man_takedown, >> +=C2=A0=C2=A0=C2=A0 .get_node =3D ttm_bo_man_get_node_huge, >> +=C2=A0=C2=A0=C2=A0 .put_node =3D ttm_bo_man_put_node, >> +=C2=A0=C2=A0=C2=A0 .debug =3D ttm_bo_man_debug >> +}; >> +EXPORT_SYMBOL(ttm_bo_manager_huge_func); >> diff --git a/include/drm/ttm/ttm_bo_driver.h=20 >> b/include/drm/ttm/ttm_bo_driver.h >> index cac7a8a0825a..868bd0d4be6a 100644 >> --- a/include/drm/ttm/ttm_bo_driver.h >> +++ b/include/drm/ttm/ttm_bo_driver.h >> @@ -888,5 +888,6 @@ int ttm_bo_pipeline_gutting(struct=20 >> ttm_buffer_object *bo); >> =C2=A0 pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp); >> =C2=A0 =C2=A0 extern const struct ttm_mem_type_manager_func ttm_bo_man= ager_func; >> +extern const struct ttm_mem_type_manager_func ttm_bo_manager_huge_fun= c; >> =C2=A0 =C2=A0 #endif