From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33183C433DB for ; Thu, 25 Mar 2021 11:53:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B21F9619E4 for ; Thu, 25 Mar 2021 11:53:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B21F9619E4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 183A76B0075; Thu, 25 Mar 2021 07:53:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 134516B0078; Thu, 25 Mar 2021 07:53:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F16866B007B; Thu, 25 Mar 2021 07:53:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0233.hostedemail.com [216.40.44.233]) by kanga.kvack.org (Postfix) with ESMTP id D412D6B0075 for ; Thu, 25 Mar 2021 07:53:29 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 95C551F22F for ; Thu, 25 Mar 2021 11:53:29 +0000 (UTC) X-FDA: 77958236538.19.099C618 Received: from pio-pvt-msa2.bahnhof.se (pio-pvt-msa2.bahnhof.se [79.136.2.41]) by imf15.hostedemail.com (Postfix) with ESMTP id 35460A00024C for ; Thu, 25 Mar 2021 11:53:25 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by pio-pvt-msa2.bahnhof.se (Postfix) with ESMTP id CBAB23F496; Thu, 25 Mar 2021 12:53:13 +0100 (CET) Authentication-Results: pio-pvt-msa2.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b=Jcr/8gpy; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se Received: from pio-pvt-msa2.bahnhof.se ([127.0.0.1]) by localhost (pio-pvt-msa2.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kXkEWKiv6GtY; Thu, 25 Mar 2021 12:53:08 +0100 (CET) Received: by pio-pvt-msa2.bahnhof.se (Postfix) with ESMTPA id 8DE093F449; Thu, 25 Mar 2021 12:53:07 +0100 (CET) Received: from [10.249.254.165] (unknown [192.198.151.44]) by mail1.shipmail.org (Postfix) with ESMTPSA id 223A33600A8; Thu, 25 Mar 2021 12:53:17 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1616673198; bh=pupAMitWznuoPQ8iF8iLtIylmNo+WPnV6Cb+xjHhIIo=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=Jcr/8gpyXj0u86Gzp+qlCA5cbWu6753/rNa1t/ZkoJ7miw09xCznZ+N1NKGpMxI3M TG7URHg57VZESD4MKWvhYIB27SRltWGjrSd6DSqcXV+wv7VRd3ubYudSl/+ROMKf+4 mSfSqOcOFKgCS4638DmDU0fRlp8Y8eXw5QJLnsls= Subject: Re: [RFC PATCH 1/2] mm,drm/ttm: Block fast GUP to TTM huge pages To: Jason Gunthorpe Cc: =?UTF-8?Q?Christian_K=c3=b6nig?= , David Airlie , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Andrew Morton References: <6c9acb90-8e91-d8af-7abd-e762d9a901aa@shipmail.org> <20210324134833.GE2356281@nvidia.com> <0b984f96-00fb-5410-bb16-02e12b2cc024@shipmail.org> <20210324163812.GJ2356281@nvidia.com> <08f19e80-d6cb-8858-0c5d-67d2e2723f72@amd.com> <730eb2ff-ba98-2393-6d42-61735e3c6b83@shipmail.org> <20210324231419.GR2356281@nvidia.com> <607ecbeb-e8a5-66e9-6fe2-9a8d22f12bc2@shipmail.org> <15da5784-96ca-25e5-1485-3ce387ee6695@shipmail.org> <20210325113023.GT2356281@nvidia.com> From: =?UTF-8?Q?Thomas_Hellstr=c3=b6m_=28Intel=29?= Message-ID: Date: Thu, 25 Mar 2021 12:53:15 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <20210325113023.GT2356281@nvidia.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Stat-Signature: 1aki69d61qcjafabhdg559y417g1pggu X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 35460A00024C Received-SPF: none (shipmail.org>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from=""; helo=pio-pvt-msa2.bahnhof.se; client-ip=79.136.2.41 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616673205-197341 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 3/25/21 12:30 PM, Jason Gunthorpe wrote: > On Thu, Mar 25, 2021 at 10:51:35AM +0100, Thomas Hellstr=C3=B6m (Intel)= wrote: > >>> Please explain that further. Why do we need the mmap lock to insert P= MDs >>> but not when insert PTEs? >> We don't. But once you've inserted a PMD directory you can't remove it >> unless you have the mmap lock (and probably also the i_mmap_lock in wr= ite >> mode). That for example means that if you have a VRAM region mapped wi= th >> huge PMDs, and then it gets evicted, and you happen to read a byte fro= m it >> when it's evicted and therefore populate the full region with PTEs poi= nting >> to system pages, you can't go back to huge PMDs again without a munmap= () in >> between. > This is all basically magic to me still, but THP does this > transformation and I think what it does could work here too. We > probably wouldn't be able to upgrade while handling fault, but at the > same time, this should be quite rare as it would require the driver to > have supplied a small page for this VMA at some point. IIRC THP handles this using khugepaged, grabbing the lock in write mode=20 when coalescing, and yeah, I don't think anything prevents anyone from=20 extending khugepaged doing that also for special huge page table entries. > >>> Apart from that I still don't fully get why we need this in the first >>> place. >> Because virtual huge page address boundaries need to be aligned with >> physical huge page address boundaries, and mmap can happen before bos = are >> populated so you have no way of knowing how physical huge page >> address > But this is a mmap-time problem, fault can't fix mmap using the wrong V= A. Nope. The point here was that in this case, to make sure mmap uses the=20 correct VA to give us a reasonable chance of alignement, the driver=20 might need to be aware of and do trickery with the huge page-table-entry=20 sizes anyway, although I think in most cases a standard helper for this=20 can be supplied. /Thomas > >>> I really don't see that either. When a buffer is accessed by the CPU = it >>> is in > 90% of all cases completely accessed. Not faulting in full >>> ranges is just optimizing for a really unlikely case here. >> It might be that you're right, but are all drivers wanting to use this= like >> drm in this respect? Using the interface to fault in a 1G range in the= hope >> it could map it to a huge pud may unexpectedly consume and populate so= me 16+ >> MB of page tables. > If the underlying device block size is so big then sure, why not? The > "unexpectedly" should be quite rare/non existant anyhow. > > Jason > =20