From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E3E4C77B6F for ; Thu, 13 Apr 2023 09:52:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 097856B0075; Thu, 13 Apr 2023 05:52:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0471D6B0078; Thu, 13 Apr 2023 05:52:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E51E4900002; Thu, 13 Apr 2023 05:52:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D60766B0075 for ; Thu, 13 Apr 2023 05:52:16 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 96233802E6 for ; Thu, 13 Apr 2023 09:52:16 +0000 (UTC) X-FDA: 80675902272.26.B9D235C Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf03.hostedemail.com (Postfix) with ESMTP id A9EE720003 for ; Thu, 13 Apr 2023 09:52:14 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=ozUVL5Wl; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf03.hostedemail.com: domain of maz@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=maz@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681379534; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=57A2BMMP3/yQMc9mvS1/WGnVuMG6M+KRKDU2Vzffo40=; b=lKyUggzQbw0O9jw5ax+uJ2zMK395UawYiM561Wc8tqiWVP+vIRoKVWhNhYR1OtNEzquf1S xfOJFqlsGmlFsYtnqlKLY3igq8CZdupMVdVLMmGiQ4kU71SOY+xAp6cgZLUHQu4tCCgdoM L9KWUOHsgySHDtAlTycPzkOhlXnaT1A= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=ozUVL5Wl; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf03.hostedemail.com: domain of maz@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=maz@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681379534; a=rsa-sha256; cv=none; b=ARl8gdDRFH7zwZJK9yaevDUq/slyu3g1UjdvN1m0GKn2X/lr1dGXdAMXxk9WU3PdBRWdrs NjYO9/5hSUAgEignWGRaq8MGwNVkNDfnFkXkIqmR6dNZOApSvrLU0Jzlcb8+NOruxoBBNn uuw/8jtclQ8lEfAeUvqeVuD0TTNxaWU= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8F85363CCC; Thu, 13 Apr 2023 09:52:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F0009C433D2; Thu, 13 Apr 2023 09:52:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1681379533; bh=z5Duc20+BxsHeLAvvLwabMO0xTmluXnayswo1xBwGrY=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=ozUVL5WlWHFn6r0EiW5ErMVPB1jWgajxhrwnHdVCEkdfHry6uEFXEaQpfJQ7CuIkt QDEMsqngpDBM4QYA1+HbPlhMD/Bqopw1wMulPu79MkMY3SbXrTthfSnU3LTs46I+JU 1lE2G0qLOap8DdgJx1a5lwumH4SHx/VpEngECJcmOJ+lYzUMcjGQjxhToN9cIpwYhP VmmtzMSIUMB/0MhKFYeR8mWRtWXfw36lRIDMoRZHZOMmstSJoPk7k2sNIjvMlyPOUA U6wNf08ZZ2BL3VM807G4O6CHMsrY8DYl7h2cDVbUc8VXDGu2eNNtb3jxsLs9oIT/09 Cf9MipMxCdRag== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pmtcg-0084PT-JS; Thu, 13 Apr 2023 10:52:10 +0100 Date: Thu, 13 Apr 2023 10:52:10 +0100 Message-ID: <86ile0kt2t.wl-maz@kernel.org> From: Marc Zyngier To: Jason Gunthorpe Cc: ankita@nvidia.com, alex.williamson@redhat.com, naoya.horiguchi@nec.com, oliver.upton@linux.dev, aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org Subject: Re: [PATCH v3 0/6] Expose GPU memory as coherently CPU accessible In-Reply-To: References: <20230405180134.16932-1-ankita@nvidia.com> <86sfd5l1yf.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: jgg@nvidia.com, ankita@nvidia.com, alex.williamson@redhat.com, naoya.horiguchi@nec.com, oliver.upton@linux.dev, aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Rspamd-Queue-Id: A9EE720003 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: f9sphekaicmxz1o7pw4n34hphqzd3b7x X-HE-Tag: 1681379534-885286 X-HE-Meta: U2FsdGVkX18pYMUkK8GAHrlVVp33AHAuO/JK8AUMhVRJh5aL/2zw/F97JDo748BsL1UXc69tOhniEenvp5+1FPOUGklNeVZdxOlP888toWeGru9T/GOakhplB/YOE757g4Irz+SUsad/EGOfQLxY1UN8EdWSa7/pkgOp3h7/R2iXjdHcA5ZQ/IE9nu6dMkpRpwpxWYeAdPLIFCxm5A5ClLf8h0rPKIMI4rNkMYKkvBP2nVtEK/BY9fUx0X7UdiYT/GSADaUeA9gT6O2nEfbd2xnFqxbeGUn2aUxdcsICwcBdtijB4GBjWF+WTmif0FDFDCYyh0RbkH/ELZC+qviM2aLhkSq44bTrn8KT0vncd/8DqPqeHeGQLliM/t0FGEzR7JCuIgv9++rd9mex+1Yvb5cYMf+FdQ4xVn2faw+FZZHlunorfFxIIvWTZohRNprSvw4R6We9kx/NfVyWYllc4WqYHLbJuZ0nQDLS1rY8qmny3fP/4grILOcv5nPkcrVNLR6E+l5+sfICuDugIJvfgoGeOgPjkJ9JV5KzrBsxFkIDHJsueKTMDiLLrnsXAWaUkh/E/NnTtxg3ba3eBpxUYs51/ECF8CP2WcJHpUwO44+MV4e6OKt2t3g+4bZs3sCOLGjkztmsBTnLl4QuLIchUGdtK/5ZQptRs69obvux/29lVb54CB/L07krQear3poB2zJU2Nd6AluPY+SVKEOs2Gpav2Yekld2LH7RoSWnGzarRrf+LfB93wjtl0nuNX1+FQ99p0/lYIt6MT2CNzv7FSxM4tFvBhyMTMbFqPYIDPAn8IX5kwXOIFJ8phenkaj8pAyWB8Xv/xT5v6LygDmct04gEyUpFYb54uohE+rYxXhhC89jnFIiSlGBCo1JsxR0RcpU8RUvBCyky8BYsrr+AlvKtrst9XmuK4lWmKVskmq43l/ZYm8rwRt3sN9KP8W0nr0HBHIRAqHO7l4ZGQf ivnGNNhV jgucmNBbX0najscnTnKVH3fye+7lJba0g2XsDxkF0GS09TXVbwBzfUtJGsRIDoNlhotuUb3I7SnlXP/lGv/gRsBqnl4Pug0J2VJO7MTOB8a4Zd6Htp66ttUk/dnoXVK0BRAzxDEzY9a0Kq7XweLphm/PVnct2KW4j6/t7GIDuITmpyq/fRSiG8Sai2tpGqbkZtKmCe+b1m+70BDKtmm4mP3GCd4qMUDIvb8AH7LUtMaod6Ebo/Yomn06zFnMcjA70Ovcq4loQ/qISwcehfibdiFvX+zjFYZCqQfAsrp9jaAibTKQdW2KVA1p/k9I3cENjjbKxYQ5wLc7PSB8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 12 Apr 2023 13:53:07 +0100, Jason Gunthorpe wrote: > > On Wed, Apr 12, 2023 at 01:28:08PM +0100, Marc Zyngier wrote: > > On Wed, 05 Apr 2023 19:01:28 +0100, > > wrote: > > > > > > From: Ankit Agrawal > > > > > > NVIDIA's upcoming Grace Hopper Superchip provides a PCI-like device > > > for the on-chip GPU that is the logical OS representation of the > > > internal propritary cache coherent interconnect. > > > > > > This representation has a number of limitations compared to a real PCI > > > device, in particular, it does not model the coherent GPU memory > > > aperture as a PCI config space BAR, and PCI doesn't know anything > > > about cacheable memory types. > > > > > > Provide a VFIO PCI variant driver that adapts the unique PCI > > > representation into a more standard PCI representation facing > > > userspace. The GPU memory aperture is obtained from ACPI, according to > > > the FW specification, and exported to userspace as the VFIO_REGION > > > that covers the first PCI BAR. qemu will naturally generate a PCI > > > device in the VM where the cacheable aperture is reported in BAR1. > > > > > > Since this memory region is actually cache coherent with the CPU, the > > > VFIO variant driver will mmap it into VMA using a cacheable mapping. > > > > > > As this is the first time an ARM environment has placed cacheable > > > non-struct page backed memory (eg from remap_pfn_range) into a KVM > > > page table, fix a bug in ARM KVM where it does not copy the cacheable > > > memory attributes from non-struct page backed PTEs to ensure the guest > > > also gets a cacheable mapping. > > > > This is not a bug, but a conscious design decision. As you pointed out > > above, nothing needed this until now, and a device mapping is the only > > safe thing to do as we know exactly *nothing* about the memory that > > gets mapped. > > IMHO, from the mm perspective, the bug is using pfn_is_map_memory() to > determine the cachability or device memory status of a PFN in a > VMA. That is not what that API is for. It is the right API for what KVM/arm64 has been designed for. RAM gets a normal memory mapping, and everything else gets device. That may not suit your *new* use case, but that doesn't make it broken. > > The cachability should be determined by the pgprot bits in the VMA. > > VM_IO is the flag that says the VMA maps memory with side-effects. > > I understand in ARM KVM it is not allowed for the VM and host to have > different cachability, so mis-detecting host cachable memory and > making it forced non-cachable in the VM is not a safe thing to do? Only if you insist on not losing coherency between the two aliases used at the same time (something that would seem pretty improbable). And said coherency can be restored by using CMOs, as documented in B2.8. M. -- Without deviation from the norm, progress is not possible.