From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D96BBC3DA6E for ; Wed, 20 Dec 2023 14:16:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E67F8D0008; Wed, 20 Dec 2023 09:16:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6965F8D0001; Wed, 20 Dec 2023 09:16:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55E1D8D0008; Wed, 20 Dec 2023 09:16:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 465238D0001 for ; Wed, 20 Dec 2023 09:16:47 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 15E391A0C40 for ; Wed, 20 Dec 2023 14:16:47 +0000 (UTC) X-FDA: 81587397654.03.A52FB54 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf16.hostedemail.com (Postfix) with ESMTP id F215F180011 for ; Wed, 20 Dec 2023 14:16:44 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=B5e6AjFX; spf=pass (imf16.hostedemail.com: domain of maz@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=maz@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703081805; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jc+z5x2dZS+YZPeCty7XM9hz8mclqcZO6JtG4k5hj8Y=; b=lGoL9WoHbVxhspUW3aCh6Sl6on3XJoCJA/phhBcqZj93IlXo+N92d4OB3EI3rSttaQ6sWD gYvRxkYjpOrxgQznIKG/1kVpSY7Lyxcw2Xg9ME6KxssK3luTZdKXghnl9daOu9iCMiAyCF olcEIuMIyU5Bfq3pvIfLz9SwIYKmOPU= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=B5e6AjFX; spf=pass (imf16.hostedemail.com: domain of maz@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=maz@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703081805; a=rsa-sha256; cv=none; b=zML6t98QgW4KKy7nVSqLlwvSGdrF47/Tbvbtb6fxfYTU0XOldwArdSwzMc2UBrUMew7bvb 6xQ7IXkKmmaVWSYPo+jg++VhtByM87DNeys5ms5yJeVuVdHCORiS6k2GlD15cwLuxe+BWo TC0INTw/fPX9DgPhFUMDC+dGFBcRhEI= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id F1DB461645; Wed, 20 Dec 2023 14:16:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8FBD3C433C8; Wed, 20 Dec 2023 14:16:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1703081803; bh=ZI70D8QvLaWdXr+fAuK9V6XLldoFTfbDeU/tcqCx5Bs=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=B5e6AjFXAbuWH87OJt2SncWeCUQNmySRUqIgLbrY1DD0aF0i3otiqnjjhkE24YOOH KirDF+QUDSsbn1/+/GFIHNzn7bpJchKI/Kjv80eqSoDfN73rVGUASKiJSXJssiJMaA U/Wjw7zQyNon/FCoPeBP+FTRhhQjRpbn3mgCKt7ZziecilMqhPBKxEhbBDc6PNMPkR VSbgVwz+AYuDi7upHfRIkKIY+pg/uFvVx+yBwynsT6sTmlY5m5VmKXmRdWrgVOtLZ5 E3X8ParhMu06znMEq2MJhbZY4rsFAME5XHPfeFeguIQTrtXty5naajYcGAbA37405P xhtVYMji6aRwQ== Received: from [104.132.45.104] (helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1rFxNJ-005hcv-4T; Wed, 20 Dec 2023 14:16:41 +0000 Date: Wed, 20 Dec 2023 14:16:40 +0000 Message-ID: <87v88tt0vr.wl-maz@kernel.org> From: Marc Zyngier To: Cc: , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v4 2/3] kvm: arm64: set io memory s2 pte as normalnc for vfio pci devices In-Reply-To: <20231218090719.22250-3-ankita@nvidia.com> References: <20231218090719.22250-1-ankita@nvidia.com> <20231218090719.22250-3-ankita@nvidia.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 104.132.45.104 X-SA-Exim-Rcpt-To: ankita@nvidia.com, jgg@nvidia.com, oliver.upton@linux.dev, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, alex.williamson@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, ardb@kernel.org, akpm@linux-foundation.org, gshan@redhat.com, mochs@nvidia.com, lpieralisi@kernel.org, aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com, linux-mm@kvack.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Rspamd-Queue-Id: F215F180011 X-Rspam-User: X-Stat-Signature: 5ad3nd5sjfugu3ukxnda7qprbaej57bg X-Rspamd-Server: rspam01 X-HE-Tag: 1703081804-627656 X-HE-Meta: U2FsdGVkX1+YKfY1I/d4hFCsZODMK1YIAcYyU+TGpHEeQLj7Q7e4i3W3y/aZNti9YHBpKe/2QP3kiwcyN9CZwUBRcDmoM7P9tSE6J4jWfQq5JebYpHCewQCJeVvaUblwbapi9g13CWGBRCVzIl2aJeQCEZx99NlPdAGUmcjNeO4EJvOrACJdEpFrRtVVDoTamRtqQd61JXud1CQBK2Hfqphgug2cBc/fv7dk5HA24g7Ew1k6QWL8ekc+Qvrso6kbJ1Fuma6vE5wa7gpGU282jmLxe4R2jTFVxLXHjmq3BplTBvrUo6Sl2l4WVHvJUen1bvrmwzzDsZnaNiB3+l0rmLVp8zTGuWtx8zLId7XP+k20Ix+sUQByXVNBsU8p7/M1kPSx4PQJ0q7dMSbxp2/t1UctuxbErJcIYquyGH2X8EgP383NYhZQoBz0UcSNx7OrmRiU6N7lofV5g8wULVYhabphSEAn38G0FdPlzLnzT8ivMVKt+qAdpKlIgh2qdch4Dk7rF4yRQSJTWhKjgVLP3zvArDwqrXfaPhhuNMTF7AqVZWs/jfF4OtMK2ay+B5s6F9mwBqu5xJw6kMR0OOZte4WMefwkK2/2KVuQhMonPz6OMVZJW5JzjYkSajQlqkfuO78pHW6w/X4WtXnU9N0K0OiVzHYcfN9aHYs3VftCEDu5fTlRUydW8uAdl3FaXMTn0K93SO7TEqnQqnIkm/qfulej+MZ3rWj1aF/z9itJM6p0EDm+xS1PBWYcfQWK+WsgjnT7iqTcAYndg/1ZhSn2C0ZOi46ZAhUZ/r4ZwgY71wv/okQaJenNgTCXFEOTdQ3LH1i3+r8j5X7pncRN0mahvHGC5UNPt4ccA/uxd9O7PSp1MDQV4KMxXZREWnljg2n7MjgykTlWS1rANYJIEJWIQNgqaaGbQ2h8pcGd9tArdmca3vas8EjPzo/8IEsL2ZWfJSrcK/KMjK6AVpSQnFh Y43OUXEM G0cNCngUOuEfr/gbMOhV71h2hI6iRGqywWTgxYIW+ZQPM8nq6nOGXC7hKQiqNMcpCpVW5dJ+8GUCz45h+b8kuy0JbjfNEVSVKW59FRGTsTyQetLXzxXTpuHW2ey50/AHVRqaLS7XpCsuGVusk/nU0VyL+3JHczNur2fT6EYZQkYOlROzYO9rmELiqJoAqxuHcjJnzOj77bl9UX/4cwFlis/gaaWH23y06q5Mm2n2mLqIwIcQKMDvOWckVxEJwkxpR8UHRUVusegXS5A9qNNBn9LsYHA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 18 Dec 2023 09:07:18 +0000, wrote: > > From: Ankit Agrawal > > To provide VM with the ability to get device IO memory with NormalNC > property, map device MMIO in KVM for ARM64 at stage2 as NormalNC. > Having NormalNC S2 default puts guests in control (based on [1], > "Combining stage 1 and stage 2 memory type attributes") of device > MMIO regions memory mappings. The rules are summarized below: > ([(S1) - stage1], [(S2) - stage 2]) > > S1 | S2 | Result > NORMAL-WB | NORMAL-NC | NORMAL-NC > NORMAL-WT | NORMAL-NC | NORMAL-NC > NORMAL-NC | NORMAL-NC | NORMAL-NC > DEVICE | NORMAL-NC | DEVICE > > Generalizing this to non PCI devices may be problematic. E.g. GICv2 > vCPU interface, which is effectively a shared peripheral, can allow > a guest to affect another guest's interrupt distribution. The issue > may be solved by limiting the relaxation to mappings that have a user > VMA. Still There is insufficient information and uncertainity in the > behavior of non PCI driver. Hence caution is maintained and the change > is restricted to the VFIO-PCI devices. PCIe on the other hand is safe > because the PCI bridge does not generate errors, and thus do not cause > uncontained failures. > > A new flag VM_VFIO_ALLOW_WC to indicate KVM that the device is WC capable. > KVM use this flag to activate the code. > > This could be extended to other devices in the future once that > is deemed safe. > > [1] section D8.5.5 of DDI0487J_a_a-profile_architecture_reference_manual.pdf > > Signed-off-by: Ankit Agrawal > Suggested-by: Catalin Marinas > Acked-by: Jason Gunthorpe > Tested-by: Ankit Agrawal > --- > arch/arm64/kvm/mmu.c | 18 ++++++++++++++---- > include/linux/mm.h | 13 +++++++++++++ > 2 files changed, 27 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index d14504821b79..e1e6847a793b 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -1381,7 +1381,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > int ret = 0; > bool write_fault, writable, force_pte = false; > bool exec_fault, mte_allowed; > - bool device = false; > + bool device = false, vfio_allow_wc = false; > unsigned long mmu_seq; > struct kvm *kvm = vcpu->kvm; > struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; > @@ -1472,6 +1472,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > gfn = fault_ipa >> PAGE_SHIFT; > mte_allowed = kvm_vma_mte_allowed(vma); > > + vfio_allow_wc = (vma->vm_flags & VM_VFIO_ALLOW_WC); > + > /* Don't use the VMA after the unlock -- it may have vanished */ > vma = NULL; > > @@ -1557,10 +1559,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (exec_fault) > prot |= KVM_PGTABLE_PROT_X; > > - if (device) > - prot |= KVM_PGTABLE_PROT_DEVICE; > - else if (cpus_have_final_cap(ARM64_HAS_CACHE_DIC)) > + if (device) { > + /* > + * To provide VM with the ability to get device IO memory > + * with NormalNC property, map device MMIO as NormalNC in S2. > + */ > + if (vfio_allow_wc) > + prot |= KVM_PGTABLE_PROT_NORMAL_NC; > + else > + prot |= KVM_PGTABLE_PROT_DEVICE; > + } else if (cpus_have_final_cap(ARM64_HAS_CACHE_DIC)) { > prot |= KVM_PGTABLE_PROT_X; > + } > > /* > * Under the premise of getting a FSC_PERM fault, we just need to relax > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 2bea89dc0bdf..d2f0f969875c 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -391,6 +391,19 @@ extern unsigned int kobjsize(const void *objp); > # define VM_UFFD_MINOR VM_NONE > #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ > > +/* This flag is used to connect VFIO to arch specific KVM code. It > + * indicates that the memory under this VMA is safe for use with any > + * non-cachable memory type inside KVM. Some VFIO devices, on some > + * platforms, are thought to be unsafe and can cause machine crashes if > + * KVM does not lock down the memory type. > + */ Comment format. > +#ifdef CONFIG_64BIT > +#define VM_VFIO_ALLOW_WC_BIT 39 > +#define VM_VFIO_ALLOW_WC BIT(VM_VFIO_ALLOW_WC_BIT) > +#else > +#define VM_VFIO_ALLOW_WC VM_NONE > +#endif > + > /* Bits set in the VMA until the stack is in its final location */ > #define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY) The mm.h change should be standalone, separate from the KVM stuff. M. -- Without deviation from the norm, progress is not possible.