From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0059C4332F for ; Tue, 12 Dec 2023 17:46:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 558446B031C; Tue, 12 Dec 2023 12:46:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 508276B031D; Tue, 12 Dec 2023 12:46:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F7136B031E; Tue, 12 Dec 2023 12:46:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 314346B031C for ; Tue, 12 Dec 2023 12:46:49 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id EFFA680AA7 for ; Tue, 12 Dec 2023 17:46:48 +0000 (UTC) X-FDA: 81558896496.09.D5B46D0 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf08.hostedemail.com (Postfix) with ESMTP id AD802160017 for ; Tue, 12 Dec 2023 17:46:46 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702403207; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wiNjlUGFThPC9Ue9cHngMXX8jYRGUAqxP5oKJPff5cg=; b=YQWjJ2OUKozS5j8COUYralMqGLkGiV1Prh4mKkVmhvUG3/9c6Vjs1cmV8fM9v9H8dQSRox WXtlKpHb4HNqtacp1Kxwl87mpalT6XmZI31mLkZC2R6tI0lWpIae2W1ffqESSOPY1aW8b4 Ope6Vqk6TL1N2xwIbrDhP4+cAI/Rdr0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702403207; a=rsa-sha256; cv=none; b=KUK3KailPpA2EJpyrKG7VLg+nq7oxyN5BkvsoYAHd65xFCS+BHdepZbq2GOBE2XwO8w69f UZreDsNqD5imro9TVPoyyrq5mjMa/Bm7NlRLS1G7ixPBkcA65tojszdAr0ydKRjow4XonJ LA+LwHmvKc7Wr0EjUcsrZYU+qJxBrQ4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of cmarinas@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id EB669CE1926; Tue, 12 Dec 2023 17:46:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 60BECC433C8; Tue, 12 Dec 2023 17:46:36 +0000 (UTC) Date: Tue, 12 Dec 2023 17:46:34 +0000 From: Catalin Marinas To: ankita@nvidia.com Cc: jgg@nvidia.com, maz@kernel.org, oliver.upton@linux.dev, suzuki.poulose@arm.com, yuzenghui@huawei.com, will@kernel.org, alex.williamson@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, ardb@kernel.org, akpm@linux-foundation.org, gshan@redhat.com, linux-mm@kvack.org, lpieralisi@kernel.org, aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v3 2/2] kvm: arm64: set io memory s2 pte as normalnc for vfio pci devices Message-ID: References: <20231208164709.23101-1-ankita@nvidia.com> <20231208164709.23101-3-ankita@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231208164709.23101-3-ankita@nvidia.com> X-Rspamd-Queue-Id: AD802160017 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: b66nauw58p6hpg7rs6fwwsakyxfsigdk X-HE-Tag: 1702403206-348805 X-HE-Meta: U2FsdGVkX18Pyccxo0PMO3QpyRfVbA++uUg97iG4cIMiyWmnI7wfSeSobsnanXQgX1gY6KwS1ZxgK3SNkgyj7y4kcfcvUkZ1yMA3DL+6smoEhdPvL+tL+gYXcvUOc8cnWjvkM42ML+tGmYHU3WNoB4TfGkWT1QsnCqfCuAFWiwwHZ/VF6gTdJUvJQ/MpvF6QZ0arUllCGzYpjCXD2qA0q5NRmU1JyFXZM71DJKvDFya0coslR1NnrtxFsljjv3RmvyD5Q8XIwEIuueH2sRmSByBYpbv6mT52L4jKsMyKKOSPMwJRL5O1UuYywEQYFJNq6UdLtN2samjDGhNPxxMKfyUuJGpLYWaUMb6av7ySVtJ1whmCFcQX4RRrj+cIp9y+fNsFUtN9i/qKwDTOwDf5UdF6c3sMJgVKTKGNJLSdnqDvVlDW4uYDsWIMLY37YMJCSG5cKIwXP/xoq+vPw7922Lf8uy7IcTvzulDSfqU+Siyxn5NuxWnW2ojayoPTOsujdh5/FUyNmMANa5Nd3iccu0R0/Z9xxnNcfLRqcTudX+cK0rGoLrKn6aaMz9fB5nZ/rktsa0V5BfSTmue6+gyN9QpVKRu2XvTul6jw6XREqMQM26+RVzdSnrqtu4P1vh2BPFtrD96ujssUvVFmufEkfrkbMQqKBHkSX6zS5Xdc86uFuVBtPuTi5VvSKHeklwa3BIJCY1NOelpzQxHShDcAU0hGBQuqFHMEXi3LGBMeiOsKUxUX0jdO6B7VowNCVnh4xzX9puRuEVAuk7/bUep/xCbTPtJeJIO8+TE3EPFL/FNhhZOmByTvB26W0Q6Yc+2OY9xrITOnVyB6jZdENDpyAAtZNQzFwdVmJd3DcEotyRA7pKuh31dysJ7GxFER1n3Vazzs/VoTvnBQ7XOg3CX/vit7JI0i7RsueXvLySObu2kaYM/8iE9SWbmHLGM0Q5LfqZ28w6b+qNE30m8l330 TCGf1cWw Uv8BTNJDCaDlsjCKQ4w+OW6Im3IQ/b+RmOJt2eJ384GnVJmdyc0EhS47NcfycMvXEoKbbHLpcHINQVjXhPVaXHqbZQuNIYRVO6tFz0d0+SmIExae5F7jvpC2/S9alORJCxkNBYdhW9c/Y7JwGrDPiLKKAKKLU9IH4gt3QgbTUj7Bo4Lhqkmm2UwNhPrkclhUmAMDlz14o8jjsjTMwY07JQJxStngAUhdVSffiQ8lLQejX4IkkP83rMTr8xfUoLi+FtY13vKmZsEHxLm6IpQn96ulz32aN+w2k+3AIdpPQQ6SrkAtTzlw4hPoto9KJAFegDV1XchMPcd80N+7dzk5U0RdiXg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Dec 08, 2023 at 10:17:09PM +0530, ankita@nvidia.com wrote: > arch/arm64/kvm/hyp/pgtable.c | 3 +++ > arch/arm64/kvm/mmu.c | 16 +++++++++++++--- > drivers/vfio/pci/vfio_pci_core.c | 3 ++- > include/linux/mm.h | 7 +++++++ > 4 files changed, 25 insertions(+), 4 deletions(-) It might be worth factoring out the vfio bits into a separate patch together with a bit of documentation around this new vma flag (up to Alex really). > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > index d4835d553c61..c8696c9e7a60 100644 > --- a/arch/arm64/kvm/hyp/pgtable.c > +++ b/arch/arm64/kvm/hyp/pgtable.c > @@ -722,6 +722,9 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p > kvm_pte_t attr; > u32 sh = KVM_PTE_LEAF_ATTR_LO_S2_SH_IS; > > + if (device && normal_nc) > + return -EINVAL; Ah, the comment Will and I made on patch 1 is handled here. Add a WARN_ON_ONCE() and please move this hunk to the first patch, it makes more sense there. > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index d14504821b79..1ce1b6d89bf9 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -1381,7 +1381,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > int ret = 0; > bool write_fault, writable, force_pte = false; > bool exec_fault, mte_allowed; > - bool device = false; > + bool device = false, vfio_pci_device = false; I don't think the variable here should be named vfio_pci_device, the VM_* flag doesn't mention PCI. So just something like "vfio_allow_wc". > unsigned long mmu_seq; > struct kvm *kvm = vcpu->kvm; > struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; > @@ -1472,6 +1472,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > gfn = fault_ipa >> PAGE_SHIFT; > mte_allowed = kvm_vma_mte_allowed(vma); > > + vfio_pci_device = !!(vma->vm_flags & VM_VFIO_ALLOW_WC); Nitpick: no need for !!, you are assigning to a bool variable already. > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > index 1cbc990d42e0..c3f95ec7fc3a 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -1863,7 +1863,8 @@ int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma > * See remap_pfn_range(), called from vfio_pci_fault() but we can't > * change vm_flags within the fault handler. Set them now. > */ > - vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); > + vm_flags_set(vma, VM_VFIO_ALLOW_WC | VM_IO | VM_PFNMAP | > + VM_DONTEXPAND | VM_DONTDUMP); Please add a comment here that write-combining is allowed to be enabled by the arch (KVM) code but the default user mmap() will still use pgprot_noncached(). > vma->vm_ops = &vfio_pci_mmap_ops; > > return 0; > diff --git a/include/linux/mm.h b/include/linux/mm.h > index a422cc123a2d..8d3c4820c492 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -391,6 +391,13 @@ extern unsigned int kobjsize(const void *objp); > # define VM_UFFD_MINOR VM_NONE > #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ > > +#ifdef CONFIG_64BIT > +#define VM_VFIO_ALLOW_WC_BIT 39 /* Convey KVM to map S2 NORMAL_NC */ This comment shouldn't be in the core header file. It knows nothing about S2 and Normal-NC, that's arm64 terminology. You can mention something like VFIO can use this flag hint that write-combining is allowed. > +#define VM_VFIO_ALLOW_WC BIT(VM_VFIO_ALLOW_WC_BIT) > +#else > +#define VM_VFIO_ALLOW_WC VM_NONE > +#endif And I think we need to add some documentation (is there any VFIO-specific doc) that describes what this flag actually means, what is permitted. For example, arm64 doesn't have write-combining without speculative fetches. So if one adds this flag to a new driver, they should know the implications. There's also an expectation that the actual driver (KVM guests) or maybe later DPDK can choose the safe non-cacheable or write-combine (Linux terminology) attributes for the BAR. -- Catalin