From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0768FC369DC for ; Tue, 29 Apr 2025 13:27:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 184756B0008; Tue, 29 Apr 2025 09:27:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 133606B000A; Tue, 29 Apr 2025 09:27:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 021896B000C; Tue, 29 Apr 2025 09:27:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D6C866B0008 for ; Tue, 29 Apr 2025 09:27:14 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id AACB9802D6 for ; Tue, 29 Apr 2025 13:27:14 +0000 (UTC) X-FDA: 83387157588.25.E9410E2 Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf24.hostedemail.com (Postfix) with ESMTP id 15B20180007 for ; Tue, 29 Apr 2025 13:27:12 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf24.hostedemail.com: domain of cmarinas@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745933233; a=rsa-sha256; cv=none; b=HMuIORoOMlN2S6NX0SL0rOjLr27yTyAYJr8/53BHWYXmI9OpWDEMbYQms1BmlRkyYPGz4Y eXdSyHlQYdNxpF9l2F5oQA6v7Shspk1OAziY7k/h0QVQLM0VNdbxSUpMRfprsZjgYhULXl REvJ6oX3s/9aVbTWDczLrdpdZraqGEQ= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf24.hostedemail.com: domain of cmarinas@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745933233; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6MsbxJW568Vs6cJJl21yFYb8yn4Yx1Tf2R8ZdaJM34A=; b=b7/315qLCrjAtjzjn1Nunv7UaBSDiIY2bCymEhrrlKa9An/DdsGxAziMXMJKHobr+iO3Qc 2zaxobYx2Lk1hhcf/DACAoxTrypksNJR339lHjiTQ4Wp4gcYoEjilMALOK8bANE9ZrW0KF EexCrOnfbSalz7ciW6VIlOGV3MfxkgE= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id ECE3661568; Tue, 29 Apr 2025 13:26:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0E6EFC4CEE3; Tue, 29 Apr 2025 13:27:04 +0000 (UTC) Date: Tue, 29 Apr 2025 14:27:02 +0100 From: Catalin Marinas To: Ankit Agrawal Cc: Jason Gunthorpe , Oliver Upton , Sean Christopherson , Marc Zyngier , "joey.gouly@arm.com" , "suzuki.poulose@arm.com" , "yuzenghui@huawei.com" , "will@kernel.org" , "ryan.roberts@arm.com" , "shahuang@redhat.com" , "lpieralisi@kernel.org" , "david@redhat.com" , Aniket Agashe , Neo Jia , Kirti Wankhede , "Tarun Gupta (SW-GPU)" , Vikram Sethi , Andy Currid , Alistair Popple , John Hubbard , Dan Williams , Zhi Wang , Matt Ochs , Uday Dhoke , Dheeraj Nigam , Krishnakant Jaju , "alex.williamson@redhat.com" , "sebastianene@google.com" , "coltonlewis@google.com" , "kevin.tian@intel.com" , "yi.l.liu@intel.com" , "ardb@kernel.org" , "akpm@linux-foundation.org" , "gshan@redhat.com" , "linux-mm@kvack.org" , "ddutile@redhat.com" , "tabba@google.com" , "qperret@google.com" , "kvmarm@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" Subject: Re: [PATCH v3 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags Message-ID: References: <20250422135452.GL823903@nvidia.com> <20250422170324.GB1645809@nvidia.com> <20250422233556.GB1648741@nvidia.com> <20250423120243.GD1648741@nvidia.com> <20250423130323.GE1648741@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 15B20180007 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 8ua5drsk4f8dswyaopc8b3ycoe6ydm75 X-HE-Tag: 1745933232-744871 X-HE-Meta: U2FsdGVkX1/VSyeWTh3r4ycqSzln4LHiNyvxrvz/5W6CpVoyBLIjKAHoV8bTIwn0HzuxyJiqIVpSaS+2xp1kIl/8Vahw9WljLzXqo8USDM35VDuXv/KKaql6nWNCcXfMydsJjEGiaUI4RbF0QW2kewWd4POPC60IPwaRlE72TqijDeqNmQ+SYdhMwtdBh0iXjeRsnPr3y3oOA3dbGZX23haTIrdm7D/GD462oG9KZoSfUI94x2KNb1NrQo2reiEm9Kf5/uOMKWH3XtkLNiI8rLdgaszZtkTMJkSYRJ7j7eTpTf51GSc64Ij8ykNBcT+RLy0R4YsYR8hCVzMUSL690Z72HTCMj58Wc3oUV7lP+5OSmN2gtdG5Lg7qOibhZwZyq83VIiIG5KF2Y8KY+0HSHYHNdCi/RU1kqmPYye3c7ABn/wJFMOLdzvmLhSI4HXL6TS+V1qMieOYEcGIJQGCDOkPTZw+/eNMqiJNMe6B9+aURAB6RwQmD8llUm/U6OmINFyHnrsFCMJnB7tKEegs9fJOZQNV3P/UTQCJuv0kp8ACm2wugoG9W9JEll2do0jHGGVMfgEUiv9iAT9sO882K96+ubtCSFnsTIVtHKc6BxixKmHCco9oB3oy8u7ogPvp2fac0i4tWJVa6o2t2yuuJIu/dUHETWm3aBYbNeiGqB9sS0cfZkvvxS67yqjrc2+29HrqaD0YgtpxlD2VkkivyeHOwAgdz88q/7rqxLD7K4OV3LtsfyQ8zHH4kBAW0ycPO7llQNmFtNAYs5qcF+qiuKNKtfmiCJfFMbiwWzCyrgcDdYaXudEtZ0ac5HbWiUNzlXlsye3C0P9GgV3HDp/CZAFaxsz4hbFFzAz18RoScscnCvXTMymolwTwmc1KmpXI5bUF/J+fzfG3kj7IIB7SHW3aWoV8G6w57b+ns+hzU6k3O/mBqElHz6dMM5Qhve2yHomyts5YkvTDbeVSKQ2G kLBpqJ4M 1UvHzbqg1AMQKSWCIKC38Zul2GfjeG/ZSEhEfDP/p+vkf+6ST2imfZxdZtMnCakIc01j+K6jTxS8W2VCuG/O4sXKQLNmaObvoGQ72Mu4AXog5tyl94aAV7pEcZvh+wrRzD7KBKswzMpK8okR4Ho6m34NpBAOaYKCZab5MyVaPXaLFuDSfFc9ifhwkEyeI5Rv1NrEX1NC4q/PlpxoOe6BJ5z2x3xmQPhwAB1M0QXuxYAz2n71NasKWA09Ykw4sESDGnJEfJ+vYDsQZnwqVLPSRlJBu6PFiizwgd0ibGc79sL2LcJGQ7DBvfptnqscrID0U2FuqsCkuctIgVYQ/d2fmvpiX0c4MZEQSUy3FNHZLaeQhOZT+BQXIhG5bOaFTWcKwIDsDPnprQdZFPuR6ks64NY8OWrTKcL2z6MROl2Ey9T49blJw0j7JkJCw/w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 29, 2025 at 10:47:58AM +0000, Ankit Agrawal wrote: > >> In this case KVM still needs to know the properties of the device. Not > >> sure how it could best do this without a vma. > > > > Well, the idea I hope to succeed with would annotate this kind of > > information inside the page list that would be exchanged through the > > FD. > > > > There is an monstrous thread here about this topic: > > > > > > > > https://lore.kernel.org/all/20250107142719.179636-1-yilun.xu@linux.intel.com/ > > > > I can't find it in the huge thread but I did explain some thoughts on > > how it could work > > > > Jason > > Hi, > > Based on the recent discussions, I gather that having a KVM_CAP > to expose the support for cacheable PFNMAP to the VMM would be > useful from VM migration point of view. > > However it appears that the memslot flag isn't a must-have. The memslot > flag cannot influence the KVM code anyways. For FWB, the PFNMAP would > be cacheable and userspace should just assume S2FWB behavior; it would > be a security bug otherwise as Jason pointed earlier (S1 cacheable, > S2 noncacheable). For !FWB, a cacheable PFNMAP could not be allowed > and VMM shouldn't attempt to create memslot at all by referring the cap. > > Also, can we take the fd based communication path between VFIO > and KVM separately? > > I am planning to send out the series with the following implementation. > Please let me know if there are any disagreements or concerns. > > 1. Block cacheable PFN map in memslot creation (kvm_arch_prepare_memory_region) > and during fault handling (user_mem_abort()). I forgot the details here. I think it makes sense in general but as the first patch, we currently block cacheable PFNMAP anyway. Probably what you meant already but - this patch should block the PFNMAP slot if there's a cacheable S1 alias. > 2. Enable support for cacheable PFN maps if S2FWB is enabled by following > the vma pgprot (this patch). > 3. Add and expose the new KVM cap to expose cacheable PFNMAP (set to false > for !FWB). I'll defer the memslot flag decision to the KVM maintainers. If we had one, it will enforce (2) or reject it as per (1) depending on the S1 attributes. Without the memslot flag, I assume at least the VMM will have to enable KVM_CAP_ARM_WB_PFNMAP (or whatever it will be called) to get the new behaviour. BTW, we should reject exec mappings as well (they probably fail for S1 VFIO since set_pte_at() will try to do cache maintenance). -- Catalin