From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70840C282EC for ; Mon, 17 Mar 2025 19:54:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB1D6280002; Mon, 17 Mar 2025 15:54:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E6185280001; Mon, 17 Mar 2025 15:54:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2C4F280002; Mon, 17 Mar 2025 15:54:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B4DCB280001 for ; Mon, 17 Mar 2025 15:54:36 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D4D5E803FC for ; Mon, 17 Mar 2025 19:54:37 +0000 (UTC) X-FDA: 83232095394.05.A08B7DE Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf16.hostedemail.com (Postfix) with ESMTP id 4AE00180008 for ; Mon, 17 Mar 2025 19:54:36 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf16.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742241276; a=rsa-sha256; cv=none; b=jiL8va7VkUxUriztY+ooB1t8K2oIE0vWvJI6Yh6cir9kkSufHB3U7hX/jmpHclw0FqBfUo nitW9jRd4rv90M+6NX/BE26mOJDPlxtokC3V0vDuKJcmCUaNDjJGCts2EHR3eN1QX2qR4Q LLRcquoTQeggXULcovbqTY3XK+2NYKE= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf16.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742241276; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LGUYPGJwOu+txVjkFezLPGRB5nEoRA1jL64Ruvx+kFM=; b=EqMlw8dAn5hHSkOSEatHlkUf400QIq/kCgSsWL5nfF8sCtkMqxJTnXXbSKpzcu+VHwFe+b ymXT/EOKKc7ms91yFazoh3opr/WAvQZgQJNsRhv8/UcSkPc+jvZzbuKRsylauIz1ONKRBK L513w2JK/4I23bIlxcuIF95rl5OUDKU= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 4C29F5C54DA; Mon, 17 Mar 2025 19:52:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0EF56C4CEE3; Mon, 17 Mar 2025 19:54:27 +0000 (UTC) Date: Mon, 17 Mar 2025 19:54:25 +0000 From: Catalin Marinas To: Marc Zyngier Cc: Ankit Agrawal , Jason Gunthorpe , "oliver.upton@linux.dev" , "joey.gouly@arm.com" , "suzuki.poulose@arm.com" , "yuzenghui@huawei.com" , "will@kernel.org" , "ryan.roberts@arm.com" , "shahuang@redhat.com" , "lpieralisi@kernel.org" , "david@redhat.com" , Aniket Agashe , Neo Jia , Kirti Wankhede , "Tarun Gupta (SW-GPU)" , Vikram Sethi , Andy Currid , Alistair Popple , John Hubbard , Dan Williams , Zhi Wang , Matt Ochs , Uday Dhoke , Dheeraj Nigam , Krishnakant Jaju , "alex.williamson@redhat.com" , "sebastianene@google.com" , "coltonlewis@google.com" , "kevin.tian@intel.com" , "yi.l.liu@intel.com" , "ardb@kernel.org" , "akpm@linux-foundation.org" , "gshan@redhat.com" , "linux-mm@kvack.org" , "ddutile@redhat.com" , "tabba@google.com" , "qperret@google.com" , "seanjc@google.com" , "kvmarm@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" Subject: Re: [PATCH v3 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags Message-ID: References: <20250310103008.3471-1-ankita@nvidia.com> <20250310103008.3471-2-ankita@nvidia.com> <861pv5p0c3.wl-maz@kernel.org> <86r033olwv.wl-maz@kernel.org> <87tt7y7j6r.wl-maz@kernel.org> <8634fcnh0n.wl-maz@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8634fcnh0n.wl-maz@kernel.org> X-Rspamd-Server: rspam07 X-Rspam-User: X-Stat-Signature: 7gwcoxo3qbqs9z7mqk4dyopbnxz7e11u X-Rspamd-Queue-Id: 4AE00180008 X-HE-Tag: 1742241276-82367 X-HE-Meta: U2FsdGVkX19giJDs40B4aj/UglViQPl/G5JBMWmnbq2sUN9zro+PE5ToBrA8err4dcW8FEychflYbHXrwG2XqDcppenv0qln30gMsxlKGhT8+w5v7ARKOsMe1viX0QhBDdv4ndj5veoO29sENNbyi6YB/s3Q7j/DOgcg8qgoEbeO8Cq0Cx/IH3IaMc7LF7XMNI9R6hCtlO6H75P+vZwDmtp51LWKbO/nITnxJ78f1uDSu2I5J0W3xaNFDNl+Y+u6HnC6aPe6XQBow/fBRLfF6AaIGz8PNCPz3XjLarUiQqGj7aojtDUD4lCs2VPN6J/xlhpYXN3tUz0kXM4igher6TOEteW+MTOfVgcs85KOlBhsBRGwvk5qjw/VVIKVMA3dRl5WOoSU5jVx3QXw9uhhurj2U7uq7UUSk8liEXI7ZQitYR2AIuazJcS6ekHEnAhY0iyBp+oOgsn9Plo5m1lqNR7P2kheWaTfukSnU0F85G/d9lOTiABhIUgosqc7YmtqRKhXNUW7PUXu/ZJAN+k5iCkqNLbj0xYezJhoH8KMiHFosxwm8W8fyD3GIpLBqoCumictG2ki8y//GTqVoClL9k+cJIFhMjfV1V3cliWlh86Qnx/Q+FLG0DjEUZkGvW0GUobUNAmQFdUJn0/xYe+jU3FEkVryStv7bk3hsKrNcMdfkK1QQjdPL496aoXy/h9L78czqBP7k34uly07mprOnEMrcvK67pM3+toTZ7GN54xPrueaSGTPWkBpTeoDC1n5XkWFAUSK2OBaWJx/O0i1djSCRa1O+BXmYozfEwet0Tdyg1qT3G1JoyXf2qGrZkv5ESCaZAyqMpAz19VZjchaqGYW9JQLZRwGF5/gLdeMGwCOFIhh5zX1bAfEVHV3IJmBv6wUjcb/w5ikR3nMN+3ZWgFho0xFB9akj+5Hfkx5grxjAV3R6/SmSxqc2t4VmxGZASOcYb+q3hA7NiIksNB c9kI7Ttg CLaE/a9tBeOYt3bqwJn+1W0fYJ6aUsmvvalreTeWAzbosBJrUsiofVqe9qbwfHkgilSaP52H6QH4uwJnQ0o7GGFwDpwuMapDw31sslFFDee7XO7xFLTyQYVHiGQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 17, 2025 at 09:27:52AM +0000, Marc Zyngier wrote: > On Mon, 17 Mar 2025 05:55:55 +0000, > Ankit Agrawal wrote: > > > > >> For my education, what is an accepted way to communicate this? Please let > > >> me know if there are any relevant examples that you may be aware of. > > > > > > A KVM capability is what is usually needed. > > > > I see. If IIUC, this would involve a corresponding Qemu (usermode) change > > to fetch the new KVM cap. Then it could fail in case the FWB is not > > supported with some additional conditions (so that the currently supported > > configs with !FWB won't break on usermode). > > > > The proposed code change is to map in S2 as NORMAL when vma flags > > has VM_PFNMAP. However, Qemu cannot know that driver is mapping > > with PFNMAP or not. So how may Qemu decide whether it is okay to > > fail for !FWB or not? > > This is not about FWB as far as userspace is concerned. This is about > PFNMAP as non-device memory. If the host doesn't have FWB, then the > "PFNMAP as non-device memory" capability doesn't exist, and userspace > fails early. > > Userspace must also have some knowledge of what device it obtains the > mapping from, and whether that device requires some extra host > capability to be assigned to the guest. > > You can then check whether the VMA associated with the memslot is > PFNMAP or not, if the memslot has been enabled for PFNMAP mappings > (either globally or on a per-memslot basis, I don't really care). Trying to page this back in, I think there are three stages: 1. A KVM cap that the VMM can use to check for non-device PFNMAP (or rather cacheable PFNMAP since we already support Normal NC). 2. Memslot registration - we need a way for the VMM to require such cacheable PFNMAP and for KVM to check. Current patch relies on (a) the stage 1 vma attributes which I'm not a fan of. An alternative I suggested was (b) a VM_FORCE_CACHEABLE vma flag, on the assumption that the vfio driver knows if it supports cacheable (it's a bit of a stretch trying to make this generic). Yet another option is (c) a KVM_MEM_CACHEABLE flag that the VMM passes at memslot registration. 3. user_mem_abort() - follows the above logic (whatever we decide), maybe with some extra check and WARN in case we got the logic wrong. The problems in (2) are that we need to know that the device supports cacheable mappings and we don't introduce additional issues or end up with FWB on a PFNMAP that does not support cacheable. Without any vma flag like the current VM_ALLOW_ANY_UNCACHED, the next best thing is relying on the stage 1 attributes. But we don't know them at the memslot registration, only later in step (3) after a GUP on the VMM address space. So in (2), when !FWB, we only want to reject VM_PFNMAP slots if we know they are going to be mapped as cacheable. So we need this information somehow, either from the vma->vm_flags or slot->flags. -- Catalin