From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF782C369D7 for ; Tue, 22 Apr 2025 14:53:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B040E6B000C; Tue, 22 Apr 2025 10:53:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AB4276B000D; Tue, 22 Apr 2025 10:53:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A4366B000E; Tue, 22 Apr 2025 10:53:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 7C6086B000C for ; Tue, 22 Apr 2025 10:53:07 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 436A3121C7E for ; Tue, 22 Apr 2025 14:53:07 +0000 (UTC) X-FDA: 83361972414.04.5E7868E Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by imf09.hostedemail.com (Postfix) with ESMTP id 71F23140006 for ; Tue, 22 Apr 2025 14:53:05 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NeKD12Mn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf09.hostedemail.com: domain of 3UK0HaAYKCBE9vr40tx55x2v.t532z4BE-331Crt1.58x@flex--seanjc.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3UK0HaAYKCBE9vr40tx55x2v.t532z4BE-331Crt1.58x@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745333585; a=rsa-sha256; cv=none; b=qq9sx/nejWDEPAHWTXA/4t7BFQRT8BJJUPaDG9B1/dpHhS8YlEoVhbPLM9jPAv6HLojkiI Q8DWFY7MfdEtjbIXoP9aPl/sp6ExarsuJtkwQVYzxClcr6TpV/P2/hSxuz0xspj15/MnrU xdnOr2ObN84wLeP0ZhAGUztirLdaOZA= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NeKD12Mn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf09.hostedemail.com: domain of 3UK0HaAYKCBE9vr40tx55x2v.t532z4BE-331Crt1.58x@flex--seanjc.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3UK0HaAYKCBE9vr40tx55x2v.t532z4BE-331Crt1.58x@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745333585; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ooGcU85uLmvlzkjhi9UMrJ/GcNbNjtlDBh2b9XJHwJc=; b=5BrsD9dGwoyUK07oRg8y5mwF1iMN4b4K9ethJrz2FqFAEmNMCfdzqhLk7vGeaQcA18Mcrh IiJ7SlqdhCDDDsVc+AGSNiRnYXR1a5mwFSdazAFjvf039ZjkafTI/GypnFLeuidojriT9F U+KCAspP1RIkJZRcdOBS0GjQcWzs2w4= Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-736bf7eb149so3462418b3a.0 for ; Tue, 22 Apr 2025 07:53:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745333584; x=1745938384; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ooGcU85uLmvlzkjhi9UMrJ/GcNbNjtlDBh2b9XJHwJc=; b=NeKD12MnCxxroKSLqxXYcvuc6DwixJgeSmt6zpSoexvhvcaIo41RivqEhBSGCyyMZz BR1nPKvJId2CwvDDJlvJmZBy5duHBt8xAnkzFKt4fK0wznZLz6xK0SGa2nOChuLxO41j 57NofP6aPCtuwaGAruIdx/lLhN0hYIaq2B3Lgfq1oo2iA1sfXIlo67sdR5RbNFwUnVov B0k6IetgubfVCmQvE8DvCBxtVCzQ99HvK4bLfb+WfHRmld5ec5gwtWeBXt0aworhpaN5 70IoMZMr8x0WxuRq2qTvAi0jDMy8C8480RG8Jn6S85p9F4zlNTVIMOExWXsSmRnLCfNB z7CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745333584; x=1745938384; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ooGcU85uLmvlzkjhi9UMrJ/GcNbNjtlDBh2b9XJHwJc=; b=Omiacew8Cc774noYVF3UFQIzlVaFJI2rVhcm35T9LjEZGnKqVjbks+gbzSLfilMrtJ svMn4fzxygBL6Z+tVdxxdEL+ufa+RtRaaym/qMaYb6oocqT7ZXgWaTnbXj2A5HK08NcR 7vyIdlZmWiH4XvGIJ7Sz0Xk5w1hip8mW4/4XplUfGYqysGxSy9iDv8EidFaDNtl8loXl AWskcLz6FEHpjEX/AFr6tXSi4FA7P8eqUq0zf1+ikGXA89UjTU2uGS5I1iBdgHBMhKfT m8yLTP0t8xCdtKpqTs8YrQNfYCdP7FedwCwlO9Wv3YE+L7ra4kRcQHURZphhJI5XRS1i rq/w== X-Forwarded-Encrypted: i=1; AJvYcCXRR6D999mZxSBd5YQ96kEPu3lYCsX+T0vgO9dcSn7zHOnZvq7a0FoxJjcBtV1ptQYZmMq1pF4iCA==@kvack.org X-Gm-Message-State: AOJu0YwpkMG7AuwnWGXAATRze/CRoUxwfFn+3v7EAnpdRtKg3HwOmHkM A6NqP7Fp6ks0a7YWEaPYH32Vp2uPrq1j8vBx2IwWhxC7ry2xs71t0SOS8rLPsIkw5qrG1FL31fj Tcw== X-Google-Smtp-Source: AGHT+IEx25gCkpttQgdXEdRCmM9tOpB/Y3lOr09cr6TIRqh2wKfoCLxMxZ/On7rEEuVhXVHqBKJaHk715hw= X-Received: from pfbfc7.prod.google.com ([2002:a05:6a00:2e07:b0:736:9c55:9272]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:888:b0:736:5544:7ad7 with SMTP id d2e1a72fcca58-73dc14e8128mr18167922b3a.14.1745333584127; Tue, 22 Apr 2025 07:53:04 -0700 (PDT) Date: Tue, 22 Apr 2025 07:53:02 -0700 In-Reply-To: Mime-Version: 1.0 References: <86y0wrlrxt.wl-maz@kernel.org> <86wmcbllg2.wl-maz@kernel.org> <20250331145643.GF10839@nvidia.com> <20250407161540.GG1557073@nvidia.com> Message-ID: Subject: Re: [PATCH v3 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags From: Sean Christopherson To: Oliver Upton Cc: Ankit Agrawal , Jason Gunthorpe , Marc Zyngier , Catalin Marinas , "joey.gouly@arm.com" , "suzuki.poulose@arm.com" , "yuzenghui@huawei.com" , "will@kernel.org" , "ryan.roberts@arm.com" , "shahuang@redhat.com" , "lpieralisi@kernel.org" , "david@redhat.com" , Aniket Agashe , Neo Jia , Kirti Wankhede , "Tarun Gupta (SW-GPU)" , Vikram Sethi , Andy Currid , Alistair Popple , John Hubbard , Dan Williams , Zhi Wang , Matt Ochs , Uday Dhoke , Dheeraj Nigam , Krishnakant Jaju , "alex.williamson@redhat.com" , "sebastianene@google.com" , "coltonlewis@google.com" , "kevin.tian@intel.com" , "yi.l.liu@intel.com" , "ardb@kernel.org" , "akpm@linux-foundation.org" , "gshan@redhat.com" , "linux-mm@kvack.org" , "ddutile@redhat.com" , "tabba@google.com" , "qperret@google.com" , "kvmarm@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: 71F23140006 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 1fiiy3eyrgmaux7edj35j6wijbwgkhj8 X-HE-Tag: 1745333585-694000 X-HE-Meta: U2FsdGVkX1/IgLuKQohR0dLd51uqHCV7tXWTtBwZeqw8K4pB+OOq6g5+SJmVWWgKqQ54byHQVfJVl6c8D6EAIATXy3MrqFupcAslBRvaRARCokVQoX4gatTA8CDs5lmikrYsakyCX/AYOrUShBLQr19LEZ4yjzp8mUAeyp1nYU3y/1yV2MKhnCbiZ+uRNISadAqkMLz/QFm6mIou2UJFAHNKLunT0bZJUy8Lq/duldtHqZximqIQ/zw6ARVTDj0KkD+9lv4zEi2rOHsFkj4UQ7MIRqpw4+7oepkOg5sw3kD/+00UQWPBoTQdjAhPKbVVxw7DdUzE3nfrFqEOzCUZXHo5TL9wMBhKic1J3dKpkIbJ1N7VxDN8mKjHjfcnI9ar5yHBZqtIIP8sS6oT0LVsSIdodnzwPX00D6/DHAI27kMHFhdLJ2gses2Q9VqgqqYjCVn+DhOMaYarbudY0tr5U+8KwAs/XjSguJPxj/UOTR3b24F9z0Y1pJ8NVs7Ho4OzFRwDU6StD9co/9T1VgSuNA44naPCx+9suSzddYsJfQjmUvA2+OLF/nTeEe5K//Oyu+MUBXFQSju9SMlEHsCwoDPdqan0YImj2IcOov3wDwa+VNQ/l7v2G4Fpvm0dGLzd1E//7j6SbPisr/5UqSQzgHTZrf0290hunqnxHL5i4LqU/nnDB5GJPLTvu7qGjiadmwmxdKWhW0onP7KBbVERBJ1jIIs5qj3/t+kVtp96w0HRGpyG14NdlRTxYA0wEvDtXjxdA8FEoYBbV5+FPCwa8Fw0g8SKl1lNAq/ALH7oPOZyFyaCOYbRIS12KNwiw1bdJa3rM7Xth8l19ci40NuvBVnXF7gqoXOry0lFaDQDur7eE57JmtQEil9VtIzb7FJzjQA+AV+hhOZQQ77YkqvzhlxRSSkNIySrNlLNHQnwbFQQDeQcS9m2xINTqOQj6IWbRtkRpgf+2X9WSkmlkx8 MemElxA/ mzHqtqG5b1DCIFrg1TYusEfSrM+IeMgE839siC840fyEed2bmO44dfPHSrLhByOvRMJUVYyQ0BsB3N4vtHLDkGexG7n6PIoy3PoQjo8QT6YBnZKXAHWDeLAgHmGHrBTODwXO1JfH20n3FusdVGHIIWpMFw93iDbw5i9ds1ZqW7bpwk+hxGxWE4lrICWBDmScLUSkzO5bcNTpCYKjwVOtlIKBCkR5sCDKAzFlcvaAu6EaSmEUKEjIpgJPqQPvkcEOIv2/5Li3+y7kI8e8AWOQLiUcrrNjV1WbPV/Vh7YAoXU5zAk9nNtVwdnDsDlBRFWN5a8C6UYWEPuH7TH4zjw1eM/sEwb8R4oO68Na9i5cluiIG8/sPuC+jtQA6YyndElXvbz3/lC6rCfxUqq4lbevfKwI2Uf1Ev/PqE5r5bzlbto/GShgopNTKaNTZZx85aWt14d717HohUutCV4VL9ZfGgJ/nSKy4l5mHpjVPmVbNM137z9/qSX/7guaH5iRFynWFSpa72uunjms0FMdDABsMPyOZ71TKFJk/fE1k1BGoCXFGpD0+n3KHq1cgx7jP/Kmz7c6zFtqYAl0hb6B8Osb2sUDDyqLjTkpVko3/DMEc4a5xYq/0GzxyP7GuI57Z3Pq3AG/s X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 22, 2025, Oliver Upton wrote: > Hi, > > On Wed, Apr 16, 2025 at 08:51:05AM +0000, Ankit Agrawal wrote: > > Hi, summarizing the discussion so far and outlining the next steps. The key points > > are as follows: > > 1. KVM cap to expose whether the kernel supports mapping cacheable PFNMAP: > > If the host doesn't have FWB, then the capability doesn't exist. Jason, Oliver, Caitlin > > and Sean points that this may not be required as userspace do not have > > much choice anyways. KVM has to follow the PTEs and userspace cannot ask > > for something different. However, Marc points that enumerating FWB support > > would allow userspace to discover the support and prevent live-migration > > across FWB and non-FWB hosts. Jason suggested that this may still be fine as > > we have already built in VFIO side protection where a live migration can be > > attempted and then fail because of late-detected HW incompatibilities. > > > > 2. New memslot flag that VMM passes at memslot registration: > > Discussion point that this is not necessary and KVM should just follow the > > VMA pgprot. > > > > 3. Fallback path handling for PFNMAP when the FWB is not set: > > Discussion points that there shouldn't be any fallback path and the memslot > > should just fail. i.e. KVM should not allow degrading cachable to non-cachable > > when it can't do flushing. This is to prevent the potential security issue > > pointed by Jason (S1 cacheable, S2 noncacheable). > > > > > > So AIU, the next step is to send out the updated series with the following patches: > > 1. Block cacheable PFN map in memslot creation (kvm_arch_prepare_memory_region) > > and during fault handling (user_mem_abort()). > > Yes, we need to prevent the creation of stage-2 mappings to PFNMAP memory > that uses cacheable attributes in the host stage-1. I believe we have alignment > that this is a bugfix. > > > 2. Enable support for cacheable PFN maps if S2FWB is enabled by following > > the vma pgprot (this patch). > > > > 3. Add and expose the new KVM cap to expose cacheable PFNMAP (set to false > > for !FWB), pending maintainers' feedback on the necessity of this capability. > > Regarding UAPI: I'm still convinced that we need the VMM to buy in to this > behavior. And no, it doesn't matter if this is some VFIO-based mapping > or kernel-managed memory. > > The reality is that userspace is an equal participant in remaining coherent with > the guest. Whether or not FWB is employed for a particular region of IPA > space is useful information for userspace deciding what it needs to do to access guest > memory. Ignoring the Nvidia widget for a second, userspace also needs to know this for > 'normal', kernel-managed memory so it understands what CMOs may be necessary when (for > example) doing live migration of the VM. > > So this KVM CAP needs to be paired with a memslot flag. > > - The capability says KVM is able to enforce Write-Back at stage-2 > > - The memslot flag says userspace expects a particular GFN range to guarantee > Write-Back semantics. This can be applied to 'normal', kernel-managed memory > and PFNMAP thingies that have cacheable attributes at host stage-1. I am very strongly opposed to adding a memslot flag. IMO, it sets a terrible precedent, and I am struggling to understand why a per-VM CAP isn't sufficient protection for the VMM. > - Under no situation do we allow userspace to create non-cacheable mapping at > stage-2 for something PFNMAP cacheable at stage-1. > > No matter what, my understanding is that we all agree the driver which provided the > host stage-1 mapping is the authoritative source for memory attributes compatible > with a given device. The accompanying UAPI is necessary for the VMM to understand how > to handle arbitrary cacheable mappings provided to the VM. > > Thanks, > Oliver