From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF79EC7EE39 for ; Mon, 30 Jun 2025 08:08:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5BCCD6B009D; Mon, 30 Jun 2025 04:08:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 594E36B009E; Mon, 30 Jun 2025 04:08:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D1BA6B009F; Mon, 30 Jun 2025 04:08:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 353EC6B009D for ; Mon, 30 Jun 2025 04:08:04 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E85E71A02CA for ; Mon, 30 Jun 2025 08:08:03 +0000 (UTC) X-FDA: 83611338846.07.79F46CE Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) by imf04.hostedemail.com (Postfix) with ESMTP id 200A740015 for ; Mon, 30 Jun 2025 08:08:01 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=g7NcM4kJ; spf=pass (imf04.hostedemail.com: domain of tabba@google.com designates 209.85.160.171 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751270882; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mi/OKB+QroqPI8Z8OCA75lN3eu0FgzMHtdVq7qnuW3M=; b=1ZjTFgI7McaeLPaLlQnTbGt8Xk+FRmKfiW3PV8Uny64+iWPTuRk1uhMut60Nx0w3Trq38b EL1Fibeaz4jwswNzNQQc5IMr9JO5VkrV9XqOE2BN7OlPEwvGscK9edYBCti15FJUEFhGAW 9RklPAjb85dTf4uRtDJUBXd9vjuMZpk= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=g7NcM4kJ; spf=pass (imf04.hostedemail.com: domain of tabba@google.com designates 209.85.160.171 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751270882; a=rsa-sha256; cv=none; b=CRQyqkWm0fiWI9qLMcc0s1qeQSk5eNm+KauabE4s2D/N3M/2oPWD1YGJ4j/MSwWDoCRobm vMy0YfhkntSn52lSbL+NfQPFlSUr2DOEchCk2INdgWsqfUDLai7qE17WqKvZ3NlHnMzuHa bQa63vIlWx1Lh1eQFCMOcVo1F3q6m7A= Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-4a5ac8fae12so694071cf.0 for ; Mon, 30 Jun 2025 01:08:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1751270881; x=1751875681; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=mi/OKB+QroqPI8Z8OCA75lN3eu0FgzMHtdVq7qnuW3M=; b=g7NcM4kJrcsXUuuFzBK4Jw3k8aZIbUjM+CeFYRNCZ6U7SvTUGHCay4vyqPqnYWUtxu Wd8hFIl8RqYzzUdHTAoKpVtLAg4yP9d2ys/JeHTHWAwzMtj9OCo9N6SUKar32FJ8mnpT sLOELovIcRw8Ns2mdTJE80+BzzqqsDaAXRp68Mo6feEpnF2xg4aLfQ+LsEwbHqlnmt/U YorZRsayeBmqdq5ZrL+YJb3aFtvHqhVwfZxzL02s6wCf+/hbnSQZIjiy8BJMHWlMVUlN Sid7nmanVE8EfYFtD8ftxuunNTW6IwNkzSCxc6aPJJHkH9Fgx//WwgJRbVrmjesEgqO5 c9Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751270881; x=1751875681; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mi/OKB+QroqPI8Z8OCA75lN3eu0FgzMHtdVq7qnuW3M=; b=Q/YK8oJfLzXVam6uJOt0A9seHfVUysiPVuwD5H4somyKfNHQaRONeXYOd6XkEGIzJL TC/5RLpAyA6YFhxQU2MXJFXuk4QIi1GladaNHkwtU9Jxnh1C2AIJU8clagF8LL+BEwpT t4n6ee1B2q/h0rAki5ts9Ryp2Ejj4fJcXj3oWqxbHroEeCJEqkPFsdrr2XkfC/u16slA R8V1xAf84WtA9F9ZmEYAU2aUGwid6KQzcDSVhAmKTAMNLbfeWgoB771f+TMxgVixttFJ JevZ1y5xQqiFqFUi5Jhrn0WdRdWIfLhICocnseJ1zmv2+MuCowZhf5eE3pTj9jkyxO3h eiOw== X-Forwarded-Encrypted: i=1; AJvYcCXcuS39lVl6FECIkqSOUQ1P3geFzSPr9ukiJs4Gf66u/pZ/vJ1aXJUP9Z4HRBcwuWPj6cearvr1DQ==@kvack.org X-Gm-Message-State: AOJu0Yy2DkEZpqXcPyDSWO9+Z8K075duU7Rkp0qbsHCeT0hU27pih27N wnpIYOmArlDB4mRp+GCzvpZTWb1JYS25k0G51ljB5LvjuMOrW21G9HlBHKxaK8Lklis2p5ZDP9u 0/RYh2zwqORRYsx4US1jtNSeNqrPmtWaVwHIgZjNn X-Gm-Gg: ASbGncsYV4D11jVherFvVwV6P6dr6xVK3zWRjxBh7eDouIF8gWEuYhsADl9nUg6CD0L grHThRFYmxRNX5Ux/zNKH3vKkRXYB3vZ61fl71hRxqz4b9xdp2gZhon4wsA+zC441Q4aDoXZpQE zYmVXgoyVPg2FGfmapgDmATCdV+abZQQ+jiTgH6IfPVgDy5jiUgDsBtsgzOuK69yFKIU0r5yYF X-Google-Smtp-Source: AGHT+IFQ+H4ctfiCA8/zL6DD6zbPcIGQ0L/p+jZyxVns9qRr/+12U+JsgC2TXDqtvwf6SRug7NRYv389tASYM/vG9wA= X-Received: by 2002:ac8:5e08:0:b0:4a7:6ad9:39b4 with SMTP id d75a77b69052e-4a808ffbdf0mr5418201cf.25.1751270880560; Mon, 30 Jun 2025 01:08:00 -0700 (PDT) MIME-Version: 1.0 References: <20250611133330.1514028-1-tabba@google.com> <20250611133330.1514028-11-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Mon, 30 Jun 2025 09:07:23 +0100 X-Gm-Features: Ac12FXx4RLRI7Sn1sliQ6WnveLDaQPnxVwQ-ObC_x8hHp0fzJtRLgEA_BEf2o30 Message-ID: Subject: Re: [PATCH v12 10/18] KVM: x86/mmu: Handle guest page faults for guest_memfd with shared memory To: Ackerley Tng Cc: Sean Christopherson , kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: drwozq91s7fun1ca7dkuk6wpo6sjjmsn X-Rspamd-Queue-Id: 200A740015 X-Rspamd-Server: rspam11 X-Rspam-User: X-HE-Tag: 1751270881-558844 X-HE-Meta: U2FsdGVkX1/qK6kdVEt1MrwxEmOorxCNvO1id/5SZfd0kp86v4V4tsaCVAMuQV51t8/e4oBLMxvFK3YdjeNv2bgXBK1nHOypBRC0+1yvO93575FgYLXPSjm2i6M7dof5i1CrOuLippb5LUFzRVFOvK/wm+ADgMwV0JSqPEF02aDVXtnnR+aKpUQbJyU/sQsdXyOKveHrQNXVqpm7vzki2Wh1RZqjr/eG/rv1gGrinpcrh0zSKUFmpTcOIA6quPhtu+oMH2bOrHNXQ6PqQABFEZDz4mavC/GZuw2O+vsSOIitmGjrP5DZsxNL0UBZ/o09UHQ6EGFkfeOmEjN77cWawIseopjL1NhMVwvWq75XgjcqGvqgW1VDKsGZya7YgWZYoSDs9aEnGPjiHSXI7BVhVe99Xa37lbKB1lSP/Vz3SUAvck06zr/Hgh2Xu4/jZ9dyJSrUv0HwICeLdf6eBRwNEZvTKKkgl3rIplzwlhQaN5Kxbgbyyo5HV4jsICaj4e4uBBnVqwM06kdOxXzcRI3LLRQNkMgn27tGW3PPjgabB2RVaagXuXuf8Yq1zbdOW5VXEuTtV7qO1tIwdQmDlaGWcfzAEjoOSDQceuKWasgH8uvaZDvZPdKELJiUvZnzxn80iMF+9ZrkZqNcSDO7EzoJSoNrvlXdmCcLR0T95tlmyqJus8DHUqcboDYAXn1HRyeqyzcQ719PFnLo6PBp/idQWo+uCNQ+URqm9ZdcD8T013rAU0W01wHFvPYUEFfiXZCPAK2YhEPxAw1KChiNiTuGgJBrvGIlmc6Wlb9p61tComMGJYc7WhnMGqI1/jbCwR6+wtL6T4jJ1+JC0ktRWwNANTrgPqOcdn5Yw2lmXbaY/EjGdHFpVGgYjlgNpnG8kQsN/LB7Sc6m/lpwUsq9ZGANiwOTDnlTi+vXdB4Kh8JIGwYy8RWpFxl1+zCjQiktGyfIyL9xyqxyODp8oRW3KD2 KQxjhRcT ZE6WlZ7sbSffZSQHKZMPMtkhhIR6/EEZsDLjd7CS8hJ4nMSKlddNimMylz/DYxiHcK9+hbOTjTnd7AveDS7PP0EMz1TSa5Zq5Zm9I/0FidzU23jDAezOVArhio1154/wuRE/TONAGEE/4e51+C52RkV2DGg85TrdIKQG67xtX5VJ4qB6pQFnF2woXv6qGpQ4zMPMIuYSArQooAqhJhftiHZdiXA7DMaZ/KkgMZ6n5QdVDkYVBCys1506aNt1uVLPOxfj5x8DrS0cm3YIGZs2rNDLioYSVZBwd5foNFssT3U2jfENvfZPVU/Sp9l3LcVcbLYwdJVbEcvAlt8rzfxIFQCnxLdUOEHM7/yaQZHsRTLfk96JuqZYI7IfTFrWuKSEZd4RM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Ackerley, On Fri, 27 Jun 2025 at 16:01, Ackerley Tng wrote: > > Ackerley Tng writes: > > > [...] > > >>> +/* > >>> + * Returns true if the given gfn's private/shared status (in the CoC= o sense) is > >>> + * private. > >>> + * > >>> + * A return value of false indicates that the gfn is explicitly or i= mplicitly > >>> + * shared (i.e., non-CoCo VMs). > >>> + */ > >>> static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) > >>> { > >>> - return IS_ENABLED(CONFIG_KVM_GMEM) && > >>> - kvm_get_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE= _PRIVATE; > >>> + struct kvm_memory_slot *slot; > >>> + > >>> + if (!IS_ENABLED(CONFIG_KVM_GMEM)) > >>> + return false; > >>> + > >>> + slot =3D gfn_to_memslot(kvm, gfn); > >>> + if (kvm_slot_has_gmem(slot) && kvm_gmem_memslot_supports_shared(s= lot)) { > >>> + /* > >>> + * Without in-place conversion support, if a guest_memfd = memslot > >>> + * supports shared memory, then all the slot's memory is > >>> + * considered not private, i.e., implicitly shared. > >>> + */ > >>> + return false; > >> > >> Why!?!? Just make sure KVM_MEMORY_ATTRIBUTE_PRIVATE is mutually exclu= sive with > >> mappable guest_memfd. You need to do that no matter what. > > > > Thanks, I agree that setting KVM_MEMORY_ATTRIBUTE_PRIVATE should be > > disallowed for gfn ranges whose slot is guest_memfd-only. Missed that > > out. Where do people think we should check the mutual exclusivity? > > > > In kvm_supported_mem_attributes() I'm thiking that we should still allo= w > > the use of KVM_MEMORY_ATTRIBUTE_PRIVATE for other non-guest_memfd-only > > gfn ranges. Or do people think we should just disallow > > KVM_MEMORY_ATTRIBUTE_PRIVATE for the entire VM as long as one memslot i= s > > a guest_memfd-only memslot? > > > > If we check mutually exclusivity when handling > > kvm_vm_set_memory_attributes(), as long as part of the range where > > KVM_MEMORY_ATTRIBUTE_PRIVATE is requested to be set intersects a range > > whose slot is guest_memfd-only, the ioctl will return EINVAL. > > > > At yesterday's (2025-06-26) guest_memfd upstream call discussion, > > * Fuad brought up a possible use case where within the *same* VM, we > want to allow both memslots that supports and does not support mmap in > guest_memfd. > * Shivank suggested a concrete use case for this: the user wants a > guest_memfd memslot that supports mmap just so userspace addresses can > be used as references for specifying memory policy. > * Sean then added on that allowing both types of guest_memfd memslots > (support and not supporting mmap) will allow the user to have a second > layer of protection and ensure that for some memslots, the user > expects never to be able to mmap from the memslot. > > I agree it will be useful to allow both guest_memfd memslots that > support and do not support mmap in a single VM. > > I think I found an issue with flags, which is that GUEST_MEMFD_FLAG_MMAP > should not imply that the guest_memfd will provide memory for all guest > faults within the memslot's gfn range (KVM_MEMSLOT_GMEM_ONLY). > > For the use case Shivank raised, if the user wants a guest_memfd memslot > that supports mmap just so userspace addresses can be used as references > for specifying memory policy for legacy Coco VMs where shared memory > should still come from other sources, GUEST_MEMFD_FLAG_MMAP will be set, > but KVM can't fault shared memory from guest_memfd. Hence, > GUEST_MEMFD_FLAG_MMAP should not imply KVM_MEMSLOT_GMEM_ONLY. > > Thinking forward, if we want guest_memfd to provide (no-mmap) protection > even for non-CoCo VMs (such that perhaps initial VM image is populated > and then VM memory should never be mmap-ed at all), we will want > guest_memfd to be the source of memory even if GUEST_MEMFD_FLAG_MMAP is > not set. > > I propose that we should have a single VM-level flag to solve this (in > line with Sean's guideline that we should just move towards what we want > and not support non-existent use cases): something like > KVM_CAP_PREFER_GMEM. > > If KVM_CAP_PREFER_GMEM_MEMORY is set, > > * memory for any gfn range in a guest_memfd memslot will be requested > from guest_memfd > * any privacy status queries will also be directed to guest_memfd > * KVM_MEMORY_ATTRIBUTE_PRIVATE will not be a valid attribute > > KVM_CAP_PREFER_GMEM_MEMORY will be orthogonal with no validation on > GUEST_MEMFD_FLAG_MMAP, which should just purely guard mmap support in > guest_memfd. > > Here's a table that I set up [1]. I believe the proposed > KVM_CAP_PREFER_GMEM_MEMORY (column 7) lines up with requirements > (columns 1 to 4) correctly. > > [1] https://lpc.events/event/18/contributions/1764/attachments/1409/3710/= guest_memfd%20use%20cases%20vs%20guest_memfd%20flags%20and%20privacy%20trac= king.pdf I'm not sure this naming helps. What does "prefer" imply here? If the caller from user space does not prefer, does it mean that they mind/oppose? Regarding the use case Shivank mentioned, mmaping for policy, while the use case is a valid one, the raison d'=C3=AAtre of mmap is to map into user space (i.e., fault it in). I would argue that if you opt into mmap, you are doing it to be able to access it. To me, that seems like something that merits its own flag, rather than mmap. Also, I recall that we said that later on, with inplace conversion, that won't be even necessary. In other words, this would also be trying to solve a problem that we haven't yet encountered and that we have a solution for anyway. I think that, unless anyone disagrees, is to go ahead with the names we discussed in the last meeting. They seem to be the ones that make the most sense for the upcoming use cases. Cheers, /fuad > > [...] >