Re: [PATCH v12 10/18] KVM: x86/mmu: Handle guest page faults for guest_memfd with shared memory

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Fuad Tabba <tabba@google.com>
To: Ackerley Tng <ackerleytng@google.com>
Cc: Sean Christopherson <seanjc@google.com>,
	kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	 linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com,
	 chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org,
	 paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu,  viro@zeniv.linux.org.uk,
	brauner@kernel.org, willy@infradead.org,
	 akpm@linux-foundation.org, xiaoyao.li@intel.com,
	yilun.xu@intel.com,  chao.p.peng@linux.intel.com,
	jarkko@kernel.org, amoorthy@google.com,  dmatlack@google.com,
	isaku.yamahata@intel.com, mic@digikod.net,  vbabka@suse.cz,
	vannapurve@google.com, mail@maciej.szmigiero.name,
	 david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com,
	 liam.merwick@oracle.com, isaku.yamahata@gmail.com,
	 kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com,
	steven.price@arm.com,  quic_eberman@quicinc.com,
	quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com,
	 quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com,
	 quic_pderrin@quicinc.com, quic_pheragu@quicinc.com,
	catalin.marinas@arm.com,  james.morse@arm.com,
	yuzenghui@huawei.com, oliver.upton@linux.dev,  maz@kernel.org,
	will@kernel.org, qperret@google.com, keirf@google.com,
	 roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org,
	jgg@nvidia.com,  rientjes@google.com, jhubbard@nvidia.com,
	fvdl@google.com, hughd@google.com,  jthoughton@google.com,
	peterx@redhat.com, pankaj.gupta@amd.com,  ira.weiny@intel.com
Subject: Re: [PATCH v12 10/18] KVM: x86/mmu: Handle guest page faults for guest_memfd with shared memory
Date: Mon, 30 Jun 2025 09:07:23 +0100	[thread overview]
Message-ID: <CA+EHjTxECJ3=ywbAPvpdA1-pm=stXWqU75mgG1epWaXiUr0raw@mail.gmail.com> (raw)
In-Reply-To: <diqza55tjkk1.fsf@ackerleytng-ctop.c.googlers.com>

Hi Ackerley,

On Fri, 27 Jun 2025 at 16:01, Ackerley Tng <ackerleytng@google.com> wrote:
>
> Ackerley Tng <ackerleytng@google.com> writes:
>
> > [...]
>
> >>> +/*
> >>> + * Returns true if the given gfn's private/shared status (in the CoCo sense) is
> >>> + * private.
> >>> + *
> >>> + * A return value of false indicates that the gfn is explicitly or implicitly
> >>> + * shared (i.e., non-CoCo VMs).
> >>> + */
> >>>  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> >>>  {
> >>> -   return IS_ENABLED(CONFIG_KVM_GMEM) &&
> >>> -          kvm_get_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
> >>> +   struct kvm_memory_slot *slot;
> >>> +
> >>> +   if (!IS_ENABLED(CONFIG_KVM_GMEM))
> >>> +           return false;
> >>> +
> >>> +   slot = gfn_to_memslot(kvm, gfn);
> >>> +   if (kvm_slot_has_gmem(slot) && kvm_gmem_memslot_supports_shared(slot)) {
> >>> +           /*
> >>> +            * Without in-place conversion support, if a guest_memfd memslot
> >>> +            * supports shared memory, then all the slot's memory is
> >>> +            * considered not private, i.e., implicitly shared.
> >>> +            */
> >>> +           return false;
> >>
> >> Why!?!?  Just make sure KVM_MEMORY_ATTRIBUTE_PRIVATE is mutually exclusive with
> >> mappable guest_memfd.  You need to do that no matter what.
> >
> > Thanks, I agree that setting KVM_MEMORY_ATTRIBUTE_PRIVATE should be
> > disallowed for gfn ranges whose slot is guest_memfd-only. Missed that
> > out. Where do people think we should check the mutual exclusivity?
> >
> > In kvm_supported_mem_attributes() I'm thiking that we should still allow
> > the use of KVM_MEMORY_ATTRIBUTE_PRIVATE for other non-guest_memfd-only
> > gfn ranges. Or do people think we should just disallow
> > KVM_MEMORY_ATTRIBUTE_PRIVATE for the entire VM as long as one memslot is
> > a guest_memfd-only memslot?
> >
> > If we check mutually exclusivity when handling
> > kvm_vm_set_memory_attributes(), as long as part of the range where
> > KVM_MEMORY_ATTRIBUTE_PRIVATE is requested to be set intersects a range
> > whose slot is guest_memfd-only, the ioctl will return EINVAL.
> >
>
> At yesterday's (2025-06-26) guest_memfd upstream call discussion,
>
> * Fuad brought up a possible use case where within the *same* VM, we
>   want to allow both memslots that supports and does not support mmap in
>   guest_memfd.
> * Shivank suggested a concrete use case for this: the user wants a
>   guest_memfd memslot that supports mmap just so userspace addresses can
>   be used as references for specifying memory policy.
> * Sean then added on that allowing both types of guest_memfd memslots
>   (support and not supporting mmap) will allow the user to have a second
>   layer of protection and ensure that for some memslots, the user
>   expects never to be able to mmap from the memslot.
>
> I agree it will be useful to allow both guest_memfd memslots that
> support and do not support mmap in a single VM.
>
> I think I found an issue with flags, which is that GUEST_MEMFD_FLAG_MMAP
> should not imply that the guest_memfd will provide memory for all guest
> faults within the memslot's gfn range (KVM_MEMSLOT_GMEM_ONLY).
>
> For the use case Shivank raised, if the user wants a guest_memfd memslot
> that supports mmap just so userspace addresses can be used as references
> for specifying memory policy for legacy Coco VMs where shared memory
> should still come from other sources, GUEST_MEMFD_FLAG_MMAP will be set,
> but KVM can't fault shared memory from guest_memfd. Hence,
> GUEST_MEMFD_FLAG_MMAP should not imply KVM_MEMSLOT_GMEM_ONLY.
>
> Thinking forward, if we want guest_memfd to provide (no-mmap) protection
> even for non-CoCo VMs (such that perhaps initial VM image is populated
> and then VM memory should never be mmap-ed at all), we will want
> guest_memfd to be the source of memory even if GUEST_MEMFD_FLAG_MMAP is
> not set.
>
> I propose that we should have a single VM-level flag to solve this (in
> line with Sean's guideline that we should just move towards what we want
> and not support non-existent use cases): something like
> KVM_CAP_PREFER_GMEM.
>
> If KVM_CAP_PREFER_GMEM_MEMORY is set,
>
> * memory for any gfn range in a guest_memfd memslot will be requested
>   from guest_memfd
> * any privacy status queries will also be directed to guest_memfd
> * KVM_MEMORY_ATTRIBUTE_PRIVATE will not be a valid attribute
>
> KVM_CAP_PREFER_GMEM_MEMORY will be orthogonal with no validation on
> GUEST_MEMFD_FLAG_MMAP, which should just purely guard mmap support in
> guest_memfd.
>
> Here's a table that I set up [1]. I believe the proposed
> KVM_CAP_PREFER_GMEM_MEMORY (column 7) lines up with requirements
> (columns 1 to 4) correctly.
>
> [1] https://lpc.events/event/18/contributions/1764/attachments/1409/3710/guest_memfd%20use%20cases%20vs%20guest_memfd%20flags%20and%20privacy%20tracking.pdf

I'm not sure this naming helps. What does "prefer" imply here? If the
caller from user space does not prefer, does it mean that they
mind/oppose?

Regarding the use case Shivank mentioned, mmaping for policy, while
the use case is a valid one, the raison d'être of mmap is to map into
user space (i.e., fault it in). I would argue that if you opt into
mmap, you are doing it to be able to access it. To me, that seems like
something that merits its own flag, rather than mmap. Also, I recall
that we said that later on, with inplace conversion, that won't be
even necessary. In other words, this would also be trying to solve a
problem that we haven't yet encountered and that we have a solution
for anyway.

I think that, unless anyone disagrees, is to go ahead with the names
we discussed in the last meeting. They seem to be the ones that make
the most sense for the upcoming use cases.

Cheers,
/fuad



> > [...]
>

next prev parent reply	other threads:[~2025-06-30  8:08 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-11 13:33 [PATCH v12 00/18] KVM: Mapping guest_memfd backed memory at the host for software protected VMs Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 01/18] KVM: Rename CONFIG_KVM_PRIVATE_MEM to CONFIG_KVM_GMEM Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 02/18] KVM: Rename CONFIG_KVM_GENERIC_PRIVATE_MEM to CONFIG_KVM_GENERIC_GMEM_POPULATE Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 03/18] KVM: Rename kvm_arch_has_private_mem() to kvm_arch_supports_gmem() Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 04/18] KVM: x86: Rename kvm->arch.has_private_mem to kvm->arch.supports_gmem Fuad Tabba
2025-06-13 13:57   ` Ackerley Tng
2025-06-13 20:35   ` Sean Christopherson
2025-06-16  7:13     ` Fuad Tabba
2025-06-16 14:20       ` David Hildenbrand
2025-06-24 20:51     ` Ackerley Tng
2025-06-25  6:33       ` Roy, Patrick
2025-06-11 13:33 ` [PATCH v12 05/18] KVM: Rename kvm_slot_can_be_private() to kvm_slot_has_gmem() Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 06/18] KVM: Fix comments that refer to slots_lock Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 07/18] KVM: Fix comment that refers to kvm uapi header path Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 08/18] KVM: guest_memfd: Allow host to map guest_memfd pages Fuad Tabba
2025-06-12 16:16   ` Shivank Garg
2025-06-13 21:03   ` Sean Christopherson
2025-06-13 21:18     ` David Hildenbrand
2025-06-13 22:48     ` Sean Christopherson
2025-06-16  6:52     ` Fuad Tabba
2025-06-16 14:16       ` David Hildenbrand
2025-06-17 23:04       ` Sean Christopherson
2025-06-18 11:18         ` Fuad Tabba
2025-06-16 13:44     ` Ira Weiny
2025-06-16 14:03       ` David Hildenbrand
2025-06-16 14:16         ` Fuad Tabba
2025-06-16 14:25           ` David Hildenbrand
2025-06-18  0:40             ` Sean Christopherson
2025-06-18  8:15               ` David Hildenbrand
2025-06-18  9:20                 ` Xiaoyao Li
2025-06-18  9:27                   ` David Hildenbrand
2025-06-18  9:44                     ` Xiaoyao Li
2025-06-18  9:59                       ` David Hildenbrand
2025-06-18 10:42                         ` Xiaoyao Li
2025-06-18 11:14                           ` David Hildenbrand
2025-06-18 12:17                             ` Xiaoyao Li
2025-06-18 13:16                               ` David Hildenbrand
2025-06-19  1:48                 ` Sean Christopherson
2025-06-19  1:50                   ` Sean Christopherson
2025-06-18  9:25     ` David Hildenbrand
2025-06-25 21:47   ` Ackerley Tng
2025-06-11 13:33 ` [PATCH v12 09/18] KVM: guest_memfd: Track shared memory support in memslot Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 10/18] KVM: x86/mmu: Handle guest page faults for guest_memfd with shared memory Fuad Tabba
2025-06-13 22:08   ` Sean Christopherson
2025-06-24 23:40     ` Ackerley Tng
2025-06-27 15:01       ` Ackerley Tng
2025-06-30  8:07         ` Fuad Tabba [this message]
2025-06-30 14:44           ` Ackerley Tng
2025-06-30 15:08             ` Fuad Tabba
2025-06-30 19:26               ` Shivank Garg
2025-06-30 20:03                 ` David Hildenbrand
2025-07-01 14:15                   ` Ackerley Tng
2025-07-01 14:44                     ` David Hildenbrand
2025-07-08  0:05                       ` Sean Christopherson
2025-07-08 13:44                         ` Ackerley Tng
2025-06-11 13:33 ` [PATCH v12 11/18] KVM: x86: Consult guest_memfd when computing max_mapping_level Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 12/18] KVM: x86: Enable guest_memfd shared memory for non-CoCo VMs Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 13/18] KVM: arm64: Refactor user_mem_abort() Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 14/18] KVM: arm64: Handle guest_memfd-backed guest page faults Fuad Tabba
2025-06-12 17:33   ` James Houghton
2025-06-11 13:33 ` [PATCH v12 15/18] KVM: arm64: Enable host mapping of shared guest_memfd memory Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 16/18] KVM: Introduce the KVM capability KVM_CAP_GMEM_SHARED_MEM Fuad Tabba
2025-06-11 13:33 ` [PATCH v12 17/18] KVM: selftests: Don't use hardcoded page sizes in guest_memfd test Fuad Tabba
2025-06-12 16:24   ` Shivank Garg
2025-06-11 13:33 ` [PATCH v12 18/18] KVM: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba
2025-06-12 16:23   ` Shivank Garg
2025-06-12 17:38 ` [PATCH v12 00/18] KVM: Mapping guest_memfd backed memory at the host for software protected VMs David Hildenbrand
2025-06-24 10:02   ` Fuad Tabba
2025-06-24 10:16     ` David Hildenbrand
2025-06-24 10:25       ` Fuad Tabba
2025-06-24 11:44         ` David Hildenbrand
2025-06-24 11:58           ` Fuad Tabba
2025-06-24 17:50             ` Sean Christopherson
2025-06-25  8:00               ` Fuad Tabba
2025-06-25 14:07                 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+EHjTxECJ3=ywbAPvpdA1-pm=stXWqU75mgG1epWaXiUr0raw@mail.gmail.com' \
    --to=tabba@google.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=fvdl@google.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=ira.weiny@intel.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=isaku.yamahata@intel.com \
    --cc=james.morse@arm.com \
    --cc=jarkko@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=keirf@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=liam.merwick@oracle.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=pankaj.gupta@amd.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qperret@google.com \
    --cc=quic_cvanscha@quicinc.com \
    --cc=quic_eberman@quicinc.com \
    --cc=quic_mnalajal@quicinc.com \
    --cc=quic_pderrin@quicinc.com \
    --cc=quic_pheragu@quicinc.com \
    --cc=quic_svaddagi@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    --cc=rientjes@google.com \
    --cc=roypat@amazon.co.uk \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wei.w.wang@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox