linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Patrick Roy <roypat@amazon.co.uk>, Fuad Tabba <tabba@google.com>
Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org,
	mpe@ellerman.id.au, anup@brainfault.org,
	paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, seanjc@google.com,
	viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org,
	akpm@linux-foundation.org, xiaoyao.li@intel.com,
	yilun.xu@intel.com, chao.p.peng@linux.intel.com,
	jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com,
	yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com,
	mic@digikod.net, vbabka@suse.cz, vannapurve@google.com,
	ackerleytng@google.com, mail@maciej.szmigiero.name,
	michael.roth@amd.com, wei.w.wang@intel.com,
	liam.merwick@oracle.com, isaku.yamahata@gmail.com,
	kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com,
	steven.price@arm.com, quic_eberman@quicinc.com,
	quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com,
	quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com,
	quic_pderrin@quicinc.com, quic_pheragu@quicinc.com,
	catalin.marinas@arm.com, james.morse@arm.com,
	yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org,
	will@kernel.org, qperret@google.com, keirf@google.com,
	shuah@kernel.org, hch@infradead.org, jgg@nvidia.com,
	rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com,
	hughd@google.com, jthoughton@google.com
Subject: Re: [RFC PATCH v1 2/9] KVM: guest_memfd: Add guest_memfd support to kvm_(read|/write)_guest_page()
Date: Thu, 23 Jan 2025 15:21:07 +0100	[thread overview]
Message-ID: <164e9d74-2f1f-4557-afda-06712e8415b0@redhat.com> (raw)
In-Reply-To: <bc59a2ec-7467-4a4e-8d73-9c4126b1c98b@amazon.co.uk>

On 23.01.25 14:57, Patrick Roy wrote:
> 
> 
> On Thu, 2025-01-23 at 12:28 +0000, Fuad Tabba wrote:
>> Hi Patrick,
>>
>> On Thu, 23 Jan 2025 at 11:57, Patrick Roy <roypat@amazon.co.uk> wrote:
>>>
>>>
>>>
>>> On Thu, 2025-01-23 at 11:39 +0000, David Hildenbrand wrote:
>>>> On 23.01.25 10:48, Fuad Tabba wrote:
>>>>> On Wed, 22 Jan 2025 at 22:10, David Hildenbrand <david@redhat.com> wrote:
>>>>>>
>>>>>> On 22.01.25 16:27, Fuad Tabba wrote:
>>>>>>> Make kvm_(read|/write)_guest_page() capable of accessing guest
>>>>>>> memory for slots that don't have a userspace address, but only if
>>>>>>> the memory is mappable, which also indicates that it is
>>>>>>> accessible by the host.
>>>>>>
>>>>>> Interesting. So far my assumption was that, for shared memory, user
>>>>>> space would simply mmap() guest_memdd and pass it as userspace address
>>>>>> to the same memslot that has this guest_memfd for private memory.
>>>>>>
>>>>>> Wouldn't that be easier in the first shot? (IOW, not require this patch
>>>>>> with the cost of faulting the shared page into the page table on access)
>>>>>
>>>>
>>>> In light of:
>>>>
>>>> https://lkml.kernel.org/r/20250117190938.93793-4-imbrenda@linux.ibm.com
>>>>
>>>> there can, in theory, be memslots that start at address 0 and have a
>>>> "valid" mapping. This case is done from the kernel (and on special s390x
>>>> hardware), though, so it does not apply here at all so far.
>>>>
>>>> In practice, getting address 0 as a valid address is unlikely, because
>>>> the default:
>>>>
>>>> $ sysctl  vm.mmap_min_addr
>>>> vm.mmap_min_addr = 65536
>>>>
>>>> usually prohibits it for good reason.
>>>>
>>>>> This has to do more with the ABI I had for pkvm and shared memory
>>>>> implementations, in which you don't need to specify the userspace
>>>>> address for memory in a guestmem memslot. The issue is there is no
>>>>> obvious address to map it to. This would be the case in kvm:arm64 for
>>>>> tracking paravirtualized time, which the userspace doesn't necessarily
>>>>> need to interact with, but kvm does.
>>>>
>>>> So I understand correctly: userspace wouldn't have to mmap it because it
>>>> is not interested in accessing it, but there is nothing speaking against
>>>> mmaping it, at least in the first shot.
>>>>
>>>> I assume it would not be a private memslot (so far, my understanding is
>>>> that internal memslots never have a guest_memfd attached).
>>>> kvm_gmem_create() is only called via KVM_CREATE_GUEST_MEMFD, to be set
>>>> on user-created memslots.
>>>>
>>>>>
>>>>> That said, we could always have a userspace address dedicated to
>>>>> mapping shared locations, and use that address when the necessity
>>>>> arises. Or we could always require that memslots have a userspace
>>>>> address, even if not used. I don't really have a strong preference.
>>>>
>>>> So, the simpler version where user space would simply mmap guest_memfd
>>>> to provide the address via userspace_addr would at least work for the
>>>> use case of paravirtualized time?
>>>
>>> fwiw, I'm currently prototyping something like this for x86 (although
>>> not by putting the gmem address into userspace_addr, but by adding a new
>>> field to memslots, so that memory attributes continue working), based on
>>> what we talked about at the last guest_memfd sync meeting (the whole
>>> "how to get MMIO emulation working for non-CoCo VMs in guest_memfd"
>>> story). So I guess if we're going down this route for x86, maybe it
>>> makes sense to do the same on ARM, for consistency?
>>>
>>>> It would get rid of the immediate need for this patch and patch #4 to
>>>> get it flying.
>>>>
>>>>
>>>> One interesting question is: when would you want shared memory in
>>>> guest_memfd and *not* provide it as part of the same memslot.
>>>
>>> In my testing of non-CoCo gmem VMs on ARM, I've been able to get quite
>>> far without giving KVM a way to internally access shared parts of gmem -
>>> it's why I was probing Fuad for this simplified series, because
>>> KVM_SW_PROTECTED_VM + mmap (for loading guest kernel) is enough to get a
>>> working non-CoCo VM on ARM (although I admittedly never looked at clocks
>>> inside the guest - maybe that's one thing that breaks if KVM can't
>>> access gmem. How to guest and host agree on the guest memory range
>>> used to exchange paravirtual timekeeping information? Could that exchange
>>> be intercepted in userspace, and set to shared via memory attributes (e.g.
>>> placed outside gmem)? That's the route I'm going down the paravirtual
>>> time on x86).
>>
>> For an idea of what it looks like on arm64, here's how kvmtool handles it:
>> https://github.com/kvmtool/kvmtool/blob/master/arm/aarch64/pvtime.c
>>
>> Cheers,
>> /fuad
>   
> Thanks! In that example, kvmtool actually allocates a separate memslot for
> the pvclock stuff, so I guess it's always possible to simply put it into
> a non-gmem memslot, which indeed sidesteps this issue as you mention in
> your reply to David :D

Does that work on CC where all memory defaults to private first, and the 
VM explicitly has to opt into marking it shared first, or how exactly 
would the flow of operations be in the cases of the non-gmem ("good 
old") memslot?

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-01-23 14:21 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-22 15:27 [RFC PATCH v1 0/9] KVM: Mapping of guest_memfd at the host and a software protected VM type Fuad Tabba
2025-01-22 15:27 ` [RFC PATCH v1 1/9] KVM: guest_memfd: Allow host to mmap guest_memfd() pages Fuad Tabba
2025-01-22 22:06   ` David Hildenbrand
2025-01-23  9:44     ` Fuad Tabba
2025-01-23 10:27       ` David Hildenbrand
2025-01-23 11:02         ` Fuad Tabba
2025-01-22 15:27 ` [RFC PATCH v1 2/9] KVM: guest_memfd: Add guest_memfd support to kvm_(read|/write)_guest_page() Fuad Tabba
2025-01-22 22:10   ` David Hildenbrand
2025-01-23  9:48     ` Fuad Tabba
2025-01-23 11:39       ` David Hildenbrand
2025-01-23 11:57         ` Patrick Roy
2025-01-23 12:28           ` Fuad Tabba
2025-01-23 13:57             ` Patrick Roy
2025-01-23 14:21               ` David Hildenbrand [this message]
2025-01-23 14:25                 ` Fuad Tabba
2025-01-23 14:18           ` David Hildenbrand
2025-01-23 15:22             ` Patrick Roy
2025-01-24 14:44               ` David Hildenbrand
2025-01-23 12:16         ` Fuad Tabba
2025-01-23 14:15           ` David Hildenbrand
2025-01-22 15:27 ` [RFC PATCH v1 3/9] KVM: guest_memfd: Add KVM capability to check if guest_memfd is host mappable Fuad Tabba
2025-01-23 11:42   ` David Hildenbrand
2025-01-23 11:46     ` Fuad Tabba
2025-01-22 15:27 ` [RFC PATCH v1 4/9] KVM: arm64: Skip VMA checks for slots without userspace address Fuad Tabba
2025-01-22 15:27 ` [RFC PATCH v1 5/9] KVM: arm64: Refactor user_mem_abort() calculation of force_pte Fuad Tabba
2025-01-22 15:27 ` [RFC PATCH v1 6/9] KVM: arm64: Handle guest_memfd()-backed guest page faults Fuad Tabba
2025-01-22 15:27 ` [RFC PATCH v1 7/9] KVM: arm64: Introduce KVM_VM_TYPE_ARM_SW_PROTECTED machine type Fuad Tabba
2025-01-22 15:27 ` [RFC PATCH v1 8/9] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba
2025-01-22 15:27 ` [RFC PATCH v1 9/9] KVM: arm64: Enable mapping guest_memfd in arm64 Fuad Tabba
2025-01-22 15:35 ` [RFC PATCH v1 0/9] KVM: Mapping of guest_memfd at the host and a software protected VM type David Hildenbrand
2025-01-22 15:41   ` David Hildenbrand
2025-01-22 17:16     ` Fuad Tabba
2025-01-22 21:42       ` David Hildenbrand
2025-01-23  9:09         ` Fuad Tabba
2025-01-23  9:14           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=164e9d74-2f1f-4557-afda-06712e8415b0@redhat.com \
    --to=david@redhat.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=dmatlack@google.com \
    --cc=fvdl@google.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=isaku.yamahata@intel.com \
    --cc=james.morse@arm.com \
    --cc=jarkko@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=keirf@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=quic_cvanscha@quicinc.com \
    --cc=quic_eberman@quicinc.com \
    --cc=quic_mnalajal@quicinc.com \
    --cc=quic_pderrin@quicinc.com \
    --cc=quic_pheragu@quicinc.com \
    --cc=quic_svaddagi@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    --cc=rientjes@google.com \
    --cc=roypat@amazon.co.uk \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wei.w.wang@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox