linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Patrick Roy <roypat@amazon.co.uk>, rppt@kernel.org, seanjc@google.com
Cc: pbonzini@redhat.com, corbet@lwn.net, willy@infradead.org,
	akpm@linux-foundation.org, song@kernel.org, jolsa@kernel.org,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	martin.lau@linux.dev, eddyz87@gmail.com, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, Liam.Howlett@oracle.com,
	lorenzo.stoakes@oracle.com, vbabka@suse.cz, jannh@google.com,
	shuah@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, bpf@vger.kernel.org,
	linux-kselftest@vger.kernel.org, tabba@google.com,
	jgowans@amazon.com, graf@amazon.com, kalyazin@amazon.com,
	xmarcalx@amazon.com, derekmn@amazon.com, jthoughton@google.com
Subject: Re: [PATCH v4 03/12] KVM: guest_memfd: Add flag to remove from direct map
Date: Wed, 26 Feb 2025 10:08:40 +0100	[thread overview]
Message-ID: <bfe43591-66b6-4fb9-bf6c-df79ddeffb17@redhat.com> (raw)
In-Reply-To: <8642de57-553a-47ec-81af-803280a360ec@amazon.co.uk>

On 26.02.25 09:48, Patrick Roy wrote:
> 
> 
> On Tue, 2025-02-25 at 16:54 +0000, David Hildenbrand wrote:> On 21.02.25 17:07, Patrick Roy wrote:
>>> Add KVM_GMEM_NO_DIRECT_MAP flag for KVM_CREATE_GUEST_MEMFD() ioctl. When
>>> set, guest_memfd folios will be removed from the direct map after
>>> preparation, with direct map entries only restored when the folios are
>>> freed.
>>>
>>> To ensure these folios do not end up in places where the kernel cannot
>>> deal with them, set AS_NO_DIRECT_MAP on the guest_memfd's struct
>>> address_space if KVM_GMEM_NO_DIRECT_MAP is requested.
>>>
>>> Note that this flag causes removal of direct map entries for all
>>> guest_memfd folios independent of whether they are "shared" or "private"
>>> (although current guest_memfd only supports either all folios in the
>>> "shared" state, or all folios in the "private" state if
>>> !IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM)). The usecase for removing
>>> direct map entries of also the shared parts of guest_memfd are a special
>>> type of non-CoCo VM where, host userspace is trusted to have access to
>>> all of guest memory, but where Spectre-style transient execution attacks
>>> through the host kernel's direct map should still be mitigated.
>>>
>>> Note that KVM retains access to guest memory via userspace
>>> mappings of guest_memfd, which are reflected back into KVM's memslots
>>> via userspace_addr. This is needed for things like MMIO emulation on
>>> x86_64 to work. Previous iterations attempted to instead have KVM
>>> temporarily restore direct map entries whenever such an access to guest
>>> memory was needed, but this turned out to have a significant performance
>>> impact, as well as additional complexity due to needing to refcount
>>> direct map reinsertion operations and making them play nicely with gmem
>>> truncations.
>>>
>>> This iteration also doesn't have KVM perform TLB flushes after direct
>>> map manipulations. This is because TLB flushes resulted in a up to 40x
>>> elongation of page faults in guest_memfd (scaling with the number of CPU
>>> cores), or a 5x elongation of memory population. On the one hand, TLB
>>> flushes are not needed for functional correctness (the virt->phys
>>> mapping technically stays "correct",  the kernel should simply to not it
>>> for a while), so this is a correct optimization to make. On the other
>>> hand, it means that the desired protection from Spectre-style attacks is
>>> not perfect, as an attacker could try to prevent a stale TLB entry from
>>> getting evicted, keeping it alive until the page it refers to is used by
>>> the guest for some sensitive data, and then targeting it using a
>>> spectre-gadget.
>>>
>>> Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
>>
>> ...
>>
>>>
>>> +static bool kvm_gmem_test_no_direct_map(struct inode *inode)
>>> +{
>>> +     return ((unsigned long) inode->i_private) & KVM_GMEM_NO_DIRECT_MAP;
>>> +}
>>> +
>>>    static inline void kvm_gmem_mark_prepared(struct folio *folio)
>>>    {
>>> +     struct inode *inode = folio_inode(folio);
>>> +
>>> +     if (kvm_gmem_test_no_direct_map(inode)) {
>>> +             int r = set_direct_map_valid_noflush(folio_page(folio, 0), folio_nr_pages(folio),
>>> +                                                  false);
>>
>> Will this work if KVM is built as a module, or is this another good
>> reason why we might want guest_memfd core part of core-mm?
> 
> mh, I'm admittedly not too familiar with the differences that would come
> from building KVM as a module vs not. I do remember something about the
> direct map accessors not being available for modules, so this would
> indeed not work. Does that mean moving gmem into core-mm will be a
> pre-requisite for the direct map removal stuff?

Likely, we'd need some shim.

Maybe for the time being it could be fenced using #if IS_BUILTIN() ... 
but that sure won't win in a beauty contest.

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-02-26  9:08 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-21 16:07 [PATCH v4 00/12] Direct Map Removal for guest_memfd Patrick Roy
2025-02-21 16:07 ` [PATCH v4 01/12] mm: introduce AS_NO_DIRECT_MAP Patrick Roy
2025-02-25 16:52   ` David Hildenbrand
2025-03-03  9:29     ` Vlastimil Babka
2025-02-21 16:07 ` [PATCH v4 02/12] mm/secretmem: set AS_NO_DIRECT_MAP instead of special-casing Patrick Roy
2025-02-25 16:52   ` David Hildenbrand
2025-02-26  8:44     ` Patrick Roy
2025-02-21 16:07 ` [PATCH v4 03/12] KVM: guest_memfd: Add flag to remove from direct map Patrick Roy
2025-02-25 16:54   ` David Hildenbrand
2025-02-26  8:48     ` Patrick Roy
2025-02-26  9:08       ` David Hildenbrand [this message]
2025-02-26 15:14         ` Patrick Roy
2025-02-26 15:30           ` David Hildenbrand
2025-03-19  7:53             ` Patrick Roy
2025-02-21 16:07 ` [PATCH v4 04/12] KVM: Add capability to discover KVM_GMEM_NO_DIRECT_MAP support Patrick Roy
2025-02-25 16:55   ` David Hildenbrand
2025-02-26  8:37     ` Patrick Roy
2025-02-21 16:07 ` [PATCH v4 05/12] KVM: Documentation: document KVM_GMEM_NO_DIRECT_MAP flag Patrick Roy
2025-02-21 16:07 ` [PATCH v4 06/12] KVM: selftests: load elf via bounce buffer Patrick Roy
2025-02-21 16:07 ` [PATCH v4 07/12] KVM: selftests: set KVM_MEM_GUEST_MEMFD in vm_mem_add() if guest_memfd != -1 Patrick Roy
2025-02-21 16:07 ` [PATCH v4 08/12] KVM: selftests: Add guest_memfd based vm_mem_backing_src_types Patrick Roy
2025-02-25 14:12   ` Patrick Roy
2025-02-21 16:07 ` [PATCH v4 09/12] KVM: selftests: stuff vm_mem_backing_src_type into vm_shape Patrick Roy
2025-02-21 16:07 ` [PATCH v4 10/12] KVM: selftests: adjust test_create_guest_memfd_invalid Patrick Roy
2025-02-21 16:07 ` [PATCH v4 11/12] KVM: selftests: set KVM_GMEM_NO_DIRECT_MAP in mem conversion tests Patrick Roy
2025-02-21 16:07 ` [PATCH v4 12/12] KVM: selftests: Test guest execution from direct map removed gmem Patrick Roy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bfe43591-66b6-4fb9-bf6c-df79ddeffb17@redhat.com \
    --to=david@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=derekmn@amazon.com \
    --cc=eddyz87@gmail.com \
    --cc=graf@amazon.com \
    --cc=haoluo@google.com \
    --cc=jannh@google.com \
    --cc=jgowans@amazon.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jthoughton@google.com \
    --cc=kalyazin@amazon.com \
    --cc=kpsingh@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=martin.lau@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=roypat@amazon.co.uk \
    --cc=rppt@kernel.org \
    --cc=sdf@fomichev.me \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=tabba@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=xmarcalx@amazon.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox