* [PATCH 1/9] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory
[not found] <20240507180729.3975856-1-pbonzini@redhat.com>
@ 2024-05-07 18:07 ` Paolo Bonzini
2024-05-07 18:07 ` [PATCH 2/9] KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode Paolo Bonzini
1 sibling, 0 replies; 2+ messages in thread
From: Paolo Bonzini @ 2024-05-07 18:07 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: vbabka, isaku.yamahata, xiaoyao.li, binbin.wu, seanjc,
rick.p.edgecombe, michael.roth, yilun.xu, Matthew Wilcox,
linux-mm
From: Michael Roth <michael.roth@amd.com>
filemap users like guest_memfd may use page cache pages to
allocate/manage memory that is only intended to be accessed by guests
via hardware protections like encryption. Writes to memory of this sort
in common paths like truncation may cause unexpected behavior such as
writing garbage instead of zeros when attempting to zero pages, or
worse, triggering hardware protections that are considered fatal as far
as the kernel is concerned.
Introduce a new address_space flag, AS_INACCESSIBLE, and use this
initially to prevent zero'ing of pages during truncation, with the
understanding that it is up to the owner of the mapping to handle this
specially if needed.
This is admittedly a rather blunt solution, but it seems like
there are no other places that should take into account the
flag to keep its promise.
Link: https://lore.kernel.org/lkml/ZR9LYhpxTaTk6PJX@google.com/
Cc: Matthew Wilcox <willy@infradead.org>
Cc: linux-mm@kvack.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240329212444.395559-5-michael.roth@amd.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
include/linux/pagemap.h | 1 +
mm/truncate.c | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 2df35e65557d..f879c1d54da7 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -207,6 +207,7 @@ enum mapping_flags {
AS_STABLE_WRITES, /* must wait for writeback before modifying
folio contents */
AS_UNMOVABLE, /* The mapping cannot be moved, ever */
+ AS_INACCESSIBLE, /* Do not attempt direct R/W access to the mapping */
};
/**
diff --git a/mm/truncate.c b/mm/truncate.c
index 725b150e47ac..c501338c7ebd 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -233,7 +233,8 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
* doing a complex calculation here, and then doing the zeroing
* anyway if the page split fails.
*/
- folio_zero_range(folio, offset, length);
+ if (!(folio->mapping->flags & AS_INACCESSIBLE))
+ folio_zero_range(folio, offset, length);
if (folio_has_private(folio))
folio_invalidate(folio, offset, length);
--
2.43.0
^ permalink raw reply [flat|nested] 2+ messages in thread
* [PATCH 2/9] KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode
[not found] <20240507180729.3975856-1-pbonzini@redhat.com>
2024-05-07 18:07 ` [PATCH 1/9] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Paolo Bonzini
@ 2024-05-07 18:07 ` Paolo Bonzini
1 sibling, 0 replies; 2+ messages in thread
From: Paolo Bonzini @ 2024-05-07 18:07 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: vbabka, isaku.yamahata, xiaoyao.li, binbin.wu, seanjc,
rick.p.edgecombe, michael.roth, yilun.xu, linux-mm
From: Michael Roth <michael.roth@amd.com>
truncate_inode_pages_range() may attempt to zero pages before truncating
them, and this will occur before arch-specific invalidations can be
triggered via .invalidate_folio/.free_folio hooks via kvm_gmem_aops. For
AMD SEV-SNP this would result in an RMP #PF being generated by the
hardware, which is currently treated as fatal (and even if specifically
allowed for, would not result in anything other than garbage being
written to guest pages due to encryption). On Intel TDX this would also
result in undesirable behavior.
Set the AS_INACCESSIBLE flag to prevent the MM from attempting
unexpected accesses of this sort during operations like truncation.
This may also in some cases yield a decent performance improvement for
guest_memfd userspace implementations that hole-punch ranges immediately
after private->shared conversions via KVM_SET_MEMORY_ATTRIBUTES, since
the current implementation of truncate_inode_pages_range() always ends
up zero'ing an entire 4K range if it is backing by a 2M folio.
Link: https://lore.kernel.org/lkml/ZR9LYhpxTaTk6PJX@google.com/
Cc: linux-mm@kvack.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240329212444.395559-6-michael.roth@amd.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
virt/kvm/guest_memfd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index 0f4e0cf4f158..5a929536ecf2 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -357,6 +357,7 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags)
inode->i_private = (void *)(unsigned long)flags;
inode->i_op = &kvm_gmem_iops;
inode->i_mapping->a_ops = &kvm_gmem_aops;
+ inode->i_mapping->flags |= AS_INACCESSIBLE;
inode->i_mode |= S_IFREG;
inode->i_size = size;
mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
--
2.43.0
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-05-07 18:07 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20240507180729.3975856-1-pbonzini@redhat.com>
2024-05-07 18:07 ` [PATCH 1/9] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Paolo Bonzini
2024-05-07 18:07 ` [PATCH 2/9] KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox