linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Quentin Perret <qperret@google.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Sean Christopherson <seanjc@google.com>,
	 Ackerley Tng <ackerleytng@google.com>,
	Alexey Kardashevskiy <aik@amd.com>,
	cgroups@vger.kernel.org,  kvm@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	 linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-mm@kvack.org,  linux-trace-kernel@vger.kernel.org,
	x86@kernel.org, akpm@linux-foundation.org,
	 binbin.wu@linux.intel.com, bp@alien8.de, brauner@kernel.org,
	chao.p.peng@intel.com,  chenhuacai@kernel.org, corbet@lwn.net,
	dave.hansen@intel.com,  dave.hansen@linux.intel.com,
	david@redhat.com, dmatlack@google.com, erdemaktas@google.com,
	 fan.du@intel.com, fvdl@google.com, haibo1.xu@intel.com,
	hannes@cmpxchg.org,  hch@infradead.org, hpa@zytor.com,
	hughd@google.com, ira.weiny@intel.com,  isaku.yamahata@intel.com,
	jack@suse.cz, james.morse@arm.com, jarkko@kernel.org,
	 jgowans@amazon.com, jhubbard@nvidia.com, jroedel@suse.de,
	jthoughton@google.com,  jun.miao@intel.com, kai.huang@intel.com,
	keirf@google.com, kent.overstreet@linux.dev,
	 liam.merwick@oracle.com, maciej.wieczor-retman@intel.com,
	mail@maciej.szmigiero.name,  maobibo@loongson.cn,
	mathieu.desnoyers@efficios.com, maz@kernel.org,
	 mhiramat@kernel.org, mhocko@kernel.org, mic@digikod.net,
	michael.roth@amd.com,  mingo@redhat.com, mlevitsk@redhat.com,
	mpe@ellerman.id.au, muchun.song@linux.dev,  nikunj@amd.com,
	nsaenz@amazon.es, oliver.upton@linux.dev, palmer@dabbelt.com,
	 pankaj.gupta@amd.com, paul.walmsley@sifive.com,
	pbonzini@redhat.com, peterx@redhat.com,  pgonda@google.com,
	prsampat@amd.com, pvorel@suse.cz, richard.weiyang@gmail.com,
	 rick.p.edgecombe@intel.com, rientjes@google.com,
	rostedt@goodmis.org, roypat@amazon.co.uk,  rppt@kernel.org,
	shakeel.butt@linux.dev, shuah@kernel.org, steven.price@arm.com,
	 steven.sistare@oracle.com, suzuki.poulose@arm.com,
	tabba@google.com, tglx@linutronix.de,  thomas.lendacky@amd.com,
	vannapurve@google.com, vbabka@suse.cz, viro@zeniv.linux.org.uk,
	 vkuznets@redhat.com, wei.w.wang@intel.com, will@kernel.org,
	willy@infradead.org,  wyihan@google.com, xiaoyao.li@intel.com,
	yan.y.zhao@intel.com, yilun.xu@intel.com,  yuzenghui@huawei.com,
	zhiquan1.li@intel.com
Subject: Re: [RFC PATCH v1 05/37] KVM: guest_memfd: Wire up kvm_get_memory_attributes() to per-gmem attributes
Date: Thu, 29 Jan 2026 14:36:14 +0000	[thread overview]
Message-ID: <od4dx6snqsl2qiocgf3jxm4dndxhrlvsfr22eveuno6nskgfdj@mxsywvku2jk5> (raw)
In-Reply-To: <20260129134245.GD2307128@ziepe.ca>

On Thursday 29 Jan 2026 at 09:42:45 (-0400), Jason Gunthorpe wrote:
> On Thu, Jan 29, 2026 at 11:10:12AM +0000, Quentin Perret wrote:
> 
> > A not-fully-thought-through-and-possibly-ridiculous idea that crossed
> > my mind some time ago was to make KVM itself a proper dmabuf
> > importer. 
> 
> AFAIK this is already the plan. Since Intel cannot tolerate having the
> private MMIO mapped into a VMA *at all* there is no other choice.
> 
> Since Intel has to build it it I figured everyone would want to use it
> because it is probably going to be much faster than reading VMAs.

Ack.

> Especially in the modern world of MMIO BARs in the 512GB range.
> 
> > You'd essentially see a guest as a 'device' (probably with an
> > actual struct dev representing it), and the stage-2 MMU in front of it
> > as its IOMMU. That could potentially allow KVM to implement dma_map_ops
> > for that guest 'device' by mapping/unmapping pages into its stage-2 and
> > such. 
> 
> The plan isn't something so wild..

I'll take that as a compliment ;-)

Not dying on that hill, but it didn't feel _that_ horrible after
thinking about it for a little while. From the host's PoV, a guest is
just another thing that can address memory, which has its own address
space and a page-table that we control in front. If you squint hard
enough it doesn't look _that_ different from a device from that angle.
Oh well.

> https://github.com/jgunthorpe/linux/commits/dmabuf_map_type/
> 
> The "Physical Address List" mapping type will let KVM just get a
> normal phys_addr_t list and do its normal stuff with it. No need for
> hacky DMA API things.

Thanks, I'll read up.

> Probably what will be hard for KVM is that it gets the entire 512GB in
> one shot and will have to chop it up to install the whole thing into
> the PTE sizes available in the S2. I don't think it even has logic
> like that right now??

The closest thing I can think of is the KVM_PRE_FAULT_MEMORY stuff in
the KVM API that forces it to fault in an arbitrarily range of guest
IPA space. There should at least be bits of infrastructure that can be
re-used for that I guess.

> > It gets really funny when a CoCo guest decides to share back a subset of
> > that dmabuf with the host, and I'm still wrapping my head around how
> > we'd make that work, but at this point I'm ready to be told how all the
> > above already doesn't work and that I should go back to the peanut
> > gallery :-)
> 
> Oh, I don't actually know how that ends up working but I suppose it
> could be meaningfully done :\

For mobile/pKVM we'll want to use dmabufs for more than just passing
MMIO to guests FWIW, it'll likely be used for memory in certain cases
too. There are examples in the KVM Forum talk I linked in the previous
email, but being able to feed guests with dmabuf-backed memory regions
is very helpful. That's useful to e.g. get physically contiguous memory
allocated from a CMA-backed dmabuf heap on systems that don't tolerate
scattered private memory well for example (either for functional or
performance reasons). I certainly wish we could ignore this type of
hardware, but we don't have that luxury sadly.

In cases like that, we certainly expect that the guest will be sharing
back parts of memory it's been given (at least a swiotlb bounce buffer
so it can do virtio etc), and that may very well be in the middle of a
dmabuf-backed memslot. In fact the guest has no clue what is backing
it's memory region, so we can't really expect it _not_ to do that :/


  reply	other threads:[~2026-01-29 14:36 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-17 20:11 [RFC PATCH v1 00/37] guest_memfd: In-place conversion support Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 01/37] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings Ackerley Tng
2025-10-27 13:27   ` Vlastimil Babka
2025-11-12  8:58   ` Binbin Wu
2026-01-28 17:07     ` Ackerley Tng
2026-01-19  7:58   ` Yan Zhao
2026-01-28 17:50     ` Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 02/37] KVM: Rename KVM_GENERIC_MEMORY_ATTRIBUTES to KVM_VM_MEMORY_ATTRIBUTES Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 03/37] KVM: Enumerate support for PRIVATE memory iff kvm_arch_has_private_mem is defined Ackerley Tng
2025-11-13  1:42   ` Binbin Wu
2025-10-17 20:11 ` [RFC PATCH v1 04/37] KVM: Stub in ability to disable per-VM memory attribute tracking Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 05/37] KVM: guest_memfd: Wire up kvm_get_memory_attributes() to per-gmem attributes Ackerley Tng
2026-01-15 11:08   ` Alexey Kardashevskiy
2026-01-28 21:47     ` Ackerley Tng
2026-01-29  0:37       ` Jason Gunthorpe
2026-01-29  1:03         ` Sean Christopherson
2026-01-29  1:16           ` Jason Gunthorpe
2026-01-29 11:10             ` Quentin Perret
2026-01-29 13:42               ` Jason Gunthorpe
2026-01-29 14:36                 ` Quentin Perret [this message]
2026-02-03  1:07             ` Alexey Kardashevskiy
2026-02-03 18:13               ` Jason Gunthorpe
2026-02-03  9:56           ` Xu Yilun
2026-02-03 18:16             ` Jason Gunthorpe
2026-02-04  4:43               ` Xu Yilun
2026-02-04 12:47                 ` Jason Gunthorpe
2026-02-05  7:04                   ` Xu Yilun
2025-10-17 20:11 ` [RFC PATCH v1 06/37] KVM: guest_memfd: Update kvm_gmem_populate() to use gmem attributes Ackerley Tng
2025-11-10 10:01   ` Yan Zhao
2025-11-15  0:52     ` Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 07/37] KVM: Introduce KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng
2025-10-22 15:21   ` Steven Price
2025-10-22 16:51     ` Ackerley Tng
2025-10-22 22:45       ` Ackerley Tng
2025-10-22 23:30         ` Sean Christopherson
2025-10-23 14:01           ` Ackerley Tng
2025-10-23 15:05             ` Sean Christopherson
2025-10-24 14:36               ` Ackerley Tng
2025-10-24 15:11                 ` Sean Christopherson
2025-10-24 16:41                   ` Ackerley Tng
2025-10-24 17:45                     ` Sean Christopherson
2025-10-27 12:48                       ` Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 08/37] KVM: guest_memfd: Don't set FGP_ACCESSED when getting folios Ackerley Tng
2025-10-27 13:39   ` Vlastimil Babka
2025-10-17 20:11 ` [RFC PATCH v1 09/37] KVM: guest_memfd: Skip LRU for guest_memfd folios Ackerley Tng
2025-10-27 13:56   ` Vlastimil Babka
2026-01-27 23:46     ` Ackerley Tng
2026-01-20  2:15   ` Yan Zhao
2025-10-17 20:11 ` [RFC PATCH v1 10/37] KVM: guest_memfd: Enable INIT_SHARED on guest_memfd for x86 Coco VMs Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 11/37] KVM: guest_memfd: Add support for KVM_SET_MEMORY_ATTRIBUTES Ackerley Tng
2025-11-04  9:25   ` Yan Zhao
2025-11-04 15:29     ` Vishal Annapurve
2025-11-15  0:46     ` Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 12/37] KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86 Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 13/37] KVM: Let userspace disable per-VM mem attributes, enable per-gmem attributes Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 14/37] KVM: selftests: Create gmem fd before "regular" fd when adding memslot Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 15/37] KVM: selftests: Rename guest_memfd{,_offset} to gmem_{fd,offset} Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 16/37] KVM: selftests: Add support for mmap() on guest_memfd in core library Ackerley Tng
2025-10-24 16:48   ` Ackerley Tng
2025-10-24 18:18     ` Sean Christopherson
2025-10-27 12:51       ` Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 17/37] KVM: selftests: Update framework to use KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng
2025-10-17 20:11 ` [RFC PATCH v1 18/37] KVM: selftests: Add helpers for calling ioctls on guest_memfd Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 19/37] KVM: selftests: guest_memfd: Test basic single-page conversion flow Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 20/37] KVM: selftests: guest_memfd: Test conversion flow when INIT_SHARED Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 21/37] KVM: selftests: guest_memfd: Test indexing in guest_memfd Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 22/37] KVM: selftests: guest_memfd: Test conversion before allocation Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 23/37] KVM: selftests: guest_memfd: Convert with allocated folios in different layouts Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 24/37] KVM: selftests: guest_memfd: Test precision of conversion Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 25/37] KVM: selftests: guest_memfd: Test that truncation does not change shared/private status Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 26/37] KVM: selftests: guest_memfd: Test that shared/private status is consistent across processes Ackerley Tng
2025-10-17 23:33   ` Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 27/37] KVM: selftests: guest_memfd: Test conversion with elevated page refcount Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 28/37] KVM: selftests: Reset shared memory after hole-punching Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 29/37] KVM: selftests: Add selftests global for guest memory attributes capability Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 30/37] KVM: selftests: Provide function to look up guest_memfd details from gpa Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 31/37] KVM: selftests: Provide common function to set memory attributes Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 32/37] KVM: selftests: Check fd/flags provided to mmap() when setting up memslot Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 33/37] KVM: selftests: Make TEST_EXPECT_SIGBUS thread-safe Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 34/37] KVM: selftests: Update private_mem_conversions_test to mmap() guest_memfd Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 35/37] KVM: selftests: Add script to exercise private_mem_conversions_test Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 36/37] KVM: selftests: Update pre-fault test to work with per-guest_memfd attributes Ackerley Tng
2025-10-17 20:12 ` [RFC PATCH v1 37/37] KVM: selftests: Update private memory exits test work with per-gmem attributes Ackerley Tng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=od4dx6snqsl2qiocgf3jxm4dndxhrlvsfr22eveuno6nskgfdj@mxsywvku2jk5 \
    --to=qperret@google.com \
    --cc=ackerleytng@google.com \
    --cc=aik@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=binbin.wu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chao.p.peng@intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=erdemaktas@google.com \
    --cc=fan.du@intel.com \
    --cc=fvdl@google.com \
    --cc=haibo1.xu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=ira.weiny@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jack@suse.cz \
    --cc=james.morse@arm.com \
    --cc=jarkko@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=jgowans@amazon.com \
    --cc=jhubbard@nvidia.com \
    --cc=jroedel@suse.de \
    --cc=jthoughton@google.com \
    --cc=jun.miao@intel.com \
    --cc=kai.huang@intel.com \
    --cc=keirf@google.com \
    --cc=kent.overstreet@linux.dev \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=mail@maciej.szmigiero.name \
    --cc=maobibo@loongson.cn \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=maz@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=muchun.song@linux.dev \
    --cc=nikunj@amd.com \
    --cc=nsaenz@amazon.es \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=pankaj.gupta@amd.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pgonda@google.com \
    --cc=prsampat@amd.com \
    --cc=pvorel@suse.cz \
    --cc=richard.weiyang@gmail.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=roypat@amazon.co.uk \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=steven.sistare@oracle.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vkuznets@redhat.com \
    --cc=wei.w.wang@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=wyihan@google.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox