From: Ackerley Tng <ackerleytng@google.com>
To: Michael Roth <michael.roth@amd.com>
Cc: aik@amd.com, andrew.jones@linux.dev, binbin.wu@linux.intel.com,
brauner@kernel.org, chao.p.peng@linux.intel.com,
david@kernel.org, ira.weiny@intel.com, jmattson@google.com,
jroedel@suse.de, jthoughton@google.com, oupton@kernel.org,
pankaj.gupta@amd.com, qperret@google.com,
rick.p.edgecombe@intel.com, rientjes@google.com,
shivankg@amd.com, steven.price@arm.com, tabba@google.com,
willy@infradead.org, wyihan@google.com, yan.y.zhao@intel.com,
forkloop@google.com, pratyush@kernel.org,
suzuki.poulose@arm.com, aneesh.kumar@kernel.org,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Thomas Gleixner <tglx@kernel.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Shuah Khan <shuah@kernel.org>,
Vishal Annapurve <vannapurve@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Barry Song <baohua@kernel.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
Vlastimil Babka <vbabka@kernel.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH RFC v4 10/44] KVM: guest_memfd: Add support for KVM_SET_MEMORY_ATTRIBUTES2
Date: Wed, 1 Apr 2026 15:46:43 -0700 [thread overview]
Message-ID: <CAEvNRgFgOJj1_WKGrmFfXe8Kg9BrrX7Shmu_cjsEfiq2RP79zQ@mail.gmail.com> (raw)
In-Reply-To: <cqkyz4zxosmev6lnbasa32ed5jjfljrx3gr6plyjfhrtbysuwr@rg5noy6y6ayu>
Michael Roth <michael.roth@amd.com> writes:
> On Thu, Mar 26, 2026 at 03:24:19PM -0700, Ackerley Tng wrote:
>> For shared to private conversions, if refcounts on any of the folios
>> within the range are elevated, fail the conversion with -EAGAIN.
>>
>> At the point of shared to private conversion, all folios in range are
>> also unmapped. The filemap_invalidate_lock() is held, so no faulting
>> can occur. Hence, from that point on, only transient refcounts can be
>> taken on the folios associated with that guest_memfd.
>>
>> Hence, it is safe to do the conversion from shared to private.
>>
>> After conversion is complete, refcounts may become elevated, but that
>> is fine since users of transient refcounts don't actually access
>> memory.
>>
>> For private to shared conversions, there are no refcount checks, since
>> the guest is the only user of private pages, and guest_memfd will be the
>> only holder of refcounts on private pages.
>
> I think KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES deserves some mention in
> the commit log.
>
Will update this in the next revision. Thanks!
>>
>>
>> [...snip...]
>>
>>
>> +Set attributes for a range of offsets within a guest_memfd to
>> +KVM_MEMORY_ATTRIBUTE_PRIVATE to limit the specified guest_memfd backed
>> +memory range for guest_use. Even if KVM_CAP_GUEST_MEMFD_MMAP is
>> +supported, after a successful call to set
>> +KVM_MEMORY_ATTRIBUTE_PRIVATE, the requested range will not be mappable
>> +into host userspace and will only be mappable by the guest.
>> +
>> +To allow the range to be mappable into host userspace again, call
>> +KVM_SET_MEMORY_ATTRIBUTES2 on the guest_memfd again with
>> +KVM_MEMORY_ATTRIBUTE_PRIVATE unset.
>> +
>> +If this ioctl returns -EAGAIN, the offset of the page with unexpected
>> +refcounts will be returned in `error_offset`. This can occur if there
>> +are transient refcounts on the pages, taken by other parts of the
>> +kernel.
>
> That's only true for the guest_memfd ioctl, for KVM ioctl these new
> fields and r/w behavior are basically ignored. So you might need to be
> clearer on which fields/behavior are specific to guest_memfd like in
> the preceeding paragraphs..
>
Yes, will update in the next revision, thanks!
> ..or maybe it's better to do the opposite and just have a blanket 'for
> now, all newly-described behavior pertains only to usage via a
> guest_memfd ioctl, and for KVM ioctls only the fields/behaviors common
> with KVM_SET_MEMORY_ATTRIBUTES are applicable.', since it doesn't seem
> like vm_memory_attributes=1 is long for this world and that's the only
> case where KVM memory attribute ioctls seem relevant.
>
> But then it makes me wonder, if we adopt the semantics I mentioned
> earlier and have KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES advertise both
> the gmem ioctl support as well as the struct kvm_memory_attributes2
> support, if we should even advertise KVM_CAP_MEMORY_ATTRIBUTES2 at all
> as part of this series.
>
Read your other email as well, thanks for reviewing!
It makes sense, hope this captures what you suggested. In v5,
If vm_memory_attributes == 1:
(KVM_CAP_MEMORY_ATTRIBUTES2 will be removed (will return 0))
If vm_memory_attributes == 0 aka attributes are tracked by guest_memfd:
KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES2 will return valid attributes
(KVM_CAP_MEMORY_ATTRIBUTES2 will be removed (will return 0))
So yup, KVM_CAP_MEMORY_ATTRIBUTES2 will not even be #defined at all.
>> +
>> +Userspace is expected to figure out how to remove all known refcounts
>> +on the shared pages, such as refcounts taken by get_user_pages(), and
>> +try the ioctl again. A possible source of these long term refcounts is
>> +if the guest_memfd memory was pinned in IOMMU page tables.
>
> One might read this to mean error_offset is used purely for the EAGAIN
> case, so it might be worth touching on the other cases as well.
>
Will update this in the next revision.
> -Mike
>
>>
>> [...snip...]
>>
next prev parent reply other threads:[~2026-04-01 22:46 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 22:24 [PATCH RFC v4 00/44] guest_memfd: In-place conversion support Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 01/44] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 02/44] KVM: Rename KVM_GENERIC_MEMORY_ATTRIBUTES to KVM_VM_MEMORY_ATTRIBUTES Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 03/44] KVM: Enumerate support for PRIVATE memory iff kvm_arch_has_private_mem is defined Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 04/44] KVM: Stub in ability to disable per-VM memory attribute tracking Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 05/44] KVM: guest_memfd: Wire up kvm_get_memory_attributes() to per-gmem attributes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 06/44] KVM: guest_memfd: Update kvm_gmem_populate() to use gmem attributes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 07/44] KVM: guest_memfd: Only prepare folios for private pages Ackerley Tng
2026-04-01 14:05 ` Ackerley Tng
2026-04-01 15:16 ` Michael Roth
2026-04-01 21:43 ` Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 08/44] KVM: Introduce KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng
2026-03-31 22:53 ` Michael Roth
2026-04-01 21:04 ` Sean Christopherson
2026-03-26 22:24 ` [PATCH RFC v4 09/44] KVM: guest_memfd: Enable INIT_SHARED on guest_memfd for x86 Coco VMs Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 10/44] KVM: guest_memfd: Add support for KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng
2026-03-31 23:31 ` Michael Roth
2026-04-01 22:46 ` Ackerley Tng [this message]
2026-04-01 15:35 ` Michael Roth
2026-04-01 21:12 ` Sean Christopherson
2026-04-01 22:38 ` Ackerley Tng
2026-04-02 16:20 ` Ackerley Tng
2026-04-03 14:50 ` Ackerley Tng
2026-04-07 21:09 ` Michael Roth
2026-04-07 21:50 ` Vishal Annapurve
2026-04-07 22:09 ` Michael Roth
2026-04-08 0:33 ` Sean Christopherson
2026-04-08 16:54 ` Ackerley Tng
2026-04-08 19:48 ` Sean Christopherson
2026-04-08 11:01 ` Steven Price
2026-04-08 0:30 ` Sean Christopherson
2026-03-26 22:24 ` [PATCH RFC v4 11/44] KVM: guest_memfd: Handle lru_add fbatch refcounts during conversion safety check Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 12/44] KVM: guest_memfd: Introduce default handlers for content modes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 13/44] KVM: guest_memfd: Apply content modes while setting memory attributes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 14/44] KVM: x86: Add support for applying content modes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 15/44] KVM: Add CAP to enumerate supported SET_MEMORY_ATTRIBUTES2 flags Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 16/44] KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86 Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 17/44] KVM: Let userspace disable per-VM mem attributes, enable per-gmem attributes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 18/44] KVM: selftests: Create gmem fd before "regular" fd when adding memslot Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 19/44] KVM: selftests: Rename guest_memfd{,_offset} to gmem_{fd,offset} Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 20/44] KVM: selftests: Add support for mmap() on guest_memfd in core library Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 21/44] KVM: selftests: Add selftests global for guest memory attributes capability Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 22/44] KVM: selftests: Update framework to use KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 23/44] KVM: selftests: Add helpers for calling ioctls on guest_memfd Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 24/44] KVM: selftests: Test using guest_memfd for guest private memory Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 25/44] KVM: selftests: Test basic single-page conversion flow Ackerley Tng
2026-03-31 22:33 ` Ackerley Tng
2026-04-01 21:08 ` Sean Christopherson
2026-03-26 22:24 ` [PATCH RFC v4 26/44] KVM: selftests: Test conversion flow when INIT_SHARED Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 27/44] KVM: selftests: Test conversion precision in guest_memfd Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 28/44] KVM: selftests: Test conversion before allocation Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 29/44] KVM: selftests: Convert with allocated folios in different layouts Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 30/44] KVM: selftests: Test that truncation does not change shared/private status Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 31/44] KVM: selftests: Test that shared/private status is consistent across processes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 32/44] KVM: selftests: Test conversion with elevated page refcount Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 33/44] KVM: selftests: Test that conversion to private does not support ZERO Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 34/44] KVM: selftests: Support checking that data not equal expected Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 35/44] KVM: selftests: Test that not specifying a conversion flag scrambles memory contents Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 36/44] KVM: selftests: Reset shared memory after hole-punching Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 37/44] KVM: selftests: Provide function to look up guest_memfd details from gpa Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 38/44] KVM: selftests: Provide common function to set memory attributes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 39/44] KVM: selftests: Check fd/flags provided to mmap() when setting up memslot Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 40/44] KVM: selftests: Make TEST_EXPECT_SIGBUS thread-safe Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 41/44] KVM: selftests: Update private_mem_conversions_test to mmap() guest_memfd Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 42/44] KVM: selftests: Add script to exercise private_mem_conversions_test Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 43/44] KVM: selftests: Update pre-fault test to work with per-guest_memfd attributes Ackerley Tng
2026-03-26 22:24 ` [PATCH RFC v4 44/44] KVM: selftests: Update private memory exits test to work with per-gmem attributes Ackerley Tng
2026-03-26 23:36 ` [POC PATCH 0/6] guest_memfd in-place conversion selftests for SNP Ackerley Tng
2026-03-26 23:36 ` [POC PATCH 1/6] KVM: selftests: Initialize guest_memfd with INIT_SHARED Ackerley Tng
2026-03-26 23:36 ` [POC PATCH 2/6] KVM: selftests: Call snp_launch_update_data() providing copy of memory Ackerley Tng
2026-03-26 23:36 ` [POC PATCH 3/6] KVM: selftests: Make guest_code_xsave more friendly Ackerley Tng
2026-03-26 23:36 ` [POC PATCH 4/6] KVM: selftests: Allow specifying CoCo-privateness while mapping a page Ackerley Tng
2026-03-26 23:36 ` [POC PATCH 5/6] KVM: selftests: Test conversions for SNP Ackerley Tng
2026-03-26 23:36 ` [POC PATCH 6/6] KVM: selftests: Test content modes ZERO and PRESERVE " Ackerley Tng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAEvNRgFgOJj1_WKGrmFfXe8Kg9BrrX7Shmu_cjsEfiq2RP79zQ@mail.gmail.com \
--to=ackerleytng@google.com \
--cc=aik@amd.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.jones@linux.dev \
--cc=aneesh.kumar@kernel.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=binbin.wu@linux.intel.com \
--cc=bp@alien8.de \
--cc=brauner@kernel.org \
--cc=chao.p.peng@linux.intel.com \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@kernel.org \
--cc=forkloop@google.com \
--cc=hpa@zytor.com \
--cc=ira.weiny@intel.com \
--cc=jgg@ziepe.ca \
--cc=jmattson@google.com \
--cc=jroedel@suse.de \
--cc=jthoughton@google.com \
--cc=kasong@tencent.com \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=michael.roth@amd.com \
--cc=mingo@redhat.com \
--cc=nphamcs@gmail.com \
--cc=oupton@kernel.org \
--cc=pankaj.gupta@amd.com \
--cc=pbonzini@redhat.com \
--cc=pratyush@kernel.org \
--cc=qperret@google.com \
--cc=rick.p.edgecombe@intel.com \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=shikemeng@huaweicloud.com \
--cc=shivankg@amd.com \
--cc=shuah@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=steven.price@arm.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=tglx@kernel.org \
--cc=vannapurve@google.com \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=wyihan@google.com \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
--cc=yuanchu@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox