From: Ackerley Tng <ackerleytng@google.com>
To: kalyazin@amazon.com, "Edgecombe,
Rick P" <rick.p.edgecombe@intel.com>,
"linux-riscv@lists.infradead.org"
<linux-riscv@lists.infradead.org>,
"kalyazin@amazon.co.uk" <kalyazin@amazon.co.uk>,
"kernel@xen0n.name" <kernel@xen0n.name>,
"linux-kselftest@vger.kernel.org"
<linux-kselftest@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
"kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"loongarch@lists.linux.dev" <loongarch@lists.linux.dev>
Cc: "david@kernel.org" <david@kernel.org>,
"palmer@dabbelt.com" <palmer@dabbelt.com>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
"svens@linux.ibm.com" <svens@linux.ibm.com>,
"jgross@suse.com" <jgross@suse.com>,
"surenb@google.com" <surenb@google.com>,
"riel@surriel.com" <riel@surriel.com>,
"pfalcato@suse.de" <pfalcato@suse.de>,
"peterx@redhat.com" <peterx@redhat.com>,
"x86@kernel.org" <x86@kernel.org>,
"rppt@kernel.org" <rppt@kernel.org>,
"thuth@redhat.com" <thuth@redhat.com>,
"maz@kernel.org" <maz@kernel.org>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"ast@kernel.org" <ast@kernel.org>,
"vbabka@suse.cz" <vbabka@suse.cz>,
"Annapurve, Vishal" <vannapurve@google.com>,
"borntraeger@linux.ibm.com" <borntraeger@linux.ibm.com>,
"alex@ghiti.fr" <alex@ghiti.fr>,
"pjw@kernel.org" <pjw@kernel.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"willy@infradead.org" <willy@infradead.org>,
"hca@linux.ibm.com" <hca@linux.ibm.com>,
"wyihan@google.com" <wyihan@google.com>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"jolsa@kernel.org" <jolsa@kernel.org>,
"yang@os.amperecomputing.com" <yang@os.amperecomputing.com>,
"jmattson@google.com" <jmattson@google.com>,
"luto@kernel.org" <luto@kernel.org>,
"aneesh.kumar@kernel.org" <aneesh.kumar@kernel.org>,
"haoluo@google.com" <haoluo@google.com>,
"patrick.roy@linux.dev" <patrick.roy@linux.dev>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"coxu@redhat.com" <coxu@redhat.com>,
"mhocko@suse.com" <mhocko@suse.com>,
"mlevitsk@redhat.com" <mlevitsk@redhat.com>,
"jgg@ziepe.ca" <jgg@ziepe.ca>, "hpa@zytor.com" <hpa@zytor.com>,
"song@kernel.org" <song@kernel.org>,
"oupton@kernel.org" <oupton@kernel.org>,
"peterz@infradead.org" <peterz@infradead.org>,
"maobibo@loongson.cn" <maobibo@loongson.cn>,
"lorenzo.stoakes@oracle.com" <lorenzo.stoakes@oracle.com>,
"Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
"jthoughton@google.com" <jthoughton@google.com>,
"martin.lau@linux.dev" <martin.lau@linux.dev>,
"jhubbard@nvidia.com" <jhubbard@nvidia.com>,
"Yu, Yu-cheng" <yu-cheng.yu@intel.com>,
"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
"eddyz87@gmail.com" <eddyz87@gmail.com>,
"yonghong.song@linux.dev" <yonghong.song@linux.dev>,
"chenhuacai@kernel.org" <chenhuacai@kernel.org>,
"shuah@kernel.org" <shuah@kernel.org>,
"prsampat@amd.com" <prsampat@amd.com>,
"kevin.brodsky@arm.com" <kevin.brodsky@arm.com>,
"shijie@os.amperecomputing.com" <shijie@os.amperecomputing.com>,
"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
"itazur@amazon.co.uk" <itazur@amazon.co.uk>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"yuzenghui@huawei.com" <yuzenghui@huawei.com>,
"dev.jain@arm.com" <dev.jain@arm.com>,
"gor@linux.ibm.com" <gor@linux.ibm.com>,
"jackabt@amazon.co.uk" <jackabt@amazon.co.uk>,
"daniel@iogearbox.net" <daniel@iogearbox.net>,
"agordeev@linux.ibm.com" <agordeev@linux.ibm.com>,
"andrii@kernel.org" <andrii@kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
"joey.gouly@arm.com" <joey.gouly@arm.com>,
"derekmn@amazon.com" <derekmn@amazon.com>,
"xmarcalx@amazon.co.uk" <xmarcalx@amazon.co.uk>,
"kpsingh@kernel.org" <kpsingh@kernel.org>,
"sdf@fomichev.me" <sdf@fomichev.me>,
"jackmanb@google.com" <jackmanb@google.com>,
"bp@alien8.de" <bp@alien8.de>, "corbet@lwn.net" <corbet@lwn.net>,
"jannh@google.com" <jannh@google.com>,
"john.fastabend@gmail.com" <john.fastabend@gmail.com>,
"kas@kernel.org" <kas@kernel.org>,
"will@kernel.org" <will@kernel.org>,
"seanjc@google.com" <seanjc@google.com>
Subject: Re: [PATCH v9 07/13] KVM: guest_memfd: Add flag to remove from direct map
Date: Thu, 22 Jan 2026 10:37:37 -0800 [thread overview]
Message-ID: <CAEvNRgEvd9tSwrkaYrQyibO2DP99vgVj6_zr=jBH5+zMnJwYbA@mail.gmail.com> (raw)
In-Reply-To: <294bca75-2f3e-46db-bb24-7c471a779cc1@amazon.com>
Nikita Kalyazin <kalyazin@amazon.com> writes:
> On 16/01/2026 00:00, Edgecombe, Rick P wrote:
>> On Wed, 2026-01-14 at 13:46 +0000, Kalyazin, Nikita wrote:
>>> +static void kvm_gmem_folio_restore_direct_map(struct folio *folio)
>>> +{
>>> + /*
>>> + * Direct map restoration cannot fail, as the only error condition
>>> + * for direct map manipulation is failure to allocate page tables
>>> + * when splitting huge pages, but this split would have already
>>> + * happened in folio_zap_direct_map() in kvm_gmem_folio_zap_direct_map().
Do you know if folio_restore_direct_map() will also end up merging page
table entries to a higher level?
>>> + * Thus folio_restore_direct_map() here only updates prot bits.
>>> + */
>>> + if (kvm_gmem_folio_no_direct_map(folio)) {
>>> + WARN_ON_ONCE(folio_restore_direct_map(folio));
>>> + folio->private = (void *)((u64)folio->private & ~KVM_GMEM_FOLIO_NO_DIRECT_MAP);
>>> + }
>>> +}
>>> +
>>
>> Does this assume the folio would not have been split after it was zapped? As in,
>> if it was zapped at 2MB granularity (no 4KB direct map split required) but then
>> restored at 4KB (split required)? Or it gets merged somehow before this?
I agree with the rest of the discussion that this will probably land
before huge page support, so I will have to figure out the intersection
of the two later.
>
> AFAIK it can't be zapped at 2MB granularity as the zapping code will
> inevitably cause splitting because guest_memfd faults occur at the base
> page granularity as of now.
Here's what I'm thinking for now:
[HugeTLB, no conversions]
With initial HugeTLB support (no conversions), host userspace
guest_memfd faults will be:
+ For guest_memfd with PUD-sized pages
+ At PUD level or PTE level
+ For guest_memfd with PMD-sized pages
+ At PMD level or PTE level
Since this guest_memfd doesn't support conversions, the folio is never
split/merged, so the direct map is restored at whatever level it was
zapped. I think this works out well.
[HugeTLB + conversions]
For a guest_memfd with HugeTLB support and conversions, host userspace
guest_memfd faults will always be at PTE level, so the direct map will
be split and the faulted pages have the direct map zapped in 4K chunks
as they are faulted.
On conversion back to private, put those back into the direct map
(putting aside whether to merge the direct map PTEs for now).
Unfortunately there's no unmapping callback for guest_memfd to use, so
perhaps the principle should be to put the folios back into the direct
map ASAP - at unmapping if guest_memfd is doing the unmapping, otherwise
at freeing time?
next prev parent reply other threads:[~2026-01-22 18:37 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-14 13:45 [PATCH v9 00/13] Direct Map Removal Support for guest_memfd Kalyazin, Nikita
2026-01-14 13:45 ` [PATCH v9 01/13] set_memory: add folio_{zap,restore}_direct_map helpers Kalyazin, Nikita
2026-01-15 10:54 ` Huacai Chen
2026-01-15 11:03 ` [PATCH v9 01/13] set_memory: add folio_{zap, restore}_direct_map helpers Nikita Kalyazin
2026-01-15 12:12 ` [PATCH v9 01/13] set_memory: add folio_{zap,restore}_direct_map helpers Heiko Carstens
2026-01-15 15:25 ` [PATCH v9 01/13] set_memory: add folio_{zap, restore}_direct_map helpers Nikita Kalyazin
2026-01-15 15:55 ` [PATCH v9 01/13] set_memory: add folio_{zap,restore}_direct_map helpers Matthew Wilcox
2026-01-15 17:45 ` [PATCH v9 01/13] set_memory: add folio_{zap, restore}_direct_map helpers Nikita Kalyazin
2026-01-15 20:05 ` David Hildenbrand (Red Hat)
2026-01-15 21:07 ` [PATCH v9 01/13] set_memory: add folio_{zap,restore}_direct_map helpers Ackerley Tng
2026-01-14 13:45 ` [PATCH v9 02/13] mm/gup: drop secretmem optimization from gup_fast_folio_allowed Kalyazin, Nikita
2026-01-15 20:04 ` David Hildenbrand (Red Hat)
2026-01-15 21:40 ` Ackerley Tng
2026-01-16 14:55 ` Nikita Kalyazin
2026-01-22 0:20 ` Ackerley Tng
2026-01-14 13:45 ` [PATCH v9 03/13] mm: introduce AS_NO_DIRECT_MAP Kalyazin, Nikita
2026-01-15 21:42 ` Ackerley Tng
2026-01-14 13:45 ` [PATCH v9 04/13] KVM: guest_memfd: Add stub for kvm_arch_gmem_invalidate Kalyazin, Nikita
2026-01-15 21:47 ` Ackerley Tng
2026-01-14 13:46 ` [PATCH v9 05/13] KVM: x86: define kvm_arch_gmem_supports_no_direct_map() Kalyazin, Nikita
2026-01-15 21:48 ` Ackerley Tng
2026-01-14 13:46 ` [PATCH v9 06/13] KVM: arm64: " Kalyazin, Nikita
2026-01-14 13:46 ` [PATCH v9 07/13] KVM: guest_memfd: Add flag to remove from direct map Kalyazin, Nikita
2026-01-15 20:00 ` Ackerley Tng
2026-01-16 14:56 ` Nikita Kalyazin
2026-01-22 16:34 ` Ackerley Tng
2026-01-22 18:04 ` Nikita Kalyazin
2026-01-22 20:30 ` Ackerley Tng
2026-01-22 20:40 ` Nikita Kalyazin
2026-01-15 23:04 ` Edgecombe, Rick P
2026-01-16 15:02 ` Nikita Kalyazin
2026-01-16 15:35 ` Edgecombe, Rick P
2026-01-16 15:41 ` Sean Christopherson
2026-01-16 17:32 ` Nikita Kalyazin
2026-01-16 17:51 ` Edgecombe, Rick P
2026-01-16 17:30 ` Vishal Annapurve
2026-01-16 17:51 ` Edgecombe, Rick P
2026-01-22 16:44 ` Ackerley Tng
2026-01-22 17:35 ` Edgecombe, Rick P
2026-01-22 22:47 ` Ackerley Tng
2026-01-23 0:01 ` Edgecombe, Rick P
2026-01-28 0:29 ` Ackerley Tng
2026-01-16 0:00 ` Edgecombe, Rick P
2026-01-16 15:00 ` Nikita Kalyazin
2026-01-16 15:34 ` Edgecombe, Rick P
2026-01-16 17:28 ` Nikita Kalyazin
2026-01-16 17:36 ` Edgecombe, Rick P
2026-01-16 17:51 ` Nikita Kalyazin
2026-01-16 18:10 ` Edgecombe, Rick P
2026-01-16 18:16 ` Nikita Kalyazin
2026-01-22 18:37 ` Ackerley Tng [this message]
2026-01-22 18:47 ` Nikita Kalyazin
2026-01-26 16:56 ` Nikita Kalyazin
2026-01-28 0:21 ` Ackerley Tng
2026-01-14 13:46 ` [PATCH v9 08/13] KVM: selftests: load elf via bounce buffer Kalyazin, Nikita
2026-01-14 13:46 ` [PATCH v9 09/13] KVM: selftests: set KVM_MEM_GUEST_MEMFD in vm_mem_add() if guest_memfd != -1 Kalyazin, Nikita
2026-01-15 19:39 ` Ackerley Tng
2026-01-16 15:00 ` Nikita Kalyazin
2026-01-14 13:47 ` [PATCH v9 10/13] KVM: selftests: Add guest_memfd based vm_mem_backing_src_types Kalyazin, Nikita
2026-01-14 13:47 ` [PATCH v9 11/13] KVM: selftests: cover GUEST_MEMFD_FLAG_NO_DIRECT_MAP in existing selftests Kalyazin, Nikita
2026-01-14 13:47 ` [PATCH v9 12/13] KVM: selftests: stuff vm_mem_backing_src_type into vm_shape Kalyazin, Nikita
2026-01-14 13:47 ` [PATCH v9 13/13] KVM: selftests: Test guest execution from direct map removed gmem Kalyazin, Nikita
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEvNRgEvd9tSwrkaYrQyibO2DP99vgVj6_zr=jBH5+zMnJwYbA@mail.gmail.com' \
--to=ackerleytng@google.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=Liam.Howlett@oracle.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=alex@ghiti.fr \
--cc=andrii@kernel.org \
--cc=aneesh.kumar@kernel.org \
--cc=aou@eecs.berkeley.edu \
--cc=ast@kernel.org \
--cc=borntraeger@linux.ibm.com \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=catalin.marinas@arm.com \
--cc=chenhuacai@kernel.org \
--cc=corbet@lwn.net \
--cc=coxu@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@kernel.org \
--cc=derekmn@amazon.com \
--cc=dev.jain@arm.com \
--cc=eddyz87@gmail.com \
--cc=gor@linux.ibm.com \
--cc=haoluo@google.com \
--cc=hca@linux.ibm.com \
--cc=hpa@zytor.com \
--cc=itazur@amazon.co.uk \
--cc=jackabt@amazon.co.uk \
--cc=jackmanb@google.com \
--cc=jannh@google.com \
--cc=jgg@ziepe.ca \
--cc=jgross@suse.com \
--cc=jhubbard@nvidia.com \
--cc=jmattson@google.com \
--cc=joey.gouly@arm.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=jthoughton@google.com \
--cc=kalyazin@amazon.co.uk \
--cc=kalyazin@amazon.com \
--cc=kas@kernel.org \
--cc=kernel@xen0n.name \
--cc=kevin.brodsky@arm.com \
--cc=kpsingh@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=loongarch@lists.linux.dev \
--cc=lorenzo.stoakes@oracle.com \
--cc=luto@kernel.org \
--cc=maobibo@loongson.cn \
--cc=martin.lau@linux.dev \
--cc=maz@kernel.org \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=mlevitsk@redhat.com \
--cc=oupton@kernel.org \
--cc=palmer@dabbelt.com \
--cc=patrick.roy@linux.dev \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=pfalcato@suse.de \
--cc=pjw@kernel.org \
--cc=prsampat@amd.com \
--cc=rick.p.edgecombe@intel.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=sdf@fomichev.me \
--cc=seanjc@google.com \
--cc=shijie@os.amperecomputing.com \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=surenb@google.com \
--cc=suzuki.poulose@arm.com \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=thuth@redhat.com \
--cc=vannapurve@google.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=wyihan@google.com \
--cc=x86@kernel.org \
--cc=xmarcalx@amazon.co.uk \
--cc=yang@os.amperecomputing.com \
--cc=yonghong.song@linux.dev \
--cc=yu-cheng.yu@intel.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox