From: Michael Roth <michael.roth@amd.com>
To: <kvm@vger.kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <x86@kernel.org>,
<linux-fsdevel@vger.kernel.org>
Cc: <david@kernel.org>, <fvdl@google.com>, <ira.weiny@intel.com>,
<jthoughton@google.com>, <pankaj.gupta@amd.com>,
<rick.p.edgecombe@intel.com>, <seanjc@google.com>,
<vannapurve@google.com>, <yan.y.zhao@intel.com>,
<kalyazin@amazon.co.uk>
Subject: Re: [RFC PATCH v2 00/51] 1G page support for guest_memfd
Date: Mon, 16 Feb 2026 17:07:33 -0600 [thread overview]
Message-ID: <20260216230733.ejxtppfrbjaarftb@amd.com> (raw)
I'm not sure I'm hitting the same issue you were, but in order to fix
the race I was hitting I needed to grab the range look outside of the
kvm_gmem_get_folio() path so that it could provide mutual exclusion on
the allocation as well as the subsequent splitting of newly-allocation
hugepages.
Here's the patch I needed on top:
https://github.com/mdroth/linux/commit/240e09e68fe61bb0dfad6a8e054a6aa9316a3660
I think this same issue exists for the THP implementation[1], where a
range lock built around filemap indicies instead of physical addresses
could maybe address both, but not sure it's worthwhile since THP has been
deemed non-upstreamable until general memory migration support is added
to gmem.
I'll dump the code below though for reference since I know some folks on
Cc have been asking about it, but it isn't yet in a state where it's
worth posting separately, but is at least relevant to this particular
discussion. For now, I've just piggy-backed off the filemap invalidate
write lock to serialize all allocations, but I've only hit the race
condition once for 2MB, it's a lot easier with 1GB using hugetlb.
[1]
The THP patches are currently on top of a snapshot of Ackerley’s hugetlb dev
tree. I’d originally planned to rebase on top of just the common
dependencies and posting upstream, but based on the latest guest_memfd/PUCK
calls, there is no chance of THP going upstream without first implementing
memory migration support for guest_memfd to deal with system-wide/cumulative
fragmentation. So I’m tabling that work, it’s just these 3 patches on top for
now:
2ae099ef6977 KVM: guest_memfd: Serialize allocations when THP is enabled
733f7a111699 [WIP] KVM: guest_memfd: Enable/fix hugepages for in-place conversion
349aa261ac65 KVM: Add hugepage support for dedicated guest memory
The initial patch adds THP support for legacy/non-inplace, the remaining 2
enable it for inplace. There are various warnings/TODOs/debugs, I'm only
posting it for reference since I don't know when I'll get to a cleaned up
version since it's not clear it'll be useful in the near-term.
Kernel:
https://github.com/mdroth/linux/commits/snp-thp-rfc2-wip0
QEMU:
https://github.com/mdroth/qemu/commits/snp-hugetlb-v3wip0b
To run QEMU with in-place conversion enabled you need the following option (SNP will default to legacy/non-inplace conversion otherwise):
qemu ... -object sev-snp-guest,...,convert-in-place=true
To enable hugepages when using either convert-in-place=false/true, a kvm module turns it on for now (flipping it on/off rapidly may help with simulating/testing low memory situations):
echo 1 >/sys/module/kvm/gmem_2m_enabled
This tree also supports SNP+hugetlbfs with the following in case you need it for comparison:
For 2MB hugetlb:
qemu ... \
-object sev-snp-guest,...,convert-in-place=true,gmem-allocator=hugetlb,gmem-page-size=2097152
For 1GB hugetlb:
qemu ... \
-object sev-snp-guest,...,convert-in-place=true,gmem-allocator=hugetlb,gmem-page-size=1073741824
next reply other threads:[~2026-02-16 23:08 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-16 23:07 Michael Roth [this message]
-- strict thread matches above, loose matches on Subject: below --
2025-05-14 23:41 Ackerley Tng
2025-05-15 18:03 ` Edgecombe, Rick P
2025-05-15 18:42 ` Vishal Annapurve
2025-05-15 23:35 ` Edgecombe, Rick P
2025-05-16 0:57 ` Sean Christopherson
2025-05-16 2:12 ` Edgecombe, Rick P
2025-05-16 13:11 ` Vishal Annapurve
2025-05-16 16:45 ` Edgecombe, Rick P
2025-05-16 17:51 ` Sean Christopherson
2025-05-16 19:14 ` Edgecombe, Rick P
2025-05-16 20:25 ` Dave Hansen
2025-05-16 21:42 ` Edgecombe, Rick P
2025-05-16 17:45 ` Sean Christopherson
2025-05-16 13:09 ` Jason Gunthorpe
2025-05-16 17:04 ` Edgecombe, Rick P
2025-05-16 19:48 ` Ira Weiny
2025-05-16 19:59 ` Ira Weiny
2025-05-16 20:26 ` Ackerley Tng
2025-05-16 22:43 ` Ackerley Tng
2025-06-19 8:13 ` Yan Zhao
2025-06-19 8:59 ` Xiaoyao Li
2025-06-19 9:18 ` Xiaoyao Li
2025-06-19 9:28 ` Yan Zhao
2025-06-19 9:45 ` Xiaoyao Li
2025-06-19 9:49 ` Xiaoyao Li
2025-06-29 18:28 ` Vishal Annapurve
2025-06-30 3:14 ` Yan Zhao
2025-06-30 14:14 ` Vishal Annapurve
2025-07-01 5:23 ` Yan Zhao
2025-07-01 19:48 ` Vishal Annapurve
2025-07-07 23:25 ` Sean Christopherson
2025-07-08 0:14 ` Vishal Annapurve
2025-07-08 1:08 ` Edgecombe, Rick P
2025-07-08 14:20 ` Sean Christopherson
2025-07-08 14:52 ` Edgecombe, Rick P
2025-07-08 15:07 ` Vishal Annapurve
2025-07-08 15:31 ` Edgecombe, Rick P
2025-07-08 17:16 ` Vishal Annapurve
2025-07-08 17:39 ` Edgecombe, Rick P
2025-07-08 18:03 ` Sean Christopherson
2025-07-08 18:13 ` Edgecombe, Rick P
2025-07-08 18:55 ` Sean Christopherson
2025-07-08 21:23 ` Edgecombe, Rick P
2025-07-09 14:28 ` Vishal Annapurve
2025-07-09 15:00 ` Sean Christopherson
2025-07-10 1:30 ` Vishal Annapurve
2025-07-10 23:33 ` Sean Christopherson
2025-07-11 21:18 ` Vishal Annapurve
2025-07-12 17:33 ` Vishal Annapurve
2025-07-09 15:17 ` Edgecombe, Rick P
2025-07-10 3:39 ` Vishal Annapurve
2025-07-08 19:28 ` Vishal Annapurve
2025-07-08 19:58 ` Sean Christopherson
2025-07-08 22:54 ` Vishal Annapurve
2025-07-08 15:38 ` Sean Christopherson
2025-07-08 16:22 ` Fuad Tabba
2025-07-08 17:25 ` Sean Christopherson
2025-07-08 18:37 ` Fuad Tabba
2025-07-16 23:06 ` Ackerley Tng
2025-06-26 23:19 ` Ackerley Tng
2026-01-24 0:07 ` Ackerley Tng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260216230733.ejxtppfrbjaarftb@amd.com \
--to=michael.roth@amd.com \
--cc=CAEvNRgFmq8DP_=V7mrY8qza3i9h4-Bn0OWt72iDj6mELu+BiZg@mail.gmail.com \
--cc=david@kernel.org \
--cc=fvdl@google.com \
--cc=ira.weiny@intel.com \
--cc=jthoughton@google.com \
--cc=kalyazin@amazon.co.uk \
--cc=kvm@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pankaj.gupta@amd.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=vannapurve@google.com \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox