Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Li Zhe" <lizhe.67@bytedance.com>
To: <david@kernel.org>
Cc: <akpm@linux-foundation.org>, <ankur.a.arora@oracle.com>,
	 <fvdl@google.com>, <joao.m.martins@oracle.com>,
	 <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	 <lizhe.67@bytedance.com>, <mhocko@suse.com>, <mjguzik@gmail.com>,
	 <muchun.song@linux.dev>, <osalvador@suse.de>,
	<raghavendra.kt@amd.com>
Subject: Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism
Date: Tue, 13 Jan 2026 20:41:47 +0800	[thread overview]
Message-ID: <20260113124147.48460-1-lizhe.67@bytedance.com> (raw)
In-Reply-To: <7963534f-cce8-4330-8a67-3f31bd6b2166@kernel.org>

On Tue, 13 Jan 2026 11:15:29 +0100, david@kernel.org wrote:

> On 1/13/26 07:37, Li Zhe wrote:
> > On Mon, 12 Jan 2026 20:52:12 +0100, david@kernel.org wrote:
> > 
> >>> As for concern (4), I believe it is orthogonal to this patchset, and
> >>> the cover letter already contains a performance comparison that
> >>> demonstrates the additional benefit.
> >>>
> >>>> I did see some comments in [1] about QEMU supporting user-mode
> >>>> parallel zero-page operations; I'm just not sure what the current
> >>>> state of that support looks like, or what the corresponding benchmark
> >>>> numbers are.
> >>>
> >>> As noted above, QEMU already employs a parallel page-touch mechanism,
> >>> yet the elapsed time remains noticeable. I am not deeply familiar with
> >>> QEMU; please correct me if I am mistaken.
> >>
> >> I implemented some part of the parallel preallocation support in QEMU.
> >>
> >> With QEMU, you can specify the number of threads and even specify the
> >> NUMA-placement of these threads. So you can pretty much fine-tune that
> >> for an environment.
> >>
> >> You still pre-zero all hugetlb pages at VM startup time, just in
> >> parallel though. So you pay some price at APP startup time.
> > 
> > Hi David,
> > 
> > Thank you for the comprehensive explanation.
> > 
> > You are absolutely correct: QEMU's parallel preallocation is performed
> > only during VM start-up. We submitted this patch series mainly
> > because we observed that, even with the existing parallel mechanism,
> > launching large-size VMs still incurs prohibitive delays. (Bringing up
> > a 2 TB VM still requires more than 40 seconds for zeroing)
> > 
> >> If you know that you will run such a VM (or something else) later, you
> >> could pre-zero the memory from user space by using a hugetlb-backed file
> >> and supplying that to QEMU as memory backend for the VM. Then, you can
> >> start your VM without any pre-zeroing.
> >>
> >> I guess that approach should work universally. Of course, there are
> >> limitations, as you would have to know how much memory an app needs, and
> >> have a way to supply that memory in form of a file to that app.
> > 
> > Regarding user-space pre-zeroing, I agree that it is feasible once the
> > VM's memory footprint is known. We evaluated this approach internally;
> > however, in production environments, it is almost impossible to predict
> > the exact amount of memory a VM will require.
> 
> Of course, you could preallocate to the expected maximum and then 
> truncate the file to the size you need :)

The solution you described seems similar to delegating hugepage
management to a userspace daemon. I haven't explored this approach
before, but it appears quite complex. Beyond ensuring secure memory
isolation between VMs, we would also need to handle scenarios where
the management daemon or the QEMU process crashes, which implies
implementing robust recovery and memory reclamation mechanisms. Do
you happen to have any documentation or references regarding
userspace hugepage management that I could look into? Compared to
the userspace approach, I wonder if implementing hugepage
pre-zeroing directly within the kernel would be a simpler and more
direct way to accelerate VM creation.

Thanks,
Zhe

next prev parent reply	other threads:[~2026-01-13 12:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-07 11:31 Li Zhe
2026-01-07 11:31 ` [PATCH v2 1/8] mm/hugetlb: add pre-zeroed framework Li Zhe
2026-01-07 11:31 ` [PATCH v2 2/8] mm/hugetlb: convert to prep_account_new_hugetlb_folio() Li Zhe
2026-01-07 11:31 ` [PATCH v2 3/8] mm/hugetlb: move the huge folio to the end of the list during enqueue Li Zhe
2026-01-07 11:31 ` [PATCH v2 4/8] mm/hugetlb: introduce per-node sysfs interface "zeroable_hugepages" Li Zhe
2026-01-07 11:31 ` [PATCH v2 5/8] mm/hugetlb: simplify function hugetlb_sysfs_add_hstate() Li Zhe
2026-01-07 11:31 ` [PATCH v2 6/8] mm/hugetlb: relocate the per-hstate struct kobject pointer Li Zhe
2026-01-07 11:31 ` [PATCH v2 7/8] mm/hugetlb: add epoll support for interface "zeroable_hugepages" Li Zhe
2026-01-07 11:31 ` [PATCH v2 8/8] mm/hugetlb: limit event generation frequency of function do_zero_free_notify() Li Zhe
2026-01-07 16:19 ` [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism Andrew Morton
2026-01-12 11:25   ` Li Zhe
2026-01-09  6:05 ` Muchun Song
2026-01-12 11:27   ` Li Zhe
2026-01-12 19:52     ` David Hildenbrand (Red Hat)
2026-01-13  6:37       ` Li Zhe
2026-01-13 10:15         ` David Hildenbrand (Red Hat)
2026-01-13 12:41           ` Li Zhe [this message]
2026-01-12 22:00     ` Ankur Arora
2026-01-13  6:39       ` Li Zhe
2026-01-12 22:01 ` Ankur Arora
2026-01-13  6:41   ` Li Zhe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260113124147.48460-1-lizhe.67@bytedance.com \
    --to=lizhe.67@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=david@kernel.org \
    --cc=fvdl@google.com \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mjguzik@gmail.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=raghavendra.kt@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox