From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Gregory Price <gourry@gourry.net>, Li Zhe <lizhe.67@bytedance.com>
Cc: david.laight.linux@gmail.com, akpm@linux-foundation.org,
ankur.a.arora@oracle.com, dan.j.williams@intel.com,
dave@stgolabs.net, fvdl@google.com, joao.m.martins@oracle.com,
jonathan.cameron@huawei.com, linux-cxl@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
mhocko@suse.com, mjguzik@gmail.com, muchun.song@linux.dev,
osalvador@suse.de, raghavendra.kt@amd.com,
wangzhou1@hisilicon.com, zhanjie9@hisilicon.com
Subject: Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism
Date: Wed, 21 Jan 2026 13:41:33 +0100 [thread overview]
Message-ID: <871f2a76-8ccb-4870-8a87-417371feb0b0@kernel.org> (raw)
In-Reply-To: <aW_G66HeWLbyiPHs@gourry-fedora-PF4VCD3F>
On 1/20/26 19:18, Gregory Price wrote:
> On Tue, Jan 20, 2026 at 06:39:48PM +0800, Li Zhe wrote:
>> On Tue, 20 Jan 2026 09:47:44 +0000, david.laight.linux@gmail.com wrote:
>>
>>> On Tue, 20 Jan 2026 14:27:06 +0800
>>> "Li Zhe" <lizhe.67@bytedance.com> wrote:
>>>
>>>
>>> Am I missing something?
>>> If userspace does:
>>> $ program_a; program_b
>>> and pages used by program_a are zeroed when it exits you get the delay
>>> for zeroing all the pages it used before program_b starts.
>>> OTOH if the zeroing is deferred program_b only needs to zero the pages
>>> it needs to start (and there may be some lurking).
>>
>> Under the init_on-free approach, improving the speed of zeroing may
>> indeed prove necessary.
>>
>> However, I believe we should first reach consensus on adopting
>> “init_on_free” as the solution to slow application startup before
>> turning to performance tuning.
>>
>
> His point was init_on_free may not actually reduce any delays on serial
> applications, and can actually introduce additional delays.
>
> Example
> -------
> program_a: alloc_hugepages(10);
> exit();
>
> program b: alloc_hugepages(5);
> exit();
>
> /* Run programs in serial */
> sh: program_a && program_b
>
> in zero_on_alloc():
> program_a eats zero(10) cost on startup
> program_b eats zero(5) cost on startup
> Overall zero(15) cost to start program_b
>
> in zero_on_free()
> program_a eats zero(10) cost on startup
> program_a eats zero(10) cost on exit
> program_b eats zero(0) cost on startup
> Overall zero(20) cost to start program_b
>
> zero_on_free is worse by zero(5)
> -------
>
> This is a trivial example, but it's unclear zero_on_free actually
> provides a benefit. You have to know ahead of time what the runtime
> behavior, pre-zeroed count, and allocation pattern (0->10->5->...) would
> be to determine whether there's an actual reduction in startup time.
For VMs with hugetlb people usually have some spare pages lying around.
VM startup time is more important for cloud providers than VM shutdown time.
I'm sure there are examples where it is the other way around, but having
mixed workloads on the system is likely not the highest priority right now.
>
> But just trivially, starting from the base case of no pages being
> zeroed, you're just injecting an additional zero(X) cost if program_a()
> consumes more hugepages than program_b().
And whatever you do,
program_a()
program_b()
will have to zero the pages.
No asynchronous mechanism will really help.
>
> Long way of saying the shift from alloc to free seems heuristic-y and
> you need stronger analysis / better data to show this change is actually
> beneficial in the general case.
I think the principle of "the allocator already contains zeroed pages"
is quite universal and simple.
Whether you want to zero the pages actually when the last reference is
gone (like we do in the buddy), or have that happen from some
asynchonrous context is an rather an internal optimization.
--
Cheers
David
next prev parent reply other threads:[~2026-01-21 12:41 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-07 11:31 Li Zhe
2026-01-07 11:31 ` [PATCH v2 1/8] mm/hugetlb: add pre-zeroed framework Li Zhe
2026-01-07 11:31 ` [PATCH v2 2/8] mm/hugetlb: convert to prep_account_new_hugetlb_folio() Li Zhe
2026-01-07 11:31 ` [PATCH v2 3/8] mm/hugetlb: move the huge folio to the end of the list during enqueue Li Zhe
2026-01-07 11:31 ` [PATCH v2 4/8] mm/hugetlb: introduce per-node sysfs interface "zeroable_hugepages" Li Zhe
2026-01-07 11:31 ` [PATCH v2 5/8] mm/hugetlb: simplify function hugetlb_sysfs_add_hstate() Li Zhe
2026-01-07 11:31 ` [PATCH v2 6/8] mm/hugetlb: relocate the per-hstate struct kobject pointer Li Zhe
2026-01-07 11:31 ` [PATCH v2 7/8] mm/hugetlb: add epoll support for interface "zeroable_hugepages" Li Zhe
2026-01-07 11:31 ` [PATCH v2 8/8] mm/hugetlb: limit event generation frequency of function do_zero_free_notify() Li Zhe
2026-01-07 16:19 ` [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism Andrew Morton
2026-01-12 11:25 ` Li Zhe
2026-01-09 6:05 ` Muchun Song
2026-01-12 11:27 ` Li Zhe
2026-01-12 19:52 ` David Hildenbrand (Red Hat)
2026-01-13 6:37 ` Li Zhe
2026-01-13 10:15 ` David Hildenbrand (Red Hat)
2026-01-13 12:41 ` Li Zhe
2026-01-14 10:41 ` David Hildenbrand (Red Hat)
2026-01-14 11:36 ` Li Zhe
2026-01-14 11:55 ` David Hildenbrand (Red Hat)
2026-01-14 12:11 ` Mateusz Guzik
2026-01-14 12:33 ` David Hildenbrand (Red Hat)
2026-01-14 12:41 ` David Hildenbrand (Red Hat)
2026-01-14 13:06 ` Mateusz Guzik
2026-01-14 17:21 ` David Hildenbrand (Red Hat)
2026-01-15 9:36 ` Li Zhe
2026-01-15 11:08 ` David Hildenbrand (Red Hat)
2026-01-15 11:57 ` Jonathan Cameron
2026-01-15 17:08 ` David Hildenbrand (Red Hat)
2026-01-15 20:16 ` dan.j.williams
2026-01-15 20:22 ` David Hildenbrand (Red Hat)
2026-01-15 22:30 ` Ankur Arora
2026-01-20 6:27 ` Li Zhe
2026-01-20 9:47 ` David Laight
2026-01-20 10:39 ` Li Zhe
2026-01-20 18:18 ` Gregory Price
2026-01-20 18:38 ` Gregory Price
2026-01-20 19:30 ` David Laight
2026-01-20 19:52 ` Gregory Price
2026-01-21 8:03 ` Li Zhe
2026-01-21 12:41 ` David Hildenbrand (Red Hat) [this message]
2026-01-21 12:32 ` David Hildenbrand (Red Hat)
2026-01-12 22:00 ` Ankur Arora
2026-01-13 6:39 ` Li Zhe
2026-01-12 22:01 ` Ankur Arora
2026-01-13 6:41 ` Li Zhe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871f2a76-8ccb-4870-8a87-417371feb0b0@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=ankur.a.arora@oracle.com \
--cc=dan.j.williams@intel.com \
--cc=dave@stgolabs.net \
--cc=david.laight.linux@gmail.com \
--cc=fvdl@google.com \
--cc=gourry@gourry.net \
--cc=joao.m.martins@oracle.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizhe.67@bytedance.com \
--cc=mhocko@suse.com \
--cc=mjguzik@gmail.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=raghavendra.kt@amd.com \
--cc=wangzhou1@hisilicon.com \
--cc=zhanjie9@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox