Re: [RFC PATCH v1 0/7] Open HugeTLB allocation routine for more generic use

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Ackerley Tng <ackerleytng@google.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: akpm@linux-foundation.org, dan.j.williams@intel.com,
	david@kernel.org,  fvdl@google.com, hannes@cmpxchg.org,
	jgg@nvidia.com, jiaqiyan@google.com,  jthoughton@google.com,
	kalyazin@amazon.com, mhocko@kernel.org,  michael.roth@amd.com,
	muchun.song@linux.dev, osalvador@suse.de,
	 pasha.tatashin@soleen.com, pbonzini@redhat.com,
	peterx@redhat.com,  pratyush@kernel.org,
	rick.p.edgecombe@intel.com, rientjes@google.com,
	 roman.gushchin@linux.dev, seanjc@google.com,
	shakeel.butt@linux.dev,  shivankg@amd.com, vannapurve@google.com,
	yan.y.zhao@intel.com,  cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH v1 0/7] Open HugeTLB allocation routine for more generic use
Date: Sun, 8 Mar 2026 23:58:57 -0700	[thread overview]
Message-ID: <CAEvNRgFpD0jD8QdmBPz-T=jhGn+Rb8MjTq4aycAUkAx54fMhWg@mail.gmail.com> (raw)
In-Reply-To: <20260226180821.2218448-1-joshua.hahnjy@gmail.com>

Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> On Wed, 25 Feb 2026 19:37:04 -0800 Ackerley Tng <ackerleytng@google.com> wrote:
>
>> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>>
>> > On Wed, 11 Feb 2026 16:37:11 -0800 Ackerley Tng <ackerleytng@google.com> wrote:
>> >
>> > Hi Ackerly, I hope you're donig well!
>> >
>> > [...snip...]
>> >
>> >> I would like to get feedback on:
>> >>
>> >> 1. Opening up HugeTLB's allocation for more generic use
>> >
>> > I'm not entirely familiar with guest_memfd, so pleae excuse my ignorance
>> > if I'm missing anything obvious.
>>
>> Happy to take questions! Thank you for your thoughts and reviews!
>
> Of course, thank you for your work, Ackerley!
>
>> > But I'm wondering what hugeTLB offers
>> > that other hugepage solutions cannot offer for guest_memfd, if the
>> > goal of this series is to decouple it from hugeTLBfs.
>> >
>>
>> The one other huge page source that we've explored is THP pages from the
>> buddy allocator. Compared to HugeTLB, huge pages from the buddy
>> allocator
>>
>> + Has a maximum size of 2M
>> + Does not guarantee huge pages the way HugeTLB does - HugeTLB pages are
>>   allocated at boot, and guest_memfd can reserve pages at guest_memfd
>>   creation time.
>> + Allocation of HugeTLB pages is also really fast, it's just dequeuing
>>   from a preallocated pool
>
> All of these make sense. Just wanted to know if guest_memfd had any
> unique usecases for hugeTLB that normal hugetlbfs didn't have.
>

IIUC HugeTLB was meant to make huge pages available to userspace for
performance reasons, guest_memfd wants HugeTLB for the same reason, but
just for virtualization use cases. So nope, I don't think there's any
specifically unique usecases.

These are the differences I can think of between guest_memfd and
HugeTLBfs's usage of HugeTLB:

+ guest_memfd may split HugeTLB pages to individual struct pages during
  guest_memfd's ownership of the HugeTLB page. (The pages will be merged
  before returning them to HugeTLB)

+ guest_memfd will provide an option to remove memory in guest_memfd
  ownership from the kernel direct map - I think HugeTLB pages are
  always in the direct map (?)

+ guest_memfd doesn't want to use HugeTLB surplus pages, for now

+ guest_memfd will reserve pages at fd creation time instead of at mmap
  time. Reservation is done by creating a subpool, so guest_memfd
  doesn't use resv_map.

>> The last reason to use HugeTLB is not because of any inherent advantage
>> of using HugeTLB over other sources of huge pages, but for
>> administrative/scheduling purposes:
>>
>>   Given that existing non-guest_memfd workloads are already using
>>   HugeTLB, for optimal scheduling, machine memory is already carved up
>>   in HugeTLB pages for these workloads. Workloads that require using
>>   guest_memfd (like Confidential VMs) must also use HugeTLB to
>>   participate in optimial workload scheduling across machines.
>>
>>
>> [...snip...]
>>
>> On the other hand, reintroducing the charging protocol has the benefit
>> of avoiding allocations (not just dequeuing, if surplus HugeTLB pages
>> are required) if the memcg limit is hit. Also, if the original reason
>> for removing the protocol was to simplify the code, refactoring out
>> hugetlb_alloc_folio() also simplifies the code, and I think it's
>> actually nice that memcg charging is done the same way as the other two
>> (h_cg and h_cg_rsvd charging). After hugetlb_alloc_folio() is refactored
>> out, the gotos make all three charging systems consistent and symmetric,
>> which I think is nice to have :)
>>
>> I hope the consistent/symmetric charging among all 3 systems is welcome,
>> what do you think?
>
> For the hugetlbfs case, the path to allocate a hugeTLB page on demand
> makes sense, so I definitely see the argument for avoiding allocations.
> Does guest_memfd also have a path to allocate a hugeTLB page outside of
> the boottime reservations? In that case I think it would be nice to
> clarify that the allocation failure case optimization is also for
> guest_memfd, not only for hugetlbfs.
>

For now, guest_memfd actually doesn't want to use surplus pages, so
guest_memfd won't be allocating pages outside of boottime
reservations.

> Symmetric charging is definitely welcome : -) All of your reasons make
> sense to me, I just wanted to ask and make sure.
>

This change is mostly for (an alternate form of) simplicity :)

> Thanks for your thoughts! I hope you have a great day!!
> Joshua

     prev parent reply	other threads:[~2026-03-09  6:59 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-12  0:37 Ackerley Tng
2026-02-12  0:37 ` [RFC PATCH v1 1/7] mm: hugetlb: Consolidate interpretation of gbl_chg within alloc_hugetlb_folio() Ackerley Tng
2026-02-25 20:27   ` Joshua Hahn
2026-02-12  0:37 ` [RFC PATCH v1 2/7] mm: hugetlb: Move mpol interpretation out of alloc_buddy_hugetlb_folio_with_mpol() Ackerley Tng
2026-02-25 18:51   ` James Houghton
2026-02-12  0:37 ` [RFC PATCH v1 3/7] mm: hugetlb: Move mpol interpretation out of dequeue_hugetlb_folio_vma() Ackerley Tng
2026-02-25 19:57   ` James Houghton
2026-02-12  0:37 ` [RFC PATCH v1 4/7] Revert "memcg/hugetlb: remove memcg hugetlb try-commit-cancel protocol" Ackerley Tng
2026-02-12  0:37 ` [RFC PATCH v1 5/7] mm: hugetlb: Adopt memcg try-commit-cancel protocol Ackerley Tng
2026-02-12  0:37 ` [RFC PATCH v1 6/7] mm: memcontrol: Remove now-unused function mem_cgroup_charge_hugetlb Ackerley Tng
2026-02-12  0:37 ` [RFC PATCH v1 7/7] mm: hugetlb: Refactor out hugetlb_alloc_folio() Ackerley Tng
2026-02-25 20:24 ` [RFC PATCH v1 0/7] Open HugeTLB allocation routine for more generic use Joshua Hahn
2026-02-26  3:37   ` Ackerley Tng
2026-02-26 18:08     ` Joshua Hahn
2026-03-09  6:58       ` Ackerley Tng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEvNRgFpD0jD8QdmBPz-T=jhGn+Rb8MjTq4aycAUkAx54fMhWg@mail.gmail.com' \
    --to=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@kernel.org \
    --cc=fvdl@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=jgg@nvidia.com \
    --cc=jiaqiyan@google.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=jthoughton@google.com \
    --cc=kalyazin@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=michael.roth@amd.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pratyush@kernel.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=seanjc@google.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shivankg@amd.com \
    --cc=vannapurve@google.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox