linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] Reduce dependence on vmas deep in hugetlb allocation code
@ 2024-10-11 23:22 Ackerley Tng
  2024-10-11 23:22 ` [RFC PATCH 1/3] mm: hugetlb: Simplify logic in dequeue_hugetlb_folio_vma() Ackerley Tng
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Ackerley Tng @ 2024-10-11 23:22 UTC (permalink / raw)
  To: muchun.song, peterx, akpm, rientjes, fvdl, jthoughton, david
  Cc: isaku.yamahata, zhiquan1.li, fan.du, jun.miao, tabba,
	quic_eberman, roypat, jgg, jhubbard, seanjc, pbonzini,
	erdemaktas, vannapurve, ackerleytng, pgonda, linux-kernel,
	linux-mm

I hope to use these 3 patches to start a discussion on eventually
removing the need to pass a struct vma pointer when taking a folio
from the global pool (i.e. dequeue_hugetlb_folio_vma()).

Why eliminate passing the struct vma pointer?

VMAs are more related to mapping into userspace, and it would be cleaner if the
HugeTLB folio allocation process could just focus on returning a folio.

Currently, the vma struct is a convenient struct that holds pieces of
information required in the allocation process, but dequeuing should not depend
on the VMA concept.

If the vma is needed deep in the allocation process, then allocation could
become awkward, such as in HugeTLBfs's fallocate, where there is no vma (yet)
and a pseudo-vma has to be created.

Separation will help with HugeTLB unification. Taking reference from the buddy
allocator, __alloc_pages_noprof() is conceptually separate from VMAs.

I started looking into this because we want to use HugeTLB folios in guest_memfd
[1], and then I found that the HugeTLB folio allocation process is tightly
coupled with VMAs. This makes it hard to use HugeTLB folios in guest_memfd,
which does not have VMAs for private pages.

Then, I watched Peter Xu's talk at LSFMM [2] about HugeTLB unifications and
thought that these patches could also contribute to the unification effort.

As discussed at LPC 2024 [3], the general preference is for guest_memfd to use
HugeTLB folios. While that is being worked out, I hope these patches can be
separately considered and merged. I believe the patches are still useful in
improving understandability of the resv_map/subpool/hstate reservation system in
HugeTLB, and there are no functionality changes intended.

---

Why use HugeTLB folios in guest_memfd?

HugeTLB is *the* source of 1G pages in the kernel today and it would be best for
all 1G page users (HugeTLB, HugeTLBfs, or guest_memfd) on a host to draw from
the same pool of 1G pages.

This allows central tracking of all 1G pages, a precious resource on a machine.

Having a separate 1G page allocator would not only require rebuilding
of features that HugeTLB has, but also cause a split 1G pool. If both
allocators are used on a machine, it would be complicated to

(a) predetermine how many pages to put in each allocator's pool or
(b) transfer pages between the pools at runtime.

---

[1] https://lore.kernel.org/all/cover.1726009989.git.ackerleytng@google.com/T/
[2] https://youtu.be/7k-m2gTDu2k?si=ghWZ6qa1GAdaHOFP
[3] https://youtu.be/PVTjLLEpozE?si=HvdDlUc_4ElVXu5R

Ackerley Tng (3):
  mm: hugetlb: Simplify logic in dequeue_hugetlb_folio_vma()
  mm: hugetlb: Refactor vma_has_reserves() to should_use_hstate_resv()
  mm: hugetlb: Remove unnecessary check for avoid_reserve

 mm/hugetlb.c | 57 +++++++++++++++++++++-------------------------------
 1 file changed, 23 insertions(+), 34 deletions(-)

--
2.47.0.rc1.288.g06298d1525-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-11-11  9:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-11 23:22 [RFC PATCH 0/3] Reduce dependence on vmas deep in hugetlb allocation code Ackerley Tng
2024-10-11 23:22 ` [RFC PATCH 1/3] mm: hugetlb: Simplify logic in dequeue_hugetlb_folio_vma() Ackerley Tng
2024-10-30 14:31   ` Sean Christopherson
2024-11-05 17:10   ` Peter Xu
2024-11-06 10:13   ` Oscar Salvador
2024-10-11 23:22 ` [RFC PATCH 2/3] mm: hugetlb: Refactor vma_has_reserves() to should_use_hstate_resv() Ackerley Tng
2024-11-05 18:46   ` Peter Xu
2024-11-11  9:19     ` Oscar Salvador
2024-10-11 23:22 ` [RFC PATCH 3/3] mm: hugetlb: Remove unnecessary check for avoid_reserve Ackerley Tng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox