From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Usama Arif <usamaarif642@gmail.com>
Cc: ziy@nvidia.com, Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com,
shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org,
dev.jain@arm.com, baolin.wang@linux.alibaba.com,
npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com,
vbabka@suse.cz, lance.yang@linux.dev,
linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [RFC 00/12] mm: PUD (1GB) THP implementation
Date: Wed, 4 Feb 2026 11:08:51 +0000 [thread overview]
Message-ID: <b07a8d54-75f2-4d77-838a-7454ad559cd2@lucifer.local> (raw)
In-Reply-To: <2efaa5ed-bd09-41f0-9c07-5cd6cccc4595@gmail.com>
On Tue, Feb 03, 2026 at 05:00:10PM -0800, Usama Arif wrote:
>
>
> On 02/02/2026 03:20, Lorenzo Stoakes wrote:
> > OK so this is somewhat unexpected :)
> >
> > It would have been nice to discuss it in the THP cabal or at a conference
> > etc. so we could discuss approaches ahead of time. Communication is important,
> > especially with major changes like this.
>
> Makes sense!
>
> >
> > And PUD THP is especially problematic in that it requires pages that the page
> > allocator can't give us, presumably you're doing something with CMA and... it's
> > a whole kettle of fish.
>
> So we dont need CMA. It helps ofcourse, but we don't *need* it.
> Its summarized in the first reply I gave to Zi in [1]:
>
> >
> > It's also complicated by the fact we _already_ support it in the DAX, VFIO cases
> > but it's kinda a weird sorta special case that we need to keep supporting.
> >
> > There's questions about how this will interact with khugepaged, MADV_COLLAPSE,
> > mTHP (and really I want to see Nico's series land before we really consider
> > this).
>
>
> So I have numbers and experiments for page faults which are in the cover letter,
> but not for khugepaged. I would be very surprised (although pleasently :)) if
> khugepaged by some magic finds 262144 pages that meets all the khugepaged requirements
> to collapse the page. In the basic infrastructure support which this series is adding,
> I want to keep khugepaged collapse disabled for 1G pages. This is also the initial
> approach that was taken in other mTHP sizes. We should go slow with 1G THPs.
Yes we definitely want to limit to page faults for now.
But keep in mind for that to be viable you'd surely need to update who gets
appropriate alignment in __get_unmapped_area()... not read through series far
enough to see so not sure if you update that though!
I guess that'd be the sanest place to start, if an allocation _size_ is aligned
1 GB, then align the unmapped area _address_ to 1 GB for maximum chance of 1 GB
fault-in.
Oh by the way I made some rough THP notes at
https://publish.obsidian.md/mm/Transparent+Huge+Pages+(THP) which are helpful
for reminding me about what does what where, useful for a top-down view of how
things are now.
>
> >
> > So overall, I want to be very cautious and SLOW here. So let's please not drop
> > the RFC tag until David and I are ok with that?
> >
> > Also the THP code base is in _dire_ need of rework, and I don't really want to
> > add major new features without us paying down some technical debt, to be honest.
> >
> > So let's proceed with caution, and treat this as a very early bit of
> > experimental code.
> >
> > Thanks, Lorenzo
>
> Ack, yeah so this is mainly an RFC to discuss what the major design choices will be.
> I got a kernel with selftests for allocation, memory integrity, fork, partial munmap,
> mprotect, reclaim and migration passing and am running them with DEBUG_VM to make sure
> we dont get the VM bugs/warnings and the numbers are good, so just wanted to share it
> upstream and get your opinions! Basically try and trigger a discussion similar to what
> Zi asked in [2]! And also if someone could point out if there is something fundamental
> we are missing in this series.
Well that's fair enough :)
But do come to a THP cabal so we can chat, face-to-face (ok, digital face to
digital face ;). It's usually a force-multiplier I find, esp. if multiple people
have input which I think is the case here. We're friendly :)
In any case, conversations are already kicking off so that's definitely positive!
I think we will definitely get there with this at _some point_ but I would urge
patience and also I really want to underline my desire for us in THP to start
paying down some of this technical debt.
I know people are already making efforts (Vernon, Luiz), and sorry that I've not
been great at review recently (should be gradually increasing over time), but I
feel that for large features to be added like this now we really do require some
refactoring work before we take it.
We definitely need to rebase this once Nico's series lands (should do next
cycle) and think about how it plays with this, I'm not sure if arm64 supports
mTHP between PMD and PUD size (Dev? Do you know?) so maybe that one is moot, but
in general want to make sure it plays nice.
>
> Thanks for the reviews! Really do apprecaite it!
No worries! :)
>
> [1] https://lore.kernel.org/all/20f92576-e932-435f-bb7b-de49eb84b012@gmail.com/#t
> [2] https://lore.kernel.org/all/3561FD10-664D-42AA-8351-DE7D8D49D42E@nvidia.com/
Cheers, Lorenzo
next prev parent reply other threads:[~2026-02-04 11:09 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-02 0:50 Usama Arif
2026-02-02 0:50 ` [RFC 01/12] mm: add PUD THP ptdesc and rmap support Usama Arif
2026-02-02 10:44 ` Kiryl Shutsemau
2026-02-02 16:01 ` Zi Yan
2026-02-03 22:07 ` Usama Arif
2026-02-05 4:17 ` Matthew Wilcox
2026-02-05 4:21 ` Matthew Wilcox
2026-02-05 5:13 ` Usama Arif
2026-02-05 17:40 ` David Hildenbrand (Arm)
2026-02-05 18:05 ` Usama Arif
2026-02-05 18:11 ` Usama Arif
2026-02-02 12:15 ` Lorenzo Stoakes
2026-02-04 7:38 ` Usama Arif
2026-02-04 12:55 ` Lorenzo Stoakes
2026-02-05 6:40 ` Usama Arif
2026-02-02 0:50 ` [RFC 02/12] mm/thp: add mTHP stats infrastructure for PUD THP Usama Arif
2026-02-02 11:56 ` Lorenzo Stoakes
2026-02-05 5:53 ` Usama Arif
2026-02-02 0:50 ` [RFC 03/12] mm: thp: add PUD THP allocation and fault handling Usama Arif
2026-02-02 0:50 ` [RFC 04/12] mm: thp: implement PUD THP split to PTE level Usama Arif
2026-02-02 0:50 ` [RFC 05/12] mm: thp: add reclaim and migration support for PUD THP Usama Arif
2026-02-02 0:50 ` [RFC 06/12] selftests/mm: add PUD THP basic allocation test Usama Arif
2026-02-02 0:50 ` [RFC 07/12] selftests/mm: add PUD THP read/write access test Usama Arif
2026-02-02 0:50 ` [RFC 08/12] selftests/mm: add PUD THP fork COW test Usama Arif
2026-02-02 0:50 ` [RFC 09/12] selftests/mm: add PUD THP partial munmap test Usama Arif
2026-02-02 0:50 ` [RFC 10/12] selftests/mm: add PUD THP mprotect split test Usama Arif
2026-02-02 0:50 ` [RFC 11/12] selftests/mm: add PUD THP reclaim test Usama Arif
2026-02-02 0:50 ` [RFC 12/12] selftests/mm: add PUD THP migration test Usama Arif
2026-02-02 2:44 ` [RFC 00/12] mm: PUD (1GB) THP implementation Rik van Riel
2026-02-02 11:30 ` Lorenzo Stoakes
2026-02-02 15:50 ` Zi Yan
2026-02-04 10:56 ` Lorenzo Stoakes
2026-02-05 11:29 ` David Hildenbrand (arm)
2026-02-05 11:22 ` David Hildenbrand (arm)
2026-02-02 4:00 ` Matthew Wilcox
2026-02-02 9:06 ` David Hildenbrand (arm)
2026-02-03 21:11 ` Usama Arif
2026-02-02 11:20 ` Lorenzo Stoakes
2026-02-04 1:00 ` Usama Arif
2026-02-04 11:08 ` Lorenzo Stoakes [this message]
2026-02-04 11:50 ` Dev Jain
2026-02-04 12:01 ` Dev Jain
2026-02-05 6:08 ` Usama Arif
2026-02-02 16:24 ` Zi Yan
2026-02-03 23:29 ` Usama Arif
2026-02-04 0:08 ` Frank van der Linden
2026-02-05 5:46 ` Usama Arif
2026-02-05 18:07 ` Zi Yan
2026-02-07 23:22 ` Usama Arif
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b07a8d54-75f2-4d77-838a-7454ad559cd2@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=kas@kernel.org \
--cc=kernel-team@meta.com \
--cc=lance.yang@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npache@redhat.com \
--cc=riel@surriel.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox