linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dev Jain <dev.jain@arm.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Usama Arif <usamaarif642@gmail.com>
Cc: ziy@nvidia.com, Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com,
	shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org,
	baolin.wang@linux.alibaba.com, npache@redhat.com,
	Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz,
	lance.yang@linux.dev, linux-kernel@vger.kernel.org,
	kernel-team@meta.com
Subject: Re: [RFC 00/12] mm: PUD (1GB) THP implementation
Date: Wed, 4 Feb 2026 17:31:03 +0530	[thread overview]
Message-ID: <0dff358a-9308-4ef4-b3d8-aa5f9ab3dcd9@arm.com> (raw)
In-Reply-To: <a9e0a6bd-46b5-41f8-be50-03300f938bf2@arm.com>


On 04/02/26 5:20 pm, Dev Jain wrote:
> On 04/02/26 4:38 pm, Lorenzo Stoakes wrote:
>> On Tue, Feb 03, 2026 at 05:00:10PM -0800, Usama Arif wrote:
>>> On 02/02/2026 03:20, Lorenzo Stoakes wrote:
>>>> OK so this is somewhat unexpected :)
>>>>
>>>> It would have been nice to discuss it in the THP cabal or at a conference
>>>> etc. so we could discuss approaches ahead of time. Communication is important,
>>>> especially with major changes like this.
>>> Makes sense!
>>>
>>>> And PUD THP is especially problematic in that it requires pages that the page
>>>> allocator can't give us, presumably you're doing something with CMA and... it's
>>>> a whole kettle of fish.
>>> So we dont need CMA. It helps ofcourse, but we don't *need* it.
>>> Its summarized in the first reply I gave to Zi in [1]:
>>>
>>>> It's also complicated by the fact we _already_ support it in the DAX, VFIO cases
>>>> but it's kinda a weird sorta special case that we need to keep supporting.
>>>>
>>>> There's questions about how this will interact with khugepaged, MADV_COLLAPSE,
>>>> mTHP (and really I want to see Nico's series land before we really consider
>>>> this).
>>> So I have numbers and experiments for page faults which are in the cover letter,
>>> but not for khugepaged. I would be very surprised (although pleasently :)) if
>>> khugepaged by some magic finds 262144 pages that meets all the khugepaged requirements
>>> to collapse the page. In the basic infrastructure support which this series is adding,
>>> I want to keep khugepaged collapse disabled for 1G pages. This is also the initial
>>> approach that was taken in other mTHP sizes. We should go slow with 1G THPs.
>> Yes we definitely want to limit to page faults for now.
>>
>> But keep in mind for that to be viable you'd surely need to update who gets
>> appropriate alignment in __get_unmapped_area()... not read through series far
>> enough to see so not sure if you update that though!
>>
>> I guess that'd be the sanest place to start, if an allocation _size_ is aligned
>> 1 GB, then align the unmapped area _address_ to 1 GB for maximum chance of 1 GB
>> fault-in.
>>
>> Oh by the way I made some rough THP notes at
>> https://publish.obsidian.md/mm/Transparent+Huge+Pages+(THP) which are helpful
>> for reminding me about what does what where, useful for a top-down view of how
>> things are now.
>>
>>>> So overall, I want to be very cautious and SLOW here. So let's please not drop
>>>> the RFC tag until David and I are ok with that?
>>>>
>>>> Also the THP code base is in _dire_ need of rework, and I don't really want to
>>>> add major new features without us paying down some technical debt, to be honest.
>>>>
>>>> So let's proceed with caution, and treat this as a very early bit of
>>>> experimental code.
>>>>
>>>> Thanks, Lorenzo
>>> Ack, yeah so this is mainly an RFC to discuss what the major design choices will be.
>>> I got a kernel with selftests for allocation, memory integrity, fork, partial munmap,
>>> mprotect, reclaim and migration passing and am running them with DEBUG_VM to make sure
>>> we dont get the VM bugs/warnings and the numbers are good, so just wanted to share it
>>> upstream and get your opinions! Basically try and trigger a discussion similar to what
>>> Zi asked in [2]! And also if someone could point out if there is something fundamental
>>> we are missing in this series.
>> Well that's fair enough :)
>>
>> But do come to a THP cabal so we can chat, face-to-face (ok, digital face to
>> digital face ;). It's usually a force-multiplier I find, esp. if multiple people
>> have input which I think is the case here. We're friendly :)
>>
>> In any case, conversations are already kicking off so that's definitely positive!
>>
>> I think we will definitely get there with this at _some point_ but I would urge
>> patience and also I really want to underline my desire for us in THP to start
>> paying down some of this technical debt.
>>
>> I know people are already making efforts (Vernon, Luiz), and sorry that I've not
>> been great at review recently (should be gradually increasing over time), but I
>> feel that for large features to be added like this now we really do require some
>> refactoring work before we take it.
>>
>> We definitely need to rebase this once Nico's series lands (should do next
>> cycle) and think about how it plays with this, I'm not sure if arm64 supports
>> mTHP between PMD and PUD size (Dev? Do you know?) so maybe that one is moot, but
> arm64 does support cont mappings at the PMD level. Currently, they are supported
> for kernel pagetables, and hugetlbpages. You may search around for "CONT_PMD" in
> the codebase. Hence it only supports cont PMD in the "static" case, there is
> no dynamic folding/unfolding of the cont bit at the PMD level, which mTHP requires.
>
> I see that this patchset splits PUD all the way down to PTEs. If we were to split
> it down to PMD, and add arm64 support for dynamic cont mappings at the PMD level,
> it will be nicer. But I guess there is some mapcount/rmap stuff involved
> here stopping us from doing that :(

Hmm, this won't make a difference w.r.t cont PMD. If we were to split PUD folio
down to PMD folios, we won't get cont PMD. But yes, in general PMD mappings
are nicer.

>
>> in general want to make sure it plays nice.
>>
>>> Thanks for the reviews! Really do apprecaite it!
>> No worries! :)
>>
>>> [1] https://lore.kernel.org/all/20f92576-e932-435f-bb7b-de49eb84b012@gmail.com/#t
>>> [2] https://lore.kernel.org/all/3561FD10-664D-42AA-8351-DE7D8D49D42E@nvidia.com/
>> Cheers, Lorenzo
>>


  reply	other threads:[~2026-02-04 12:01 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-02  0:50 Usama Arif
2026-02-02  0:50 ` [RFC 01/12] mm: add PUD THP ptdesc and rmap support Usama Arif
2026-02-02 10:44   ` Kiryl Shutsemau
2026-02-02 16:01     ` Zi Yan
2026-02-03 22:07       ` Usama Arif
2026-02-05  4:17         ` Matthew Wilcox
2026-02-05  4:21           ` Matthew Wilcox
2026-02-05  5:13             ` Usama Arif
2026-02-05 17:40               ` David Hildenbrand (Arm)
2026-02-05 18:05                 ` Usama Arif
2026-02-05 18:11                   ` Usama Arif
2026-02-02 12:15   ` Lorenzo Stoakes
2026-02-04  7:38     ` Usama Arif
2026-02-04 12:55       ` Lorenzo Stoakes
2026-02-05  6:40         ` Usama Arif
2026-02-02  0:50 ` [RFC 02/12] mm/thp: add mTHP stats infrastructure for PUD THP Usama Arif
2026-02-02 11:56   ` Lorenzo Stoakes
2026-02-05  5:53     ` Usama Arif
2026-02-02  0:50 ` [RFC 03/12] mm: thp: add PUD THP allocation and fault handling Usama Arif
2026-02-02  0:50 ` [RFC 04/12] mm: thp: implement PUD THP split to PTE level Usama Arif
2026-02-02  0:50 ` [RFC 05/12] mm: thp: add reclaim and migration support for PUD THP Usama Arif
2026-02-02  0:50 ` [RFC 06/12] selftests/mm: add PUD THP basic allocation test Usama Arif
2026-02-02  0:50 ` [RFC 07/12] selftests/mm: add PUD THP read/write access test Usama Arif
2026-02-02  0:50 ` [RFC 08/12] selftests/mm: add PUD THP fork COW test Usama Arif
2026-02-02  0:50 ` [RFC 09/12] selftests/mm: add PUD THP partial munmap test Usama Arif
2026-02-02  0:50 ` [RFC 10/12] selftests/mm: add PUD THP mprotect split test Usama Arif
2026-02-02  0:50 ` [RFC 11/12] selftests/mm: add PUD THP reclaim test Usama Arif
2026-02-02  0:50 ` [RFC 12/12] selftests/mm: add PUD THP migration test Usama Arif
2026-02-02  2:44 ` [RFC 00/12] mm: PUD (1GB) THP implementation Rik van Riel
2026-02-02 11:30   ` Lorenzo Stoakes
2026-02-02 15:50     ` Zi Yan
2026-02-04 10:56       ` Lorenzo Stoakes
2026-02-05 11:29         ` David Hildenbrand (arm)
2026-02-05 11:22       ` David Hildenbrand (arm)
2026-02-02  4:00 ` Matthew Wilcox
2026-02-02  9:06   ` David Hildenbrand (arm)
2026-02-03 21:11     ` Usama Arif
2026-02-02 11:20 ` Lorenzo Stoakes
2026-02-04  1:00   ` Usama Arif
2026-02-04 11:08     ` Lorenzo Stoakes
2026-02-04 11:50       ` Dev Jain
2026-02-04 12:01         ` Dev Jain [this message]
2026-02-05  6:08       ` Usama Arif
2026-02-02 16:24 ` Zi Yan
2026-02-03 23:29   ` Usama Arif
2026-02-04  0:08     ` Frank van der Linden
2026-02-05  5:46       ` Usama Arif
2026-02-05 18:07     ` Zi Yan
2026-02-07 23:22       ` Usama Arif

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0dff358a-9308-4ef4-b3d8-aa5f9ab3dcd9@arm.com \
    --to=dev.jain@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kas@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=npache@redhat.com \
    --cc=riel@surriel.com \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox