From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Kiryl Shutsemau <kas@kernel.org>, Matthew Wilcox <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
Mike Rapoport <rppt@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>, Baoquan He <bhe@redhat.com>,
Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Jonathan Corbet <corbet@lwn.net>,
kernel-team@meta.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Usama Arif <usamaarif642@gmail.com>,
Frank van der Linden <fvdl@google.com>
Subject: Re: [PATCHv2 02/14] mm/sparse: Check memmap alignment
Date: Thu, 8 Jan 2026 00:08:35 +0100 [thread overview]
Message-ID: <2ace6fc2-6891-4d6c-98de-c027da03d516@kernel.org> (raw)
In-Reply-To: <glu3noshgeh7ktwwqofk7xcwkvhek2x3hrbdmyyo56gmctdx3t@adsfih557p7g>
>> "Then we make page->compound_head point to the dynamically allocated memdesc
>> rather than the first page. Then we can transition to the above layout. "
>
Sorry for the late reply, it's been a bit crazy over here.
> I am not sure I understand how it is going to work.
>
I don't recall all the details that Willy shared over the last years
while working on folios, but I will try to answer as best as I can from
the top of my head. (there are plenty of resources on the list, on the
web, in his presentations etc.).
> 32-byte layout indicates that flags will stay in the statically
> allocated part, but most (all?) flags are in the head page and we would
> need a way to redirect from tail to head in the statically allocated
> pages.
When working with folios we will never go through the head page flags.
That's why Willy has incrementally converted most folio code that worked
on pages to work on folios.
For example, PageUptodate() does a
folio_test_uptodate(page_folio(page));
The flags in the 32-byte layout will be used by some non-folio things
for which we won't allocate memdescs (just yet) (e.g., free pages in the
buddy and other things that does not require a lot of metadata). Some of
these flags will be moved into the memdesc pointer in the future as the
conversion proceeeds.
>
>> The "memdesc" could be a pointer to a "struct folio" that is allocated from
>> the slab.
>>
>> So in the new memdesc world, all pages part of a folio will point at the
>> allocated "struct folio", not the head page where "struct folio" currently
>> overlays "struct page".
>>
>> That would mean that the proposal in this patch set will have to be reverted
>> again.
>>
>>
>> At LPC, Willy said that he wants to have something out there in the first
>> half of 2026.
>
> Okay, seems ambitious to me.
When the program was called "2025" I considered it very ambitious :) Now
I consider it ambitious. I think Willy already shared early versions of
the "struct slab" split and the "struct ptdesc" split recently on the list.
>
> Last time I asked, we had no idea how much performance would additional
> indirection cost us. Do we have a clue?
I raised that in the past, and I think the answer I got was that
(a) We always had these indirection cost when going from tail page to
head page / folio.
(b) We must convert the code to do as little page_folio() as possible.
That's why we saw so much code conversion to stop working on pages
and only work on folios.
There are certainly cases where we cannot currently avoid the
indirection, like when we traverse a page table and go
pfn -> page -> folio
and cannot simply go
pfn -> folio
On the bright side, we'll lose the head-page checks and can simply
dereference the pointer.
I don't know whether Willy has more information yet, but I would assume
that in most cases this will be similar to the performance summary in
your cover letter: "... has shown either no change or only a slight
improvement within the noise.", just that it will be "only a slight
degradation within the noise". :)
We'll learn I guess, in particular which other page -> folio conversions
cannot be optimized out by caching the folio.
For quite some time there will be a magical config option that will
switch between both layouts. I'd assume that things will get more
complicated if we suddenly have a "compound_head/folio" pointer and a
"compound_info" pointer at the same time.
But it's really Willy who has the concept in mind as he is very likely
right now busy writing some of that code.
I'm just the messenger.
:)
[I would hope that Willy could share his thoughts]
--
Cheers
David
next prev parent reply other threads:[~2026-01-07 23:08 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-18 15:09 [PATCHv2 00/14] Kiryl Shutsemau
2025-12-18 15:09 ` [PATCHv2 01/14] mm: Move MAX_FOLIO_ORDER definition to mmzone.h Kiryl Shutsemau
2025-12-18 15:09 ` [PATCHv2 02/14] mm/sparse: Check memmap alignment Kiryl Shutsemau
2025-12-22 8:34 ` Muchun Song
2025-12-22 14:02 ` Kiryl Shutsemau
2025-12-22 14:18 ` David Hildenbrand (Red Hat)
2025-12-22 14:52 ` Kiryl Shutsemau
2025-12-22 14:59 ` Muchun Song
2025-12-22 14:55 ` Muchun Song
2025-12-23 9:38 ` David Hildenbrand (Red Hat)
2025-12-23 11:26 ` Muchun Song
2025-12-24 14:13 ` Kiryl Shutsemau
2026-01-07 23:08 ` David Hildenbrand (Red Hat) [this message]
2026-01-08 12:32 ` Kiryl Shutsemau
2026-01-08 13:30 ` Kiryl Shutsemau
2026-01-09 9:40 ` Muchun Song
2026-01-09 15:24 ` David Hildenbrand (Red Hat)
2026-01-09 21:48 ` Matthew Wilcox
2025-12-22 14:49 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 03/14] mm: Change the interface of prep_compound_tail() Kiryl Shutsemau
2025-12-22 2:55 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 04/14] mm: Rename the 'compound_head' field in the 'struct page' to 'compound_info' Kiryl Shutsemau
2025-12-22 3:00 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 05/14] mm: Move set/clear_compound_head() next to compound_head() Kiryl Shutsemau
2025-12-22 3:06 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 06/14] mm: Rework compound_head() for power-of-2 sizeof(struct page) Kiryl Shutsemau
2025-12-22 3:20 ` Muchun Song
2025-12-22 14:03 ` Kiryl Shutsemau
2025-12-23 8:37 ` Muchun Song
2025-12-22 7:57 ` Muchun Song
2025-12-22 9:45 ` Muchun Song
2025-12-22 14:49 ` Kiryl Shutsemau
2025-12-18 15:09 ` [PATCHv2 07/14] mm: Make page_zonenum() use head page Kiryl Shutsemau
2025-12-18 15:09 ` [PATCHv2 08/14] mm/hugetlb: Refactor code around vmemmap_walk Kiryl Shutsemau
2025-12-22 5:54 ` Muchun Song
2025-12-22 15:00 ` Kiryl Shutsemau
2025-12-22 15:11 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 09/14] mm/hugetlb: Remove fake head pages Kiryl Shutsemau
2025-12-18 15:09 ` [PATCHv2 10/14] mm: Drop fake head checks Kiryl Shutsemau
2025-12-22 5:56 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 11/14] hugetlb: Remove VMEMMAP_SYNCHRONIZE_RCU Kiryl Shutsemau
2025-12-22 6:00 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 12/14] mm/hugetlb: Remove hugetlb_optimize_vmemmap_key static key Kiryl Shutsemau
2025-12-22 6:03 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 13/14] mm: Remove the branch from compound_head() Kiryl Shutsemau
2025-12-22 6:30 ` Muchun Song
2025-12-18 15:09 ` [PATCHv2 14/14] hugetlb: Update vmemmap_dedup.rst Kiryl Shutsemau
2025-12-22 6:20 ` Muchun Song
2025-12-18 22:18 ` [PATCHv2 00/14] Eliminate fake head pages from vmemmap optimization Kiryl Shutsemau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2ace6fc2-6891-4d6c-98de-c027da03d516@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=corbet@lwn.net \
--cc=fvdl@google.com \
--cc=hannes@cmpxchg.org \
--cc=kas@kernel.org \
--cc=kernel-team@meta.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=rppt@kernel.org \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox