From: Zi Yan <ziy@nvidia.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Rik van Riel <riel@surriel.com>,
linux-kernel@vger.kernel.org, Johannes Weiner <jweiner@meta.com>
Subject: Re: [RFC 1/2] mm: page_alloc: replace pageblock_flags bitmap with struct pageblock_data
Date: Sun, 19 Apr 2026 21:40:23 -0400 [thread overview]
Message-ID: <20AA822C-F702-4EC4-B0A3-5C5CEB1D1952@nvidia.com> (raw)
In-Reply-To: <20260403194526.477775-2-hannes@cmpxchg.org>
On 3 Apr 2026, at 15:40, Johannes Weiner wrote:
> From: Johannes Weiner <jweiner@meta.com>
>
> Replace the packed pageblock_flags bitmap with a per-pageblock struct
> containing its own flags word. This changes the storage from
> NR_PAGEBLOCK_BITS bits per pageblock packed into shared unsigned longs,
> to a dedicated unsigned long per pageblock.
>
> The free path looks up migratetype (from pageblock flags) immediately
> followed by looking up pageblock ownership. Colocating them in a struct
> means this hot path touches one cache line instead of two.
>
> The per-pageblock struct also eliminates all the bit-packing indexing
> (pfn_to_bitidx, word selection, intra-word shifts), simplifying the
> accessor code.
>
> Memory overhead: 8 bytes per pageblock (one unsigned long). With 2MB
> pageblocks on x86_64, that's 4KB per GB -- up from ~0.5-1 bytes per
> pageblock with the packed bitmap, but still negligible in absolute terms.
>
> No functional change.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> include/linux/mmzone.h | 15 ++++----
> mm/internal.h | 17 +++++++++
> mm/mm_init.c | 25 ++++++-------
> mm/page_alloc.c | 81 ++++++------------------------------------
> mm/sparse.c | 3 +-
> 5 files changed, 48 insertions(+), 93 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 3e51190a55e4..2f202bda5ec6 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -916,7 +916,7 @@ struct zone {
> * Flags for a pageblock_nr_pages block. See pageblock-flags.h.
> * In SPARSEMEM, this map is stored in struct mem_section
> */
> - unsigned long *pageblock_flags;
> + struct pageblock_data *pageblock_data;
> #endif /* CONFIG_SPARSEMEM */
>
> /* zone_start_pfn == zone_start_paddr >> PAGE_SHIFT */
> @@ -1866,9 +1866,6 @@ static inline bool movable_only_nodes(nodemask_t *nodes)
> #define PAGES_PER_SECTION (1UL << PFN_SECTION_SHIFT)
> #define PAGE_SECTION_MASK (~(PAGES_PER_SECTION-1))
>
> -#define SECTION_BLOCKFLAGS_BITS \
> - ((1UL << (PFN_SECTION_SHIFT - pageblock_order)) * NR_PAGEBLOCK_BITS)
> -
> #if (MAX_PAGE_ORDER + PAGE_SHIFT) > SECTION_SIZE_BITS
> #error Allocator MAX_PAGE_ORDER exceeds SECTION_SIZE
> #endif
> @@ -1901,13 +1898,17 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec)
> #define SUBSECTION_ALIGN_UP(pfn) ALIGN((pfn), PAGES_PER_SUBSECTION)
> #define SUBSECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SUBSECTION_MASK)
>
> +struct pageblock_data {
> + unsigned long flags;
Would it be better to make this uint32_t if !CONFIG_MEMORY_ISOLATION
and uint64_t otherwise? MIGRATE_ISOLATE is the only reason to have
8 byte pageblock flag.
<snip>
> -#ifdef CONFIG_MEMORY_ISOLATION
> - BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 8);
> -#else
> - BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 4);
> -#endif
We probably still need
BUILD_BUG_ON(NR_PAGEBLOCK_BITS > sizeof(struct pageblock_data));
just in case in the future we add too many pageblock bits.
Otherwise, this patch can be sent and merged separately.
--
Best Regards,
Yan, Zi
next prev parent reply other threads:[~2026-04-20 1:40 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-03 19:40 [RFC 0/2] mm: page_alloc: pcp buddy allocator Johannes Weiner
2026-04-03 19:40 ` [RFC 1/2] mm: page_alloc: replace pageblock_flags bitmap with struct pageblock_data Johannes Weiner
2026-04-04 1:43 ` Rik van Riel
2026-04-20 1:40 ` Zi Yan [this message]
2026-04-03 19:40 ` [RFC 2/2] mm: page_alloc: per-cpu pageblock buddy allocator Johannes Weiner
2026-04-04 1:42 ` Rik van Riel
2026-04-06 16:12 ` Johannes Weiner
2026-04-06 17:31 ` Frank van der Linden
2026-04-06 21:58 ` Johannes Weiner
2026-04-10 9:48 ` Vlastimil Babka (SUSE)
2026-04-10 19:12 ` Johannes Weiner
2026-04-04 2:27 ` [RFC 0/2] mm: page_alloc: pcp " Zi Yan
2026-04-06 15:24 ` Johannes Weiner
2026-04-07 2:42 ` Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20AA822C-F702-4EC4-B0A3-5C5CEB1D1952@nvidia.com \
--to=ziy@nvidia.com \
--cc=Liam.Howlett@oracle.com \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jweiner@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=riel@surriel.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox