From: Lance Yang <lance.yang@linux.dev>
To: david@kernel.org, dev.jain@arm.com
Cc: catalin.marinas@arm.com, will@kernel.org,
akpm@linux-foundation.org, ljs@kernel.org,
Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, jackmanb@google.com,
hannes@cmpxchg.org, ziy@nvidia.com, lance.yang@linux.dev,
ryan.roberts@arm.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
stable@vger.kernel.org
Subject: Re: [PATCH] mm/page_alloc: fix initialization of tags of the huge zero folio with init_on_free
Date: Tue, 21 Apr 2026 17:03:06 +0800 [thread overview]
Message-ID: <20260421090306.45979-1-lance.yang@linux.dev> (raw)
In-Reply-To: <94fdc376-9c43-4334-b293-20a54acbdc3a@kernel.org>
On Tue, Apr 21, 2026 at 10:06:54AM +0200, David Hildenbrand (Arm) wrote:
>On 4/21/26 09:06, Dev Jain wrote:
>>
>>
>> On 21/04/26 2:46 am, David Hildenbrand (Arm) wrote:
>>> __GFP_ZEROTAGS semantics are currently a bit weird, but effectively this
>>> flag is only ever set alongside __GFP_ZERO and __GFP_SKIP_KASAN.
>>>
>>> If we run with init_on_free, we will zero out pages during
>>> __free_pages_prepare(), to skip zeroing on the allocation path.
>>>
>>> However, when allocating with __GFP_ZEROTAG set, post_alloc_hook() will
>>> consequently not only skip clearing page content, but also skip
>>> clearing tag memory.
>>>
>>> Not clearing tags through __GFP_ZEROTAGS is irrelevant for most pages that
>>> will get mapped to user space through set_pte_at() later: set_pte_at() and
>>> friends will detect that the tags have not been initialized yet
>>> (PG_mte_tagged not set), and initialize them.
>>>
>>> However, for the huge zero folio, which will be mapped through a PMD
>>> marked as special, this initialization will not be performed, ending up
>>> exposing whatever tags were still set for the pages.
>>>
>>> The docs (Documentation/arch/arm64/memory-tagging-extension.rst) state
>>> that allocation tags are set to 0 when a page is first mapped to user
>>> space. That no longer holds with the huge zero folio when init_on_free
>>> is enabled.
>>>
>>> Fix it by decoupling __GFP_ZEROTAGS from __GFP_ZERO, passing to
>>> tag_clear_highpages() whether we want to also clear page content.
>>>
>>> As we are touching the interface either way, just clean it up by
>>> only calling it when HW tags are enabled, dropping the return value, and
>>> dropping the common code stub.
>>>
>>> Reproduced with the huge zero folio by modifying the check_buffer_fill
>>> arm64/mte selftest to use a 2 MiB area, after making sure that pages have
>>> a non-0 tag set when freeing (note that, during boot, we will not
>>> actually initialize tags, but only set KASAN_TAG_KERNEL in the page
>>> flags).
>>>
>>> $ ./check_buffer_fill
>>> 1..20
>>> ...
>>> not ok 17 Check initial tags with private mapping, sync error mode and mmap memory
>>> not ok 18 Check initial tags with private mapping, sync error mode and mmap/mprotect memory
>>> ...
>>>
>>> This code needs more cleanups; we'll tackle that next, like
>>> decoupling __GFP_ZEROTAGS from __GFP_SKIP_KASAN, moving all the
>>> KASAN magic into a separate helper, and consolidating HW-tag handling.
>>>
>>> Fixes: adfb6609c680 ("mm/huge_memory: initialise the tags of the huge zero folio")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>>> ---
>>> arch/arm64/include/asm/page.h | 3 ---
>>> arch/arm64/mm/fault.c | 16 +++++-----------
>>> include/linux/gfp_types.h | 10 +++++-----
>>> include/linux/highmem.h | 10 +---------
>>> mm/page_alloc.c | 12 +++++++-----
>>> 5 files changed, 18 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>>> index e25d0d18f6d7..5c6cbfbbd34c 100644
>>> --- a/arch/arm64/include/asm/page.h
>>> +++ b/arch/arm64/include/asm/page.h
>>> @@ -33,9 +33,6 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>>> unsigned long vaddr);
>>> #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio
>>>
>>> -bool tag_clear_highpages(struct page *to, int numpages);
>>> -#define __HAVE_ARCH_TAG_CLEAR_HIGHPAGES
>>> -
>>> #define copy_user_page(to, from, vaddr, pg) copy_page(to, from)
>>>
>>> typedef struct page *pgtable_t;
>>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>>> index 0f3c5c7ca054..32a3723f2d34 100644
>>> --- a/arch/arm64/mm/fault.c
>>> +++ b/arch/arm64/mm/fault.c
>>> @@ -1018,21 +1018,15 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>>> return vma_alloc_folio(flags, 0, vma, vaddr);
>>> }
>>>
>>> -bool tag_clear_highpages(struct page *page, int numpages)
>>> +void tag_clear_highpages(struct page *page, int numpages, bool clear_pages)
>>> {
>>> - /*
>>> - * Check if MTE is supported and fall back to clear_highpage().
>>> - * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
>>> - * post_alloc_hook() will invoke tag_clear_highpages().
>>> - */
>>> - if (!system_supports_mte())
>>> - return false;
>>> -
>>> /* Newly allocated pages, shouldn't have been tagged yet */
>>> for (int i = 0; i < numpages; i++, page++) {
>>> WARN_ON_ONCE(!try_page_mte_tagging(page));
>>> - mte_zero_clear_page_tags(page_address(page));
>>> + if (clear_pages)
>>> + mte_zero_clear_page_tags(page_address(page));
>>> + else
>>> + mte_clear_page_tags(page_address(page));
>>> set_page_mte_tagged(page);
>>> }
>>> - return true;
>>> }
>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
>>> index 6c75df30a281..fd53a6fba33f 100644
>>> --- a/include/linux/gfp_types.h
>>> +++ b/include/linux/gfp_types.h
>>> @@ -273,11 +273,11 @@ enum {
>>> *
>>> * %__GFP_ZERO returns a zeroed page on success.
>>> *
>>> - * %__GFP_ZEROTAGS zeroes memory tags at allocation time if the memory itself
>>> - * is being zeroed (either via __GFP_ZERO or via init_on_alloc, provided that
>>> - * __GFP_SKIP_ZERO is not set). This flag is intended for optimization: setting
>>> - * memory tags at the same time as zeroing memory has minimal additional
>>> - * performance impact.
>>> + * %__GFP_ZEROTAGS zeroes memory tags at allocation time. This flag is intended
>>> + * for optimization: setting memory tags at the same time as zeroing memory
>>> + * (e.g., with __GPF_ZERO) has minimal additional performance impact. However,
>>> + * __GFP_ZEROTAGS also zeroes the tags even if memory is not getting zeroed at
>>> + * allocation time (e.g., with init_on_free).
>>> *
>>> * %__GFP_SKIP_KASAN makes KASAN skip unpoisoning on page allocation.
>>> * Used for userspace and vmalloc pages; the latter are unpoisoned by
>>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>>> index af03db851a1d..62f589baa343 100644
>>> --- a/include/linux/highmem.h
>>> +++ b/include/linux/highmem.h
>>> @@ -345,15 +345,7 @@ static inline void clear_highpage_kasan_tagged(struct page *page)
>>> kunmap_local(kaddr);
>>> }
>>>
>>> -#ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGES
>>> -
>>> -/* Return false to let people know we did not initialize the pages */
>>> -static inline bool tag_clear_highpages(struct page *page, int numpages)
>>> -{
>>> - return false;
>>> -}
>>> -
>>> -#endif
>>> +void tag_clear_highpages(struct page *to, int numpages, bool clear_pages);
>>>
>>> /*
>>> * If we pass in a base or tail page, we can zero up to PAGE_SIZE.
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 65e205111553..8c6821d25a00 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -1808,9 +1808,9 @@ static inline bool should_skip_init(gfp_t flags)
>>> inline void post_alloc_hook(struct page *page, unsigned int order,
>>> gfp_t gfp_flags)
>>> {
>>> + const bool zero_tags = kasan_hw_tags_enabled() && (gfp_flags & __GFP_ZEROTAGS);
>>
>> Sashiko:
>>
>> https://sashiko.dev/#/patchset/20260420-zerotags-v1-1-3edc93e95bb4%40kernel.org
>>
>> PROT_MTE works without KASAN_HW_TAGS, so probably just retain the
>> system_supports_mte() check in tag_clear_highpages(), and document
>> that GFP_ZEROTAGS is only for MTE?
>
>Right, we have to clear tags here even without kasan. God, what an ugly
>mess people created here with these GFP flags.
Yeah, with kasan=off, kasan_init_hw_tags() returns early, so
kasan_hw_tags_enabled() stays false and tag_clear_highpages() is still
skipped.
With the small debug change below, it still reproduces reliably:
---8<---
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 970e077019b7..d5b6e2474f47 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -225,8 +225,7 @@ static bool get_huge_zero_folio(void)
if (likely(atomic_inc_not_zero(&huge_zero_refcount)))
return true;
- zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS) &
- ~__GFP_MOVABLE,
+ zero_folio = folio_alloc(GFP_TRANSHUGE | __GFP_ZERO | __GFP_ZEROTAGS,
HPAGE_PMD_ORDER);
if (!zero_folio) {
count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED);
---
Cheers,
Lance
next prev parent reply other threads:[~2026-04-21 9:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-20 21:16 David Hildenbrand (Arm)
2026-04-21 7:06 ` Dev Jain
2026-04-21 8:06 ` David Hildenbrand (Arm)
2026-04-21 9:03 ` Lance Yang [this message]
2026-04-21 7:46 ` Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260421090306.45979-1-lance.yang@linux.dev \
--to=lance.yang@linux.dev \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=will@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox