From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Balbir Singh <balbirs@nvidia.com>,
Catalin Marinas <catalin.marinas@arm.com>,
"David Hildenbrand (Red Hat)" <davidhildenbrandkernel@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: Jan Polensky <japo@linux.ibm.com>,
akpm@linux-foundation.org, linux-arm-kernel@lists.infradead.org,
linux-mm@kvack.org, will@kernel.org
Subject: Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
Date: Mon, 24 Nov 2025 11:57:09 +0100 [thread overview]
Message-ID: <e83945be-056a-443c-b140-e3301c2109c5@kernel.org> (raw)
In-Reply-To: <9ab7f5c8-a2fd-497b-a32e-4def84e0be26@nvidia.com>
On 11/22/25 13:04, Balbir Singh wrote:
> On 11/11/25 02:28, Catalin Marinas wrote:
>> On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
>>> On 10.11.25 10:48, Jan Polensky wrote:
>>>> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
>>>>> On 09.11.25 01:36, Jan Polensky wrote:
>>>>>> The previous change added __GFP_ZEROTAGS when allocating the huge zero
>>>>>> folio to ensure tag initialization for arm64 with MTE enabled. However,
>>>>>> on s390 this flag is unnecessary and triggers a regression
>>>>>> (observed as a crash during repeated 'dnf makecache').
>> [...]
>>>>> I think the problem is that post_alloc_hook() does
>>>>>
>>>>> if (zero_tags) {
>>>>> /* Initialize both memory and memory tags. */
>>>>> for (i = 0; i != 1 << order; ++i)
>>>>> tag_clear_highpage(page + i);
>>>>>
>>>>> /* Take note that memory was initialized by the loop above. */
>>>>> init = false;
>>>>> }
>>>>>
>>>>> And tag_clear_highpage() is a NOP on other architectures.
>>
>> Hmm, another thing I missed. Sorry about this.
>>
>>>> Which works by the way for our arch (s390).
>>>>
>>>> include/linux/gfp_types.h | 4 ++++
>>>> 1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
>>>> index 65db9349f905..c12d8a601bb3 100644
>>>> --- a/include/linux/gfp_types.h
>>>> +++ b/include/linux/gfp_types.h
>>>> @@ -85,7 +85,11 @@ enum {
>>>> #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT)
>>>> #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT)
>>>> #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT)
>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>>> #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT)
>>>> +#else
>>>> +#define ___GFP_ZEROTAGS 0
>>>> +#endif
>>>> #ifdef CONFIG_KASAN_HW_TAGS
>>>> #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT)
>>>> #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT)
>>>>
>>>> This solution would be sufficient from my side, and I would appreciate a
>>>> quick application if there are no objections.
>>>
>>> As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen
>>> early in that file, it should likely become a CONFIG_ thing.
>>
>> I'm fine with either option above but I'll throw one more in the mix:
>>
>> --------------------8<--------------------------------
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 2312e6ee595f..dcff91533590 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -33,6 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>> unsigned long vaddr);
>> #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio
>>
>> +bool arch_has_tag_clear_highpage(void);
>> void tag_clear_highpage(struct page *to);
>> #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>
>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>> index 125dfa6c613b..318d091db843 100644
>> --- a/arch/arm64/mm/fault.c
>> +++ b/arch/arm64/mm/fault.c
>> @@ -967,18 +967,13 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>> return vma_alloc_folio(flags, 0, vma, vaddr);
>> }
>>
>> +bool arch_has_tag_clear_highpage(void)
>> +{
>> + return system_supports_mte();
>> +}
>> +
>> void tag_clear_highpage(struct page *page)
>> {
>> - /*
>> - * Check if MTE is supported and fall back to clear_highpage().
>> - * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
>> - * post_alloc_hook() will invoke tag_clear_highpage().
>> - */
>> - if (!system_supports_mte()) {
>> - clear_highpage(page);
>> - return;
>> - }
>> -
>> /* Newly allocated page, shouldn't have been tagged yet */
>> WARN_ON_ONCE(!try_page_mte_tagging(page));
>> mte_zero_clear_page_tags(page_address(page));
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 105cc4c00cc3..7aa56179ccef 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -251,6 +251,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page)
>>
>> #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>
>> +static inline bool arch_has_tag_clear_highpage(void)
>> +{
>> + return false;
>> +}
>> +
>> static inline void tag_clear_highpage(struct page *page)
>> {
>> }
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index e4efda1158b2..5ab15431bc06 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1798,7 +1798,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>> {
>> bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
>> !should_skip_init(gfp_flags);
>> - bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS);
>> + bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS) &&
>> + arch_has_tag_clear_highpage();
>> int i;
>>
>> set_page_private(page, 0);
>> --------------------8<--------------------------------
>>
>> Reasoning: with MTE on arm64, you can't have kasan-tagged pages in the
>> kernel which are also exposed to user because the tags are shared (same
>> physical location). The 'zero_tags' initialisation in post_alloc_hook()
>> makes sense for this behaviour. With virtual tagging (briefly announced
>> in [1], full specs not public yet), both the user and the kernel can
>> have their own tags - more like KASAN_SW_TAGS but without the compiler
>> instrumentation. The kernel won't be able to zero the tags for the user
>> since they are in virtual space. It can, however, continue to use Kasan
>> tags even if the pages are mapped in user space. In this case, I'd
>> rather use the kernel_init_pages() call further down in
>> post_alloc_hook() than replicating it in tag_clear_highpage(). When we
>> get to upstreaming virtual tagging (informally vMTE, sometime next
>> year), I'd like to have a kernel image that supports both, so the
>> decision on whether to call tag_clear_highpage() will need to be
>> dynamic.
>>
>> [1] https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/future-architecture-technologies-poe2-and-vmte
>>
>
> I've run into the issue where due to init being set to false if zero_tags was set,
> the system does not clear the zero_folio. I just spent a lot of time debugging it :)
>
> Catalin, were you going to send out this patch as a fix to be included in mm-unstable?
> I've for now reverted your __GFP_ZEROTAGS change to get_huge_zero_folio() for my testing
>
> I am on the current mm-new branch.
We have a fix upstream now:
commit 5bebe8de19264946d398ead4e6c20c229454a552
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue Nov 18 08:21:27 2025 -0800
mm/huge_memory: Fix initialization of huge zero folio
Andrew could consider picking it up as well temporarily to fix the issue
until we rebase on top of the new kernel.
--
Cheers
David
next prev parent reply other threads:[~2025-11-24 10:57 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas
2025-10-31 17:15 ` David Hildenbrand
2025-11-03 13:32 ` Mark Brown
2025-11-03 14:30 ` Catalin Marinas
2025-11-03 14:41 ` David Hildenbrand (Red Hat)
2025-11-03 15:59 ` Catalin Marinas
2025-11-03 19:29 ` Beleswar Prasad Padhi
2025-11-04 1:05 ` Andrew Morton
2025-11-04 8:52 ` Catalin Marinas
2025-11-04 11:53 ` [PATCH] mm/huge_memory: Initialise the tags of the huge zero Lance Yang
2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky
2025-11-09 0:42 ` [PATCH] Clarification: please ignore earlier submission Jan Polensky
2025-11-09 0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky
2025-11-10 9:09 ` David Hildenbrand (Red Hat)
2025-11-10 9:48 ` Jan Polensky
2025-11-10 9:53 ` David Hildenbrand (Red Hat)
2025-11-10 15:28 ` Catalin Marinas
2025-11-10 15:55 ` Catalin Marinas
2025-11-22 12:04 ` Balbir Singh
2025-11-24 10:57 ` David Hildenbrand (Red Hat) [this message]
2025-11-24 18:08 ` Andrew Morton
2025-11-11 10:44 ` Jan Polensky
2025-11-11 12:27 ` David Hildenbrand (Red Hat)
2025-11-11 12:28 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e83945be-056a-443c-b140-e3301c2109c5@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=balbirs@nvidia.com \
--cc=catalin.marinas@arm.com \
--cc=davidhildenbrandkernel@gmail.com \
--cc=japo@linux.ibm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox