linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Balbir Singh <balbirs@nvidia.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	"David Hildenbrand (Red Hat)" <davidhildenbrandkernel@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: Jan Polensky <japo@linux.ibm.com>,
	akpm@linux-foundation.org, linux-arm-kernel@lists.infradead.org,
	linux-mm@kvack.org, will@kernel.org
Subject: Re: [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
Date: Mon, 24 Nov 2025 11:57:09 +0100	[thread overview]
Message-ID: <e83945be-056a-443c-b140-e3301c2109c5@kernel.org> (raw)
In-Reply-To: <9ab7f5c8-a2fd-497b-a32e-4def84e0be26@nvidia.com>

On 11/22/25 13:04, Balbir Singh wrote:
> On 11/11/25 02:28, Catalin Marinas wrote:
>> On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
>>> On 10.11.25 10:48, Jan Polensky wrote:
>>>> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
>>>>> On 09.11.25 01:36, Jan Polensky wrote:
>>>>>> The previous change added __GFP_ZEROTAGS when allocating the huge zero
>>>>>> folio to ensure tag initialization for arm64 with MTE enabled. However,
>>>>>> on s390 this flag is unnecessary and triggers a regression
>>>>>> (observed as a crash during repeated 'dnf makecache').
>> [...]
>>>>> I think the problem is that post_alloc_hook() does
>>>>>
>>>>> if (zero_tags) {
>>>>> 	/* Initialize both memory and memory tags. */
>>>>> 	for (i = 0; i != 1 << order; ++i)
>>>>> 		tag_clear_highpage(page + i);
>>>>>
>>>>> 	/* Take note that memory was initialized by the loop above. */
>>>>> 	init = false;
>>>>> }
>>>>>
>>>>> And tag_clear_highpage() is a NOP on other architectures.
>>
>> Hmm, another thing I missed. Sorry about this.
>>
>>>> Which works by the way for our arch (s390).
>>>>
>>>>    include/linux/gfp_types.h | 4 ++++
>>>>    1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
>>>> index 65db9349f905..c12d8a601bb3 100644
>>>> --- a/include/linux/gfp_types.h
>>>> +++ b/include/linux/gfp_types.h
>>>> @@ -85,7 +85,11 @@ enum {
>>>>    #define ___GFP_HARDWALL        BIT(___GFP_HARDWALL_BIT)
>>>>    #define ___GFP_THISNODE        BIT(___GFP_THISNODE_BIT)
>>>>    #define ___GFP_ACCOUNT     BIT(___GFP_ACCOUNT_BIT)
>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>>>    #define ___GFP_ZEROTAGS        BIT(___GFP_ZEROTAGS_BIT)
>>>> +#else
>>>> +#define ___GFP_ZEROTAGS        0
>>>> +#endif
>>>>    #ifdef CONFIG_KASAN_HW_TAGS
>>>>    #define ___GFP_SKIP_ZERO   BIT(___GFP_SKIP_ZERO_BIT)
>>>>    #define ___GFP_SKIP_KASAN  BIT(___GFP_SKIP_KASAN_BIT)
>>>>
>>>> This solution would be sufficient from my side, and I would appreciate a
>>>> quick application if there are no objections.
>>>
>>> As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen
>>> early in that file, it should likely become a CONFIG_ thing.
>>
>> I'm fine with either option above but I'll throw one more in the mix:
>>
>> --------------------8<--------------------------------
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 2312e6ee595f..dcff91533590 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -33,6 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>>   						unsigned long vaddr);
>>   #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio
>>   
>> +bool arch_has_tag_clear_highpage(void);
>>   void tag_clear_highpage(struct page *to);
>>   #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>   
>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>> index 125dfa6c613b..318d091db843 100644
>> --- a/arch/arm64/mm/fault.c
>> +++ b/arch/arm64/mm/fault.c
>> @@ -967,18 +967,13 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>>   	return vma_alloc_folio(flags, 0, vma, vaddr);
>>   }
>>   
>> +bool arch_has_tag_clear_highpage(void)
>> +{
>> +	return system_supports_mte();
>> +}
>> +
>>   void tag_clear_highpage(struct page *page)
>>   {
>> -	/*
>> -	 * Check if MTE is supported and fall back to clear_highpage().
>> -	 * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
>> -	 * post_alloc_hook() will invoke tag_clear_highpage().
>> -	 */
>> -	if (!system_supports_mte()) {
>> -		clear_highpage(page);
>> -		return;
>> -	}
>> -
>>   	/* Newly allocated page, shouldn't have been tagged yet */
>>   	WARN_ON_ONCE(!try_page_mte_tagging(page));
>>   	mte_zero_clear_page_tags(page_address(page));
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 105cc4c00cc3..7aa56179ccef 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -251,6 +251,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page)
>>   
>>   #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>   
>> +static inline bool arch_has_tag_clear_highpage(void)
>> +{
>> +	return false;
>> +}
>> +
>>   static inline void tag_clear_highpage(struct page *page)
>>   {
>>   }
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index e4efda1158b2..5ab15431bc06 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1798,7 +1798,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>>   {
>>   	bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
>>   			!should_skip_init(gfp_flags);
>> -	bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS);
>> +	bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS) &&
>> +		arch_has_tag_clear_highpage();
>>   	int i;
>>   
>>   	set_page_private(page, 0);
>> --------------------8<--------------------------------
>>
>> Reasoning: with MTE on arm64, you can't have kasan-tagged pages in the
>> kernel which are also exposed to user because the tags are shared (same
>> physical location). The 'zero_tags' initialisation in post_alloc_hook()
>> makes sense for this behaviour. With virtual tagging (briefly announced
>> in [1], full specs not public yet), both the user and the kernel can
>> have their own tags - more like KASAN_SW_TAGS but without the compiler
>> instrumentation. The kernel won't be able to zero the tags for the user
>> since they are in virtual space. It can, however, continue to use Kasan
>> tags even if the pages are mapped in user space. In this case, I'd
>> rather use the kernel_init_pages() call further down in
>> post_alloc_hook() than replicating it in tag_clear_highpage(). When we
>> get to upstreaming virtual tagging (informally vMTE, sometime next
>> year), I'd like to have a kernel image that supports both, so the
>> decision on whether to call tag_clear_highpage() will need to be
>> dynamic.
>>
>> [1] https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/future-architecture-technologies-poe2-and-vmte
>>
> 
> I've run into the issue where due to init being set to false if zero_tags was set,
> the system does not clear the zero_folio. I just spent a lot of time debugging it :)
> 
> Catalin, were you going to send out this patch as a fix to be included in mm-unstable?
> I've for now reverted your __GFP_ZEROTAGS change to get_huge_zero_folio() for my testing
> 
> I am on the current mm-new branch.

We have a fix upstream now:

commit 5bebe8de19264946d398ead4e6c20c229454a552
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Nov 18 08:21:27 2025 -0800

     mm/huge_memory: Fix initialization of huge zero folio


Andrew could consider picking it up as well temporarily to fix the issue 
until we rebase on top of the new kernel.

-- 
Cheers

David


  reply	other threads:[~2025-11-24 10:57 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-31 16:57 [PATCH] mm/huge_memory: Initialise the tags of the huge zero folio Catalin Marinas
2025-10-31 17:15 ` David Hildenbrand
2025-11-03 13:32 ` Mark Brown
2025-11-03 14:30   ` Catalin Marinas
2025-11-03 14:41     ` David Hildenbrand (Red Hat)
2025-11-03 15:59       ` Catalin Marinas
2025-11-03 19:29         ` Beleswar Prasad Padhi
2025-11-04  1:05         ` Andrew Morton
2025-11-04  8:52           ` Catalin Marinas
2025-11-04 11:53     ` [PATCH] mm/huge_memory: Initialise the tags of the huge zero Lance Yang
2025-11-08 19:19 ` [PATCH] mm/huge_memory: initialise the tags of the huge zero folio Jan Polensky
2025-11-09  0:42   ` [PATCH] Clarification: please ignore earlier submission Jan Polensky
2025-11-09  0:36 ` [PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures Jan Polensky
2025-11-10  9:09   ` David Hildenbrand (Red Hat)
2025-11-10  9:48     ` Jan Polensky
2025-11-10  9:53       ` David Hildenbrand (Red Hat)
2025-11-10 15:28         ` Catalin Marinas
2025-11-10 15:55           ` Catalin Marinas
2025-11-22 12:04           ` Balbir Singh
2025-11-24 10:57             ` David Hildenbrand (Red Hat) [this message]
2025-11-24 18:08               ` Andrew Morton
2025-11-11 10:44         ` Jan Polensky
2025-11-11 12:27           ` David Hildenbrand (Red Hat)
2025-11-11 12:28             ` David Hildenbrand (Red Hat)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e83945be-056a-443c-b140-e3301c2109c5@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=balbirs@nvidia.com \
    --cc=catalin.marinas@arm.com \
    --cc=davidhildenbrandkernel@gmail.com \
    --cc=japo@linux.ibm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox