linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [v2 PATCH] mm: hugetlb: avoid soft lockup when mprotect to large memory area
@ 2025-09-29 20:24 Yang Shi
  2025-09-30  5:26 ` Dev Jain
  0 siblings, 1 reply; 6+ messages in thread
From: Yang Shi @ 2025-09-29 20:24 UTC (permalink / raw)
  To: muchun.song, osalvador, david, akpm, catalin.marinas, will,
	anshuman.khandual, carl, cl
  Cc: yang, linux-mm, linux-arm-kernel, linux-kernel

When calling mprotect() to a large hugetlb memory area in our customer's
workload (~300GB hugetlb memory), soft lockup was observed:

watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916]

CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted 6.17-rc7
Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS 5.4.4.1 07/15/2025
pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mte_clear_page_tags+0x14/0x24
lr : mte_sync_tags+0x1c0/0x240
sp : ffff80003150bb80
x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000
x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458
x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000
x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c
x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
x5 : fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000
x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000

Call trace:
  mte_clear_page_tags+0x14/0x24
  set_huge_pte_at+0x25c/0x280
  hugetlb_change_protection+0x220/0x430
  change_protection+0x5c/0x8c
  mprotect_fixup+0x10c/0x294
  do_mprotect_pkey.constprop.0+0x2e0/0x3d4
  __arm64_sys_mprotect+0x24/0x44
  invoke_syscall+0x50/0x160
  el0_svc_common+0x48/0x144
  do_el0_svc+0x30/0xe0
  el0_svc+0x30/0xf0
  el0t_64_sync_handler+0xc4/0x148
  el0t_64_sync+0x1a4/0x1a8

Soft lockup is not triggered with THP or base page because there is
cond_resched() called for each PMD size.

Although the soft lockup was triggered by MTE, it should be not MTE
specific. The other processing which takes long time in the loop may
trigger soft lockup too.

So add cond_resched() for hugetlb to avoid soft lockup.

Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages")
Tested-by: Carl Worth <carl@os.amperecomputing.com>
Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
---
v2: - Made the subject and commit message less MTE specific and fixed
      the fixes tag.
    - Collected all R-bs and A-bs.

 mm/hugetlb.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index cb5c4e79e0b8..fe6606d91b31 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -7242,6 +7242,8 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
 						psize);
 		}
 		spin_unlock(ptl);
+
+		cond_resched();
 	}
 	/*
 	 * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare
-- 
2.47.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [v2 PATCH] mm: hugetlb: avoid soft lockup when mprotect to large memory area
  2025-09-29 20:24 [v2 PATCH] mm: hugetlb: avoid soft lockup when mprotect to large memory area Yang Shi
@ 2025-09-30  5:26 ` Dev Jain
  2025-09-30 18:08   ` Yang Shi
  0 siblings, 1 reply; 6+ messages in thread
From: Dev Jain @ 2025-09-30  5:26 UTC (permalink / raw)
  To: Yang Shi, muchun.song, osalvador, david, akpm, catalin.marinas,
	will, anshuman.khandual, carl, cl
  Cc: linux-mm, linux-arm-kernel, linux-kernel


On 30/09/25 1:54 am, Yang Shi wrote:
> When calling mprotect() to a large hugetlb memory area in our customer's
> workload (~300GB hugetlb memory), soft lockup was observed:
>
> watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916]
>
> CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted 6.17-rc7
> Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS 5.4.4.1 07/15/2025
> pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : mte_clear_page_tags+0x14/0x24
> lr : mte_sync_tags+0x1c0/0x240
> sp : ffff80003150bb80
> x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000
> x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458
> x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000
> x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c
> x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
> x5 : fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000
> x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000
>
> Call trace:
>    mte_clear_page_tags+0x14/0x24
>    set_huge_pte_at+0x25c/0x280
>    hugetlb_change_protection+0x220/0x430
>    change_protection+0x5c/0x8c
>    mprotect_fixup+0x10c/0x294
>    do_mprotect_pkey.constprop.0+0x2e0/0x3d4
>    __arm64_sys_mprotect+0x24/0x44
>    invoke_syscall+0x50/0x160
>    el0_svc_common+0x48/0x144
>    do_el0_svc+0x30/0xe0
>    el0_svc+0x30/0xf0
>    el0t_64_sync_handler+0xc4/0x148
>    el0t_64_sync+0x1a4/0x1a8
>
> Soft lockup is not triggered with THP or base page because there is
> cond_resched() called for each PMD size.
>
> Although the soft lockup was triggered by MTE, it should be not MTE
> specific. The other processing which takes long time in the loop may
> trigger soft lockup too.
>
> So add cond_resched() for hugetlb to avoid soft lockup.
>
> Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages")
> Tested-by: Carl Worth <carl@os.amperecomputing.com>
> Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Acked-by: David Hildenbrand <david@redhat.com>
> Acked-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
> ---
> v2: - Made the subject and commit message less MTE specific and fixed
>        the fixes tag.
>      - Collected all R-bs and A-bs.
>
>   mm/hugetlb.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index cb5c4e79e0b8..fe6606d91b31 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7242,6 +7242,8 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
>   						psize);
>   		}
>   		spin_unlock(ptl);
> +
> +		cond_resched();
>   	}
>   	/*
>   	 * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare

Reviewed-by: Dev Jain <dev.jain@arm.com>

Does it make sense to also do cond_resched() in the huge_pmd_unshare() branch?
That also amounts to clearing a page. And I can see for example, zap_huge_pmd()
and change_huge_pmd() consume a cond_resched().



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [v2 PATCH] mm: hugetlb: avoid soft lockup when mprotect to large memory area
  2025-09-30  5:26 ` Dev Jain
@ 2025-09-30 18:08   ` Yang Shi
  2025-09-30 18:43     ` Christoph Lameter (Ampere)
  2025-10-01  4:23     ` Dev Jain
  0 siblings, 2 replies; 6+ messages in thread
From: Yang Shi @ 2025-09-30 18:08 UTC (permalink / raw)
  To: Dev Jain, muchun.song, osalvador, david, akpm, catalin.marinas,
	will, anshuman.khandual, carl, cl
  Cc: linux-mm, linux-arm-kernel, linux-kernel



On 9/29/25 10:26 PM, Dev Jain wrote:
>
> On 30/09/25 1:54 am, Yang Shi wrote:
>> When calling mprotect() to a large hugetlb memory area in our customer's
>> workload (~300GB hugetlb memory), soft lockup was observed:
>>
>> watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916]
>>
>> CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted 6.17-rc7
>> Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS 
>> 5.4.4.1 07/15/2025
>> pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : mte_clear_page_tags+0x14/0x24
>> lr : mte_sync_tags+0x1c0/0x240
>> sp : ffff80003150bb80
>> x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000
>> x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458
>> x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000
>> x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000
>> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>> x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
>> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c
>> x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
>> x5 : fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000
>> x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000
>>
>> Call trace:
>>    mte_clear_page_tags+0x14/0x24
>>    set_huge_pte_at+0x25c/0x280
>>    hugetlb_change_protection+0x220/0x430
>>    change_protection+0x5c/0x8c
>>    mprotect_fixup+0x10c/0x294
>>    do_mprotect_pkey.constprop.0+0x2e0/0x3d4
>>    __arm64_sys_mprotect+0x24/0x44
>>    invoke_syscall+0x50/0x160
>>    el0_svc_common+0x48/0x144
>>    do_el0_svc+0x30/0xe0
>>    el0_svc+0x30/0xf0
>>    el0t_64_sync_handler+0xc4/0x148
>>    el0t_64_sync+0x1a4/0x1a8
>>
>> Soft lockup is not triggered with THP or base page because there is
>> cond_resched() called for each PMD size.
>>
>> Although the soft lockup was triggered by MTE, it should be not MTE
>> specific. The other processing which takes long time in the loop may
>> trigger soft lockup too.
>>
>> So add cond_resched() for hugetlb to avoid soft lockup.
>>
>> Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages")
>> Tested-by: Carl Worth <carl@os.amperecomputing.com>
>> Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org>
>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>> Acked-by: David Hildenbrand <david@redhat.com>
>> Acked-by: Oscar Salvador <osalvador@suse.de>
>> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
>> ---
>> v2: - Made the subject and commit message less MTE specific and fixed
>>        the fixes tag.
>>      - Collected all R-bs and A-bs.
>>
>>   mm/hugetlb.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index cb5c4e79e0b8..fe6606d91b31 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -7242,6 +7242,8 @@ long hugetlb_change_protection(struct 
>> vm_area_struct *vma,
>>                           psize);
>>           }
>>           spin_unlock(ptl);
>> +
>> +        cond_resched();
>>       }
>>       /*
>>        * Must flush TLB before releasing i_mmap_rwsem: x86's 
>> huge_pmd_unshare
>
> Reviewed-by: Dev Jain <dev.jain@arm.com>

Thank you.

>
> Does it make sense to also do cond_resched() in the huge_pmd_unshare() 
> branch?
> That also amounts to clearing a page. And I can see for example, 
> zap_huge_pmd()
> and change_huge_pmd() consume a cond_resched().

Thanks for raising this. I did think about it. But I didn't convince 
myself because shared pmd should be not that common IMHO (If I'm wrong, 
please feel free to correct me). At least PMD can't be shared if the 
memory is tagged IIRC. So I'd like to keep the patch minimal for now and 
defer adding cond_resched() until it is hit by some real life workload.

Yang




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [v2 PATCH] mm: hugetlb: avoid soft lockup when mprotect to large memory area
  2025-09-30 18:08   ` Yang Shi
@ 2025-09-30 18:43     ` Christoph Lameter (Ampere)
  2025-10-01  4:23     ` Dev Jain
  1 sibling, 0 replies; 6+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-09-30 18:43 UTC (permalink / raw)
  To: Yang Shi
  Cc: Dev Jain, muchun.song, osalvador, david, akpm, catalin.marinas,
	will, anshuman.khandual, carl, linux-mm, linux-arm-kernel,
	linux-kernel

On Tue, 30 Sep 2025, Yang Shi wrote:

> > Does it make sense to also do cond_resched() in the huge_pmd_unshare()
> > branch?
> > That also amounts to clearing a page. And I can see for example,
> > zap_huge_pmd()
> > and change_huge_pmd() consume a cond_resched().
>
> Thanks for raising this. I did think about it. But I didn't convince myself
> because shared pmd should be not that common IMHO (If I'm wrong, please feel
> free to correct me). At least PMD can't be shared if the memory is tagged
> IIRC. So I'd like to keep the patch minimal for now and defer adding
> cond_resched() until it is hit by some real life workload.

It would be good to send out a second path that covers the other cases
for discussion.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [v2 PATCH] mm: hugetlb: avoid soft lockup when mprotect to large memory area
  2025-09-30 18:08   ` Yang Shi
  2025-09-30 18:43     ` Christoph Lameter (Ampere)
@ 2025-10-01  4:23     ` Dev Jain
  2025-10-01  8:32       ` David Hildenbrand
  1 sibling, 1 reply; 6+ messages in thread
From: Dev Jain @ 2025-10-01  4:23 UTC (permalink / raw)
  To: Yang Shi, muchun.song, osalvador, david, akpm, catalin.marinas,
	will, anshuman.khandual, carl, cl
  Cc: linux-mm, linux-arm-kernel, linux-kernel


On 30/09/25 11:38 pm, Yang Shi wrote:
>
>
> On 9/29/25 10:26 PM, Dev Jain wrote:
>>
>> On 30/09/25 1:54 am, Yang Shi wrote:
>>> When calling mprotect() to a large hugetlb memory area in our 
>>> customer's
>>> workload (~300GB hugetlb memory), soft lockup was observed:
>>>
>>> watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916]
>>>
>>> CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted 
>>> 6.17-rc7
>>> Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS 
>>> 5.4.4.1 07/15/2025
>>> pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> pc : mte_clear_page_tags+0x14/0x24
>>> lr : mte_sync_tags+0x1c0/0x240
>>> sp : ffff80003150bb80
>>> x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000
>>> x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458
>>> x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000
>>> x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000
>>> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>> x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
>>> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c
>>> x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
>>> x5 : fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000
>>> x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000
>>>
>>> Call trace:
>>>    mte_clear_page_tags+0x14/0x24
>>>    set_huge_pte_at+0x25c/0x280
>>>    hugetlb_change_protection+0x220/0x430
>>>    change_protection+0x5c/0x8c
>>>    mprotect_fixup+0x10c/0x294
>>>    do_mprotect_pkey.constprop.0+0x2e0/0x3d4
>>>    __arm64_sys_mprotect+0x24/0x44
>>>    invoke_syscall+0x50/0x160
>>>    el0_svc_common+0x48/0x144
>>>    do_el0_svc+0x30/0xe0
>>>    el0_svc+0x30/0xf0
>>>    el0t_64_sync_handler+0xc4/0x148
>>>    el0t_64_sync+0x1a4/0x1a8
>>>
>>> Soft lockup is not triggered with THP or base page because there is
>>> cond_resched() called for each PMD size.
>>>
>>> Although the soft lockup was triggered by MTE, it should be not MTE
>>> specific. The other processing which takes long time in the loop may
>>> trigger soft lockup too.
>>>
>>> So add cond_resched() for hugetlb to avoid soft lockup.
>>>
>>> Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages")
>>> Tested-by: Carl Worth <carl@os.amperecomputing.com>
>>> Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org>
>>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>>> Acked-by: David Hildenbrand <david@redhat.com>
>>> Acked-by: Oscar Salvador <osalvador@suse.de>
>>> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
>>> ---
>>> v2: - Made the subject and commit message less MTE specific and fixed
>>>        the fixes tag.
>>>      - Collected all R-bs and A-bs.
>>>
>>>   mm/hugetlb.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index cb5c4e79e0b8..fe6606d91b31 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -7242,6 +7242,8 @@ long hugetlb_change_protection(struct 
>>> vm_area_struct *vma,
>>>                           psize);
>>>           }
>>>           spin_unlock(ptl);
>>> +
>>> +        cond_resched();
>>>       }
>>>       /*
>>>        * Must flush TLB before releasing i_mmap_rwsem: x86's 
>>> huge_pmd_unshare
>>
>> Reviewed-by: Dev Jain <dev.jain@arm.com>
>
> Thank you.
>
>>
>> Does it make sense to also do cond_resched() in the 
>> huge_pmd_unshare() branch?
>> That also amounts to clearing a page. And I can see for example, 
>> zap_huge_pmd()
>> and change_huge_pmd() consume a cond_resched().
>
> Thanks for raising this. I did think about it. But I didn't convince 
> myself because shared pmd should be not that common IMHO (If I'm 
> wrong, please feel free to correct me). At least PMD can't be shared 
> if the memory is tagged IIRC. So I'd like to keep the patch minimal 
> for now and defer adding cond_resched() until it is hit by some real 
> life workload.

If we have large swathes of hugetlb memory like in your workload, and it 
is MAP_SHARED, then there should be high chances of sharing the PMD. 
Although, I incorrectly

observed that we are clearing a page there - we are only clearing the 
pud entry which is 8 bytes. So yes a soft lockup should be highly 
unlikely. But since cond_resched()

is cheap (I assume this is the case since it is liberally sprinkled all 
over the codebase) I think we should be consistent. Probably not an 
immediate concern and not a matter

of this patch.


>
> Yang
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [v2 PATCH] mm: hugetlb: avoid soft lockup when mprotect to large memory area
  2025-10-01  4:23     ` Dev Jain
@ 2025-10-01  8:32       ` David Hildenbrand
  0 siblings, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2025-10-01  8:32 UTC (permalink / raw)
  To: Dev Jain, Yang Shi, muchun.song, osalvador, akpm,
	catalin.marinas, will, anshuman.khandual, carl, cl
  Cc: linux-mm, linux-arm-kernel, linux-kernel

On 01.10.25 06:23, Dev Jain wrote:
> 
> On 30/09/25 11:38 pm, Yang Shi wrote:
>>
>>
>> On 9/29/25 10:26 PM, Dev Jain wrote:
>>>
>>> On 30/09/25 1:54 am, Yang Shi wrote:
>>>> When calling mprotect() to a large hugetlb memory area in our
>>>> customer's
>>>> workload (~300GB hugetlb memory), soft lockup was observed:
>>>>
>>>> watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916]
>>>>
>>>> CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted
>>>> 6.17-rc7
>>>> Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS
>>>> 5.4.4.1 07/15/2025
>>>> pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>> pc : mte_clear_page_tags+0x14/0x24
>>>> lr : mte_sync_tags+0x1c0/0x240
>>>> sp : ffff80003150bb80
>>>> x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000
>>>> x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458
>>>> x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000
>>>> x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000
>>>> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>>>> x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
>>>> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c
>>>> x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
>>>> x5 : fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000
>>>> x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000
>>>>
>>>> Call trace:
>>>>     mte_clear_page_tags+0x14/0x24
>>>>     set_huge_pte_at+0x25c/0x280
>>>>     hugetlb_change_protection+0x220/0x430
>>>>     change_protection+0x5c/0x8c
>>>>     mprotect_fixup+0x10c/0x294
>>>>     do_mprotect_pkey.constprop.0+0x2e0/0x3d4
>>>>     __arm64_sys_mprotect+0x24/0x44
>>>>     invoke_syscall+0x50/0x160
>>>>     el0_svc_common+0x48/0x144
>>>>     do_el0_svc+0x30/0xe0
>>>>     el0_svc+0x30/0xf0
>>>>     el0t_64_sync_handler+0xc4/0x148
>>>>     el0t_64_sync+0x1a4/0x1a8
>>>>
>>>> Soft lockup is not triggered with THP or base page because there is
>>>> cond_resched() called for each PMD size.
>>>>
>>>> Although the soft lockup was triggered by MTE, it should be not MTE
>>>> specific. The other processing which takes long time in the loop may
>>>> trigger soft lockup too.
>>>>
>>>> So add cond_resched() for hugetlb to avoid soft lockup.
>>>>
>>>> Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages")
>>>> Tested-by: Carl Worth <carl@os.amperecomputing.com>
>>>> Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org>
>>>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>>>> Acked-by: David Hildenbrand <david@redhat.com>
>>>> Acked-by: Oscar Salvador <osalvador@suse.de>
>>>> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
>>>> ---
>>>> v2: - Made the subject and commit message less MTE specific and fixed
>>>>         the fixes tag.
>>>>       - Collected all R-bs and A-bs.
>>>>
>>>>    mm/hugetlb.c | 2 ++
>>>>    1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>>> index cb5c4e79e0b8..fe6606d91b31 100644
>>>> --- a/mm/hugetlb.c
>>>> +++ b/mm/hugetlb.c
>>>> @@ -7242,6 +7242,8 @@ long hugetlb_change_protection(struct
>>>> vm_area_struct *vma,
>>>>                            psize);
>>>>            }
>>>>            spin_unlock(ptl);
>>>> +
>>>> +        cond_resched();
>>>>        }
>>>>        /*
>>>>         * Must flush TLB before releasing i_mmap_rwsem: x86's
>>>> huge_pmd_unshare
>>>
>>> Reviewed-by: Dev Jain <dev.jain@arm.com>
>>
>> Thank you.
>>
>>>
>>> Does it make sense to also do cond_resched() in the
>>> huge_pmd_unshare() branch?
>>> That also amounts to clearing a page. And I can see for example,
>>> zap_huge_pmd()
>>> and change_huge_pmd() consume a cond_resched().
>>
>> Thanks for raising this. I did think about it. But I didn't convince
>> myself because shared pmd should be not that common IMHO (If I'm
>> wrong, please feel free to correct me). At least PMD can't be shared
>> if the memory is tagged IIRC. So I'd like to keep the patch minimal
>> for now and defer adding cond_resched() until it is hit by some real
>> life workload.
> 
> If we have large swathes of hugetlb memory like in your workload, and it
> is MAP_SHARED, then there should be high chances of sharing the PMD.
> Although, I incorrectly
> 
> observed that we are clearing a page there - we are only clearing the
> pud entry which is 8 bytes. So yes a soft lockup should be highly
> unlikely. But since cond_resched()
> 
> is cheap (I assume this is the case since it is liberally sprinkled all
> over the codebase) I think we should be consistent. Probably not an
> immediate concern and not a matter

Right, that's one of the cases where we might just want to wait either 
until is is reported or until hugetlb is finally removed in a couple of 
decades ;)

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-10-01  8:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-29 20:24 [v2 PATCH] mm: hugetlb: avoid soft lockup when mprotect to large memory area Yang Shi
2025-09-30  5:26 ` Dev Jain
2025-09-30 18:08   ` Yang Shi
2025-09-30 18:43     ` Christoph Lameter (Ampere)
2025-10-01  4:23     ` Dev Jain
2025-10-01  8:32       ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox