linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
@ 2025-12-19  6:28 Jane Chu
  2025-12-19  8:01 ` Miaohe Lin
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jane Chu @ 2025-12-19  6:28 UTC (permalink / raw)
  To: muchun.song, osalvador, david, linmiaohe, jiaqiyan,
	william.roche, rientjes, akpm, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mhocko, linux-mm, linux-kernel

When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
passed head pfn to kill_accessing_process(), that is not right.
The precise pfn of the poisoned page should be used in order to
determine the precise vaddr as the SIGBUS payload.

This issue has already been taken care of in the normal path, that is,
hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
correctly in the hugetlb repoisoning case, it's essential to inform
VM the precise poisoned page, not the head page.

[1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
[2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
[3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/

Cc: <stable@vger.kernel.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 mm/memory-failure.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 3edebb0cda30..c9d87811b1ea 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
 }
 
 static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
-				unsigned long poisoned_pfn, struct to_kill *tk)
+				unsigned long poisoned_pfn, struct to_kill *tk,
+				int pte_nr)
 {
 	unsigned long pfn = 0;
+	unsigned long hwpoison_vaddr;
 
 	if (pte_present(pte)) {
 		pfn = pte_pfn(pte);
@@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
 			pfn = swp_offset_pfn(swp);
 	}
 
-	if (!pfn || pfn != poisoned_pfn)
+	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
 		return 0;
 
-	set_to_kill(tk, addr, shift);
+	hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
+	set_to_kill(tk, hwpoison_vaddr, shift);
 	return 1;
 }
 
@@ -749,7 +752,7 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr,
 
 	for (; addr != end; ptep++, addr += PAGE_SIZE) {
 		ret = check_hwpoisoned_entry(ptep_get(ptep), addr, PAGE_SHIFT,
-					     hwp->pfn, &hwp->tk);
+					     hwp->pfn, &hwp->tk, 1);
 		if (ret == 1)
 			break;
 	}
@@ -772,8 +775,8 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
 
 	ptl = huge_pte_lock(h, walk->mm, ptep);
 	pte = huge_ptep_get(walk->mm, addr, ptep);
-	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h),
-					hwp->pfn, &hwp->tk);
+	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h), hwp->pfn,
+				&hwp->tk, pages_per_huge_page(h));
 	spin_unlock(ptl);
 	return ret;
 }
@@ -2023,10 +2026,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
 		*hugetlb = 0;
 		return 0;
 	} else if (res == -EHWPOISON) {
-		if (flags & MF_ACTION_REQUIRED) {
-			folio = page_folio(p);
-			res = kill_accessing_process(current, folio_pfn(folio), flags);
-		}
+		if (flags & MF_ACTION_REQUIRED)
+			res = kill_accessing_process(current, pfn, flags);
 		action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
 		return res;
 	} else if (res == -EBUSY) {
@@ -2037,6 +2038,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
 		return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
 	}
 
+
 	folio = page_folio(p);
 	folio_lock(folio);
 
-- 
2.43.5



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-19  6:28 [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
@ 2025-12-19  8:01 ` Miaohe Lin
  2025-12-19  8:06   ` jane.chu
  2025-12-19 17:27 ` Liam R. Howlett
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Miaohe Lin @ 2025-12-19  8:01 UTC (permalink / raw)
  To: Jane Chu
  Cc: muchun.song, osalvador, david, jiaqiyan, william.roche, rientjes,
	akpm, lorenzo.stoakes, Liam.Howlett, rppt, surenb, mhocko,
	linux-mm, linux-kernel

On 2025/12/19 14:28, Jane Chu wrote:
> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.
> 
> This issue has already been taken care of in the normal path, that is,
> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
> correctly in the hugetlb repoisoning case, it's essential to inform
> VM the precise poisoned page, not the head page.
> 
> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
> 

Thanks for your patch.

> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> ---
>  mm/memory-failure.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 3edebb0cda30..c9d87811b1ea 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>  }
>  
>  static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
> -				unsigned long poisoned_pfn, struct to_kill *tk)
> +				unsigned long poisoned_pfn, struct to_kill *tk,
> +				int pte_nr)
>  {
>  	unsigned long pfn = 0;
> +	unsigned long hwpoison_vaddr;
>  
>  	if (pte_present(pte)) {
>  		pfn = pte_pfn(pte);
> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>  			pfn = swp_offset_pfn(swp);
>  	}
>  
> -	if (!pfn || pfn != poisoned_pfn)
> +	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>  		return 0;

Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?

Thanks.
.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-19  8:01 ` Miaohe Lin
@ 2025-12-19  8:06   ` jane.chu
  2025-12-22  3:01     ` Miaohe Lin
  0 siblings, 1 reply; 12+ messages in thread
From: jane.chu @ 2025-12-19  8:06 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: muchun.song, osalvador, david, jiaqiyan, william.roche, rientjes,
	akpm, lorenzo.stoakes, Liam.Howlett, rppt, surenb, mhocko,
	linux-mm, linux-kernel



On 12/19/2025 12:01 AM, Miaohe Lin wrote:
> On 2025/12/19 14:28, Jane Chu wrote:
>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
>>
>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>
> 
> Thanks for your patch.
> 
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>> ---
>>   mm/memory-failure.c | 22 ++++++++++++----------
>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 3edebb0cda30..c9d87811b1ea 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>   }
>>   
>>   static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>> -				unsigned long poisoned_pfn, struct to_kill *tk)
>> +				unsigned long poisoned_pfn, struct to_kill *tk,
>> +				int pte_nr)
>>   {
>>   	unsigned long pfn = 0;
>> +	unsigned long hwpoison_vaddr;
>>   
>>   	if (pte_present(pte)) {
>>   		pfn = pte_pfn(pte);
>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>   			pfn = swp_offset_pfn(swp);
>>   	}
>>   
>> -	if (!pfn || pfn != poisoned_pfn)
>> +	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>   		return 0;
> 
> Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?

Why?  Is there any concern with using the macro pages_per_huge_page(h) ?

thanks!
-jane
> 
> Thanks.
> .



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-19  6:28 [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
  2025-12-19  8:01 ` Miaohe Lin
@ 2025-12-19 17:27 ` Liam R. Howlett
  2025-12-19 17:29   ` jane.chu
  2025-12-20 23:13 ` Andrew Morton
  2025-12-21  8:49 ` David Hildenbrand (Red Hat)
  3 siblings, 1 reply; 12+ messages in thread
From: Liam R. Howlett @ 2025-12-19 17:27 UTC (permalink / raw)
  To: Jane Chu
  Cc: muchun.song, osalvador, david, linmiaohe, jiaqiyan,
	william.roche, rientjes, akpm, lorenzo.stoakes, rppt, surenb,
	mhocko, linux-mm, linux-kernel

* Jane Chu <jane.chu@oracle.com> [251219 01:28]:
> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.
> 
> This issue has already been taken care of in the normal path, that is,
> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
> correctly in the hugetlb repoisoning case, it's essential to inform
> VM the precise poisoned page, not the head page.
> 
> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
> 
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>

I don't see stable in the Cc list, did you miss it?

Looks good, small nit below.

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  mm/memory-failure.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 3edebb0cda30..c9d87811b1ea 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>  }
>  
>  static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
> -				unsigned long poisoned_pfn, struct to_kill *tk)
> +				unsigned long poisoned_pfn, struct to_kill *tk,
> +				int pte_nr)
>  {
>  	unsigned long pfn = 0;
> +	unsigned long hwpoison_vaddr;
>  
>  	if (pte_present(pte)) {
>  		pfn = pte_pfn(pte);
> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>  			pfn = swp_offset_pfn(swp);
>  	}
>  
> -	if (!pfn || pfn != poisoned_pfn)
> +	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>  		return 0;
>  
> -	set_to_kill(tk, addr, shift);
> +	hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
> +	set_to_kill(tk, hwpoison_vaddr, shift);
>  	return 1;
>  }
>  
> @@ -749,7 +752,7 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr,
>  
>  	for (; addr != end; ptep++, addr += PAGE_SIZE) {
>  		ret = check_hwpoisoned_entry(ptep_get(ptep), addr, PAGE_SHIFT,
> -					     hwp->pfn, &hwp->tk);
> +					     hwp->pfn, &hwp->tk, 1);
>  		if (ret == 1)
>  			break;
>  	}
> @@ -772,8 +775,8 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
>  
>  	ptl = huge_pte_lock(h, walk->mm, ptep);
>  	pte = huge_ptep_get(walk->mm, addr, ptep);
> -	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h),
> -					hwp->pfn, &hwp->tk);
> +	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h), hwp->pfn,
> +				&hwp->tk, pages_per_huge_page(h));
>  	spin_unlock(ptl);
>  	return ret;
>  }
> @@ -2023,10 +2026,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
>  		*hugetlb = 0;
>  		return 0;
>  	} else if (res == -EHWPOISON) {
> -		if (flags & MF_ACTION_REQUIRED) {
> -			folio = page_folio(p);
> -			res = kill_accessing_process(current, folio_pfn(folio), flags);
> -		}
> +		if (flags & MF_ACTION_REQUIRED)
> +			res = kill_accessing_process(current, pfn, flags);
>  		action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
>  		return res;
>  	} else if (res == -EBUSY) {
> @@ -2037,6 +2038,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
>  		return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
>  	}
>  
> +

nit: extra witespace added.

>  	folio = page_folio(p);
>  	folio_lock(folio);
>  
> -- 
> 2.43.5
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-19 17:27 ` Liam R. Howlett
@ 2025-12-19 17:29   ` jane.chu
  0 siblings, 0 replies; 12+ messages in thread
From: jane.chu @ 2025-12-19 17:29 UTC (permalink / raw)
  To: Liam R. Howlett, muchun.song, osalvador, david, linmiaohe,
	jiaqiyan, william.roche, rientjes, akpm, lorenzo.stoakes, rppt,
	surenb, mhocko, linux-mm, linux-kernel



On 12/19/2025 9:27 AM, Liam R. Howlett wrote:
> * Jane Chu <jane.chu@oracle.com> [251219 01:28]:
>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
>>
>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> 
> I don't see stable in the Cc list, did you miss it?

Good catch, thank you!
> 
> Looks good, small nit below.
> 
> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

Thanks!
-jane

> 
>> ---
>>   mm/memory-failure.c | 22 ++++++++++++----------
>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 3edebb0cda30..c9d87811b1ea 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>   }
>>   
>>   static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>> -				unsigned long poisoned_pfn, struct to_kill *tk)
>> +				unsigned long poisoned_pfn, struct to_kill *tk,
>> +				int pte_nr)
>>   {
>>   	unsigned long pfn = 0;
>> +	unsigned long hwpoison_vaddr;
>>   
>>   	if (pte_present(pte)) {
>>   		pfn = pte_pfn(pte);
>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>   			pfn = swp_offset_pfn(swp);
>>   	}
>>   
>> -	if (!pfn || pfn != poisoned_pfn)
>> +	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>   		return 0;
>>   
>> -	set_to_kill(tk, addr, shift);
>> +	hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
>> +	set_to_kill(tk, hwpoison_vaddr, shift);
>>   	return 1;
>>   }
>>   
>> @@ -749,7 +752,7 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr,
>>   
>>   	for (; addr != end; ptep++, addr += PAGE_SIZE) {
>>   		ret = check_hwpoisoned_entry(ptep_get(ptep), addr, PAGE_SHIFT,
>> -					     hwp->pfn, &hwp->tk);
>> +					     hwp->pfn, &hwp->tk, 1);
>>   		if (ret == 1)
>>   			break;
>>   	}
>> @@ -772,8 +775,8 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
>>   
>>   	ptl = huge_pte_lock(h, walk->mm, ptep);
>>   	pte = huge_ptep_get(walk->mm, addr, ptep);
>> -	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h),
>> -					hwp->pfn, &hwp->tk);
>> +	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h), hwp->pfn,
>> +				&hwp->tk, pages_per_huge_page(h));
>>   	spin_unlock(ptl);
>>   	return ret;
>>   }
>> @@ -2023,10 +2026,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
>>   		*hugetlb = 0;
>>   		return 0;
>>   	} else if (res == -EHWPOISON) {
>> -		if (flags & MF_ACTION_REQUIRED) {
>> -			folio = page_folio(p);
>> -			res = kill_accessing_process(current, folio_pfn(folio), flags);
>> -		}
>> +		if (flags & MF_ACTION_REQUIRED)
>> +			res = kill_accessing_process(current, pfn, flags);
>>   		action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
>>   		return res;
>>   	} else if (res == -EBUSY) {
>> @@ -2037,6 +2038,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
>>   		return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
>>   	}
>>   
>> +
> 
> nit: extra witespace added.
> 
>>   	folio = page_folio(p);
>>   	folio_lock(folio);
>>   
>> -- 
>> 2.43.5
>>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-19  6:28 [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
  2025-12-19  8:01 ` Miaohe Lin
  2025-12-19 17:27 ` Liam R. Howlett
@ 2025-12-20 23:13 ` Andrew Morton
  2025-12-22 20:32   ` jane.chu
  2025-12-23  0:36   ` jane.chu
  2025-12-21  8:49 ` David Hildenbrand (Red Hat)
  3 siblings, 2 replies; 12+ messages in thread
From: Andrew Morton @ 2025-12-20 23:13 UTC (permalink / raw)
  To: Jane Chu
  Cc: muchun.song, osalvador, david, linmiaohe, jiaqiyan,
	william.roche, rientjes, lorenzo.stoakes, Liam.Howlett, rppt,
	surenb, mhocko, linux-mm, linux-kernel

On Thu, 18 Dec 2025 23:28:19 -0700 Jane Chu <jane.chu@oracle.com> wrote:

> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.
> 
> This issue has already been taken care of in the normal path, that is,
> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
> correctly in the hugetlb repoisoning case, it's essential to inform
> VM the precise poisoned page, not the head page.

This conflicts with your "mm/memory-failure: fix missing ->mf_stats
count in hugetlb poison".

Also conflicts a bit with "mm: fixup pfnmap memory failure handling to
use pgoff" but that one isn't cc:stable, so this patch (which *is*
cc:stable) takes priority.

Help?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-19  6:28 [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
                   ` (2 preceding siblings ...)
  2025-12-20 23:13 ` Andrew Morton
@ 2025-12-21  8:49 ` David Hildenbrand (Red Hat)
  2025-12-22 18:42   ` jane.chu
  3 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-21  8:49 UTC (permalink / raw)
  To: Jane Chu, muchun.song, osalvador, linmiaohe, jiaqiyan,
	william.roche, rientjes, akpm, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mhocko, linux-mm, linux-kernel

On 12/19/25 07:28, Jane Chu wrote:
> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.

I don't think so? IIRC, for hugetlb folios we always reported the head 
PFN. And user space must assume that the whole thing is poisoned and 
will go away.

I recall that older QEMU even depended on that behavior, for example.

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-19  8:06   ` jane.chu
@ 2025-12-22  3:01     ` Miaohe Lin
  2025-12-22 20:29       ` jane.chu
  0 siblings, 1 reply; 12+ messages in thread
From: Miaohe Lin @ 2025-12-22  3:01 UTC (permalink / raw)
  To: jane.chu
  Cc: muchun.song, osalvador, david, jiaqiyan, william.roche, rientjes,
	akpm, lorenzo.stoakes, Liam.Howlett, rppt, surenb, mhocko,
	linux-mm, linux-kernel

On 2025/12/19 16:06, jane.chu@oracle.com wrote:
> 
> 
> On 12/19/2025 12:01 AM, Miaohe Lin wrote:
>> On 2025/12/19 14:28, Jane Chu wrote:
>>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>>> passed head pfn to kill_accessing_process(), that is not right.
>>> The precise pfn of the poisoned page should be used in order to
>>> determine the precise vaddr as the SIGBUS payload.
>>>
>>> This issue has already been taken care of in the normal path, that is,
>>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>>> correctly in the hugetlb repoisoning case, it's essential to inform
>>> VM the precise poisoned page, not the head page.
>>>
>>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>>
>>
>> Thanks for your patch.
>>
>>> Cc: <stable@vger.kernel.org>
>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>>> ---
>>>   mm/memory-failure.c | 22 ++++++++++++----------
>>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>> index 3edebb0cda30..c9d87811b1ea 100644
>>> --- a/mm/memory-failure.c
>>> +++ b/mm/memory-failure.c
>>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>>   }
>>>     static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>> -                unsigned long poisoned_pfn, struct to_kill *tk)
>>> +                unsigned long poisoned_pfn, struct to_kill *tk,
>>> +                int pte_nr)
>>>   {
>>>       unsigned long pfn = 0;
>>> +    unsigned long hwpoison_vaddr;
>>>         if (pte_present(pte)) {
>>>           pfn = pte_pfn(pte);
>>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>>               pfn = swp_offset_pfn(swp);
>>>       }
>>>   -    if (!pfn || pfn != poisoned_pfn)
>>> +    if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>>           return 0;
>>
>> Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?
> 
> Why?  Is there any concern with using the macro pages_per_huge_page(h) ?

No, I was trying to get rid of new @pte_nr parameter. Something like below:

 static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
-                               unsigned long poisoned_pfn, struct to_kill *tk,
-                               int pte_nr)
+                               unsigned long poisoned_pfn, struct to_kill *tk)
 {
        unsigned long pfn = 0;
        unsigned long hwpoison_vaddr;
+       int pte_nr;

        if (pte_present(pte)) {
                pfn = pte_pfn(pte);
@@ -701,7 +701,8 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
                        pfn = softleaf_to_pfn(entry);
        }

-       if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
+       pte_nr = 1UL << (shift - PAGE_SHIFT);
+       if (!pfn || (pfn > poisoned_pfn || (pfn +  pte_nr - 1) < poisoned_pfn))
                return 0;

        hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);

So we don't have to pass in pte_nr from all callers. But that's trivial.

Thanks.
.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-21  8:49 ` David Hildenbrand (Red Hat)
@ 2025-12-22 18:42   ` jane.chu
  0 siblings, 0 replies; 12+ messages in thread
From: jane.chu @ 2025-12-22 18:42 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat),
	muchun.song, osalvador, linmiaohe, jiaqiyan, william.roche,
	rientjes, akpm, lorenzo.stoakes, Liam.Howlett, rppt, surenb,
	mhocko, linux-mm, linux-kernel


On 12/21/2025 12:49 AM, David Hildenbrand (Red Hat) wrote:
> On 12/19/25 07:28, Jane Chu wrote:
>> When a hugetlb folio is being poisoned again, 
>> try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
> 
> I don't think so? IIRC, for hugetlb folios we always reported the head 
> PFN. And user space must assume that the whole thing is poisoned and 
> will go away.
> 
> I recall that older QEMU even depended on that behavior, for example.
> 

What happens if non-head PFN of hugetlb is indicated in a SIGBUG to 
QEMU?  Because, the regular path, the path via hwpoison_user_mappings() 
already behave this way.

I'm not familiar with QEMU. AFAIK, the need for this patch came from our 
VM/QEMU team.

thanks,
-jane



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-22  3:01     ` Miaohe Lin
@ 2025-12-22 20:29       ` jane.chu
  0 siblings, 0 replies; 12+ messages in thread
From: jane.chu @ 2025-12-22 20:29 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: muchun.song, osalvador, david, jiaqiyan, william.roche, rientjes,
	akpm, lorenzo.stoakes, Liam.Howlett, rppt, surenb, mhocko,
	linux-mm, linux-kernel



On 12/21/2025 7:01 PM, Miaohe Lin wrote:
> On 2025/12/19 16:06, jane.chu@oracle.com wrote:
>>
>>
>> On 12/19/2025 12:01 AM, Miaohe Lin wrote:
>>> On 2025/12/19 14:28, Jane Chu wrote:
>>>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>>>> passed head pfn to kill_accessing_process(), that is not right.
>>>> The precise pfn of the poisoned page should be used in order to
>>>> determine the precise vaddr as the SIGBUS payload.
>>>>
>>>> This issue has already been taken care of in the normal path, that is,
>>>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>>>> correctly in the hugetlb repoisoning case, it's essential to inform
>>>> VM the precise poisoned page, not the head page.
>>>>
>>>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>>>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>>>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>>>
>>>
>>> Thanks for your patch.
>>>
>>>> Cc: <stable@vger.kernel.org>
>>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>>>> ---
>>>>    mm/memory-failure.c | 22 ++++++++++++----------
>>>>    1 file changed, 12 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>> index 3edebb0cda30..c9d87811b1ea 100644
>>>> --- a/mm/memory-failure.c
>>>> +++ b/mm/memory-failure.c
>>>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>>>    }
>>>>      static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>>> -                unsigned long poisoned_pfn, struct to_kill *tk)
>>>> +                unsigned long poisoned_pfn, struct to_kill *tk,
>>>> +                int pte_nr)
>>>>    {
>>>>        unsigned long pfn = 0;
>>>> +    unsigned long hwpoison_vaddr;
>>>>          if (pte_present(pte)) {
>>>>            pfn = pte_pfn(pte);
>>>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>>>                pfn = swp_offset_pfn(swp);
>>>>        }
>>>>    -    if (!pfn || pfn != poisoned_pfn)
>>>> +    if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>>>            return 0;
>>>
>>> Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?
>>
>> Why?  Is there any concern with using the macro pages_per_huge_page(h) ?
> 
> No, I was trying to get rid of new @pte_nr parameter. Something like below:
> 
>   static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
> -                               unsigned long poisoned_pfn, struct to_kill *tk,
> -                               int pte_nr)
> +                               unsigned long poisoned_pfn, struct to_kill *tk)
>   {
>          unsigned long pfn = 0;
>          unsigned long hwpoison_vaddr;
> +       int pte_nr;
> 
>          if (pte_present(pte)) {
>                  pfn = pte_pfn(pte);
> @@ -701,7 +701,8 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>                          pfn = softleaf_to_pfn(entry);
>          }
> 
> -       if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
> +       pte_nr = 1UL << (shift - PAGE_SHIFT);
> +       if (!pfn || (pfn > poisoned_pfn || (pfn +  pte_nr - 1) < poisoned_pfn))
>                  return 0;
> 
>          hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
> 
> So we don't have to pass in pte_nr from all callers. But that's trivial.

Got it, that's better. I will combine yours and Matthew's suggestion in v3.

Thanks a lot!
-jane

> 
> Thanks.
> .
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-20 23:13 ` Andrew Morton
@ 2025-12-22 20:32   ` jane.chu
  2025-12-23  0:36   ` jane.chu
  1 sibling, 0 replies; 12+ messages in thread
From: jane.chu @ 2025-12-22 20:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: muchun.song, osalvador, david, linmiaohe, jiaqiyan,
	william.roche, rientjes, lorenzo.stoakes, Liam.Howlett, rppt,
	surenb, mhocko, linux-mm, linux-kernel



On 12/20/2025 3:13 PM, Andrew Morton wrote:
> On Thu, 18 Dec 2025 23:28:19 -0700 Jane Chu <jane.chu@oracle.com> wrote:
> 
>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
> 
> This conflicts with your "mm/memory-failure: fix missing ->mf_stats
> count in hugetlb poison".
> 
> Also conflicts a bit with "mm: fixup pfnmap memory failure handling to
> use pgoff" but that one isn't cc:stable, so this patch (which *is*
> cc:stable) takes priority.
> 
> Help?
> 

Sorry Andrew.  Let me try rebase v3 on top of my other patch.  Will also 
take a look at this other conflict.

thanks,
-jane



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-20 23:13 ` Andrew Morton
  2025-12-22 20:32   ` jane.chu
@ 2025-12-23  0:36   ` jane.chu
  1 sibling, 0 replies; 12+ messages in thread
From: jane.chu @ 2025-12-23  0:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: muchun.song, osalvador, david, linmiaohe, jiaqiyan,
	william.roche, rientjes, lorenzo.stoakes, Liam.Howlett, rppt,
	surenb, mhocko, linux-mm, linux-kernel

Hi, Andrew,

On 12/20/2025 3:13 PM, Andrew Morton wrote:
> On Thu, 18 Dec 2025 23:28:19 -0700 Jane Chu <jane.chu@oracle.com> wrote:
> 
>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
> 
> This conflicts with your "mm/memory-failure: fix missing ->mf_stats
> count in hugetlb poison".
> 
> Also conflicts a bit with "mm: fixup pfnmap memory failure handling to
> use pgoff" but that one isn't cc:stable, so this patch (which *is*
> cc:stable) takes priority.

I looked at
   https://lore.kernel.org/lkml/20251213044708.3610-2-ankita@nvidia.com/
looks like we're changing different function, perhaps the conflict is
peripheral?

thanks,
-jane

> 
> Help?
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-12-23  0:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-19  6:28 [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
2025-12-19  8:01 ` Miaohe Lin
2025-12-19  8:06   ` jane.chu
2025-12-22  3:01     ` Miaohe Lin
2025-12-22 20:29       ` jane.chu
2025-12-19 17:27 ` Liam R. Howlett
2025-12-19 17:29   ` jane.chu
2025-12-20 23:13 ` Andrew Morton
2025-12-22 20:32   ` jane.chu
2025-12-23  0:36   ` jane.chu
2025-12-21  8:49 ` David Hildenbrand (Red Hat)
2025-12-22 18:42   ` jane.chu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox