linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
@ 2025-12-23  1:21 Jane Chu
  2025-12-23  1:21 ` [PATCH v3 2/2] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
  2025-12-23  8:50 ` [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison David Hildenbrand (Red Hat)
  0 siblings, 2 replies; 6+ messages in thread
From: Jane Chu @ 2025-12-23  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, stable, muchun.song, osalvador, david, linmiaohe,
	jiaqiyan, william.roche, rientjes, akpm, lorenzo.stoakes,
	Liam.Howlett, rppt, surenb, mhocko, willy

When a newly poisoned subpage ends up in an already poisoned hugetlb
folio, 'num_poisoned_pages' is incremented, but the per node ->mf_stats
is not. Fix the inconsistency by designating action_result() to update
them both.

While at it, define __get_huge_page_for_hwpoison() return values in terms
of symbol names for better readibility. Also rename
folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison() since the
function does more than the conventional bit setting and the fact
three possible return values are expected.

Fixes: 18f41fa616ee4 ("mm: memory-failure: bump memory failure stats to pglist_data")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
v2 -> v3:
  No change.
v1 -> v2:
  adapted David and Liam's comment, define __get_huge_page_for_hwpoison()
return values in terms of symbol names instead of naked integers for better
readibility.  #define instead of enum is used since the function has footprint
outside MF, just try to limit the MF specifics local.
  also renamed folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison()
since the function does more than the conventional bit setting and the
fact three possible return values are expected.

---
 mm/memory-failure.c | 56 ++++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 23 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index fbc5a01260c8..8b47e8a1b12d 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1883,12 +1883,18 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
 	return count;
 }
 
-static int folio_set_hugetlb_hwpoison(struct folio *folio, struct page *page)
+#define	MF_HUGETLB_ALREADY_POISONED	3  /* already poisoned */
+#define	MF_HUGETLB_ACC_EXISTING_POISON	4  /* accessed existing poisoned page */
+/*
+ * Set hugetlb folio as hwpoisoned, update folio private raw hwpoison list
+ * to keep track of the poisoned pages.
+ */
+static int hugetlb_update_hwpoison(struct folio *folio, struct page *page)
 {
 	struct llist_head *head;
 	struct raw_hwp_page *raw_hwp;
 	struct raw_hwp_page *p;
-	int ret = folio_test_set_hwpoison(folio) ? -EHWPOISON : 0;
+	int ret = folio_test_set_hwpoison(folio) ? MF_HUGETLB_ALREADY_POISONED : 0;
 
 	/*
 	 * Once the hwpoison hugepage has lost reliable raw error info,
@@ -1896,20 +1902,18 @@ static int folio_set_hugetlb_hwpoison(struct folio *folio, struct page *page)
 	 * so skip to add additional raw error info.
 	 */
 	if (folio_test_hugetlb_raw_hwp_unreliable(folio))
-		return -EHWPOISON;
+		return MF_HUGETLB_ALREADY_POISONED;
+
 	head = raw_hwp_list_head(folio);
 	llist_for_each_entry(p, head->first, node) {
 		if (p->page == page)
-			return -EHWPOISON;
+			return MF_HUGETLB_ACC_EXISTING_POISON;
 	}
 
 	raw_hwp = kmalloc(sizeof(struct raw_hwp_page), GFP_ATOMIC);
 	if (raw_hwp) {
 		raw_hwp->page = page;
 		llist_add(&raw_hwp->node, head);
-		/* the first error event will be counted in action_result(). */
-		if (ret)
-			num_poisoned_pages_inc(page_to_pfn(page));
 	} else {
 		/*
 		 * Failed to save raw error info.  We no longer trace all
@@ -1955,32 +1959,30 @@ void folio_clear_hugetlb_hwpoison(struct folio *folio)
 	folio_free_raw_hwp(folio, true);
 }
 
+#define	MF_HUGETLB_FREED			0	/* freed hugepage */
+#define	MF_HUGETLB_IN_USED			1	/* in-use hugepage */
+#define	MF_NOT_HUGETLB				2	/* not a hugepage */
+
 /*
  * Called from hugetlb code with hugetlb_lock held.
- *
- * Return values:
- *   0             - free hugepage
- *   1             - in-use hugepage
- *   2             - not a hugepage
- *   -EBUSY        - the hugepage is busy (try to retry)
- *   -EHWPOISON    - the hugepage is already hwpoisoned
  */
 int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
 				 bool *migratable_cleared)
 {
 	struct page *page = pfn_to_page(pfn);
 	struct folio *folio = page_folio(page);
-	int ret = 2;	/* fallback to normal page handling */
+	int ret = MF_NOT_HUGETLB;
 	bool count_increased = false;
+	int rc;
 
 	if (!folio_test_hugetlb(folio))
 		goto out;
 
 	if (flags & MF_COUNT_INCREASED) {
-		ret = 1;
+		ret = MF_HUGETLB_IN_USED;
 		count_increased = true;
 	} else if (folio_test_hugetlb_freed(folio)) {
-		ret = 0;
+		ret = MF_HUGETLB_FREED;
 	} else if (folio_test_hugetlb_migratable(folio)) {
 		ret = folio_try_get(folio);
 		if (ret)
@@ -1991,8 +1993,9 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
 			goto out;
 	}
 
-	if (folio_set_hugetlb_hwpoison(folio, page)) {
-		ret = -EHWPOISON;
+	rc = hugetlb_update_hwpoison(folio, page);
+	if (rc >= MF_HUGETLB_ALREADY_POISONED) {
+		ret = rc;
 		goto out;
 	}
 
@@ -2029,22 +2032,29 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
 	*hugetlb = 1;
 retry:
 	res = get_huge_page_for_hwpoison(pfn, flags, &migratable_cleared);
-	if (res == 2) { /* fallback to normal page handling */
+	switch (res) {
+	case MF_NOT_HUGETLB:	/* fallback to normal page handling */
 		*hugetlb = 0;
 		return 0;
-	} else if (res == -EHWPOISON) {
+	case MF_HUGETLB_ALREADY_POISONED:
+	case MF_HUGETLB_ACC_EXISTING_POISON:
 		if (flags & MF_ACTION_REQUIRED) {
 			folio = page_folio(p);
 			res = kill_accessing_process(current, folio_pfn(folio), flags);
 		}
-		action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
+		if (res == MF_HUGETLB_ALREADY_POISONED)
+			action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
+		else
+			action_result(pfn, MF_MSG_HUGE, MF_FAILED);
 		return res;
-	} else if (res == -EBUSY) {
+	case -EBUSY:
 		if (!(flags & MF_NO_RETRY)) {
 			flags |= MF_NO_RETRY;
 			goto retry;
 		}
 		return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
+	default:
+		break;
 	}
 
 	folio = page_folio(p);
-- 
2.43.5



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 2/2] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-23  1:21 [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison Jane Chu
@ 2025-12-23  1:21 ` Jane Chu
  2025-12-23  9:13   ` David Hildenbrand (Red Hat)
  2025-12-23  8:50 ` [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison David Hildenbrand (Red Hat)
  1 sibling, 1 reply; 6+ messages in thread
From: Jane Chu @ 2025-12-23  1:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, stable, muchun.song, osalvador, david, linmiaohe,
	jiaqiyan, william.roche, rientjes, akpm, lorenzo.stoakes,
	Liam.Howlett, rppt, surenb, mhocko, willy

When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
passed head pfn to kill_accessing_process(), that is not right.
The precise pfn of the poisoned page should be used in order to
determine the precise vaddr as the SIGBUS payload.

This issue has already been taken care of in the normal path, that is,
hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
correctly in the hugetlb repoisoning case, it's essential to inform
VM the precise poisoned page, not the head page.

[1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
[2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
[3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/

Cc: <stable@vger.kernel.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
v2 -> v3:
  incorporated suggestions from Miaohe and Matthew.
v1 -> v2:
  pickup R-B, add stable to cc list.
---
 mm/memory-failure.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 8b47e8a1b12d..98612ac961b0 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -692,6 +692,8 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
 				unsigned long poisoned_pfn, struct to_kill *tk)
 {
 	unsigned long pfn = 0;
+	unsigned long hwpoison_vaddr;
+	unsigned long mask;
 
 	if (pte_present(pte)) {
 		pfn = pte_pfn(pte);
@@ -702,10 +704,12 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
 			pfn = softleaf_to_pfn(entry);
 	}
 
-	if (!pfn || pfn != poisoned_pfn)
+	mask = ~((1UL << (shift - PAGE_SHIFT)) - 1);
+	if (!pfn || ((pfn & mask) != (poisoned_pfn & mask)))
 		return 0;
 
-	set_to_kill(tk, addr, shift);
+	hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
+	set_to_kill(tk, hwpoison_vaddr, shift);
 	return 1;
 }
 
@@ -2038,10 +2042,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
 		return 0;
 	case MF_HUGETLB_ALREADY_POISONED:
 	case MF_HUGETLB_ACC_EXISTING_POISON:
-		if (flags & MF_ACTION_REQUIRED) {
-			folio = page_folio(p);
-			res = kill_accessing_process(current, folio_pfn(folio), flags);
-		}
+		if (flags & MF_ACTION_REQUIRED)
+			res = kill_accessing_process(current, pfn, flags);
 		if (res == MF_HUGETLB_ALREADY_POISONED)
 			action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
 		else
-- 
2.43.5



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
  2025-12-23  1:21 [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison Jane Chu
  2025-12-23  1:21 ` [PATCH v3 2/2] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
@ 2025-12-23  8:50 ` David Hildenbrand (Red Hat)
  2025-12-23 18:28   ` jane.chu
  1 sibling, 1 reply; 6+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-23  8:50 UTC (permalink / raw)
  To: Jane Chu, linux-kernel
  Cc: linux-mm, stable, muchun.song, osalvador, linmiaohe, jiaqiyan,
	william.roche, rientjes, akpm, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mhocko, willy

On 12/23/25 02:21, Jane Chu wrote:
> When a newly poisoned subpage ends up in an already poisoned hugetlb
> folio, 'num_poisoned_pages' is incremented, but the per node ->mf_stats
> is not. Fix the inconsistency by designating action_result() to update
> them both.
> 
> While at it, define __get_huge_page_for_hwpoison() return values in terms
> of symbol names for better readibility. Also rename
> folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison() since the
> function does more than the conventional bit setting and the fact
> three possible return values are expected.
> 
> Fixes: 18f41fa616ee4 ("mm: memory-failure: bump memory failure stats to pglist_data")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> ---
> v2 -> v3:
>    No change.
> v1 -> v2:
>    adapted David and Liam's comment, define __get_huge_page_for_hwpoison()
> return values in terms of symbol names instead of naked integers for better
> readibility.  #define instead of enum is used since the function has footprint
> outside MF, just try to limit the MF specifics local.
>    also renamed folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison()
> since the function does more than the conventional bit setting and the
> fact three possible return values are expected.
> 
> ---
>   mm/memory-failure.c | 56 ++++++++++++++++++++++++++-------------------
>   1 file changed, 33 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index fbc5a01260c8..8b47e8a1b12d 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1883,12 +1883,18 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
>   	return count;
>   }
>   
> -static int folio_set_hugetlb_hwpoison(struct folio *folio, struct page *page)
> +#define	MF_HUGETLB_ALREADY_POISONED	3  /* already poisoned */
> +#define	MF_HUGETLB_ACC_EXISTING_POISON	4  /* accessed existing poisoned page */

What happened to the idea of using an enum?


> +/*
> + * Set hugetlb folio as hwpoisoned, update folio private raw hwpoison list
> + * to keep track of the poisoned pages.
> + */
> +static int hugetlb_update_hwpoison(struct folio *folio, struct page *page)
>   {
>   	struct llist_head *head;
>   	struct raw_hwp_page *raw_hwp;
>   	struct raw_hwp_page *p;
> -	int ret = folio_test_set_hwpoison(folio) ? -EHWPOISON : 0;
> +	int ret = folio_test_set_hwpoison(folio) ? MF_HUGETLB_ALREADY_POISONED : 0;
>   
>   	/*
>   	 * Once the hwpoison hugepage has lost reliable raw error info,
> @@ -1896,20 +1902,18 @@ static int folio_set_hugetlb_hwpoison(struct folio *folio, struct page *page)
>   	 * so skip to add additional raw error info.
>   	 */
>   	if (folio_test_hugetlb_raw_hwp_unreliable(folio))
> -		return -EHWPOISON;
> +		return MF_HUGETLB_ALREADY_POISONED;
> +
>   	head = raw_hwp_list_head(folio);
>   	llist_for_each_entry(p, head->first, node) {
>   		if (p->page == page)
> -			return -EHWPOISON;
> +			return MF_HUGETLB_ACC_EXISTING_POISON;
>   	}
>   
>   	raw_hwp = kmalloc(sizeof(struct raw_hwp_page), GFP_ATOMIC);
>   	if (raw_hwp) {
>   		raw_hwp->page = page;
>   		llist_add(&raw_hwp->node, head);
> -		/* the first error event will be counted in action_result(). */
> -		if (ret)
> -			num_poisoned_pages_inc(page_to_pfn(page));
>   	} else {
>   		/*
>   		 * Failed to save raw error info.  We no longer trace all
> @@ -1955,32 +1959,30 @@ void folio_clear_hugetlb_hwpoison(struct folio *folio)
>   	folio_free_raw_hwp(folio, true);
>   }
>   
> +#define	MF_HUGETLB_FREED			0	/* freed hugepage */
> +#define	MF_HUGETLB_IN_USED			1	/* in-use hugepage */
> +#define	MF_NOT_HUGETLB				2	/* not a hugepage */

If you're already dealing with negative error codes, "MF_NOT_HUGETLB" nicely
translated to -EINVAL.

But I wonder if it would be cleaner to just define all values in an enum and return
that enum instead of an int from the functions.

enum md_hugetlb_status {
	MF_HUGETLB_INVALID,		/* not a hugetlb folio */
	MF_HUGETLB_BUSY,		/* busy, retry later */
	MF_HUGETLB_FREED,		/* hugetlb folio was freed */
	MF_HUGETLB_IN_USED,		/* ??? no idea what that really means */
	MF_HUGETLB_FOLIO_PRE_POISONED,	/* folio already poisoned, per-page information unclear */
	MF_HUGETLB_PAGE_PRE_POISONED,	/* exact page already poisoned */
}

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-23  1:21 ` [PATCH v3 2/2] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
@ 2025-12-23  9:13   ` David Hildenbrand (Red Hat)
  2025-12-23 18:17     ` jane.chu
  0 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-23  9:13 UTC (permalink / raw)
  To: Jane Chu, linux-kernel
  Cc: linux-mm, stable, muchun.song, osalvador, linmiaohe, jiaqiyan,
	william.roche, rientjes, akpm, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mhocko, willy

On 12/23/25 02:21, Jane Chu wrote:
> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.
> 
> This issue has already been taken care of in the normal path, that is,
> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
> correctly in the hugetlb repoisoning case, it's essential to inform
> VM the precise poisoned page, not the head page.
> 
> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
> 
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> ---
> v2 -> v3:
>    incorporated suggestions from Miaohe and Matthew.
> v1 -> v2:
>    pickup R-B, add stable to cc list.

Please don't send new versions when the discussions on your old 
submissions are still going on. Makes the whole discussion hard to follow.

You asked in the old version:

"
What happens if non-head PFN of hugetlb is indicated in a SIGBUG to
QEMU?  Because, the regular path, the path via hwpoison_user_mappings()
already behave this way.

I'm not familiar with QEMU. AFAIK, the need for this patch came from our
VM/QEMU team.
"

I just took a look and I think it's ok. I remembered a discussion around 
[1] where we concluded that the kernel would always give us the first 
PFN, but essentially the whole hugetlb folio will vanish.

But in QEMU we work completely on the given vaddr, and are able to 
identify that it's a hugetlb folio through our information on memory 
mappings.

QEMU stores a list of positioned vaddrs, to remap them (e.g., 
fallocate(PUNCH_HOLE)) when restarting the VM. If we get various vaddrs 
for the same hugetlb folio we will simply try to remap a hugetlb folio 
several times, which is not a real problem. I think we discussed that 
that could get optimized as part of [1] (or follow-up versions) if ever 
required.

[1] 
https://lore.kernel.org/qemu-devel/20240910090747.2741475-1-william.roche@oracle.com/


-- 
Cheers

David


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
  2025-12-23  9:13   ` David Hildenbrand (Red Hat)
@ 2025-12-23 18:17     ` jane.chu
  0 siblings, 0 replies; 6+ messages in thread
From: jane.chu @ 2025-12-23 18:17 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), linux-kernel
  Cc: linux-mm, stable, muchun.song, osalvador, linmiaohe, jiaqiyan,
	william.roche, rientjes, akpm, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mhocko, willy

On 12/23/2025 1:13 AM, David Hildenbrand (Red Hat) wrote:
> On 12/23/25 02:21, Jane Chu wrote:
>> When a hugetlb folio is being poisoned again, 
>> try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
>>
>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1- 
>> willy@infradead.org
>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1- 
>> jane.chu@oracle.com
>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1- 
>> jiaqiyan@google.com/
>>
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
>> ---
>> v2 -> v3:
>>    incorporated suggestions from Miaohe and Matthew.
>> v1 -> v2:
>>    pickup R-B, add stable to cc list.
> 
> Please don't send new versions when the discussions on your old 
> submissions are still going on. Makes the whole discussion hard to follow.

Got it, thanks.

> 
> You asked in the old version:
> 
> "
> What happens if non-head PFN of hugetlb is indicated in a SIGBUG to
> QEMU?  Because, the regular path, the path via hwpoison_user_mappings()
> already behave this way.
> 
> I'm not familiar with QEMU. AFAIK, the need for this patch came from our
> VM/QEMU team.
> "
> 
> I just took a look and I think it's ok. I remembered a discussion around 
> [1] where we concluded that the kernel would always give us the first 
> PFN, but essentially the whole hugetlb folio will vanish.
> 
> But in QEMU we work completely on the given vaddr, and are able to 
> identify that it's a hugetlb folio through our information on memory 
> mappings.
> 
> QEMU stores a list of positioned vaddrs, to remap them (e.g., 
> fallocate(PUNCH_HOLE)) when restarting the VM. If we get various vaddrs 
> for the same hugetlb folio we will simply try to remap a hugetlb folio 
> several times, which is not a real problem. I think we discussed that 
> that could get optimized as part of [1] (or follow-up versions) if ever 
> required.
> 
> [1] https://lore.kernel.org/qemu-devel/20240910090747.2741475-1- 
> william.roche@oracle.com/
> 

Thanks a lot!

-jane
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
  2025-12-23  8:50 ` [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison David Hildenbrand (Red Hat)
@ 2025-12-23 18:28   ` jane.chu
  0 siblings, 0 replies; 6+ messages in thread
From: jane.chu @ 2025-12-23 18:28 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), linux-kernel
  Cc: linux-mm, stable, muchun.song, osalvador, linmiaohe, jiaqiyan,
	william.roche, rientjes, akpm, lorenzo.stoakes, Liam.Howlett,
	rppt, surenb, mhocko, willy


On 12/23/2025 12:50 AM, David Hildenbrand (Red Hat) wrote:
> On 12/23/25 02:21, Jane Chu wrote:
>> When a newly poisoned subpage ends up in an already poisoned hugetlb
>> folio, 'num_poisoned_pages' is incremented, but the per node ->mf_stats
>> is not. Fix the inconsistency by designating action_result() to update
>> them both.
>>
>> While at it, define __get_huge_page_for_hwpoison() return values in terms
>> of symbol names for better readibility. Also rename
>> folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison() since the
>> function does more than the conventional bit setting and the fact
>> three possible return values are expected.
>>
>> Fixes: 18f41fa616ee4 ("mm: memory-failure: bump memory failure stats 
>> to pglist_data")
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>> ---
>> v2 -> v3:
>>    No change.
>> v1 -> v2:
>>    adapted David and Liam's comment, define 
>> __get_huge_page_for_hwpoison()
>> return values in terms of symbol names instead of naked integers for 
>> better
>> readibility.  #define instead of enum is used since the function has 
>> footprint
>> outside MF, just try to limit the MF specifics local.
>>    also renamed folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison()
>> since the function does more than the conventional bit setting and the
>> fact three possible return values are expected.
>>
>> ---
>>   mm/memory-failure.c | 56 ++++++++++++++++++++++++++-------------------
>>   1 file changed, 33 insertions(+), 23 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index fbc5a01260c8..8b47e8a1b12d 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -1883,12 +1883,18 @@ static unsigned long 
>> __folio_free_raw_hwp(struct folio *folio, bool move_flag)
>>       return count;
>>   }
>> -static int folio_set_hugetlb_hwpoison(struct folio *folio, struct 
>> page *page)
>> +#define    MF_HUGETLB_ALREADY_POISONED    3  /* already poisoned */
>> +#define    MF_HUGETLB_ACC_EXISTING_POISON    4  /* accessed existing 
>> poisoned page */
> 
> What happened to the idea of using an enum?

I briefly mentioned the reason of using #define instead of enum in the 
v1 -> v2 comment, more below.

> 
> 
>> +/*
>> + * Set hugetlb folio as hwpoisoned, update folio private raw hwpoison 
>> list
>> + * to keep track of the poisoned pages.
>> + */
>> +static int hugetlb_update_hwpoison(struct folio *folio, struct page 
>> *page)
>>   {
>>       struct llist_head *head;
>>       struct raw_hwp_page *raw_hwp;
>>       struct raw_hwp_page *p;
>> -    int ret = folio_test_set_hwpoison(folio) ? -EHWPOISON : 0;
>> +    int ret = folio_test_set_hwpoison(folio) ? 
>> MF_HUGETLB_ALREADY_POISONED : 0;
>>       /*
>>        * Once the hwpoison hugepage has lost reliable raw error info,
>> @@ -1896,20 +1902,18 @@ static int folio_set_hugetlb_hwpoison(struct 
>> folio *folio, struct page *page)
>>        * so skip to add additional raw error info.
>>        */
>>       if (folio_test_hugetlb_raw_hwp_unreliable(folio))
>> -        return -EHWPOISON;
>> +        return MF_HUGETLB_ALREADY_POISONED;
>> +
>>       head = raw_hwp_list_head(folio);
>>       llist_for_each_entry(p, head->first, node) {
>>           if (p->page == page)
>> -            return -EHWPOISON;
>> +            return MF_HUGETLB_ACC_EXISTING_POISON;
>>       }
>>       raw_hwp = kmalloc(sizeof(struct raw_hwp_page), GFP_ATOMIC);
>>       if (raw_hwp) {
>>           raw_hwp->page = page;
>>           llist_add(&raw_hwp->node, head);
>> -        /* the first error event will be counted in action_result(). */
>> -        if (ret)
>> -            num_poisoned_pages_inc(page_to_pfn(page));
>>       } else {
>>           /*
>>            * Failed to save raw error info.  We no longer trace all
>> @@ -1955,32 +1959,30 @@ void folio_clear_hugetlb_hwpoison(struct folio 
>> *folio)
>>       folio_free_raw_hwp(folio, true);
>>   }
>> +#define    MF_HUGETLB_FREED            0    /* freed hugepage */
>> +#define    MF_HUGETLB_IN_USED            1    /* in-use hugepage */
>> +#define    MF_NOT_HUGETLB                2    /* not a hugepage */
> 
> If you're already dealing with negative error codes, "MF_NOT_HUGETLB" 
> nicely
> translated to -EINVAL.

Agreed, thanks, will make the change in next round.

> 
> But I wonder if it would be cleaner to just define all values in an enum 
> and return
> that enum instead of an int from the functions.
> 
> enum md_hugetlb_status {
>      MF_HUGETLB_INVALID,        /* not a hugetlb folio */
>      MF_HUGETLB_BUSY,        /* busy, retry later */
>      MF_HUGETLB_FREED,        /* hugetlb folio was freed */
>      MF_HUGETLB_IN_USED,        /* ??? no idea what that really means */
>      MF_HUGETLB_FOLIO_PRE_POISONED,    /* folio already poisoned, per- 
> page information unclear */
>      MF_HUGETLB_PAGE_PRE_POISONED,    /* exact page already poisoned */
> }

enum is nicer than #define for sure, the only concern I have is that, 
this enum as a return type from _get_huge_page_for_hwpoison()/
get_huge_page_for_hwpoison() will leave footprint in
   include/linux/mm.h that declares _get_huge_page_for_hwpoison(),
   include/linux/hugetlb.h that declares/defines 
get_huge_page_for_hwpoison(),
   and mm/hugetlb.c.

Is there a neat way? or, am I nitpicking? :)

thanks,
-jane




> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-12-23 18:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-23  1:21 [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison Jane Chu
2025-12-23  1:21 ` [PATCH v3 2/2] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn Jane Chu
2025-12-23  9:13   ` David Hildenbrand (Red Hat)
2025-12-23 18:17     ` jane.chu
2025-12-23  8:50 ` [PATCH v3 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison David Hildenbrand (Red Hat)
2025-12-23 18:28   ` jane.chu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox