在 2025/8/14 14:05, jane.chu@oracle.com 写道:

On 8/10/2025 9:33 PM, Jinjiang Tu wrote:
When memory_failure() is called for a already hwpoisoned pfn backed with
struct page, kill_accessing_process() will conditionally send a SIGBUS to
the current (triggering) process if it maps the page.

However, in case the page is not ordinarily mapped, but was mapped through
remap_pfn_range(), kill_accessing_process() wouldn't identify it as mapped
even though hwpoison_pte_range() would be prepared to handle it, because
walk_page_range() will skip VM_PFNMAP as default in walk_page_test(). As
a result, walk_page_range() will return 0, assuming "not mapped" and SIGBUS
will be skipped. The user task will trigger UCE infinitely because it will
not receive a SIGBUS on access and simply retry.

Before commit aaf99ac2ceb7 ("mm/hwpoison: do not send SIGBUS to processes
with recovered clean pages"), kill_accessing_process() will return EFAULT.
For x86, the current task will be killed in kill_me_maybe().

To fix it, add .test_walk callback for hwpoison_walk_ops to process
VM_PFNMAP VMAs too.

Fixes: aaf99ac2ceb7 ("mm/hwpoison: do not send SIGBUS to processes with recovered clean pages")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
---
Changelog since v1:
  * update patch description, suggested by David Hildenbrand

  mm/memory-failure.c | 7 +++++++
  1 file changed, 7 insertions(+)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index e2e685b971bb..fa6a8f2cdebc 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -853,9 +853,16 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
  #define hwpoison_hugetlb_range    NULL
  #endif
  +static int hwpoison_test_walk(unsigned long start, unsigned long end,
+                 struct mm_walk *walk)
+{
+    return 0;
+}
+
  static const struct mm_walk_ops hwpoison_walk_ops = {
      .pmd_entry = hwpoison_pte_range,
      .hugetlb_entry = hwpoison_hugetlb_range,
+    .test_walk = hwpoison_test_walk,
      .walk_lock = PGWALK_RDLOCK,
  };
 

Looks good.  Could you add this to stable ? 
Yes, I will.

Reviewed-by: Jane Chu <jane.chu@oracle.com>

thanks,
-jane