在 2025/8/6 11:05, Miaohe Lin 写道:

On 2025/8/6 10:05, Jinjiang Tu wrote:

When memory_failure() is called for a already hwpoisoned pfn,
kill_accessing_process() will be called to kill current task. However, if

Thanks for your patch.

the vma of the accessing vaddr is VM_PFNMAP, walk_page_range() will skip
the vma in walk_page_test() and return 0.

Before commit aaf99ac2ceb7 ("mm/hwpoison: do not send SIGBUS to processes
with recovered clean pages"), kill_accessing_process() will return EFAULT.

I'm not sure but pfn_to_online_page should return NULL for VM_PFNMAP pages?
So memory_failure_dev_pagemap should handle these pages?

We could call remap_pfn_range() for those pfns with struct page. IIUC, VM_PFNMAP 
means we should assume the pfn doesn't have struct page, but it can have.

For x86, the current task will be killed in kill_me_maybe().

However, after this commit, kill_accessing_process() simplies return 0,
that means UCE is handled properly, but it doesn't actually. In such case,
the user task will trigger UCE infinitely.

Did you ever trigger this loop?

Yes. Our test is as follow steps:
1) create a user task allocates a clean anonymous page, wihout accessing it.
2) use einj to inject UCE for the page
3) create task devmem to use /dev/mem to map the pfn and keep accessing it.

/dev/mem uses remap_pfn_range() to map the pfn.

When task devmem first accesses the pfn, UCE is triggered, memory_failure()
succeeds to isolate it due to it's clean user page. But the task devmem isn't killed.

When task devmem accesses the pfn again, since the pfn is already hwpoisoned, kill_accessing_process() is called.
But it fails to kill the accessing task.


Theoretically, if we have several tasks that share the pfn range mapped by remap_pfn_range(), the above issue exists too.


Thanks.
.