在 2025/8/8 16:21, David Hildenbrand 写道: > On 07.08.25 13:13, Jinjiang Tu wrote: >> >> 在 2025/8/6 20:41, David Hildenbrand 写道: >>> On 06.08.25 05:24, Jinjiang Tu wrote: >>>> >>>> 在 2025/8/6 11:05, Miaohe Lin 写道: >>>>> On 2025/8/6 10:05, Jinjiang Tu wrote: >>>>>> When memory_failure() is called for a already hwpoisoned pfn, >>>>>> kill_accessing_process() will be called to kill current task. >>>>>> However, if >>>>> Thanks for your patch. >>>>> >>>>>> the vma of the accessing vaddr is VM_PFNMAP, walk_page_range() >>>>>> will skip >>>>>> the vma in walk_page_test() and return 0. >>>>>> >>>>>> Before commit aaf99ac2ceb7 ("mm/hwpoison: do not send SIGBUS to >>>>>> processes >>>>>> with recovered clean pages"), kill_accessing_process() will >>>>>> return EFAULT. >>>>> I'm not sure but pfn_to_online_page should return NULL for >>>>> VM_PFNMAP pages? >>>>> So memory_failure_dev_pagemap should handle these pages? >>>> >>>> We could call remap_pfn_range() for those pfns with struct page. >>>> IIUC, VM_PFNMAP >>>> means we should assume the pfn doesn't have struct page, but it can >>>> have. >>>> >>>>>> For x86, the current task will be killed in kill_me_maybe(). >>>>>> >>>>>> However, after this commit, kill_accessing_process() simplies >>>>>> return 0, >>>>>> that means UCE is handled properly, but it doesn't actually. In >>>>>> such case, >>>>>> the user task will trigger UCE infinitely. >>>>> Did you ever trigger this loop? >>>> >>>> Yes. Our test is as follow steps: >>>> 1) create a user task allocates a clean anonymous page, wihout >>>> accessing it. >>>> 2) use einj to inject UCE for the page >>>> 3) create task devmem to use /dev/mem to map the pfn and keep >>>> accessing it. >>> >>> What is the use case for that? It sounds extremely questionable. >>> >> This case is only for test, and is strange indeed. >> >> But considering another case, a driver may map same RAM pfn to >> several processes with remap_pfn_range(). >> If the first task triggers UCE when accessing the pfn, the task will >> be killed. But the other tasks couldn't be killed >> and triggers UCE infinitely. > > Yes, the "anon page" example is confusing though. We really just want > to test here if the PFN is mapped. And I would agree that your patch > is correct in that case. > > For memory poisoning handling you really need a "struct page". > struct-less memory is only handled in special ways for DAX (see > pfn_to_online_page() logic in memory_failure()). > > So what you describe here really only works when a process uses > remap_pfn_range() to VM_PFNMAP a struct-page-backed PFN. > > > Likely your patch description should be: > > " > mm/memory-failure: fix infinite UCE for VM_PFNMAP'ed page > > When memory_failure() is called for an already hardware poisoned page, > kill_accessing_process() will conditionally send a SIGBUS to the > current (triggering) process if it still maps the page. > > However, in case the page is not ordinarily mapped, but was mapped > through remap_pfn_range(), kill_accessing_process() would not identify > it as mapped even though hwpoison_pte_range() would be prepared to > handle it, because walk_page_range() will skip VM_PFNMAP as default in > walk_page_test(). > > walk_page_range() will return 0, assuming "not mapped" and the SIGBUS > will be skipped. In this case, the user task will trigger UCE > infinitely because it will not receive a SIGBUS on access and simply > retry. > > > Before commit aaf99ac2ceb7 ("mm/hwpoison: do not send SIGBUS to > processes with recovered clean pages"), kill_accessing_process() would > return EFAULT in that case, and on x86, the current task would be > killed in kill_me_maybe(). > > Let's fix it by adding our custom .test_walk callback that will also > process VM_PFNMAP VMAs. > " > Thanks, I will update the patch description to emphasize the pfn is backed with struct page.