* [PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio @ 2025-03-18 8:39 Jinjiang Tu 2025-03-18 8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu 2025-03-18 8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu 0 siblings, 2 replies; 8+ messages in thread From: Jinjiang Tu @ 2025-03-18 8:39 UTC (permalink / raw) To: akpm, linmiaohe, nao.horiguchi, david Cc: linux-mm, wangkefeng.wang, sunnanyong, tujinjiang Fix a bug during memory reclaim if folio is hwpoisoned. Jinjiang Tu (2): mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper mm/vmscan: don't try to reclaim hwpoison folio include/linux/page-flags.h | 6 ++++++ mm/memory_hotplug.c | 3 +-- mm/shmem.c | 3 +-- mm/vmscan.c | 7 +++++++ 4 files changed, 15 insertions(+), 4 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper 2025-03-18 8:39 [PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu @ 2025-03-18 8:39 ` Jinjiang Tu 2025-03-20 2:36 ` Miaohe Lin 2025-04-01 16:28 ` David Hildenbrand 2025-03-18 8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu 1 sibling, 2 replies; 8+ messages in thread From: Jinjiang Tu @ 2025-03-18 8:39 UTC (permalink / raw) To: akpm, linmiaohe, nao.horiguchi, david Cc: linux-mm, wangkefeng.wang, sunnanyong, tujinjiang Introduce helper folio_contain_hwpoisoned_page() to check if the entire folio is hwpoisoned or it contains hwpoisoned pages. Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> --- include/linux/page-flags.h | 6 ++++++ mm/memory_hotplug.c | 3 +-- mm/shmem.c | 3 +-- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 36d283552f80..be2f0017a667 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -1104,6 +1104,12 @@ static inline bool is_page_hwpoison(const struct page *page) return folio_test_hugetlb(folio) && PageHWPoison(&folio->page); } +static inline bool folio_contain_hwpoisoned_page(struct folio *folio) +{ + return folio_test_hwpoison(folio) || + (folio_test_large(folio) && folio_test_has_hwpoisoned(folio)); +} + bool is_free_buddy_page(const struct page *page); PAGEFLAG(Isolated, isolated, PF_ANY); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 16cf9e17077e..75401866fb76 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1828,8 +1828,7 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) if (unlikely(page_folio(page) != folio)) goto put_folio; - if (folio_test_hwpoison(folio) || - (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) { + if (folio_contain_hwpoisoned_page(folio)) { if (WARN_ON(folio_test_lru(folio))) folio_isolate_lru(folio); if (folio_mapped(folio)) { diff --git a/mm/shmem.c b/mm/shmem.c index 1ede0800e846..1dd513d82332 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3302,8 +3302,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, if (ret) return ret; - if (folio_test_hwpoison(folio) || - (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) { + if (folio_contain_hwpoisoned_page(folio)) { folio_unlock(folio); folio_put(folio); return -EIO; -- 2.43.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper 2025-03-18 8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu @ 2025-03-20 2:36 ` Miaohe Lin 2025-04-01 16:28 ` David Hildenbrand 1 sibling, 0 replies; 8+ messages in thread From: Miaohe Lin @ 2025-03-20 2:36 UTC (permalink / raw) To: Jinjiang Tu Cc: linux-mm, wangkefeng.wang, sunnanyong, akpm, nao.horiguchi, david On 2025/3/18 16:39, Jinjiang Tu wrote: > Introduce helper folio_contain_hwpoisoned_page() to check if the entire > folio is hwpoisoned or it contains hwpoisoned pages. > > Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> LGTM. Acked-by: Miaohe Lin <linmiaohe@huawei.com> Thanks. . ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper 2025-03-18 8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu 2025-03-20 2:36 ` Miaohe Lin @ 2025-04-01 16:28 ` David Hildenbrand 1 sibling, 0 replies; 8+ messages in thread From: David Hildenbrand @ 2025-04-01 16:28 UTC (permalink / raw) To: Jinjiang Tu, akpm, linmiaohe, nao.horiguchi Cc: linux-mm, wangkefeng.wang, sunnanyong On 18.03.25 09:39, Jinjiang Tu wrote: > Introduce helper folio_contain_hwpoisoned_page() to check if the entire > folio is hwpoisoned or it contains hwpoisoned pages. > > Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> > --- > include/linux/page-flags.h | 6 ++++++ > mm/memory_hotplug.c | 3 +-- > mm/shmem.c | 3 +-- > 3 files changed, 8 insertions(+), 4 deletions(-) > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index 36d283552f80..be2f0017a667 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -1104,6 +1104,12 @@ static inline bool is_page_hwpoison(const struct page *page) > return folio_test_hugetlb(folio) && PageHWPoison(&folio->page); > } > > +static inline bool folio_contain_hwpoisoned_page(struct folio *folio) "folio_contains_hwpoisoned_page" Also make sure to indent return folio_test_hwpoison(folio) || (folio_test_large(folio) && folio_test_has_hwpoisoned(folio)); ^ this way With that Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio 2025-03-18 8:39 [PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu 2025-03-18 8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu @ 2025-03-18 8:39 ` Jinjiang Tu 2025-03-20 2:50 ` Miaohe Lin 2025-04-01 16:36 ` David Hildenbrand 1 sibling, 2 replies; 8+ messages in thread From: Jinjiang Tu @ 2025-03-18 8:39 UTC (permalink / raw) To: akpm, linmiaohe, nao.horiguchi, david Cc: linux-mm, wangkefeng.wang, sunnanyong, tujinjiang Syzkaller reports a bug as follows: Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000 Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e memcg:ffff0000dd6d9000 anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff) raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9 raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000 page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio)) ------------[ cut here ]------------ kernel BUG at mm/swap_state.c:184! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP Modules linked in: CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3 Hardware name: linux,dummy-virt (DT) pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : add_to_swap+0xbc/0x158 lr : add_to_swap+0xbc/0x158 sp : ffff800087f37340 x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780 x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0 x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4 x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000 x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000 x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001 x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000 Call trace: add_to_swap+0xbc/0x158 shrink_folio_list+0x12ac/0x2648 shrink_inactive_list+0x318/0x948 shrink_lruvec+0x450/0x720 shrink_node_memcgs+0x280/0x4a8 shrink_node+0x128/0x978 balance_pgdat+0x4f0/0xb20 kswapd+0x228/0x438 kthread+0x214/0x230 ret_from_fork+0x10/0x20 I can reproduce this issue with the following steps: 1) When a dirty swapcache page is isolated by reclaim process and the page isn't locked, inject memory failure for the page. me_swapcache_dirty() clears uptodate flag and tries to delete from lru, but fails. Reclaim process will put the hwpoisoned page back to lru. 2) The process that maps the hwpoisoned page exits, the page is deleted the page will never be freed and will be in the lru forever. 3) If we trigger a reclaim again and tries to reclaim the page, add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is cleared. To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap it in shrink_folio_list(), otherwise the folio will fail to be unmaped by hwpoison_user_mappings() since the folio isn't in lru list. Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> --- mm/vmscan.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index 2d73d497bdd5..ca3757b137d9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1112,6 +1112,13 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, if (!folio_trylock(folio)) goto keep; + if (folio_contain_hwpoisoned_page(folio)) { + unmap_poisoned_folio(folio, folio_pfn(folio), false); + folio_unlock(folio); + folio_put(folio); + continue; + } + VM_BUG_ON_FOLIO(folio_test_active(folio), folio); nr_pages = folio_nr_pages(folio); -- 2.43.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio 2025-03-18 8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu @ 2025-03-20 2:50 ` Miaohe Lin 2025-03-20 3:37 ` Jinjiang Tu 2025-04-01 16:36 ` David Hildenbrand 1 sibling, 1 reply; 8+ messages in thread From: Miaohe Lin @ 2025-03-20 2:50 UTC (permalink / raw) To: Jinjiang Tu Cc: linux-mm, wangkefeng.wang, sunnanyong, akpm, nao.horiguchi, david On 2025/3/18 16:39, Jinjiang Tu wrote: > Syzkaller reports a bug as follows: Thanks for your fix. > > Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000 > Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users > Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed > page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e > memcg:ffff0000dd6d9000 > anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff) > raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9 > raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000 > page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio)) > ------------[ cut here ]------------ > kernel BUG at mm/swap_state.c:184! > Internal error: Oops - BUG: 00000000f2000800 [#1] SMP > Modules linked in: > CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3 > Hardware name: linux,dummy-virt (DT) > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : add_to_swap+0xbc/0x158 > lr : add_to_swap+0xbc/0x158 > sp : ffff800087f37340 > x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780 > x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0 > x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4 > x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000 > x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c > x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b > x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000 > x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001 > x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000 > x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000 > Call trace: > add_to_swap+0xbc/0x158 > shrink_folio_list+0x12ac/0x2648 > shrink_inactive_list+0x318/0x948 > shrink_lruvec+0x450/0x720 > shrink_node_memcgs+0x280/0x4a8 > shrink_node+0x128/0x978 > balance_pgdat+0x4f0/0xb20 > kswapd+0x228/0x438 > kthread+0x214/0x230 > ret_from_fork+0x10/0x20 > There are too many races in memory_failure to handle... > I can reproduce this issue with the following steps: > 1) When a dirty swapcache page is isolated by reclaim process and the page > isn't locked, inject memory failure for the page. me_swapcache_dirty() > clears uptodate flag and tries to delete from lru, but fails. Reclaim > process will put the hwpoisoned page back to lru. The hwpoisoned page is put back to lru list due to memory_failure holding the extra page refcnt? > 2) The process that maps the hwpoisoned page exits, the page is deleted > the page will never be freed and will be in the lru forever. Again, memory_failure holds the extra page refcnt so... > 3) If we trigger a reclaim again and tries to reclaim the page, > add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is > cleared. > > To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the > hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap > it in shrink_folio_list(), otherwise the folio will fail to be unmaped > by hwpoison_user_mappings() since the folio isn't in lru list. > > Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Thanks. . ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio 2025-03-20 2:50 ` Miaohe Lin @ 2025-03-20 3:37 ` Jinjiang Tu 0 siblings, 0 replies; 8+ messages in thread From: Jinjiang Tu @ 2025-03-20 3:37 UTC (permalink / raw) To: Miaohe Lin Cc: linux-mm, wangkefeng.wang, sunnanyong, akpm, nao.horiguchi, david 在 2025/3/20 10:50, Miaohe Lin 写道: > On 2025/3/18 16:39, Jinjiang Tu wrote: >> Syzkaller reports a bug as follows: > Thanks for your fix. > >> Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000 >> Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users >> Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed >> page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e >> memcg:ffff0000dd6d9000 >> anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff) >> raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9 >> raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000 >> page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio)) >> ------------[ cut here ]------------ >> kernel BUG at mm/swap_state.c:184! >> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP >> Modules linked in: >> CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3 >> Hardware name: linux,dummy-virt (DT) >> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >> pc : add_to_swap+0xbc/0x158 >> lr : add_to_swap+0xbc/0x158 >> sp : ffff800087f37340 >> x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780 >> x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0 >> x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4 >> x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000 >> x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c >> x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b >> x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000 >> x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001 >> x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000 >> x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000 >> Call trace: >> add_to_swap+0xbc/0x158 >> shrink_folio_list+0x12ac/0x2648 >> shrink_inactive_list+0x318/0x948 >> shrink_lruvec+0x450/0x720 >> shrink_node_memcgs+0x280/0x4a8 >> shrink_node+0x128/0x978 >> balance_pgdat+0x4f0/0xb20 >> kswapd+0x228/0x438 >> kthread+0x214/0x230 >> ret_from_fork+0x10/0x20 >> > There are too many races in memory_failure to handle... > >> I can reproduce this issue with the following steps: >> 1) When a dirty swapcache page is isolated by reclaim process and the page >> isn't locked, inject memory failure for the page. me_swapcache_dirty() >> clears uptodate flag and tries to delete from lru, but fails. Reclaim >> process will put the hwpoisoned page back to lru. > The hwpoisoned page is put back to lru list due to memory_failure holding the extra page refcnt? Yes > >> 2) The process that maps the hwpoisoned page exits, the page is deleted >> the page will never be freed and will be in the lru forever. > Again, memory_failure holds the extra page refcnt so... > >> 3) If we trigger a reclaim again and tries to reclaim the page, >> add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is >> cleared. >> >> To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the >> hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap >> it in shrink_folio_list(), otherwise the folio will fail to be unmaped >> by hwpoison_user_mappings() since the folio isn't in lru list. >> >> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> > Acked-by: Miaohe Lin <linmiaohe@huawei.com> Thanks for your review. > > Thanks. > . ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio 2025-03-18 8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu 2025-03-20 2:50 ` Miaohe Lin @ 2025-04-01 16:36 ` David Hildenbrand 1 sibling, 0 replies; 8+ messages in thread From: David Hildenbrand @ 2025-04-01 16:36 UTC (permalink / raw) To: Jinjiang Tu, akpm, linmiaohe, nao.horiguchi Cc: linux-mm, wangkefeng.wang, sunnanyong On 18.03.25 09:39, Jinjiang Tu wrote: > Syzkaller reports a bug as follows: > > Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000 > Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users > Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed > page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e > memcg:ffff0000dd6d9000 > anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff) > raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9 > raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000 > page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio)) > ------------[ cut here ]------------ > kernel BUG at mm/swap_state.c:184! > Internal error: Oops - BUG: 00000000f2000800 [#1] SMP > Modules linked in: > CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3 > Hardware name: linux,dummy-virt (DT) > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : add_to_swap+0xbc/0x158 > lr : add_to_swap+0xbc/0x158 > sp : ffff800087f37340 > x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780 > x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0 > x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4 > x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000 > x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c > x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b > x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000 > x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001 > x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000 > x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000 > Call trace: > add_to_swap+0xbc/0x158 > shrink_folio_list+0x12ac/0x2648 > shrink_inactive_list+0x318/0x948 > shrink_lruvec+0x450/0x720 > shrink_node_memcgs+0x280/0x4a8 > shrink_node+0x128/0x978 > balance_pgdat+0x4f0/0xb20 > kswapd+0x228/0x438 > kthread+0x214/0x230 > ret_from_fork+0x10/0x20 > > I can reproduce this issue with the following steps: > 1) When a dirty swapcache page is isolated by reclaim process and the page > isn't locked, inject memory failure for the page. me_swapcache_dirty() > clears uptodate flag and tries to delete from lru, but fails. Reclaim > process will put the hwpoisoned page back to lru. > 2) The process that maps the hwpoisoned page exits, the page is deleted > the page will never be freed and will be in the lru forever. > 3) If we trigger a reclaim again and tries to reclaim the page, > add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is > cleared. > > To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the > hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap > it in shrink_folio_list(), otherwise the folio will fail to be unmaped > by hwpoison_user_mappings() since the folio isn't in lru list. > > Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> > --- > mm/vmscan.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 2d73d497bdd5..ca3757b137d9 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1112,6 +1112,13 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, > if (!folio_trylock(folio)) > goto keep; > > + if (folio_contain_hwpoisoned_page(folio)) { > + unmap_poisoned_folio(folio, folio_pfn(folio), false); > + folio_unlock(folio); > + folio_put(folio); > + continue; > + } > + I was briefly concerned about large folios (if only a single page is bad, why unmap all of them?), but memory_failure() will already kill_procs_now() in case splitting the large folio failed. So we should rarely run into large folios here. Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-04-01 16:36 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-03-18 8:39 [PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu 2025-03-18 8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu 2025-03-20 2:36 ` Miaohe Lin 2025-04-01 16:28 ` David Hildenbrand 2025-03-18 8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu 2025-03-20 2:50 ` Miaohe Lin 2025-03-20 3:37 ` Jinjiang Tu 2025-04-01 16:36 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox