[PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio
@ 2025-03-18  8:39 Jinjiang Tu
  2025-03-18  8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu
  2025-03-18  8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu
  0 siblings, 2 replies; 8+ messages in thread
From: Jinjiang Tu @ 2025-03-18  8:39 UTC (permalink / raw)
  To: akpm, linmiaohe, nao.horiguchi, david
  Cc: linux-mm, wangkefeng.wang, sunnanyong, tujinjiang

Fix a bug during memory reclaim if folio is hwpoisoned.

Jinjiang Tu (2):
  mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
  mm/vmscan: don't try to reclaim hwpoison folio

 include/linux/page-flags.h | 6 ++++++
 mm/memory_hotplug.c        | 3 +--
 mm/shmem.c                 | 3 +--
 mm/vmscan.c                | 7 +++++++
 4 files changed, 15 insertions(+), 4 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
  2025-03-18  8:39 [PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu
@ 2025-03-18  8:39 ` Jinjiang Tu
  2025-03-20  2:36   ` Miaohe Lin
  2025-04-01 16:28   ` David Hildenbrand
  2025-03-18  8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu
  1 sibling, 2 replies; 8+ messages in thread
From: Jinjiang Tu @ 2025-03-18  8:39 UTC (permalink / raw)
  To: akpm, linmiaohe, nao.horiguchi, david
  Cc: linux-mm, wangkefeng.wang, sunnanyong, tujinjiang

Introduce helper folio_contain_hwpoisoned_page() to check if the entire
folio is hwpoisoned or it contains hwpoisoned pages.

Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
---
 include/linux/page-flags.h | 6 ++++++
 mm/memory_hotplug.c        | 3 +--
 mm/shmem.c                 | 3 +--
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 36d283552f80..be2f0017a667 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -1104,6 +1104,12 @@ static inline bool is_page_hwpoison(const struct page *page)
 	return folio_test_hugetlb(folio) && PageHWPoison(&folio->page);
 }
 
+static inline bool folio_contain_hwpoisoned_page(struct folio *folio)
+{
+	return folio_test_hwpoison(folio) ||
+	    (folio_test_large(folio) && folio_test_has_hwpoisoned(folio));
+}
+
 bool is_free_buddy_page(const struct page *page);
 
 PAGEFLAG(Isolated, isolated, PF_ANY);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 16cf9e17077e..75401866fb76 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1828,8 +1828,7 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		if (unlikely(page_folio(page) != folio))
 			goto put_folio;
 
-		if (folio_test_hwpoison(folio) ||
-		    (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
+		if (folio_contain_hwpoisoned_page(folio)) {
 			if (WARN_ON(folio_test_lru(folio)))
 				folio_isolate_lru(folio);
 			if (folio_mapped(folio)) {
diff --git a/mm/shmem.c b/mm/shmem.c
index 1ede0800e846..1dd513d82332 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3302,8 +3302,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping,
 	if (ret)
 		return ret;
 
-	if (folio_test_hwpoison(folio) ||
-	    (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
+	if (folio_contain_hwpoisoned_page(folio)) {
 		folio_unlock(folio);
 		folio_put(folio);
 		return -EIO;
-- 
2.43.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio
  2025-03-18  8:39 [PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu
  2025-03-18  8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu
@ 2025-03-18  8:39 ` Jinjiang Tu
  2025-03-20  2:50   ` Miaohe Lin
  2025-04-01 16:36   ` David Hildenbrand
  1 sibling, 2 replies; 8+ messages in thread
From: Jinjiang Tu @ 2025-03-18  8:39 UTC (permalink / raw)
  To: akpm, linmiaohe, nao.horiguchi, david
  Cc: linux-mm, wangkefeng.wang, sunnanyong, tujinjiang

Syzkaller reports a bug as follows:

Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000
Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users
Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed
page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e
memcg:ffff0000dd6d9000
anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff)
raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9
raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000
page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio))
------------[ cut here ]------------
kernel BUG at mm/swap_state.c:184!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
Modules linked in:
CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3
Hardware name: linux,dummy-virt (DT)
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : add_to_swap+0xbc/0x158
lr : add_to_swap+0xbc/0x158
sp : ffff800087f37340
x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780
x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0
x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4
x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000
x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c
x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b
x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000
x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001
x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000
Call trace:
 add_to_swap+0xbc/0x158
 shrink_folio_list+0x12ac/0x2648
 shrink_inactive_list+0x318/0x948
 shrink_lruvec+0x450/0x720
 shrink_node_memcgs+0x280/0x4a8
 shrink_node+0x128/0x978
 balance_pgdat+0x4f0/0xb20
 kswapd+0x228/0x438
 kthread+0x214/0x230
 ret_from_fork+0x10/0x20

I can reproduce this issue with the following steps:
1) When a dirty swapcache page is isolated by reclaim process and the page
isn't locked, inject memory failure for the page. me_swapcache_dirty()
clears uptodate flag and tries to delete from lru, but fails. Reclaim
process will put the hwpoisoned page back to lru.
2) The process that maps the hwpoisoned page exits, the page is deleted
the page will never be freed and will be in the lru forever.
3) If we trigger a reclaim again and tries to reclaim the page,
add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is
cleared.

To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the
hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap
it in shrink_folio_list(), otherwise the folio will fail to be unmaped
by hwpoison_user_mappings() since the folio isn't in lru list.

Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
---
 mm/vmscan.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2d73d497bdd5..ca3757b137d9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1112,6 +1112,13 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 		if (!folio_trylock(folio))
 			goto keep;
 
+		if (folio_contain_hwpoisoned_page(folio)) {
+			unmap_poisoned_folio(folio, folio_pfn(folio), false);
+			folio_unlock(folio);
+			folio_put(folio);
+			continue;
+		}
+
 		VM_BUG_ON_FOLIO(folio_test_active(folio), folio);
 
 		nr_pages = folio_nr_pages(folio);
-- 
2.43.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
  2025-03-18  8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu
@ 2025-03-20  2:36   ` Miaohe Lin
  2025-04-01 16:28   ` David Hildenbrand
  1 sibling, 0 replies; 8+ messages in thread
From: Miaohe Lin @ 2025-03-20  2:36 UTC (permalink / raw)
  To: Jinjiang Tu
  Cc: linux-mm, wangkefeng.wang, sunnanyong, akpm, nao.horiguchi, david

On 2025/3/18 16:39, Jinjiang Tu wrote:
> Introduce helper folio_contain_hwpoisoned_page() to check if the entire
> folio is hwpoisoned or it contains hwpoisoned pages.
> 
> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>

LGTM.

Acked-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks.
.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio
  2025-03-18  8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu
@ 2025-03-20  2:50   ` Miaohe Lin
  2025-03-20  3:37     ` Jinjiang Tu
  2025-04-01 16:36   ` David Hildenbrand
  1 sibling, 1 reply; 8+ messages in thread
From: Miaohe Lin @ 2025-03-20  2:50 UTC (permalink / raw)
  To: Jinjiang Tu
  Cc: linux-mm, wangkefeng.wang, sunnanyong, akpm, nao.horiguchi, david

On 2025/3/18 16:39, Jinjiang Tu wrote:
> Syzkaller reports a bug as follows:

Thanks for your fix.

> 
> Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000
> Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users
> Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed
> page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e
> memcg:ffff0000dd6d9000
> anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff)
> raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9
> raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000
> page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio))
> ------------[ cut here ]------------
> kernel BUG at mm/swap_state.c:184!
> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
> Modules linked in:
> CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3
> Hardware name: linux,dummy-virt (DT)
> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : add_to_swap+0xbc/0x158
> lr : add_to_swap+0xbc/0x158
> sp : ffff800087f37340
> x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780
> x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0
> x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4
> x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000
> x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c
> x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b
> x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000
> x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001
> x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000
> x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000
> Call trace:
>  add_to_swap+0xbc/0x158
>  shrink_folio_list+0x12ac/0x2648
>  shrink_inactive_list+0x318/0x948
>  shrink_lruvec+0x450/0x720
>  shrink_node_memcgs+0x280/0x4a8
>  shrink_node+0x128/0x978
>  balance_pgdat+0x4f0/0xb20
>  kswapd+0x228/0x438
>  kthread+0x214/0x230
>  ret_from_fork+0x10/0x20
> 

There are too many races in memory_failure to handle...

> I can reproduce this issue with the following steps:
> 1) When a dirty swapcache page is isolated by reclaim process and the page
> isn't locked, inject memory failure for the page. me_swapcache_dirty()
> clears uptodate flag and tries to delete from lru, but fails. Reclaim
> process will put the hwpoisoned page back to lru.

The hwpoisoned page is put back to lru list due to memory_failure holding the extra page refcnt?

> 2) The process that maps the hwpoisoned page exits, the page is deleted
> the page will never be freed and will be in the lru forever.

Again, memory_failure holds the extra page refcnt so...

> 3) If we trigger a reclaim again and tries to reclaim the page,
> add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is
> cleared.
> 
> To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the
> hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap
> it in shrink_folio_list(), otherwise the folio will fail to be unmaped
> by hwpoison_user_mappings() since the folio isn't in lru list.
> 
> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>

Acked-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks.
.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio
  2025-03-20  2:50   ` Miaohe Lin
@ 2025-03-20  3:37     ` Jinjiang Tu
  0 siblings, 0 replies; 8+ messages in thread
From: Jinjiang Tu @ 2025-03-20  3:37 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: linux-mm, wangkefeng.wang, sunnanyong, akpm, nao.horiguchi, david


在 2025/3/20 10:50, Miaohe Lin 写道:
> On 2025/3/18 16:39, Jinjiang Tu wrote:
>> Syzkaller reports a bug as follows:
> Thanks for your fix.
>
>> Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000
>> Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users
>> Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed
>> page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e
>> memcg:ffff0000dd6d9000
>> anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff)
>> raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9
>> raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000
>> page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio))
>> ------------[ cut here ]------------
>> kernel BUG at mm/swap_state.c:184!
>> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
>> Modules linked in:
>> CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3
>> Hardware name: linux,dummy-virt (DT)
>> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : add_to_swap+0xbc/0x158
>> lr : add_to_swap+0xbc/0x158
>> sp : ffff800087f37340
>> x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780
>> x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0
>> x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4
>> x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000
>> x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c
>> x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b
>> x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000
>> x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001
>> x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000
>> x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000
>> Call trace:
>>   add_to_swap+0xbc/0x158
>>   shrink_folio_list+0x12ac/0x2648
>>   shrink_inactive_list+0x318/0x948
>>   shrink_lruvec+0x450/0x720
>>   shrink_node_memcgs+0x280/0x4a8
>>   shrink_node+0x128/0x978
>>   balance_pgdat+0x4f0/0xb20
>>   kswapd+0x228/0x438
>>   kthread+0x214/0x230
>>   ret_from_fork+0x10/0x20
>>
> There are too many races in memory_failure to handle...
>
>> I can reproduce this issue with the following steps:
>> 1) When a dirty swapcache page is isolated by reclaim process and the page
>> isn't locked, inject memory failure for the page. me_swapcache_dirty()
>> clears uptodate flag and tries to delete from lru, but fails. Reclaim
>> process will put the hwpoisoned page back to lru.
> The hwpoisoned page is put back to lru list due to memory_failure holding the extra page refcnt?

Yes

>
>> 2) The process that maps the hwpoisoned page exits, the page is deleted
>> the page will never be freed and will be in the lru forever.
> Again, memory_failure holds the extra page refcnt so...
>
>> 3) If we trigger a reclaim again and tries to reclaim the page,
>> add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is
>> cleared.
>>
>> To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the
>> hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap
>> it in shrink_folio_list(), otherwise the folio will fail to be unmaped
>> by hwpoison_user_mappings() since the folio isn't in lru list.
>>
>> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
> Acked-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks for your review.

>
> Thanks.
> .


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
  2025-03-18  8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu
  2025-03-20  2:36   ` Miaohe Lin
@ 2025-04-01 16:28   ` David Hildenbrand
  1 sibling, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2025-04-01 16:28 UTC (permalink / raw)
  To: Jinjiang Tu, akpm, linmiaohe, nao.horiguchi
  Cc: linux-mm, wangkefeng.wang, sunnanyong

On 18.03.25 09:39, Jinjiang Tu wrote:
> Introduce helper folio_contain_hwpoisoned_page() to check if the entire
> folio is hwpoisoned or it contains hwpoisoned pages.
> 
> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
> ---
>   include/linux/page-flags.h | 6 ++++++
>   mm/memory_hotplug.c        | 3 +--
>   mm/shmem.c                 | 3 +--
>   3 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 36d283552f80..be2f0017a667 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -1104,6 +1104,12 @@ static inline bool is_page_hwpoison(const struct page *page)
>   	return folio_test_hugetlb(folio) && PageHWPoison(&folio->page);
>   }
>   
> +static inline bool folio_contain_hwpoisoned_page(struct folio *folio)

"folio_contains_hwpoisoned_page"

Also make sure to indent

	return folio_test_hwpoison(folio) ||
	       (folio_test_large(folio) && folio_test_has_hwpoisoned(folio));
	       ^ this way

With that

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio
  2025-03-18  8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu
  2025-03-20  2:50   ` Miaohe Lin
@ 2025-04-01 16:36   ` David Hildenbrand
  1 sibling, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2025-04-01 16:36 UTC (permalink / raw)
  To: Jinjiang Tu, akpm, linmiaohe, nao.horiguchi
  Cc: linux-mm, wangkefeng.wang, sunnanyong

On 18.03.25 09:39, Jinjiang Tu wrote:
> Syzkaller reports a bug as follows:
> 
> Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000
> Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users
> Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed
> page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e
> memcg:ffff0000dd6d9000
> anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff)
> raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9
> raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000
> page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio))
> ------------[ cut here ]------------
> kernel BUG at mm/swap_state.c:184!
> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
> Modules linked in:
> CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3
> Hardware name: linux,dummy-virt (DT)
> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : add_to_swap+0xbc/0x158
> lr : add_to_swap+0xbc/0x158
> sp : ffff800087f37340
> x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780
> x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0
> x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4
> x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000
> x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c
> x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b
> x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000
> x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001
> x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000
> x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000
> Call trace:
>   add_to_swap+0xbc/0x158
>   shrink_folio_list+0x12ac/0x2648
>   shrink_inactive_list+0x318/0x948
>   shrink_lruvec+0x450/0x720
>   shrink_node_memcgs+0x280/0x4a8
>   shrink_node+0x128/0x978
>   balance_pgdat+0x4f0/0xb20
>   kswapd+0x228/0x438
>   kthread+0x214/0x230
>   ret_from_fork+0x10/0x20
> 
> I can reproduce this issue with the following steps:
> 1) When a dirty swapcache page is isolated by reclaim process and the page
> isn't locked, inject memory failure for the page. me_swapcache_dirty()
> clears uptodate flag and tries to delete from lru, but fails. Reclaim
> process will put the hwpoisoned page back to lru.
> 2) The process that maps the hwpoisoned page exits, the page is deleted
> the page will never be freed and will be in the lru forever.
> 3) If we trigger a reclaim again and tries to reclaim the page,
> add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is
> cleared.
> 
> To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the
> hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap
> it in shrink_folio_list(), otherwise the folio will fail to be unmaped
> by hwpoison_user_mappings() since the folio isn't in lru list.
> 
> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
> ---
>   mm/vmscan.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 2d73d497bdd5..ca3757b137d9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1112,6 +1112,13 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>   		if (!folio_trylock(folio))
>   			goto keep;
>   
> +		if (folio_contain_hwpoisoned_page(folio)) {
> +			unmap_poisoned_folio(folio, folio_pfn(folio), false);
> +			folio_unlock(folio);
> +			folio_put(folio);
> +			continue;
> +		}
> +

I was briefly concerned about large folios (if only a single page is 
bad, why unmap all of them?), but memory_failure() will already 
kill_procs_now() in case splitting the large folio failed. So we should 
rarely run into large folios here.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-04-01 16:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-18  8:39 [PATCH 0/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu
2025-03-18  8:39 ` [PATCH 1/2] mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper Jinjiang Tu
2025-03-20  2:36   ` Miaohe Lin
2025-04-01 16:28   ` David Hildenbrand
2025-03-18  8:39 ` [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio Jinjiang Tu
2025-03-20  2:50   ` Miaohe Lin
2025-03-20  3:37     ` Jinjiang Tu
2025-04-01 16:36   ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox