linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] mm: memory_failure: unmap poisoned filio during migrate properly
@ 2025-01-16  6:16 Wupeng Ma
  2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Wupeng Ma @ 2025-01-16  6:16 UTC (permalink / raw)
  To: akpm, david, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: mawupeng1, linux-mm, linux-kernel

From: Ma Wupeng <mawupeng1@huawei.com>

Fix two bugs during migrate folio if folio is poisoned.

Changelog since v1:
- update ttu flag inside unmap_poisoned_folio.
- check folio ref count before unmap HWpoisoned folio.

Ma Wupeng (3):
  mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio
  mm: memory-hotplug: check folio ref count first in do_migrate_rang

 mm/internal.h       |  5 ++--
 mm/memory-failure.c | 61 +++++++++++++++++++++++----------------------
 mm/memory_hotplug.c | 22 ++++++++--------
 3 files changed, 44 insertions(+), 44 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-16  6:16 [PATCH v2 0/3] mm: memory_failure: unmap poisoned filio during migrate properly Wupeng Ma
@ 2025-01-16  6:16 ` Wupeng Ma
  2025-01-17  3:57   ` kernel test robot
                     ` (4 more replies)
  2025-01-16  6:16 ` [PATCH v2 2/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Wupeng Ma
  2025-01-16  6:16 ` [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang Wupeng Ma
  2 siblings, 5 replies; 22+ messages in thread
From: Wupeng Ma @ 2025-01-16  6:16 UTC (permalink / raw)
  To: akpm, david, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: mawupeng1, linux-mm, linux-kernel

From: Ma Wupeng <mawupeng1@huawei.com>

Commit 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to
TTU_HWPOISON") introduce TTU_HWPOISON to replace TTU_IGNORE_HWPOISON
in order to stop send SIGBUS signal when accessing an error page after
a memory error on a clean folio. However during page migration, anon
folio must be set with TTU_HWPOISON during unmap_*(). For pagecache
we need some policy just like the one in hwpoison_user_mappings to
set this flag. So move this policy from hwpoison_user_mappings to
unmap_poisoned_folio to handle this waring properly.

Waring will be produced during unamp poison folio with the following log:

  ------------[ cut here ]------------
  WARNING: CPU: 1 PID: 365 at mm/rmap.c:1847 try_to_unmap_one+0x8fc/0xd3c
  Modules linked in:
  CPU: 1 UID: 0 PID: 365 Comm: bash Tainted: G        W          6.13.0-rc1-00018-gacdb4bbda7ab #42
  Tainted: [W]=WARN
  Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
  pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : try_to_unmap_one+0x8fc/0xd3c
  lr : try_to_unmap_one+0x3dc/0xd3c
  Call trace:
   try_to_unmap_one+0x8fc/0xd3c (P)
   try_to_unmap_one+0x3dc/0xd3c (L)
   rmap_walk_anon+0xdc/0x1f8
   rmap_walk+0x3c/0x58
   try_to_unmap+0x88/0x90
   unmap_poisoned_folio+0x30/0xa8
   do_migrate_range+0x4a0/0x568
   offline_pages+0x5a4/0x670
   memory_block_action+0x17c/0x374
   memory_subsys_offline+0x3c/0x78
   device_offline+0xa4/0xd0
   state_store+0x8c/0xf0
   dev_attr_store+0x18/0x2c
   sysfs_kf_write+0x44/0x54
   kernfs_fop_write_iter+0x118/0x1a8
   vfs_write+0x3a8/0x4bc
   ksys_write+0x6c/0xf8
   __arm64_sys_write+0x1c/0x28
   invoke_syscall+0x44/0x100
   el0_svc_common.constprop.0+0x40/0xe0
   do_el0_svc+0x1c/0x28
   el0_svc+0x30/0xd0
   el0t_64_sync_handler+0xc8/0xcc
   el0t_64_sync+0x198/0x19c
  ---[ end trace 0000000000000000 ]---

Fixes: 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON")
Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Suggested-by: David Hildenbrand <david@redhat.com>
---
 mm/internal.h       |  5 ++--
 mm/memory-failure.c | 61 +++++++++++++++++++++++----------------------
 mm/memory_hotplug.c |  3 ++-
 3 files changed, 36 insertions(+), 33 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 9826f7dce607..3caee67c0abd 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1102,7 +1102,7 @@ static inline int find_next_best_node(int node, nodemask_t *used_node_mask)
  * mm/memory-failure.c
  */
 #ifdef CONFIG_MEMORY_FAILURE
-void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu);
+int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
 void shake_folio(struct folio *folio);
 extern int hwpoison_filter(struct page *p);
 
@@ -1125,8 +1125,9 @@ unsigned long page_mapped_in_vma(const struct page *page,
 		struct vm_area_struct *vma);
 
 #else
-static inline void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
+static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
 {
+	return -EBUSY;
 }
 #endif
 
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index a7b8ccd29b6f..b5212b6e330a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1556,8 +1556,34 @@ static int get_hwpoison_page(struct page *p, unsigned long flags)
 	return ret;
 }
 
-void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
+int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill)
 {
+	enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC | TTU_HWPOISON;
+	struct address_space *mapping;
+
+	if (folio_test_swapcache(folio)) {
+		pr_err("%#lx: keeping poisoned page in swap cache\n", pfn);
+		ttu &= ~TTU_HWPOISON;
+	}
+
+	/*
+	 * Propagate the dirty bit from PTEs to struct page first, because we
+	 * need this to decide if we should kill or just drop the page.
+	 * XXX: the dirty test could be racy: set_page_dirty() may not always
+	 * be called inside page lock (it's recommended but not enforced).
+	 */
+	mapping = folio_mapping(folio);
+	if (!must_kill && !folio_test_dirty(folio) && mapping &&
+	    mapping_can_writeback(mapping)) {
+		if (folio_mkclean(folio)) {
+			folio_set_dirty(folio);
+		} else {
+			ttu &= ~TTU_HWPOISON;
+			pr_info("%#lx: corrupted page was clean: dropped without side effects\n",
+				pfn);
+		}
+	}
+
 	if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
 		struct address_space *mapping;
 
@@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
 		if (!mapping) {
 			pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
 				folio_pfn(folio));
-			return;
+			return -EBUSY;
 		}
 
 		try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
@@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
 	} else {
 		try_to_unmap(folio, ttu);
 	}
+
+	return folio_mapped(folio) ? -EBUSY : 0;
 }
 
 /*
@@ -1589,8 +1617,6 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
 static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
 		unsigned long pfn, int flags)
 {
-	enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC | TTU_HWPOISON;
-	struct address_space *mapping;
 	LIST_HEAD(tokill);
 	bool unmap_success;
 	int forcekill;
@@ -1613,29 +1639,6 @@ static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
 	if (!folio_mapped(folio))
 		return true;
 
-	if (folio_test_swapcache(folio)) {
-		pr_err("%#lx: keeping poisoned page in swap cache\n", pfn);
-		ttu &= ~TTU_HWPOISON;
-	}
-
-	/*
-	 * Propagate the dirty bit from PTEs to struct page first, because we
-	 * need this to decide if we should kill or just drop the page.
-	 * XXX: the dirty test could be racy: set_page_dirty() may not always
-	 * be called inside page lock (it's recommended but not enforced).
-	 */
-	mapping = folio_mapping(folio);
-	if (!(flags & MF_MUST_KILL) && !folio_test_dirty(folio) && mapping &&
-	    mapping_can_writeback(mapping)) {
-		if (folio_mkclean(folio)) {
-			folio_set_dirty(folio);
-		} else {
-			ttu &= ~TTU_HWPOISON;
-			pr_info("%#lx: corrupted page was clean: dropped without side effects\n",
-				pfn);
-		}
-	}
-
 	/*
 	 * First collect all the processes that have the page
 	 * mapped in dirty form.  This has to be done before try_to_unmap,
@@ -1643,9 +1646,7 @@ static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
 	 */
 	collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
 
-	unmap_poisoned_folio(folio, ttu);
-
-	unmap_success = !folio_mapped(folio);
+	unmap_success = !unmap_poisoned_folio(folio, pfn, flags & MF_MUST_KILL);
 	if (!unmap_success)
 		pr_err("%#lx: failed to unmap page (folio mapcount=%d)\n",
 		       pfn, folio_mapcount(folio));
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c43b4e7fb298..3de661e57e92 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1806,7 +1806,8 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 			if (WARN_ON(folio_test_lru(folio)))
 				folio_isolate_lru(folio);
 			if (folio_mapped(folio))
-				unmap_poisoned_folio(folio, TTU_IGNORE_MLOCK);
+				unmap_poisoned_folio(folio, pfn, false);
+
 			continue;
 		}
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 2/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio
  2025-01-16  6:16 [PATCH v2 0/3] mm: memory_failure: unmap poisoned filio during migrate properly Wupeng Ma
  2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
@ 2025-01-16  6:16 ` Wupeng Ma
  2025-01-20  9:25   ` David Hildenbrand
  2025-01-16  6:16 ` [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang Wupeng Ma
  2 siblings, 1 reply; 22+ messages in thread
From: Wupeng Ma @ 2025-01-16  6:16 UTC (permalink / raw)
  To: akpm, david, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: mawupeng1, linux-mm, linux-kernel

From: Ma Wupeng <mawupeng1@huawei.com>

Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to
be offlined) add page poison checks in do_migrate_range in order to make
offline hwpoisoned page possible by introducing isolate_lru_page and
try_to_unmap for hwpoisoned page. However folio lock must be held before
calling try_to_unmap. Add it to fix this problem.

Waring will be produced if folio is not locked during unmap:

  ------------[ cut here ]------------
  kernel BUG at ./include/linux/swapops.h:400!
  Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G        W          6.13.0-rc1-00016-g3c434c7ee82a-dirty #41
  Tainted: [W]=WARN
  Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
  pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : try_to_unmap_one+0xb08/0xd3c
  lr : try_to_unmap_one+0x3dc/0xd3c
  Call trace:
   try_to_unmap_one+0xb08/0xd3c (P)
   try_to_unmap_one+0x3dc/0xd3c (L)
   rmap_walk_anon+0xdc/0x1f8
   rmap_walk+0x3c/0x58
   try_to_unmap+0x88/0x90
   unmap_poisoned_folio+0x30/0xa8
   do_migrate_range+0x4a0/0x568
   offline_pages+0x5a4/0x670
   memory_block_action+0x17c/0x374
   memory_subsys_offline+0x3c/0x78
   device_offline+0xa4/0xd0
   state_store+0x8c/0xf0
   dev_attr_store+0x18/0x2c
   sysfs_kf_write+0x44/0x54
   kernfs_fop_write_iter+0x118/0x1a8
   vfs_write+0x3a8/0x4bc
   ksys_write+0x6c/0xf8
   __arm64_sys_write+0x1c/0x28
   invoke_syscall+0x44/0x100
   el0_svc_common.constprop.0+0x40/0xe0
   do_el0_svc+0x1c/0x28
   el0_svc+0x30/0xd0
   el0t_64_sync_handler+0xc8/0xcc
   el0t_64_sync+0x198/0x19c
  Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000)
  ---[ end trace 0000000000000000 ]---

Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined")
Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
---
 mm/memory_hotplug.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 3de661e57e92..2815bd4ea483 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1805,8 +1805,11 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		    (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
 			if (WARN_ON(folio_test_lru(folio)))
 				folio_isolate_lru(folio);
-			if (folio_mapped(folio))
+			if (folio_mapped(folio)) {
+				folio_lock(folio);
 				unmap_poisoned_folio(folio, pfn, false);
+				folio_unlock(folio);
+			}
 
 			continue;
 		}
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang
  2025-01-16  6:16 [PATCH v2 0/3] mm: memory_failure: unmap poisoned filio during migrate properly Wupeng Ma
  2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
  2025-01-16  6:16 ` [PATCH v2 2/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Wupeng Ma
@ 2025-01-16  6:16 ` Wupeng Ma
  2025-01-20  6:32   ` Miaohe Lin
  2025-01-20  8:01   ` David Hildenbrand
  2 siblings, 2 replies; 22+ messages in thread
From: Wupeng Ma @ 2025-01-16  6:16 UTC (permalink / raw)
  To: akpm, david, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: mawupeng1, linux-mm, linux-kernel

From: Ma Wupeng <mawupeng1@huawei.com>

If a folio has an increased reference count, folio_try_get() will acquire
it, perform necessary operations, and then release it. In the case of a
poisoned folio without an elevated reference count (which is unlikely for
memory-failure), folio_try_get() will simply bypass it.

Therefore, relocate the folio_try_get() function, responsible for checking
and acquiring this reference count at first.

Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
---
 mm/memory_hotplug.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2815bd4ea483..3fb75ee185c6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1786,6 +1786,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		page = pfn_to_page(pfn);
 		folio = page_folio(page);
 
+		if (!folio_try_get(folio))
+			continue;
+
 		/*
 		 * No reference or lock is held on the folio, so it might
 		 * be modified concurrently (e.g. split).  As such,
@@ -1795,12 +1798,6 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		if (folio_test_large(folio))
 			pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
 
-		/*
-		 * HWPoison pages have elevated reference counts so the migration would
-		 * fail on them. It also doesn't make any sense to migrate them in the
-		 * first place. Still try to unmap such a page in case it is still mapped
-		 * (keep the unmap as the catch all safety net).
-		 */
 		if (folio_test_hwpoison(folio) ||
 		    (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
 			if (WARN_ON(folio_test_lru(folio)))
@@ -1811,12 +1808,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 				folio_unlock(folio);
 			}
 
-			continue;
+			goto put_folio;
 		}
 
-		if (!folio_try_get(folio))
-			continue;
-
 		if (unlikely(page_folio(page) != folio))
 			goto put_folio;
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
@ 2025-01-17  3:57   ` kernel test robot
  2025-01-17  4:16     ` mawupeng
  2025-01-17  4:39   ` kernel test robot
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: kernel test robot @ 2025-01-17  3:57 UTC (permalink / raw)
  To: Wupeng Ma, akpm, david, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: oe-kbuild-all, mawupeng1, linux-mm, linux-kernel

Hi Wupeng,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Wupeng-Ma/mm-memory-failure-update-ttu-flag-inside-unmap_poisoned_folio/20250116-142614
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250116061657.227027-2-mawupeng1%40huawei.com
patch subject: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
config: s390-randconfig-001-20250117 (https://download.01.org/0day-ci/archive/20250117/202501171300.SYkofFQ6-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250117/202501171300.SYkofFQ6-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501171300.SYkofFQ6-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   In file included from mm/page_isolation.c:13:
>> mm/internal.h:1142:1: error: expected identifier or '(' before '{' token
    1142 | {
         | ^
>> mm/internal.h:1141:19: warning: 'unmap_poisoned_folio' declared 'static' but never defined [-Wunused-function]
    1141 | static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
         |                   ^~~~~~~~~~~~~~~~~~~~
--
   In file included from mm/damon/paddr.c:19:
>> mm/damon/../internal.h:1142:1: error: expected identifier or '(' before '{' token
    1142 | {
         | ^
>> mm/damon/../internal.h:1141:19: warning: 'unmap_poisoned_folio' declared 'static' but never defined [-Wunused-function]
    1141 | static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
         |                   ^~~~~~~~~~~~~~~~~~~~


vim +1142 mm/internal.h

31d3d3484f9bd2 Wu Fengguang            2009-12-16  1121  
7c116f2b0dbac4 Wu Fengguang            2009-12-16  1122  extern u32 hwpoison_filter_dev_major;
7c116f2b0dbac4 Wu Fengguang            2009-12-16  1123  extern u32 hwpoison_filter_dev_minor;
478c5ffc0b5052 Wu Fengguang            2009-12-16  1124  extern u64 hwpoison_filter_flags_mask;
478c5ffc0b5052 Wu Fengguang            2009-12-16  1125  extern u64 hwpoison_filter_flags_value;
4fd466eb46a6a9 Andi Kleen              2009-12-16  1126  extern u64 hwpoison_filter_memcg;
1bfe5febe34d2b Haicheng Li             2009-12-16  1127  extern u32 hwpoison_filter_enable;
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1128  #define MAGIC_HWPOISON	0x48575053U	/* HWPS */
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1129  void SetPageHWPoisonTakenOff(struct page *page);
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1130  void ClearPageHWPoisonTakenOff(struct page *page);
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1131  bool take_page_off_buddy(struct page *page);
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1132  bool put_page_back_buddy(struct page *page);
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1133  struct task_struct *task_early_kill(struct task_struct *tsk, int force_early);
68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1134) void add_to_kill_ksm(struct task_struct *tsk, const struct page *p,
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1135  		     struct vm_area_struct *vma, struct list_head *to_kill,
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1136  		     unsigned long ksm_addr);
68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1137) unsigned long page_mapped_in_vma(const struct page *page,
68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1138) 		struct vm_area_struct *vma);
eb36c5873b96e8 Al Viro                 2012-05-30  1139  
16038c4fffd802 Kefeng Wang             2024-08-27  1140  #else
2b5df10a15dc74 Ma Wupeng               2025-01-16 @1141  static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
16038c4fffd802 Kefeng Wang             2024-08-27 @1142  {
2b5df10a15dc74 Ma Wupeng               2025-01-16  1143  	return -EBUSY;
16038c4fffd802 Kefeng Wang             2024-08-27  1144  }
16038c4fffd802 Kefeng Wang             2024-08-27  1145  #endif
16038c4fffd802 Kefeng Wang             2024-08-27  1146  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-17  3:57   ` kernel test robot
@ 2025-01-17  4:16     ` mawupeng
  0 siblings, 0 replies; 22+ messages in thread
From: mawupeng @ 2025-01-17  4:16 UTC (permalink / raw)
  To: lkp, akpm, david, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: mawupeng1, oe-kbuild-all, linux-mm, linux-kernel



On 2025/1/17 11:57, kernel test robot wrote:
> Hi Wupeng,
> 
> kernel test robot noticed the following build errors:

Thanks.

Will be fixed later.

> 
> [auto build test ERROR on akpm-mm/mm-everything]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Wupeng-Ma/mm-memory-failure-update-ttu-flag-inside-unmap_poisoned_folio/20250116-142614
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/20250116061657.227027-2-mawupeng1%40huawei.com
> patch subject: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
> config: s390-randconfig-001-20250117 (https://download.01.org/0day-ci/archive/20250117/202501171300.SYkofFQ6-lkp@intel.com/config)
> compiler: s390-linux-gcc (GCC) 14.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250117/202501171300.SYkofFQ6-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202501171300.SYkofFQ6-lkp@intel.com/
> 
> All error/warnings (new ones prefixed by >>):
> 
>    In file included from mm/page_isolation.c:13:
>>> mm/internal.h:1142:1: error: expected identifier or '(' before '{' token
>     1142 | {
>          | ^
>>> mm/internal.h:1141:19: warning: 'unmap_poisoned_folio' declared 'static' but never defined [-Wunused-function]
>     1141 | static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
>          |                   ^~~~~~~~~~~~~~~~~~~~
> --
>    In file included from mm/damon/paddr.c:19:
>>> mm/damon/../internal.h:1142:1: error: expected identifier or '(' before '{' token
>     1142 | {
>          | ^
>>> mm/damon/../internal.h:1141:19: warning: 'unmap_poisoned_folio' declared 'static' but never defined [-Wunused-function]
>     1141 | static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
>          |                   ^~~~~~~~~~~~~~~~~~~~
> 
> 
> vim +1142 mm/internal.h
> 
> 31d3d3484f9bd2 Wu Fengguang            2009-12-16  1121  
> 7c116f2b0dbac4 Wu Fengguang            2009-12-16  1122  extern u32 hwpoison_filter_dev_major;
> 7c116f2b0dbac4 Wu Fengguang            2009-12-16  1123  extern u32 hwpoison_filter_dev_minor;
> 478c5ffc0b5052 Wu Fengguang            2009-12-16  1124  extern u64 hwpoison_filter_flags_mask;
> 478c5ffc0b5052 Wu Fengguang            2009-12-16  1125  extern u64 hwpoison_filter_flags_value;
> 4fd466eb46a6a9 Andi Kleen              2009-12-16  1126  extern u64 hwpoison_filter_memcg;
> 1bfe5febe34d2b Haicheng Li             2009-12-16  1127  extern u32 hwpoison_filter_enable;
> 3a78f77fd1fb82 Miaohe Lin              2024-06-12  1128  #define MAGIC_HWPOISON	0x48575053U	/* HWPS */
> 3a78f77fd1fb82 Miaohe Lin              2024-06-12  1129  void SetPageHWPoisonTakenOff(struct page *page);
> 3a78f77fd1fb82 Miaohe Lin              2024-06-12  1130  void ClearPageHWPoisonTakenOff(struct page *page);
> 3a78f77fd1fb82 Miaohe Lin              2024-06-12  1131  bool take_page_off_buddy(struct page *page);
> 3a78f77fd1fb82 Miaohe Lin              2024-06-12  1132  bool put_page_back_buddy(struct page *page);
> 3a78f77fd1fb82 Miaohe Lin              2024-06-12  1133  struct task_struct *task_early_kill(struct task_struct *tsk, int force_early);
> 68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1134) void add_to_kill_ksm(struct task_struct *tsk, const struct page *p,
> 3a78f77fd1fb82 Miaohe Lin              2024-06-12  1135  		     struct vm_area_struct *vma, struct list_head *to_kill,
> 3a78f77fd1fb82 Miaohe Lin              2024-06-12  1136  		     unsigned long ksm_addr);
> 68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1137) unsigned long page_mapped_in_vma(const struct page *page,
> 68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1138) 		struct vm_area_struct *vma);
> eb36c5873b96e8 Al Viro                 2012-05-30  1139  
> 16038c4fffd802 Kefeng Wang             2024-08-27  1140  #else
> 2b5df10a15dc74 Ma Wupeng               2025-01-16 @1141  static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
> 16038c4fffd802 Kefeng Wang             2024-08-27 @1142  {
> 2b5df10a15dc74 Ma Wupeng               2025-01-16  1143  	return -EBUSY;
> 16038c4fffd802 Kefeng Wang             2024-08-27  1144  }
> 16038c4fffd802 Kefeng Wang             2024-08-27  1145  #endif
> 16038c4fffd802 Kefeng Wang             2024-08-27  1146  
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
  2025-01-17  3:57   ` kernel test robot
@ 2025-01-17  4:39   ` kernel test robot
  2025-01-17  4:49   ` kernel test robot
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2025-01-17  4:39 UTC (permalink / raw)
  To: Wupeng Ma, akpm, david, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: llvm, oe-kbuild-all, mawupeng1, linux-mm, linux-kernel

Hi Wupeng,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Wupeng-Ma/mm-memory-failure-update-ttu-flag-inside-unmap_poisoned_folio/20250116-142614
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250116061657.227027-2-mawupeng1%40huawei.com
patch subject: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
config: s390-randconfig-002-20250117 (https://download.01.org/0day-ci/archive/20250117/202501171201.GNMkmauO-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project c23f2417dc5f6dc371afb07af5627ec2a9d373a0)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250117/202501171201.GNMkmauO-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501171201.GNMkmauO-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from mm/mprotect.c:12:
   In file included from include/linux/pagewalk.h:5:
   In file included from include/linux/mm.h:2224:
   include/linux/vmstat.h:504:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     504 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     505 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:511:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     511 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     512 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:524:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     524 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     525 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from mm/mprotect.c:30:
   include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
         |                                    ~~~~~~~~~~~ ^ ~~~
   include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
         |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
   In file included from mm/mprotect.c:41:
>> mm/internal.h:1142:1: error: expected identifier or '('
    1142 | {
         | ^
   5 warnings and 1 error generated.
--
   In file included from mm/vmscan.c:15:
   In file included from include/linux/mm.h:2224:
   include/linux/vmstat.h:504:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     504 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     505 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:511:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     511 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     512 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:524:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     524 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     525 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from mm/vmscan.c:30:
   include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
         |                                    ~~~~~~~~~~~ ^ ~~~
   include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
         |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
   In file included from mm/vmscan.c:68:
>> mm/internal.h:1142:1: error: expected identifier or '('
    1142 | {
         | ^
   mm/vmscan.c:409:51: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     409 |                         size += zone_page_state(zone, NR_ZONE_LRU_BASE + lru);
         |                                                       ~~~~~~~~~~~~~~~~ ^ ~~~
   mm/vmscan.c:1770:4: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    1770 |                         __count_zid_vm_events(PGSCAN_SKIP, zid, nr_skipped[zid]);
         |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   mm/vmscan.c:2276:51: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
    2276 |         inactive = lruvec_page_state(lruvec, NR_LRU_BASE + inactive_lru);
         |                                              ~~~~~~~~~~~ ^ ~~~~~~~~~~~~
   mm/vmscan.c:2277:49: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
    2277 |         active = lruvec_page_state(lruvec, NR_LRU_BASE + active_lru);
         |                                            ~~~~~~~~~~~ ^ ~~~~~~~~~~
   mm/vmscan.c:6292:3: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    6292 |                 __count_zid_vm_events(ALLOCSTALL, sc->reclaim_idx, 1);
         |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   10 warnings and 1 error generated.
--
   In file included from mm/page_alloc.c:19:
   In file included from include/linux/mm.h:2224:
   include/linux/vmstat.h:504:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     504 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     505 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:511:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     511 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     512 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:524:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     524 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     525 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from mm/page_alloc.c:44:
   include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
         |                                    ~~~~~~~~~~~ ^ ~~~
   include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
         |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
   In file included from mm/page_alloc.c:59:
>> mm/internal.h:1142:1: error: expected identifier or '('
    1142 | {
         | ^
   mm/page_alloc.c:2933:2: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    2933 |         __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
         |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   mm/page_alloc.c:3050:3: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    3050 |                 __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
         |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   mm/page_alloc.c:4683:2: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    4683 |         __count_zid_vm_events(PGALLOC, zone_idx(zone), nr_account);
         |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   8 warnings and 1 error generated.


vim +1142 mm/internal.h

31d3d3484f9bd2 Wu Fengguang            2009-12-16  1121  
7c116f2b0dbac4 Wu Fengguang            2009-12-16  1122  extern u32 hwpoison_filter_dev_major;
7c116f2b0dbac4 Wu Fengguang            2009-12-16  1123  extern u32 hwpoison_filter_dev_minor;
478c5ffc0b5052 Wu Fengguang            2009-12-16  1124  extern u64 hwpoison_filter_flags_mask;
478c5ffc0b5052 Wu Fengguang            2009-12-16  1125  extern u64 hwpoison_filter_flags_value;
4fd466eb46a6a9 Andi Kleen              2009-12-16  1126  extern u64 hwpoison_filter_memcg;
1bfe5febe34d2b Haicheng Li             2009-12-16  1127  extern u32 hwpoison_filter_enable;
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1128  #define MAGIC_HWPOISON	0x48575053U	/* HWPS */
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1129  void SetPageHWPoisonTakenOff(struct page *page);
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1130  void ClearPageHWPoisonTakenOff(struct page *page);
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1131  bool take_page_off_buddy(struct page *page);
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1132  bool put_page_back_buddy(struct page *page);
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1133  struct task_struct *task_early_kill(struct task_struct *tsk, int force_early);
68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1134) void add_to_kill_ksm(struct task_struct *tsk, const struct page *p,
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1135  		     struct vm_area_struct *vma, struct list_head *to_kill,
3a78f77fd1fb82 Miaohe Lin              2024-06-12  1136  		     unsigned long ksm_addr);
68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1137) unsigned long page_mapped_in_vma(const struct page *page,
68158bfa3dbd4a Matthew Wilcox (Oracle  2024-10-05  1138) 		struct vm_area_struct *vma);
eb36c5873b96e8 Al Viro                 2012-05-30  1139  
16038c4fffd802 Kefeng Wang             2024-08-27  1140  #else
2b5df10a15dc74 Ma Wupeng               2025-01-16  1141  static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
16038c4fffd802 Kefeng Wang             2024-08-27 @1142  {
2b5df10a15dc74 Ma Wupeng               2025-01-16  1143  	return -EBUSY;
16038c4fffd802 Kefeng Wang             2024-08-27  1144  }
16038c4fffd802 Kefeng Wang             2024-08-27  1145  #endif
16038c4fffd802 Kefeng Wang             2024-08-27  1146  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
  2025-01-17  3:57   ` kernel test robot
  2025-01-17  4:39   ` kernel test robot
@ 2025-01-17  4:49   ` kernel test robot
  2025-01-20  6:24   ` Miaohe Lin
  2025-01-20  7:55   ` David Hildenbrand
  4 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2025-01-17  4:49 UTC (permalink / raw)
  To: Wupeng Ma, akpm, david, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: oe-kbuild-all, mawupeng1, linux-mm, linux-kernel

Hi Wupeng,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Wupeng-Ma/mm-memory-failure-update-ttu-flag-inside-unmap_poisoned_folio/20250116-142614
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250116061657.227027-2-mawupeng1%40huawei.com
patch subject: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
config: x86_64-buildonly-randconfig-002-20250117 (https://download.01.org/0day-ci/archive/20250117/202501171215.KLa8VDaS-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250117/202501171215.KLa8VDaS-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501171215.KLa8VDaS-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from mm/memory_hotplug.c:41:
   mm/internal.h:1142:1: error: expected identifier or '(' before '{' token
    1142 | {
         | ^
>> mm/internal.h:1141:19: warning: 'unmap_poisoned_folio' used but never defined
    1141 | static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
         |                   ^~~~~~~~~~~~~~~~~~~~


vim +/unmap_poisoned_folio +1141 mm/internal.h

  1121	
  1122	extern u32 hwpoison_filter_dev_major;
  1123	extern u32 hwpoison_filter_dev_minor;
  1124	extern u64 hwpoison_filter_flags_mask;
  1125	extern u64 hwpoison_filter_flags_value;
  1126	extern u64 hwpoison_filter_memcg;
  1127	extern u32 hwpoison_filter_enable;
  1128	#define MAGIC_HWPOISON	0x48575053U	/* HWPS */
  1129	void SetPageHWPoisonTakenOff(struct page *page);
  1130	void ClearPageHWPoisonTakenOff(struct page *page);
  1131	bool take_page_off_buddy(struct page *page);
  1132	bool put_page_back_buddy(struct page *page);
  1133	struct task_struct *task_early_kill(struct task_struct *tsk, int force_early);
  1134	void add_to_kill_ksm(struct task_struct *tsk, const struct page *p,
  1135			     struct vm_area_struct *vma, struct list_head *to_kill,
  1136			     unsigned long ksm_addr);
  1137	unsigned long page_mapped_in_vma(const struct page *page,
  1138			struct vm_area_struct *vma);
  1139	
  1140	#else
> 1141	static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
  1142	{
  1143		return -EBUSY;
  1144	}
  1145	#endif
  1146	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
                     ` (2 preceding siblings ...)
  2025-01-17  4:49   ` kernel test robot
@ 2025-01-20  6:24   ` Miaohe Lin
  2025-01-20  7:49     ` David Hildenbrand
  2025-01-20  9:06     ` mawupeng
  2025-01-20  7:55   ` David Hildenbrand
  4 siblings, 2 replies; 22+ messages in thread
From: Miaohe Lin @ 2025-01-20  6:24 UTC (permalink / raw)
  To: Wupeng Ma
  Cc: linux-mm, linux-kernel, akpm, david, osalvador, nao.horiguchi, mhocko

On 2025/1/16 14:16, Wupeng Ma wrote:
> From: Ma Wupeng <mawupeng1@huawei.com>

Thanks for your patch. Some nits below.

> 
> Commit 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to
> TTU_HWPOISON") introduce TTU_HWPOISON to replace TTU_IGNORE_HWPOISON
> in order to stop send SIGBUS signal when accessing an error page after
> a memory error on a clean folio. However during page migration, anon
> folio must be set with TTU_HWPOISON during unmap_*(). For pagecache
> we need some policy just like the one in hwpoison_user_mappings to
> set this flag. So move this policy from hwpoison_user_mappings to
> unmap_poisoned_folio to handle this waring properly.

s/waring/warning/g

> 
> Waring will be produced during unamp poison folio with the following log:

s/Waring/Warning/g

> 
>   ------------[ cut here ]------------
>   WARNING: CPU: 1 PID: 365 at mm/rmap.c:1847 try_to_unmap_one+0x8fc/0xd3c
>   Modules linked in:
>   CPU: 1 UID: 0 PID: 365 Comm: bash Tainted: G        W          6.13.0-rc1-00018-gacdb4bbda7ab #42
>   Tainted: [W]=WARN
>   Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>   pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>   pc : try_to_unmap_one+0x8fc/0xd3c
>   lr : try_to_unmap_one+0x3dc/0xd3c
>   Call trace:
>    try_to_unmap_one+0x8fc/0xd3c (P)
>    try_to_unmap_one+0x3dc/0xd3c (L)
>    rmap_walk_anon+0xdc/0x1f8
>    rmap_walk+0x3c/0x58
>    try_to_unmap+0x88/0x90
>    unmap_poisoned_folio+0x30/0xa8
>    do_migrate_range+0x4a0/0x568
>    offline_pages+0x5a4/0x670
>    memory_block_action+0x17c/0x374
>    memory_subsys_offline+0x3c/0x78
>    device_offline+0xa4/0xd0
>    state_store+0x8c/0xf0
>    dev_attr_store+0x18/0x2c
>    sysfs_kf_write+0x44/0x54
>    kernfs_fop_write_iter+0x118/0x1a8
>    vfs_write+0x3a8/0x4bc
>    ksys_write+0x6c/0xf8
>    __arm64_sys_write+0x1c/0x28
>    invoke_syscall+0x44/0x100
>    el0_svc_common.constprop.0+0x40/0xe0
>    do_el0_svc+0x1c/0x28
>    el0_svc+0x30/0xd0
>    el0t_64_sync_handler+0xc8/0xcc
>    el0t_64_sync+0x198/0x19c
>   ---[ end trace 0000000000000000 ]---
> 
> Fixes: 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON")
> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
> Suggested-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/internal.h       |  5 ++--
>  mm/memory-failure.c | 61 +++++++++++++++++++++++----------------------
>  mm/memory_hotplug.c |  3 ++-
>  3 files changed, 36 insertions(+), 33 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 9826f7dce607..3caee67c0abd 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1102,7 +1102,7 @@ static inline int find_next_best_node(int node, nodemask_t *used_node_mask)
>   * mm/memory-failure.c
>   */
>  #ifdef CONFIG_MEMORY_FAILURE
> -void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu);
> +int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
>  void shake_folio(struct folio *folio);
>  extern int hwpoison_filter(struct page *p);
>  
> @@ -1125,8 +1125,9 @@ unsigned long page_mapped_in_vma(const struct page *page,
>  		struct vm_area_struct *vma);
>  
>  #else
> -static inline void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
> +static inline int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill);
>  {
> +	return -EBUSY;
>  }
>  #endif
>  
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index a7b8ccd29b6f..b5212b6e330a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1556,8 +1556,34 @@ static int get_hwpoison_page(struct page *p, unsigned long flags)
>  	return ret;
>  }
>  
> -void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
> +int unmap_poisoned_folio(struct folio *folio, unsigned long pfn, bool must_kill)
>  {
> +	enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC | TTU_HWPOISON;
> +	struct address_space *mapping;
> +
> +	if (folio_test_swapcache(folio)) {
> +		pr_err("%#lx: keeping poisoned page in swap cache\n", pfn);
> +		ttu &= ~TTU_HWPOISON;
> +	}
> +
> +	/*
> +	 * Propagate the dirty bit from PTEs to struct page first, because we
> +	 * need this to decide if we should kill or just drop the page.
> +	 * XXX: the dirty test could be racy: set_page_dirty() may not always
> +	 * be called inside page lock (it's recommended but not enforced).
> +	 */
> +	mapping = folio_mapping(folio);
> +	if (!must_kill && !folio_test_dirty(folio) && mapping &&
> +	    mapping_can_writeback(mapping)) {
> +		if (folio_mkclean(folio)) {
> +			folio_set_dirty(folio);
> +		} else {
> +			ttu &= ~TTU_HWPOISON;
> +			pr_info("%#lx: corrupted page was clean: dropped without side effects\n",
> +				pfn);
> +		}
> +	}
> +
>  	if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>  		struct address_space *mapping;
>  
> @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>  		if (!mapping) {
>  			pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>  				folio_pfn(folio));
> -			return;
> +			return -EBUSY;
>  		}
>  
>  		try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>  	} else {
>  		try_to_unmap(folio, ttu);
>  	}
> +
> +	return folio_mapped(folio) ? -EBUSY : 0;

Do we really need this return value? It's unused in do_migrate_range().

Thanks.
.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang
  2025-01-16  6:16 ` [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang Wupeng Ma
@ 2025-01-20  6:32   ` Miaohe Lin
  2025-01-21  2:17     ` mawupeng
  2025-01-20  8:01   ` David Hildenbrand
  1 sibling, 1 reply; 22+ messages in thread
From: Miaohe Lin @ 2025-01-20  6:32 UTC (permalink / raw)
  To: Wupeng Ma
  Cc: linux-mm, linux-kernel, akpm, david, osalvador, nao.horiguchi, mhocko

On 2025/1/16 14:16, Wupeng Ma wrote:
> From: Ma Wupeng <mawupeng1@huawei.com>
> 
> If a folio has an increased reference count, folio_try_get() will acquire
> it, perform necessary operations, and then release it. In the case of a
> poisoned folio without an elevated reference count (which is unlikely for
> memory-failure), folio_try_get() will simply bypass it.
> 
> Therefore, relocate the folio_try_get() function, responsible for checking
> and acquiring this reference count at first.
> 
> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
> ---
>  mm/memory_hotplug.c | 14 ++++----------
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 2815bd4ea483..3fb75ee185c6 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1786,6 +1786,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  		page = pfn_to_page(pfn);
>  		folio = page_folio(page);
>  
> +		if (!folio_try_get(folio))
> +			continue;
> +
>  		/*
>  		 * No reference or lock is held on the folio, so it might
>  		 * be modified concurrently (e.g. split).  As such,
> @@ -1795,12 +1798,6 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  		if (folio_test_large(folio))
>  			pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>  
> -		/*
> -		 * HWPoison pages have elevated reference counts so the migration would
> -		 * fail on them. It also doesn't make any sense to migrate them in the
> -		 * first place. Still try to unmap such a page in case it is still mapped
> -		 * (keep the unmap as the catch all safety net).
> -		 */
>  		if (folio_test_hwpoison(folio) ||
>  		    (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
>  			if (WARN_ON(folio_test_lru(folio)))
> @@ -1811,12 +1808,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  				folio_unlock(folio);
>  			}
>  
> -			continue;
> +			goto put_folio;
>  		}
>  
> -		if (!folio_try_get(folio))
> -			continue;
> -
>  		if (unlikely(page_folio(page) != folio))
>  			goto put_folio;

Will it be necessary to move this check above folio_test_hwpoison trunk too?

Thanks.
.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-20  6:24   ` Miaohe Lin
@ 2025-01-20  7:49     ` David Hildenbrand
  2025-01-20  8:46       ` David Hildenbrand
  2025-01-21  2:46       ` Miaohe Lin
  2025-01-20  9:06     ` mawupeng
  1 sibling, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2025-01-20  7:49 UTC (permalink / raw)
  To: Miaohe Lin, Wupeng Ma
  Cc: linux-mm, linux-kernel, akpm, osalvador, nao.horiguchi, mhocko


>>   	if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>   		struct address_space *mapping;
>>   
>> @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>   		if (!mapping) {
>>   			pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>   				folio_pfn(folio));
>> -			return;
>> +			return -EBUSY;
>>   		}
>>   
>>   		try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>   	} else {
>>   		try_to_unmap(folio, ttu);
>>   	}
>> +
>> +	return folio_mapped(folio) ? -EBUSY : 0;
> 
> Do we really need this return value? It's unused in do_migrate_range().

I suggested it, because the folio_mapped() is nowadays extremely cheap. 
It cleans up hwpoison_user_mappings() quite nicely.

Any particular reason we shouldn't be doing that?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
                     ` (3 preceding siblings ...)
  2025-01-20  6:24   ` Miaohe Lin
@ 2025-01-20  7:55   ` David Hildenbrand
  4 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2025-01-20  7:55 UTC (permalink / raw)
  To: Wupeng Ma, akpm, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: linux-mm, linux-kernel

On 16.01.25 07:16, Wupeng Ma wrote:
> From: Ma Wupeng <mawupeng1@huawei.com>
> 
> Commit 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to
> TTU_HWPOISON") introduce TTU_HWPOISON to replace TTU_IGNORE_HWPOISON
> in order to stop send SIGBUS signal when accessing an error page after
> a memory error on a clean folio. However during page migration, anon
> folio must be set with TTU_HWPOISON during unmap_*(). For pagecache
> we need some policy just like the one in hwpoison_user_mappings to
> set this flag. So move this policy from hwpoison_user_mappings to
> unmap_poisoned_folio to handle this waring properly.
> 
> Waring will be produced during unamp poison folio with the following log:
> 
>    ------------[ cut here ]------------
>    WARNING: CPU: 1 PID: 365 at mm/rmap.c:1847 try_to_unmap_one+0x8fc/0xd3c
>    Modules linked in:
>    CPU: 1 UID: 0 PID: 365 Comm: bash Tainted: G        W          6.13.0-rc1-00018-gacdb4bbda7ab #42
>    Tainted: [W]=WARN
>    Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>    pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>    pc : try_to_unmap_one+0x8fc/0xd3c
>    lr : try_to_unmap_one+0x3dc/0xd3c
>    Call trace:
>     try_to_unmap_one+0x8fc/0xd3c (P)
>     try_to_unmap_one+0x3dc/0xd3c (L)
>     rmap_walk_anon+0xdc/0x1f8
>     rmap_walk+0x3c/0x58
>     try_to_unmap+0x88/0x90
>     unmap_poisoned_folio+0x30/0xa8
>     do_migrate_range+0x4a0/0x568
>     offline_pages+0x5a4/0x670
>     memory_block_action+0x17c/0x374
>     memory_subsys_offline+0x3c/0x78
>     device_offline+0xa4/0xd0
>     state_store+0x8c/0xf0
>     dev_attr_store+0x18/0x2c
>     sysfs_kf_write+0x44/0x54
>     kernfs_fop_write_iter+0x118/0x1a8
>     vfs_write+0x3a8/0x4bc
>     ksys_write+0x6c/0xf8
>     __arm64_sys_write+0x1c/0x28
>     invoke_syscall+0x44/0x100
>     el0_svc_common.constprop.0+0x40/0xe0
>     do_el0_svc+0x1c/0x28
>     el0_svc+0x30/0xd0
>     el0t_64_sync_handler+0xc8/0xcc
>     el0t_64_sync+0x198/0x19c
>    ---[ end trace 0000000000000000 ]---
> 
> Fixes: 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON")
> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
> Suggested-by: David Hildenbrand <david@redhat.com>

Your SOB should probably come last.

With or without moving folio_mapped() into unmap_poisoned_folio():

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang
  2025-01-16  6:16 ` [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang Wupeng Ma
  2025-01-20  6:32   ` Miaohe Lin
@ 2025-01-20  8:01   ` David Hildenbrand
  2025-01-20  9:11     ` mawupeng
  1 sibling, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2025-01-20  8:01 UTC (permalink / raw)
  To: Wupeng Ma, akpm, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: linux-mm, linux-kernel

On 16.01.25 07:16, Wupeng Ma wrote:
> From: Ma Wupeng <mawupeng1@huawei.com>
> 
> If a folio has an increased reference count, folio_try_get() will acquire
> it, perform necessary operations, and then release it. In the case of a
> poisoned folio without an elevated reference count (which is unlikely for
> memory-failure), folio_try_get() will simply bypass it.
> 
> Therefore, relocate the folio_try_get() function, responsible for checking
> and acquiring this reference count at first.
> 
> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
> ---
>   mm/memory_hotplug.c | 14 ++++----------
>   1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 2815bd4ea483..3fb75ee185c6 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1786,6 +1786,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>   		page = pfn_to_page(pfn);
>   		folio = page_folio(page);
>   
> +		if (!folio_try_get(folio))
> +			continue;
> +

I would only move it in front of the folio_test_hwpoison() check for 
now. Note that with this patch as is the comment below would be wrong

>   		/*
>   		 * No reference or lock is held on the folio, so it might

^

I would move this patch before the current #2, so the folio_lock() looks 
less weird.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-20  7:49     ` David Hildenbrand
@ 2025-01-20  8:46       ` David Hildenbrand
  2025-01-21  3:20         ` Miaohe Lin
  2025-01-21  2:46       ` Miaohe Lin
  1 sibling, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2025-01-20  8:46 UTC (permalink / raw)
  To: Miaohe Lin, Wupeng Ma
  Cc: linux-mm, linux-kernel, akpm, osalvador, nao.horiguchi, mhocko

On 20.01.25 08:49, David Hildenbrand wrote:
> 
>>>    	if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>>    		struct address_space *mapping;
>>>    
>>> @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>    		if (!mapping) {
>>>    			pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>>    				folio_pfn(folio));
>>> -			return;
>>> +			return -EBUSY;
>>>    		}
>>>    
>>>    		try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>    	} else {
>>>    		try_to_unmap(folio, ttu);
>>>    	}
>>> +
>>> +	return folio_mapped(folio) ? -EBUSY : 0;
>>
>> Do we really need this return value? It's unused in do_migrate_range().
> 
> I suggested it, because the folio_mapped() is nowadays extremely cheap.
> It cleans up hwpoison_user_mappings() quite nicely.

I'm also wondering, if in do_migrate_range(), we want to 
pr_warn_ratelimit() in case still mapped after the call. IIUC, we don't 
really expect this to happen with SYNC set.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-20  6:24   ` Miaohe Lin
  2025-01-20  7:49     ` David Hildenbrand
@ 2025-01-20  9:06     ` mawupeng
  1 sibling, 0 replies; 22+ messages in thread
From: mawupeng @ 2025-01-20  9:06 UTC (permalink / raw)
  To: linmiaohe
  Cc: mawupeng1, linux-mm, linux-kernel, akpm, david, osalvador,
	nao.horiguchi, mhocko



On 2025/1/20 14:24, Miaohe Lin wrote:
> On 2025/1/16 14:16, Wupeng Ma wrote:
>> From: Ma Wupeng <mawupeng1@huawei.com>
> 
> Thanks for your patch. Some nits below.
> 
>>
>> Commit 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to
>> TTU_HWPOISON") introduce TTU_HWPOISON to replace TTU_IGNORE_HWPOISON
>> in order to stop send SIGBUS signal when accessing an error page after
>> a memory error on a clean folio. However during page migration, anon
>> folio must be set with TTU_HWPOISON during unmap_*(). For pagecache
>> we need some policy just like the one in hwpoison_user_mappings to
>> set this flag. So move this policy from hwpoison_user_mappings to
>> unmap_poisoned_folio to handle this waring properly.
> 
> s/waring/warning/g

Thanks for your reply.

will be fixed later.

> 
>>
>> Waring will be produced during unamp poison folio with the following log:
> 
> s/Waring/Warning/g

Thanks for your reply.

will be fixed later.





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang
  2025-01-20  8:01   ` David Hildenbrand
@ 2025-01-20  9:11     ` mawupeng
  0 siblings, 0 replies; 22+ messages in thread
From: mawupeng @ 2025-01-20  9:11 UTC (permalink / raw)
  To: david, akpm, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: mawupeng1, linux-mm, linux-kernel



On 2025/1/20 16:01, David Hildenbrand wrote:
> On 16.01.25 07:16, Wupeng Ma wrote:
>> From: Ma Wupeng <mawupeng1@huawei.com>
>>
>> If a folio has an increased reference count, folio_try_get() will acquire
>> it, perform necessary operations, and then release it. In the case of a
>> poisoned folio without an elevated reference count (which is unlikely for
>> memory-failure), folio_try_get() will simply bypass it.
>>
>> Therefore, relocate the folio_try_get() function, responsible for checking
>> and acquiring this reference count at first.
>>
>> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
>> ---
>>   mm/memory_hotplug.c | 14 ++++----------
>>   1 file changed, 4 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 2815bd4ea483..3fb75ee185c6 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1786,6 +1786,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>           page = pfn_to_page(pfn);
>>           folio = page_folio(page);
>>   +        if (!folio_try_get(folio))
>> +            continue;
>> +
> 
> I would only move it in front of the folio_test_hwpoison() check for now. Note that with this patch as is the comment below would be wrong

Thanks for notice this.

Move it in front of the folio_test_hwpoison() do seems better.

> 
>>           /*
>>            * No reference or lock is held on the folio, so it might
> 
> ^
> 
> I would move this patch before the current #2, so the folio_lock() looks less weird.
> 

Ok, will be done.

Thanks.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio
  2025-01-16  6:16 ` [PATCH v2 2/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Wupeng Ma
@ 2025-01-20  9:25   ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2025-01-20  9:25 UTC (permalink / raw)
  To: Wupeng Ma, akpm, osalvador, nao.horiguchi, linmiaohe, mhocko
  Cc: linux-mm, linux-kernel

On 16.01.25 07:16, Wupeng Ma wrote:
> From: Ma Wupeng <mawupeng1@huawei.com>
> 
> Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to
> be offlined) add page poison checks in do_migrate_range in order to make
> offline hwpoisoned page possible by introducing isolate_lru_page and
> try_to_unmap for hwpoisoned page. However folio lock must be held before
> calling try_to_unmap. Add it to fix this problem.
> 
> Waring will be produced if folio is not locked during unmap:
> 
>    ------------[ cut here ]------------
>    kernel BUG at ./include/linux/swapops.h:400!
>    Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>    Modules linked in:
>    CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G        W          6.13.0-rc1-00016-g3c434c7ee82a-dirty #41
>    Tainted: [W]=WARN
>    Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>    pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>    pc : try_to_unmap_one+0xb08/0xd3c
>    lr : try_to_unmap_one+0x3dc/0xd3c
>    Call trace:
>     try_to_unmap_one+0xb08/0xd3c (P)
>     try_to_unmap_one+0x3dc/0xd3c (L)
>     rmap_walk_anon+0xdc/0x1f8
>     rmap_walk+0x3c/0x58
>     try_to_unmap+0x88/0x90
>     unmap_poisoned_folio+0x30/0xa8
>     do_migrate_range+0x4a0/0x568
>     offline_pages+0x5a4/0x670
>     memory_block_action+0x17c/0x374
>     memory_subsys_offline+0x3c/0x78
>     device_offline+0xa4/0xd0
>     state_store+0x8c/0xf0
>     dev_attr_store+0x18/0x2c
>     sysfs_kf_write+0x44/0x54
>     kernfs_fop_write_iter+0x118/0x1a8
>     vfs_write+0x3a8/0x4bc
>     ksys_write+0x6c/0xf8
>     __arm64_sys_write+0x1c/0x28
>     invoke_syscall+0x44/0x100
>     el0_svc_common.constprop.0+0x40/0xe0
>     do_el0_svc+0x1c/0x28
>     el0_svc+0x30/0xd0
>     el0t_64_sync_handler+0xc8/0xcc
>     el0t_64_sync+0x198/0x19c
>    Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000)
>    ---[ end trace 0000000000000000 ]---
> 
> Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined")
> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>

With patch #3 coming first it looks good to me:

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang
  2025-01-20  6:32   ` Miaohe Lin
@ 2025-01-21  2:17     ` mawupeng
  0 siblings, 0 replies; 22+ messages in thread
From: mawupeng @ 2025-01-21  2:17 UTC (permalink / raw)
  To: linmiaohe
  Cc: mawupeng1, linux-mm, linux-kernel, akpm, david, osalvador,
	nao.horiguchi, mhocko



On 2025/1/20 14:32, Miaohe Lin wrote:
> On 2025/1/16 14:16, Wupeng Ma wrote:
>> From: Ma Wupeng <mawupeng1@huawei.com>
>>
>> If a folio has an increased reference count, folio_try_get() will acquire
>> it, perform necessary operations, and then release it. In the case of a
>> poisoned folio without an elevated reference count (which is unlikely for
>> memory-failure), folio_try_get() will simply bypass it.
>>
>> Therefore, relocate the folio_try_get() function, responsible for checking
>> and acquiring this reference count at first.
>>
>> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
>> ---
>>  mm/memory_hotplug.c | 14 ++++----------
>>  1 file changed, 4 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 2815bd4ea483..3fb75ee185c6 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1786,6 +1786,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>  		page = pfn_to_page(pfn);
>>  		folio = page_folio(page);
>>  
>> +		if (!folio_try_get(folio))
>> +			continue;
>> +
>>  		/*
>>  		 * No reference or lock is held on the folio, so it might
>>  		 * be modified concurrently (e.g. split).  As such,
>> @@ -1795,12 +1798,6 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>  		if (folio_test_large(folio))
>>  			pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>>  
>> -		/*
>> -		 * HWPoison pages have elevated reference counts so the migration would
>> -		 * fail on them. It also doesn't make any sense to migrate them in the
>> -		 * first place. Still try to unmap such a page in case it is still mapped
>> -		 * (keep the unmap as the catch all safety net).
>> -		 */
>>  		if (folio_test_hwpoison(folio) ||
>>  		    (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
>>  			if (WARN_ON(folio_test_lru(folio)))
>> @@ -1811,12 +1808,9 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>  				folio_unlock(folio);
>>  			}
>>  
>> -			continue;
>> +			goto put_folio;
>>  		}
>>  
>> -		if (!folio_try_get(folio))
>> -			continue;
>> -
>>  		if (unlikely(page_folio(page) != folio))
>>  			goto put_folio;
> 
> Will it be necessary to move this check above folio_test_hwpoison trunk too?

Thanks.

AFAICT  we can do this, I'll move this in the next patch. there is no need to handle this page if
the state of this folio changes.

> 
> Thanks.
> .



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-20  7:49     ` David Hildenbrand
  2025-01-20  8:46       ` David Hildenbrand
@ 2025-01-21  2:46       ` Miaohe Lin
  1 sibling, 0 replies; 22+ messages in thread
From: Miaohe Lin @ 2025-01-21  2:46 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, akpm, osalvador, nao.horiguchi, mhocko,
	Wupeng Ma

On 2025/1/20 15:49, David Hildenbrand wrote:
> 
>>>       if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>>           struct address_space *mapping;
>>>   @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>           if (!mapping) {
>>>               pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>>                   folio_pfn(folio));
>>> -            return;
>>> +            return -EBUSY;
>>>           }
>>>             try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>       } else {
>>>           try_to_unmap(folio, ttu);
>>>       }
>>> +
>>> +    return folio_mapped(folio) ? -EBUSY : 0;
>>
>> Do we really need this return value? It's unused in do_migrate_range().
> 
> I suggested it, because the folio_mapped() is nowadays extremely cheap. It cleans up hwpoison_user_mappings() quite nicely.
> 
> Any particular reason we shouldn't be doing that?

I was trying to keep code more clean (IMO) but no strong opinion.

Thanks.
.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-20  8:46       ` David Hildenbrand
@ 2025-01-21  3:20         ` Miaohe Lin
  2025-01-21  7:58           ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Miaohe Lin @ 2025-01-21  3:20 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, akpm, osalvador, nao.horiguchi, mhocko,
	Wupeng Ma

On 2025/1/20 16:46, David Hildenbrand wrote:
> On 20.01.25 08:49, David Hildenbrand wrote:
>>
>>>>        if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>>>            struct address_space *mapping;
>>>>    @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>            if (!mapping) {
>>>>                pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>>>                    folio_pfn(folio));
>>>> -            return;
>>>> +            return -EBUSY;
>>>>            }
>>>>               try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>>>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>        } else {
>>>>            try_to_unmap(folio, ttu);
>>>>        }
>>>> +
>>>> +    return folio_mapped(folio) ? -EBUSY : 0;
>>>
>>> Do we really need this return value? It's unused in do_migrate_range().
>>
>> I suggested it, because the folio_mapped() is nowadays extremely cheap.
>> It cleans up hwpoison_user_mappings() quite nicely.
> 
> I'm also wondering, if in do_migrate_range(), we want to pr_warn_ratelimit() in case still mapped after the call. IIUC, we don't really expect this to happen with SYNC set.

Do you mean TTU_SYNC? It seems it's not set.

There might be a race will hit the proposed pr_warn_ratelimit():

/* Assume folio is isolated for reclaim, so memory_failure failed to handle it at first time. Then it's put back to LRU. */
do_migrate_range
 folio_test_hwpoison
  folio_mapped
  <folio is isolated for reclaim again.>
   unmap_poisoned_folio
  <folio is put buck.>
    pr_warn_ratelimit(folio_mapped)

But I might be miss something. And even this race is possible, it should be really hard to hit.

Thanks.
.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-21  3:20         ` Miaohe Lin
@ 2025-01-21  7:58           ` David Hildenbrand
  2025-01-22  7:38             ` Miaohe Lin
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2025-01-21  7:58 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: linux-mm, linux-kernel, akpm, osalvador, nao.horiguchi, mhocko,
	Wupeng Ma

On 21.01.25 04:20, Miaohe Lin wrote:
> On 2025/1/20 16:46, David Hildenbrand wrote:
>> On 20.01.25 08:49, David Hildenbrand wrote:
>>>
>>>>>         if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>>>>             struct address_space *mapping;
>>>>>     @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>>             if (!mapping) {
>>>>>                 pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>>>>                     folio_pfn(folio));
>>>>> -            return;
>>>>> +            return -EBUSY;
>>>>>             }
>>>>>                try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>>>>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>>         } else {
>>>>>             try_to_unmap(folio, ttu);
>>>>>         }
>>>>> +
>>>>> +    return folio_mapped(folio) ? -EBUSY : 0;
>>>>
>>>> Do we really need this return value? It's unused in do_migrate_range().
>>>
>>> I suggested it, because the folio_mapped() is nowadays extremely cheap.
>>> It cleans up hwpoison_user_mappings() quite nicely.
>>
>> I'm also wondering, if in do_migrate_range(), we want to pr_warn_ratelimit() in case still mapped after the call. IIUC, we don't really expect this to happen with SYNC set.
> 
> Do you mean TTU_SYNC? It seems it's not set.

With your patch it will be now, which is the right thing to do I think.

> 
> There might be a race will hit the proposed pr_warn_ratelimit():
> 
> /* Assume folio is isolated for reclaim, so memory_failure failed to handle it at first time. Then it's put back to LRU. */
> do_migrate_range
>   folio_test_hwpoison
>    folio_mapped
>    <folio is isolated for reclaim again.>
>     unmap_poisoned_folio
>    <folio is put buck.>
>      pr_warn_ratelimit(folio_mapped)
> 
> But I might be miss something. And even this race is possible, it should be really hard to hit.

Does try_to_unmap() care about isolation? Skimming over the code, I 
don't think so. I assume once we take the folio lock, races with reclaim 
are impossible.

In any case, the race is unexpected, so pr_warn_() would be helpful and 
not harmful.

Memory offlining code will later simply skip all PageHWPoison() pages, 
independent of the refcount as it seems. Failing to unmap might not be 
handled correctly at all ... I think this might be problematic in other 
regard (e.g., GUP references), but failing to unmap is "obviously" bad I 
think :)

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio
  2025-01-21  7:58           ` David Hildenbrand
@ 2025-01-22  7:38             ` Miaohe Lin
  0 siblings, 0 replies; 22+ messages in thread
From: Miaohe Lin @ 2025-01-22  7:38 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, linux-kernel, akpm, osalvador, nao.horiguchi, mhocko,
	Wupeng Ma

On 2025/1/21 15:58, David Hildenbrand wrote:
> On 21.01.25 04:20, Miaohe Lin wrote:
>> On 2025/1/20 16:46, David Hildenbrand wrote:
>>> On 20.01.25 08:49, David Hildenbrand wrote:
>>>>
>>>>>>         if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>>>>>             struct address_space *mapping;
>>>>>>     @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>>>             if (!mapping) {
>>>>>>                 pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>>>>>                     folio_pfn(folio));
>>>>>> -            return;
>>>>>> +            return -EBUSY;
>>>>>>             }
>>>>>>                try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>>>>>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>>>         } else {
>>>>>>             try_to_unmap(folio, ttu);
>>>>>>         }
>>>>>> +
>>>>>> +    return folio_mapped(folio) ? -EBUSY : 0;
>>>>>
>>>>> Do we really need this return value? It's unused in do_migrate_range().
>>>>
>>>> I suggested it, because the folio_mapped() is nowadays extremely cheap.
>>>> It cleans up hwpoison_user_mappings() quite nicely.
>>>
>>> I'm also wondering, if in do_migrate_range(), we want to pr_warn_ratelimit() in case still mapped after the call. IIUC, we don't really expect this to happen with SYNC set.
>>
>> Do you mean TTU_SYNC? It seems it's not set.
> 
> With your patch it will be now, which is the right thing to do I think.
> 
>>
>> There might be a race will hit the proposed pr_warn_ratelimit():
>>
>> /* Assume folio is isolated for reclaim, so memory_failure failed to handle it at first time. Then it's put back to LRU. */
>> do_migrate_range
>>   folio_test_hwpoison
>>    folio_mapped
>>    <folio is isolated for reclaim again.>
>>     unmap_poisoned_folio
>>    <folio is put buck.>
>>      pr_warn_ratelimit(folio_mapped)
>>
>> But I might be miss something. And even this race is possible, it should be really hard to hit.
> 
> Does try_to_unmap() care about isolation? Skimming over the code, I don't think so. I assume once we take the folio lock, races with reclaim are impossible.

I think you're right. I missed folio lock in above race.

> 
> In any case, the race is unexpected, so pr_warn_() would be helpful and not harmful.
> 
> Memory offlining code will later simply skip all PageHWPoison() pages, independent of the refcount as it seems. Failing to unmap might not be handled correctly at all ... I think this might be problematic in other regard (e.g., GUP references), but failing to unmap is "obviously" bad I think :)

Agree with you.

Thanks.
.



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-01-22  7:39 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-16  6:16 [PATCH v2 0/3] mm: memory_failure: unmap poisoned filio during migrate properly Wupeng Ma
2025-01-16  6:16 ` [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio Wupeng Ma
2025-01-17  3:57   ` kernel test robot
2025-01-17  4:16     ` mawupeng
2025-01-17  4:39   ` kernel test robot
2025-01-17  4:49   ` kernel test robot
2025-01-20  6:24   ` Miaohe Lin
2025-01-20  7:49     ` David Hildenbrand
2025-01-20  8:46       ` David Hildenbrand
2025-01-21  3:20         ` Miaohe Lin
2025-01-21  7:58           ` David Hildenbrand
2025-01-22  7:38             ` Miaohe Lin
2025-01-21  2:46       ` Miaohe Lin
2025-01-20  9:06     ` mawupeng
2025-01-20  7:55   ` David Hildenbrand
2025-01-16  6:16 ` [PATCH v2 2/3] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Wupeng Ma
2025-01-20  9:25   ` David Hildenbrand
2025-01-16  6:16 ` [PATCH v2 3/3] mm: memory-hotplug: check folio ref count first in do_migrate_rang Wupeng Ma
2025-01-20  6:32   ` Miaohe Lin
2025-01-21  2:17     ` mawupeng
2025-01-20  8:01   ` David Hildenbrand
2025-01-20  9:11     ` mawupeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox