linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next 0/7] mm: convert page cpupid functions to folios
@ 2023-10-10  6:45 Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 1/7] mm_types: add _last_cpupid into folio Kefeng Wang
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, linux-mm, linux-kernel, ying.huang, david, Zi Yan, Kefeng Wang

The cpupid(or access time) used by numa balancing is stored in flags
or _last_cpupid(if LAST_CPUPID_NOT_IN_PAGE_FLAGS) of page, this is to
convert page cpupid to folio cpupid, a new _last_cpupid is added into
folio, which make us to use folio->_last_cpupid directly, and the
page_cpupid_xchg_last(), xchg_page_access_time() and page_cpupid_last()
are converted to folio ones.

v1:
- drop inappropriate page_cpupid_reset_last convertion from RFC
- rebased on next-20231009

Kefeng Wang (7):
  mm_types: add _last_cpupid into folio
  mm: mprotect: use a folio in change_pte_range()
  mm: huge_memory: use a folio in change_huge_pmd()
  mm: convert xchg_page_access_time to xchg_folio_access_time()
  mm: convert page_cpupid_last() to folio_cpupid_last()
  mm: make wp_page_reuse() and finish_mkwrite_fault() to take a folio
  mm: convert page_cpupid_xchg_last() to folio_cpupid_xchg_last()

 include/linux/mm.h       | 30 +++++++++++++++---------------
 include/linux/mm_types.h | 13 +++++++++----
 kernel/sched/fair.c      |  4 ++--
 mm/huge_memory.c         | 17 +++++++++--------
 mm/memory.c              | 39 +++++++++++++++++++++------------------
 mm/migrate.c             |  4 ++--
 mm/mmzone.c              |  6 +++---
 mm/mprotect.c            | 16 +++++++++-------
 8 files changed, 70 insertions(+), 59 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -next 1/7] mm_types: add _last_cpupid into folio
  2023-10-10  6:45 [PATCH -next 0/7] mm: convert page cpupid functions to folios Kefeng Wang
@ 2023-10-10  6:45 ` Kefeng Wang
  2023-10-10  8:17   ` Huang, Ying
  2023-10-10 12:33   ` Matthew Wilcox
  2023-10-10  6:45 ` [PATCH -next 2/7] mm: mprotect: use a folio in change_pte_range() Kefeng Wang
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, linux-mm, linux-kernel, ying.huang, david, Zi Yan, Kefeng Wang

At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of
them don't support numa balancing, and the page struct is aligned
to _struct_page_alignment, it is safe to move _last_cpupid before
'virtual' in page, meanwhile, add it into folio, which make us to
use folio->_last_cpupid directly.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/mm_types.h | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 36c5b43999e6..32af41160109 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -183,6 +183,9 @@ struct page {
 #ifdef CONFIG_MEMCG
 	unsigned long memcg_data;
 #endif
+#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
+	int _last_cpupid;
+#endif
 
 	/*
 	 * On machines where all RAM is mapped into kernel address space,
@@ -210,10 +213,6 @@ struct page {
 	struct page *kmsan_shadow;
 	struct page *kmsan_origin;
 #endif
-
-#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
-	int _last_cpupid;
-#endif
 } _struct_page_alignment;
 
 /*
@@ -317,6 +316,9 @@ struct folio {
 			atomic_t _refcount;
 #ifdef CONFIG_MEMCG
 			unsigned long memcg_data;
+#endif
+#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
+			int _last_cpupid;
 #endif
 	/* private: the union with struct page is transitional */
 		};
@@ -373,6 +375,9 @@ FOLIO_MATCH(_refcount, _refcount);
 #ifdef CONFIG_MEMCG
 FOLIO_MATCH(memcg_data, memcg_data);
 #endif
+#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
+FOLIO_MATCH(_last_cpupid, _last_cpupid);
+#endif
 #undef FOLIO_MATCH
 #define FOLIO_MATCH(pg, fl)						\
 	static_assert(offsetof(struct folio, fl) ==			\
-- 
2.27.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -next 2/7] mm: mprotect: use a folio in change_pte_range()
  2023-10-10  6:45 [PATCH -next 0/7] mm: convert page cpupid functions to folios Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 1/7] mm_types: add _last_cpupid into folio Kefeng Wang
@ 2023-10-10  6:45 ` Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 3/7] mm: huge_memory: use a folio in change_huge_pmd() Kefeng Wang
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, linux-mm, linux-kernel, ying.huang, david, Zi Yan, Kefeng Wang

Use a folio in change_pte_range() to save three compound_head() calls.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/mprotect.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index b94fbb45d5c7..459daa987131 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -114,7 +114,7 @@ static long change_pte_range(struct mmu_gather *tlb,
 			 * pages. See similar comment in change_huge_pmd.
 			 */
 			if (prot_numa) {
-				struct page *page;
+				struct folio *folio;
 				int nid;
 				bool toptier;
 
@@ -122,13 +122,14 @@ static long change_pte_range(struct mmu_gather *tlb,
 				if (pte_protnone(oldpte))
 					continue;
 
-				page = vm_normal_page(vma, addr, oldpte);
-				if (!page || is_zone_device_page(page) || PageKsm(page))
+				folio = vm_normal_folio(vma, addr, oldpte);
+				if (!folio || folio_is_zone_device(folio) ||
+				    folio_test_ksm(folio))
 					continue;
 
 				/* Also skip shared copy-on-write pages */
 				if (is_cow_mapping(vma->vm_flags) &&
-				    page_count(page) != 1)
+				    folio_ref_count(folio) != 1)
 					continue;
 
 				/*
@@ -136,14 +137,15 @@ static long change_pte_range(struct mmu_gather *tlb,
 				 * it cannot move them all from MIGRATE_ASYNC
 				 * context.
 				 */
-				if (page_is_file_lru(page) && PageDirty(page))
+				if (folio_is_file_lru(folio) &&
+				    folio_test_dirty(folio))
 					continue;
 
 				/*
 				 * Don't mess with PTEs if page is already on the node
 				 * a single-threaded process is running on.
 				 */
-				nid = page_to_nid(page);
+				nid = folio_nid(folio);
 				if (target_node == nid)
 					continue;
 				toptier = node_is_toptier(nid);
@@ -157,7 +159,7 @@ static long change_pte_range(struct mmu_gather *tlb,
 					continue;
 				if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING &&
 				    !toptier)
-					xchg_page_access_time(page,
+					xchg_page_access_time(&folio->page,
 						jiffies_to_msecs(jiffies));
 			}
 
-- 
2.27.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -next 3/7] mm: huge_memory: use a folio in change_huge_pmd()
  2023-10-10  6:45 [PATCH -next 0/7] mm: convert page cpupid functions to folios Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 1/7] mm_types: add _last_cpupid into folio Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 2/7] mm: mprotect: use a folio in change_pte_range() Kefeng Wang
@ 2023-10-10  6:45 ` Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 4/7] mm: convert xchg_page_access_time to xchg_folio_access_time() Kefeng Wang
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, linux-mm, linux-kernel, ying.huang, david, Zi Yan, Kefeng Wang

Use a folio in change_huge_pmd(), this is in preparation for
xchg_page_access_time() to folio conversion.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/huge_memory.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index c9cbcbf6697e..344c8db904e1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1856,7 +1856,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
 	if (is_swap_pmd(*pmd)) {
 		swp_entry_t entry = pmd_to_swp_entry(*pmd);
-		struct page *page = pfn_swap_entry_to_page(entry);
+		struct folio *folio = page_folio(pfn_swap_entry_to_page(entry));
 		pmd_t newpmd;
 
 		VM_BUG_ON(!is_pmd_migration_entry(*pmd));
@@ -1865,7 +1865,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 			 * A protection check is difficult so
 			 * just be safe and disable write
 			 */
-			if (PageAnon(page))
+			if (folio_test_anon(folio))
 				entry = make_readable_exclusive_migration_entry(swp_offset(entry));
 			else
 				entry = make_readable_migration_entry(swp_offset(entry));
@@ -1887,7 +1887,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 #endif
 
 	if (prot_numa) {
-		struct page *page;
+		struct folio *folio;
 		bool toptier;
 		/*
 		 * Avoid trapping faults against the zero page. The read-only
@@ -1900,8 +1900,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		if (pmd_protnone(*pmd))
 			goto unlock;
 
-		page = pmd_page(*pmd);
-		toptier = node_is_toptier(page_to_nid(page));
+		folio = page_folio(pmd_page(*pmd));
+		toptier = node_is_toptier(folio_nid(folio));
 		/*
 		 * Skip scanning top tier node if normal numa
 		 * balancing is disabled
@@ -1912,7 +1912,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 
 		if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING &&
 		    !toptier)
-			xchg_page_access_time(page, jiffies_to_msecs(jiffies));
+			xchg_page_access_time(&folio->page,
+					      jiffies_to_msecs(jiffies));
 	}
 	/*
 	 * In case prot_numa, we are under mmap_read_lock(mm). It's critical
-- 
2.27.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -next 4/7] mm: convert xchg_page_access_time to xchg_folio_access_time()
  2023-10-10  6:45 [PATCH -next 0/7] mm: convert page cpupid functions to folios Kefeng Wang
                   ` (2 preceding siblings ...)
  2023-10-10  6:45 ` [PATCH -next 3/7] mm: huge_memory: use a folio in change_huge_pmd() Kefeng Wang
@ 2023-10-10  6:45 ` Kefeng Wang
  2023-10-10 12:27   ` Matthew Wilcox
  2023-10-10  6:45 ` [PATCH -next 5/7] mm: convert page_cpupid_last() to folio_cpupid_last() Kefeng Wang
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, linux-mm, linux-kernel, ying.huang, david, Zi Yan, Kefeng Wang

Make xchg_page_access_time to take a folio, and rename it to
xchg_folio_access_time() since all callers with a folio.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/mm.h  | 7 ++++---
 kernel/sched/fair.c | 2 +-
 mm/huge_memory.c    | 4 ++--
 mm/mprotect.c       | 2 +-
 4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a10b8774cc6f..13ca63efacf7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1711,11 +1711,12 @@ static inline void page_cpupid_reset_last(struct page *page)
 }
 #endif /* LAST_CPUPID_NOT_IN_PAGE_FLAGS */
 
-static inline int xchg_page_access_time(struct page *page, int time)
+static inline int xchg_folio_access_time(struct folio *folio, int time)
 {
 	int last_time;
 
-	last_time = page_cpupid_xchg_last(page, time >> PAGE_ACCESS_TIME_BUCKETS);
+	last_time = page_cpupid_xchg_last(&folio->page,
+					  time >> PAGE_ACCESS_TIME_BUCKETS);
 	return last_time << PAGE_ACCESS_TIME_BUCKETS;
 }
 
@@ -1734,7 +1735,7 @@ static inline int page_cpupid_xchg_last(struct page *page, int cpupid)
 	return page_to_nid(page); /* XXX */
 }
 
-static inline int xchg_page_access_time(struct page *page, int time)
+static inline int xchg_folio_access_time(struct folio *folio, int time)
 {
 	return 0;
 }
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 682067c545d1..50b9f63099fb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1722,7 +1722,7 @@ static int numa_hint_fault_latency(struct folio *folio)
 	int last_time, time;
 
 	time = jiffies_to_msecs(jiffies);
-	last_time = xchg_page_access_time(&folio->page, time);
+	last_time = xchg_folio_access_time(folio, time);
 
 	return (time - last_time) & PAGE_ACCESS_TIME_MASK;
 }
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 344c8db904e1..e85238ac1d5c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1912,8 +1912,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 
 		if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING &&
 		    !toptier)
-			xchg_page_access_time(&folio->page,
-					      jiffies_to_msecs(jiffies));
+			xchg_folio_access_time(folio,
+					       jiffies_to_msecs(jiffies));
 	}
 	/*
 	 * In case prot_numa, we are under mmap_read_lock(mm). It's critical
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 459daa987131..1c556651888a 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -159,7 +159,7 @@ static long change_pte_range(struct mmu_gather *tlb,
 					continue;
 				if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING &&
 				    !toptier)
-					xchg_page_access_time(&folio->page,
+					xchg_folio_access_time(folio,
 						jiffies_to_msecs(jiffies));
 			}
 
-- 
2.27.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -next 5/7] mm: convert page_cpupid_last() to folio_cpupid_last()
  2023-10-10  6:45 [PATCH -next 0/7] mm: convert page cpupid functions to folios Kefeng Wang
                   ` (3 preceding siblings ...)
  2023-10-10  6:45 ` [PATCH -next 4/7] mm: convert xchg_page_access_time to xchg_folio_access_time() Kefeng Wang
@ 2023-10-10  6:45 ` Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 6/7] mm: make wp_page_reuse() and finish_mkwrite_fault() to take a folio Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 7/7] mm: convert page_cpupid_xchg_last() to folio_cpupid_xchg_last() Kefeng Wang
  6 siblings, 0 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, linux-mm, linux-kernel, ying.huang, david, Zi Yan, Kefeng Wang

Make page_cpupid_last() to take a folio, and rename it to
folio_cpupid_last() since all callers with a folio.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/mm.h | 12 ++++++------
 mm/huge_memory.c   |  4 ++--
 mm/memory.c        |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 13ca63efacf7..e0bd8abae6c6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1689,18 +1689,18 @@ static inline int page_cpupid_xchg_last(struct page *page, int cpupid)
 	return xchg(&page->_last_cpupid, cpupid & LAST_CPUPID_MASK);
 }
 
-static inline int page_cpupid_last(struct page *page)
+static inline int folio_cpupid_last(struct folio *folio)
 {
-	return page->_last_cpupid;
+	return folio->_last_cpupid;
 }
 static inline void page_cpupid_reset_last(struct page *page)
 {
 	page->_last_cpupid = -1 & LAST_CPUPID_MASK;
 }
 #else
-static inline int page_cpupid_last(struct page *page)
+static inline int folio_cpupid_last(struct folio *folio)
 {
-	return (page->flags >> LAST_CPUPID_PGSHIFT) & LAST_CPUPID_MASK;
+	return (folio->flags >> LAST_CPUPID_PGSHIFT) & LAST_CPUPID_MASK;
 }
 
 extern int page_cpupid_xchg_last(struct page *page, int cpupid);
@@ -1740,9 +1740,9 @@ static inline int xchg_folio_access_time(struct folio *folio, int time)
 	return 0;
 }
 
-static inline int page_cpupid_last(struct page *page)
+static inline int folio_cpupid_last(struct folio *folio)
 {
-	return page_to_nid(page); /* XXX */
+	return folio_nid(folio); /* XXX */
 }
 
 static inline int cpupid_to_nid(int cpupid)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e85238ac1d5c..3b37367eaeff 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1562,7 +1562,7 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf)
 	 * to record page access time.  So use default value.
 	 */
 	if (node_is_toptier(nid))
-		last_cpupid = page_cpupid_last(&folio->page);
+		last_cpupid = folio_cpupid_last(folio);
 	target_nid = numa_migrate_prep(folio, vma, haddr, nid, &flags);
 	if (target_nid == NUMA_NO_NODE) {
 		folio_put(folio);
@@ -2515,7 +2515,7 @@ static void __split_huge_page_tail(struct folio *folio, int tail,
 	if (page_is_idle(head))
 		set_page_idle(page_tail);
 
-	page_cpupid_xchg_last(page_tail, page_cpupid_last(head));
+	page_cpupid_xchg_last(page_tail, folio_cpupid_last(folio));
 
 	/*
 	 * always add to the tail because some iterators expect new
diff --git a/mm/memory.c b/mm/memory.c
index c4b4aa4c1180..7566955d88e3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4861,7 +4861,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
 	    !node_is_toptier(nid))
 		last_cpupid = (-1 & LAST_CPUPID_MASK);
 	else
-		last_cpupid = page_cpupid_last(&folio->page);
+		last_cpupid = folio_cpupid_last(folio);
 	target_nid = numa_migrate_prep(folio, vma, vmf->address, nid, &flags);
 	if (target_nid == NUMA_NO_NODE) {
 		folio_put(folio);
-- 
2.27.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -next 6/7] mm: make wp_page_reuse() and finish_mkwrite_fault() to take a folio
  2023-10-10  6:45 [PATCH -next 0/7] mm: convert page cpupid functions to folios Kefeng Wang
                   ` (4 preceding siblings ...)
  2023-10-10  6:45 ` [PATCH -next 5/7] mm: convert page_cpupid_last() to folio_cpupid_last() Kefeng Wang
@ 2023-10-10  6:45 ` Kefeng Wang
  2023-10-10  6:45 ` [PATCH -next 7/7] mm: convert page_cpupid_xchg_last() to folio_cpupid_xchg_last() Kefeng Wang
  6 siblings, 0 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, linux-mm, linux-kernel, ying.huang, david, Zi Yan, Kefeng Wang

Make finish_mkwrite_fault() to a static function, and convert
wp_page_reuse() and finish_mkwrite_fault() to take a folio in
preparation for page_cpupid_xchg_last() to folio conversion.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/mm.h |  1 -
 mm/memory.c        | 37 ++++++++++++++++++++-----------------
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index e0bd8abae6c6..3d59455626fa 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1335,7 +1335,6 @@ void set_pte_range(struct vm_fault *vmf, struct folio *folio,
 		struct page *page, unsigned int nr, unsigned long addr);
 
 vm_fault_t finish_fault(struct vm_fault *vmf);
-vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf);
 #endif
 
 /*
diff --git a/mm/memory.c b/mm/memory.c
index 7566955d88e3..1a1a6a6ccd58 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3018,23 +3018,24 @@ static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf)
  * case, all we need to do here is to mark the page as writable and update
  * any related book-keeping.
  */
-static inline void wp_page_reuse(struct vm_fault *vmf)
+static inline void wp_page_reuse(struct vm_fault *vmf, struct folio *folio)
 	__releases(vmf->ptl)
 {
 	struct vm_area_struct *vma = vmf->vma;
-	struct page *page = vmf->page;
 	pte_t entry;
 
 	VM_BUG_ON(!(vmf->flags & FAULT_FLAG_WRITE));
-	VM_BUG_ON(page && PageAnon(page) && !PageAnonExclusive(page));
+	if (folio) {
+		VM_BUG_ON(folio_test_anon(folio) &&
+			  !PageAnonExclusive(vmf->page));
 
-	/*
-	 * Clear the pages cpupid information as the existing
-	 * information potentially belongs to a now completely
-	 * unrelated process.
-	 */
-	if (page)
-		page_cpupid_xchg_last(page, (1 << LAST_CPUPID_SHIFT) - 1);
+		/*
+		 * Clear the pages cpupid information as the existing
+		 * information potentially belongs to a now completely
+		 * unrelated process.
+		 */
+		page_cpupid_xchg_last(vmf->page, (1 << LAST_CPUPID_SHIFT) - 1);
+	}
 
 	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 	entry = pte_mkyoung(vmf->orig_pte);
@@ -3261,6 +3262,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
  *			  writeable once the page is prepared
  *
  * @vmf: structure describing the fault
+ * @folio: the folio of vmf->page
  *
  * This function handles all that is needed to finish a write page fault in a
  * shared mapping due to PTE being read-only once the mapped page is prepared.
@@ -3272,7 +3274,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
  * Return: %0 on success, %VM_FAULT_NOPAGE when PTE got changed before
  * we acquired PTE lock.
  */
-vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf)
+static vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf,
+				       struct folio *folio)
 {
 	WARN_ON_ONCE(!(vmf->vma->vm_flags & VM_SHARED));
 	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, vmf->address,
@@ -3288,7 +3291,7 @@ vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf)
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
 		return VM_FAULT_NOPAGE;
 	}
-	wp_page_reuse(vmf);
+	wp_page_reuse(vmf, folio);
 	return 0;
 }
 
@@ -3312,9 +3315,9 @@ static vm_fault_t wp_pfn_shared(struct vm_fault *vmf)
 		ret = vma->vm_ops->pfn_mkwrite(vmf);
 		if (ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE))
 			return ret;
-		return finish_mkwrite_fault(vmf);
+		return finish_mkwrite_fault(vmf, NULL);
 	}
-	wp_page_reuse(vmf);
+	wp_page_reuse(vmf, NULL);
 	return 0;
 }
 
@@ -3342,14 +3345,14 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf, struct folio *folio)
 			folio_put(folio);
 			return tmp;
 		}
-		tmp = finish_mkwrite_fault(vmf);
+		tmp = finish_mkwrite_fault(vmf, folio);
 		if (unlikely(tmp & (VM_FAULT_ERROR | VM_FAULT_NOPAGE))) {
 			folio_unlock(folio);
 			folio_put(folio);
 			return tmp;
 		}
 	} else {
-		wp_page_reuse(vmf);
+		wp_page_reuse(vmf, folio);
 		folio_lock(folio);
 	}
 	ret |= fault_dirty_shared_page(vmf);
@@ -3494,7 +3497,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
 			pte_unmap_unlock(vmf->pte, vmf->ptl);
 			return 0;
 		}
-		wp_page_reuse(vmf);
+		wp_page_reuse(vmf, folio);
 		return 0;
 	}
 	/*
-- 
2.27.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -next 7/7] mm: convert page_cpupid_xchg_last() to folio_cpupid_xchg_last()
  2023-10-10  6:45 [PATCH -next 0/7] mm: convert page cpupid functions to folios Kefeng Wang
                   ` (5 preceding siblings ...)
  2023-10-10  6:45 ` [PATCH -next 6/7] mm: make wp_page_reuse() and finish_mkwrite_fault() to take a folio Kefeng Wang
@ 2023-10-10  6:45 ` Kefeng Wang
  6 siblings, 0 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10  6:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: willy, linux-mm, linux-kernel, ying.huang, david, Zi Yan, Kefeng Wang

Make page_cpupid_xchg_last() to take a folio, and rename it to
olio_cpupid_xchg_last() since all callers with a folio.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 include/linux/mm.h  | 14 +++++++-------
 kernel/sched/fair.c |  2 +-
 mm/huge_memory.c    |  2 +-
 mm/memory.c         |  2 +-
 mm/migrate.c        |  4 ++--
 mm/mmzone.c         |  6 +++---
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3d59455626fa..e761642e1c00 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1683,9 +1683,9 @@ static inline bool __cpupid_match_pid(pid_t task_pid, int cpupid)
 
 #define cpupid_match_pid(task, cpupid) __cpupid_match_pid(task->pid, cpupid)
 #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
-static inline int page_cpupid_xchg_last(struct page *page, int cpupid)
+static inline int folio_cpupid_xchg_last(struct folio *folio, int cpupid)
 {
-	return xchg(&page->_last_cpupid, cpupid & LAST_CPUPID_MASK);
+	return xchg(&folio->_last_cpupid, cpupid & LAST_CPUPID_MASK);
 }
 
 static inline int folio_cpupid_last(struct folio *folio)
@@ -1702,7 +1702,7 @@ static inline int folio_cpupid_last(struct folio *folio)
 	return (folio->flags >> LAST_CPUPID_PGSHIFT) & LAST_CPUPID_MASK;
 }
 
-extern int page_cpupid_xchg_last(struct page *page, int cpupid);
+extern int folio_cpupid_xchg_last(struct folio *folio, int cpupid);
 
 static inline void page_cpupid_reset_last(struct page *page)
 {
@@ -1714,8 +1714,8 @@ static inline int xchg_folio_access_time(struct folio *folio, int time)
 {
 	int last_time;
 
-	last_time = page_cpupid_xchg_last(&folio->page,
-					  time >> PAGE_ACCESS_TIME_BUCKETS);
+	last_time = folio_cpupid_xchg_last(folio,
+					   time >> PAGE_ACCESS_TIME_BUCKETS);
 	return last_time << PAGE_ACCESS_TIME_BUCKETS;
 }
 
@@ -1729,9 +1729,9 @@ static inline void vma_set_access_pid_bit(struct vm_area_struct *vma)
 	}
 }
 #else /* !CONFIG_NUMA_BALANCING */
-static inline int page_cpupid_xchg_last(struct page *page, int cpupid)
+static inline int folio_cpupid_xchg_last(struct folio *folio, int cpupid)
 {
-	return page_to_nid(page); /* XXX */
+	return folio_nid(folio); /* XXX */
 }
 
 static inline int xchg_folio_access_time(struct folio *folio, int time)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 50b9f63099fb..5d4c7cedc6d1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1818,7 +1818,7 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
 	}
 
 	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
-	last_cpupid = page_cpupid_xchg_last(&folio->page, this_cpupid);
+	last_cpupid = folio_cpupid_xchg_last(folio, this_cpupid);
 
 	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
 	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3b37367eaeff..2163b1d0dad5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2515,7 +2515,7 @@ static void __split_huge_page_tail(struct folio *folio, int tail,
 	if (page_is_idle(head))
 		set_page_idle(page_tail);
 
-	page_cpupid_xchg_last(page_tail, folio_cpupid_last(folio));
+	folio_cpupid_xchg_last(new_folio, folio_cpupid_last(folio));
 
 	/*
 	 * always add to the tail because some iterators expect new
diff --git a/mm/memory.c b/mm/memory.c
index 1a1a6a6ccd58..9f3b359b46db 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3034,7 +3034,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf, struct folio *folio)
 		 * information potentially belongs to a now completely
 		 * unrelated process.
 		 */
-		page_cpupid_xchg_last(vmf->page, (1 << LAST_CPUPID_SHIFT) - 1);
+		folio_cpupid_xchg_last(folio, (1 << LAST_CPUPID_SHIFT) - 1);
 	}
 
 	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
diff --git a/mm/migrate.c b/mm/migrate.c
index c602bf6dec97..5642e9572d80 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -588,7 +588,7 @@ void folio_migrate_flags(struct folio *newfolio, struct folio *folio)
 	 * Copy NUMA information to the new page, to prevent over-eager
 	 * future migrations of this same page.
 	 */
-	cpupid = page_cpupid_xchg_last(&folio->page, -1);
+	cpupid = folio_cpupid_xchg_last(folio, -1);
 	/*
 	 * For memory tiering mode, when migrate between slow and fast
 	 * memory node, reset cpupid, because that is used to record
@@ -601,7 +601,7 @@ void folio_migrate_flags(struct folio *newfolio, struct folio *folio)
 		if (f_toptier != t_toptier)
 			cpupid = -1;
 	}
-	page_cpupid_xchg_last(&newfolio->page, cpupid);
+	folio_cpupid_xchg_last(newfolio, cpupid);
 
 	folio_migrate_ksm(newfolio, folio);
 	/*
diff --git a/mm/mmzone.c b/mm/mmzone.c
index 68e1511be12d..cd473f82b647 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -93,19 +93,19 @@ void lruvec_init(struct lruvec *lruvec)
 }
 
 #if defined(CONFIG_NUMA_BALANCING) && !defined(LAST_CPUPID_NOT_IN_PAGE_FLAGS)
-int page_cpupid_xchg_last(struct page *page, int cpupid)
+int folio_cpupid_xchg_last(struct folio *folio, int cpupid)
 {
 	unsigned long old_flags, flags;
 	int last_cpupid;
 
-	old_flags = READ_ONCE(page->flags);
+	old_flags = READ_ONCE(folio->flags);
 	do {
 		flags = old_flags;
 		last_cpupid = (flags >> LAST_CPUPID_PGSHIFT) & LAST_CPUPID_MASK;
 
 		flags &= ~(LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT);
 		flags |= (cpupid & LAST_CPUPID_MASK) << LAST_CPUPID_PGSHIFT;
-	} while (unlikely(!try_cmpxchg(&page->flags, &old_flags, flags)));
+	} while (unlikely(!try_cmpxchg(&folio->flags, &old_flags, flags)));
 
 	return last_cpupid;
 }
-- 
2.27.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next 1/7] mm_types: add _last_cpupid into folio
  2023-10-10  6:45 ` [PATCH -next 1/7] mm_types: add _last_cpupid into folio Kefeng Wang
@ 2023-10-10  8:17   ` Huang, Ying
  2023-10-10 11:10     ` Kefeng Wang
  2023-10-10 12:33   ` Matthew Wilcox
  1 sibling, 1 reply; 16+ messages in thread
From: Huang, Ying @ 2023-10-10  8:17 UTC (permalink / raw)
  To: Kefeng Wang; +Cc: Andrew Morton, willy, linux-mm, linux-kernel, david, Zi Yan

Kefeng Wang <wangkefeng.wang@huawei.com> writes:

> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of
> them don't support numa balancing, and the page struct is aligned
> to _struct_page_alignment, it is safe to move _last_cpupid before
> 'virtual' in page, meanwhile, add it into folio, which make us to
> use folio->_last_cpupid directly.

Add BUILD_BUG_ON() to check this automatically?

--
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next 1/7] mm_types: add _last_cpupid into folio
  2023-10-10  8:17   ` Huang, Ying
@ 2023-10-10 11:10     ` Kefeng Wang
  0 siblings, 0 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-10 11:10 UTC (permalink / raw)
  To: Huang, Ying; +Cc: Andrew Morton, willy, linux-mm, linux-kernel, david, Zi Yan



On 2023/10/10 16:17, Huang, Ying wrote:
> Kefeng Wang <wangkefeng.wang@huawei.com> writes:
> 
>> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of
>> them don't support numa balancing, and the page struct is aligned
>> to _struct_page_alignment, it is safe to move _last_cpupid before
>> 'virtual' in page, meanwhile, add it into folio, which make us to
>> use folio->_last_cpupid directly.
> 
> Add BUILD_BUG_ON() to check this automatically?

The WANT_PAGE_VIRTUAL and LAST_CPUPID_NOT_IN_PAGE_FLAGS are not
conflict, the check is to make sure that the re-order the virtual
and _last_cpupid is minimal impact, and there is a build warning in
mm/memory.c when the LAST_CPUPID_NOT_IN_PAGE_FLAGS is enabled, so I
don't think we need a new BUILD_BUG_ON here.

Thanks.

> 
> --
> Best Regards,
> Huang, Ying


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next 4/7] mm: convert xchg_page_access_time to xchg_folio_access_time()
  2023-10-10  6:45 ` [PATCH -next 4/7] mm: convert xchg_page_access_time to xchg_folio_access_time() Kefeng Wang
@ 2023-10-10 12:27   ` Matthew Wilcox
  2023-10-11  3:03     ` Kefeng Wang
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2023-10-10 12:27 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, linux-mm, linux-kernel, ying.huang, david, Zi Yan

On Tue, Oct 10, 2023 at 02:45:41PM +0800, Kefeng Wang wrote:
> Make xchg_page_access_time to take a folio, and rename it to
> xchg_folio_access_time() since all callers with a folio.

You're doing this the hard way, which makes life hard for the reviewrs.

patch 1. Introduce folio->_last_cpupid
patch 2: Add

static inline int folio_xchg_access_time(struct folio *folio, int time)
{
	return xchg_page_access_time(&folio->page, time);
}

patch 3-n: Convert callers
Patch n+1: Remove xchg_page_access_time(), folding it into
folio_xchg_access_time().

Similarly for page_cpupid_xchg_last / folio_cpupid_xchg_last().
(why is this not called folio_xchg_last_cpupid?)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next 1/7] mm_types: add _last_cpupid into folio
  2023-10-10  6:45 ` [PATCH -next 1/7] mm_types: add _last_cpupid into folio Kefeng Wang
  2023-10-10  8:17   ` Huang, Ying
@ 2023-10-10 12:33   ` Matthew Wilcox
  2023-10-11  3:02     ` Kefeng Wang
  1 sibling, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2023-10-10 12:33 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, linux-mm, linux-kernel, ying.huang, david, Zi Yan

On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote:
> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of
> them don't support numa balancing, and the page struct is aligned
> to _struct_page_alignment, it is safe to move _last_cpupid before
> 'virtual' in page, meanwhile, add it into folio, which make us to
> use folio->_last_cpupid directly.

What do you mean by "safe"?  I think you mean "Does not increase the
size of struct page", but if that is what you mean, why not just say so?
If there's something else you mean, please explain.

In any event, I'd like to see some reasoning that _last_cpupid is actually
information which is logically maintained on a per-allocation basis,
not a per-page basis (I think this is true, but I honestly don't know)

And looking at all this, I think it makes sense to move _last_cpupid
before the kmsan garbage, then add both 'virtual' and '_last_cpupid'
to folio.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next 1/7] mm_types: add _last_cpupid into folio
  2023-10-10 12:33   ` Matthew Wilcox
@ 2023-10-11  3:02     ` Kefeng Wang
  2023-10-11  5:55       ` Huang, Ying
  0 siblings, 1 reply; 16+ messages in thread
From: Kefeng Wang @ 2023-10-11  3:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, linux-mm, linux-kernel, ying.huang, david, Zi Yan



On 2023/10/10 20:33, Matthew Wilcox wrote:
> On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote:
>> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of
>> them don't support numa balancing, and the page struct is aligned
>> to _struct_page_alignment, it is safe to move _last_cpupid before
>> 'virtual' in page, meanwhile, add it into folio, which make us to
>> use folio->_last_cpupid directly.
> 
> What do you mean by "safe"?  I think you mean "Does not increase the
> size of struct page", but if that is what you mean, why not just say so?
> If there's something else you mean, please explain.

Don't increase size of struct page and don't impact the real order of
struct page as the above three archs without numa balancing support.

> 
> In any event, I'd like to see some reasoning that _last_cpupid is actually
> information which is logically maintained on a per-allocation basis,
> not a per-page basis (I think this is true, but I honestly don't know)

The _last_cpupid is updated in should_numa_migrate_memory() from numa
fault(do_numa_page, and do_huge_pmd_numa_page), it is per-page(normal
page and PMD-mapped page). Maybe I misunderstand your mean, please
correct me.

> 
> And looking at all this, I think it makes sense to move _last_cpupid
> before the kmsan garbage, then add both 'virtual' and '_last_cpupid'
> to folio.

sure, I will add both of them into folio and don't re-order 'virtual' 
and '_last_cpupid'.
> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next 4/7] mm: convert xchg_page_access_time to xchg_folio_access_time()
  2023-10-10 12:27   ` Matthew Wilcox
@ 2023-10-11  3:03     ` Kefeng Wang
  0 siblings, 0 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-11  3:03 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, linux-mm, linux-kernel, ying.huang, david, Zi Yan



On 2023/10/10 20:27, Matthew Wilcox wrote:
> On Tue, Oct 10, 2023 at 02:45:41PM +0800, Kefeng Wang wrote:
>> Make xchg_page_access_time to take a folio, and rename it to
>> xchg_folio_access_time() since all callers with a folio.
> 
> You're doing this the hard way, which makes life hard for the reviewrs.
> 
> patch 1. Introduce folio->_last_cpupid
> patch 2: Add
> 
> static inline int folio_xchg_access_time(struct folio *folio, int time)
> {
> 	return xchg_page_access_time(&folio->page, time);
> }
> 
> patch 3-n: Convert callers
> Patch n+1: Remove xchg_page_access_time(), folding it into
> folio_xchg_access_time().

Ok, I will follow this way, thanks for your advise.
> 
> Similarly for page_cpupid_xchg_last / folio_cpupid_xchg_last().
> (why is this not called folio_xchg_last_cpupid?)

Fine with me, will update.

Thanks.

> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next 1/7] mm_types: add _last_cpupid into folio
  2023-10-11  3:02     ` Kefeng Wang
@ 2023-10-11  5:55       ` Huang, Ying
  2023-10-11  8:05         ` Kefeng Wang
  0 siblings, 1 reply; 16+ messages in thread
From: Huang, Ying @ 2023-10-11  5:55 UTC (permalink / raw)
  To: Kefeng Wang, Matthew Wilcox
  Cc: Andrew Morton, linux-mm, linux-kernel, david, Zi Yan, Mel Gorman

Kefeng Wang <wangkefeng.wang@huawei.com> writes:

> On 2023/10/10 20:33, Matthew Wilcox wrote:
>> On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote:
>>> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of
>>> them don't support numa balancing, and the page struct is aligned
>>> to _struct_page_alignment, it is safe to move _last_cpupid before
>>> 'virtual' in page, meanwhile, add it into folio, which make us to
>>> use folio->_last_cpupid directly.
>> What do you mean by "safe"?  I think you mean "Does not increase the
>> size of struct page", but if that is what you mean, why not just say so?
>> If there's something else you mean, please explain.
>
> Don't increase size of struct page and don't impact the real order of
> struct page as the above three archs without numa balancing support.
>
>> In any event, I'd like to see some reasoning that _last_cpupid is
>> actually
>> information which is logically maintained on a per-allocation basis,
>> not a per-page basis (I think this is true, but I honestly don't know)
>
> The _last_cpupid is updated in should_numa_migrate_memory() from numa
> fault(do_numa_page, and do_huge_pmd_numa_page), it is per-page(normal
> page and PMD-mapped page). Maybe I misunderstand your mean, please
> correct me.

Because PTE mapped THP will not be migrated according to comments and
folio_test_large() test in do_numa_page().  Only _last_cpuid of the head
page will be used (that is, on per-allocation basis).  Although in
change_pte_range() in mprotect.c, _last_cpuid of tail pages may be
changed, they are not used actually.  All in all, _last_cpuid is on
per-allocation basis for now.

In the future, it's hard to say.  PTE-mapped THPs or large folios give
us an opportunity to check whether the different parts of a folio are
accessed by multiple sockets, so that we should split the folio.  But
this is just some possibility in the future.

--
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -next 1/7] mm_types: add _last_cpupid into folio
  2023-10-11  5:55       ` Huang, Ying
@ 2023-10-11  8:05         ` Kefeng Wang
  0 siblings, 0 replies; 16+ messages in thread
From: Kefeng Wang @ 2023-10-11  8:05 UTC (permalink / raw)
  To: Huang, Ying, Matthew Wilcox
  Cc: Andrew Morton, linux-mm, linux-kernel, david, Zi Yan, Mel Gorman



On 2023/10/11 13:55, Huang, Ying wrote:
> Kefeng Wang <wangkefeng.wang@huawei.com> writes:
> 
>> On 2023/10/10 20:33, Matthew Wilcox wrote:
>>> On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote:
>>>> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of
>>>> them don't support numa balancing, and the page struct is aligned
>>>> to _struct_page_alignment, it is safe to move _last_cpupid before
>>>> 'virtual' in page, meanwhile, add it into folio, which make us to
>>>> use folio->_last_cpupid directly.
>>> What do you mean by "safe"?  I think you mean "Does not increase the
>>> size of struct page", but if that is what you mean, why not just say so?
>>> If there's something else you mean, please explain.
>>
>> Don't increase size of struct page and don't impact the real order of
>> struct page as the above three archs without numa balancing support.
>>
>>> In any event, I'd like to see some reasoning that _last_cpupid is
>>> actually
>>> information which is logically maintained on a per-allocation basis,
>>> not a per-page basis (I think this is true, but I honestly don't know)
>>
>> The _last_cpupid is updated in should_numa_migrate_memory() from numa
>> fault(do_numa_page, and do_huge_pmd_numa_page), it is per-page(normal
>> page and PMD-mapped page). Maybe I misunderstand your mean, please
>> correct me.
> 
> Because PTE mapped THP will not be migrated according to comments and
> folio_test_large() test in do_numa_page().  Only _last_cpuid of the head
> page will be used (that is, on per-allocation basis).  Although in
> change_pte_range() in mprotect.c, _last_cpuid of tail pages may be
> changed, they are not used actually.  All in all, _last_cpuid is on
> per-allocation basis for now.

Thanks for clarification, yes, it's what I mean, too
> 
> In the future, it's hard to say.  PTE-mapped THPs or large folios give
> us an opportunity to check whether the different parts of a folio are
> accessed by multiple sockets, so that we should split the folio.  But
> this is just some possibility in the future.

It depends on memory access behavior of application,if multiple sockets
access a large folio/PTE-mappped THP frequently, split maybe better,
or it is enough to just migrate the entire folio.


> 
> --
> Best Regards,
> Huang, Ying
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-10-11  8:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-10  6:45 [PATCH -next 0/7] mm: convert page cpupid functions to folios Kefeng Wang
2023-10-10  6:45 ` [PATCH -next 1/7] mm_types: add _last_cpupid into folio Kefeng Wang
2023-10-10  8:17   ` Huang, Ying
2023-10-10 11:10     ` Kefeng Wang
2023-10-10 12:33   ` Matthew Wilcox
2023-10-11  3:02     ` Kefeng Wang
2023-10-11  5:55       ` Huang, Ying
2023-10-11  8:05         ` Kefeng Wang
2023-10-10  6:45 ` [PATCH -next 2/7] mm: mprotect: use a folio in change_pte_range() Kefeng Wang
2023-10-10  6:45 ` [PATCH -next 3/7] mm: huge_memory: use a folio in change_huge_pmd() Kefeng Wang
2023-10-10  6:45 ` [PATCH -next 4/7] mm: convert xchg_page_access_time to xchg_folio_access_time() Kefeng Wang
2023-10-10 12:27   ` Matthew Wilcox
2023-10-11  3:03     ` Kefeng Wang
2023-10-10  6:45 ` [PATCH -next 5/7] mm: convert page_cpupid_last() to folio_cpupid_last() Kefeng Wang
2023-10-10  6:45 ` [PATCH -next 6/7] mm: make wp_page_reuse() and finish_mkwrite_fault() to take a folio Kefeng Wang
2023-10-10  6:45 ` [PATCH -next 7/7] mm: convert page_cpupid_xchg_last() to folio_cpupid_xchg_last() Kefeng Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox