linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] khugepaged folio conversions
@ 2024-04-03 17:18 Matthew Wilcox (Oracle)
  2024-04-03 17:18 ` [PATCH 1/7] khugepaged: Inline hpage_collapse_alloc_folio() Matthew Wilcox (Oracle)
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-04-03 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm

We've been kind of hacking piecemeal at converting khugepaged to use
folios instead of compound pages, and so this patchset is a little
larger than it should be as I undo some of our wrong moves in the past.
In particular, collapse_file() now consistently uses 'new_folio' for the
freshly allocated folio and 'folio' for the one that's currently in use.

Matthew Wilcox (Oracle) (7):
  khugepaged: Inline hpage_collapse_alloc_folio()
  khugepaged: Convert alloc_charge_hpage to alloc_charge_folio
  khugepaged: Remove hpage from collapse_huge_page()
  khugepaged: Pass a folio to __collapse_huge_page_copy()
  khugepaged: Remove hpage from collapse_file()
  khugepaged: Use a folio throughout collapse_file()
  khugepaged: Use a folio throughout hpage_collapse_scan_file()

 include/trace/events/huge_memory.h |  12 +-
 mm/khugepaged.c                    | 293 +++++++++++++----------------
 2 files changed, 142 insertions(+), 163 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/7] khugepaged: Inline hpage_collapse_alloc_folio()
  2024-04-03 17:18 [PATCH 0/7] khugepaged folio conversions Matthew Wilcox (Oracle)
@ 2024-04-03 17:18 ` Matthew Wilcox (Oracle)
  2024-04-05 21:13   ` Vishal Moola
  2024-04-03 17:18 ` [PATCH 2/7] khugepaged: Convert alloc_charge_hpage to alloc_charge_folio Matthew Wilcox (Oracle)
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-04-03 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm

This function has one caller, and the combined function is simpler to
read, reason about and modify.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/khugepaged.c | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 38830174608f..ad16dd8b26a8 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -891,20 +891,6 @@ static int hpage_collapse_find_target_node(struct collapse_control *cc)
 }
 #endif
 
-static bool hpage_collapse_alloc_folio(struct folio **folio, gfp_t gfp, int node,
-				      nodemask_t *nmask)
-{
-	*folio = __folio_alloc(gfp, HPAGE_PMD_ORDER, node, nmask);
-
-	if (unlikely(!*folio)) {
-		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
-		return false;
-	}
-
-	count_vm_event(THP_COLLAPSE_ALLOC);
-	return true;
-}
-
 /*
  * If mmap_lock temporarily dropped, revalidate vma
  * before taking mmap_lock.
@@ -1067,11 +1053,14 @@ static int alloc_charge_hpage(struct page **hpage, struct mm_struct *mm,
 	int node = hpage_collapse_find_target_node(cc);
 	struct folio *folio;
 
-	if (!hpage_collapse_alloc_folio(&folio, gfp, node, &cc->alloc_nmask)) {
+	folio = __folio_alloc(gfp, HPAGE_PMD_ORDER, node, &cc->alloc_nmask);
+	if (!folio) {
 		*hpage = NULL;
+		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
 		return SCAN_ALLOC_HUGE_PAGE_FAIL;
 	}
 
+	count_vm_event(THP_COLLAPSE_ALLOC);
 	if (unlikely(mem_cgroup_charge(folio, mm, gfp))) {
 		folio_put(folio);
 		*hpage = NULL;
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/7] khugepaged: Convert alloc_charge_hpage to alloc_charge_folio
  2024-04-03 17:18 [PATCH 0/7] khugepaged folio conversions Matthew Wilcox (Oracle)
  2024-04-03 17:18 ` [PATCH 1/7] khugepaged: Inline hpage_collapse_alloc_folio() Matthew Wilcox (Oracle)
@ 2024-04-03 17:18 ` Matthew Wilcox (Oracle)
  2024-04-05 21:14   ` Vishal Moola
  2024-04-03 17:18 ` [PATCH 3/7] khugepaged: Remove hpage from collapse_huge_page() Matthew Wilcox (Oracle)
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-04-03 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm

Both callers want to deal with a folio, so return a folio from
this function.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/khugepaged.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index ad16dd8b26a8..2f1dacd65d12 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1045,7 +1045,7 @@ static int __collapse_huge_page_swapin(struct mm_struct *mm,
 	return result;
 }
 
-static int alloc_charge_hpage(struct page **hpage, struct mm_struct *mm,
+static int alloc_charge_folio(struct folio **foliop, struct mm_struct *mm,
 			      struct collapse_control *cc)
 {
 	gfp_t gfp = (cc->is_khugepaged ? alloc_hugepage_khugepaged_gfpmask() :
@@ -1055,7 +1055,7 @@ static int alloc_charge_hpage(struct page **hpage, struct mm_struct *mm,
 
 	folio = __folio_alloc(gfp, HPAGE_PMD_ORDER, node, &cc->alloc_nmask);
 	if (!folio) {
-		*hpage = NULL;
+		*foliop = NULL;
 		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
 		return SCAN_ALLOC_HUGE_PAGE_FAIL;
 	}
@@ -1063,13 +1063,13 @@ static int alloc_charge_hpage(struct page **hpage, struct mm_struct *mm,
 	count_vm_event(THP_COLLAPSE_ALLOC);
 	if (unlikely(mem_cgroup_charge(folio, mm, gfp))) {
 		folio_put(folio);
-		*hpage = NULL;
+		*foliop = NULL;
 		return SCAN_CGROUP_CHARGE_FAIL;
 	}
 
 	count_memcg_folio_events(folio, THP_COLLAPSE_ALLOC, 1);
 
-	*hpage = folio_page(folio, 0);
+	*foliop = folio;
 	return SCAN_SUCCEED;
 }
 
@@ -1098,7 +1098,8 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	 */
 	mmap_read_unlock(mm);
 
-	result = alloc_charge_hpage(&hpage, mm, cc);
+	result = alloc_charge_folio(&folio, mm, cc);
+	hpage = &folio->page;
 	if (result != SCAN_SUCCEED)
 		goto out_nolock;
 
@@ -1204,7 +1205,6 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	if (unlikely(result != SCAN_SUCCEED))
 		goto out_up_write;
 
-	folio = page_folio(hpage);
 	/*
 	 * The smp_wmb() inside __folio_mark_uptodate() ensures the
 	 * copy_huge_page writes become visible before the set_pmd_at()
@@ -1789,7 +1789,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 	struct page *hpage;
 	struct page *page;
 	struct page *tmp;
-	struct folio *folio;
+	struct folio *folio, *new_folio;
 	pgoff_t index = 0, end = start + HPAGE_PMD_NR;
 	LIST_HEAD(pagelist);
 	XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
@@ -1800,7 +1800,8 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 	VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem);
 	VM_BUG_ON(start & (HPAGE_PMD_NR - 1));
 
-	result = alloc_charge_hpage(&hpage, mm, cc);
+	result = alloc_charge_folio(&new_folio, mm, cc);
+	hpage = &new_folio->page;
 	if (result != SCAN_SUCCEED)
 		goto out;
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/7] khugepaged: Remove hpage from collapse_huge_page()
  2024-04-03 17:18 [PATCH 0/7] khugepaged folio conversions Matthew Wilcox (Oracle)
  2024-04-03 17:18 ` [PATCH 1/7] khugepaged: Inline hpage_collapse_alloc_folio() Matthew Wilcox (Oracle)
  2024-04-03 17:18 ` [PATCH 2/7] khugepaged: Convert alloc_charge_hpage to alloc_charge_folio Matthew Wilcox (Oracle)
@ 2024-04-03 17:18 ` Matthew Wilcox (Oracle)
  2024-04-05 21:14   ` Vishal Moola
  2024-04-03 17:18 ` [PATCH 4/7] khugepaged: Pass a folio to __collapse_huge_page_copy() Matthew Wilcox (Oracle)
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-04-03 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm

Work purely in terms of the folio.  Removes a call to compound_head()
in put_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/khugepaged.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 2f1dacd65d12..1c99d18602e5 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1082,7 +1082,6 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	pte_t *pte;
 	pgtable_t pgtable;
 	struct folio *folio;
-	struct page *hpage;
 	spinlock_t *pmd_ptl, *pte_ptl;
 	int result = SCAN_FAIL;
 	struct vm_area_struct *vma;
@@ -1099,7 +1098,6 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	mmap_read_unlock(mm);
 
 	result = alloc_charge_folio(&folio, mm, cc);
-	hpage = &folio->page;
 	if (result != SCAN_SUCCEED)
 		goto out_nolock;
 
@@ -1198,7 +1196,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	 */
 	anon_vma_unlock_write(vma->anon_vma);
 
-	result = __collapse_huge_page_copy(pte, hpage, pmd, _pmd,
+	result = __collapse_huge_page_copy(pte, &folio->page, pmd, _pmd,
 					   vma, address, pte_ptl,
 					   &compound_pagelist);
 	pte_unmap(pte);
@@ -1213,7 +1211,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	__folio_mark_uptodate(folio);
 	pgtable = pmd_pgtable(_pmd);
 
-	_pmd = mk_huge_pmd(hpage, vma->vm_page_prot);
+	_pmd = mk_huge_pmd(&folio->page, vma->vm_page_prot);
 	_pmd = maybe_pmd_mkwrite(pmd_mkdirty(_pmd), vma);
 
 	spin_lock(pmd_ptl);
@@ -1225,14 +1223,14 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	update_mmu_cache_pmd(vma, address, pmd);
 	spin_unlock(pmd_ptl);
 
-	hpage = NULL;
+	folio = NULL;
 
 	result = SCAN_SUCCEED;
 out_up_write:
 	mmap_write_unlock(mm);
 out_nolock:
-	if (hpage)
-		put_page(hpage);
+	if (folio)
+		folio_put(folio);
 	trace_mm_collapse_huge_page(mm, result == SCAN_SUCCEED, result);
 	return result;
 }
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/7] khugepaged: Pass a folio to __collapse_huge_page_copy()
  2024-04-03 17:18 [PATCH 0/7] khugepaged folio conversions Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2024-04-03 17:18 ` [PATCH 3/7] khugepaged: Remove hpage from collapse_huge_page() Matthew Wilcox (Oracle)
@ 2024-04-03 17:18 ` Matthew Wilcox (Oracle)
  2024-04-05 21:19   ` Vishal Moola
  2024-04-03 17:18 ` [PATCH 5/7] khugepaged: Remove hpage from collapse_file() Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-04-03 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm

Simplify the body of __collapse_huge_page_copy() while I'm looking at
it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/khugepaged.c | 34 +++++++++++++++-------------------
 1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1c99d18602e5..71a4119ce3a8 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -767,7 +767,7 @@ static void __collapse_huge_page_copy_failed(pte_t *pte,
  * Returns SCAN_SUCCEED if copying succeeds, otherwise returns SCAN_COPY_MC.
  *
  * @pte: starting of the PTEs to copy from
- * @page: the new hugepage to copy contents to
+ * @folio: the new hugepage to copy contents to
  * @pmd: pointer to the new hugepage's PMD
  * @orig_pmd: the original raw pages' PMD
  * @vma: the original raw pages' virtual memory area
@@ -775,33 +775,29 @@ static void __collapse_huge_page_copy_failed(pte_t *pte,
  * @ptl: lock on raw pages' PTEs
  * @compound_pagelist: list that stores compound pages
  */
-static int __collapse_huge_page_copy(pte_t *pte,
-				     struct page *page,
-				     pmd_t *pmd,
-				     pmd_t orig_pmd,
-				     struct vm_area_struct *vma,
-				     unsigned long address,
-				     spinlock_t *ptl,
-				     struct list_head *compound_pagelist)
+static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio,
+		pmd_t *pmd, pmd_t orig_pmd, struct vm_area_struct *vma,
+		unsigned long address, spinlock_t *ptl,
+		struct list_head *compound_pagelist)
 {
-	struct page *src_page;
-	pte_t *_pte;
-	pte_t pteval;
-	unsigned long _address;
+	unsigned int i;
 	int result = SCAN_SUCCEED;
 
 	/*
 	 * Copying pages' contents is subject to memory poison at any iteration.
 	 */
-	for (_pte = pte, _address = address; _pte < pte + HPAGE_PMD_NR;
-	     _pte++, page++, _address += PAGE_SIZE) {
-		pteval = ptep_get(_pte);
+	for (i = 0; i < HPAGE_PMD_NR; i++) {
+		pte_t pteval = ptep_get(pte + i);
+		struct page *page = folio_page(folio, i);
+		unsigned long src_addr = address + i * PAGE_SIZE;
+		struct page *src_page;
+
 		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
-			clear_user_highpage(page, _address);
+			clear_user_highpage(page, src_addr);
 			continue;
 		}
 		src_page = pte_page(pteval);
-		if (copy_mc_user_highpage(page, src_page, _address, vma) > 0) {
+		if (copy_mc_user_highpage(page, src_page, src_addr, vma) > 0) {
 			result = SCAN_COPY_MC;
 			break;
 		}
@@ -1196,7 +1192,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	 */
 	anon_vma_unlock_write(vma->anon_vma);
 
-	result = __collapse_huge_page_copy(pte, &folio->page, pmd, _pmd,
+	result = __collapse_huge_page_copy(pte, folio, pmd, _pmd,
 					   vma, address, pte_ptl,
 					   &compound_pagelist);
 	pte_unmap(pte);
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/7] khugepaged: Remove hpage from collapse_file()
  2024-04-03 17:18 [PATCH 0/7] khugepaged folio conversions Matthew Wilcox (Oracle)
                   ` (3 preceding siblings ...)
  2024-04-03 17:18 ` [PATCH 4/7] khugepaged: Pass a folio to __collapse_huge_page_copy() Matthew Wilcox (Oracle)
@ 2024-04-03 17:18 ` Matthew Wilcox (Oracle)
  2024-04-05 21:19   ` Vishal Moola
  2024-04-03 17:18 ` [PATCH 6/7] khugepaged: Use a folio throughout collapse_file() Matthew Wilcox (Oracle)
  2024-04-03 17:18 ` [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file() Matthew Wilcox (Oracle)
  6 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-04-03 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm

Use new_folio throughout where we had been using hpage.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/trace/events/huge_memory.h |  6 +--
 mm/khugepaged.c                    | 77 +++++++++++++++---------------
 2 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h
index 6e2ef1d4b002..dc6eeef2d3da 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -207,10 +207,10 @@ TRACE_EVENT(mm_khugepaged_scan_file,
 );
 
 TRACE_EVENT(mm_khugepaged_collapse_file,
-	TP_PROTO(struct mm_struct *mm, struct page *hpage, pgoff_t index,
+	TP_PROTO(struct mm_struct *mm, struct folio *new_folio, pgoff_t index,
 			bool is_shmem, unsigned long addr, struct file *file,
 			int nr, int result),
-	TP_ARGS(mm, hpage, index, addr, is_shmem, file, nr, result),
+	TP_ARGS(mm, new_folio, index, addr, is_shmem, file, nr, result),
 	TP_STRUCT__entry(
 		__field(struct mm_struct *, mm)
 		__field(unsigned long, hpfn)
@@ -224,7 +224,7 @@ TRACE_EVENT(mm_khugepaged_collapse_file,
 
 	TP_fast_assign(
 		__entry->mm = mm;
-		__entry->hpfn = hpage ? page_to_pfn(hpage) : -1;
+		__entry->hpfn = new_folio ? folio_pfn(new_folio) : -1;
 		__entry->index = index;
 		__entry->addr = addr;
 		__entry->is_shmem = is_shmem;
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 71a4119ce3a8..d44584b5e004 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1780,30 +1780,27 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 			 struct collapse_control *cc)
 {
 	struct address_space *mapping = file->f_mapping;
-	struct page *hpage;
 	struct page *page;
-	struct page *tmp;
+	struct page *tmp, *dst;
 	struct folio *folio, *new_folio;
 	pgoff_t index = 0, end = start + HPAGE_PMD_NR;
 	LIST_HEAD(pagelist);
 	XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
 	int nr_none = 0, result = SCAN_SUCCEED;
 	bool is_shmem = shmem_file(file);
-	int nr = 0;
 
 	VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem);
 	VM_BUG_ON(start & (HPAGE_PMD_NR - 1));
 
 	result = alloc_charge_folio(&new_folio, mm, cc);
-	hpage = &new_folio->page;
 	if (result != SCAN_SUCCEED)
 		goto out;
 
-	__SetPageLocked(hpage);
+	__folio_set_locked(new_folio);
 	if (is_shmem)
-		__SetPageSwapBacked(hpage);
-	hpage->index = start;
-	hpage->mapping = mapping;
+		__folio_set_swapbacked(new_folio);
+	new_folio->index = start;
+	new_folio->mapping = mapping;
 
 	/*
 	 * Ensure we have slots for all the pages in the range.  This is
@@ -2036,20 +2033,24 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 	 * The old pages are locked, so they won't change anymore.
 	 */
 	index = start;
+	dst = folio_page(new_folio, 0);
 	list_for_each_entry(page, &pagelist, lru) {
 		while (index < page->index) {
-			clear_highpage(hpage + (index % HPAGE_PMD_NR));
+			clear_highpage(dst);
 			index++;
+			dst++;
 		}
-		if (copy_mc_highpage(hpage + (page->index % HPAGE_PMD_NR), page) > 0) {
+		if (copy_mc_highpage(dst, page) > 0) {
 			result = SCAN_COPY_MC;
 			goto rollback;
 		}
 		index++;
+		dst++;
 	}
 	while (index < end) {
-		clear_highpage(hpage + (index % HPAGE_PMD_NR));
+		clear_highpage(dst);
 		index++;
+		dst++;
 	}
 
 	if (nr_none) {
@@ -2077,16 +2078,17 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 		}
 
 		/*
-		 * If userspace observed a missing page in a VMA with a MODE_MISSING
-		 * userfaultfd, then it might expect a UFFD_EVENT_PAGEFAULT for that
-		 * page. If so, we need to roll back to avoid suppressing such an
-		 * event. Since wp/minor userfaultfds don't give userspace any
-		 * guarantees that the kernel doesn't fill a missing page with a zero
-		 * page, so they don't matter here.
+		 * If userspace observed a missing page in a VMA with
+		 * a MODE_MISSING userfaultfd, then it might expect a
+		 * UFFD_EVENT_PAGEFAULT for that page. If so, we need to
+		 * roll back to avoid suppressing such an event. Since
+		 * wp/minor userfaultfds don't give userspace any
+		 * guarantees that the kernel doesn't fill a missing
+		 * page with a zero page, so they don't matter here.
 		 *
-		 * Any userfaultfds registered after this point will not be able to
-		 * observe any missing pages due to the previously inserted retry
-		 * entries.
+		 * Any userfaultfds registered after this point will
+		 * not be able to observe any missing pages due to the
+		 * previously inserted retry entries.
 		 */
 		vma_interval_tree_foreach(vma, &mapping->i_mmap, start, end) {
 			if (userfaultfd_missing(vma)) {
@@ -2111,33 +2113,32 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 		xas_lock_irq(&xas);
 	}
 
-	folio = page_folio(hpage);
-	nr = folio_nr_pages(folio);
 	if (is_shmem)
-		__lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, nr);
+		__lruvec_stat_mod_folio(new_folio, NR_SHMEM_THPS, HPAGE_PMD_NR);
 	else
-		__lruvec_stat_mod_folio(folio, NR_FILE_THPS, nr);
+		__lruvec_stat_mod_folio(new_folio, NR_FILE_THPS, HPAGE_PMD_NR);
 
 	if (nr_none) {
-		__lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr_none);
+		__lruvec_stat_mod_folio(new_folio, NR_FILE_PAGES, nr_none);
 		/* nr_none is always 0 for non-shmem. */
-		__lruvec_stat_mod_folio(folio, NR_SHMEM, nr_none);
+		__lruvec_stat_mod_folio(new_folio, NR_SHMEM, nr_none);
 	}
 
 	/*
-	 * Mark hpage as uptodate before inserting it into the page cache so
-	 * that it isn't mistaken for an fallocated but unwritten page.
+	 * Mark new_folio as uptodate before inserting it into the
+	 * page cache so that it isn't mistaken for an fallocated but
+	 * unwritten page.
 	 */
-	folio_mark_uptodate(folio);
-	folio_ref_add(folio, HPAGE_PMD_NR - 1);
+	folio_mark_uptodate(new_folio);
+	folio_ref_add(new_folio, HPAGE_PMD_NR - 1);
 
 	if (is_shmem)
-		folio_mark_dirty(folio);
-	folio_add_lru(folio);
+		folio_mark_dirty(new_folio);
+	folio_add_lru(new_folio);
 
 	/* Join all the small entries into a single multi-index entry. */
 	xas_set_order(&xas, start, HPAGE_PMD_ORDER);
-	xas_store(&xas, folio);
+	xas_store(&xas, new_folio);
 	WARN_ON_ONCE(xas_error(&xas));
 	xas_unlock_irq(&xas);
 
@@ -2148,7 +2149,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 	retract_page_tables(mapping, start);
 	if (cc && !cc->is_khugepaged)
 		result = SCAN_PTE_MAPPED_HUGEPAGE;
-	folio_unlock(folio);
+	folio_unlock(new_folio);
 
 	/*
 	 * The collapse has succeeded, so free the old pages.
@@ -2193,13 +2194,13 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 		smp_mb();
 	}
 
-	hpage->mapping = NULL;
+	new_folio->mapping = NULL;
 
-	unlock_page(hpage);
-	put_page(hpage);
+	folio_unlock(new_folio);
+	folio_put(new_folio);
 out:
 	VM_BUG_ON(!list_empty(&pagelist));
-	trace_mm_khugepaged_collapse_file(mm, hpage, index, is_shmem, addr, file, nr, result);
+	trace_mm_khugepaged_collapse_file(mm, new_folio, index, is_shmem, addr, file, HPAGE_PMD_NR, result);
 	return result;
 }
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 6/7] khugepaged: Use a folio throughout collapse_file()
  2024-04-03 17:18 [PATCH 0/7] khugepaged folio conversions Matthew Wilcox (Oracle)
                   ` (4 preceding siblings ...)
  2024-04-03 17:18 ` [PATCH 5/7] khugepaged: Remove hpage from collapse_file() Matthew Wilcox (Oracle)
@ 2024-04-03 17:18 ` Matthew Wilcox (Oracle)
  2024-04-05 21:20   ` Vishal Moola
  2024-04-03 17:18 ` [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file() Matthew Wilcox (Oracle)
  6 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-04-03 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm

Pull folios from the page cache instead of pages.  Half of this work
had been done already, but we were still operating on pages for a large
chunk of this function.  There is no attempt in this patch to handle
large folios that are smaller than a THP; that will have to wait for a
future patch.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/khugepaged.c | 113 +++++++++++++++++++++++-------------------------
 1 file changed, 54 insertions(+), 59 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index d44584b5e004..0b0053fb30c0 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1780,9 +1780,8 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 			 struct collapse_control *cc)
 {
 	struct address_space *mapping = file->f_mapping;
-	struct page *page;
-	struct page *tmp, *dst;
-	struct folio *folio, *new_folio;
+	struct page *dst;
+	struct folio *folio, *tmp, *new_folio;
 	pgoff_t index = 0, end = start + HPAGE_PMD_NR;
 	LIST_HEAD(pagelist);
 	XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
@@ -1820,11 +1819,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 
 	for (index = start; index < end; index++) {
 		xas_set(&xas, index);
-		page = xas_load(&xas);
+		folio = xas_load(&xas);
 
 		VM_BUG_ON(index != xas.xa_index);
 		if (is_shmem) {
-			if (!page) {
+			if (!folio) {
 				/*
 				 * Stop if extent has been truncated or
 				 * hole-punched, and is now completely
@@ -1840,7 +1839,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 				continue;
 			}
 
-			if (xa_is_value(page) || !PageUptodate(page)) {
+			if (xa_is_value(folio) || !folio_test_uptodate(folio)) {
 				xas_unlock_irq(&xas);
 				/* swap in or instantiate fallocated page */
 				if (shmem_get_folio(mapping->host, index,
@@ -1850,28 +1849,27 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 				}
 				/* drain lru cache to help isolate_lru_page() */
 				lru_add_drain();
-				page = folio_file_page(folio, index);
-			} else if (trylock_page(page)) {
-				get_page(page);
+			} else if (folio_trylock(folio)) {
+				folio_get(folio);
 				xas_unlock_irq(&xas);
 			} else {
 				result = SCAN_PAGE_LOCK;
 				goto xa_locked;
 			}
 		} else {	/* !is_shmem */
-			if (!page || xa_is_value(page)) {
+			if (!folio || xa_is_value(folio)) {
 				xas_unlock_irq(&xas);
 				page_cache_sync_readahead(mapping, &file->f_ra,
 							  file, index,
 							  end - index);
 				/* drain lru cache to help isolate_lru_page() */
 				lru_add_drain();
-				page = find_lock_page(mapping, index);
-				if (unlikely(page == NULL)) {
+				folio = filemap_lock_folio(mapping, index);
+				if (unlikely(folio == NULL)) {
 					result = SCAN_FAIL;
 					goto xa_unlocked;
 				}
-			} else if (PageDirty(page)) {
+			} else if (folio_test_dirty(folio)) {
 				/*
 				 * khugepaged only works on read-only fd,
 				 * so this page is dirty because it hasn't
@@ -1889,12 +1887,12 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 				filemap_flush(mapping);
 				result = SCAN_FAIL;
 				goto xa_unlocked;
-			} else if (PageWriteback(page)) {
+			} else if (folio_test_writeback(folio)) {
 				xas_unlock_irq(&xas);
 				result = SCAN_FAIL;
 				goto xa_unlocked;
-			} else if (trylock_page(page)) {
-				get_page(page);
+			} else if (folio_trylock(folio)) {
+				folio_get(folio);
 				xas_unlock_irq(&xas);
 			} else {
 				result = SCAN_PAGE_LOCK;
@@ -1903,35 +1901,31 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 		}
 
 		/*
-		 * The page must be locked, so we can drop the i_pages lock
+		 * The folio must be locked, so we can drop the i_pages lock
 		 * without racing with truncate.
 		 */
-		VM_BUG_ON_PAGE(!PageLocked(page), page);
+		VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 
-		/* make sure the page is up to date */
-		if (unlikely(!PageUptodate(page))) {
+		/* make sure the folio is up to date */
+		if (unlikely(!folio_test_uptodate(folio))) {
 			result = SCAN_FAIL;
 			goto out_unlock;
 		}
 
 		/*
 		 * If file was truncated then extended, or hole-punched, before
-		 * we locked the first page, then a THP might be there already.
+		 * we locked the first folio, then a THP might be there already.
 		 * This will be discovered on the first iteration.
 		 */
-		if (PageTransCompound(page)) {
-			struct page *head = compound_head(page);
-
-			result = compound_order(head) == HPAGE_PMD_ORDER &&
-					head->index == start
+		if (folio_test_large(folio)) {
+			result = folio_order(folio) == HPAGE_PMD_ORDER &&
+					folio->index == start
 					/* Maybe PMD-mapped */
 					? SCAN_PTE_MAPPED_HUGEPAGE
 					: SCAN_PAGE_COMPOUND;
 			goto out_unlock;
 		}
 
-		folio = page_folio(page);
-
 		if (folio_mapping(folio) != mapping) {
 			result = SCAN_TRUNCATED;
 			goto out_unlock;
@@ -1941,7 +1935,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 				  folio_test_writeback(folio))) {
 			/*
 			 * khugepaged only works on read-only fd, so this
-			 * page is dirty because it hasn't been flushed
+			 * folio is dirty because it hasn't been flushed
 			 * since first write.
 			 */
 			result = SCAN_FAIL;
@@ -1965,33 +1959,34 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 
 		xas_lock_irq(&xas);
 
-		VM_BUG_ON_PAGE(page != xa_load(xas.xa, index), page);
+		VM_BUG_ON_FOLIO(folio != xa_load(xas.xa, index), folio);
 
 		/*
-		 * We control three references to the page:
+		 * We control three references to the folio:
 		 *  - we hold a pin on it;
 		 *  - one reference from page cache;
-		 *  - one from isolate_lru_page;
-		 * If those are the only references, then any new usage of the
-		 * page will have to fetch it from the page cache. That requires
-		 * locking the page to handle truncate, so any new usage will be
-		 * blocked until we unlock page after collapse/during rollback.
+		 *  - one from lru_isolate_folio;
+		 * If those are the only references, then any new usage
+		 * of the folio will have to fetch it from the page
+		 * cache. That requires locking the folio to handle
+		 * truncate, so any new usage will be blocked until we
+		 * unlock folio after collapse/during rollback.
 		 */
-		if (page_count(page) != 3) {
+		if (folio_ref_count(folio) != 3) {
 			result = SCAN_PAGE_COUNT;
 			xas_unlock_irq(&xas);
-			putback_lru_page(page);
+			folio_putback_lru(folio);
 			goto out_unlock;
 		}
 
 		/*
-		 * Accumulate the pages that are being collapsed.
+		 * Accumulate the folios that are being collapsed.
 		 */
-		list_add_tail(&page->lru, &pagelist);
+		list_add_tail(&folio->lru, &pagelist);
 		continue;
 out_unlock:
-		unlock_page(page);
-		put_page(page);
+		folio_unlock(folio);
+		folio_put(folio);
 		goto xa_unlocked;
 	}
 
@@ -2030,17 +2025,17 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 	}
 
 	/*
-	 * The old pages are locked, so they won't change anymore.
+	 * The old folios are locked, so they won't change anymore.
 	 */
 	index = start;
 	dst = folio_page(new_folio, 0);
-	list_for_each_entry(page, &pagelist, lru) {
-		while (index < page->index) {
+	list_for_each_entry(folio, &pagelist, lru) {
+		while (index < folio->index) {
 			clear_highpage(dst);
 			index++;
 			dst++;
 		}
-		if (copy_mc_highpage(dst, page) > 0) {
+		if (copy_mc_highpage(dst, folio_page(folio, 0)) > 0) {
 			result = SCAN_COPY_MC;
 			goto rollback;
 		}
@@ -2152,15 +2147,15 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 	folio_unlock(new_folio);
 
 	/*
-	 * The collapse has succeeded, so free the old pages.
+	 * The collapse has succeeded, so free the old folios.
 	 */
-	list_for_each_entry_safe(page, tmp, &pagelist, lru) {
-		list_del(&page->lru);
-		page->mapping = NULL;
-		ClearPageActive(page);
-		ClearPageUnevictable(page);
-		unlock_page(page);
-		folio_put_refs(page_folio(page), 3);
+	list_for_each_entry_safe(folio, tmp, &pagelist, lru) {
+		list_del(&folio->lru);
+		folio->mapping = NULL;
+		folio_clear_active(folio);
+		folio_clear_unevictable(folio);
+		folio_unlock(folio);
+		folio_put_refs(folio, 3);
 	}
 
 	goto out;
@@ -2174,11 +2169,11 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 		shmem_uncharge(mapping->host, nr_none);
 	}
 
-	list_for_each_entry_safe(page, tmp, &pagelist, lru) {
-		list_del(&page->lru);
-		unlock_page(page);
-		putback_lru_page(page);
-		put_page(page);
+	list_for_each_entry_safe(folio, tmp, &pagelist, lru) {
+		list_del(&folio->lru);
+		folio_unlock(folio);
+		folio_putback_lru(folio);
+		folio_put(folio);
 	}
 	/*
 	 * Undo the updates of filemap_nr_thps_inc for non-SHMEM
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file()
  2024-04-03 17:18 [PATCH 0/7] khugepaged folio conversions Matthew Wilcox (Oracle)
                   ` (5 preceding siblings ...)
  2024-04-03 17:18 ` [PATCH 6/7] khugepaged: Use a folio throughout collapse_file() Matthew Wilcox (Oracle)
@ 2024-04-03 17:18 ` Matthew Wilcox (Oracle)
  2024-04-05 21:21   ` Vishal Moola
  2024-04-08 20:07   ` David Hildenbrand
  6 siblings, 2 replies; 20+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-04-03 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox (Oracle), linux-mm

Replace the use of pages with folios.  Saves a few calls to
compound_head() and removes some uses of obsolete functions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/trace/events/huge_memory.h |  6 +++---
 mm/khugepaged.c                    | 33 +++++++++++++++---------------
 2 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h
index dc6eeef2d3da..ab576898a126 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -174,10 +174,10 @@ TRACE_EVENT(mm_collapse_huge_page_swapin,
 
 TRACE_EVENT(mm_khugepaged_scan_file,
 
-	TP_PROTO(struct mm_struct *mm, struct page *page, struct file *file,
+	TP_PROTO(struct mm_struct *mm, struct folio *folio, struct file *file,
 		 int present, int swap, int result),
 
-	TP_ARGS(mm, page, file, present, swap, result),
+	TP_ARGS(mm, folio, file, present, swap, result),
 
 	TP_STRUCT__entry(
 		__field(struct mm_struct *, mm)
@@ -190,7 +190,7 @@ TRACE_EVENT(mm_khugepaged_scan_file,
 
 	TP_fast_assign(
 		__entry->mm = mm;
-		__entry->pfn = page ? page_to_pfn(page) : -1;
+		__entry->pfn = folio ? folio_pfn(folio) : -1;
 		__assign_str(filename, file->f_path.dentry->d_iname);
 		__entry->present = present;
 		__entry->swap = swap;
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 0b0053fb30c0..ef2871aaeb43 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2203,7 +2203,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
 				    struct file *file, pgoff_t start,
 				    struct collapse_control *cc)
 {
-	struct page *page = NULL;
+	struct folio *folio = NULL;
 	struct address_space *mapping = file->f_mapping;
 	XA_STATE(xas, &mapping->i_pages, start);
 	int present, swap;
@@ -2215,11 +2215,11 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
 	memset(cc->node_load, 0, sizeof(cc->node_load));
 	nodes_clear(cc->alloc_nmask);
 	rcu_read_lock();
-	xas_for_each(&xas, page, start + HPAGE_PMD_NR - 1) {
-		if (xas_retry(&xas, page))
+	xas_for_each(&xas, folio, start + HPAGE_PMD_NR - 1) {
+		if (xas_retry(&xas, folio))
 			continue;
 
-		if (xa_is_value(page)) {
+		if (xa_is_value(folio)) {
 			++swap;
 			if (cc->is_khugepaged &&
 			    swap > khugepaged_max_ptes_swap) {
@@ -2234,11 +2234,9 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
 		 * TODO: khugepaged should compact smaller compound pages
 		 * into a PMD sized page
 		 */
-		if (PageTransCompound(page)) {
-			struct page *head = compound_head(page);
-
-			result = compound_order(head) == HPAGE_PMD_ORDER &&
-					head->index == start
+		if (folio_test_large(folio)) {
+			result = folio_order(folio) == HPAGE_PMD_ORDER &&
+					folio->index == start
 					/* Maybe PMD-mapped */
 					? SCAN_PTE_MAPPED_HUGEPAGE
 					: SCAN_PAGE_COMPOUND;
@@ -2251,28 +2249,29 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
 			break;
 		}
 
-		node = page_to_nid(page);
+		node = folio_nid(folio);
 		if (hpage_collapse_scan_abort(node, cc)) {
 			result = SCAN_SCAN_ABORT;
 			break;
 		}
 		cc->node_load[node]++;
 
-		if (!PageLRU(page)) {
+		if (!folio_test_lru(folio)) {
 			result = SCAN_PAGE_LRU;
 			break;
 		}
 
-		if (page_count(page) !=
-		    1 + page_mapcount(page) + page_has_private(page)) {
+		if (folio_ref_count(folio) !=
+		    1 + folio_mapcount(folio) + folio_test_private(folio)) {
 			result = SCAN_PAGE_COUNT;
 			break;
 		}
 
 		/*
-		 * We probably should check if the page is referenced here, but
-		 * nobody would transfer pte_young() to PageReferenced() for us.
-		 * And rmap walk here is just too costly...
+		 * We probably should check if the folio is referenced
+		 * here, but nobody would transfer pte_young() to
+		 * folio_test_referenced() for us.  And rmap walk here
+		 * is just too costly...
 		 */
 
 		present++;
@@ -2294,7 +2293,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
 		}
 	}
 
-	trace_mm_khugepaged_scan_file(mm, page, file, present, swap, result);
+	trace_mm_khugepaged_scan_file(mm, folio, file, present, swap, result);
 	return result;
 }
 #else
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/7] khugepaged: Inline hpage_collapse_alloc_folio()
  2024-04-03 17:18 ` [PATCH 1/7] khugepaged: Inline hpage_collapse_alloc_folio() Matthew Wilcox (Oracle)
@ 2024-04-05 21:13   ` Vishal Moola
  0 siblings, 0 replies; 20+ messages in thread
From: Vishal Moola @ 2024-04-05 21:13 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andrew Morton, linux-mm

On Wed, Apr 03, 2024 at 06:18:30PM +0100, Matthew Wilcox (Oracle) wrote:
> This function has one caller, and the combined function is simpler to
> read, reason about and modify.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] khugepaged: Convert alloc_charge_hpage to alloc_charge_folio
  2024-04-03 17:18 ` [PATCH 2/7] khugepaged: Convert alloc_charge_hpage to alloc_charge_folio Matthew Wilcox (Oracle)
@ 2024-04-05 21:14   ` Vishal Moola
  2024-04-07  3:44     ` Matthew Wilcox
  0 siblings, 1 reply; 20+ messages in thread
From: Vishal Moola @ 2024-04-05 21:14 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andrew Morton, linux-mm

On Wed, Apr 03, 2024 at 06:18:31PM +0100, Matthew Wilcox (Oracle) wrote:  
> @@ -1789,7 +1789,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
>  	struct page *hpage;
>  	struct page *page;
>  	struct page *tmp;
> -	struct folio *folio;
> +	struct folio *folio, *new_folio;

Would it make more sense to introduce new_folio in patch 5 where
you convert the rest of the function to use new_folio? I think it
might make the commit history easier to read. 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/7] khugepaged: Remove hpage from collapse_huge_page()
  2024-04-03 17:18 ` [PATCH 3/7] khugepaged: Remove hpage from collapse_huge_page() Matthew Wilcox (Oracle)
@ 2024-04-05 21:14   ` Vishal Moola
  0 siblings, 0 replies; 20+ messages in thread
From: Vishal Moola @ 2024-04-05 21:14 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andrew Morton, linux-mm

On Wed, Apr 03, 2024 at 06:18:32PM +0100, Matthew Wilcox (Oracle) wrote:
> Work purely in terms of the folio.  Removes a call to compound_head()
> in put_page().
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] khugepaged: Pass a folio to __collapse_huge_page_copy()
  2024-04-03 17:18 ` [PATCH 4/7] khugepaged: Pass a folio to __collapse_huge_page_copy() Matthew Wilcox (Oracle)
@ 2024-04-05 21:19   ` Vishal Moola
  2024-04-07  3:46     ` Matthew Wilcox
  0 siblings, 1 reply; 20+ messages in thread
From: Vishal Moola @ 2024-04-05 21:19 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andrew Morton, linux-mm

On Wed, Apr 03, 2024 at 06:18:33PM +0100, Matthew Wilcox (Oracle) wrote:
> Simplify the body of __collapse_huge_page_copy() while I'm looking at
> it.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

This makes things so much easier to read!

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>

>  		}
> @@ -1196,7 +1192,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
>  	 */
>  	anon_vma_unlock_write(vma->anon_vma);
>  
> -	result = __collapse_huge_page_copy(pte, &folio->page, pmd, _pmd,
> +	result = __collapse_huge_page_copy(pte, folio, pmd, _pmd,
>  					   vma, address, pte_ptl,
>  					   &compound_pagelist);

Additionally, it looks like if you put this after patch 2 and before 3,
you could avoid modifying this line twice.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/7] khugepaged: Remove hpage from collapse_file()
  2024-04-03 17:18 ` [PATCH 5/7] khugepaged: Remove hpage from collapse_file() Matthew Wilcox (Oracle)
@ 2024-04-05 21:19   ` Vishal Moola
  0 siblings, 0 replies; 20+ messages in thread
From: Vishal Moola @ 2024-04-05 21:19 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andrew Morton, linux-mm

On Wed, Apr 03, 2024 at 06:18:34PM +0100, Matthew Wilcox (Oracle) wrote:
> Use new_folio throughout where we had been using hpage.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/7] khugepaged: Use a folio throughout collapse_file()
  2024-04-03 17:18 ` [PATCH 6/7] khugepaged: Use a folio throughout collapse_file() Matthew Wilcox (Oracle)
@ 2024-04-05 21:20   ` Vishal Moola
  2024-04-07  3:43     ` Matthew Wilcox
  0 siblings, 1 reply; 20+ messages in thread
From: Vishal Moola @ 2024-04-05 21:20 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andrew Morton, linux-mm

On Wed, Apr 03, 2024 at 06:18:35PM +0100, Matthew Wilcox (Oracle) wrote:
> @@ -1850,28 +1849,27 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
>  				}
>  				/* drain lru cache to help isolate_lru_page() */
>  				lru_add_drain();
> -				page = folio_file_page(folio, index);
> -			} else if (trylock_page(page)) {
> -				get_page(page);
> +			} else if (folio_trylock(folio)) {
> +				folio_get(folio);
>  				xas_unlock_irq(&xas);
>  			} else {
>  				result = SCAN_PAGE_LOCK;
>  				goto xa_locked;
>  			}
>  		} else {	/* !is_shmem */
> -			if (!page || xa_is_value(page)) {
> +			if (!folio || xa_is_value(folio)) {
>  				xas_unlock_irq(&xas);
>  				page_cache_sync_readahead(mapping, &file->f_ra,
>  							  file, index,
>  							  end - index);
>  				/* drain lru cache to help isolate_lru_page() */
>  				lru_add_drain();
> -				page = find_lock_page(mapping, index);
> -				if (unlikely(page == NULL)) {
> +				folio = filemap_lock_folio(mapping, index);
> +				if (unlikely(folio == NULL)) {

filemap_lock_folio() can return an ERR_PTR(), find_lock_page() handles
it internally.

 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file()
  2024-04-03 17:18 ` [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file() Matthew Wilcox (Oracle)
@ 2024-04-05 21:21   ` Vishal Moola
  2024-04-08 20:07   ` David Hildenbrand
  1 sibling, 0 replies; 20+ messages in thread
From: Vishal Moola @ 2024-04-05 21:21 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andrew Morton, linux-mm

On Wed, Apr 03, 2024 at 06:18:36PM +0100, Matthew Wilcox (Oracle) wrote:
> Replace the use of pages with folios.  Saves a few calls to
> compound_head() and removes some uses of obsolete functions.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/7] khugepaged: Use a folio throughout collapse_file()
  2024-04-05 21:20   ` Vishal Moola
@ 2024-04-07  3:43     ` Matthew Wilcox
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2024-04-07  3:43 UTC (permalink / raw)
  To: Vishal Moola; +Cc: Andrew Morton, linux-mm

On Fri, Apr 05, 2024 at 02:20:28PM -0700, Vishal Moola wrote:
> On Wed, Apr 03, 2024 at 06:18:35PM +0100, Matthew Wilcox (Oracle) wrote:
> > -				page = find_lock_page(mapping, index);
> > -				if (unlikely(page == NULL)) {
> > +				folio = filemap_lock_folio(mapping, index);
> > +				if (unlikely(folio == NULL)) {
> 
> filemap_lock_folio() can return an ERR_PTR(), find_lock_page() handles
> it internally.

Argh.  I keep forgetting that when doing conversions.  Thanks.

Andrew, if you could attach this fix ...

(the unlikely() is embedded in IS_ERR() so no need to keep it here)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 6a96a85de56b..89e2624fb3ff 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1865,7 +1865,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
 				/* drain lru cache to help isolate_lru_page() */
 				lru_add_drain();
 				folio = filemap_lock_folio(mapping, index);
-				if (unlikely(folio == NULL)) {
+				if (IS_ERR(folio)) {
 					result = SCAN_FAIL;
 					goto xa_unlocked;
 				}


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] khugepaged: Convert alloc_charge_hpage to alloc_charge_folio
  2024-04-05 21:14   ` Vishal Moola
@ 2024-04-07  3:44     ` Matthew Wilcox
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2024-04-07  3:44 UTC (permalink / raw)
  To: Vishal Moola; +Cc: Andrew Morton, linux-mm

On Fri, Apr 05, 2024 at 02:14:10PM -0700, Vishal Moola wrote:
> On Wed, Apr 03, 2024 at 06:18:31PM +0100, Matthew Wilcox (Oracle) wrote:  
> > @@ -1789,7 +1789,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
> >  	struct page *hpage;
> >  	struct page *page;
> >  	struct page *tmp;
> > -	struct folio *folio;
> > +	struct folio *folio, *new_folio;
> 
> Would it make more sense to introduce new_folio in patch 5 where
> you convert the rest of the function to use new_folio? I think it
> might make the commit history easier to read. 

I went back and forth on that a few times.  I ended up deciding that
it didn't really help.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] khugepaged: Pass a folio to __collapse_huge_page_copy()
  2024-04-05 21:19   ` Vishal Moola
@ 2024-04-07  3:46     ` Matthew Wilcox
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2024-04-07  3:46 UTC (permalink / raw)
  To: Vishal Moola; +Cc: Andrew Morton, linux-mm

On Fri, Apr 05, 2024 at 02:19:19PM -0700, Vishal Moola wrote:
> Additionally, it looks like if you put this after patch 2 and before 3,
> you could avoid modifying this line twice.

Ah, yeah, good point.  If Andrew hadn't already taken the patches, I'd
make that change.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file()
  2024-04-03 17:18 ` [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file() Matthew Wilcox (Oracle)
  2024-04-05 21:21   ` Vishal Moola
@ 2024-04-08 20:07   ` David Hildenbrand
  2024-04-08 20:28     ` Matthew Wilcox
  1 sibling, 1 reply; 20+ messages in thread
From: David Hildenbrand @ 2024-04-08 20:07 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), Andrew Morton; +Cc: linux-mm

On 03.04.24 19:18, Matthew Wilcox (Oracle) wrote:
> Replace the use of pages with folios.  Saves a few calls to
> compound_head() and removes some uses of obsolete functions.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---

[...]

>   
> -		if (page_count(page) !=
> -		    1 + page_mapcount(page) + page_has_private(page)) {
> +		if (folio_ref_count(folio) !=
> +		    1 + folio_mapcount(folio) + folio_test_private(folio)) {
>   			result = SCAN_PAGE_COUNT;

Just stumbled over that: page_has_private() would have checked 
PG_private and PG_private_2. The corresponding replacement would have 
been folio_has_private(). folio_test_private() only checks PG_private.

Are we sure that we no longer have to check PG_private_2 here? pagecache 
... so I have no clue :)

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file()
  2024-04-08 20:07   ` David Hildenbrand
@ 2024-04-08 20:28     ` Matthew Wilcox
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2024-04-08 20:28 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Andrew Morton, linux-mm, Vishal Moola (Oracle)

On Mon, Apr 08, 2024 at 10:07:01PM +0200, David Hildenbrand wrote:
> On 03.04.24 19:18, Matthew Wilcox (Oracle) wrote:
> > Replace the use of pages with folios.  Saves a few calls to
> > compound_head() and removes some uses of obsolete functions.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> 
> [...]
> 
> > -		if (page_count(page) !=
> > -		    1 + page_mapcount(page) + page_has_private(page)) {
> > +		if (folio_ref_count(folio) !=
> > +		    1 + folio_mapcount(folio) + folio_test_private(folio)) {
> >   			result = SCAN_PAGE_COUNT;
> 
> Just stumbled over that: page_has_private() would have checked PG_private
> and PG_private_2. The corresponding replacement would have been
> folio_has_private(). folio_test_private() only checks PG_private.
> 
> Are we sure that we no longer have to check PG_private_2 here? pagecache ...
> so I have no clue :)

Oh man.  Vishal just asked me about this in our meeting and I struggled
to explain the whole horrid history.  The short answer is that this
almost certainly fixes a bug.

We have a rule (which most filesystem developers don't know about) that
attaching private data to the folio should increase its refcount by one.
This is handled by folio_attach_private().  But there's no corresponding
rule that setting PG_private_2 should also increment the refcount.
So checking PG_private_2 is wrong here because a folio with PG_private_2
set and PG_private clear will not have its refcount increased.

There are longer versions of this answer, and if you keep asking, I'll
give them ;-P


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-04-08 20:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-03 17:18 [PATCH 0/7] khugepaged folio conversions Matthew Wilcox (Oracle)
2024-04-03 17:18 ` [PATCH 1/7] khugepaged: Inline hpage_collapse_alloc_folio() Matthew Wilcox (Oracle)
2024-04-05 21:13   ` Vishal Moola
2024-04-03 17:18 ` [PATCH 2/7] khugepaged: Convert alloc_charge_hpage to alloc_charge_folio Matthew Wilcox (Oracle)
2024-04-05 21:14   ` Vishal Moola
2024-04-07  3:44     ` Matthew Wilcox
2024-04-03 17:18 ` [PATCH 3/7] khugepaged: Remove hpage from collapse_huge_page() Matthew Wilcox (Oracle)
2024-04-05 21:14   ` Vishal Moola
2024-04-03 17:18 ` [PATCH 4/7] khugepaged: Pass a folio to __collapse_huge_page_copy() Matthew Wilcox (Oracle)
2024-04-05 21:19   ` Vishal Moola
2024-04-07  3:46     ` Matthew Wilcox
2024-04-03 17:18 ` [PATCH 5/7] khugepaged: Remove hpage from collapse_file() Matthew Wilcox (Oracle)
2024-04-05 21:19   ` Vishal Moola
2024-04-03 17:18 ` [PATCH 6/7] khugepaged: Use a folio throughout collapse_file() Matthew Wilcox (Oracle)
2024-04-05 21:20   ` Vishal Moola
2024-04-07  3:43     ` Matthew Wilcox
2024-04-03 17:18 ` [PATCH 7/7] khugepaged: Use a folio throughout hpage_collapse_scan_file() Matthew Wilcox (Oracle)
2024-04-05 21:21   ` Vishal Moola
2024-04-08 20:07   ` David Hildenbrand
2024-04-08 20:28     ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox