linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dev Jain <dev.jain@arm.com>
To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org,
	kirill.shutemov@linux.intel.com
Cc: npache@redhat.com, ryan.roberts@arm.com,
	anshuman.khandual@arm.com, catalin.marinas@arm.com,
	cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com,
	apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org,
	baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu,
	haowenchao22@gmail.com, hughd@google.com,
	aneesh.kumar@kernel.org, yang@os.amperecomputing.com,
	peterx@redhat.com, ioworker0@gmail.com,
	wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com,
	surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com,
	zhengqi.arch@bytedance.com, jhubbard@nvidia.com,
	21cnbao@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Dev Jain <dev.jain@arm.com>
Subject: [PATCH v2 13/17] khugepaged: Lock all VMAs mapping the PTE table
Date: Tue, 11 Feb 2025 16:43:22 +0530	[thread overview]
Message-ID: <20250211111326.14295-14-dev.jain@arm.com> (raw)
In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com>

After enabling khugepaged to handle VMAs of any size, it may happen that
the process faults on a VMA other than the VMA under collapse, and both
these VMAs span the same PTE table. As a result, the fault handler will
install a new PTE table after khugepaged isolates the PTE table. Therefore,
scan the PTE table, retrieve all VMAs, and write lock them. Note that,
rmap can still reach the PTE table from folios not under collapse; this is
fine since it does not interfere with the PTEs under collapse, nor the folios
under collapse, nor can rmap fill the PMD.

Signed-off-by: Dev Jain <dev.jain@arm.com>
---
 mm/khugepaged.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 048f990d8507..e1c2c5b89f6d 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1139,6 +1139,23 @@ static int alloc_charge_folio(struct folio **foliop, struct mm_struct *mm,
 	return SCAN_SUCCEED;
 }
 
+static void take_vma_locks_per_pte(struct mm_struct *mm, unsigned long haddress)
+{
+	struct vm_area_struct *vma;
+	unsigned long start = haddress;
+	unsigned long end = haddress + HPAGE_PMD_SIZE;
+
+	while (start < end) {
+		vma = vma_lookup(mm, start);
+		if (!vma) {
+			start += PAGE_SIZE;
+			continue;
+		}
+		vma_start_write(vma);
+		start = vma->vm_end;
+	}
+}
+
 static int vma_collapse_anon_folio_pmd(struct mm_struct *mm, unsigned long address,
 		struct vm_area_struct *vma, struct collapse_control *cc, pmd_t *pmd,
 		struct folio *folio)
@@ -1270,7 +1287,9 @@ static int vma_collapse_anon_folio(struct mm_struct *mm, unsigned long address,
 	if (result != SCAN_SUCCEED)
 		goto out;
 
-	vma_start_write(vma);
+	/* Faulting may fill the PMD after flush; lock all VMAs mapping this PTE */
+	take_vma_locks_per_pte(mm, haddress);
+
 	anon_vma_lock_write(vma->anon_vma);
 
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, haddress,
-- 
2.30.2



  parent reply	other threads:[~2025-02-11 11:16 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-11 11:13 [PATCH v2 00/17] khugepaged: Asynchronous mTHP collapse Dev Jain
2025-02-11 11:13 ` [PATCH v2 01/17] khugepaged: Generalize alloc_charge_folio() Dev Jain
2025-02-11 11:13 ` [PATCH v2 02/17] khugepaged: Generalize hugepage_vma_revalidate() Dev Jain
2025-02-11 11:13 ` [PATCH v2 03/17] khugepaged: Generalize __collapse_huge_page_swapin() Dev Jain
2025-02-11 11:13 ` [PATCH v2 04/17] khugepaged: Generalize __collapse_huge_page_isolate() Dev Jain
2025-02-11 11:13 ` [PATCH v2 05/17] khugepaged: Generalize __collapse_huge_page_copy() Dev Jain
2025-02-11 11:13 ` [PATCH v2 06/17] khugepaged: Abstract PMD-THP collapse Dev Jain
2025-02-11 11:13 ` [PATCH v2 07/17] khugepaged: Scan PTEs order-wise Dev Jain
2025-02-11 11:13 ` [PATCH v2 08/17] khugepaged: Introduce vma_collapse_anon_folio() Dev Jain
2025-02-11 11:13 ` [PATCH v2 09/17] khugepaged: Define collapse policy if a larger folio is already mapped Dev Jain
2025-02-11 11:13 ` [PATCH v2 10/17] khugepaged: Exit early on fully-mapped aligned mTHP Dev Jain
2025-02-11 11:13 ` [PATCH v2 11/17] khugepaged: Enable sysfs to control order of collapse Dev Jain
2025-02-11 11:13 ` [PATCH v2 12/17] khugepaged: Enable variable-sized VMA collapse Dev Jain
2025-02-11 11:13 ` Dev Jain [this message]
2025-02-11 11:13 ` [PATCH v2 14/17] khugepaged: Reset scan address to correct alignment Dev Jain
2025-02-11 11:13 ` [PATCH v2 15/17] khugepaged: Delay cond_resched() Dev Jain
2025-02-11 11:13 ` [PATCH v2 16/17] khugepaged: Implement strict policy for mTHP collapse Dev Jain
2025-02-11 11:13 ` [PATCH v2 17/17] Documentation: transhuge: Define khugepaged mTHP collapse policy Dev Jain
2025-02-11 23:23 ` [PATCH v2 00/17] khugepaged: Asynchronous mTHP collapse Andrew Morton
2025-02-12  4:18   ` Dev Jain
2025-02-15  1:47 ` Nico Pache
2025-02-15  7:36   ` Dev Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250211111326.14295-14-dev.jain@arm.com \
    --to=dev.jain@arm.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=apopple@nvidia.com \
    --cc=baohua@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=haowenchao22@gmail.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=jack@suse.cz \
    --cc=jglisse@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=srivatsa@csail.mit.edu \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox