linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Use killable vma write locking in most places
@ 2026-02-17 16:32 Suren Baghdasaryan
  2026-02-17 16:32 ` [PATCH v2 1/3] mm/vma: cleanup error handling path in vma_expand() Suren Baghdasaryan
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Suren Baghdasaryan @ 2026-02-17 16:32 UTC (permalink / raw)
  To: akpm
  Cc: willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
	baohua, lance.yang, vbabka, jannh, rppt, mhocko, pfalcato, kees,
	maddy, npiggin, mpe, chleroy, borntraeger, frankja, imbrenda,
	hca, gor, agordeev, svens, gerald.schaefer, linux-mm,
	linuxppc-dev, kvm, linux-kernel, linux-s390, surenb

Now that we have vma_start_write_killable() we can replace most of the
vma_start_write() calls with it, improving reaction time to the kill
signal.

There are several places which are left untouched by this patchset:

1. free_pgtables() because function should free page tables even if a
fatal signal is pending.

2. userfaultd code, where some paths calling vma_start_write() can
handle EINTR and some can't without a deeper code refactoring.

3. vm_flags_{set|mod|clear} require refactoring that involves moving
vma_start_write() out of these functions and replacing it with
vma_assert_write_locked(), then callers of these functions should
lock the vma themselves using vma_start_write_killable() whenever
possible.

A cleanup patch is added in the beginning to make later changes more
readable. The second patch contains most of the changes and the last
patch contains the changes associated with process_vma_walk_lock()
error handling.

Changes since v1 [1]:
- Moved vma_start_write_killable() inside set_mempolicy_home_node()
to be done before mpol_dup(new), per Jann Horn
- Added error propagation for the missing PGWALK_WRLOCK users and
split it into a separate patch, per Jann Horn
- Moved vma_start_write_killable() inside __split_vma() to be done
before new->vm_ops->open(), per Jann Horn
- Added a separate patch to change flow control in vma_expand(),
per Jann Horn
- Brought back signal_pending() in mm_take_all_locks, per Jann Horn
- Moved vma_start_write_killable() inside __mmap_new_vma() to be done
before __mmap_new_file_vma(), per Jann Horn
- Added Reviewed-by for powerpc, per Ritesh Harjani
- Added s390 reviewers and the list due to changes in the last patch

[1] https://lore.kernel.org/all/20260209220849.2126486-1-surenb@google.com/

Suren Baghdasaryan (3):
  mm/vma: cleanup error handling path in vma_expand()
  mm: replace vma_start_write() with vma_start_write_killable()
  mm: use vma_start_write_killable() in process_vma_walk_lock()

 arch/powerpc/kvm/book3s_hv_uvmem.c |   5 +-
 arch/s390/kvm/kvm-s390.c           |   5 +-
 arch/s390/mm/gmap.c                |  13 +++-
 fs/proc/task_mmu.c                 |   7 +-
 include/linux/mempolicy.h          |   5 +-
 mm/khugepaged.c                    |   5 +-
 mm/madvise.c                       |   4 +-
 mm/memory.c                        |   2 +
 mm/mempolicy.c                     |  23 +++++--
 mm/mlock.c                         |  20 ++++--
 mm/mprotect.c                      |   4 +-
 mm/mremap.c                        |   4 +-
 mm/pagewalk.c                      |  20 ++++--
 mm/vma.c                           | 105 ++++++++++++++++++++---------
 mm/vma_exec.c                      |   6 +-
 15 files changed, 164 insertions(+), 64 deletions(-)


base-commit: b08472d036a36893ecf68296d87beb58d21f4357
-- 
2.53.0.273.g2a3d683680-goog



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/3] mm/vma: cleanup error handling path in vma_expand()
  2026-02-17 16:32 [PATCH v2 0/3] Use killable vma write locking in most places Suren Baghdasaryan
@ 2026-02-17 16:32 ` Suren Baghdasaryan
  2026-02-17 18:26   ` Liam R. Howlett
  2026-02-17 16:32 ` [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable() Suren Baghdasaryan
  2026-02-17 16:32 ` [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock() Suren Baghdasaryan
  2 siblings, 1 reply; 14+ messages in thread
From: Suren Baghdasaryan @ 2026-02-17 16:32 UTC (permalink / raw)
  To: akpm
  Cc: willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
	baohua, lance.yang, vbabka, jannh, rppt, mhocko, pfalcato, kees,
	maddy, npiggin, mpe, chleroy, borntraeger, frankja, imbrenda,
	hca, gor, agordeev, svens, gerald.schaefer, linux-mm,
	linuxppc-dev, kvm, linux-kernel, linux-s390, surenb

vma_expand() error handling is a bit confusing with "if (ret) return ret;"
mixed with "if (!ret && ...) ret = ...;". Simplify the code to check
for errors and return immediately after an operation that might fail.
This also makes later changes to this function more readable.

No functional change intended.

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 mm/vma.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/mm/vma.c b/mm/vma.c
index be64f781a3aa..bb4d0326fecb 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -1186,12 +1186,16 @@ int vma_expand(struct vma_merge_struct *vmg)
 	 * Note that, by convention, callers ignore OOM for this case, so
 	 * we don't need to account for vmg->give_up_on_mm here.
 	 */
-	if (remove_next)
+	if (remove_next) {
 		ret = dup_anon_vma(target, next, &anon_dup);
-	if (!ret && vmg->copied_from)
+		if (ret)
+			return ret;
+	}
+	if (vmg->copied_from) {
 		ret = dup_anon_vma(target, vmg->copied_from, &anon_dup);
-	if (ret)
-		return ret;
+		if (ret)
+			return ret;
+	}
 
 	if (remove_next) {
 		vma_start_write(next);
-- 
2.53.0.273.g2a3d683680-goog



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable()
  2026-02-17 16:32 [PATCH v2 0/3] Use killable vma write locking in most places Suren Baghdasaryan
  2026-02-17 16:32 ` [PATCH v2 1/3] mm/vma: cleanup error handling path in vma_expand() Suren Baghdasaryan
@ 2026-02-17 16:32 ` Suren Baghdasaryan
  2026-02-17 19:19   ` Liam R. Howlett
  2026-02-17 16:32 ` [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock() Suren Baghdasaryan
  2 siblings, 1 reply; 14+ messages in thread
From: Suren Baghdasaryan @ 2026-02-17 16:32 UTC (permalink / raw)
  To: akpm
  Cc: willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
	baohua, lance.yang, vbabka, jannh, rppt, mhocko, pfalcato, kees,
	maddy, npiggin, mpe, chleroy, borntraeger, frankja, imbrenda,
	hca, gor, agordeev, svens, gerald.schaefer, linux-mm,
	linuxppc-dev, kvm, linux-kernel, linux-s390, surenb,
	Ritesh Harjani (IBM)

Now that we have vma_start_write_killable() we can replace most of the
vma_start_write() calls with it, improving reaction time to the kill
signal.

There are several places which are left untouched by this patch:

1. free_pgtables() because function should free page tables even if a
fatal signal is pending.

2. process_vma_walk_lock(), which requires changes in its callers and
will be handled in the next patch.

3. userfaultd code, where some paths calling vma_start_write() can
handle EINTR and some can't without a deeper code refactoring.

4. vm_flags_{set|mod|clear} require refactoring that involves moving
vma_start_write() out of these functions and replacing it with
vma_assert_write_locked(), then callers of these functions should
lock the vma themselves using vma_start_write_killable() whenever
possible.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> # powerpc
---
 arch/powerpc/kvm/book3s_hv_uvmem.c |  5 +-
 include/linux/mempolicy.h          |  5 +-
 mm/khugepaged.c                    |  5 +-
 mm/madvise.c                       |  4 +-
 mm/memory.c                        |  2 +
 mm/mempolicy.c                     | 23 ++++++--
 mm/mlock.c                         | 20 +++++--
 mm/mprotect.c                      |  4 +-
 mm/mremap.c                        |  4 +-
 mm/vma.c                           | 93 +++++++++++++++++++++---------
 mm/vma_exec.c                      |  6 +-
 11 files changed, 123 insertions(+), 48 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 7cf9310de0ec..69750edcf8d5 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -410,7 +410,10 @@ static int kvmppc_memslot_page_merge(struct kvm *kvm,
 			ret = H_STATE;
 			break;
 		}
-		vma_start_write(vma);
+		if (vma_start_write_killable(vma)) {
+			ret = H_STATE;
+			break;
+		}
 		/* Copy vm_flags to avoid partial modifications in ksm_madvise */
 		vm_flags = vma->vm_flags;
 		ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 0fe96f3ab3ef..762930edde5a 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -137,7 +137,7 @@ bool vma_policy_mof(struct vm_area_struct *vma);
 extern void numa_default_policy(void);
 extern void numa_policy_init(void);
 extern void mpol_rebind_task(struct task_struct *tsk, const nodemask_t *new);
-extern void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new);
+extern int mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new);
 
 extern int huge_node(struct vm_area_struct *vma,
 				unsigned long addr, gfp_t gfp_flags,
@@ -251,8 +251,9 @@ static inline void mpol_rebind_task(struct task_struct *tsk,
 {
 }
 
-static inline void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
+static inline int mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
 {
+	return 0;
 }
 
 static inline int huge_node(struct vm_area_struct *vma,
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index fa1e57fd2c46..392dde66fa86 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1150,7 +1150,10 @@ static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned long a
 	if (result != SCAN_SUCCEED)
 		goto out_up_write;
 	/* check if the pmd is still valid */
-	vma_start_write(vma);
+	if (vma_start_write_killable(vma)) {
+		result = SCAN_FAIL;
+		goto out_up_write;
+	}
 	result = check_pmd_still_valid(mm, address, pmd);
 	if (result != SCAN_SUCCEED)
 		goto out_up_write;
diff --git a/mm/madvise.c b/mm/madvise.c
index 8debb2d434aa..b41e64231c31 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -173,7 +173,9 @@ static int madvise_update_vma(vm_flags_t new_flags,
 	madv_behavior->vma = vma;
 
 	/* vm_flags is protected by the mmap_lock held in write mode. */
-	vma_start_write(vma);
+	if (vma_start_write_killable(vma))
+		return -EINTR;
+
 	vm_flags_reset(vma, new_flags);
 	if (set_new_anon_name)
 		return replace_anon_vma_name(vma, anon_name);
diff --git a/mm/memory.c b/mm/memory.c
index dc0e5da70cdc..29e12f063c7b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -379,6 +379,8 @@ void free_pgd_range(struct mmu_gather *tlb,
  * page tables that should be removed.  This can differ from the vma mappings on
  * some archs that may have mappings that need to be removed outside the vmas.
  * Note that the prev->vm_end and next->vm_start are often used.
+ * We don't use vma_start_write_killable() because page tables should be freed
+ * even if the task is being killed.
  *
  * The vma_end differs from the pg_end when a dup_mmap() failed and the tree has
  * unrelated data to the mm_struct being torn down.
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index dbd48502ac24..5f6302d227f5 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -556,17 +556,25 @@ void mpol_rebind_task(struct task_struct *tsk, const nodemask_t *new)
  *
  * Call holding a reference to mm.  Takes mm->mmap_lock during call.
  */
-void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
+int mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
 {
 	struct vm_area_struct *vma;
 	VMA_ITERATOR(vmi, mm, 0);
+	int ret = 0;
+
+	if (mmap_write_lock_killable(mm))
+		return -EINTR;
 
-	mmap_write_lock(mm);
 	for_each_vma(vmi, vma) {
-		vma_start_write(vma);
+		if (vma_start_write_killable(vma)) {
+			ret = -EINTR;
+			break;
+		}
 		mpol_rebind_policy(vma->vm_policy, new);
 	}
 	mmap_write_unlock(mm);
+
+	return ret;
 }
 
 static const struct mempolicy_operations mpol_ops[MPOL_MAX] = {
@@ -1785,9 +1793,15 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
 		return -EINVAL;
 	if (end == start)
 		return 0;
-	mmap_write_lock(mm);
+	if (mmap_write_lock_killable(mm))
+		return -EINTR;
 	prev = vma_prev(&vmi);
 	for_each_vma_range(vmi, vma, end) {
+		if (vma_start_write_killable(vma)) {
+			err = -EINTR;
+			break;
+		}
+
 		/*
 		 * If any vma in the range got policy other than MPOL_BIND
 		 * or MPOL_PREFERRED_MANY we return error. We don't reset
@@ -1808,7 +1822,6 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
 			break;
 		}
 
-		vma_start_write(vma);
 		new->home_node = home_node;
 		err = mbind_range(&vmi, vma, &prev, start, end, new);
 		mpol_put(new);
diff --git a/mm/mlock.c b/mm/mlock.c
index 2f699c3497a5..2885b858aa0f 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -420,7 +420,7 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr,
  * Called for mlock(), mlock2() and mlockall(), to set @vma VM_LOCKED;
  * called for munlock() and munlockall(), to clear VM_LOCKED from @vma.
  */
-static void mlock_vma_pages_range(struct vm_area_struct *vma,
+static int mlock_vma_pages_range(struct vm_area_struct *vma,
 	unsigned long start, unsigned long end, vm_flags_t newflags)
 {
 	static const struct mm_walk_ops mlock_walk_ops = {
@@ -441,7 +441,9 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma,
 	 */
 	if (newflags & VM_LOCKED)
 		newflags |= VM_IO;
-	vma_start_write(vma);
+	if (vma_start_write_killable(vma))
+		return -EINTR;
+
 	vm_flags_reset_once(vma, newflags);
 
 	lru_add_drain();
@@ -452,6 +454,7 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma,
 		newflags &= ~VM_IO;
 		vm_flags_reset_once(vma, newflags);
 	}
+	return 0;
 }
 
 /*
@@ -501,10 +504,12 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	 */
 	if ((newflags & VM_LOCKED) && (oldflags & VM_LOCKED)) {
 		/* No work to do, and mlocking twice would be wrong */
-		vma_start_write(vma);
+		ret = vma_start_write_killable(vma);
+		if (ret)
+			goto out;
 		vm_flags_reset(vma, newflags);
 	} else {
-		mlock_vma_pages_range(vma, start, end, newflags);
+		ret = mlock_vma_pages_range(vma, start, end, newflags);
 	}
 out:
 	*prev = vma;
@@ -733,9 +738,12 @@ static int apply_mlockall_flags(int flags)
 
 		error = mlock_fixup(&vmi, vma, &prev, vma->vm_start, vma->vm_end,
 				    newflags);
-		/* Ignore errors, but prev needs fixing up. */
-		if (error)
+		/* Ignore errors except EINTR, but prev needs fixing up. */
+		if (error) {
+			if (error == -EINTR)
+				break;
 			prev = vma;
+		}
 		cond_resched();
 	}
 out:
diff --git a/mm/mprotect.c b/mm/mprotect.c
index c0571445bef7..49dbb7156936 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -765,7 +765,9 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
 	 * vm_flags and vm_page_prot are protected by the mmap_lock
 	 * held in write mode.
 	 */
-	vma_start_write(vma);
+	error = vma_start_write_killable(vma);
+	if (error < 0)
+		goto fail;
 	vm_flags_reset_once(vma, newflags);
 	if (vma_wants_manual_pte_write_upgrade(vma))
 		mm_cp_flags |= MM_CP_TRY_CHANGE_WRITABLE;
diff --git a/mm/mremap.c b/mm/mremap.c
index 2be876a70cc0..aef1e5f373c7 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -1286,7 +1286,9 @@ static unsigned long move_vma(struct vma_remap_struct *vrm)
 		return -ENOMEM;
 
 	/* We don't want racing faults. */
-	vma_start_write(vrm->vma);
+	err = vma_start_write_killable(vrm->vma);
+	if (err)
+		return err;
 
 	/* Perform copy step. */
 	err = copy_vma_and_data(vrm, &new_vma);
diff --git a/mm/vma.c b/mm/vma.c
index bb4d0326fecb..1d21351282cf 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -530,6 +530,13 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	if (err)
 		goto out_free_vmi;
 
+	err = vma_start_write_killable(vma);
+	if (err)
+		goto out_free_mpol;
+	err = vma_start_write_killable(new);
+	if (err)
+		goto out_free_mpol;
+
 	err = anon_vma_clone(new, vma, VMA_OP_SPLIT);
 	if (err)
 		goto out_free_mpol;
@@ -540,9 +547,6 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	if (new->vm_ops && new->vm_ops->open)
 		new->vm_ops->open(new);
 
-	vma_start_write(vma);
-	vma_start_write(new);
-
 	init_vma_prep(&vp, vma);
 	vp.insert = new;
 	vma_prepare(&vp);
@@ -895,16 +899,22 @@ static __must_check struct vm_area_struct *vma_merge_existing_range(
 	}
 
 	/* No matter what happens, we will be adjusting middle. */
-	vma_start_write(middle);
+	err = vma_start_write_killable(middle);
+	if (err)
+		goto abort;
 
 	if (merge_right) {
-		vma_start_write(next);
+		err = vma_start_write_killable(next);
+		if (err)
+			goto abort;
 		vmg->target = next;
 		sticky_flags |= (next->vm_flags & VM_STICKY);
 	}
 
 	if (merge_left) {
-		vma_start_write(prev);
+		err = vma_start_write_killable(prev);
+		if (err)
+			goto abort;
 		vmg->target = prev;
 		sticky_flags |= (prev->vm_flags & VM_STICKY);
 	}
@@ -1155,10 +1165,12 @@ int vma_expand(struct vma_merge_struct *vmg)
 	struct vm_area_struct *next = vmg->next;
 	bool remove_next = false;
 	vm_flags_t sticky_flags;
-	int ret = 0;
+	int ret;
 
 	mmap_assert_write_locked(vmg->mm);
-	vma_start_write(target);
+	ret = vma_start_write_killable(target);
+	if (ret)
+		return ret;
 
 	if (next && target != next && vmg->end == next->vm_end)
 		remove_next = true;
@@ -1187,6 +1199,9 @@ int vma_expand(struct vma_merge_struct *vmg)
 	 * we don't need to account for vmg->give_up_on_mm here.
 	 */
 	if (remove_next) {
+		ret = vma_start_write_killable(next);
+		if (ret)
+			return ret;
 		ret = dup_anon_vma(target, next, &anon_dup);
 		if (ret)
 			return ret;
@@ -1197,10 +1212,8 @@ int vma_expand(struct vma_merge_struct *vmg)
 			return ret;
 	}
 
-	if (remove_next) {
-		vma_start_write(next);
+	if (remove_next)
 		vmg->__remove_next = true;
-	}
 	if (commit_merge(vmg))
 		goto nomem;
 
@@ -1233,6 +1246,7 @@ int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	       unsigned long start, unsigned long end, pgoff_t pgoff)
 {
 	struct vma_prepare vp;
+	int err;
 
 	WARN_ON((vma->vm_start != start) && (vma->vm_end != end));
 
@@ -1244,7 +1258,11 @@ int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	if (vma_iter_prealloc(vmi, NULL))
 		return -ENOMEM;
 
-	vma_start_write(vma);
+	err = vma_start_write_killable(vma);
+	if (err) {
+		vma_iter_free(vmi);
+		return err;
+	}
 
 	init_vma_prep(&vp, vma);
 	vma_prepare(&vp);
@@ -1434,7 +1452,9 @@ static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms,
 			if (error)
 				goto end_split_failed;
 		}
-		vma_start_write(next);
+		error = vma_start_write_killable(next);
+		if (error)
+			goto munmap_gather_failed;
 		mas_set(mas_detach, vms->vma_count++);
 		error = mas_store_gfp(mas_detach, next, GFP_KERNEL);
 		if (error)
@@ -1828,12 +1848,17 @@ static void vma_link_file(struct vm_area_struct *vma, bool hold_rmap_lock)
 static int vma_link(struct mm_struct *mm, struct vm_area_struct *vma)
 {
 	VMA_ITERATOR(vmi, mm, 0);
+	int err;
 
 	vma_iter_config(&vmi, vma->vm_start, vma->vm_end);
 	if (vma_iter_prealloc(&vmi, vma))
 		return -ENOMEM;
 
-	vma_start_write(vma);
+	err = vma_start_write_killable(vma);
+	if (err) {
+		vma_iter_free(&vmi);
+		return err;
+	}
 	vma_iter_store_new(&vmi, vma);
 	vma_link_file(vma, /* hold_rmap_lock= */false);
 	mm->map_count++;
@@ -2215,9 +2240,8 @@ int mm_take_all_locks(struct mm_struct *mm)
 	 * is reached.
 	 */
 	for_each_vma(vmi, vma) {
-		if (signal_pending(current))
+		if (signal_pending(current) || vma_start_write_killable(vma))
 			goto out_unlock;
-		vma_start_write(vma);
 	}
 
 	vma_iter_init(&vmi, mm, 0);
@@ -2532,6 +2556,11 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
 		goto free_vma;
 	}
 
+	/* Lock the VMA since it is modified after insertion into VMA tree */
+	error = vma_start_write_killable(vma);
+	if (error)
+		goto free_iter_vma;
+
 	if (map->file)
 		error = __mmap_new_file_vma(map, vma);
 	else if (map->vm_flags & VM_SHARED)
@@ -2552,8 +2581,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
 	WARN_ON_ONCE(!arch_validate_flags(map->vm_flags));
 #endif
 
-	/* Lock the VMA since it is modified after insertion into VMA tree */
-	vma_start_write(vma);
 	vma_iter_store_new(vmi, vma);
 	map->mm->map_count++;
 	vma_link_file(vma, map->hold_file_rmap_lock);
@@ -2864,6 +2891,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma,
 		 unsigned long addr, unsigned long len, vm_flags_t vm_flags)
 {
 	struct mm_struct *mm = current->mm;
+	int err = -ENOMEM;
 
 	/*
 	 * Check against address space limits by the changed size
@@ -2908,7 +2936,10 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	vma_set_range(vma, addr, addr + len, addr >> PAGE_SHIFT);
 	vm_flags_init(vma, vm_flags);
 	vma->vm_page_prot = vm_get_page_prot(vm_flags);
-	vma_start_write(vma);
+	if (vma_start_write_killable(vma)) {
+		err = -EINTR;
+		goto mas_store_fail;
+	}
 	if (vma_iter_store_gfp(vmi, vma, GFP_KERNEL))
 		goto mas_store_fail;
 
@@ -2928,7 +2959,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	vm_area_free(vma);
 unacct_fail:
 	vm_unacct_memory(len >> PAGE_SHIFT);
-	return -ENOMEM;
+	return err;
 }
 
 /**
@@ -3089,7 +3120,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 	struct mm_struct *mm = vma->vm_mm;
 	struct vm_area_struct *next;
 	unsigned long gap_addr;
-	int error = 0;
+	int error;
 	VMA_ITERATOR(vmi, mm, vma->vm_start);
 
 	if (!(vma->vm_flags & VM_GROWSUP))
@@ -3126,12 +3157,14 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 
 	/* We must make sure the anon_vma is allocated. */
 	if (unlikely(anon_vma_prepare(vma))) {
-		vma_iter_free(&vmi);
-		return -ENOMEM;
+		error = -ENOMEM;
+		goto free;
 	}
 
 	/* Lock the VMA before expanding to prevent concurrent page faults */
-	vma_start_write(vma);
+	error = vma_start_write_killable(vma);
+	if (error)
+		goto free;
 	/* We update the anon VMA tree. */
 	anon_vma_lock_write(vma->anon_vma);
 
@@ -3160,6 +3193,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 		}
 	}
 	anon_vma_unlock_write(vma->anon_vma);
+free:
 	vma_iter_free(&vmi);
 	validate_mm(mm);
 	return error;
@@ -3174,7 +3208,7 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct vm_area_struct *prev;
-	int error = 0;
+	int error;
 	VMA_ITERATOR(vmi, mm, vma->vm_start);
 
 	if (!(vma->vm_flags & VM_GROWSDOWN))
@@ -3205,12 +3239,14 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address)
 
 	/* We must make sure the anon_vma is allocated. */
 	if (unlikely(anon_vma_prepare(vma))) {
-		vma_iter_free(&vmi);
-		return -ENOMEM;
+		error = -ENOMEM;
+		goto free;
 	}
 
 	/* Lock the VMA before expanding to prevent concurrent page faults */
-	vma_start_write(vma);
+	error = vma_start_write_killable(vma);
+	if (error)
+		goto free;
 	/* We update the anon VMA tree. */
 	anon_vma_lock_write(vma->anon_vma);
 
@@ -3240,6 +3276,7 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address)
 		}
 	}
 	anon_vma_unlock_write(vma->anon_vma);
+free:
 	vma_iter_free(&vmi);
 	validate_mm(mm);
 	return error;
diff --git a/mm/vma_exec.c b/mm/vma_exec.c
index 8134e1afca68..a4addc2a8480 100644
--- a/mm/vma_exec.c
+++ b/mm/vma_exec.c
@@ -40,6 +40,7 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift)
 	struct vm_area_struct *next;
 	struct mmu_gather tlb;
 	PAGETABLE_MOVE(pmc, vma, vma, old_start, new_start, length);
+	int err;
 
 	BUG_ON(new_start > new_end);
 
@@ -55,8 +56,9 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift)
 	 * cover the whole range: [new_start, old_end)
 	 */
 	vmg.target = vma;
-	if (vma_expand(&vmg))
-		return -ENOMEM;
+	err = vma_expand(&vmg);
+	if (err)
+		return err;
 
 	/*
 	 * move the page tables downwards, on failure we rely on
-- 
2.53.0.273.g2a3d683680-goog



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock()
  2026-02-17 16:32 [PATCH v2 0/3] Use killable vma write locking in most places Suren Baghdasaryan
  2026-02-17 16:32 ` [PATCH v2 1/3] mm/vma: cleanup error handling path in vma_expand() Suren Baghdasaryan
  2026-02-17 16:32 ` [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable() Suren Baghdasaryan
@ 2026-02-17 16:32 ` Suren Baghdasaryan
  2026-02-17 19:15   ` Heiko Carstens
  2 siblings, 1 reply; 14+ messages in thread
From: Suren Baghdasaryan @ 2026-02-17 16:32 UTC (permalink / raw)
  To: akpm
  Cc: willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
	baohua, lance.yang, vbabka, jannh, rppt, mhocko, pfalcato, kees,
	maddy, npiggin, mpe, chleroy, borntraeger, frankja, imbrenda,
	hca, gor, agordeev, svens, gerald.schaefer, linux-mm,
	linuxppc-dev, kvm, linux-kernel, linux-s390, surenb

Replace vma_start_write() with vma_start_write_killable() when
process_vma_walk_lock() is used with PGWALK_WRLOCK option.
Adjust its direct and indirect users to check for a possible error
and handle it.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 arch/s390/kvm/kvm-s390.c |  5 +++--
 arch/s390/mm/gmap.c      | 13 ++++++++++---
 fs/proc/task_mmu.c       |  7 ++++++-
 mm/pagewalk.c            | 20 ++++++++++++++------
 4 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 56a50524b3ee..75aef9c66e03 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -958,6 +958,7 @@ static int kvm_s390_get_mem_control(struct kvm *kvm, struct kvm_device_attr *att
 static int kvm_s390_set_mem_control(struct kvm *kvm, struct kvm_device_attr *attr)
 {
 	int ret;
+	int err;
 	unsigned int idx;
 	switch (attr->attr) {
 	case KVM_S390_VM_MEM_ENABLE_CMMA:
@@ -990,10 +991,10 @@ static int kvm_s390_set_mem_control(struct kvm *kvm, struct kvm_device_attr *att
 		VM_EVENT(kvm, 3, "%s", "RESET: CMMA states");
 		mutex_lock(&kvm->lock);
 		idx = srcu_read_lock(&kvm->srcu);
-		s390_reset_cmma(kvm->arch.gmap->mm);
+		err = s390_reset_cmma(kvm->arch.gmap->mm);
 		srcu_read_unlock(&kvm->srcu, idx);
 		mutex_unlock(&kvm->lock);
-		ret = 0;
+		ret = (err < 0) ? err : 0;
 		break;
 	case KVM_S390_VM_MEM_LIMIT_SIZE: {
 		unsigned long new_limit;
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index dd85bcca817d..96054b124db5 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2271,6 +2271,7 @@ int s390_enable_skey(void)
 {
 	struct mm_struct *mm = current->mm;
 	int rc = 0;
+	int err;
 
 	mmap_write_lock(mm);
 	if (mm_uses_skeys(mm))
@@ -2282,7 +2283,9 @@ int s390_enable_skey(void)
 		mm->context.uses_skeys = 0;
 		goto out_up;
 	}
-	walk_page_range(mm, 0, TASK_SIZE, &enable_skey_walk_ops, NULL);
+	err = walk_page_range(mm, 0, TASK_SIZE, &enable_skey_walk_ops, NULL);
+	if (err < 0)
+		rc = err;
 
 out_up:
 	mmap_write_unlock(mm);
@@ -2305,11 +2308,15 @@ static const struct mm_walk_ops reset_cmma_walk_ops = {
 	.walk_lock		= PGWALK_WRLOCK,
 };
 
-void s390_reset_cmma(struct mm_struct *mm)
+int s390_reset_cmma(struct mm_struct *mm)
 {
+	int err;
+
 	mmap_write_lock(mm);
-	walk_page_range(mm, 0, TASK_SIZE, &reset_cmma_walk_ops, NULL);
+	err = walk_page_range(mm, 0, TASK_SIZE, &reset_cmma_walk_ops, NULL);
 	mmap_write_unlock(mm);
+
+	return (err < 0) ? err : 0;
 }
 EXPORT_SYMBOL_GPL(s390_reset_cmma);
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index d7d52e259055..91e806d65bd9 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1797,6 +1797,7 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 		struct clear_refs_private cp = {
 			.type = type,
 		};
+		int err;
 
 		if (mmap_write_lock_killable(mm)) {
 			count = -EINTR;
@@ -1824,7 +1825,11 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 						0, mm, 0, -1UL);
 			mmu_notifier_invalidate_range_start(&range);
 		}
-		walk_page_range(mm, 0, -1, &clear_refs_walk_ops, &cp);
+		err = walk_page_range(mm, 0, -1, &clear_refs_walk_ops, &cp);
+		if (err < 0) {
+			count = err;
+			goto out_unlock;
+		}
 		if (type == CLEAR_REFS_SOFT_DIRTY) {
 			mmu_notifier_invalidate_range_end(&range);
 			flush_tlb_mm(mm);
diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index a94c401ab2cf..dc9f7a7709c6 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -425,14 +425,13 @@ static inline void process_mm_walk_lock(struct mm_struct *mm,
 		mmap_assert_write_locked(mm);
 }
 
-static inline void process_vma_walk_lock(struct vm_area_struct *vma,
+static inline int process_vma_walk_lock(struct vm_area_struct *vma,
 					 enum page_walk_lock walk_lock)
 {
 #ifdef CONFIG_PER_VMA_LOCK
 	switch (walk_lock) {
 	case PGWALK_WRLOCK:
-		vma_start_write(vma);
-		break;
+		return vma_start_write_killable(vma);
 	case PGWALK_WRLOCK_VERIFY:
 		vma_assert_write_locked(vma);
 		break;
@@ -444,6 +443,7 @@ static inline void process_vma_walk_lock(struct vm_area_struct *vma,
 		break;
 	}
 #endif
+	return 0;
 }
 
 /*
@@ -487,7 +487,9 @@ int walk_page_range_mm_unsafe(struct mm_struct *mm, unsigned long start,
 			if (ops->pte_hole)
 				err = ops->pte_hole(start, next, -1, &walk);
 		} else { /* inside vma */
-			process_vma_walk_lock(vma, ops->walk_lock);
+			err = process_vma_walk_lock(vma, ops->walk_lock);
+			if (err)
+				break;
 			walk.vma = vma;
 			next = min(end, vma->vm_end);
 			vma = find_vma(mm, vma->vm_end);
@@ -704,6 +706,7 @@ int walk_page_range_vma_unsafe(struct vm_area_struct *vma, unsigned long start,
 		.vma		= vma,
 		.private	= private,
 	};
+	int err;
 
 	if (start >= end || !walk.mm)
 		return -EINVAL;
@@ -711,7 +714,9 @@ int walk_page_range_vma_unsafe(struct vm_area_struct *vma, unsigned long start,
 		return -EINVAL;
 
 	process_mm_walk_lock(walk.mm, ops->walk_lock);
-	process_vma_walk_lock(vma, ops->walk_lock);
+	err = process_vma_walk_lock(vma, ops->walk_lock);
+	if (err)
+		return err;
 	return __walk_page_range(start, end, &walk);
 }
 
@@ -734,6 +739,7 @@ int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops,
 		.vma		= vma,
 		.private	= private,
 	};
+	int err;
 
 	if (!walk.mm)
 		return -EINVAL;
@@ -741,7 +747,9 @@ int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops,
 		return -EINVAL;
 
 	process_mm_walk_lock(walk.mm, ops->walk_lock);
-	process_vma_walk_lock(vma, ops->walk_lock);
+	err = process_vma_walk_lock(vma, ops->walk_lock);
+	if (err)
+		return err;
 	return __walk_page_range(vma->vm_start, vma->vm_end, &walk);
 }
 
-- 
2.53.0.273.g2a3d683680-goog



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/3] mm/vma: cleanup error handling path in vma_expand()
  2026-02-17 16:32 ` [PATCH v2 1/3] mm/vma: cleanup error handling path in vma_expand() Suren Baghdasaryan
@ 2026-02-17 18:26   ` Liam R. Howlett
  0 siblings, 0 replies; 14+ messages in thread
From: Liam R. Howlett @ 2026-02-17 18:26 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: akpm, willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, npache, ryan.roberts, dev.jain, baohua, lance.yang,
	vbabka, jannh, rppt, mhocko, pfalcato, kees, maddy, npiggin, mpe,
	chleroy, borntraeger, frankja, imbrenda, hca, gor, agordeev,
	svens, gerald.schaefer, linux-mm, linuxppc-dev, kvm,
	linux-kernel, linux-s390

* Suren Baghdasaryan <surenb@google.com> [260217 11:33]:
> vma_expand() error handling is a bit confusing with "if (ret) return ret;"
> mixed with "if (!ret && ...) ret = ...;". Simplify the code to check
> for errors and return immediately after an operation that might fail.
> This also makes later changes to this function more readable.
> 
> No functional change intended.
> 
> Suggested-by: Jann Horn <jannh@google.com>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  mm/vma.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/vma.c b/mm/vma.c
> index be64f781a3aa..bb4d0326fecb 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -1186,12 +1186,16 @@ int vma_expand(struct vma_merge_struct *vmg)
>  	 * Note that, by convention, callers ignore OOM for this case, so
>  	 * we don't need to account for vmg->give_up_on_mm here.
>  	 */
> -	if (remove_next)
> +	if (remove_next) {
>  		ret = dup_anon_vma(target, next, &anon_dup);
> -	if (!ret && vmg->copied_from)
> +		if (ret)
> +			return ret;
> +	}
> +	if (vmg->copied_from) {
>  		ret = dup_anon_vma(target, vmg->copied_from, &anon_dup);
> -	if (ret)
> -		return ret;
> +		if (ret)
> +			return ret;
> +	}
>  
>  	if (remove_next) {
>  		vma_start_write(next);
> -- 
> 2.53.0.273.g2a3d683680-goog
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock()
  2026-02-17 16:32 ` [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock() Suren Baghdasaryan
@ 2026-02-17 19:15   ` Heiko Carstens
  2026-02-17 20:31     ` Suren Baghdasaryan
  0 siblings, 1 reply; 14+ messages in thread
From: Heiko Carstens @ 2026-02-17 19:15 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: akpm, willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
	baohua, lance.yang, vbabka, jannh, rppt, mhocko, pfalcato, kees,
	maddy, npiggin, mpe, chleroy, borntraeger, frankja, imbrenda,
	gor, agordeev, svens, gerald.schaefer, linux-mm, linuxppc-dev,
	kvm, linux-kernel, linux-s390

On Tue, Feb 17, 2026 at 08:32:50AM -0800, Suren Baghdasaryan wrote:
> Replace vma_start_write() with vma_start_write_killable() when
> process_vma_walk_lock() is used with PGWALK_WRLOCK option.
> Adjust its direct and indirect users to check for a possible error
> and handle it.
> 
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
>  arch/s390/kvm/kvm-s390.c |  5 +++--
>  arch/s390/mm/gmap.c      | 13 ++++++++++---
>  fs/proc/task_mmu.c       |  7 ++++++-
>  mm/pagewalk.c            | 20 ++++++++++++++------
>  4 files changed, 33 insertions(+), 12 deletions(-)

The s390 code modified with this patch does not exist upstream
anymore. It has been replaced with Claudio's huge gmap rewrite.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable()
  2026-02-17 16:32 ` [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable() Suren Baghdasaryan
@ 2026-02-17 19:19   ` Liam R. Howlett
  2026-02-17 21:02     ` Suren Baghdasaryan
  0 siblings, 1 reply; 14+ messages in thread
From: Liam R. Howlett @ 2026-02-17 19:19 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: akpm, willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, npache, ryan.roberts, dev.jain, baohua, lance.yang,
	vbabka, jannh, rppt, mhocko, pfalcato, kees, maddy, npiggin, mpe,
	chleroy, borntraeger, frankja, imbrenda, hca, gor, agordeev,
	svens, gerald.schaefer, linux-mm, linuxppc-dev, kvm,
	linux-kernel, linux-s390, Ritesh Harjani (IBM)

* Suren Baghdasaryan <surenb@google.com> [260217 11:33]:
> Now that we have vma_start_write_killable() we can replace most of the
> vma_start_write() calls with it, improving reaction time to the kill
> signal.
> 
> There are several places which are left untouched by this patch:
> 
> 1. free_pgtables() because function should free page tables even if a
> fatal signal is pending.
> 
> 2. process_vma_walk_lock(), which requires changes in its callers and
> will be handled in the next patch.
> 
> 3. userfaultd code, where some paths calling vma_start_write() can
> handle EINTR and some can't without a deeper code refactoring.
> 
> 4. vm_flags_{set|mod|clear} require refactoring that involves moving
> vma_start_write() out of these functions and replacing it with
> vma_assert_write_locked(), then callers of these functions should
> lock the vma themselves using vma_start_write_killable() whenever
> possible.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> # powerpc
> ---
>  arch/powerpc/kvm/book3s_hv_uvmem.c |  5 +-
>  include/linux/mempolicy.h          |  5 +-
>  mm/khugepaged.c                    |  5 +-
>  mm/madvise.c                       |  4 +-
>  mm/memory.c                        |  2 +
>  mm/mempolicy.c                     | 23 ++++++--
>  mm/mlock.c                         | 20 +++++--
>  mm/mprotect.c                      |  4 +-
>  mm/mremap.c                        |  4 +-
>  mm/vma.c                           | 93 +++++++++++++++++++++---------
>  mm/vma_exec.c                      |  6 +-
>  11 files changed, 123 insertions(+), 48 deletions(-)
> 

...

> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c

...

>  
>  static const struct mempolicy_operations mpol_ops[MPOL_MAX] = {
> @@ -1785,9 +1793,15 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>  		return -EINVAL;
>  	if (end == start)
>  		return 0;
> -	mmap_write_lock(mm);
> +	if (mmap_write_lock_killable(mm))
> +		return -EINTR;
>  	prev = vma_prev(&vmi);
>  	for_each_vma_range(vmi, vma, end) {
> +		if (vma_start_write_killable(vma)) {
> +			err = -EINTR;
> +			break;
> +		}
> +
>  		/*
>  		 * If any vma in the range got policy other than MPOL_BIND
>  		 * or MPOL_PREFERRED_MANY we return error. We don't reset
> @@ -1808,7 +1822,6 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>  			break;
>  		}
>  
> -		vma_start_write(vma);

Moving this vma_start_write() up means we will lock all vmas in the
range regardless of if they are going to change.  Was that your
intention?

It might be better to move the locking to later in the loop, prior to
the mpol_dup(), but after skipping other vmas?

>  		new->home_node = home_node;
>  		err = mbind_range(&vmi, vma, &prev, start, end, new);

...

> diff --git a/mm/vma.c b/mm/vma.c
> index bb4d0326fecb..1d21351282cf 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c

...

> @@ -2532,6 +2556,11 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
>  		goto free_vma;
>  	}
>  
> +	/* Lock the VMA since it is modified after insertion into VMA tree */
> +	error = vma_start_write_killable(vma);
> +	if (error)
> +		goto free_iter_vma;
> +
>  	if (map->file)
>  		error = __mmap_new_file_vma(map, vma);
>  	else if (map->vm_flags & VM_SHARED)
> @@ -2552,8 +2581,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
>  	WARN_ON_ONCE(!arch_validate_flags(map->vm_flags));
>  #endif
>  
> -	/* Lock the VMA since it is modified after insertion into VMA tree */
> -	vma_start_write(vma);
>  	vma_iter_store_new(vmi, vma);
>  	map->mm->map_count++;
>  	vma_link_file(vma, map->hold_file_rmap_lock);

This is a bit of a nit on the placement..

Prior to this change, the write lock on the vma was taken next to where
it was inserted in the tree.  Now the lock is taken between vma iterator
preallocations and part of the vma setup.

Would it make sense to put it closer to the vma allocation itself?  I
think all that's needed to be set is the mm struct for the locking to
work?


...

> @@ -3089,7 +3120,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)

Good luck testing this one.

>  	struct mm_struct *mm = vma->vm_mm;
>  	struct vm_area_struct *next;
>  	unsigned long gap_addr;
> -	int error = 0;
> +	int error;
>  	VMA_ITERATOR(vmi, mm, vma->vm_start);
>  
>  	if (!(vma->vm_flags & VM_GROWSUP))
> @@ -3126,12 +3157,14 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  
>  	/* We must make sure the anon_vma is allocated. */
>  	if (unlikely(anon_vma_prepare(vma))) {
> -		vma_iter_free(&vmi);
> -		return -ENOMEM;
> +		error = -ENOMEM;
> +		goto free;
>  	}
>  
>  	/* Lock the VMA before expanding to prevent concurrent page faults */
> -	vma_start_write(vma);
> +	error = vma_start_write_killable(vma);
> +	if (error)
> +		goto free;
>  	/* We update the anon VMA tree. */
>  	anon_vma_lock_write(vma->anon_vma);
>  
> @@ -3160,6 +3193,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  		}
>  	}
>  	anon_vma_unlock_write(vma->anon_vma);
> +free:
>  	vma_iter_free(&vmi);
>  	validate_mm(mm);
>  	return error;

Looks okay.

...




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock()
  2026-02-17 19:15   ` Heiko Carstens
@ 2026-02-17 20:31     ` Suren Baghdasaryan
  2026-02-18  7:10       ` Heiko Carstens
  2026-02-18 13:07       ` Matthew Wilcox
  0 siblings, 2 replies; 14+ messages in thread
From: Suren Baghdasaryan @ 2026-02-17 20:31 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: akpm, willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
	baohua, lance.yang, vbabka, jannh, rppt, mhocko, pfalcato, kees,
	maddy, npiggin, mpe, chleroy, borntraeger, frankja, imbrenda,
	gor, agordeev, svens, gerald.schaefer, linux-mm, linuxppc-dev,
	kvm, linux-kernel, linux-s390

On Tue, Feb 17, 2026 at 11:15 AM Heiko Carstens <hca@linux.ibm.com> wrote:
>
> On Tue, Feb 17, 2026 at 08:32:50AM -0800, Suren Baghdasaryan wrote:
> > Replace vma_start_write() with vma_start_write_killable() when
> > process_vma_walk_lock() is used with PGWALK_WRLOCK option.
> > Adjust its direct and indirect users to check for a possible error
> > and handle it.
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > ---
> >  arch/s390/kvm/kvm-s390.c |  5 +++--
> >  arch/s390/mm/gmap.c      | 13 ++++++++++---
> >  fs/proc/task_mmu.c       |  7 ++++++-
> >  mm/pagewalk.c            | 20 ++++++++++++++------
> >  4 files changed, 33 insertions(+), 12 deletions(-)
>
> The s390 code modified with this patch does not exist upstream
> anymore. It has been replaced with Claudio's huge gmap rewrite.

Hmm. My patchset is based on mm-new. I guess the code was modified in
some other tree. Could you please provide a link to that patchset so I
can track it? I'll probably remove this patch from my set until that
one is merged.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable()
  2026-02-17 19:19   ` Liam R. Howlett
@ 2026-02-17 21:02     ` Suren Baghdasaryan
  2026-02-18 16:46       ` Liam R. Howlett
  0 siblings, 1 reply; 14+ messages in thread
From: Suren Baghdasaryan @ 2026-02-17 21:02 UTC (permalink / raw)
  To: Liam R. Howlett, Suren Baghdasaryan, akpm, willy, david, ziy,
	matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
	ying.huang, apopple, lorenzo.stoakes, baolin.wang, npache,
	ryan.roberts, dev.jain, baohua, lance.yang, vbabka, jannh, rppt,
	mhocko, pfalcato, kees, maddy, npiggin, mpe, chleroy,
	borntraeger, frankja, imbrenda, hca, gor, agordeev, svens,
	gerald.schaefer, linux-mm, linuxppc-dev, kvm, linux-kernel,
	linux-s390, Ritesh Harjani (IBM)

On Tue, Feb 17, 2026 at 11:19 AM Liam R. Howlett
<Liam.Howlett@oracle.com> wrote:
>
> * Suren Baghdasaryan <surenb@google.com> [260217 11:33]:
> > Now that we have vma_start_write_killable() we can replace most of the
> > vma_start_write() calls with it, improving reaction time to the kill
> > signal.
> >
> > There are several places which are left untouched by this patch:
> >
> > 1. free_pgtables() because function should free page tables even if a
> > fatal signal is pending.
> >
> > 2. process_vma_walk_lock(), which requires changes in its callers and
> > will be handled in the next patch.
> >
> > 3. userfaultd code, where some paths calling vma_start_write() can
> > handle EINTR and some can't without a deeper code refactoring.
> >
> > 4. vm_flags_{set|mod|clear} require refactoring that involves moving
> > vma_start_write() out of these functions and replacing it with
> > vma_assert_write_locked(), then callers of these functions should
> > lock the vma themselves using vma_start_write_killable() whenever
> > possible.
> >
> > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> # powerpc
> > ---
> >  arch/powerpc/kvm/book3s_hv_uvmem.c |  5 +-
> >  include/linux/mempolicy.h          |  5 +-
> >  mm/khugepaged.c                    |  5 +-
> >  mm/madvise.c                       |  4 +-
> >  mm/memory.c                        |  2 +
> >  mm/mempolicy.c                     | 23 ++++++--
> >  mm/mlock.c                         | 20 +++++--
> >  mm/mprotect.c                      |  4 +-
> >  mm/mremap.c                        |  4 +-
> >  mm/vma.c                           | 93 +++++++++++++++++++++---------
> >  mm/vma_exec.c                      |  6 +-
> >  11 files changed, 123 insertions(+), 48 deletions(-)
> >
>
> ...
>
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
>
> ...
>
> >
> >  static const struct mempolicy_operations mpol_ops[MPOL_MAX] = {
> > @@ -1785,9 +1793,15 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
> >               return -EINVAL;
> >       if (end == start)
> >               return 0;
> > -     mmap_write_lock(mm);
> > +     if (mmap_write_lock_killable(mm))
> > +             return -EINTR;
> >       prev = vma_prev(&vmi);
> >       for_each_vma_range(vmi, vma, end) {
> > +             if (vma_start_write_killable(vma)) {
> > +                     err = -EINTR;
> > +                     break;
> > +             }
> > +
> >               /*
> >                * If any vma in the range got policy other than MPOL_BIND
> >                * or MPOL_PREFERRED_MANY we return error. We don't reset
> > @@ -1808,7 +1822,6 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
> >                       break;
> >               }
> >
> > -             vma_start_write(vma);
>
> Moving this vma_start_write() up means we will lock all vmas in the
> range regardless of if they are going to change.  Was that your
> intention?

No, I missed that this would result in unnecessary locks.

>
> It might be better to move the locking to later in the loop, prior to
> the mpol_dup(), but after skipping other vmas?

Yes, that's the right place for it. Will move.

>
> >               new->home_node = home_node;
> >               err = mbind_range(&vmi, vma, &prev, start, end, new);
>
> ...
>
> > diff --git a/mm/vma.c b/mm/vma.c
> > index bb4d0326fecb..1d21351282cf 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
>
> ...
>
> > @@ -2532,6 +2556,11 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
> >               goto free_vma;
> >       }
> >
> > +     /* Lock the VMA since it is modified after insertion into VMA tree */
> > +     error = vma_start_write_killable(vma);
> > +     if (error)
> > +             goto free_iter_vma;
> > +
> >       if (map->file)
> >               error = __mmap_new_file_vma(map, vma);
> >       else if (map->vm_flags & VM_SHARED)
> > @@ -2552,8 +2581,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
> >       WARN_ON_ONCE(!arch_validate_flags(map->vm_flags));
> >  #endif
> >
> > -     /* Lock the VMA since it is modified after insertion into VMA tree */
> > -     vma_start_write(vma);
> >       vma_iter_store_new(vmi, vma);
> >       map->mm->map_count++;
> >       vma_link_file(vma, map->hold_file_rmap_lock);
>
> This is a bit of a nit on the placement..
>
> Prior to this change, the write lock on the vma was taken next to where
> it was inserted in the tree.  Now the lock is taken between vma iterator
> preallocations and part of the vma setup.
>
> Would it make sense to put it closer to the vma allocation itself?  I
> think all that's needed to be set is the mm struct for the locking to
> work?

I guess locking the vma before vma_iter_prealloc() would save us
unnecessary alloc/free in case of a pending fatal signal. I'll move
the lock right after vm_area_alloc() so that the entire vma setup is
done on a locked vma.

>
>
> ...
>
> > @@ -3089,7 +3120,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>
> Good luck testing this one.

Yeah... Any suggestions for tests I should use?

>
> >       struct mm_struct *mm = vma->vm_mm;
> >       struct vm_area_struct *next;
> >       unsigned long gap_addr;
> > -     int error = 0;
> > +     int error;
> >       VMA_ITERATOR(vmi, mm, vma->vm_start);
> >
> >       if (!(vma->vm_flags & VM_GROWSUP))
> > @@ -3126,12 +3157,14 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
> >
> >       /* We must make sure the anon_vma is allocated. */
> >       if (unlikely(anon_vma_prepare(vma))) {
> > -             vma_iter_free(&vmi);
> > -             return -ENOMEM;
> > +             error = -ENOMEM;
> > +             goto free;
> >       }
> >
> >       /* Lock the VMA before expanding to prevent concurrent page faults */
> > -     vma_start_write(vma);
> > +     error = vma_start_write_killable(vma);
> > +     if (error)
> > +             goto free;
> >       /* We update the anon VMA tree. */
> >       anon_vma_lock_write(vma->anon_vma);
> >
> > @@ -3160,6 +3193,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
> >               }
> >       }
> >       anon_vma_unlock_write(vma->anon_vma);
> > +free:
> >       vma_iter_free(&vmi);
> >       validate_mm(mm);
> >       return error;
>
> Looks okay.

Thanks for the review, Liam! I'll wait a couple of days and post the
v3 with fixes.

>
> ...
>
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock()
  2026-02-17 20:31     ` Suren Baghdasaryan
@ 2026-02-18  7:10       ` Heiko Carstens
  2026-02-18 13:07       ` Matthew Wilcox
  1 sibling, 0 replies; 14+ messages in thread
From: Heiko Carstens @ 2026-02-18  7:10 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: akpm, willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
	baohua, lance.yang, vbabka, jannh, rppt, mhocko, pfalcato, kees,
	maddy, npiggin, mpe, chleroy, borntraeger, frankja, imbrenda,
	gor, agordeev, svens, gerald.schaefer, linux-mm, linuxppc-dev,
	kvm, linux-kernel, linux-s390

On Tue, Feb 17, 2026 at 12:31:32PM -0800, Suren Baghdasaryan wrote:
> On Tue, Feb 17, 2026 at 11:15 AM Heiko Carstens <hca@linux.ibm.com> wrote:
> >
> > On Tue, Feb 17, 2026 at 08:32:50AM -0800, Suren Baghdasaryan wrote:
> > > Replace vma_start_write() with vma_start_write_killable() when
> > > process_vma_walk_lock() is used with PGWALK_WRLOCK option.
> > > Adjust its direct and indirect users to check for a possible error
> > > and handle it.
> > >
> > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > ---
> > >  arch/s390/kvm/kvm-s390.c |  5 +++--
> > >  arch/s390/mm/gmap.c      | 13 ++++++++++---
> > >  fs/proc/task_mmu.c       |  7 ++++++-
> > >  mm/pagewalk.c            | 20 ++++++++++++++------
> > >  4 files changed, 33 insertions(+), 12 deletions(-)
> >
> > The s390 code modified with this patch does not exist upstream
> > anymore. It has been replaced with Claudio's huge gmap rewrite.
> 
> Hmm. My patchset is based on mm-new. I guess the code was modified in
> some other tree. Could you please provide a link to that patchset so I
> can track it? I'll probably remove this patch from my set until that
> one is merged.

This is the corresponding merge commit in Linus' tree:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b1195183ed42f1522fae3fe44ebee3af437aa000


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock()
  2026-02-17 20:31     ` Suren Baghdasaryan
  2026-02-18  7:10       ` Heiko Carstens
@ 2026-02-18 13:07       ` Matthew Wilcox
  2026-02-18 15:52         ` Suren Baghdasaryan
  1 sibling, 1 reply; 14+ messages in thread
From: Matthew Wilcox @ 2026-02-18 13:07 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Heiko Carstens, akpm, david, ziy, matthew.brost, joshua.hahnjy,
	rakie.kim, byungchul, gourry, ying.huang, apopple,
	lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts,
	dev.jain, baohua, lance.yang, vbabka, jannh, rppt, mhocko,
	pfalcato, kees, maddy, npiggin, mpe, chleroy, borntraeger,
	frankja, imbrenda, gor, agordeev, svens, gerald.schaefer,
	linux-mm, linuxppc-dev, kvm, linux-kernel, linux-s390

On Tue, Feb 17, 2026 at 12:31:32PM -0800, Suren Baghdasaryan wrote:
> Hmm. My patchset is based on mm-new. I guess the code was modified in
> some other tree. Could you please provide a link to that patchset so I
> can track it? I'll probably remove this patch from my set until that
> one is merged.

mm-new is a bad place to be playing; better to base off mm-unstable.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock()
  2026-02-18 13:07       ` Matthew Wilcox
@ 2026-02-18 15:52         ` Suren Baghdasaryan
  0 siblings, 0 replies; 14+ messages in thread
From: Suren Baghdasaryan @ 2026-02-18 15:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Heiko Carstens, akpm, david, ziy, matthew.brost, joshua.hahnjy,
	rakie.kim, byungchul, gourry, ying.huang, apopple,
	lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts,
	dev.jain, baohua, lance.yang, vbabka, jannh, rppt, mhocko,
	pfalcato, kees, maddy, npiggin, mpe, chleroy, borntraeger,
	frankja, imbrenda, gor, agordeev, svens, gerald.schaefer,
	linux-mm, linuxppc-dev, kvm, linux-kernel, linux-s390

On Wed, Feb 18, 2026 at 1:07 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Feb 17, 2026 at 12:31:32PM -0800, Suren Baghdasaryan wrote:
> > Hmm. My patchset is based on mm-new. I guess the code was modified in
> > some other tree. Could you please provide a link to that patchset so I
> > can track it? I'll probably remove this patch from my set until that
> > one is merged.
>
> mm-new is a bad place to be playing; better to base off mm-unstable.

Ah, I see. I'll redo my patchset on top of that then. Thanks for the tip!


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable()
  2026-02-17 21:02     ` Suren Baghdasaryan
@ 2026-02-18 16:46       ` Liam R. Howlett
  2026-02-18 23:40         ` Suren Baghdasaryan
  0 siblings, 1 reply; 14+ messages in thread
From: Liam R. Howlett @ 2026-02-18 16:46 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: akpm, willy, david, ziy, matthew.brost, joshua.hahnjy, rakie.kim,
	byungchul, gourry, ying.huang, apopple, lorenzo.stoakes,
	baolin.wang, npache, ryan.roberts, dev.jain, baohua, lance.yang,
	vbabka, jannh, rppt, mhocko, pfalcato, kees, maddy, npiggin, mpe,
	chleroy, borntraeger, frankja, imbrenda, hca, gor, agordeev,
	svens, gerald.schaefer, linux-mm, linuxppc-dev, kvm,
	linux-kernel, linux-s390, Ritesh Harjani (IBM)

* Suren Baghdasaryan <surenb@google.com> [260217 16:03]:
> On Tue, Feb 17, 2026 at 11:19 AM Liam R. Howlett
> <Liam.Howlett@oracle.com> wrote:
> >
> > * Suren Baghdasaryan <surenb@google.com> [260217 11:33]:
> > > Now that we have vma_start_write_killable() we can replace most of the
> > > vma_start_write() calls with it, improving reaction time to the kill
> > > signal.
> > >
> > > There are several places which are left untouched by this patch:
> > >
> > > 1. free_pgtables() because function should free page tables even if a
> > > fatal signal is pending.
> > >
> > > 2. process_vma_walk_lock(), which requires changes in its callers and
> > > will be handled in the next patch.
> > >
> > > 3. userfaultd code, where some paths calling vma_start_write() can
> > > handle EINTR and some can't without a deeper code refactoring.
> > >
> > > 4. vm_flags_{set|mod|clear} require refactoring that involves moving
> > > vma_start_write() out of these functions and replacing it with
> > > vma_assert_write_locked(), then callers of these functions should
> > > lock the vma themselves using vma_start_write_killable() whenever
> > > possible.
> > >
> > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> # powerpc
> > > ---
> > >  arch/powerpc/kvm/book3s_hv_uvmem.c |  5 +-
> > >  include/linux/mempolicy.h          |  5 +-
> > >  mm/khugepaged.c                    |  5 +-
> > >  mm/madvise.c                       |  4 +-
> > >  mm/memory.c                        |  2 +
> > >  mm/mempolicy.c                     | 23 ++++++--
> > >  mm/mlock.c                         | 20 +++++--
> > >  mm/mprotect.c                      |  4 +-
> > >  mm/mremap.c                        |  4 +-
> > >  mm/vma.c                           | 93 +++++++++++++++++++++---------
> > >  mm/vma_exec.c                      |  6 +-
> > >  11 files changed, 123 insertions(+), 48 deletions(-)
> > >

...

> >
> >
> > ...
> >
> > > @@ -3089,7 +3120,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
> >
> > Good luck testing this one.
> 
> Yeah... Any suggestions for tests I should use?

I think you have to either isolate it or boot parisc.

To boot parisc, you can use the debian hppa image [1].  The file is a
zip file which can be decompressed to a qcow2, initrd, and kernel.  You
can boot with qemu-system-hppa (debian has this in qemu-system-misc
package), there is a readme that has a boot line as well.

Building can be done using the cross-compiler tools for hppa [2] and the
make command with CROSS_COMPILE=<path>/bin/hppa64-linux-

Cheers,
Liam

[1]. https://people.debian.org/~gio/dqib/
[2]. https://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/15.2.0/



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable()
  2026-02-18 16:46       ` Liam R. Howlett
@ 2026-02-18 23:40         ` Suren Baghdasaryan
  0 siblings, 0 replies; 14+ messages in thread
From: Suren Baghdasaryan @ 2026-02-18 23:40 UTC (permalink / raw)
  To: Liam R. Howlett, Suren Baghdasaryan, akpm, willy, david, ziy,
	matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
	ying.huang, apopple, lorenzo.stoakes, baolin.wang, npache,
	ryan.roberts, dev.jain, baohua, lance.yang, vbabka, jannh, rppt,
	mhocko, pfalcato, kees, maddy, npiggin, mpe, chleroy,
	borntraeger, frankja, imbrenda, hca, gor, agordeev, svens,
	gerald.schaefer, linux-mm, linuxppc-dev, kvm, linux-kernel,
	linux-s390, Ritesh Harjani (IBM)

On Wed, Feb 18, 2026 at 8:46 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Suren Baghdasaryan <surenb@google.com> [260217 16:03]:
> > On Tue, Feb 17, 2026 at 11:19 AM Liam R. Howlett
> > <Liam.Howlett@oracle.com> wrote:
> > >
> > > * Suren Baghdasaryan <surenb@google.com> [260217 11:33]:
> > > > Now that we have vma_start_write_killable() we can replace most of the
> > > > vma_start_write() calls with it, improving reaction time to the kill
> > > > signal.
> > > >
> > > > There are several places which are left untouched by this patch:
> > > >
> > > > 1. free_pgtables() because function should free page tables even if a
> > > > fatal signal is pending.
> > > >
> > > > 2. process_vma_walk_lock(), which requires changes in its callers and
> > > > will be handled in the next patch.
> > > >
> > > > 3. userfaultd code, where some paths calling vma_start_write() can
> > > > handle EINTR and some can't without a deeper code refactoring.
> > > >
> > > > 4. vm_flags_{set|mod|clear} require refactoring that involves moving
> > > > vma_start_write() out of these functions and replacing it with
> > > > vma_assert_write_locked(), then callers of these functions should
> > > > lock the vma themselves using vma_start_write_killable() whenever
> > > > possible.
> > > >
> > > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > > Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> # powerpc
> > > > ---
> > > >  arch/powerpc/kvm/book3s_hv_uvmem.c |  5 +-
> > > >  include/linux/mempolicy.h          |  5 +-
> > > >  mm/khugepaged.c                    |  5 +-
> > > >  mm/madvise.c                       |  4 +-
> > > >  mm/memory.c                        |  2 +
> > > >  mm/mempolicy.c                     | 23 ++++++--
> > > >  mm/mlock.c                         | 20 +++++--
> > > >  mm/mprotect.c                      |  4 +-
> > > >  mm/mremap.c                        |  4 +-
> > > >  mm/vma.c                           | 93 +++++++++++++++++++++---------
> > > >  mm/vma_exec.c                      |  6 +-
> > > >  11 files changed, 123 insertions(+), 48 deletions(-)
> > > >
>
> ...
>
> > >
> > >
> > > ...
> > >
> > > > @@ -3089,7 +3120,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
> > >
> > > Good luck testing this one.
> >
> > Yeah... Any suggestions for tests I should use?
>
> I think you have to either isolate it or boot parisc.
>
> To boot parisc, you can use the debian hppa image [1].  The file is a
> zip file which can be decompressed to a qcow2, initrd, and kernel.  You
> can boot with qemu-system-hppa (debian has this in qemu-system-misc
> package), there is a readme that has a boot line as well.
>
> Building can be done using the cross-compiler tools for hppa [2] and the
> make command with CROSS_COMPILE=<path>/bin/hppa64-linux-

Ah, I thought you were referring to the difficulty of finding specific
tests to verify this change but these instructions are helpful too.
Thanks!


>
> Cheers,
> Liam
>
> [1]. https://people.debian.org/~gio/dqib/
> [2]. https://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/15.2.0/
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-02-18 23:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-17 16:32 [PATCH v2 0/3] Use killable vma write locking in most places Suren Baghdasaryan
2026-02-17 16:32 ` [PATCH v2 1/3] mm/vma: cleanup error handling path in vma_expand() Suren Baghdasaryan
2026-02-17 18:26   ` Liam R. Howlett
2026-02-17 16:32 ` [PATCH v2 2/3] mm: replace vma_start_write() with vma_start_write_killable() Suren Baghdasaryan
2026-02-17 19:19   ` Liam R. Howlett
2026-02-17 21:02     ` Suren Baghdasaryan
2026-02-18 16:46       ` Liam R. Howlett
2026-02-18 23:40         ` Suren Baghdasaryan
2026-02-17 16:32 ` [PATCH v2 3/3] mm: use vma_start_write_killable() in process_vma_walk_lock() Suren Baghdasaryan
2026-02-17 19:15   ` Heiko Carstens
2026-02-17 20:31     ` Suren Baghdasaryan
2026-02-18  7:10       ` Heiko Carstens
2026-02-18 13:07       ` Matthew Wilcox
2026-02-18 15:52         ` Suren Baghdasaryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox