linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, willy@infradead.org, david@kernel.org,
	 ziy@nvidia.com, matthew.brost@intel.com,
	joshua.hahnjy@gmail.com,  rakie.kim@sk.com, byungchul@sk.com,
	gourry@gourry.net,  ying.huang@linux.alibaba.com,
	apopple@nvidia.com, lorenzo.stoakes@oracle.com,
	 baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com,
	npache@redhat.com,  ryan.roberts@arm.com, dev.jain@arm.com,
	baohua@kernel.org,  lance.yang@linux.dev, vbabka@suse.cz,
	rppt@kernel.org, mhocko@suse.com,  pfalcato@suse.de,
	kees@kernel.org, maddy@linux.ibm.com, npiggin@gmail.com,
	 mpe@ellerman.id.au, chleroy@kernel.org, linux-mm@kvack.org,
	 linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/1] mm: replace vma_start_write() with vma_start_write_killable()
Date: Tue, 10 Feb 2026 22:18:47 +0100	[thread overview]
Message-ID: <CAG48ez2zFfCO7RKhHKaATFge7DWzaTfO+Yta0y4_HXGHZAtkqw@mail.gmail.com> (raw)
In-Reply-To: <20260209220849.2126486-1-surenb@google.com>

On Mon, Feb 9, 2026 at 11:08 PM Suren Baghdasaryan <surenb@google.com> wrote:
> Now that we have vma_start_write_killable() we can replace most of the
> vma_start_write() calls with it, improving reaction time to the kill
> signal.
[...]
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index dbd48502ac24..3de7ab4f4cee 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
[...]
> @@ -1808,7 +1817,11 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>                         break;
>                 }
>
> -               vma_start_write(vma);
> +               if (vma_start_write_killable(vma)) {
> +                       err = -EINTR;

Doesn't this need mpol_put(new)? Or less complicated, move the
vma_start_write_killable() up to somewhere above the mpol_dup() call.

> +                       break;
> +               }
> +
>                 new->home_node = home_node;
>                 err = mbind_range(&vmi, vma, &prev, start, end, new);
>                 mpol_put(new);
[...]
> diff --git a/mm/pagewalk.c b/mm/pagewalk.c
> index a94c401ab2cf..dc9f7a7709c6 100644
> --- a/mm/pagewalk.c
> +++ b/mm/pagewalk.c
> @@ -425,14 +425,13 @@ static inline void process_mm_walk_lock(struct mm_struct *mm,
>                 mmap_assert_write_locked(mm);
>  }
>
> -static inline void process_vma_walk_lock(struct vm_area_struct *vma,
> +static inline int process_vma_walk_lock(struct vm_area_struct *vma,
>                                          enum page_walk_lock walk_lock)
>  {
>  #ifdef CONFIG_PER_VMA_LOCK
>         switch (walk_lock) {
>         case PGWALK_WRLOCK:
> -               vma_start_write(vma);
> -               break;
> +               return vma_start_write_killable(vma);

There are two users of PGWALK_WRLOCK in arch/s390/mm/gmap.c code that
don't check pagewalk return values, have you checked that they are not
negatively affected by this new possible error return?

>         case PGWALK_WRLOCK_VERIFY:
>                 vma_assert_write_locked(vma);
>                 break;
[...]
> diff --git a/mm/vma.c b/mm/vma.c
> index be64f781a3aa..3cfb81b3b7cf 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -540,8 +540,12 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
>         if (new->vm_ops && new->vm_ops->open)
>                 new->vm_ops->open(new);
>
> -       vma_start_write(vma);
> -       vma_start_write(new);
> +       err = vma_start_write_killable(vma);
> +       if (err)
> +               goto out_fput;
> +       err = vma_start_write_killable(new);
> +       if (err)
> +               goto out_fput;

What about the new->vm_ops->open() call and the anon_vma_clone()
above? I don't think the error path properly undoes either. These
calls should probably be moved further up, so that the point of no
return in this function stays where it was.

>         init_vma_prep(&vp, vma);
>         vp.insert = new;
[...]
> @@ -1155,10 +1168,12 @@ int vma_expand(struct vma_merge_struct *vmg)
>         struct vm_area_struct *next = vmg->next;
>         bool remove_next = false;
>         vm_flags_t sticky_flags;
> -       int ret = 0;
> +       int ret;
>
>         mmap_assert_write_locked(vmg->mm);
> -       vma_start_write(target);
> +       ret = vma_start_write_killable(target);
> +       if (ret)
> +               return ret;
>
>         if (next && target != next && vmg->end == next->vm_end)
>                 remove_next = true;
> @@ -1186,17 +1201,19 @@ int vma_expand(struct vma_merge_struct *vmg)
>          * Note that, by convention, callers ignore OOM for this case, so
>          * we don't need to account for vmg->give_up_on_mm here.
>          */
> -       if (remove_next)
> +       if (remove_next) {
> +               ret = vma_start_write_killable(next);
> +               if (ret)
> +                       return ret;
>                 ret = dup_anon_vma(target, next, &anon_dup);
> +       }
>         if (!ret && vmg->copied_from)
>                 ret = dup_anon_vma(target, vmg->copied_from, &anon_dup);
>         if (ret)
>                 return ret;

nit: the control flow here is kinda chaotic, with some "if (ret)
return ret;" mixed with "if (!ret && ...) ret = ...;".

>
> -       if (remove_next) {
> -               vma_start_write(next);
> +       if (remove_next)
>                 vmg->__remove_next = true;
> -       }
>         if (commit_merge(vmg))
>                 goto nomem;
>
[...]
> @@ -2211,9 +2240,8 @@ int mm_take_all_locks(struct mm_struct *mm)
>          * is reached.
>          */
>         for_each_vma(vmi, vma) {
> -               if (signal_pending(current))
> +               if (vma_start_write_killable(vma))
>                         goto out_unlock;
> -               vma_start_write(vma);

nit: might want to keep the signal_pending() so that this can sort of
be interrupted by non-fatal signals, which seems to be the intention

>         }
>
>         vma_iter_init(&vmi, mm, 0);
> @@ -2549,7 +2577,9 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
>  #endif
>
>         /* Lock the VMA since it is modified after insertion into VMA tree */
> -       vma_start_write(vma);
> +       error = vma_start_write_killable(vma);
> +       if (error)
> +               goto free_iter_vma;

This seems way past the point of no return, we've already called the
->mmap() handler which I think means removing the VMA again would
require a ->close() call. The VMA should be locked further up if we
want to do it killably.

>         vma_iter_store_new(vmi, vma);
>         map->mm->map_count++;
>         vma_link_file(vma, map->hold_file_rmap_lock);
>


  parent reply	other threads:[~2026-02-10 21:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-09 22:08 Suren Baghdasaryan
2026-02-10  4:30 ` kernel test robot
2026-02-10  9:19 ` kernel test robot
2026-02-10 21:18 ` Jann Horn [this message]
2026-02-10 23:41   ` Suren Baghdasaryan
2026-02-11  3:55 ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAG48ez2zFfCO7RKhHKaATFge7DWzaTfO+Yta0y4_HXGHZAtkqw@mail.gmail.com \
    --to=jannh@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=byungchul@sk.com \
    --cc=chleroy@kernel.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=gourry@gourry.net \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kees@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=maddy@linux.ibm.com \
    --cc=matthew.brost@intel.com \
    --cc=mhocko@suse.com \
    --cc=mpe@ellerman.id.au \
    --cc=npache@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=pfalcato@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox