Re: [PATCH v2] mm: Fix memory ordering for mm_lock_seq and vm_lock_seq

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@linux-foundation.org>
To: Jann Horn <jannh@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v2] mm: Fix memory ordering for mm_lock_seq and vm_lock_seq
Date: Mon, 24 Jul 2023 10:11:36 -0700	[thread overview]
Message-ID: <20230724101136.4c58e8291961e87f6c5c1c79@linux-foundation.org> (raw)
In-Reply-To: <20230721225107.942336-1-jannh@google.com>

On Sat, 22 Jul 2023 00:51:07 +0200 Jann Horn <jannh@google.com> wrote:

> mm->mm_lock_seq effectively functions as a read/write lock; therefore it
> must be used with acquire/release semantics.
> 
> A specific example is the interaction between userfaultfd_register() and
> lock_vma_under_rcu().
> userfaultfd_register() does the following from the point where it changes
> a VMA's flags to the point where concurrent readers are permitted again
> (in a simple scenario where only a single private VMA is accessed and no
> merging/splitting is involved):
> 
> userfaultfd_register
>   userfaultfd_set_vm_flags
>     vm_flags_reset
>       vma_start_write
>         down_write(&vma->vm_lock->lock)
>         vma->vm_lock_seq = mm_lock_seq [marks VMA as busy]
>         up_write(&vma->vm_lock->lock)
>       vm_flags_init
>         [sets VM_UFFD_* in __vm_flags]
>   vma->vm_userfaultfd_ctx.ctx = ctx
>   mmap_write_unlock
>     vma_end_write_all
>       WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1) [unlocks VMA]
> 
> There are no memory barriers in between the __vm_flags update and the
> mm->mm_lock_seq update that unlocks the VMA, so the unlock can be reordered
> to above the `vm_flags_init()` call, which means from the perspective of a
> concurrent reader, a VMA can be marked as a userfaultfd VMA while it is not
> VMA-locked. That's bad, we definitely need a store-release for the unlock
> operation.
> 
> The non-atomic write to vma->vm_lock_seq in vma_start_write() is mostly
> fine because all accesses to vma->vm_lock_seq that matter are always
> protected by the VMA lock. There is a racy read in vma_start_read() though
> that can tolerate false-positives, so we should be using WRITE_ONCE() to
> keep things tidy and data-race-free (including for KCSAN).
> 
> On the other side, lock_vma_under_rcu() works as follows in the relevant
> region for locking and userfaultfd check:
> 
> lock_vma_under_rcu
>   vma_start_read
>     vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [early bailout]
>     down_read_trylock(&vma->vm_lock->lock)
>     vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [main check]
>   userfaultfd_armed
>     checks vma->vm_flags & __VM_UFFD_FLAGS
> 
> Here, the interesting aspect is how far down the mm->mm_lock_seq read
> can be reordered - if this read is reordered down below the vma->vm_flags
> access, this could cause lock_vma_under_rcu() to partly operate on
> information that was read while the VMA was supposed to be locked.
> To prevent this kind of downwards bleeding of the mm->mm_lock_seq read, we
> need to read it with a load-acquire.
> 
> Some of the comment wording is based on suggestions by Suren.
> 
> BACKPORT WARNING: One of the functions changed by this patch (which I've
> written against Linus' tree) is vma_try_start_write(), but this function
> no longer exists in mm/mm-everything. I don't know whether the merged
> version of this patch will be ordered before or after the patch that
> removes vma_try_start_write(). If you're backporting this patch to a
> tree with vma_try_start_write(), make sure this patch changes that
> function.

I staged this patch as a hotfix, ahead of mm-unstable material.

The conflict is with Hugh's "mm: delete mmap_write_trylock() and
vma_try_start_write()"
(https://lkml.kernel.org/r/4e6db3d-e8e-73fb-1f2a-8de2dab2a87c@google.com)

I fixed the reject in the obvious way (deleted the function anyway),
but there's a possibility that the ordering issue you have addressed
will now be reintroduced by Hugh's series, so please let's review that.

next prev parent reply	other threads:[~2023-07-24 17:11 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-21 22:51 Jann Horn
2023-07-21 23:26 ` Suren Baghdasaryan
2023-07-24 17:11 ` Andrew Morton [this message]
2023-07-24 17:29   ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230724101136.4c58e8291961e87f6c5c1c79@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterz@infradead.org \
    --cc=surenb@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox