From: Andrew Morton <akpm@linux-foundation.org>
To: Jann Horn <jannh@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Will Deacon <will@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v2] mm: Fix memory ordering for mm_lock_seq and vm_lock_seq
Date: Mon, 24 Jul 2023 10:11:36 -0700 [thread overview]
Message-ID: <20230724101136.4c58e8291961e87f6c5c1c79@linux-foundation.org> (raw)
In-Reply-To: <20230721225107.942336-1-jannh@google.com>
On Sat, 22 Jul 2023 00:51:07 +0200 Jann Horn <jannh@google.com> wrote:
> mm->mm_lock_seq effectively functions as a read/write lock; therefore it
> must be used with acquire/release semantics.
>
> A specific example is the interaction between userfaultfd_register() and
> lock_vma_under_rcu().
> userfaultfd_register() does the following from the point where it changes
> a VMA's flags to the point where concurrent readers are permitted again
> (in a simple scenario where only a single private VMA is accessed and no
> merging/splitting is involved):
>
> userfaultfd_register
> userfaultfd_set_vm_flags
> vm_flags_reset
> vma_start_write
> down_write(&vma->vm_lock->lock)
> vma->vm_lock_seq = mm_lock_seq [marks VMA as busy]
> up_write(&vma->vm_lock->lock)
> vm_flags_init
> [sets VM_UFFD_* in __vm_flags]
> vma->vm_userfaultfd_ctx.ctx = ctx
> mmap_write_unlock
> vma_end_write_all
> WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1) [unlocks VMA]
>
> There are no memory barriers in between the __vm_flags update and the
> mm->mm_lock_seq update that unlocks the VMA, so the unlock can be reordered
> to above the `vm_flags_init()` call, which means from the perspective of a
> concurrent reader, a VMA can be marked as a userfaultfd VMA while it is not
> VMA-locked. That's bad, we definitely need a store-release for the unlock
> operation.
>
> The non-atomic write to vma->vm_lock_seq in vma_start_write() is mostly
> fine because all accesses to vma->vm_lock_seq that matter are always
> protected by the VMA lock. There is a racy read in vma_start_read() though
> that can tolerate false-positives, so we should be using WRITE_ONCE() to
> keep things tidy and data-race-free (including for KCSAN).
>
> On the other side, lock_vma_under_rcu() works as follows in the relevant
> region for locking and userfaultfd check:
>
> lock_vma_under_rcu
> vma_start_read
> vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [early bailout]
> down_read_trylock(&vma->vm_lock->lock)
> vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [main check]
> userfaultfd_armed
> checks vma->vm_flags & __VM_UFFD_FLAGS
>
> Here, the interesting aspect is how far down the mm->mm_lock_seq read
> can be reordered - if this read is reordered down below the vma->vm_flags
> access, this could cause lock_vma_under_rcu() to partly operate on
> information that was read while the VMA was supposed to be locked.
> To prevent this kind of downwards bleeding of the mm->mm_lock_seq read, we
> need to read it with a load-acquire.
>
> Some of the comment wording is based on suggestions by Suren.
>
> BACKPORT WARNING: One of the functions changed by this patch (which I've
> written against Linus' tree) is vma_try_start_write(), but this function
> no longer exists in mm/mm-everything. I don't know whether the merged
> version of this patch will be ordered before or after the patch that
> removes vma_try_start_write(). If you're backporting this patch to a
> tree with vma_try_start_write(), make sure this patch changes that
> function.
I staged this patch as a hotfix, ahead of mm-unstable material.
The conflict is with Hugh's "mm: delete mmap_write_trylock() and
vma_try_start_write()"
(https://lkml.kernel.org/r/4e6db3d-e8e-73fb-1f2a-8de2dab2a87c@google.com)
I fixed the reject in the obvious way (deleted the function anyway),
but there's a possibility that the ordering issue you have addressed
will now be reintroduced by Hugh's series, so please let's review that.
next prev parent reply other threads:[~2023-07-24 17:11 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-21 22:51 Jann Horn
2023-07-21 23:26 ` Suren Baghdasaryan
2023-07-24 17:11 ` Andrew Morton [this message]
2023-07-24 17:29 ` Jann Horn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230724101136.4c58e8291961e87f6c5c1c79@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterz@infradead.org \
--cc=surenb@google.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox