From: Peter Xu <peterx@redhat.com>
To: Mike Rapoport <rppt@linux.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Hugh Dickins <hughd@google.com>, Maya Gokhale <gokhale2@llnl.gov>,
Jerome Glisse <jglisse@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Martin Cracauer <cracauer@cons.org>,
Denis Plotnikov <dplotnikov@virtuozzo.com>,
Shaohua Li <shli@fb.com>, Andrea Arcangeli <aarcange@redhat.com>,
Pavel Emelyanov <xemul@parallels.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Marty McFadden <mcfadden8@llnl.gov>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Mel Gorman <mgorman@suse.de>,
"Kirill A . Shutemov" <kirill@shutemov.name>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH RFC 07/24] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl
Date: Thu, 24 Jan 2019 12:56:15 +0800 [thread overview]
Message-ID: <20190124045551.GD18231@xz-x1> (raw)
In-Reply-To: <20190121104232.GA26461@rapoport-lnx>
On Mon, Jan 21, 2019 at 12:42:33PM +0200, Mike Rapoport wrote:
[...]
> > @@ -1343,7 +1344,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
> >
> > /* check not compatible vmas */
> > ret = -EINVAL;
> > - if (!vma_can_userfault(cur))
> > + if (!vma_can_userfault(cur, vm_flags))
> > goto out_unlock;
> >
> > /*
> > @@ -1371,6 +1372,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
> > if (end & (vma_hpagesize - 1))
> > goto out_unlock;
> > }
> > + if ((vm_flags & VM_UFFD_WP) && !(cur->vm_flags & VM_WRITE))
> > + goto out_unlock;
>
> This is problematic for the non-cooperative use-case. Way may still want to
> monitor a read-only area because it may eventually become writable, e.g. if
> the monitored process runs mprotect().
Firstly I think I should be able to change it to VM_MAYWRITE which
seems to suite more.
Meanwhile, frankly speaking I didn't think a lot about how to nest the
usages of uffd-wp and mprotect(), so far I was only considering it as
a replacement of mprotect(). But indeed it can happen that the
monitored process calls mprotect(). Is there an existing scenario of
such usage?
The problem is I'm uncertain about whether this scenario can work
after all. Say, the monitor process A write protected process B's
page P, so logically A will definitely receive a message before B
writes to page P. However here if we allow process B to do
mprotect(PROT_WRITE) upon page P and grant write permission to it on
its own, then A will not be able to capture the write operation at
all? Then I don't know how it can work here... or whether we should
fail the mprotect() at least upon uffd-wp ranges?
> Particularity, for using uffd-wp as a replacement for soft-dirty would
> require it.
>
> >
> > /*
> > * Check that this vma isn't already owned by a
> > @@ -1400,7 +1403,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
> > do {
> > cond_resched();
> >
> > - BUG_ON(!vma_can_userfault(vma));
> > + BUG_ON(!vma_can_userfault(vma, vm_flags));
> > BUG_ON(vma->vm_userfaultfd_ctx.ctx &&
> > vma->vm_userfaultfd_ctx.ctx != ctx);
> > WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
> > @@ -1535,7 +1538,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
> > * provides for more strict behavior to notice
> > * unregistration errors.
> > */
> > - if (!vma_can_userfault(cur))
> > + if (!vma_can_userfault(cur, cur->vm_flags))
> > goto out_unlock;
> >
> > found = true;
> > @@ -1549,7 +1552,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
> > do {
> > cond_resched();
> >
> > - BUG_ON(!vma_can_userfault(vma));
> > + BUG_ON(!vma_can_userfault(vma, vma->vm_flags));
> > WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
> >
> > /*
> > @@ -1760,6 +1763,46 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx,
> > return ret;
> > }
> >
> > +static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx,
> > + unsigned long arg)
> > +{
> > + int ret;
> > + struct uffdio_writeprotect uffdio_wp;
> > + struct uffdio_writeprotect __user *user_uffdio_wp;
> > + struct userfaultfd_wake_range range;
> > +
>
> In the non-cooperative mode the userfaultfd_writeprotect() may race with VM
> layout changes, pretty much as uffdio_copy() [1]. My solution for uffdio_copy()
> was to return -EAGAIN if such race is encountered. I think the same would
> apply here.
I tried to understand the problem at [1] but failed... could you help
to clarify it a bit more?
I'm quoting some of the discussions from [1] here directly between you
and Pavel:
> Since the monitor cannot assume that the process will access all its memory
> it has to copy some pages "in the background". A simple monitor may look
> like:
>
> for (;;) {
> wait_for_uffd_events(timeout);
> handle_uffd_events();
> uffd_copy(some not faulted pages);
> }
>
> Then, if the "background" uffd_copy() races with fork, the pages we've
> copied may be already present in parent's mappings before the call to
> copy_page_range() and may be not.
>
> If the pages were not present, uffd_copy'ing them again to the child's
> memory would be ok.
>
> But if uffd_copy() was first to catch mmap_sem, and we would uffd_copy them
> again, child process will get memory corruption.
Here I don't understand why the child process will get memory
corruption if uffd_copy() caught the mmap_sem first.
If it did it, then IMHO when uffd_copy() copies the page again it'll
simply get a -EEXIST showing that the page has already been copied.
Could you explain on why there will be a data corruption?
Thanks in advance,
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df2cc96e77011cf7989208b206da9817e0321028
>
--
Peter Xu
next prev parent reply other threads:[~2019-01-24 4:56 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-21 7:56 [PATCH RFC 00/24] userfaultfd: write protection support Peter Xu
2019-01-21 7:56 ` [PATCH RFC 01/24] mm: gup: rename "nonblocking" to "locked" where proper Peter Xu
2019-01-21 10:20 ` Mike Rapoport
2019-01-21 7:57 ` [PATCH RFC 02/24] mm: userfault: return VM_FAULT_RETRY on signals Peter Xu
2019-01-21 15:40 ` Jerome Glisse
2019-01-22 6:10 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 03/24] mm: allow VM_FAULT_RETRY for multiple times Peter Xu
2019-01-21 15:55 ` Jerome Glisse
2019-01-22 8:22 ` Peter Xu
2019-01-22 16:53 ` Jerome Glisse
2019-01-23 2:12 ` Peter Xu
2019-01-23 2:39 ` Jerome Glisse
2019-01-24 5:45 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 04/24] mm: gup: " Peter Xu
2019-01-21 16:24 ` Jerome Glisse
2019-01-24 7:05 ` Peter Xu
2019-01-24 15:34 ` Jerome Glisse
2019-01-25 2:49 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 05/24] userfaultfd: wp: add helper for writeprotect check Peter Xu
2019-01-21 10:23 ` Mike Rapoport
2019-01-22 8:31 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 06/24] userfaultfd: wp: support write protection for userfault vma range Peter Xu
2019-01-21 10:20 ` Mike Rapoport
2019-01-22 8:55 ` Peter Xu
2019-01-21 14:05 ` Jerome Glisse
2019-01-22 9:39 ` Peter Xu
2019-01-22 17:02 ` Jerome Glisse
2019-01-23 2:17 ` Peter Xu
2019-01-23 2:43 ` Jerome Glisse
2019-01-24 5:47 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 07/24] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl Peter Xu
2019-01-21 10:42 ` Mike Rapoport
2019-01-24 4:56 ` Peter Xu [this message]
2019-01-24 7:27 ` Mike Rapoport
2019-01-24 9:28 ` Peter Xu
2019-01-25 7:54 ` Mike Rapoport
2019-01-25 10:12 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 08/24] userfaultfd: wp: hook userfault handler to write protection fault Peter Xu
2019-01-21 7:57 ` [PATCH RFC 09/24] userfaultfd: wp: enabled write protection in userfaultfd API Peter Xu
2019-01-21 7:57 ` [PATCH RFC 10/24] userfaultfd: wp: add WP pagetable tracking to x86 Peter Xu
2019-01-21 15:09 ` Jerome Glisse
2019-01-24 5:16 ` Peter Xu
2019-01-24 15:40 ` Jerome Glisse
2019-01-25 3:30 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 11/24] userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers Peter Xu
2019-01-21 7:57 ` [PATCH RFC 12/24] userfaultfd: wp: add UFFDIO_COPY_MODE_WP Peter Xu
2019-01-21 7:57 ` [PATCH RFC 13/24] mm: merge parameters for change_protection() Peter Xu
2019-01-21 13:54 ` Jerome Glisse
2019-01-24 5:22 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 14/24] userfaultfd: wp: apply _PAGE_UFFD_WP bit Peter Xu
2019-01-21 7:57 ` [PATCH RFC 15/24] mm: export wp_page_copy() Peter Xu
2019-01-21 7:57 ` [PATCH RFC 16/24] userfaultfd: wp: handle COW properly for uffd-wp Peter Xu
2019-01-21 7:57 ` [PATCH RFC 17/24] userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork Peter Xu
2019-01-21 7:57 ` [PATCH RFC 18/24] userfaultfd: wp: add pmd_swp_*uffd_wp() helpers Peter Xu
2019-01-21 7:57 ` [PATCH RFC 19/24] userfaultfd: wp: support swap and page migration Peter Xu
2019-01-21 7:57 ` [PATCH RFC 20/24] userfaultfd: wp: don't wake up when doing write protect Peter Xu
2019-01-21 11:10 ` Mike Rapoport
2019-01-24 5:36 ` Peter Xu
2019-01-21 7:57 ` [PATCH RFC 21/24] khugepaged: skip collapse if uffd-wp detected Peter Xu
2019-01-21 7:57 ` [PATCH RFC 22/24] userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update Peter Xu
2019-01-21 7:57 ` [PATCH RFC 23/24] userfaultfd: selftests: refactor statistics Peter Xu
2019-01-21 7:57 ` [PATCH RFC 24/24] userfaultfd: selftests: add write-protect test Peter Xu
2019-01-21 14:33 ` [PATCH RFC 00/24] userfaultfd: write protection support David Hildenbrand
2019-01-22 3:18 ` Peter Xu
2019-01-22 8:59 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190124045551.GD18231@xz-x1 \
--to=peterx@redhat.com \
--cc=aarcange@redhat.com \
--cc=cracauer@cons.org \
--cc=dgilbert@redhat.com \
--cc=dplotnikov@virtuozzo.com \
--cc=gokhale2@llnl.gov \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jglisse@redhat.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcfadden8@llnl.gov \
--cc=mgorman@suse.de \
--cc=mike.kravetz@oracle.com \
--cc=rppt@linux.ibm.com \
--cc=rppt@linux.vnet.ibm.com \
--cc=shli@fb.com \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox