linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Hugh Dickins <hughd@google.com>, Maya Gokhale <gokhale2@llnl.gov>,
	Jerome Glisse <jglisse@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Martin Cracauer <cracauer@cons.org>,
	Denis Plotnikov <dplotnikov@virtuozzo.com>,
	Shaohua Li <shli@fb.com>, Andrea Arcangeli <aarcange@redhat.com>,
	Pavel Emelyanov <xemul@parallels.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Marty McFadden <mcfadden8@llnl.gov>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Mel Gorman <mgorman@suse.de>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH RFC 00/24] userfaultfd: write protection support
Date: Mon, 21 Jan 2019 15:33:21 +0100	[thread overview]
Message-ID: <c2485a2d-25b3-2fc0-4902-01fa278be9c7@redhat.com> (raw)
In-Reply-To: <20190121075722.7945-1-peterx@redhat.com>

On 21.01.19 08:56, Peter Xu wrote:
> Hi,
> 
> This series implements initial write protection support for
> userfaultfd.  Currently both shmem and hugetlbfs are not supported
> yet, but only anonymous memory.
> 
> To be simple, either "userfaultfd-wp" or "uffd-wp" might be used in
> later paragraphs.
> 
> The whole series can also be found at:
> 
>   https://github.com/xzpeter/linux/tree/uffd-wp-merged
> 
> Any comment would be greatly welcomed.   Thanks.
> 
> Overview
> ====================
> 
> The uffd-wp work was initialized by Shaohua Li [1], and later
> continued by Andrea [2]. This series is based upon Andrea's latest
> userfaultfd tree, and it is a continuous works from both Shaohua and
> Andrea.  Many of the follow up ideas come from Andrea too.
> 
> Besides the old MISSING register mode of userfaultfd, the new uffd-wp
> support provides another alternative register mode called
> UFFDIO_REGISTER_MODE_WP that can be used to listen to not only missing
> page faults but also write protection page faults, or even they can be
> registered together.  At the same time, the new feature also provides
> a new userfaultfd ioctl called UFFDIO_WRITEPROTECT which allows the
> userspace to write protect a range or memory or fixup write permission
> of faulted pages.
> 
> Please refer to the document patch "userfaultfd: wp:
> UFFDIO_REGISTER_MODE_WP documentation update" for more information on
> the new interface and what it can do.
> 
> The major workflow of an uffd-wp program should be:
> 
>   1. Register a memory region with WP mode using UFFDIO_REGISTER_MODE_WP
> 
>   2. Write protect part of the whole registered region using
>      UFFDIO_WRITEPROTECT, passing in UFFDIO_WRITEPROTECT_MODE_WP to
>      show that we want to write protect the range.
> 
>   3. Start a working thread that modifies the protected pages,
>      meanwhile listening to UFFD messages.
> 
>   4. When a write is detected upon the protected range, page fault
>      happens, a UFFD message will be generated and reported to the
>      page fault handling thread
> 
>   5. The page fault handler thread resolves the page fault using the
>      new UFFDIO_WRITEPROTECT ioctl, but this time passing in
>      !UFFDIO_WRITEPROTECT_MODE_WP instead showing that we want to
>      recover the write permission.  Before this operation, the fault
>      handler thread can do anything it wants, e.g., dumps the page to
>      a persistent storage.
> 
>   6. The worker thread will continue running with the correctly
>      applied write permission from step 5.
> 
> Currently there are already two projects that are based on this new
> userfaultfd feature.
> 
> QEMU Live Snapshot: The project provides a way to allow the QEMU
>                     hypervisor to take snapshot of VMs without
>                     stopping the VM [3].
> 
> LLNL umap library:  The project provides a mmap-like interface and
>                     "allow to have an application specific buffer of
>                     pages cached from a large file, i.e. out-of-core
>                     execution using memory map" [4][5].
> 
> Before posting the patchset, this series was smoke tested against QEMU
> live snapshot and the LLNL umap library (by doing parallel quicksort
> using 128 sorting threads + 80 uffd servicing threads).  My sincere
> thanks to Marty Mcfadden and Denis Plotnikov for the help along the
> way.
> 
> Implementation
> ==============
> 
> Patch 1-4: The whole uffd-wp requires the kernel page fault path to
>            take more than one retries.  In the previous works starting
>            from Shaohua, a new fault flag FAULT_FLAG_ALLOW_UFFD_RETRY
>            was introduced for this [6]. However in this series we have
>            dropped that patch, instead the whole work is based on the
>            recent series "[PATCH RFC v3 0/4] mm: some enhancements to
>            the page fault mechanism" [7] which removes the assuption
>            that VM_FAULT_RETRY can only happen once.  This four
>            patches are identital patches but picked up here.  Please
>            refer to the cover letter [7] for more information.  More
>            discussion upstream shows that this work could even benefit
>            existing use case [8] so please help justify whether
>            patches 1-4 can be consider to be accepted even earlier
>            than the rest of the series.
> 
> Patch 5-21:   Implements the uffd-wp logic.  To avoid collision with
>               existing write protections (e.g., an private anonymous
>               page can be write protected if it was shared between
>               multiple processes), a new PTE bit (_PAGE_UFFD_WP) was
>               introduced to explicitly mark a PTE as userfault
>               write-protected.  A similar bit was also used in the
>               swap/migration entry (_PAGE_SWP_UFFD_WP) to make sure
>               even if the pages were swapped or migrated, the uffd-wp
>               tracking information won't be lost.  When resolving a
>               page fault, we'll do a page copy before hand if the page
>               was COWed to make sure we won't corrupt any shared
>               pages.  Etc.  Please see separated patches for more
>               details.
> 
> Patch 22:     Documentation update for uffd-wp
> 
> Patch 23,24:  Uffd-wp selftests
> 
> TODO
> =============
> 
> - hugetlbfs/shmem support
> - performance
> - more architectures
> - ...
> 
> References
> ==========
> 
> [1] https://lwn.net/Articles/666187/
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/log/?h=userfault
> [3] https://github.com/denis-plotnikov/qemu/commits/background-snapshot-kvm
> [4] https://github.com/LLNL/umap
> [5] https://llnl-umap.readthedocs.io/en/develop/
> [6] https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=userfault&id=b245ecf6cf59156966f3da6e6b674f6695a5ffa5
> [7] https://lkml.org/lkml/2018/11/21/370
> [8] https://lkml.org/lkml/2018/12/30/64
> 
> Andrea Arcangeli (5):
>   userfaultfd: wp: add the writeprotect API to userfaultfd ioctl
>   userfaultfd: wp: hook userfault handler to write protection fault
>   userfaultfd: wp: add WP pagetable tracking to x86
>   userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers
>   userfaultfd: wp: add UFFDIO_COPY_MODE_WP
> 
> Martin Cracauer (1):
>   userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update
> 
> Peter Xu (15):
>   mm: gup: rename "nonblocking" to "locked" where proper
>   mm: userfault: return VM_FAULT_RETRY on signals
>   mm: allow VM_FAULT_RETRY for multiple times
>   mm: gup: allow VM_FAULT_RETRY for multiple times
>   mm: merge parameters for change_protection()
>   userfaultfd: wp: apply _PAGE_UFFD_WP bit
>   mm: export wp_page_copy()
>   userfaultfd: wp: handle COW properly for uffd-wp
>   userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork
>   userfaultfd: wp: add pmd_swp_*uffd_wp() helpers
>   userfaultfd: wp: support swap and page migration
>   userfaultfd: wp: don't wake up when doing write protect
>   khugepaged: skip collapse if uffd-wp detected
>   userfaultfd: selftests: refactor statistics
>   userfaultfd: selftests: add write-protect test
> 
> Shaohua Li (3):
>   userfaultfd: wp: add helper for writeprotect check
>   userfaultfd: wp: support write protection for userfault vma range
>   userfaultfd: wp: enabled write protection in userfaultfd API
> 
>  Documentation/admin-guide/mm/userfaultfd.rst |  51 +++++
>  arch/alpha/mm/fault.c                        |   4 +-
>  arch/arc/mm/fault.c                          |  12 +-
>  arch/arm/mm/fault.c                          |  17 +-
>  arch/arm64/mm/fault.c                        |  11 +-
>  arch/hexagon/mm/vm_fault.c                   |   3 +-
>  arch/ia64/mm/fault.c                         |   3 +-
>  arch/m68k/mm/fault.c                         |   5 +-
>  arch/microblaze/mm/fault.c                   |   3 +-
>  arch/mips/mm/fault.c                         |   3 +-
>  arch/nds32/mm/fault.c                        |   7 +-
>  arch/nios2/mm/fault.c                        |   5 +-
>  arch/openrisc/mm/fault.c                     |   3 +-
>  arch/parisc/mm/fault.c                       |   4 +-
>  arch/powerpc/mm/fault.c                      |   9 +-
>  arch/riscv/mm/fault.c                        |   9 +-
>  arch/s390/mm/fault.c                         |  14 +-
>  arch/sh/mm/fault.c                           |   5 +-
>  arch/sparc/mm/fault_32.c                     |   4 +-
>  arch/sparc/mm/fault_64.c                     |   4 +-
>  arch/um/kernel/trap.c                        |   6 +-
>  arch/unicore32/mm/fault.c                    |  10 +-
>  arch/x86/Kconfig                             |   1 +
>  arch/x86/include/asm/pgtable.h               |  67 ++++++
>  arch/x86/include/asm/pgtable_64.h            |   8 +-
>  arch/x86/include/asm/pgtable_types.h         |  11 +-
>  arch/x86/mm/fault.c                          |  13 +-
>  arch/xtensa/mm/fault.c                       |   4 +-
>  fs/userfaultfd.c                             | 110 +++++----
>  include/asm-generic/pgtable.h                |   1 +
>  include/asm-generic/pgtable_uffd.h           |  66 ++++++
>  include/linux/huge_mm.h                      |   2 +-
>  include/linux/mm.h                           |  21 +-
>  include/linux/swapops.h                      |   2 +
>  include/linux/userfaultfd_k.h                |  41 +++-
>  include/trace/events/huge_memory.h           |   1 +
>  include/uapi/linux/userfaultfd.h             |  28 ++-
>  init/Kconfig                                 |   5 +
>  mm/gup.c                                     |  61 ++---
>  mm/huge_memory.c                             |  28 ++-
>  mm/hugetlb.c                                 |   8 +-
>  mm/khugepaged.c                              |  23 ++
>  mm/memory.c                                  |  28 ++-
>  mm/mempolicy.c                               |   2 +-
>  mm/migrate.c                                 |   7 +
>  mm/mprotect.c                                |  99 +++++++--
>  mm/rmap.c                                    |   6 +
>  mm/userfaultfd.c                             |  92 +++++++-
>  tools/testing/selftests/vm/userfaultfd.c     | 222 ++++++++++++++-----
>  49 files changed, 898 insertions(+), 251 deletions(-)
>  create mode 100644 include/asm-generic/pgtable_uffd.h
> 

Does this series fix the "false positives" case I experienced on early
prototypes of uffd-wp? (getting notified about a write access although
it was not a write access?)

-- 

Thanks,

David / dhildenb

  parent reply	other threads:[~2019-01-21 14:33 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-21  7:56 Peter Xu
2019-01-21  7:56 ` [PATCH RFC 01/24] mm: gup: rename "nonblocking" to "locked" where proper Peter Xu
2019-01-21 10:20   ` Mike Rapoport
2019-01-21  7:57 ` [PATCH RFC 02/24] mm: userfault: return VM_FAULT_RETRY on signals Peter Xu
2019-01-21 15:40   ` Jerome Glisse
2019-01-22  6:10     ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 03/24] mm: allow VM_FAULT_RETRY for multiple times Peter Xu
2019-01-21 15:55   ` Jerome Glisse
2019-01-22  8:22     ` Peter Xu
2019-01-22 16:53       ` Jerome Glisse
2019-01-23  2:12         ` Peter Xu
2019-01-23  2:39           ` Jerome Glisse
2019-01-24  5:45             ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 04/24] mm: gup: " Peter Xu
2019-01-21 16:24   ` Jerome Glisse
2019-01-24  7:05     ` Peter Xu
2019-01-24 15:34       ` Jerome Glisse
2019-01-25  2:49         ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 05/24] userfaultfd: wp: add helper for writeprotect check Peter Xu
2019-01-21 10:23   ` Mike Rapoport
2019-01-22  8:31     ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 06/24] userfaultfd: wp: support write protection for userfault vma range Peter Xu
2019-01-21 10:20   ` Mike Rapoport
2019-01-22  8:55     ` Peter Xu
2019-01-21 14:05   ` Jerome Glisse
2019-01-22  9:39     ` Peter Xu
2019-01-22 17:02       ` Jerome Glisse
2019-01-23  2:17         ` Peter Xu
2019-01-23  2:43           ` Jerome Glisse
2019-01-24  5:47             ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 07/24] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl Peter Xu
2019-01-21 10:42   ` Mike Rapoport
2019-01-24  4:56     ` Peter Xu
2019-01-24  7:27       ` Mike Rapoport
2019-01-24  9:28         ` Peter Xu
2019-01-25  7:54           ` Mike Rapoport
2019-01-25 10:12             ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 08/24] userfaultfd: wp: hook userfault handler to write protection fault Peter Xu
2019-01-21  7:57 ` [PATCH RFC 09/24] userfaultfd: wp: enabled write protection in userfaultfd API Peter Xu
2019-01-21  7:57 ` [PATCH RFC 10/24] userfaultfd: wp: add WP pagetable tracking to x86 Peter Xu
2019-01-21 15:09   ` Jerome Glisse
2019-01-24  5:16     ` Peter Xu
2019-01-24 15:40       ` Jerome Glisse
2019-01-25  3:30         ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 11/24] userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers Peter Xu
2019-01-21  7:57 ` [PATCH RFC 12/24] userfaultfd: wp: add UFFDIO_COPY_MODE_WP Peter Xu
2019-01-21  7:57 ` [PATCH RFC 13/24] mm: merge parameters for change_protection() Peter Xu
2019-01-21 13:54   ` Jerome Glisse
2019-01-24  5:22     ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 14/24] userfaultfd: wp: apply _PAGE_UFFD_WP bit Peter Xu
2019-01-21  7:57 ` [PATCH RFC 15/24] mm: export wp_page_copy() Peter Xu
2019-01-21  7:57 ` [PATCH RFC 16/24] userfaultfd: wp: handle COW properly for uffd-wp Peter Xu
2019-01-21  7:57 ` [PATCH RFC 17/24] userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork Peter Xu
2019-01-21  7:57 ` [PATCH RFC 18/24] userfaultfd: wp: add pmd_swp_*uffd_wp() helpers Peter Xu
2019-01-21  7:57 ` [PATCH RFC 19/24] userfaultfd: wp: support swap and page migration Peter Xu
2019-01-21  7:57 ` [PATCH RFC 20/24] userfaultfd: wp: don't wake up when doing write protect Peter Xu
2019-01-21 11:10   ` Mike Rapoport
2019-01-24  5:36     ` Peter Xu
2019-01-21  7:57 ` [PATCH RFC 21/24] khugepaged: skip collapse if uffd-wp detected Peter Xu
2019-01-21  7:57 ` [PATCH RFC 22/24] userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update Peter Xu
2019-01-21  7:57 ` [PATCH RFC 23/24] userfaultfd: selftests: refactor statistics Peter Xu
2019-01-21  7:57 ` [PATCH RFC 24/24] userfaultfd: selftests: add write-protect test Peter Xu
2019-01-21 14:33 ` David Hildenbrand [this message]
2019-01-22  3:18   ` [PATCH RFC 00/24] userfaultfd: write protection support Peter Xu
2019-01-22  8:59     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c2485a2d-25b3-2fc0-4902-01fa278be9c7@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=cracauer@cons.org \
    --cc=dgilbert@redhat.com \
    --cc=dplotnikov@virtuozzo.com \
    --cc=gokhale2@llnl.gov \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcfadden8@llnl.gov \
    --cc=mgorman@suse.de \
    --cc=mike.kravetz@oracle.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=shli@fb.com \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox