From: Dan Williams <dan.j.williams@intel.com>
To: linux-nvdimm@lists.01.org
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
"J. Bruce Fields" <bfields@fieldses.org>,
linux-mm@kvack.org, Sean Hefty <sean.hefty@intel.com>,
Jeff Layton <jlayton@poochiereds.net>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Ashok Raj <ashok.raj@intel.com>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
linux-rdma@vger.kernel.org, Joerg Roedel <joro@8bytes.org>,
Doug Ledford <dledford@redhat.com>,
Christoph Hellwig <hch@lst.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jeff Moyer <jmoyer@redhat.com>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Hal Rosenstock <hal.rosenstock@gmail.com>,
Arnd Bergmann <arnd@arndb.de>,
Robin Murphy <robin.murphy@arm.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Andy Lutomirski <luto@kernel.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
linux-xfs@vger.kernel.org, iommu@lists.linux-foundation.org,
linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
David Woodhouse <dwmw2@infradead.org>
Subject: [PATCH v8 00/14] MAP_DIRECT for DAX RDMA and userspace flush
Date: Tue, 10 Oct 2017 07:48:55 -0700 [thread overview]
Message-ID: <150764693502.16882.15848797003793552156.stgit@dwillia2-desk3.amr.corp.intel.com> (raw)
Changes since v7 [1]:
* Fix IOVA reuse race by leaving the dma scatterlist mapped until
unregistration time. Use iommu_unmap() in ib_umem_lease_break() to
force-invalidate the ibverbs memory registration. (David Woodhouse)
* Introduce iomap_can_allocate() as a way to check if any layouts are
present in the mmap write-fault path to prevent block map changes, and
start the leak break process when an allocating write-fault occurs.
This also removes the i_mapdcount bloat of 'struct inode' from v7.
(Dave Chinner)
* Provide generic_map_direct_{open,close,lease} to cleanup the
filesystem wiring to implement MAP_DIRECT support (Dave Chinner)
* Abandon (defer to a potential new fcntl()) support for using
MAP_DIRECT on non-DAX files. With this change we can validate the
inode is MAP_DIRECT capable just once at mmap time rather than every
fault. (Dave Chinner)
* Arrange for lease_direct leases to also wait the
/proc/sys/fs/lease-break-time period before calling break_fn. For
example, allow the lease-holder time to quiesce RDMA operations before
the iommu starts throwing io-faults.
* Switch intel-iommu to use iommu_num_sg_pages().
[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-October/012707.html
---
MAP_DIRECT is a mechanism that allows an application to establish a
mapping where the kernel will not change the block-map, or otherwise
dirty the block-map metadata of a file without notification. It supports
a "flush from userspace" model where persistent memory applications can
bypass the overhead of ongoing coordination of writes with the
filesystem, and it provides safety to RDMA operations involving DAX
mappings.
The kernel always has the ability to revoke access and convert the file
back to normal operation after performing a "lease break". Similar to
fcntl leases, there is no way for userspace to to cancel the lease break
process once it has started, it can only delay it via the
/proc/sys/fs/lease-break-time setting.
MAP_DIRECT enables XFS to supplant the device-dax interface for
mmap-write access to persistent memory with no ongoing coordination with
the filesystem via fsync/msync syscalls.
---
Dan Williams (14):
mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags
fs, mm: pass fd to ->mmap_validate()
fs: MAP_DIRECT core
xfs: prepare xfs_break_layouts() for reuse with MAP_DIRECT
fs, xfs, iomap: introduce iomap_can_allocate()
xfs: wire up MAP_DIRECT
iommu, dma-mapping: introduce dma_get_iommu_domain()
fs, mapdirect: introduce ->lease_direct()
xfs: wire up ->lease_direct()
device-dax: wire up ->lease_direct()
iommu: up-level sg_num_pages() from amd-iommu
iommu/vt-d: use iommu_num_sg_pages
IB/core: use MAP_DIRECT to fix / enable RDMA to DAX mappings
tools/testing/nvdimm: enable rdma unit tests
arch/alpha/include/uapi/asm/mman.h | 1
arch/mips/include/uapi/asm/mman.h | 1
arch/mips/kernel/vdso.c | 2
arch/parisc/include/uapi/asm/mman.h | 1
arch/tile/mm/elf.c | 3
arch/x86/mm/mpx.c | 3
arch/xtensa/include/uapi/asm/mman.h | 1
drivers/base/dma-mapping.c | 10 +
drivers/dax/Kconfig | 1
drivers/dax/device.c | 4
drivers/infiniband/core/umem.c | 90 +++++-
drivers/iommu/amd_iommu.c | 40 +--
drivers/iommu/intel-iommu.c | 30 +-
drivers/iommu/iommu.c | 27 ++
fs/Kconfig | 5
fs/Makefile | 1
fs/aio.c | 2
fs/mapdirect.c | 382 ++++++++++++++++++++++++++
fs/xfs/Kconfig | 4
fs/xfs/Makefile | 1
fs/xfs/xfs_file.c | 103 +++++++
fs/xfs/xfs_iomap.c | 3
fs/xfs/xfs_layout.c | 45 +++
fs/xfs/xfs_layout.h | 13 +
fs/xfs/xfs_pnfs.c | 30 --
fs/xfs/xfs_pnfs.h | 10 -
include/linux/dma-mapping.h | 3
include/linux/fs.h | 2
include/linux/iomap.h | 10 +
include/linux/iommu.h | 2
include/linux/mapdirect.h | 57 ++++
include/linux/mm.h | 17 +
include/linux/mman.h | 42 +++
include/rdma/ib_umem.h | 8 +
include/uapi/asm-generic/mman-common.h | 1
include/uapi/asm-generic/mman.h | 1
ipc/shm.c | 3
mm/internal.h | 2
mm/mmap.c | 28 +-
mm/nommu.c | 5
mm/util.c | 7
tools/include/uapi/asm-generic/mman-common.h | 1
tools/testing/nvdimm/Kbuild | 31 ++
tools/testing/nvdimm/config_check.c | 2
tools/testing/nvdimm/test/iomap.c | 14 +
45 files changed, 938 insertions(+), 111 deletions(-)
create mode 100644 fs/mapdirect.c
create mode 100644 fs/xfs/xfs_layout.c
create mode 100644 fs/xfs/xfs_layout.h
create mode 100644 include/linux/mapdirect.h
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2017-10-10 14:55 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-10 14:48 Dan Williams [this message]
2017-10-10 14:49 ` [PATCH v8 01/14] mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags Dan Williams
2017-10-11 7:43 ` Jan Kara
2017-10-11 14:15 ` Dan Williams
2017-10-10 14:49 ` [PATCH v8 02/14] fs, mm: pass fd to ->mmap_validate() Dan Williams
2017-10-10 14:49 ` [PATCH v8 03/14] fs: MAP_DIRECT core Dan Williams
2017-10-10 14:49 ` [PATCH v8 04/14] xfs: prepare xfs_break_layouts() for reuse with MAP_DIRECT Dan Williams
2017-10-11 0:46 ` Dave Chinner
2017-10-11 2:12 ` Dan Williams
2017-10-10 14:49 ` [PATCH v8 05/14] fs, xfs, iomap: introduce iomap_can_allocate() Dan Williams
2017-10-10 14:49 ` [PATCH v8 06/14] xfs: wire up MAP_DIRECT Dan Williams
2017-10-11 1:09 ` Dave Chinner
2017-10-11 2:12 ` Dan Williams
2017-10-10 14:49 ` [PATCH v8 07/14] iommu, dma-mapping: introduce dma_get_iommu_domain() Dan Williams
2017-10-10 14:49 ` [PATCH v8 08/14] fs, mapdirect: introduce ->lease_direct() Dan Williams
2017-10-10 14:49 ` [PATCH v8 09/14] xfs: wire up ->lease_direct() Dan Williams
2017-10-10 14:49 ` [PATCH v8 10/14] device-dax: " Dan Williams
2017-10-10 14:50 ` [PATCH v8 11/14] iommu: up-level sg_num_pages() from amd-iommu Dan Williams
2017-10-10 14:50 ` [PATCH v8 12/14] iommu/vt-d: use iommu_num_sg_pages Dan Williams
2017-10-10 14:50 ` [PATCH v8 13/14] IB/core: use MAP_DIRECT to fix / enable RDMA to DAX mappings Dan Williams
2017-10-11 11:54 ` Joerg Roedel
2017-10-11 16:01 ` Dan Williams
2017-10-10 14:50 ` [PATCH v8 14/14] tools/testing/nvdimm: enable rdma unit tests Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=150764693502.16882.15848797003793552156.stgit@dwillia2-desk3.amr.corp.intel.com \
--to=dan.j.williams@intel.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=ashok.raj@intel.com \
--cc=bfields@fieldses.org \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=dledford@redhat.com \
--cc=dwmw2@infradead.org \
--cc=gregkh@linuxfoundation.org \
--cc=hal.rosenstock@gmail.com \
--cc=hch@lst.de \
--cc=iommu@lists.linux-foundation.org \
--cc=jack@suse.cz \
--cc=jlayton@poochiereds.net \
--cc=jmoyer@redhat.com \
--cc=joro@8bytes.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=luto@kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=robin.murphy@arm.com \
--cc=ross.zwisler@linux.intel.com \
--cc=sean.hefty@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox