From: Dan Williams <dan.j.williams@intel.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave@sr71.net>, Toshi Kani <toshi.kani@hpe.com>,
David Airlie <airlied@linux.ie>,
Dave Hansen <dave.hansen@linux.intel.com>,
Dave Chinner <david@fromorbit.com>, Linux MM <linux-mm@kvack.org>,
"H. Peter Anvin" <hpa@zytor.com>, Christoph Hellwig <hch@lst.de>,
Andrea Arcangeli <aarcange@redhat.com>,
kbuild test robot <lkp@intel.com>,
linux-nvdimm <linux-nvdimm@ml01.01.org>,
Richard Weinberger <richard@nod.at>, X86 ML <x86@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Mel Gorman <mgorman@suse.de>,
Matthew Wilcox <willy@linux.intel.com>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Jeff Dike <jdike@addtoit.com>, Jens Axboe <axboe@fb.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Thomas Gleixner <tglx@linutronix.de>,
Christoffer Dall <christoffer.dall@linaro.org>,
Jan Kara <jack@suse.com>, Paolo Bonzini <pbonzini@redhat.com>,
Logan Gunthorpe <logang@deltatee.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [-mm PATCH v2 00/25] get_user_pages() for dax pte and pmd mappings
Date: Thu, 10 Dec 2015 18:03:31 -0800 [thread overview]
Message-ID: <CAPcyv4jtF2LwK3jbsjPHB7=JE1O0-TkRQGQcMSrB9bPZVdFd8A@mail.gmail.com> (raw)
In-Reply-To: <x49fuzat8k9.fsf@segfault.boston.devel.redhat.com>
On Thu, Dec 10, 2015 at 11:20 AM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Dan Williams <dan.j.williams@intel.com> writes:
>
>> On Thu, Dec 10, 2015 at 10:08 AM, Jeff Moyer <jmoyer@redhat.com> wrote:
>>> Dan Williams <dan.j.williams@intel.com> writes:
>>>
>>>> Summary:
>>>>
>>>> To date, we have implemented two I/O usage models for persistent memory,
>>>> PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
>>>> userspace). This series adds a third, DAX-GUP, that allows DAX mappings
>>>> to be the target of direct-i/o. It allows userspace to coordinate
>>>> DMA/RDMA from/to persistent memory.
>>>>
>>>> The implementation leverages the ZONE_DEVICE mm-zone that went into
>>>> 4.3-rc1 (also discussed at kernel summit) to flag pages that are owned
>>>> and dynamically mapped by a device driver. The pmem driver, after
>>>> mapping a persistent memory range into the system memmap via
>>>> devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus
>>>> page-backed pmem-pfns via flags in the new pfn_t type.
>>>
>>> So, this basically means that an admin has to decide whether or not DMA
>>> will be used on a given device before making a file system on it. That
>>> seems like an odd requirement. There's also a configuration option of
>>> whether to put those backing struct pages into DRAM or PMEM (which, of
>>> course, will be dictated by the size of pmem). I really think we should
>>> reconsider this approach.
>>>
>>> First, the admin shouldn't have to choose whether or not DMA will be
>>> done on the file system.
>>
>> To be clear it's not "whether or not DMA will be done on the file
>> system", it's whether or not both DMA and DAX will be done
>> simultaneously on the filesystem.
>
> Fair point, but I'd view one of those configurations as not recommended.
> To be clear, if you're just going to use the device for block based
> access, using btt is the safer option.
Speaking of btt, the mechanism for setting up a btt is identical to
specifying a reserved area for the memmap. I.e. write an info block
to the namespace to specify a new mode of operation.
>> DAX is already a capability that an admin can inadvertently disable by
>> mis-configuring the alignment of a partition [1].
>
> Heh, using my own commit against me? ;-) Anyway, the commit message
> suggests that dax *could* be supported on misaligned partitions.
All's fair in love, war, and code defense. :-)
>> Why not also disable it when DMA support is not configured and force
>> the fs back to page-cache? Namespace creation tooling in userspace
>> can default to enabling DAX + DMA.
>
> Well, the only reason I can come up with is manufactured: we've forced
> the admin to decide between having that extra space for storage and
> doing DMA, and he or she opted for more space.
Is this any worse than the "forcing" we're imposing in the btt /
no-btt decision that impacts DAX? This additional configuration
flexibility for whether / where to store a memmap array is merely
incremental, not fatal. It's also a configuration decision we can
stop asking an admin to make when / if we ever re-write the kernel to
reduce its dependency on struct page.
In the meantime, I expect some would say DAX is a toy as long as it
continues to fail at DMA.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-12-11 2:03 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-10 2:37 Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 01/25] pmem, dax: clean up clear_pmem() Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 02/25] dax: increase granularity of dax_clear_blocks() operations Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 03/25] dax: guarantee page aligned results from bdev_direct_access() Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 04/25] dax: fix lifetime of in-kernel dax mappings with dax_map_atomic() Dan Williams
2015-12-11 18:11 ` [-mm PATCH v3 " Dan Williams
2015-12-17 22:00 ` Ross Zwisler
2015-12-17 22:16 ` Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 05/25] mm, dax: fix livelock, allow dax pmd mappings to become writeable Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 06/25] dax: Split pmd map when fallback on COW Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 07/25] um: kill pfn_t Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 08/25] kvm: rename pfn_t to kvm_pfn_t Dan Williams
2015-12-10 2:37 ` [-mm PATCH v2 09/25] mm, dax, pmem: introduce pfn_t Dan Williams
2015-12-11 18:22 ` [-mm PATCH v3 " Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 10/25] mm: introduce find_dev_pagemap() Dan Williams
2015-12-11 18:27 ` [-mm PATCH v3 " Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 11/25] x86, mm: introduce vmem_altmap to augment vmemmap_populate() Dan Williams
2015-12-15 16:50 ` Dan Williams
2015-12-15 23:28 ` Andrew Morton
2015-12-15 23:37 ` Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 12/25] libnvdimm, pfn, pmem: allocate memmap array in persistent memory Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 13/25] avr32: convert to asm-generic/memory_model.h Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 14/25] hugetlb: fix compile error on tile Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 15/25] frv: fix compiler warning from definition of __pmd() Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 16/25] x86, mm: introduce _PAGE_DEVMAP Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 17/25] mm, dax, gpu: convert vm_insert_mixed to pfn_t Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 18/25] mm, dax: convert vmf_insert_pfn_pmd() " Dan Williams
2015-12-10 2:38 ` [-mm PATCH v2 19/25] list: introduce list_del_poison() Dan Williams
2015-12-15 23:41 ` Andrew Morton
2015-12-16 0:17 ` Dan Williams
2015-12-10 2:39 ` [-mm PATCH v2 20/25] libnvdimm, pmem: move request_queue allocation earlier in probe Dan Williams
2015-12-10 2:39 ` [-mm PATCH v2 21/25] mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup Dan Williams
2015-12-15 23:46 ` Andrew Morton
2015-12-10 2:39 ` [-mm PATCH v2 22/25] mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmd Dan Williams
2015-12-10 2:39 ` [-mm PATCH v2 23/25] mm, x86: get_user_pages() for dax mappings Dan Williams
2015-12-16 0:14 ` Andrew Morton
2015-12-16 2:18 ` Dan Williams
2015-12-18 0:09 ` Dan Williams
2015-12-10 2:39 ` [-mm PATCH v2 24/25] dax: provide diagnostics for pmd mapping failures Dan Williams
2015-12-10 2:39 ` [-mm PATCH v2 25/25] dax: re-enable dax pmd mappings Dan Williams
2015-12-10 18:08 ` [-mm PATCH v2 00/25] get_user_pages() for dax pte and " Jeff Moyer
2015-12-10 18:56 ` Dan Williams
2015-12-10 19:20 ` Jeff Moyer
2015-12-11 2:03 ` Dan Williams [this message]
2015-12-14 14:52 ` Jeff Moyer
2015-12-14 16:44 ` Dan Williams
2015-12-11 18:44 ` Dan Williams
2015-12-15 1:59 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPcyv4jtF2LwK3jbsjPHB7=JE1O0-TkRQGQcMSrB9bPZVdFd8A@mail.gmail.com' \
--to=dan.j.williams@intel.com \
--cc=aarcange@redhat.com \
--cc=airlied@linux.ie \
--cc=akpm@linux-foundation.org \
--cc=axboe@fb.com \
--cc=christoffer.dall@linaro.org \
--cc=dave.hansen@linux.intel.com \
--cc=dave@sr71.net \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=hpa@zytor.com \
--cc=jack@suse.com \
--cc=jdike@addtoit.com \
--cc=jmoyer@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@ml01.01.org \
--cc=lkp@intel.com \
--cc=logang@deltatee.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=richard@nod.at \
--cc=ross.zwisler@linux.intel.com \
--cc=tglx@linutronix.de \
--cc=toshi.kani@hpe.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@linux.intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox