From: Mike Kravetz <mike.kravetz@oracle.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Guy Shattah <sguy@mellanox.com>,
Christopher Lameter <cl@linux.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-api@vger.kernel.org,
Marek Szyprowski <m.szyprowski@samsung.com>,
Michal Nazarewicz <mina86@mina86.com>,
"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Anshuman Khandual <khandual@linux.vnet.ibm.com>,
Laura Abbott <labbott@redhat.com>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support
Date: Mon, 16 Oct 2017 13:32:45 -0700 [thread overview]
Message-ID: <e8cf6227-003d-8a82-8b4d-07176b43810c@oracle.com> (raw)
In-Reply-To: <20171016180749.2y2v4ucchb33xnde@dhcp22.suse.cz>
On 10/16/2017 11:07 AM, Michal Hocko wrote:
> On Mon 16-10-17 10:43:38, Mike Kravetz wrote:
>> Just to be clear, the posix standard talks about a typed memory object.
>> The suggested implementation has one create a connection to the memory
>> object to receive a fd, then use mmap as usual to get a mapping backed
>> by contiguous pages/memory. Of course, this type of implementation is
>> not a requirement.
>
> I am not sure that POSIC standard for typed memory is easily
> implementable in Linux. Does any OS actually implement this API?
A quick search only reveals Blackberry QNX and PlayBook OS.
Also somewhat related. In a earlier thread someone pointed out this
out of tree module used for contiguous allocations in SOC (and other?)
environments. It even has the option of making use of CMA.
http://processors.wiki.ti.com/index.php/CMEM_Overview
>> However, this type of implementation looks quite a
>> bit like hugetlbfs today.
>> - Both require opening a special file/device, and then calling mmap on
>> the returned fd. You can technically use mmap(MAP_HUGETLB), but that
>> still ends up using hugetbfs. BTW, there was resistance to adding the
>> MAP_HUGETLB flag to mmap.
>
> And I think we shouldn't really shape any API based on hugetlb.
Agree. I only wanted to point out the similarities.
But, it does make me wonder how much of a benefit hugetlb 1G pages would
make in the the RDMA performance comparison. The table in the presentation
show a average speedup of something like 27% (or so) for contiguous allocation
which I assume are 2GB in size. Certainly, using hugetlb is not the ideal
case, just wondering if it does help and how much.
>> - Allocation of contiguous memory is much like 'on demand' allocation of
>> huge pages. There are some (not many) users that use this model. They
>> attempt to allocate huge pages on demand, and if not available fall back
>> to base pages. This is how contiguous allocations would need to work.
>> Of course, most hugetlbfs users pre-allocate pages for their use, and
>> this 'might' be something useful for contiguous allocations as well.
>
> But there is still admin configuration required to consume memory from
> the pool or overcommit that pool.
>
>> I wonder if going down the path of a separate devide/filesystem/etc for
>> contiguous allocations might be a better option. It would keep the
>> implementation somewhat separate. However, I would then be afraid that
>> we end up with another 'separate/special vm' as in the case of hugetlbfs
>> today.
>
> That depends on who is actually going to use the contiguous memory. If
> we are talking about drivers to communication to the userspace then
> using driver specific fd with its mmap implementation then we do not
> need any special fs nor a seperate infrastructure. Well except for a
> library function to handle the MM side of the thing.
If we embed this functionality into device specific mmap calls it will
closely tie the usage to the devices. However, don't we still have to
worry about potential interaction with other parts of the mm as you mention
below? I guess that would be the library function and how it is used
by drivers.
--
Mike Kravetz
> If we really need a general purpose physical contiguous memory allocator
> then I would agree that using MAP_ flag might be a way to go but that
> would require a very careful consideration of who is allowed to allocate
> and how much/large blocks. I do not see a good fit to conveying that
> information to the kernel right now. Moreover, and most importantly, I
> haven't heard any sound usecase for such a functionality in the first
> place. There is some hand waving about performance but there are no real
> numbers to back those claims AFAIK. Not to mention a serious
> consideration of potential consequences of the whole MM.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-16 20:32 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-03 23:56 [RFC] mmap(MAP_CONTIG) Mike Kravetz
2017-10-04 11:54 ` Michal Nazarewicz
2017-10-04 17:08 ` Mike Kravetz
2017-10-04 21:29 ` Laura Abbott
2017-10-04 13:49 ` Anshuman Khandual
2017-10-04 16:05 ` Christopher Lameter
2017-10-04 17:38 ` Mike Kravetz
2017-10-04 17:35 ` Mike Kravetz
2017-10-05 7:06 ` Vlastimil Babka
2017-10-05 8:58 ` Guy Shattah
2017-10-05 12:36 ` Guy Shattah
2017-10-05 14:30 ` Christopher Lameter
2017-10-12 1:46 ` [RFC PATCH 0/3] Add mmap(MAP_CONTIG) support Mike Kravetz
2017-10-12 1:46 ` [RFC PATCH 1/3] mm/map_contig: Add VM_CONTIG flag to vma struct Mike Kravetz
2017-10-12 1:46 ` [RFC PATCH 2/3] mm/map_contig: Use pre-allocated pages for VM_CONTIG mappings Mike Kravetz
2017-10-12 11:04 ` Anshuman Khandual
2017-10-12 1:46 ` [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support Mike Kravetz
2017-10-12 11:22 ` Anshuman Khandual
2017-10-13 15:14 ` Christopher Lameter
2017-10-12 14:37 ` Michal Hocko
2017-10-12 17:19 ` Mike Kravetz
2017-10-13 8:40 ` Michal Hocko
2017-10-13 15:20 ` Christopher Lameter
2017-10-13 15:28 ` Michal Hocko
2017-10-13 15:42 ` Christopher Lameter
2017-10-13 15:47 ` Michal Hocko
[not found] ` <20171013154747.2jv7rtfqyyagiodn-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-10-13 15:56 ` Christopher Lameter
2017-10-13 16:17 ` Michal Hocko
2017-10-15 7:50 ` Guy Shattah
2017-10-16 8:24 ` Michal Hocko
2017-10-16 9:11 ` Guy Shattah
2017-10-16 12:32 ` Michal Hocko
[not found] ` <20171016123248.csntl6luxgafst6q-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-10-16 16:00 ` Christopher Lameter
2017-10-16 17:42 ` Michal Hocko
[not found] ` <20171016174229.pz3o4uhzz3qbrp6n-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-10-16 17:56 ` Christopher Lameter
2017-10-16 18:17 ` Michal Hocko
2017-10-23 15:25 ` David Nellans
2017-10-17 10:50 ` Guy Shattah
2017-10-17 10:59 ` Michal Hocko
2017-10-17 13:22 ` Michal Nazarewicz
2017-10-17 14:20 ` Guy Shattah
2017-10-17 17:44 ` Vlastimil Babka
2017-10-17 18:23 ` Mike Kravetz
2017-10-17 19:56 ` Vlastimil Babka
2017-10-16 10:33 ` Michal Nazarewicz
2017-10-16 11:09 ` Guy Shattah
2017-10-16 17:43 ` Mike Kravetz
2017-10-16 18:07 ` Michal Hocko
2017-10-16 20:32 ` Mike Kravetz [this message]
2017-10-16 20:58 ` Michal Hocko
2017-10-16 21:03 ` Laura Abbott
2017-10-16 21:18 ` Mike Kravetz
2017-10-17 6:59 ` Vlastimil Babka
2017-10-15 6:58 ` Pavel Machek
2017-10-16 8:18 ` Michal Hocko
2017-10-16 9:54 ` Pavel Machek
2017-10-16 12:18 ` Michal Hocko
[not found] ` <20171016121808.m4sq3g5nxeyxoymc-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-10-16 16:02 ` Christopher Lameter
2017-10-16 17:33 ` Michal Hocko
2017-10-16 17:53 ` Christopher Lameter
2017-10-15 8:07 ` Guy Shattah
2017-10-12 10:36 ` [RFC PATCH 0/3] " Anshuman Khandual
2017-10-12 14:25 ` Anshuman Khandual
2017-10-23 22:10 ` [RFC] mmap(MAP_CONTIG) Dave Hansen
2017-10-24 22:49 ` Mike Kravetz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8cf6227-003d-8a82-8b4d-07176b43810c@oracle.com \
--to=mike.kravetz@oracle.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=khandual@linux.vnet.ibm.com \
--cc=labbott@redhat.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=m.szyprowski@samsung.com \
--cc=mhocko@kernel.org \
--cc=mina86@mina86.com \
--cc=sguy@mellanox.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox