From: Mike Kravetz <mike.kravetz@oracle.com>
To: "C.Wehrmeyer" <c.wehrmeyer@gmx.de>
Cc: Michal Hocko <mhocko@kernel.org>,
linux-mm@kvack.org, linux-kernel <linux-kernel@vger.kernel.org>,
Andrea Arcangeli <aarcange@redhat.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: PROBLEM: Remapping hugepages mappings causes kernel to return EINVAL
Date: Mon, 23 Oct 2017 11:51:44 -0700 [thread overview]
Message-ID: <4d855be6-7718-f428-91d6-d0c6b44b7ff4@oracle.com> (raw)
In-Reply-To: <b138bcf8-0a66-a988-4040-520d767da266@gmx.de>
On 10/23/2017 10:52 AM, C.Wehrmeyer wrote:
> On 2017-10-23 18:57, Michal Hocko wrote:
>> On Mon 23-10-17 18:46:59, C.Wehrmeyer wrote:
>>> On 23-10-17 18:13, Michal Hocko wrote:
>>>> On Mon 23-10-17 16:00:13, C.Wehrmeyer wrote:
>>>>> And just to be very sure I've added:
>>>>>
>>>>> if (madvise(buf1,ALLOC_SIZE_1,MADV_HUGEPAGE)) {
>>>>> errno_tmp = errno;
>>>>> fprintf(stderr,"madvise: %u\n",errno_tmp);
>>>>> goto out;
>>>>> }
>>>>>
>>>>> /*Make sure the mapping is actually used*/
>>>>> memset(buf1,'!',ALLOC_SIZE_1);
>>>>
>>>> Is the buffer aligned to 2MB?
>>>
>>> When I omit MAP_HUGETLB for the flags that mmap receives - no.
>>>
>>> #define ALLOC_SIZE_1 (2 * 1024 * 1024)
>>> [...]
>>> buf1 = mmap (
>>> NULL,
>>> ALLOC_SIZE_1,
>>> prot, /*PROT_READ | PROT_WRITE*/
>>> flags /*MAP_PRIVATE | MAP_ANONYMOUS*/,
>>> -1,
>>> 0
>>> );
>>>
>>> In such a case buf1 usually contains addresses which are aligned to 4 KiBs,
>>> such as 0x7f07d76e9000. 2-MiB-aligned addresses, such as 0x7f89f5e00000, are
>>> only produced with MAP_HUGETLB - which, if I understood the documentation
>>> correctly, is not the point of THPs as they are supposed to be transparent.
>>
>> yes. You can use posix_memalign
>
> Useless. We don't use the memory allocation structures of malloc/free, and yet that's exactly what this function requires us to do. The reason why we use mmap and mremap is to get rid of userspace-crap in the first place.
>
>> or you can mmap a larger block and
>> munmap the initial unaligned part.
>
> And how is that supposed to be transparent? When I hear "transparent" I think of a mechanism which I can put under a system so that it benefits from it, while the system does not notice or at least does not need to be aware of it. The system also does not need to be changed for it.
>
> This approach is even more un-transparent than providing a flag to mmap in order to make hugepages work correctly.
Well at least this has a built in fall back mechanism. When using hugetlb(fs)
pages, you would need to handle the case where mremap fails due to lack of
configured huge pages.
I assume your allocator will be for somewhat general application usage. Yet,
for the most reliability the user/admin will need to know at boot time how
many huge pages will be needed and set that up.
--
Mike Kravetz
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-23 18:51 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <93684e4b-9e60-ef3a-ba62-5719fdf7cff9@gmx.de>
2017-10-19 7:34 ` C.Wehrmeyer
2017-10-20 22:42 ` Mike Kravetz
2017-10-23 11:42 ` Michal Hocko
2017-10-23 12:22 ` C.Wehrmeyer
2017-10-23 12:41 ` Michal Hocko
2017-10-23 14:00 ` C.Wehrmeyer
2017-10-23 16:13 ` Michal Hocko
2017-10-23 16:46 ` C.Wehrmeyer
2017-10-23 16:57 ` Michal Hocko
2017-10-23 17:52 ` C.Wehrmeyer
2017-10-23 18:02 ` Michal Hocko
2017-10-24 7:41 ` C.Wehrmeyer
2017-10-24 8:12 ` Michal Hocko
2017-10-24 8:32 ` C.Wehrmeyer
2017-10-27 14:29 ` Vlastimil Babka
2017-10-27 17:06 ` Mike Kravetz
2017-10-27 17:31 ` Kirill A. Shutemov
2017-10-23 18:51 ` Mike Kravetz [this message]
2017-10-24 8:09 ` C.Wehrmeyer
2017-10-07 1:58 C.Wehrmeyer
2017-10-09 16:47 ` Mike Kravetz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4d855be6-7718-f428-91d6-d0c6b44b7ff4@oracle.com \
--to=mike.kravetz@oracle.com \
--cc=aarcange@redhat.com \
--cc=c.wehrmeyer@gmx.de \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox