linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Eric B Munson <linux-mm@mgebm.net>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Alexey Korolev <akorolex@gmail.com>,
	Alexey Korolev <akorolev@infradead.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/3]HTLB mapping for drivers (take 2)
Date: Wed, 19 Aug 2009 11:35:50 +0100	[thread overview]
Message-ID: <56e00de0908190335n6b120114kb6ece4623f024319@mail.gmail.com> (raw)
In-Reply-To: <20090819100553.GE24809@csn.ul.ie>

On Wed, Aug 19, 2009 at 11:05 AM, Mel Gorman<mel@csn.ul.ie> wrote:
> On Wed, Aug 19, 2009 at 05:48:11PM +1200, Alexey Korolev wrote:
>> Hi,
>> >
>> > It sounds like this patch set working towards the same goal as my
>> > MAP_HUGETLB set.  The only difference I see is you allocate huge page
>> > at a time and (if I am understanding the patch) fault the page in
>> > immediately, where MAP_HUGETLB only faults pages as needed.  Does the
>> > MAP_HUGETLB patch set provide the functionality that you need, and if
>> > not, what can be done to provide what you need?
>> >
>>
>> Thanks a lot for willing to help. I'll be much appreciate if you have
>> an interesting idea how HTLB mapping for drivers can be done.
>>
>> It is better to describe use case in order to make it clear what needs
>> to be done.
>> Driver provides mapping of device DMA buffers to user level
>> applications.
>
> Ok, so the buffer is in normal memory. When mmap() is called, the buffer
> is already populated by data DMA'd from the device. That scenario rules out
> calling mmap(MAP_ANONYMOUS|MAP_HUGETLB) because userspace has access to the
> buffer before it is populated by data from the device.
>
> However, it does not rule out mmap(MAP_ANONYMOUS|MAP_HUGETLB) when userspace
> is responsible for populating a buffer for sending to a device. i.e. whether it
> is suitable or not depends on when the buffer is populated and who is doing it.
>
>> User level applications process the data.
>> Device is using a master DMA to send data to the user buffer, buffer
>> size can be >1GB and performance is very important. (So huge pages
>> mapping really makes sense.)
>>
>
> Ok, so the DMA may be faster because you have to do less scatter/gather
> and can DMA in larger chunks and and reading from userspace may be faster
> because there is less translation overhead. Right?
>
>> In addition we have to mention that:
>> 1. It is hard for user to tell how much huge pages needs to be
>>    reserved by the driver.
>
> I think you have this problem either way. If the buffer is allocated and
> populated before mmap(), then the driver is going to have to guess how many
> pages it needs. If the DMA occurs as a result of mmap(), it's easier because
> you know the number of huge pages to be reserved at that point and you have
> the option of falling back to small pages if necessary.
>
>> 2. Devices add constrains on memory regions. For example it needs to
>>    be contiguous with in the physical address space. It is necessary to
>>   have ability to specify special gfp flags.
>
> The contiguity constraints are the same for huge pages. Do you mean there
> are zone restrictions? If so, the hugetlbfs_file_setup() function could be
> extended to specify a GFP mask that is used for the allocation of hugepages
> and associated with the hugetlbfs inode. Right now, there is a htlb_alloc_mask
> mask that is applied to some additional flags so htlb_alloc_mask would be
> the default mask unless otherwise specified.
>
>> 3 The HW needs to access physical memory before the user level
>> software can access it. (Hugetlbfs picks up pages on page fault from
>> pool).
>> It means memory allocation needs to be driven by device driver.
>>
>
> How about;
>
>        o Extend Eric's helper slightly to take a GFP mask that is
>          associated with the inode and used for allocations from
>          outside the hugepage pool
>        o A helper that returns the page at a given offset within
>          a hugetlbfs file for population before the page has been
>          faulted.
>
> I know this is a bit hand-wavy, but it would allow significant sharing
> of the existing code and remove much of the hugetlbfs-awareness from
> your current driver.
>
>> Original idea was: create hugetlbfs file which has common mapping with
>> device file. Allocate memory. Populate page cache of hugetlbfs file
>> with allocated pages.
>> When fault occurs, page will be taken from page cache and then
>> remapped to user space by hugetlbfs.
>>
>> Another possible approach is described here:
>> http://marc.info/?l=linux-mm&m=125065257431410&w=2
>> But currently not sure  will it work or not.
>>
>>
>> Thanks,
>> Alexey
>>
>
> --
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab
>

Alexey,

I'd be willing to take a stab at a prototype of Mel's suggestion based
on my patch set if you this it would be useful to you.

Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-08-19 10:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-17 22:24 Alexey Korolev
2009-08-18 10:29 ` Eric Munson
2009-08-19  5:48   ` Alexey Korolev
2009-08-19 10:05     ` Mel Gorman
2009-08-19 10:35       ` Eric B Munson [this message]
2009-08-20  7:03       ` Alexey Korolev
2009-08-25 10:47         ` Mel Gorman
2009-08-25 11:00           ` Benjamin Herrenschmidt
2009-08-25 11:10             ` Mel Gorman
2009-08-26  9:58               ` Benjamin Herrenschmidt
2009-08-26 10:05                 ` Mel Gorman
2009-08-24  6:14       ` Alexey Korolev
2009-08-25 10:53         ` Mel Gorman
2009-08-27 12:02           ` Alexey Korolev
2009-08-27 12:50             ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56e00de0908190335n6b120114kb6ece4623f024319@mail.gmail.com \
    --to=linux-mm@mgebm.net \
    --cc=akorolev@infradead.org \
    --cc=akorolex@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox