On Tue, Apr 16, 2013 at 3:03 AM, Simon Jeons wrote: > Hi Jerome, > > On 02/08/2013 11:21 PM, Jerome Glisse wrote: > >> On Fri, Feb 8, 2013 at 6:18 AM, Shachar Raindel >> wrote: >> >>> Hi, >>> >>> We would like to present a reference implementation for safely sharing >>> memory pages from user space with the hardware, without pinning. >>> >>> We will be happy to hear the community feedback on our prototype >>> implementation, and suggestions for future improvements. >>> >>> We would also like to discuss adding features to the core MM subsystem to >>> assist hardware access to user memory without pinning. >>> >>> Following is a longer motivation and explanation on the technology >>> presented: >>> >>> Many application developers would like to be able to be able to >>> communicate >>> directly with the hardware from the userspace. >>> >>> Use cases for that includes high performance networking API such as >>> InfiniBand, RoCE and iWarp and interfacing with GPUs. >>> >>> Currently, if the user space application wants to share system memory >>> with >>> the hardware device, the kernel component must pin the memory pages in >>> RAM, >>> using get_user_pages. >>> >>> This is a hurdle, as it usually makes large portions the application >>> memory >>> unmovable. This pinning also makes the user space development model very >>> complicated – one needs to register memory before using it for >>> communication >>> with the hardware. >>> >>> We use the mmu-notifiers [1] mechanism to inform the hardware when the >>> mapping of a page is changed. If the hardware tries to access a page >>> which >>> is not yet mapped for the hardware, it requests a resolution for the page >>> address from the kernel. >>> >>> This mechanism allows the hardware to access the entire address space of >>> the >>> user application, without pinning even a single page. >>> >>> We would like to use the LSF/MM forum opportunity to discuss open issues >>> we >>> have for further development, such as: >>> >>> -Allowing the hardware to perform page table walk, similar to >>> get_user_pages_fast to resolve user pages that are already in RAM. >>> >> > get_user_pages_fast just get page reference count instead of populate the > pte to page table, correct? Then how can GPU driver use iommu to access the > page? > As i said this is for pre-filling already present entry, ie pte that are present with a valid page (no special bit set). This is an optimization so that the GPU can pre-fill its tlb without having to take any mmap_sem. Hope is that in most common case this will be enough, but in some case you will have to go through the lengthy non fast gup. Cheers, Jerome