Hi Jerome, On 04/12/2013 10:57 AM, Jerome Glisse wrote: > On Thu, Apr 11, 2013 at 9:54 PM, Simon Jeons > wrote: > > Hi Jerome, > > On 04/12/2013 02:38 AM, Jerome Glisse wrote: > > On Thu, Apr 11, 2013 at 11:42:05AM +0800, Simon Jeons wrote: > > Hi Jerome, > On 04/11/2013 04:45 AM, Jerome Glisse wrote: > > On Wed, Apr 10, 2013 at 09:41:57AM +0800, Simon Jeons > wrote: > > Hi Jerome, > On 04/09/2013 10:21 PM, Jerome Glisse wrote: > > On Tue, Apr 09, 2013 at 04:28:09PM +0800, > Simon Jeons wrote: > > Hi Jerome, > On 02/10/2013 12:29 AM, Jerome Glisse wrote: > > On Sat, Feb 9, 2013 at 1:05 AM, Michel > Lespinasse > wrote: > > On Fri, Feb 8, 2013 at 3:18 AM, > Shachar Raindel > > wrote: > > Hi, > > We would like to present a > reference implementation for > safely sharing > memory pages from user space > with the hardware, without > pinning. > > We will be happy to hear the > community feedback on our > prototype > implementation, and > suggestions for future > improvements. > > We would also like to discuss > adding features to the core MM > subsystem to > assist hardware access to user > memory without pinning. > > This sounds kinda scary TBH; > however I do understand the need > for such > technology. > > I think one issue is that many MM > developers are insufficiently aware > of such developments; having a > technology presentation would probably > help there; but traditionally > LSF/MM sessions are more interactive > between developers who are already > quite familiar with the technology. > I think it would help if you could > send in advance a detailed > presentation of the problem and > the proposed solutions (and then what > they require of the MM layer) so > people can be better prepared. > > And first I'd like to ask, aren't > IOMMUs supposed to already largely > solve this problem ? (probably a > dumb question, but that just tells > you how much you need to explain :) > > For GPU the motivation is three fold. > With the advance of GPU compute > and also with newer graphic program we > see a massive increase in GPU > memory consumption. We easily can > reach buffer that are bigger than > 1gbytes. So the first motivation is to > directly use the memory the > user allocated through malloc in the > GPU this avoid copying 1gbytes of > data with the cpu to the gpu buffer. > The second and mostly important > to GPU compute is the use of GPU > seamlessly with the CPU, in order to > achieve this you want the programmer > to have a single address space on > the CPU and GPU. So that the same > address point to the same object on > GPU as on the CPU. This would also be > a tremendous cleaner design from > driver point of view toward memory > management. > > And last, the most important, with > such big buffer (>1gbytes) the > memory pinning is becoming way to > expensive and also drastically > reduce the freedom of the mm to free > page for other process. Most of > the time a small window (every thing > is relative the window can be > > 100mbytes not so small :)) of the > object will be in use by the > hardware. The hardware pagefault > support would avoid the necessity to > > What's the meaning of hardware pagefault? > > It's a PCIE extension (well it's a combination > of extension that allow > that see > http://www.pcisig.com/specifications/iov/ats/). Idea > is that the > iommu can trigger a regular pagefault inside a > process address space on > behalf of the hardware. The only iommu > supporting that right now is the > AMD iommu v2 that you find on recent AMD platform. > > Why need hardware page fault? regular page fault > is trigger by cpu > mmu, correct? > > Well here i abuse regular page fault term. Idea is > that with hardware page > fault you don't need to pin memory or take reference > on page for hardware to > use it. So that kernel can free as usual page that > would otherwise have been > > For the case when GPU need to pin memory, why GPU need > grap the > memory of normal process instead of allocating for itself? > > Pin memory is today world where gpu allocate its own memory > (GB of memory) > that disappear from kernel control ie kernel can no longer > reclaim this > memory it's lost memory (i had complain about that already > from user than > saw GB of memory vanish and couldn't understand why the GPU > was using so > much). > > Tomorrow world we want gpu to be able to access memory that > the application > allocated through a simple malloc and we want the kernel to be > able to > recycly any page at any time because of memory pressure or > because kernel > decide to do so. > > That's just what we want to do. To achieve so we are getting > hw that can do > pagefault. No change to kernel core mm code (some improvement > might be made). > > > The memory disappear since you have a reference(gup) against it, > correct? Tomorrow world you want the page fault trigger through > iommu driver that call get_user_pages, it also will take a > reference(since gup is called), isn't it? Anyway, assume tomorrow > world doesn't take a reference, we don't need care page which used > by GPU is reclaimed? > > > Right now code use gup because it's convenient but it drop the > reference right after the fault. So reference is hold only for short > period of time. Are you sure gup will drop the reference right after the fault? I redig the codes and fail verify it. Could you point out to me? > > No you don't need to care about reclaim thanks to mmu notifier, ie > before page is remove mmu notifier is call and iommu register a > notifier, so it get the invalidate event and invalidate the device tlb > and things goes on. If gpu access the page a new pagefault happen and > a new page is allocated. Good idea! ;-) > > All this code is upstream in linux kernel just read it. There is just > no device that use it yet. > > That being said we will want improvement so that page that are hot in > the device are not reclaimed. But it can work without such improvement. > > Cheers, > Jerome