From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sun, 10 Oct 1999 07:24:58 -0400 Message-Id: <199910101124.HAA32129@light.alephnull.com> From: Rik Faith Subject: Re: MMIO regions In-reply-to: Sender: owner-linux-mm@kvack.org Return-Path: To: James Simmons Cc: Linux MM List-ID: On 7 Oct 99 19:40:32 GMT, James Simmons wrote: > On Mon, 4 Oct 1999, Stephen C. Tweedie wrote: > > On Mon, 4 Oct 1999 14:29:14 -0400 (EDT), James Simmons > > said: > > > > > Okay. But none of this prevents a rogue app from hosing your system. Such > > > a process doesn't have to bother with locks or semaphores. > > > > And we talked about this before. You _can_ make such a guarantee, but > > it is hideously expensive especially on SMP. You either protect the > > memory or the CPU against access by the other app, and that requires > > either scheduler or VM interrupts between CPUs. > > No VM stuff. I think the better approach is with the scheduler. The nice > thing about the schedular is the schedular lock. I'm assuming durning > is lock no other process on any CPU can be resceduled. Its during the lock > that I can test to see if a process is using a MMIO region that already in > use by another process. If it is then skip this process. If not weight > this process with the others. If a process is slected to be the next > executed process then lock the mmio region. If I understand what you are saying, there are serious performance implications for direct-rendering clients (in addition to the added scheduler overhead, which will negatively impact overall system performance). I believe you are saying: 1) There are n processes, each of which has the MMIO region mmap'd. 2) The scheduler will only schedule one of these processes at a time, even on an SMP system. [I'm assuming this is what you mean by "in use", since the scheduler can't know about actual MMIO writes -- it has to assume that a mapped region is a region that is "in use", even if it isn't (e.g., a threaded program may have the MMIO region mapped in n-1 threads, but may only direct render in 1 thread).] On MMIO-based graphics cards (i.e., those that do not use traditional DMA), a direct-rendering client will intersperse relatively long periods of computation with relatively short periods of MMIO writes. In your scheme, one of these clients will run for a whole time slice before the other one runs (i.e., they will run in alternate time slices, even on an SMP system with sufficient processors to run both simultaneously). Because actual MMIO writes take up a relatively small fraction of that time slice, rendering performance will potentially decrease by a factor of 2 (or more, if more CPUs are available). This is significant, especially since many high-end OpenGL applications are threaded and expect to be able to run simultaneously on SMP systems. The cooperative locking system used by the DRI (see http://precisioninsight.com/dr/locking.html) allows direct-rendering clients to perform fine-grain locking only when the MMIO region is actually being written. The overhead for this system is extremely low (about 2 instructions to lock, and 1 instruction to unlock). Cooperative locking like this allows several threads that all map the same MMIO region to run simultaneously on an SMP system. -- Rik Faith: faith@precisioninsight.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://humbolt.geo.uu.nl/Linux-MM/