From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 2 Nov 2006 12:45:42 +0000 (GMT) From: Mel Gorman Subject: Re: Page allocator: Single Zone optimizations In-Reply-To: Message-ID: References: <20061027190452.6ff86cae.akpm@osdl.org> <20061027192429.42bb4be4.akpm@osdl.org> <20061027214324.4f80e992.akpm@osdl.org> <20061028180402.7c3e6ad8.akpm@osdl.org> <4544914F.3000502@yahoo.com.au> <20061101182605.GC27386@skynet.ie> <20061101123451.3fd6cfa4.akpm@osdl.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: Andrew Morton , Nick Piggin , KAMEZAWA Hiroyuki , Linux Memory Management List List-ID: On Wed, 1 Nov 2006, Christoph Lameter wrote: > On Wed, 1 Nov 2006, Mel Gorman wrote: > >>> I still think that we need to generalize the approach to be >>> able to cover as much memory as possible. Remapping can solve some of the >>> issues, for others we could add additional ways to make things movable. >>> F.e. one could make page table pages movable by adding a back pointer to >>> the mm, reclaimable slab pages by adding a move function, driver >>> allocations could have a backpointer to the driver that would be able to >>> move its memory. >> >> I got the impression that we wouldn't be allowed to introduce such a mechanism >> because driver writers would get it wrong. It was why proper defragmentation >> was never really implemented. > > I think that choice is better than fiddling with the VM by adding > additional zones which will introduce lots of other problems. > The amount of work required to overhawl the device driver model and then update each driver makes the suspend problem look straight-forward. > The ability to move memory in general is beneficial for many purposes. > Defragmentation is certainly one of them. If all memory would be movable > then you would not need the separate list in the zone either. > If we had that ability, sure, but it's pie-in-the-sky. Page migration as it is can move easily reclaimable pages. With list-based anti-frag clustering non-movable allocations together as much as possible, I could use the existing page migration infrastructure to keep MAX_ORDER_NR_PAGES blocks of pages containing kernel pages free of easily-reclaimed allocations and vastly improve the resizing of the huge pages pool at runtime which is my *primary* concern, not memory hot unplug. If hot unplug became a big issue again, __rmqueue_fallback() from the list-based anti-frag patches could be taught how to place kernel non-reclaimable (different to short-lived kernel allocation or caches) blocks of pages always at the lower PFNs and migrate existing easily-reclaimed pages out of there without adding zones. This would allow hot unplug of higher PFN ranges. This would be easier than making all kernel memory movable. With the additional lists in the buddy allocator as well, I believe we could get rid of the special code managing hugepage pools and put it all back into the buddy allocator as an additional list that is only used for hugepage allocations. That would remove some hugepage-specific code for page allocation. If I can get list-based anti-frag merged into a testing tree for a while, I can start working on this sort of thing properly as well as finding out for sure if anti-frag helps the resizing of the huge page pool or not. > Maybe we can have special mempools for unreclaimable > allocations for starters and with that have the rest of memory be > movable? Then we can gradually reduce the need for unreclaimable memory. > Maybe we can keep unmovable memory completely out of the page allocator? > Mempools that grow or shrink in MAX_ORDER_NR_PAGES blocks is an option although balancing gets tricky. You'd have pages that are free-but-not-free stuck in there. > With that approach we would not break the NUMA layer because we can keep > the one zone per node approach for memory policies. The special > unreclaimable memory would not obey memory policies (which makes sense > since device driverws do not want user space policies applied to their > allocations anyways. Device drivers need memory near the device). > >>> Hmm.... Maybe generally a way to provide a >>> function to move data in the page struct for kernel allocations? >>> >> As devices are able to get physical addresses which then get pinned for IO, it >> gets messy. > > Right. So the device needs to disengage and then move its structures. > That is far easier said than done. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org