From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Tue, 8 Aug 2000 12:21:00 -0300 (BRST) From: Rik van Riel Subject: Re: RFC: design for new VM In-Reply-To: <200008080048.RAA13326@eng2.sequent.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Gerrit.Huizenga@us.ibm.com Cc: chucklever@bigfoot.com, linux-mm@kvack.org, linux-kernel@vger.rutgers.edu, Linus Torvalds List-ID: On Mon, 7 Aug 2000 Gerrit.Huizenga@us.ibm.com wrote: > > On Mon, 7 Aug 2000, Rik van Riel wrote: > > The idea is that the memory_pressure variable indicates how > > much page stealing is going on (on average) so every time > > kswapd wakes up it knows how much pages to steal. That way > > it should (if we're "lucky") free enough pages to get us > > along until the next time kswapd wakes up. > > Seems like you could signal kswapd when either the page fault > rate increases or the rate of (memory allocations / memory > frees) hits a tuneable? ratio We will. Each page steal and each allocation will increase the memory_pressure variable, and because of that, also the inactive_target. Whenever either - one zone gets low on free memory *OR* - all zones get more or less low on free+inactive_clean pages *OR* - we get low on inactive pages (inactive_shortage > inactive_target/2), THEN kswapd gets woken up immediately. We do this both from the page allocation code and from __find_page_nolock (which gets hit every time we reclaim an inactive page back for its original purpose). > > About NUMA scalability: we'll have different memory pools > > per NUMA node. So if you have a 32-node, 64GB NUMA machine, > > it'll partly function like 32 independant 2GB machines. > > One lesson we learned early on is that anything you can > possibly do on a per-CPU basis helps both SMP and NUMA > activity. This includes memory management, scheduling, > TCP performance counters, any kind of system counters, etc. > Once you have the basic SMP hierarchy in place, adding a NUMA > hierarchy (or more than one for architectures that need it) > is much easier. > > Also, is there a kswapd per pool? Or does one kswapd oversee > all of the pools (in the NUMA world, that is)? Currently we have none of this, but once 2.5 is forked off, I'll submit a patch which shuffles all variables into per-node (per pgdat) structures. regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/