From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 6 Sep 2002 20:03:28 -0300 (BRT) From: Rik van Riel Subject: Re: inactive_dirty list In-Reply-To: <3D7930D6.F658E5B9@zip.com.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Andrew Morton Cc: "linux-mm@kvack.org" List-ID: On Fri, 6 Sep 2002, Andrew Morton wrote: > Rik van Riel wrote: > > On Fri, 6 Sep 2002, Andrew Morton wrote: > > > > I guess this means the dirty limit should be near 1% for the > > VM. > > What is the thinking behind that? Dirty pages could sit on the list practically forever if there are enough clean pages. This means we can have a significant amount of memory "parked" on the dirty list, without it ever getting reclaimed, even if we could use the memory for something better. > I still have not got my head around: > > > We did this in early 2.4 kernels and it was a disaster. The > > reason it was a disaster was that in many workloads we'd > > always have some clean pages and we'd end up always reclaiming > > those before even starting writeout on any of the dirty pages. > > Does this imply that we need to block on writeout *instead* > of reclaiming clean pagecache? No, it means that whenever we reclaim clean pagecache pages, we should also start the writeout of some dirty pages. > We could do something like: > > if (zone->nr_inactive_dirty > zone->nr_inactive_clean) { > wakeup_bdflush(); /* Hope this writes the correct zone */ > yield(); > } > > which would get the IO underway promptly. But the caller would > still go in and gobble remaining clean pagecache. This is nice, but it would still be possible to have oodles of pages "parked" on the dirty list, which we definately need to prevent. > So a 1G box running dbench 1000 acts like a 600M box. Which > is not a bad model, perhaps. If we can twiddle that 40% > up and down based on criteria... Writing out dirty pages whenever we reclaim free pages could fix that problem. > But that separaton of the 40% of unusable memory from the > 60% of usable memory is done by scanning at present, and > it costs a bit of CPU. Not much, but a bit. There are other reasons we're wasting CPU in scanning: 1) the scanning isn't really rate limited yet (or is it?) 2) every thread in the system can fall into the scanning function, so if we have 50 page allocators they'll all happily scan the list, even though the first of these threads already found there wasn't anything freeable > (btw, is there any reason at all for having page reserves > in ZONE_HIGHMEM? I have a suspicion that this is just wasted > memory...) Dunno, but I guess it is to prevent a 4GB box from acting like a 900MB box under corner conditions ;) regards, Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/