From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from max.phys.uu.nl (max.phys.uu.nl [131.211.32.73]) by kvack.org (8.8.7/8.8.7) with ESMTP id EAA11034 for ; Thu, 18 Jun 1998 04:31:07 -0400 Date: Thu, 18 Jun 1998 09:25:55 +0200 (CEST) From: Rik van Riel Reply-To: Rik van Riel Subject: Re: PTE chaining, kswapd and swapin readahead In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org To: "Eric W. Biederman" Cc: Linux MM List-ID: On 17 Jun 1998, Eric W. Biederman wrote: > >>>>> "RR" == Rik van Riel writes: > > RR> True LRU swapping might actually be a disadvantage. The way > RR> we do things now (walking process address space) can result > RR> in a much larger I/O bandwidth to/from the swapping device. > > The goal should be to reduce disk I/O as disk bandwidth is way below > memory bandwidth. Using ``unused'' disk bandwidth in prepaging may > also be a help. For normal disks, most I/O is so completely dominated by _seeks_, that transfer time can be almost completely ignored. This is why we should focus on reducing the number of disk I/Os, not the number of blocks transferred. > Note: much of the write I/O performance we achieve is because > get_swap_page() is very efficient at returning adjacent swap pages. > I don't see where location is memory makes a difference. It makes a difference because: - programs usually use larger area's in one piece - swapout clustering saves a lot of I/O (like you just said) - when pages from a process are adjacant on disk, we can save the same amount of I/O on _swapin_ too. - this will result in a _much_ improved bandwidth > We could probably add a few more likely cases to the vm system. The > only simple special cases I can think to add are reverse sequential > access, and stack access where pages 1 2 3 4 are accesed and then 4 3 > 2 1 are accessed in reverse order. Maybe that's why stacks grow down :-) Looking at the addresses of a shrinking stack, you'll notice that linear forward readahead still is the best algorithm. > >> Also for swapin readahead the only effective strategy I know is to > >> implement a kernel system call, that says I'm going to be accessing > > The point I was hoping to make is that for programs that find > themselves swapping frequently a non blocking read (for mmapped areas) > can be quite effective. Agreed. And combined with your other (snipped) call, it might give a _huge_ advantage for large simulations and other processes which implement it. > RR> There are more possibilities. One of them is to use the > RR> same readahead tactic that is being used for mmap() > RR> readahead. > > Actually that sounds like a decent idea. But I doubt it will help > much. I will start on the vnodes fairly soon, after I get a kernel > pgflush deamon working. It _does_ help very much. A lot of the perceived slowness of Linux is the 'task switching' in X. By this I mean new people selecting another window as their foreground window, causing X and the programs to read in huge amounts of graphics data, while simultaneously swapping out other data. By implementing the same readahead tactic as we use for mmap()ed files, we could cut the number of I/Os by more than one 3rd and probably even more. Rik. +-------------------------------------------------------------------+ | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl | | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ | +-------------------------------------------------------------------+