Re: [RFC] start_aggressive_readahead

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Scott Kaplan <sfkaplan@cs.amherst.edu>
To: "Martin J. Bligh" <Martin.Bligh@us.ibm.com>
Cc: Andrew Morton <akpm@zip.com.au>,
	Rik van Riel <riel@conectiva.com.br>,
	Christoph Hellwig <hch@lst.de>,
	linux-mm@kvack.org
Subject: Re: [RFC] start_aggressive_readahead
Date: Mon, 5 Aug 2002 14:54:03 -0400	[thread overview]
Message-ID: <B5FE047C-A8A4-11D6-A6B5-000393829FA4@cs.amherst.edu> (raw)
In-Reply-To: <646802512.1028022723@[10.10.2.3]>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin,

Sorry for the slowness of the response, but just a thought or two...

> Both sets of heuristics seem backwards to me, depending on the
> circumstances ;-)

I don't agree, but more on that in a moment.  First, I'd like to point out 
a minor difference between what I meant by my suggestion and your 
interpretation of it.  The heuristic that I was suggesting -- grow in 
response to read-ahead misses, shrink in response to hits -- was not 
intended as a mere replacement.  It was meant as a ``blind'' approach to 
discovering the reference distribution for read-ahead pages.  So, the 
heuristic wouldn't be used simply as stated; instead, it would be a first 
approach to changing the read-ahead window size until evidence was 
gathered to make higher-level decisions.

For example, the VM system could shrink the window in response to hits, 
but if that shrinking decreased the hit count ``significantly'', it would 
return to the smallest window size that did not cause a hit decrease.  
Similarly, the VM system could increase the window size in response to 
misses, but after reaching some limit of increase where the misses do not 
decrease ``sufficiently'', it could return the window to the smallest size 
at which miss decrease was observed.

Now back to my claim that the heuristic that I suggested is not just the 
flip side of the original heuristics, where both are roughly equivalent, 
and the success of one or the other is just a matter of the reference 
behavior.  Assuming that an LRU-like replacement strategy is in place -- 
and I believe that page aging is LRU-like in the vast majority of 
situations -- the only way to turn a miss into a hit is to increase the 
window size.  Thus, the original heuristic's approach of shrinking the 
window in response to misses is a guarantee that future references that 
are part of the same reference behavior will remain misses.  Put 
differently, the *only* case in which it makes sense to shrink the 
read-ahead window in response to misses is one in which the misses are the 
result of un-cache-able references -- ones that would have required an 
absurdly large window, and so no window would be the best choice.  However,
  the heuristic that I described above will reach the same conclusion, 
although more slowly.  After growing the cache in response to the misses 
and observing no miss decrease, it would revert to a zero-sized window.

Granted, this discussion is based only on the read-ahead references, and 
not on the references to other, used pages.  However, even with that 
consideration, there's almost no situation in which you want to respond to 
read-ahead misses by shrinking the window -- and in those cases where you 
do, it's because of other factors, such as the need for a hopeless large 
window or a heavy demand on used pages that are near eviction that you 
want to shrink the window.  Read-ahead misses may not motivate larger 
read-ahead windows, but alone they *never* motivate smaller read-ahead 
windows.

>> So, while it is ideal to have some foresight before resizing the
>> window -- some calculation that determines whether or not growth
>> will help or shrinkage will hurt -- it will require the VM system
>> to gather hit distributions.
>
> Yup, but I think it's almost certainly worth that expense.

I'm happy that you think so, because I'm trying to do that now, and it's 
going to create some overhead.  Much like current rmap implementations, it'
s going to be the most intrusive for those cases where no paging is 
involved, and so the gains of tracking such information cannot be realized.

> How you actually calculate the window is a matter for debate and
> experimentation, but just growing and shrinking based on purely the
> hit rate seems like a bad idea.

Here I do agree.  Rather than finding the hit distribution by blindly 
setting allocations and observing the outcome, we can gather data to 
indicate what the outcome *would* be for that allocation.  Note, however, 
that VM systems have a long, long history of doing things like just 
responding to blind data gathering, much like increasing or decreasing 
allocation due to hit rate.  It's a matter of convincing people that 
gathering data that shows you the search space on-line is worth the 
complexity and the overhead.

Scott
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (Darwin)
Comment: For info see http://www.gnupg.org

iD8DBQE9TsnO8eFdWQtoOmgRAiC1AJsE3nhGa5zIGtkTsn7FBEuwrhX2uwCfcgzK
x7JgsWbQcQIhk3BSS2Wyu/o=
=oSsq
-----END PGP SIGNATURE-----

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

next prev parent reply	other threads:[~2002-08-05 18:54 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-25 16:10 Christoph Hellwig
2002-07-25 16:44 ` Rik van Riel
2002-07-25 19:40   ` Andrew Morton
2002-07-26 16:50     ` Scott Kaplan
2002-07-26 19:38       ` Andrew Morton
2002-07-28 23:32         ` Scott Kaplan
2002-07-29  0:19           ` Rik van Riel
2002-07-29  2:12             ` Scott Kaplan
2002-07-29  3:05               ` Rik van Riel
2002-07-29 15:24                 ` Scott Kaplan
2002-07-29  7:34           ` Andrew Morton
2002-07-29  7:37             ` Vladimir Dergachev
2002-07-29  7:53               ` Andrew Morton
2002-07-29  8:04             ` Rik van Riel
2002-07-30 16:11             ` Scott Kaplan
2002-07-30 16:21               ` Martin J. Bligh
2002-07-30 16:38                 ` Scott Kaplan
2002-07-30 16:52                   ` Martin J. Bligh
2002-08-05 18:54                     ` Scott Kaplan [this message]
2002-07-30 17:13                 ` William Lee Irwin III
2002-07-26 20:14     ` Stephen Lord
2002-07-26 20:29       ` Andrew Morton
2002-07-26  6:53 ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B5FE047C-A8A4-11D6-A6B5-000393829FA4@cs.amherst.edu \
    --to=sfkaplan@cs.amherst.edu \
    --cc=Martin.Bligh@us.ibm.com \
    --cc=akpm@zip.com.au \
    --cc=hch@lst.de \
    --cc=linux-mm@kvack.org \
    --cc=riel@conectiva.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox