From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 29 Jul 2002 11:24:07 -0400 Subject: Re: [RFC] start_aggressive_readahead Content-Type: text/plain; charset=US-ASCII; format=flowed Mime-Version: 1.0 (Apple Message framework v482) From: Scott Kaplan In-Reply-To: Message-Id: <397881C8-A307-11D6-A4C0-000393829FA4@cs.amherst.edu> Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Rik van Riel Cc: Andrew Morton , Christoph Hellwig , torvalds@transmeta.com, linux-mm@kvack.org List-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sunday, July 28, 2002, at 11:05 PM, Rik van Riel wrote: > My experience with 300 ftp clients pulling a collective 40 Mbit/s > suggests otherwise. > > About 70% of the clients were on modem speed and the other 30% of > the clients were on widely variable higher speeds. > > Since a disk seek + read is about 10ms, the absolute maximum > number of seeks that can be done is 100 a second and the minimum > amount of time between disk seeks for one stream should be about > 3 seconds. This is a very interesting example of some real (and important) reference behavior that must be understood to be handled well. In the context of this thread of discussion, this case is substantially different from your original comment on read-ahead for ``linear file I/O''. Just as a refresher for myself and anyone else that needs it: I claimed that linear file I/O was a case in which read-ahead blocks should not be cached for long before they would either be used or evicted from lack of use. (That is, they should be placed nearer to the end of the LRU ordering.) The claim was based on the observation that sequential file traversal is a very good case for read-ahead, where the read-ahead blocks are very likely to be used very soon. What's important about this example is that, due to the whole system workload and the disparate connection speeds of the ftp clients, it is *NOT* a typical case of linear file I/O. In fact, what's odd about it is that block `i' of a file will be read, and for slower connections, block ` i+1' will *not* be used for some time, since reading block `i' will take a while. In other words, the interleaved reference behavior from all of these ftp downloads makes the prediction that block `i+1' will be used soon a weaker prediction. It is very likely to be used, yes, but not so soon in many cases due to the other files being read and referenced. Because the soonness of use is weak, we do indeed want to cache the read-ahead pages for longer. (That is, I agree that for this example, read-ahead pages should go into the active list.) Caching read-ahead pages for longer, though, displaces more used pages, forcing them to be evicted sooner then they would have been without the aggressive read-ahead caching. Critically, for *this* workload, that's probably just fine. Assuming that different files are being downloaded by different ftp clients, after reading and referencing a block, it's probably worth little to cache it in case of re-use for very long. In other words, among the referenced pages, those near the end of the LRU ordering are referenced rarely. The competition between read-ahead pages and less recently used referenced pages is lopsided in favor of the read-ahead pages. But that is only a consequence of reference pattern for *this specific workload* -- it may not be true for other workloads. Incidentally, this is all just mental masturbation until someone actually records and measures the reference behavior from this kind of workload. It all sounds about right, but that's neither good science nor good engineering. In short, I agree that for this case, inserting read-ahead pages into the inactive list may not be aggressive enough. I disagree that the reason is ``linear file I/O'', as the reference pattern here is more complex than that. This is also a wonderful case for getting read-ahead caching adaptivity right: A system that can weigh read-ahead caching allocations against less recently used referenced-page allocations will detect and adjust to this case quickly, while avoiding such aggressive read-ahead caching for other workloads. Scott -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (Darwin) Comment: For info see http://www.gnupg.org iD8DBQE9RV4a8eFdWQtoOmgRAk6tAKCYX8tHrauHGMaek1oyCJMvEQf5yACgrEgX pHx2gTsY4HTy9OUmOZjT7I8= =JTJP -----END PGP SIGNATURE----- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/