From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from dax.scot.redhat.com (sct@dax.scot.redhat.com [195.89.149.242]) by kvack.org (8.8.7/8.8.7) with ESMTP id PAA00492 for ; Thu, 26 Nov 1998 15:54:05 -0500 Date: Fri, 27 Nov 1998 16:02:51 GMT Message-Id: <199811271602.QAA00642@dax.scot.redhat.com> From: "Stephen C. Tweedie" MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Re: [2.1.130-3] Page cache DEFINATELY too persistant... feature? In-Reply-To: References: <199811261236.MAA14785@dax.scot.redhat.com> Sender: owner-linux-mm@kvack.org To: Linus Torvalds Cc: "Stephen C. Tweedie" , Benjamin Redelings I , linux-kernel@vger.rutgers.edu, linux-mm@kvack.org List-ID: Hi, Looks like I have a handle on what's wrong with the 2.1.130 vm (in particular, its tendency to cache too much at the expense of swapping). The real problem seems to be that shrink_mmap() can fail for two completely separate reasons. First of all, we might fail to find a free page because all of the cache pages we find are recently referenced. Secondly, we might fail to find a cache page at all. The first case is an example of an overactive, large cache; the second is an example of a very small cache. Currently, however, we treat these two cases pretty much the same. In the second case, the correct reaction is to swap, and 2.1.130 is sufficiently good at swapping that we do so, heavily. In the first case, high cache throughput, what we really _should_ be doing is to age the pages more quickly. What we actually do is to swap. On reflection, there is a completely natural way of distinguishing between these two cases, and that is to extend the size of the shrink_mmap() pass whenever we encounter many recently touched pages. This is easy to do: simply restricting the "count_min" accounting in shrink_mmap to avoid including salvageable but recently-touched pages will automatically cause us to age faster as we encounter more touched pages in the cache. The patch below both makes sense from this perspective and seems to work, which is always a good sign! Moreover, it is inherently self-tuning. The more recently-accessed cache pages we encounter, the faster we will age the cache. On an 8MB boot, it is just as fast as plain 2.1.130 at doing a make defrag over NFS (which means it is still 25% faster than 2.0.36 at this job even on low-memory NFS). On 64MB, I've been running emacs with two or three 10MB mail folders loaded, netscape running a bunch of pages, a couple of large 16-bit images under xv and a kernel build all on different X pages, and switching between them is all extremely fast. It still has the great responsiveness under this sort of load that first came with 2.1.130. The good news, however, is that an extra filesystem load of "wc /usr/bin/*" does not cause a swap storm on the new kernel: it is finally able to distinguish a cache which is too active and a cache which is too small. The system does still swap well once it decides it needs to, giving swapout burst rates of 2--3MB/sec during this test. Note that older kernels did not have this problem because in 1.2, the buffer cache would never ever grow into the space used by the rest of the kernel, and in 2.0, the presence of page aging in the swapper but not in the cache caused us to run around the shrink_mmap() loop far too much anyway, at the expense of good swap performance. --Stephen ---------------------------------------------------------------- --- mm/filemap.c.~1~ Thu Nov 26 18:48:52 1998 +++ mm/filemap.c Fri Nov 27 12:45:03 1998 @@ -200,8 +200,8 @@ struct page * page; int count_max, count_min; - count_max = (limit<<1) >> (priority>>1); - count_min = (limit<<1) >> (priority); + count_max = limit; + count_min = (limit<<2) >> (priority); page = mem_map + clock; do { @@ -214,7 +214,15 @@ if (shrink_one_page(page, gfp_mask)) return 1; count_max--; - if (page->inode || page->buffers) + /* + * If the page we looked at was recyclable but we didn't + * reclaim it (presumably due to PG_referenced), don't + * count it as scanned. This way, the more referenced + * page cache pages we encounter, the more rapidly we + * will age them. + */ + if (atomic_read(&page->count) != 1 || + (!page->inode && !page->buffers)) count_min--; page++; clock++; -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org