From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 08 Feb 2006 18:53:46 +0900 From: IWAMOTO Toshihiro Subject: Re: [PATCH 6/9] clockpro-clockpro.patch In-Reply-To: <20060124072503.BAF6A7402F@sv1.valinux.co.jp> References: <20051230223952.765.21096.sendpatchset@twins.localnet> <20051230224312.765.58575.sendpatchset@twins.localnet> <20051231002417.GA4913@dmt.cnet> <1136028546.17853.69.camel@twins> <20060105094722.897C574030@sv1.valinux.co.jp> <20060106090135.3525D74031@sv1.valinux.co.jp> <20060124063010.B85C77402D@sv1.valinux.co.jp> <20060124072503.BAF6A7402F@sv1.valinux.co.jp> MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Message-Id: <20060208095346.0BB9774038@sv1.valinux.co.jp> Sender: owner-linux-mm@kvack.org Return-Path: To: Peter Zijlstra Cc: IWAMOTO Toshihiro , Rik van Riel , Marcelo Tosatti , linux-mm@kvack.org, Andrew Morton , Christoph Lameter , Wu Fengguang , Nick Piggin , Marijn Meijles List-ID: At Tue, 24 Jan 2006 16:25:03 +0900, > Environment: Dell 1850 4GB EM64T CPUx2 HT disabled, x86_64 kernel > Kernel 1: linux-2.6.15-rc5 > Kernel 2: linux-2.6.15-rc5 + clockpro patch posted in 2005/12/31 > Kernel 3: linux-2.6.15-rc5 + clockpro patch posted in 2005/12/31 + > modification to disable page cache usage from ZONE_DMA > (to rule out possible zone balancing related problem) > Kernel 1 and 2 were booted with "mem=1008m", Kernel 3 was booted with > "mem=1024m". > > The test program: 2read.c (attached below) > 2read.c repeatedly reads from two files zero and zero2. > Command line arguments specify the ranges to be read. (See the > code for detail) > It prints the number of read operations/2 every 5 seconds and > terminates in 5 minutes. > > $ cc -O 2read.c > $ ls -l zero* > -rw-r--r-- 1 toshii users 1073741824 2006-01-13 17:27 zero > -rw-r--r-- 1 toshii users 1572864000 2006-01-20 18:20 zero2 > > (with Kernel 1) > $ for n in 100 200 300 400 500; do > > ./a.out -n $n $((1100-$n)) > /tmp/2d.$n ; done > (with Kernel 2) > $ for n in 100 200 300 400 500; do > > ./a.out -n $n $((1100-$n)) > /tmp/2d.c.$n ; done > (with Kernel 3) > $ for n in 100 200 300 400 500; do > > ./a.out -n $n $((1100-$n)) > /tmp/2d.c.nodma.$n ; done > > The table below is the last numbers printed by the test program > ((number of reads)/2 in 5 minutes). Clockpro (with or without the > ZONE_DMA modification) is always slower with one exception, and > the slowdown can be as large as 42-54%. > > I've put the complete data and some generated figures at > http://people.valinux.co.jp/~iwamoto/clockpro-20051231/ > > n Kernel 1 Kernel 2 Kernel 3 > ====================================== > 100 373600 298720 395818 > 200 385639 272749 272166 > 300 371047 243734 262370 > 400 367691 213974 169714 > 500 147130 126284 103038 I've done some more measurements a while ago. It might not apply to the latest patch but I'm writing the results briefly. If more detail or measurements with a newer patch is needed, please let me know. 1. Clockpro cannot compare page access frequencies of zero and zero2 correctly and prefer to cache pages of zero, whose access frequency per page is higher. When n=300 or n=400, kernel 1 completely caches accessed region of zero, while kernel 3 does only 98-100% of them. However, kernel 3 caches 10-15% more pages of zero2 than kernel 1 does. When the accessed region of zero is mmap+mlocked, kernel 1: +150% performance improvement when n=500, no significant change otherwise kernel 3: -20% to +132% performance improvement is observed. Kernel 3 is still slower. 2. nr_cold_target keeps changing during the test program runs. Fixing nr_cold_target generally improves performance, by up to 50%. But when n=100 and certain value of nr_cold_target is selected, 2read.c gets 20% slower. -- IWAMOTO Toshihiro -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org