On Thu, Jan 11, 2007 at 10:13:55AM +1100, Nick Piggin wrote: > David Chinner wrote: > >On Wed, Jan 10, 2007 at 03:04:15PM -0800, Christoph Lameter wrote: > > > >>On Thu, 11 Jan 2007, David Chinner wrote: > >> > >> > >>>The performance and smoothness is fully restored on 2.6.20-rc3 > >>>by setting dirty_ratio down to 10 (from the default 40), so > >>>something in the VM is not working as well as it used to.... > >> > >>dirty_background_ratio is left as is at 10? > > > > > >Yes. > > > > > >>So you gain performance by switching off background writes via pdflush? > > > > > >Well, pdflush appears to be doing very little on both 2.6.18 and > >2.6.20-rc3. In both cases kswapd is consuming 10-20% of a CPU and > >all of the pdflush threads combined (I've seen up to 7 active at > >once) use maybe 1-2% of cpu time. This occurs regardless of the > >dirty_ratio setting. > > Hi David, > > Could you get /proc/vmstat deltas for each kernel, to start with? Sure, but that doesn't really show the how erratic the per-filesystem throughput is because the test I'm running is PCI-X bus limited in it's throughput at about 750MB/s. Each dm device is capable of about 340MB/s write, so when one slows down, the others will typically speed up. So, what I've attached is three files which have both 'vmstat 5' output and 'iostat 5 |grep dm-' output in them. - 2.6.18.out - 2.6.18 behaviour near start of writes. Behaviour does not change over the couse of the test, just gets a bit slower as the test moves from the outer edge of the disk to the inner. erractic behaviour is highlighted. - 2.6.20-rc3.out - 2.6.20-rc3 behaviour near start of writes. Somewhat more erratic than 2.6.18, but about 100-150GB into the write test, things change with dirty_ratio=40. erractic behaviour is highlighted. - 2.6.20-rc3-worse.out - 2.6.20-rc3 behavour when things go bad. We're not keeping the disks or the PCI-X bus fully utilised (each dm device can do about 300MB/s at this offset) and aggregate throughput has dropped to 500-600MB/s. With 2.6.20-rc3 and dirty_ratio = 10, the performance drop-off part way into the test does not occur and the output is almost identical to 2.6.18.out. > I'm guessing CPU time isn't a problem, but if it is then I guess > profiles as well. Plenty of idle cpu so I don't think it's a problem. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group