From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 20 Sep 2007 11:19:36 +0100 Subject: Re: [PATCH/RFC 1/14] Reclaim Scalability: Convert anon_vma lock to read/write lock Message-ID: <20070920101936.GA24105@skynet.ie> References: <20070914205359.6536.98017.sendpatchset@localhost> <20070914205405.6536.37532.sendpatchset@localhost> <20070917110234.GF25706@skynet.ie> <1190146641.5035.80.camel@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1190146641.5035.80.camel@localhost> From: mel@skynet.ie (Mel Gorman) Sender: owner-linux-mm@kvack.org Return-Path: To: Lee Schermerhorn Cc: linux-mm@kvack.org, akpm@linux-foundation.org, clameter@sgi.com, riel@redhat.com, balbir@linux.vnet.ibm.com, andrea@suse.de, a.p.zijlstra@chello.nl, eric.whitney@hp.com, npiggin@suse.de List-ID: On (18/09/07 16:17), Lee Schermerhorn didst pronounce: > On Mon, 2007-09-17 at 12:02 +0100, Mel Gorman wrote: > > On (14/09/07 16:54), Lee Schermerhorn didst pronounce: > > > [PATCH/RFC] 01/14 Reclaim Scalability: Convert anon_vma list lock a read/write lock > > > > > > Against 2.6.23-rc4-mm1 > > > > > > Make the anon_vma list lock a read/write lock. Heaviest use of this > > > lock is in the page_referenced()/try_to_unmap() calls from vmscan > > > [shrink_page_list()]. These functions can use a read lock to allow > > > some parallelism for different cpus trying to reclaim pages mapped > > > via the same set of vmas. > > > > > > This change should not change the footprint of the anon_vma in the > > > non-debug case. > > > > > > Note: I have seen systems livelock with all cpus in reclaim, down > > > in page_referenced_anon() or try_to_unmap_anon() spinning on the > > > anon_vma lock. I have only seen this with the AIM7 benchmark with > > > workloads of 10s of thousands of tasks. All of these tasks are > > > children of a single ancestor, so they all share the same anon_vma > > > for each vm area in their respective mm's. I'm told that Apache > > > can fork thousands of children to handle incoming connections, and > > > I've seen similar livelocks--albeit on the i_mmap_lock [next patch] > > > running 1000s of Oracle users on a large ia64 platform. > > > > > > With this patch [along with Rik van Riel's split LRU patch] we were > > > able to see the AIM7 workload start swapping, instead of hanging, > > > for the first time. Same workload DID hang with just Rik's patch, > > > so this patch is apparently useful. > > > > > > > In light of what Peter and Linus said about rw-locks being more expensive > > than spinlocks, we'll need to measure this with some benchmark. The plus > > side is that this patch can be handled in isolation because it's either a > > scalability fix or it isn't. It's worth investigating because you say it > > fixed a real problem where under load the job was able to complete with > > this patch and live-locked without it. > > > > kernbench is unlikely to show up anything useful here although it might be > > worth running anyway just in case. brk_test from aim9 might be useful as it's > > a micro-benchmark that uses brk() which is a path affected by this patch. As > > aim7 is exercising this path, it would be interesting to see does it show > > performance differences in the normal non-stressed case. Other suggestions? > > As Mel predicted, kernel builds don't seem to be affected by this patch, > nor the i_mmap_lock rw_lock patch. Below I've included results for an > old ia64 system that I have pretty much exclusive access to. I can't > get 23-rc4-mm1 nor rc6-mm1 to boot on an x86_64 [AMD-based] right > now--still trying to capture stack trace [not easy from a remote > console :-(]. > On x86_64, I got -0.34% and -0.03% regressions on two different machines with kernbench. However, that is pretty close to noise. On a range of machines NUMA and non-NUMA with 2.6.23-rc6-mm1 I saw Total CPU figures ranging from -1.23% to 1.02% and -1.09% to 6.54% System CPU time. DBench figures were from -2.54% to 4.94%. The DBench figures tend to vary by about this much anyway so basic smoke test at least. hackbench (tested just in case) didn't show up anything unusual. I didn't do scalability testing with multiple processes like aim7 yet but so far we're looking ok. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org