From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesse Barnes Subject: Re: Anticipatory prefaulting in the page fault handler V1 Date: Wed, 8 Dec 2004 09:33:13 -0800 References: <20041202101029.7fe8b303.cliffw@osdl.org> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412080933.13396.jbarnes@engr.sgi.com> Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: nickpiggin@yahoo.com.au, Jeff Garzik , torvalds@osdl.org, hugh@veritas.com, benh@kernel.crashing.org, linux-mm@kvack.org, linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org List-ID: On Wednesday, December 8, 2004 9:24 am, Christoph Lameter wrote: > Page fault scalability patch and prefaulting. Max prefault order > increased to 5 (max preallocation of 32 pages): > > Gb Rep Threads User System Wall flt/cpu/s fault/wsec > 256 10 8 33.571s 4516.293s 863.021s 36874.099 194356.930 > 256 10 16 33.103s 3737.688s 461.028s 44492.553 363704.484 > 256 10 32 35.094s 3436.561s 321.080s 48326.262 521352.840 > 256 10 64 46.675s 2899.997s 245.020s 56936.124 684214.256 > 256 10 128 85.493s 2890.198s 203.008s 56380.890 826122.524 > 256 10 256 74.299s 1374.973s 99.088s115762.963 1679630.272 > 256 10 512 62.760s 706.559s 53.027s218078.311 3149273.714 > > We are getting into an almost linear scalability in the high end with > both patches and end up with a fault rate > 3 mio faults per second. Nice results! Any idea how many applications benefit from this sort of anticipatory faulting? It has implications for NUMA allocation. Imagine an app that allocates a large virtual address space and then tries to fault in pages near each CPU in turn. With this patch applied, CPU 2 would be referencing pages near CPU 1, and CPU 3 would then fault in 4 pages, which would then be used by CPUs 4-6. Unless I'm missing something... And again, I'm not sure how important that is, maybe this approach will work well in the majority of cases (obviously it's a big win in faults/sec for your benchmark, but I wonder about subsequent references from other CPUs to those pages). You can look at /sys/devices/platform/nodeN/meminfo to see where the pages are coming from. Jesse -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: aart@kvack.org