On Wed, 20 Jan 2010 15:07:52 +0530 Balbir Singh wrote: > > This includes no functional changes. > > > > Signed-off-by: KAMEZAWA Hiroyuki > > > Before review, could you please post parallel pagefault data on a large > system, since root now uses these per cpu counters and its overhead is > now dependent on these counters. Also the data read from root cgroup is > also dependent on these, could you make sure that is not broken. > Hmm, I rewrote test program for avoidng mmap_sem. This version does fork() instead of pthread_create() and meausre parallel-process page fault speed. [Before patch] [root@bluextal memory]# /root/bin/perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault-fork 8 Performance counter stats for './multi-fault-fork 8' (5 runs): 45256919 page-faults ( +- 0.851% ) 602230144 cache-misses ( +- 0.187% ) 61.020533723 seconds time elapsed ( +- 0.002% [After patch] [root@bluextal memory]# /root/bin/perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault-fork 8 Performance counter stats for './multi-fault-fork 8' (5 runs): 46007166 page-faults ( +- 0.339% ) 599553505 cache-misses ( +- 0.298% ) 61.020937843 seconds time elapsed ( +- 0.004% ) slightly improved ? But this test program does some extreme behavior and you can't see difference in real-world applications, I think. So, I guess this is in error-range in famous (not small) benchmarks. Thanks, -Kame