A test program creates an anonymous memory mapping the size of the system's RAM (2G). It faults all pages of it linearly, then kicks off 128 reclaimers (on 4 cores) that map, fault and unmap 2G in sum and parallel, thereby evicting the first mapping onto swap. The time is then taken for the initial mapping to get faulted in from swap linearly again, thus measuring how bad the 128 reclaimers distributed the pages on the swap space. Average over 5 runs, standard deviation in parens: swap-in user system total old: 74.97s (0.38s) 0.52s (0.02s) 291.07s (3.28s) 2m52.66s (0m1.32s) new: 45.26s (0.68s) 0.53s (0.01s) 250.47s (5.17s) 2m45.93s (0m2.63s) where old is current mmotm snapshot 2009-04-17-15-19 and new is these three patches applied to it. Test program attached. Kernbench didn't show any differences on my single core x86 laptop with 256mb ram (poor thing).