On Wed, May 04, 2011 at 10:32:01AM +0800, Dave Young wrote: > On Wed, May 4, 2011 at 9:56 AM, Dave Young wrote: > > On Thu, Apr 28, 2011 at 9:36 PM, Wu Fengguang wrote: > >> Concurrent page allocations are suffering from high failure rates. > >> > >> On a 8p, 3GB ram test box, when reading 1000 sparse files of size 1GB, > >> the page allocation failures are > >> > >> nr_alloc_fail 733 A A A # interleaved reads by 1 single task > >> nr_alloc_fail 11799 A A # concurrent reads by 1000 tasks > >> > >> The concurrent read test script is: > >> > >> A A A A for i in `seq 1000` > >> A A A A do > >> A A A A A A A A truncate -s 1G /fs/sparse-$i > >> A A A A A A A A dd if=/fs/sparse-$i of=/dev/null & > >> A A A A done > >> > > > > With Core2 Duo, 3G ram, No swap partition I can not produce the alloc fail > > unset CONFIG_SCHED_AUTOGROUP and CONFIG_CGROUP_SCHED seems affects the > test results, now I see several nr_alloc_fail (dd is not finished > yet): > > dave@darkstar-32:$ grep fail /proc/vmstat: > nr_alloc_fail 4 > compact_pagemigrate_failed 0 > compact_fail 3 > htlb_buddy_alloc_fail 0 > thp_collapse_alloc_fail 4 > > So the result is related to cpu scheduler. Good catch! My kernel also disabled CONFIG_CGROUP_SCHED and CONFIG_SCHED_AUTOGROUP. Thanks, Fengguang