On Fri, 2007-05-04 at 11:27 -0700, Christoph Lameter wrote: > > Not sure where to go here. Increasing the per cpu slab size may hold off > the issue up to a certain cpu cache size. For that we would need to > identify which slabs create the performance issue. > > One easy way to check that this is indeed the case: Enable fake NUMA. You > will then have separate queues for each processor since they are on > different "nodes". Create two fake nodes. Run one thread in each node and > see if this fixes it. I tried with fake NUMA (boot with numa=fake=2) and use numactl --physcpubind=1 --membind=0 ./netserver numactl --physcpubind=2 --membind=1 ./netperf -t TCP_STREAM -l 60 -H 127.0.0.1 -i 5,5 -I 99,5 -- -s 57344 -S 57344 -m 4096 to run the tests. The results are about the same as the non-NUMA case, with slab about 5% better than slub. So probably the difference is due to some other reasons than partial slab. The kernel config file is attached. Tim