On Wed, Jun 08, 2016 at 02:43:44PM -0400, neha agarwal wrote: > On Mon, Jun 6, 2016 at 9:51 AM, Kirill A. Shutemov > wrote: > > > On Wed, May 25, 2016 at 03:11:55PM -0400, neha agarwal wrote: > > > Hi All, > > > > > > I have been testing Hugh's and Kirill's huge tmpfs patch sets with > > > Cassandra (NoSQL database). I am seeing significant performance gap > > between > > > these two implementations (~30%). Hugh's implementation performs better > > > than Kirill's implementation. I am surprised why I am seeing this > > > performance gap. Following is my test setup. > > > > > > Patchsets > > > ======== > > > - For Hugh's: > > > I checked out 4.6-rc3, applied Hugh's preliminary patches (01 to 10 > > > patches) from here: https://lkml.org/lkml/2016/4/5/792 and then applied > > the > > > THP patches posted on April 16 (01 to 29 patches). > > > > > > - For Kirill's: > > > I am using his branch "git:// > > > git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git hugetmpfs/v8", > > which > > > is based off of 4.6-rc3, posted on May 12. > > > > > > > > > Khugepaged settings > > > ================ > > > cd /sys/kernel/mm/transparent_hugepage > > > echo 10 >khugepaged/alloc_sleep_millisecs > > > echo 10 >khugepaged/scan_sleep_millisecs > > > echo 511 >khugepaged/max_ptes_none > > > > > > > > > Mount options > > > =========== > > > - For Hugh's: > > > sudo sysctl -w vm/shmem_huge=2 > > > sudo mount -o remount,huge=1 /hugetmpfs > > > > > > - For Kirill's: > > > sudo mount -o remount,huge=always /hugetmpfs > > > echo force > /sys/kernel/mm/transparent_hugepage/shmem_enabled > > > echo 511 >khugepaged/max_ptes_swap > > > > > > > > > Workload Setting > > > ============= > > > Please look at the attached setup document for Cassandra (NoSQL > > database): > > > cassandra-setup.txt > > > > > > > > > Machine setup > > > =========== > > > 36-core (72 hardware thread) dual-socket x86 server with 512 GB RAM > > running > > > Ubuntu. I use control groups for resource isolation. Server and client > > > threads run on different sockets. Frequency governor set to "performance" > > > to remove any performance fluctuations due to frequency variation. > > > > > > > > > Throughput numbers > > > ================ > > > Hugh's implementation: 74522.08 ops/sec > > > Kirill's implementation: 54919.10 ops/sec > > > > In my setup I don't see the difference: > > > > v4.7-rc1 + my implementation: > > [OVERALL], RunTime(ms), 822862.0 > > [OVERALL], Throughput(ops/sec), 60763.53021527304 > > ShmemPmdMapped: 4999168 kB > > > > v4.6-rc2 + Hugh's implementation: > > [OVERALL], RunTime(ms), 833157.0 > > [OVERALL], Throughput(ops/sec), 60012.698687042175 > > ShmemPmdMapped: 5021696 kB > > > > It's basically within measuarment error. 'ShmemPmdMapped' indicate how > > much memory is mapped with huge pages by the end of test. > > > > It's on dual-socket 24-core machine with 64G of RAM. > > > > I guess we have some configuration difference or something, but so far I > > don't see the drastic performance difference you've pointed to. > > > > May be my implementation behaves slower on bigger machines, I don't know.. > > There's no architectural reason for this. > > > > I'll post my updated patchset today. > > > > -- > > Kirill A. Shutemov > > > > Thanks a lot Kirill for the testing. It is interesting that you don't see > any significant performance difference. Also, your absolute throughput > numbers are different from mine, more so for Hugh's implementation. > > Can you please share your kernel config file? Attached. > I will try to look if I have some different config settings. Also, I am > assuming that you had turned off DVFS. DVFS? I'm not sure what you're talking about. I guess it's not "dynamic voltage and frequency scaling". :) > > One thing I forgot mentioning in my previous setup email was: I use 8 cores > for running Cassandra server threads. Can you please tell how many cores > did you use? As Cassandra is CPU bound that can make a difference in > throughput number we are seeing. I have 24-core machine, and I didn't limit CPU usage in any way. I can see load avarage easily over 15. -- Kirill A. Shutemov