Greeting, FYI, we noticed a 1.2% improvement of vm-scalability.throughput due to commit: commit: 878308e2e0d003e923a0fad51657441916ca1a86 ("[RFC PATCH 2/3] hugetlbfs: Only take i_mmap_rwsem when sharing is possible") url: https://github.com/0day-ci/linux/commits/Mike-Kravetz/hugetlbfs-address-fault-time-regression/20200707-043055 base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68 in testcase: vm-scalability on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory with following parameters: runtime: 300s size: 8T test: anon-cow-seq-hugetlb cpufreq_governor: performance ucode: 0x5002f01 test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode: gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/8T/lkp-csl-2sp6/anon-cow-seq-hugetlb/vm-scalability/0x5002f01 commit: e1f9bcc75b ("Revert: "hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race"") 878308e2e0 ("hugetlbfs: Only take i_mmap_rwsem when sharing is possible") e1f9bcc75b135fa7 878308e2e0d003e923a0fad5165 ---------------- --------------------------- %stddev %change %stddev \ | \ 368149 +1.2% 372740 vm-scalability.median 2.08 ± 6% +1.1 3.21 ± 6% vm-scalability.median_stddev% 36859123 +1.2% 37316599 vm-scalability.throughput 7183 +2.3% 7351 vm-scalability.time.percent_of_cpu_this_job_got 13255 +1.3% 13426 vm-scalability.time.system_time 8653 +2.9% 8904 vm-scalability.time.user_time 353403 -87.2% 45099 vm-scalability.time.voluntary_context_switches 2215 -2.0% 2170 boot-time.idle 24601711 ± 19% +27.9% 31454336 ± 9% meminfo.DirectMap2M 58850 ± 2% -10.7% 52572 ± 2% cpuidle.C1E.usage 1542818 -15.3% 1306149 cpuidle.C6.usage 3422 ± 11% -13.6% 2957 ± 7% slabinfo.fsnotify_mark_connector.active_objs 3422 ± 11% -13.6% 2957 ± 7% slabinfo.fsnotify_mark_connector.num_objs 13429 ± 10% -19.8% 10769 ± 4% softirqs.CPU0.SCHED 545104 -26.3% 401941 softirqs.SCHED 24.00 -8.3% 22.00 vmstat.cpu.id 29.00 +3.4% 30.00 vmstat.cpu.us 4618 -43.5% 2609 ± 2% vmstat.system.cs 76078 +2.7% 78129 vmstat.system.in 580.50 ± 17% -33.2% 387.75 ± 9% interrupts.CPU15.CAL:Function_call_interrupts 519.75 ± 20% -25.5% 387.25 ± 11% interrupts.CPU19.CAL:Function_call_interrupts 163.50 ± 77% -90.1% 16.25 ±100% interrupts.CPU24.TLB:TLB_shootdowns 69.00 ± 10% +40.9% 97.25 ± 16% interrupts.CPU25.RES:Rescheduling_interrupts 727.75 ± 41% -42.3% 420.00 ± 28% interrupts.CPU31.CAL:Function_call_interrupts 159.25 ± 61% -57.9% 67.00 ± 32% interrupts.CPU31.RES:Rescheduling_interrupts 9.25 ±142% +518.9% 57.25 ± 61% interrupts.CPU39.TLB:TLB_shootdowns 746.50 ± 8% -25.6% 555.25 ± 10% interrupts.CPU7.CAL:Function_call_interrupts 99.50 ± 18% -25.1% 74.50 ± 28% interrupts.CPU73.RES:Rescheduling_interrupts 208.25 ± 36% -68.3% 66.00 ± 44% interrupts.CPU82.RES:Rescheduling_interrupts 612.75 ± 34% -35.4% 395.75 ± 22% interrupts.CPU88.CAL:Function_call_interrupts 509479 ± 44% -52.4% 242707 ± 32% sched_debug.cfs_rq:/.load.max 9228443 ± 10% +23.1% 11358951 sched_debug.cfs_rq:/.min_vruntime.max 538248 ± 13% +60.0% 861238 ± 15% sched_debug.cfs_rq:/.min_vruntime.stddev 18.97 ± 9% +33.8% 25.40 ± 6% sched_debug.cfs_rq:/.nr_spread_over.avg 88.65 ± 33% +64.1% 145.50 ± 26% sched_debug.cfs_rq:/.nr_spread_over.max 18.49 ± 13% +45.9% 26.98 ± 19% sched_debug.cfs_rq:/.nr_spread_over.stddev 223.44 ± 11% +19.6% 267.14 ± 4% sched_debug.cfs_rq:/.runnable_avg.stddev 701309 ± 21% +83.4% 1286114 ± 13% sched_debug.cfs_rq:/.spread0.max 538445 ± 13% +59.9% 861184 ± 15% sched_debug.cfs_rq:/.spread0.stddev 222.23 ± 11% +17.7% 261.61 ± 5% sched_debug.cfs_rq:/.util_avg.stddev 296.77 ± 5% +22.6% 363.91 ± 9% sched_debug.cfs_rq:/.util_est_enqueued.stddev 2299 ±100% +404.8% 11607 ± 52% sched_debug.cpu.max_idle_balance_cost.stddev 0.27 ± 6% +27.7% 0.35 ± 23% sched_debug.cpu.nr_running.stddev 7774 ± 9% -30.7% 5387 ± 2% sched_debug.cpu.nr_switches.avg 5083 ± 10% -61.2% 1972 ± 8% sched_debug.cpu.nr_switches.min 0.02 ± 53% +159.3% 0.05 ± 41% sched_debug.cpu.nr_uninterruptible.avg 6771 ± 11% -34.5% 4435 ± 2% sched_debug.cpu.sched_count.avg 4287 ± 13% -72.0% 1201 ± 10% sched_debug.cpu.sched_count.min 3032 ± 14% +23.7% 3752 ± 4% sched_debug.cpu.sched_count.stddev 3015 ± 11% -42.6% 1731 ± 2% sched_debug.cpu.sched_goidle.avg 1853 ± 13% -82.3% 328.67 ± 18% sched_debug.cpu.sched_goidle.min 3128 ± 11% -37.8% 1944 ± 3% sched_debug.cpu.ttwu_count.avg 1007 ± 12% -35.6% 648.58 ± 10% sched_debug.cpu.ttwu_count.min 783.48 ± 10% +19.2% 933.93 ± 6% sched_debug.cpu.ttwu_local.avg vm-scalability.throughput 3.76e+07 +----------------------------------------------------------------+ | O | 3.74e+07 |-+O O O O O O O O O O O O O O | | O O O O O O O | 3.72e+07 |-+ +.. .+.. | 3.7e+07 |-+ .+ + .+..+. .+..+ +.. .+.. | | +. + .+..+ +. +. + +..+..+.+..| 3.68e+07 |-+ +.. .. +. | | : + | 3.66e+07 |-+ : | 3.64e+07 |-+ : | | : | 3.62e+07 |..: | | + | 3.6e+07 +----------------------------------------------------------------+ vm-scalability.time.voluntary_context_switches 400000 +------------------------------------------------------------------+ | .+. .+.. | 350000 |..+. +..+. +..+.+..+..+..+..+.+..+..+..+.+..+..+..+..+.+..+..| | | 300000 |-+ | 250000 |-+ | | | 200000 |-+ | | | 150000 |-+ | 100000 |-+ | | | 50000 |-+ O O O O O O O O O O O O O O O O | | O O O O O O | 0 +------------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Rong Chen