Greetings, Please note that we reported a regression in will-it-scale malloc1 benchmark on below commit f35b5d7d676e ("mm: align larger anonymous mappings on THP boundaries") at https://lore.kernel.org/all/202210181535.7144dd15-yujie.liu@intel.com/ and Nathan reported a kbuild slowdown under clang toolchain at https://lore.kernel.org/all/Y1DNQaoPWxE+rGce@dev-arch.thelio-3990X/ That commit was finally reverted. When we tested the revert commit, the score in malloc1 benchmark recovered, but we observed another regression in mmap1 benchmark. "Yin, Fengwei" helped to check and got below clues: 1. The regression is related with the VMA merge with prev/next VMA when doing mmap. 2. Before the patch reverted, almost all the VMA for 128M mapping can't be merged with prev/next VMA. So always create new VMA. With the patch reverted, most VMA for 128 mapping can be merged. It looks like VMA merging introduce more latency comparing to creating new VMA. 3. If force to create new VMA with patch reverted, the result of mmap1_thread is restored. 4. The thp_get_unmapped_area() adds a padding to request mapping length. The padding is 2M in general. I believe this padding break VMA merging behavior. 5. No idea about why the difference of the two path (VMA merging vs New VMA) is not shown in perf data Please check below report for details. FYI, we noticed a -21.1% regression of will-it-scale.per_thread_ops due to commit: commit: 0ba09b1733878afe838fe35c310715fda3d46428 ("Revert "mm: align larger anonymous mappings on THP boundaries"") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master in testcase: will-it-scale on test machine: 104 threads 2 sockets (Skylake) with 192G memory with following parameters: nr_task: 50% mode: thread test: mmap1 cpufreq_governor: performance test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale In addition to that, the commit also has significant impact on the following tests: +------------------+------------------------------------------------------------------------------------------------+ | testcase: change | will-it-scale: will-it-scale.per_process_ops 1943.6% improvement | | test machine | 128 threads 4 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory | | test parameters | cpufreq_governor=performance | | | mode=process | | | nr_task=50% | | | test=malloc1 | +------------------+------------------------------------------------------------------------------------------------+ | testcase: change | unixbench: unixbench.score 2.6% improvement | | test machine | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory | | test parameters | cpufreq_governor=performance | | | nr_task=30% | | | runtime=300s | | | test=shell8 | +------------------+------------------------------------------------------------------------------------------------+ | testcase: change | phoronix-test-suite: phoronix-test-suite.build-eigen.0.seconds 9.1% regression | | test machine | 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory | | test parameters | cpufreq_governor=performance | | | test=build-eigen-1.1.0 | +------------------+------------------------------------------------------------------------------------------------+ | testcase: change | will-it-scale: will-it-scale.per_process_ops 2882.9% improvement | | test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory | | test parameters | cpufreq_governor=performance | | | mode=process | | | nr_task=100% | | | test=malloc1 | +------------------+------------------------------------------------------------------------------------------------+ | testcase: change | will-it-scale: will-it-scale.per_process_ops 12.7% improvement | | test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory | | test parameters | cpufreq_governor=performance | | | mode=process | | | nr_task=50% | | | test=mmap1 | +------------------+------------------------------------------------------------------------------------------------+ | testcase: change | stress-ng: stress-ng.pthread.ops_per_sec 600.6% improvement | | test machine | 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory | | test parameters | class=scheduler | | | cpufreq_governor=performance | | | nr_threads=100% | | | sc_pid_max=4194304 | | | test=pthread | | | testtime=60s | +------------------+------------------------------------------------------------------------------------------------+ | testcase: change | will-it-scale: will-it-scale.per_process_ops 601.0% improvement | | test machine | 104 threads 2 sockets (Skylake) with 192G memory | | test parameters | cpufreq_governor=performance | | | mode=process | | | nr_task=50% | | | test=malloc1 | +------------------+------------------------------------------------------------------------------------------------+ Details are as below: ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-11/performance/x86_64-rhel-8.3/thread/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/mmap1/will-it-scale commit: 23393c6461 ("char: tpm: Protect tpm_pm_suspend with locks") 0ba09b1733 ("Revert "mm: align larger anonymous mappings on THP boundaries"") 23393c6461422df5 0ba09b1733878afe838fe35c310 ---------------- --------------------------- %stddev %change %stddev \ | \ 140227 -21.1% 110582 ± 3% will-it-scale.52.threads 49.74 +0.1% 49.78 will-it-scale.52.threads_idle 2696 -21.1% 2126 ± 3% will-it-scale.per_thread_ops 301.30 -0.0% 301.26 will-it-scale.time.elapsed_time 301.30 -0.0% 301.26 will-it-scale.time.elapsed_time.max 3.67 ± 71% -22.7% 2.83 ± 47% will-it-scale.time.involuntary_context_switches 0.67 ±165% -75.0% 0.17 ±223% will-it-scale.time.major_page_faults 9772 -0.7% 9702 will-it-scale.time.maximum_resident_set_size 7274 -0.3% 7254 will-it-scale.time.minor_page_faults 4096 +0.0% 4096 will-it-scale.time.page_size 0.04 ± 16% -4.0% 0.04 will-it-scale.time.system_time 0.06 ± 24% -11.8% 0.05 ± 16% will-it-scale.time.user_time 102.83 +1.9% 104.83 ± 2% will-it-scale.time.voluntary_context_switches 140227 -21.1% 110582 ± 3% will-it-scale.workload 1.582e+10 +0.1% 1.584e+10 cpuidle..time 33034032 -0.0% 33021393 cpuidle..usage 10.00 +0.0% 10.00 dmesg.bootstage:last 172.34 +0.1% 172.58 dmesg.timestamp:last 10.00 +0.0% 10.00 kmsg.bootstage:last 172.34 +0.1% 172.58 kmsg.timestamp:last 362.22 +0.0% 362.25 uptime.boot 21363 +0.1% 21389 uptime.idle 55.94 +0.2% 56.06 boot-time.boot 38.10 +0.2% 38.19 boot-time.dhcp 5283 +0.2% 5295 boot-time.idle 1.11 -0.1% 1.11 boot-time.smp_boot 50.14 +0.0 50.16 mpstat.cpu.all.idle% 0.03 ±223% -0.0 0.00 ±223% mpstat.cpu.all.iowait% 1.02 +0.0 1.03 mpstat.cpu.all.irq% 0.03 ± 4% -0.0 0.02 mpstat.cpu.all.soft% 48.59 +0.0 48.61 mpstat.cpu.all.sys% 0.20 ± 2% -0.0 0.17 ± 4% mpstat.cpu.all.usr% 0.00 -100.0% 0.00 numa-numastat.node0.interleave_hit 328352 ± 15% -7.2% 304842 ± 20% numa-numastat.node0.local_node 374230 ± 6% -4.2% 358578 ± 7% numa-numastat.node0.numa_hit 45881 ± 75% +17.1% 53735 ± 69% numa-numastat.node0.other_node 0.00 -100.0% 0.00 numa-numastat.node1.interleave_hit 381812 ± 13% +5.9% 404461 ± 14% numa-numastat.node1.local_node 430007 ± 5% +3.4% 444810 ± 5% numa-numastat.node1.numa_hit 48195 ± 71% -16.3% 40348 ± 92% numa-numastat.node1.other_node 301.30 -0.0% 301.26 time.elapsed_time 301.30 -0.0% 301.26 time.elapsed_time.max 3.67 ± 71% -22.7% 2.83 ± 47% time.involuntary_context_switches 0.67 ±165% -75.0% 0.17 ±223% time.major_page_faults 9772 -0.7% 9702 time.maximum_resident_set_size 7274 -0.3% 7254 time.minor_page_faults 4096 +0.0% 4096 time.page_size 0.04 ± 16% -4.0% 0.04 time.system_time 0.06 ± 24% -11.8% 0.05 ± 16% time.user_time 102.83 +1.9% 104.83 ± 2% time.voluntary_context_switches 50.00 +0.0% 50.00 vmstat.cpu.id 49.00 +0.0% 49.00 vmstat.cpu.sy 0.00 -100.0% 0.00 vmstat.cpu.us 0.00 -100.0% 0.00 vmstat.cpu.wa 12.50 ±100% -66.7% 4.17 ±223% vmstat.io.bi 3.33 ±141% -55.0% 1.50 ±223% vmstat.io.bo 6.00 ± 47% -16.7% 5.00 ± 44% vmstat.memory.buff 4150651 -0.1% 4148516 vmstat.memory.cache 1.912e+08 +0.1% 1.913e+08 vmstat.memory.free 0.00 -100.0% 0.00 vmstat.procs.b 50.50 -0.3% 50.33 vmstat.procs.r 8274 ± 2% +1.2% 8371 ± 4% vmstat.system.cs 211078 -0.1% 210826 vmstat.system.in 1399 +0.0% 1399 turbostat.Avg_MHz 50.12 +0.0 50.13 turbostat.Busy% 2799 -0.0% 2798 turbostat.Bzy_MHz 208677 ± 13% +1112.3% 2529776 ±194% turbostat.C1 0.03 ± 89% +0.3 0.36 ±203% turbostat.C1% 27078371 ± 15% -22.0% 21125809 ± 51% turbostat.C1E 37.41 ± 33% -9.4 28.04 ± 62% turbostat.C1E% 5088326 ± 84% +63.1% 8298766 ± 77% turbostat.C6 12.59 ± 99% +9.1 21.69 ± 78% turbostat.C6% 49.79 -0.1% 49.75 turbostat.CPU%c1 0.08 ± 71% +37.3% 0.12 ± 78% turbostat.CPU%c6 43.67 -0.4% 43.50 turbostat.CoreTmp 0.03 +0.0% 0.03 turbostat.IPC 64483530 -0.2% 64338768 turbostat.IRQ 647657 ± 2% +63.2% 1057048 ± 98% turbostat.POLL 0.01 +0.0 0.05 ±178% turbostat.POLL% 0.01 ±223% +200.0% 0.04 ±147% turbostat.Pkg%pc2 0.01 ±223% +140.0% 0.02 ±165% turbostat.Pkg%pc6 44.17 +0.4% 44.33 turbostat.PkgTmp 284.98 +0.1% 285.28 turbostat.PkgWatt 26.78 +0.4% 26.89 turbostat.RAMWatt 2095 +0.0% 2095 turbostat.TSC_MHz 49585 ± 7% +1.1% 50139 ± 7% meminfo.Active 49182 ± 7% +1.4% 49889 ± 7% meminfo.Active(anon) 402.33 ± 99% -37.9% 250.00 ±123% meminfo.Active(file) 290429 -33.7% 192619 meminfo.AnonHugePages 419654 -25.9% 311054 meminfo.AnonPages 6.00 ± 47% -16.7% 5.00 ± 44% meminfo.Buffers 4026046 -0.1% 4023990 meminfo.Cached 98360160 +0.0% 98360160 meminfo.CommitLimit 4319751 +0.4% 4337801 meminfo.Committed_AS 1.877e+08 -0.1% 1.875e+08 meminfo.DirectMap1G 14383445 ± 12% +0.7% 14491306 ± 4% meminfo.DirectMap2M 1042426 ± 9% +6.4% 1109328 ± 7% meminfo.DirectMap4k 4.00 ±141% -50.0% 2.00 ±223% meminfo.Dirty 2048 +0.0% 2048 meminfo.Hugepagesize 434675 -26.3% 320518 meminfo.Inactive 431330 -26.0% 319346 meminfo.Inactive(anon) 3344 ± 95% -65.0% 1171 ±186% meminfo.Inactive(file) 124528 -0.1% 124460 meminfo.KReclaimable 18433 +0.7% 18559 meminfo.KernelStack 40185 ± 2% -0.9% 39837 meminfo.Mapped 1.903e+08 +0.1% 1.904e+08 meminfo.MemAvailable 1.912e+08 +0.1% 1.913e+08 meminfo.MemFree 1.967e+08 +0.0% 1.967e+08 meminfo.MemTotal 5569412 -1.8% 5466754 meminfo.Memused 4763 -5.7% 4489 meminfo.PageTables 51956 +0.0% 51956 meminfo.Percpu 124528 -0.1% 124460 meminfo.SReclaimable 197128 +0.1% 197293 meminfo.SUnreclaim 57535 ± 7% +0.8% 57986 ± 6% meminfo.Shmem 321657 +0.0% 321754 meminfo.Slab 3964769 -0.0% 3964586 meminfo.Unevictable 3.436e+10 +0.0% 3.436e+10 meminfo.VmallocTotal 280612 +0.1% 280841 meminfo.VmallocUsed 6194619 -2.0% 6071944 meminfo.max_used_kB 2626 ± 28% -7.7% 2423 ± 11% numa-meminfo.node0.Active 2361 ± 20% -5.3% 2236 ± 10% numa-meminfo.node0.Active(anon) 264.67 ±117% -29.5% 186.67 ±152% numa-meminfo.node0.Active(file) 135041 ± 20% -22.4% 104774 ± 42% numa-meminfo.node0.AnonHugePages 197759 ± 18% -20.4% 157470 ± 35% numa-meminfo.node0.AnonPages 235746 ± 19% -11.8% 207988 ± 29% numa-meminfo.node0.AnonPages.max 2.00 ±223% +0.0% 2.00 ±223% numa-meminfo.node0.Dirty 1386137 ±123% +89.5% 2626100 ± 67% numa-meminfo.node0.FilePages 202317 ± 19% -21.0% 159846 ± 36% numa-meminfo.node0.Inactive 200223 ± 19% -20.7% 158765 ± 35% numa-meminfo.node0.Inactive(anon) 2093 ±129% -48.4% 1080 ±200% numa-meminfo.node0.Inactive(file) 46369 ± 57% +43.5% 66525 ± 41% numa-meminfo.node0.KReclaimable 9395 ± 4% +4.6% 9822 ± 5% numa-meminfo.node0.KernelStack 14343 ±101% +65.1% 23681 ± 58% numa-meminfo.node0.Mapped 95532160 -1.3% 94306066 numa-meminfo.node0.MemFree 97681544 +0.0% 97681544 numa-meminfo.node0.MemTotal 2149382 ± 82% +57.0% 3375476 ± 53% numa-meminfo.node0.MemUsed 2356 ± 21% -9.9% 2122 ± 9% numa-meminfo.node0.PageTables 46369 ± 57% +43.5% 66525 ± 41% numa-meminfo.node0.SReclaimable 109141 ± 6% +1.5% 110817 ± 7% numa-meminfo.node0.SUnreclaim 4514 ± 34% -22.4% 3505 ± 30% numa-meminfo.node0.Shmem 155511 ± 18% +14.0% 177344 ± 14% numa-meminfo.node0.Slab 1379264 ±124% +90.1% 2621327 ± 67% numa-meminfo.node0.Unevictable 46974 ± 8% +1.5% 47665 ± 7% numa-meminfo.node1.Active 46837 ± 8% +1.6% 47601 ± 7% numa-meminfo.node1.Active(anon) 137.33 ±219% -54.0% 63.17 ± 85% numa-meminfo.node1.Active(file) 155559 ± 18% -43.5% 87865 ± 52% numa-meminfo.node1.AnonHugePages 222077 ± 16% -30.8% 153725 ± 36% numa-meminfo.node1.AnonPages 304080 ± 17% -27.5% 220544 ± 28% numa-meminfo.node1.AnonPages.max 2.00 ±223% -100.0% 0.00 numa-meminfo.node1.Dirty 2639873 ± 65% -47.0% 1397913 ±126% numa-meminfo.node1.FilePages 232481 ± 17% -30.8% 160887 ± 34% numa-meminfo.node1.Inactive 231228 ± 16% -30.5% 160796 ± 34% numa-meminfo.node1.Inactive(anon) 1252 ±213% -92.8% 90.33 ± 96% numa-meminfo.node1.Inactive(file) 78155 ± 34% -25.9% 57927 ± 47% numa-meminfo.node1.KReclaimable 9041 ± 4% -3.3% 8740 ± 5% numa-meminfo.node1.KernelStack 25795 ± 55% -37.5% 16118 ± 85% numa-meminfo.node1.Mapped 95619356 +1.4% 96947357 numa-meminfo.node1.MemFree 99038776 +0.0% 99038776 numa-meminfo.node1.MemTotal 3419418 ± 52% -38.8% 2091417 ± 85% numa-meminfo.node1.MemUsed 2405 ± 21% -1.5% 2369 ± 7% numa-meminfo.node1.PageTables 78155 ± 34% -25.9% 57927 ± 47% numa-meminfo.node1.SReclaimable 87984 ± 7% -1.7% 86475 ± 9% numa-meminfo.node1.SUnreclaim 52978 ± 9% +2.9% 54500 ± 8% numa-meminfo.node1.Shmem 166140 ± 16% -13.1% 144403 ± 17% numa-meminfo.node1.Slab 2585504 ± 66% -48.0% 1343258 ±131% numa-meminfo.node1.Unevictable 486.17 ± 9% +6.8% 519.17 ± 7% proc-vmstat.direct_map_level2_splits 8.00 ± 22% +2.1% 8.17 ± 8% proc-vmstat.direct_map_level3_splits 12303 ± 7% +1.3% 12461 ± 7% proc-vmstat.nr_active_anon 100.50 ± 99% -37.8% 62.50 ±123% proc-vmstat.nr_active_file 104906 -25.9% 77785 proc-vmstat.nr_anon_pages 141.00 -33.6% 93.67 proc-vmstat.nr_anon_transparent_hugepages 264.00 ±141% -54.3% 120.67 ±223% proc-vmstat.nr_dirtied 1.00 ±141% -50.0% 0.50 ±223% proc-vmstat.nr_dirty 4750146 +0.1% 4752612 proc-vmstat.nr_dirty_background_threshold 9511907 +0.1% 9516846 proc-vmstat.nr_dirty_threshold 1006517 -0.1% 1005995 proc-vmstat.nr_file_pages 47787985 +0.1% 47813269 proc-vmstat.nr_free_pages 107821 -25.9% 79869 proc-vmstat.nr_inactive_anon 836.17 ± 95% -65.1% 292.17 ±186% proc-vmstat.nr_inactive_file 18434 +0.7% 18563 proc-vmstat.nr_kernel_stack 10033 ± 2% -1.1% 9924 proc-vmstat.nr_mapped 1190 -5.7% 1122 proc-vmstat.nr_page_table_pages 14387 ± 7% +0.7% 14493 ± 6% proc-vmstat.nr_shmem 31131 -0.1% 31114 proc-vmstat.nr_slab_reclaimable 49281 +0.1% 49323 proc-vmstat.nr_slab_unreclaimable 991192 -0.0% 991146 proc-vmstat.nr_unevictable 264.00 ±141% -54.3% 120.67 ±223% proc-vmstat.nr_written 12303 ± 7% +1.3% 12461 ± 7% proc-vmstat.nr_zone_active_anon 100.50 ± 99% -37.8% 62.50 ±123% proc-vmstat.nr_zone_active_file 107821 -25.9% 79869 proc-vmstat.nr_zone_inactive_anon 836.17 ± 95% -65.1% 292.17 ±186% proc-vmstat.nr_zone_inactive_file 991192 -0.0% 991146 proc-vmstat.nr_zone_unevictable 1.00 ±141% -50.0% 0.50 ±223% proc-vmstat.nr_zone_write_pending 17990 ± 21% -17.6% 14820 ± 46% proc-vmstat.numa_hint_faults 7847 ± 37% -41.5% 4588 ± 26% proc-vmstat.numa_hint_faults_local 806662 +0.3% 809070 proc-vmstat.numa_hit 488.50 ± 13% -73.4% 130.17 ± 22% proc-vmstat.numa_huge_pte_updates 0.00 -100.0% 0.00 proc-vmstat.numa_interleave 712588 -0.2% 711419 proc-vmstat.numa_local 94077 +0.0% 94084 proc-vmstat.numa_other 18894 ± 67% -3.1% 18303 ± 41% proc-vmstat.numa_pages_migrated 337482 ± 10% -59.0% 138314 ± 10% proc-vmstat.numa_pte_updates 61815 -1.6% 60823 proc-vmstat.pgactivate 0.00 -100.0% 0.00 proc-vmstat.pgalloc_dma32 933601 -3.8% 898485 proc-vmstat.pgalloc_normal 899579 -0.5% 895253 proc-vmstat.pgfault 896972 -3.9% 861819 proc-vmstat.pgfree 18894 ± 67% -3.1% 18303 ± 41% proc-vmstat.pgmigrate_success 3845 ±100% -66.8% 1277 ±223% proc-vmstat.pgpgin 1064 ±141% -54.3% 486.67 ±223% proc-vmstat.pgpgout 40396 -0.6% 40172 proc-vmstat.pgreuse 105.50 -9.2% 95.83 ± 5% proc-vmstat.thp_collapse_alloc 57.00 -87.4% 7.17 ± 5% proc-vmstat.thp_deferred_split_page 74.83 -72.4% 20.67 ± 4% proc-vmstat.thp_fault_alloc 19.50 ±105% -15.4% 16.50 ± 71% proc-vmstat.thp_migration_success 57.00 -87.4% 7.17 ± 5% proc-vmstat.thp_split_pmd 0.00 -100.0% 0.00 proc-vmstat.thp_zero_page_alloc 17.00 +0.0% 17.00 proc-vmstat.unevictable_pgs_culled 589.83 ± 21% -5.2% 559.00 ± 10% numa-vmstat.node0.nr_active_anon 66.00 ±117% -29.3% 46.67 ±152% numa-vmstat.node0.nr_active_file 49406 ± 18% -20.3% 39355 ± 35% numa-vmstat.node0.nr_anon_pages 65.17 ± 21% -22.0% 50.83 ± 42% numa-vmstat.node0.nr_anon_transparent_hugepages 132.00 ±223% -8.6% 120.67 ±223% numa-vmstat.node0.nr_dirtied 0.50 ±223% +0.0% 0.50 ±223% numa-vmstat.node0.nr_dirty 346534 ±123% +89.5% 656525 ± 67% numa-vmstat.node0.nr_file_pages 23883055 -1.3% 23576561 numa-vmstat.node0.nr_free_pages 50051 ± 19% -20.7% 39679 ± 35% numa-vmstat.node0.nr_inactive_anon 522.67 ±129% -48.4% 269.67 ±200% numa-vmstat.node0.nr_inactive_file 0.00 -100.0% 0.00 numa-vmstat.node0.nr_isolated_anon 9392 ± 4% +4.6% 9823 ± 5% numa-vmstat.node0.nr_kernel_stack 3594 ±101% +64.8% 5922 ± 58% numa-vmstat.node0.nr_mapped 587.83 ± 21% -9.8% 530.00 ± 9% numa-vmstat.node0.nr_page_table_pages 1129 ± 34% -22.4% 876.67 ± 30% numa-vmstat.node0.nr_shmem 11591 ± 57% +43.5% 16631 ± 41% numa-vmstat.node0.nr_slab_reclaimable 27285 ± 6% +1.5% 27704 ± 7% numa-vmstat.node0.nr_slab_unreclaimable 344815 ±124% +90.1% 655331 ± 67% numa-vmstat.node0.nr_unevictable 132.00 ±223% -8.6% 120.67 ±223% numa-vmstat.node0.nr_written 589.83 ± 21% -5.2% 559.00 ± 10% numa-vmstat.node0.nr_zone_active_anon 66.00 ±117% -29.3% 46.67 ±152% numa-vmstat.node0.nr_zone_active_file 50051 ± 19% -20.7% 39679 ± 35% numa-vmstat.node0.nr_zone_inactive_anon 522.67 ±129% -48.4% 269.67 ±200% numa-vmstat.node0.nr_zone_inactive_file 344815 ±124% +90.1% 655331 ± 67% numa-vmstat.node0.nr_zone_unevictable 0.50 ±223% +0.0% 0.50 ±223% numa-vmstat.node0.nr_zone_write_pending 374134 ± 6% -4.1% 358690 ± 7% numa-vmstat.node0.numa_hit 0.00 -100.0% 0.00 numa-vmstat.node0.numa_interleave 328256 ± 15% -7.1% 304955 ± 20% numa-vmstat.node0.numa_local 45881 ± 75% +17.1% 53735 ± 69% numa-vmstat.node0.numa_other 11706 ± 8% +1.7% 11901 ± 7% numa-vmstat.node1.nr_active_anon 34.17 ±219% -54.1% 15.67 ± 84% numa-vmstat.node1.nr_active_file 55500 ± 16% -30.8% 38424 ± 36% numa-vmstat.node1.nr_anon_pages 75.50 ± 18% -43.7% 42.50 ± 53% numa-vmstat.node1.nr_anon_transparent_hugepages 132.00 ±223% -100.0% 0.00 numa-vmstat.node1.nr_dirtied 0.50 ±223% -100.0% 0.00 numa-vmstat.node1.nr_dirty 659985 ± 65% -47.0% 349484 ±126% numa-vmstat.node1.nr_file_pages 23904828 +1.4% 24236871 numa-vmstat.node1.nr_free_pages 57826 ± 16% -30.5% 40197 ± 34% numa-vmstat.node1.nr_inactive_anon 313.00 ±213% -92.9% 22.33 ± 96% numa-vmstat.node1.nr_inactive_file 9043 ± 4% -3.3% 8740 ± 5% numa-vmstat.node1.nr_kernel_stack 6467 ± 55% -37.6% 4038 ± 85% numa-vmstat.node1.nr_mapped 601.50 ± 21% -1.6% 591.83 ± 7% numa-vmstat.node1.nr_page_table_pages 13261 ± 9% +2.8% 13630 ± 8% numa-vmstat.node1.nr_shmem 19538 ± 34% -25.9% 14481 ± 47% numa-vmstat.node1.nr_slab_reclaimable 21995 ± 7% -1.7% 21618 ± 9% numa-vmstat.node1.nr_slab_unreclaimable 646375 ± 66% -48.0% 335813 ±131% numa-vmstat.node1.nr_unevictable 132.00 ±223% -100.0% 0.00 numa-vmstat.node1.nr_written 11706 ± 8% +1.7% 11901 ± 7% numa-vmstat.node1.nr_zone_active_anon 34.17 ±219% -54.1% 15.67 ± 84% numa-vmstat.node1.nr_zone_active_file 57826 ± 16% -30.5% 40197 ± 34% numa-vmstat.node1.nr_zone_inactive_anon 313.00 ±213% -92.9% 22.33 ± 96% numa-vmstat.node1.nr_zone_inactive_file 646375 ± 66% -48.0% 335813 ±131% numa-vmstat.node1.nr_zone_unevictable 0.50 ±223% -100.0% 0.00 numa-vmstat.node1.nr_zone_write_pending 429997 ± 5% +3.5% 444962 ± 5% numa-vmstat.node1.numa_hit 0.00 -100.0% 0.00 numa-vmstat.node1.numa_interleave 381801 ± 13% +6.0% 404613 ± 14% numa-vmstat.node1.numa_local 48195 ± 71% -16.3% 40348 ± 92% numa-vmstat.node1.numa_other 2.47 ± 2% -2.0% 2.42 ± 5% perf-stat.i.MPKI 3.282e+09 +0.7% 3.305e+09 perf-stat.i.branch-instructions 0.41 -0.1 0.33 perf-stat.i.branch-miss-rate% 13547319 -16.6% 11300609 perf-stat.i.branch-misses 42.88 +0.7 43.53 perf-stat.i.cache-miss-rate% 17114713 ± 3% +1.4% 17346470 ± 5% perf-stat.i.cache-misses 40081707 ± 2% -0.0% 40073189 ± 5% perf-stat.i.cache-references 8192 ± 2% +1.4% 8311 ± 4% perf-stat.i.context-switches 8.84 -0.8% 8.77 perf-stat.i.cpi 104007 +0.0% 104008 perf-stat.i.cpu-clock 1.446e+11 +0.1% 1.447e+11 perf-stat.i.cpu-cycles 140.10 -1.0% 138.76 perf-stat.i.cpu-migrations 8487 ± 3% -0.9% 8412 ± 6% perf-stat.i.cycles-between-cache-misses 0.01 ± 6% -0.0 0.01 perf-stat.i.dTLB-load-miss-rate% 434358 ± 3% -16.9% 360889 perf-stat.i.dTLB-load-misses 4.316e+09 +1.3% 4.373e+09 perf-stat.i.dTLB-loads 0.00 ± 15% -0.0 0.00 ± 9% perf-stat.i.dTLB-store-miss-rate% 10408 ± 11% -2.6% 10135 ± 8% perf-stat.i.dTLB-store-misses 4.302e+08 +5.5% 4.539e+08 perf-stat.i.dTLB-stores 16.21 ± 2% -2.5 13.73 ± 18% perf-stat.i.iTLB-load-miss-rate% 394805 ± 5% -26.0% 292089 ± 8% perf-stat.i.iTLB-load-misses 2041963 ± 3% -8.3% 1872405 ± 12% perf-stat.i.iTLB-loads 1.638e+10 +1.0% 1.654e+10 perf-stat.i.instructions 41729 ± 6% +37.4% 57323 ± 8% perf-stat.i.instructions-per-iTLB-miss 0.11 +0.8% 0.11 perf-stat.i.ipc 0.01 ± 55% -1.5% 0.01 ± 85% perf-stat.i.major-faults 1.39 +0.1% 1.39 perf-stat.i.metric.GHz 468.46 ± 2% -1.5% 461.59 ± 4% perf-stat.i.metric.K/sec 77.18 +1.3% 78.18 perf-stat.i.metric.M/sec 2473 -0.0% 2472 perf-stat.i.minor-faults 89.67 -0.5 89.18 perf-stat.i.node-load-miss-rate% 5070484 -10.3% 4547670 perf-stat.i.node-load-misses 585336 ± 2% -5.5% 553260 ± 8% perf-stat.i.node-loads 98.73 +0.2 98.91 perf-stat.i.node-store-miss-rate% 935187 +2.2% 955923 ± 3% perf-stat.i.node-store-misses 13301 ± 8% -12.6% 11631 ± 5% perf-stat.i.node-stores 2473 -0.0% 2472 perf-stat.i.page-faults 104007 +0.0% 104008 perf-stat.i.task-clock 2.45 ± 2% -1.0% 2.42 ± 5% perf-stat.overall.MPKI 0.41 -0.1 0.34 perf-stat.overall.branch-miss-rate% 42.68 +0.6 43.26 perf-stat.overall.cache-miss-rate% 8.83 -0.9% 8.75 perf-stat.overall.cpi 8459 ± 3% -1.0% 8372 ± 6% perf-stat.overall.cycles-between-cache-misses 0.01 ± 3% -0.0 0.01 perf-stat.overall.dTLB-load-miss-rate% 0.00 ± 11% -0.0 0.00 ± 8% perf-stat.overall.dTLB-store-miss-rate% 16.19 ± 2% -2.5 13.73 ± 18% perf-stat.overall.iTLB-load-miss-rate% 41644 ± 6% +37.0% 57047 ± 8% perf-stat.overall.instructions-per-iTLB-miss 0.11 +0.9% 0.11 perf-stat.overall.ipc 89.65 -0.5 89.15 perf-stat.overall.node-load-miss-rate% 98.59 +0.2 98.78 perf-stat.overall.node-store-miss-rate% 35314961 +28.0% 45213422 ± 3% perf-stat.overall.path-length 3.272e+09 +0.7% 3.295e+09 perf-stat.ps.branch-instructions 13563215 -16.5% 11329031 perf-stat.ps.branch-misses 17059170 ± 3% +1.3% 17288798 ± 5% perf-stat.ps.cache-misses 39960738 ± 2% -0.0% 39951411 ± 5% perf-stat.ps.cache-references 8205 ± 2% +1.4% 8320 ± 4% perf-stat.ps.context-switches 103658 -0.0% 103657 perf-stat.ps.cpu-clock 1.441e+11 +0.1% 1.442e+11 perf-stat.ps.cpu-cycles 140.16 -1.0% 138.77 perf-stat.ps.cpu-migrations 433133 ± 3% -16.9% 359910 perf-stat.ps.dTLB-load-misses 4.302e+09 +1.3% 4.359e+09 perf-stat.ps.dTLB-loads 10392 ± 11% -2.6% 10120 ± 8% perf-stat.ps.dTLB-store-misses 4.29e+08 +5.5% 4.527e+08 perf-stat.ps.dTLB-stores 393499 ± 5% -26.0% 291118 ± 8% perf-stat.ps.iTLB-load-misses 2035052 ± 3% -8.3% 1866106 ± 12% perf-stat.ps.iTLB-loads 1.633e+10 +1.0% 1.649e+10 perf-stat.ps.instructions 0.01 ± 55% +0.1% 0.01 ± 85% perf-stat.ps.major-faults 2466 +0.0% 2466 perf-stat.ps.minor-faults 5053378 -10.3% 4532205 perf-stat.ps.node-load-misses 583428 ± 2% -5.5% 551516 ± 8% perf-stat.ps.node-loads 932227 +2.2% 952780 ± 3% perf-stat.ps.node-store-misses 13342 ± 8% -12.1% 11729 ± 6% perf-stat.ps.node-stores 2466 +0.0% 2466 perf-stat.ps.page-faults 103658 -0.0% 103657 perf-stat.ps.task-clock 4.952e+12 +0.9% 4.994e+12 perf-stat.total.instructions 10.88 ±223% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.avg 1132 ±223% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.max 0.00 +0.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.min 110.47 ±223% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.stddev 0.53 ± 4% +7.4% 0.57 ± 4% sched_debug.cfs_rq:/.h_nr_running.avg 1.03 ± 7% -3.2% 1.00 sched_debug.cfs_rq:/.h_nr_running.max 0.45 ± 2% -1.9% 0.44 ± 3% sched_debug.cfs_rq:/.h_nr_running.stddev 11896 ± 12% -0.1% 11883 ± 13% sched_debug.cfs_rq:/.load.avg 123097 ±123% -80.1% 24487 ± 18% sched_debug.cfs_rq:/.load.max 19029 ± 74% -49.9% 9525 ± 13% sched_debug.cfs_rq:/.load.stddev 22.63 ± 23% +1.4% 22.93 ± 16% sched_debug.cfs_rq:/.load_avg.avg 530.85 ± 73% -13.1% 461.19 ± 43% sched_debug.cfs_rq:/.load_avg.max 73.53 ± 46% -7.1% 68.30 ± 33% sched_debug.cfs_rq:/.load_avg.stddev 10.88 ±223% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.avg 1132 ±223% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.max 0.00 +0.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.min 110.47 ±223% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.stddev 3883756 ± 13% +12.7% 4377466 ± 4% sched_debug.cfs_rq:/.min_vruntime.avg 6993455 ± 10% +6.5% 7445221 ± 2% sched_debug.cfs_rq:/.min_vruntime.max 219925 ± 60% +43.7% 315970 ± 71% sched_debug.cfs_rq:/.min_vruntime.min 2240239 ± 11% +14.0% 2554847 ± 14% sched_debug.cfs_rq:/.min_vruntime.stddev 0.53 ± 5% +7.5% 0.57 ± 4% sched_debug.cfs_rq:/.nr_running.avg 1.03 ± 7% -3.2% 1.00 sched_debug.cfs_rq:/.nr_running.max 0.45 ± 2% -1.9% 0.44 ± 3% sched_debug.cfs_rq:/.nr_running.stddev 6.96 ± 55% +26.9% 8.83 ± 45% sched_debug.cfs_rq:/.removed.load_avg.avg 305.28 ± 32% +39.3% 425.39 ± 44% sched_debug.cfs_rq:/.removed.load_avg.max 42.94 ± 36% +34.4% 57.70 ± 42% sched_debug.cfs_rq:/.removed.load_avg.stddev 2.96 ± 58% +39.1% 4.12 ± 48% sched_debug.cfs_rq:/.removed.runnable_avg.avg 150.06 ± 34% +44.0% 216.03 ± 45% sched_debug.cfs_rq:/.removed.runnable_avg.max 19.33 ± 42% +42.6% 27.56 ± 45% sched_debug.cfs_rq:/.removed.runnable_avg.stddev 2.96 ± 58% +39.1% 4.12 ± 48% sched_debug.cfs_rq:/.removed.util_avg.avg 150.06 ± 34% +44.0% 216.03 ± 45% sched_debug.cfs_rq:/.removed.util_avg.max 19.33 ± 42% +42.6% 27.56 ± 45% sched_debug.cfs_rq:/.removed.util_avg.stddev 540.76 ± 6% +7.5% 581.25 ± 5% sched_debug.cfs_rq:/.runnable_avg.avg 1060 ± 2% +2.5% 1087 ± 3% sched_debug.cfs_rq:/.runnable_avg.max 442.07 ± 4% -0.1% 441.69 ± 5% sched_debug.cfs_rq:/.runnable_avg.stddev 3123464 ± 14% +10.0% 3436745 ± 3% sched_debug.cfs_rq:/.spread0.avg 6233151 ± 10% +4.4% 6504505 ± 3% sched_debug.cfs_rq:/.spread0.max -540338 +15.6% -624739 sched_debug.cfs_rq:/.spread0.min 2240217 ± 11% +14.0% 2554844 ± 14% sched_debug.cfs_rq:/.spread0.stddev 540.71 ± 6% +7.5% 581.22 ± 5% sched_debug.cfs_rq:/.util_avg.avg 1060 ± 2% +2.5% 1086 ± 3% sched_debug.cfs_rq:/.util_avg.max 442.07 ± 4% -0.1% 441.67 ± 5% sched_debug.cfs_rq:/.util_avg.stddev 454.69 ± 6% +7.0% 486.47 ± 8% sched_debug.cfs_rq:/.util_est_enqueued.avg 1024 -0.0% 1023 sched_debug.cfs_rq:/.util_est_enqueued.max 396.02 ± 2% -0.1% 395.79 sched_debug.cfs_rq:/.util_est_enqueued.stddev 642171 ± 4% +16.6% 748912 ± 2% sched_debug.cpu.avg_idle.avg 1051166 -1.2% 1038098 sched_debug.cpu.avg_idle.max 2402 ± 5% +28.5% 3088 ± 9% sched_debug.cpu.avg_idle.min 384501 ± 3% -12.3% 337306 ± 5% sched_debug.cpu.avg_idle.stddev 198632 ± 7% +5.1% 208788 sched_debug.cpu.clock.avg 198638 ± 7% +5.1% 208794 sched_debug.cpu.clock.max 198626 ± 7% +5.1% 208783 sched_debug.cpu.clock.min 3.25 +2.3% 3.32 ± 5% sched_debug.cpu.clock.stddev 196832 ± 7% +5.1% 206882 sched_debug.cpu.clock_task.avg 197235 ± 7% +5.1% 207282 sched_debug.cpu.clock_task.max 181004 ± 7% +5.7% 191329 sched_debug.cpu.clock_task.min 1575 ± 3% -1.8% 1546 sched_debug.cpu.clock_task.stddev 2411 ± 4% +2.8% 2478 sched_debug.cpu.curr->pid.avg 8665 ± 4% +3.1% 8935 sched_debug.cpu.curr->pid.max 2522 ± 2% +1.0% 2548 sched_debug.cpu.curr->pid.stddev 501318 -0.0% 501249 sched_debug.cpu.max_idle_balance_cost.avg 528365 +0.5% 531236 ± 2% sched_debug.cpu.max_idle_balance_cost.max 500000 +0.0% 500000 sched_debug.cpu.max_idle_balance_cost.min 5157 ± 19% -4.2% 4941 ± 23% sched_debug.cpu.max_idle_balance_cost.stddev 4294 +0.0% 4294 sched_debug.cpu.next_balance.avg 4294 +0.0% 4294 sched_debug.cpu.next_balance.max 4294 +0.0% 4294 sched_debug.cpu.next_balance.min 0.00 ± 41% -40.0% 0.00 ± 13% sched_debug.cpu.next_balance.stddev 0.44 ± 4% +2.4% 0.45 sched_debug.cpu.nr_running.avg 1.00 +0.0% 1.00 sched_debug.cpu.nr_running.max 0.47 +0.5% 0.47 sched_debug.cpu.nr_running.stddev 14345 ± 8% +6.7% 15305 ± 4% sched_debug.cpu.nr_switches.avg 30800 ± 8% +34.5% 41437 ± 10% sched_debug.cpu.nr_switches.max 4563 ± 28% +5.7% 4822 ± 25% sched_debug.cpu.nr_switches.min 5491 ± 8% +26.4% 6941 ± 10% sched_debug.cpu.nr_switches.stddev 2.111e+09 ± 7% +1.5% 2.142e+09 ± 6% sched_debug.cpu.nr_uninterruptible.avg 4.295e+09 +0.0% 4.295e+09 sched_debug.cpu.nr_uninterruptible.max 2.14e+09 +0.1% 2.143e+09 sched_debug.cpu.nr_uninterruptible.stddev 198627 ± 7% +5.1% 208783 sched_debug.cpu_clk 996147 +0.0% 996147 sched_debug.dl_rq:.dl_bw->bw.avg 996147 +0.0% 996147 sched_debug.dl_rq:.dl_bw->bw.max 996147 +0.0% 996147 sched_debug.dl_rq:.dl_bw->bw.min 4.295e+09 +0.0% 4.295e+09 sched_debug.jiffies 198022 ± 7% +5.1% 208178 sched_debug.ktime 950.00 +0.0% 950.00 sched_debug.rt_rq:.rt_runtime.avg 950.00 +0.0% 950.00 sched_debug.rt_rq:.rt_runtime.max 950.00 +0.0% 950.00 sched_debug.rt_rq:.rt_runtime.min 199377 ± 7% +5.1% 209531 sched_debug.sched_clk 1.00 +0.0% 1.00 sched_debug.sched_clock_stable() 58611259 +0.0% 58611259 sched_debug.sysctl_sched.sysctl_sched_features 0.75 +0.0% 0.75 sched_debug.sysctl_sched.sysctl_sched_idle_min_granularity 24.00 +0.0% 24.00 sched_debug.sysctl_sched.sysctl_sched_latency 3.00 +0.0% 3.00 sched_debug.sysctl_sched.sysctl_sched_min_granularity 1.00 +0.0% 1.00 sched_debug.sysctl_sched.sysctl_sched_tunable_scaling 4.00 +0.0% 4.00 sched_debug.sysctl_sched.sysctl_sched_wakeup_granularity 20.90 ± 47% -6.4 14.49 ±100% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call 20.90 ± 47% -6.4 14.49 ±100% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle 0.48 ± 44% -0.5 0.00 perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap 29.41 ± 19% -0.2 29.23 ± 18% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry 35.02 ± 8% -0.2 34.86 ± 7% perf-profile.calltrace.cycles-pp.__mmap 34.95 ± 8% -0.1 34.81 ± 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap 34.92 ± 8% -0.1 34.79 ± 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap 34.87 ± 8% -0.1 34.74 ± 7% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap 0.41 ± 74% -0.1 0.30 ±156% perf-profile.calltrace.cycles-pp.cpu_startup_entry.rest_init.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify 0.41 ± 74% -0.1 0.30 ±156% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.rest_init.arch_call_rest_init.start_kernel 0.41 ± 74% -0.1 0.30 ±156% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.rest_init.arch_call_rest_init 0.41 ± 74% -0.1 0.30 ±156% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.rest_init 0.41 ± 74% -0.1 0.30 ±156% perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64_no_verify 0.41 ± 74% -0.1 0.30 ±156% perf-profile.calltrace.cycles-pp.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify 0.41 ± 74% -0.1 0.30 ±156% perf-profile.calltrace.cycles-pp.rest_init.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify 29.59 ± 19% -0.1 29.50 ± 17% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify 29.03 ± 19% -0.1 28.95 ± 17% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 29.03 ± 19% -0.1 28.95 ± 17% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 29.03 ± 19% -0.1 28.95 ± 17% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify 29.00 ± 19% -0.1 28.93 ± 17% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 29.00 ± 19% -0.1 28.93 ± 17% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary 33.56 ± 8% -0.0 33.53 ± 7% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 34.26 ± 8% -0.0 34.24 ± 7% perf-profile.calltrace.cycles-pp.down_write_killable.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap 34.23 ± 8% -0.0 34.21 ± 7% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe 34.19 ± 8% -0.0 34.18 ± 7% perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff.do_syscall_64 0.44 ± 44% +0.0 0.48 ± 44% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__vm_munmap 0.45 ± 44% +0.0 0.48 ± 44% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 33.62 ± 8% +0.1 33.71 ± 7% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__vm_munmap 34.32 ± 8% +0.1 34.42 ± 7% perf-profile.calltrace.cycles-pp.down_write_killable.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe 34.29 ± 8% +0.1 34.39 ± 7% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write_killable.__vm_munmap.__x64_sys_munmap.do_syscall_64 34.25 ± 8% +0.1 34.36 ± 7% perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__vm_munmap.__x64_sys_munmap 35.11 ± 8% +0.2 35.31 ± 7% perf-profile.calltrace.cycles-pp.__munmap 35.04 ± 8% +0.2 35.25 ± 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap 35.02 ± 8% +0.2 35.24 ± 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 0.00 +0.2 0.22 ±223% perf-profile.calltrace.cycles-pp.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle 34.97 ± 8% +0.2 35.20 ± 7% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 34.97 ± 8% +0.2 35.20 ± 7% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 0.47 ± 44% +0.2 0.70 ± 7% perf-profile.calltrace.cycles-pp.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +0.4 0.44 ±223% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.mwait_idle_with_hints.intel_idle_irq.cpuidle_enter_state.cpuidle_enter 8.27 ± 91% +6.2 14.46 ± 77% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call 8.27 ± 91% +6.2 14.46 ± 77% perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle 21.09 ± 47% -6.5 14.62 ± 99% perf-profile.children.cycles-pp.intel_idle 35.02 ± 8% -0.2 34.86 ± 7% perf-profile.children.cycles-pp.__mmap 0.14 ± 9% -0.1 0.00 perf-profile.children.cycles-pp.thp_get_unmapped_area 34.87 ± 8% -0.1 34.74 ± 7% perf-profile.children.cycles-pp.vm_mmap_pgoff 0.55 ± 9% -0.1 0.46 ± 7% perf-profile.children.cycles-pp.do_mmap 29.59 ± 19% -0.1 29.50 ± 17% perf-profile.children.cycles-pp.secondary_startup_64_no_verify 29.59 ± 19% -0.1 29.50 ± 17% perf-profile.children.cycles-pp.cpu_startup_entry 29.59 ± 19% -0.1 29.50 ± 17% perf-profile.children.cycles-pp.do_idle 29.03 ± 19% -0.1 28.95 ± 17% perf-profile.children.cycles-pp.start_secondary 29.56 ± 19% -0.1 29.49 ± 17% perf-profile.children.cycles-pp.cpuidle_idle_call 29.56 ± 19% -0.1 29.48 ± 17% perf-profile.children.cycles-pp.cpuidle_enter 29.56 ± 19% -0.1 29.48 ± 17% perf-profile.children.cycles-pp.cpuidle_enter_state 29.52 ± 19% -0.1 29.45 ± 17% perf-profile.children.cycles-pp.mwait_idle_with_hints 0.38 ± 9% -0.1 0.32 ± 6% perf-profile.children.cycles-pp.mmap_region 0.05 ± 7% -0.1 0.00 perf-profile.children.cycles-pp.unmap_vmas 0.11 ± 8% -0.1 0.06 ± 13% perf-profile.children.cycles-pp.unmap_region 0.16 ± 10% -0.0 0.13 ± 9% perf-profile.children.cycles-pp.get_unmapped_area 0.07 ± 7% -0.0 0.03 ± 70% perf-profile.children.cycles-pp.mas_find 0.05 ± 44% -0.0 0.02 ±141% perf-profile.children.cycles-pp.mas_wr_node_store 0.10 ± 10% -0.0 0.07 ± 14% perf-profile.children.cycles-pp.mas_spanning_rebalance 0.14 ± 9% -0.0 0.11 ± 9% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown 0.06 ± 11% -0.0 0.04 ± 72% perf-profile.children.cycles-pp.__schedule 0.14 ± 10% -0.0 0.11 ± 9% perf-profile.children.cycles-pp.vm_unmapped_area 0.07 ± 10% -0.0 0.04 ± 45% perf-profile.children.cycles-pp.do_mas_munmap 0.02 ± 99% -0.0 0.00 perf-profile.children.cycles-pp.mas_next_entry 0.04 ± 44% -0.0 0.02 ±141% perf-profile.children.cycles-pp.schedule 0.06 ± 9% -0.0 0.04 ± 71% perf-profile.children.cycles-pp.mas_wr_modify 0.10 ± 8% -0.0 0.08 ± 11% perf-profile.children.cycles-pp.mas_rev_awalk 0.10 ± 12% -0.0 0.08 ± 16% perf-profile.children.cycles-pp.mas_wr_spanning_store 0.06 ± 7% -0.0 0.04 ± 45% perf-profile.children.cycles-pp.mas_walk 0.09 ± 11% -0.0 0.08 ± 16% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.02 ±141% -0.0 0.00 perf-profile.children.cycles-pp.perf_event_mmap 0.02 ±141% -0.0 0.00 perf-profile.children.cycles-pp.unmap_page_range 0.11 ± 26% -0.0 0.10 ± 10% perf-profile.children.cycles-pp.__get_user_nocheck_8 0.35 ± 19% -0.0 0.34 ± 11% perf-profile.children.cycles-pp.perf_tp_event 0.11 ± 26% -0.0 0.10 ± 11% perf-profile.children.cycles-pp.perf_callchain_user 0.34 ± 19% -0.0 0.33 ± 10% perf-profile.children.cycles-pp.__perf_event_overflow 0.34 ± 19% -0.0 0.33 ± 10% perf-profile.children.cycles-pp.perf_event_output_forward 0.31 ± 19% -0.0 0.30 ± 12% perf-profile.children.cycles-pp.perf_prepare_sample 0.30 ± 19% -0.0 0.29 ± 10% perf-profile.children.cycles-pp.perf_callchain 0.30 ± 19% -0.0 0.29 ± 10% perf-profile.children.cycles-pp.get_perf_callchain 0.12 ± 9% -0.0 0.11 ± 9% perf-profile.children.cycles-pp.mas_empty_area_rev 0.08 ± 7% -0.0 0.07 ± 8% perf-profile.children.cycles-pp.syscall_return_via_sysret 0.01 ±223% -0.0 0.00 perf-profile.children.cycles-pp.mas_wr_bnode 0.01 ±223% -0.0 0.00 perf-profile.children.cycles-pp.perf_event_mmap_event 0.01 ±223% -0.0 0.00 perf-profile.children.cycles-pp.__entry_text_start 0.33 ± 10% -0.0 0.32 ± 7% perf-profile.children.cycles-pp.mas_store_prealloc 0.32 ± 20% -0.0 0.32 ± 10% perf-profile.children.cycles-pp.update_curr 0.32 ± 19% -0.0 0.31 ± 11% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime 0.56 ± 22% -0.0 0.56 ± 58% perf-profile.children.cycles-pp.start_kernel 0.56 ± 22% -0.0 0.56 ± 58% perf-profile.children.cycles-pp.arch_call_rest_init 0.56 ± 22% -0.0 0.56 ± 58% perf-profile.children.cycles-pp.rest_init 0.07 ± 45% -0.0 0.07 ± 11% perf-profile.children.cycles-pp.native_irq_return_iret 0.01 ±223% +0.0 0.01 ±223% perf-profile.children.cycles-pp.ktime_get_update_offsets_now 0.06 ± 45% +0.0 0.06 ± 8% perf-profile.children.cycles-pp.asm_exc_page_fault 0.18 ± 16% +0.0 0.18 ± 14% perf-profile.children.cycles-pp.perf_callchain_kernel 0.12 ± 16% +0.0 0.12 ± 12% perf-profile.children.cycles-pp.unwind_next_frame 0.36 ± 18% +0.0 0.37 ± 10% perf-profile.children.cycles-pp.task_tick_fair 0.58 ± 14% +0.0 0.58 ± 10% perf-profile.children.cycles-pp.hrtimer_interrupt 0.49 ± 14% +0.0 0.50 ± 11% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.05 ± 46% +0.0 0.05 ± 45% perf-profile.children.cycles-pp.__unwind_start 0.45 ± 14% +0.0 0.46 ± 11% perf-profile.children.cycles-pp.tick_sched_handle 0.46 ± 14% +0.0 0.46 ± 11% perf-profile.children.cycles-pp.tick_sched_timer 0.45 ± 15% +0.0 0.45 ± 11% perf-profile.children.cycles-pp.update_process_times 0.06 ± 11% +0.0 0.07 ± 12% perf-profile.children.cycles-pp.kmem_cache_free_bulk 0.58 ± 14% +0.0 0.58 ± 10% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.record__mmap_read_evlist 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.perf_mmap__push 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.record__pushfn 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.ksys_write 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.vfs_write 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.__libc_write 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.generic_file_write_iter 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.__generic_file_write_iter 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.generic_perform_write 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.build_id__mark_dso_hit 0.39 ± 17% +0.0 0.40 ± 10% perf-profile.children.cycles-pp.scheduler_tick 0.00 +0.0 0.01 ±223% perf-profile.children.cycles-pp.clockevents_program_event 0.05 ± 45% +0.0 0.06 ± 11% perf-profile.children.cycles-pp.mas_wr_store_entry 0.60 ± 14% +0.0 0.61 ± 9% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 0.08 ± 8% +0.0 0.10 ± 12% perf-profile.children.cycles-pp.mas_destroy 0.08 ± 9% +0.0 0.09 ± 21% perf-profile.children.cycles-pp.perf_session__deliver_event 0.08 ± 12% +0.0 0.09 ± 33% perf-profile.children.cycles-pp.ordered_events__queue 0.08 ± 11% +0.0 0.10 ± 22% perf-profile.children.cycles-pp.__ordered_events__flush 0.08 ± 9% +0.0 0.10 ± 22% perf-profile.children.cycles-pp.perf_session__process_user_event 0.06 ± 13% +0.0 0.08 ± 14% perf-profile.children.cycles-pp.kmem_cache_alloc 0.07 ± 9% +0.0 0.09 ± 33% perf-profile.children.cycles-pp.queue_event 0.08 ± 8% +0.0 0.10 ± 31% perf-profile.children.cycles-pp.process_simple 0.00 +0.0 0.03 ±100% perf-profile.children.cycles-pp.evlist__parse_sample 0.06 ± 6% +0.0 0.08 ± 8% perf-profile.children.cycles-pp.memset_erms 0.22 ± 7% +0.0 0.26 ± 23% perf-profile.children.cycles-pp.__libc_start_main 0.22 ± 7% +0.0 0.26 ± 23% perf-profile.children.cycles-pp.main 0.22 ± 7% +0.0 0.26 ± 23% perf-profile.children.cycles-pp.run_builtin 0.21 ± 9% +0.0 0.25 ± 23% perf-profile.children.cycles-pp.cmd_record 0.21 ± 9% +0.0 0.25 ± 23% perf-profile.children.cycles-pp.__cmd_record 0.20 ± 9% +0.0 0.24 ± 24% perf-profile.children.cycles-pp.cmd_sched 0.17 ± 11% +0.0 0.21 ± 25% perf-profile.children.cycles-pp.reader__read_event 0.17 ± 11% +0.0 0.21 ± 26% perf-profile.children.cycles-pp.record__finish_output 0.17 ± 11% +0.0 0.21 ± 26% perf-profile.children.cycles-pp.perf_session__process_events 0.00 +0.0 0.04 ± 45% perf-profile.children.cycles-pp.kmem_cache_free 0.17 ± 7% +0.1 0.22 ± 8% perf-profile.children.cycles-pp.mas_alloc_nodes 0.11 ± 9% +0.1 0.17 ± 6% perf-profile.children.cycles-pp.kmem_cache_alloc_bulk 0.00 +0.1 0.06 ± 13% perf-profile.children.cycles-pp.vm_area_dup 0.16 ± 8% +0.1 0.22 ± 6% perf-profile.children.cycles-pp.mas_preallocate 67.20 ± 8% +0.1 67.28 ± 7% perf-profile.children.cycles-pp.osq_lock 68.59 ± 8% +0.1 68.66 ± 7% perf-profile.children.cycles-pp.down_write_killable 1.04 ± 8% +0.1 1.12 ± 7% perf-profile.children.cycles-pp.rwsem_spin_on_owner 70.08 ± 8% +0.1 70.15 ± 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 68.52 ± 8% +0.1 68.60 ± 7% perf-profile.children.cycles-pp.rwsem_down_write_slowpath 70.03 ± 8% +0.1 70.11 ± 7% perf-profile.children.cycles-pp.do_syscall_64 68.46 ± 8% +0.1 68.55 ± 7% perf-profile.children.cycles-pp.rwsem_optimistic_spin 0.55 ± 8% +0.2 0.71 ± 8% perf-profile.children.cycles-pp.do_mas_align_munmap 35.12 ± 8% +0.2 35.31 ± 7% perf-profile.children.cycles-pp.__munmap 0.00 +0.2 0.22 ± 7% perf-profile.children.cycles-pp.vma_expand 0.00 +0.2 0.22 ±223% perf-profile.children.cycles-pp.intel_idle_irq 34.98 ± 8% +0.2 35.20 ± 7% perf-profile.children.cycles-pp.__x64_sys_munmap 34.97 ± 8% +0.2 35.20 ± 7% perf-profile.children.cycles-pp.__vm_munmap 0.64 ± 13% +0.2 0.88 ± 55% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 0.00 +0.3 0.30 ± 7% perf-profile.children.cycles-pp.__vma_adjust 0.00 +0.4 0.36 ± 6% perf-profile.children.cycles-pp.__split_vma 8.42 ± 91% +6.2 14.60 ± 77% perf-profile.children.cycles-pp.intel_idle_ibrs 29.52 ± 19% -0.1 29.45 ± 17% perf-profile.self.cycles-pp.mwait_idle_with_hints 0.18 ± 9% -0.1 0.12 ± 10% perf-profile.self.cycles-pp.rwsem_optimistic_spin 0.04 ± 45% -0.0 0.00 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.04 ± 44% -0.0 0.00 perf-profile.self.cycles-pp.mmap_region 0.10 ± 5% -0.0 0.08 ± 9% perf-profile.self.cycles-pp.mas_rev_awalk 0.06 ± 7% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.mas_walk 0.06 ± 11% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.do_mas_align_munmap 0.08 ± 8% -0.0 0.07 ± 14% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.08 ± 7% -0.0 0.07 ± 8% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.06 ± 13% -0.0 0.05 ± 7% perf-profile.self.cycles-pp.down_write_killable 0.07 ± 45% -0.0 0.07 ± 11% perf-profile.self.cycles-pp.native_irq_return_iret 0.05 ± 45% -0.0 0.05 ± 47% perf-profile.self.cycles-pp.unwind_next_frame 0.00 +0.0 0.01 ±223% perf-profile.self.cycles-pp.ktime_get_update_offsets_now 0.05 ± 45% +0.0 0.06 ± 11% perf-profile.self.cycles-pp.kmem_cache_free_bulk 0.00 +0.0 0.02 ±141% perf-profile.self.cycles-pp.kmem_cache_free 0.07 ± 8% +0.0 0.09 ± 33% perf-profile.self.cycles-pp.queue_event 0.06 ± 8% +0.0 0.08 ± 8% perf-profile.self.cycles-pp.memset_erms 0.04 ± 45% +0.0 0.08 ± 6% perf-profile.self.cycles-pp.kmem_cache_alloc_bulk 66.61 ± 8% +0.1 66.68 ± 7% perf-profile.self.cycles-pp.osq_lock 1.02 ± 8% +0.1 1.10 ± 7% perf-profile.self.cycles-pp.rwsem_spin_on_owner If you fix the issue, kindly add following tag | Reported-by: kernel test robot | Link: https://lore.kernel.org/oe-lkp/202212151657.5d11a672-yujie.liu@intel.com To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://01.org/lkp