FYI, we noticed a 343.9% improvement of vm-scalability.throughput due to commit: commit 0c649028cd2ffe58eed59287ae1e3a6b3e19419e ("mm: Provide helper for finishing mkwrite faults") https://github.com/0day-ci/linux Jan-Kara/dax-Clear-dirty-bits-after-flushing-caches/20160725-043348 in testcase: vm-scalability on test machine: 56 threads Grantley Haswell-EP with 64G memory with following parameters: runtime: 300s size: 1T nr_pmem: 1 test: msync-mt cpufreq_governor: performance Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/nr_pmem/rootfs/runtime/size/tbox_group/test/testcase: gcc-6/performance/x86_64-rhel/1/debian-x86_64-2015-02-07.cgz/300s/1T/lkp-hsw01/msync-mt/vm-scalability commit: ee48f4e9e2 ("mm: Lift vm_fault structure creation from do_page_mkwrite()") 0c649028cd ("mm: Provide helper for finishing mkwrite faults") ee48f4e9e21466ed 0c649028cd2ffe58eed59287ae ---------------- -------------------------- fail:runs %reproduction fail:runs | | | %stddev %change %stddev \ | \ 4408522 +- 4% +343.9% 19567723 +- 16% vm-scalability.throughput 960.89 +- 3% -7.4% 889.56 +- 2% vm-scalability.time.elapsed_time 960.89 +- 3% -7.4% 889.56 +- 2% vm-scalability.time.elapsed_time.max 1.387e+08 +- 0% +279.0% 5.256e+08 +- 14% vm-scalability.time.file_system_outputs 5102 +- 62% +85.4% 9459 +- 12% vm-scalability.time.major_page_faults 1511 +- 1% +73.4% 2619 +- 10% vm-scalability.time.percent_of_cpu_this_job_got 12786 +- 3% -44.5% 7093 +- 16% vm-scalability.time.system_time 1733 +- 6% +833.1% 16173 +- 20% vm-scalability.time.user_time 2.021e+08 +- 3% +21.1% 2.447e+08 +- 9% vm-scalability.time.voluntary_context_switches 351389 +- 13% +6.8e+07% 2.405e+11 +- 0% interrupts.CAL:Function_call_interrupts 2465280 +- 3% +20.0% 2958379 +- 10% softirqs.RCU 4871363 +- 2% +52.8% 7441080 +- 5% softirqs.SCHED 7418823 +- 2% +61.6% 11986432 +- 9% softirqs.TIMER 12982929 +- 18% +316.1% 54024649 +- 46% numa-numastat.node0.local_node 1393946 +- 37% +784.3% 12326850 +- 49% numa-numastat.node0.numa_foreign 12982932 +- 18% +316.1% 54024652 +- 46% numa-numastat.node0.numa_hit 6541446 +- 27% +313.7% 27059676 +- 52% numa-numastat.node0.numa_miss 6545171 +- 27% +313.5% 27066574 +- 52% numa-numastat.node1.numa_foreign 1397671 +- 37% +782.4% 12333748 +- 48% numa-numastat.node1.numa_miss 30.30 +- 0% +65.6% 50.17 +- 9% turbostat.%Busy 934.67 +- 0% +66.1% 1552 +- 9% turbostat.Avg_MHz 56.35 +- 0% -34.7% 36.77 +- 16% turbostat.CPU%c1 0.46 +- 5% -69.0% 0.14 +- 21% turbostat.CPU%c3 6.31 +- 5% -42.0% 3.66 +- 15% turbostat.Pkg%pc2 183.68 +- 0% +12.4% 206.39 +- 1% turbostat.PkgWatt 27.23 +- 0% +31.0% 35.67 +- 3% turbostat.RAMWatt 2.367e+08 +- 4% +32.4% 3.133e+08 +- 17% cpuidle.C1-HSW.usage 3.774e+09 +- 2% -78.4% 8.143e+08 +- 15% cpuidle.C1E-HSW.time 31623470 +- 2% -71.7% 8962095 +- 24% cpuidle.C1E-HSW.usage 5.597e+08 +- 3% -36.4% 3.558e+08 +- 8% cpuidle.C3-HSW.time 2241690 +- 2% -62.0% 852112 +- 4% cpuidle.C3-HSW.usage 2.099e+10 +- 4% -37.9% 1.303e+10 +- 4% cpuidle.C6-HSW.time 22173023 +- 4% -38.1% 13727677 +- 5% cpuidle.C6-HSW.usage 1.018e+09 +- 5% -36.5% 6.462e+08 +- 18% cpuidle.POLL.time 72114 +- 3% +310.0% 295683 +- 15% vmstat.io.bo 247.67 +- 5% -72.0% 69.25 +- 6% vmstat.memory.buff 2714683 +- 3% -28.9% 1930583 +- 5% vmstat.memory.free 493822 +- 27% +54.8% 764350 +- 2% vmstat.memory.swpd 39.33 +- 1% -42.2% 22.75 +- 18% vmstat.procs.b 15.00 +- 0% +78.3% 26.75 +- 10% vmstat.procs.r 416849 +- 0% +28.5% 535821 +- 7% vmstat.system.cs 266417 +- 0% +154.0% 676590 +- 22% vmstat.system.in 4543 +- 8% +116.0% 9814 +- 13% slabinfo.bio-1.active_objs 91.33 +- 8% +114.9% 196.25 +- 13% slabinfo.bio-1.active_slabs 4689 +- 7% +114.1% 10042 +- 13% slabinfo.bio-1.num_objs 91.33 +- 8% +114.9% 196.25 +- 13% slabinfo.bio-1.num_slabs 1577 +- 11% +17.4% 1852 +- 3% slabinfo.file_lock_cache.active_objs 1577 +- 11% +17.4% 1852 +- 3% slabinfo.file_lock_cache.num_objs 141137 +- 0% +9.6% 154714 +- 0% slabinfo.radix_tree_node.active_objs 2536 +- 0% +10.1% 2792 +- 0% slabinfo.radix_tree_node.active_slabs 2536 +- 0% +10.1% 2792 +- 0% slabinfo.radix_tree_node.num_slabs 237.67 +- 1% +362.8% 1100 +- 10% slabinfo.scsi_data_buffer.active_objs 237.67 +- 1% +362.8% 1100 +- 10% slabinfo.scsi_data_buffer.num_objs 55.67 +- 2% +364.4% 258.50 +- 10% slabinfo.xfs_efd_item.active_objs 55.67 +- 2% +364.4% 258.50 +- 10% slabinfo.xfs_efd_item.num_objs 16164022 +- 9% +41.9% 22943479 +- 3% meminfo.Active 6736631 +- 21% +94.4% 13096110 +- 8% meminfo.Active(anon) 40644 +- 3% -22.7% 31400 +- 8% meminfo.AnonPages 53765 +- 3% -57.0% 23116 +- 40% meminfo.CmaFree 14739819 +- 2% +20.1% 17695829 +- 1% meminfo.Committed_AS 4970502 +- 0% -15.7% 4191241 +- 10% meminfo.DirectMap2M 269001 +- 9% +184.4% 765157 +- 9% meminfo.Dirty 12785098 +- 12% -47.4% 6720300 +- 11% meminfo.Inactive 7016777 +- 22% -51.5% 3400607 +- 24% meminfo.Inactive(anon) 5768320 +- 2% -42.4% 3319691 +- 3% meminfo.Inactive(file) 14994701 +- 0% -14.5% 12819235 +- 2% meminfo.Mapped 17735647 +- 0% -15.9% 14916220 +- 1% meminfo.MemAvailable 2684208 +- 3% -29.3% 1897326 +- 5% meminfo.MemFree 13712595 +- 1% +20.1% 16463501 +- 1% meminfo.Shmem 343.00 +- 43% +487.0% 2013 +- 20% meminfo.SwapCached 371115 +- 1% +25.3% 465109 +- 0% meminfo.Unevictable 14413 +- 4% +203.9% 43804 +- 9% meminfo.Writeback 9385084 +- 1% +13.8% 10679611 +- 4% numa-meminfo.node0.Active 756567 +- 32% +409.3% 3853571 +- 72% numa-meminfo.node0.Active(anon) 5904595 +- 1% -34.2% 3885121 +- 14% numa-meminfo.node0.Inactive 5205257 +- 2% -55.2% 2330053 +- 41% numa-meminfo.node0.Inactive(file) 13699647 +- 1% -34.8% 8928305 +- 46% numa-meminfo.node0.Mapped 440455 +- 3% +142.8% 1069535 +- 6% numa-meminfo.node0.MemFree 29250 +- 1% -27.7% 21144 +- 33% numa-meminfo.node0.PageTables 1439104 +- 16% +274.8% 5393219 +- 80% numa-meminfo.node0.Shmem 185595 +- 1% +25.3% 232539 +- 0% numa-meminfo.node0.Unevictable 6772106 +- 22% +80.9% 12250113 +- 3% numa-meminfo.node1.Active 23876 +- 17% -28.2% 17134 +- 6% numa-meminfo.node1.AnonPages 32952 +- 18% +607.5% 233145 +- 75% numa-meminfo.node1.Dirty 13808627 +- 0% +10.8% 15298244 +- 0% numa-meminfo.node1.FilePages 6874580 +- 21% -58.8% 2832391 +- 14% numa-meminfo.node1.Inactive 6314396 +- 24% -70.8% 1844616 +- 45% numa-meminfo.node1.Inactive(anon) 2256364 +- 2% -62.6% 844215 +- 5% numa-meminfo.node1.MemFree 5797 +- 7% +101.6% 11688 +- 64% numa-meminfo.node1.PageTables 185687 +- 1% +25.2% 232559 +- 0% numa-meminfo.node1.Unevictable 719.33 +- 38% +1632.4% 12461 +- 71% numa-meminfo.node1.Writeback 6.292e+12 +- 2% +95.7% 1.232e+13 +- 8% perf-stat.branch-instructions 0.30 +- 1% -42.1% 0.17 +- 20% perf-stat.branch-miss-rate 1.888e+10 +- 3% +11.5% 2.105e+10 +- 11% perf-stat.branch-misses 14.62 +- 1% +23.2% 18.02 +- 3% perf-stat.cache-miss-rate 1.919e+10 +- 1% +123.0% 4.28e+10 +- 2% perf-stat.cache-misses 1.313e+11 +- 2% +81.0% 2.377e+11 +- 2% perf-stat.cache-references 4.012e+08 +- 3% +19.1% 4.78e+08 +- 9% perf-stat.context-switches 5.136e+13 +- 3% +52.5% 7.832e+13 +- 8% perf-stat.cpu-cycles 88752877 +- 3% -48.0% 46187791 +- 31% perf-stat.cpu-migrations 2.412e+09 +- 3% +86.1% 4.487e+09 +- 14% perf-stat.dTLB-load-misses 5.305e+12 +- 2% +67.3% 8.875e+12 +- 5% perf-stat.dTLB-loads 0.12 +- 0% +29.7% 0.16 +- 5% perf-stat.dTLB-store-miss-rate 2.088e+09 +- 1% +173.3% 5.706e+09 +- 9% perf-stat.dTLB-store-misses 1.688e+12 +- 1% +110.0% 3.546e+12 +- 5% perf-stat.dTLB-stores 4.126e+08 +- 7% +58.7% 6.55e+08 +- 5% perf-stat.iTLB-load-misses 4.566e+09 +- 3% +36.1% 6.213e+09 +- 13% perf-stat.iTLB-loads 2.373e+13 +- 2% +79.4% 4.259e+13 +- 7% perf-stat.instructions 0.46 +- 0% +17.8% 0.54 +- 1% perf-stat.ipc 5241 +- 61% +104.7% 10732 +- 12% perf-stat.major-faults 90.94 +- 0% -6.6% 84.93 +- 2% perf-stat.node-load-miss-rate 7.659e+09 +- 3% +18.8% 9.102e+09 +- 21% perf-stat.node-load-misses 7.635e+08 +- 3% +106.3% 1.575e+09 +- 9% perf-stat.node-loads 81.14 +- 0% +5.1% 85.25 +- 0% perf-stat.node-store-miss-rate 8.228e+09 +- 0% +236.1% 2.766e+10 +- 6% perf-stat.node-store-misses 1.913e+09 +- 1% +149.6% 4.775e+09 +- 3% perf-stat.node-stores 189100 +- 32% +409.9% 964261 +- 72% numa-vmstat.node0.nr_active_anon 109202 +- 5% +143.6% 266006 +- 6% numa-vmstat.node0.nr_free_pages 1301621 +- 2% -55.2% 583094 +- 41% numa-vmstat.node0.nr_inactive_file 3426754 +- 1% -34.9% 2229299 +- 46% numa-vmstat.node0.nr_mapped 7314 +- 1% -27.7% 5286 +- 33% numa-vmstat.node0.nr_page_table_pages 359786 +- 16% +275.0% 1349171 +- 80% numa-vmstat.node0.nr_shmem 46372 +- 1% +25.3% 58124 +- 0% numa-vmstat.node0.nr_unevictable 119145 +- 75% +891.4% 1181173 +- 90% numa-vmstat.node0.nr_vmscan_write 849473 +- 21% +788.8% 7550257 +- 37% numa-vmstat.node0.numa_foreign 8162609 +- 7% +243.0% 27998161 +- 45% numa-vmstat.node0.numa_hit 8162607 +- 7% +243.0% 27998158 +- 45% numa-vmstat.node0.numa_local 3187818 +- 11% +339.2% 14002372 +- 48% numa-vmstat.node0.numa_miss 515307 +- 44% +607.8% 3647375 +- 38% numa-vmstat.node0.workingset_refault 5968 +- 17% -28.2% 4283 +- 6% numa-vmstat.node1.nr_anon_pages 776330 +- 15% +1072.5% 9102811 +- 72% numa-vmstat.node1.nr_dirtied 8251 +- 19% +607.4% 58366 +- 74% numa-vmstat.node1.nr_dirty 3453327 +- 0% +10.8% 3826357 +- 0% numa-vmstat.node1.nr_file_pages 13419 +- 3% -57.2% 5749 +- 39% numa-vmstat.node1.nr_free_cma 562913 +- 3% -62.8% 209251 +- 5% numa-vmstat.node1.nr_free_pages 1579041 +- 24% -70.8% 461228 +- 45% numa-vmstat.node1.nr_inactive_anon 1448 +- 7% +101.9% 2924 +- 64% numa-vmstat.node1.nr_page_table_pages 46394 +- 1% +25.3% 58129 +- 0% numa-vmstat.node1.nr_unevictable 155.67 +- 45% +1942.3% 3179 +- 71% numa-vmstat.node1.nr_writeback 2058751 +- 37% +420.0% 10706310 +- 52% numa-vmstat.node1.nr_written 3133640 +- 11% +347.1% 14010896 +- 48% numa-vmstat.node1.numa_foreign 795294 +- 22% +850.4% 7558764 +- 37% numa-vmstat.node1.numa_miss 287258 +- 48% +1387.2% 4272007 +- 42% numa-vmstat.node1.workingset_refault 11409 +- 2% +268.3% 42026 +- 13% latency_stats.hits.call_rwsem_down_read_failed.__do_page_fault.do_page_fault.page_fault 281007 +- 24% +653.2% 2116410 +- 29% latency_stats.hits.call_rwsem_down_read_failed.xfs_ilock.xfs_ilock_data_map_shared.__xfs_get_blocks.xfs_get_blocks.do_mpage_readpage.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead.ondemand_readahead.page_cache_async_readahead.filemap_fault 69181 +- 29% +634.1% 507896 +- 24% latency_stats.hits.call_rwsem_down_write_failed.xfs_ilock.__xfs_get_blocks.xfs_get_blocks.__block_write_begin.block_page_mkwrite.xfs_filemap_page_mkwrite.do_page_mkwrite.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault 11018 +- 32% +526.5% 69026 +- 26% latency_stats.hits.call_rwsem_down_write_failed.xfs_ilock.xfs_vn_update_time.file_update_time.xfs_filemap_page_mkwrite.do_page_mkwrite.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 2055 +- 2% +364.7% 9553 +- 25% latency_stats.hits.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_range.filemap_write_and_wait_range.xfs_file_fsync.vfs_fsync_range.SyS_msync.entry_SYSCALL_64_fastpath 817463 +- 7% +222.0% 2632368 +- 50% latency_stats.hits.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 16296929 +- 5% +446.0% 88981686 +- 15% latency_stats.hits.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 9653 +-111% +1480.4% 152555 +-138% latency_stats.max.bt_get.blk_mq_get_tag.__blk_mq_alloc_request.blk_mq_map_request.blk_sq_make_request.generic_make_request.submit_bio.xfs_submit_ioend.xfs_do_writepage.write_cache_pages.xfs_vm_writepages.do_writepages 385.67 +- 66% +3536.9% 14026 +- 52% latency_stats.max.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 5259 +- 50% +904.5% 52829 +- 41% latency_stats.sum.bt_get.blk_mq_get_tag.__blk_mq_alloc_request.blk_mq_map_request.blk_sq_make_request.generic_make_request.submit_bio.xfs_add_to_ioend.xfs_do_writepage.write_cache_pages.xfs_vm_writepages.do_writepages 19723 +- 50% +1299.7% 276069 +- 33% latency_stats.sum.call_rwsem_down_read_failed.SyS_madvise.entry_SYSCALL_64_fastpath 1246087 +- 7% +284.0% 4785330 +- 11% latency_stats.sum.call_rwsem_down_read_failed.__do_page_fault.do_page_fault.page_fault 8300 +- 40% +506.6% 50348 +- 38% latency_stats.sum.call_rwsem_down_read_failed.do_exit.SyS_exit.entry_SYSCALL_64_fastpath 1126112 +- 27% +618.3% 8088329 +- 26% latency_stats.sum.call_rwsem_down_read_failed.xfs_ilock.xfs_ilock_data_map_shared.__xfs_get_blocks.xfs_get_blocks.do_mpage_readpage.mpage_readpages.xfs_vm_readpages.__do_page_cache_readahead.ondemand_readahead.page_cache_async_readahead.filemap_fault 27051 +- 6% +249.3% 94500 +- 27% latency_stats.sum.call_rwsem_down_write_failed.xfs_ilock.__xfs_get_blocks.xfs_get_blocks.__block_write_begin.block_page_mkwrite.xfs_filemap_page_mkwrite.do_page_mkwrite.do_wp_page.handle_pte_fault.handle_mm_fault.__do_page_fault 857851 +- 35% +543.9% 5524012 +- 25% latency_stats.sum.call_rwsem_down_write_failed.xfs_ilock.__xfs_get_blocks.xfs_get_blocks.__block_write_begin.block_page_mkwrite.xfs_filemap_page_mkwrite.do_page_mkwrite.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault 141821 +- 5% +189.9% 411157 +- 24% latency_stats.sum.call_rwsem_down_write_failed.xfs_ilock.xfs_vn_update_time.file_update_time.xfs_filemap_page_mkwrite.do_page_mkwrite.do_wp_page.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 1642514 +- 37% +427.8% 8669074 +- 29% latency_stats.sum.call_rwsem_down_write_failed.xfs_ilock.xfs_vn_update_time.file_update_time.xfs_filemap_page_mkwrite.do_page_mkwrite.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 14112 +- 15% +264.5% 51435 +- 5% latency_stats.sum.call_rwsem_down_write_failed_killable.SyS_mprotect.entry_SYSCALL_64_fastpath 9798 +- 20% +605.6% 69135 +- 22% latency_stats.sum.call_rwsem_down_write_failed_killable.SyS_munmap.entry_SYSCALL_64_fastpath 14778 +- 16% +248.9% 51564 +- 16% latency_stats.sum.call_rwsem_down_write_failed_killable.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath 23928 +- 15% +252.2% 84281 +- 20% latency_stats.sum.devkmsg_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath 3134 +- 67% +327.5% 13401 +- 55% latency_stats.sum.futex_wait_queue_me.futex_wait.do_futex.SyS_futex.entry_SYSCALL_64_fastpath 2503 +- 54% +550.3% 16282 +- 70% latency_stats.sum.stop_two_cpus.migrate_swap.task_numa_migrate.numa_migrate_preferred.task_numa_fault.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 335934 +- 6% +334.3% 1459124 +- 24% latency_stats.sum.wait_on_page_bit.__filemap_fdatawait_range.filemap_fdatawait_range.filemap_write_and_wait_range.xfs_file_fsync.vfs_fsync_range.SyS_msync.entry_SYSCALL_64_fastpath 6291964 +- 7% +275.4% 23622933 +- 45% latency_stats.sum.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 3.97e+08 +- 12% +553.5% 2.595e+09 +- 12% latency_stats.sum.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 61635 +- 10% +173.5% 168594 +- 21% latency_stats.sum.wait_woken.inotify_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath 44.33 +- 82% +945.5% 463.50 +- 52% proc-vmstat.allocstall 239.67 +- 62% +407.6% 1216 +- 35% proc-vmstat.kswapd_high_wmark_hit_quickly 1684142 +- 22% +94.3% 3272966 +- 8% proc-vmstat.nr_active_anon 10169 +- 3% -22.7% 7856 +- 8% proc-vmstat.nr_anon_pages 17334991 +- 0% +279.0% 65696881 +- 14% proc-vmstat.nr_dirtied 67150 +- 9% +185.0% 191346 +- 9% proc-vmstat.nr_dirty 440744 +- 0% -15.9% 370493 +- 1% proc-vmstat.nr_dirty_background_threshold 882566 +- 0% -15.9% 741892 +- 1% proc-vmstat.nr_dirty_threshold 13420 +- 3% -56.9% 5778 +- 40% proc-vmstat.nr_free_cma 672764 +- 3% -29.2% 476003 +- 5% proc-vmstat.nr_free_pages 1753420 +- 22% -51.5% 850281 +- 24% proc-vmstat.nr_inactive_anon 1441315 +- 2% -42.4% 830209 +- 3% proc-vmstat.nr_inactive_file 3747801 +- 0% -14.6% 3201756 +- 2% proc-vmstat.nr_mapped 3271 +- 65% +234.5% 10940 +- 11% proc-vmstat.nr_pages_scanned 3427344 +- 1% +20.1% 4114926 +- 1% proc-vmstat.nr_shmem 92800 +- 1% +25.3% 116264 +- 0% proc-vmstat.nr_unevictable 1283 +- 43% +937.2% 13310 +- 36% proc-vmstat.nr_vmscan_immediate_reclaim 1471488 +- 50% +156.7% 3776625 +- 19% proc-vmstat.nr_vmscan_write 3738 +- 8% +204.2% 11371 +- 6% proc-vmstat.nr_writeback 19863390 +- 5% +248.9% 69302109 +- 13% proc-vmstat.nr_written 7939117 +- 17% +396.2% 39393425 +- 24% proc-vmstat.numa_foreign 13788271 +- 4% +234.3% 46097371 +- 12% proc-vmstat.numa_hint_faults 8596174 +- 4% +325.2% 36550560 +- 13% proc-vmstat.numa_hint_faults_local 44180066 +- 1% +117.7% 96181368 +- 16% proc-vmstat.numa_hit 44180062 +- 1% +117.7% 96181364 +- 16% proc-vmstat.numa_local 7939117 +- 17% +396.2% 39393425 +- 24% proc-vmstat.numa_miss 83653 +- 5% +198.3% 249552 +- 56% proc-vmstat.numa_pages_migrated 31393579 +- 0% +264.6% 1.145e+08 +- 13% proc-vmstat.numa_pte_updates 950.00 +- 76% +213.2% 2975 +- 16% proc-vmstat.pageoutrun 22952028 +- 4% +256.9% 81926442 +- 16% proc-vmstat.pgactivate 2593340 +- 6% +301.7% 10418533 +- 27% proc-vmstat.pgalloc_dma32 50666335 +- 2% +150.7% 1.27e+08 +- 3% proc-vmstat.pgalloc_normal 7790592 +- 16% +406.6% 39465289 +- 29% proc-vmstat.pgdeactivate 4.301e+08 +- 3% +19.4% 5.134e+08 +- 4% proc-vmstat.pgfault 51805819 +- 3% +162.3% 1.359e+08 +- 5% proc-vmstat.pgfree 5200 +- 61% +99.6% 10380 +- 12% proc-vmstat.pgmajfault 2396462 +- 7% +62.3% 3890122 +- 55% proc-vmstat.pgmigrate_fail 85114 +- 4% +199.0% 254469 +- 57% proc-vmstat.pgmigrate_success 69341960 +- 0% +279.0% 2.628e+08 +- 14% proc-vmstat.pgpgout 934568 +- 7% +357.9% 4279061 +- 30% proc-vmstat.pgrefill_dma32 6860254 +- 17% +412.9% 35186854 +- 28% proc-vmstat.pgrefill_normal 445307 +- 69% +544.2% 2868453 +- 36% proc-vmstat.pgrotated 286646 +- 88% +408.9% 1458838 +- 27% proc-vmstat.pgscan_direct_dma32 5511171 +-110% +186.8% 15804483 +- 25% proc-vmstat.pgscan_direct_normal 1931438 +- 18% +314.0% 7995975 +- 23% proc-vmstat.pgscan_kswapd_dma32 22368566 +- 17% +355.1% 1.018e+08 +- 19% proc-vmstat.pgscan_kswapd_normal 4343 +-136% +239.2% 14732 +- 59% proc-vmstat.pgsteal_direct_dma32 5192 +-115% +1117.3% 63207 +- 25% proc-vmstat.pgsteal_direct_normal 405523 +- 35% +447.9% 2221947 +- 22% proc-vmstat.pgsteal_kswapd_dma32 6418737 +- 24% +389.7% 31434080 +- 23% proc-vmstat.pgsteal_kswapd_normal 629598 +- 44% +554.0% 4117299 +- 67% proc-vmstat.workingset_activate 802748 +- 44% +886.8% 7921519 +- 33% proc-vmstat.workingset_refault 2.02 +- 7% +180.4% 5.67 +- 21% perf-profile.cycles-pp.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate 0.00 +- -1% +Inf% 3.09 +- 72% perf-profile.cycles-pp.__do_fault.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault 50.59 +- 2% -35.6% 32.61 +- 25% perf-profile.cycles-pp.__do_page_fault.do_page_fault.page_fault 0.00 +- -1% +Inf% 2.22 +- 67% perf-profile.cycles-pp.__lock_page_or_retry.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault 4.08 +- 18% -63.2% 1.50 +- 75% perf-profile.cycles-pp.__migration_entry_wait.migration_entry_wait.handle_pte_fault.handle_mm_fault.__do_page_fault 0.58 +- 7% +157.6% 1.50 +- 17% perf-profile.cycles-pp.__schedule.schedule.schedule_preempt_disabled.cpu_startup_entry.start_secondary 0.82 +- 7% +101.9% 1.66 +- 28% perf-profile.cycles-pp.__schedule.schedule.schedule_timeout.io_schedule_timeout.bit_wait_io 0.00 +- -1% +Inf% 2.04 +- 67% perf-profile.cycles-pp.__wait_on_bit.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.xfs_filemap_fault 0.00 +- -1% +Inf% 2.56 +- 62% perf-profile.cycles-pp.__wake_up_common.__wake_up.__wake_up_bit.unlock_page.fault_dirty_shared_page.isra.58 0.00 +- -1% +Inf% 1.21 +- 42% perf-profile.cycles-pp.__writeback_inodes_wb.wb_writeback.wb_workfn.process_one_work.worker_thread 0.00 +- -1% +Inf% 1.21 +- 42% perf-profile.cycles-pp.__writeback_single_inode.writeback_sb_inodes.__writeback_inodes_wb.wb_writeback.wb_workfn 4.06 +- 18% -63.9% 1.46 +- 75% perf-profile.cycles-pp._raw_spin_lock.__migration_entry_wait.migration_entry_wait.handle_pte_fault.handle_mm_fault 11.58 +- 8% -100.0% 0.00 +- -1% perf-profile.cycles-pp._raw_spin_lock.do_wp_page.handle_pte_fault.handle_mm_fault.__do_page_fault 24.80 +- 7% -76.8% 5.75 +- 65% perf-profile.cycles-pp._raw_spin_lock.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault 0.26 +-141% +505.7% 1.59 +- 6% perf-profile.cycles-pp.activate_task.ttwu_do_activate.sched_ttwu_pending.cpu_startup_entry.start_secondary 2.12 +- 6% +146.1% 5.23 +- 27% perf-profile.cycles-pp.activate_task.ttwu_do_activate.try_to_wake_up.default_wake_function.wake_bit_function 0.00 +- -1% +Inf% 0.79 +- 38% perf-profile.cycles-pp.bit_wait_io.__wait_on_bit.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault 2.81 +- 4% +171.7% 7.64 +- 22% perf-profile.cycles-pp.default_wake_function.wake_bit_function.__wake_up_common.__wake_up.__wake_up_bit 50.62 +- 2% -35.4% 32.70 +- 25% perf-profile.cycles-pp.do_page_fault.page_fault 17.68 +- 5% -32.6% 11.92 +- 46% perf-profile.cycles-pp.do_wp_page.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault 0.00 +- -1% +Inf% 1.21 +- 42% perf-profile.cycles-pp.do_writepages.__writeback_single_inode.writeback_sb_inodes.__writeback_inodes_wb.wb_writeback 1.80 +- 6% +138.5% 4.29 +- 15% perf-profile.cycles-pp.dump_trace.save_stack_trace_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair 0.25 +-141% +506.0% 1.51 +- 6% perf-profile.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.sched_ttwu_pending 2.05 +- 7% +146.6% 5.05 +- 27% perf-profile.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up 0.26 +-141% +503.9% 1.55 +- 6% perf-profile.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.sched_ttwu_pending.cpu_startup_entry 2.07 +- 7% +146.8% 5.12 +- 27% perf-profile.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up.default_wake_function 0.00 +- -1% +Inf% 2.95 +- 72% perf-profile.cycles-pp.filemap_fault.xfs_filemap_fault.__do_fault.handle_pte_fault.handle_mm_fault 0.00 +- -1% +Inf% 2.09 +- 65% perf-profile.cycles-pp.finish_mkwrite_fault.do_wp_page.handle_pte_fault.handle_mm_fault.__do_page_fault 49.86 +- 2% -38.3% 30.75 +- 25% perf-profile.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 49.51 +- 3% -39.5% 29.95 +- 25% perf-profile.cycles-pp.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 0.00 +- -1% +Inf% 0.78 +- 38% perf-profile.cycles-pp.io_schedule_timeout.bit_wait_io.__wait_on_bit.wait_on_page_bit_killable.__lock_page_or_retry 0.00 +- -1% +Inf% 3.66 +- 41% perf-profile.cycles-pp.kthread.ret_from_fork 0.00 +- -1% +Inf% 1.34 +- 48% perf-profile.cycles-pp.kthread_worker_fn.kthread.ret_from_fork 0.00 +- -1% +Inf% 1.32 +- 48% perf-profile.cycles-pp.loop_queue_work.kthread_worker_fn.kthread.ret_from_fork 4.12 +- 18% -62.9% 1.53 +- 75% perf-profile.cycles-pp.migration_entry_wait.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault 4.03 +- 18% -65.3% 1.40 +- 75% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__migration_entry_wait.migration_entry_wait.handle_pte_fault 11.44 +- 9% -100.0% 0.00 +- -1% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_wp_page.handle_pte_fault.handle_mm_fault 24.44 +- 7% -77.2% 5.56 +- 65% perf-profile.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.handle_pte_fault.handle_mm_fault.__do_page_fault 50.63 +- 2% -35.3% 32.75 +- 25% perf-profile.cycles-pp.page_fault 5.02 +- 3% -50.8% 2.47 +- 18% perf-profile.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry 1.36 +- 9% +160.4% 3.53 +- 16% perf-profile.cycles-pp.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency.enqueue_entity 0.00 +- -1% +Inf% 1.29 +- 38% perf-profile.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork 0.00 +- -1% +Inf% 3.66 +- 41% perf-profile.cycles-pp.ret_from_fork 1.80 +- 6% +138.4% 4.30 +- 15% perf-profile.cycles-pp.save_stack_trace_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task 0.66 +- 25% +170.2% 1.79 +- 6% perf-profile.cycles-pp.sched_ttwu_pending.cpu_startup_entry.start_secondary 0.61 +- 6% +157.6% 1.58 +- 17% perf-profile.cycles-pp.schedule.schedule_preempt_disabled.cpu_startup_entry.start_secondary 0.62 +- 6% +156.3% 1.60 +- 17% perf-profile.cycles-pp.schedule_preempt_disabled.cpu_startup_entry.start_secondary 2.77 +- 4% +172.7% 7.54 +- 23% perf-profile.cycles-pp.try_to_wake_up.default_wake_function.wake_bit_function.__wake_up_common.__wake_up 0.28 +-141% +495.5% 1.67 +- 6% perf-profile.cycles-pp.ttwu_do_activate.sched_ttwu_pending.cpu_startup_entry.start_secondary 2.24 +- 5% +145.3% 5.49 +- 28% perf-profile.cycles-pp.ttwu_do_activate.try_to_wake_up.default_wake_function.wake_bit_function.__wake_up_common 0.00 +- -1% +Inf% 2.06 +- 67% perf-profile.cycles-pp.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.xfs_filemap_fault.__do_fault 3.03 +- 4% +155.6% 7.75 +- 22% perf-profile.cycles-pp.wake_bit_function.__wake_up_common.__wake_up.__wake_up_bit.unlock_page 0.00 +- -1% +Inf% 1.21 +- 42% perf-profile.cycles-pp.wb_workfn.process_one_work.worker_thread.kthread.ret_from_fork 0.00 +- -1% +Inf% 1.21 +- 42% perf-profile.cycles-pp.wb_writeback.wb_workfn.process_one_work.worker_thread.kthread 0.00 +- -1% +Inf% 1.29 +- 38% perf-profile.cycles-pp.worker_thread.kthread.ret_from_fork 0.00 +- -1% +Inf% 1.21 +- 42% perf-profile.cycles-pp.write_cache_pages.xfs_vm_writepages.do_writepages.__writeback_single_inode.writeback_sb_inodes 0.00 +- -1% +Inf% 1.21 +- 42% perf-profile.cycles-pp.writeback_sb_inodes.__writeback_inodes_wb.wb_writeback.wb_workfn.process_one_work 0.00 +- -1% +Inf% 3.08 +- 72% perf-profile.cycles-pp.xfs_filemap_fault.__do_fault.handle_pte_fault.handle_mm_fault.__do_page_fault 0.00 +- -1% +Inf% 1.21 +- 42% perf-profile.cycles-pp.xfs_vm_writepages.do_writepages.__writeback_single_inode.writeback_sb_inodes.__writeback_inodes_wb 0.22 +- 4% +299.2% 0.86 +- 28% perf-profile.func.cycles-pp.__account_scheduler_latency 0.56 +- 0% +97.5% 1.11 +- 12% perf-profile.func.cycles-pp.__kernel_text_address 0.55 +- 12% +81.6% 1.00 +- 25% perf-profile.func.cycles-pp._raw_spin_lock 0.38 +- 6% +132.2% 0.89 +- 16% perf-profile.func.cycles-pp._raw_spin_lock_irqsave 0.08 +- 0% +712.5% 0.65 +- 53% perf-profile.func.cycles-pp.memcpy_erms 41.85 +- 4% -51.6% 20.27 +- 46% perf-profile.func.cycles-pp.native_queued_spin_lock_slowpath 5.06 +- 3% -50.2% 2.52 +- 18% perf-profile.func.cycles-pp.poll_idle 0.47 +- 0% +147.7% 1.17 +- 21% perf-profile.func.cycles-pp.print_context_stack 0.20 +- 16% +477.9% 1.18 +- 27% perf-profile.func.cycles-pp.smp_call_function_many 83234 +- 59% +210.7% 258569 +- 29% sched_debug.cfs_rq:/.MIN_vruntime.avg 2954789 +- 51% +147.4% 7310402 +- 24% sched_debug.cfs_rq:/.MIN_vruntime.max 460002 +- 54% +174.1% 1261004 +- 24% sched_debug.cfs_rq:/.MIN_vruntime.stddev 122858 +- 1% +64.1% 201649 +- 8% sched_debug.cfs_rq:/.exec_clock.avg 32708 +- 5% +383.9% 158287 +- 17% sched_debug.cfs_rq:/.exec_clock.min 68617 +- 3% -52.1% 32856 +- 29% sched_debug.cfs_rq:/.exec_clock.stddev 44289 +- 97% +724.1% 364993 +- 58% sched_debug.cfs_rq:/.load.avg 1994839 +-120% +808.8% 18128382 +- 54% sched_debug.cfs_rq:/.load.max 271440 +-117% +816.6% 2488134 +- 56% sched_debug.cfs_rq:/.load.stddev 34.80 +- 59% +929.5% 358.22 +- 36% sched_debug.cfs_rq:/.load_avg.avg 1295 +- 72% +1158.0% 16294 +- 35% sched_debug.cfs_rq:/.load_avg.max 180.36 +- 68% +1127.0% 2213 +- 36% sched_debug.cfs_rq:/.load_avg.stddev 83234 +- 59% +210.7% 258569 +- 29% sched_debug.cfs_rq:/.max_vruntime.avg 2954789 +- 51% +147.4% 7310402 +- 24% sched_debug.cfs_rq:/.max_vruntime.max 460002 +- 54% +174.1% 1261004 +- 24% sched_debug.cfs_rq:/.max_vruntime.stddev 4319334 +- 0% +141.6% 10435459 +- 9% sched_debug.cfs_rq:/.min_vruntime.avg 7840197 +- 2% +60.1% 12552896 +- 3% sched_debug.cfs_rq:/.min_vruntime.max 1231917 +- 6% +570.5% 8260526 +- 18% sched_debug.cfs_rq:/.min_vruntime.min 0.30 +- 1% +52.3% 0.46 +- 17% sched_debug.cfs_rq:/.nr_running.avg 0.97 +- 2% +60.9% 1.56 +- 16% sched_debug.cfs_rq:/.nr_spread_over.avg 1.92 +- 27% +173.7% 5.25 +- 22% sched_debug.cfs_rq:/.nr_spread_over.max 0.15 +- 59% +482.4% 0.87 +- 17% sched_debug.cfs_rq:/.nr_spread_over.stddev 18.77 +-104% +1282.1% 259.45 +- 48% sched_debug.cfs_rq:/.runnable_load_avg.avg 955.11 +-110% +1346.5% 13816 +- 48% sched_debug.cfs_rq:/.runnable_load_avg.max 127.51 +-109% +1336.9% 1832 +- 48% sched_debug.cfs_rq:/.runnable_load_avg.stddev -3520631 +- -4% -59.1% -1440204 +--72% sched_debug.cfs_rq:/.spread0.avg 310.50 +-112% +2.2e+05% 677308 +- 64% sched_debug.cfs_rq:/.spread0.max -6608099 +- -3% -45.3% -3615267 +--43% sched_debug.cfs_rq:/.spread0.min 286.24 +- 5% +61.1% 461.24 +- 19% sched_debug.cfs_rq:/.util_avg.avg 623.95 +- 6% +42.5% 888.93 +- 7% sched_debug.cfs_rq:/.util_avg.max 45.40 +- 33% +291.8% 177.89 +- 37% sched_debug.cfs_rq:/.util_avg.min 177.52 +- 7% -22.0% 138.47 +- 8% sched_debug.cfs_rq:/.util_avg.stddev 241086 +- 4% -31.1% 166030 +- 10% sched_debug.cpu.avg_idle.stddev 3.09 +- 12% +191.0% 9.00 +- 36% sched_debug.cpu.clock.stddev 3.09 +- 12% +191.0% 9.00 +- 36% sched_debug.cpu.clock_task.stddev 2.05 +- 24% +10208.7% 211.72 +- 52% sched_debug.cpu.cpu_load[0].avg 78.43 +- 34% +14349.1% 11332 +- 54% sched_debug.cpu.cpu_load[0].max 10.80 +- 33% +13850.8% 1506 +- 54% sched_debug.cpu.cpu_load[0].stddev 20.80 +- 96% +1256.9% 282.25 +- 47% sched_debug.cpu.cpu_load[1].avg 588.31 +- 86% +2355.9% 14448 +- 48% sched_debug.cpu.cpu_load[1].max 100.96 +- 95% +1813.0% 1931 +- 48% sched_debug.cpu.cpu_load[1].stddev 13.59 +- 78% +1932.5% 276.32 +- 42% sched_debug.cpu.cpu_load[2].avg 358.32 +- 64% +3821.9% 14053 +- 44% sched_debug.cpu.cpu_load[2].max 60.06 +- 76% +3020.5% 1874 +- 44% sched_debug.cpu.cpu_load[2].stddev 12.06 +- 74% +2159.9% 272.51 +- 39% sched_debug.cpu.cpu_load[3].avg 315.61 +- 68% +4193.3% 13549 +- 42% sched_debug.cpu.cpu_load[3].max 49.43 +- 73% +3567.2% 1812 +- 41% sched_debug.cpu.cpu_load[3].stddev 12.65 +- 80% +1970.8% 261.96 +- 37% sched_debug.cpu.cpu_load[4].avg 433.85 +- 94% +2821.9% 12676 +- 38% sched_debug.cpu.cpu_load[4].max 59.89 +- 91% +2730.7% 1695 +- 38% sched_debug.cpu.cpu_load[4].stddev 3131 +- 3% +75.0% 5479 +- 18% sched_debug.cpu.curr->pid.avg 43980 +- 96% +608.4% 311539 +- 47% sched_debug.cpu.load.avg 1993032 +-120% +733.6% 16614394 +- 48% sched_debug.cpu.load.max 272149 +-116% +711.8% 2209313 +- 48% sched_debug.cpu.load.stddev 500244 +- 0% +9.0% 545514 +- 4% sched_debug.cpu.max_idle_balance_cost.max 32.43 +-117% +23144.1% 7538 +- 58% sched_debug.cpu.max_idle_balance_cost.stddev 0.00 +- 23% +48.1% 0.00 +- 10% sched_debug.cpu.next_balance.stddev 438825 +- 2% -16.1% 368018 +- 5% sched_debug.cpu.nr_load_updates.max 141481 +- 2% +84.8% 261422 +- 4% sched_debug.cpu.nr_load_updates.min 105941 +- 2% -63.6% 38569 +- 25% sched_debug.cpu.nr_load_updates.stddev 0.30 +- 5% +72.1% 0.52 +- 16% sched_debug.cpu.nr_running.avg 1.46 +- 7% +47.1% 2.14 +- 8% sched_debug.cpu.nr_running.max 0.47 +- 3% +12.2% 0.53 +- 3% sched_debug.cpu.nr_running.stddev 3507693 +- 2% +19.5% 4192972 +- 9% sched_debug.cpu.nr_switches.avg 6655564 +- 3% -14.9% 5664879 +- 8% sched_debug.cpu.nr_switches.max 853965 +- 3% +177.3% 2368023 +- 7% sched_debug.cpu.nr_switches.min 2071286 +- 3% -43.5% 1171089 +- 19% sched_debug.cpu.nr_switches.stddev 0.59 +- 5% -51.3% 0.29 +- 43% sched_debug.cpu.nr_uninterruptible.avg 3509145 +- 2% +19.6% 4197314 +- 9% sched_debug.cpu.sched_count.avg 6652383 +- 3% -14.8% 5667506 +- 8% sched_debug.cpu.sched_count.max 853301 +- 3% +177.7% 2369720 +- 7% sched_debug.cpu.sched_count.min 2071442 +- 3% -43.4% 1171689 +- 19% sched_debug.cpu.sched_count.stddev 1702491 +- 2% +17.9% 2007809 +- 9% sched_debug.cpu.sched_goidle.avg 3253713 +- 3% -16.9% 2704602 +- 8% sched_debug.cpu.sched_goidle.max 409921 +- 3% +176.7% 1134208 +- 6% sched_debug.cpu.sched_goidle.min 1018693 +- 3% -45.2% 558120 +- 18% sched_debug.cpu.sched_goidle.stddev 1793145 +- 2% +21.7% 2182201 +- 9% sched_debug.cpu.ttwu_count.avg 434146 +- 3% +197.5% 1291761 +- 7% sched_debug.cpu.ttwu_count.min 44928 +- 4% +16.7% 52427 +- 5% sched_debug.cpu.ttwu_local.avg 19131 +- 10% -31.1% 13178 +- 23% sched_debug.cpu.ttwu_local.stddev vm-scalability.throughput 2.5e+07 ++----------------------------------------------------------------+ | | O O O O O O O O O O O O O | 2e+07 ++ | | | | O O O | 1.5e+07 ++ O OO O O O O OO O O O | | | 1e+07 ++ | | | | | 5e+06 *+ .*.* .*. .*.*. .*. *.*.*. .*.*. *. .*.*. .* | *.* *.* * * * *.* * *.* *.*.*.** *.* | | + + | 0 ++----------------------------------------------------------*-----+ [*] bisect-good sample [O] bisect-bad sample Thanks, Xiaolong