Greeting, FYI, we noticed a -2.9% regression of will-it-scale.per_process_ops due to commit: commit: 71ee870ccb768a5019ff8ebeb47cc9f062559a7a ("[RFC PATCH] mm: readahead: add readahead_shift into backing device") url: https://github.com/0day-ci/linux/commits/Martin-Liu/mm-readahead-add-readahead_shift-into-backing-device/20190325-133415 in testcase: will-it-scale on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory with following parameters: nr_task: 50% mode: process test: poll2 cpufreq_governor: performance test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-7/performance/x86_64-rhel-7.6/process/50%/debian-x86_64-2018-04-03-no-ucode.cgz/lkp-bdw-ep3d/poll2/will-it-scale commit: v5.1-rc2 71ee870ccb ("mm: readahead: add readahead_shift into backing device") v5.1-rc2 71ee870ccb768a5019ff8ebeb47 ---------------- --------------------------- fail:runs %reproduction fail:runs | | | :5 40% 2:4 dmesg.BUG:unable_to_handle_kernel :5 40% 2:4 dmesg.Kernel_panic-not_syncing:Fatal_exception :5 40% 2:4 dmesg.Oops:#[##] :5 40% 2:4 dmesg.RIP:ondemand_readahead %stddev %change %stddev \ | \ 330269 -2.9% 320782 will-it-scale.per_process_ops 14531906 -2.9% 14114429 will-it-scale.workload 38.92 ± 10% +2.3% 39.81 ± 10% boot-time.boot 6108941 ±121% -99.4% 38017 ± 3% cpuidle.C1.usage 53094 ± 3% +6.3% 56436 ± 6% meminfo.Shmem 6106659 ±121% -99.4% 36364 ± 3% turbostat.C1 245.02 -3.1% 237.54 turbostat.PkgWatt 16384 ± 9% -15.8% 13797 numa-vmstat.node0.nr_slab_unreclaimable 2715 ± 18% +30.4% 3542 numa-vmstat.node1.nr_mapped 8641 ± 11% +14.4% 9889 ± 5% numa-vmstat.node1.nr_slab_reclaimable 14101 ± 12% +19.9% 16907 numa-vmstat.node1.nr_slab_unreclaimable 65541 ± 9% -15.8% 55189 numa-meminfo.node0.SUnreclaim 34563 ± 11% +14.5% 39558 ± 5% numa-meminfo.node1.KReclaimable 34563 ± 11% +14.5% 39558 ± 5% numa-meminfo.node1.SReclaimable 56405 ± 12% +19.9% 67631 numa-meminfo.node1.SUnreclaim 90969 ± 12% +17.8% 107190 numa-meminfo.node1.Slab 28.50 ± 21% +7.2 35.70 ± 22% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 28.50 ± 21% +7.2 35.71 ± 22% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64 28.50 ± 21% +7.2 35.71 ± 22% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64 29.06 ± 22% +7.2 36.28 ± 22% perf-profile.calltrace.cycles-pp.secondary_startup_64 28.43 ± 22% +7.2 35.66 ± 22% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 28.32 ± 22% +7.3 35.62 ± 22% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary 68918 +1.6% 70055 proc-vmstat.nr_active_anon 4420 +1.1% 4471 proc-vmstat.nr_inactive_anon 13272 ± 3% +6.3% 14110 ± 6% proc-vmstat.nr_shmem 68918 +1.6% 70055 proc-vmstat.nr_zone_active_anon 4420 +1.1% 4471 proc-vmstat.nr_zone_inactive_anon 12592 ± 5% +9.8% 13822 ± 8% proc-vmstat.pgactivate 22.97 ± 16% +15.2% 26.46 ± 4% sched_debug.cfs_rq:/.load_avg.avg 31.76 ± 68% +91.5% 60.83 ± 16% sched_debug.cfs_rq:/.removed.load_avg.stddev 1465 ± 68% +90.9% 2798 ± 16% sched_debug.cfs_rq:/.removed.runnable_sum.stddev 90.67 ± 81% +122.4% 201.67 ± 28% sched_debug.cfs_rq:/.removed.util_avg.max 13.91 ± 72% +96.0% 27.27 ± 24% sched_debug.cfs_rq:/.removed.util_avg.stddev 161427 ± 2% +53.9% 248474 ± 23% sched_debug.cpu.avg_idle.stddev 32.13 ± 11% -17.8% 26.42 ± 3% sched_debug.cpu.cpu_load[3].max 319.40 ± 7% +9.5% 349.67 ± 3% sched_debug.cpu.nr_switches.min 16306 ± 32% +65.7% 27026 ± 23% softirqs.CPU11.RCU 19898 ± 15% +17.4% 23362 ± 12% softirqs.CPU25.RCU 25900 ± 19% -46.7% 13796 ± 39% softirqs.CPU38.RCU 18084 ± 26% +58.8% 28722 ± 6% softirqs.CPU39.RCU 90524 ± 2% +37.1% 124077 ± 24% softirqs.CPU39.TIMER 19146 ± 30% +21.8% 23323 ± 33% softirqs.CPU4.RCU 4625 ± 4% +336.9% 20209 ± 76% softirqs.CPU45.SCHED 4695 ± 2% +401.2% 23537 ± 79% softirqs.CPU61.SCHED 89544 +36.5% 122247 ± 26% softirqs.CPU61.TIMER 4691 ± 2% +404.2% 23656 ± 80% softirqs.CPU65.SCHED 89357 +46.9% 131246 ± 32% softirqs.CPU65.TIMER 10339 ± 4% -20.7% 8197 ± 13% softirqs.CPU66.RCU 4555 ± 3% +337.5% 19929 ± 76% softirqs.CPU66.SCHED 13869 ± 37% -39.3% 8419 ± 2% softirqs.CPU78.RCU 12163 ± 34% -33.2% 8125 ± 4% softirqs.CPU80.RCU 4660 ± 3% +331.1% 20087 ± 77% softirqs.CPU80.SCHED 5222 ± 24% +352.1% 23611 ± 79% softirqs.CPU82.SCHED 90496 ± 3% +35.0% 122171 ± 27% softirqs.CPU82.TIMER 9935 ± 6% -19.9% 7958 ± 8% softirqs.CPU84.RCU 4653 ± 4% +325.4% 19793 ± 76% softirqs.CPU84.SCHED 9701 ± 6% -21.8% 7582 ± 8% softirqs.CPU85.RCU 4643 ± 4% +325.2% 19741 ± 76% softirqs.CPU85.SCHED 0.08 ± 4% +11.7% 0.09 ± 3% perf-stat.i.MPKI 3.464e+10 -2.9% 3.365e+10 perf-stat.i.branch-instructions 0.27 +0.0 0.27 perf-stat.i.branch-miss-rate% 206283 ± 6% +10.1% 227086 ± 7% perf-stat.i.cache-misses 12679443 ± 4% +9.2% 13840943 ± 3% perf-stat.i.cache-references 0.75 +3.0% 0.78 perf-stat.i.cpi 13397222 -5.7% 12635139 perf-stat.i.dTLB-load-misses 3.619e+10 -2.9% 3.515e+10 perf-stat.i.dTLB-loads 1.854e+10 -2.3% 1.811e+10 perf-stat.i.dTLB-stores 1.636e+11 -2.9% 1.589e+11 perf-stat.i.instructions 1.33 -2.9% 1.29 perf-stat.i.ipc 24604 ± 32% +13.2% 27859 ± 26% perf-stat.i.node-stores 0.08 ± 4% +12.4% 0.09 ± 3% perf-stat.overall.MPKI 0.26 +0.0 0.27 perf-stat.overall.branch-miss-rate% 0.75 +3.0% 0.78 perf-stat.overall.cpi 1.33 -2.9% 1.29 perf-stat.overall.ipc 3.452e+10 -2.9% 3.353e+10 perf-stat.ps.branch-instructions 205685 ± 6% +10.1% 226440 ± 7% perf-stat.ps.cache-misses 12640358 ± 4% +9.2% 13798889 ± 3% perf-stat.ps.cache-references 13352001 -5.7% 12592542 perf-stat.ps.dTLB-load-misses 3.606e+10 -2.9% 3.503e+10 perf-stat.ps.dTLB-loads 1.847e+10 -2.3% 1.805e+10 perf-stat.ps.dTLB-stores 1.63e+11 -2.9% 1.584e+11 perf-stat.ps.instructions 24533 ± 32% +13.2% 27780 ± 26% perf-stat.ps.node-stores 4.923e+13 -2.9% 4.782e+13 perf-stat.total.instructions 659.20 ± 14% -33.4% 439.00 ± 5% interrupts.32:PCI-MSI.3145729-edge.eth0-TxRx-0 193.20 ± 13% +46.5% 283.00 ± 4% interrupts.35:PCI-MSI.3145732-edge.eth0-TxRx-3 341.60 +9.0% 372.50 interrupts.9:IO-APIC.9-fasteoi.acpi 926.20 ± 57% -59.0% 380.00 interrupts.CPU0.RES:Rescheduling_interrupts 341.60 +9.0% 372.50 interrupts.CPU1.9:IO-APIC.9-fasteoi.acpi 2403 ± 32% +123.2% 5363 ± 46% interrupts.CPU1.NMI:Non-maskable_interrupts 2403 ± 32% +123.2% 5363 ± 46% interrupts.CPU1.PMI:Performance_monitoring_interrupts 2983 ± 52% +44.4% 4309 ± 32% interrupts.CPU10.NMI:Non-maskable_interrupts 2983 ± 52% +44.4% 4309 ± 32% interrupts.CPU10.PMI:Performance_monitoring_interrupts 659.20 ± 14% -33.4% 439.00 ± 5% interrupts.CPU11.32:PCI-MSI.3145729-edge.eth0-TxRx-0 60.80 ± 91% +1764.3% 1133 ± 92% interrupts.CPU11.RES:Rescheduling_interrupts 193.20 ± 13% +46.5% 283.00 ± 4% interrupts.CPU14.35:PCI-MSI.3145732-edge.eth0-TxRx-3 2401 ± 32% +79.0% 4299 ± 33% interrupts.CPU2.NMI:Non-maskable_interrupts 2401 ± 32% +79.0% 4299 ± 33% interrupts.CPU2.PMI:Performance_monitoring_interrupts 3289 ± 42% +107.2% 6814 ± 15% interrupts.CPU21.NMI:Non-maskable_interrupts 3289 ± 42% +107.2% 6814 ± 15% interrupts.CPU21.PMI:Performance_monitoring_interrupts 3275 ± 42% +108.5% 6829 ± 16% interrupts.CPU22.NMI:Non-maskable_interrupts 3275 ± 42% +108.5% 6829 ± 16% interrupts.CPU22.PMI:Performance_monitoring_interrupts 2404 ± 32% +79.0% 4305 ± 33% interrupts.CPU3.NMI:Non-maskable_interrupts 2404 ± 32% +79.0% 4305 ± 33% interrupts.CPU3.PMI:Performance_monitoring_interrupts 2940 ± 51% +46.3% 4301 ± 33% interrupts.CPU31.NMI:Non-maskable_interrupts 2940 ± 51% +46.3% 4301 ± 33% interrupts.CPU31.PMI:Performance_monitoring_interrupts 2368 ± 28% +81.8% 4304 ± 33% interrupts.CPU33.NMI:Non-maskable_interrupts 2368 ± 28% +81.8% 4304 ± 33% interrupts.CPU33.PMI:Performance_monitoring_interrupts 41.80 ± 77% +164.4% 110.50 ± 52% interrupts.CPU33.RES:Rescheduling_interrupts 2872 ± 26% +50.0% 4308 ± 32% interrupts.CPU34.NMI:Non-maskable_interrupts 2872 ± 26% +50.0% 4308 ± 32% interrupts.CPU34.PMI:Performance_monitoring_interrupts 2872 ± 25% +137.3% 6815 ± 15% interrupts.CPU35.NMI:Non-maskable_interrupts 2872 ± 25% +137.3% 6815 ± 15% interrupts.CPU35.PMI:Performance_monitoring_interrupts 2660 ± 18% +156.6% 6826 ± 16% interrupts.CPU36.NMI:Non-maskable_interrupts 2660 ± 18% +156.6% 6826 ± 16% interrupts.CPU36.PMI:Performance_monitoring_interrupts 40.60 ± 50% +134.0% 95.00 ± 36% interrupts.CPU38.RES:Rescheduling_interrupts 2395 ± 32% +79.5% 4300 ± 33% interrupts.CPU4.NMI:Non-maskable_interrupts 2395 ± 32% +79.5% 4300 ± 33% interrupts.CPU4.PMI:Performance_monitoring_interrupts 68.20 ± 33% +1063.5% 793.50 ± 12% interrupts.CPU4.RES:Rescheduling_interrupts 2986 ± 5% +129.2% 6843 ± 16% interrupts.CPU40.NMI:Non-maskable_interrupts 2986 ± 5% +129.2% 6843 ± 16% interrupts.CPU40.PMI:Performance_monitoring_interrupts 2984 ± 5% +128.6% 6821 ± 16% interrupts.CPU41.NMI:Non-maskable_interrupts 2984 ± 5% +128.6% 6821 ± 16% interrupts.CPU41.PMI:Performance_monitoring_interrupts 2991 ± 5% +43.9% 4305 ± 32% interrupts.CPU43.NMI:Non-maskable_interrupts 2991 ± 5% +43.9% 4305 ± 32% interrupts.CPU43.PMI:Performance_monitoring_interrupts 2392 ± 31% +79.8% 4302 ± 33% interrupts.CPU5.NMI:Non-maskable_interrupts 2392 ± 31% +79.8% 4302 ± 33% interrupts.CPU5.PMI:Performance_monitoring_interrupts 345.00 ±173% +382.3% 1664 ± 53% interrupts.CPU6.RES:Rescheduling_interrupts 1.20 ± 33% +4233.3% 52.00 ± 96% interrupts.CPU77.RES:Rescheduling_interrupts 471010 ± 6% +10.0% 518230 ± 10% interrupts.NMI:Non-maskable_interrupts 471010 ± 6% +10.0% 518230 ± 10% interrupts.PMI:Performance_monitoring_interrupts will-it-scale.per_process_ops 350000 +-+----------------------------------------------------------------+ O..+..O..+..O O +..+..O..O..+ O +..O..+..O..+..+..O..O..+..+..O 300000 +-+ : : : : | | : : : : | 250000 +-+ : : : : | | : : : : | 200000 +-+ : : : : | | : : : : | 150000 +-+ : : : : | | : : : : | 100000 +-+ : : : : | | : : : : | 50000 +-+ : :: | | : : | 0 +-+O-----O--------O--O--------O------O-----O-----O--O--------O--O--+ will-it-scale.workload 1.6e+07 +-+---------------------------------------------------------------+ |..+..+..+..+ +..+..+..+..+ +..+..+..+..+..+..+..+..+..+..| 1.4e+07 O-+ O O O : O O : O : O O O O O 1.2e+07 +-+ : : : : | | : : : : | 1e+07 +-+ : : : : | | : : : : | 8e+06 +-+ : : : : | | : : : : | 6e+06 +-+ : : : : | 4e+06 +-+ : : : : | | : : : : | 2e+06 +-+ : : | | : : | 0 +-+O-----O--------O--O--------O-----O-----O-----O--O--------O--O--+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Rong Chen