Greeting, FYI, we noticed a 2.4% improvement of aim7.jobs-per-min due to commit: commit: 1fea323ff00526dcc04fbb4ee6e7d04e4e2ab0e1 ("xfs: reduce debug overhead of dir leaf/node checks") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: aim7 on test machine: 88 threads Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory with following parameters: disk: 4BRD_12G md: RAID1 fs: xfs test: disk_rw load: 3000 cpufreq_governor: performance ucode: 0x5003006 test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system. test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/ In addition to that, the commit also has significant impact on the following tests: +------------------+------------------------------------------------------------------------+ | testcase: change | aim7: aim7.jobs-per-min 1.6% improvement | | test machine | 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory | | test parameters | cpufreq_governor=performance | | | disk=1BRD_48G | | | fs=xfs | | | load=3000 | | | test=disk_rw | | | ucode=0x700001e | +------------------+------------------------------------------------------------------------+ Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml bin/lkp run compatible-job.yaml ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase/ucode: gcc-9/performance/4BRD_12G/xfs/x86_64-rhel-8.3/3000/RAID1/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/disk_rw/aim7/0x5003006 commit: 39d3c0b596 ("xfs: No need for inode number error injection in __xfs_dir3_data_check") 1fea323ff0 ("xfs: reduce debug overhead of dir leaf/node checks") 39d3c0b5968b5421 1fea323ff00526dcc04fbb4ee6e ---------------- --------------------------- fail:runs %reproduction fail:runs | | | :6 33% 2:6 kmsg.XFS(md#):xlog_verify_grant_tail:space>BBTOB(tail_blocks) %stddev %change %stddev \ | \ 505405 +2.4% 517621 aim7.jobs-per-min 35.82 -2.4% 34.98 aim7.time.elapsed_time 35.82 -2.4% 34.98 aim7.time.elapsed_time.max 2866 ± 35% +39.0% 3985 ± 3% interrupts.CPU53.NMI:Non-maskable_interrupts 2866 ± 35% +39.0% 3985 ± 3% interrupts.CPU53.PMI:Performance_monitoring_interrupts 286711 -2.5% 279423 proc-vmstat.nr_dirty 554636 -1.3% 547330 proc-vmstat.nr_file_pages 286865 -2.5% 279593 proc-vmstat.nr_inactive_file 286865 -2.5% 279593 proc-vmstat.nr_zone_inactive_file 287057 -2.6% 279704 proc-vmstat.nr_zone_write_pending 1.313e+10 +2.0% 1.34e+10 perf-stat.i.branch-instructions 52558 +2.7% 53962 perf-stat.i.context-switches 1942 +7.4% 2086 ± 2% perf-stat.i.cpu-migrations 1.9e+10 +2.0% 1.939e+10 perf-stat.i.dTLB-loads 1.061e+10 +2.5% 1.087e+10 perf-stat.i.dTLB-stores 6.606e+10 +2.1% 6.743e+10 perf-stat.i.instructions 487.84 +2.1% 498.03 perf-stat.i.metric.M/sec 3171946 +6.1% 3364545 perf-stat.i.node-store-misses 10014711 +2.8% 10299278 perf-stat.i.node-stores 24.04 +0.6 24.62 perf-stat.overall.node-store-miss-rate% 1.286e+10 +2.0% 1.311e+10 perf-stat.ps.branch-instructions 51473 +2.6% 52806 perf-stat.ps.context-switches 1903 +7.2% 2040 ± 2% perf-stat.ps.cpu-migrations 1.861e+10 +2.0% 1.898e+10 perf-stat.ps.dTLB-loads 1.039e+10 +2.4% 1.064e+10 perf-stat.ps.dTLB-stores 6.469e+10 +2.0% 6.598e+10 perf-stat.ps.instructions 3106311 +6.0% 3293166 perf-stat.ps.node-store-misses 9812026 +2.8% 10082006 perf-stat.ps.node-stores 2.29 ± 7% -0.2 2.07 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink 2.28 ± 7% -0.2 2.06 ± 2% perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink 2.31 ± 7% -0.2 2.10 ± 2% perf-profile.calltrace.cycles-pp.unlink 2.29 ± 7% -0.2 2.08 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlink 1.66 ± 8% -0.2 1.48 ± 4% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink 2.00 ± 6% -0.2 1.83 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64 2.00 ± 6% -0.2 1.83 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 1.98 ± 6% -0.2 1.80 ± 2% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.00 ± 6% -0.2 1.83 perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 2.00 ± 6% -0.2 1.83 perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 1.97 ± 6% -0.2 1.80 ± 2% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64 2.01 ± 6% -0.2 1.84 perf-profile.calltrace.cycles-pp.creat64 0.90 ± 11% -0.1 0.79 ± 3% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.92 ± 9% -0.1 0.81 ± 2% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2.do_sys_open 0.73 ± 11% -0.1 0.63 ± 3% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2 0.69 ± 6% -0.1 0.61 ± 6% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.58 ± 8% -0.3 2.29 ± 3% perf-profile.children.cycles-pp.rwsem_down_write_slowpath 2.29 ± 7% -0.2 2.06 ± 2% perf-profile.children.cycles-pp.do_unlinkat 2.32 ± 7% -0.2 2.10 ± 2% perf-profile.children.cycles-pp.unlink 1.62 ± 11% -0.2 1.42 ± 3% perf-profile.children.cycles-pp.rwsem_spin_on_owner 2.08 ± 6% -0.2 1.90 perf-profile.children.cycles-pp.do_sys_open 2.04 ± 6% -0.2 1.86 ± 2% perf-profile.children.cycles-pp.do_filp_open 2.07 ± 6% -0.2 1.90 perf-profile.children.cycles-pp.do_sys_openat2 2.02 ± 6% -0.2 1.85 ± 2% perf-profile.children.cycles-pp.creat64 2.03 ± 6% -0.2 1.86 perf-profile.children.cycles-pp.path_openat 0.82 ± 6% -0.1 0.72 ± 5% perf-profile.children.cycles-pp.osq_lock 0.18 ± 84% -0.1 0.09 ± 9% perf-profile.children.cycles-pp.xfs_vn_lookup 0.50 ± 2% -0.1 0.44 ± 2% perf-profile.children.cycles-pp.__fsnotify_parent 0.14 ± 6% -0.0 0.10 ± 7% perf-profile.children.cycles-pp.write@plt 0.12 ± 11% -0.0 0.09 perf-profile.children.cycles-pp.xfs_dir2_leafn_lookup_for_entry 0.11 ± 18% -0.0 0.08 ± 8% perf-profile.children.cycles-pp.xfs_dir_lookup 0.22 ± 7% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.update_process_times 0.09 ± 8% -0.0 0.07 ± 14% perf-profile.children.cycles-pp.xfs_dir2_node_lookup 0.25 ± 44% +0.1 0.35 ± 5% perf-profile.children.cycles-pp.xfs_file_llseek 1.61 ± 11% -0.2 1.41 ± 3% perf-profile.self.cycles-pp.rwsem_spin_on_owner 0.81 ± 6% -0.1 0.72 ± 5% perf-profile.self.cycles-pp.osq_lock 0.48 ± 2% -0.1 0.41 ± 3% perf-profile.self.cycles-pp.__fsnotify_parent 0.10 ± 6% -0.1 0.04 ± 44% perf-profile.self.cycles-pp.write@plt 0.24 ± 44% +0.1 0.34 ± 5% perf-profile.self.cycles-pp.xfs_file_llseek 0.77 ± 13% +0.2 0.94 ± 4% perf-profile.self.cycles-pp.xfs_file_buffered_write aim7.jobs-per-min 540000 +------------------------------------------------------------------+ | O | 530000 |-+ O O O O O | | O O O O O O O O | | O O O O O O O O O O O O | 520000 |-+ O O O O O O | | O | 510000 |-+ .+ | | .+.+.+ | 500000 |-+ +.+ | | : | | +. + + +. : | 490000 |.+.+. + +. + + .+.+. + + + +.+ | | + + + + +.+.+.+.++.+ | 480000 +------------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample *************************************************************************************************** lkp-cpl-4sp1: 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase/ucode: gcc-9/performance/1BRD_48G/xfs/x86_64-rhel-8.3/3000/debian-10.4-x86_64-20200603.cgz/lkp-cpl-4sp1/disk_rw/aim7/0x700001e commit: 39d3c0b596 ("xfs: No need for inode number error injection in __xfs_dir3_data_check") 1fea323ff0 ("xfs: reduce debug overhead of dir leaf/node checks") 39d3c0b5968b5421 1fea323ff00526dcc04fbb4ee6e ---------------- --------------------------- %stddev %change %stddev \ | \ 500977 +1.6% 509113 aim7.jobs-per-min 36.14 -1.6% 35.57 aim7.time.elapsed_time 36.14 -1.6% 35.57 aim7.time.elapsed_time.max 40.93 ± 2% -4.3% 39.19 aim7.time.user_time 28267 ± 79% -81.7% 5164 ± 5% numa-meminfo.node2.KernelStack 28180 ± 78% -81.7% 5162 ± 5% numa-vmstat.node2.nr_kernel_stack 291109 -1.6% 286393 proc-vmstat.nr_dirty 11049 ± 5% +9.0% 12039 ± 4% slabinfo.pde_opener.active_objs 11049 ± 5% +9.0% 12039 ± 4% slabinfo.pde_opener.num_objs 1579 ± 33% +29.9% 2051 ± 25% interrupts.CPU109.NMI:Non-maskable_interrupts 1579 ± 33% +29.9% 2051 ± 25% interrupts.CPU109.PMI:Performance_monitoring_interrupts 1785 ± 30% +45.8% 2602 ± 8% interrupts.CPU117.NMI:Non-maskable_interrupts 1785 ± 30% +45.8% 2602 ± 8% interrupts.CPU117.PMI:Performance_monitoring_interrupts 891.67 ± 8% +99.4% 1778 ± 47% interrupts.CPU4.CAL:Function_call_interrupts 1.301e+10 +1.6% 1.322e+10 perf-stat.i.branch-instructions 52509 +2.1% 53602 perf-stat.i.context-switches 1.89e+10 +1.8% 1.924e+10 perf-stat.i.dTLB-loads 1.061e+10 +1.9% 1.081e+10 perf-stat.i.dTLB-stores 6.554e+10 +1.6% 6.66e+10 perf-stat.i.instructions 296.86 +1.8% 302.18 perf-stat.i.metric.M/sec 76.63 +1.0 77.63 perf-stat.i.node-load-miss-rate% 3774653 ± 2% +6.6% 4025641 ± 3% perf-stat.i.node-loads 4414091 ± 3% +7.6% 4747750 perf-stat.i.node-store-misses 9344160 +2.3% 9559103 perf-stat.i.node-stores 32.07 ± 2% +1.1 33.18 perf-stat.overall.node-store-miss-rate% 1.271e+10 +1.8% 1.293e+10 perf-stat.ps.branch-instructions 51278 +2.3% 52440 perf-stat.ps.context-switches 1.846e+10 +2.0% 1.883e+10 perf-stat.ps.dTLB-loads 1.036e+10 +2.1% 1.057e+10 perf-stat.ps.dTLB-stores 6.4e+10 +1.8% 6.516e+10 perf-stat.ps.instructions 3686827 ± 2% +6.9% 3940762 ± 3% perf-stat.ps.node-loads 4310625 ± 3% +7.8% 4645614 perf-stat.ps.node-store-misses 9127740 +2.5% 9355048 perf-stat.ps.node-stores 2.54 -0.2 2.30 ± 4% perf-profile.calltrace.cycles-pp.creat64 2.53 -0.2 2.29 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64 2.52 -0.2 2.28 ± 4% perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 2.50 -0.2 2.26 ± 4% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.50 -0.2 2.26 ± 4% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64 2.52 -0.2 2.28 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 2.52 -0.2 2.28 ± 4% perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 2.82 ± 2% -0.2 2.62 ± 3% perf-profile.calltrace.cycles-pp.unlink 2.79 ± 2% -0.2 2.59 ± 3% perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink 2.80 ± 2% -0.2 2.61 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlink 2.79 ± 2% -0.2 2.60 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink 2.10 -0.1 1.95 ± 4% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink 1.21 ± 3% -0.1 1.10 ± 4% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2.do_sys_open 0.96 ± 5% -0.1 0.87 ± 4% perf-profile.calltrace.cycles-pp.xfs_generic_create.path_openat.do_filp_open.do_sys_openat2.do_sys_open 1.07 ± 3% -0.1 0.99 ± 3% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.92 ± 2% -0.1 0.85 ± 4% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2 3.31 -0.3 3.06 ± 4% perf-profile.children.cycles-pp.rwsem_down_write_slowpath 2.56 -0.2 2.31 ± 4% perf-profile.children.cycles-pp.do_filp_open 2.55 ± 2% -0.2 2.30 ± 4% perf-profile.children.cycles-pp.creat64 2.56 -0.2 2.31 ± 4% perf-profile.children.cycles-pp.path_openat 2.60 -0.2 2.36 ± 4% perf-profile.children.cycles-pp.do_sys_open 2.60 -0.2 2.36 ± 4% perf-profile.children.cycles-pp.do_sys_openat2 2.83 ± 2% -0.2 2.63 ± 3% perf-profile.children.cycles-pp.unlink 2.79 ± 2% -0.2 2.60 ± 3% perf-profile.children.cycles-pp.do_unlinkat 1.99 ± 2% -0.1 1.84 ± 3% perf-profile.children.cycles-pp.rwsem_spin_on_owner 0.96 ± 5% -0.1 0.87 ± 4% perf-profile.children.cycles-pp.xfs_generic_create 0.45 ± 5% -0.1 0.37 ± 7% perf-profile.children.cycles-pp.__fsnotify_parent 0.17 ± 5% -0.0 0.13 ± 9% perf-profile.children.cycles-pp.write@plt 0.12 ± 4% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.xfs_dir2_leafn_lookup_for_entry 0.09 ± 7% -0.0 0.07 ± 7% perf-profile.children.cycles-pp.generic_file_llseek_size 0.09 ± 7% -0.0 0.07 ± 10% perf-profile.children.cycles-pp.xfs_dir2_node_lookup 0.08 ± 11% -0.0 0.06 ± 7% perf-profile.children.cycles-pp.wake_up_q 1.97 ± 2% -0.2 1.82 ± 3% perf-profile.self.cycles-pp.rwsem_spin_on_owner 0.43 ± 5% -0.1 0.34 ± 7% perf-profile.self.cycles-pp.__fsnotify_parent 1.17 ± 3% -0.1 1.10 ± 4% perf-profile.self.cycles-pp.write 0.10 ± 7% -0.1 0.05 ± 45% perf-profile.self.cycles-pp.write@plt 0.09 ± 7% -0.0 0.07 ± 7% perf-profile.self.cycles-pp.generic_file_llseek_size 0.19 ± 3% -0.0 0.17 ± 4% perf-profile.self.cycles-pp.xfs_get_extsz_hint 0.21 ± 6% +0.0 0.24 ± 6% perf-profile.self.cycles-pp.propagate_protected_usage Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. --- 0DAY/LKP+ Test Infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation Thanks, Oliver Sang