Greeting, FYI, we noticed a 4.5% improvement of stress-ng.clock.ops_per_sec due to commit: commit: df29d3cd5ad4d400767caa199ec7c0ecbab10fc8 ("clocksource: Limit number of CPUs checked for clock synchronization") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: stress-ng on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory with following parameters: nr_threads: 100% disk: 1HDD testtime: 60s class: interrupt test: clock cpufreq_governor: performance ucode: 0x5003006 Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run bin/lkp run generated-yaml-file ========================================================================================= class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode: interrupt/gcc-9/performance/1HDD/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp7/clock/stress-ng/60s/0x5003006 commit: b509a98006 ("clocksource: Check per-CPU clock synchronization when marked unstable") df29d3cd5a ("clocksource: Limit number of CPUs checked for clock synchronization") b509a9800648b24a df29d3cd5ad4d400767caa199ec ---------------- --------------------------- %stddev %change %stddev \ | \ 114360 +4.5% 119554 stress-ng.clock.ops_per_sec 27349 ± 89% -90.9% 2477 ± 4% stress-ng.time.file_system_inputs 11510358 +4.6% 12034522 stress-ng.time.voluntary_context_switches 1.807e+08 +2.4% 1.851e+08 interrupts.CAL:Function_call_interrupts 0.00 ± 53% -0.0 0.00 ± 46% mpstat.cpu.all.iowait% 2024 ± 7% -32.1% 1375 ± 7% slabinfo.dmaengine-unmap-16.active_objs 2024 ± 7% -32.1% 1375 ± 7% slabinfo.dmaengine-unmap-16.num_objs 43640 ± 13% -23.3% 33491 ± 7% meminfo.Active 10027 ± 82% -96.9% 306.00 ± 29% meminfo.Active(file) 752.67 ± 98% -98.7% 9.67 ± 30% meminfo.Buffers 2506 ± 82% -97.0% 76.33 ± 29% proc-vmstat.nr_active_file 2506 ± 82% -97.0% 76.33 ± 29% proc-vmstat.nr_zone_active_file 14176 ± 82% -90.7% 1323 ± 13% proc-vmstat.pgpgin 217.33 ± 82% -90.9% 19.83 ± 14% vmstat.io.bi 753.17 ± 98% -98.7% 9.67 ± 30% vmstat.memory.buff 356471 +4.0% 370646 vmstat.system.cs 1.02 ± 30% +128.4% 2.34 ± 28% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 3.74 ± 33% +201.5% 11.28 ± 26% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 1.02 ± 30% +128.8% 2.33 ± 28% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 3.74 ± 33% +201.1% 11.25 ± 26% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 6.048e+09 +2.2% 6.18e+09 perf-stat.i.branch-instructions 69934017 +2.7% 71856066 perf-stat.i.branch-misses 26.91 -0.6 26.30 perf-stat.i.cache-miss-rate% 53135032 -2.4% 51884388 perf-stat.i.cache-misses 366423 +4.5% 382812 perf-stat.i.context-switches 9.45 -2.6% 9.20 perf-stat.i.cpi 4686 +2.5% 4804 perf-stat.i.cycles-between-cache-misses 6.913e+09 +2.9% 7.11e+09 perf-stat.i.dTLB-loads 2.371e+09 +3.8% 2.461e+09 perf-stat.i.dTLB-stores 2.612e+10 +2.7% 2.683e+10 perf-stat.i.instructions 161.64 +2.7% 166.00 perf-stat.i.metric.M/sec 22851223 +3.2% 23572316 perf-stat.i.node-load-misses 9794183 +3.9% 10176496 perf-stat.i.node-store-misses 7.45 -2.7% 7.24 perf-stat.overall.MPKI 27.37 -0.6 26.74 perf-stat.overall.cache-miss-rate% 9.74 -2.7% 9.48 perf-stat.overall.cpi 1240 ± 3% +5.9% 1313 ± 2% perf-stat.overall.instructions-per-iTLB-miss 0.10 +2.7% 0.11 perf-stat.overall.ipc 5.959e+09 +2.2% 6.088e+09 perf-stat.ps.branch-instructions 68700080 +2.8% 70597822 perf-stat.ps.branch-misses 52442864 -2.4% 51206685 perf-stat.ps.cache-misses 361424 +4.5% 377658 perf-stat.ps.context-switches 6.812e+09 +2.9% 7.007e+09 perf-stat.ps.dTLB-loads 2.337e+09 +3.8% 2.426e+09 perf-stat.ps.dTLB-stores 2.573e+10 +2.7% 2.643e+10 perf-stat.ps.instructions 22537158 +3.2% 23252556 perf-stat.ps.node-load-misses 9660122 +3.9% 10039431 perf-stat.ps.node-store-misses 1.628e+12 +2.8% 1.673e+12 perf-stat.total.instructions 29.42 -0.9 28.53 perf-profile.calltrace.cycles-pp.release_posix_timer.__x64_sys_timer_delete.do_syscall_64.entry_SYSCALL_64_after_hwframe 29.55 -0.9 28.68 perf-profile.calltrace.cycles-pp.__x64_sys_timer_delete.do_syscall_64.entry_SYSCALL_64_after_hwframe 10.29 -0.5 9.83 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_unlock_irqrestore.release_posix_timer.__x64_sys_timer_delete.do_syscall_64.entry_SYSCALL_64_after_hwframe 20.82 -0.4 20.38 perf-profile.calltrace.cycles-pp._raw_spin_lock.do_timer_create.__x64_sys_timer_create.do_syscall_64.entry_SYSCALL_64_after_hwframe 18.82 -0.4 18.39 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.release_posix_timer.__x64_sys_timer_delete.do_syscall_64.entry_SYSCALL_64_after_hwframe 21.26 -0.4 20.84 perf-profile.calltrace.cycles-pp.do_timer_create.__x64_sys_timer_create.do_syscall_64.entry_SYSCALL_64_after_hwframe 21.29 -0.4 20.87 perf-profile.calltrace.cycles-pp.__x64_sys_timer_create.do_syscall_64.entry_SYSCALL_64_after_hwframe 18.73 -0.4 18.31 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.release_posix_timer.__x64_sys_timer_delete.do_syscall_64 20.59 -0.4 20.18 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_timer_create.__x64_sys_timer_create.do_syscall_64 9.40 -0.4 9.02 perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function._raw_spin_unlock_irqrestore.release_posix_timer.__x64_sys_timer_delete 9.56 -0.4 9.18 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function._raw_spin_unlock_irqrestore.release_posix_timer.__x64_sys_timer_delete.do_syscall_64 9.37 -0.4 8.98 perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function._raw_spin_unlock_irqrestore.release_posix_timer 11.25 -0.3 10.94 perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function._raw_spin_unlock_irqrestore 1.33 ± 2% -0.1 1.24 perf-profile.calltrace.cycles-pp.ktime_get_real_ts64.posix_get_realtime_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.19 -0.1 1.11 perf-profile.calltrace.cycles-pp.ktime_get_ts64.posix_get_monotonic_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.21 -0.1 1.13 perf-profile.calltrace.cycles-pp.posix_get_monotonic_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.86 -0.0 0.82 perf-profile.calltrace.cycles-pp.ktime_get_with_offset.posix_get_boottime_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.posix_get_boottime_timespec.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.71 -0.0 0.67 ± 2% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.ktime_get_with_offset.posix_get_tai_timespec.__x64_sys_clock_gettime.do_syscall_64 0.69 ± 2% -0.0 0.65 perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.ktime_get_with_offset 0.69 -0.0 0.66 ± 2% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.ktime_get_with_offset.posix_get_tai_timespec.__x64_sys_clock_gettime 0.63 +0.0 0.66 perf-profile.calltrace.cycles-pp.common_timer_get.do_timer_gettime.__x64_sys_timer_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.95 +0.0 0.99 perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond 0.94 +0.0 0.99 perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.llist_add_batch 0.97 +0.0 1.02 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.clock_was_set 0.95 +0.0 1.00 perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask 1.14 +0.1 1.19 perf-profile.calltrace.cycles-pp.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.clock_was_set.timekeeping_inject_offset 1.19 +0.1 1.24 perf-profile.calltrace.cycles-pp.do_timer_gettime.__x64_sys_timer_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.23 +0.1 1.29 perf-profile.calltrace.cycles-pp.__x64_sys_timer_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.74 +0.1 2.81 perf-profile.calltrace.cycles-pp.ktime_get.clockevents_program_event.retrigger_next_event.flush_smp_call_function_queue.__sysvec_call_function 0.62 ± 3% +0.1 0.69 ± 4% perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.do_sys_open 0.85 ± 2% +0.1 0.94 ± 3% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.85 ± 2% +0.1 0.94 ± 3% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64 0.90 ± 3% +0.1 1.00 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.do_adjtimex.__do_sys_clock_adjtime.do_syscall_64 0.94 ± 2% +0.1 1.03 ± 3% perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.92 ± 2% +0.1 1.02 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.do_adjtimex.__do_sys_clock_adjtime.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.94 ± 2% +0.1 1.03 ± 3% perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.49 +0.1 2.58 perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.clock_was_set.timekeeping_inject_offset 19.32 +0.2 19.54 perf-profile.calltrace.cycles-pp.timekeeping_inject_offset.do_adjtimex.__do_sys_clock_adjtime.do_syscall_64.entry_SYSCALL_64_after_hwframe 21.84 +0.3 22.18 perf-profile.calltrace.cycles-pp.do_adjtimex.__do_sys_clock_adjtime.do_syscall_64.entry_SYSCALL_64_after_hwframe 22.02 +0.3 22.36 perf-profile.calltrace.cycles-pp.__do_sys_clock_adjtime.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.69 ± 3% +0.4 3.14 ± 4% perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.osq_lock 2.71 ± 3% +0.4 3.15 ± 4% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.osq_lock.__mutex_lock.i40e_ptp_gettimex 2.69 ± 3% +0.4 3.14 ± 4% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.osq_lock.__mutex_lock 2.75 ± 3% +0.5 3.20 ± 4% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.osq_lock.__mutex_lock.i40e_ptp_gettimex.pc_clock_gettime 3.47 +0.5 3.92 ± 6% perf-profile.calltrace.cycles-pp.clockevents_program_event.retrigger_next_event.flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function 0.09 ±223% +0.5 0.55 ± 5% perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2 14.21 +0.7 14.88 ± 2% perf-profile.calltrace.cycles-pp.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 5.68 ± 2% +0.9 6.60 ± 3% perf-profile.calltrace.cycles-pp.osq_lock.__mutex_lock.i40e_ptp_gettimex.pc_clock_gettime.__x64_sys_clock_gettime 6.09 ± 2% +1.0 7.05 ± 3% perf-profile.calltrace.cycles-pp.__mutex_lock.i40e_ptp_gettimex.pc_clock_gettime.__x64_sys_clock_gettime.do_syscall_64 6.52 ± 2% +1.0 7.50 ± 3% perf-profile.calltrace.cycles-pp.i40e_ptp_gettimex.pc_clock_gettime.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.59 ± 2% +1.0 7.57 ± 3% perf-profile.calltrace.cycles-pp.pc_clock_gettime.__x64_sys_clock_gettime.do_syscall_64.entry_SYSCALL_64_after_hwframe 29.42 -0.9 28.53 perf-profile.children.cycles-pp.release_posix_timer 29.55 -0.9 28.68 perf-profile.children.cycles-pp.__x64_sys_timer_delete 41.94 -0.6 41.29 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 21.27 -0.4 20.84 perf-profile.children.cycles-pp.do_timer_create 21.29 -0.4 20.87 perf-profile.children.cycles-pp.__x64_sys_timer_create 21.39 -0.4 20.99 perf-profile.children.cycles-pp._raw_spin_lock 15.00 -0.4 14.64 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore 2.38 -0.1 2.27 perf-profile.children.cycles-pp.ktime_get_with_offset 1.90 ± 2% -0.1 1.79 perf-profile.children.cycles-pp.ktime_get_real_ts64 1.21 -0.1 1.13 perf-profile.children.cycles-pp.posix_get_monotonic_timespec 1.20 -0.1 1.12 perf-profile.children.cycles-pp.ktime_get_ts64 0.88 -0.0 0.84 perf-profile.children.cycles-pp.posix_get_boottime_timespec 0.33 -0.0 0.31 perf-profile.children.cycles-pp.poll_idle 0.37 +0.0 0.39 perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.28 +0.0 0.29 perf-profile.children.cycles-pp.hrtimer_update_next_event 0.21 ± 2% +0.0 0.23 ± 3% perf-profile.children.cycles-pp.syscall_enter_from_user_mode 0.30 +0.0 0.33 perf-profile.children.cycles-pp._copy_to_user 0.17 ± 3% +0.0 0.19 perf-profile.children.cycles-pp.__lock_timer 0.31 ± 2% +0.0 0.33 ± 3% perf-profile.children.cycles-pp.mutex_spin_on_owner 0.63 +0.0 0.65 perf-profile.children.cycles-pp.__default_send_IPI_dest_field 0.04 ± 45% +0.0 0.07 ± 10% perf-profile.children.cycles-pp.__legitimize_path 0.63 +0.0 0.66 perf-profile.children.cycles-pp.common_timer_get 1.12 +0.0 1.17 perf-profile.children.cycles-pp.llist_reverse_order 1.36 +0.1 1.41 perf-profile.children.cycles-pp.lapic_next_deadline 1.16 +0.1 1.21 perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys 0.49 ± 3% +0.1 0.55 ± 5% perf-profile.children.cycles-pp.do_dentry_open 1.19 +0.1 1.24 perf-profile.children.cycles-pp.do_timer_gettime 1.24 +0.1 1.29 perf-profile.children.cycles-pp.__x64_sys_timer_gettime 0.62 ± 3% +0.1 0.69 ± 4% perf-profile.children.cycles-pp.do_open 0.85 ± 2% +0.1 0.94 ± 3% perf-profile.children.cycles-pp.do_filp_open 0.85 ± 2% +0.1 0.94 ± 3% perf-profile.children.cycles-pp.path_openat 0.94 ± 2% +0.1 1.04 ± 3% perf-profile.children.cycles-pp.do_sys_open 0.94 ± 2% +0.1 1.03 ± 3% perf-profile.children.cycles-pp.do_sys_openat2 2.50 +0.1 2.61 perf-profile.children.cycles-pp.llist_add_batch 19.33 +0.2 19.54 perf-profile.children.cycles-pp.timekeeping_inject_offset 8.95 +0.2 9.17 perf-profile.children.cycles-pp.ktime_get 7.83 +0.2 8.08 perf-profile.children.cycles-pp.clockevents_program_event 21.85 +0.3 22.18 perf-profile.children.cycles-pp.do_adjtimex 22.02 +0.4 22.37 perf-profile.children.cycles-pp.__do_sys_clock_adjtime 14.21 +0.7 14.88 ± 2% perf-profile.children.cycles-pp.__x64_sys_clock_gettime 5.71 ± 2% +0.9 6.64 ± 3% perf-profile.children.cycles-pp.osq_lock 6.09 ± 2% +1.0 7.05 ± 3% perf-profile.children.cycles-pp.__mutex_lock 6.52 ± 2% +1.0 7.50 ± 3% perf-profile.children.cycles-pp.i40e_ptp_gettimex 6.59 ± 2% +1.0 7.57 ± 3% perf-profile.children.cycles-pp.pc_clock_gettime 33.43 -0.6 32.85 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.02 ± 2% -0.1 0.96 perf-profile.self.cycles-pp.ktime_get_real_ts64 0.65 -0.0 0.60 perf-profile.self.cycles-pp.ktime_get_ts64 0.62 -0.0 0.60 perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64 0.63 +0.0 0.65 perf-profile.self.cycles-pp.__default_send_IPI_dest_field 0.59 +0.0 0.62 perf-profile.self.cycles-pp.flush_smp_call_function_queue 1.12 +0.0 1.17 perf-profile.self.cycles-pp.llist_reverse_order 1.36 +0.0 1.41 perf-profile.self.cycles-pp.lapic_next_deadline 1.49 +0.1 1.54 perf-profile.self.cycles-pp.llist_add_batch 8.74 +0.2 8.95 perf-profile.self.cycles-pp.ktime_get 2.84 ± 2% +0.5 3.32 ± 4% perf-profile.self.cycles-pp.osq_lock stress-ng.clock.ops_per_sec 126000 +------------------------------------------------------------------+ 124000 |-O O O | | O | 122000 |-+ | 120000 |-+ O O O O O O O O O O O O O O O O O | | O O O O | 118000 |-+ | 116000 |-+ | 114000 |-+ +.+.. .+.+.| | : + | 112000 |-+ : | 110000 |-+ .+..+.+.+.+. .+..+ +. .+.+.. .+.+.+.+.+..+. .+. : | |.+.+ + : : + + + + | 108000 |-+ : : | 106000 +------------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. --- 0DAY/LKP+ Test Infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation Thanks, Oliver Sang