Hello, kernel test robot noticed a 3.0% improvement of stress-ng.handle.ops_per_sec on: commit: a1a690e009744e4526526b2f838beec5ef9233cc ("[PATCH v7 3/3] shmem: stable directory offsets") url: https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/libfs-Add-directory-operations-for-stable-offsets/20230701-014925 base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/all/168814734331.530310.3911190551060453102.stgit@manet.1015granger.net/ patch subject: [PATCH v7 3/3] shmem: stable directory offsets testcase: stress-ng test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory parameters: nr_threads: 10% disk: 1SSD testtime: 60s fs: ext4 class: filesystem test: handle cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. ========================================================================================= class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: filesystem/gcc-12/performance/1SSD/ext4/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-skl-d08/handle/stress-ng/60s commit: c2c3172c26 ("shmem: Refactor shmem_symlink()") a1a690e009 ("shmem: stable directory offsets") c2c3172c269f808d a1a690e009744e4526526b2f838 ---------------- --------------------------- %stddev %change %stddev \ | \ 5.92 -1.2% 5.85 iostat.cpu.system 88.56 ± 21% -29.4% 62.51 ± 39% sched_debug.cfs_rq:/.util_est_enqueued.avg 1294 ± 10% -21.2% 1020 ± 13% sched_debug.cpu.curr->pid.avg 11210546 +3.0% 11542342 stress-ng.handle.ops 186841 +3.0% 192371 stress-ng.handle.ops_per_sec 100.40 ± 53% -51.2% 49.00 ± 74% proc-vmstat.nr_dirtied 488851 ± 5% -8.4% 447861 proc-vmstat.numa_hit 480285 ± 5% -6.8% 447764 proc-vmstat.numa_local 750318 ± 6% -8.4% 687348 proc-vmstat.pgalloc_normal 710717 ± 7% -9.0% 646894 proc-vmstat.pgfree 1.43e+09 -1.7% 1.406e+09 perf-stat.i.branch-instructions 1.25 -0.0 1.20 perf-stat.i.branch-miss-rate% 19188631 -5.1% 18207135 perf-stat.i.branch-misses 0.15 +0.0 0.15 perf-stat.i.dTLB-load-miss-rate% 2781970 +2.6% 2854604 perf-stat.i.dTLB-load-misses 1.904e+09 -1.5% 1.875e+09 perf-stat.i.dTLB-loads 1.139e+09 -1.7% 1.12e+09 perf-stat.i.dTLB-stores 67.67 ± 10% +7.1 74.72 perf-stat.i.iTLB-load-miss-rate% 1568674 ± 46% -32.7% 1055235 ± 14% perf-stat.i.iTLB-loads 7.352e+09 -1.4% 7.247e+09 perf-stat.i.instructions 124.64 -1.9% 122.30 perf-stat.i.metric.M/sec 41311 ± 5% -11.0% 36758 ± 2% perf-stat.i.node-stores 1.34 -0.0 1.29 perf-stat.overall.branch-miss-rate% 0.15 +0.0 0.15 perf-stat.overall.dTLB-load-miss-rate% 68.34 ± 10% +7.3 75.68 ± 2% perf-stat.overall.iTLB-load-miss-rate% 1.407e+09 -1.7% 1.383e+09 perf-stat.ps.branch-instructions 18876614 -5.1% 17906237 perf-stat.ps.branch-misses 2737798 +2.6% 2809345 perf-stat.ps.dTLB-load-misses 1.873e+09 -1.5% 1.845e+09 perf-stat.ps.dTLB-loads 1.121e+09 -1.7% 1.102e+09 perf-stat.ps.dTLB-stores 1543675 ± 46% -32.8% 1037667 ± 14% perf-stat.ps.iTLB-loads 7.235e+09 -1.4% 7.13e+09 perf-stat.ps.instructions 40647 ± 5% -11.1% 36139 ± 3% perf-stat.ps.node-stores 4.557e+11 -1.2% 4.502e+11 perf-stat.total.instructions 0.01 -20.0% 0.00 perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork 0.08 ± 10% -30.6% 0.06 ± 9% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.19 ± 5% -26.8% 0.14 ± 3% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.08 ± 20% -30.2% 0.06 ± 15% perf-sched.wait_time.avg.ms.__cond_resched.__fput.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare 0.08 ± 10% -31.9% 0.06 ± 9% perf-sched.wait_time.avg.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc.do_sys_name_to_handle.__x64_sys_name_to_handle_at 0.08 ± 12% -31.4% 0.06 ± 8% perf-sched.wait_time.avg.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc.handle_to_path.do_handle_open 0.08 ± 9% -29.4% 0.06 ± 14% perf-sched.wait_time.avg.ms.__cond_resched.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file 0.08 ± 9% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.dentry_kill.dput.dcache_dir_close.__fput 0.08 ± 10% -30.5% 0.06 ± 7% perf-sched.wait_time.avg.ms.__cond_resched.dput.__fput.task_work_run.exit_to_user_mode_loop 0.08 ± 11% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.dput.dcache_dir_close.__fput.task_work_run 0.08 ± 10% -30.3% 0.06 ± 8% perf-sched.wait_time.avg.ms.__cond_resched.dput.path_put.__x64_sys_name_to_handle_at.do_syscall_64 0.08 ± 9% -34.9% 0.05 ± 8% perf-sched.wait_time.avg.ms.__cond_resched.dput.path_put.do_handle_open.do_syscall_64 0.08 ± 10% -29.6% 0.05 ± 10% perf-sched.wait_time.avg.ms.__cond_resched.dput.terminate_walk.path_openat.do_file_open_root 0.08 ± 9% -30.6% 0.06 ± 10% perf-sched.wait_time.avg.ms.__cond_resched.dput.terminate_walk.path_openat.do_filp_open 0.08 ± 10% -29.1% 0.06 ± 9% perf-sched.wait_time.avg.ms.__cond_resched.ilookup5.shmem_fh_to_dentry.exportfs_decode_fh_raw.exportfs_decode_fh 0.08 ± 18% -30.9% 0.05 ± 10% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_file_open_root 0.07 ± 7% -23.8% 0.06 ± 10% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_filp_open 0.08 ± 10% -32.2% 0.05 ± 9% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.getname_flags.part.0 0.08 ± 11% -33.4% 0.05 ± 12% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.getname_kernel.do_file_open_root.file_open_root 0.08 ± 11% -30.8% 0.06 ± 8% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file 0.08 ± 10% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.slab_pre_alloc_hook.constprop.0.kmem_cache_alloc_lru 0.08 ± 11% -34.6% 0.05 ± 11% perf-sched.wait_time.avg.ms.__cond_resched.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode 0.08 ± 10% -30.6% 0.06 ± 9% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.08 ± 11% -32.3% 0.05 ± 9% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 0.13 ± 29% -43.2% 0.07 ± 22% perf-sched.wait_time.max.ms.__cond_resched.__fput.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare 0.18 ± 11% -22.3% 0.14 ± 12% perf-sched.wait_time.max.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc.do_sys_name_to_handle.__x64_sys_name_to_handle_at 0.17 ± 9% -28.2% 0.12 ± 6% perf-sched.wait_time.max.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc.handle_to_path.do_handle_open 0.13 ± 14% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.dentry_kill.dput.dcache_dir_close.__fput 0.15 ± 9% -26.8% 0.11 ± 16% perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.task_work_run.exit_to_user_mode_loop 0.11 ± 32% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.dput.dcache_dir_close.__fput.task_work_run 0.17 ± 2% -26.6% 0.12 ± 7% perf-sched.wait_time.max.ms.__cond_resched.dput.path_put.__x64_sys_name_to_handle_at.do_syscall_64 0.16 ± 10% -42.7% 0.09 ± 18% perf-sched.wait_time.max.ms.__cond_resched.dput.path_put.do_handle_open.do_syscall_64 0.16 ± 10% -27.0% 0.12 ± 6% perf-sched.wait_time.max.ms.__cond_resched.dput.terminate_walk.path_openat.do_file_open_root 0.16 ± 6% -24.1% 0.12 ± 9% perf-sched.wait_time.max.ms.__cond_resched.ilookup5.shmem_fh_to_dentry.exportfs_decode_fh_raw.exportfs_decode_fh 0.13 ± 27% -36.6% 0.08 ± 24% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_file_open_root 0.13 ± 16% -27.5% 0.10 ± 20% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_filp_open 0.18 ± 9% -23.2% 0.14 ± 13% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.getname_flags.part.0 0.15 ± 11% -35.8% 0.10 ± 15% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.getname_kernel.do_file_open_root.file_open_root 0.16 ± 7% -35.5% 0.10 ± 24% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file 0.17 ± 10% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.slab_pre_alloc_hook.constprop.0.kmem_cache_alloc_lru 0.16 ± 8% -36.2% 0.10 ± 21% perf-sched.wait_time.max.ms.__cond_resched.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode 0.19 ± 5% -26.8% 0.14 ± 3% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.18 ± 4% -31.4% 0.12 ± 8% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 1.42 ± 6% -0.8 0.59 ± 3% perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2 1.89 ± 4% -0.8 1.07 ± 5% perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat 4.24 ± 3% -0.8 3.43 ± 2% perf-profile.calltrace.cycles-pp.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64 4.19 ± 3% -0.8 3.38 ± 2% perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64 3.46 ± 3% -0.8 2.68 ± 3% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe 3.36 ± 3% -0.8 2.59 ± 3% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64 6.64 ± 3% -0.8 5.88 ± 2% perf-profile.calltrace.cycles-pp.open64 5.39 ± 3% -0.8 4.63 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.open64 4.95 ± 3% -0.8 4.19 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64 5.48 ± 4% -0.7 4.80 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 6.32 ± 3% -0.6 5.68 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__close 8.83 ± 3% -0.5 8.29 perf-profile.calltrace.cycles-pp.__close 4.33 ± 4% -0.5 3.82 ± 5% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 2.36 ± 5% -0.5 1.86 ± 8% perf-profile.calltrace.cycles-pp.__fput.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode 2.93 ± 5% -0.5 2.45 ± 6% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.76 ± 5% -0.5 2.28 ± 7% perf-profile.calltrace.cycles-pp.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 2.98 ± 5% -0.5 2.50 ± 6% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 1.17 ± 13% -0.3 0.85 ± 17% perf-profile.calltrace.cycles-pp.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread 1.00 ± 3% -0.1 0.85 ± 9% perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 0.77 ± 4% -0.1 0.66 ± 13% perf-profile.calltrace.cycles-pp.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 1.17 ± 6% +0.1 1.29 ± 5% perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_file_open_root.file_open_root 3.33 ± 5% +0.2 3.56 ± 2% perf-profile.calltrace.cycles-pp.do_file_open_root.file_open_root.do_handle_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 3.38 ± 5% +0.2 3.61 ± 2% perf-profile.calltrace.cycles-pp.file_open_root.do_handle_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.open_by_handle_at 4.26 ± 3% -0.8 3.44 ± 2% perf-profile.children.cycles-pp.__x64_sys_openat 4.21 ± 3% -0.8 3.40 ± 2% perf-profile.children.cycles-pp.do_sys_openat2 3.47 ± 3% -0.8 2.69 ± 3% perf-profile.children.cycles-pp.do_filp_open 6.69 ± 3% -0.8 5.93 ± 2% perf-profile.children.cycles-pp.open64 2.59 ± 5% -0.7 1.89 ± 3% perf-profile.children.cycles-pp.do_dentry_open 3.35 ± 4% -0.7 2.66 ± 3% perf-profile.children.cycles-pp.do_open 6.40 ± 4% -0.6 5.80 ± 2% perf-profile.children.cycles-pp.path_openat 8.94 ± 3% -0.5 8.41 ± 2% perf-profile.children.cycles-pp.__close 2.38 ± 5% -0.5 1.88 ± 8% perf-profile.children.cycles-pp.__fput 2.95 ± 5% -0.5 2.46 ± 6% perf-profile.children.cycles-pp.exit_to_user_mode_loop 2.77 ± 5% -0.5 2.30 ± 7% perf-profile.children.cycles-pp.task_work_run 3.20 ± 5% -0.4 2.75 ± 5% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 2.89 ± 7% -0.4 2.44 ± 4% perf-profile.children.cycles-pp.dput 0.36 ± 8% -0.2 0.14 ± 16% perf-profile.children.cycles-pp.lockref_put_return 1.00 ± 3% -0.1 0.86 ± 10% perf-profile.children.cycles-pp.__x64_sys_close 0.77 ± 4% -0.1 0.66 ± 13% perf-profile.children.cycles-pp.filp_close 0.35 ± 15% -0.1 0.24 ± 16% perf-profile.children.cycles-pp.__slab_free 0.44 ± 13% -0.1 0.33 ± 8% perf-profile.children.cycles-pp.shmem_encode_fh 0.49 ± 6% -0.1 0.39 ± 3% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook 0.70 ± 7% -0.1 0.60 ± 9% perf-profile.children.cycles-pp.lockref_get 0.39 ± 8% -0.1 0.30 ± 24% perf-profile.children.cycles-pp.locks_remove_posix 0.34 ± 7% -0.1 0.26 ± 5% perf-profile.children.cycles-pp.__call_rcu_common 0.13 ± 8% -0.1 0.07 ± 9% perf-profile.children.cycles-pp.___slab_alloc 0.25 ± 12% -0.0 0.20 ± 11% perf-profile.children.cycles-pp.get_obj_cgroup_from_current 0.37 ± 10% -0.0 0.33 ± 9% perf-profile.children.cycles-pp.rep_movs_alternative 0.08 ± 16% -0.0 0.05 ± 54% perf-profile.children.cycles-pp.obj_cgroup_charge 0.16 ± 12% -0.0 0.13 ± 4% perf-profile.children.cycles-pp.close_fd 0.13 ± 4% -0.0 0.10 ± 14% perf-profile.children.cycles-pp.is_vmalloc_addr 3.34 ± 5% +0.2 3.57 ± 2% perf-profile.children.cycles-pp.do_file_open_root 0.36 ± 8% -0.2 0.13 ± 19% perf-profile.self.cycles-pp.lockref_put_return 0.34 ± 15% -0.1 0.23 ± 16% perf-profile.self.cycles-pp.__slab_free 0.44 ± 13% -0.1 0.33 ± 10% perf-profile.self.cycles-pp.shmem_encode_fh 0.69 ± 7% -0.1 0.60 ± 9% perf-profile.self.cycles-pp.lockref_get 0.39 ± 7% -0.1 0.30 ± 25% perf-profile.self.cycles-pp.locks_remove_posix 0.37 ± 7% -0.1 0.29 ± 5% perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook 0.20 ± 10% -0.1 0.15 ± 9% perf-profile.self.cycles-pp.__call_rcu_common 0.13 ± 21% -0.0 0.09 ± 28% perf-profile.self.cycles-pp.__legitimize_path 0.13 ± 6% -0.0 0.10 ± 14% perf-profile.self.cycles-pp.is_vmalloc_addr 0.10 ± 7% -0.0 0.08 ± 7% perf-profile.self.cycles-pp.do_handle_open 0.15 ± 11% +0.0 0.18 ± 5% perf-profile.self.cycles-pp.init_file Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki