* [linux-next:master] [fsnotify] a5e57b4d37: stress-ng.full.ops_per_sec -17.3% regression
@ 2024-04-11 1:42 kernel test robot
2024-04-11 9:23 ` Amir Goldstein
0 siblings, 1 reply; 5+ messages in thread
From: kernel test robot @ 2024-04-11 1:42 UTC (permalink / raw)
To: Amir Goldstein
Cc: oe-lkp, lkp, Linux Memory Management List, Jan Kara,
linux-fsdevel, ying.huang, feng.tang, fengwei.yin, oliver.sang
hi, Amir,
for "[amir73il:fsnotify-sbconn] [fsnotify] 629f30e073: unixbench.throughput 5.8% improvement"
(https://lore.kernel.org/all/202403141505.807a722b-oliver.sang@intel.com/)
you requested us to test unixbench for this commit on different branches and we
observed consistent performance improvement.
now we noticed this commit is merged into linux-next/master, we still observed
similar unixbench improvement, however, we also captured a stress-ng regression
now. below details FYI.
Hello,
kernel test robot noticed a -17.3% regression of stress-ng.full.ops_per_sec on:
commit: a5e57b4d370c6d320e5bfb0c919fe00aee29e039 ("fsnotify: optimize the case of no permission event watchers")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: full
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | unixbench: unixbench.throughput 6.4% improvement |
| test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
| test parameters | cpufreq_governor=performance |
| | nr_task=1 |
| | runtime=300s |
| | test=fsbuffer-r |
+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | unixbench: unixbench.throughput 5.8% improvement |
| test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
| test parameters | cpufreq_governor=performance |
| | nr_task=1 |
| | runtime=300s |
| | test=fstime-r |
+------------------+-------------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202404101624.85684be8-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240410/202404101624.85684be8-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/full/stress-ng/60s
commit:
477cf917dd ("fsnotify: use an enum for group priority constants")
a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
---------------- ---------------------------
%stddev %change %stddev
\ | \
20489 ± 7% -19.2% 16565 ± 13% perf-c2c.HITM.remote
409.48 ± 9% -14.0% 352.13 ± 5% sched_debug.cfs_rq:/.util_est.avg
217.94 ± 8% +12.9% 246.07 ± 4% sched_debug.cfs_rq:/.util_est.stddev
1.461e+08 ± 3% -17.3% 1.208e+08 ± 5% stress-ng.full.ops
2434462 ± 3% -17.3% 2013444 ± 5% stress-ng.full.ops_per_sec
71.04 ± 3% -16.6% 59.28 ± 6% stress-ng.time.user_time
9.95e+09 ± 4% -13.4% 8.617e+09 ± 3% perf-stat.i.branch-instructions
0.48 ± 3% +0.1 0.55 ± 2% perf-stat.i.branch-miss-rate%
4.36 ± 4% +17.1% 5.10 ± 3% perf-stat.i.cpi
5.162e+10 ± 4% -14.5% 4.416e+10 ± 3% perf-stat.i.instructions
0.24 ± 3% -13.8% 0.21 ± 3% perf-stat.i.ipc
0.46 ± 3% +0.1 0.54 ± 2% perf-stat.overall.branch-miss-rate%
4.38 ± 4% +16.9% 5.12 ± 3% perf-stat.overall.cpi
0.23 ± 4% -14.5% 0.20 ± 3% perf-stat.overall.ipc
9.781e+09 ± 4% -13.4% 8.471e+09 ± 3% perf-stat.ps.branch-instructions
5.075e+10 ± 4% -14.5% 4.341e+10 ± 3% perf-stat.ps.instructions
3.111e+12 ± 4% -14.5% 2.66e+12 ± 3% perf-stat.total.instructions
8.39 ± 7% -2.8 5.56 ± 4% perf-profile.calltrace.cycles-pp.__mmap
8.09 ± 7% -2.8 5.31 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap
8.05 ± 7% -2.8 5.28 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
7.95 ± 7% -2.8 5.19 ± 4% perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
6.80 ± 8% -2.7 4.14 ± 4% perf-profile.calltrace.cycles-pp.security_file_open.do_dentry_open.do_open.path_openat.do_filp_open
7.46 ± 8% -2.7 4.80 ± 4% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
6.78 ± 8% -2.7 4.13 ± 4% perf-profile.calltrace.cycles-pp.apparmor_file_open.security_file_open.do_dentry_open.do_open.path_openat
4.12 ± 14% -2.0 2.09 ± 10% perf-profile.calltrace.cycles-pp.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.54 ± 14% -1.7 1.81 ± 10% perf-profile.calltrace.cycles-pp.apparmor_mmap_file.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
3.46 ± 8% -1.5 1.99 ± 6% perf-profile.calltrace.cycles-pp.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
3.15 ± 8% -1.4 1.71 ± 7% perf-profile.calltrace.cycles-pp.init_file.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2
3.06 ± 9% -1.4 1.63 ± 7% perf-profile.calltrace.cycles-pp.security_file_alloc.init_file.alloc_empty_file.path_openat.do_filp_open
2.95 ± 9% -1.4 1.54 ± 8% perf-profile.calltrace.cycles-pp.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file.path_openat
5.50 ± 7% -1.1 4.39 ± 5% perf-profile.calltrace.cycles-pp.fstatat64
5.34 ± 7% -1.1 4.26 ± 6% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fstatat64
5.32 ± 7% -1.1 4.24 ± 6% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
5.27 ± 8% -1.1 4.20 ± 6% perf-profile.calltrace.cycles-pp.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
4.95 ± 8% -1.0 3.91 ± 7% perf-profile.calltrace.cycles-pp.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
4.78 ± 8% -1.0 3.77 ± 7% perf-profile.calltrace.cycles-pp.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.75 ± 9% -1.0 3.74 ± 7% perf-profile.calltrace.cycles-pp.common_perm_cond.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64
1.74 ± 12% -0.9 0.83 ± 11% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64
1.75 ± 12% -0.9 0.84 ± 11% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64
2.08 ± 13% -0.9 1.17 ± 9% perf-profile.calltrace.cycles-pp.write
1.78 ± 13% -0.9 0.88 ± 13% perf-profile.calltrace.cycles-pp.security_file_post_open.do_open.path_openat.do_filp_open.do_sys_openat2
1.77 ± 13% -0.9 0.87 ± 13% perf-profile.calltrace.cycles-pp.ima_file_check.security_file_post_open.do_open.path_openat.do_filp_open
1.68 ± 15% -0.9 0.80 ± 13% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
1.68 ± 15% -0.9 0.80 ± 13% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
1.68 ± 14% -0.9 0.80 ± 14% perf-profile.calltrace.cycles-pp.apparmor_current_getsecid_subj.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open
1.68 ± 14% -0.9 0.81 ± 14% perf-profile.calltrace.cycles-pp.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open.path_openat
1.90 ± 14% -0.9 1.02 ± 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
1.88 ± 14% -0.9 1.00 ± 11% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
1.82 ± 15% -0.9 0.96 ± 11% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
1.77 ± 15% -0.8 0.92 ± 11% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
1.74 ± 15% -0.8 0.90 ± 12% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.72 ± 15% -0.8 0.87 ± 12% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_write.ksys_write
1.73 ± 15% -0.8 0.89 ± 12% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_write.ksys_write.do_syscall_64
1.32 ± 5% -0.5 0.80 ± 5% perf-profile.calltrace.cycles-pp.security_file_free.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.31 ± 5% -0.5 0.80 ± 5% perf-profile.calltrace.cycles-pp.apparmor_file_free_security.security_file_free.__fput.__x64_sys_close.do_syscall_64
2.72 ± 2% -0.5 2.24 ± 6% perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.68 ± 9% -0.4 0.26 ±100% perf-profile.calltrace.cycles-pp.kobject_put.cdev_put.__fput.__x64_sys_close.do_syscall_64
2.48 ± 2% -0.4 2.07 ± 5% perf-profile.calltrace.cycles-pp.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
2.39 ± 2% -0.4 1.99 ± 6% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff
2.22 ± 2% -0.4 1.84 ± 5% perf-profile.calltrace.cycles-pp.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff
1.54 ± 2% -0.3 1.27 ± 6% perf-profile.calltrace.cycles-pp.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap
0.91 ± 8% -0.2 0.66 ± 6% perf-profile.calltrace.cycles-pp.cdev_put.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.17 ± 3% -0.2 0.96 ± 6% perf-profile.calltrace.cycles-pp.mas_rev_awalk.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area
0.64 ± 2% -0.1 0.57 ± 4% perf-profile.calltrace.cycles-pp.ioctl
2.80 ± 7% +1.7 4.48 ± 6% perf-profile.calltrace.cycles-pp.__libc_pread
2.65 ± 7% +1.7 4.35 ± 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pread
2.63 ± 7% +1.7 4.33 ± 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
2.58 ± 7% +1.7 4.29 ± 7% perf-profile.calltrace.cycles-pp.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
2.79 ± 8% +1.7 4.50 ± 7% perf-profile.calltrace.cycles-pp.read
2.53 ± 8% +1.7 4.25 ± 7% perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
2.64 ± 9% +1.7 4.37 ± 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
2.62 ± 9% +1.7 4.35 ± 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
2.57 ± 9% +1.7 4.31 ± 8% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
2.52 ± 10% +1.7 4.27 ± 8% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
1.77 ± 12% +1.9 3.64 ± 8% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.71 ± 15% +1.9 3.64 ± 9% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +2.8 2.79 ± 5% perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open
8.50 ± 7% -2.8 5.66 ± 4% perf-profile.children.cycles-pp.__mmap
7.96 ± 7% -2.8 5.20 ± 4% perf-profile.children.cycles-pp.ksys_mmap_pgoff
6.81 ± 8% -2.7 4.14 ± 4% perf-profile.children.cycles-pp.security_file_open
6.79 ± 8% -2.7 4.14 ± 4% perf-profile.children.cycles-pp.apparmor_file_open
7.48 ± 7% -2.7 4.83 ± 4% perf-profile.children.cycles-pp.vm_mmap_pgoff
5.14 ± 14% -2.6 2.51 ± 12% perf-profile.children.cycles-pp.apparmor_file_permission
5.18 ± 14% -2.6 2.54 ± 11% perf-profile.children.cycles-pp.security_file_permission
4.13 ± 14% -2.0 2.10 ± 10% perf-profile.children.cycles-pp.security_mmap_file
3.55 ± 14% -1.7 1.81 ± 10% perf-profile.children.cycles-pp.apparmor_mmap_file
3.47 ± 8% -1.5 2.00 ± 6% perf-profile.children.cycles-pp.alloc_empty_file
3.15 ± 8% -1.4 1.72 ± 7% perf-profile.children.cycles-pp.init_file
3.06 ± 9% -1.4 1.64 ± 7% perf-profile.children.cycles-pp.security_file_alloc
2.95 ± 9% -1.4 1.55 ± 8% perf-profile.children.cycles-pp.apparmor_file_alloc_security
2.18 ± 16% -1.2 1.02 ± 14% perf-profile.children.cycles-pp.security_current_getsecid_subj
2.16 ± 16% -1.2 1.00 ± 14% perf-profile.children.cycles-pp.apparmor_current_getsecid_subj
5.55 ± 7% -1.1 4.44 ± 5% perf-profile.children.cycles-pp.fstatat64
5.27 ± 8% -1.1 4.20 ± 6% perf-profile.children.cycles-pp.__do_sys_newfstatat
4.96 ± 8% -1.0 3.92 ± 7% perf-profile.children.cycles-pp.vfs_fstat
4.78 ± 8% -1.0 3.77 ± 7% perf-profile.children.cycles-pp.security_inode_getattr
4.75 ± 9% -1.0 3.74 ± 7% perf-profile.children.cycles-pp.common_perm_cond
2.16 ± 12% -0.9 1.25 ± 8% perf-profile.children.cycles-pp.write
1.78 ± 13% -0.9 0.88 ± 13% perf-profile.children.cycles-pp.security_file_post_open
1.77 ± 13% -0.9 0.87 ± 13% perf-profile.children.cycles-pp.ima_file_check
1.86 ± 14% -0.9 1.00 ± 10% perf-profile.children.cycles-pp.ksys_write
1.81 ± 15% -0.8 0.96 ± 10% perf-profile.children.cycles-pp.vfs_write
1.32 ± 5% -0.5 0.80 ± 5% perf-profile.children.cycles-pp.security_file_free
1.31 ± 5% -0.5 0.80 ± 5% perf-profile.children.cycles-pp.apparmor_file_free_security
2.73 ± 2% -0.5 2.25 ± 6% perf-profile.children.cycles-pp.do_mmap
2.50 ± 2% -0.4 2.08 ± 6% perf-profile.children.cycles-pp.get_unmapped_area
2.41 ± 2% -0.4 2.01 ± 6% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown
2.24 ± 2% -0.4 1.86 ± 5% perf-profile.children.cycles-pp.vm_unmapped_area
0.52 ± 23% -0.3 0.23 ± 14% perf-profile.children.cycles-pp.ima_file_mmap
1.58 ± 2% -0.3 1.31 ± 6% perf-profile.children.cycles-pp.mas_empty_area_rev
0.91 ± 7% -0.2 0.67 ± 6% perf-profile.children.cycles-pp.cdev_put
0.44 ± 3% -0.2 0.22 ± 6% perf-profile.children.cycles-pp.__fsnotify_parent
1.21 ± 3% -0.2 0.99 ± 6% perf-profile.children.cycles-pp.mas_rev_awalk
0.69 ± 9% -0.2 0.50 ± 6% perf-profile.children.cycles-pp.kobject_put
1.13 ± 3% -0.2 0.96 ± 4% perf-profile.children.cycles-pp.read_iter_zero
1.09 ± 3% -0.2 0.93 ± 4% perf-profile.children.cycles-pp.iov_iter_zero
0.96 ± 2% -0.1 0.82 ± 4% perf-profile.children.cycles-pp.rep_stos_alternative
0.76 ± 3% -0.1 0.64 ± 4% perf-profile.children.cycles-pp.entry_SYSCALL_64
0.21 ± 24% -0.1 0.11 ± 12% perf-profile.children.cycles-pp.aa_file_perm
0.31 ± 7% -0.1 0.20 ± 8% perf-profile.children.cycles-pp.down_write_killable
0.75 ± 2% -0.1 0.66 ± 4% perf-profile.children.cycles-pp.ioctl
0.59 ± 2% -0.1 0.50 ± 4% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.31 ± 9% -0.1 0.23 ± 8% perf-profile.children.cycles-pp.fget
0.52 ± 3% -0.1 0.44 ± 5% perf-profile.children.cycles-pp.stress_full
0.34 -0.1 0.27 ± 5% perf-profile.children.cycles-pp.llseek
0.30 ± 3% -0.1 0.24 ± 8% perf-profile.children.cycles-pp.kmem_cache_free
0.34 ± 2% -0.0 0.29 ± 6% perf-profile.children.cycles-pp.mas_prev_slot
0.29 ± 2% -0.0 0.24 ± 5% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.16 ± 5% -0.0 0.11 ± 8% perf-profile.children.cycles-pp.__legitimize_mnt
0.16 ± 6% -0.0 0.12 ± 13% perf-profile.children.cycles-pp.__memcg_slab_free_hook
0.07 ± 5% -0.0 0.03 ± 81% perf-profile.children.cycles-pp.ksys_lseek
0.25 ± 3% -0.0 0.22 ± 6% perf-profile.children.cycles-pp.mas_ascend
0.18 -0.0 0.15 ± 5% perf-profile.children.cycles-pp.mas_data_end
0.19 ± 2% -0.0 0.16 ± 5% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.11 ± 7% -0.0 0.08 ± 8% perf-profile.children.cycles-pp.open_last_lookups
0.07 ± 4% -0.0 0.04 ± 50% perf-profile.children.cycles-pp.mas_prev
0.11 ± 4% -0.0 0.08 ± 9% perf-profile.children.cycles-pp.__fdget_pos
0.07 ± 4% -0.0 0.04 ± 51% perf-profile.children.cycles-pp.process_measurement
0.06 -0.0 0.04 ± 65% perf-profile.children.cycles-pp.vfs_getattr_nosec
0.06 -0.0 0.04 ± 33% perf-profile.children.cycles-pp.amd_clear_divider
0.08 ± 5% -0.0 0.06 ± 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.07 ± 10% +0.0 0.10 ± 10% perf-profile.children.cycles-pp.walk_component
0.35 +0.0 0.40 ± 6% perf-profile.children.cycles-pp.link_path_walk
97.57 +0.4 97.94 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
97.40 +0.4 97.80 perf-profile.children.cycles-pp.do_syscall_64
2.85 ± 7% +1.7 4.53 ± 6% perf-profile.children.cycles-pp.__libc_pread
2.85 ± 8% +1.7 4.54 ± 7% perf-profile.children.cycles-pp.read
2.59 ± 7% +1.7 4.30 ± 7% perf-profile.children.cycles-pp.__x64_sys_pread64
2.58 ± 9% +1.7 4.31 ± 8% perf-profile.children.cycles-pp.ksys_read
0.00 +2.8 2.80 ± 5% perf-profile.children.cycles-pp.fsnotify_open_perm
5.23 ± 14% +3.0 8.19 ± 8% perf-profile.children.cycles-pp.rw_verify_area
5.06 ± 8% +3.5 8.53 ± 7% perf-profile.children.cycles-pp.vfs_read
6.77 ± 8% -2.6 4.12 ± 4% perf-profile.self.cycles-pp.apparmor_file_open
5.01 ± 14% -2.6 2.44 ± 12% perf-profile.self.cycles-pp.apparmor_file_permission
3.45 ± 13% -1.7 1.77 ± 10% perf-profile.self.cycles-pp.apparmor_mmap_file
2.93 ± 9% -1.4 1.54 ± 8% perf-profile.self.cycles-pp.apparmor_file_alloc_security
2.14 ± 16% -1.2 0.99 ± 14% perf-profile.self.cycles-pp.apparmor_current_getsecid_subj
4.74 ± 9% -1.0 3.73 ± 7% perf-profile.self.cycles-pp.common_perm_cond
1.31 ± 5% -0.5 0.79 ± 5% perf-profile.self.cycles-pp.apparmor_file_free_security
0.43 ± 3% -0.2 0.21 ± 5% perf-profile.self.cycles-pp.__fsnotify_parent
1.07 ± 3% -0.2 0.88 ± 6% perf-profile.self.cycles-pp.mas_rev_awalk
0.68 ± 9% -0.2 0.50 ± 6% perf-profile.self.cycles-pp.kobject_put
0.95 ± 2% -0.1 0.81 ± 4% perf-profile.self.cycles-pp.rep_stos_alternative
0.20 ± 25% -0.1 0.10 ± 14% perf-profile.self.cycles-pp.aa_file_perm
0.28 ± 8% -0.1 0.18 ± 8% perf-profile.self.cycles-pp.down_write_killable
0.57 ± 3% -0.1 0.48 ± 4% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.31 ± 8% -0.1 0.22 ± 9% perf-profile.self.cycles-pp.fget
0.50 ± 3% -0.1 0.43 ± 5% perf-profile.self.cycles-pp.stress_full
0.22 ± 6% -0.1 0.16 ± 6% perf-profile.self.cycles-pp.cdev_put
0.15 ± 5% -0.0 0.11 ± 6% perf-profile.self.cycles-pp.__legitimize_mnt
0.24 ± 4% -0.0 0.20 ± 6% perf-profile.self.cycles-pp.mas_empty_area_rev
0.28 ± 3% -0.0 0.24 ± 4% perf-profile.self.cycles-pp.do_syscall_64
0.24 ± 3% -0.0 0.20 ± 6% perf-profile.self.cycles-pp.mas_ascend
0.18 ± 3% -0.0 0.14 ± 6% perf-profile.self.cycles-pp.do_mmap
0.14 ± 5% -0.0 0.11 ± 12% perf-profile.self.cycles-pp.chrdev_open
0.19 ± 2% -0.0 0.15 ± 5% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.20 ± 3% -0.0 0.17 ± 5% perf-profile.self.cycles-pp.entry_SYSCALL_64
0.20 ± 4% -0.0 0.17 ± 3% perf-profile.self.cycles-pp.vfs_read
0.18 ± 2% -0.0 0.15 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.16 ± 2% -0.0 0.13 ± 4% perf-profile.self.cycles-pp.mas_data_end
0.07 ± 4% -0.0 0.04 ± 50% perf-profile.self.cycles-pp.process_measurement
0.16 ± 3% -0.0 0.13 ± 5% perf-profile.self.cycles-pp.vm_unmapped_area
0.12 ± 4% -0.0 0.09 ± 6% perf-profile.self.cycles-pp.mas_prev_slot
0.14 ± 2% -0.0 0.12 ± 5% perf-profile.self.cycles-pp.kmem_cache_free
0.10 ± 5% -0.0 0.07 ± 6% perf-profile.self.cycles-pp.open64
0.15 ± 2% -0.0 0.13 ± 5% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.15 ± 2% -0.0 0.13 ± 4% perf-profile.self.cycles-pp.ioctl
0.09 ± 5% -0.0 0.07 ± 8% perf-profile.self.cycles-pp.write
0.07 ± 6% -0.0 0.06 perf-profile.self.cycles-pp.__close
0.11 ± 4% +0.0 0.13 ± 4% perf-profile.self.cycles-pp.link_path_walk
0.01 ±200% +0.0 0.06 ± 9% perf-profile.self.cycles-pp.__virt_addr_valid
0.75 ± 2% +0.1 0.89 ± 3% perf-profile.self.cycles-pp._raw_spin_lock
0.00 +2.8 2.79 ± 5% perf-profile.self.cycles-pp.fsnotify_open_perm
0.05 +5.6 5.63 ± 10% perf-profile.self.cycles-pp.rw_verify_area
***************************************************************************************************
lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fsbuffer-r/unixbench
commit:
477cf917dd ("fsnotify: use an enum for group priority constants")
a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
---------------- ---------------------------
%stddev %change %stddev
\ | \
1339661 +6.4% 1425877 unixbench.throughput
5.765e+08 +6.4% 6.131e+08 unixbench.workload
1.159e+09 +2.2% 1.184e+09 perf-stat.i.branch-instructions
1.49 +0.0 1.54 perf-stat.i.branch-miss-rate%
10449249 ± 2% +6.7% 11149426 perf-stat.i.branch-misses
4514 -5.3% 4273 perf-stat.overall.path-length
1.156e+09 +2.2% 1.181e+09 perf-stat.ps.branch-instructions
10430168 ± 2% +6.7% 11128869 perf-stat.ps.branch-misses
7.02 ± 2% -3.3 3.70 ± 3% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.45 ± 3% +0.2 1.62 ± 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.read
1.24 ± 3% +0.2 1.44 ± 3% perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.filemap_read.vfs_read
2.55 ± 8% +0.4 2.91 ± 4% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
3.04 ± 6% +0.4 3.44 ± 3% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
1.94 ± 9% +0.5 2.42 ± 3% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
8.62 ± 3% +0.5 9.14 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
7.90 ± 2% +0.6 8.51 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read.ksys_read
9.29 ± 2% +0.8 10.04 perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.vfs_read.ksys_read.do_syscall_64
4.43 ± 7% +0.8 5.28 ± 2% perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read
29.04 ± 3% +1.8 30.80 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.06 ± 2% -3.3 3.73 ± 3% perf-profile.children.cycles-pp.__fsnotify_parent
0.77 ± 6% +0.1 0.88 ± 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
1.26 ± 2% +0.2 1.45 ± 3% perf-profile.children.cycles-pp.current_time
1.66 ± 3% +0.2 1.90 ± 3% perf-profile.children.cycles-pp.syscall_return_via_sysret
3.72 ± 2% +0.3 4.03 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
2.56 ± 7% +0.4 2.91 ± 4% perf-profile.children.cycles-pp.apparmor_file_permission
5.72 ± 2% +0.4 6.08 perf-profile.children.cycles-pp.entry_SYSCALL_64
4.40 ± 4% +0.4 4.81 ± 2% perf-profile.children.cycles-pp.rep_movs_alternative
3.10 ± 6% +0.4 3.52 ± 3% perf-profile.children.cycles-pp.security_file_permission
1.94 ± 9% +0.5 2.42 ± 3% perf-profile.children.cycles-pp.__fdget_pos
8.68 ± 3% +0.5 9.20 perf-profile.children.cycles-pp.filemap_get_pages
8.37 ± 2% +0.7 9.05 perf-profile.children.cycles-pp._copy_to_iter
9.52 ± 2% +0.8 10.28 perf-profile.children.cycles-pp.copy_page_to_iter
29.25 ± 3% +1.7 30.99 perf-profile.children.cycles-pp.filemap_read
6.94 -3.2 3.72 ± 3% perf-profile.self.cycles-pp.__fsnotify_parent
0.77 ± 6% +0.1 0.88 ± 7% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.83 ± 5% +0.1 0.97 ± 7% perf-profile.self.cycles-pp.current_time
1.66 ± 3% +0.2 1.90 ± 3% perf-profile.self.cycles-pp.syscall_return_via_sysret
3.52 ± 2% +0.2 3.76 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
2.42 ± 3% +0.3 2.67 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64
1.92 ± 6% +0.3 2.20 ± 5% perf-profile.self.cycles-pp.apparmor_file_permission
3.92 ± 4% +0.3 4.25 ± 2% perf-profile.self.cycles-pp.rep_movs_alternative
4.38 +0.3 4.72 ± 2% perf-profile.self.cycles-pp._copy_to_iter
1.16 ± 8% +0.3 1.51 ± 2% perf-profile.self.cycles-pp.ksys_read
1.85 ± 10% +0.5 2.36 ± 2% perf-profile.self.cycles-pp.__fdget_pos
***************************************************************************************************
lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fstime-r/unixbench
commit:
477cf917dd ("fsnotify: use an enum for group priority constants")
a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
---------------- ---------------------------
%stddev %change %stddev
\ | \
4709035 +5.8% 4980152 unixbench.throughput
2.026e+09 +5.7% 2.141e+09 unixbench.workload
1.034e+09 +1.4% 1.048e+09 perf-stat.i.branch-instructions
1.56 +0.0 1.59 perf-stat.i.branch-miss-rate%
60950726 +5.3% 64193405 perf-stat.i.cache-references
0.02 ± 30% -36.7% 0.01 ± 39% perf-stat.i.major-faults
0.78 -0.0 0.75 perf-stat.overall.cache-miss-rate%
1145 -5.4% 1083 perf-stat.overall.path-length
1.031e+09 +1.4% 1.046e+09 perf-stat.ps.branch-instructions
60812120 +5.3% 64047513 perf-stat.ps.cache-references
0.02 ± 30% -36.7% 0.01 ± 39% perf-stat.ps.major-faults
6.22 ± 3% -2.9 3.30 ± 3% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
49.43 -1.5 47.90 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
52.39 -1.0 51.34 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
55.16 -0.9 54.29 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
56.49 -0.7 55.80 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
2.40 ± 4% +0.2 2.64 ± 5% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_read.vfs_read.ksys_read
2.59 ± 4% +0.3 2.86 ± 5% perf-profile.calltrace.cycles-pp.touch_atime.filemap_read.vfs_read.ksys_read.do_syscall_64
6.88 +0.3 7.23 ± 2% perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.vfs_read.ksys_read
2.26 ± 3% +0.4 2.64 ± 10% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
7.90 ± 3% +0.4 8.29 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
2.68 ± 2% +0.4 3.13 ± 8% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
8.47 +0.4 8.91 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
32.80 +1.8 34.63 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.27 ± 3% -2.9 3.34 ± 3% perf-profile.children.cycles-pp.__fsnotify_parent
49.50 -1.4 48.07 perf-profile.children.cycles-pp.vfs_read
52.46 -1.0 51.45 perf-profile.children.cycles-pp.ksys_read
1.16 ± 4% +0.1 1.28 ± 4% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
2.46 ± 4% +0.2 2.69 ± 6% perf-profile.children.cycles-pp.atime_needs_update
5.03 ± 3% +0.3 5.30 perf-profile.children.cycles-pp.entry_SYSCALL_64
2.66 ± 4% +0.3 2.94 ± 6% perf-profile.children.cycles-pp.touch_atime
3.27 ± 2% +0.3 3.59 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
6.96 +0.4 7.31 ± 2% perf-profile.children.cycles-pp.filemap_get_read_batch
2.27 ± 3% +0.4 2.64 ± 10% perf-profile.children.cycles-pp.apparmor_file_permission
2.76 ± 2% +0.4 3.20 ± 7% perf-profile.children.cycles-pp.security_file_permission
8.52 +0.5 8.98 perf-profile.children.cycles-pp.filemap_get_pages
32.99 +1.8 34.80 perf-profile.children.cycles-pp.filemap_read
6.16 ± 3% -2.8 3.32 ± 3% perf-profile.self.cycles-pp.__fsnotify_parent
1.19 ± 3% -0.4 0.81 ± 6% perf-profile.self.cycles-pp.rw_verify_area
1.55 ± 3% +0.1 1.64 ± 2% perf-profile.self.cycles-pp.filemap_get_pages
0.70 ± 3% +0.1 0.81 ± 7% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
1.31 ± 4% +0.1 1.43 ± 4% perf-profile.self.cycles-pp.do_syscall_64
2.15 ± 4% +0.1 2.28 perf-profile.self.cycles-pp.entry_SYSCALL_64
4.00 ± 2% +0.2 4.22 perf-profile.self.cycles-pp.read
1.06 ± 4% +0.3 1.31 ± 5% perf-profile.self.cycles-pp.ksys_read
3.09 ± 2% +0.3 3.36 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
3.89 ± 2% +0.3 4.19 ± 3% perf-profile.self.cycles-pp._copy_to_iter
1.66 ± 2% +0.3 2.01 ± 13% perf-profile.self.cycles-pp.apparmor_file_permission
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [linux-next:master] [fsnotify] a5e57b4d37: stress-ng.full.ops_per_sec -17.3% regression
2024-04-11 1:42 [linux-next:master] [fsnotify] a5e57b4d37: stress-ng.full.ops_per_sec -17.3% regression kernel test robot
@ 2024-04-11 9:23 ` Amir Goldstein
2024-04-11 11:54 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Amir Goldstein @ 2024-04-11 9:23 UTC (permalink / raw)
To: kernel test robot, Jan Kara
Cc: oe-lkp, lkp, Linux Memory Management List, linux-fsdevel,
ying.huang, feng.tang, fengwei.yin
On Thu, Apr 11, 2024 at 4:42 AM kernel test robot <oliver.sang@intel.com> wrote:
>
>
> hi, Amir,
>
> for "[amir73il:fsnotify-sbconn] [fsnotify] 629f30e073: unixbench.throughput 5.8% improvement"
> (https://lore.kernel.org/all/202403141505.807a722b-oliver.sang@intel.com/)
> you requested us to test unixbench for this commit on different branches and we
> observed consistent performance improvement.
>
> now we noticed this commit is merged into linux-next/master, we still observed
> similar unixbench improvement, however, we also captured a stress-ng regression
> now. below details FYI.
>
>
>
> Hello,
>
> kernel test robot noticed a -17.3% regression of stress-ng.full.ops_per_sec on:
>
>
> commit: a5e57b4d370c6d320e5bfb0c919fe00aee29e039 ("fsnotify: optimize the case of no permission event watchers")
Odd. This commit does add an extra fsnotify_sb_has_priority_watchers()
inline check for reads and writes, but the inline helper
fsnotify_sb_has_watchers()
already exists in fsnotify_parent() and it already accesses fsnotify_sb_info.
It seems like stress-ng.full does read/write/mmap operations on /dev/full,
so the fsnotify_sb_info object would be that of devtmpfs.
I think that the permission events on special files are not very relevant,
but I am not sure.
Jan, any ideas?
Thanks,
Amir.
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> testcase: stress-ng
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
>
> nr_threads: 100%
> testtime: 60s
> test: full
> cpufreq_governor: performance
>
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+-------------------------------------------------------------------------------------------------+
> | testcase: change | unixbench: unixbench.throughput 6.4% improvement |
> | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
> | test parameters | cpufreq_governor=performance |
> | | nr_task=1 |
> | | runtime=300s |
> | | test=fsbuffer-r |
> +------------------+-------------------------------------------------------------------------------------------------+
> | testcase: change | unixbench: unixbench.throughput 5.8% improvement |
> | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
> | test parameters | cpufreq_governor=performance |
> | | nr_task=1 |
> | | runtime=300s |
> | | test=fstime-r |
> +------------------+-------------------------------------------------------------------------------------------------+
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202404101624.85684be8-oliver.sang@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240410/202404101624.85684be8-oliver.sang@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/full/stress-ng/60s
>
> commit:
> 477cf917dd ("fsnotify: use an enum for group priority constants")
> a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
>
> 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 20489 ą 7% -19.2% 16565 ą 13% perf-c2c.HITM.remote
> 409.48 ą 9% -14.0% 352.13 ą 5% sched_debug.cfs_rq:/.util_est.avg
> 217.94 ą 8% +12.9% 246.07 ą 4% sched_debug.cfs_rq:/.util_est.stddev
> 1.461e+08 ą 3% -17.3% 1.208e+08 ą 5% stress-ng.full.ops
> 2434462 ą 3% -17.3% 2013444 ą 5% stress-ng.full.ops_per_sec
> 71.04 ą 3% -16.6% 59.28 ą 6% stress-ng.time.user_time
> 9.95e+09 ą 4% -13.4% 8.617e+09 ą 3% perf-stat.i.branch-instructions
> 0.48 ą 3% +0.1 0.55 ą 2% perf-stat.i.branch-miss-rate%
> 4.36 ą 4% +17.1% 5.10 ą 3% perf-stat.i.cpi
> 5.162e+10 ą 4% -14.5% 4.416e+10 ą 3% perf-stat.i.instructions
> 0.24 ą 3% -13.8% 0.21 ą 3% perf-stat.i.ipc
> 0.46 ą 3% +0.1 0.54 ą 2% perf-stat.overall.branch-miss-rate%
> 4.38 ą 4% +16.9% 5.12 ą 3% perf-stat.overall.cpi
> 0.23 ą 4% -14.5% 0.20 ą 3% perf-stat.overall.ipc
> 9.781e+09 ą 4% -13.4% 8.471e+09 ą 3% perf-stat.ps.branch-instructions
> 5.075e+10 ą 4% -14.5% 4.341e+10 ą 3% perf-stat.ps.instructions
> 3.111e+12 ą 4% -14.5% 2.66e+12 ą 3% perf-stat.total.instructions
> 8.39 ą 7% -2.8 5.56 ą 4% perf-profile.calltrace.cycles-pp.__mmap
> 8.09 ą 7% -2.8 5.31 ą 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap
> 8.05 ą 7% -2.8 5.28 ą 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> 7.95 ą 7% -2.8 5.19 ą 4% perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> 6.80 ą 8% -2.7 4.14 ą 4% perf-profile.calltrace.cycles-pp.security_file_open.do_dentry_open.do_open.path_openat.do_filp_open
> 7.46 ą 8% -2.7 4.80 ą 4% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> 6.78 ą 8% -2.7 4.13 ą 4% perf-profile.calltrace.cycles-pp.apparmor_file_open.security_file_open.do_dentry_open.do_open.path_openat
> 4.12 ą 14% -2.0 2.09 ą 10% perf-profile.calltrace.cycles-pp.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 3.54 ą 14% -1.7 1.81 ą 10% perf-profile.calltrace.cycles-pp.apparmor_mmap_file.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> 3.46 ą 8% -1.5 1.99 ą 6% perf-profile.calltrace.cycles-pp.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
> 3.15 ą 8% -1.4 1.71 ą 7% perf-profile.calltrace.cycles-pp.init_file.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2
> 3.06 ą 9% -1.4 1.63 ą 7% perf-profile.calltrace.cycles-pp.security_file_alloc.init_file.alloc_empty_file.path_openat.do_filp_open
> 2.95 ą 9% -1.4 1.54 ą 8% perf-profile.calltrace.cycles-pp.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file.path_openat
> 5.50 ą 7% -1.1 4.39 ą 5% perf-profile.calltrace.cycles-pp.fstatat64
> 5.34 ą 7% -1.1 4.26 ą 6% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fstatat64
> 5.32 ą 7% -1.1 4.24 ą 6% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
> 5.27 ą 8% -1.1 4.20 ą 6% perf-profile.calltrace.cycles-pp.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
> 4.95 ą 8% -1.0 3.91 ą 7% perf-profile.calltrace.cycles-pp.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
> 4.78 ą 8% -1.0 3.77 ą 7% perf-profile.calltrace.cycles-pp.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 4.75 ą 9% -1.0 3.74 ą 7% perf-profile.calltrace.cycles-pp.common_perm_cond.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64
> 1.74 ą 12% -0.9 0.83 ą 11% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64
> 1.75 ą 12% -0.9 0.84 ą 11% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64
> 2.08 ą 13% -0.9 1.17 ą 9% perf-profile.calltrace.cycles-pp.write
> 1.78 ą 13% -0.9 0.88 ą 13% perf-profile.calltrace.cycles-pp.security_file_post_open.do_open.path_openat.do_filp_open.do_sys_openat2
> 1.77 ą 13% -0.9 0.87 ą 13% perf-profile.calltrace.cycles-pp.ima_file_check.security_file_post_open.do_open.path_openat.do_filp_open
> 1.68 ą 15% -0.9 0.80 ą 13% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
> 1.68 ą 15% -0.9 0.80 ą 13% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
> 1.68 ą 14% -0.9 0.80 ą 14% perf-profile.calltrace.cycles-pp.apparmor_current_getsecid_subj.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open
> 1.68 ą 14% -0.9 0.81 ą 14% perf-profile.calltrace.cycles-pp.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open.path_openat
> 1.90 ą 14% -0.9 1.02 ą 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
> 1.88 ą 14% -0.9 1.00 ą 11% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
> 1.82 ą 15% -0.9 0.96 ą 11% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
> 1.77 ą 15% -0.8 0.92 ą 11% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
> 1.74 ą 15% -0.8 0.90 ą 12% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.72 ą 15% -0.8 0.87 ą 12% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_write.ksys_write
> 1.73 ą 15% -0.8 0.89 ą 12% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_write.ksys_write.do_syscall_64
> 1.32 ą 5% -0.5 0.80 ą 5% perf-profile.calltrace.cycles-pp.security_file_free.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.31 ą 5% -0.5 0.80 ą 5% perf-profile.calltrace.cycles-pp.apparmor_file_free_security.security_file_free.__fput.__x64_sys_close.do_syscall_64
> 2.72 ą 2% -0.5 2.24 ą 6% perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.68 ą 9% -0.4 0.26 ą100% perf-profile.calltrace.cycles-pp.kobject_put.cdev_put.__fput.__x64_sys_close.do_syscall_64
> 2.48 ą 2% -0.4 2.07 ą 5% perf-profile.calltrace.cycles-pp.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> 2.39 ą 2% -0.4 1.99 ą 6% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff
> 2.22 ą 2% -0.4 1.84 ą 5% perf-profile.calltrace.cycles-pp.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff
> 1.54 ą 2% -0.3 1.27 ą 6% perf-profile.calltrace.cycles-pp.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap
> 0.91 ą 8% -0.2 0.66 ą 6% perf-profile.calltrace.cycles-pp.cdev_put.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.17 ą 3% -0.2 0.96 ą 6% perf-profile.calltrace.cycles-pp.mas_rev_awalk.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area
> 0.64 ą 2% -0.1 0.57 ą 4% perf-profile.calltrace.cycles-pp.ioctl
> 2.80 ą 7% +1.7 4.48 ą 6% perf-profile.calltrace.cycles-pp.__libc_pread
> 2.65 ą 7% +1.7 4.35 ą 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pread
> 2.63 ą 7% +1.7 4.33 ą 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
> 2.58 ą 7% +1.7 4.29 ą 7% perf-profile.calltrace.cycles-pp.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
> 2.79 ą 8% +1.7 4.50 ą 7% perf-profile.calltrace.cycles-pp.read
> 2.53 ą 8% +1.7 4.25 ą 7% perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
> 2.64 ą 9% +1.7 4.37 ą 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
> 2.62 ą 9% +1.7 4.35 ą 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 2.57 ą 9% +1.7 4.31 ą 8% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 2.52 ą 10% +1.7 4.27 ą 8% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 1.77 ą 12% +1.9 3.64 ą 8% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.71 ą 15% +1.9 3.64 ą 9% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +2.8 2.79 ą 5% perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open
> 8.50 ą 7% -2.8 5.66 ą 4% perf-profile.children.cycles-pp.__mmap
> 7.96 ą 7% -2.8 5.20 ą 4% perf-profile.children.cycles-pp.ksys_mmap_pgoff
> 6.81 ą 8% -2.7 4.14 ą 4% perf-profile.children.cycles-pp.security_file_open
> 6.79 ą 8% -2.7 4.14 ą 4% perf-profile.children.cycles-pp.apparmor_file_open
> 7.48 ą 7% -2.7 4.83 ą 4% perf-profile.children.cycles-pp.vm_mmap_pgoff
> 5.14 ą 14% -2.6 2.51 ą 12% perf-profile.children.cycles-pp.apparmor_file_permission
> 5.18 ą 14% -2.6 2.54 ą 11% perf-profile.children.cycles-pp.security_file_permission
> 4.13 ą 14% -2.0 2.10 ą 10% perf-profile.children.cycles-pp.security_mmap_file
> 3.55 ą 14% -1.7 1.81 ą 10% perf-profile.children.cycles-pp.apparmor_mmap_file
> 3.47 ą 8% -1.5 2.00 ą 6% perf-profile.children.cycles-pp.alloc_empty_file
> 3.15 ą 8% -1.4 1.72 ą 7% perf-profile.children.cycles-pp.init_file
> 3.06 ą 9% -1.4 1.64 ą 7% perf-profile.children.cycles-pp.security_file_alloc
> 2.95 ą 9% -1.4 1.55 ą 8% perf-profile.children.cycles-pp.apparmor_file_alloc_security
> 2.18 ą 16% -1.2 1.02 ą 14% perf-profile.children.cycles-pp.security_current_getsecid_subj
> 2.16 ą 16% -1.2 1.00 ą 14% perf-profile.children.cycles-pp.apparmor_current_getsecid_subj
> 5.55 ą 7% -1.1 4.44 ą 5% perf-profile.children.cycles-pp.fstatat64
> 5.27 ą 8% -1.1 4.20 ą 6% perf-profile.children.cycles-pp.__do_sys_newfstatat
> 4.96 ą 8% -1.0 3.92 ą 7% perf-profile.children.cycles-pp.vfs_fstat
> 4.78 ą 8% -1.0 3.77 ą 7% perf-profile.children.cycles-pp.security_inode_getattr
> 4.75 ą 9% -1.0 3.74 ą 7% perf-profile.children.cycles-pp.common_perm_cond
> 2.16 ą 12% -0.9 1.25 ą 8% perf-profile.children.cycles-pp.write
> 1.78 ą 13% -0.9 0.88 ą 13% perf-profile.children.cycles-pp.security_file_post_open
> 1.77 ą 13% -0.9 0.87 ą 13% perf-profile.children.cycles-pp.ima_file_check
> 1.86 ą 14% -0.9 1.00 ą 10% perf-profile.children.cycles-pp.ksys_write
> 1.81 ą 15% -0.8 0.96 ą 10% perf-profile.children.cycles-pp.vfs_write
> 1.32 ą 5% -0.5 0.80 ą 5% perf-profile.children.cycles-pp.security_file_free
> 1.31 ą 5% -0.5 0.80 ą 5% perf-profile.children.cycles-pp.apparmor_file_free_security
> 2.73 ą 2% -0.5 2.25 ą 6% perf-profile.children.cycles-pp.do_mmap
> 2.50 ą 2% -0.4 2.08 ą 6% perf-profile.children.cycles-pp.get_unmapped_area
> 2.41 ą 2% -0.4 2.01 ą 6% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown
> 2.24 ą 2% -0.4 1.86 ą 5% perf-profile.children.cycles-pp.vm_unmapped_area
> 0.52 ą 23% -0.3 0.23 ą 14% perf-profile.children.cycles-pp.ima_file_mmap
> 1.58 ą 2% -0.3 1.31 ą 6% perf-profile.children.cycles-pp.mas_empty_area_rev
> 0.91 ą 7% -0.2 0.67 ą 6% perf-profile.children.cycles-pp.cdev_put
> 0.44 ą 3% -0.2 0.22 ą 6% perf-profile.children.cycles-pp.__fsnotify_parent
> 1.21 ą 3% -0.2 0.99 ą 6% perf-profile.children.cycles-pp.mas_rev_awalk
> 0.69 ą 9% -0.2 0.50 ą 6% perf-profile.children.cycles-pp.kobject_put
> 1.13 ą 3% -0.2 0.96 ą 4% perf-profile.children.cycles-pp.read_iter_zero
> 1.09 ą 3% -0.2 0.93 ą 4% perf-profile.children.cycles-pp.iov_iter_zero
> 0.96 ą 2% -0.1 0.82 ą 4% perf-profile.children.cycles-pp.rep_stos_alternative
> 0.76 ą 3% -0.1 0.64 ą 4% perf-profile.children.cycles-pp.entry_SYSCALL_64
> 0.21 ą 24% -0.1 0.11 ą 12% perf-profile.children.cycles-pp.aa_file_perm
> 0.31 ą 7% -0.1 0.20 ą 8% perf-profile.children.cycles-pp.down_write_killable
> 0.75 ą 2% -0.1 0.66 ą 4% perf-profile.children.cycles-pp.ioctl
> 0.59 ą 2% -0.1 0.50 ą 4% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> 0.31 ą 9% -0.1 0.23 ą 8% perf-profile.children.cycles-pp.fget
> 0.52 ą 3% -0.1 0.44 ą 5% perf-profile.children.cycles-pp.stress_full
> 0.34 -0.1 0.27 ą 5% perf-profile.children.cycles-pp.llseek
> 0.30 ą 3% -0.1 0.24 ą 8% perf-profile.children.cycles-pp.kmem_cache_free
> 0.34 ą 2% -0.0 0.29 ą 6% perf-profile.children.cycles-pp.mas_prev_slot
> 0.29 ą 2% -0.0 0.24 ą 5% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
> 0.16 ą 5% -0.0 0.11 ą 8% perf-profile.children.cycles-pp.__legitimize_mnt
> 0.16 ą 6% -0.0 0.12 ą 13% perf-profile.children.cycles-pp.__memcg_slab_free_hook
> 0.07 ą 5% -0.0 0.03 ą 81% perf-profile.children.cycles-pp.ksys_lseek
> 0.25 ą 3% -0.0 0.22 ą 6% perf-profile.children.cycles-pp.mas_ascend
> 0.18 -0.0 0.15 ą 5% perf-profile.children.cycles-pp.mas_data_end
> 0.19 ą 2% -0.0 0.16 ą 5% perf-profile.children.cycles-pp.syscall_return_via_sysret
> 0.11 ą 7% -0.0 0.08 ą 8% perf-profile.children.cycles-pp.open_last_lookups
> 0.07 ą 4% -0.0 0.04 ą 50% perf-profile.children.cycles-pp.mas_prev
> 0.11 ą 4% -0.0 0.08 ą 9% perf-profile.children.cycles-pp.__fdget_pos
> 0.07 ą 4% -0.0 0.04 ą 51% perf-profile.children.cycles-pp.process_measurement
> 0.06 -0.0 0.04 ą 65% perf-profile.children.cycles-pp.vfs_getattr_nosec
> 0.06 -0.0 0.04 ą 33% perf-profile.children.cycles-pp.amd_clear_divider
> 0.08 ą 5% -0.0 0.06 ą 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
> 0.07 ą 10% +0.0 0.10 ą 10% perf-profile.children.cycles-pp.walk_component
> 0.35 +0.0 0.40 ą 6% perf-profile.children.cycles-pp.link_path_walk
> 97.57 +0.4 97.94 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 97.40 +0.4 97.80 perf-profile.children.cycles-pp.do_syscall_64
> 2.85 ą 7% +1.7 4.53 ą 6% perf-profile.children.cycles-pp.__libc_pread
> 2.85 ą 8% +1.7 4.54 ą 7% perf-profile.children.cycles-pp.read
> 2.59 ą 7% +1.7 4.30 ą 7% perf-profile.children.cycles-pp.__x64_sys_pread64
> 2.58 ą 9% +1.7 4.31 ą 8% perf-profile.children.cycles-pp.ksys_read
> 0.00 +2.8 2.80 ą 5% perf-profile.children.cycles-pp.fsnotify_open_perm
> 5.23 ą 14% +3.0 8.19 ą 8% perf-profile.children.cycles-pp.rw_verify_area
> 5.06 ą 8% +3.5 8.53 ą 7% perf-profile.children.cycles-pp.vfs_read
> 6.77 ą 8% -2.6 4.12 ą 4% perf-profile.self.cycles-pp.apparmor_file_open
> 5.01 ą 14% -2.6 2.44 ą 12% perf-profile.self.cycles-pp.apparmor_file_permission
> 3.45 ą 13% -1.7 1.77 ą 10% perf-profile.self.cycles-pp.apparmor_mmap_file
> 2.93 ą 9% -1.4 1.54 ą 8% perf-profile.self.cycles-pp.apparmor_file_alloc_security
> 2.14 ą 16% -1.2 0.99 ą 14% perf-profile.self.cycles-pp.apparmor_current_getsecid_subj
> 4.74 ą 9% -1.0 3.73 ą 7% perf-profile.self.cycles-pp.common_perm_cond
> 1.31 ą 5% -0.5 0.79 ą 5% perf-profile.self.cycles-pp.apparmor_file_free_security
> 0.43 ą 3% -0.2 0.21 ą 5% perf-profile.self.cycles-pp.__fsnotify_parent
> 1.07 ą 3% -0.2 0.88 ą 6% perf-profile.self.cycles-pp.mas_rev_awalk
> 0.68 ą 9% -0.2 0.50 ą 6% perf-profile.self.cycles-pp.kobject_put
> 0.95 ą 2% -0.1 0.81 ą 4% perf-profile.self.cycles-pp.rep_stos_alternative
> 0.20 ą 25% -0.1 0.10 ą 14% perf-profile.self.cycles-pp.aa_file_perm
> 0.28 ą 8% -0.1 0.18 ą 8% perf-profile.self.cycles-pp.down_write_killable
> 0.57 ą 3% -0.1 0.48 ą 4% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> 0.31 ą 8% -0.1 0.22 ą 9% perf-profile.self.cycles-pp.fget
> 0.50 ą 3% -0.1 0.43 ą 5% perf-profile.self.cycles-pp.stress_full
> 0.22 ą 6% -0.1 0.16 ą 6% perf-profile.self.cycles-pp.cdev_put
> 0.15 ą 5% -0.0 0.11 ą 6% perf-profile.self.cycles-pp.__legitimize_mnt
> 0.24 ą 4% -0.0 0.20 ą 6% perf-profile.self.cycles-pp.mas_empty_area_rev
> 0.28 ą 3% -0.0 0.24 ą 4% perf-profile.self.cycles-pp.do_syscall_64
> 0.24 ą 3% -0.0 0.20 ą 6% perf-profile.self.cycles-pp.mas_ascend
> 0.18 ą 3% -0.0 0.14 ą 6% perf-profile.self.cycles-pp.do_mmap
> 0.14 ą 5% -0.0 0.11 ą 12% perf-profile.self.cycles-pp.chrdev_open
> 0.19 ą 2% -0.0 0.15 ą 5% perf-profile.self.cycles-pp.syscall_return_via_sysret
> 0.20 ą 3% -0.0 0.17 ą 5% perf-profile.self.cycles-pp.entry_SYSCALL_64
> 0.20 ą 4% -0.0 0.17 ą 3% perf-profile.self.cycles-pp.vfs_read
> 0.18 ą 2% -0.0 0.15 ą 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> 0.16 ą 2% -0.0 0.13 ą 4% perf-profile.self.cycles-pp.mas_data_end
> 0.07 ą 4% -0.0 0.04 ą 50% perf-profile.self.cycles-pp.process_measurement
> 0.16 ą 3% -0.0 0.13 ą 5% perf-profile.self.cycles-pp.vm_unmapped_area
> 0.12 ą 4% -0.0 0.09 ą 6% perf-profile.self.cycles-pp.mas_prev_slot
> 0.14 ą 2% -0.0 0.12 ą 5% perf-profile.self.cycles-pp.kmem_cache_free
> 0.10 ą 5% -0.0 0.07 ą 6% perf-profile.self.cycles-pp.open64
> 0.15 ą 2% -0.0 0.13 ą 5% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
> 0.15 ą 2% -0.0 0.13 ą 4% perf-profile.self.cycles-pp.ioctl
> 0.09 ą 5% -0.0 0.07 ą 8% perf-profile.self.cycles-pp.write
> 0.07 ą 6% -0.0 0.06 perf-profile.self.cycles-pp.__close
> 0.11 ą 4% +0.0 0.13 ą 4% perf-profile.self.cycles-pp.link_path_walk
> 0.01 ą200% +0.0 0.06 ą 9% perf-profile.self.cycles-pp.__virt_addr_valid
> 0.75 ą 2% +0.1 0.89 ą 3% perf-profile.self.cycles-pp._raw_spin_lock
> 0.00 +2.8 2.79 ą 5% perf-profile.self.cycles-pp.fsnotify_open_perm
> 0.05 +5.6 5.63 ą 10% perf-profile.self.cycles-pp.rw_verify_area
>
>
> ***************************************************************************************************
> lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
> gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fsbuffer-r/unixbench
>
> commit:
> 477cf917dd ("fsnotify: use an enum for group priority constants")
> a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
>
> 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 1339661 +6.4% 1425877 unixbench.throughput
> 5.765e+08 +6.4% 6.131e+08 unixbench.workload
> 1.159e+09 +2.2% 1.184e+09 perf-stat.i.branch-instructions
> 1.49 +0.0 1.54 perf-stat.i.branch-miss-rate%
> 10449249 ą 2% +6.7% 11149426 perf-stat.i.branch-misses
> 4514 -5.3% 4273 perf-stat.overall.path-length
> 1.156e+09 +2.2% 1.181e+09 perf-stat.ps.branch-instructions
> 10430168 ą 2% +6.7% 11128869 perf-stat.ps.branch-misses
> 7.02 ą 2% -3.3 3.70 ą 3% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.45 ą 3% +0.2 1.62 ą 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.read
> 1.24 ą 3% +0.2 1.44 ą 3% perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.filemap_read.vfs_read
> 2.55 ą 8% +0.4 2.91 ą 4% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
> 3.04 ą 6% +0.4 3.44 ą 3% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
> 1.94 ą 9% +0.5 2.42 ą 3% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 8.62 ą 3% +0.5 9.14 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
> 7.90 ą 2% +0.6 8.51 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read.ksys_read
> 9.29 ą 2% +0.8 10.04 perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.vfs_read.ksys_read.do_syscall_64
> 4.43 ą 7% +0.8 5.28 ą 2% perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read
> 29.04 ą 3% +1.8 30.80 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 7.06 ą 2% -3.3 3.73 ą 3% perf-profile.children.cycles-pp.__fsnotify_parent
> 0.77 ą 6% +0.1 0.88 ą 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
> 1.26 ą 2% +0.2 1.45 ą 3% perf-profile.children.cycles-pp.current_time
> 1.66 ą 3% +0.2 1.90 ą 3% perf-profile.children.cycles-pp.syscall_return_via_sysret
> 3.72 ą 2% +0.3 4.03 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> 2.56 ą 7% +0.4 2.91 ą 4% perf-profile.children.cycles-pp.apparmor_file_permission
> 5.72 ą 2% +0.4 6.08 perf-profile.children.cycles-pp.entry_SYSCALL_64
> 4.40 ą 4% +0.4 4.81 ą 2% perf-profile.children.cycles-pp.rep_movs_alternative
> 3.10 ą 6% +0.4 3.52 ą 3% perf-profile.children.cycles-pp.security_file_permission
> 1.94 ą 9% +0.5 2.42 ą 3% perf-profile.children.cycles-pp.__fdget_pos
> 8.68 ą 3% +0.5 9.20 perf-profile.children.cycles-pp.filemap_get_pages
> 8.37 ą 2% +0.7 9.05 perf-profile.children.cycles-pp._copy_to_iter
> 9.52 ą 2% +0.8 10.28 perf-profile.children.cycles-pp.copy_page_to_iter
> 29.25 ą 3% +1.7 30.99 perf-profile.children.cycles-pp.filemap_read
> 6.94 -3.2 3.72 ą 3% perf-profile.self.cycles-pp.__fsnotify_parent
> 0.77 ą 6% +0.1 0.88 ą 7% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
> 0.83 ą 5% +0.1 0.97 ą 7% perf-profile.self.cycles-pp.current_time
> 1.66 ą 3% +0.2 1.90 ą 3% perf-profile.self.cycles-pp.syscall_return_via_sysret
> 3.52 ą 2% +0.2 3.76 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> 2.42 ą 3% +0.3 2.67 ą 3% perf-profile.self.cycles-pp.entry_SYSCALL_64
> 1.92 ą 6% +0.3 2.20 ą 5% perf-profile.self.cycles-pp.apparmor_file_permission
> 3.92 ą 4% +0.3 4.25 ą 2% perf-profile.self.cycles-pp.rep_movs_alternative
> 4.38 +0.3 4.72 ą 2% perf-profile.self.cycles-pp._copy_to_iter
> 1.16 ą 8% +0.3 1.51 ą 2% perf-profile.self.cycles-pp.ksys_read
> 1.85 ą 10% +0.5 2.36 ą 2% perf-profile.self.cycles-pp.__fdget_pos
>
>
>
> ***************************************************************************************************
> lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
> gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fstime-r/unixbench
>
> commit:
> 477cf917dd ("fsnotify: use an enum for group priority constants")
> a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
>
> 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 4709035 +5.8% 4980152 unixbench.throughput
> 2.026e+09 +5.7% 2.141e+09 unixbench.workload
> 1.034e+09 +1.4% 1.048e+09 perf-stat.i.branch-instructions
> 1.56 +0.0 1.59 perf-stat.i.branch-miss-rate%
> 60950726 +5.3% 64193405 perf-stat.i.cache-references
> 0.02 ą 30% -36.7% 0.01 ą 39% perf-stat.i.major-faults
> 0.78 -0.0 0.75 perf-stat.overall.cache-miss-rate%
> 1145 -5.4% 1083 perf-stat.overall.path-length
> 1.031e+09 +1.4% 1.046e+09 perf-stat.ps.branch-instructions
> 60812120 +5.3% 64047513 perf-stat.ps.cache-references
> 0.02 ą 30% -36.7% 0.01 ą 39% perf-stat.ps.major-faults
> 6.22 ą 3% -2.9 3.30 ą 3% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 49.43 -1.5 47.90 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 52.39 -1.0 51.34 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 55.16 -0.9 54.29 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 56.49 -0.7 55.80 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
> 2.40 ą 4% +0.2 2.64 ą 5% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_read.vfs_read.ksys_read
> 2.59 ą 4% +0.3 2.86 ą 5% perf-profile.calltrace.cycles-pp.touch_atime.filemap_read.vfs_read.ksys_read.do_syscall_64
> 6.88 +0.3 7.23 ą 2% perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.vfs_read.ksys_read
> 2.26 ą 3% +0.4 2.64 ą 10% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
> 7.90 ą 3% +0.4 8.29 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
> 2.68 ą 2% +0.4 3.13 ą 8% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
> 8.47 +0.4 8.91 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
> 32.80 +1.8 34.63 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 6.27 ą 3% -2.9 3.34 ą 3% perf-profile.children.cycles-pp.__fsnotify_parent
> 49.50 -1.4 48.07 perf-profile.children.cycles-pp.vfs_read
> 52.46 -1.0 51.45 perf-profile.children.cycles-pp.ksys_read
> 1.16 ą 4% +0.1 1.28 ą 4% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
> 2.46 ą 4% +0.2 2.69 ą 6% perf-profile.children.cycles-pp.atime_needs_update
> 5.03 ą 3% +0.3 5.30 perf-profile.children.cycles-pp.entry_SYSCALL_64
> 2.66 ą 4% +0.3 2.94 ą 6% perf-profile.children.cycles-pp.touch_atime
> 3.27 ą 2% +0.3 3.59 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> 6.96 +0.4 7.31 ą 2% perf-profile.children.cycles-pp.filemap_get_read_batch
> 2.27 ą 3% +0.4 2.64 ą 10% perf-profile.children.cycles-pp.apparmor_file_permission
> 2.76 ą 2% +0.4 3.20 ą 7% perf-profile.children.cycles-pp.security_file_permission
> 8.52 +0.5 8.98 perf-profile.children.cycles-pp.filemap_get_pages
> 32.99 +1.8 34.80 perf-profile.children.cycles-pp.filemap_read
> 6.16 ą 3% -2.8 3.32 ą 3% perf-profile.self.cycles-pp.__fsnotify_parent
> 1.19 ą 3% -0.4 0.81 ą 6% perf-profile.self.cycles-pp.rw_verify_area
> 1.55 ą 3% +0.1 1.64 ą 2% perf-profile.self.cycles-pp.filemap_get_pages
> 0.70 ą 3% +0.1 0.81 ą 7% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
> 1.31 ą 4% +0.1 1.43 ą 4% perf-profile.self.cycles-pp.do_syscall_64
> 2.15 ą 4% +0.1 2.28 perf-profile.self.cycles-pp.entry_SYSCALL_64
> 4.00 ą 2% +0.2 4.22 perf-profile.self.cycles-pp.read
> 1.06 ą 4% +0.3 1.31 ą 5% perf-profile.self.cycles-pp.ksys_read
> 3.09 ą 2% +0.3 3.36 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> 3.89 ą 2% +0.3 4.19 ą 3% perf-profile.self.cycles-pp._copy_to_iter
> 1.66 ą 2% +0.3 2.01 ą 13% perf-profile.self.cycles-pp.apparmor_file_permission
>
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [linux-next:master] [fsnotify] a5e57b4d37: stress-ng.full.ops_per_sec -17.3% regression
2024-04-11 9:23 ` Amir Goldstein
@ 2024-04-11 11:54 ` Jan Kara
2024-04-11 16:22 ` Amir Goldstein
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2024-04-11 11:54 UTC (permalink / raw)
To: Amir Goldstein
Cc: kernel test robot, Jan Kara, oe-lkp, lkp,
Linux Memory Management List, linux-fsdevel, ying.huang,
feng.tang, fengwei.yin
On Thu 11-04-24 12:23:34, Amir Goldstein wrote:
> On Thu, Apr 11, 2024 at 4:42 AM kernel test robot <oliver.sang@intel.com> wrote:
> > for "[amir73il:fsnotify-sbconn] [fsnotify] 629f30e073: unixbench.throughput 5.8% improvement"
> > (https://lore.kernel.org/all/202403141505.807a722b-oliver.sang@intel.com/)
> > you requested us to test unixbench for this commit on different branches and we
> > observed consistent performance improvement.
> >
> > now we noticed this commit is merged into linux-next/master, we still
> > observed similar unixbench improvement, however, we also captured a
> > stress-ng regression now. below details FYI.
> >
> > Hello,
> >
> > kernel test robot noticed a -17.3% regression of stress-ng.full.ops_per_sec on:
> >
> >
> > commit: a5e57b4d370c6d320e5bfb0c919fe00aee29e039 ("fsnotify: optimize the case of no permission event watchers")
>
> Odd. This commit does add an extra fsnotify_sb_has_priority_watchers()
> inline check for reads and writes, but the inline helper
> fsnotify_sb_has_watchers()
> already exists in fsnotify_parent() and it already accesses fsnotify_sb_info.
>
> It seems like stress-ng.full does read/write/mmap operations on /dev/full,
> so the fsnotify_sb_info object would be that of devtmpfs.
>
> I think that the permission events on special files are not very relevant,
> but I am not sure.
>
> Jan, any ideas?
So I'm not 100% sure but this load simply seems to run 'stress-ng' with all
the syscalls it is able to exercise (one per CPU if I'm right). Hum...
looking at perf numbers I've noticed changes like:
0.43 ą 3% -0.2 0.21 ą 5% perf-profile.self.cycles-pp.__fsnotify_parent
0.00 +2.8 2.79 ą 5% perf-profile.self.cycles-pp.fsnotify_open_perm
or
1.77 ą 12% +1.9 3.64 ą 8% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.71 ą 15% +1.9 3.64 ą 9% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +2.8 2.79 ą 5% perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open
So the savings in __fsnotify_parent() don't really outweight the costs in
fsnotify_file()... I can see stress-ng exercises also inotify so maybe
there's some contention on the counters which is causing the regression now
that we have more of them?
BTW, I'm not sure how you've arrived at the conclusing the test is using
/dev/full. For all I can tell the e.g. the stress-mmap test is using a file
in a subdir of CWD.
Honza
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> >
> > testcase: stress-ng
> > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> > parameters:
> >
> > nr_threads: 100%
> > testtime: 60s
> > test: full
> > cpufreq_governor: performance
> >
> >
> > In addition to that, the commit also has significant impact on the following tests:
> >
> > +------------------+-------------------------------------------------------------------------------------------------+
> > | testcase: change | unixbench: unixbench.throughput 6.4% improvement |
> > | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
> > | test parameters | cpufreq_governor=performance |
> > | | nr_task=1 |
> > | | runtime=300s |
> > | | test=fsbuffer-r |
> > +------------------+-------------------------------------------------------------------------------------------------+
> > | testcase: change | unixbench: unixbench.throughput 5.8% improvement |
> > | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
> > | test parameters | cpufreq_governor=performance |
> > | | nr_task=1 |
> > | | runtime=300s |
> > | | test=fstime-r |
> > +------------------+-------------------------------------------------------------------------------------------------+
> >
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202404101624.85684be8-oliver.sang@intel.com
> >
> >
> > Details are as below:
> > -------------------------------------------------------------------------------------------------->
> >
> >
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20240410/202404101624.85684be8-oliver.sang@intel.com
> >
> > =========================================================================================
> > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/full/stress-ng/60s
> >
> > commit:
> > 477cf917dd ("fsnotify: use an enum for group priority constants")
> > a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
> >
> > 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 20489 ą 7% -19.2% 16565 ą 13% perf-c2c.HITM.remote
> > 409.48 ą 9% -14.0% 352.13 ą 5% sched_debug.cfs_rq:/.util_est.avg
> > 217.94 ą 8% +12.9% 246.07 ą 4% sched_debug.cfs_rq:/.util_est.stddev
> > 1.461e+08 ą 3% -17.3% 1.208e+08 ą 5% stress-ng.full.ops
> > 2434462 ą 3% -17.3% 2013444 ą 5% stress-ng.full.ops_per_sec
> > 71.04 ą 3% -16.6% 59.28 ą 6% stress-ng.time.user_time
> > 9.95e+09 ą 4% -13.4% 8.617e+09 ą 3% perf-stat.i.branch-instructions
> > 0.48 ą 3% +0.1 0.55 ą 2% perf-stat.i.branch-miss-rate%
> > 4.36 ą 4% +17.1% 5.10 ą 3% perf-stat.i.cpi
> > 5.162e+10 ą 4% -14.5% 4.416e+10 ą 3% perf-stat.i.instructions
> > 0.24 ą 3% -13.8% 0.21 ą 3% perf-stat.i.ipc
> > 0.46 ą 3% +0.1 0.54 ą 2% perf-stat.overall.branch-miss-rate%
> > 4.38 ą 4% +16.9% 5.12 ą 3% perf-stat.overall.cpi
> > 0.23 ą 4% -14.5% 0.20 ą 3% perf-stat.overall.ipc
> > 9.781e+09 ą 4% -13.4% 8.471e+09 ą 3% perf-stat.ps.branch-instructions
> > 5.075e+10 ą 4% -14.5% 4.341e+10 ą 3% perf-stat.ps.instructions
> > 3.111e+12 ą 4% -14.5% 2.66e+12 ą 3% perf-stat.total.instructions
> > 8.39 ą 7% -2.8 5.56 ą 4% perf-profile.calltrace.cycles-pp.__mmap
> > 8.09 ą 7% -2.8 5.31 ą 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap
> > 8.05 ą 7% -2.8 5.28 ą 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> > 7.95 ą 7% -2.8 5.19 ą 4% perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> > 6.80 ą 8% -2.7 4.14 ą 4% perf-profile.calltrace.cycles-pp.security_file_open.do_dentry_open.do_open.path_openat.do_filp_open
> > 7.46 ą 8% -2.7 4.80 ą 4% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> > 6.78 ą 8% -2.7 4.13 ą 4% perf-profile.calltrace.cycles-pp.apparmor_file_open.security_file_open.do_dentry_open.do_open.path_openat
> > 4.12 ą 14% -2.0 2.09 ą 10% perf-profile.calltrace.cycles-pp.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 3.54 ą 14% -1.7 1.81 ą 10% perf-profile.calltrace.cycles-pp.apparmor_mmap_file.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> > 3.46 ą 8% -1.5 1.99 ą 6% perf-profile.calltrace.cycles-pp.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
> > 3.15 ą 8% -1.4 1.71 ą 7% perf-profile.calltrace.cycles-pp.init_file.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2
> > 3.06 ą 9% -1.4 1.63 ą 7% perf-profile.calltrace.cycles-pp.security_file_alloc.init_file.alloc_empty_file.path_openat.do_filp_open
> > 2.95 ą 9% -1.4 1.54 ą 8% perf-profile.calltrace.cycles-pp.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file.path_openat
> > 5.50 ą 7% -1.1 4.39 ą 5% perf-profile.calltrace.cycles-pp.fstatat64
> > 5.34 ą 7% -1.1 4.26 ą 6% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fstatat64
> > 5.32 ą 7% -1.1 4.24 ą 6% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
> > 5.27 ą 8% -1.1 4.20 ą 6% perf-profile.calltrace.cycles-pp.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
> > 4.95 ą 8% -1.0 3.91 ą 7% perf-profile.calltrace.cycles-pp.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
> > 4.78 ą 8% -1.0 3.77 ą 7% perf-profile.calltrace.cycles-pp.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 4.75 ą 9% -1.0 3.74 ą 7% perf-profile.calltrace.cycles-pp.common_perm_cond.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64
> > 1.74 ą 12% -0.9 0.83 ą 11% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64
> > 1.75 ą 12% -0.9 0.84 ą 11% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64
> > 2.08 ą 13% -0.9 1.17 ą 9% perf-profile.calltrace.cycles-pp.write
> > 1.78 ą 13% -0.9 0.88 ą 13% perf-profile.calltrace.cycles-pp.security_file_post_open.do_open.path_openat.do_filp_open.do_sys_openat2
> > 1.77 ą 13% -0.9 0.87 ą 13% perf-profile.calltrace.cycles-pp.ima_file_check.security_file_post_open.do_open.path_openat.do_filp_open
> > 1.68 ą 15% -0.9 0.80 ą 13% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
> > 1.68 ą 15% -0.9 0.80 ą 13% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
> > 1.68 ą 14% -0.9 0.80 ą 14% perf-profile.calltrace.cycles-pp.apparmor_current_getsecid_subj.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open
> > 1.68 ą 14% -0.9 0.81 ą 14% perf-profile.calltrace.cycles-pp.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open.path_openat
> > 1.90 ą 14% -0.9 1.02 ą 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
> > 1.88 ą 14% -0.9 1.00 ą 11% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
> > 1.82 ą 15% -0.9 0.96 ą 11% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
> > 1.77 ą 15% -0.8 0.92 ą 11% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
> > 1.74 ą 15% -0.8 0.90 ą 12% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.72 ą 15% -0.8 0.87 ą 12% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_write.ksys_write
> > 1.73 ą 15% -0.8 0.89 ą 12% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_write.ksys_write.do_syscall_64
> > 1.32 ą 5% -0.5 0.80 ą 5% perf-profile.calltrace.cycles-pp.security_file_free.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.31 ą 5% -0.5 0.80 ą 5% perf-profile.calltrace.cycles-pp.apparmor_file_free_security.security_file_free.__fput.__x64_sys_close.do_syscall_64
> > 2.72 ą 2% -0.5 2.24 ą 6% perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.68 ą 9% -0.4 0.26 ą100% perf-profile.calltrace.cycles-pp.kobject_put.cdev_put.__fput.__x64_sys_close.do_syscall_64
> > 2.48 ą 2% -0.4 2.07 ą 5% perf-profile.calltrace.cycles-pp.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> > 2.39 ą 2% -0.4 1.99 ą 6% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff
> > 2.22 ą 2% -0.4 1.84 ą 5% perf-profile.calltrace.cycles-pp.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff
> > 1.54 ą 2% -0.3 1.27 ą 6% perf-profile.calltrace.cycles-pp.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap
> > 0.91 ą 8% -0.2 0.66 ą 6% perf-profile.calltrace.cycles-pp.cdev_put.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.17 ą 3% -0.2 0.96 ą 6% perf-profile.calltrace.cycles-pp.mas_rev_awalk.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area
> > 0.64 ą 2% -0.1 0.57 ą 4% perf-profile.calltrace.cycles-pp.ioctl
> > 2.80 ą 7% +1.7 4.48 ą 6% perf-profile.calltrace.cycles-pp.__libc_pread
> > 2.65 ą 7% +1.7 4.35 ą 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pread
> > 2.63 ą 7% +1.7 4.33 ą 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
> > 2.58 ą 7% +1.7 4.29 ą 7% perf-profile.calltrace.cycles-pp.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
> > 2.79 ą 8% +1.7 4.50 ą 7% perf-profile.calltrace.cycles-pp.read
> > 2.53 ą 8% +1.7 4.25 ą 7% perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
> > 2.64 ą 9% +1.7 4.37 ą 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
> > 2.62 ą 9% +1.7 4.35 ą 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> > 2.57 ą 9% +1.7 4.31 ą 8% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> > 2.52 ą 10% +1.7 4.27 ą 8% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> > 1.77 ą 12% +1.9 3.64 ą 8% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.71 ą 15% +1.9 3.64 ą 9% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.00 +2.8 2.79 ą 5% perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open
> > 8.50 ą 7% -2.8 5.66 ą 4% perf-profile.children.cycles-pp.__mmap
> > 7.96 ą 7% -2.8 5.20 ą 4% perf-profile.children.cycles-pp.ksys_mmap_pgoff
> > 6.81 ą 8% -2.7 4.14 ą 4% perf-profile.children.cycles-pp.security_file_open
> > 6.79 ą 8% -2.7 4.14 ą 4% perf-profile.children.cycles-pp.apparmor_file_open
> > 7.48 ą 7% -2.7 4.83 ą 4% perf-profile.children.cycles-pp.vm_mmap_pgoff
> > 5.14 ą 14% -2.6 2.51 ą 12% perf-profile.children.cycles-pp.apparmor_file_permission
> > 5.18 ą 14% -2.6 2.54 ą 11% perf-profile.children.cycles-pp.security_file_permission
> > 4.13 ą 14% -2.0 2.10 ą 10% perf-profile.children.cycles-pp.security_mmap_file
> > 3.55 ą 14% -1.7 1.81 ą 10% perf-profile.children.cycles-pp.apparmor_mmap_file
> > 3.47 ą 8% -1.5 2.00 ą 6% perf-profile.children.cycles-pp.alloc_empty_file
> > 3.15 ą 8% -1.4 1.72 ą 7% perf-profile.children.cycles-pp.init_file
> > 3.06 ą 9% -1.4 1.64 ą 7% perf-profile.children.cycles-pp.security_file_alloc
> > 2.95 ą 9% -1.4 1.55 ą 8% perf-profile.children.cycles-pp.apparmor_file_alloc_security
> > 2.18 ą 16% -1.2 1.02 ą 14% perf-profile.children.cycles-pp.security_current_getsecid_subj
> > 2.16 ą 16% -1.2 1.00 ą 14% perf-profile.children.cycles-pp.apparmor_current_getsecid_subj
> > 5.55 ą 7% -1.1 4.44 ą 5% perf-profile.children.cycles-pp.fstatat64
> > 5.27 ą 8% -1.1 4.20 ą 6% perf-profile.children.cycles-pp.__do_sys_newfstatat
> > 4.96 ą 8% -1.0 3.92 ą 7% perf-profile.children.cycles-pp.vfs_fstat
> > 4.78 ą 8% -1.0 3.77 ą 7% perf-profile.children.cycles-pp.security_inode_getattr
> > 4.75 ą 9% -1.0 3.74 ą 7% perf-profile.children.cycles-pp.common_perm_cond
> > 2.16 ą 12% -0.9 1.25 ą 8% perf-profile.children.cycles-pp.write
> > 1.78 ą 13% -0.9 0.88 ą 13% perf-profile.children.cycles-pp.security_file_post_open
> > 1.77 ą 13% -0.9 0.87 ą 13% perf-profile.children.cycles-pp.ima_file_check
> > 1.86 ą 14% -0.9 1.00 ą 10% perf-profile.children.cycles-pp.ksys_write
> > 1.81 ą 15% -0.8 0.96 ą 10% perf-profile.children.cycles-pp.vfs_write
> > 1.32 ą 5% -0.5 0.80 ą 5% perf-profile.children.cycles-pp.security_file_free
> > 1.31 ą 5% -0.5 0.80 ą 5% perf-profile.children.cycles-pp.apparmor_file_free_security
> > 2.73 ą 2% -0.5 2.25 ą 6% perf-profile.children.cycles-pp.do_mmap
> > 2.50 ą 2% -0.4 2.08 ą 6% perf-profile.children.cycles-pp.get_unmapped_area
> > 2.41 ą 2% -0.4 2.01 ą 6% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown
> > 2.24 ą 2% -0.4 1.86 ą 5% perf-profile.children.cycles-pp.vm_unmapped_area
> > 0.52 ą 23% -0.3 0.23 ą 14% perf-profile.children.cycles-pp.ima_file_mmap
> > 1.58 ą 2% -0.3 1.31 ą 6% perf-profile.children.cycles-pp.mas_empty_area_rev
> > 0.91 ą 7% -0.2 0.67 ą 6% perf-profile.children.cycles-pp.cdev_put
> > 0.44 ą 3% -0.2 0.22 ą 6% perf-profile.children.cycles-pp.__fsnotify_parent
> > 1.21 ą 3% -0.2 0.99 ą 6% perf-profile.children.cycles-pp.mas_rev_awalk
> > 0.69 ą 9% -0.2 0.50 ą 6% perf-profile.children.cycles-pp.kobject_put
> > 1.13 ą 3% -0.2 0.96 ą 4% perf-profile.children.cycles-pp.read_iter_zero
> > 1.09 ą 3% -0.2 0.93 ą 4% perf-profile.children.cycles-pp.iov_iter_zero
> > 0.96 ą 2% -0.1 0.82 ą 4% perf-profile.children.cycles-pp.rep_stos_alternative
> > 0.76 ą 3% -0.1 0.64 ą 4% perf-profile.children.cycles-pp.entry_SYSCALL_64
> > 0.21 ą 24% -0.1 0.11 ą 12% perf-profile.children.cycles-pp.aa_file_perm
> > 0.31 ą 7% -0.1 0.20 ą 8% perf-profile.children.cycles-pp.down_write_killable
> > 0.75 ą 2% -0.1 0.66 ą 4% perf-profile.children.cycles-pp.ioctl
> > 0.59 ą 2% -0.1 0.50 ą 4% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 0.31 ą 9% -0.1 0.23 ą 8% perf-profile.children.cycles-pp.fget
> > 0.52 ą 3% -0.1 0.44 ą 5% perf-profile.children.cycles-pp.stress_full
> > 0.34 -0.1 0.27 ą 5% perf-profile.children.cycles-pp.llseek
> > 0.30 ą 3% -0.1 0.24 ą 8% perf-profile.children.cycles-pp.kmem_cache_free
> > 0.34 ą 2% -0.0 0.29 ą 6% perf-profile.children.cycles-pp.mas_prev_slot
> > 0.29 ą 2% -0.0 0.24 ą 5% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
> > 0.16 ą 5% -0.0 0.11 ą 8% perf-profile.children.cycles-pp.__legitimize_mnt
> > 0.16 ą 6% -0.0 0.12 ą 13% perf-profile.children.cycles-pp.__memcg_slab_free_hook
> > 0.07 ą 5% -0.0 0.03 ą 81% perf-profile.children.cycles-pp.ksys_lseek
> > 0.25 ą 3% -0.0 0.22 ą 6% perf-profile.children.cycles-pp.mas_ascend
> > 0.18 -0.0 0.15 ą 5% perf-profile.children.cycles-pp.mas_data_end
> > 0.19 ą 2% -0.0 0.16 ą 5% perf-profile.children.cycles-pp.syscall_return_via_sysret
> > 0.11 ą 7% -0.0 0.08 ą 8% perf-profile.children.cycles-pp.open_last_lookups
> > 0.07 ą 4% -0.0 0.04 ą 50% perf-profile.children.cycles-pp.mas_prev
> > 0.11 ą 4% -0.0 0.08 ą 9% perf-profile.children.cycles-pp.__fdget_pos
> > 0.07 ą 4% -0.0 0.04 ą 51% perf-profile.children.cycles-pp.process_measurement
> > 0.06 -0.0 0.04 ą 65% perf-profile.children.cycles-pp.vfs_getattr_nosec
> > 0.06 -0.0 0.04 ą 33% perf-profile.children.cycles-pp.amd_clear_divider
> > 0.08 ą 5% -0.0 0.06 ą 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
> > 0.07 ą 10% +0.0 0.10 ą 10% perf-profile.children.cycles-pp.walk_component
> > 0.35 +0.0 0.40 ą 6% perf-profile.children.cycles-pp.link_path_walk
> > 97.57 +0.4 97.94 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 97.40 +0.4 97.80 perf-profile.children.cycles-pp.do_syscall_64
> > 2.85 ą 7% +1.7 4.53 ą 6% perf-profile.children.cycles-pp.__libc_pread
> > 2.85 ą 8% +1.7 4.54 ą 7% perf-profile.children.cycles-pp.read
> > 2.59 ą 7% +1.7 4.30 ą 7% perf-profile.children.cycles-pp.__x64_sys_pread64
> > 2.58 ą 9% +1.7 4.31 ą 8% perf-profile.children.cycles-pp.ksys_read
> > 0.00 +2.8 2.80 ą 5% perf-profile.children.cycles-pp.fsnotify_open_perm
> > 5.23 ą 14% +3.0 8.19 ą 8% perf-profile.children.cycles-pp.rw_verify_area
> > 5.06 ą 8% +3.5 8.53 ą 7% perf-profile.children.cycles-pp.vfs_read
> > 6.77 ą 8% -2.6 4.12 ą 4% perf-profile.self.cycles-pp.apparmor_file_open
> > 5.01 ą 14% -2.6 2.44 ą 12% perf-profile.self.cycles-pp.apparmor_file_permission
> > 3.45 ą 13% -1.7 1.77 ą 10% perf-profile.self.cycles-pp.apparmor_mmap_file
> > 2.93 ą 9% -1.4 1.54 ą 8% perf-profile.self.cycles-pp.apparmor_file_alloc_security
> > 2.14 ą 16% -1.2 0.99 ą 14% perf-profile.self.cycles-pp.apparmor_current_getsecid_subj
> > 4.74 ą 9% -1.0 3.73 ą 7% perf-profile.self.cycles-pp.common_perm_cond
> > 1.31 ą 5% -0.5 0.79 ą 5% perf-profile.self.cycles-pp.apparmor_file_free_security
> > 0.43 ą 3% -0.2 0.21 ą 5% perf-profile.self.cycles-pp.__fsnotify_parent
> > 1.07 ą 3% -0.2 0.88 ą 6% perf-profile.self.cycles-pp.mas_rev_awalk
> > 0.68 ą 9% -0.2 0.50 ą 6% perf-profile.self.cycles-pp.kobject_put
> > 0.95 ą 2% -0.1 0.81 ą 4% perf-profile.self.cycles-pp.rep_stos_alternative
> > 0.20 ą 25% -0.1 0.10 ą 14% perf-profile.self.cycles-pp.aa_file_perm
> > 0.28 ą 8% -0.1 0.18 ą 8% perf-profile.self.cycles-pp.down_write_killable
> > 0.57 ą 3% -0.1 0.48 ą 4% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 0.31 ą 8% -0.1 0.22 ą 9% perf-profile.self.cycles-pp.fget
> > 0.50 ą 3% -0.1 0.43 ą 5% perf-profile.self.cycles-pp.stress_full
> > 0.22 ą 6% -0.1 0.16 ą 6% perf-profile.self.cycles-pp.cdev_put
> > 0.15 ą 5% -0.0 0.11 ą 6% perf-profile.self.cycles-pp.__legitimize_mnt
> > 0.24 ą 4% -0.0 0.20 ą 6% perf-profile.self.cycles-pp.mas_empty_area_rev
> > 0.28 ą 3% -0.0 0.24 ą 4% perf-profile.self.cycles-pp.do_syscall_64
> > 0.24 ą 3% -0.0 0.20 ą 6% perf-profile.self.cycles-pp.mas_ascend
> > 0.18 ą 3% -0.0 0.14 ą 6% perf-profile.self.cycles-pp.do_mmap
> > 0.14 ą 5% -0.0 0.11 ą 12% perf-profile.self.cycles-pp.chrdev_open
> > 0.19 ą 2% -0.0 0.15 ą 5% perf-profile.self.cycles-pp.syscall_return_via_sysret
> > 0.20 ą 3% -0.0 0.17 ą 5% perf-profile.self.cycles-pp.entry_SYSCALL_64
> > 0.20 ą 4% -0.0 0.17 ą 3% perf-profile.self.cycles-pp.vfs_read
> > 0.18 ą 2% -0.0 0.15 ą 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 0.16 ą 2% -0.0 0.13 ą 4% perf-profile.self.cycles-pp.mas_data_end
> > 0.07 ą 4% -0.0 0.04 ą 50% perf-profile.self.cycles-pp.process_measurement
> > 0.16 ą 3% -0.0 0.13 ą 5% perf-profile.self.cycles-pp.vm_unmapped_area
> > 0.12 ą 4% -0.0 0.09 ą 6% perf-profile.self.cycles-pp.mas_prev_slot
> > 0.14 ą 2% -0.0 0.12 ą 5% perf-profile.self.cycles-pp.kmem_cache_free
> > 0.10 ą 5% -0.0 0.07 ą 6% perf-profile.self.cycles-pp.open64
> > 0.15 ą 2% -0.0 0.13 ą 5% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
> > 0.15 ą 2% -0.0 0.13 ą 4% perf-profile.self.cycles-pp.ioctl
> > 0.09 ą 5% -0.0 0.07 ą 8% perf-profile.self.cycles-pp.write
> > 0.07 ą 6% -0.0 0.06 perf-profile.self.cycles-pp.__close
> > 0.11 ą 4% +0.0 0.13 ą 4% perf-profile.self.cycles-pp.link_path_walk
> > 0.01 ą200% +0.0 0.06 ą 9% perf-profile.self.cycles-pp.__virt_addr_valid
> > 0.75 ą 2% +0.1 0.89 ą 3% perf-profile.self.cycles-pp._raw_spin_lock
> > 0.00 +2.8 2.79 ą 5% perf-profile.self.cycles-pp.fsnotify_open_perm
> > 0.05 +5.6 5.63 ą 10% perf-profile.self.cycles-pp.rw_verify_area
> >
> >
> > ***************************************************************************************************
> > lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
> > =========================================================================================
> > compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
> > gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fsbuffer-r/unixbench
> >
> > commit:
> > 477cf917dd ("fsnotify: use an enum for group priority constants")
> > a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
> >
> > 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 1339661 +6.4% 1425877 unixbench.throughput
> > 5.765e+08 +6.4% 6.131e+08 unixbench.workload
> > 1.159e+09 +2.2% 1.184e+09 perf-stat.i.branch-instructions
> > 1.49 +0.0 1.54 perf-stat.i.branch-miss-rate%
> > 10449249 ą 2% +6.7% 11149426 perf-stat.i.branch-misses
> > 4514 -5.3% 4273 perf-stat.overall.path-length
> > 1.156e+09 +2.2% 1.181e+09 perf-stat.ps.branch-instructions
> > 10430168 ą 2% +6.7% 11128869 perf-stat.ps.branch-misses
> > 7.02 ą 2% -3.3 3.70 ą 3% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.45 ą 3% +0.2 1.62 ą 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.read
> > 1.24 ą 3% +0.2 1.44 ą 3% perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.filemap_read.vfs_read
> > 2.55 ą 8% +0.4 2.91 ą 4% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
> > 3.04 ą 6% +0.4 3.44 ą 3% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
> > 1.94 ą 9% +0.5 2.42 ą 3% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> > 8.62 ą 3% +0.5 9.14 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
> > 7.90 ą 2% +0.6 8.51 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read.ksys_read
> > 9.29 ą 2% +0.8 10.04 perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.vfs_read.ksys_read.do_syscall_64
> > 4.43 ą 7% +0.8 5.28 ą 2% perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read
> > 29.04 ą 3% +1.8 30.80 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 7.06 ą 2% -3.3 3.73 ą 3% perf-profile.children.cycles-pp.__fsnotify_parent
> > 0.77 ą 6% +0.1 0.88 ą 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
> > 1.26 ą 2% +0.2 1.45 ą 3% perf-profile.children.cycles-pp.current_time
> > 1.66 ą 3% +0.2 1.90 ą 3% perf-profile.children.cycles-pp.syscall_return_via_sysret
> > 3.72 ą 2% +0.3 4.03 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 2.56 ą 7% +0.4 2.91 ą 4% perf-profile.children.cycles-pp.apparmor_file_permission
> > 5.72 ą 2% +0.4 6.08 perf-profile.children.cycles-pp.entry_SYSCALL_64
> > 4.40 ą 4% +0.4 4.81 ą 2% perf-profile.children.cycles-pp.rep_movs_alternative
> > 3.10 ą 6% +0.4 3.52 ą 3% perf-profile.children.cycles-pp.security_file_permission
> > 1.94 ą 9% +0.5 2.42 ą 3% perf-profile.children.cycles-pp.__fdget_pos
> > 8.68 ą 3% +0.5 9.20 perf-profile.children.cycles-pp.filemap_get_pages
> > 8.37 ą 2% +0.7 9.05 perf-profile.children.cycles-pp._copy_to_iter
> > 9.52 ą 2% +0.8 10.28 perf-profile.children.cycles-pp.copy_page_to_iter
> > 29.25 ą 3% +1.7 30.99 perf-profile.children.cycles-pp.filemap_read
> > 6.94 -3.2 3.72 ą 3% perf-profile.self.cycles-pp.__fsnotify_parent
> > 0.77 ą 6% +0.1 0.88 ą 7% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
> > 0.83 ą 5% +0.1 0.97 ą 7% perf-profile.self.cycles-pp.current_time
> > 1.66 ą 3% +0.2 1.90 ą 3% perf-profile.self.cycles-pp.syscall_return_via_sysret
> > 3.52 ą 2% +0.2 3.76 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 2.42 ą 3% +0.3 2.67 ą 3% perf-profile.self.cycles-pp.entry_SYSCALL_64
> > 1.92 ą 6% +0.3 2.20 ą 5% perf-profile.self.cycles-pp.apparmor_file_permission
> > 3.92 ą 4% +0.3 4.25 ą 2% perf-profile.self.cycles-pp.rep_movs_alternative
> > 4.38 +0.3 4.72 ą 2% perf-profile.self.cycles-pp._copy_to_iter
> > 1.16 ą 8% +0.3 1.51 ą 2% perf-profile.self.cycles-pp.ksys_read
> > 1.85 ą 10% +0.5 2.36 ą 2% perf-profile.self.cycles-pp.__fdget_pos
> >
> >
> >
> > ***************************************************************************************************
> > lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
> > =========================================================================================
> > compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
> > gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fstime-r/unixbench
> >
> > commit:
> > 477cf917dd ("fsnotify: use an enum for group priority constants")
> > a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
> >
> > 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 4709035 +5.8% 4980152 unixbench.throughput
> > 2.026e+09 +5.7% 2.141e+09 unixbench.workload
> > 1.034e+09 +1.4% 1.048e+09 perf-stat.i.branch-instructions
> > 1.56 +0.0 1.59 perf-stat.i.branch-miss-rate%
> > 60950726 +5.3% 64193405 perf-stat.i.cache-references
> > 0.02 ą 30% -36.7% 0.01 ą 39% perf-stat.i.major-faults
> > 0.78 -0.0 0.75 perf-stat.overall.cache-miss-rate%
> > 1145 -5.4% 1083 perf-stat.overall.path-length
> > 1.031e+09 +1.4% 1.046e+09 perf-stat.ps.branch-instructions
> > 60812120 +5.3% 64047513 perf-stat.ps.cache-references
> > 0.02 ą 30% -36.7% 0.01 ą 39% perf-stat.ps.major-faults
> > 6.22 ą 3% -2.9 3.30 ą 3% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 49.43 -1.5 47.90 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> > 52.39 -1.0 51.34 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> > 55.16 -0.9 54.29 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> > 56.49 -0.7 55.80 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
> > 2.40 ą 4% +0.2 2.64 ą 5% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_read.vfs_read.ksys_read
> > 2.59 ą 4% +0.3 2.86 ą 5% perf-profile.calltrace.cycles-pp.touch_atime.filemap_read.vfs_read.ksys_read.do_syscall_64
> > 6.88 +0.3 7.23 ą 2% perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.vfs_read.ksys_read
> > 2.26 ą 3% +0.4 2.64 ą 10% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
> > 7.90 ą 3% +0.4 8.29 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
> > 2.68 ą 2% +0.4 3.13 ą 8% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
> > 8.47 +0.4 8.91 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
> > 32.80 +1.8 34.63 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 6.27 ą 3% -2.9 3.34 ą 3% perf-profile.children.cycles-pp.__fsnotify_parent
> > 49.50 -1.4 48.07 perf-profile.children.cycles-pp.vfs_read
> > 52.46 -1.0 51.45 perf-profile.children.cycles-pp.ksys_read
> > 1.16 ą 4% +0.1 1.28 ą 4% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
> > 2.46 ą 4% +0.2 2.69 ą 6% perf-profile.children.cycles-pp.atime_needs_update
> > 5.03 ą 3% +0.3 5.30 perf-profile.children.cycles-pp.entry_SYSCALL_64
> > 2.66 ą 4% +0.3 2.94 ą 6% perf-profile.children.cycles-pp.touch_atime
> > 3.27 ą 2% +0.3 3.59 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 6.96 +0.4 7.31 ą 2% perf-profile.children.cycles-pp.filemap_get_read_batch
> > 2.27 ą 3% +0.4 2.64 ą 10% perf-profile.children.cycles-pp.apparmor_file_permission
> > 2.76 ą 2% +0.4 3.20 ą 7% perf-profile.children.cycles-pp.security_file_permission
> > 8.52 +0.5 8.98 perf-profile.children.cycles-pp.filemap_get_pages
> > 32.99 +1.8 34.80 perf-profile.children.cycles-pp.filemap_read
> > 6.16 ą 3% -2.8 3.32 ą 3% perf-profile.self.cycles-pp.__fsnotify_parent
> > 1.19 ą 3% -0.4 0.81 ą 6% perf-profile.self.cycles-pp.rw_verify_area
> > 1.55 ą 3% +0.1 1.64 ą 2% perf-profile.self.cycles-pp.filemap_get_pages
> > 0.70 ą 3% +0.1 0.81 ą 7% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
> > 1.31 ą 4% +0.1 1.43 ą 4% perf-profile.self.cycles-pp.do_syscall_64
> > 2.15 ą 4% +0.1 2.28 perf-profile.self.cycles-pp.entry_SYSCALL_64
> > 4.00 ą 2% +0.2 4.22 perf-profile.self.cycles-pp.read
> > 1.06 ą 4% +0.3 1.31 ą 5% perf-profile.self.cycles-pp.ksys_read
> > 3.09 ą 2% +0.3 3.36 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 3.89 ą 2% +0.3 4.19 ą 3% perf-profile.self.cycles-pp._copy_to_iter
> > 1.66 ą 2% +0.3 2.01 ą 13% perf-profile.self.cycles-pp.apparmor_file_permission
> >
> >
> >
> >
> >
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are provided
> > for informational purposes only. Any difference in system hardware or software
> > design or configuration may affect actual performance.
> >
> >
> > --
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki
> >
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [linux-next:master] [fsnotify] a5e57b4d37: stress-ng.full.ops_per_sec -17.3% regression
2024-04-11 11:54 ` Jan Kara
@ 2024-04-11 16:22 ` Amir Goldstein
2024-04-12 15:04 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Amir Goldstein @ 2024-04-11 16:22 UTC (permalink / raw)
To: Jan Kara
Cc: kernel test robot, oe-lkp, lkp, Linux Memory Management List,
linux-fsdevel, ying.huang, feng.tang, fengwei.yin
On Thu, Apr 11, 2024 at 2:54 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 11-04-24 12:23:34, Amir Goldstein wrote:
> > On Thu, Apr 11, 2024 at 4:42 AM kernel test robot <oliver.sang@intel.com> wrote:
> > > for "[amir73il:fsnotify-sbconn] [fsnotify] 629f30e073: unixbench.throughput 5.8% improvement"
> > > (https://lore.kernel.org/all/202403141505.807a722b-oliver.sang@intel.com/)
> > > you requested us to test unixbench for this commit on different branches and we
> > > observed consistent performance improvement.
> > >
> > > now we noticed this commit is merged into linux-next/master, we still
> > > observed similar unixbench improvement, however, we also captured a
> > > stress-ng regression now. below details FYI.
> > >
> > > Hello,
> > >
> > > kernel test robot noticed a -17.3% regression of stress-ng.full.ops_per_sec on:
> > >
> > >
> > > commit: a5e57b4d370c6d320e5bfb0c919fe00aee29e039 ("fsnotify: optimize the case of no permission event watchers")
> >
> > Odd. This commit does add an extra fsnotify_sb_has_priority_watchers()
> > inline check for reads and writes, but the inline helper
> > fsnotify_sb_has_watchers()
> > already exists in fsnotify_parent() and it already accesses fsnotify_sb_info.
> >
> > It seems like stress-ng.full does read/write/mmap operations on /dev/full,
> > so the fsnotify_sb_info object would be that of devtmpfs.
> >
> > I think that the permission events on special files are not very relevant,
> > but I am not sure.
> >
> > Jan, any ideas?
>
> So I'm not 100% sure but this load simply seems to run 'stress-ng' with all
> the syscalls it is able to exercise (one per CPU if I'm right). Hum...
> looking at perf numbers I've noticed changes like:
>
> 0.43 ą 3% -0.2 0.21 ą 5% perf-profile.self.cycles-pp.__fsnotify_parent
> 0.00 +2.8 2.79 ą 5% perf-profile.self.cycles-pp.fsnotify_open_perm
>
> or
>
> 1.77 ą 12% +1.9 3.64 ą 8% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.71 ą 15% +1.9 3.64 ą 9% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +2.8 2.79 ą 5% perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open
>
> So the savings in __fsnotify_parent() don't really outweight the costs in
> fsnotify_file()... I can see stress-ng exercises also inotify so maybe
> there's some contention on the counters which is causing the regression now
> that we have more of them?
>
> BTW, I'm not sure how you've arrived at the conclusing the test is using
> /dev/full. For all I can tell the e.g. the stress-mmap test is using a file
> in a subdir of CWD.
>
Oh, I just saw the file stress-full.c in stress-ng and wrongly assumed that
test stress-ng.full refers to this code.
Where do I find the code for this test?
Thanks,
Amir.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [linux-next:master] [fsnotify] a5e57b4d37: stress-ng.full.ops_per_sec -17.3% regression
2024-04-11 16:22 ` Amir Goldstein
@ 2024-04-12 15:04 ` Jan Kara
0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2024-04-12 15:04 UTC (permalink / raw)
To: Amir Goldstein
Cc: Jan Kara, kernel test robot, oe-lkp, lkp,
Linux Memory Management List, linux-fsdevel, ying.huang,
feng.tang, fengwei.yin
On Thu 11-04-24 19:22:29, Amir Goldstein wrote:
> On Thu, Apr 11, 2024 at 2:54 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 11-04-24 12:23:34, Amir Goldstein wrote:
> > > On Thu, Apr 11, 2024 at 4:42 AM kernel test robot <oliver.sang@intel.com> wrote:
> > > > for "[amir73il:fsnotify-sbconn] [fsnotify] 629f30e073: unixbench.throughput 5.8% improvement"
> > > > (https://lore.kernel.org/all/202403141505.807a722b-oliver.sang@intel.com/)
> > > > you requested us to test unixbench for this commit on different branches and we
> > > > observed consistent performance improvement.
> > > >
> > > > now we noticed this commit is merged into linux-next/master, we still
> > > > observed similar unixbench improvement, however, we also captured a
> > > > stress-ng regression now. below details FYI.
> > > >
> > > > Hello,
> > > >
> > > > kernel test robot noticed a -17.3% regression of stress-ng.full.ops_per_sec on:
> > > >
> > > >
> > > > commit: a5e57b4d370c6d320e5bfb0c919fe00aee29e039 ("fsnotify: optimize the case of no permission event watchers")
> > >
> > > Odd. This commit does add an extra fsnotify_sb_has_priority_watchers()
> > > inline check for reads and writes, but the inline helper
> > > fsnotify_sb_has_watchers()
> > > already exists in fsnotify_parent() and it already accesses fsnotify_sb_info.
> > >
> > > It seems like stress-ng.full does read/write/mmap operations on /dev/full,
> > > so the fsnotify_sb_info object would be that of devtmpfs.
> > >
> > > I think that the permission events on special files are not very relevant,
> > > but I am not sure.
> > >
> > > Jan, any ideas?
> >
> > So I'm not 100% sure but this load simply seems to run 'stress-ng' with all
> > the syscalls it is able to exercise (one per CPU if I'm right). Hum...
> > looking at perf numbers I've noticed changes like:
> >
> > 0.43 ą 3% -0.2 0.21 ą 5% perf-profile.self.cycles-pp.__fsnotify_parent
> > 0.00 +2.8 2.79 ą 5% perf-profile.self.cycles-pp.fsnotify_open_perm
> >
> > or
> >
> > 1.77 ą 12% +1.9 3.64 ą 8% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.71 ą 15% +1.9 3.64 ą 9% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.00 +2.8 2.79 ą 5% perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open
> >
> > So the savings in __fsnotify_parent() don't really outweight the costs in
> > fsnotify_file()... I can see stress-ng exercises also inotify so maybe
> > there's some contention on the counters which is causing the regression now
> > that we have more of them?
> >
> > BTW, I'm not sure how you've arrived at the conclusing the test is using
> > /dev/full. For all I can tell the e.g. the stress-mmap test is using a file
> > in a subdir of CWD.
> >
>
> Oh, I just saw the file stress-full.c in stress-ng and wrongly assumed that
> test stress-ng.full refers to this code.
>
> Where do I find the code for this test?
Ah, now that I've investigated the LKP details again, you're indeed right.
repro-script shows how stress-ng is run and when I do that with cloned
stress-ng repository, it is the test using /dev/full.
So with that I'm not sure why patch adds so much cost to fsnotify_file()...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-04-12 15:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-11 1:42 [linux-next:master] [fsnotify] a5e57b4d37: stress-ng.full.ops_per_sec -17.3% regression kernel test robot
2024-04-11 9:23 ` Amir Goldstein
2024-04-11 11:54 ` Jan Kara
2024-04-11 16:22 ` Amir Goldstein
2024-04-12 15:04 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox