Greeting, FYI, we noticed a 8.4% improvement of stress-ng.sendfile.ops_per_sec due to commit: commit: 3eb3c59b128509a5e8a8349dafced64b9769438e ("splice: Do splice read from a file without using ITER_PIPE") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: stress-ng on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory with following parameters: nr_threads: 100% testtime: 60s class: pipe test: sendfile cpufreq_governor: performance In addition to that, the commit also has significant impact on the following tests: +------------------+------------------------------------------------------------------------------------------+ | testcase: change | stress-ng: stress-ng.sendfile.ops_per_sec 11.2% improvement | | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory | | test parameters | class=pipe | | | cpufreq_governor=performance | | | nr_threads=100% | | | test=sendfile | | | testtime=60s | +------------------+------------------------------------------------------------------------------------------+ Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. ========================================================================================= class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: pipe/gcc-11/performance/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp6/sendfile/stress-ng/60s commit: 82ab8404c9 ("tty, proc, kernfs, random: Use direct_splice_read()") 3eb3c59b12 ("splice: Do splice read from a file without using ITER_PIPE") 82ab8404c910d4ab 3eb3c59b128509a5e8a8349dafc ---------------- --------------------------- %stddev %change %stddev \ | \ 36216 +8.4% 39262 stress-ng.sendfile.MB_per_sec_sent_to_/dev/null 68875003 +9.2% 75213668 stress-ng.sendfile.ops 1156844 +8.4% 1253553 stress-ng.sendfile.ops_per_sec 1.015e+11 -2.2% 9.931e+10 perf-stat.i.branch-instructions 0.22 -0.1 0.13 perf-stat.i.branch-miss-rate% 1.3e+08 -78.2% 28315322 perf-stat.i.branch-misses 0.76 +3.3% 0.78 perf-stat.i.cpi 1.47e+11 -7.9% 1.353e+11 perf-stat.i.dTLB-loads 0.00 ± 5% -0.0 0.00 perf-stat.i.dTLB-store-miss-rate% 223138 ± 34% -77.1% 51179 ± 4% perf-stat.i.dTLB-store-misses 7.976e+10 -12.6% 6.967e+10 perf-stat.i.dTLB-stores 1.33 -3.8% 1.28 perf-stat.i.ipc 1.13 -8.0% 1.04 perf-stat.i.metric.G/sec 1433 -6.7% 1337 perf-stat.i.metric.M/sec 0.13 -0.1 0.02 ± 70% perf-stat.overall.branch-miss-rate% 0.00 ± 35% -0.0 0.00 ± 70% perf-stat.overall.dTLB-store-miss-rate% 1.33 -35.6% 0.86 ± 70% perf-stat.overall.ipc 9.991e+10 -34.9% 6.505e+10 ± 70% perf-stat.ps.branch-instructions 1.279e+08 -85.5% 18523463 ± 70% perf-stat.ps.branch-misses 1.447e+11 -38.7% 8.866e+10 ± 70% perf-stat.ps.dTLB-loads 220282 ± 33% -84.3% 34605 ± 70% perf-stat.ps.dTLB-store-misses 7.85e+10 -41.9% 4.564e+10 ± 70% perf-stat.ps.dTLB-stores 3.136e+13 -34.9% 2.041e+13 ± 70% perf-stat.total.instructions 73.35 -73.4 0.00 perf-profile.calltrace.cycles-pp.generic_file_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64 69.91 -69.9 0.00 perf-profile.calltrace.cycles-pp.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 26.95 -27.0 0.00 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct 25.81 ± 2% -25.8 0.00 perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.generic_file_splice_read.splice_direct_to_actor 23.20 -23.2 0.00 perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct 5.56 -5.6 0.00 perf-profile.calltrace.cycles-pp.sanity.copy_page_to_iter.filemap_read.generic_file_splice_read.splice_direct_to_actor 99.21 -0.1 99.15 perf-profile.calltrace.cycles-pp.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64.do_syscall_64 99.33 -0.1 99.27 perf-profile.calltrace.cycles-pp.do_splice_direct.do_sendfile.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe 99.44 -0.0 99.40 perf-profile.calltrace.cycles-pp.do_sendfile.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendfile 99.53 -0.0 99.49 perf-profile.calltrace.cycles-pp.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendfile 99.55 -0.0 99.52 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendfile 1.80 +0.1 1.86 perf-profile.calltrace.cycles-pp.page_cache_pipe_buf_confirm.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor 0.60 +0.1 0.67 perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_splice_read.splice_direct_to_actor.do_splice_direct 0.82 +0.1 0.92 perf-profile.calltrace.cycles-pp.security_file_permission.vfs_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 2.12 +0.2 2.28 perf-profile.calltrace.cycles-pp.vfs_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64 0.00 +0.5 0.51 perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.filemap_splice_read.splice_direct_to_actor 0.00 +0.6 0.63 perf-profile.calltrace.cycles-pp.__might_resched.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 10.48 +1.0 11.52 perf-profile.calltrace.cycles-pp.page_cache_pipe_buf_release.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor 0.00 +1.2 1.20 perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_splice_read.splice_direct_to_actor.do_splice_direct 0.00 +1.3 1.34 perf-profile.calltrace.cycles-pp.xas_load.filemap_get_read_batch.filemap_get_pages.filemap_splice_read.splice_direct_to_actor 23.28 +1.6 24.84 perf-profile.calltrace.cycles-pp.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64 0.00 +1.6 1.56 perf-profile.calltrace.cycles-pp.touch_atime.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 22.93 +1.6 24.52 perf-profile.calltrace.cycles-pp.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile 20.74 +1.9 22.66 perf-profile.calltrace.cycles-pp.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct 0.00 +2.2 2.15 perf-profile.calltrace.cycles-pp.folio_mark_accessed.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +13.8 13.84 perf-profile.calltrace.cycles-pp.release_pages.__pagevec_release.filemap_splice_read.splice_direct_to_actor.do_splice_direct 0.00 +14.6 14.58 perf-profile.calltrace.cycles-pp.__pagevec_release.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +18.2 18.20 perf-profile.calltrace.cycles-pp.splice_folio_into_pipe.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +27.5 27.51 perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_splice_read.splice_direct_to_actor.do_splice_direct 0.00 +28.8 28.78 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +70.1 70.08 perf-profile.calltrace.cycles-pp.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64 73.51 -73.4 0.15 ± 3% perf-profile.children.cycles-pp.generic_file_splice_read 70.87 -70.9 0.00 perf-profile.children.cycles-pp.filemap_read 23.73 -23.7 0.00 perf-profile.children.cycles-pp.copy_page_to_iter 5.09 -5.1 0.00 perf-profile.children.cycles-pp.sanity 3.00 -1.3 1.66 perf-profile.children.cycles-pp.touch_atime 2.48 ± 2% -1.1 1.36 perf-profile.children.cycles-pp.atime_needs_update 1.11 ± 2% -0.5 0.59 perf-profile.children.cycles-pp.current_time 1.32 -0.2 1.08 perf-profile.children.cycles-pp.__might_resched 2.75 -0.2 2.58 perf-profile.children.cycles-pp.folio_mark_accessed 0.34 ± 7% -0.2 0.19 perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64 0.26 -0.1 0.15 ± 2% perf-profile.children.cycles-pp.make_vfsgid 0.24 ± 2% -0.1 0.15 perf-profile.children.cycles-pp.make_vfsuid 99.33 -0.1 99.27 perf-profile.children.cycles-pp.do_splice_direct 99.30 -0.1 99.25 perf-profile.children.cycles-pp.splice_direct_to_actor 1.58 -0.0 1.53 perf-profile.children.cycles-pp.xas_load 0.50 -0.0 0.45 perf-profile.children.cycles-pp.xas_start 99.45 -0.0 99.41 perf-profile.children.cycles-pp.do_sendfile 99.60 -0.0 99.56 perf-profile.children.cycles-pp.do_syscall_64 99.53 -0.0 99.50 perf-profile.children.cycles-pp.__x64_sys_sendfile64 99.69 -0.0 99.66 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 0.25 +0.0 0.26 perf-profile.children.cycles-pp.splice_from_pipe_next 0.21 ± 3% +0.0 0.23 ± 3% perf-profile.children.cycles-pp.stress_sendfile 0.17 ± 2% +0.0 0.19 ± 2% perf-profile.children.cycles-pp.rw_verify_area 0.32 ± 2% +0.0 0.34 perf-profile.children.cycles-pp.rcu_all_qs 0.32 +0.0 0.36 ± 2% perf-profile.children.cycles-pp.aa_file_perm 0.25 +0.0 0.29 perf-profile.children.cycles-pp.__get_task_ioprio 0.94 +0.1 1.01 perf-profile.children.cycles-pp.pipe_to_null 2.22 +0.1 2.30 perf-profile.children.cycles-pp.page_cache_pipe_buf_confirm 0.70 +0.1 0.78 perf-profile.children.cycles-pp.apparmor_file_permission 0.78 +0.1 0.87 perf-profile.children.cycles-pp.__cond_resched 0.93 +0.1 1.04 perf-profile.children.cycles-pp.security_file_permission 2.19 +0.2 2.36 perf-profile.children.cycles-pp.vfs_splice_read 0.00 +0.2 0.22 ± 2% perf-profile.children.cycles-pp.mlock_drain_local 0.00 +0.2 0.23 perf-profile.children.cycles-pp.free_unref_page_list 0.00 +0.3 0.26 perf-profile.children.cycles-pp.lru_add_drain_cpu 0.00 +0.4 0.45 perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list 10.45 +1.0 11.46 perf-profile.children.cycles-pp.page_cache_pipe_buf_release 21.90 +1.5 23.40 perf-profile.children.cycles-pp.__splice_from_pipe 23.34 +1.6 24.90 perf-profile.children.cycles-pp.direct_splice_actor 23.10 +1.6 24.70 perf-profile.children.cycles-pp.splice_from_pipe 25.98 +1.7 27.66 perf-profile.children.cycles-pp.filemap_get_read_batch 27.08 +1.8 28.91 perf-profile.children.cycles-pp.filemap_get_pages 0.00 +14.0 14.03 perf-profile.children.cycles-pp.release_pages 0.00 +14.8 14.80 perf-profile.children.cycles-pp.__pagevec_release 0.00 +18.0 18.00 perf-profile.children.cycles-pp.splice_folio_into_pipe 0.00 +71.6 71.62 perf-profile.children.cycles-pp.filemap_splice_read 19.07 -19.1 0.00 perf-profile.self.cycles-pp.copy_page_to_iter 15.48 -15.5 0.00 perf-profile.self.cycles-pp.filemap_read 1.00 -0.4 0.55 perf-profile.self.cycles-pp.atime_needs_update 0.55 -0.4 0.11 perf-profile.self.cycles-pp.generic_file_splice_read 0.76 -0.4 0.40 perf-profile.self.cycles-pp.current_time 0.52 -0.2 0.27 perf-profile.self.cycles-pp.touch_atime 1.19 -0.2 0.94 perf-profile.self.cycles-pp.__might_resched 2.31 -0.2 2.14 perf-profile.self.cycles-pp.folio_mark_accessed 0.27 ± 9% -0.1 0.15 perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64 0.19 ± 2% -0.1 0.11 perf-profile.self.cycles-pp.make_vfsgid 0.18 ± 3% -0.1 0.11 ± 3% perf-profile.self.cycles-pp.make_vfsuid 0.21 -0.0 0.17 perf-profile.self.cycles-pp.direct_splice_actor 0.36 -0.0 0.33 perf-profile.self.cycles-pp.xas_start 0.14 ± 2% +0.0 0.16 ± 3% perf-profile.self.cycles-pp.rw_verify_area 0.22 +0.0 0.24 perf-profile.self.cycles-pp.__get_task_ioprio 0.20 ± 3% +0.0 0.22 perf-profile.self.cycles-pp.rcu_all_qs 0.20 +0.0 0.22 ± 3% perf-profile.self.cycles-pp.stress_sendfile 0.24 ± 2% +0.0 0.26 ± 2% perf-profile.self.cycles-pp.security_file_permission 0.29 +0.0 0.32 perf-profile.self.cycles-pp.aa_file_perm 0.48 +0.0 0.51 perf-profile.self.cycles-pp.pipe_to_null 0.37 +0.0 0.42 perf-profile.self.cycles-pp.apparmor_file_permission 0.49 +0.1 0.54 perf-profile.self.cycles-pp.__cond_resched 1.76 +0.1 1.83 perf-profile.self.cycles-pp.page_cache_pipe_buf_confirm 1.11 +0.1 1.20 perf-profile.self.cycles-pp.splice_from_pipe 1.10 +0.1 1.24 perf-profile.self.cycles-pp.filemap_get_pages 0.00 +0.1 0.15 ± 2% perf-profile.self.cycles-pp.mlock_drain_local 0.00 +0.2 0.17 perf-profile.self.cycles-pp.free_unref_page_list 0.00 +0.2 0.24 ± 2% perf-profile.self.cycles-pp.lru_add_drain_cpu 8.88 +0.4 9.25 perf-profile.self.cycles-pp.__splice_from_pipe 0.00 +0.4 0.38 perf-profile.self.cycles-pp.__mem_cgroup_uncharge_list 0.00 +0.4 0.39 perf-profile.self.cycles-pp.__pagevec_release 9.87 +1.0 10.85 perf-profile.self.cycles-pp.page_cache_pipe_buf_release 24.08 ± 2% +1.7 25.80 perf-profile.self.cycles-pp.filemap_get_read_batch 0.00 +5.6 5.58 perf-profile.self.cycles-pp.filemap_splice_read 0.00 +13.4 13.36 perf-profile.self.cycles-pp.release_pages 0.00 +17.3 17.26 perf-profile.self.cycles-pp.splice_folio_into_pipe *************************************************************************************************** lkp-spr-2sp1: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory ========================================================================================= class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: pipe/gcc-11/performance/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-spr-2sp1/sendfile/stress-ng/60s commit: 82ab8404c9 ("tty, proc, kernfs, random: Use direct_splice_read()") 3eb3c59b12 ("splice: Do splice read from a file without using ITER_PIPE") 82ab8404c910d4ab 3eb3c59b128509a5e8a8349dafc ---------------- --------------------------- %stddev %change %stddev \ | \ 33210 ± 5% +11.5% 37023 ± 5% stress-ng.sendfile.MB_per_sec_sent_to_/dev/null 1.122e+08 ± 4% +11.2% 1.247e+08 ± 4% stress-ng.sendfile.ops 1870141 ± 4% +11.2% 2078774 ± 4% stress-ng.sendfile.ops_per_sec 47.80 ± 12% +12.3% 53.69 ± 10% stress-ng.time.user_time 52.98 -0.6% 52.65 boot-time.boot 10781 -0.8% 10698 boot-time.idle 60353 ± 5% -19.9% 48332 ± 7% meminfo.Active 60295 ± 5% -19.9% 48268 ± 7% meminfo.Active(anon) 489.67 ± 64% -89.9% 49.33 ± 5% turbostat.C1 0.88 ± 4% -6.0% 0.83 ± 4% turbostat.IPC 667.43 +1.6% 678.03 turbostat.PkgWatt 1183 ± 2% -5.0% 1123 ± 2% proc-vmstat.direct_map_level2_splits 15019 ± 6% -19.7% 12064 ± 7% proc-vmstat.nr_active_anon 15019 ± 6% -19.7% 12064 ± 7% proc-vmstat.nr_zone_active_anon 8704 ± 12% -87.0% 1127 ± 93% proc-vmstat.numa_hint_faults 4349 ± 71% -81.9% 788.00 ±141% proc-vmstat.numa_hint_faults_local 2625 ± 5% -15.4% 2221 ± 5% proc-vmstat.pgactivate 520605 -0.8% 516293 proc-vmstat.pgfault 483584 ± 14% +11.4% 538624 proc-vmstat.unevictable_pgs_scanned 38.60 ± 19% -31.4% 26.47 ± 31% sched_debug.cfs_rq:/.removed.runnable_avg.stddev 38.60 ± 19% -31.5% 26.46 ± 31% sched_debug.cfs_rq:/.removed.util_avg.stddev 1203822 ± 5% +8.8% 1310010 ± 2% sched_debug.cpu.avg_idle.max 212000 ± 8% -19.8% 170034 ± 16% sched_debug.cpu.avg_idle.min 118019 ± 4% +13.4% 133830 ± 3% sched_debug.cpu.avg_idle.stddev 7280 -6.7% 6792 ± 4% sched_debug.cpu.curr->pid.max 7017 ± 7% +35.4% 9500 ± 19% sched_debug.cpu.max_idle_balance_cost.stddev 48116 ± 26% +47.4% 70902 ± 2% sched_debug.cpu.nr_switches.max 3889 ± 11% +27.3% 4949 ± 2% sched_debug.cpu.nr_switches.stddev 20951 ± 7% -12.6% 18321 numa-vmstat.node0.nr_kernel_stack 2045 ± 21% -33.2% 1365 ± 6% numa-vmstat.node0.nr_shmem 24416 ± 7% -12.3% 21410 ± 3% numa-vmstat.node0.nr_slab_reclaimable 14788 ± 5% -20.6% 11736 ± 7% numa-vmstat.node1.nr_active_anon 72983 ± 9% +22.0% 89035 ± 10% numa-vmstat.node1.nr_inactive_anon 18553 ± 8% +13.6% 21078 numa-vmstat.node1.nr_kernel_stack 1966 ± 67% +96.7% 3867 ± 7% numa-vmstat.node1.nr_page_table_pages 13840 ± 13% +21.1% 16757 ± 4% numa-vmstat.node1.nr_slab_reclaimable 14788 ± 5% -20.6% 11736 ± 7% numa-vmstat.node1.nr_zone_active_anon 72983 ± 9% +22.0% 89035 ± 10% numa-vmstat.node1.nr_zone_inactive_anon 97667 ± 7% -12.3% 85641 ± 3% numa-meminfo.node0.KReclaimable 20951 ± 7% -12.5% 18321 numa-meminfo.node0.KernelStack 97667 ± 7% -12.3% 85641 ± 3% numa-meminfo.node0.SReclaimable 8182 ± 21% -33.2% 5463 ± 6% numa-meminfo.node0.Shmem 59122 ± 5% -20.6% 46961 ± 7% numa-meminfo.node1.Active 59122 ± 5% -20.6% 46924 ± 7% numa-meminfo.node1.Active(anon) 291808 ± 9% +21.8% 355513 ± 10% numa-meminfo.node1.Inactive 291796 ± 9% +21.8% 355404 ± 10% numa-meminfo.node1.Inactive(anon) 55356 ± 13% +21.1% 67024 ± 4% numa-meminfo.node1.KReclaimable 18552 ± 8% +13.6% 21078 numa-meminfo.node1.KernelStack 7866 ± 67% +96.7% 15473 ± 7% numa-meminfo.node1.PageTables 55356 ± 13% +21.1% 67024 ± 4% numa-meminfo.node1.SReclaimable 0.20 -0.1 0.11 ± 2% perf-stat.i.branch-miss-rate% 2.049e+08 ± 3% -79.2% 42614103 ± 3% perf-stat.i.branch-misses 1.80 ± 5% +0.5 2.29 ± 13% perf-stat.i.cache-miss-rate% 8357009 ± 20% +63.8% 13689961 ± 18% perf-stat.i.cache-misses 0.68 ± 4% +6.1% 0.72 ± 3% perf-stat.i.cpi 277.36 ± 2% +4.9% 290.87 perf-stat.i.cpu-migrations 79850 ± 27% -40.5% 47537 ± 25% perf-stat.i.cycles-between-cache-misses 1724783 ± 7% +30.1% 2244259 ± 7% perf-stat.i.dTLB-load-misses 2.399e+11 ± 3% -6.7% 2.238e+11 ± 3% perf-stat.i.dTLB-loads 190678 ± 2% +76.5% 336608 ± 2% perf-stat.i.dTLB-store-misses 1.301e+11 ± 3% -11.5% 1.151e+11 ± 3% perf-stat.i.dTLB-stores 1.49 ± 4% -6.1% 1.40 ± 4% perf-stat.i.ipc 0.11 ± 11% +78.6% 0.19 ± 20% perf-stat.i.major-faults 20.02 +12.1% 22.44 perf-stat.i.metric.K/sec 1532 ± 6% -12.1% 1347 ± 6% perf-stat.i.metric.M/sec 1776926 ± 9% +18.9% 2113532 ± 4% perf-stat.i.node-load-misses 1.31 +3.6% 1.35 ± 3% perf-stat.overall.MPKI 0.12 -0.1 0.03 perf-stat.overall.branch-miss-rate% 0.75 ± 19% +0.5 1.23 ± 23% perf-stat.overall.cache-miss-rate% 0.68 ± 4% +6.4% 0.72 ± 4% perf-stat.overall.cpi 71742 ± 21% -35.9% 46019 ± 24% perf-stat.overall.cycles-between-cache-misses 0.00 ± 9% +0.0 0.00 ± 3% perf-stat.overall.dTLB-load-miss-rate% 0.00 +0.0 0.00 perf-stat.overall.dTLB-store-miss-rate% 1.48 ± 4% -6.0% 1.39 ± 4% perf-stat.overall.ipc 1.972e+08 ± 4% -79.2% 41095397 ± 4% perf-stat.ps.branch-misses 7895746 ± 21% +66.4% 13139497 ± 18% perf-stat.ps.cache-misses 257.74 ± 2% +5.8% 272.63 perf-stat.ps.cpu-migrations 1747228 ± 5% +28.3% 2242162 ± 8% perf-stat.ps.dTLB-load-misses 180288 ± 3% +79.3% 323194 ± 4% perf-stat.ps.dTLB-store-misses 1.254e+11 ± 4% -10.9% 1.117e+11 ± 4% perf-stat.ps.dTLB-stores 0.10 ± 14% +74.4% 0.17 ± 20% perf-stat.ps.major-faults 1686094 ± 10% +20.5% 2031440 ± 5% perf-stat.ps.node-load-misses 76.50 -76.5 0.00 perf-profile.calltrace.cycles-pp.generic_file_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64 73.71 ± 2% -73.7 0.00 perf-profile.calltrace.cycles-pp.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 35.69 ± 11% -35.7 0.00 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct 34.71 ± 12% -34.7 0.00 perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.generic_file_splice_read.splice_direct_to_actor 17.39 ± 5% -17.4 0.00 perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.generic_file_splice_read.splice_direct_to_actor.do_splice_direct 1.38 ± 8% -0.1 1.26 ± 9% perf-profile.calltrace.cycles-pp.page_cache_pipe_buf_confirm.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor 99.20 -0.1 99.09 perf-profile.calltrace.cycles-pp.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64.do_syscall_64 0.00 +1.0 0.98 ± 11% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_splice_read.splice_direct_to_actor.do_splice_direct 0.00 +1.1 1.14 ± 11% perf-profile.calltrace.cycles-pp.xas_load.filemap_get_read_batch.filemap_get_pages.filemap_splice_read.splice_direct_to_actor 0.00 +1.2 1.23 ± 10% perf-profile.calltrace.cycles-pp.touch_atime.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +1.4 1.35 ± 7% perf-profile.calltrace.cycles-pp.folio_mark_accessed.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +13.6 13.57 ± 7% perf-profile.calltrace.cycles-pp.release_pages.__pagevec_release.filemap_splice_read.splice_direct_to_actor.do_splice_direct 0.00 +14.1 14.12 ± 7% perf-profile.calltrace.cycles-pp.__pagevec_release.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +16.0 15.95 ± 7% perf-profile.calltrace.cycles-pp.splice_folio_into_pipe.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +38.4 38.42 ± 11% perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_splice_read.splice_direct_to_actor.do_splice_direct 0.00 +39.4 39.37 ± 10% perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile 0.00 +74.9 74.88 ± 2% perf-profile.calltrace.cycles-pp.filemap_splice_read.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64 76.66 -76.6 0.10 ± 9% perf-profile.children.cycles-pp.generic_file_splice_read 74.75 -74.8 0.00 perf-profile.children.cycles-pp.filemap_read 18.22 ± 5% -18.2 0.00 perf-profile.children.cycles-pp.copy_page_to_iter 2.36 ± 8% -1.1 1.30 ± 10% perf-profile.children.cycles-pp.touch_atime 2.01 ± 8% -0.9 1.12 ± 10% perf-profile.children.cycles-pp.atime_needs_update 2.52 ± 10% -0.8 1.67 ± 7% perf-profile.children.cycles-pp.folio_mark_accessed 0.88 ± 9% -0.4 0.47 ± 11% perf-profile.children.cycles-pp.current_time 1.17 ± 8% -0.3 0.87 ± 9% perf-profile.children.cycles-pp.__might_resched 0.29 ± 10% -0.2 0.14 ± 8% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64 0.22 ± 7% -0.1 0.12 ± 11% perf-profile.children.cycles-pp.make_vfsgid 0.20 ± 9% -0.1 0.13 ± 13% perf-profile.children.cycles-pp.make_vfsuid 0.22 ± 7% +0.0 0.26 ± 9% perf-profile.children.cycles-pp.__xas_next 0.00 +0.1 0.07 ± 14% perf-profile.children.cycles-pp.splice_write_null 0.00 +0.2 0.16 ± 11% perf-profile.children.cycles-pp.free_unref_page_list 0.00 +0.2 0.19 ± 9% perf-profile.children.cycles-pp.lru_add_drain_cpu 0.00 +0.2 0.20 ± 10% perf-profile.children.cycles-pp.mlock_drain_local 0.00 +0.3 0.28 ± 8% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list 0.00 +13.7 13.71 ± 7% perf-profile.children.cycles-pp.release_pages 0.00 +14.3 14.31 ± 8% perf-profile.children.cycles-pp.__pagevec_release 0.00 +15.6 15.58 ± 6% perf-profile.children.cycles-pp.splice_folio_into_pipe 0.00 +76.1 76.07 perf-profile.children.cycles-pp.filemap_splice_read 16.46 ± 7% -16.5 0.00 perf-profile.self.cycles-pp.filemap_read 15.06 ± 4% -15.1 0.00 perf-profile.self.cycles-pp.copy_page_to_iter 2.18 ± 10% -0.8 1.41 ± 7% perf-profile.self.cycles-pp.folio_mark_accessed 0.81 ± 8% -0.3 0.48 ± 10% perf-profile.self.cycles-pp.atime_needs_update 0.38 ± 8% -0.3 0.07 ± 6% perf-profile.self.cycles-pp.generic_file_splice_read 1.03 ± 8% -0.3 0.72 ± 9% perf-profile.self.cycles-pp.__might_resched 0.58 ± 8% -0.3 0.32 ± 11% perf-profile.self.cycles-pp.current_time 0.30 ± 7% -0.1 0.16 ± 8% perf-profile.self.cycles-pp.touch_atime 0.23 ± 10% -0.1 0.12 ± 8% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64 1.37 ± 7% -0.1 1.27 ± 8% perf-profile.self.cycles-pp.page_cache_pipe_buf_confirm 1.03 ± 9% -0.1 0.94 ± 10% perf-profile.self.cycles-pp.xas_load 0.17 ± 7% -0.1 0.09 ± 15% perf-profile.self.cycles-pp.make_vfsgid 0.16 ± 7% -0.1 0.09 ± 10% perf-profile.self.cycles-pp.make_vfsuid 0.08 +0.1 0.13 ± 10% perf-profile.self.cycles-pp.direct_splice_actor 0.18 ± 7% +0.1 0.25 ± 8% perf-profile.self.cycles-pp.splice_direct_to_actor 0.53 ± 7% +0.1 0.64 ± 10% perf-profile.self.cycles-pp.splice_from_pipe 0.00 +0.1 0.12 ± 8% perf-profile.self.cycles-pp.free_unref_page_list 0.00 +0.1 0.12 ± 10% perf-profile.self.cycles-pp.mlock_drain_local 0.00 +0.2 0.18 ± 7% perf-profile.self.cycles-pp.lru_add_drain_cpu 0.00 +0.2 0.24 ± 7% perf-profile.self.cycles-pp.__mem_cgroup_uncharge_list 0.00 +0.3 0.31 ± 9% perf-profile.self.cycles-pp.__pagevec_release 0.00 +3.6 3.63 ± 7% perf-profile.self.cycles-pp.filemap_splice_read 0.00 +13.2 13.20 ± 7% perf-profile.self.cycles-pp.release_pages 0.00 +14.9 14.86 ± 6% perf-profile.self.cycles-pp.splice_folio_into_pipe Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests