Hello, kernel test robot noticed a -40.2% regression of fio.write_iops on: commit: 40b77b3e82f7bfdfa48d8def763d33bdd49c0e4e ("btrfs: switch to the new mount API") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master testcase: fio-basic test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: runtime: 300s disk: 1HDD fs: btrfs nr_task: 1 test_size: 128G rw: write bs: 4k ioengine: falloc cpufreq_governor: performance In addition to that, the commit also has significant impact on the following tests: +------------------+---------------------------------------------------------------------------------------------+ | testcase: change | fileio: fileio.write_operations/s -6.8% regression | | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory | | test parameters | cpufreq_governor=performance | | | disk=1SSD | | | filenum=1024f | | | fs=btrfs | | | iomode=sync | | | nr_threads=100% | | | period=600s | | | rwmode=seqrewr | | | size=64G | +------------------+---------------------------------------------------------------------------------------------+ If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-lkp/202312022313.a9606b59-oliver.sang@intel.com Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231202/202312022313.a9606b59-oliver.sang@intel.com ========================================================================================= bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase: 4k/gcc-12/performance/1HDD/btrfs/falloc/x86_64-rhel-8.3/1/debian-11.1-x86_64-20220510.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic commit: 0002e97e35 ("btrfs: handle the ro->rw transition for mounting different subvolumes") 40b77b3e82 ("btrfs: switch to the new mount API") 0002e97e35a6bc53 40b77b3e82f7bfdfa48d8def763 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.23 -33.8% 0.81 ± 17% iostat.cpu.user 148.85 ±100% +4490.0% 6832 ± 64% numa-meminfo.node0.Inactive(file) 1.352e+09 +83.0% 2.473e+09 ± 24% cpuidle..time 1381290 +82.0% 2513835 ± 24% cpuidle..usage 23629 ± 29% +59.2% 37607 ± 15% meminfo.AnonHugePages 615.70 ± 56% +1027.0% 6938 ± 65% meminfo.Inactive(file) 0.02 ± 14% +0.0 0.03 ± 10% mpstat.cpu.all.soft% 1.27 ± 2% -0.4 0.82 ± 18% mpstat.cpu.all.usr% 58.46 +30.7% 76.41 ± 12% uptime.boot 3485 +32.4% 4613 ± 13% uptime.idle 1365686 +83.3% 2503432 ± 24% turbostat.C1 1404935 +81.8% 2554851 ± 24% turbostat.IRQ 241.62 -1.0% 239.14 turbostat.PkgWatt 2.67 ±223% +3.2e+05% 8548 numa-vmstat.node0.nr_dirtied 37.21 ±100% +4490.0% 1708 ± 64% numa-vmstat.node0.nr_inactive_file 0.00 +8.6e+105% 8562 numa-vmstat.node0.nr_written 37.21 ±100% +4490.0% 1708 ± 64% numa-vmstat.node0.nr_zone_inactive_file 0.09 +1e+06% 888.06 ± 15% vmstat.io.bo 2.36 ± 5% -28.1% 1.69 ± 19% vmstat.procs.r 4868 ± 9% -23.3% 3735 ± 14% vmstat.system.cs 62700 +1.4% 63592 vmstat.system.in 0.04 ± 6% +0.0 0.06 ± 3% fio.latency_10us% 0.01 +0.0 0.02 ± 12% fio.latency_20us% 0.05 ± 17% +0.1 0.14 ± 6% fio.latency_2us% 0.01 ± 4% +0.0 0.01 ± 18% fio.latency_4us% 19.77 +66.3% 32.87 fio.time.elapsed_time 19.77 +66.3% 32.87 fio.time.elapsed_time.max 10.83 +98.7% 21.51 fio.time.system_time 2154 +64.3% 3539 fio.time.voluntary_context_switches 6725 -40.2% 4022 fio.write_bw_MBps 396.00 +98.0% 784.00 fio.write_clat_90%_us 399.33 +97.7% 789.33 fio.write_clat_95%_us 411.33 +95.8% 805.33 fio.write_clat_99%_us 392.13 +97.9% 775.93 fio.write_clat_mean_us 160.94 ± 3% +14158.7% 22948 ± 61% fio.write_clat_stddev 1721819 -40.2% 1029637 fio.write_iops 95506 +1.2% 96646 proc-vmstat.nr_anon_pages 5.50 ±141% +1.6e+05% 8550 proc-vmstat.nr_dirtied 97087 +1.2% 98222 proc-vmstat.nr_inactive_anon 153.92 ± 56% +1027.0% 1734 ± 65% proc-vmstat.nr_inactive_file 0.17 ±223% +5.1e+06% 8576 proc-vmstat.nr_written 97087 +1.2% 98222 proc-vmstat.nr_zone_inactive_anon 153.92 ± 56% +1027.0% 1734 ± 65% proc-vmstat.nr_zone_inactive_file 271193 +11.9% 303408 ± 4% proc-vmstat.numa_hit 204981 +15.7% 237187 ± 5% proc-vmstat.numa_local 292107 +13.0% 330070 ± 4% proc-vmstat.pgalloc_normal 165913 +21.4% 201423 ± 9% proc-vmstat.pgfault 125209 ± 2% +23.1% 154085 ± 9% proc-vmstat.pgfree 0.00 +3.5e+106% 34568 proc-vmstat.pgpgout 5579 ± 2% +28.9% 7193 ± 13% proc-vmstat.pgreuse 336512 ± 9% +27.8% 429952 ± 17% proc-vmstat.unevictable_pgs_scanned 6.76 ±132% -6.8 0.00 perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range 4.98 ±112% -5.0 0.00 perf-profile.calltrace.cycles-pp.mtree_load.show_interrupts.seq_read_iter.proc_reg_read_iter.vfs_read 5.42 ±101% -3.6 1.85 ±223% perf-profile.calltrace.cycles-pp.proc_fill_cache.proc_pid_readdir.iterate_dir.__x64_sys_getdents64.do_syscall_64 6.34 ± 77% -3.3 3.04 ±146% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.getdents64 6.34 ± 77% -3.3 3.04 ±146% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.getdents64 6.34 ± 77% -3.3 3.04 ±146% perf-profile.calltrace.cycles-pp.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe.getdents64 6.34 ± 77% -3.3 3.04 ±146% perf-profile.calltrace.cycles-pp.iterate_dir.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe.getdents64 6.34 ± 77% -3.3 3.04 ±146% perf-profile.calltrace.cycles-pp.proc_pid_readdir.iterate_dir.__x64_sys_getdents64.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.34 ± 77% -3.3 3.04 ±146% perf-profile.calltrace.cycles-pp.getdents64 6.76 ±132% -6.8 0.00 perf-profile.children.cycles-pp.free_unref_page_commit 4.98 ±112% -5.0 0.00 perf-profile.children.cycles-pp.mtree_load 4.68 ±109% -4.7 0.00 perf-profile.children.cycles-pp.free_pcppages_bulk 5.42 ±101% -3.6 1.85 ±223% perf-profile.children.cycles-pp.proc_fill_cache 6.34 ± 77% -3.3 3.04 ±146% perf-profile.children.cycles-pp.__x64_sys_getdents64 6.34 ± 77% -3.3 3.04 ±146% perf-profile.children.cycles-pp.iterate_dir 6.34 ± 77% -3.3 3.04 ±146% perf-profile.children.cycles-pp.proc_pid_readdir 6.34 ± 77% -3.3 3.04 ±146% perf-profile.children.cycles-pp.getdents64 4.26 ±105% -3.1 1.11 ±223% perf-profile.children.cycles-pp.sync_regs 4.98 ±112% -5.0 0.00 perf-profile.self.cycles-pp.mtree_load 4.26 ±105% -3.1 1.11 ±223% perf-profile.self.cycles-pp.sync_regs 5.39 ±111% -1.8 3.57 ±152% perf-profile.self.cycles-pp.show_interrupts 2.132e+09 -18.7% 1.733e+09 ± 16% perf-stat.i.branch-instructions 0.41 ± 3% +0.1 0.54 ± 15% perf-stat.i.branch-miss-rate% 13163572 ± 3% -15.4% 11142658 ± 17% perf-stat.i.branch-misses 606150 ± 10% -31.3% 416611 ± 15% perf-stat.i.cache-misses 7363367 ± 2% -32.9% 4940433 ± 11% perf-stat.i.cache-references 4573 ± 13% -24.7% 3444 ± 15% perf-stat.i.context-switches 0.61 +32.2% 0.80 ± 49% perf-stat.i.cpi 6.365e+09 -14.9% 5.42e+09 ± 13% perf-stat.i.cpu-cycles 82.94 ± 2% -10.1% 74.54 ± 2% perf-stat.i.cpu-migrations 2.724e+09 -19.0% 2.206e+09 ± 16% perf-stat.i.dTLB-loads 29695 ± 4% -25.0% 22284 ± 13% perf-stat.i.dTLB-store-misses 1.505e+09 -19.5% 1.211e+09 ± 16% perf-stat.i.dTLB-stores 1.04e+10 -18.3% 8.491e+09 ± 16% perf-stat.i.instructions 1.67 -9.0% 1.52 ± 13% perf-stat.i.ipc 0.10 -14.8% 0.08 ± 13% perf-stat.i.metric.GHz 99.30 -19.1% 80.36 ± 16% perf-stat.i.metric.M/sec 3096 -15.5% 2616 ± 3% perf-stat.i.minor-faults 88118 ± 15% -25.3% 65786 ± 12% perf-stat.i.node-load-misses 28399 ± 7% -39.6% 17140 ± 25% perf-stat.i.node-loads 82979 ± 10% -27.8% 59871 ± 10% perf-stat.i.node-stores 3096 -15.5% 2616 ± 3% perf-stat.i.page-faults 0.06 ± 10% -14.7% 0.05 ± 12% perf-stat.overall.MPKI 0.61 +5.0% 0.64 ± 4% perf-stat.overall.cpi 10622 ± 10% +23.2% 13081 ± 9% perf-stat.overall.cycles-between-cache-misses 6014 ± 2% +49.7% 9005 ± 2% perf-stat.overall.path-length 2.028e+09 -16.9% 1.685e+09 ± 15% perf-stat.ps.branch-instructions 12533665 ± 2% -13.6% 10833492 ± 16% perf-stat.ps.branch-misses 576316 ± 10% -29.7% 405030 ± 14% perf-stat.ps.cache-misses 7008904 ± 2% -31.4% 4808668 ± 10% perf-stat.ps.cache-references 4355 ± 13% -23.0% 3355 ± 14% perf-stat.ps.context-switches 60895 +2.2% 62261 perf-stat.ps.cpu-clock 6.055e+09 -13.0% 5.27e+09 ± 13% perf-stat.ps.cpu-cycles 78.84 ± 2% -8.1% 72.48 ± 2% perf-stat.ps.cpu-migrations 2.591e+09 -17.3% 2.144e+09 ± 15% perf-stat.ps.dTLB-loads 28241 ± 4% -23.2% 21677 ± 13% perf-stat.ps.dTLB-store-misses 1.431e+09 -17.8% 1.177e+09 ± 16% perf-stat.ps.dTLB-stores 9.89e+09 -16.6% 8.253e+09 ± 16% perf-stat.ps.instructions 2942 -13.6% 2543 ± 3% perf-stat.ps.minor-faults 83810 ± 15% -23.7% 63968 ± 12% perf-stat.ps.node-load-misses 26995 ± 7% -38.3% 16655 ± 25% perf-stat.ps.node-loads 78936 ± 10% -26.2% 58237 ± 10% perf-stat.ps.node-stores 2943 -13.6% 2543 ± 3% perf-stat.ps.page-faults 60895 +2.2% 62261 perf-stat.ps.task-clock 2.018e+11 ± 2% +49.7% 3.022e+11 ± 2% perf-stat.total.instructions *************************************************************************************************** lkp-spr-2sp4: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory ========================================================================================= compiler/cpufreq_governor/disk/filenum/fs/iomode/kconfig/nr_threads/period/rootfs/rwmode/size/tbox_group/testcase: gcc-12/performance/1SSD/1024f/btrfs/sync/x86_64-rhel-8.3/100%/600s/debian-11.1-x86_64-20220510.cgz/seqrewr/64G/lkp-spr-2sp4/fileio commit: 0002e97e35 ("btrfs: handle the ro->rw transition for mounting different subvolumes") 40b77b3e82 ("btrfs: switch to the new mount API") 0002e97e35a6bc53 40b77b3e82f7bfdfa48d8def763 ---------------- --------------------------- %stddev %change %stddev \ | \ 75.27 +2.9% 77.49 iostat.cpu.idle 24.17 -9.0% 22.00 iostat.cpu.system 1013831 -6.6% 946721 vmstat.io.bo 53.34 -9.1% 48.49 ± 2% vmstat.procs.r 457782 -7.2% 424715 ± 3% vmstat.system.cs 314745 ± 2% -4.2% 301431 ± 3% vmstat.system.in 36467329 ± 3% -12.8% 31796228 ± 7% turbostat.C1 55.94 +1.1 57.02 turbostat.C1E% 12697302 +22.3% 15527272 ± 3% turbostat.C6 8.14 +1.9 10.03 ± 3% turbostat.C6% 465.51 -1.3% 459.36 turbostat.PkgWatt 14157383 -1.7% 13912161 proc-vmstat.nr_active_file 1.531e+08 -6.6% 1.43e+08 proc-vmstat.nr_dirtied 2106003 +5.5% 2222878 proc-vmstat.nr_inactive_file 137670 +1.4% 139649 proc-vmstat.nr_slab_unreclaimable 1.529e+08 -6.6% 1.428e+08 proc-vmstat.nr_written 14157383 -1.7% 13912161 proc-vmstat.nr_zone_active_file 2106003 +5.5% 2222878 proc-vmstat.nr_zone_inactive_file 6.133e+08 -6.6% 5.727e+08 proc-vmstat.pgpgout 604125 -6.8% 563103 fileio.fsync_operations/s 0.31 +8.1% 0.34 fileio.latency_avg_ms 1.224e+08 +1.6% 1.243e+08 fileio.latency_sum_ms 1.222e+09 -6.6% 1.141e+09 fileio.time.file_system_outputs 123116 -7.5% 113904 fileio.time.involuntary_context_switches 5337 -9.2% 4845 fileio.time.percent_of_cpu_this_job_got 31472 -9.3% 28554 fileio.time.system_time 646.61 -7.5% 597.80 fileio.time.user_time 1.336e+08 -7.9% 1.231e+08 ± 3% fileio.time.voluntary_context_switches 965.99 -6.8% 900.36 fileio.write_bytes_MB/s 921.24 -6.8% 858.65 fileio.write_bytes_MiB/s 58959 -6.8% 54953 fileio.write_operations/s 0.06 ± 93% -81.3% 0.01 ± 65% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part 0.35 +12.6% 0.40 ± 2% perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range 6.61 ± 6% -19.0% 5.35 ± 8% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__btrfs_tree_read_lock 1.35 ± 2% +16.1% 1.57 ± 4% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_inode_lock 20403 -11.6% 18033 ± 2% perf-sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range 5436 ± 3% +62.4% 8829 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__btrfs_tree_read_lock 3.98 ± 62% -84.2% 0.63 ±223% perf-sched.wait_time.avg.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc.btrfs_buffered_write.btrfs_do_write_iter 0.35 +12.6% 0.39 ± 2% perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range 6.58 ± 6% -19.0% 5.34 ± 8% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__btrfs_tree_read_lock 1.34 ± 2% +16.2% 1.56 ± 4% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_inode_lock 0.36 ± 41% -42.9% 0.20 ± 21% perf-sched.wait_time.max.ms.btrfs_sync_log.btrfs_sync_file.__x64_sys_fsync.do_syscall_64 9.94 ± 5% -12.0% 8.75 ± 5% perf-sched.wait_time.max.ms.wait_log_commit.btrfs_sync_log.btrfs_sync_file.__x64_sys_fsync 3885192 -15.7% 3275108 sched_debug.cfs_rq:/.avg_vruntime.avg 4154904 -12.7% 3628008 ± 3% sched_debug.cfs_rq:/.avg_vruntime.max 3482162 ± 2% -16.0% 2925983 sched_debug.cfs_rq:/.avg_vruntime.min 76271 ± 2% +11.1% 84714 ± 4% sched_debug.cfs_rq:/.avg_vruntime.stddev 0.23 ± 3% -11.6% 0.20 ± 3% sched_debug.cfs_rq:/.h_nr_running.avg 3885192 -15.7% 3275108 sched_debug.cfs_rq:/.min_vruntime.avg 4154904 -12.7% 3628008 ± 3% sched_debug.cfs_rq:/.min_vruntime.max 3482162 ± 2% -16.0% 2925983 sched_debug.cfs_rq:/.min_vruntime.min 76271 ± 2% +11.1% 84714 ± 4% sched_debug.cfs_rq:/.min_vruntime.stddev 0.23 ± 3% -11.7% 0.20 ± 3% sched_debug.cfs_rq:/.nr_running.avg 227.98 ± 4% -9.3% 206.72 ± 3% sched_debug.cfs_rq:/.runnable_avg.avg 227.72 ± 4% -9.3% 206.48 ± 3% sched_debug.cfs_rq:/.util_avg.avg 54.04 ± 2% -20.6% 42.90 ± 6% sched_debug.cfs_rq:/.util_est_enqueued.avg 102.98 -13.1% 89.51 ± 4% sched_debug.cfs_rq:/.util_est_enqueued.stddev 2333 ± 3% -11.9% 2055 ± 3% sched_debug.cpu.curr->pid.avg 0.22 ± 3% -11.9% 0.20 ± 3% sched_debug.cpu.nr_running.avg 7.875e+09 -5.5% 7.442e+09 perf-stat.i.branch-instructions 0.26 +0.0 0.26 perf-stat.i.branch-miss-rate% 19949855 -3.6% 19228835 perf-stat.i.branch-misses 40.22 -0.8 39.41 perf-stat.i.cache-miss-rate% 59747484 -6.2% 56072710 perf-stat.i.cache-misses 1.49e+08 -4.2% 1.428e+08 perf-stat.i.cache-references 460439 -7.2% 427143 ± 3% perf-stat.i.context-switches 4.29 -3.9% 4.12 perf-stat.i.cpi 1.644e+11 -8.9% 1.497e+11 perf-stat.i.cpu-cycles 2749 -3.0% 2667 perf-stat.i.cycles-between-cache-misses 1.016e+10 -5.2% 9.629e+09 perf-stat.i.dTLB-loads 1.352e+09 -2.4% 1.32e+09 perf-stat.i.dTLB-stores 3.832e+10 -5.2% 3.633e+10 perf-stat.i.instructions 0.24 +4.0% 0.25 perf-stat.i.ipc 0.73 -9.0% 0.67 perf-stat.i.metric.GHz 722.53 -4.3% 691.67 perf-stat.i.metric.K/sec 86.55 -5.2% 82.06 perf-stat.i.metric.M/sec 12017109 -5.3% 11380497 perf-stat.i.node-load-misses 480747 ± 2% -6.2% 451146 ± 2% perf-stat.i.node-loads 0.25 +0.0 0.26 perf-stat.overall.branch-miss-rate% 40.03 -0.8 39.22 perf-stat.overall.cache-miss-rate% 4.29 -3.9% 4.12 perf-stat.overall.cpi 2752 -2.9% 2671 perf-stat.overall.cycles-between-cache-misses 0.23 +4.1% 0.24 perf-stat.overall.ipc 7.868e+09 -5.5% 7.435e+09 perf-stat.ps.branch-instructions 19908907 -3.6% 19186085 perf-stat.ps.branch-misses 59618596 -6.2% 55946919 perf-stat.ps.cache-misses 1.489e+08 -4.2% 1.427e+08 perf-stat.ps.cache-references 459046 -7.2% 425904 ± 3% perf-stat.ps.context-switches 1.641e+11 -8.9% 1.495e+11 perf-stat.ps.cpu-cycles 1.015e+10 -5.3% 9.621e+09 perf-stat.ps.dTLB-loads 1.351e+09 -2.5% 1.317e+09 perf-stat.ps.dTLB-stores 3.828e+10 -5.2% 3.63e+10 perf-stat.ps.instructions 11987188 -5.3% 11352253 perf-stat.ps.node-load-misses 483196 ± 2% -6.3% 452637 ± 2% perf-stat.ps.node-loads 2.308e+13 -5.2% 2.187e+13 perf-stat.total.instructions 70.29 ± 5% -8.6 61.72 ± 6% perf-profile.calltrace.cycles-pp.lapic_next_deadline.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt 31.27 ± 6% -4.4 26.90 ± 10% perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite64 31.26 ± 6% -4.4 26.89 ± 10% perf-profile.calltrace.cycles-pp.btrfs_buffered_write.btrfs_do_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64 31.28 ± 6% -4.4 26.91 ± 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pwrite64 31.28 ± 6% -4.4 26.91 ± 10% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite64 31.28 ± 6% -4.4 26.91 ± 10% perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite64 31.28 ± 6% -4.4 26.91 ± 10% perf-profile.calltrace.cycles-pp.__libc_pwrite64 31.26 ± 6% -4.4 26.90 ± 10% perf-profile.calltrace.cycles-pp.btrfs_do_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe 30.80 ± 6% -4.4 26.44 ± 11% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write.btrfs_inode_lock.btrfs_buffered_write.btrfs_do_write_iter 30.80 ± 6% -4.4 26.45 ± 11% perf-profile.calltrace.cycles-pp.down_write.btrfs_inode_lock.btrfs_buffered_write.btrfs_do_write_iter.vfs_write 30.80 ± 6% -4.3 26.45 ± 11% perf-profile.calltrace.cycles-pp.btrfs_inode_lock.btrfs_buffered_write.btrfs_do_write_iter.vfs_write.__x64_sys_pwrite64 30.61 ± 7% -4.3 26.27 ± 11% perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.btrfs_inode_lock.btrfs_buffered_write 30.11 ± 7% -4.3 25.78 ± 11% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath 30.11 ± 7% -4.3 25.79 ± 11% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write 30.11 ± 7% -4.3 25.79 ± 11% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.btrfs_inode_lock 25.84 ± 7% -4.2 21.62 ± 11% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.osq_lock.rwsem_optimistic_spin 25.71 ± 7% -4.2 21.50 ± 10% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.osq_lock 7.43 ± 6% -1.3 6.14 ± 6% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock 7.39 ± 6% -1.3 6.10 ± 6% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.native_queued_spin_lock_slowpath 4.75 ± 6% -0.7 4.04 ± 3% perf-profile.calltrace.cycles-pp.__lll_lock_wait 4.67 ± 6% -0.7 3.96 ± 3% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_lock_wait 4.67 ± 6% -0.7 3.96 ± 3% perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe 4.66 ± 6% -0.7 3.96 ± 3% perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64 4.68 ± 6% -0.7 3.98 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__lll_lock_wait 4.68 ± 6% -0.7 3.98 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_lock_wait 4.67 ± 6% -0.7 3.96 ± 3% perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_lock_wait 4.55 ± 6% -0.7 3.86 ± 3% perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex 4.48 ± 6% -0.7 3.79 ± 3% perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex 4.41 ± 6% -0.7 3.72 ± 3% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock.futex_q_lock.futex_wait_setup 4.41 ± 6% -0.7 3.72 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.futex_q_lock.futex_wait_setup.__futex_wait 4.40 ± 6% -0.7 3.72 ± 3% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock.futex_q_lock 4.40 ± 6% -0.7 3.72 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait 4.45 ± 5% -0.6 3.82 ± 4% perf-profile.calltrace.cycles-pp.__pthread_mutex_unlock_usercnt 4.36 ± 6% -0.6 3.73 ± 4% perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.__pthread_mutex_unlock_usercnt 4.36 ± 6% -0.6 3.73 ± 4% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.__pthread_mutex_unlock_usercnt 4.36 ± 6% -0.6 3.73 ± 4% perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe 4.37 ± 6% -0.6 3.74 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__pthread_mutex_unlock_usercnt 4.37 ± 6% -0.6 3.74 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__pthread_mutex_unlock_usercnt 4.18 ± 6% -0.6 3.56 ± 4% perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wake.do_futex.__x64_sys_futex.do_syscall_64 4.17 ± 6% -0.6 3.55 ± 4% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock.futex_wake.do_futex 4.17 ± 6% -0.6 3.55 ± 4% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock.futex_wake 4.17 ± 6% -0.6 3.55 ± 4% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.futex_wake.do_futex.__x64_sys_futex 1.02 ± 2% -0.1 0.88 ± 4% perf-profile.calltrace.cycles-pp.fsync 1.01 ± 2% -0.1 0.88 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fsync 1.01 ± 2% -0.1 0.88 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync 0.99 ± 2% -0.1 0.86 ± 4% perf-profile.calltrace.cycles-pp.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync 0.89 ± 3% -0.1 0.77 ± 4% perf-profile.calltrace.cycles-pp.btrfs_sync_file.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync 17.65 ± 3% +1.6 19.27 ± 4% perf-profile.calltrace.cycles-pp.perf_event_task_tick.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler 17.80 ± 3% +1.7 19.48 ± 4% perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues 19.28 ± 4% +1.7 21.02 ± 5% perf-profile.calltrace.cycles-pp.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt 19.00 ± 4% +1.8 20.80 ± 4% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt 18.97 ± 4% +1.8 20.78 ± 4% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt 57.55 ± 3% +5.9 63.42 ± 4% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 57.55 ± 3% +5.9 63.42 ± 4% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 57.55 ± 3% +5.9 63.42 ± 4% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify 57.29 ± 3% +5.9 63.16 ± 4% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 57.28 ± 3% +5.9 63.16 ± 4% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary 57.84 ± 3% +5.9 63.78 ± 4% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify 57.28 ± 3% +6.0 63.25 ± 4% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry 56.70 ± 3% +6.0 62.69 ± 4% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call 56.71 ± 3% +6.0 62.70 ± 4% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle 41.52 ± 4% -5.9 35.64 ± 7% perf-profile.children.cycles-pp.do_syscall_64 41.52 ± 4% -5.9 35.64 ± 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 27.70 ± 5% -5.1 22.57 ± 22% perf-profile.children.cycles-pp.native_apic_msr_write 31.07 ± 6% -4.4 26.66 ± 11% perf-profile.children.cycles-pp.btrfs_inode_lock 31.09 ± 6% -4.4 26.68 ± 11% perf-profile.children.cycles-pp.down_write 31.07 ± 6% -4.4 26.66 ± 11% perf-profile.children.cycles-pp.rwsem_down_write_slowpath 31.36 ± 6% -4.4 26.96 ± 10% perf-profile.children.cycles-pp.vfs_write 30.88 ± 7% -4.4 26.48 ± 11% perf-profile.children.cycles-pp.rwsem_optimistic_spin 30.36 ± 7% -4.4 25.98 ± 11% perf-profile.children.cycles-pp.osq_lock 31.26 ± 6% -4.4 26.89 ± 10% perf-profile.children.cycles-pp.btrfs_buffered_write 31.28 ± 6% -4.4 26.91 ± 10% perf-profile.children.cycles-pp.__x64_sys_pwrite64 31.28 ± 6% -4.4 26.91 ± 10% perf-profile.children.cycles-pp.__libc_pwrite64 31.26 ± 6% -4.4 26.90 ± 10% perf-profile.children.cycles-pp.btrfs_do_write_iter 9.03 ± 6% -1.3 7.69 ± 3% perf-profile.children.cycles-pp.do_futex 9.03 ± 6% -1.3 7.69 ± 3% perf-profile.children.cycles-pp.__x64_sys_futex 8.85 ± 6% -1.3 7.54 ± 4% perf-profile.children.cycles-pp._raw_spin_lock 8.59 ± 6% -1.3 7.29 ± 4% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 2.89 ± 8% -1.0 1.91 ± 44% perf-profile.children.cycles-pp.nmi_cpu_backtrace 4.75 ± 6% -0.7 4.04 ± 3% perf-profile.children.cycles-pp.__lll_lock_wait 4.67 ± 6% -0.7 3.96 ± 3% perf-profile.children.cycles-pp.futex_wait 4.66 ± 6% -0.7 3.96 ± 3% perf-profile.children.cycles-pp.__futex_wait 4.55 ± 6% -0.7 3.86 ± 3% perf-profile.children.cycles-pp.futex_wait_setup 4.48 ± 6% -0.7 3.79 ± 3% perf-profile.children.cycles-pp.futex_q_lock 4.45 ± 5% -0.6 3.82 ± 4% perf-profile.children.cycles-pp.__pthread_mutex_unlock_usercnt 4.36 ± 6% -0.6 3.73 ± 4% perf-profile.children.cycles-pp.futex_wake 0.87 ± 8% -0.3 0.60 ± 36% perf-profile.children.cycles-pp.nmi_cpu_backtrace_handler 1.60 ± 5% -0.2 1.44 ± 6% perf-profile.children.cycles-pp.nmi_handle 1.02 ± 2% -0.1 0.88 ± 4% perf-profile.children.cycles-pp.fsync 0.99 ± 2% -0.1 0.86 ± 4% perf-profile.children.cycles-pp.__x64_sys_fsync 0.89 ± 3% -0.1 0.77 ± 4% perf-profile.children.cycles-pp.btrfs_sync_file 0.60 ± 2% -0.0 0.56 ± 3% perf-profile.children.cycles-pp.perf_event_nmi_handler 0.36 ± 3% -0.0 0.32 ± 5% perf-profile.children.cycles-pp.__filemap_fdatawrite_range 0.35 ± 3% -0.0 0.31 ± 4% perf-profile.children.cycles-pp.start_ordered_ops 0.35 ± 4% -0.0 0.31 ± 4% perf-profile.children.cycles-pp.filemap_fdatawrite_wbc 0.34 ± 4% -0.0 0.31 ± 5% perf-profile.children.cycles-pp.do_writepages 0.32 ± 3% -0.0 0.28 ± 4% perf-profile.children.cycles-pp.extent_writepages 0.31 ± 4% -0.0 0.27 ± 5% perf-profile.children.cycles-pp.extent_write_cache_pages 0.17 ± 4% -0.0 0.15 ± 3% perf-profile.children.cycles-pp.__extent_writepage_io 0.14 ± 3% -0.0 0.12 perf-profile.children.cycles-pp.submit_one_bio 0.19 ± 4% -0.0 0.17 ± 4% perf-profile.children.cycles-pp.schedule_idle 0.10 ± 4% -0.0 0.09 ± 6% perf-profile.children.cycles-pp.btrfs_dirty_pages 0.12 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.btrfs_csum_one_bio 0.12 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.crc_pcl 0.10 ± 7% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.prepare_pages 0.07 ± 5% +0.0 0.09 perf-profile.children.cycles-pp.btrfs_search_slot 1.01 ± 4% +0.2 1.24 ± 13% perf-profile.children.cycles-pp.perf_pmu_nop_void 2.43 ± 9% +1.0 3.39 ± 27% perf-profile.children.cycles-pp.__intel_pmu_enable_all 18.59 ± 3% +1.7 20.26 ± 4% perf-profile.children.cycles-pp.perf_event_task_tick 18.59 ± 3% +1.7 20.26 ± 4% perf-profile.children.cycles-pp.perf_adjust_freq_unthr_context 18.75 ± 3% +1.7 20.48 ± 4% perf-profile.children.cycles-pp.scheduler_tick 20.30 ± 4% +1.8 22.10 ± 5% perf-profile.children.cycles-pp.tick_nohz_highres_handler 20.00 ± 4% +1.9 21.87 ± 5% perf-profile.children.cycles-pp.tick_sched_handle 19.97 ± 4% +1.9 21.84 ± 5% perf-profile.children.cycles-pp.update_process_times 57.55 ± 3% +5.9 63.42 ± 4% perf-profile.children.cycles-pp.start_secondary 57.84 ± 3% +5.9 63.78 ± 4% perf-profile.children.cycles-pp.secondary_startup_64_no_verify 57.84 ± 3% +5.9 63.78 ± 4% perf-profile.children.cycles-pp.cpu_startup_entry 57.84 ± 3% +5.9 63.78 ± 4% perf-profile.children.cycles-pp.do_idle 57.57 ± 3% +6.0 63.52 ± 4% perf-profile.children.cycles-pp.cpuidle_enter 57.57 ± 3% +6.0 63.52 ± 4% perf-profile.children.cycles-pp.cpuidle_enter_state 57.57 ± 3% +6.0 63.53 ± 4% perf-profile.children.cycles-pp.cpuidle_idle_call 27.70 ± 5% -5.1 22.57 ± 22% perf-profile.self.cycles-pp.native_apic_msr_write 2.89 ± 8% -1.0 1.91 ± 44% perf-profile.self.cycles-pp.nmi_cpu_backtrace 0.87 ± 8% -0.3 0.60 ± 36% perf-profile.self.cycles-pp.nmi_cpu_backtrace_handler 1.60 ± 5% -0.2 1.44 ± 6% perf-profile.self.cycles-pp.nmi_handle 0.60 ± 2% -0.0 0.56 ± 3% perf-profile.self.cycles-pp.perf_event_nmi_handler 0.03 ±100% +0.1 0.08 ± 19% perf-profile.self.cycles-pp.timerqueue_add 0.16 ± 16% +0.1 0.29 ± 28% perf-profile.self.cycles-pp._raw_spin_lock_irq Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki