Hello, kernel test robot noticed a 2.1% improvement of fsmark.files_per_sec on: commit: e5b9a37505880cb3d76ebddca25a7242fd9d6f91 ("NFSD: Enable write delegation support") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master testcase: fsmark test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory parameters: iterations: 1x nr_threads: 32t disk: 1SSD fs: btrfs fs2: nfsv4 filesize: 9B test_size: 400M sync_method: fsyncBeforeClose nr_directories: 16d nr_files_per_directory: 256fpd cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. ========================================================================================= compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase: gcc-12/performance/1SSD/9B/nfsv4/btrfs/1x/x86_64-rhel-8.3/16d/256fpd/32t/debian-11.1-x86_64-20220510.cgz/fsyncBeforeClose/lkp-ivb-2ep1/400M/fsmark commit: af7c14f91a ("NFSD: Enforce flush-on-close for write delegations") e5b9a37505 ("NFSD: Enable write delegation support") af7c14f91a306eee e5b9a37505880cb3d76ebddca25 ---------------- --------------------------- %stddev %change %stddev \ | \ 4426369 -12.5% 3873415 cpuidle..usage 5.93 ± 2% -0.5 5.47 mpstat.cpu.all.sys% 9.48 +2.1% 9.68 iostat.cpu.iowait 7.84 -6.0% 7.37 iostat.cpu.system 310366 ± 2% -14.8% 264523 vmstat.system.cs 69011 -5.3% 65342 vmstat.system.in 11823460 -11.4% 10471571 fsmark.app_overhead 5058 +2.1% 5165 fsmark.files_per_sec 55.00 +2.4% 56.33 fsmark.time.percent_of_cpu_this_job_got 477760 -9.2% 433730 fsmark.time.voluntary_context_switches 5.77 ± 69% -4.4 1.36 ±148% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe 5.77 ± 69% -4.4 1.36 ±148% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64 7.35 ± 68% -5.5 1.86 ±142% perf-profile.children.cycles-pp.__x64_sys_openat 7.35 ± 68% -5.5 1.86 ±142% perf-profile.children.cycles-pp.do_sys_openat2 5.77 ± 69% -4.4 1.36 ±148% perf-profile.children.cycles-pp.do_filp_open 5.77 ± 69% -4.4 1.36 ±148% perf-profile.children.cycles-pp.path_openat 5.10 ± 88% -3.2 1.92 ±142% perf-profile.children.cycles-pp.sched_setaffinity 2453462 -20.3% 1954727 turbostat.C1 14.80 -1.8 13.03 turbostat.C1% 18.26 +0.8 19.04 turbostat.C1E% 92.44 -1.7% 90.91 turbostat.CorWatt 32589 -34.7% 21295 ± 2% turbostat.POLL 0.13 ± 2% -0.0 0.09 turbostat.POLL% 120.46 -1.3% 118.92 turbostat.PkgWatt 712376 -1.4% 702472 proc-vmstat.nr_dirtied 46109 -6.9% 42910 proc-vmstat.nr_slab_unreclaimable 705524 -1.4% 695317 proc-vmstat.nr_written 2314060 -3.9% 2223615 proc-vmstat.numa_hit 489398 ± 9% -12.0% 430864 ± 9% proc-vmstat.numa_other 218171 -1.6% 214672 proc-vmstat.pgactivate 2765841 -3.3% 2673651 proc-vmstat.pgalloc_normal 1981302 -7.0% 1842873 ± 3% proc-vmstat.pgfree 2140457 -1.8% 2102818 proc-vmstat.pgpgout 2.073e+09 -4.5% 1.98e+09 perf-stat.i.branch-instructions 4.89 +0.2 5.10 perf-stat.i.branch-miss-rate% 6.86 -0.3 6.60 perf-stat.i.cache-miss-rate% 16995704 -6.9% 15830167 perf-stat.i.cache-misses 2.565e+08 -3.3% 2.48e+08 perf-stat.i.cache-references 355982 -13.9% 306606 perf-stat.i.context-switches 1.713e+10 -5.2% 1.623e+10 perf-stat.i.cpu-cycles 1706 -6.8% 1589 ± 2% perf-stat.i.cpu-migrations 2.443e+09 -4.5% 2.334e+09 perf-stat.i.dTLB-loads 1.196e+09 -5.0% 1.137e+09 perf-stat.i.dTLB-stores 52.68 +1.9 54.58 perf-stat.i.iTLB-load-miss-rate% 2927293 -9.6% 2646648 perf-stat.i.iTLB-loads 1.018e+10 -4.2% 9.752e+09 perf-stat.i.instructions 0.60 +1.0% 0.60 perf-stat.i.ipc 0.36 -5.2% 0.34 perf-stat.i.metric.GHz 669.99 -10.0% 602.81 perf-stat.i.metric.K/sec 124.30 -4.5% 118.67 perf-stat.i.metric.M/sec 8004496 -12.0% 7040516 perf-stat.i.node-load-misses 8642285 -11.0% 7688294 perf-stat.i.node-loads 35.10 -1.0 34.11 perf-stat.i.node-store-miss-rate% 3782168 -10.9% 3370639 perf-stat.i.node-store-misses 6991064 -7.0% 6505160 perf-stat.i.node-stores 4.87 +0.2 5.06 perf-stat.overall.branch-miss-rate% 6.63 -0.2 6.38 perf-stat.overall.cache-miss-rate% 52.17 +1.9 54.08 perf-stat.overall.iTLB-load-miss-rate% 0.59 +1.1% 0.60 perf-stat.overall.ipc 35.11 -1.0 34.13 perf-stat.overall.node-store-miss-rate% 1.979e+09 -4.7% 1.886e+09 perf-stat.ps.branch-instructions 16221561 -7.1% 15076188 perf-stat.ps.cache-misses 2.448e+08 -3.5% 2.362e+08 perf-stat.ps.cache-references 339800 -14.1% 292030 perf-stat.ps.context-switches 1.635e+10 -5.4% 1.546e+10 perf-stat.ps.cpu-cycles 1628 -7.0% 1513 ± 2% perf-stat.ps.cpu-migrations 2.332e+09 -4.7% 2.223e+09 perf-stat.ps.dTLB-loads 1.141e+09 -5.2% 1.082e+09 perf-stat.ps.dTLB-stores 2794208 -9.8% 2520816 perf-stat.ps.iTLB-loads 9.712e+09 -4.4% 9.288e+09 perf-stat.ps.instructions 7640561 -12.2% 6705747 perf-stat.ps.node-load-misses 8249263 -11.2% 7322640 perf-stat.ps.node-loads 3609780 -11.1% 3209985 perf-stat.ps.node-store-misses 6672465 -7.2% 6195169 perf-stat.ps.node-stores 2.143e+11 -8.7% 1.956e+11 perf-stat.total.instructions Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki