Greeting,

FYI, we noticed a 13.7% improvement of will-it-scale.per_thread_ops due to commit:

commit: 4601e2fc8b57840660ce1a1ee98aea873fa15eee ("shmem: convert shmem_file_read_iter() to use shmem_get_folio()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: will-it-scale
on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
with following parameters:

	nr_task: 100%
	mode: thread
	test: pread2
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale


Details are as below:

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/thread/100%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp5/pread2/will-it-scale

commit: 
  eff1f906c2 ("shmem: convert shmem_write_begin() to use shmem_get_folio()")
  4601e2fc8b ("shmem: convert shmem_file_read_iter() to use shmem_get_folio()")

eff1f906c2dcd83c 4601e2fc8b57840660ce1a1ee98 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   1508791 ą  3%     +13.7%    1715505 ą  2%  will-it-scale.128.threads
     11786 ą  3%     +13.7%      13401 ą  2%  will-it-scale.per_thread_ops
   1508791 ą  3%     +13.7%    1715505 ą  2%  will-it-scale.workload
      2.92 ą 15%     +43.7%       4.20 ą 16%  turbostat.CPU%c1
     58550 ą  4%     -16.4%      48936 ą  5%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.20 ą  9%     +17.5%       0.23 ą  5%  sched_debug.cfs_rq:/.nr_running.stddev
     58605 ą  5%     -16.5%      48957 ą  5%  sched_debug.cfs_rq:/.spread0.stddev
    191.02 ą  4%     +16.1%     221.72 ą  5%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
      0.23 ą  3%     +11.1%       0.25 ą  4%  sched_debug.cpu.nr_running.stddev
     12.20            -1.1%      12.07        perf-stat.i.cpi
      0.00 ą  9%      -0.0        0.00 ą  5%  perf-stat.i.dTLB-store-miss-rate%
 9.003e+08 ą  2%      +6.4%  9.582e+08        perf-stat.i.dTLB-stores
     82.71            +2.2       84.95        perf-stat.i.node-store-miss-rate%
   5815837           +10.2%    6408731        perf-stat.i.node-store-misses
   1223798 ą  2%      -6.6%    1142824 ą  2%  perf-stat.i.node-stores
     12.19            -1.0%      12.06        perf-stat.overall.cpi
      0.01 ą  3%      -0.0        0.00 ą  5%  perf-stat.overall.dTLB-store-miss-rate%
     82.60            +2.2       84.85        perf-stat.overall.node-store-miss-rate%
   6712074 ą  2%     -12.0%    5904631 ą  2%  perf-stat.overall.path-length
 8.981e+08 ą  2%      +6.4%  9.558e+08        perf-stat.ps.dTLB-stores
   5796378           +10.2%    6387291        perf-stat.ps.node-store-misses
   1220724 ą  2%      -6.6%    1140426 ą  2%  perf-stat.ps.node-stores
     41.14           -41.1        0.00        perf-profile.calltrace.cycles-pp.shmem_getpage.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
     41.10           -41.1        0.00        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_getpage.shmem_file_read_iter.vfs_read.__x64_sys_pread64
     41.04           -41.0        0.00        perf-profile.calltrace.cycles-pp.__filemap_get_folio.shmem_get_folio_gfp.shmem_getpage.shmem_file_read_iter.vfs_read
     40.18           -40.2        0.00        perf-profile.calltrace.cycles-pp.folio_wait_bit_common.__filemap_get_folio.shmem_get_folio_gfp.shmem_getpage.shmem_file_read_iter
     39.18           -39.2        0.00        perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.folio_wait_bit_common.__filemap_get_folio.shmem_get_folio_gfp.shmem_getpage
      0.00            +0.6        0.59 ą  7%  perf-profile.calltrace.cycles-pp.io_schedule.folio_wait_bit_common.__filemap_get_folio.shmem_get_folio_gfp.shmem_file_read_iter
      0.00           +39.4       39.45        perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.folio_wait_bit_common.__filemap_get_folio.shmem_get_folio_gfp.shmem_file_read_iter
      0.00           +40.5       40.46        perf-profile.calltrace.cycles-pp.folio_wait_bit_common.__filemap_get_folio.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read
      0.00           +41.2       41.24        perf-profile.calltrace.cycles-pp.__filemap_get_folio.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64
      0.00           +41.3       41.30        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
     41.14           -41.1        0.00        perf-profile.children.cycles-pp.shmem_getpage
      0.10 ą  4%      +0.0        0.12 ą  4%  perf-profile.children.cycles-pp.copyout
      0.12 ą  3%      +0.0        0.14 ą  3%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      0.07            +0.0        0.09        perf-profile.children.cycles-pp.folio_unlock
      0.12 ą  3%      +0.0        0.14 ą  3%  perf-profile.children.cycles-pp._copy_to_iter
      0.13 ą  2%      +0.0        0.15 ą  4%  perf-profile.children.cycles-pp.copy_page_to_iter
      0.00            +0.1        0.06 ą  9%  perf-profile.children.cycles-pp.PageHeadHuge
      0.46            -0.1        0.37 ą  3%  perf-profile.self.cycles-pp.shmem_file_read_iter
      0.82 ą  2%      -0.1        0.74 ą  4%  perf-profile.self.cycles-pp.__filemap_get_folio
      0.12 ą  3%      +0.0        0.14 ą  3%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      0.07            +0.0        0.09        perf-profile.self.cycles-pp.folio_unlock


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://01.org/lkp