Greeting,

FYI, we noticed a 34.5% improvement of vm-scalability.throughput due to commit:


commit: cb67f4282bf9693658dbda934a441ddbbb1446df ("mm,thp,rmap: simplify compound page mapcount handling")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: vm-scalability
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
with following parameters:

	runtime: 300s
	size: 128G
	test: truncate-seq
	cpufreq_governor: performance

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/128G/lkp-csl-2sp3/truncate-seq/vm-scalability

commit: 
  dad6a5eb55 ("mm,hugetlb: use folio fields in second tail page")
  cb67f4282b ("mm,thp,rmap: simplify compound page mapcount handling")

dad6a5eb55564845 cb67f4282bf9693658dbda934a4 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 2.352e+08           +34.5%  3.164e+08        vm-scalability.median
 2.352e+08           +34.5%  3.164e+08        vm-scalability.throughput
   1132841 ą 37%     +90.2%    2154205 ą 22%  proc-vmstat.compact_free_scanned
      2.40 ą107%      +5.3        7.74 ą 47%  perf-profile.children.cycles-pp.do_filp_open
      2.40 ą107%      +5.3        7.74 ą 47%  perf-profile.children.cycles-pp.path_openat
    105.57            +1.5%     107.20        perf-stat.i.cpu-migrations
     31.13 ą  4%      +4.8       35.91 ą  4%  perf-stat.i.iTLB-load-miss-rate%
    821423 ą  5%     +17.4%     963953 ą  2%  perf-stat.i.iTLB-load-misses
      1724 ą  5%     -15.5%       1456 ą  4%  perf-stat.i.instructions-per-iTLB-miss
    727.93           -11.3%     645.86 ą 18%  perf-stat.i.metric.K/sec
    572194 ą  3%     -29.0%     406083 ą  4%  perf-stat.i.node-load-misses
    603100 ą  3%     -29.9%     422840 ą  3%  perf-stat.i.node-loads
     31.13 ą  4%      +4.8       35.88 ą  4%  perf-stat.overall.iTLB-load-miss-rate%
      1769 ą  5%     -15.7%       1490 ą  4%  perf-stat.overall.instructions-per-iTLB-miss
    104.86            +1.5%     106.46        perf-stat.ps.cpu-migrations
    815914 ą  5%     +17.3%     957385 ą  2%  perf-stat.ps.iTLB-load-misses
    568201 ą  3%     -29.0%     403248 ą  4%  perf-stat.ps.node-load-misses
    598933 ą  3%     -29.9%     420034 ą  3%  perf-stat.ps.node-loads


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests