Greeting, FYI, we noticed a 34.5% improvement of vm-scalability.throughput due to commit: commit: cb67f4282bf9693658dbda934a441ddbbb1446df ("mm,thp,rmap: simplify compound page mapcount handling") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master in testcase: vm-scalability on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory with following parameters: runtime: 300s size: 128G test: truncate-seq cpufreq_governor: performance test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-11/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/128G/lkp-csl-2sp3/truncate-seq/vm-scalability commit: dad6a5eb55 ("mm,hugetlb: use folio fields in second tail page") cb67f4282b ("mm,thp,rmap: simplify compound page mapcount handling") dad6a5eb55564845 cb67f4282bf9693658dbda934a4 ---------------- --------------------------- %stddev %change %stddev \ | \ 2.352e+08 +34.5% 3.164e+08 vm-scalability.median 2.352e+08 +34.5% 3.164e+08 vm-scalability.throughput 1132841 ± 37% +90.2% 2154205 ± 22% proc-vmstat.compact_free_scanned 2.40 ±107% +5.3 7.74 ± 47% perf-profile.children.cycles-pp.do_filp_open 2.40 ±107% +5.3 7.74 ± 47% perf-profile.children.cycles-pp.path_openat 105.57 +1.5% 107.20 perf-stat.i.cpu-migrations 31.13 ± 4% +4.8 35.91 ± 4% perf-stat.i.iTLB-load-miss-rate% 821423 ± 5% +17.4% 963953 ± 2% perf-stat.i.iTLB-load-misses 1724 ± 5% -15.5% 1456 ± 4% perf-stat.i.instructions-per-iTLB-miss 727.93 -11.3% 645.86 ± 18% perf-stat.i.metric.K/sec 572194 ± 3% -29.0% 406083 ± 4% perf-stat.i.node-load-misses 603100 ± 3% -29.9% 422840 ± 3% perf-stat.i.node-loads 31.13 ± 4% +4.8 35.88 ± 4% perf-stat.overall.iTLB-load-miss-rate% 1769 ± 5% -15.7% 1490 ± 4% perf-stat.overall.instructions-per-iTLB-miss 104.86 +1.5% 106.46 perf-stat.ps.cpu-migrations 815914 ± 5% +17.3% 957385 ± 2% perf-stat.ps.iTLB-load-misses 568201 ± 3% -29.0% 403248 ± 4% perf-stat.ps.node-load-misses 598933 ± 3% -29.9% 420034 ± 3% perf-stat.ps.node-loads Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests