Greeting, FYI, we noticed a -8.1% regression of phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.mb_s due to commit: commit: 8b157c14b505f861cf8da783ff89f679a0e50abe ("[PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()") url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/mm-filemap-fix-that-first-page-is-not-mark-accessed-in-filemap_read/20220602-161035 base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/linux-fsdevel/20220602082129.2805890-1-yukuai3@huawei.com in testcase: phoronix-test-suite on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory with following parameters: test: fio-1.14.1 option_a: Sequential Read option_b: Linux AIO option_c: Yes option_d: Yes option_e: 4KB option_f: Default Test Directory cpufreq_governor: performance ucode: 0x500320a test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added. test-url: http://www.phoronix-test-suite.com/ If you fix the issue, kindly add following tag Reported-by: kernel test robot Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. ========================================================================================= compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/option_d/option_e/option_f/rootfs/tbox_group/test/testcase/ucode: gcc-11/performance/x86_64-rhel-8.3/Sequential Read/Linux AIO/Yes/Yes/4KB/Default Test Directory/debian-x86_64-phoronix/lkp-csl-2sp7/fio-1.14.1/phoronix-test-suite/0x500320a commit: 2408f14000 ("Merge branch 'mm-nonmm-unstable' into mm-everything") 8b157c14b5 ("mm/filemap: fix that first page is not mark accessed in filemap_read()") 2408f140000f9597 8b157c14b505f861cf8da783ff8 ---------------- --------------------------- %stddev %change %stddev \ | \ 481388 -8.1% 442333 phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.iops 1880 -8.1% 1727 phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.mb_s 2.894e+08 -8.1% 2.659e+08 phoronix-test-suite.time.file_system_inputs 0.11 ± 22% -0.0 0.08 mpstat.cpu.all.soft% 292.39 ± 35% -35.3% 189.30 ± 8% sched_debug.cpu.clock_task.stddev 933030 +47.4% 1374932 ± 2% numa-meminfo.node0.Active 92985 ± 16% +478.0% 537464 ± 6% numa-meminfo.node0.Active(file) 23246 ± 16% +475.4% 133769 ± 6% numa-vmstat.node0.nr_active_file 23246 ± 16% +475.4% 133769 ± 6% numa-vmstat.node0.nr_zone_active_file 1181131 -8.1% 1085364 vmstat.io.bi 20529 -7.4% 19019 vmstat.system.cs 954480 +45.1% 1384840 ± 3% meminfo.Active 112134 +386.0% 544959 ± 7% meminfo.Active(file) 2756213 -13.9% 2371792 meminfo.Inactive 1492877 -25.8% 1108430 meminfo.Inactive(file) 84.17 ± 10% -11.7% 74.33 turbostat.Avg_MHz 4.72 ± 18% -0.9 3.84 turbostat.Busy% 854421 ±133% -82.0% 154039 ± 20% turbostat.C1 0.49 ±155% -0.4 0.06 ± 11% turbostat.C1% 28033 +386.2% 136307 ± 7% proc-vmstat.nr_active_file 373247 -25.8% 277108 proc-vmstat.nr_inactive_file 28033 +386.2% 136308 ± 7% proc-vmstat.nr_zone_active_file 373247 -25.8% 277108 proc-vmstat.nr_zone_inactive_file 40703167 ± 2% -8.5% 37255189 proc-vmstat.numa_hit 40122593 -7.5% 37096628 proc-vmstat.numa_local 316253 +10501.8% 33528470 proc-vmstat.pgactivate 40072448 -7.3% 37140540 proc-vmstat.pgalloc_normal 39689252 -7.5% 36696525 proc-vmstat.pgfree 1.447e+08 -8.1% 1.33e+08 proc-vmstat.pgpgin 22.95 ± 52% -53.5% 10.67 perf-stat.i.MPKI 1.088e+09 -3.3% 1.052e+09 perf-stat.i.branch-instructions 14531811 ± 29% -30.9% 10047658 perf-stat.i.branch-misses 31350962 -9.2% 28459348 perf-stat.i.cache-misses 86567058 ± 24% -29.3% 61243543 perf-stat.i.cache-references 21004 -7.6% 19398 perf-stat.i.context-switches 7.243e+09 ± 11% -13.5% 6.262e+09 perf-stat.i.cpu-cycles 0.14 ± 95% -0.1 0.01 ± 10% perf-stat.i.dTLB-load-miss-rate% 1307140 ± 15% +17.6% 1537276 perf-stat.i.iTLB-loads 5.234e+09 -2.9% 5.084e+09 perf-stat.i.instructions 2655 ± 5% -10.9% 2366 ± 3% perf-stat.i.instructions-per-iTLB-miss 75383 ± 11% -13.5% 65208 perf-stat.i.metric.GHz 6029414 -6.2% 5655914 perf-stat.i.node-loads 20.94 ± 15% +3.7 24.66 ± 3% perf-stat.i.node-store-miss-rate% 82166 ± 23% +29.0% 106019 ± 2% perf-stat.i.node-store-misses 6382540 -9.0% 5805257 perf-stat.i.node-stores 16.54 ± 24% -27.2% 12.04 perf-stat.overall.MPKI 2862 ± 5% -11.1% 2544 ± 3% perf-stat.overall.instructions-per-iTLB-miss 5.63 ± 15% +1.0 6.67 perf-stat.overall.node-load-miss-rate% 1.27 ± 23% +0.5 1.79 perf-stat.overall.node-store-miss-rate% 1.078e+09 -3.3% 1.043e+09 perf-stat.ps.branch-instructions 14418791 ± 29% -30.9% 9965662 perf-stat.ps.branch-misses 31056696 -9.2% 28199667 perf-stat.ps.cache-misses 85785810 ± 24% -29.3% 60689278 perf-stat.ps.cache-references 20807 -7.6% 19221 perf-stat.ps.context-switches 7.181e+09 ± 11% -13.5% 6.209e+09 perf-stat.ps.cpu-cycles 1296058 ± 15% +17.6% 1524338 perf-stat.ps.iTLB-loads 5.189e+09 -2.9% 5.04e+09 perf-stat.ps.instructions 5972497 -6.2% 5604175 perf-stat.ps.node-loads 81503 ± 23% +29.0% 105130 ± 2% perf-stat.ps.node-store-misses 6322173 -9.0% 5752078 perf-stat.ps.node-stores 6.205e+11 -2.6% 6.041e+11 perf-stat.total.instructions 7.61 ± 14% -1.6 6.00 ± 13% perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.aio_read.io_submit_one.__x64_sys_io_submit 4.09 ± 14% -0.8 3.27 ± 11% perf-profile.calltrace.cycles-pp.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64 4.10 ± 14% -0.8 3.28 ± 11% perf-profile.calltrace.cycles-pp.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe 4.10 ± 14% -0.8 3.28 ± 11% perf-profile.calltrace.cycles-pp.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe 4.10 ± 14% -0.8 3.28 ± 11% perf-profile.calltrace.cycles-pp.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.46 ± 15% -0.5 2.00 ± 11% perf-profile.calltrace.cycles-pp.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.35 ± 16% -0.4 1.90 ± 11% perf-profile.calltrace.cycles-pp.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.68 ± 16% -0.4 1.30 ± 15% perf-profile.calltrace.cycles-pp.read_pages.page_cache_ra_unbounded.filemap_get_pages.filemap_read.aio_read 1.75 ± 14% -0.4 1.38 ± 10% perf-profile.calltrace.cycles-pp.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64 1.77 ± 14% -0.4 1.40 ± 10% perf-profile.calltrace.cycles-pp.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64 0.89 ± 18% -0.3 0.59 ± 46% perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.aio_read.io_submit_one 0.85 ± 18% -0.3 0.57 ± 46% perf-profile.calltrace.cycles-pp.ext4_mpage_readpages.read_pages.page_cache_ra_unbounded.filemap_get_pages.filemap_read 1.49 ± 14% -0.3 1.22 ± 12% perf-profile.calltrace.cycles-pp.folio_alloc.page_cache_ra_unbounded.filemap_get_pages.filemap_read.aio_read 1.32 ± 13% -0.2 1.08 ± 12% perf-profile.calltrace.cycles-pp.__alloc_pages.folio_alloc.page_cache_ra_unbounded.filemap_get_pages.filemap_read 0.98 ± 13% -0.2 0.76 ± 11% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise 1.05 ± 12% -0.2 0.84 ± 14% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.folio_alloc.page_cache_ra_unbounded.filemap_get_pages 0.90 ± 10% -0.2 0.72 ± 14% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc.page_cache_ra_unbounded 0.75 ± 15% -0.2 0.59 ± 10% perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page_list.release_pages.__pagevec_release.invalidate_mapping_pagevec 1.53 ± 17% +0.4 1.95 ± 9% perf-profile.calltrace.cycles-pp.schedule.worker_thread.kthread.ret_from_fork 1.53 ± 17% +0.4 1.95 ± 9% perf-profile.calltrace.cycles-pp.__schedule.schedule.worker_thread.kthread.ret_from_fork 0.00 +1.2 1.17 ± 18% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.folio_mark_accessed.filemap_read.aio_read.io_submit_one 0.31 ±101% +2.2 2.47 ± 17% perf-profile.calltrace.cycles-pp.folio_mark_accessed.filemap_read.aio_read.io_submit_one.__x64_sys_io_submit 7.61 ± 14% -1.6 6.00 ± 13% perf-profile.children.cycles-pp.filemap_get_pages 4.10 ± 14% -0.8 3.28 ± 11% perf-profile.children.cycles-pp.__x64_sys_fadvise64 4.10 ± 14% -0.8 3.28 ± 11% perf-profile.children.cycles-pp.ksys_fadvise64_64 4.10 ± 14% -0.8 3.28 ± 11% perf-profile.children.cycles-pp.generic_fadvise 4.10 ± 14% -0.8 3.28 ± 11% perf-profile.children.cycles-pp.invalidate_mapping_pagevec 2.47 ± 15% -0.5 2.00 ± 11% perf-profile.children.cycles-pp.__x64_sys_io_getevents 2.36 ± 16% -0.5 1.90 ± 11% perf-profile.children.cycles-pp.do_io_getevents 1.68 ± 16% -0.4 1.30 ± 15% perf-profile.children.cycles-pp.read_pages 1.77 ± 14% -0.4 1.40 ± 10% perf-profile.children.cycles-pp.__pagevec_release 1.49 ± 14% -0.3 1.22 ± 12% perf-profile.children.cycles-pp.folio_alloc 1.40 ± 12% -0.3 1.14 ± 12% perf-profile.children.cycles-pp.__alloc_pages 1.16 ± 15% -0.3 0.90 ± 13% perf-profile.children.cycles-pp.lookup_ioctx 0.90 ± 18% -0.2 0.67 ± 17% perf-profile.children.cycles-pp.filemap_get_read_batch 1.00 ± 12% -0.2 0.78 ± 12% perf-profile.children.cycles-pp.free_unref_page_list 1.08 ± 11% -0.2 0.86 ± 14% perf-profile.children.cycles-pp.get_page_from_freelist 0.85 ± 18% -0.2 0.65 ± 15% perf-profile.children.cycles-pp.ext4_mpage_readpages 0.88 ± 16% -0.2 0.70 ± 14% perf-profile.children.cycles-pp.__might_resched 0.93 ± 10% -0.2 0.75 ± 14% perf-profile.children.cycles-pp.rmqueue 0.78 ± 15% -0.2 0.61 ± 10% perf-profile.children.cycles-pp.free_unref_page_commit 0.61 ± 16% -0.1 0.48 ± 12% perf-profile.children.cycles-pp.free_pcppages_bulk 0.27 ± 15% -0.1 0.20 ± 11% perf-profile.children.cycles-pp.hrtimer_next_event_without 0.16 ± 22% -0.1 0.11 ± 19% perf-profile.children.cycles-pp.hrtimer_update_next_event 0.08 ± 20% -0.0 0.04 ± 47% perf-profile.children.cycles-pp.mem_cgroup_charge_statistics 0.08 ± 9% -0.0 0.05 ± 47% perf-profile.children.cycles-pp.tick_program_event 1.46 ± 13% +0.3 1.76 ± 8% perf-profile.children.cycles-pp.load_balance 0.00 +0.4 0.43 ± 16% perf-profile.children.cycles-pp.workingset_age_nonresident 0.00 +0.7 0.65 ± 17% perf-profile.children.cycles-pp.workingset_activation 0.00 +0.7 0.67 ± 17% perf-profile.children.cycles-pp.__folio_activate 0.00 +1.2 1.18 ± 18% perf-profile.children.cycles-pp.pagevec_lru_move_fn 0.57 ± 17% +1.9 2.51 ± 17% perf-profile.children.cycles-pp.folio_mark_accessed 4.33 ± 17% -0.9 3.45 ± 13% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string 1.36 ± 12% -0.5 0.84 ± 36% perf-profile.self.cycles-pp.menu_select 0.64 ± 17% -0.2 0.45 ± 18% perf-profile.self.cycles-pp.filemap_get_read_batch 0.86 ± 16% -0.2 0.67 ± 13% perf-profile.self.cycles-pp.__might_resched 0.46 ± 19% -0.1 0.32 ± 18% perf-profile.self.cycles-pp.__get_user_4 0.34 ± 10% -0.1 0.24 ± 3% perf-profile.self.cycles-pp.copy_page_to_iter 0.14 ± 17% -0.0 0.09 ± 32% perf-profile.self.cycles-pp.aio_prep_rw 0.11 ± 14% -0.0 0.07 ± 23% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore 0.08 ± 12% -0.0 0.04 ± 73% perf-profile.self.cycles-pp.tick_program_event 0.14 ± 9% -0.0 0.10 ± 11% perf-profile.self.cycles-pp.atime_needs_update 0.00 +0.2 0.22 ± 26% perf-profile.self.cycles-pp.workingset_activation 0.00 +0.3 0.29 ± 19% perf-profile.self.cycles-pp.pagevec_lru_move_fn 0.00 +0.4 0.35 ± 16% perf-profile.self.cycles-pp.__folio_activate 0.00 +0.4 0.43 ± 16% perf-profile.self.cycles-pp.workingset_age_nonresident Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://01.org/lkp