* [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
@ 2024-05-17 5:56 kernel test robot
2024-05-17 23:38 ` Yosry Ahmed
2024-05-18 6:28 ` Shakeel Butt
0 siblings, 2 replies; 15+ messages in thread
From: kernel test robot @ 2024-05-17 5:56 UTC (permalink / raw)
To: Shakeel Butt
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin, oliver.sang
Hello,
kernel test robot noticed a -11.9% regression of will-it-scale.per_process_ops on:
commit: 70a64b7919cbd6c12306051ff2825839a9d65605 ("memcg: dynamically allocate lruvec_stats")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:
nr_task: 100%
mode: process
test: page_fault2
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202405171353.b56b845-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240517/202405171353.b56b845-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/page_fault2/will-it-scale
commit:
59142d87ab ("memcg: reduce memory size of mem_cgroup_events_index")
70a64b7919 ("memcg: dynamically allocate lruvec_stats")
59142d87ab03b8ff 70a64b7919cbd6c12306051ff28
---------------- ---------------------------
%stddev %change %stddev
\ | \
7.14 -0.8 6.32 mpstat.cpu.all.usr%
245257 ± 7% -13.8% 211354 ± 4% sched_debug.cfs_rq:/.avg_vruntime.stddev
245258 ± 7% -13.8% 211353 ± 4% sched_debug.cfs_rq:/.min_vruntime.stddev
21099 ± 5% -14.9% 17946 ± 5% perf-c2c.DRAM.local
4025 ± 2% +29.1% 5197 ± 3% perf-c2c.HITM.local
105.17 ± 8% -12.7% 91.83 ± 6% perf-c2c.HITM.remote
9538291 -11.9% 8402170 will-it-scale.104.processes
91713 -11.9% 80789 will-it-scale.per_process_ops
9538291 -11.9% 8402170 will-it-scale.workload
1.438e+09 -11.2% 1.276e+09 numa-numastat.node0.local_node
1.44e+09 -11.3% 1.278e+09 numa-numastat.node0.numa_hit
83001 ± 15% -68.9% 25774 ± 34% numa-numastat.node0.other_node
1.453e+09 -12.5% 1.271e+09 numa-numastat.node1.local_node
1.454e+09 -12.5% 1.272e+09 numa-numastat.node1.numa_hit
24752 ± 51% +230.9% 81910 ± 10% numa-numastat.node1.other_node
1.44e+09 -11.3% 1.278e+09 numa-vmstat.node0.numa_hit
1.438e+09 -11.3% 1.276e+09 numa-vmstat.node0.numa_local
83001 ± 15% -68.9% 25774 ± 34% numa-vmstat.node0.numa_other
1.454e+09 -12.5% 1.272e+09 numa-vmstat.node1.numa_hit
1.453e+09 -12.5% 1.271e+09 numa-vmstat.node1.numa_local
24752 ± 51% +230.9% 81910 ± 10% numa-vmstat.node1.numa_other
14952 -3.2% 14468 proc-vmstat.nr_mapped
2.894e+09 -11.9% 2.55e+09 proc-vmstat.numa_hit
2.891e+09 -11.9% 2.548e+09 proc-vmstat.numa_local
2.88e+09 -11.8% 2.539e+09 proc-vmstat.pgalloc_normal
2.869e+09 -11.9% 2.529e+09 proc-vmstat.pgfault
2.88e+09 -11.8% 2.539e+09 proc-vmstat.pgfree
17.51 -2.6% 17.05 perf-stat.i.MPKI
9.457e+09 -9.2% 8.585e+09 perf-stat.i.branch-instructions
45022022 -8.2% 41340795 perf-stat.i.branch-misses
84.38 -4.9 79.51 perf-stat.i.cache-miss-rate%
8.353e+08 -12.1% 7.345e+08 perf-stat.i.cache-misses
9.877e+08 -6.7% 9.216e+08 perf-stat.i.cache-references
6.06 +10.8% 6.72 perf-stat.i.cpi
136.25 -1.2% 134.59 perf-stat.i.cpu-migrations
348.56 +13.9% 396.93 perf-stat.i.cycles-between-cache-misses
4.763e+10 -9.7% 4.302e+10 perf-stat.i.instructions
0.17 -9.6% 0.15 perf-stat.i.ipc
182.56 -11.9% 160.88 perf-stat.i.metric.K/sec
9494393 -11.9% 8368012 perf-stat.i.minor-faults
9494393 -11.9% 8368012 perf-stat.i.page-faults
17.54 -2.6% 17.08 perf-stat.overall.MPKI
0.47 +0.0 0.48 perf-stat.overall.branch-miss-rate%
84.57 -4.9 79.71 perf-stat.overall.cache-miss-rate%
6.07 +10.8% 6.73 perf-stat.overall.cpi
346.33 +13.8% 393.97 perf-stat.overall.cycles-between-cache-misses
0.16 -9.7% 0.15 perf-stat.overall.ipc
1503802 +2.6% 1542599 perf-stat.overall.path-length
9.424e+09 -9.2% 8.553e+09 perf-stat.ps.branch-instructions
44739120 -8.3% 41034189 perf-stat.ps.branch-misses
8.326e+08 -12.1% 7.321e+08 perf-stat.ps.cache-misses
9.846e+08 -6.7% 9.185e+08 perf-stat.ps.cache-references
134.98 -1.3% 133.26 perf-stat.ps.cpu-migrations
4.747e+10 -9.7% 4.286e+10 perf-stat.ps.instructions
9463902 -11.9% 8339836 perf-stat.ps.minor-faults
9463902 -11.9% 8339836 perf-stat.ps.page-faults
1.434e+13 -9.6% 1.296e+13 perf-stat.total.instructions
64.15 -2.4 61.72 perf-profile.calltrace.cycles-pp.testcase
58.30 -1.9 56.41 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
52.64 -1.4 51.28 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
52.50 -1.3 51.16 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
50.81 -1.0 49.86 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
49.86 -0.8 49.02 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
9.27 -0.8 8.45 ± 3% perf-profile.calltrace.cycles-pp.copy_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
49.21 -0.8 48.43 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
5.15 -0.5 4.68 perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
3.24 -0.5 2.77 perf-profile.calltrace.cycles-pp.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.82 -0.3 0.51 perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
1.68 -0.3 1.42 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
2.52 -0.2 2.28 perf-profile.calltrace.cycles-pp.error_entry.testcase
1.50 ± 2% -0.2 1.30 perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
1.85 -0.1 1.70 ± 3% perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
0.68 -0.1 0.55 ± 2% perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
1.55 -0.1 1.44 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault
0.55 -0.1 0.43 ± 44% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc
1.07 -0.1 0.98 perf-profile.calltrace.cycles-pp.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault
0.90 -0.1 0.81 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
0.89 -0.0 0.86 perf-profile.calltrace.cycles-pp.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault
1.00 +0.1 1.05 perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
3.85 +0.2 4.10 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
3.85 +0.2 4.10 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
3.85 +0.2 4.10 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
3.82 +0.3 4.07 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region
3.68 +0.3 3.94 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu
0.83 +0.3 1.10 ± 2% perf-profile.calltrace.cycles-pp.__lruvec_stat_mod_folio.set_pte_range.finish_fault.do_fault.__handle_mm_fault
0.00 +0.5 0.54 perf-profile.calltrace.cycles-pp.__lruvec_stat_mod_folio.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range
0.00 +0.7 0.66 perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
32.87 +0.7 33.62 perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
29.54 +2.3 31.80 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
29.54 +2.3 31.80 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
29.53 +2.3 31.80 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
30.66 +2.3 32.93 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
30.66 +2.3 32.93 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
30.66 +2.3 32.93 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
30.66 +2.3 32.93 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
29.26 +2.3 31.60 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
28.41 +2.4 30.78 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
34.56 +2.5 37.08 perf-profile.calltrace.cycles-pp.__munmap
34.56 +2.5 37.08 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
34.56 +2.5 37.08 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
34.55 +2.5 37.07 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
34.55 +2.5 37.08 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
34.55 +2.5 37.08 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
34.55 +2.5 37.08 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
34.55 +2.5 37.08 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
31.41 +2.8 34.20 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
31.42 +2.8 34.23 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
31.38 +2.8 34.19 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
65.26 -2.5 62.73 perf-profile.children.cycles-pp.testcase
56.09 -1.7 54.41 perf-profile.children.cycles-pp.asm_exc_page_fault
52.66 -1.4 51.30 perf-profile.children.cycles-pp.exc_page_fault
52.52 -1.3 51.18 perf-profile.children.cycles-pp.do_user_addr_fault
50.83 -1.0 49.88 perf-profile.children.cycles-pp.handle_mm_fault
49.87 -0.8 49.02 perf-profile.children.cycles-pp.__handle_mm_fault
9.35 -0.8 8.53 ± 3% perf-profile.children.cycles-pp.copy_page
49.23 -0.8 48.45 perf-profile.children.cycles-pp.do_fault
5.15 -0.5 4.68 perf-profile.children.cycles-pp.__irqentry_text_end
3.27 -0.5 2.80 perf-profile.children.cycles-pp.folio_prealloc
0.82 -0.3 0.52 perf-profile.children.cycles-pp.lock_vma_under_rcu
0.57 -0.3 0.32 perf-profile.children.cycles-pp.mas_walk
1.69 -0.3 1.43 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
2.54 -0.2 2.30 perf-profile.children.cycles-pp.error_entry
1.52 ± 2% -0.2 1.31 perf-profile.children.cycles-pp.__mem_cgroup_charge
0.95 -0.2 0.79 ± 4% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
1.87 -0.2 1.72 ± 3% perf-profile.children.cycles-pp.__pte_offset_map_lock
0.60 ± 4% -0.1 0.46 ± 6% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
0.70 -0.1 0.56 ± 2% perf-profile.children.cycles-pp.lru_add_fn
1.57 -0.1 1.45 ± 3% perf-profile.children.cycles-pp._raw_spin_lock
1.16 -0.1 1.04 perf-profile.children.cycles-pp.native_irq_return_iret
1.12 -0.1 1.01 perf-profile.children.cycles-pp.alloc_pages_mpol_noprof
0.44 -0.1 0.35 perf-profile.children.cycles-pp.get_vma_policy
0.94 -0.1 0.85 perf-profile.children.cycles-pp.sync_regs
0.96 -0.1 0.87 perf-profile.children.cycles-pp.__perf_sw_event
0.43 -0.1 0.34 ± 2% perf-profile.children.cycles-pp.free_unref_folios
0.21 ± 3% -0.1 0.13 ± 3% perf-profile.children.cycles-pp._compound_head
0.75 -0.1 0.68 perf-profile.children.cycles-pp.___perf_sw_event
0.31 -0.1 0.25 perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
0.94 -0.0 0.90 perf-profile.children.cycles-pp.__alloc_pages_noprof
0.41 ± 4% -0.0 0.37 ± 4% perf-profile.children.cycles-pp.mem_cgroup_commit_charge
0.44 ± 5% -0.0 0.40 ± 5% perf-profile.children.cycles-pp.__count_memcg_events
0.17 ± 2% -0.0 0.13 ± 4% perf-profile.children.cycles-pp.uncharge_batch
0.57 -0.0 0.53 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
0.13 ± 2% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.__mod_zone_page_state
0.19 ± 3% -0.0 0.16 ± 6% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.15 ± 2% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.free_unref_page_commit
0.10 ± 3% -0.0 0.07 ± 5% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.08 -0.0 0.05 perf-profile.children.cycles-pp.policy_nodemask
0.13 ± 3% -0.0 0.10 ± 3% perf-profile.children.cycles-pp.page_counter_uncharge
0.32 ± 3% -0.0 0.30 ± 2% perf-profile.children.cycles-pp.__mod_node_page_state
0.17 ± 2% -0.0 0.15 ± 3% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.16 ± 2% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.shmem_get_policy
0.16 -0.0 0.14 ± 2% perf-profile.children.cycles-pp.handle_pte_fault
0.16 ± 4% -0.0 0.14 ± 4% perf-profile.children.cycles-pp.__pte_offset_map
0.09 -0.0 0.07 ± 5% perf-profile.children.cycles-pp.get_pfnblock_flags_mask
0.12 ± 3% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.uncharge_folio
0.36 -0.0 0.34 perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.10 ± 3% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.pte_offset_map_nolock
0.30 -0.0 0.28 ± 2% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.09 ± 4% -0.0 0.08 perf-profile.children.cycles-pp.down_read_trylock
0.08 -0.0 0.07 ± 5% perf-profile.children.cycles-pp.folio_unlock
0.40 +0.0 0.43 perf-profile.children.cycles-pp.__mod_lruvec_state
1.02 +0.0 1.06 perf-profile.children.cycles-pp.zap_present_ptes
0.47 +0.2 0.67 perf-profile.children.cycles-pp.folio_remove_rmap_ptes
3.87 +0.3 4.12 perf-profile.children.cycles-pp.tlb_finish_mmu
1.17 +0.5 1.71 ± 2% perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
32.88 +0.8 33.63 perf-profile.children.cycles-pp.set_pte_range
29.54 +2.3 31.80 perf-profile.children.cycles-pp.tlb_flush_mmu
30.66 +2.3 32.93 perf-profile.children.cycles-pp.zap_pte_range
30.66 +2.3 32.94 perf-profile.children.cycles-pp.unmap_page_range
30.66 +2.3 32.94 perf-profile.children.cycles-pp.zap_pmd_range
30.66 +2.3 32.94 perf-profile.children.cycles-pp.unmap_vmas
33.41 +2.5 35.92 perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
33.40 +2.5 35.92 perf-profile.children.cycles-pp.free_pages_and_swap_cache
34.56 +2.5 37.08 perf-profile.children.cycles-pp.__munmap
34.56 +2.5 37.08 perf-profile.children.cycles-pp.__vm_munmap
34.56 +2.5 37.08 perf-profile.children.cycles-pp.__x64_sys_munmap
34.56 +2.5 37.09 perf-profile.children.cycles-pp.do_vmi_munmap
34.56 +2.5 37.09 perf-profile.children.cycles-pp.do_vmi_align_munmap
34.67 +2.5 37.20 perf-profile.children.cycles-pp.do_syscall_64
34.67 +2.5 37.20 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
34.56 +2.5 37.09 perf-profile.children.cycles-pp.unmap_region
33.22 +2.6 35.80 perf-profile.children.cycles-pp.folios_put_refs
32.12 +2.6 34.75 perf-profile.children.cycles-pp.__page_cache_release
61.97 +3.3 65.27 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
61.94 +3.3 65.26 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
61.98 +3.3 65.30 perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
9.32 -0.8 8.49 ± 3% perf-profile.self.cycles-pp.copy_page
5.15 -0.5 4.68 perf-profile.self.cycles-pp.__irqentry_text_end
0.56 -0.3 0.31 perf-profile.self.cycles-pp.mas_walk
2.58 -0.2 2.33 perf-profile.self.cycles-pp.testcase
2.53 -0.2 2.30 perf-profile.self.cycles-pp.error_entry
0.60 ± 4% -0.2 0.44 ± 6% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
0.85 -0.1 0.71 ± 4% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
1.54 -0.1 1.43 ± 3% perf-profile.self.cycles-pp._raw_spin_lock
1.15 -0.1 1.04 perf-profile.self.cycles-pp.native_irq_return_iret
0.94 -0.1 0.85 perf-profile.self.cycles-pp.sync_regs
0.20 ± 3% -0.1 0.13 ± 3% perf-profile.self.cycles-pp._compound_head
0.27 ± 3% -0.1 0.20 ± 3% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.26 -0.1 0.18 ± 2% perf-profile.self.cycles-pp.get_vma_policy
0.26 -0.1 0.19 ± 2% perf-profile.self.cycles-pp.__page_cache_release
0.16 -0.1 0.09 ± 5% perf-profile.self.cycles-pp.vma_alloc_folio_noprof
0.28 ± 2% -0.1 0.22 ± 3% perf-profile.self.cycles-pp.zap_present_ptes
0.66 -0.1 0.60 perf-profile.self.cycles-pp.___perf_sw_event
0.32 -0.1 0.27 ± 5% perf-profile.self.cycles-pp.lru_add_fn
0.47 -0.0 0.43 ± 2% perf-profile.self.cycles-pp.__handle_mm_fault
0.16 ± 4% -0.0 0.12 perf-profile.self.cycles-pp.lock_vma_under_rcu
0.20 -0.0 0.16 ± 4% perf-profile.self.cycles-pp.free_unref_folios
0.30 -0.0 0.26 perf-profile.self.cycles-pp.handle_mm_fault
0.10 ± 4% -0.0 0.07 perf-profile.self.cycles-pp.zap_pte_range
0.09 ± 5% -0.0 0.06 ± 6% perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
0.14 ± 2% -0.0 0.11 ± 4% perf-profile.self.cycles-pp.mem_cgroup_commit_charge
0.14 ± 3% -0.0 0.12 ± 4% perf-profile.self.cycles-pp.folio_remove_rmap_ptes
0.12 ± 4% -0.0 0.09 ± 7% perf-profile.self.cycles-pp.__mod_zone_page_state
0.10 ± 4% -0.0 0.08 ± 6% perf-profile.self.cycles-pp.alloc_pages_mpol_noprof
0.11 -0.0 0.08 ± 5% perf-profile.self.cycles-pp.free_unref_page_commit
0.22 ± 2% -0.0 0.19 perf-profile.self.cycles-pp.__pte_offset_map_lock
0.21 -0.0 0.18 ± 2% perf-profile.self.cycles-pp.__perf_sw_event
0.21 -0.0 0.18 ± 2% perf-profile.self.cycles-pp.do_user_addr_fault
0.31 ± 2% -0.0 0.29 perf-profile.self.cycles-pp.__mod_node_page_state
0.16 ± 2% -0.0 0.14 ± 5% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.17 ± 2% -0.0 0.15 ± 2% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.11 -0.0 0.09 ± 4% perf-profile.self.cycles-pp.page_counter_uncharge
0.09 -0.0 0.07 perf-profile.self.cycles-pp.get_pfnblock_flags_mask
0.28 ± 2% -0.0 0.26 ± 2% perf-profile.self.cycles-pp.xas_load
0.16 ± 2% -0.0 0.14 ± 3% perf-profile.self.cycles-pp.get_page_from_freelist
0.12 -0.0 0.10 ± 3% perf-profile.self.cycles-pp.uncharge_folio
0.16 ± 4% -0.0 0.14 ± 3% perf-profile.self.cycles-pp.__pte_offset_map
0.20 ± 2% -0.0 0.19 ± 2% perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.16 ± 3% -0.0 0.14 ± 3% perf-profile.self.cycles-pp.shmem_get_policy
0.14 ± 3% -0.0 0.12 ± 4% perf-profile.self.cycles-pp.do_fault
0.08 -0.0 0.07 ± 7% perf-profile.self.cycles-pp.folio_unlock
0.12 ± 3% -0.0 0.11 perf-profile.self.cycles-pp.folio_add_new_anon_rmap
0.09 -0.0 0.08 perf-profile.self.cycles-pp.down_read_trylock
0.07 -0.0 0.06 perf-profile.self.cycles-pp.folio_prealloc
0.38 ± 2% +0.0 0.42 ± 3% perf-profile.self.cycles-pp.filemap_get_entry
0.26 +0.1 0.36 perf-profile.self.cycles-pp.folios_put_refs
0.33 +0.1 0.44 ± 3% perf-profile.self.cycles-pp.folio_batch_move_lru
0.40 ± 5% +0.6 0.98 perf-profile.self.cycles-pp.__lruvec_stat_mod_folio
61.94 +3.3 65.26 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-17 5:56 [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression kernel test robot
@ 2024-05-17 23:38 ` Yosry Ahmed
2024-05-18 6:28 ` Shakeel Butt
1 sibling, 0 replies; 15+ messages in thread
From: Yosry Ahmed @ 2024-05-17 23:38 UTC (permalink / raw)
To: kernel test robot
Cc: Shakeel Butt, oe-lkp, lkp, Linux Memory Management List,
Andrew Morton, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin
On Thu, May 16, 2024 at 10:56 PM kernel test robot
<oliver.sang@intel.com> wrote:
>
>
>
> Hello,
>
> kernel test robot noticed a -11.9% regression of will-it-scale.per_process_ops on:
>
>
> commit: 70a64b7919cbd6c12306051ff2825839a9d65605 ("memcg: dynamically allocate lruvec_stats")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
I think we may want to go back to the approach of reordering the
indices to separate memcg and non-memcg stats. If we really want to
conserve the order in which the stats are exported to userspace, we
can use a translation table on the read path instead of the update
path.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-17 5:56 [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression kernel test robot
2024-05-17 23:38 ` Yosry Ahmed
@ 2024-05-18 6:28 ` Shakeel Butt
2024-05-19 9:14 ` Oliver Sang
1 sibling, 1 reply; 15+ messages in thread
From: Shakeel Butt @ 2024-05-18 6:28 UTC (permalink / raw)
To: kernel test robot
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin
On Fri, May 17, 2024 at 01:56:30PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed a -11.9% regression of will-it-scale.per_process_ops on:
>
>
> commit: 70a64b7919cbd6c12306051ff2825839a9d65605 ("memcg: dynamically allocate lruvec_stats")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
Thanks for the report. Can you please run the same benchmark but with
the full series (of 8 patches) or at least include the ff48c71c26aa
("memcg: reduce memory for the lruvec and memcg stats").
thanks,
Shakeel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-18 6:28 ` Shakeel Butt
@ 2024-05-19 9:14 ` Oliver Sang
2024-05-19 17:20 ` Shakeel Butt
0 siblings, 1 reply; 15+ messages in thread
From: Oliver Sang @ 2024-05-19 9:14 UTC (permalink / raw)
To: Shakeel Butt
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin, oliver.sang
hi, Shakeel,
On Fri, May 17, 2024 at 11:28:10PM -0700, Shakeel Butt wrote:
> On Fri, May 17, 2024 at 01:56:30PM +0800, kernel test robot wrote:
> >
> >
> > Hello,
> >
> > kernel test robot noticed a -11.9% regression of will-it-scale.per_process_ops on:
> >
> >
> > commit: 70a64b7919cbd6c12306051ff2825839a9d65605 ("memcg: dynamically allocate lruvec_stats")
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> >
>
> Thanks for the report. Can you please run the same benchmark but with
> the full series (of 8 patches) or at least include the ff48c71c26aa
> ("memcg: reduce memory for the lruvec and memcg stats").
while this bisect, ff48c71c26aa has been checked. it has silimar data as
70a64b7919 (a little worse actually)
59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
91713 -11.9% 80789 -13.2% 79612 will-it-scale.per_process_ops
ok, we will run tests on tip of the series which should be below if I understand
it correctly.
* a94032b35e5f9 memcg: use proper type for mod_memcg_state
>
> thanks,
> Shakeel
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-19 9:14 ` Oliver Sang
@ 2024-05-19 17:20 ` Shakeel Butt
2024-05-20 2:43 ` Oliver Sang
0 siblings, 1 reply; 15+ messages in thread
From: Shakeel Butt @ 2024-05-19 17:20 UTC (permalink / raw)
To: Oliver Sang
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin
On Sun, May 19, 2024 at 05:14:39PM +0800, Oliver Sang wrote:
> hi, Shakeel,
>
> On Fri, May 17, 2024 at 11:28:10PM -0700, Shakeel Butt wrote:
> > On Fri, May 17, 2024 at 01:56:30PM +0800, kernel test robot wrote:
> > >
> > >
> > > Hello,
> > >
> > > kernel test robot noticed a -11.9% regression of will-it-scale.per_process_ops on:
> > >
> > >
> > > commit: 70a64b7919cbd6c12306051ff2825839a9d65605 ("memcg: dynamically allocate lruvec_stats")
> > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > >
> >
> > Thanks for the report. Can you please run the same benchmark but with
> > the full series (of 8 patches) or at least include the ff48c71c26aa
> > ("memcg: reduce memory for the lruvec and memcg stats").
>
> while this bisect, ff48c71c26aa has been checked. it has silimar data as
> 70a64b7919 (a little worse actually)
>
> 59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803
> ---------------- --------------------------- ---------------------------
> %stddev %change %stddev %change %stddev
> \ | \ | \
> 91713 -11.9% 80789 -13.2% 79612 will-it-scale.per_process_ops
>
>
> ok, we will run tests on tip of the series which should be below if I understand
> it correctly.
>
> * a94032b35e5f9 memcg: use proper type for mod_memcg_state
>
>
Thanks a lot Oliver. One question: what is the filesystem mounted at
/tmp on your test machine? I just wanted to make sure I run the test
with minimal changes from your setup.
> >
> > thanks,
> > Shakeel
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-19 17:20 ` Shakeel Butt
@ 2024-05-20 2:43 ` Oliver Sang
2024-05-20 3:49 ` Shakeel Butt
0 siblings, 1 reply; 15+ messages in thread
From: Oliver Sang @ 2024-05-20 2:43 UTC (permalink / raw)
To: Shakeel Butt
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin, oliver.sang
hi, Shakeel,
On Sun, May 19, 2024 at 10:20:28AM -0700, Shakeel Butt wrote:
> On Sun, May 19, 2024 at 05:14:39PM +0800, Oliver Sang wrote:
> > hi, Shakeel,
> >
> > On Fri, May 17, 2024 at 11:28:10PM -0700, Shakeel Butt wrote:
> > > On Fri, May 17, 2024 at 01:56:30PM +0800, kernel test robot wrote:
> > > >
> > > >
> > > > Hello,
> > > >
> > > > kernel test robot noticed a -11.9% regression of will-it-scale.per_process_ops on:
> > > >
> > > >
> > > > commit: 70a64b7919cbd6c12306051ff2825839a9d65605 ("memcg: dynamically allocate lruvec_stats")
> > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > > >
> > >
> > > Thanks for the report. Can you please run the same benchmark but with
> > > the full series (of 8 patches) or at least include the ff48c71c26aa
> > > ("memcg: reduce memory for the lruvec and memcg stats").
> >
> > while this bisect, ff48c71c26aa has been checked. it has silimar data as
> > 70a64b7919 (a little worse actually)
> >
> > 59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803
> > ---------------- --------------------------- ---------------------------
> > %stddev %change %stddev %change %stddev
> > \ | \ | \
> > 91713 -11.9% 80789 -13.2% 79612 will-it-scale.per_process_ops
> >
> >
> > ok, we will run tests on tip of the series which should be below if I understand
> > it correctly.
> >
> > * a94032b35e5f9 memcg: use proper type for mod_memcg_state
> >
> >
>
> Thanks a lot Oliver. One question: what is the filesystem mounted at
> /tmp on your test machine? I just wanted to make sure I run the test
> with minimal changes from your setup.
we don't have specific partition for /tmp, just use tmpfs
tmp on /tmp type tmpfs (rw,relatime)
BTW, the test on a94032b35e5f9 finished, still have similar score to 70a64b7919
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/page_fault2/will-it-scale
59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803 a94032b35e5f97dc1023030d929
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
91713 -11.9% 80789 -13.2% 79612 -13.0% 79833 will-it-scale.per_process_ops
>
> > >
> > > thanks,
> > > Shakeel
> > >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-20 2:43 ` Oliver Sang
@ 2024-05-20 3:49 ` Shakeel Butt
2024-05-21 2:43 ` Oliver Sang
0 siblings, 1 reply; 15+ messages in thread
From: Shakeel Butt @ 2024-05-20 3:49 UTC (permalink / raw)
To: Oliver Sang
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin
On Mon, May 20, 2024 at 10:43:35AM +0800, Oliver Sang wrote:
> hi, Shakeel,
>
> On Sun, May 19, 2024 at 10:20:28AM -0700, Shakeel Butt wrote:
> > On Sun, May 19, 2024 at 05:14:39PM +0800, Oliver Sang wrote:
> > > hi, Shakeel,
> > >
> > > On Fri, May 17, 2024 at 11:28:10PM -0700, Shakeel Butt wrote:
> > > > On Fri, May 17, 2024 at 01:56:30PM +0800, kernel test robot wrote:
> > > > >
> > > > >
> > > > > Hello,
> > > > >
> > > > > kernel test robot noticed a -11.9% regression of will-it-scale.per_process_ops on:
> > > > >
> > > > >
> > > > > commit: 70a64b7919cbd6c12306051ff2825839a9d65605 ("memcg: dynamically allocate lruvec_stats")
> > > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > > > >
> > > >
> > > > Thanks for the report. Can you please run the same benchmark but with
> > > > the full series (of 8 patches) or at least include the ff48c71c26aa
> > > > ("memcg: reduce memory for the lruvec and memcg stats").
> > >
> > > while this bisect, ff48c71c26aa has been checked. it has silimar data as
> > > 70a64b7919 (a little worse actually)
> > >
> > > 59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803
> > > ---------------- --------------------------- ---------------------------
> > > %stddev %change %stddev %change %stddev
> > > \ | \ | \
> > > 91713 -11.9% 80789 -13.2% 79612 will-it-scale.per_process_ops
> > >
> > >
> > > ok, we will run tests on tip of the series which should be below if I understand
> > > it correctly.
> > >
> > > * a94032b35e5f9 memcg: use proper type for mod_memcg_state
> > >
> > >
> >
> > Thanks a lot Oliver. One question: what is the filesystem mounted at
> > /tmp on your test machine? I just wanted to make sure I run the test
> > with minimal changes from your setup.
>
> we don't have specific partition for /tmp, just use tmpfs
>
> tmp on /tmp type tmpfs (rw,relatime)
>
>
> BTW, the test on a94032b35e5f9 finished, still have similar score to 70a64b7919
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
> gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/page_fault2/will-it-scale
>
> 59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803 a94032b35e5f97dc1023030d929
> ---------------- --------------------------- --------------------------- ---------------------------
> %stddev %change %stddev %change %stddev %change %stddev
> \ | \ | \ | \
> 91713 -11.9% 80789 -13.2% 79612 -13.0% 79833 will-it-scale.per_process_ops
>
Thanks again. I am not sure if you have a single node machine but if you
have, can you try to repro this issue on such machine. At the moment, I
don't have access to such machine but I will try to repro myself as
well.
Shakeel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-20 3:49 ` Shakeel Butt
@ 2024-05-21 2:43 ` Oliver Sang
2024-05-22 4:18 ` Shakeel Butt
0 siblings, 1 reply; 15+ messages in thread
From: Oliver Sang @ 2024-05-21 2:43 UTC (permalink / raw)
To: Shakeel Butt
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin, oliver.sang
hi, Shakeel,
On Sun, May 19, 2024 at 08:49:33PM -0700, Shakeel Butt wrote:
> On Mon, May 20, 2024 at 10:43:35AM +0800, Oliver Sang wrote:
> > hi, Shakeel,
> >
> > On Sun, May 19, 2024 at 10:20:28AM -0700, Shakeel Butt wrote:
> > > On Sun, May 19, 2024 at 05:14:39PM +0800, Oliver Sang wrote:
> > > > hi, Shakeel,
> > > >
> > > > On Fri, May 17, 2024 at 11:28:10PM -0700, Shakeel Butt wrote:
> > > > > On Fri, May 17, 2024 at 01:56:30PM +0800, kernel test robot wrote:
> > > > > >
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > kernel test robot noticed a -11.9% regression of will-it-scale.per_process_ops on:
> > > > > >
> > > > > >
> > > > > > commit: 70a64b7919cbd6c12306051ff2825839a9d65605 ("memcg: dynamically allocate lruvec_stats")
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > > > > >
> > > > >
> > > > > Thanks for the report. Can you please run the same benchmark but with
> > > > > the full series (of 8 patches) or at least include the ff48c71c26aa
> > > > > ("memcg: reduce memory for the lruvec and memcg stats").
> > > >
> > > > while this bisect, ff48c71c26aa has been checked. it has silimar data as
> > > > 70a64b7919 (a little worse actually)
> > > >
> > > > 59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803
> > > > ---------------- --------------------------- ---------------------------
> > > > %stddev %change %stddev %change %stddev
> > > > \ | \ | \
> > > > 91713 -11.9% 80789 -13.2% 79612 will-it-scale.per_process_ops
> > > >
> > > >
> > > > ok, we will run tests on tip of the series which should be below if I understand
> > > > it correctly.
> > > >
> > > > * a94032b35e5f9 memcg: use proper type for mod_memcg_state
> > > >
> > > >
> > >
> > > Thanks a lot Oliver. One question: what is the filesystem mounted at
> > > /tmp on your test machine? I just wanted to make sure I run the test
> > > with minimal changes from your setup.
> >
> > we don't have specific partition for /tmp, just use tmpfs
> >
> > tmp on /tmp type tmpfs (rw,relatime)
> >
> >
> > BTW, the test on a94032b35e5f9 finished, still have similar score to 70a64b7919
> >
> > =========================================================================================
> > compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
> > gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/page_fault2/will-it-scale
> >
> > 59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803 a94032b35e5f97dc1023030d929
> > ---------------- --------------------------- --------------------------- ---------------------------
> > %stddev %change %stddev %change %stddev %change %stddev
> > \ | \ | \ | \
> > 91713 -11.9% 80789 -13.2% 79612 -13.0% 79833 will-it-scale.per_process_ops
> >
>
> Thanks again. I am not sure if you have a single node machine but if you
> have, can you try to repro this issue on such machine. At the moment, I
> don't have access to such machine but I will try to repro myself as
> well.
we reported regression on a 2-node Skylake server. so I found a 1-node Skylake
desktop (we don't have 1 node server) to check.
model: Skylake
nr_node: 1
nr_cpu: 36
memory: 32G
brand: Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz
but cannot reproduce this regression:
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-d08/page_fault2/will-it-scale
59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803 a94032b35e5f97dc1023030d929
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
136040 -0.2% 135718 -0.2% 135829 -0.1% 135881 will-it-scale.per_process_ops
then I tried on 2-node servers with other models
for
model: Ice Lake
nr_node: 2
nr_cpu: 64
memory: 256G
brand: Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz
similar regression
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp9/page_fault2/will-it-scale
59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803 a94032b35e5f97dc1023030d929
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
240373 -14.4% 205702 -14.1% 206368 -12.9% 209394 will-it-scale.per_process_ops
full data is as below [1]
for
model: Sapphire Rapids
nr_node: 2
nr_cpu: 224
memory: 512G
brand: Intel(R) Xeon(R) Platinum 8480CTDX
the regression is smaller but still exists.
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/page_fault2/will-it-scale
59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803 a94032b35e5f97dc1023030d929
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
78072 -3.4% 75386 -6.0% 73363 -5.6% 73683 will-it-scale.per_process_ops
full data is as below [2]
hope these data are useful.
[1]
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp9/page_fault2/will-it-scale
59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803 a94032b35e5f97dc1023030d929
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
0.27 ± 3% -0.0 0.24 ± 3% -0.0 0.23 ± 3% -0.0 0.24 ± 2% mpstat.cpu.all.irq%
3.83 -0.7 3.17 ± 2% -0.6 3.23 ± 3% -0.6 3.21 mpstat.cpu.all.usr%
62547 -10.1% 56227 -10.8% 55807 -8.9% 56984 perf-c2c.DRAM.local
194.40 ± 9% -11.5% 172.00 ± 4% -11.5% 172.00 ± 5% -13.9% 167.40 ± 2% perf-c2c.HITM.remote
15383898 -14.4% 13164951 -14.1% 13207631 -12.9% 13401271 will-it-scale.64.processes
240373 -14.4% 205702 -14.1% 206368 -12.9% 209394 will-it-scale.per_process_ops
15383898 -14.4% 13164951 -14.1% 13207631 -12.9% 13401271 will-it-scale.workload
2.359e+09 -12.9% 2.055e+09 -14.2% 2.023e+09 -12.8% 2.057e+09 numa-numastat.node0.local_node
2.359e+09 -12.9% 2.055e+09 -14.2% 2.023e+09 -12.8% 2.057e+09 numa-numastat.node0.numa_hit
2.346e+09 -16.1% 1.967e+09 -14.2% 2.013e+09 -13.2% 2.035e+09 ± 2% numa-numastat.node1.local_node
2.345e+09 -16.1% 1.967e+09 -14.2% 2.013e+09 -13.2% 2.036e+09 ± 2% numa-numastat.node1.numa_hit
567382 ± 8% +2.1% 579061 ± 10% -9.5% 513215 ± 5% +1.2% 574201 ± 9% numa-vmstat.node0.nr_anon_pages
2.36e+09 -12.9% 2.055e+09 -14.3% 2.023e+09 -12.9% 2.056e+09 numa-vmstat.node0.numa_hit
2.36e+09 -12.9% 2.055e+09 -14.3% 2.023e+09 -12.9% 2.056e+09 numa-vmstat.node0.numa_local
2.346e+09 -16.2% 1.966e+09 -14.2% 2.012e+09 -13.3% 2.035e+09 ± 2% numa-vmstat.node1.numa_hit
2.347e+09 -16.2% 1.967e+09 -14.2% 2.013e+09 -13.3% 2.034e+09 ± 2% numa-vmstat.node1.numa_local
1137116 -1.9% 1115597 -1.5% 1119624 -1.8% 1116759 proc-vmstat.nr_anon_pages
4575 +2.1% 4673 +2.1% 4671 +1.7% 4654 proc-vmstat.nr_page_table_pages
4.705e+09 -14.5% 4.022e+09 -14.2% 4.036e+09 -13.0% 4.093e+09 proc-vmstat.numa_hit
4.706e+09 -14.5% 4.023e+09 -14.2% 4.037e+09 -13.0% 4.092e+09 proc-vmstat.numa_local
4.645e+09 -14.3% 3.979e+09 -14.1% 3.991e+09 -12.8% 4.05e+09 proc-vmstat.pgalloc_normal
4.631e+09 -14.3% 3.967e+09 -14.1% 3.979e+09 -12.8% 4.038e+09 proc-vmstat.pgfault
4.643e+09 -14.3% 3.978e+09 -14.1% 3.99e+09 -12.8% 4.049e+09 proc-vmstat.pgfree
29780 ± 54% -49.0% 15173 ± 50% -87.2% 3818 ±199% -33.2% 19878 ±112% sched_debug.cfs_rq:/.left_deadline.avg
1905931 ± 54% -49.1% 971033 ± 50% -87.2% 244356 ±199% -33.2% 1272254 ±112% sched_debug.cfs_rq:/.left_deadline.max
236372 ± 54% -49.1% 120428 ± 50% -87.2% 30306 ±199% -33.2% 157784 ±112% sched_debug.cfs_rq:/.left_deadline.stddev
29779 ± 54% -49.0% 15172 ± 50% -87.2% 3818 ±199% -33.2% 19878 ±112% sched_debug.cfs_rq:/.left_vruntime.avg
1905916 ± 54% -49.1% 971025 ± 50% -87.2% 244349 ±199% -33.2% 1272236 ±112% sched_debug.cfs_rq:/.left_vruntime.max
236371 ± 54% -49.1% 120427 ± 50% -87.2% 30304 ±199% -33.2% 157782 ±112% sched_debug.cfs_rq:/.left_vruntime.stddev
12745 ± 8% +2.4% 13045 -9.7% 11510 ± 11% -6.0% 11984 ± 10% sched_debug.cfs_rq:/.load.min
253.83 ± 24% +56.9% 398.30 ± 27% +58.4% 402.13 ± 56% +23.8% 314.20 ± 23% sched_debug.cfs_rq:/.load_avg.max
22.93 ± 4% -12.2% 20.14 ± 17% -12.0% 20.17 ± 17% -18.5% 18.68 ± 15% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
22.93 ± 4% -13.0% 19.94 ± 16% -12.1% 20.16 ± 17% -19.9% 18.35 ± 14% sched_debug.cfs_rq:/.removed.util_avg.stddev
29779 ± 54% -49.0% 15172 ± 50% -87.2% 3818 ±199% -33.2% 19878 ±112% sched_debug.cfs_rq:/.right_vruntime.avg
1905916 ± 54% -49.1% 971025 ± 50% -87.2% 244349 ±199% -33.2% 1272236 ±112% sched_debug.cfs_rq:/.right_vruntime.max
236371 ± 54% -49.1% 120427 ± 50% -87.2% 30304 ±199% -33.2% 157782 ±112% sched_debug.cfs_rq:/.right_vruntime.stddev
149.50 ± 33% -81.3% 28.00 ±180% -71.2% 43.03 ±120% -70.9% 43.57 ±125% sched_debug.cfs_rq:/.util_est.min
1930 ± 4% -15.5% 1631 ± 7% -18.1% 1581 ± 5% -10.5% 1729 ± 16% sched_debug.cpu.nr_switches.min
0.79 ± 98% +89.1% 1.49 ± 48% +147.8% 1.96 ± 16% -12.4% 0.69 ± 91% sched_debug.rt_rq:.rt_time.avg
50.52 ± 98% +89.2% 95.60 ± 48% +147.8% 125.19 ± 17% -12.3% 44.29 ± 91% sched_debug.rt_rq:.rt_time.max
6.27 ± 98% +89.2% 11.86 ± 48% +147.8% 15.53 ± 17% -12.3% 5.49 ± 91% sched_debug.rt_rq:.rt_time.stddev
21.14 -10.1% 19.00 -10.1% 19.01 ± 2% -9.9% 19.05 perf-stat.i.MPKI
1.468e+10 -9.4% 1.33e+10 -9.0% 1.336e+10 -7.9% 1.351e+10 perf-stat.i.branch-instructions
14349180 -7.8% 13236560 -6.6% 13407521 -6.2% 13464962 perf-stat.i.branch-misses
69.58 -5.1 64.51 -4.8 64.81 -4.6 64.96 perf-stat.i.cache-miss-rate%
1.57e+09 -19.5% 1.263e+09 ± 2% -18.9% 1.273e+09 ± 3% -17.8% 1.291e+09 perf-stat.i.cache-misses
2.252e+09 -13.2% 1.955e+09 -12.9% 1.961e+09 -11.9% 1.985e+09 perf-stat.i.cache-references
3.00 +12.8% 3.39 +12.0% 3.36 +10.6% 3.32 perf-stat.i.cpi
99.00 -0.9% 98.11 -1.1% 97.90 -0.9% 98.13 perf-stat.i.cpu-migrations
143.06 +25.2% 179.10 ± 2% +24.5% 178.15 ± 3% +22.4% 175.18 perf-stat.i.cycles-between-cache-misses
7.403e+10 -10.4% 6.634e+10 -9.8% 6.679e+10 -8.7% 6.76e+10 perf-stat.i.instructions
0.34 -11.4% 0.30 -10.7% 0.30 -9.7% 0.30 perf-stat.i.ipc
478.41 -14.3% 410.14 -14.0% 411.31 -12.7% 417.50 perf-stat.i.metric.K/sec
15310132 -14.3% 13125768 -14.0% 13162999 -12.7% 13361235 perf-stat.i.minor-faults
15310132 -14.3% 13125768 -14.0% 13163000 -12.7% 13361235 perf-stat.i.page-faults
21.21 -28.4% 15.17 ± 50% -10.2% 19.05 ± 2% -28.3% 15.20 ± 50% perf-stat.overall.MPKI
0.10 -0.0 0.08 ± 50% +0.0 0.10 -0.0 0.08 ± 50% perf-stat.overall.branch-miss-rate%
69.71 -18.2 51.52 ± 50% -4.8 64.89 -17.9 51.83 ± 50% perf-stat.overall.cache-miss-rate%
3.01 -9.7% 2.72 ± 50% +11.9% 3.37 -11.4% 2.67 ± 50% perf-stat.overall.cpi
141.98 +1.0% 143.41 ± 50% +24.6% 176.94 ± 3% -1.2% 140.33 ± 50% perf-stat.overall.cycles-between-cache-misses
0.33 -29.1% 0.24 ± 50% -10.6% 0.30 -27.7% 0.24 ± 50% perf-stat.overall.ipc
1453908 -16.2% 1217875 ± 50% +4.9% 1524841 -16.2% 1218410 ± 50% perf-stat.overall.path-length
1.463e+10 -27.6% 1.059e+10 ± 50% -9.0% 1.332e+10 -26.4% 1.077e+10 ± 50% perf-stat.ps.branch-instructions
14253731 -25.8% 10569701 ± 50% -6.6% 13307817 -25.1% 10681742 ± 50% perf-stat.ps.branch-misses
1.565e+09 -36.0% 1.002e+09 ± 50% -18.9% 1.269e+09 ± 3% -34.6% 1.023e+09 ± 50% perf-stat.ps.cache-misses
2.245e+09 -30.7% 1.556e+09 ± 50% -12.9% 1.954e+09 -29.6% 1.579e+09 ± 50% perf-stat.ps.cache-references
98.42 -20.7% 78.08 ± 50% -1.0% 97.40 -20.6% 78.12 ± 50% perf-stat.ps.cpu-migrations
7.378e+10 -28.4% 5.281e+10 ± 50% -9.8% 6.656e+10 -27.0% 5.385e+10 ± 50% perf-stat.ps.instructions
15260342 -31.6% 10437993 ± 50% -14.0% 13119215 -30.3% 10633461 ± 50% perf-stat.ps.minor-faults
15260342 -31.6% 10437993 ± 50% -14.0% 13119215 -30.3% 10633461 ± 50% perf-stat.ps.page-faults
2.237e+13 -28.5% 1.599e+13 ± 50% -10.0% 2.014e+13 -27.2% 1.629e+13 ± 50% perf-stat.total.instructions
75.68 -6.2 69.50 -6.1 69.63 -5.4 70.26 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
72.31 -5.8 66.56 -5.6 66.68 -5.1 67.25 perf-profile.calltrace.cycles-pp.testcase
63.50 -4.4 59.13 -4.4 59.13 -3.9 59.64 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
63.32 -4.4 58.97 -4.4 58.97 -3.8 59.48 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
61.04 -4.1 56.99 -4.1 56.98 -3.6 57.49 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
21.29 -3.9 17.43 ± 3% -3.6 17.67 ± 3% -3.5 17.77 ± 2% perf-profile.calltrace.cycles-pp.copy_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
59.53 -3.8 55.69 -3.9 55.68 -3.3 56.21 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
58.35 -3.7 54.65 -3.7 54.65 -3.2 55.17 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
5.31 -0.9 4.40 ± 2% -0.9 4.44 ± 2% -0.8 4.50 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
4.97 -0.8 4.13 ± 2% -0.8 4.15 ± 2% -0.8 4.21 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault
4.40 -0.7 3.72 ± 3% -0.6 3.79 ± 3% -0.6 3.78 ± 2% perf-profile.calltrace.cycles-pp.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
2.63 -0.4 2.23 ± 2% -0.4 2.26 ± 2% -0.3 2.29 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
1.82 -0.4 1.44 ± 2% -0.4 1.47 ± 2% -0.3 1.49 perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
2.21 -0.3 1.89 ± 2% -0.3 1.88 ± 2% -0.3 1.90 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
2.01 -0.3 1.69 ± 4% -0.2 1.76 ± 5% -0.3 1.73 ± 2% perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
1.80 -0.3 1.52 ± 2% -0.3 1.52 ± 2% -0.3 1.54 perf-profile.calltrace.cycles-pp.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault
1.74 -0.2 1.50 ± 3% -0.2 1.51 ± 3% -0.2 1.52 ± 2% perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
1.55 -0.2 1.31 ± 2% -0.2 1.30 ± 2% -0.2 1.33 perf-profile.calltrace.cycles-pp.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault
1.60 -0.2 1.37 ± 3% -0.2 1.39 ± 3% -0.2 1.39 ± 2% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.29 -0.2 1.08 ± 3% -0.2 1.14 ± 4% -0.2 1.11 ± 3% perf-profile.calltrace.cycles-pp.mem_cgroup_commit_charge.__mem_cgroup_charge.folio_prealloc.do_fault.__handle_mm_fault
1.42 -0.2 1.21 ± 3% -0.2 1.23 ± 3% -0.2 1.24 ± 2% perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
1.50 -0.2 1.31 ± 2% -0.1 1.41 ± 2% -0.1 1.36 ± 3% perf-profile.calltrace.cycles-pp.__lruvec_stat_mod_folio.set_pte_range.finish_fault.do_fault.__handle_mm_fault
1.12 -0.2 0.93 ± 3% -0.2 0.93 ± 2% -0.2 0.95 ± 2% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc
0.92 -0.1 0.78 ± 4% -0.1 0.80 ± 3% -0.1 0.81 ± 3% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault
0.74 -0.1 0.61 ± 2% -0.1 0.65 ± 2% -0.1 0.64 perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
0.98 -0.1 0.86 ± 2% -0.1 0.87 ± 2% -0.1 0.87 ± 2% perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.72 ± 2% -0.1 0.61 ± 2% -0.1 0.61 ± 2% -0.1 0.60 ± 3% perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.63 ± 2% -0.1 0.53 -0.1 0.53 ± 2% -0.2 0.41 ± 50% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault
1.15 -0.1 1.05 -0.1 1.08 -0.1 1.07 perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
0.66 -0.1 0.56 ± 2% -0.1 0.56 ± 2% -0.1 0.56 ± 2% perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.64 -0.1 0.55 ± 4% -0.1 0.54 ± 3% -0.1 0.56 ± 2% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof
0.66 -0.1 0.58 ± 2% -0.1 0.59 ± 3% -0.1 0.58 ± 2% perf-profile.calltrace.cycles-pp.mas_walk.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
2.71 +0.7 3.39 +0.7 3.36 +0.6 3.31 ± 2% perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
2.71 +0.7 3.39 +0.7 3.36 +0.6 3.31 ± 2% perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
2.71 +0.7 3.39 +0.7 3.37 +0.6 3.31 ± 2% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
2.65 +0.7 3.34 +0.7 3.32 +0.6 3.26 ± 2% perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region
2.44 +0.7 3.15 +0.7 3.13 +0.6 3.07 ± 2% perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu
24.39 +2.2 26.56 ± 5% +1.8 26.19 ± 4% +2.1 26.54 ± 3% perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
22.46 +2.4 24.88 ± 5% +2.0 24.41 ± 5% +2.3 24.81 ± 4% perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_fault.__handle_mm_fault
22.25 +2.5 24.70 ± 5% +2.0 24.24 ± 5% +2.4 24.63 ± 4% perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_fault
20.38 +2.5 22.90 ± 6% +2.0 22.42 ± 5% +2.5 22.84 ± 4% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
20.37 +2.5 22.89 ± 6% +2.0 22.41 ± 5% +2.5 22.83 ± 4% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
20.30 +2.5 22.83 ± 6% +2.0 22.35 ± 5% +2.5 22.77 ± 4% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
22.59 +5.3 27.93 +5.3 27.84 +4.7 27.29 ± 2% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
22.59 +5.3 27.93 +5.3 27.84 +4.7 27.29 ± 2% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
22.59 +5.3 27.93 +5.3 27.84 +4.7 27.29 ± 2% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
22.58 +5.3 27.92 +5.3 27.83 +4.7 27.28 ± 2% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
20.59 +5.8 26.34 +5.6 26.22 +5.1 25.64 ± 2% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
20.59 +5.8 26.34 +5.6 26.22 +5.1 25.64 ± 2% perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
20.56 +5.8 26.32 +5.6 26.20 +5.1 25.62 ± 2% perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
20.07 +5.9 25.95 +5.8 25.83 +5.2 25.23 ± 3% perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
18.73 +6.0 24.73 +5.9 24.63 +5.3 24.01 ± 3% perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
25.34 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
25.34 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
25.34 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
25.34 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
25.34 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
25.33 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
25.34 +6.0 31.37 +5.9 31.25 +5.3 30.65 ± 2% perf-profile.calltrace.cycles-pp.__munmap
25.33 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
20.35 +6.7 27.09 +6.6 26.96 +5.9 26.29 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
20.36 +6.7 27.11 +6.6 26.98 +5.9 26.30 ± 3% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
20.28 +6.8 27.04 +6.6 26.91 +6.0 26.24 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
74.49 -6.0 68.46 -5.9 68.59 -5.3 69.18 perf-profile.children.cycles-pp.testcase
71.15 -5.5 65.63 -5.4 65.72 -4.8 66.30 perf-profile.children.cycles-pp.asm_exc_page_fault
63.55 -4.4 59.16 -4.4 59.17 -3.9 59.68 perf-profile.children.cycles-pp.exc_page_fault
63.38 -4.4 59.03 -4.4 59.03 -3.8 59.54 perf-profile.children.cycles-pp.do_user_addr_fault
61.10 -4.1 57.04 -4.1 57.03 -3.6 57.54 perf-profile.children.cycles-pp.handle_mm_fault
21.32 -3.9 17.45 ± 3% -3.6 17.70 ± 3% -3.5 17.80 ± 2% perf-profile.children.cycles-pp.copy_page
59.57 -3.9 55.72 -3.9 55.72 -3.3 56.24 perf-profile.children.cycles-pp.__handle_mm_fault
58.44 -3.7 54.74 -3.7 54.74 -3.2 55.25 perf-profile.children.cycles-pp.do_fault
5.36 -0.9 4.44 ± 2% -0.9 4.48 ± 2% -0.8 4.54 perf-profile.children.cycles-pp.__pte_offset_map_lock
5.02 -0.9 4.16 ± 2% -0.8 4.19 ± 2% -0.8 4.25 perf-profile.children.cycles-pp._raw_spin_lock
4.45 -0.7 3.76 ± 3% -0.6 3.83 ± 3% -0.6 3.82 ± 2% perf-profile.children.cycles-pp.folio_prealloc
2.64 -0.4 2.24 ± 2% -0.4 2.27 ± 2% -0.3 2.30 perf-profile.children.cycles-pp.sync_regs
1.89 -0.4 1.49 ± 2% -0.4 1.52 ± 2% -0.3 1.55 perf-profile.children.cycles-pp.zap_present_ptes
2.42 -0.4 2.04 ± 2% -0.3 2.08 ± 3% -0.3 2.09 ± 2% perf-profile.children.cycles-pp.native_irq_return_iret
2.24 -0.3 1.91 ± 2% -0.3 1.91 ± 2% -0.3 1.93 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
2.07 -0.3 1.74 ± 3% -0.3 1.80 ± 5% -0.3 1.77 ± 2% perf-profile.children.cycles-pp.__mem_cgroup_charge
1.89 -0.3 1.61 ± 2% -0.3 1.60 ± 2% -0.3 1.62 perf-profile.children.cycles-pp.alloc_pages_mpol_noprof
2.04 -0.3 1.77 ± 2% -0.1 1.90 ± 2% -0.2 1.83 ± 3% perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
1.64 -0.3 1.39 ± 2% -0.3 1.39 ± 2% -0.2 1.41 perf-profile.children.cycles-pp.__alloc_pages_noprof
1.77 -0.2 1.52 ± 3% -0.2 1.53 ± 3% -0.2 1.54 ± 2% perf-profile.children.cycles-pp.__do_fault
1.62 -0.2 1.39 ± 3% -0.2 1.41 ± 3% -0.2 1.41 ± 2% perf-profile.children.cycles-pp.shmem_fault
1.32 -0.2 1.10 ± 3% -0.2 1.16 ± 4% -0.2 1.13 ± 2% perf-profile.children.cycles-pp.mem_cgroup_commit_charge
1.42 -0.2 1.21 ± 2% -0.2 1.20 ± 2% -0.2 1.19 ± 2% perf-profile.children.cycles-pp.__perf_sw_event
1.47 -0.2 1.27 ± 3% -0.2 1.28 ± 3% -0.2 1.29 ± 2% perf-profile.children.cycles-pp.shmem_get_folio_gfp
1.13 ± 2% -0.2 0.93 ± 4% -0.1 1.06 ± 2% -0.1 1.03 ± 3% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
1.17 -0.2 0.98 ± 2% -0.2 0.98 ± 2% -0.2 1.00 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
1.25 -0.2 1.06 ± 2% -0.2 1.06 ± 2% -0.2 1.05 ± 2% perf-profile.children.cycles-pp.___perf_sw_event
0.84 -0.2 0.67 ± 3% -0.2 0.68 ± 4% -0.2 0.69 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_state
0.61 -0.2 0.44 ± 3% -0.2 0.43 ± 3% -0.2 0.46 ± 2% perf-profile.children.cycles-pp._compound_head
0.65 -0.1 0.51 ± 2% -0.1 0.53 ± 4% -0.1 0.53 ± 2% perf-profile.children.cycles-pp.__mod_node_page_state
0.94 -0.1 0.80 ± 4% -0.1 0.82 ± 4% -0.1 0.82 ± 3% perf-profile.children.cycles-pp.filemap_get_entry
1.02 -0.1 0.89 ± 2% -0.1 0.90 ± 3% -0.1 0.90 ± 2% perf-profile.children.cycles-pp.lock_vma_under_rcu
0.76 -0.1 0.63 ± 2% -0.1 0.67 ± 2% -0.1 0.66 perf-profile.children.cycles-pp.folio_remove_rmap_ptes
1.20 -0.1 1.10 -0.1 1.13 -0.1 1.11 perf-profile.children.cycles-pp.lru_add_fn
0.69 -0.1 0.59 ± 4% -0.1 0.58 ± 2% -0.1 0.60 ± 2% perf-profile.children.cycles-pp.rmqueue
0.47 -0.1 0.38 ± 2% -0.1 0.37 ± 2% -0.1 0.38 perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
0.59 -0.1 0.49 ± 2% -0.1 0.49 -0.1 0.50 perf-profile.children.cycles-pp.free_unref_folios
0.54 -0.1 0.45 ± 4% -0.1 0.46 ± 3% -0.1 0.47 ± 3% perf-profile.children.cycles-pp.xas_load
0.67 -0.1 0.58 ± 3% -0.1 0.60 ± 3% -0.1 0.59 ± 2% perf-profile.children.cycles-pp.mas_walk
0.63 ± 3% -0.1 0.55 ± 3% -0.0 0.61 ± 4% -0.1 0.55 ± 3% perf-profile.children.cycles-pp.__count_memcg_events
0.27 ± 3% -0.1 0.21 ± 3% -0.1 0.21 ± 3% -0.1 0.21 perf-profile.children.cycles-pp.uncharge_batch
0.38 -0.1 0.32 ± 5% -0.0 0.33 -0.0 0.33 perf-profile.children.cycles-pp.try_charge_memcg
0.22 ± 3% -0.1 0.17 ± 4% -0.1 0.17 ± 4% -0.1 0.17 ± 2% perf-profile.children.cycles-pp.page_counter_uncharge
0.32 -0.1 0.27 -0.0 0.28 -0.1 0.26 perf-profile.children.cycles-pp.cgroup_rstat_updated
0.26 ± 3% -0.0 0.21 ± 4% -0.0 0.22 ± 2% -0.0 0.23 ± 5% perf-profile.children.cycles-pp.__pte_offset_map
0.30 -0.0 0.26 ± 2% -0.0 0.26 -0.0 0.26 ± 3% perf-profile.children.cycles-pp.handle_pte_fault
0.28 -0.0 0.24 ± 2% -0.0 0.25 ± 3% -0.0 0.25 perf-profile.children.cycles-pp.error_entry
0.31 -0.0 0.27 -0.0 0.26 ± 5% -0.0 0.26 perf-profile.children.cycles-pp.percpu_counter_add_batch
0.31 ± 2% -0.0 0.27 ± 6% -0.0 0.27 ± 4% -0.0 0.27 ± 3% perf-profile.children.cycles-pp.get_vma_policy
0.22 -0.0 0.19 ± 2% -0.0 0.19 ± 2% -0.0 0.19 ± 2% perf-profile.children.cycles-pp.free_unref_page_commit
0.22 ± 2% -0.0 0.19 ± 3% -0.0 0.19 -0.0 0.20 ± 2% perf-profile.children.cycles-pp.folio_add_new_anon_rmap
0.26 ± 2% -0.0 0.22 ± 9% -0.0 0.22 ± 4% -0.0 0.23 ± 5% perf-profile.children.cycles-pp._raw_spin_trylock
0.28 ± 2% -0.0 0.25 ± 3% -0.0 0.25 -0.0 0.25 ± 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.32 ± 2% -0.0 0.29 ± 4% -0.0 0.28 ± 2% -0.0 0.29 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.26 ± 3% -0.0 0.22 ± 4% -0.0 0.22 ± 3% -0.0 0.23 perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.25 ± 3% -0.0 0.21 ± 4% -0.0 0.21 ± 2% -0.0 0.22 perf-profile.children.cycles-pp.hrtimer_interrupt
0.22 ± 2% -0.0 0.19 ± 3% -0.0 0.19 ± 2% -0.0 0.19 ± 3% perf-profile.children.cycles-pp.pte_offset_map_nolock
0.17 ± 2% -0.0 0.14 ± 4% -0.0 0.14 ± 5% -0.0 0.15 ± 3% perf-profile.children.cycles-pp.folio_unlock
0.14 ± 2% -0.0 0.11 -0.0 0.11 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.__mod_zone_page_state
0.19 ± 2% -0.0 0.16 ± 2% -0.0 0.17 ± 2% -0.0 0.17 ± 3% perf-profile.children.cycles-pp.down_read_trylock
0.18 -0.0 0.15 ± 3% -0.0 0.15 ± 4% -0.0 0.15 ± 2% perf-profile.children.cycles-pp.__rmqueue_pcplist
0.14 ± 2% -0.0 0.11 ± 6% -0.0 0.11 ± 8% -0.0 0.12 ± 6% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
0.14 ± 3% -0.0 0.12 ± 3% -0.0 0.12 ± 5% -0.0 0.12 ± 4% perf-profile.children.cycles-pp.perf_exclude_event
0.19 ± 2% -0.0 0.17 ± 4% -0.0 0.17 ± 2% -0.0 0.17 ± 4% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.16 ± 2% -0.0 0.14 -0.0 0.13 ± 3% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.uncharge_folio
0.12 ± 3% -0.0 0.10 ± 5% -0.0 0.09 ± 5% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.get_pfnblock_flags_mask
0.18 ± 3% -0.0 0.16 ± 4% -0.0 0.16 ± 3% -0.0 0.16 ± 3% perf-profile.children.cycles-pp.tick_nohz_handler
0.13 ± 3% -0.0 0.10 ± 4% -0.0 0.10 ± 4% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.page_counter_try_charge
0.16 -0.0 0.14 ± 2% -0.0 0.14 ± 4% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.folio_put
0.18 ± 2% -0.0 0.16 ± 3% -0.0 0.16 ± 3% -0.0 0.15 ± 2% perf-profile.children.cycles-pp.__cond_resched
0.18 ± 2% -0.0 0.16 ± 5% -0.0 0.16 ± 4% -0.0 0.16 ± 2% perf-profile.children.cycles-pp.up_read
0.14 -0.0 0.12 -0.0 0.12 -0.0 0.12 ± 3% perf-profile.children.cycles-pp.policy_nodemask
0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 2% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.update_process_times
0.11 ± 3% -0.0 0.09 ± 8% -0.0 0.09 ± 4% -0.0 0.09 ± 4% perf-profile.children.cycles-pp.xas_start
0.13 ± 3% -0.0 0.11 -0.0 0.11 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.access_error
0.09 ± 4% -0.0 0.08 ± 5% -0.0 0.08 ± 5% -0.0 0.08 perf-profile.children.cycles-pp.__irqentry_text_end
0.07 ± 5% -0.0 0.05 ± 9% -0.0 0.06 ± 6% -0.0 0.06 perf-profile.children.cycles-pp.vm_normal_page
0.06 ± 7% -0.0 0.05 ± 7% -0.0 0.05 -0.0 0.05 ± 7% perf-profile.children.cycles-pp.__tlb_remove_folio_pages_size
0.08 -0.0 0.07 ± 5% -0.0 0.07 ± 5% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.memcg_check_events
0.12 ± 3% -0.0 0.11 ± 6% -0.0 0.11 ± 4% -0.0 0.11 ± 3% perf-profile.children.cycles-pp.perf_swevent_event
0.06 -0.0 0.05 ± 7% -0.0 0.05 -0.0 0.05 perf-profile.children.cycles-pp.pte_alloc_one
0.06 -0.0 0.05 ± 7% -0.0 0.05 -0.0 0.05 ± 7% perf-profile.children.cycles-pp.irqentry_enter
0.06 -0.0 0.05 ± 7% -0.0 0.05 -0.0 0.05 ± 7% perf-profile.children.cycles-pp.vmf_anon_prepare
0.05 +0.0 0.06 ± 8% +0.0 0.06 +0.0 0.06 ± 8% perf-profile.children.cycles-pp.write
0.05 +0.0 0.06 +0.0 0.06 +0.0 0.06 perf-profile.children.cycles-pp.perf_mmap__push
0.19 ± 2% +0.2 0.40 ± 6% +0.2 0.37 ± 7% +0.2 0.35 ± 4% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
2.72 +0.7 3.40 +0.7 3.38 +0.6 3.32 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu
24.44 +2.2 26.60 ± 5% +1.8 26.23 ± 4% +2.1 26.58 ± 3% perf-profile.children.cycles-pp.set_pte_range
22.47 +2.4 24.89 ± 5% +2.0 24.42 ± 5% +2.3 24.81 ± 4% perf-profile.children.cycles-pp.folio_add_lru_vma
22.31 +2.5 24.77 ± 5% +2.0 24.30 ± 5% +2.4 24.70 ± 4% perf-profile.children.cycles-pp.folio_batch_move_lru
22.59 +5.3 27.93 +5.2 27.84 +4.7 27.29 ± 2% perf-profile.children.cycles-pp.zap_pmd_range
22.59 +5.3 27.93 +5.3 27.84 +4.7 27.29 ± 2% perf-profile.children.cycles-pp.unmap_page_range
22.59 +5.3 27.93 +5.3 27.84 +4.7 27.29 ± 2% perf-profile.children.cycles-pp.zap_pte_range
22.59 +5.3 27.93 +5.3 27.84 +4.7 27.29 ± 2% perf-profile.children.cycles-pp.unmap_vmas
20.59 +5.8 26.34 +5.6 26.22 +5.1 25.64 ± 2% perf-profile.children.cycles-pp.tlb_flush_mmu
25.34 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.children.cycles-pp.__x64_sys_munmap
25.34 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.children.cycles-pp.__vm_munmap
25.34 +6.0 31.37 +5.9 31.25 +5.3 30.65 ± 2% perf-profile.children.cycles-pp.__munmap
25.33 +6.0 31.36 +5.9 31.24 +5.3 30.64 ± 2% perf-profile.children.cycles-pp.unmap_region
25.34 +6.0 31.37 +5.9 31.25 +5.3 30.65 ± 2% perf-profile.children.cycles-pp.do_vmi_align_munmap
25.34 +6.0 31.37 +5.9 31.25 +5.3 30.65 ± 2% perf-profile.children.cycles-pp.do_vmi_munmap
25.46 +6.0 31.49 +5.9 31.37 +5.3 30.77 ± 2% perf-profile.children.cycles-pp.do_syscall_64
25.46 +6.0 31.49 +5.9 31.37 +5.3 30.77 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
23.30 +6.4 29.74 +6.3 29.59 +5.7 28.96 ± 2% perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
23.29 +6.4 29.73 +6.3 29.58 +5.7 28.95 ± 2% perf-profile.children.cycles-pp.free_pages_and_swap_cache
23.00 +6.5 29.52 +6.4 29.38 +5.7 28.73 ± 2% perf-profile.children.cycles-pp.folios_put_refs
21.22 +6.7 27.93 +6.6 27.81 +5.9 27.13 ± 3% perf-profile.children.cycles-pp.__page_cache_release
40.79 +9.3 50.07 ± 2% +8.7 49.46 ± 2% +8.4 49.20 perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
40.78 +9.3 50.06 ± 2% +8.7 49.44 ± 2% +8.4 49.19 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
40.64 +9.3 49.96 ± 2% +8.7 49.34 ± 2% +8.4 49.09 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
21.23 -3.9 17.38 ± 3% -3.6 17.63 ± 3% -3.5 17.73 ± 2% perf-profile.self.cycles-pp.copy_page
4.99 -0.8 4.14 ± 2% -0.8 4.17 ± 2% -0.8 4.22 perf-profile.self.cycles-pp._raw_spin_lock
5.21 -0.8 4.45 ± 2% -0.7 4.49 ± 2% -0.7 4.53 perf-profile.self.cycles-pp.testcase
2.63 -0.4 2.24 ± 2% -0.4 2.26 ± 2% -0.3 2.29 perf-profile.self.cycles-pp.sync_regs
2.42 -0.4 2.04 ± 2% -0.3 2.08 ± 3% -0.3 2.09 ± 2% perf-profile.self.cycles-pp.native_irq_return_iret
0.58 ± 2% -0.2 0.42 ± 3% -0.2 0.40 ± 2% -0.1 0.43 ± 3% perf-profile.self.cycles-pp._compound_head
0.93 ± 2% -0.2 0.77 ± 5% -0.0 0.89 ± 2% -0.1 0.86 ± 3% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
1.00 -0.1 0.85 -0.2 0.85 ± 2% -0.2 0.83 ± 2% perf-profile.self.cycles-pp.___perf_sw_event
0.93 ± 2% -0.1 0.78 ± 3% -0.1 0.79 ± 4% -0.1 0.80 ± 3% perf-profile.self.cycles-pp.mem_cgroup_commit_charge
0.61 -0.1 0.48 ± 3% -0.1 0.50 ± 4% -0.1 0.50 ± 3% perf-profile.self.cycles-pp.__mod_node_page_state
0.51 -0.1 0.38 -0.1 0.38 ± 2% -0.1 0.40 ± 2% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.80 -0.1 0.70 ± 2% -0.1 0.69 ± 3% -0.1 0.70 ± 2% perf-profile.self.cycles-pp.__handle_mm_fault
0.61 ± 2% -0.1 0.51 -0.1 0.51 ± 2% -0.1 0.51 perf-profile.self.cycles-pp.lru_add_fn
0.47 -0.1 0.38 -0.1 0.38 -0.1 0.39 ± 2% perf-profile.self.cycles-pp.get_page_from_freelist
0.45 -0.1 0.37 ± 2% -0.1 0.37 ± 2% -0.1 0.38 perf-profile.self.cycles-pp.zap_present_ptes
0.44 -0.1 0.36 ± 4% -0.1 0.37 ± 4% -0.1 0.38 ± 3% perf-profile.self.cycles-pp.xas_load
0.65 -0.1 0.57 ± 2% -0.1 0.58 ± 2% -0.1 0.58 ± 2% perf-profile.self.cycles-pp.mas_walk
0.46 -0.1 0.39 ± 2% -0.1 0.40 ± 2% -0.1 0.41 ± 3% perf-profile.self.cycles-pp.handle_mm_fault
0.44 -0.1 0.38 ± 2% -0.1 0.38 ± 2% -0.1 0.39 perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.52 ± 3% -0.1 0.46 ± 3% -0.0 0.51 ± 6% -0.1 0.46 ± 5% perf-profile.self.cycles-pp.__count_memcg_events
0.89 ± 2% -0.1 0.84 -0.0 0.88 ± 3% -0.1 0.83 ± 3% perf-profile.self.cycles-pp.__lruvec_stat_mod_folio
0.32 -0.1 0.26 -0.1 0.26 -0.0 0.27 perf-profile.self.cycles-pp.__page_cache_release
0.39 -0.1 0.34 ± 4% -0.0 0.35 ± 4% -0.0 0.35 ± 3% perf-profile.self.cycles-pp.filemap_get_entry
0.20 ± 4% -0.1 0.15 ± 5% -0.1 0.15 ± 3% -0.0 0.15 ± 2% perf-profile.self.cycles-pp.page_counter_uncharge
0.24 -0.0 0.19 -0.0 0.20 ± 2% -0.0 0.20 perf-profile.self.cycles-pp.folio_remove_rmap_ptes
0.34 ± 3% -0.0 0.29 ± 2% -0.0 0.29 ± 2% -0.0 0.29 ± 3% perf-profile.self.cycles-pp.__alloc_pages_noprof
0.27 -0.0 0.23 ± 3% -0.0 0.23 ± 3% -0.0 0.23 ± 2% perf-profile.self.cycles-pp.free_unref_folios
0.27 ± 3% -0.0 0.23 ± 2% -0.0 0.23 ± 2% -0.0 0.22 ± 2% perf-profile.self.cycles-pp.rmqueue
0.30 -0.0 0.26 -0.0 0.26 -0.0 0.26 perf-profile.self.cycles-pp.do_user_addr_fault
0.26 -0.0 0.22 ± 2% -0.0 0.22 ± 2% -0.0 0.22 ± 4% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.23 ± 3% -0.0 0.19 ± 4% -0.0 0.20 ± 5% -0.0 0.20 ± 3% perf-profile.self.cycles-pp.__pte_offset_map_lock
0.22 ± 3% -0.0 0.19 ± 2% -0.0 0.19 ± 3% -0.0 0.20 ± 4% perf-profile.self.cycles-pp.__pte_offset_map
0.29 -0.0 0.26 -0.0 0.25 ± 5% -0.0 0.25 perf-profile.self.cycles-pp.percpu_counter_add_batch
0.19 ± 2% -0.0 0.16 ± 2% -0.0 0.16 ± 4% -0.0 0.16 ± 4% perf-profile.self.cycles-pp.__mod_lruvec_state
0.21 ± 3% -0.0 0.17 ± 2% -0.0 0.18 ± 2% -0.0 0.19 ± 4% perf-profile.self.cycles-pp.finish_fault
0.25 -0.0 0.21 -0.0 0.21 ± 3% -0.0 0.22 ± 2% perf-profile.self.cycles-pp.error_entry
0.24 -0.0 0.21 ± 3% -0.0 0.22 -0.0 0.22 perf-profile.self.cycles-pp.try_charge_memcg
0.21 ± 2% -0.0 0.18 ± 4% -0.0 0.18 ± 2% -0.0 0.19 ± 2% perf-profile.self.cycles-pp.folio_add_new_anon_rmap
0.22 -0.0 0.19 ± 2% -0.0 0.19 ± 2% -0.0 0.19 ± 2% perf-profile.self.cycles-pp.set_pte_range
0.24 ± 3% -0.0 0.21 ± 7% -0.0 0.20 ± 4% -0.0 0.21 ± 6% perf-profile.self.cycles-pp._raw_spin_trylock
0.06 -0.0 0.03 ± 81% -0.0 0.04 ± 50% -0.0 0.05 perf-profile.self.cycles-pp.vm_normal_page
0.23 ± 2% -0.0 0.20 ± 2% -0.0 0.20 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.do_fault
0.18 -0.0 0.15 ± 2% -0.0 0.15 ± 2% -0.0 0.15 ± 2% perf-profile.self.cycles-pp.free_unref_page_commit
0.15 ± 2% -0.0 0.12 -0.0 0.12 ± 6% -0.0 0.13 ± 3% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.13 ± 3% -0.0 0.10 ± 4% -0.0 0.11 ± 4% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.__mem_cgroup_charge
0.18 -0.0 0.15 ± 2% -0.0 0.16 ± 3% -0.0 0.16 ± 2% perf-profile.self.cycles-pp.down_read_trylock
0.11 ± 3% -0.0 0.08 ± 4% -0.0 0.09 -0.0 0.09 ± 5% perf-profile.self.cycles-pp.__mod_zone_page_state
0.19 ± 2% -0.0 0.17 ± 2% -0.0 0.16 ± 2% -0.0 0.16 ± 2% perf-profile.self.cycles-pp.folio_add_lru_vma
0.19 ± 2% -0.0 0.17 ± 8% -0.0 0.17 ± 3% -0.0 0.17 ± 3% perf-profile.self.cycles-pp.get_vma_policy
0.16 ± 2% -0.0 0.13 ± 3% -0.0 0.13 ± 5% -0.0 0.14 ± 2% perf-profile.self.cycles-pp.folio_unlock
0.12 ± 3% -0.0 0.10 ± 6% -0.0 0.10 ± 6% -0.0 0.10 perf-profile.self.cycles-pp.perf_exclude_event
0.19 ± 2% -0.0 0.17 -0.0 0.17 ± 2% -0.0 0.17 ± 2% perf-profile.self.cycles-pp.asm_exc_page_fault
0.15 ± 2% -0.0 0.13 ± 3% -0.0 0.13 ± 3% -0.0 0.13 perf-profile.self.cycles-pp.folio_put
0.14 ± 2% -0.0 0.12 -0.0 0.12 ± 3% -0.0 0.12 perf-profile.self.cycles-pp.__rmqueue_pcplist
0.17 ± 2% -0.0 0.14 ± 5% -0.0 0.14 ± 2% -0.0 0.15 ± 3% perf-profile.self.cycles-pp.__perf_sw_event
0.10 ± 3% -0.0 0.08 ± 7% -0.0 0.08 ± 11% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
0.15 ± 2% -0.0 0.13 -0.0 0.13 ± 3% -0.0 0.13 ± 3% perf-profile.self.cycles-pp.uncharge_folio
0.12 ± 3% -0.0 0.10 -0.0 0.10 ± 3% -0.0 0.11 ± 4% perf-profile.self.cycles-pp.alloc_pages_mpol_noprof
0.11 ± 3% -0.0 0.09 ± 8% -0.0 0.09 ± 4% -0.0 0.09 perf-profile.self.cycles-pp.page_counter_try_charge
0.17 ± 4% -0.0 0.15 ± 4% -0.0 0.15 ± 2% -0.0 0.15 perf-profile.self.cycles-pp.lock_vma_under_rcu
0.17 ± 2% -0.0 0.15 ± 3% -0.0 0.16 ± 3% -0.0 0.15 ± 3% perf-profile.self.cycles-pp.up_read
0.11 -0.0 0.09 ± 4% -0.0 0.09 ± 5% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.zap_pte_range
0.10 -0.0 0.08 ± 4% -0.0 0.08 ± 5% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.get_pfnblock_flags_mask
0.16 ± 2% -0.0 0.15 ± 5% -0.0 0.15 ± 3% -0.0 0.15 ± 5% perf-profile.self.cycles-pp.shmem_fault
0.10 ± 4% -0.0 0.08 ± 4% -0.0 0.08 ± 4% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.__do_fault
0.12 ± 3% -0.0 0.10 ± 7% -0.0 0.10 ± 7% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.exc_page_fault
0.12 ± 3% -0.0 0.10 ± 3% -0.0 0.10 ± 3% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.access_error
0.12 ± 4% -0.0 0.10 -0.0 0.10 -0.0 0.10 ± 3% perf-profile.self.cycles-pp.vma_alloc_folio_noprof
0.11 -0.0 0.10 ± 5% -0.0 0.09 ± 4% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.perf_swevent_event
0.09 ± 5% -0.0 0.08 -0.0 0.08 -0.0 0.08 perf-profile.self.cycles-pp.policy_nodemask
0.09 -0.0 0.08 ± 13% -0.0 0.08 ± 5% -0.0 0.08 ± 5% perf-profile.self.cycles-pp.xas_start
0.10 ± 4% -0.0 0.09 ± 4% -0.0 0.09 -0.0 0.09 ± 4% perf-profile.self.cycles-pp.pte_offset_map_nolock
0.08 ± 4% -0.0 0.07 -0.0 0.07 ± 5% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.__irqentry_text_end
0.10 -0.0 0.09 -0.0 0.09 ± 5% -0.0 0.09 perf-profile.self.cycles-pp.folio_prealloc
0.09 -0.0 0.08 -0.0 0.08 -0.0 0.08 perf-profile.self.cycles-pp.__cond_resched
0.38 ± 2% +0.1 0.47 ± 2% +0.1 0.46 +0.1 0.44 perf-profile.self.cycles-pp.folio_batch_move_lru
0.18 ± 2% +0.2 0.38 ± 6% +0.2 0.35 ± 7% +0.2 0.34 ± 4% perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
40.64 +9.3 49.96 ± 2% +8.7 49.34 ± 2% +8.4 49.08 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
[2]
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/page_fault2/will-it-scale
59142d87ab03b8ff 70a64b7919cbd6c12306051ff28 ff48c71c26aaefb090c108d8803 a94032b35e5f97dc1023030d929
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
17488267 -3.4% 16886777 -6.0% 16433590 -5.6% 16505101 will-it-scale.224.processes
78072 -3.4% 75386 -6.0% 73363 -5.6% 73683 will-it-scale.per_process_ops
17488267 -3.4% 16886777 -6.0% 16433590 -5.6% 16505101 will-it-scale.workload
5.296e+09 -3.4% 5.116e+09 -6.0% 4.977e+09 -5.6% 4.998e+09 proc-vmstat.numa_hit
5.291e+09 -3.4% 5.111e+09 -6.0% 4.973e+09 -5.6% 4.995e+09 proc-vmstat.numa_local
5.285e+09 -3.4% 5.105e+09 -6.0% 4.968e+09 -5.6% 4.989e+09 proc-vmstat.pgalloc_normal
5.264e+09 -3.4% 5.084e+09 -6.0% 4.947e+09 -5.6% 4.969e+09 proc-vmstat.pgfault
5.283e+09 -3.4% 5.104e+09 -6.0% 4.967e+09 -5.6% 4.989e+09 proc-vmstat.pgfree
3067 +20.1% 3685 ± 8% +19.5% 3665 ± 8% -0.4% 3056 sched_debug.cfs_rq:/.load.min
0.07 ± 19% -12.8% 0.06 ± 14% -31.1% 0.05 ± 14% -8.8% 0.06 ± 14% sched_debug.cfs_rq:/.nr_running.stddev
1727628 ± 22% +2.3% 1767491 ± 32% +8.6% 1876362 ± 25% -24.1% 1310525 ± 7% sched_debug.cpu.avg_idle.max
6058 ± 41% +71.5% 10389 ±118% +96.1% 11878 ± 66% -47.9% 3156 ± 43% sched_debug.cpu.max_idle_balance_cost.stddev
17928 ± 11% +133.0% 41768 ± 36% +39.4% 24992 ± 57% +6.3% 19052 ± 15% sched_debug.cpu.nr_switches.max
2270 ± 6% +70.6% 3874 ± 28% +21.4% 2756 ± 37% +0.5% 2282 ± 4% sched_debug.cpu.nr_switches.stddev
4369255 -9.9% 3934784 ± 8% -3.0% 4238563 ± 6% -3.0% 4239325 ± 7% numa-vmstat.node0.nr_file_pages
20526 ± 3% -25.8% 15236 ± 22% -11.5% 18161 ± 16% -6.4% 19205 ± 16% numa-vmstat.node0.nr_mapped
35617 ± 5% -27.8% 25727 ± 20% -12.1% 31303 ± 13% -9.1% 32375 ± 21% numa-vmstat.node0.nr_slab_reclaimable
65089 ± 16% -8.1% 59820 ± 19% -19.8% 52215 ± 3% -18.3% 53200 ± 3% numa-vmstat.node0.nr_slab_unreclaimable
738801 ± 3% -59.2% 301176 ±113% -17.7% 608173 ± 48% -18.0% 605778 ± 49% numa-vmstat.node0.nr_unevictable
738801 ± 3% -59.2% 301176 ±113% -17.7% 608173 ± 48% -18.0% 605778 ± 49% numa-vmstat.node0.nr_zone_unevictable
4024866 +10.9% 4465333 ± 7% +3.2% 4152344 ± 7% +3.4% 4163009 ± 7% numa-vmstat.node1.nr_file_pages
19132 ± 10% +51.8% 29044 ± 18% +22.2% 23371 ± 18% +17.3% 22446 ± 30% numa-vmstat.node1.nr_slab_reclaimable
45845 ± 24% +12.0% 51337 ± 23% +28.7% 58982 ± 2% +26.8% 58122 ± 3% numa-vmstat.node1.nr_slab_unreclaimable
30816 ± 81% +1420.1% 468441 ± 72% +423.9% 161444 ±184% +431.7% 163839 ±184% numa-vmstat.node1.nr_unevictable
30816 ± 81% +1420.1% 468441 ± 72% +423.9% 161444 ±184% +431.7% 163839 ±184% numa-vmstat.node1.nr_zone_unevictable
142458 ± 5% -27.7% 102968 ± 20% -12.1% 125181 ± 13% -9.1% 129506 ± 21% numa-meminfo.node0.KReclaimable
81201 ± 3% -25.4% 60607 ± 21% -11.8% 71622 ± 16% -6.6% 75868 ± 16% numa-meminfo.node0.Mapped
142458 ± 5% -27.7% 102968 ± 20% -12.1% 125181 ± 13% -9.1% 129506 ± 21% numa-meminfo.node0.SReclaimable
260359 ± 16% -8.1% 239286 ± 19% -19.8% 208866 ± 3% -18.3% 212806 ± 3% numa-meminfo.node0.SUnreclaim
402817 ± 12% -15.0% 342254 ± 18% -17.1% 334047 ± 6% -15.0% 342313 ± 9% numa-meminfo.node0.Slab
2955204 ± 3% -59.2% 1204704 ±113% -17.7% 2432692 ± 48% -18.0% 2423114 ± 49% numa-meminfo.node0.Unevictable
16107004 +11.0% 17872044 ± 7% +3.0% 16587232 ± 7% +3.3% 16635393 ± 7% numa-meminfo.node1.FilePages
76509 ± 10% +51.9% 116237 ± 18% +22.1% 93450 ± 18% +17.4% 89791 ± 30% numa-meminfo.node1.KReclaimable
76509 ± 10% +51.9% 116237 ± 18% +22.1% 93450 ± 18% +17.4% 89791 ± 30% numa-meminfo.node1.SReclaimable
183385 ± 24% +12.0% 205353 ± 23% +28.7% 235933 ± 2% +26.8% 232488 ± 3% numa-meminfo.node1.SUnreclaim
259894 ± 20% +23.7% 321590 ± 19% +26.7% 329384 ± 6% +24.0% 322280 ± 10% numa-meminfo.node1.Slab
123266 ± 81% +1420.1% 1873767 ± 72% +423.9% 645778 ±184% +431.7% 655357 ±184% numa-meminfo.node1.Unevictable
20.16 -1.4% 19.89 -2.9% 19.57 -2.9% 19.58 perf-stat.i.MPKI
2.501e+10 -1.7% 2.46e+10 -2.6% 2.436e+10 -2.4% 2.44e+10 perf-stat.i.branch-instructions
18042153 -0.3% 17981852 -1.9% 17692517 -2.8% 17539874 perf-stat.i.branch-misses
2.382e+09 -3.3% 2.304e+09 -5.8% 2.244e+09 -5.6% 2.249e+09 perf-stat.i.cache-misses
2.561e+09 -3.2% 2.479e+09 -5.5% 2.42e+09 -5.3% 2.424e+09 perf-stat.i.cache-references
5.49 +1.9% 5.59 +3.1% 5.66 +2.8% 5.64 perf-stat.i.cpi
274.25 +2.9% 282.07 +5.7% 289.98 +5.4% 289.07 perf-stat.i.cycles-between-cache-misses
1.177e+11 -1.9% 1.155e+11 -2.9% 1.143e+11 -2.7% 1.145e+11 perf-stat.i.instructions
0.19 -1.9% 0.18 -3.0% 0.18 -2.7% 0.18 perf-stat.i.ipc
155.11 -3.3% 150.03 -5.9% 145.89 -5.5% 146.59 perf-stat.i.metric.K/sec
17405977 -3.4% 16819060 -5.9% 16378605 -5.5% 16441964 perf-stat.i.minor-faults
17405978 -3.4% 16819060 -5.9% 16378606 -5.5% 16441964 perf-stat.i.page-faults
4.41 ± 50% +27.3% 5.61 +3.1% 4.54 ± 50% +28.5% 5.66 perf-stat.overall.cpi
217.50 ± 50% +29.2% 280.93 +6.3% 231.09 ± 50% +32.4% 287.87 perf-stat.overall.cycles-between-cache-misses
1623235 ± 50% +26.9% 2060668 +3.4% 1677714 ± 50% +29.0% 2093187 perf-stat.overall.path-length
5.48 -0.3 5.15 -0.4 5.10 -0.4 5.11 perf-profile.calltrace.cycles-pp.copy_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
57.55 -0.3 57.24 -0.4 57.15 -0.3 57.20 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
56.14 -0.2 55.94 -0.3 55.86 -0.2 55.90 perf-profile.calltrace.cycles-pp.testcase
1.86 -0.1 1.73 ± 2% -0.1 1.72 -0.2 1.71 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.77 -0.1 1.64 ± 2% -0.1 1.63 -0.1 1.63 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault
1.17 -0.1 1.10 -0.1 1.09 -0.1 1.10 perf-profile.calltrace.cycles-pp.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
52.55 -0.1 52.49 -0.1 52.42 -0.1 52.47 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
52.62 -0.1 52.56 -0.1 52.48 -0.1 52.54 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
0.96 -0.0 0.91 -0.0 0.91 -0.0 0.91 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
0.71 -0.0 0.68 -0.0 0.67 -0.0 0.67 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
51.87 -0.0 51.84 -0.1 51.76 -0.0 51.82 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.60 -0.0 0.57 -0.0 0.56 -0.0 0.57 perf-profile.calltrace.cycles-pp.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault
4.87 +0.0 4.90 +0.0 4.91 +0.0 4.91 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
4.85 +0.0 4.88 +0.0 4.90 +0.0 4.90 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region
4.86 +0.0 4.90 +0.0 4.91 +0.0 4.91 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
4.86 +0.0 4.89 +0.1 4.91 +0.0 4.91 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
4.77 +0.0 4.80 +0.1 4.83 +0.1 4.82 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu
37.74 +0.2 37.98 +0.3 38.04 +0.3 38.01 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
37.74 +0.2 37.98 +0.3 38.04 +0.3 38.01 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
37.74 +0.2 37.98 +0.3 38.04 +0.3 38.01 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
37.73 +0.2 37.97 +0.3 38.04 +0.3 38.01 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
37.27 +0.3 37.53 +0.3 37.60 +0.3 37.57 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
37.28 +0.3 37.54 +0.3 37.61 +0.3 37.58 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
37.28 +0.3 37.54 +0.3 37.61 +0.3 37.58 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
36.72 +0.3 36.98 +0.4 37.08 +0.3 37.04 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
37.15 +0.3 37.41 +0.3 37.49 +0.3 37.46 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.calltrace.cycles-pp.__munmap
41.26 +0.3 41.56 +0.4 41.68 +0.4 41.64 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
41.26 +0.3 41.56 +0.4 41.68 +0.4 41.63 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
41.23 +0.3 41.53 +0.4 41.66 +0.4 41.61 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
43.64 +0.5 44.09 +0.4 44.05 +0.5 44.12 perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
41.57 +0.6 42.17 +0.6 42.14 +0.6 42.22 perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
40.93 +0.6 41.56 +0.6 41.53 +0.7 41.59 perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_fault.__handle_mm_fault
40.84 +0.6 41.48 +0.6 41.44 +0.7 41.50 perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_fault
40.16 +0.7 40.83 +0.6 40.80 +0.7 40.87 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
40.19 +0.7 40.85 +0.6 40.83 +0.7 40.89 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
40.19 +0.7 40.85 +0.6 40.83 +0.7 40.89 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
5.49 -0.3 5.16 -0.4 5.12 -0.4 5.12 perf-profile.children.cycles-pp.copy_page
57.05 -0.3 56.79 -0.4 56.70 -0.3 56.75 perf-profile.children.cycles-pp.testcase
55.66 -0.2 55.44 -0.3 55.36 -0.2 55.41 perf-profile.children.cycles-pp.asm_exc_page_fault
1.88 -0.1 1.75 ± 2% -0.1 1.74 -0.2 1.73 perf-profile.children.cycles-pp.__pte_offset_map_lock
1.79 -0.1 1.66 ± 2% -0.1 1.64 -0.1 1.64 perf-profile.children.cycles-pp._raw_spin_lock
1.19 -0.1 1.11 -0.1 1.11 -0.1 1.11 perf-profile.children.cycles-pp.folio_prealloc
52.64 -0.1 52.57 -0.1 52.49 -0.1 52.55 perf-profile.children.cycles-pp.exc_page_fault
0.96 -0.1 0.91 -0.1 0.91 -0.1 0.91 perf-profile.children.cycles-pp.sync_regs
52.57 -0.0 52.52 -0.1 52.44 -0.1 52.50 perf-profile.children.cycles-pp.do_user_addr_fault
0.73 -0.0 0.69 -0.0 0.68 -0.0 0.68 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
0.63 -0.0 0.60 -0.0 0.59 -0.0 0.59 perf-profile.children.cycles-pp.alloc_pages_mpol_noprof
0.55 -0.0 0.52 -0.0 0.51 -0.0 0.51 perf-profile.children.cycles-pp.__alloc_pages_noprof
51.89 -0.0 51.86 -0.1 51.78 -0.0 51.84 perf-profile.children.cycles-pp.handle_mm_fault
1.02 -0.0 0.99 -0.0 0.99 -0.0 0.98 perf-profile.children.cycles-pp.native_irq_return_iret
0.46 -0.0 0.43 ± 2% -0.0 0.44 -0.0 0.43 perf-profile.children.cycles-pp.shmem_fault
0.39 -0.0 0.36 ± 2% -0.0 0.36 -0.0 0.38 perf-profile.children.cycles-pp.__mem_cgroup_charge
0.51 -0.0 0.48 ± 2% -0.0 0.49 -0.0 0.48 perf-profile.children.cycles-pp.__do_fault
0.38 -0.0 0.36 -0.0 0.35 -0.0 0.36 perf-profile.children.cycles-pp.lru_add_fn
0.51 -0.0 0.49 -0.0 0.50 -0.0 0.48 perf-profile.children.cycles-pp.shmem_get_folio_gfp
0.36 -0.0 0.34 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.___perf_sw_event
0.42 -0.0 0.40 ± 2% -0.0 0.40 -0.0 0.39 perf-profile.children.cycles-pp.__perf_sw_event
0.41 -0.0 0.39 -0.0 0.39 -0.0 0.39 perf-profile.children.cycles-pp.get_page_from_freelist
0.25 ± 2% -0.0 0.23 -0.0 0.24 ± 2% -0.0 0.23 perf-profile.children.cycles-pp.filemap_get_entry
0.42 -0.0 0.41 -0.0 0.40 -0.0 0.40 perf-profile.children.cycles-pp.zap_present_ptes
0.14 ± 2% -0.0 0.12 ± 3% -0.0 0.12 ± 3% -0.0 0.13 perf-profile.children.cycles-pp.xas_load
0.21 ± 2% -0.0 0.20 -0.0 0.19 ± 2% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.__mod_node_page_state
0.26 -0.0 0.25 -0.0 0.24 -0.0 0.24 perf-profile.children.cycles-pp.__mod_lruvec_state
0.27 -0.0 0.26 ± 2% -0.0 0.26 -0.0 0.26 perf-profile.children.cycles-pp.lock_vma_under_rcu
0.11 -0.0 0.10 -0.0 0.09 ± 5% -0.0 0.10 perf-profile.children.cycles-pp._compound_head
0.23 ± 2% -0.0 0.22 ± 2% -0.0 0.22 -0.0 0.21 perf-profile.children.cycles-pp.rmqueue
0.09 -0.0 0.08 -0.0 0.08 -0.0 0.08 perf-profile.children.cycles-pp.scheduler_tick
0.12 -0.0 0.11 -0.0 0.11 -0.0 0.11 perf-profile.children.cycles-pp.tick_nohz_handler
0.21 -0.0 0.20 -0.0 0.20 -0.0 0.19 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.16 -0.0 0.15 ± 2% -0.0 0.15 -0.0 0.15 perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.11 -0.0 0.10 ± 3% -0.0 0.10 -0.0 0.10 perf-profile.children.cycles-pp.update_process_times
0.14 ± 3% -0.0 0.14 ± 3% -0.0 0.13 -0.0 0.13 ± 3% perf-profile.children.cycles-pp.try_charge_memcg
0.15 -0.0 0.14 ± 2% -0.0 0.14 ± 2% -0.0 0.14 perf-profile.children.cycles-pp.hrtimer_interrupt
0.06 -0.0 0.06 ± 8% -0.0 0.05 ± 7% -0.0 0.05 perf-profile.children.cycles-pp.task_tick_fair
0.16 ± 2% -0.0 0.16 -0.0 0.15 -0.0 0.15 ± 2% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
0.07 +0.0 0.08 ± 6% +0.0 0.08 +0.0 0.08 perf-profile.children.cycles-pp.folio_add_lru
4.88 +0.0 4.91 +0.0 4.93 +0.0 4.93 perf-profile.children.cycles-pp.tlb_finish_mmu
37.74 +0.2 37.98 +0.3 38.04 +0.3 38.01 perf-profile.children.cycles-pp.unmap_page_range
37.74 +0.2 37.98 +0.3 38.04 +0.3 38.01 perf-profile.children.cycles-pp.unmap_vmas
37.74 +0.2 37.98 +0.3 38.04 +0.3 38.01 perf-profile.children.cycles-pp.zap_pmd_range
37.74 +0.2 37.98 +0.3 38.04 +0.3 38.01 perf-profile.children.cycles-pp.zap_pte_range
37.28 +0.3 37.54 +0.3 37.61 +0.3 37.58 perf-profile.children.cycles-pp.tlb_flush_mmu
42.65 +0.3 42.92 +0.3 43.00 +0.3 42.97 perf-profile.children.cycles-pp.__vm_munmap
42.65 +0.3 42.92 +0.3 43.00 +0.3 42.97 perf-profile.children.cycles-pp.__x64_sys_munmap
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.children.cycles-pp.__munmap
42.65 +0.3 42.92 +0.4 43.00 +0.3 42.97 perf-profile.children.cycles-pp.unmap_region
42.65 +0.3 42.93 +0.4 43.01 +0.3 42.98 perf-profile.children.cycles-pp.do_vmi_align_munmap
42.65 +0.3 42.93 +0.4 43.01 +0.3 42.98 perf-profile.children.cycles-pp.do_vmi_munmap
42.86 +0.3 43.14 +0.4 43.22 +0.3 43.18 perf-profile.children.cycles-pp.do_syscall_64
42.86 +0.3 43.14 +0.4 43.22 +0.3 43.19 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
42.15 +0.3 42.44 +0.4 42.54 +0.3 42.50 perf-profile.children.cycles-pp.free_pages_and_swap_cache
42.12 +0.3 42.41 +0.4 42.50 +0.3 42.46 perf-profile.children.cycles-pp.folios_put_refs
42.15 +0.3 42.45 +0.4 42.54 +0.3 42.50 perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
41.51 +0.3 41.80 +0.4 41.93 +0.4 41.89 perf-profile.children.cycles-pp.__page_cache_release
43.66 +0.5 44.12 +0.4 44.08 +0.5 44.15 perf-profile.children.cycles-pp.finish_fault
41.59 +0.6 42.19 +0.6 42.16 +0.6 42.24 perf-profile.children.cycles-pp.set_pte_range
40.94 +0.6 41.57 +0.6 41.53 +0.7 41.59 perf-profile.children.cycles-pp.folio_add_lru_vma
40.99 +0.6 41.63 +0.6 41.60 +0.7 41.66 perf-profile.children.cycles-pp.folio_batch_move_lru
81.57 +1.0 82.53 +1.1 82.62 +1.1 82.65 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
81.60 +1.0 82.56 +1.1 82.66 +1.1 82.68 perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
81.59 +1.0 82.56 +1.1 82.66 +1.1 82.68 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
5.47 -0.3 5.14 -0.4 5.10 -0.4 5.10 perf-profile.self.cycles-pp.copy_page
1.77 -0.1 1.65 ± 2% -0.1 1.63 -0.1 1.63 perf-profile.self.cycles-pp._raw_spin_lock
2.19 -0.1 2.08 -0.1 2.08 -0.1 2.07 perf-profile.self.cycles-pp.testcase
0.96 -0.0 0.91 -0.0 0.91 -0.0 0.91 perf-profile.self.cycles-pp.sync_regs
1.02 -0.0 0.99 -0.0 0.99 -0.0 0.98 perf-profile.self.cycles-pp.native_irq_return_iret
0.28 ± 2% -0.0 0.26 ± 2% +0.0 0.29 ± 2% +0.0 0.30 ± 2% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.19 ± 2% -0.0 0.17 ± 2% -0.0 0.17 ± 2% -0.0 0.17 perf-profile.self.cycles-pp.get_page_from_freelist
0.20 -0.0 0.19 -0.0 0.18 ± 2% -0.0 0.19 ± 2% perf-profile.self.cycles-pp.__mod_node_page_state
0.28 -0.0 0.27 ± 3% -0.0 0.27 -0.0 0.26 perf-profile.self.cycles-pp.___perf_sw_event
0.16 ± 2% -0.0 0.15 ± 2% -0.0 0.15 -0.0 0.15 ± 2% perf-profile.self.cycles-pp.handle_mm_fault
0.06 -0.0 0.05 -0.0 0.05 -0.0 0.05 perf-profile.self.cycles-pp.down_read_trylock
0.09 -0.0 0.08 -0.0 0.08 -0.0 0.08 perf-profile.self.cycles-pp.folio_add_new_anon_rmap
0.11 -0.0 0.10 ± 3% -0.0 0.10 -0.0 0.11 ± 3% perf-profile.self.cycles-pp.xas_load
0.16 -0.0 0.15 ± 2% -0.0 0.15 ± 2% -0.0 0.15 perf-profile.self.cycles-pp.mas_walk
0.12 ± 4% -0.0 0.11 ± 3% +0.0 0.12 -0.0 0.10 perf-profile.self.cycles-pp.filemap_get_entry
0.11 ± 3% -0.0 0.11 ± 4% -0.0 0.10 ± 4% -0.0 0.10 perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.11 -0.0 0.11 ± 4% -0.0 0.10 -0.0 0.10 ± 4% perf-profile.self.cycles-pp.error_entry
0.09 ± 4% -0.0 0.09 -0.0 0.08 -0.0 0.09 ± 4% perf-profile.self.cycles-pp._compound_head
0.21 +0.0 0.21 -0.0 0.20 -0.0 0.20 perf-profile.self.cycles-pp.folios_put_refs
0.12 +0.0 0.12 -0.0 0.11 +0.0 0.12 perf-profile.self.cycles-pp.do_fault
0.00 +0.0 0.00 +0.1 0.05 +0.0 0.00 perf-profile.self.cycles-pp.folio_unlock
81.57 +1.0 82.53 +1.1 82.62 +1.1 82.65 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>
> Shakeel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-21 2:43 ` Oliver Sang
@ 2024-05-22 4:18 ` Shakeel Butt
2024-05-23 7:48 ` Oliver Sang
0 siblings, 1 reply; 15+ messages in thread
From: Shakeel Butt @ 2024-05-22 4:18 UTC (permalink / raw)
To: Oliver Sang
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin
On Tue, May 21, 2024 at 10:43:16AM +0800, Oliver Sang wrote:
> hi, Shakeel,
>
[...]
>
> we reported regression on a 2-node Skylake server. so I found a 1-node Skylake
> desktop (we don't have 1 node server) to check.
>
Please try the following patch on both single node and dual node
machines:
From 00a84b489b9e18abd1b8ec575ea31afacaf0734b Mon Sep 17 00:00:00 2001
From: Shakeel Butt <shakeel.butt@linux.dev>
Date: Tue, 21 May 2024 20:27:11 -0700
Subject: [PATCH] memcg: rearrage fields of mem_cgroup_per_node
At the moment the fields of mem_cgroup_per_node which get read on the
performance critical path share the cacheline with the fields which
might get updated. This cause contention of that cacheline for
concurrent readers. Let's move all the read only pointers at the start
of the struct, followed by memcg-v1 only fields and at the end fields
which get updated often.
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
include/linux/memcontrol.h | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 030d34e9d117..16efd9737be9 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -96,23 +96,25 @@ struct mem_cgroup_reclaim_iter {
* per-node information in memory controller.
*/
struct mem_cgroup_per_node {
- struct lruvec lruvec;
+ /* Keep the read-only fields at the start */
+ struct mem_cgroup *memcg; /* Back pointer, we cannot */
+ /* use container_of */
struct lruvec_stats_percpu __percpu *lruvec_stats_percpu;
struct lruvec_stats *lruvec_stats;
-
- unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
-
- struct mem_cgroup_reclaim_iter iter;
-
struct shrinker_info __rcu *shrinker_info;
+ /* memcg-v1 only stuff in middle */
+
struct rb_node tree_node; /* RB tree node */
unsigned long usage_in_excess;/* Set to the value by which */
/* the soft limit is exceeded*/
bool on_tree;
- struct mem_cgroup *memcg; /* Back pointer, we cannot */
- /* use container_of */
+
+ /* Fields which get updated often at the end. */
+ struct lruvec lruvec;
+ unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
+ struct mem_cgroup_reclaim_iter iter;
};
struct mem_cgroup_threshold {
--
2.43.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-22 4:18 ` Shakeel Butt
@ 2024-05-23 7:48 ` Oliver Sang
2024-05-23 16:47 ` Shakeel Butt
0 siblings, 1 reply; 15+ messages in thread
From: Oliver Sang @ 2024-05-23 7:48 UTC (permalink / raw)
To: Shakeel Butt
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin, oliver.sang
[-- Attachment #1: Type: text/plain, Size: 7554 bytes --]
hi, Shakeel,
On Tue, May 21, 2024 at 09:18:19PM -0700, Shakeel Butt wrote:
> On Tue, May 21, 2024 at 10:43:16AM +0800, Oliver Sang wrote:
> > hi, Shakeel,
> >
> [...]
> >
> > we reported regression on a 2-node Skylake server. so I found a 1-node Skylake
> > desktop (we don't have 1 node server) to check.
> >
>
> Please try the following patch on both single node and dual node
> machines:
the regression is partially recovered by applying your patch.
(but one even more regression case as below)
details:
since you mentioned the whole patch-set behavior last time, I applied the
patch upon
a94032b35e5f9 memcg: use proper type for mod_memcg_state
below fd2296741e2686ed6ecd05187e4 = a94032b35e5f9 + patch
for the regression in our original report, test machine is:
model: Skylake
nr_node: 2
nr_cpu: 104
memory: 192G
regression partially recovered:
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
91713 -13.0% 79833 -4.5% 87614 will-it-scale.per_process_ops
detail data is in part [1] in attachment.
in later threads, we also reported similar regression on other platforms.
on:
model: Ice Lake
nr_node: 2
nr_cpu: 64
memory: 256G
brand: Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz
regression partially recovered but not so obvious as above:
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp9/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
240373 -12.9% 209394 -10.1% 215996 will-it-scale.per_process_ops
detail data is in part [2] in attachment.
on:
model: Sapphire Rapids
nr_node: 2
nr_cpu: 224
memory: 512G
brand: Intel(R) Xeon(R) Platinum 8480CTDX
regression NOT recovered, even a little worse:
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
78072 -5.6% 73683 -6.5% 72975 will-it-scale.per_process_ops
detail data is in part [3] in attachment.
for single node machine, we reported last time no regression on:
model: Skylake
nr_node: 1
nr_cpu: 36
memory: 32G
brand: Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz
we confirmed it's not impacted by this new patch, either:
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-d08/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
136040 -0.1% 135881 -0.1% 135953 will-it-scale.per_process_ops
if you need detail data for this comparison, please let us know.
BTW, after last update, we found another single node machine which can reproduce
the regression in our original report:
model: Cascade Lake
nr_node: 1
nr_cpu: 36
memory: 128G
brand: Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz
the regression is also partially recovered now:
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-csl-d02/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
187483 -19.4% 151162 -12.1% 164714 will-it-scale.per_process_ops
detail data is in part [4] in attachment.
>
>
> From 00a84b489b9e18abd1b8ec575ea31afacaf0734b Mon Sep 17 00:00:00 2001
> From: Shakeel Butt <shakeel.butt@linux.dev>
> Date: Tue, 21 May 2024 20:27:11 -0700
> Subject: [PATCH] memcg: rearrage fields of mem_cgroup_per_node
>
> At the moment the fields of mem_cgroup_per_node which get read on the
> performance critical path share the cacheline with the fields which
> might get updated. This cause contention of that cacheline for
> concurrent readers. Let's move all the read only pointers at the start
> of the struct, followed by memcg-v1 only fields and at the end fields
> which get updated often.
>
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> ---
> include/linux/memcontrol.h | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 030d34e9d117..16efd9737be9 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -96,23 +96,25 @@ struct mem_cgroup_reclaim_iter {
> * per-node information in memory controller.
> */
> struct mem_cgroup_per_node {
> - struct lruvec lruvec;
> + /* Keep the read-only fields at the start */
> + struct mem_cgroup *memcg; /* Back pointer, we cannot */
> + /* use container_of */
>
> struct lruvec_stats_percpu __percpu *lruvec_stats_percpu;
> struct lruvec_stats *lruvec_stats;
> -
> - unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
> -
> - struct mem_cgroup_reclaim_iter iter;
> -
> struct shrinker_info __rcu *shrinker_info;
>
> + /* memcg-v1 only stuff in middle */
> +
> struct rb_node tree_node; /* RB tree node */
> unsigned long usage_in_excess;/* Set to the value by which */
> /* the soft limit is exceeded*/
> bool on_tree;
> - struct mem_cgroup *memcg; /* Back pointer, we cannot */
> - /* use container_of */
> +
> + /* Fields which get updated often at the end. */
> + struct lruvec lruvec;
> + unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
> + struct mem_cgroup_reclaim_iter iter;
> };
>
> struct mem_cgroup_threshold {
> --
> 2.43.0
>
>
[-- Attachment #2: detail-comparison --]
[-- Type: text/plain, Size: 136381 bytes --]
[1]
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
1.646e+08 +7.6% 1.772e+08 ± 14% +34.5% 2.215e+08 ± 20% cpuidle..time
41.99 ± 16% -24.4% 31.73 ± 16% -25.2% 31.39 ± 12% sched_debug.cfs_rq:/.removed.load_avg.stddev
34.17 -0.9% 33.87 -0.2% 34.12 boot-time.boot
3182 -1.0% 3151 -0.2% 3176 boot-time.idle
21099 ± 5% -16.5% 17627 ± 2% -7.4% 19540 ± 3% perf-c2c.DRAM.local
4025 ± 2% +31.3% 5285 ± 4% -14.7% 3432 ± 2% perf-c2c.HITM.local
0.44 ± 24% +0.1 0.58 +0.2 0.65 ± 20% mpstat.cpu.all.idle%
0.01 ± 23% +0.0 0.01 ± 9% +0.0 0.02 ± 6% mpstat.cpu.all.soft%
7.14 -0.9 6.23 -0.3 6.79 mpstat.cpu.all.usr%
9538291 -13.0% 8302761 -4.5% 9111939 will-it-scale.104.processes
91713 -13.0% 79833 -4.5% 87614 will-it-scale.per_process_ops
9538291 -13.0% 8302761 -4.5% 9111939 will-it-scale.workload
1.438e+09 -12.9% 1.253e+09 -4.2% 1.378e+09 numa-numastat.node0.local_node
1.44e+09 -12.9% 1.254e+09 -4.2% 1.38e+09 numa-numastat.node0.numa_hit
1.453e+09 -13.1% 1.263e+09 -4.9% 1.382e+09 numa-numastat.node1.local_node
1.454e+09 -12.9% 1.265e+09 -4.8% 1.384e+09 numa-numastat.node1.numa_hit
1.44e+09 -12.9% 1.254e+09 -4.2% 1.38e+09 numa-vmstat.node0.numa_hit
1.438e+09 -12.9% 1.253e+09 -4.2% 1.378e+09 numa-vmstat.node0.numa_local
1.454e+09 -12.9% 1.265e+09 -4.8% 1.384e+09 numa-vmstat.node1.numa_hit
1.453e+09 -13.1% 1.263e+09 -4.9% 1.382e+09 numa-vmstat.node1.numa_local
2.894e+09 -12.9% 2.52e+09 -4.5% 2.764e+09 proc-vmstat.numa_hit
2.891e+09 -13.0% 2.516e+09 -4.5% 2.76e+09 proc-vmstat.numa_local
2.88e+09 -12.9% 2.509e+09 -4.5% 2.752e+09 proc-vmstat.pgalloc_normal
2.869e+09 -12.9% 2.499e+09 -4.5% 2.741e+09 proc-vmstat.pgfault
2.88e+09 -12.9% 2.509e+09 -4.5% 2.751e+09 proc-vmstat.pgfree
17.51 -3.2% 16.95 -1.5% 17.23 perf-stat.i.MPKI
9.457e+09 -9.7% 8.542e+09 -3.1% 9.165e+09 perf-stat.i.branch-instructions
45022022 -9.0% 40951240 -2.6% 43850606 perf-stat.i.branch-misses
84.38 -5.7 78.65 -3.2 81.15 perf-stat.i.cache-miss-rate%
8.353e+08 -12.9% 7.271e+08 -4.6% 7.969e+08 perf-stat.i.cache-misses
9.877e+08 -6.6% 9.224e+08 -0.8% 9.799e+08 perf-stat.i.cache-references
6.06 +11.3% 6.75 +3.2% 6.26 perf-stat.i.cpi
136.25 -1.1% 134.73 -0.1% 136.12 perf-stat.i.cpu-migrations
348.56 +14.9% 400.65 +4.9% 365.77 perf-stat.i.cycles-between-cache-misses
4.763e+10 -10.1% 4.285e+10 -3.1% 4.617e+10 perf-stat.i.instructions
0.17 -9.9% 0.15 -3.2% 0.16 perf-stat.i.ipc
182.56 -12.9% 158.99 -4.5% 174.33 perf-stat.i.metric.K/sec
9494393 -12.9% 8270117 -4.5% 9066901 perf-stat.i.minor-faults
9494393 -12.9% 8270117 -4.5% 9066902 perf-stat.i.page-faults
17.54 -3.2% 16.98 -1.6% 17.27 perf-stat.overall.MPKI
84.57 -5.7 78.84 -3.2 81.34 perf-stat.overall.cache-miss-rate%
6.07 +11.2% 6.76 +3.2% 6.27 perf-stat.overall.cpi
346.33 +14.9% 397.97 +4.8% 362.97 perf-stat.overall.cycles-between-cache-misses
0.16 -10.1% 0.15 -3.1% 0.16 perf-stat.overall.ipc
1503802 +3.5% 1555989 +1.7% 1528933 perf-stat.overall.path-length
9.424e+09 -9.7% 8.509e+09 -3.1% 9.133e+09 perf-stat.ps.branch-instructions
44739120 -9.2% 40645392 -2.6% 43568159 perf-stat.ps.branch-misses
8.326e+08 -13.0% 7.247e+08 -4.6% 7.945e+08 perf-stat.ps.cache-misses
9.846e+08 -6.6% 9.193e+08 -0.8% 9.768e+08 perf-stat.ps.cache-references
134.98 -1.1% 133.49 -0.1% 134.89 perf-stat.ps.cpu-migrations
4.747e+10 -10.1% 4.268e+10 -3.1% 4.601e+10 perf-stat.ps.instructions
9463902 -12.9% 8241837 -4.5% 9037920 perf-stat.ps.minor-faults
9463902 -12.9% 8241837 -4.5% 9037920 perf-stat.ps.page-faults
1.434e+13 -9.9% 1.292e+13 -2.9% 1.393e+13 perf-stat.total.instructions
64.15 -2.5 61.69 -0.9 63.21 perf-profile.calltrace.cycles-pp.testcase
58.30 -1.9 56.36 -0.7 57.58 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
52.64 -1.3 51.29 -0.5 52.17 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
52.50 -1.3 51.18 -0.5 52.05 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
50.81 -1.0 49.86 -0.2 50.64 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
9.27 -0.9 8.36 -0.4 8.83 ± 2% perf-profile.calltrace.cycles-pp.copy_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
49.86 -0.8 49.02 -0.1 49.76 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
49.21 -0.8 48.45 -0.1 49.14 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
0.60 ± 4% -0.6 0.00 -0.2 0.35 ± 70% perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.__mem_cgroup_charge.folio_prealloc.do_fault.__handle_mm_fault
3.24 -0.5 2.73 -0.3 2.98 perf-profile.calltrace.cycles-pp.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
5.15 -0.5 4.65 -0.2 4.94 perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
0.82 -0.3 0.53 -0.3 0.56 perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
1.68 -0.3 1.43 -0.2 1.51 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
1.50 ± 2% -0.2 1.26 ± 3% -0.1 1.42 perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
2.52 -0.2 2.27 -0.1 2.40 perf-profile.calltrace.cycles-pp.error_entry.testcase
1.85 -0.2 1.68 -0.1 1.78 ± 2% perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.55 -0.1 1.42 -0.1 1.49 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault
1.07 -0.1 0.95 -0.1 1.00 perf-profile.calltrace.cycles-pp.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault
0.68 -0.1 0.56 ± 2% -0.1 0.61 perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
0.55 -0.1 0.42 ± 44% -0.0 0.53 ± 2% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc
0.90 -0.1 0.80 -0.0 0.86 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
0.89 -0.1 0.84 -0.0 0.88 perf-profile.calltrace.cycles-pp.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault
1.23 -0.0 1.21 +0.0 1.27 perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
1.15 -0.0 1.13 +0.0 1.19 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
0.96 +0.0 0.96 +0.1 1.01 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
0.73 ± 2% +0.0 0.75 +0.1 0.79 perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault
1.00 +0.1 1.06 +0.1 1.08 perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
3.85 +0.2 4.09 +0.1 3.95 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
3.85 +0.2 4.09 +0.1 3.95 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
3.85 +0.2 4.09 +0.1 3.96 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
3.82 +0.2 4.07 +0.1 3.92 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region
3.68 +0.3 3.93 +0.1 3.80 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu
0.83 +0.3 1.12 ± 2% +0.3 1.14 perf-profile.calltrace.cycles-pp.__lruvec_stat_mod_folio.set_pte_range.finish_fault.do_fault.__handle_mm_fault
0.00 +0.6 0.56 ± 3% +0.3 0.34 ± 70% perf-profile.calltrace.cycles-pp.__lruvec_stat_mod_folio.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range
31.81 +0.6 32.44 +0.4 32.22 perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_fault.__handle_mm_fault
31.69 +0.6 32.33 +0.4 32.11 perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_fault
30.47 +0.6 31.11 +0.4 30.90 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
30.48 +0.6 31.13 +0.4 30.91 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
30.44 +0.7 31.09 +0.4 30.88 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
0.00 +0.7 0.68 ± 2% +0.6 0.63 perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
35.03 +0.7 35.76 +0.6 35.66 perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
32.87 +0.9 33.79 +0.7 33.58 perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
29.54 +2.3 31.84 +0.9 30.39 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
29.54 +2.3 31.84 +0.9 30.39 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
29.53 +2.3 31.83 +0.9 30.39 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
30.66 +2.3 32.98 +0.9 31.57 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
30.66 +2.3 32.98 +0.9 31.57 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
30.66 +2.3 32.98 +0.9 31.57 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
30.66 +2.3 32.98 +0.9 31.57 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
29.26 +2.4 31.64 +0.9 30.16 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
28.41 +2.4 30.83 +1.0 29.39 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
34.56 +2.6 37.12 +1.0 35.57 perf-profile.calltrace.cycles-pp.__munmap
34.55 +2.6 37.12 +1.0 35.57 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
34.55 +2.6 37.12 +1.0 35.57 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
34.55 +2.6 37.12 +1.0 35.57 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
34.55 +2.6 37.12 +1.0 35.57 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
34.56 +2.6 37.12 +1.0 35.57 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
34.56 +2.6 37.12 +1.0 35.57 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
34.55 +2.6 37.11 +1.0 35.56 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
31.41 +2.8 34.25 +1.1 32.55 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
31.38 +2.9 34.24 +1.1 32.53 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
31.42 +2.9 34.28 +1.1 32.56 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
65.26 -2.6 62.67 -1.0 64.26 perf-profile.children.cycles-pp.testcase
56.09 -1.7 54.39 -0.6 55.47 perf-profile.children.cycles-pp.asm_exc_page_fault
52.66 -1.3 51.31 -0.5 52.19 perf-profile.children.cycles-pp.exc_page_fault
52.52 -1.3 51.20 -0.5 52.07 perf-profile.children.cycles-pp.do_user_addr_fault
50.83 -1.0 49.88 -0.2 50.66 perf-profile.children.cycles-pp.handle_mm_fault
9.35 -0.9 8.44 -0.4 8.91 ± 2% perf-profile.children.cycles-pp.copy_page
49.87 -0.8 49.03 -0.1 49.77 perf-profile.children.cycles-pp.__handle_mm_fault
49.23 -0.8 48.47 -0.1 49.16 perf-profile.children.cycles-pp.do_fault
3.27 -0.5 2.76 -0.3 3.01 perf-profile.children.cycles-pp.folio_prealloc
5.15 -0.5 4.65 -0.2 4.94 perf-profile.children.cycles-pp.__irqentry_text_end
0.82 -0.3 0.53 -0.3 0.57 perf-profile.children.cycles-pp.lock_vma_under_rcu
1.52 ± 2% -0.3 1.26 ± 3% -0.1 1.43 perf-profile.children.cycles-pp.__mem_cgroup_charge
1.69 -0.2 1.44 -0.2 1.52 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
2.54 -0.2 2.29 -0.1 2.43 perf-profile.children.cycles-pp.error_entry
0.57 -0.2 0.33 -0.2 0.34 perf-profile.children.cycles-pp.mas_walk
1.87 -0.2 1.70 -0.1 1.80 ± 2% perf-profile.children.cycles-pp.__pte_offset_map_lock
0.60 ± 4% -0.2 0.44 ± 6% -0.1 0.52 ± 5% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
1.57 -0.1 1.43 -0.1 1.51 ± 3% perf-profile.children.cycles-pp._raw_spin_lock
1.12 -0.1 0.99 -0.1 1.04 perf-profile.children.cycles-pp.alloc_pages_mpol_noprof
0.70 -0.1 0.57 ± 2% -0.1 0.62 perf-profile.children.cycles-pp.lru_add_fn
0.95 -0.1 0.82 ± 5% +0.3 1.22 ± 2% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
1.16 -0.1 1.04 -0.0 1.11 perf-profile.children.cycles-pp.native_irq_return_iret
0.94 -0.1 0.84 -0.0 0.90 perf-profile.children.cycles-pp.sync_regs
0.43 -0.1 0.34 ± 2% -0.0 0.39 perf-profile.children.cycles-pp.free_unref_folios
0.96 -0.1 0.87 -0.0 0.92 perf-profile.children.cycles-pp.__perf_sw_event
0.44 -0.1 0.36 -0.1 0.39 perf-profile.children.cycles-pp.get_vma_policy
0.21 ± 3% -0.1 0.13 ± 2% -0.0 0.16 ± 2% perf-profile.children.cycles-pp._compound_head
0.75 -0.1 0.68 -0.0 0.72 perf-profile.children.cycles-pp.___perf_sw_event
0.94 -0.1 0.88 -0.0 0.92 perf-profile.children.cycles-pp.__alloc_pages_noprof
0.44 ± 5% -0.1 0.37 ± 7% -0.0 0.42 ± 6% perf-profile.children.cycles-pp.__count_memcg_events
0.31 -0.1 0.24 ± 2% -0.0 0.28 ± 3% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
0.41 ± 4% -0.1 0.35 ± 7% -0.0 0.40 ± 5% perf-profile.children.cycles-pp.mem_cgroup_commit_charge
0.57 -0.0 0.52 -0.0 0.55 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
0.17 ± 2% -0.0 0.12 ± 4% -0.0 0.15 ± 3% perf-profile.children.cycles-pp.uncharge_batch
0.19 ± 3% -0.0 0.15 ± 8% -0.0 0.18 ± 2% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.15 ± 2% -0.0 0.12 ± 4% -0.0 0.13 ± 3% perf-profile.children.cycles-pp.free_unref_page_commit
0.32 ± 3% -0.0 0.29 ± 2% -0.0 0.30 ± 2% perf-profile.children.cycles-pp.__mod_node_page_state
0.13 ± 3% -0.0 0.10 ± 5% -0.0 0.11 ± 3% perf-profile.children.cycles-pp.page_counter_uncharge
0.13 ± 2% -0.0 0.10 ± 4% -0.0 0.12 ± 6% perf-profile.children.cycles-pp.__mod_zone_page_state
0.10 ± 3% -0.0 0.07 ± 5% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.08 -0.0 0.05 -0.0 0.05 ± 8% perf-profile.children.cycles-pp.policy_nodemask
1.24 -0.0 1.21 +0.0 1.28 perf-profile.children.cycles-pp.__do_fault
0.36 -0.0 0.33 -0.0 0.34 perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.39 -0.0 0.37 -0.0 0.38 ± 2% perf-profile.children.cycles-pp.rmqueue
0.17 ± 2% -0.0 0.15 -0.0 0.16 ± 3% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.32 -0.0 0.30 -0.0 0.31 perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
1.15 -0.0 1.13 +0.0 1.19 perf-profile.children.cycles-pp.shmem_fault
0.09 -0.0 0.07 -0.0 0.08 perf-profile.children.cycles-pp.get_pfnblock_flags_mask
0.16 -0.0 0.14 -0.0 0.15 ± 3% perf-profile.children.cycles-pp.handle_pte_fault
0.12 ± 3% -0.0 0.10 ± 3% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.uncharge_folio
0.16 ± 2% -0.0 0.14 ± 2% -0.0 0.15 ± 2% perf-profile.children.cycles-pp.shmem_get_policy
0.29 -0.0 0.27 -0.0 0.28 ± 2% perf-profile.children.cycles-pp.hrtimer_interrupt
0.08 -0.0 0.06 ± 6% -0.0 0.07 ± 5% perf-profile.children.cycles-pp.folio_unlock
0.16 ± 4% -0.0 0.14 ± 3% -0.0 0.15 ± 2% perf-profile.children.cycles-pp.__pte_offset_map
0.25 -0.0 0.24 -0.0 0.24 perf-profile.children.cycles-pp.__hrtimer_run_queues
0.30 -0.0 0.28 ± 2% -0.0 0.28 perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.20 ± 2% -0.0 0.18 ± 3% -0.0 0.19 ± 2% perf-profile.children.cycles-pp.tick_nohz_handler
0.09 ± 4% -0.0 0.08 -0.0 0.09 perf-profile.children.cycles-pp.down_read_trylock
0.12 ± 3% -0.0 0.11 -0.0 0.12 ± 3% perf-profile.children.cycles-pp.folio_add_new_anon_rmap
0.99 -0.0 0.99 +0.1 1.04 ± 2% perf-profile.children.cycles-pp.shmem_get_folio_gfp
0.04 ± 44% +0.0 0.06 ± 7% -0.0 0.02 ±142% perf-profile.children.cycles-pp.kthread
0.04 ± 44% +0.0 0.06 ± 7% -0.0 0.02 ±142% perf-profile.children.cycles-pp.ret_from_fork
0.04 ± 44% +0.0 0.06 ± 7% -0.0 0.02 ±142% perf-profile.children.cycles-pp.ret_from_fork_asm
0.73 +0.0 0.75 +0.1 0.79 perf-profile.children.cycles-pp.filemap_get_entry
0.00 +0.1 0.05 +0.0 0.01 ±223% perf-profile.children.cycles-pp._raw_spin_lock_irq
1.02 +0.1 1.07 +0.1 1.10 perf-profile.children.cycles-pp.zap_present_ptes
0.47 +0.2 0.68 ± 2% +0.2 0.64 perf-profile.children.cycles-pp.folio_remove_rmap_ptes
3.87 +0.2 4.11 +0.1 3.97 perf-profile.children.cycles-pp.tlb_finish_mmu
1.17 +0.6 1.75 ± 2% +0.5 1.67 perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
31.81 +0.6 32.44 +0.4 32.22 perf-profile.children.cycles-pp.folio_add_lru_vma
31.77 +0.6 32.42 +0.4 32.19 perf-profile.children.cycles-pp.folio_batch_move_lru
35.04 +0.7 35.77 +0.6 35.67 perf-profile.children.cycles-pp.finish_fault
32.88 +0.9 33.80 +0.7 33.59 perf-profile.children.cycles-pp.set_pte_range
29.54 +2.3 31.84 +0.9 30.39 perf-profile.children.cycles-pp.tlb_flush_mmu
30.66 +2.3 32.98 +0.9 31.57 perf-profile.children.cycles-pp.zap_pte_range
30.66 +2.3 32.98 +0.9 31.58 perf-profile.children.cycles-pp.unmap_page_range
30.66 +2.3 32.98 +0.9 31.58 perf-profile.children.cycles-pp.unmap_vmas
30.66 +2.3 32.98 +0.9 31.58 perf-profile.children.cycles-pp.zap_pmd_range
33.41 +2.5 35.95 +1.0 34.36 perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
33.40 +2.5 35.94 +1.0 34.36 perf-profile.children.cycles-pp.free_pages_and_swap_cache
34.56 +2.6 37.12 +1.0 35.57 perf-profile.children.cycles-pp.__x64_sys_munmap
34.56 +2.6 37.12 +1.0 35.57 perf-profile.children.cycles-pp.__vm_munmap
34.56 +2.6 37.12 +1.0 35.58 perf-profile.children.cycles-pp.do_vmi_munmap
34.56 +2.6 37.12 +1.0 35.57 perf-profile.children.cycles-pp.__munmap
34.56 +2.6 37.12 +1.0 35.58 perf-profile.children.cycles-pp.do_vmi_align_munmap
34.56 +2.6 37.12 +1.0 35.58 perf-profile.children.cycles-pp.unmap_region
34.67 +2.6 37.24 +1.0 35.68 perf-profile.children.cycles-pp.do_syscall_64
34.67 +2.6 37.24 +1.0 35.69 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
33.22 +2.6 35.83 +1.0 34.21 perf-profile.children.cycles-pp.folios_put_refs
32.12 +2.7 34.80 +1.1 33.22 perf-profile.children.cycles-pp.__page_cache_release
61.97 +3.5 65.47 +1.6 63.54 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
61.98 +3.5 65.50 +1.6 63.56 perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
61.94 +3.5 65.48 +1.6 63.51 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
9.32 -0.9 8.41 -0.4 8.88 ± 2% perf-profile.self.cycles-pp.copy_page
5.15 -0.5 4.65 -0.2 4.94 perf-profile.self.cycles-pp.__irqentry_text_end
2.58 -0.3 2.30 -0.1 2.46 perf-profile.self.cycles-pp.testcase
2.53 -0.2 2.28 -0.1 2.42 perf-profile.self.cycles-pp.error_entry
0.56 -0.2 0.32 ± 2% -0.2 0.34 perf-profile.self.cycles-pp.mas_walk
0.60 ± 4% -0.2 0.43 ± 5% -0.1 0.51 ± 5% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
1.54 -0.1 1.42 -0.1 1.49 ± 3% perf-profile.self.cycles-pp._raw_spin_lock
1.15 -0.1 1.04 -0.0 1.11 perf-profile.self.cycles-pp.native_irq_return_iret
0.94 -0.1 0.84 -0.0 0.90 perf-profile.self.cycles-pp.sync_regs
0.85 -0.1 0.75 ± 5% +0.3 1.13 ± 2% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.20 ± 3% -0.1 0.12 ± 3% -0.1 0.15 ± 2% perf-profile.self.cycles-pp._compound_head
0.27 ± 3% -0.1 0.19 ± 2% -0.0 0.23 ± 3% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.26 -0.1 0.19 ± 3% -0.0 0.25 ± 2% perf-profile.self.cycles-pp.__page_cache_release
0.66 -0.1 0.59 -0.0 0.63 perf-profile.self.cycles-pp.___perf_sw_event
0.28 ± 2% -0.1 0.22 ± 3% -0.0 0.25 perf-profile.self.cycles-pp.zap_present_ptes
0.32 -0.1 0.27 ± 4% -0.0 0.28 perf-profile.self.cycles-pp.lru_add_fn
0.37 ± 5% -0.1 0.32 ± 6% -0.0 0.36 ± 6% perf-profile.self.cycles-pp.__count_memcg_events
0.26 -0.1 0.20 -0.0 0.21 perf-profile.self.cycles-pp.get_vma_policy
0.47 -0.1 0.42 -0.0 0.44 ± 2% perf-profile.self.cycles-pp.__handle_mm_fault
0.16 -0.0 0.12 ± 4% -0.0 0.12 ± 3% perf-profile.self.cycles-pp.vma_alloc_folio_noprof
0.20 -0.0 0.16 ± 3% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.free_unref_folios
0.30 -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.handle_mm_fault
0.16 ± 4% -0.0 0.12 ± 3% -0.0 0.13 ± 3% perf-profile.self.cycles-pp.lock_vma_under_rcu
0.14 ± 3% -0.0 0.11 ± 3% -0.0 0.13 perf-profile.self.cycles-pp.folio_remove_rmap_ptes
0.10 ± 4% -0.0 0.07 -0.0 0.09 ± 4% perf-profile.self.cycles-pp.zap_pte_range
0.16 ± 2% -0.0 0.12 ± 7% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.10 ± 4% -0.0 0.07 ± 5% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.alloc_pages_mpol_noprof
0.11 -0.0 0.08 -0.0 0.10 ± 4% perf-profile.self.cycles-pp.free_unref_page_commit
0.09 ± 5% -0.0 0.06 ± 7% -0.0 0.08 perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
0.11 -0.0 0.08 ± 4% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.page_counter_uncharge
0.12 ± 4% -0.0 0.09 -0.0 0.11 ± 5% perf-profile.self.cycles-pp.__mod_zone_page_state
0.31 ± 2% -0.0 0.29 ± 2% -0.0 0.30 ± 2% perf-profile.self.cycles-pp.__mod_node_page_state
0.14 ± 2% -0.0 0.12 ± 4% -0.0 0.14 ± 3% perf-profile.self.cycles-pp.mem_cgroup_commit_charge
0.21 -0.0 0.19 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.do_user_addr_fault
0.09 -0.0 0.07 ± 5% -0.0 0.08 perf-profile.self.cycles-pp.get_pfnblock_flags_mask
0.21 -0.0 0.19 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.__perf_sw_event
0.17 ± 2% -0.0 0.15 -0.0 0.16 ± 2% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.28 -0.0 0.26 ± 2% -0.0 0.27 perf-profile.self.cycles-pp.__alloc_pages_noprof
0.22 ± 2% -0.0 0.19 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.__pte_offset_map_lock
0.20 ± 2% -0.0 0.18 ± 2% -0.0 0.20 ± 3% perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.12 -0.0 0.10 -0.0 0.11 ± 4% perf-profile.self.cycles-pp.uncharge_folio
0.11 ± 4% -0.0 0.09 ± 4% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.__mem_cgroup_charge
0.08 -0.0 0.06 ± 6% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.folio_unlock
0.14 ± 3% -0.0 0.12 ± 3% -0.0 0.12 ± 3% perf-profile.self.cycles-pp.do_fault
0.16 ± 3% -0.0 0.14 ± 2% -0.0 0.15 ± 3% perf-profile.self.cycles-pp.shmem_get_policy
0.10 ± 3% -0.0 0.08 ± 5% -0.0 0.09 perf-profile.self.cycles-pp.set_pte_range
0.16 ± 2% -0.0 0.15 ± 2% -0.0 0.16 ± 2% perf-profile.self.cycles-pp.get_page_from_freelist
0.10 ± 3% -0.0 0.09 -0.0 0.10 ± 5% perf-profile.self.cycles-pp.exc_page_fault
0.12 ± 3% -0.0 0.11 -0.0 0.12 ± 3% perf-profile.self.cycles-pp.folio_add_new_anon_rmap
0.09 -0.0 0.08 +0.0 0.09 perf-profile.self.cycles-pp.down_read_trylock
0.38 ± 2% +0.0 0.42 +0.1 0.44 ± 2% perf-profile.self.cycles-pp.filemap_get_entry
0.26 +0.1 0.36 -0.0 0.23 perf-profile.self.cycles-pp.folios_put_refs
0.33 +0.1 0.45 ± 4% +0.1 0.40 perf-profile.self.cycles-pp.folio_batch_move_lru
0.40 ± 5% +0.6 0.99 +0.2 0.59 perf-profile.self.cycles-pp.__lruvec_stat_mod_folio
61.94 +3.5 65.48 +1.6 63.51 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
[2]
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp9/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
194.40 ± 9% -13.9% 167.40 ± 2% -10.0% 175.00 ± 4% perf-c2c.HITM.remote
0.27 ± 3% -0.0 0.24 ± 2% -0.0 0.25 ± 2% mpstat.cpu.all.irq%
3.83 -0.6 3.21 -0.5 3.37 ± 2% mpstat.cpu.all.usr%
15383898 -12.9% 13401271 -10.1% 13823802 will-it-scale.64.processes
240373 -12.9% 209394 -10.1% 215996 will-it-scale.per_process_ops
15383898 -12.9% 13401271 -10.1% 13823802 will-it-scale.workload
2.359e+09 -12.8% 2.057e+09 -10.2% 2.118e+09 ± 2% numa-numastat.node0.local_node
2.359e+09 -12.8% 2.057e+09 -10.2% 2.118e+09 ± 2% numa-numastat.node0.numa_hit
2.346e+09 -13.2% 2.035e+09 ± 2% -10.3% 2.105e+09 numa-numastat.node1.local_node
2.345e+09 -13.2% 2.036e+09 ± 2% -10.2% 2.105e+09 numa-numastat.node1.numa_hit
2.36e+09 -12.9% 2.056e+09 -10.2% 2.118e+09 ± 2% numa-vmstat.node0.numa_hit
2.36e+09 -12.9% 2.056e+09 -10.3% 2.118e+09 ± 2% numa-vmstat.node0.numa_local
2.346e+09 -13.3% 2.035e+09 ± 2% -10.3% 2.105e+09 numa-vmstat.node1.numa_hit
2.347e+09 -13.3% 2.034e+09 ± 2% -10.3% 2.105e+09 numa-vmstat.node1.numa_local
7.86 ± 5% -29.5% 5.54 ± 34% -37.0% 4.95 ± 30% sched_debug.cfs_rq:/.removed.runnable_avg.avg
22.93 ± 4% -18.5% 18.68 ± 15% -21.7% 17.96 ± 20% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
7.86 ± 5% -30.0% 5.50 ± 34% -37.0% 4.95 ± 30% sched_debug.cfs_rq:/.removed.util_avg.avg
22.93 ± 4% -19.9% 18.35 ± 14% -21.7% 17.96 ± 20% sched_debug.cfs_rq:/.removed.util_avg.stddev
149.50 ± 33% -70.9% 43.57 ±125% -58.2% 62.42 ± 67% sched_debug.cfs_rq:/.util_est.min
1930 ± 4% -10.5% 1729 ± 16% -14.9% 1643 ± 6% sched_debug.cpu.nr_switches.min
1137116 -1.8% 1116759 -1.8% 1116590 proc-vmstat.nr_anon_pages
4575 +1.7% 4654 +1.7% 4652 proc-vmstat.nr_page_table_pages
4.705e+09 -13.0% 4.093e+09 -10.2% 4.224e+09 proc-vmstat.numa_hit
4.706e+09 -13.0% 4.092e+09 -10.3% 4.223e+09 proc-vmstat.numa_local
4.645e+09 -12.8% 4.05e+09 -10.1% 4.177e+09 proc-vmstat.pgalloc_normal
4.631e+09 -12.8% 4.038e+09 -10.1% 4.164e+09 proc-vmstat.pgfault
4.643e+09 -12.8% 4.049e+09 -10.1% 4.176e+09 proc-vmstat.pgfree
21.14 -9.9% 19.05 -7.4% 19.58 perf-stat.i.MPKI
1.468e+10 -7.9% 1.351e+10 -6.2% 1.378e+10 perf-stat.i.branch-instructions
14349180 -6.2% 13464962 -5.2% 13596701 perf-stat.i.branch-misses
69.58 -4.6 64.96 -3.2 66.40 perf-stat.i.cache-miss-rate%
1.57e+09 -17.8% 1.291e+09 -13.6% 1.356e+09 ± 2% perf-stat.i.cache-misses
2.252e+09 -11.9% 1.985e+09 -9.4% 2.039e+09 perf-stat.i.cache-references
3.00 +10.6% 3.32 +8.1% 3.25 perf-stat.i.cpi
99.00 -0.9% 98.13 -1.1% 97.87 perf-stat.i.cpu-migrations
143.06 +22.4% 175.18 +16.4% 166.58 ± 2% perf-stat.i.cycles-between-cache-misses
7.403e+10 -8.7% 6.76e+10 -6.7% 6.91e+10 perf-stat.i.instructions
0.34 -9.7% 0.30 -7.6% 0.31 perf-stat.i.ipc
478.41 -12.7% 417.50 -10.0% 430.74 perf-stat.i.metric.K/sec
15310132 -12.7% 13361235 -10.0% 13784853 perf-stat.i.minor-faults
15310132 -12.7% 13361235 -10.0% 13784853 perf-stat.i.page-faults
21.21 -28.3% 15.20 ± 50% -7.5% 19.62 perf-stat.overall.MPKI
0.10 -0.0 0.08 ± 50% +0.0 0.10 perf-stat.overall.branch-miss-rate%
69.71 -17.9 51.83 ± 50% -3.2 66.46 perf-stat.overall.cache-miss-rate%
3.01 -11.4% 2.67 ± 50% +8.0% 3.25 perf-stat.overall.cpi
141.98 -1.2% 140.33 ± 50% +16.8% 165.83 ± 2% perf-stat.overall.cycles-between-cache-misses
0.33 -27.7% 0.24 ± 50% -7.4% 0.31 perf-stat.overall.ipc
1453908 -16.2% 1218410 ± 50% +3.6% 1506867 perf-stat.overall.path-length
1.463e+10 -26.4% 1.077e+10 ± 50% -6.2% 1.373e+10 perf-stat.ps.branch-instructions
14253731 -25.1% 10681742 ± 50% -5.2% 13506212 perf-stat.ps.branch-misses
1.565e+09 -34.6% 1.023e+09 ± 50% -13.6% 1.351e+09 ± 2% perf-stat.ps.cache-misses
2.245e+09 -29.6% 1.579e+09 ± 50% -9.4% 2.032e+09 perf-stat.ps.cache-references
7.378e+10 -27.0% 5.385e+10 ± 50% -6.7% 6.886e+10 perf-stat.ps.instructions
15260342 -30.3% 10633461 ± 50% -10.0% 13738637 perf-stat.ps.minor-faults
15260342 -30.3% 10633461 ± 50% -10.0% 13738637 perf-stat.ps.page-faults
2.237e+13 -27.2% 1.629e+13 ± 50% -6.9% 2.083e+13 perf-stat.total.instructions
75.68 -5.4 70.26 -5.0 70.73 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
72.31 -5.1 67.25 -4.7 67.66 perf-profile.calltrace.cycles-pp.testcase
63.50 -3.9 59.64 -3.7 59.78 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
63.32 -3.8 59.48 -3.7 59.63 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
61.04 -3.6 57.49 -3.5 57.55 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
21.29 -3.5 17.77 ± 2% -2.8 18.48 ± 2% perf-profile.calltrace.cycles-pp.copy_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
59.53 -3.3 56.21 -3.3 56.24 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
58.35 -3.2 55.17 -3.2 55.16 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
5.31 -0.8 4.50 -0.7 4.64 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
4.97 -0.8 4.21 -0.6 4.35 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault
4.40 -0.6 3.78 ± 2% -0.4 3.96 ± 3% perf-profile.calltrace.cycles-pp.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.57 -0.6 0.00 -0.3 0.26 ±100% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
2.63 -0.3 2.29 -0.3 2.36 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
1.82 -0.3 1.49 -0.3 1.55 perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
2.21 -0.3 1.90 -0.2 1.97 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
2.01 -0.3 1.73 ± 2% -0.2 1.84 ± 5% perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
1.80 -0.3 1.54 -0.2 1.59 perf-profile.calltrace.cycles-pp.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault
1.55 -0.2 1.33 -0.2 1.36 perf-profile.calltrace.cycles-pp.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault
1.74 -0.2 1.52 ± 2% -0.2 1.57 perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.63 ± 2% -0.2 0.41 ± 50% -0.1 0.53 ± 2% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault
1.60 -0.2 1.39 ± 2% -0.2 1.44 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.29 -0.2 1.11 ± 3% -0.1 1.19 ± 6% perf-profile.calltrace.cycles-pp.mem_cgroup_commit_charge.__mem_cgroup_charge.folio_prealloc.do_fault.__handle_mm_fault
1.42 -0.2 1.24 ± 2% -0.1 1.28 ± 2% perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
1.12 -0.2 0.95 ± 2% -0.1 0.98 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc
1.50 -0.1 1.36 ± 3% -0.2 1.33 ± 2% perf-profile.calltrace.cycles-pp.__lruvec_stat_mod_folio.set_pte_range.finish_fault.do_fault.__handle_mm_fault
0.72 ± 2% -0.1 0.60 ± 3% -0.1 0.62 ± 2% perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.98 -0.1 0.87 ± 2% -0.1 0.90 ± 2% perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.92 -0.1 0.81 ± 3% -0.1 0.84 ± 3% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault
0.74 -0.1 0.64 -0.1 0.66 perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
0.66 -0.1 0.56 ± 2% -0.1 0.59 perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.64 -0.1 0.56 ± 2% -0.1 0.57 ± 2% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof
1.15 -0.1 1.07 -0.1 1.08 ± 2% perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
0.66 -0.1 0.58 ± 2% -0.1 0.60 ± 2% perf-profile.calltrace.cycles-pp.mas_walk.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
2.71 +0.6 3.31 ± 2% +0.5 3.23 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
2.71 +0.6 3.31 ± 2% +0.5 3.23 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
2.71 +0.6 3.31 ± 2% +0.5 3.22 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
2.65 +0.6 3.26 ± 2% +0.5 3.17 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region
2.44 +0.6 3.07 ± 2% +0.5 2.98 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu
24.39 +2.1 26.54 ± 3% +1.0 25.41 ± 4% perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
22.46 +2.3 24.81 ± 4% +1.2 23.70 ± 4% perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_fault.__handle_mm_fault
22.25 +2.4 24.63 ± 4% +1.3 23.52 ± 5% perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_fault
20.38 +2.5 22.84 ± 4% +1.3 21.71 ± 5% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
20.37 +2.5 22.83 ± 4% +1.3 21.70 ± 5% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
20.30 +2.5 22.77 ± 4% +1.3 21.63 ± 5% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
22.59 +4.7 27.29 ± 2% +4.3 26.92 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
22.59 +4.7 27.29 ± 2% +4.3 26.92 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
22.59 +4.7 27.29 ± 2% +4.3 26.92 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
22.58 +4.7 27.28 ± 2% +4.3 26.91 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
20.59 +5.1 25.64 ± 2% +4.6 25.21 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
20.59 +5.1 25.64 ± 2% +4.6 25.20 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
20.56 +5.1 25.62 ± 2% +4.6 25.18 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
20.07 +5.2 25.23 ± 3% +4.7 24.78 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
18.73 +5.3 24.01 ± 3% +4.8 23.55 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
25.34 +5.3 30.64 ± 2% +4.8 30.19 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
25.34 +5.3 30.64 ± 2% +4.8 30.19 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
25.34 +5.3 30.64 ± 2% +4.8 30.19 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
25.34 +5.3 30.64 ± 2% +4.8 30.19 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
25.34 +5.3 30.65 ± 2% +4.9 30.19 perf-profile.calltrace.cycles-pp.__munmap
25.34 +5.3 30.64 ± 2% +4.9 30.19 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
25.33 +5.3 30.64 ± 2% +4.9 30.18 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
25.33 +5.3 30.64 ± 2% +4.9 30.19 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
20.36 +5.9 26.30 ± 3% +5.4 25.74 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
20.35 +5.9 26.29 ± 3% +5.4 25.73 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
20.28 +6.0 26.24 ± 3% +5.4 25.67 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
74.49 -5.3 69.18 -4.9 69.64 perf-profile.children.cycles-pp.testcase
71.15 -4.8 66.30 -4.5 66.66 perf-profile.children.cycles-pp.asm_exc_page_fault
63.55 -3.9 59.68 -3.7 59.82 perf-profile.children.cycles-pp.exc_page_fault
63.38 -3.8 59.54 -3.7 59.68 perf-profile.children.cycles-pp.do_user_addr_fault
61.10 -3.6 57.54 -3.5 57.61 perf-profile.children.cycles-pp.handle_mm_fault
21.32 -3.5 17.80 ± 2% -2.8 18.51 ± 2% perf-profile.children.cycles-pp.copy_page
59.57 -3.3 56.24 -3.3 56.27 perf-profile.children.cycles-pp.__handle_mm_fault
58.44 -3.2 55.25 -3.2 55.25 perf-profile.children.cycles-pp.do_fault
5.36 -0.8 4.54 -0.7 4.69 perf-profile.children.cycles-pp.__pte_offset_map_lock
5.02 -0.8 4.25 -0.6 4.38 perf-profile.children.cycles-pp._raw_spin_lock
4.45 -0.6 3.82 ± 2% -0.4 4.00 ± 3% perf-profile.children.cycles-pp.folio_prealloc
2.64 -0.3 2.30 -0.3 2.37 perf-profile.children.cycles-pp.sync_regs
1.89 -0.3 1.55 -0.3 1.62 perf-profile.children.cycles-pp.zap_present_ptes
2.42 -0.3 2.09 ± 2% -0.3 2.16 ± 2% perf-profile.children.cycles-pp.native_irq_return_iret
2.24 -0.3 1.93 -0.2 2.00 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
2.07 -0.3 1.77 ± 2% -0.2 1.88 ± 5% perf-profile.children.cycles-pp.__mem_cgroup_charge
1.89 -0.3 1.62 -0.2 1.67 perf-profile.children.cycles-pp.alloc_pages_mpol_noprof
1.64 -0.2 1.41 -0.2 1.45 perf-profile.children.cycles-pp.__alloc_pages_noprof
1.42 -0.2 1.19 ± 2% -0.2 1.23 ± 2% perf-profile.children.cycles-pp.__perf_sw_event
1.77 -0.2 1.54 ± 2% -0.2 1.60 perf-profile.children.cycles-pp.__do_fault
1.62 -0.2 1.41 ± 2% -0.2 1.46 ± 2% perf-profile.children.cycles-pp.shmem_fault
1.25 -0.2 1.05 ± 2% -0.2 1.08 ± 2% perf-profile.children.cycles-pp.___perf_sw_event
2.04 -0.2 1.83 ± 3% -0.2 1.82 ± 2% perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
1.32 -0.2 1.13 ± 2% -0.1 1.21 ± 6% perf-profile.children.cycles-pp.mem_cgroup_commit_charge
1.47 -0.2 1.29 ± 2% -0.1 1.34 ± 2% perf-profile.children.cycles-pp.shmem_get_folio_gfp
1.17 -0.2 1.00 ± 2% -0.1 1.03 perf-profile.children.cycles-pp.get_page_from_freelist
0.84 -0.2 0.69 ± 2% -0.1 0.71 ± 3% perf-profile.children.cycles-pp.__mod_lruvec_state
0.61 -0.2 0.46 ± 2% -0.1 0.48 perf-profile.children.cycles-pp._compound_head
0.65 -0.1 0.53 ± 2% -0.1 0.54 ± 3% perf-profile.children.cycles-pp.__mod_node_page_state
1.02 -0.1 0.90 ± 2% -0.1 0.93 ± 2% perf-profile.children.cycles-pp.lock_vma_under_rcu
0.94 -0.1 0.82 ± 3% -0.1 0.85 ± 2% perf-profile.children.cycles-pp.filemap_get_entry
1.13 ± 2% -0.1 1.03 ± 3% -0.1 1.02 ± 3% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.76 -0.1 0.66 -0.1 0.68 perf-profile.children.cycles-pp.folio_remove_rmap_ptes
1.20 -0.1 1.11 -0.1 1.12 perf-profile.children.cycles-pp.lru_add_fn
0.69 -0.1 0.60 ± 2% -0.1 0.61 ± 2% perf-profile.children.cycles-pp.rmqueue
0.47 -0.1 0.38 -0.1 0.40 perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
0.59 -0.1 0.50 -0.1 0.52 perf-profile.children.cycles-pp.free_unref_folios
0.63 ± 3% -0.1 0.55 ± 3% -0.0 0.59 ± 7% perf-profile.children.cycles-pp.__count_memcg_events
0.67 -0.1 0.59 ± 2% -0.1 0.61 ± 2% perf-profile.children.cycles-pp.mas_walk
0.54 -0.1 0.47 ± 3% -0.1 0.49 ± 3% perf-profile.children.cycles-pp.xas_load
0.27 ± 3% -0.1 0.21 -0.1 0.22 ± 3% perf-profile.children.cycles-pp.uncharge_batch
0.32 -0.1 0.26 -0.0 0.28 ± 3% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.22 ± 3% -0.1 0.17 ± 2% -0.0 0.18 ± 2% perf-profile.children.cycles-pp.page_counter_uncharge
0.38 -0.0 0.33 -0.0 0.34 ± 2% perf-profile.children.cycles-pp.try_charge_memcg
0.31 -0.0 0.26 -0.0 0.28 ± 2% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.31 ± 2% -0.0 0.27 ± 3% -0.0 0.28 ± 5% perf-profile.children.cycles-pp.get_vma_policy
0.30 -0.0 0.26 ± 3% -0.0 0.27 perf-profile.children.cycles-pp.handle_pte_fault
0.28 -0.0 0.25 -0.0 0.26 perf-profile.children.cycles-pp.error_entry
0.22 -0.0 0.19 ± 2% -0.0 0.20 perf-profile.children.cycles-pp.free_unref_page_commit
0.28 ± 2% -0.0 0.25 ± 2% -0.0 0.26 ± 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.29 ± 3% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.26 ± 2% -0.0 0.23 ± 5% -0.0 0.23 ± 5% perf-profile.children.cycles-pp._raw_spin_trylock
0.22 ± 2% -0.0 0.20 ± 2% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.folio_add_new_anon_rmap
0.22 ± 2% -0.0 0.19 ± 3% -0.0 0.19 perf-profile.children.cycles-pp.pte_offset_map_nolock
0.14 ± 2% -0.0 0.11 -0.0 0.12 ± 4% perf-profile.children.cycles-pp.__mod_zone_page_state
0.14 ± 3% -0.0 0.12 ± 4% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.perf_exclude_event
0.18 -0.0 0.15 ± 2% -0.0 0.16 perf-profile.children.cycles-pp.__rmqueue_pcplist
0.26 ± 3% -0.0 0.23 ± 5% -0.0 0.22 ± 3% perf-profile.children.cycles-pp.__pte_offset_map
0.26 ± 3% -0.0 0.23 -0.0 0.23 ± 4% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.25 ± 3% -0.0 0.22 -0.0 0.22 ± 4% perf-profile.children.cycles-pp.hrtimer_interrupt
0.18 ± 2% -0.0 0.15 ± 2% -0.0 0.16 ± 2% perf-profile.children.cycles-pp.__cond_resched
0.16 ± 2% -0.0 0.14 ± 2% -0.0 0.14 perf-profile.children.cycles-pp.uncharge_folio
0.19 ± 2% -0.0 0.17 ± 4% -0.0 0.18 ± 4% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.17 ± 2% -0.0 0.15 ± 3% -0.0 0.15 ± 4% perf-profile.children.cycles-pp.folio_unlock
0.19 ± 2% -0.0 0.17 ± 3% -0.0 0.18 ± 2% perf-profile.children.cycles-pp.down_read_trylock
0.16 -0.0 0.14 ± 2% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.folio_put
0.14 ± 2% -0.0 0.12 ± 6% -0.0 0.12 ± 6% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
0.11 ± 3% -0.0 0.09 ± 4% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.xas_start
0.13 ± 3% -0.0 0.11 ± 4% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.page_counter_try_charge
0.18 ± 3% -0.0 0.16 ± 3% -0.0 0.16 ± 5% perf-profile.children.cycles-pp.tick_nohz_handler
0.12 ± 3% -0.0 0.10 ± 4% -0.0 0.10 perf-profile.children.cycles-pp.get_pfnblock_flags_mask
0.18 ± 2% -0.0 0.16 ± 2% -0.0 0.17 perf-profile.children.cycles-pp.up_read
0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 5% perf-profile.children.cycles-pp.update_process_times
0.14 -0.0 0.12 ± 3% -0.0 0.13 perf-profile.children.cycles-pp.policy_nodemask
0.08 -0.0 0.06 ± 6% -0.0 0.07 ± 5% perf-profile.children.cycles-pp.memcg_check_events
0.13 ± 3% -0.0 0.11 -0.0 0.12 ± 4% perf-profile.children.cycles-pp.access_error
0.12 ± 3% -0.0 0.11 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.perf_swevent_event
0.09 ± 4% -0.0 0.08 -0.0 0.08 perf-profile.children.cycles-pp.__irqentry_text_end
0.06 -0.0 0.05 -0.0 0.05 ± 7% perf-profile.children.cycles-pp.pte_alloc_one
0.05 +0.0 0.06 +0.0 0.06 ± 8% perf-profile.children.cycles-pp.perf_mmap__push
0.19 ± 2% +0.2 0.35 ± 4% +0.1 0.30 ± 3% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
2.72 +0.6 3.32 ± 2% +0.5 3.24 perf-profile.children.cycles-pp.tlb_finish_mmu
24.44 +2.1 26.58 ± 3% +1.0 25.45 ± 4% perf-profile.children.cycles-pp.set_pte_range
22.47 +2.3 24.81 ± 4% +1.2 23.71 ± 4% perf-profile.children.cycles-pp.folio_add_lru_vma
22.31 +2.4 24.70 ± 4% +1.3 23.58 ± 4% perf-profile.children.cycles-pp.folio_batch_move_lru
22.59 +4.7 27.29 ± 2% +4.3 26.92 perf-profile.children.cycles-pp.unmap_page_range
22.59 +4.7 27.29 ± 2% +4.3 26.92 perf-profile.children.cycles-pp.unmap_vmas
22.59 +4.7 27.29 ± 2% +4.3 26.92 perf-profile.children.cycles-pp.zap_pmd_range
22.59 +4.7 27.29 ± 2% +4.3 26.92 perf-profile.children.cycles-pp.zap_pte_range
20.59 +5.1 25.64 ± 2% +4.6 25.21 perf-profile.children.cycles-pp.tlb_flush_mmu
25.34 +5.3 30.64 ± 2% +4.9 30.19 perf-profile.children.cycles-pp.__vm_munmap
25.34 +5.3 30.64 ± 2% +4.9 30.19 perf-profile.children.cycles-pp.__x64_sys_munmap
25.34 +5.3 30.65 ± 2% +4.9 30.19 perf-profile.children.cycles-pp.__munmap
25.34 +5.3 30.65 ± 2% +4.9 30.20 perf-profile.children.cycles-pp.do_vmi_align_munmap
25.34 +5.3 30.65 ± 2% +4.9 30.20 perf-profile.children.cycles-pp.do_vmi_munmap
25.46 +5.3 30.77 ± 2% +4.9 30.32 perf-profile.children.cycles-pp.do_syscall_64
25.33 +5.3 30.64 ± 2% +4.9 30.19 perf-profile.children.cycles-pp.unmap_region
25.46 +5.3 30.77 ± 2% +4.9 30.32 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
23.30 +5.7 28.96 ± 2% +5.1 28.44 perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
23.29 +5.7 28.95 ± 2% +5.1 28.43 perf-profile.children.cycles-pp.free_pages_and_swap_cache
23.00 +5.7 28.73 ± 2% +5.2 28.20 perf-profile.children.cycles-pp.folios_put_refs
21.22 +5.9 27.13 ± 3% +5.4 26.57 perf-profile.children.cycles-pp.__page_cache_release
40.79 +8.4 49.20 +6.7 47.50 ± 2% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
40.78 +8.4 49.19 +6.7 47.49 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
40.64 +8.4 49.09 +6.7 47.38 ± 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
21.23 -3.5 17.73 ± 2% -2.8 18.43 ± 2% perf-profile.self.cycles-pp.copy_page
4.99 -0.8 4.22 -0.6 4.36 perf-profile.self.cycles-pp._raw_spin_lock
5.21 -0.7 4.53 -0.5 4.68 perf-profile.self.cycles-pp.testcase
2.63 -0.3 2.29 -0.3 2.37 ± 2% perf-profile.self.cycles-pp.sync_regs
2.42 -0.3 2.09 ± 2% -0.3 2.16 ± 2% perf-profile.self.cycles-pp.native_irq_return_iret
1.00 -0.2 0.83 ± 2% -0.1 0.87 ± 2% perf-profile.self.cycles-pp.___perf_sw_event
0.58 ± 2% -0.1 0.43 ± 3% -0.1 0.46 perf-profile.self.cycles-pp._compound_head
0.93 ± 2% -0.1 0.80 ± 3% -0.1 0.84 ± 6% perf-profile.self.cycles-pp.mem_cgroup_commit_charge
0.61 -0.1 0.50 ± 3% -0.1 0.51 ± 3% perf-profile.self.cycles-pp.__mod_node_page_state
0.51 -0.1 0.40 ± 2% -0.1 0.42 perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.80 -0.1 0.70 ± 2% -0.1 0.72 perf-profile.self.cycles-pp.__handle_mm_fault
0.61 ± 2% -0.1 0.51 -0.1 0.54 perf-profile.self.cycles-pp.lru_add_fn
0.47 -0.1 0.39 ± 2% -0.1 0.41 perf-profile.self.cycles-pp.get_page_from_freelist
0.93 ± 2% -0.1 0.86 ± 3% -0.1 0.85 ± 3% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.45 -0.1 0.38 -0.1 0.40 perf-profile.self.cycles-pp.zap_present_ptes
0.65 -0.1 0.58 ± 2% -0.1 0.60 ± 2% perf-profile.self.cycles-pp.mas_walk
0.89 ± 2% -0.1 0.83 ± 3% -0.1 0.83 ± 2% perf-profile.self.cycles-pp.__lruvec_stat_mod_folio
0.44 -0.1 0.39 -0.0 0.40 ± 2% perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.52 ± 3% -0.1 0.46 ± 5% -0.0 0.49 ± 8% perf-profile.self.cycles-pp.__count_memcg_events
0.46 -0.1 0.41 ± 3% -0.0 0.41 ± 3% perf-profile.self.cycles-pp.handle_mm_fault
0.44 -0.1 0.38 ± 3% -0.0 0.40 ± 3% perf-profile.self.cycles-pp.xas_load
0.32 -0.0 0.27 -0.0 0.28 ± 2% perf-profile.self.cycles-pp.__page_cache_release
0.34 ± 3% -0.0 0.29 ± 3% -0.0 0.29 ± 2% perf-profile.self.cycles-pp.__alloc_pages_noprof
0.39 -0.0 0.35 ± 3% -0.0 0.36 ± 3% perf-profile.self.cycles-pp.filemap_get_entry
0.20 ± 4% -0.0 0.15 ± 2% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.page_counter_uncharge
0.27 ± 3% -0.0 0.22 ± 2% -0.0 0.23 ± 2% perf-profile.self.cycles-pp.rmqueue
0.29 -0.0 0.25 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.27 -0.0 0.23 ± 2% -0.0 0.24 perf-profile.self.cycles-pp.free_unref_folios
0.24 -0.0 0.20 -0.0 0.21 ± 2% perf-profile.self.cycles-pp.folio_remove_rmap_ptes
0.26 -0.0 0.22 ± 4% -0.0 0.23 ± 3% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.30 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.do_user_addr_fault
0.23 ± 3% -0.0 0.20 ± 3% -0.0 0.21 ± 4% perf-profile.self.cycles-pp.__pte_offset_map_lock
0.22 -0.0 0.19 ± 2% -0.0 0.19 ± 2% perf-profile.self.cycles-pp.set_pte_range
0.19 ± 2% -0.0 0.16 ± 4% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.__mod_lruvec_state
0.13 ± 3% -0.0 0.10 ± 3% -0.0 0.11 perf-profile.self.cycles-pp.__mem_cgroup_charge
0.25 -0.0 0.22 ± 2% -0.0 0.22 ± 2% perf-profile.self.cycles-pp.error_entry
0.23 ± 2% -0.0 0.20 ± 2% -0.0 0.21 perf-profile.self.cycles-pp.do_fault
0.21 ± 2% -0.0 0.19 ± 2% -0.0 0.19 perf-profile.self.cycles-pp.folio_add_new_anon_rmap
0.19 ± 2% -0.0 0.16 ± 2% -0.0 0.17 ± 2% perf-profile.self.cycles-pp.folio_add_lru_vma
0.18 -0.0 0.15 ± 2% -0.0 0.16 ± 2% perf-profile.self.cycles-pp.free_unref_page_commit
0.15 ± 2% -0.0 0.13 ± 3% -0.0 0.13 perf-profile.self.cycles-pp.uncharge_folio
0.12 ± 3% -0.0 0.10 -0.0 0.10 ± 4% perf-profile.self.cycles-pp.perf_exclude_event
0.19 ± 2% -0.0 0.17 ± 3% -0.0 0.18 ± 6% perf-profile.self.cycles-pp.get_vma_policy
0.24 -0.0 0.22 -0.0 0.22 ± 3% perf-profile.self.cycles-pp.try_charge_memcg
0.14 ± 2% -0.0 0.12 -0.0 0.12 ± 3% perf-profile.self.cycles-pp.__rmqueue_pcplist
0.11 ± 3% -0.0 0.09 ± 5% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.__mod_zone_page_state
0.11 ± 3% -0.0 0.09 -0.0 0.10 ± 4% perf-profile.self.cycles-pp.page_counter_try_charge
0.15 ± 2% -0.0 0.13 ± 3% -0.0 0.13 ± 2% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.17 ± 4% -0.0 0.15 -0.0 0.15 ± 3% perf-profile.self.cycles-pp.lock_vma_under_rcu
0.15 ± 2% -0.0 0.13 -0.0 0.14 ± 3% perf-profile.self.cycles-pp.folio_put
0.18 -0.0 0.16 ± 2% -0.0 0.16 ± 2% perf-profile.self.cycles-pp.down_read_trylock
0.21 ± 3% -0.0 0.19 ± 4% -0.0 0.18 ± 4% perf-profile.self.cycles-pp.finish_fault
0.17 ± 2% -0.0 0.15 ± 3% -0.0 0.15 perf-profile.self.cycles-pp.__perf_sw_event
0.19 ± 2% -0.0 0.17 ± 2% -0.0 0.18 perf-profile.self.cycles-pp.asm_exc_page_fault
0.16 ± 2% -0.0 0.14 ± 2% -0.0 0.14 ± 3% perf-profile.self.cycles-pp.folio_unlock
0.22 ± 3% -0.0 0.20 ± 4% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.__pte_offset_map
0.16 ± 2% -0.0 0.15 ± 5% -0.0 0.15 ± 2% perf-profile.self.cycles-pp.shmem_fault
0.17 ± 2% -0.0 0.15 ± 3% -0.0 0.16 ± 2% perf-profile.self.cycles-pp.up_read
0.10 -0.0 0.08 ± 4% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.get_pfnblock_flags_mask
0.11 -0.0 0.09 ± 5% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.perf_swevent_event
0.10 ± 3% -0.0 0.09 ± 5% -0.0 0.09 ± 6% perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
0.11 -0.0 0.09 ± 5% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.zap_pte_range
0.10 ± 4% -0.0 0.09 ± 4% -0.0 0.09 ± 5% perf-profile.self.cycles-pp.pte_offset_map_nolock
0.10 ± 4% -0.0 0.08 ± 4% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.__do_fault
0.12 ± 3% -0.0 0.10 ± 3% -0.0 0.10 perf-profile.self.cycles-pp.exc_page_fault
0.12 ± 3% -0.0 0.11 ± 4% -0.0 0.11 ± 4% perf-profile.self.cycles-pp.alloc_pages_mpol_noprof
0.12 ± 3% -0.0 0.10 ± 3% -0.0 0.11 ± 4% perf-profile.self.cycles-pp.access_error
0.09 ± 5% -0.0 0.08 -0.0 0.08 ± 5% perf-profile.self.cycles-pp.policy_nodemask
0.12 ± 4% -0.0 0.10 ± 3% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.vma_alloc_folio_noprof
0.09 -0.0 0.08 ± 5% -0.0 0.08 ± 6% perf-profile.self.cycles-pp.xas_start
0.10 -0.0 0.09 -0.0 0.09 perf-profile.self.cycles-pp.folio_prealloc
0.09 -0.0 0.08 -0.0 0.08 perf-profile.self.cycles-pp.__cond_resched
0.06 -0.0 0.05 -0.0 0.05 perf-profile.self.cycles-pp.vm_normal_page
0.38 ± 2% +0.1 0.44 +0.1 0.44 ± 3% perf-profile.self.cycles-pp.folio_batch_move_lru
0.18 ± 2% +0.2 0.34 ± 4% +0.1 0.29 ± 4% perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
40.64 +8.4 49.08 +6.7 47.38 ± 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
[3]
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
1727628 ± 22% -24.1% 1310525 ± 7% -5.3% 1636459 ± 30% sched_debug.cpu.avg_idle.max
6058 ± 41% -47.9% 3156 ± 43% +1.0% 6121 ± 61% sched_debug.cpu.max_idle_balance_cost.stddev
35617 ± 5% -9.1% 32375 ± 21% -26.2% 26270 ± 25% numa-vmstat.node0.nr_slab_reclaimable
4024866 +3.4% 4163009 ± 7% +8.7% 4374953 ± 7% numa-vmstat.node1.nr_file_pages
19132 ± 10% +17.3% 22446 ± 30% +49.4% 28587 ± 23% numa-vmstat.node1.nr_slab_reclaimable
17488267 -5.6% 16505101 -6.5% 16346741 will-it-scale.224.processes
78072 -5.6% 73683 -6.5% 72975 will-it-scale.per_process_ops
17488267 -5.6% 16505101 -6.5% 16346741 will-it-scale.workload
142458 ± 5% -9.1% 129506 ± 21% -26.2% 105066 ± 25% numa-meminfo.node0.KReclaimable
142458 ± 5% -9.1% 129506 ± 21% -26.2% 105066 ± 25% numa-meminfo.node0.SReclaimable
16107004 +3.3% 16635393 ± 7% +8.6% 17491995 ± 7% numa-meminfo.node1.FilePages
76509 ± 10% +17.4% 89791 ± 30% +49.4% 114321 ± 23% numa-meminfo.node1.KReclaimable
76509 ± 10% +17.4% 89791 ± 30% +49.4% 114321 ± 23% numa-meminfo.node1.SReclaimable
5.296e+09 -5.6% 4.998e+09 -6.5% 4.949e+09 proc-vmstat.numa_hit
5.291e+09 -5.6% 4.995e+09 -6.5% 4.947e+09 proc-vmstat.numa_local
5.285e+09 -5.6% 4.989e+09 -6.5% 4.941e+09 proc-vmstat.pgalloc_normal
5.264e+09 -5.6% 4.969e+09 -6.5% 4.921e+09 proc-vmstat.pgfault
5.283e+09 -5.6% 4.989e+09 -6.5% 4.941e+09 proc-vmstat.pgfree
20.16 -2.9% 19.58 -3.3% 19.50 perf-stat.i.MPKI
2.501e+10 -2.4% 2.44e+10 -2.9% 2.428e+10 perf-stat.i.branch-instructions
18042153 -2.8% 17539874 -3.8% 17362741 perf-stat.i.branch-misses
2.382e+09 -5.6% 2.249e+09 -6.5% 2.228e+09 perf-stat.i.cache-misses
2.561e+09 -5.3% 2.424e+09 -6.5% 2.394e+09 perf-stat.i.cache-references
5.49 +2.8% 5.64 +3.3% 5.67 perf-stat.i.cpi
274.25 +5.4% 289.07 +6.4% 291.86 perf-stat.i.cycles-between-cache-misses
1.177e+11 -2.7% 1.145e+11 -3.2% 1.139e+11 perf-stat.i.instructions
0.19 -2.7% 0.18 -3.2% 0.18 perf-stat.i.ipc
155.11 -5.5% 146.59 -6.5% 145.09 perf-stat.i.metric.K/sec
17405977 -5.5% 16441964 -6.5% 16274188 perf-stat.i.minor-faults
17405978 -5.5% 16441964 -6.5% 16274188 perf-stat.i.page-faults
4.41 ± 50% +28.5% 5.66 +29.1% 5.69 perf-stat.overall.cpi
217.50 ± 50% +32.4% 287.87 +33.6% 290.48 perf-stat.overall.cycles-between-cache-misses
1623235 ± 50% +29.0% 2093187 +29.6% 2103156 perf-stat.overall.path-length
5.48 -0.4 5.11 -0.4 5.11 perf-profile.calltrace.cycles-pp.copy_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
57.55 -0.3 57.20 -0.1 57.48 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
56.14 -0.2 55.90 +0.0 56.16 perf-profile.calltrace.cycles-pp.testcase
1.86 -0.2 1.71 -0.1 1.73 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.77 -0.1 1.63 -0.1 1.64 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault
1.17 -0.1 1.10 -0.1 1.08 perf-profile.calltrace.cycles-pp.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
51.87 -0.0 51.82 +0.2 52.11 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.96 -0.0 0.91 -0.1 0.91 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
0.71 -0.0 0.67 -0.0 0.66 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
0.60 -0.0 0.57 -0.0 0.56 perf-profile.calltrace.cycles-pp.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault
51.39 -0.0 51.37 +0.3 51.67 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
51.03 +0.0 51.03 +0.3 51.33 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
4.86 +0.0 4.91 +0.0 4.90 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
4.87 +0.0 4.91 +0.0 4.90 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
4.86 +0.0 4.91 +0.0 4.90 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
4.85 +0.0 4.90 +0.0 4.88 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region
4.77 +0.1 4.82 +0.0 4.81 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu
37.74 +0.3 38.01 -0.0 37.74 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
37.74 +0.3 38.01 -0.0 37.74 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
37.74 +0.3 38.01 -0.0 37.74 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
37.73 +0.3 38.01 +0.0 37.74 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
37.27 +0.3 37.57 +0.0 37.30 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
37.28 +0.3 37.58 +0.0 37.31 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
37.28 +0.3 37.58 +0.0 37.31 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
37.15 +0.3 37.46 +0.0 37.20 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
42.65 +0.3 42.97 +0.0 42.68 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
42.65 +0.3 42.97 +0.0 42.68 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
42.65 +0.3 42.97 +0.0 42.68 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
42.65 +0.3 42.97 +0.0 42.68 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
42.65 +0.3 42.97 +0.0 42.68 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
36.72 +0.3 37.04 +0.1 36.79 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
42.65 +0.3 42.97 +0.0 42.69 perf-profile.calltrace.cycles-pp.__munmap
42.65 +0.3 42.97 +0.0 42.69 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
42.65 +0.3 42.97 +0.0 42.69 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
41.26 +0.4 41.63 +0.1 41.38 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
41.26 +0.4 41.64 +0.1 41.38 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
41.23 +0.4 41.61 +0.1 41.36 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
43.64 +0.5 44.12 +0.8 44.42 perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
41.57 +0.6 42.22 +0.9 42.50 perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
40.93 +0.7 41.59 +1.0 41.90 perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_fault.__handle_mm_fault
40.84 +0.7 41.50 +1.0 41.81 perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_fault
40.19 +0.7 40.89 +1.0 41.19 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
40.19 +0.7 40.89 +1.0 41.19 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
40.16 +0.7 40.87 +1.0 41.16 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
5.49 -0.4 5.12 -0.4 5.12 perf-profile.children.cycles-pp.copy_page
57.05 -0.3 56.75 -0.0 57.02 perf-profile.children.cycles-pp.testcase
55.66 -0.2 55.41 +0.0 55.70 perf-profile.children.cycles-pp.asm_exc_page_fault
1.88 -0.2 1.73 -0.1 1.75 perf-profile.children.cycles-pp.__pte_offset_map_lock
1.79 -0.1 1.64 -0.1 1.66 perf-profile.children.cycles-pp._raw_spin_lock
1.19 -0.1 1.11 -0.1 1.10 perf-profile.children.cycles-pp.folio_prealloc
0.96 -0.1 0.91 -0.1 0.91 perf-profile.children.cycles-pp.sync_regs
51.89 -0.0 51.84 +0.2 52.13 perf-profile.children.cycles-pp.handle_mm_fault
0.73 -0.0 0.68 -0.0 0.68 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
1.02 -0.0 0.98 -0.1 0.96 perf-profile.children.cycles-pp.native_irq_return_iret
0.63 -0.0 0.59 -0.0 0.59 perf-profile.children.cycles-pp.alloc_pages_mpol_noprof
0.55 -0.0 0.51 -0.0 0.51 perf-profile.children.cycles-pp.__alloc_pages_noprof
0.51 -0.0 0.48 -0.0 0.48 perf-profile.children.cycles-pp.__do_fault
0.46 -0.0 0.43 -0.0 0.44 perf-profile.children.cycles-pp.shmem_fault
0.41 -0.0 0.39 -0.0 0.38 perf-profile.children.cycles-pp.get_page_from_freelist
0.51 -0.0 0.48 -0.0 0.50 perf-profile.children.cycles-pp.shmem_get_folio_gfp
0.36 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.___perf_sw_event
0.42 -0.0 0.39 -0.0 0.39 perf-profile.children.cycles-pp.__perf_sw_event
0.42 -0.0 0.40 -0.0 0.40 perf-profile.children.cycles-pp.zap_present_ptes
0.26 -0.0 0.24 -0.0 0.24 perf-profile.children.cycles-pp.__mod_lruvec_state
0.38 -0.0 0.36 -0.0 0.36 perf-profile.children.cycles-pp.lru_add_fn
0.25 ± 2% -0.0 0.23 -0.0 0.24 perf-profile.children.cycles-pp.filemap_get_entry
0.21 ± 2% -0.0 0.20 ± 2% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.__mod_node_page_state
0.21 -0.0 0.19 ± 2% -0.0 0.20 perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
51.40 -0.0 51.39 +0.3 51.68 perf-profile.children.cycles-pp.__handle_mm_fault
0.23 ± 2% -0.0 0.21 -0.0 0.21 ± 2% perf-profile.children.cycles-pp.rmqueue
0.39 -0.0 0.38 -0.0 0.36 ± 2% perf-profile.children.cycles-pp.__mem_cgroup_charge
0.16 ± 2% -0.0 0.15 ± 2% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
0.11 -0.0 0.10 -0.0 0.10 ± 5% perf-profile.children.cycles-pp._compound_head
0.17 ± 2% -0.0 0.16 ± 2% -0.0 0.16 perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.16 -0.0 0.15 -0.0 0.15 ± 2% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.27 -0.0 0.26 -0.0 0.26 perf-profile.children.cycles-pp.lock_vma_under_rcu
0.11 -0.0 0.10 -0.0 0.10 perf-profile.children.cycles-pp.update_process_times
0.09 -0.0 0.08 -0.0 0.08 perf-profile.children.cycles-pp.scheduler_tick
0.06 -0.0 0.05 -0.0 0.05 perf-profile.children.cycles-pp.task_tick_fair
0.12 -0.0 0.11 -0.0 0.11 ± 3% perf-profile.children.cycles-pp.tick_nohz_handler
0.15 -0.0 0.14 -0.0 0.14 perf-profile.children.cycles-pp.hrtimer_interrupt
0.11 ± 4% -0.0 0.10 -0.0 0.09 ± 5% perf-profile.children.cycles-pp.uncharge_batch
51.07 -0.0 51.06 +0.3 51.36 perf-profile.children.cycles-pp.do_fault
0.08 -0.0 0.08 ± 6% -0.0 0.07 perf-profile.children.cycles-pp.page_counter_uncharge
0.06 -0.0 0.06 ± 6% +0.0 0.07 perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.15 ± 2% +0.0 0.16 ± 6% +0.0 0.17 ± 4% perf-profile.children.cycles-pp.generic_perform_write
0.07 +0.0 0.08 +0.0 0.08 ± 4% perf-profile.children.cycles-pp.folio_add_lru
0.09 ± 4% +0.0 0.10 ± 3% +0.0 0.10 ± 4% perf-profile.children.cycles-pp.shmem_write_begin
4.88 +0.0 4.93 +0.0 4.91 perf-profile.children.cycles-pp.tlb_finish_mmu
37.74 +0.3 38.01 -0.0 37.74 perf-profile.children.cycles-pp.unmap_page_range
37.74 +0.3 38.01 -0.0 37.74 perf-profile.children.cycles-pp.unmap_vmas
37.74 +0.3 38.01 -0.0 37.74 perf-profile.children.cycles-pp.zap_pmd_range
37.74 +0.3 38.01 -0.0 37.74 perf-profile.children.cycles-pp.zap_pte_range
37.28 +0.3 37.58 +0.0 37.31 perf-profile.children.cycles-pp.tlb_flush_mmu
42.65 +0.3 42.97 +0.0 42.68 perf-profile.children.cycles-pp.__x64_sys_munmap
42.65 +0.3 42.97 +0.0 42.68 perf-profile.children.cycles-pp.__vm_munmap
42.65 +0.3 42.97 +0.0 42.69 perf-profile.children.cycles-pp.__munmap
42.65 +0.3 42.98 +0.0 42.69 perf-profile.children.cycles-pp.do_vmi_align_munmap
42.65 +0.3 42.98 +0.0 42.69 perf-profile.children.cycles-pp.do_vmi_munmap
42.86 +0.3 43.18 +0.1 42.91 perf-profile.children.cycles-pp.do_syscall_64
42.65 +0.3 42.97 +0.0 42.69 perf-profile.children.cycles-pp.unmap_region
42.86 +0.3 43.19 +0.1 42.91 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
42.15 +0.3 42.50 +0.1 42.22 perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
42.12 +0.3 42.46 +0.1 42.19 perf-profile.children.cycles-pp.folios_put_refs
42.15 +0.3 42.50 +0.1 42.22 perf-profile.children.cycles-pp.free_pages_and_swap_cache
41.51 +0.4 41.89 +0.1 41.63 perf-profile.children.cycles-pp.__page_cache_release
43.66 +0.5 44.15 +0.8 44.45 perf-profile.children.cycles-pp.finish_fault
41.59 +0.6 42.24 +0.9 42.52 perf-profile.children.cycles-pp.set_pte_range
40.94 +0.7 41.59 +1.0 41.90 perf-profile.children.cycles-pp.folio_add_lru_vma
40.99 +0.7 41.66 +1.0 41.97 perf-profile.children.cycles-pp.folio_batch_move_lru
81.57 +1.1 82.65 +1.1 82.68 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
81.59 +1.1 82.68 +1.1 82.72 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
81.60 +1.1 82.68 +1.1 82.72 perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
5.47 -0.4 5.10 -0.4 5.11 perf-profile.self.cycles-pp.copy_page
1.77 -0.1 1.63 -0.1 1.64 perf-profile.self.cycles-pp._raw_spin_lock
2.19 -0.1 2.07 -0.1 2.06 perf-profile.self.cycles-pp.testcase
0.96 -0.0 0.91 -0.1 0.90 perf-profile.self.cycles-pp.sync_regs
1.02 -0.0 0.98 -0.1 0.96 perf-profile.self.cycles-pp.native_irq_return_iret
0.28 -0.0 0.26 -0.0 0.26 ± 2% perf-profile.self.cycles-pp.___perf_sw_event
0.19 ± 2% -0.0 0.17 -0.0 0.17 ± 2% perf-profile.self.cycles-pp.get_page_from_freelist
0.20 -0.0 0.19 ± 2% -0.0 0.19 ± 2% perf-profile.self.cycles-pp.__mod_node_page_state
0.12 ± 4% -0.0 0.10 -0.0 0.11 ± 4% perf-profile.self.cycles-pp.filemap_get_entry
0.11 ± 3% -0.0 0.10 -0.0 0.10 ± 3% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.21 -0.0 0.20 -0.0 0.20 ± 2% perf-profile.self.cycles-pp.folios_put_refs
0.16 -0.0 0.15 -0.0 0.15 ± 3% perf-profile.self.cycles-pp.mas_walk
0.09 -0.0 0.08 -0.0 0.08 perf-profile.self.cycles-pp.folio_add_new_anon_rmap
0.06 -0.0 0.05 -0.0 0.05 ± 8% perf-profile.self.cycles-pp.down_read_trylock
0.18 -0.0 0.17 ± 2% -0.0 0.17 perf-profile.self.cycles-pp.lru_add_fn
0.09 ± 4% -0.0 0.09 ± 4% -0.0 0.08 perf-profile.self.cycles-pp._compound_head
81.57 +1.1 82.65 +1.1 82.68 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
[4]
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-csl-d02/page_fault2/will-it-scale
59142d87ab03b8ff a94032b35e5f97dc1023030d929 fd2296741e2686ed6ecd05187e4
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
13383 -14.7% 11416 -10.2% 12023 perf-c2c.DRAM.local
878.00 ± 4% +39.1% 1221 ± 6% +11.3% 977.00 ± 4% perf-c2c.HITM.local
0.54 ± 3% -0.1 0.43 ± 2% -0.1 0.47 ± 2% mpstat.cpu.all.irq%
0.04 ± 6% -0.0 0.03 +0.0 0.04 ± 11% mpstat.cpu.all.soft%
8.44 ± 2% -1.1 7.32 -0.9 7.53 mpstat.cpu.all.usr%
59743 ± 11% -22.9% 46054 ± 7% -15.0% 50754 ± 8% sched_debug.cfs_rq:/.avg_vruntime.stddev
59744 ± 11% -22.9% 46054 ± 7% -15.0% 50754 ± 8% sched_debug.cfs_rq:/.min_vruntime.stddev
3843 ± 4% -28.8% 2737 ± 8% -14.2% 3296 ± 10% sched_debug.cpu.nr_switches.min
6749425 -19.4% 5441878 -12.1% 5929733 will-it-scale.36.processes
187483 -19.4% 151162 -12.1% 164714 will-it-scale.per_process_ops
6749425 -19.4% 5441878 -12.1% 5929733 will-it-scale.workload
734606 -2.1% 718878 -1.8% 721386 proc-vmstat.nr_anon_pages
9660 -4.0% 9278 -2.9% 9383 proc-vmstat.nr_mapped
2999 +3.2% 3095 +2.3% 3069 proc-vmstat.nr_page_table_pages
2.043e+09 -19.3% 1.649e+09 -12.0% 1.799e+09 proc-vmstat.numa_hit
2.049e+09 -19.3% 1.653e+09 -12.0% 1.803e+09 proc-vmstat.numa_local
2.036e+09 -19.2% 1.644e+09 -12.0% 1.791e+09 proc-vmstat.pgalloc_normal
2.029e+09 -19.3% 1.639e+09 -12.0% 1.785e+09 proc-vmstat.pgfault
2.035e+09 -19.2% 1.644e+09 -12.0% 1.791e+09 proc-vmstat.pgfree
21123 ± 2% +3.4% 21833 +3.9% 21942 proc-vmstat.pgreuse
17.45 -8.6% 15.96 -6.0% 16.41 perf-stat.i.MPKI
6.199e+09 -10.2% 5.567e+09 -5.5% 5.856e+09 perf-stat.i.branch-instructions
0.26 -0.0 0.25 -0.0 0.25 perf-stat.i.branch-miss-rate%
16660671 -10.6% 14902193 -7.3% 15444974 perf-stat.i.branch-misses
87.85 -2.9 84.90 -2.8 85.02 perf-stat.i.cache-miss-rate%
5.476e+08 -19.5% 4.407e+08 -12.3% 4.805e+08 perf-stat.i.cache-misses
6.227e+08 -16.7% 5.186e+08 -9.3% 5.647e+08 perf-stat.i.cache-references
4.35 +14.1% 4.96 +7.6% 4.68 perf-stat.i.cpi
61.84 ± 2% -16.2% 51.79 -14.1% 53.13 perf-stat.i.cpu-migrations
251.09 +24.4% 312.35 +14.2% 286.75 perf-stat.i.cycles-between-cache-misses
3.137e+10 -11.8% 2.768e+10 -6.6% 2.931e+10 perf-stat.i.instructions
0.23 -11.7% 0.21 -6.5% 0.22 perf-stat.i.ipc
373.37 -19.3% 301.36 -12.0% 328.39 perf-stat.i.metric.K/sec
6720929 -19.3% 5424836 -12.0% 5911373 perf-stat.i.minor-faults
6720929 -19.3% 5424836 -12.0% 5911373 perf-stat.i.page-faults
17.45 -8.8% 15.92 -6.1% 16.39 perf-stat.overall.MPKI
0.27 -0.0 0.27 -0.0 0.26 perf-stat.overall.branch-miss-rate%
87.94 -3.0 84.96 -2.9 85.08 perf-stat.overall.cache-miss-rate%
4.35 +13.4% 4.93 +7.1% 4.65 perf-stat.overall.cpi
249.03 +24.3% 309.56 +14.0% 283.85 perf-stat.overall.cycles-between-cache-misses
0.23 -11.8% 0.20 -6.6% 0.21 perf-stat.overall.ipc
1400364 +9.4% 1532615 +6.5% 1491568 perf-stat.overall.path-length
6.178e+09 -10.2% 5.548e+09 -5.5% 5.835e+09 perf-stat.ps.branch-instructions
16578081 -10.7% 14811244 -7.4% 15346617 perf-stat.ps.branch-misses
5.458e+08 -19.5% 4.392e+08 -12.3% 4.788e+08 perf-stat.ps.cache-misses
6.206e+08 -16.7% 5.169e+08 -9.3% 5.628e+08 perf-stat.ps.cache-references
61.60 ± 2% -16.3% 51.58 -14.2% 52.85 perf-stat.ps.cpu-migrations
3.127e+10 -11.8% 2.758e+10 -6.6% 2.921e+10 perf-stat.ps.instructions
6698560 -19.3% 5406176 -12.1% 5890997 perf-stat.ps.minor-faults
6698560 -19.3% 5406177 -12.1% 5890998 perf-stat.ps.page-faults
9.451e+12 -11.8% 8.34e+12 -6.4% 8.845e+12 perf-stat.total.instructions
78.09 -11.0 67.12 -7.4 70.68 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
84.87 ± 2% -10.3 74.55 -6.9 77.97 perf-profile.calltrace.cycles-pp.testcase
68.48 ± 2% -9.3 59.13 -6.2 62.28 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
68.26 ± 2% -9.3 58.94 -6.2 62.08 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
65.58 ± 2% -8.7 56.90 -5.7 59.92 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
64.14 ± 2% -8.5 55.61 -5.6 58.59 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
63.24 ± 2% -8.4 54.84 -5.5 57.78 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
40.12 ± 4% -4.1 36.02 -2.9 37.23 perf-profile.calltrace.cycles-pp.copy_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
15.19 ± 3% -3.5 11.73 -1.9 13.28 perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
9.10 ± 8% -3.1 6.01 ± 2% -1.9 7.16 ± 3% perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_fault.__handle_mm_fault
8.89 ± 8% -3.1 5.83 ± 3% -1.9 6.96 ± 3% perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_fault
10.98 ± 6% -3.0 7.97 ± 2% -1.6 9.38 ± 2% perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
7.41 ± 10% -2.9 4.49 ± 4% -1.9 5.50 ± 4% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
7.42 ± 10% -2.9 4.51 ± 4% -1.9 5.52 ± 4% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
7.35 ± 10% -2.9 4.44 ± 4% -1.9 5.45 ± 4% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
2.14 ± 15% -1.4 0.70 ± 6% -1.2 0.93 ± 3% perf-profile.calltrace.cycles-pp._compound_head.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
3.15 ± 11% -1.3 1.84 -1.2 1.96 perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
3.60 ± 3% -0.4 3.16 -0.3 3.28 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
3.88 -0.4 3.46 -0.4 3.50 perf-profile.calltrace.cycles-pp.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
1.29 -0.4 0.87 -0.4 0.92 perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
3.09 ± 3% -0.4 2.68 -0.3 2.81 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault
0.96 -0.3 0.62 ± 2% -0.3 0.65 perf-profile.calltrace.cycles-pp.mas_walk.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
3.45 ± 3% -0.3 3.12 -0.2 3.24 perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
3.31 ± 3% -0.3 3.00 -0.2 3.11 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
3.09 ± 3% -0.3 2.80 -0.2 2.90 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
2.42 -0.3 2.16 -0.3 2.14 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
2.72 ± 4% -0.2 2.50 -0.1 2.58 perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault
1.55 ± 2% -0.2 1.33 -0.2 1.38 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
0.87 -0.2 0.72 -0.1 0.79 perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
1.39 ± 3% -0.1 1.25 ± 3% -0.1 1.30 ± 2% perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.folio_prealloc.do_fault.__handle_mm_fault.handle_mm_fault
0.81 -0.1 0.70 ± 2% -0.1 0.73 ± 2% perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
1.74 -0.1 1.63 -0.1 1.62 perf-profile.calltrace.cycles-pp.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault.__handle_mm_fault
0.85 ± 2% -0.1 0.74 ± 3% -0.1 0.78 perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.71 -0.1 0.62 -0.1 0.64 ± 3% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault
1.01 ± 4% -0.1 0.93 ± 2% -0.1 0.94 perf-profile.calltrace.cycles-pp.xas_load.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault
0.72 ± 2% -0.1 0.64 ± 3% -0.0 0.67 perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
1.56 -0.1 1.50 -0.1 1.48 perf-profile.calltrace.cycles-pp.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof.folio_prealloc.do_fault
0.35 ± 81% +0.1 0.44 ± 50% +0.3 0.68 ± 7% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.__lruvec_stat_mod_folio.set_pte_range.finish_fault.do_fault
0.77 ± 2% +0.1 0.87 ± 2% +0.0 0.80 ± 2% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages_noprof.alloc_pages_mpol_noprof.vma_alloc_folio_noprof
1.47 ± 2% +0.2 1.63 ± 6% +0.4 1.90 ± 2% perf-profile.calltrace.cycles-pp.__lruvec_stat_mod_folio.set_pte_range.finish_fault.do_fault.__handle_mm_fault
0.62 ± 5% +0.2 0.84 ± 2% +0.1 0.69 ± 2% perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
0.00 +0.7 0.68 ± 3% +0.4 0.35 ± 70% perf-profile.calltrace.cycles-pp.__lruvec_stat_mod_folio.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range
1.66 ± 12% +1.2 2.86 +0.8 2.50 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
1.66 ± 12% +1.2 2.86 +0.8 2.49 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
1.66 ± 12% +1.2 2.86 +0.8 2.49 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
1.51 ± 15% +1.3 2.80 +0.9 2.41 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu.unmap_region
1.31 ± 18% +1.3 2.64 ± 2% +0.9 2.25 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_finish_mmu
16.10 ± 9% +9.5 25.63 ± 2% +6.4 22.50 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
16.10 ± 9% +9.5 25.63 ± 2% +6.4 22.50 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
16.10 ± 9% +9.5 25.63 ± 2% +6.4 22.50 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
16.09 ± 9% +9.5 25.62 ± 2% +6.4 22.49 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
17.82 ± 10% +10.7 28.54 ± 2% +7.2 25.03 perf-profile.calltrace.cycles-pp.__munmap
17.81 ± 10% +10.7 28.53 ± 2% +7.2 25.02 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
17.81 ± 10% +10.7 28.53 ± 2% +7.2 25.02 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
17.81 ± 10% +10.7 28.53 +7.2 25.02 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
17.82 ± 10% +10.7 28.54 ± 2% +7.2 25.03 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
17.82 ± 10% +10.7 28.54 ± 2% +7.2 25.03 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
17.81 ± 10% +10.7 28.53 ± 2% +7.2 25.02 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
17.79 ± 10% +10.7 28.53 ± 2% +7.2 25.02 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
12.80 ± 15% +10.9 23.68 ± 2% +7.6 20.42 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
12.78 ± 15% +10.9 23.68 ± 2% +7.6 20.41 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
12.77 ± 15% +10.9 23.67 ± 2% +7.6 20.40 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
11.49 ± 18% +11.7 23.22 ± 2% +8.3 19.79 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
10.49 ± 20% +11.9 22.36 ± 2% +8.4 18.90 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
11.02 ± 22% +13.4 24.43 ± 2% +9.4 20.44 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
11.03 ± 22% +13.4 24.46 ± 2% +9.4 20.46 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
10.97 ± 22% +13.4 24.41 ± 2% +9.4 20.40 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
81.97 ± 2% -10.7 71.28 -7.2 74.78 perf-profile.children.cycles-pp.testcase
74.32 ± 2% -10.3 64.01 -6.9 67.40 perf-profile.children.cycles-pp.asm_exc_page_fault
68.51 ± 2% -9.4 59.15 -6.2 62.30 perf-profile.children.cycles-pp.exc_page_fault
68.29 ± 2% -9.3 58.97 -6.2 62.11 perf-profile.children.cycles-pp.do_user_addr_fault
65.61 ± 2% -8.7 56.92 -5.7 59.95 perf-profile.children.cycles-pp.handle_mm_fault
64.16 ± 2% -8.5 55.63 -5.6 58.60 perf-profile.children.cycles-pp.__handle_mm_fault
63.27 ± 2% -8.4 54.87 -5.5 57.82 perf-profile.children.cycles-pp.do_fault
40.21 ± 4% -4.1 36.11 -2.9 37.33 perf-profile.children.cycles-pp.copy_page
15.21 ± 3% -3.5 11.75 -1.9 13.30 perf-profile.children.cycles-pp.finish_fault
9.10 ± 8% -3.1 6.02 ± 2% -1.9 7.16 ± 3% perf-profile.children.cycles-pp.folio_add_lru_vma
8.91 ± 8% -3.0 5.87 ± 3% -1.9 6.99 ± 3% perf-profile.children.cycles-pp.folio_batch_move_lru
10.99 ± 6% -3.0 7.98 ± 2% -1.6 9.40 ± 2% perf-profile.children.cycles-pp.set_pte_range
2.16 ± 15% -1.4 0.71 ± 6% -1.2 0.94 ± 4% perf-profile.children.cycles-pp._compound_head
3.17 ± 11% -1.3 1.85 -1.2 1.98 perf-profile.children.cycles-pp.zap_present_ptes
3.63 ± 3% -0.5 3.17 -0.3 3.30 perf-profile.children.cycles-pp.__pte_offset_map_lock
3.14 ± 3% -0.4 2.71 -0.3 2.85 perf-profile.children.cycles-pp._raw_spin_lock
1.30 -0.4 0.88 -0.4 0.93 perf-profile.children.cycles-pp.lock_vma_under_rcu
3.90 -0.4 3.49 -0.4 3.53 perf-profile.children.cycles-pp.folio_prealloc
0.97 -0.3 0.62 ± 2% -0.3 0.66 perf-profile.children.cycles-pp.mas_walk
3.46 ± 3% -0.3 3.13 -0.2 3.25 perf-profile.children.cycles-pp.__do_fault
3.31 ± 3% -0.3 3.00 -0.2 3.12 perf-profile.children.cycles-pp.shmem_fault
6.74 ± 4% -0.3 6.44 -0.2 6.53 perf-profile.children.cycles-pp.native_irq_return_iret
3.10 ± 3% -0.3 2.82 -0.2 2.92 perf-profile.children.cycles-pp.shmem_get_folio_gfp
2.43 -0.3 2.17 -0.3 2.15 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
1.60 ± 2% -0.2 1.37 -0.2 1.42 perf-profile.children.cycles-pp.sync_regs
2.73 ± 4% -0.2 2.51 -0.1 2.58 perf-profile.children.cycles-pp.filemap_get_entry
1.66 -0.2 1.44 ± 2% -0.1 1.51 perf-profile.children.cycles-pp.__perf_sw_event
0.64 ± 4% -0.2 0.44 ± 2% -0.1 0.53 perf-profile.children.cycles-pp.free_unref_folios
1.45 -0.2 1.28 ± 2% -0.1 1.33 perf-profile.children.cycles-pp.___perf_sw_event
0.88 -0.2 0.73 -0.1 0.80 perf-profile.children.cycles-pp.lru_add_fn
1.40 ± 3% -0.1 1.26 ± 3% -0.1 1.31 ± 2% perf-profile.children.cycles-pp.__mem_cgroup_charge
1.23 ± 9% -0.1 1.09 ± 8% +0.2 1.41 ± 4% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
1.83 -0.1 1.71 -0.1 1.69 perf-profile.children.cycles-pp.alloc_pages_mpol_noprof
0.58 ± 7% -0.1 0.47 ± 5% -0.1 0.51 ± 3% perf-profile.children.cycles-pp.__count_memcg_events
0.69 ± 3% -0.1 0.59 -0.1 0.63 ± 4% perf-profile.children.cycles-pp.__mod_lruvec_state
0.33 ± 5% -0.1 0.22 ± 2% -0.1 0.27 perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
0.51 ± 5% -0.1 0.42 ± 7% -0.1 0.46 ± 3% perf-profile.children.cycles-pp.mem_cgroup_commit_charge
0.53 -0.1 0.44 ± 3% -0.1 0.43 ± 2% perf-profile.children.cycles-pp.get_vma_policy
1.02 ± 4% -0.1 0.93 ± 3% -0.1 0.94 perf-profile.children.cycles-pp.xas_load
0.58 ± 3% -0.1 0.50 ± 2% -0.0 0.54 ± 5% perf-profile.children.cycles-pp.__mod_node_page_state
0.57 ± 6% -0.1 0.50 ± 3% -0.0 0.53 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.48 ± 7% -0.1 0.40 ± 4% -0.0 0.44 ± 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.23 ± 5% -0.1 0.16 ± 4% -0.0 0.19 perf-profile.children.cycles-pp.free_unref_page_commit
0.43 ± 7% -0.1 0.36 ± 3% -0.0 0.39 ± 2% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.43 ± 6% -0.1 0.36 ± 3% -0.0 0.39 ± 2% perf-profile.children.cycles-pp.hrtimer_interrupt
0.16 ± 4% -0.1 0.10 -0.0 0.12 perf-profile.children.cycles-pp.get_pfnblock_flags_mask
0.15 ± 9% -0.1 0.09 ± 4% -0.0 0.11 ± 6% perf-profile.children.cycles-pp.uncharge_batch
0.30 ± 6% -0.1 0.24 ± 4% -0.0 0.27 ± 3% perf-profile.children.cycles-pp.tick_nohz_handler
0.37 ± 7% -0.1 0.31 ± 3% -0.0 0.34 ± 2% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.27 ± 8% -0.1 0.22 ± 4% -0.0 0.24 ± 3% perf-profile.children.cycles-pp.update_process_times
1.64 -0.1 1.59 -0.1 1.56 perf-profile.children.cycles-pp.__alloc_pages_noprof
0.30 ± 9% -0.0 0.25 ± 28% -0.0 0.25 ± 5% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.16 ± 6% -0.0 0.12 ± 3% -0.0 0.15 ± 4% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.25 -0.0 0.20 -0.0 0.21 ± 2% perf-profile.children.cycles-pp.handle_pte_fault
0.11 ± 11% -0.0 0.07 ± 5% -0.0 0.08 perf-profile.children.cycles-pp.page_counter_uncharge
0.20 ± 5% -0.0 0.16 ± 2% -0.0 0.16 ± 4% perf-profile.children.cycles-pp.__pte_offset_map
0.22 ± 3% -0.0 0.19 ± 4% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.error_entry
0.10 ± 3% -0.0 0.07 ± 7% -0.0 0.07 ± 8% perf-profile.children.cycles-pp.policy_nodemask
0.16 ± 3% -0.0 0.12 ± 3% -0.0 0.13 ± 4% perf-profile.children.cycles-pp.pte_offset_map_nolock
0.15 ± 5% -0.0 0.12 ± 3% -0.0 0.14 ± 4% perf-profile.children.cycles-pp.uncharge_folio
0.22 ± 3% -0.0 0.19 ± 3% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.folio_add_new_anon_rmap
0.17 ± 9% -0.0 0.14 ± 5% -0.0 0.16 ± 3% perf-profile.children.cycles-pp.scheduler_tick
0.18 ± 5% -0.0 0.16 ± 5% -0.0 0.17 ± 2% perf-profile.children.cycles-pp.up_read
0.19 ± 2% -0.0 0.16 ± 7% -0.0 0.15 ± 7% perf-profile.children.cycles-pp.shmem_get_policy
0.14 ± 4% -0.0 0.12 ± 4% -0.0 0.12 ± 6% perf-profile.children.cycles-pp.down_read_trylock
0.13 ± 6% -0.0 0.10 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.folio_put
0.08 ± 11% -0.0 0.06 ± 14% -0.0 0.06 ± 9% perf-profile.children.cycles-pp.free_pcppages_bulk
0.12 ± 6% -0.0 0.09 ± 5% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.task_tick_fair
0.29 ± 3% -0.0 0.27 ± 4% -0.0 0.28 ± 3% perf-profile.children.cycles-pp._raw_spin_trylock
0.12 ± 6% -0.0 0.10 ± 4% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.folio_unlock
0.79 ± 2% +0.1 0.90 ± 2% +0.0 0.83 ± 2% perf-profile.children.cycles-pp.rmqueue
0.24 ± 3% +0.2 0.41 ± 6% +0.1 0.33 ± 4% perf-profile.children.cycles-pp.__rmqueue_pcplist
0.10 ± 5% +0.2 0.29 ± 9% +0.1 0.20 ± 9% perf-profile.children.cycles-pp.rmqueue_bulk
0.62 ± 4% +0.2 0.84 ± 2% +0.1 0.70 ± 3% perf-profile.children.cycles-pp.folio_remove_rmap_ptes
1.89 ± 2% +0.4 2.34 ± 5% +0.5 2.43 ± 2% perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
1.67 ± 12% +1.2 2.87 +0.8 2.50 perf-profile.children.cycles-pp.tlb_finish_mmu
16.10 ± 9% +9.5 25.63 ± 2% +6.4 22.50 perf-profile.children.cycles-pp.unmap_vmas
16.10 ± 9% +9.5 25.63 ± 2% +6.4 22.50 perf-profile.children.cycles-pp.unmap_page_range
16.10 ± 9% +9.5 25.63 ± 2% +6.4 22.50 perf-profile.children.cycles-pp.zap_pmd_range
16.10 ± 9% +9.5 25.63 ± 2% +6.4 22.50 perf-profile.children.cycles-pp.zap_pte_range
18.48 ± 17% +10.5 29.01 +7.5 26.01 perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
17.97 ± 9% +10.7 28.66 +7.2 25.16 perf-profile.children.cycles-pp.do_syscall_64
17.97 ± 9% +10.7 28.66 +7.2 25.16 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
18.49 ± 17% +10.7 29.20 +7.6 26.12 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
17.82 ± 10% +10.7 28.54 ± 2% +7.2 25.03 perf-profile.children.cycles-pp.__munmap
17.81 ± 10% +10.7 28.53 ± 2% +7.2 25.02 perf-profile.children.cycles-pp.__vm_munmap
17.81 ± 10% +10.7 28.53 ± 2% +7.2 25.02 perf-profile.children.cycles-pp.__x64_sys_munmap
17.82 ± 10% +10.7 28.54 ± 2% +7.2 25.03 perf-profile.children.cycles-pp.do_vmi_munmap
17.81 ± 10% +10.7 28.54 ± 2% +7.2 25.03 perf-profile.children.cycles-pp.do_vmi_align_munmap
17.80 ± 10% +10.7 28.53 ± 2% +7.2 25.02 perf-profile.children.cycles-pp.unmap_region
18.38 ± 17% +10.8 29.13 +7.7 26.03 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
12.80 ± 15% +10.9 23.68 ± 2% +7.6 20.42 perf-profile.children.cycles-pp.tlb_flush_mmu
14.44 ± 15% +12.1 26.54 ± 2% +8.5 22.91 perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
14.43 ± 15% +12.1 26.54 ± 2% +8.5 22.90 perf-profile.children.cycles-pp.free_pages_and_swap_cache
13.19 ± 17% +13.0 26.17 ± 2% +9.2 22.36 perf-profile.children.cycles-pp.folios_put_refs
11.81 ± 20% +13.2 25.01 ± 2% +9.4 21.17 perf-profile.children.cycles-pp.__page_cache_release
39.99 ± 4% -4.1 35.92 -2.9 37.12 perf-profile.self.cycles-pp.copy_page
2.14 ± 15% -1.4 0.70 ± 5% -1.2 0.93 ± 4% perf-profile.self.cycles-pp._compound_head
1.39 ± 13% -0.9 0.48 ± 3% -0.7 0.67 ± 4% perf-profile.self.cycles-pp.free_pages_and_swap_cache
4.45 -0.7 3.74 -0.5 3.92 perf-profile.self.cycles-pp.testcase
3.12 ± 3% -0.4 2.69 -0.3 2.83 perf-profile.self.cycles-pp._raw_spin_lock
0.96 -0.3 0.61 ± 2% -0.3 0.64 perf-profile.self.cycles-pp.mas_walk
6.74 ± 4% -0.3 6.44 -0.2 6.53 perf-profile.self.cycles-pp.native_irq_return_iret
1.59 ± 2% -0.2 1.36 -0.2 1.42 perf-profile.self.cycles-pp.sync_regs
1.22 ± 2% -0.2 1.04 ± 2% -0.1 1.10 perf-profile.self.cycles-pp.___perf_sw_event
1.71 ± 4% -0.1 1.57 -0.1 1.64 perf-profile.self.cycles-pp.filemap_get_entry
1.06 ± 10% -0.1 0.94 ± 10% +0.2 1.25 ± 4% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.48 ± 9% -0.1 0.38 ± 6% -0.0 0.44 ± 5% perf-profile.self.cycles-pp.__count_memcg_events
0.63 -0.1 0.54 -0.1 0.56 perf-profile.self.cycles-pp.__handle_mm_fault
0.44 -0.1 0.35 -0.1 0.38 ± 2% perf-profile.self.cycles-pp.lru_add_fn
0.29 -0.1 0.21 ± 3% -0.0 0.28 ± 3% perf-profile.self.cycles-pp.__page_cache_release
0.36 ± 3% -0.1 0.28 ± 2% -0.1 0.31 ± 4% perf-profile.self.cycles-pp.get_page_from_freelist
0.57 ± 3% -0.1 0.49 ± 2% -0.0 0.53 ± 5% perf-profile.self.cycles-pp.__mod_node_page_state
0.25 ± 3% -0.1 0.18 ± 2% -0.0 0.22 ± 2% perf-profile.self.cycles-pp.free_unref_folios
0.23 ± 2% -0.1 0.16 ± 2% -0.0 0.18 ± 4% perf-profile.self.cycles-pp.folio_remove_rmap_ptes
0.85 ± 4% -0.1 0.78 ± 2% -0.1 0.80 perf-profile.self.cycles-pp.xas_load
0.28 ± 3% -0.1 0.21 ± 3% -0.0 0.23 ± 3% perf-profile.self.cycles-pp.do_user_addr_fault
0.19 ± 2% -0.1 0.13 ± 8% -0.1 0.13 ± 3% perf-profile.self.cycles-pp.set_pte_range
0.30 ± 2% -0.1 0.24 ± 3% -0.0 0.26 ± 2% perf-profile.self.cycles-pp.zap_present_ptes
0.29 ± 2% -0.1 0.23 ± 6% -0.1 0.24 ± 7% perf-profile.self.cycles-pp.get_vma_policy
0.15 ± 2% -0.1 0.09 -0.1 0.09 ± 4% perf-profile.self.cycles-pp.vma_alloc_folio_noprof
0.16 ± 5% -0.1 0.10 ± 4% -0.0 0.12 perf-profile.self.cycles-pp.get_pfnblock_flags_mask
0.32 ± 5% -0.1 0.26 ± 4% -0.0 0.28 ± 3% perf-profile.self.cycles-pp.__alloc_pages_noprof
0.28 ± 4% -0.1 0.23 ± 6% -0.0 0.23 ± 4% perf-profile.self.cycles-pp.rmqueue
0.26 -0.0 0.21 ± 3% -0.0 0.23 ± 3% perf-profile.self.cycles-pp.asm_exc_page_fault
0.07 ± 5% -0.0 0.02 ±122% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.policy_nodemask
0.19 ± 6% -0.0 0.14 -0.0 0.15 ± 6% perf-profile.self.cycles-pp.lock_vma_under_rcu
0.16 ± 5% -0.0 0.11 ± 3% -0.0 0.14 ± 4% perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
0.32 ± 3% -0.0 0.28 -0.0 0.29 ± 2% perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.15 ± 2% -0.0 0.10 ± 3% -0.0 0.13 ± 3% perf-profile.self.cycles-pp.free_unref_page_commit
0.20 ± 6% -0.0 0.16 ± 5% -0.0 0.18 ± 3% perf-profile.self.cycles-pp.__perf_sw_event
0.19 ± 2% -0.0 0.14 ± 3% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.do_fault
0.12 -0.0 0.08 ± 5% -0.0 0.10 ± 4% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.27 ± 9% -0.0 0.23 ± 32% -0.0 0.22 ± 6% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.19 ± 5% -0.0 0.15 ± 3% -0.0 0.15 ± 4% perf-profile.self.cycles-pp.__pte_offset_map
0.10 ± 11% -0.0 0.06 ± 10% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.page_counter_uncharge
0.15 ± 6% -0.0 0.12 ± 3% -0.0 0.14 ± 4% perf-profile.self.cycles-pp.uncharge_folio
0.21 ± 4% -0.0 0.17 ± 2% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.error_entry
0.09 ± 4% -0.0 0.06 ± 6% -0.0 0.07 ± 5% perf-profile.self.cycles-pp.alloc_pages_mpol_noprof
0.18 -0.0 0.15 ± 4% -0.0 0.16 ± 2% perf-profile.self.cycles-pp.exc_page_fault
0.22 ± 3% -0.0 0.19 ± 3% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.folio_add_new_anon_rmap
0.22 ± 4% -0.0 0.19 ± 4% -0.0 0.20 ± 3% perf-profile.self.cycles-pp.shmem_fault
0.18 ± 6% -0.0 0.15 ± 3% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.up_read
0.11 ± 4% -0.0 0.09 -0.0 0.10 ± 4% perf-profile.self.cycles-pp.zap_pte_range
0.13 ± 6% -0.0 0.10 ± 3% -0.0 0.11 perf-profile.self.cycles-pp.folio_put
0.29 -0.0 0.26 ± 4% -0.0 0.27 ± 3% perf-profile.self.cycles-pp._raw_spin_trylock
0.14 ± 2% -0.0 0.12 ± 4% -0.0 0.12 ± 6% perf-profile.self.cycles-pp.down_read_trylock
0.11 ± 4% -0.0 0.09 ± 4% -0.0 0.10 ± 5% perf-profile.self.cycles-pp.__mod_lruvec_state
0.09 ± 4% -0.0 0.07 ± 5% -0.0 0.07 perf-profile.self.cycles-pp.pte_offset_map_nolock
0.12 ± 6% -0.0 0.10 ± 4% -0.0 0.10 ± 4% perf-profile.self.cycles-pp.folio_unlock
0.18 ± 4% -0.0 0.16 ± 7% -0.0 0.14 ± 8% perf-profile.self.cycles-pp.shmem_get_policy
0.07 ± 7% -0.0 0.05 ± 7% -0.0 0.06 ± 6% perf-profile.self.cycles-pp.__do_fault
0.08 ± 5% -0.0 0.07 -0.0 0.08 ± 6% perf-profile.self.cycles-pp.handle_pte_fault
0.08 -0.0 0.07 ± 5% -0.0 0.08 ± 6% perf-profile.self.cycles-pp.__mem_cgroup_charge
0.40 -0.0 0.39 ± 4% -0.0 0.38 ± 2% perf-profile.self.cycles-pp.__pte_offset_map_lock
0.38 ± 3% +0.1 0.44 ± 3% +0.1 0.47 ± 2% perf-profile.self.cycles-pp.folio_batch_move_lru
0.39 ± 3% +0.1 0.46 -0.0 0.35 ± 2% perf-profile.self.cycles-pp.folios_put_refs
0.61 ± 13% +0.5 1.15 ± 3% +0.4 0.97 ± 3% perf-profile.self.cycles-pp.__lruvec_stat_mod_folio
18.38 ± 17% +10.8 29.13 +7.7 26.03 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-23 7:48 ` Oliver Sang
@ 2024-05-23 16:47 ` Shakeel Butt
2024-05-24 7:45 ` Oliver Sang
0 siblings, 1 reply; 15+ messages in thread
From: Shakeel Butt @ 2024-05-23 16:47 UTC (permalink / raw)
To: Oliver Sang
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin
On Thu, May 23, 2024 at 03:48:40PM +0800, Oliver Sang wrote:
> hi, Shakeel,
>
> On Tue, May 21, 2024 at 09:18:19PM -0700, Shakeel Butt wrote:
> > On Tue, May 21, 2024 at 10:43:16AM +0800, Oliver Sang wrote:
> > > hi, Shakeel,
> > >
> > [...]
> > >
> > > we reported regression on a 2-node Skylake server. so I found a 1-node Skylake
> > > desktop (we don't have 1 node server) to check.
> > >
> >
> > Please try the following patch on both single node and dual node
> > machines:
>
>
> the regression is partially recovered by applying your patch.
> (but one even more regression case as below)
>
> details:
>
> since you mentioned the whole patch-set behavior last time, I applied the
> patch upon
> a94032b35e5f9 memcg: use proper type for mod_memcg_state
>
> below fd2296741e2686ed6ecd05187e4 = a94032b35e5f9 + patch
>
Thanks a lot Oliver. I have couple of questions and requests:
1. What is the baseline kernel you are using? Is it linux-next or linus?
If linux-next, which one specifically?
2. What is the cgroup hierarchy where the workload is running? Is it
running in the root cgroup?
3. For the followup experiments when needed, can you please remove the
whole series (including 59142d87ab03b8ff) for the base numbers.
4. My experiment [1] on Cooper Lake (2 node) and Skylake (1 node) shows
significant improvement but I noticed that I am directly running
page_fault2_processes with -t equal nr_cpus but you are running through
runtest.py. Also it seems like lkp has modified runtest.py. I will try
to run the same setup as yours to repro.
[1] https://lore.kernel.org/all/20240523034824.1255719-1-shakeel.butt@linux.dev
thanks,
Shakeel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-23 16:47 ` Shakeel Butt
@ 2024-05-24 7:45 ` Oliver Sang
2024-05-24 18:06 ` Shakeel Butt
0 siblings, 1 reply; 15+ messages in thread
From: Oliver Sang @ 2024-05-24 7:45 UTC (permalink / raw)
To: Shakeel Butt
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin, oliver.sang
hi, Shakeel,
On Thu, May 23, 2024 at 09:47:30AM -0700, Shakeel Butt wrote:
> On Thu, May 23, 2024 at 03:48:40PM +0800, Oliver Sang wrote:
> > hi, Shakeel,
> >
> > On Tue, May 21, 2024 at 09:18:19PM -0700, Shakeel Butt wrote:
> > > On Tue, May 21, 2024 at 10:43:16AM +0800, Oliver Sang wrote:
> > > > hi, Shakeel,
> > > >
> > > [...]
> > > >
> > > > we reported regression on a 2-node Skylake server. so I found a 1-node Skylake
> > > > desktop (we don't have 1 node server) to check.
> > > >
> > >
> > > Please try the following patch on both single node and dual node
> > > machines:
> >
> >
> > the regression is partially recovered by applying your patch.
> > (but one even more regression case as below)
> >
> > details:
> >
> > since you mentioned the whole patch-set behavior last time, I applied the
> > patch upon
> > a94032b35e5f9 memcg: use proper type for mod_memcg_state
> >
> > below fd2296741e2686ed6ecd05187e4 = a94032b35e5f9 + patch
> >
>
> Thanks a lot Oliver. I have couple of questions and requests:
you are welcome!
>
> 1. What is the baseline kernel you are using? Is it linux-next or linus?
> If linux-next, which one specifically?
base is just 59142d87ab03b, which is in current linux-next/master,
and is already merged into linus/master now.
linux$ git rev-list linux-next/master | grep 59142d87ab03b
59142d87ab03b8ff969074348f65730d465f42ee
linux$ git rev-list linus/master | grep 59142d87ab03b
59142d87ab03b8ff969074348f65730d465f42ee
the data for it is the first column in the tables we supplied.
I just applied your patch upon a94032b35e5f9, so:
linux$ git log --oneline --graph fd2296741e2686ed6ecd05187e4
* fd2296741e268 fix for 70a64b7919 from Shakeel <----- your fix patch
* a94032b35e5f9 memcg: use proper type for mod_memcg_state <--- patch-set tip, I believe
* acb5fe2f1aff0 memcg: warn for unexpected events and stats
* 4715c6a753dcc mm: cleanup WORKINGSET_NODES in workingset
* 0667c7870a186 memcg: cleanup __mod_memcg_lruvec_state
* ff48c71c26aae memcg: reduce memory for the lruvec and memcg stats
* aab6103b97f1c mm: memcg: account memory used for memcg vmstats and lruvec stats
* 70a64b7919cbd memcg: dynamically allocate lruvec_stats <--- we reported this as 'fbc' in original report
* 59142d87ab03b memcg: reduce memory size of mem_cgroup_events_index <--- base
>
> 2. What is the cgroup hierarchy where the workload is running? Is it
> running in the root cgroup?
Our test system uses systemd from the distribution (debian-12). The workload is
automatically assigned to a specific cgroup by systemd which is in the
sub-hierarchy of root, so it is not directly running in the root cgroup.
>
> 3. For the followup experiments when needed, can you please remove the
> whole series (including 59142d87ab03b8ff) for the base numbers.
I cannot understand this very well, if the patch is to fix the regression
cause by this series, seems to me the best way is to apply this patch on top
of the series. anything I misunderstood here?
anyway, I could do that, do you mean such like v6.9, which doesn't include this
serial yet? I could use it as base, then apply your patch onto it. then check
the diff between v6.9 and v6.9+patch.
but I still have some concern that, what a big improvement show in this test
cannot guarantee there will be same improvement if comparing the series and
the series+patch
>
> 4. My experiment [1] on Cooper Lake (2 node) and Skylake (1 node) shows
> significant improvement but I noticed that I am directly running
> page_fault2_processes with -t equal nr_cpus but you are running through
> runtest.py. Also it seems like lkp has modified runtest.py. I will try
> to run the same setup as yours to repro.
>
>
> [1] https://lore.kernel.org/all/20240523034824.1255719-1-shakeel.butt@linux.dev
>
> thanks,
> Shakeel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-24 7:45 ` Oliver Sang
@ 2024-05-24 18:06 ` Shakeel Butt
2024-05-28 6:30 ` Shakeel Butt
0 siblings, 1 reply; 15+ messages in thread
From: Shakeel Butt @ 2024-05-24 18:06 UTC (permalink / raw)
To: Oliver Sang
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin
On Fri, May 24, 2024 at 03:45:54PM +0800, Oliver Sang wrote:
> hi, Shakeel,
>
[...]
>
> >
> > 1. What is the baseline kernel you are using? Is it linux-next or linus?
> > If linux-next, which one specifically?
>
> base is just 59142d87ab03b, which is in current linux-next/master,
> and is already merged into linus/master now.
>
> linux$ git rev-list linux-next/master | grep 59142d87ab03b
> 59142d87ab03b8ff969074348f65730d465f42ee
>
> linux$ git rev-list linus/master | grep 59142d87ab03b
> 59142d87ab03b8ff969074348f65730d465f42ee
>
>
> the data for it is the first column in the tables we supplied.
>
> I just applied your patch upon a94032b35e5f9, so:
>
> linux$ git log --oneline --graph fd2296741e2686ed6ecd05187e4
> * fd2296741e268 fix for 70a64b7919 from Shakeel <----- your fix patch
> * a94032b35e5f9 memcg: use proper type for mod_memcg_state <--- patch-set tip, I believe
> * acb5fe2f1aff0 memcg: warn for unexpected events and stats
> * 4715c6a753dcc mm: cleanup WORKINGSET_NODES in workingset
> * 0667c7870a186 memcg: cleanup __mod_memcg_lruvec_state
> * ff48c71c26aae memcg: reduce memory for the lruvec and memcg stats
> * aab6103b97f1c mm: memcg: account memory used for memcg vmstats and lruvec stats
> * 70a64b7919cbd memcg: dynamically allocate lruvec_stats <--- we reported this as 'fbc' in original report
> * 59142d87ab03b memcg: reduce memory size of mem_cgroup_events_index <--- base
>
Cool, let's stick to the linus tree. I was actually taking next-20240521
and reverting all the patches in the series to treat as the base. One
request I have would be to make the base the patch previous to the
59142d87ab03b i.e. not 59142d87ab03b.
>
> >
> > 2. What is the cgroup hierarchy where the workload is running? Is it
> > running in the root cgroup?
>
> Our test system uses systemd from the distribution (debian-12). The workload is
> automatically assigned to a specific cgroup by systemd which is in the
> sub-hierarchy of root, so it is not directly running in the root cgroup.
>
> >
> > 3. For the followup experiments when needed, can you please remove the
> > whole series (including 59142d87ab03b8ff) for the base numbers.
>
> I cannot understand this very well, if the patch is to fix the regression
> cause by this series, seems to me the best way is to apply this patch on top
> of the series. anything I misunderstood here?
>
Sorry I just meant to make the 'base' case to compare against the commit
previous to 59142d87ab03b as I said above.
I will re-run my experiments on linus tree and report back.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-24 18:06 ` Shakeel Butt
@ 2024-05-28 6:30 ` Shakeel Butt
2024-05-30 6:17 ` Oliver Sang
0 siblings, 1 reply; 15+ messages in thread
From: Shakeel Butt @ 2024-05-28 6:30 UTC (permalink / raw)
To: Oliver Sang
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin
On Fri, May 24, 2024 at 11:06:54AM GMT, Shakeel Butt wrote:
> On Fri, May 24, 2024 at 03:45:54PM +0800, Oliver Sang wrote:
[...]
> I will re-run my experiments on linus tree and report back.
I am not able to reproduce the regression with the fix I have proposed,
at least on my 1 node 52 CPUs (Cooper Lake) and 2 node 80 CPUs (Skylake)
machines. Let me give more details below:
Setup instructions:
-------------------
mount -t tmpfs tmpfs /tmp
mkdir -p /sys/fs/cgroup/A
mkdir -p /sys/fs/cgroup/A/B
mkdir -p /sys/fs/cgroup/A/B/C
echo +memory > /sys/fs/cgroup/A/cgroup.subtree_control
echo +memory > /sys/fs/cgroup/A/B/cgroup.subtree_control
echo $$ > /sys/fs/cgroup/A/B/C/cgroup.procs
The base case (commit a4c43b8a0980):
------------------------------------
$ python3 ./runtest.py page_fault2 295 process 0 0 52
tasks,processes,processes_idle,threads,threads_idle,linear
0,0,100,0,100,0
52,2796769,0.03,0,0.00,0
$ python3 ./runtest.py page_fault2 295 process 0 0 80
tasks,processes,processes_idle,threads,threads_idle,linear
0,0,100,0,100,0
80,6755010,0.04,0,0.00,0
The regressing series (last commit a94032b35e5f)
------------------------------------------------
$ python3 ./runtest.py page_fault2 295 process 0 0 52
tasks,processes,processes_idle,threads,threads_idle,linear
0,0,100,0,100,0
52,2684859,0.03,0,0.00,0
$ python3 ./runtest.py page_fault2 295 process 0 0 80
tasks,processes,processes_idle,threads,threads_idle,linear
0,0,100,0,100,0
80,6010438,0.13,0,0.00,0
The fix on top of regressing series:
------------------------------------
$ python3 ./runtest.py page_fault2 295 process 0 0 52
tasks,processes,processes_idle,threads,threads_idle,linear
0,0,100,0,100,0
52,3812133,0.02,0,0.00,0
$ python3 ./runtest.py page_fault2 295 process 0 0 80
tasks,processes,processes_idle,threads,threads_idle,linear
0,0,100,0,100,0
80,7979893,0.15,0,0.00,0
As you can see, the fix is improving the performance over the base, at
least for me. I can only speculate that either the difference of
hardware is giving us different results (you have newer CPUs) or there
is still disparity of experiment setup/environment between us.
Are you disabling hyperthreading? Is the prefetching heuristics
different on your systems?
Regarding test environment, can you check my setup instructions above
and see if I am doing something wrong or different?
At the moment, I am inclined towards asking Andrew to include my fix in
following 6.10-rc* but keep this report open, so we continue to improve.
Let me know if you have concerns.
thanks,
Shakeel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression
2024-05-28 6:30 ` Shakeel Butt
@ 2024-05-30 6:17 ` Oliver Sang
0 siblings, 0 replies; 15+ messages in thread
From: Oliver Sang @ 2024-05-30 6:17 UTC (permalink / raw)
To: Shakeel Butt
Cc: oe-lkp, lkp, Linux Memory Management List, Andrew Morton,
Yosry Ahmed, T.J. Mercier, Roman Gushchin, Johannes Weiner,
Michal Hocko, Muchun Song, cgroups, ying.huang, feng.tang,
fengwei.yin, oliver.sang
hi, Shakeel,
On Mon, May 27, 2024 at 11:30:38PM -0700, Shakeel Butt wrote:
> On Fri, May 24, 2024 at 11:06:54AM GMT, Shakeel Butt wrote:
> > On Fri, May 24, 2024 at 03:45:54PM +0800, Oliver Sang wrote:
> [...]
> > I will re-run my experiments on linus tree and report back.
>
> I am not able to reproduce the regression with the fix I have proposed,
> at least on my 1 node 52 CPUs (Cooper Lake) and 2 node 80 CPUs (Skylake)
> machines. Let me give more details below:
>
> Setup instructions:
> -------------------
> mount -t tmpfs tmpfs /tmp
> mkdir -p /sys/fs/cgroup/A
> mkdir -p /sys/fs/cgroup/A/B
> mkdir -p /sys/fs/cgroup/A/B/C
> echo +memory > /sys/fs/cgroup/A/cgroup.subtree_control
> echo +memory > /sys/fs/cgroup/A/B/cgroup.subtree_control
> echo $$ > /sys/fs/cgroup/A/B/C/cgroup.procs
>
> The base case (commit a4c43b8a0980):
> ------------------------------------
> $ python3 ./runtest.py page_fault2 295 process 0 0 52
> tasks,processes,processes_idle,threads,threads_idle,linear
> 0,0,100,0,100,0
> 52,2796769,0.03,0,0.00,0
>
> $ python3 ./runtest.py page_fault2 295 process 0 0 80
> tasks,processes,processes_idle,threads,threads_idle,linear
> 0,0,100,0,100,0
> 80,6755010,0.04,0,0.00,0
>
>
> The regressing series (last commit a94032b35e5f)
> ------------------------------------------------
> $ python3 ./runtest.py page_fault2 295 process 0 0 52
> tasks,processes,processes_idle,threads,threads_idle,linear
> 0,0,100,0,100,0
> 52,2684859,0.03,0,0.00,0
>
> $ python3 ./runtest.py page_fault2 295 process 0 0 80
> tasks,processes,processes_idle,threads,threads_idle,linear
> 0,0,100,0,100,0
> 80,6010438,0.13,0,0.00,0
>
> The fix on top of regressing series:
> ------------------------------------
> $ python3 ./runtest.py page_fault2 295 process 0 0 52
> tasks,processes,processes_idle,threads,threads_idle,linear
> 0,0,100,0,100,0
> 52,3812133,0.02,0,0.00,0
>
> $ python3 ./runtest.py page_fault2 295 process 0 0 80
> tasks,processes,processes_idle,threads,threads_idle,linear
> 0,0,100,0,100,0
> 80,7979893,0.15,0,0.00,0
>
>
> As you can see, the fix is improving the performance over the base, at
> least for me. I can only speculate that either the difference of
> hardware is giving us different results (you have newer CPUs) or there
> is still disparity of experiment setup/environment between us.
>
> Are you disabling hyperthreading? Is the prefetching heuristics
> different on your systems?
we don't disable hyperthreading.
for prefetching, we don't change bios default setting. for the skl server
in our original report:
MLC Spatial Prefetcher - enabled
DCU Data Prefetcher - enabled
DCU Instruction Prefetcher - enabled
LLC Prefetch - disabled
but we don't uniform these setting for all our servers. such like for that
Ice Lake server mentioned in previous mail, the "LLC Prefetch" is default
to be enabled, so we keep it as enabled.
>
> Regarding test environment, can you check my setup instructions above
> and see if I am doing something wrong or different?
>
> At the moment, I am inclined towards asking Andrew to include my fix in
> following 6.10-rc* but keep this report open, so we continue to improve.
> Let me know if you have concerns.
yeah, different setup/environment could cause difference. anyway, when your
fix merged, we could capture it for some performance improvement. or if you
want us a manual check, you could let us know. Thanks!
>
> thanks,
> Shakeel
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-05-30 6:18 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-17 5:56 [linux-next:master] [memcg] 70a64b7919: will-it-scale.per_process_ops -11.9% regression kernel test robot
2024-05-17 23:38 ` Yosry Ahmed
2024-05-18 6:28 ` Shakeel Butt
2024-05-19 9:14 ` Oliver Sang
2024-05-19 17:20 ` Shakeel Butt
2024-05-20 2:43 ` Oliver Sang
2024-05-20 3:49 ` Shakeel Butt
2024-05-21 2:43 ` Oliver Sang
2024-05-22 4:18 ` Shakeel Butt
2024-05-23 7:48 ` Oliver Sang
2024-05-23 16:47 ` Shakeel Butt
2024-05-24 7:45 ` Oliver Sang
2024-05-24 18:06 ` Shakeel Butt
2024-05-28 6:30 ` Shakeel Butt
2024-05-30 6:17 ` Oliver Sang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox