From: Yujie Liu <yujie.liu@intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
"Hugh Dickins" <hughd@google.com>,
Nadav Amit <nadav.amit@gmail.com>,
"Linux Memory Management List" <linux-mm@kvack.org>,
<linux-arch@vger.kernel.org>, <ying.huang@intel.com>,
<feng.tang@intel.com>, <zhengjun.xing@linux.intel.com>,
<fengwei.yin@intel.com>
Subject: Re: [linux-next:master] [mm] 5df397dec7: will-it-scale.per_thread_ops -53.3% regression
Date: Wed, 7 Dec 2022 10:12:45 +0800 [thread overview]
Message-ID: <Y4/2nZoYtDufIMSK@yujie-X299> (raw)
In-Reply-To: <CAHk-=wg330wAAxwSaJBPUtL5Rrn7PoQK3ksJw2OLvBxA0NGg+g@mail.gmail.com>
On Mon, Dec 05, 2022 at 12:43:37PM -0800, Linus Torvalds wrote:
> On Mon, Dec 5, 2022 at 1:02 AM kernel test robot <yujie.liu@intel.com> wrote:
> >
> > FYI, we noticed a -53.3% regression of will-it-scale.per_thread_ops due to commit:
> > 5df397dec7c4 ("mm: delay page_remove_rmap() until after the TLB has been flushed")
>
> Sadly, I think this may be at least partially expected.
>
> The code fundamentally moves one "loop over pages" and splits it up
> (with the TLB flush in between).
>
> Which can't be great for locality, but it's kind of fundamental for
> the fix - but some of it might be due to the batch limit logic.
>
> I wouldn't have expected it to actually show up in any real loads, but:
>
> > in testcase: will-it-scale
> > test: page_fault3
>
> I assume that this test is doing a lot of mmap/munmap on dirty shared
> memory regions (both because of the regression, and because of the
> name of that test ;)
>
> So this is hopefully an extreme case.
>
> Now, it's likely that this particular case probably also triggers that
>
> /* No more batching if we have delayed rmaps pending */
>
> which means that the loops in between the TLB flushes will be smaller,
> since we don't batch up as many pages as we used to before we force a
> TLB (and rmap) flush and free them.
>
> If it's due to that batching issue it may be fixable - I'll think
> about this some more, but
>
> > Details are as below:
>
> The bug it fixes ends up meaning that we run that rmap removal code
> _after_ the TLB flush, and it looks like this (probably combined with
> the batching limit) then causes some nasty iTLB load issues:
>
> > 2291312 ą 2% +1452.8% 35580378 ą 4% perf-stat.i.iTLB-loads
>
> although it also does look like it's at least partly due to some irq
> timing issue (and/or bad NUMA/CPU migration luck):
>
> > 388169 +267.4% 1426305 ą 6% vmstat.system.in
> > 161.37 +84.9% 298.43 ą 6% perf-stat.ps.cpu-migrations
> > 172442 ą 4% +26.9% 218745 ą 8% perf-stat.ps.node-load-misses
>
> so it might be that some of the regression comes down to "bad luck" -
> it happened to run really nicely on that particular machine, and then
> the timing changes caused some random "phase change" to the load.
>
> The profile doesn't actually seem to show all that much more IPI
> overhead, so maybe these incidental issues are what then causes that
> big regression.
>
> It would be lovely to hear if you see this on other machines and/or loads.
FYI, we ran this "will-it-scale page_fault3" testcase on two other x86
platforms and observed similar performance regressions. We haven't
seen regressions from other benchmarks/workloads yet.
96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake)
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-11/performance/x86_64-rhel-8.3/thread/16/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp7/page_fault3/will-it-scale
commit:
7cc8f9c7146a5 ("mm: mmu_gather: prepare to gather encoded page pointers with flags")
5df397dec7c4c ("mm: delay page_remove_rmap() until after the TLB has been flushed")
7cc8f9c7146a5 5df397dec7c4c
---------------- ------------ --------------
%stddev %change %stddev
\ | \
5292018 -41.6% 3090618 ± 2% will-it-scale.16.threads
84.04 +5.5% 88.64 will-it-scale.16.threads_idle
330750 -41.6% 193163 ± 2% will-it-scale.per_thread_ops
5292018 -41.6% 3090618 ± 2% will-it-scale.workload
3777076 -33.9% 2496224 numa-numastat.node0.local_node
3834886 -33.7% 2541691 numa-numastat.node0.numa_hit
1.17 ± 9% +1.2 2.39 ± 8% mpstat.cpu.all.irq%
13.50 ± 2% -5.3 8.17 mpstat.cpu.all.sys%
1.14 ± 39% -0.5 0.64 mpstat.cpu.all.usr%
83.83 +6.0% 88.83 vmstat.cpu.id
13.83 ± 2% -32.5% 9.33 ± 5% vmstat.procs.r.
9325 ± 3% -46.2% 5018 vmstat.system.cs
298875 +422.0% 1560096 vmstat.system.in
160279 ± 23% +36.5% 218776 ± 12% numa-meminfo.node0.AnonPages
166313 ± 21% +34.5% 223688 ± 11% numa-meminfo.node0.Inactive
164286 ± 22% +35.9% 223228 ± 11% numa-meminfo.node0.Inactive(anon)
4048 ± 6% +14.1% 4620 ± 5% numa-meminfo.node0.PageTables
247964 ± 16% -28.7% 176690 ± 17% numa-meminfo.node1.AnonPages.max
40074 ± 23% +36.5% 54693 ± 12% numa-vmstat.node0.nr_anon_pages
41076 ± 22% +35.9% 55806 ± 11% numa-vmstat.node0.nr_inactive_anon
1012 ± 6% +14.0% 1154 ± 5% numa-vmstat.node0.nr_page_table_pages
41076 ± 22% +35.9% 55806 ± 11% numa-vmstat.node0.nr_zone_inactive_anon
3834883 -33.7% 2541696 numa-vmstat.node0.numa_hit
3777072 -33.9% 2496229 numa-vmstat.node0.numa_local
442.00 -29.3% 312.67 ± 2% turbostat.Avg_MHz
16.87 ± 2% -4.5 12.33 ± 4% turbostat.Busy%
611287 ± 13% -91.1% 54248 ± 13% turbostat.C1
0.27 ± 16% -0.3 0.01 turbostat.C1%
1.238e+08 +624.2% 8.965e+08 turbostat.IRQ
167.40 -6.5% 156.53 turbostat.PkgWatt
270220 +1.8% 275170 proc-vmstat.nr_mapped
4434671 -29.9% 3110296 proc-vmstat.numa_hit
4348904 -30.5% 3023405 proc-vmstat.numa_local
548152 -1.4% 540422 proc-vmstat.pgactivate
4512817 -29.2% 3193199 proc-vmstat.pgalloc_normal
1.594e+09 -41.6% 9.308e+08 ± 2% proc-vmstat.pgfault
4490607 -29.4% 3171990 proc-vmstat.pgfree
0.42 ± 4% -12.7% 0.36 ± 7% sched_debug.cfs_rq:/.h_nr_running.stddev
78690 ± 2% +78.7% 140636 ± 46% sched_debug.cfs_rq:/.load.max
30480 ± 5% +31.9% 40209 ± 15% sched_debug.cfs_rq:/.load.stddev
317962 ± 8% -48.8% 162930 ± 15% sched_debug.cfs_rq:/.min_vruntime.avg
1279285 ± 12% -56.3% 558508 ± 11% sched_debug.cfs_rq:/.min_vruntime.max
404116 ± 10% -57.3% 172730 ± 10% sched_debug.cfs_rq:/.min_vruntime.stddev
0.42 ± 5% -12.6% 0.36 ± 7% sched_debug.cfs_rq:/.nr_running.stddev
231.45 ± 7% -23.5% 177.15 ± 6% sched_debug.cfs_rq:/.runnable_avg.avg
854.02 ± 3% -15.8% 719.28 ± 7% sched_debug.cfs_rq:/.runnable_avg.max
315.94 ± 4% -27.2% 229.91 ± 2% sched_debug.cfs_rq:/.runnable_avg.stddev
681750 ± 31% -50.1% 340262 ± 13% sched_debug.cfs_rq:/.spread0.max
-577443 -66.6% -193080 sched_debug.cfs_rq:/.spread0.min
404120 ± 10% -57.3% 172733 ± 10% sched_debug.cfs_rq:/.spread0.stddev
231.41 ± 7% -23.5% 177.11 ± 6% sched_debug.cfs_rq:/.util_avg.avg
853.96 ± 3% -15.8% 719.22 ± 7% sched_debug.cfs_rq:/.util_avg.max
315.91 ± 4% -27.2% 229.88 ± 2% sched_debug.cfs_rq:/.util_avg.stddev
155.50 ± 9% -49.7% 78.27 ± 26% sched_debug.cfs_rq:/.util_est_enqueued.avg
781.56 ± 2% -22.9% 602.36 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.max
267.82 ± 5% -39.1% 163.05 ± 10% sched_debug.cfs_rq:/.util_est_enqueued.stddev
151897 ± 35% +76.2% 267629 ± 33% sched_debug.cpu.avg_idle.min
215458 ± 12% -41.9% 125119 ± 7% sched_debug.cpu.avg_idle.stddev
1645 ± 9% +85.1% 3044 ± 6% sched_debug.cpu.clock_task.stddev
872.08 ± 4% -28.3% 624.97 ± 12% sched_debug.cpu.curr->pid.avg
1962 ± 2% -14.7% 1674 ± 6% sched_debug.cpu.curr->pid.stddev
0.17 ± 5% -27.9% 0.12 ± 12% sched_debug.cpu.nr_running.avg
0.37 ± 2% -16.5% 0.31 ± 6% sched_debug.cpu.nr_running.stddev
16252 ± 9% -38.7% 9956 ± 7% sched_debug.cpu.nr_switches.avg
18638 ± 12% -43.9% 10451 ± 19% sched_debug.cpu.nr_switches.stddev
3.315e+09 -29.4% 2.34e+09 perf-stat.i.branch-instructions
0.30 ± 21% +0.2 0.50 ± 24% perf-stat.i.branch-miss-rate%
8341916 ± 11% -31.9% 5682680 ± 15% perf-stat.i.cache-misses
9303 ± 3% -46.8% 4949 perf-stat.i.context-switches
4.193e+10 -29.8% 2.944e+10 ± 2% perf-stat.i.cpu-cycles
4.344e+09 -28.1% 3.122e+09 perf-stat.i.dTLB-loads
5.87 -1.0 4.85 perf-stat.i.dTLB-store-miss-rate%
1.477e+08 -41.7% 86041370 ± 2% perf-stat.i.dTLB-store-misses
2.369e+09 -28.9% 1.685e+09 perf-stat.i.dTLB-stores
86.29 -60.2 26.04 perf-stat.i.iTLB-load-miss-rate%
10635919 ± 17% -15.9% 8947125 perf-stat.i.iTLB-load-misses
1651323 ± 6% +1441.1% 25448763 perf-stat.i.iTLB-loads
1.593e+10 -29.2% 1.128e+10 perf-stat.i.instructions
0.44 -29.8% 0.31 ± 2% perf-stat.i.metric.GHz
450.64 ± 78% +335.6% 1963 ± 15% perf-stat.i.metric.K/sec
106.83 -30.1% 74.65 perf-stat.i.metric.M/sec
5278281 -41.7% 3075728 ± 2% perf-stat.i.minor-faults
0.48 ± 14% +0.5 0.94 ± 22% perf-stat.i.node-store-miss-rate%
5329558 -41.5% 3115251 ± 2% perf-stat.i.node-stores
5278281 -41.7% 3075729 ± 2% perf-stat.i.page-faults
0.32 ± 20% +0.2 0.53 ± 22% perf-stat.overall.branch-miss-rate%
5.87 -1.0 4.86 perf-stat.overall.dTLB-store-miss-rate%
86.34 -60.3 26.01 perf-stat.overall.iTLB-load-miss-rate%
0.47 ± 14% +0.3 0.81 ± 23% perf-stat.overall.node-store-miss-rate%
909203 +21.4% 1104227 perf-stat.overall.path-length
3.304e+09 -29.4% 2.333e+09 perf-stat.ps.branch-instructions
8314122 ± 11% -31.9% 5663748 ± 15% perf-stat.ps.cache-misses
9272 ± 3% -46.8% 4933 perf-stat.ps.context-switches
4.179e+10 -29.8% 2.935e+10 ± 2% perf-stat.ps.cpu-cycles
4.33e+09 -28.1% 3.111e+09 perf-stat.ps.dTLB-loads
1.472e+08 -41.8% 85755366 ± 2% perf-stat.ps.dTLB-store-misses
2.361e+09 -28.9% 1.679e+09 perf-stat.ps.dTLB-stores
10601230 ± 17% -15.9% 8917293 perf-stat.ps.iTLB-load-misses
1645797 ± 6% +1441.2% 25364210 perf-stat.ps.iTLB-loads
1.588e+10 -29.2% 1.124e+10 perf-stat.ps.instructions
5260752 -41.7% 3065504 ± 2% perf-stat.ps.minor-faults
5311793 -41.5% 3104889 ± 2% perf-stat.ps.node-stores
5260753 -41.7% 3065504 ± 2% perf-stat.ps.page-faults
4.812e+12 -29.1% 3.412e+12 perf-stat.total.instructions
22.05 ± 8% -4.7 17.34 ± 11% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
11.90 ± 8% -4.0 7.95 ± 9% perf-profile.calltrace.cycles-pp.up_read.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
15.06 ± 8% -2.9 12.11 ± 9% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
11.63 ± 7% -2.3 9.32 ± 9% perf-profile.calltrace.cycles-pp.down_read_trylock.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.00 +0.6 0.59 ± 11% perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range
0.00 +0.6 0.62 ± 10% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.testcase
0.00 +0.6 0.62 ± 11% perf-profile.calltrace.cycles-pp.page_remove_rmap.tlb_flush_rmaps.zap_pte_range.zap_pmd_range.unmap_page_range
0.00 +0.7 0.66 ± 10% perf-profile.calltrace.cycles-pp.tlb_flush_rmaps.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
0.00 +0.8 0.78 ± 9% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.testcase
0.00 +0.8 0.85 ± 11% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.__handle_mm_fault
0.00 +0.9 0.85 ± 11% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.__handle_mm_fault.handle_mm_fault
0.00 +0.9 0.89 ± 9% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.down_read_trylock
0.00 +0.9 0.90 ± 9% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.down_read_trylock.do_user_addr_fault
0.00 +0.9 0.90 ± 11% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.00 +0.9 0.93 ± 7% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.up_read
0.00 +0.9 0.94 ± 6% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.up_read.do_user_addr_fault
0.00 +0.9 0.95 ± 9% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.down_read_trylock.do_user_addr_fault.exc_page_fault
0.00 +1.0 1.00 ± 6% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.up_read.do_user_addr_fault.exc_page_fault
0.00 +1.0 1.00 ± 36% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range
0.00 +1.1 1.05 ± 36% perf-profile.calltrace.cycles-pp.flush_tlb_func.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range
0.00 +1.1 1.10 ± 11% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
0.00 +1.1 1.15 ± 8% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.down_read_trylock.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.00 +1.2 1.20 ± 7% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.up_read.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.00 +1.5 1.48 ± 7% perf-profile.calltrace.cycles-pp.__default_send_IPI_dest_field.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range
0.00 +1.5 1.52 ± 7% perf-profile.calltrace.cycles-pp.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range
0.00 +1.7 1.71 ± 10% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_user_addr_fault
0.00 +1.7 1.72 ± 10% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_user_addr_fault.exc_page_fault
0.00 +1.8 1.82 ± 10% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.00 +3.1 3.14 ± 10% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.00 +3.6 3.61 ± 15% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function
0.00 +3.8 3.85 ± 7% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range.zap_pmd_range
0.00 +3.9 3.87 ± 7% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range.zap_pmd_range.unmap_page_range
0.00 +4.2 4.22 ± 20% perf-profile.calltrace.cycles-pp.flush_tlb_func.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function
1.96 ± 7% +4.3 6.27 ± 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
1.96 ± 7% +4.3 6.27 ± 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
1.96 ± 7% +4.3 6.27 ± 7% perf-profile.calltrace.cycles-pp.__munmap
1.96 ± 7% +4.3 6.27 ± 7% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
1.96 ± 7% +4.3 6.27 ± 7% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
1.94 ± 7% +4.3 6.25 ± 7% perf-profile.calltrace.cycles-pp.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.91 ± 7% +4.3 6.23 ± 7% perf-profile.calltrace.cycles-pp.unmap_region.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
1.89 ± 7% +4.3 6.22 ± 7% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap
1.89 ± 7% +4.3 6.22 ± 7% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_mas_align_munmap.__vm_munmap
1.89 ± 7% +4.3 6.22 ± 7% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_mas_align_munmap
1.86 ± 7% +4.3 6.21 ± 7% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
0.00 +4.6 4.56 ± 7% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
30.20 ± 17% +6.8 37.04 ± 15% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
30.64 ± 16% +7.0 37.60 ± 14% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
30.73 ± 16% +7.0 37.70 ± 14% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
30.73 ± 16% +7.0 37.71 ± 14% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
30.73 ± 16% +7.0 37.71 ± 14% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
22.10 ± 8% -4.6 17.48 ± 11% perf-profile.children.cycles-pp.handle_mm_fault
11.92 ± 8% -3.8 8.15 ± 8% perf-profile.children.cycles-pp.up_read
11.65 ± 7% -2.1 9.51 ± 9% perf-profile.children.cycles-pp.down_read_trylock
0.16 ± 21% -0.1 0.09 ± 26% perf-profile.children.cycles-pp.process_simple
0.13 ± 14% -0.1 0.07 ± 10% perf-profile.children.cycles-pp.rwsem_down_read_slowpath
0.14 ± 21% -0.1 0.08 ± 22% perf-profile.children.cycles-pp.queue_event
0.14 ± 21% -0.1 0.08 ± 26% perf-profile.children.cycles-pp.ordered_events__queue
0.14 ± 14% -0.0 0.10 ± 10% perf-profile.children.cycles-pp.__schedule
0.10 ± 10% -0.0 0.06 ± 19% perf-profile.children.cycles-pp.schedule
0.02 ±141% +0.0 0.06 ± 13% perf-profile.children.cycles-pp.ret_from_fork
0.02 ±141% +0.0 0.06 ± 13% perf-profile.children.cycles-pp.kthread
0.16 ± 21% +0.1 0.22 ± 10% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.00 +0.1 0.09 ± 7% perf-profile.children.cycles-pp._find_next_bit
0.08 ± 10% +0.2 0.25 ± 15% perf-profile.children.cycles-pp.native_sched_clock
0.08 ± 8% +0.2 0.29 ± 14% perf-profile.children.cycles-pp.sched_clock_cpu
0.20 ± 12% +0.2 0.43 ± 13% perf-profile.children.cycles-pp.__irq_exit_rcu
0.07 ± 11% +0.2 0.31 ± 10% perf-profile.children.cycles-pp.irqtime_account_irq
1.46 ± 8% +0.3 1.77 ± 10% perf-profile.children.cycles-pp.__filemap_get_folio
1.71 ± 8% +0.3 2.02 ± 10% perf-profile.children.cycles-pp.shmem_get_folio_gfp
0.00 +0.4 0.43 ± 10% perf-profile.children.cycles-pp.llist_reverse_order
0.00 +0.6 0.59 ± 11% perf-profile.children.cycles-pp.llist_add_batch
0.00 +0.7 0.67 ± 10% perf-profile.children.cycles-pp.tlb_flush_rmaps
0.09 ± 15% +1.4 1.48 ± 7% perf-profile.children.cycles-pp.__default_send_IPI_dest_field
0.09 ± 14% +1.4 1.53 ± 7% perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys
0.19 ± 8% +3.7 3.87 ± 7% perf-profile.children.cycles-pp.smp_call_function_many_cond
0.19 ± 8% +3.7 3.87 ± 7% perf-profile.children.cycles-pp.on_each_cpu_cond_mask
2.19 ± 7% +4.3 6.49 ± 7% perf-profile.children.cycles-pp.do_syscall_64
2.19 ± 7% +4.3 6.49 ± 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
1.97 ± 7% +4.3 6.27 ± 7% perf-profile.children.cycles-pp.__vm_munmap
1.96 ± 7% +4.3 6.27 ± 7% perf-profile.children.cycles-pp.__x64_sys_munmap
1.96 ± 7% +4.3 6.27 ± 7% perf-profile.children.cycles-pp.__munmap
1.94 ± 7% +4.3 6.25 ± 7% perf-profile.children.cycles-pp.do_mas_align_munmap
1.92 ± 7% +4.3 6.24 ± 7% perf-profile.children.cycles-pp.unmap_region
1.90 ± 7% +4.3 6.23 ± 7% perf-profile.children.cycles-pp.unmap_vmas
1.90 ± 7% +4.3 6.23 ± 7% perf-profile.children.cycles-pp.unmap_page_range
1.90 ± 7% +4.3 6.23 ± 7% perf-profile.children.cycles-pp.zap_pmd_range
1.90 ± 7% +4.3 6.23 ± 7% perf-profile.children.cycles-pp.zap_pte_range
0.19 ± 8% +4.4 4.56 ± 7% perf-profile.children.cycles-pp.flush_tlb_mm_range
0.14 ± 11% +6.5 6.64 ± 9% perf-profile.children.cycles-pp.__flush_smp_call_function_queue
0.14 ± 11% +6.5 6.69 ± 9% perf-profile.children.cycles-pp.__sysvec_call_function
0.00 +6.8 6.84 ± 9% perf-profile.children.cycles-pp.native_flush_tlb_one_user
0.17 ± 11% +6.9 7.09 ± 9% perf-profile.children.cycles-pp.sysvec_call_function
30.73 ± 16% +7.0 37.71 ± 14% perf-profile.children.cycles-pp.start_secondary
0.08 ± 13% +7.3 7.39 ± 9% perf-profile.children.cycles-pp.flush_tlb_func
0.30 ± 12% +8.7 8.96 ± 8% perf-profile.children.cycles-pp.asm_sysvec_call_function
11.80 ± 8% -4.8 6.96 ± 9% perf-profile.self.cycles-pp.up_read
10.49 ± 7% -4.1 6.38 ± 9% perf-profile.self.cycles-pp.__handle_mm_fault
11.55 ± 7% -3.2 8.38 ± 9% perf-profile.self.cycles-pp.down_read_trylock
6.15 ± 11% -2.4 3.74 ± 23% perf-profile.self.cycles-pp.handle_mm_fault
9.08 ± 7% -1.8 7.24 ± 9% perf-profile.self.cycles-pp.testcase
0.32 ± 8% -0.1 0.23 ± 10% perf-profile.self.cycles-pp.page_remove_rmap
0.14 ± 21% -0.1 0.08 ± 22% perf-profile.self.cycles-pp.queue_event
0.26 ± 9% -0.0 0.21 ± 8% perf-profile.self.cycles-pp.page_add_file_rmap
0.15 ± 7% -0.0 0.12 ± 10% perf-profile.self.cycles-pp.__mod_lruvec_page_state
0.12 ± 9% -0.0 0.09 ± 11% perf-profile.self.cycles-pp.do_fault
0.10 ± 9% -0.0 0.07 ± 17% perf-profile.self.cycles-pp.__count_memcg_events
0.00 +0.1 0.06 ± 15% perf-profile.self.cycles-pp.flush_tlb_mm_range
0.00 +0.1 0.07 ± 10% perf-profile.self.cycles-pp.sysvec_call_function
0.00 +0.1 0.08 ± 12% perf-profile.self.cycles-pp.irqtime_account_irq
0.00 +0.1 0.08 ± 10% perf-profile.self.cycles-pp._find_next_bit
0.00 +0.1 0.10 ± 9% perf-profile.self.cycles-pp.asm_sysvec_call_function
0.64 ± 8% +0.1 0.77 ± 8% perf-profile.self.cycles-pp.__filemap_get_folio
0.07 ± 12% +0.2 0.24 ± 16% perf-profile.self.cycles-pp.native_sched_clock
0.00 +0.4 0.42 ± 10% perf-profile.self.cycles-pp.llist_reverse_order
0.00 +0.5 0.51 ± 6% perf-profile.self.cycles-pp.__flush_smp_call_function_queue
0.42 ± 7% +0.6 0.97 ± 10% perf-profile.self.cycles-pp.do_user_addr_fault
0.03 ±100% +0.6 0.58 ± 4% perf-profile.self.cycles-pp.smp_call_function_many_cond
0.00 +0.6 0.57 ± 9% perf-profile.self.cycles-pp.flush_tlb_func
0.00 +0.6 0.59 ± 10% perf-profile.self.cycles-pp.llist_add_batch
0.09 ± 15% +1.4 1.48 ± 7% perf-profile.self.cycles-pp.__default_send_IPI_dest_field
0.00 +6.8 6.82 ± 9% perf-profile.self.cycles-pp.native_flush_tlb_one_user
128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake)
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-11/performance/x86_64-rhel-8.3/thread/16/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp5/page_fault3/will-it-scale
commit:
7cc8f9c7146a5 ("mm: mmu_gather: prepare to gather encoded page pointers with flags")
5df397dec7c4c ("mm: delay page_remove_rmap() until after the TLB has been flushed")
7cc8f9c7146a5 5df397dec7c4c
---------------- ------------ --------------
%stddev %change %stddev
\ | \
6221506 ± 5% -44.7% 3439127 ± 2% will-it-scale.16.threads
87.20 +4.3% 90.98 will-it-scale.16.threads_idle
388843 ± 5% -44.7% 214944 ± 2% will-it-scale.per_thread_ops
6221506 ± 5% -44.7% 3439127 ± 2% will-it-scale.workload
1.23 ± 8% +1.3 2.54 ± 19% mpstat.cpu.all.irq%
10.86 -4.9 5.96 ± 2% mpstat.cpu.all.sys%
0.61 ± 6% -0.2 0.40 ± 3% mpstat.cpu.all.usr%
4388857 ± 5% -36.6% 2782588 ± 2% numa-numastat.node0.local_node
4446685 ± 4% -36.3% 2833518 ± 2% numa-numastat.node0.numa_hit
618831 ± 3% -10.3% 554830 ± 4% numa-numastat.node1.local_node
14.50 ± 3% -39.1% 8.83 ± 7% vmstat.procs.r
10336 ± 8% -45.7% 5616 vmstat.system.cs
390901 ± 3% +350.1% 1759451 ± 2% vmstat.system.in
410.83 -31.2% 282.50 ± 4% turbostat.Avg_MHz
13.65 -3.6 10.06 ± 8% turbostat.Busy%
1.603e+08 ± 5% +524.0% 1e+09 ± 2% turbostat.IRQ
60.67 -7.1% 56.33 ± 5% turbostat.PkgTmp
274.70 -8.6% 251.09 ± 6% turbostat.PkgWatt
126930 ± 15% -35.2% 82187 ± 25% numa-meminfo.node0.AnonHugePages
248450 ± 9% -19.6% 199671 ± 25% numa-meminfo.node0.AnonPages
255089 ± 9% -19.6% 205148 ± 24% numa-meminfo.node0.Inactive
254353 ± 9% -19.4% 204910 ± 24% numa-meminfo.node0.Inactive(anon)
22546 ± 12% -24.4% 17051 ± 9% numa-meminfo.node1.Active
22116 ± 11% -25.2% 16539 ± 10% numa-meminfo.node1.Active(anon)
24956 ± 10% -24.9% 18736 ± 9% numa-meminfo.node1.Shmem
264468 +3.9% 274871 proc-vmstat.nr_mapped
5125804 ± 4% -32.6% 3455449 proc-vmstat.numa_hit
5010073 ± 4% -33.3% 3339738 proc-vmstat.numa_local
551502 -1.7% 542068 proc-vmstat.pgactivate
5213112 ± 4% -32.1% 3539426 proc-vmstat.pgalloc_normal
1.874e+09 ± 5% -44.7% 1.036e+09 ± 2% proc-vmstat.pgfault
5251524 ± 4% -31.8% 3580764 proc-vmstat.pgfree
62112 ± 9% -19.6% 49917 ± 25% numa-vmstat.node0.nr_anon_pages
63588 ± 9% -19.4% 51227 ± 24% numa-vmstat.node0.nr_inactive_anon
63588 ± 9% -19.4% 51227 ± 24% numa-vmstat.node0.nr_zone_inactive_anon
4446807 ± 4% -36.3% 2833561 ± 2% numa-vmstat.node0.numa_hit
4388978 ± 5% -36.6% 2782630 ± 2% numa-vmstat.node0.numa_local
5529 ± 11% -25.2% 4134 ± 10% numa-vmstat.node1.nr_active_anon
6238 ± 10% -24.9% 4684 ± 9% numa-vmstat.node1.nr_shmem
5529 ± 11% -25.2% 4134 ± 10% numa-vmstat.node1.nr_zone_active_anon
618919 ± 3% -10.4% 554861 ± 4% numa-vmstat.node1.numa_local
0.30 ± 12% -57.3% 0.13 ± 16% sched_debug.cfs_rq:/.h_nr_running.avg
0.42 ± 3% -24.8% 0.32 ± 7% sched_debug.cfs_rq:/.h_nr_running.stddev
17954 ± 13% -38.2% 11093 ± 20% sched_debug.cfs_rq:/.load.avg
59044 ± 2% +55.1% 91605 ± 3% sched_debug.cfs_rq:/.load.max
35.12 ± 14% -38.9% 21.45 ± 12% sched_debug.cfs_rq:/.load_avg.avg
425478 ± 9% -58.9% 175030 ± 14% sched_debug.cfs_rq:/.min_vruntime.avg
1451058 ± 13% -68.9% 451040 ± 3% sched_debug.cfs_rq:/.min_vruntime.max
511538 ± 10% -66.3% 172194 ± 7% sched_debug.cfs_rq:/.min_vruntime.stddev
0.30 ± 12% -57.3% 0.13 ± 16% sched_debug.cfs_rq:/.nr_running.avg
0.42 ± 3% -25.0% 0.32 ± 7% sched_debug.cfs_rq:/.nr_running.stddev
316.29 ± 10% -51.8% 152.57 ± 11% sched_debug.cfs_rq:/.runnable_avg.avg
1011 ± 2% -26.9% 739.47 ± 4% sched_debug.cfs_rq:/.runnable_avg.max
396.72 ± 2% -38.9% 242.26 ± 4% sched_debug.cfs_rq:/.runnable_avg.stddev
-550745 -65.2% -191612 sched_debug.cfs_rq:/.spread0.avg
474857 ± 58% -82.2% 84412 ± 28% sched_debug.cfs_rq:/.spread0.max
-956414 -63.9% -345608 sched_debug.cfs_rq:/.spread0.min
511547 ± 10% -66.3% 172197 ± 7% sched_debug.cfs_rq:/.spread0.stddev
316.22 ± 10% -51.8% 152.49 ± 11% sched_debug.cfs_rq:/.util_avg.avg
1010 ± 2% -26.9% 739.42 ± 4% sched_debug.cfs_rq:/.util_avg.max
396.65 ± 2% -38.9% 242.22 ± 4% sched_debug.cfs_rq:/.util_avg.stddev
237.99 ± 14% -75.7% 57.81 ± 16% sched_debug.cfs_rq:/.util_est_enqueued.avg
962.81 ± 2% -27.2% 701.03 ± 2% sched_debug.cfs_rq:/.util_est_enqueued.max
359.62 ± 5% -54.3% 164.36 ± 7% sched_debug.cfs_rq:/.util_est_enqueued.stddev
242264 ± 6% -51.7% 116978 ± 7% sched_debug.cpu.avg_idle.stddev
1801 ± 6% +58.5% 2855 ± 3% sched_debug.cpu.clock_task.stddev
711.81 ± 4% -35.4% 459.59 ± 7% sched_debug.cpu.curr->pid.avg
1909 -16.7% 1589 ± 4% sched_debug.cpu.curr->pid.stddev
0.13 ± 4% -35.4% 0.08 ± 6% sched_debug.cpu.nr_running.avg
0.33 ± 2% -20.1% 0.26 ± 3% sched_debug.cpu.nr_running.stddev
13910 ± 6% -36.7% 8800 sched_debug.cpu.nr_switches.avg
18507 ± 12% -40.5% 11004 ± 14% sched_debug.cpu.nr_switches.stddev
4.30 ± 13% +95.0% 8.38 ± 39% perf-stat.i.MPKI
3.88e+09 ± 5% -32.4% 2.621e+09 perf-stat.i.branch-instructions
0.07 ± 8% +0.3 0.38 ± 81% perf-stat.i.branch-miss-rate%
2747085 ± 10% +269.7% 10156358 ± 79% perf-stat.i.branch-misses
8.60 ± 12% -3.3 5.34 ± 20% perf-stat.i.cache-miss-rate%
6857924 ± 5% -23.2% 5265295 ± 16% perf-stat.i.cache-misses
10324 ± 8% -46.2% 5552 perf-stat.i.context-switches
5.216e+10 -32.0% 3.545e+10 ± 4% perf-stat.i.cpu-cycles
139.62 +48.4% 207.16 ± 4% perf-stat.i.cpu-migrations
5.128e+09 ± 5% -31.6% 3.508e+09 perf-stat.i.dTLB-loads
7.55 -1.3 6.25 perf-stat.i.dTLB-store-miss-rate%
2.287e+08 ± 5% -44.8% 1.262e+08 ± 2% perf-stat.i.dTLB-store-misses
2.798e+09 ± 5% -32.4% 1.893e+09 perf-stat.i.dTLB-stores
1.876e+10 ± 5% -32.4% 1.269e+10 perf-stat.i.instructions
0.41 -32.0% 0.28 ± 4% perf-stat.i.metric.GHz
94.02 ± 5% -32.5% 63.48 ± 2% perf-stat.i.metric.M/sec
6207930 ± 5% -44.7% 3430475 ± 2% perf-stat.i.minor-faults
55974 ± 8% +39.7% 78180 ± 8% perf-stat.i.node-load-misses
6339958 ± 5% -42.7% 3633731 ± 3% perf-stat.i.node-stores
6207930 ± 5% -44.7% 3430475 ± 2% perf-stat.i.page-faults
4.30 ± 13% +94.4% 8.35 ± 38% perf-stat.overall.MPKI
0.07 ± 7% +0.3 0.39 ± 79% perf-stat.overall.branch-miss-rate%
8.65 ± 12% -3.3 5.39 ± 20% perf-stat.overall.cache-miss-rate%
7.55 -1.3 6.25 perf-stat.overall.dTLB-store-miss-rate%
0.27 ± 37% +0.4 0.66 ± 31% perf-stat.overall.node-store-miss-rate%
910167 +22.4% 1114007 perf-stat.overall.path-length
3.867e+09 ± 5% -32.5% 2.612e+09 perf-stat.ps.branch-instructions
2739799 ± 9% +269.3% 10116762 ± 80% perf-stat.ps.branch-misses
6834912 ± 5% -23.2% 5246515 ± 16% perf-stat.ps.cache-misses
10291 ± 8% -46.2% 5533 perf-stat.ps.context-switches
5.198e+10 -32.0% 3.534e+10 ± 4% perf-stat.ps.cpu-cycles
139.18 +48.4% 206.52 ± 4% perf-stat.ps.cpu-migrations
5.111e+09 ± 5% -31.6% 3.496e+09 perf-stat.ps.dTLB-loads
2.279e+08 ± 5% -44.8% 1.258e+08 ± 2% perf-stat.ps.dTLB-store-misses
2.789e+09 ± 5% -32.4% 1.887e+09 perf-stat.ps.dTLB-stores
1.87e+10 ± 5% -32.4% 1.264e+10 perf-stat.ps.instructions
6187409 ± 5% -44.7% 3418985 ± 2% perf-stat.ps.minor-faults
55825 ± 8% +39.6% 77936 ± 8% perf-stat.ps.node-load-misses
6318444 ± 5% -42.7% 3620465 ± 3% perf-stat.ps.node-stores
6187409 ± 5% -44.7% 3418985 ± 2% perf-stat.ps.page-faults
5.662e+12 ± 5% -32.4% 3.83e+12 perf-stat.total.instructions
92.72 -14.8 77.93 ± 3% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
69.44 ± 2% -11.5 57.91 ± 4% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
82.34 -11.4 70.95 ± 3% perf-profile.calltrace.cycles-pp.testcase
69.74 ± 2% -10.5 59.24 ± 4% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
26.90 -6.0 20.87 ± 5% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
20.44 ± 4% -5.8 14.62 ± 5% perf-profile.calltrace.cycles-pp.up_read.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
18.29 ± 3% -5.0 13.33 ± 4% perf-profile.calltrace.cycles-pp.down_read_trylock.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
27.60 ± 2% -4.9 22.73 ± 3% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
4.19 ± 5% -0.7 3.50 ± 5% perf-profile.calltrace.cycles-pp.error_entry.testcase
3.93 ± 5% -0.6 3.28 ± 5% perf-profile.calltrace.cycles-pp.sync_regs.error_entry.testcase
1.15 ± 8% +0.3 1.42 ± 5% perf-profile.calltrace.cycles-pp.do_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.68 ± 7% +0.3 1.96 ± 5% perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
4.58 ± 5% +0.5 5.05 ± 4% perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
0.00 +0.5 0.54 ± 4% perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.09 ±223% +0.6 0.65 ± 4% perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.20 ±141% +0.7 0.88 ± 40% perf-profile.calltrace.cycles-pp.menu_select.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
0.00 +0.7 0.72 ± 7% perf-profile.calltrace.cycles-pp.page_remove_rmap.tlb_flush_rmaps.zap_pte_range.zap_pmd_range.unmap_page_range
0.00 +0.8 0.79 ± 6% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.testcase
0.00 +0.8 0.79 ± 5% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.testcase
0.00 +0.8 0.80 ± 6% perf-profile.calltrace.cycles-pp.tlb_flush_rmaps.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
0.00 +0.8 0.82 ± 6% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.testcase
0.81 ± 20% +0.9 1.70 ± 43% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
0.82 ± 20% +0.9 1.74 ± 44% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
0.10 ±223% +1.0 1.13 ± 60% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.00 +1.0 1.04 ± 5% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.testcase
1.28 ± 20% +1.4 2.68 ± 41% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
1.55 ± 19% +1.6 3.14 ± 37% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
0.00 +1.6 1.63 ± 7% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.down_read_trylock
0.00 +1.6 1.64 ± 7% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.down_read_trylock.do_user_addr_fault
0.00 +1.7 1.72 ± 7% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.down_read_trylock.do_user_addr_fault.exc_page_fault
0.00 +1.8 1.76 ± 44% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range
0.00 +1.9 1.85 ± 43% perf-profile.calltrace.cycles-pp.flush_tlb_func.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range
0.00 +2.1 2.08 ± 7% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.down_read_trylock.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.00 +2.1 2.09 ± 8% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.up_read
0.00 +2.1 2.09 ± 8% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.up_read.do_user_addr_fault
0.00 +2.1 2.15 ± 6% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.__handle_mm_fault
0.00 +2.2 2.16 ± 6% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.__handle_mm_fault.handle_mm_fault
0.00 +2.2 2.19 ± 8% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.up_read.do_user_addr_fault.exc_page_fault
0.00 +2.3 2.25 ± 6% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.00 +2.5 2.53 ± 4% perf-profile.calltrace.cycles-pp.__default_send_IPI_dest_field.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range
0.00 +2.6 2.59 ± 4% perf-profile.calltrace.cycles-pp.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range
0.00 +2.6 2.63 ± 8% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.up_read.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.00 +2.7 2.70 ± 4% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_user_addr_fault
0.00 +2.7 2.71 ± 4% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_user_addr_fault.exc_page_fault
0.00 +2.7 2.72 ± 6% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
0.00 +2.8 2.84 ± 4% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
4.34 ± 11% +3.0 7.38 ± 16% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
4.36 ± 11% +3.1 7.42 ± 16% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
6.20 ± 12% +4.9 11.08 ± 22% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
6.43 ± 12% +5.0 11.48 ± 21% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
0.00 +5.2 5.18 ± 3% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
6.94 ± 13% +5.6 12.54 ± 23% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
7.04 ± 13% +5.7 12.75 ± 23% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
7.05 ± 13% +5.7 12.77 ± 23% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
7.05 ± 13% +5.7 12.77 ± 23% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
0.00 +5.8 5.77 ± 17% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range.zap_pmd_range
7.10 ± 13% +5.8 12.88 ± 23% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
0.00 +5.8 5.81 ± 17% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range.zap_pmd_range.unmap_page_range
0.00 +6.8 6.82 ± 4% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
2.05 ± 5% +7.1 9.11 ± 3% perf-profile.calltrace.cycles-pp.__munmap
2.04 ± 5% +7.1 9.11 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
2.04 ± 5% +7.1 9.11 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
2.04 ± 5% +7.1 9.11 ± 3% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
2.04 ± 5% +7.1 9.11 ± 3% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
2.01 ± 5% +7.1 9.08 ± 3% perf-profile.calltrace.cycles-pp.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.99 ± 5% +7.1 9.06 ± 3% perf-profile.calltrace.cycles-pp.unmap_region.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
1.96 ± 5% +7.1 9.04 ± 3% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_mas_align_munmap
1.96 ± 5% +7.1 9.04 ± 3% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap
1.96 ± 5% +7.1 9.04 ± 3% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_mas_align_munmap.__vm_munmap
1.89 ± 5% +7.1 8.99 ± 3% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
0.00 +7.3 7.30 ± 4% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function
0.00 +7.9 7.92 ± 4% perf-profile.calltrace.cycles-pp.flush_tlb_func.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function
90.30 -12.9 77.37 ± 3% perf-profile.children.cycles-pp.testcase
81.56 -12.7 68.87 ± 3% perf-profile.children.cycles-pp.asm_exc_page_fault
69.82 ± 2% -10.5 59.30 ± 4% perf-profile.children.cycles-pp.exc_page_fault
69.67 ± 2% -10.5 59.18 ± 4% perf-profile.children.cycles-pp.do_user_addr_fault
26.76 -5.6 21.20 ± 4% perf-profile.children.cycles-pp.__handle_mm_fault
20.49 ± 4% -5.5 15.03 ± 5% perf-profile.children.cycles-pp.up_read
28.06 -4.7 23.34 ± 4% perf-profile.children.cycles-pp.handle_mm_fault
18.33 ± 3% -4.7 13.66 ± 4% perf-profile.children.cycles-pp.down_read_trylock
3.94 ± 5% -0.6 3.30 ± 5% perf-profile.children.cycles-pp.sync_regs
4.36 ± 5% -0.6 3.77 ± 5% perf-profile.children.cycles-pp.error_entry
0.14 ± 12% -0.0 0.10 ± 16% perf-profile.children.cycles-pp.rwsem_down_read_slowpath
0.14 ± 7% -0.0 0.11 ± 9% perf-profile.children.cycles-pp.folio_memcg_lock
0.07 ± 8% -0.0 0.04 ± 45% perf-profile.children.cycles-pp.__tlb_remove_page_size
0.17 ± 6% -0.0 0.14 ± 9% perf-profile.children.cycles-pp.__irqentry_text_end
0.07 ± 6% -0.0 0.05 ± 8% perf-profile.children.cycles-pp.noop_dirty_folio
0.08 ± 6% -0.0 0.06 ± 11% perf-profile.children.cycles-pp.perf_callchain_user
0.05 ± 7% +0.0 0.09 ± 7% perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.07 ± 20% +0.0 0.12 ± 22% perf-profile.children.cycles-pp.arch_scale_freq_tick
0.21 ± 5% +0.0 0.26 ± 7% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.10 ± 14% +0.1 0.16 ± 17% perf-profile.children.cycles-pp.read_tsc
0.04 ± 45% +0.1 0.10 ± 52% perf-profile.children.cycles-pp.update_rq_clock
0.05 ± 46% +0.1 0.11 ± 18% perf-profile.children.cycles-pp.start_kernel
0.05 ± 46% +0.1 0.11 ± 18% perf-profile.children.cycles-pp.arch_call_rest_init
0.05 ± 46% +0.1 0.11 ± 18% perf-profile.children.cycles-pp.rest_init
0.12 ± 11% +0.1 0.18 ± 11% perf-profile.children.cycles-pp.lapic_next_deadline
0.00 +0.1 0.06 ± 17% perf-profile.children.cycles-pp.restore_regs_and_return_to_kernel
0.06 ± 11% +0.1 0.12 ± 22% perf-profile.children.cycles-pp.find_busiest_group
0.04 ± 47% +0.1 0.11 ± 32% perf-profile.children.cycles-pp.get_next_timer_interrupt
0.05 ± 45% +0.1 0.11 ± 26% perf-profile.children.cycles-pp.update_sd_lb_stats
0.04 ± 44% +0.1 0.11 ± 29% perf-profile.children.cycles-pp.irqentry_enter
0.02 ±141% +0.1 0.08 ± 42% perf-profile.children.cycles-pp.hrtimer_next_event_without
0.46 ± 5% +0.1 0.52 ± 4% perf-profile.children.cycles-pp.__might_resched
0.01 ±223% +0.1 0.08 ± 23% perf-profile.children.cycles-pp.update_sg_lb_stats
0.00 +0.1 0.07 ± 12% perf-profile.children.cycles-pp.idle_cpu
0.20 ± 4% +0.1 0.27 ± 6% perf-profile.children.cycles-pp.__cond_resched
0.12 ± 9% +0.1 0.20 ± 6% perf-profile.children.cycles-pp.__mod_node_page_state
0.02 ± 99% +0.1 0.11 ± 19% perf-profile.children.cycles-pp.ret_from_fork
0.02 ± 99% +0.1 0.11 ± 19% perf-profile.children.cycles-pp.kthread
0.08 ± 11% +0.1 0.17 ± 28% perf-profile.children.cycles-pp.load_balance
0.35 ± 7% +0.1 0.44 ± 4% perf-profile.children.cycles-pp._raw_spin_lock
0.00 +0.1 0.10 ± 35% perf-profile.children.cycles-pp.update_blocked_averages
0.12 ± 15% +0.1 0.22 ± 25% perf-profile.children.cycles-pp.rebalance_domains
0.00 +0.1 0.10 ± 39% perf-profile.children.cycles-pp.run_rebalance_domains
0.21 ± 5% +0.1 0.31 ± 6% perf-profile.children.cycles-pp.__mod_lruvec_state
0.12 ± 10% +0.1 0.22 ± 30% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
0.69 ± 5% +0.1 0.79 ± 6% perf-profile.children.cycles-pp.page_remove_rmap
0.32 ± 6% +0.1 0.44 ± 3% perf-profile.children.cycles-pp.tlb_batch_pages_flush
0.12 ± 60% +0.1 0.25 ± 42% perf-profile.children.cycles-pp.irq_enter_rcu
0.26 ± 32% +0.1 0.40 ± 25% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.65 ± 6% +0.2 0.82 ± 4% perf-profile.children.cycles-pp.___perf_sw_event
0.56 ± 4% +0.2 0.73 ± 6% perf-profile.children.cycles-pp.__mod_lruvec_page_state
0.36 ± 5% +0.2 0.54 ± 30% perf-profile.children.cycles-pp.scheduler_tick
0.00 +0.2 0.19 ± 28% perf-profile.children.cycles-pp._find_next_bit
0.00 +0.2 0.19 ± 6% perf-profile.children.cycles-pp.irq_exit_rcu
0.20 ± 14% +0.2 0.40 ± 29% perf-profile.children.cycles-pp.__softirqentry_text_start
0.00 +0.2 0.22 ± 5% perf-profile.children.cycles-pp.error_return
0.00 +0.2 0.24 ± 9% perf-profile.children.cycles-pp.llist_add_batch
0.22 ± 30% +0.3 0.48 ± 9% perf-profile.children.cycles-pp.percpu_counter_add_batch
1.19 ± 8% +0.3 1.46 ± 5% perf-profile.children.cycles-pp.do_set_pte
0.12 ± 12% +0.3 0.40 ± 12% perf-profile.children.cycles-pp.native_sched_clock
1.72 ± 7% +0.3 2.01 ± 5% perf-profile.children.cycles-pp.finish_fault
0.14 ± 10% +0.3 0.49 ± 10% perf-profile.children.cycles-pp.sched_clock_cpu
0.44 ± 8% +0.4 0.81 ± 46% perf-profile.children.cycles-pp.update_process_times
0.09 ± 10% +0.4 0.47 ± 6% perf-profile.children.cycles-pp.irqtime_account_irq
0.86 ± 6% +0.4 1.26 ± 4% perf-profile.children.cycles-pp.__perf_sw_event
0.45 ± 8% +0.4 0.87 ± 51% perf-profile.children.cycles-pp.tick_sched_handle
0.52 ± 13% +0.5 0.97 ± 46% perf-profile.children.cycles-pp.tick_sched_timer
4.65 ± 5% +0.5 5.11 ± 4% perf-profile.children.cycles-pp.do_fault
0.43 ± 29% +0.5 0.90 ± 40% perf-profile.children.cycles-pp.menu_select
0.28 ± 11% +0.5 0.80 ± 18% perf-profile.children.cycles-pp.__irq_exit_rcu
3.58 ± 6% +0.6 4.22 ± 7% perf-profile.children.cycles-pp.native_irq_return_iret
0.02 ± 99% +0.7 0.72 ± 3% perf-profile.children.cycles-pp.llist_reverse_order
0.72 ± 11% +0.7 1.43 ± 48% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.00 +0.8 0.82 ± 6% perf-profile.children.cycles-pp.tlb_flush_rmaps
1.18 ± 18% +0.9 2.06 ± 35% perf-profile.children.cycles-pp.hrtimer_interrupt
1.18 ± 18% +0.9 2.10 ± 36% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
1.70 ± 18% +1.4 3.09 ± 36% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
2.05 ± 17% +1.6 3.62 ± 31% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.16 ± 14% +2.4 2.54 ± 4% perf-profile.children.cycles-pp.__default_send_IPI_dest_field
0.16 ± 14% +2.4 2.60 ± 4% perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys
4.42 ± 10% +3.1 7.50 ± 15% perf-profile.children.cycles-pp.mwait_idle_with_hints
4.40 ± 10% +3.1 7.50 ± 16% perf-profile.children.cycles-pp.intel_idle
6.47 ± 12% +5.1 11.57 ± 21% perf-profile.children.cycles-pp.cpuidle_enter_state
6.48 ± 12% +5.1 11.58 ± 21% perf-profile.children.cycles-pp.cpuidle_enter
0.26 ± 11% +5.5 5.81 ± 17% perf-profile.children.cycles-pp.smp_call_function_many_cond
0.26 ± 11% +5.6 5.82 ± 17% perf-profile.children.cycles-pp.on_each_cpu_cond_mask
6.99 ± 13% +5.7 12.65 ± 23% perf-profile.children.cycles-pp.cpuidle_idle_call
7.05 ± 13% +5.7 12.77 ± 23% perf-profile.children.cycles-pp.start_secondary
7.10 ± 13% +5.8 12.88 ± 23% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
7.10 ± 13% +5.8 12.88 ± 23% perf-profile.children.cycles-pp.cpu_startup_entry
7.10 ± 13% +5.8 12.88 ± 23% perf-profile.children.cycles-pp.do_idle
0.27 ± 11% +6.6 6.83 ± 4% perf-profile.children.cycles-pp.flush_tlb_mm_range
2.05 ± 5% +7.1 9.11 ± 3% perf-profile.children.cycles-pp.__munmap
2.05 ± 5% +7.1 9.11 ± 3% perf-profile.children.cycles-pp.__vm_munmap
2.05 ± 5% +7.1 9.11 ± 3% perf-profile.children.cycles-pp.__x64_sys_munmap
2.02 ± 5% +7.1 9.08 ± 3% perf-profile.children.cycles-pp.do_mas_align_munmap
1.99 ± 5% +7.1 9.06 ± 3% perf-profile.children.cycles-pp.unmap_region
1.97 ± 5% +7.1 9.05 ± 3% perf-profile.children.cycles-pp.unmap_vmas
1.96 ± 5% +7.1 9.05 ± 3% perf-profile.children.cycles-pp.unmap_page_range
1.96 ± 5% +7.1 9.05 ± 3% perf-profile.children.cycles-pp.zap_pmd_range
1.96 ± 5% +7.1 9.05 ± 3% perf-profile.children.cycles-pp.zap_pte_range
2.26 ± 5% +7.1 9.37 ± 3% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
2.26 ± 5% +7.1 9.37 ± 3% perf-profile.children.cycles-pp.do_syscall_64
0.20 ± 11% +10.6 10.77 ± 4% perf-profile.children.cycles-pp.__flush_smp_call_function_queue
0.20 ± 11% +10.6 10.79 ± 4% perf-profile.children.cycles-pp.__sysvec_call_function
0.00 +11.1 11.09 ± 3% perf-profile.children.cycles-pp.native_flush_tlb_one_user
0.23 ± 11% +11.3 11.49 ± 4% perf-profile.children.cycles-pp.sysvec_call_function
0.10 ± 13% +11.8 11.89 ± 3% perf-profile.children.cycles-pp.flush_tlb_func
0.43 ± 12% +14.0 14.47 ± 4% perf-profile.children.cycles-pp.asm_sysvec_call_function
21.36 ± 4% -8.6 12.74 ± 5% perf-profile.self.cycles-pp.__handle_mm_fault
20.32 ± 4% -7.9 12.42 ± 5% perf-profile.self.cycles-pp.up_read
18.17 ± 3% -6.6 11.61 ± 4% perf-profile.self.cycles-pp.down_read_trylock
12.27 ± 4% -2.1 10.17 ± 5% perf-profile.self.cycles-pp.testcase
3.88 ± 5% -0.6 3.24 ± 5% perf-profile.self.cycles-pp.sync_regs
1.01 ± 11% -0.2 0.80 ± 7% perf-profile.self.cycles-pp.mt_find
0.65 ± 4% -0.1 0.56 ± 4% perf-profile.self.cycles-pp.__filemap_get_folio
0.30 ± 6% -0.1 0.24 ± 4% perf-profile.self.cycles-pp.page_add_file_rmap
0.29 ± 3% -0.0 0.24 ± 3% perf-profile.self.cycles-pp.asm_exc_page_fault
0.24 ± 5% -0.0 0.20 ± 6% perf-profile.self.cycles-pp.do_set_pte
0.07 ± 5% -0.0 0.04 ± 71% perf-profile.self.cycles-pp.lock_page_memcg
0.27 ± 2% -0.0 0.24 ± 5% perf-profile.self.cycles-pp.xas_load
0.15 ± 4% -0.0 0.14 ± 6% perf-profile.self.cycles-pp.handle_pte_fault
0.04 ± 45% +0.0 0.07 ± 11% perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt
0.06 ± 8% +0.0 0.08 ± 8% perf-profile.self.cycles-pp.find_vma
0.13 ± 9% +0.0 0.16 ± 6% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.13 ± 5% +0.0 0.16 ± 9% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.09 ± 7% +0.0 0.12 ± 11% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.12 ± 4% +0.0 0.16 ± 5% perf-profile.self.cycles-pp.__cond_resched
0.07 ± 20% +0.0 0.12 ± 22% perf-profile.self.cycles-pp.arch_scale_freq_tick
0.06 ± 13% +0.0 0.10 ± 28% perf-profile.self.cycles-pp.cpuidle_idle_call
0.10 ± 7% +0.1 0.16 ± 6% perf-profile.self.cycles-pp.__mod_node_page_state
0.10 ± 14% +0.1 0.16 ± 17% perf-profile.self.cycles-pp.read_tsc
0.12 ± 11% +0.1 0.18 ± 11% perf-profile.self.cycles-pp.lapic_next_deadline
0.00 +0.1 0.06 ± 21% perf-profile.self.cycles-pp.irqentry_enter
0.00 +0.1 0.07 ± 10% perf-profile.self.cycles-pp.idle_cpu
0.00 +0.1 0.08 perf-profile.self.cycles-pp.error_return
0.00 +0.1 0.09 ± 7% perf-profile.self.cycles-pp.flush_tlb_mm_range
0.00 +0.1 0.13 ± 18% perf-profile.self.cycles-pp.irqtime_account_irq
0.23 ± 6% +0.1 0.38 ± 3% perf-profile.self.cycles-pp.__perf_sw_event
0.00 +0.2 0.16 ± 31% perf-profile.self.cycles-pp._find_next_bit
0.44 ± 4% +0.2 0.61 ± 5% perf-profile.self.cycles-pp.zap_pte_range
0.19 ± 33% +0.2 0.39 ± 8% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.00 +0.2 0.21 ± 6% perf-profile.self.cycles-pp.asm_sysvec_call_function
0.00 +0.2 0.24 ± 8% perf-profile.self.cycles-pp.llist_add_batch
0.00 +0.2 0.24 ± 7% perf-profile.self.cycles-pp.sysvec_call_function
0.15 ± 35% +0.3 0.41 ± 48% perf-profile.self.cycles-pp.menu_select
0.37 ± 14% +0.3 0.64 ± 20% perf-profile.self.cycles-pp.cpuidle_enter_state
0.11 ± 12% +0.3 0.38 ± 10% perf-profile.self.cycles-pp.native_sched_clock
3.58 ± 6% +0.6 4.21 ± 7% perf-profile.self.cycles-pp.native_irq_return_iret
0.02 ± 99% +0.7 0.72 ± 3% perf-profile.self.cycles-pp.llist_reverse_order
0.00 +0.8 0.84 ± 3% perf-profile.self.cycles-pp.flush_tlb_func
0.07 ± 12% +0.8 0.92 ± 6% perf-profile.self.cycles-pp.smp_call_function_many_cond
0.06 ± 13% +0.9 0.93 ± 4% perf-profile.self.cycles-pp.__flush_smp_call_function_queue
0.61 ± 6% +1.0 1.60 ± 3% perf-profile.self.cycles-pp.do_user_addr_fault
0.16 ± 14% +2.4 2.54 ± 4% perf-profile.self.cycles-pp.__default_send_IPI_dest_field
4.40 ± 10% +3.1 7.48 ± 15% perf-profile.self.cycles-pp.mwait_idle_with_hints
0.00 +11.0 11.04 ± 3% perf-profile.self.cycles-pp.native_flush_tlb_one_user
The fix patch is under testing. We will send the result once the test is
done.
Best Regards,
Yujie
> Because if it's a one-off, it's probably best ignored. If it shows up
> elsewhere, I think that batching logic might need looking at.
>
> Linus
prev parent reply other threads:[~2022-12-07 2:15 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-05 8:59 kernel test robot
2022-12-05 20:43 ` Linus Torvalds
2022-12-06 2:02 ` Huang, Ying
2022-12-06 18:41 ` Linus Torvalds
[not found] ` <CAHk-=whjis-wTZKH20xoBW3=1qyygYoxJORxXx8ZpJbc6KtROw@mail.gmail.com>
2022-12-07 5:39 ` Huang, Ying
2022-12-07 5:54 ` Hugh Dickins
2022-12-07 20:17 ` Linus Torvalds
2022-12-07 22:20 ` Andrew Morton
2022-12-07 2:12 ` Yujie Liu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y4/2nZoYtDufIMSK@yujie-X299 \
--to=yujie.liu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=nadav.amit@gmail.com \
--cc=oe-lkp@lists.linux.dev \
--cc=torvalds@linux-foundation.org \
--cc=ying.huang@intel.com \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox