From: Yang Shi <shy828301@gmail.com>
To: kernel test robot <oliver.sang@intel.com>
Cc: Rik van Riel <riel@surriel.com>,
oe-lkp@lists.linux.dev, lkp@intel.com,
Linux Memory Management List <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
Christopher Lameter <cl@linux.com>,
ying.huang@intel.com, feng.tang@intel.com,
fengwei.yin@intel.com
Subject: Re: [linux-next:master] [mm] 1111d46b5c: stress-ng.pthread.ops_per_sec -84.3% regression
Date: Tue, 19 Dec 2023 21:27:07 -0800 [thread overview]
Message-ID: <CAHbLzkogaL-VTuZbBbPp=O8TPZxJmabJLRx1hrD-65rtbRmTtQ@mail.gmail.com> (raw)
In-Reply-To: <202312192310.56367035-oliver.sang@intel.com>
On Tue, Dec 19, 2023 at 7:41 AM kernel test robot <oliver.sang@intel.com> wrote:
>
>
>
> Hello,
>
> for this commit, we reported
> "[mm] 96db82a66d: will-it-scale.per_process_ops -95.3% regression"
> in Aug, 2022 when it's in linux-next/master
> https://lore.kernel.org/all/YwIoiIYo4qsYBcgd@xsang-OptiPlex-9020/
>
> later, we reported
> "[mm] f35b5d7d67: will-it-scale.per_process_ops -95.5% regression"
> in Oct, 2022 when it's in linus/master
> https://lore.kernel.org/all/202210181535.7144dd15-yujie.liu@intel.com/
>
> and the commit was reverted finally by
> commit 0ba09b1733878afe838fe35c310715fda3d46428
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Sun Dec 4 12:51:59 2022 -0800
>
> now we noticed it goes into linux-next/master again.
>
> we are not sure if there is an agreement that the benefit of this commit
> has already overweight performance drop in some mirco benchmark.
>
> we also noticed from https://lore.kernel.org/all/20231214223423.1133074-1-yang@os.amperecomputing.com/
> that
> "This patch was applied to v6.1, but was reverted due to a regression
> report. However it turned out the regression was not due to this patch.
> I ping'ed Andrew to reapply this patch, Andrew may forget it. This
> patch helps promote THP, so I rebased it onto the latest mm-unstable."
IIRC, Huang Ying's analysis showed the regression for will-it-scale
micro benchmark is fine, it was actually reverted due to kernel build
regression with LLVM reported by Nathan Chancellor. Then the
regression was resolved by commit
81e506bec9be1eceaf5a2c654e28ba5176ef48d8 ("mm/thp: check and bail out
if page in deferred queue already"). And this patch did improve kernel
build with GCC by ~3% if I remember correctly.
>
> however, unfortunately, in our latest tests, we still observed below regression
> upon this commit. just FYI.
>
>
>
> kernel test robot noticed a -84.3% regression of stress-ng.pthread.ops_per_sec on:
Interesting, wasn't the same regression seen last time? And I'm a
little bit confused about how pthread got regressed. I didn't see the
pthread benchmark do any intensive memory alloc/free operations. Do
the pthread APIs do any intensive memory operations? I saw the
benchmark does allocate memory for thread stack, but it should be just
8K per thread, so it should not trigger what this patch does. With
1024 threads, the thread stacks may get merged into one single VMA (8M
total), but it may do so even though the patch is not applied.
>
>
> commit: 1111d46b5cbad57486e7a3fab75888accac2f072 ("mm: align larger anonymous mappings on THP boundaries")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> testcase: stress-ng
> test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
> parameters:
>
> nr_threads: 1
> disk: 1HDD
> testtime: 60s
> fs: ext4
> class: os
> test: pthread
> cpufreq_governor: performance
>
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+-----------------------------------------------------------------------------------------------+
> | testcase: change | stream: stream.triad_bandwidth_MBps -12.1% regression |
> | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory |
> | test parameters | array_size=50000000 |
> | | cpufreq_governor=performance |
> | | iterations=10x |
> | | loop=100 |
> | | nr_threads=25% |
> | | omp=true |
> +------------------+-----------------------------------------------------------------------------------------------+
> | testcase: change | phoronix-test-suite: phoronix-test-suite.ramspeed.Average.Integer.mb_s -3.5% regression |
> | test machine | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 16G memory |
> | test parameters | cpufreq_governor=performance |
> | | option_a=Average |
> | | option_b=Integer |
> | | test=ramspeed-1.4.3 |
> +------------------+-----------------------------------------------------------------------------------------------+
> | testcase: change | phoronix-test-suite: phoronix-test-suite.ramspeed.Average.FloatingPoint.mb_s -3.0% regression |
> | test machine | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 16G memory |
> | test parameters | cpufreq_governor=performance |
> | | option_a=Average |
> | | option_b=Floating Point |
> | | test=ramspeed-1.4.3 |
> +------------------+-----------------------------------------------------------------------------------------------+
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202312192310.56367035-oliver.sang@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20231219/202312192310.56367035-oliver.sang@intel.com
>
> =========================================================================================
> class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> os/gcc-12/performance/1HDD/ext4/x86_64-rhel-8.3/1/debian-11.1-x86_64-20220510.cgz/lkp-csl-d02/pthread/stress-ng/60s
>
> commit:
> 30749e6fbb ("mm/memory: replace kmap() with kmap_local_page()")
> 1111d46b5c ("mm: align larger anonymous mappings on THP boundaries")
>
> 30749e6fbb3d391a 1111d46b5cbad57486e7a3fab75
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 13405796 -65.5% 4620124 cpuidle..usage
> 8.00 +8.2% 8.66 ą 2% iostat.cpu.system
> 1.61 -60.6% 0.63 iostat.cpu.user
> 597.50 ą 14% -64.3% 213.50 ą 14% perf-c2c.DRAM.local
> 1882 ą 14% -74.7% 476.83 ą 7% perf-c2c.HITM.local
> 3768436 -12.9% 3283395 vmstat.memory.cache
> 355105 -75.7% 86344 ą 3% vmstat.system.cs
> 385435 -20.7% 305714 ą 3% vmstat.system.in
> 1.13 -0.2 0.88 mpstat.cpu.all.irq%
> 0.29 -0.2 0.10 ą 2% mpstat.cpu.all.soft%
> 6.76 ą 2% +1.1 7.88 ą 2% mpstat.cpu.all.sys%
> 1.62 -1.0 0.62 ą 2% mpstat.cpu.all.usr%
> 2234397 -84.3% 350161 ą 5% stress-ng.pthread.ops
> 37237 -84.3% 5834 ą 5% stress-ng.pthread.ops_per_sec
> 294706 ą 2% -68.0% 94191 ą 6% stress-ng.time.involuntary_context_switches
> 41442 ą 2% +5023.4% 2123284 stress-ng.time.maximum_resident_set_size
> 4466457 -83.9% 717053 ą 5% stress-ng.time.minor_page_faults
The larger RSS and fewer page faults are expected.
> 243.33 +13.5% 276.17 ą 3% stress-ng.time.percent_of_cpu_this_job_got
> 131.64 +27.7% 168.11 ą 3% stress-ng.time.system_time
> 19.73 -82.1% 3.53 ą 4% stress-ng.time.user_time
Much less user time. And it seems to match the drop of the pthread metric.
> 7715609 -80.2% 1530125 ą 4% stress-ng.time.voluntary_context_switches
> 494566 -59.5% 200338 ą 3% meminfo.Active
> 478287 -61.5% 184050 ą 3% meminfo.Active(anon)
> 58549 ą 17% +1532.8% 956006 ą 14% meminfo.AnonHugePages
> 424631 +194.9% 1252445 ą 10% meminfo.AnonPages
> 3677263 -13.0% 3197755 meminfo.Cached
> 5829485 ą 4% -19.0% 4724784 ą 10% meminfo.Committed_AS
> 692486 +108.6% 1444669 ą 8% meminfo.Inactive
> 662179 +113.6% 1414338 ą 9% meminfo.Inactive(anon)
> 182416 -50.2% 90759 meminfo.Mapped
> 4614466 +10.0% 5076604 ą 2% meminfo.Memused
> 6985 +47.6% 10307 ą 4% meminfo.PageTables
> 718445 -66.7% 238913 ą 3% meminfo.Shmem
> 35906 -20.7% 28471 ą 3% meminfo.VmallocUsed
> 4838522 +25.6% 6075302 meminfo.max_used_kB
> 488.83 -20.9% 386.67 ą 2% turbostat.Avg_MHz
> 12.95 -2.7 10.26 ą 2% turbostat.Busy%
> 7156734 -87.2% 919149 ą 4% turbostat.C1
> 10.59 -8.9 1.65 ą 5% turbostat.C1%
> 3702647 -55.1% 1663518 ą 2% turbostat.C1E
> 32.99 -20.6 12.36 ą 3% turbostat.C1E%
> 1161078 +64.5% 1909611 turbostat.C6
> 44.25 +31.8 76.10 turbostat.C6%
> 0.18 -33.3% 0.12 turbostat.IPC
> 74338573 ą 2% -33.9% 49159610 ą 4% turbostat.IRQ
> 1381661 -91.0% 124075 ą 6% turbostat.POLL
> 0.26 -0.2 0.04 ą 12% turbostat.POLL%
> 96.15 -5.4% 90.95 turbostat.PkgWatt
> 12.12 +19.3% 14.46 turbostat.RAMWatt
> 119573 -61.5% 46012 ą 3% proc-vmstat.nr_active_anon
> 106168 +195.8% 314047 ą 10% proc-vmstat.nr_anon_pages
> 28.60 ą 17% +1538.5% 468.68 ą 14% proc-vmstat.nr_anon_transparent_hugepages
> 923365 -13.0% 803489 proc-vmstat.nr_file_pages
> 165571 +113.5% 353493 ą 9% proc-vmstat.nr_inactive_anon
> 45605 -50.2% 22690 proc-vmstat.nr_mapped
> 1752 +47.1% 2578 ą 4% proc-vmstat.nr_page_table_pages
> 179613 -66.7% 59728 ą 3% proc-vmstat.nr_shmem
> 21490 -2.4% 20981 proc-vmstat.nr_slab_reclaimable
> 28260 -7.3% 26208 proc-vmstat.nr_slab_unreclaimable
> 119573 -61.5% 46012 ą 3% proc-vmstat.nr_zone_active_anon
> 165570 +113.5% 353492 ą 9% proc-vmstat.nr_zone_inactive_anon
> 17343640 -76.3% 4116748 ą 4% proc-vmstat.numa_hit
> 17364975 -76.3% 4118098 ą 4% proc-vmstat.numa_local
> 249252 -66.2% 84187 ą 2% proc-vmstat.pgactivate
> 27528916 +567.1% 1.836e+08 ą 5% proc-vmstat.pgalloc_normal
> 4912427 -79.2% 1019949 ą 3% proc-vmstat.pgfault
> 27227124 +574.1% 1.835e+08 ą 5% proc-vmstat.pgfree
> 8728 +3896.4% 348802 ą 5% proc-vmstat.thp_deferred_split_page
> 8730 +3895.3% 348814 ą 5% proc-vmstat.thp_fault_alloc
> 8728 +3896.4% 348802 ą 5% proc-vmstat.thp_split_pmd
> 316745 -21.5% 248756 ą 4% sched_debug.cfs_rq:/.avg_vruntime.avg
> 112735 ą 4% -34.3% 74061 ą 6% sched_debug.cfs_rq:/.avg_vruntime.min
> 0.49 ą 6% -17.2% 0.41 ą 8% sched_debug.cfs_rq:/.h_nr_running.stddev
> 12143 ą120% -99.9% 15.70 ą116% sched_debug.cfs_rq:/.left_vruntime.avg
> 414017 ą126% -99.9% 428.50 ą102% sched_debug.cfs_rq:/.left_vruntime.max
> 68492 ą125% -99.9% 78.15 ą106% sched_debug.cfs_rq:/.left_vruntime.stddev
> 41917 ą 24% -48.3% 21690 ą 57% sched_debug.cfs_rq:/.load.avg
> 176151 ą 30% -56.9% 75963 ą 57% sched_debug.cfs_rq:/.load.stddev
> 6489 ą 17% -29.0% 4608 ą 12% sched_debug.cfs_rq:/.load_avg.max
> 4.42 ą 45% -81.1% 0.83 ą 74% sched_debug.cfs_rq:/.load_avg.min
> 1112 ą 17% -31.0% 767.62 ą 11% sched_debug.cfs_rq:/.load_avg.stddev
> 316745 -21.5% 248756 ą 4% sched_debug.cfs_rq:/.min_vruntime.avg
> 112735 ą 4% -34.3% 74061 ą 6% sched_debug.cfs_rq:/.min_vruntime.min
> 0.49 ą 6% -17.2% 0.41 ą 8% sched_debug.cfs_rq:/.nr_running.stddev
> 12144 ą120% -99.9% 15.70 ą116% sched_debug.cfs_rq:/.right_vruntime.avg
> 414017 ą126% -99.9% 428.50 ą102% sched_debug.cfs_rq:/.right_vruntime.max
> 68492 ą125% -99.9% 78.15 ą106% sched_debug.cfs_rq:/.right_vruntime.stddev
> 14.25 ą 44% -76.6% 3.33 ą 58% sched_debug.cfs_rq:/.runnable_avg.min
> 11.58 ą 49% -77.7% 2.58 ą 58% sched_debug.cfs_rq:/.util_avg.min
> 423972 ą 23% +59.3% 675379 ą 3% sched_debug.cpu.avg_idle.avg
> 5720 ą 43% +439.5% 30864 sched_debug.cpu.avg_idle.min
> 99.79 ą 2% -23.7% 76.11 ą 2% sched_debug.cpu.clock_task.stddev
> 162475 ą 49% -95.8% 6813 ą 26% sched_debug.cpu.curr->pid.avg
> 1061268 -84.0% 170212 ą 4% sched_debug.cpu.curr->pid.max
> 365404 ą 20% -91.3% 31839 ą 10% sched_debug.cpu.curr->pid.stddev
> 0.51 ą 3% -20.1% 0.41 ą 9% sched_debug.cpu.nr_running.stddev
> 311923 -74.2% 80615 ą 2% sched_debug.cpu.nr_switches.avg
> 565973 ą 4% -77.8% 125597 ą 10% sched_debug.cpu.nr_switches.max
> 192666 ą 4% -70.6% 56695 ą 6% sched_debug.cpu.nr_switches.min
> 67485 ą 8% -79.9% 13558 ą 10% sched_debug.cpu.nr_switches.stddev
> 2.62 +102.1% 5.30 perf-stat.i.MPKI
> 2.09e+09 -47.6% 1.095e+09 ą 4% perf-stat.i.branch-instructions
> 1.56 -0.5 1.01 perf-stat.i.branch-miss-rate%
> 31951200 -60.9% 12481432 ą 2% perf-stat.i.branch-misses
> 19.38 +23.7 43.08 perf-stat.i.cache-miss-rate%
> 26413597 -5.7% 24899132 ą 4% perf-stat.i.cache-misses
> 1.363e+08 -58.3% 56906133 ą 4% perf-stat.i.cache-references
> 370628 -75.8% 89743 ą 3% perf-stat.i.context-switches
> 1.77 +65.1% 2.92 ą 2% perf-stat.i.cpi
> 1.748e+10 -21.8% 1.367e+10 ą 2% perf-stat.i.cpu-cycles
> 61611 -79.1% 12901 ą 6% perf-stat.i.cpu-migrations
> 716.97 ą 2% -17.2% 593.35 ą 2% perf-stat.i.cycles-between-cache-misses
> 0.12 ą 4% -0.1 0.05 perf-stat.i.dTLB-load-miss-rate%
> 3066100 ą 3% -81.3% 573066 ą 5% perf-stat.i.dTLB-load-misses
> 2.652e+09 -50.1% 1.324e+09 ą 4% perf-stat.i.dTLB-loads
> 0.08 ą 2% -0.0 0.03 perf-stat.i.dTLB-store-miss-rate%
> 1168195 ą 2% -82.9% 199438 ą 5% perf-stat.i.dTLB-store-misses
> 1.478e+09 -56.8% 6.384e+08 ą 3% perf-stat.i.dTLB-stores
> 8080423 -73.2% 2169371 ą 3% perf-stat.i.iTLB-load-misses
> 5601321 -74.3% 1440571 ą 2% perf-stat.i.iTLB-loads
> 1.028e+10 -49.7% 5.173e+09 ą 4% perf-stat.i.instructions
> 1450 +73.1% 2511 ą 2% perf-stat.i.instructions-per-iTLB-miss
> 0.61 -35.9% 0.39 perf-stat.i.ipc
> 0.48 -21.4% 0.38 ą 2% perf-stat.i.metric.GHz
> 616.28 -17.6% 507.69 ą 4% perf-stat.i.metric.K/sec
> 175.16 -50.8% 86.18 ą 4% perf-stat.i.metric.M/sec
> 76728 -80.8% 14724 ą 4% perf-stat.i.minor-faults
> 5600408 -61.4% 2160997 ą 5% perf-stat.i.node-loads
> 8873996 +52.1% 13499744 ą 5% perf-stat.i.node-stores
> 112409 -81.9% 20305 ą 4% perf-stat.i.page-faults
> 2.55 +89.6% 4.83 perf-stat.overall.MPKI
Much more TLB misses.
> 1.51 -0.4 1.13 perf-stat.overall.branch-miss-rate%
> 19.26 +24.5 43.71 perf-stat.overall.cache-miss-rate%
> 1.70 +56.4% 2.65 perf-stat.overall.cpi
> 665.84 -17.5% 549.51 ą 2% perf-stat.overall.cycles-between-cache-misses
> 0.12 ą 4% -0.1 0.04 perf-stat.overall.dTLB-load-miss-rate%
> 0.08 ą 2% -0.0 0.03 perf-stat.overall.dTLB-store-miss-rate%
> 59.16 +0.9 60.04 perf-stat.overall.iTLB-load-miss-rate%
> 1278 +86.1% 2379 ą 2% perf-stat.overall.instructions-per-iTLB-miss
> 0.59 -36.1% 0.38 perf-stat.overall.ipc
Worse IPC and CPI.
> 2.078e+09 -48.3% 1.074e+09 ą 4% perf-stat.ps.branch-instructions
> 31292687 -61.2% 12133349 ą 2% perf-stat.ps.branch-misses
> 26057291 -5.9% 24512034 ą 4% perf-stat.ps.cache-misses
> 1.353e+08 -58.6% 56072195 ą 4% perf-stat.ps.cache-references
> 365254 -75.8% 88464 ą 3% perf-stat.ps.context-switches
> 1.735e+10 -22.4% 1.346e+10 ą 2% perf-stat.ps.cpu-cycles
> 60838 -79.1% 12727 ą 6% perf-stat.ps.cpu-migrations
> 3056601 ą 4% -81.5% 565354 ą 4% perf-stat.ps.dTLB-load-misses
> 2.636e+09 -50.7% 1.3e+09 ą 4% perf-stat.ps.dTLB-loads
> 1155253 ą 2% -83.0% 196581 ą 5% perf-stat.ps.dTLB-store-misses
> 1.473e+09 -57.4% 6.268e+08 ą 3% perf-stat.ps.dTLB-stores
> 7997726 -73.3% 2131477 ą 3% perf-stat.ps.iTLB-load-misses
> 5521346 -74.3% 1418623 ą 2% perf-stat.ps.iTLB-loads
> 1.023e+10 -50.4% 5.073e+09 ą 4% perf-stat.ps.instructions
> 75671 -80.9% 14479 ą 4% perf-stat.ps.minor-faults
> 5549722 -61.4% 2141750 ą 4% perf-stat.ps.node-loads
> 8769156 +51.6% 13296579 ą 5% perf-stat.ps.node-stores
> 110795 -82.0% 19977 ą 4% perf-stat.ps.page-faults
> 6.482e+11 -50.7% 3.197e+11 ą 4% perf-stat.total.instructions
> 0.00 ą 37% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc_node.memcg_alloc_slab_cgroups.allocate_slab
> 0.01 ą 18% +8373.1% 0.73 ą 49% perf-sched.sch_delay.avg.ms.__cond_resched.down_read.do_madvise.__x64_sys_madvise.do_syscall_64
> 0.01 ą 16% +4600.0% 0.38 ą 24% perf-sched.sch_delay.avg.ms.__cond_resched.down_read.exit_mm.do_exit.__x64_sys_exit
More time spent in madvise and munmap. but I'm not sure whether this
is caused by tearing down the address space when exiting the test. If
so it should not count in the regression.
> 0.01 ą204% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.down_write.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
> 0.01 ą 8% +3678.9% 0.36 ą 79% perf-sched.sch_delay.avg.ms.__cond_resched.exit_signals.do_exit.__x64_sys_exit.do_syscall_64
> 0.01 ą 14% -38.5% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
> 0.01 ą 5% +2946.2% 0.26 ą 43% perf-sched.sch_delay.avg.ms.__cond_resched.mutex_lock.futex_exit_release.exit_mm_release.exit_mm
> 0.00 ą 14% +125.0% 0.01 ą 12% perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 0.02 ą170% -83.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.switch_task_namespaces.__do_sys_setns.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 ą 69% +6578.6% 0.31 ą 4% perf-sched.sch_delay.avg.ms.__cond_resched.tlb_batch_pages_flush.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
> 0.00 +100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
> 0.02 ą 86% +4234.4% 0.65 ą 4% perf-sched.sch_delay.avg.ms.__cond_resched.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
> 0.01 ą 6% +6054.3% 0.47 perf-sched.sch_delay.avg.ms.__cond_resched.zap_pmd_range.isra.0.unmap_page_range
> 0.00 ą 14% +195.2% 0.01 ą 89% perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 0.00 ą102% +340.0% 0.01 ą 85% perf-sched.sch_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
> 0.00 +100.0% 0.00 perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
> 0.00 ą 11% +66.7% 0.01 ą 21% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
> 0.01 ą 89% +1096.1% 0.15 ą 30% perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
> 0.00 +141.7% 0.01 ą 61% perf-sched.sch_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
> 0.00 ą223% +9975.0% 0.07 ą203% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select
> 0.00 ą 10% +789.3% 0.04 ą 69% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
> 0.00 ą 31% +6691.3% 0.26 ą 5% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.do_madvise
> 0.00 ą 28% +14612.5% 0.59 ą 4% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.exit_mm
> 0.00 ą 24% +4904.2% 0.20 ą 4% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
> 0.00 ą 28% +450.0% 0.01 ą 74% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap
> 0.00 ą 17% +984.6% 0.02 ą 79% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 0.00 ą 20% +231.8% 0.01 ą 89% perf-sched.sch_delay.avg.ms.schedule_timeout.io_schedule_timeout.__wait_for_common.submit_bio_wait
> 0.00 +350.0% 0.01 ą 16% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 0.02 ą 16% +320.2% 0.07 ą 2% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 0.02 ą 2% +282.1% 0.09 ą 5% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 0.00 ą 14% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc_node.memcg_alloc_slab_cgroups.allocate_slab
> 0.05 ą 35% +3784.5% 1.92 ą 16% perf-sched.sch_delay.max.ms.__cond_resched.down_read.do_madvise.__x64_sys_madvise.do_syscall_64
> 0.29 ą128% +563.3% 1.92 ą 7% perf-sched.sch_delay.max.ms.__cond_resched.down_read.exit_mm.do_exit.__x64_sys_exit
> 0.14 ą217% -99.7% 0.00 ą223% perf-sched.sch_delay.max.ms.__cond_resched.down_write.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
> 0.03 ą 49% -74.0% 0.01 ą 51% perf-sched.sch_delay.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
> 0.01 ą 54% -57.4% 0.00 ą 75% perf-sched.sch_delay.max.ms.__cond_resched.dput.__ns_get_path.ns_get_path.proc_ns_get_link
> 0.12 ą 21% +873.0% 1.19 ą 60% perf-sched.sch_delay.max.ms.__cond_resched.exit_signals.do_exit.__x64_sys_exit.do_syscall_64
> 2.27 ą220% -99.7% 0.01 ą 19% perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc.create_new_namespaces.__do_sys_setns.do_syscall_64
> 0.02 ą 36% -54.4% 0.01 ą 55% perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc.getname_flags.part.0
> 0.04 ą 36% -77.1% 0.01 ą 31% perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
> 0.12 ą 32% +1235.8% 1.58 ą 31% perf-sched.sch_delay.max.ms.__cond_resched.mutex_lock.futex_exit_release.exit_mm_release.exit_mm
> 2.25 ą218% -99.3% 0.02 ą 52% perf-sched.sch_delay.max.ms.__cond_resched.switch_task_namespaces.__do_sys_setns.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.01 ą 85% +19836.4% 2.56 ą 7% perf-sched.sch_delay.max.ms.__cond_resched.tlb_batch_pages_flush.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
> 0.03 ą 70% -93.6% 0.00 ą223% perf-sched.sch_delay.max.ms.__cond_resched.unmap_page_range.zap_page_range_single.madvise_vma_behavior.do_madvise
> 0.10 ą 16% +2984.2% 3.21 ą 6% perf-sched.sch_delay.max.ms.__cond_resched.zap_pmd_range.isra.0.unmap_page_range
> 0.01 ą 20% +883.9% 0.05 ą177% perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 0.01 ą 15% +694.7% 0.08 ą123% perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
> 0.00 ą223% +6966.7% 0.07 ą199% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select
> 0.01 ą 38% +8384.6% 0.55 ą 72% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 0.01 ą 13% +12995.7% 1.51 ą103% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 117.80 ą 56% -96.4% 4.26 ą 36% perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 0.01 ą 68% +331.9% 0.03 perf-sched.total_sch_delay.average.ms
> 4.14 +242.6% 14.20 ą 4% perf-sched.total_wait_and_delay.average.ms
> 700841 -69.6% 212977 ą 3% perf-sched.total_wait_and_delay.count.ms
> 4.14 +242.4% 14.16 ą 4% perf-sched.total_wait_time.average.ms
> 11.68 ą 8% +213.3% 36.59 ą 28% perf-sched.wait_and_delay.avg.ms.__cond_resched.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file
> 10.00 ą 2% +226.1% 32.62 ą 20% perf-sched.wait_and_delay.avg.ms.__cond_resched.dentry_kill.dput.__fput.__x64_sys_close
> 10.55 ą 3% +259.8% 37.96 ą 7% perf-sched.wait_and_delay.avg.ms.__cond_resched.dput.nd_jump_link.proc_ns_get_link.pick_link
> 9.80 ą 12% +196.5% 29.07 ą 32% perf-sched.wait_and_delay.avg.ms.__cond_resched.dput.pick_link.step_into.open_last_lookups
> 9.80 ą 4% +234.9% 32.83 ą 14% perf-sched.wait_and_delay.avg.ms.__cond_resched.dput.terminate_walk.path_openat.do_filp_open
> 10.32 ą 2% +223.8% 33.42 ą 6% perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_filp_open
> 8.15 ą 14% +271.3% 30.25 ą 35% perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc.create_new_namespaces.__do_sys_setns.do_syscall_64
> 9.60 ą 4% +240.8% 32.73 ą 16% perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc.getname_flags.part.0
> 10.37 ą 4% +232.0% 34.41 ą 10% perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
> 7.32 ą 46% +269.7% 27.07 ą 49% perf-sched.wait_and_delay.avg.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> 9.88 +236.2% 33.23 ą 4% perf-sched.wait_and_delay.avg.ms.__cond_resched.slab_pre_alloc_hook.constprop.0.kmem_cache_alloc_lru
> 4.44 ą 4% +379.0% 21.27 ą 18% perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 10.05 ą 2% +235.6% 33.73 ą 11% perf-sched.wait_and_delay.avg.ms.__cond_resched.switch_task_namespaces.__do_sys_setns.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.03 +462.6% 0.15 ą 6% perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 6.78 ą 4% +482.1% 39.46 ą 3% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
> 3.17 +683.3% 24.85 ą 8% perf-sched.wait_and_delay.avg.ms.futex_wait_queue.__futex_wait.futex_wait.do_futex
> 36.64 ą 13% +244.7% 126.32 ą 6% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
> 9.81 +302.4% 39.47 ą 4% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_sigtimedwait.__x64_sys_rt_sigtimedwait.do_syscall_64
> 1.05 +48.2% 1.56 perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.do_madvise
> 0.93 +14.2% 1.06 ą 2% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
> 9.93 -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.ext4_lazyinit_thread.part.0.kthread
> 12.02 ą 3% +139.8% 28.83 ą 6% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 6.09 ą 2% +403.0% 30.64 ą 5% perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 23.17 ą 19% -83.5% 3.83 ą143% perf-sched.wait_and_delay.count.__cond_resched.__alloc_pages.alloc_pages_mpol.shmem_alloc_folio.shmem_alloc_and_add_folio
> 79.83 ą 9% -55.1% 35.83 ą 16% perf-sched.wait_and_delay.count.__cond_resched.dentry_kill.dput.__fput.__x64_sys_close
> 14.83 ą 14% -59.6% 6.00 ą 56% perf-sched.wait_and_delay.count.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
> 8.50 ą 17% -80.4% 1.67 ą 89% perf-sched.wait_and_delay.count.__cond_resched.dput.__ns_get_path.ns_get_path.proc_ns_get_link
> 114.00 ą 14% -62.4% 42.83 ą 11% perf-sched.wait_and_delay.count.__cond_resched.dput.nd_jump_link.proc_ns_get_link.pick_link
> 94.67 ą 7% -48.1% 49.17 ą 13% perf-sched.wait_and_delay.count.__cond_resched.dput.terminate_walk.path_openat.do_filp_open
> 59.83 ą 13% -76.0% 14.33 ą 48% perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
> 103.00 ą 12% -48.1% 53.50 ą 20% perf-sched.wait_and_delay.count.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_filp_open
> 19.33 ą 16% -56.0% 8.50 ą 29% perf-sched.wait_and_delay.count.__cond_resched.kmem_cache_alloc.create_new_namespaces.__do_sys_setns.do_syscall_64
> 68.17 ą 11% -39.1% 41.50 ą 19% perf-sched.wait_and_delay.count.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
> 36.67 ą 22% -79.1% 7.67 ą 46% perf-sched.wait_and_delay.count.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
> 465.50 ą 9% -47.4% 244.83 ą 11% perf-sched.wait_and_delay.count.__cond_resched.slab_pre_alloc_hook.constprop.0.kmem_cache_alloc_lru
> 14492 ą 3% -96.3% 533.67 ą 10% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 128.67 ą 7% -53.5% 59.83 ą 10% perf-sched.wait_and_delay.count.__cond_resched.switch_task_namespaces.__do_sys_setns.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 7.67 ą 34% -80.4% 1.50 ą107% perf-sched.wait_and_delay.count.__cond_resched.vunmap_p4d_range.__vunmap_range_noflush.remove_vm_area.vfree
> 147533 -81.0% 28023 ą 5% perf-sched.wait_and_delay.count.do_task_dead.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 4394 ą 4% -78.5% 942.83 ą 7% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
> 228791 -79.3% 47383 ą 4% perf-sched.wait_and_delay.count.futex_wait_queue.__futex_wait.futex_wait.do_futex
> 368.50 ą 2% -67.1% 121.33 ą 3% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
> 147506 -81.0% 28010 ą 5% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_sigtimedwait.__x64_sys_rt_sigtimedwait.do_syscall_64
> 5387 ą 6% -16.7% 4488 ą 5% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.do_madvise
> 8303 ą 2% -56.9% 3579 ą 5% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
> 14.67 ą 7% -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.ext4_lazyinit_thread.part.0.kthread
> 370.50 ą141% +221.9% 1192 ą 5% perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 24395 ą 2% -51.2% 11914 ą 6% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 31053 ą 2% -80.5% 6047 ą 5% perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 16.41 ą 2% +342.7% 72.65 ą 29% perf-sched.wait_and_delay.max.ms.__cond_resched.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file
> 16.49 ą 3% +463.3% 92.90 ą 27% perf-sched.wait_and_delay.max.ms.__cond_resched.dentry_kill.dput.__fput.__x64_sys_close
> 17.32 ą 5% +520.9% 107.52 ą 14% perf-sched.wait_and_delay.max.ms.__cond_resched.dput.nd_jump_link.proc_ns_get_link.pick_link
> 15.38 ą 6% +325.2% 65.41 ą 22% perf-sched.wait_and_delay.max.ms.__cond_resched.dput.pick_link.step_into.open_last_lookups
> 16.73 ą 4% +456.2% 93.04 ą 11% perf-sched.wait_and_delay.max.ms.__cond_resched.dput.terminate_walk.path_openat.do_filp_open
> 17.14 ą 3% +510.6% 104.68 ą 14% perf-sched.wait_and_delay.max.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_filp_open
> 15.70 ą 4% +379.4% 75.25 ą 28% perf-sched.wait_and_delay.max.ms.__cond_resched.kmem_cache_alloc.create_new_namespaces.__do_sys_setns.do_syscall_64
> 15.70 ą 3% +422.1% 81.97 ą 19% perf-sched.wait_and_delay.max.ms.__cond_resched.kmem_cache_alloc.getname_flags.part.0
> 16.38 +528.4% 102.91 ą 21% perf-sched.wait_and_delay.max.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
> 45.20 ą 48% +166.0% 120.23 ą 27% perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> 17.25 +495.5% 102.71 ą 2% perf-sched.wait_and_delay.max.ms.__cond_resched.slab_pre_alloc_hook.constprop.0.kmem_cache_alloc_lru
> 402.57 ą 15% -52.8% 189.90 ą 14% perf-sched.wait_and_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 16.96 ą 4% +521.3% 105.40 ą 15% perf-sched.wait_and_delay.max.ms.__cond_resched.switch_task_namespaces.__do_sys_setns.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 28.45 +517.3% 175.65 ą 14% perf-sched.wait_and_delay.max.ms.do_task_dead.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 22.49 +628.5% 163.83 ą 16% perf-sched.wait_and_delay.max.ms.futex_wait_queue.__futex_wait.futex_wait.do_futex
> 26.53 ą 30% +326.9% 113.25 ą 16% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.do_sigtimedwait.__x64_sys_rt_sigtimedwait.do_syscall_64
> 15.54 -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.ext4_lazyinit_thread.part.0.kthread
> 1.67 ą141% +284.6% 6.44 ą 4% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 0.07 ą 34% -93.6% 0.00 ą105% perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages.alloc_pages_mpol.pte_alloc_one.__pte_alloc
> 10.21 ą 15% +295.8% 40.43 ą 50% perf-sched.wait_time.avg.ms.__cond_resched.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 3.89 ą 40% -99.8% 0.01 ą113% perf-sched.wait_time.avg.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc_node.memcg_alloc_slab_cgroups.allocate_slab
> 11.67 ą 8% +213.5% 36.58 ą 28% perf-sched.wait_time.avg.ms.__cond_resched.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file
> 9.98 ą 2% +226.8% 32.61 ą 20% perf-sched.wait_time.avg.ms.__cond_resched.dentry_kill.dput.__fput.__x64_sys_close
> 1.03 +71.2% 1.77 ą 20% perf-sched.wait_time.avg.ms.__cond_resched.down_read.do_madvise.__x64_sys_madvise.do_syscall_64
> 0.06 ą 79% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.down_write.__split_vma.vma_modify.mprotect_fixup
> 0.05 ą 22% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.down_write.vma_expand.mmap_region.do_mmap
> 0.08 ą 82% -98.2% 0.00 ą223% perf-sched.wait_time.avg.ms.__cond_resched.down_write_killable.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 10.72 ą 10% +166.9% 28.61 ą 29% perf-sched.wait_time.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
> 10.53 ą 3% +260.5% 37.95 ą 7% perf-sched.wait_time.avg.ms.__cond_resched.dput.nd_jump_link.proc_ns_get_link.pick_link
> 9.80 ą 12% +196.6% 29.06 ą 32% perf-sched.wait_time.avg.ms.__cond_resched.dput.pick_link.step_into.open_last_lookups
> 9.80 ą 4% +235.1% 32.82 ą 14% perf-sched.wait_time.avg.ms.__cond_resched.dput.terminate_walk.path_openat.do_filp_open
> 9.50 ą 12% +281.9% 36.27 ą 70% perf-sched.wait_time.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
> 10.31 ą 2% +223.9% 33.40 ą 6% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_filp_open
> 8.04 ą 15% +276.1% 30.25 ą 35% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.create_new_namespaces.__do_sys_setns.do_syscall_64
> 9.60 ą 4% +240.9% 32.72 ą 16% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.getname_flags.part.0
> 0.06 ą 66% -98.3% 0.00 ą223% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.mas_alloc_nodes.mas_preallocate.__split_vma
> 10.36 ą 4% +232.1% 34.41 ą 10% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
> 0.08 ą 50% -95.7% 0.00 ą100% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.vm_area_dup.__split_vma.vma_modify
> 0.01 ą 49% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range
> 0.03 ą 73% -87.4% 0.00 ą145% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node.dup_task_struct.copy_process.kernel_clone
> 8.01 ą 25% +238.0% 27.07 ą 49% perf-sched.wait_time.avg.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> 9.86 +237.0% 33.23 ą 4% perf-sched.wait_time.avg.ms.__cond_resched.slab_pre_alloc_hook.constprop.0.kmem_cache_alloc_lru
> 4.44 ą 4% +379.2% 21.26 ą 18% perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 10.03 +236.3% 33.73 ą 11% perf-sched.wait_time.avg.ms.__cond_resched.switch_task_namespaces.__do_sys_setns.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.97 ą 8% -87.8% 0.12 ą221% perf-sched.wait_time.avg.ms.__cond_resched.unmap_page_range.zap_page_range_single.madvise_vma_behavior.do_madvise
> 0.02 ą 13% +1846.8% 0.45 ą 11% perf-sched.wait_time.avg.ms.__cond_resched.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
> 1.01 +64.7% 1.66 perf-sched.wait_time.avg.ms.__cond_resched.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
> 0.75 ą 4% +852.1% 7.10 ą 5% perf-sched.wait_time.avg.ms.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 0.03 +462.6% 0.15 ą 6% perf-sched.wait_time.avg.ms.do_task_dead.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.24 ą 4% +25.3% 0.30 ą 8% perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
> 1.98 ą 15% +595.7% 13.80 ą 90% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
> 2.78 ą 14% +444.7% 15.12 ą 16% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function
> 6.77 ą 4% +483.0% 39.44 ą 3% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
> 3.17 +684.7% 24.85 ą 8% perf-sched.wait_time.avg.ms.futex_wait_queue.__futex_wait.futex_wait.do_futex
> 36.64 ą 13% +244.7% 126.32 ą 6% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
> 9.79 +303.0% 39.45 ą 4% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_sigtimedwait.__x64_sys_rt_sigtimedwait.do_syscall_64
> 1.05 +23.8% 1.30 perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.do_madvise
> 0.86 +101.2% 1.73 ą 3% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.exit_mm
> 0.11 ą 21% +438.9% 0.61 ą 15% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap
> 0.32 ą 4% +28.5% 0.41 ą 13% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 12.00 ą 3% +139.6% 28.76 ą 6% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 6.07 ą 2% +403.5% 30.56 ą 5% perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 0.38 ą 41% -98.8% 0.00 ą105% perf-sched.wait_time.max.ms.__cond_resched.__alloc_pages.alloc_pages_mpol.pte_alloc_one.__pte_alloc
> 0.36 ą 34% -84.3% 0.06 ą200% perf-sched.wait_time.max.ms.__cond_resched.__alloc_pages.alloc_pages_mpol.vma_alloc_folio.do_anonymous_page
> 0.36 ą 51% -92.9% 0.03 ą114% perf-sched.wait_time.max.ms.__cond_resched.__anon_vma_prepare.do_anonymous_page.__handle_mm_fault.handle_mm_fault
> 15.98 ą 5% +361.7% 73.80 ą 23% perf-sched.wait_time.max.ms.__cond_resched.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.51 ą 14% -92.8% 0.04 ą196% perf-sched.wait_time.max.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc_node.__vmalloc_area_node.__vmalloc_node_range
> 8.56 ą 11% -99.9% 0.01 ą126% perf-sched.wait_time.max.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc_node.memcg_alloc_slab_cgroups.allocate_slab
> 0.43 ą 32% -68.2% 0.14 ą119% perf-sched.wait_time.max.ms.__cond_resched.__kmem_cache_alloc_node.kmalloc_node_trace.__get_vm_area_node.__vmalloc_node_range
> 0.46 ą 20% -89.3% 0.05 ą184% perf-sched.wait_time.max.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range.alloc_thread_stack_node.dup_task_struct
> 16.40 ą 2% +342.9% 72.65 ą 29% perf-sched.wait_time.max.ms.__cond_resched.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file
> 0.31 ą 63% -76.2% 0.07 ą169% perf-sched.wait_time.max.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
> 0.14 ą 93% +258.7% 0.49 ą 14% perf-sched.wait_time.max.ms.__cond_resched.clear_huge_page.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
> 16.49 ą 3% +463.5% 92.89 ą 27% perf-sched.wait_time.max.ms.__cond_resched.dentry_kill.dput.__fput.__x64_sys_close
> 1.09 +171.0% 2.96 ą 10% perf-sched.wait_time.max.ms.__cond_resched.down_read.do_madvise.__x64_sys_madvise.do_syscall_64
> 1.16 ą 7% +155.1% 2.97 ą 4% perf-sched.wait_time.max.ms.__cond_resched.down_read.exit_mm.do_exit.__x64_sys_exit
> 0.19 ą 78% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.down_write.__split_vma.vma_modify.mprotect_fixup
> 0.33 ą 35% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.down_write.vma_expand.mmap_region.do_mmap
> 0.20 ą101% -99.3% 0.00 ą223% perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 17.31 ą 5% +521.0% 107.51 ą 14% perf-sched.wait_time.max.ms.__cond_resched.dput.nd_jump_link.proc_ns_get_link.pick_link
> 15.38 ą 6% +325.3% 65.40 ą 22% perf-sched.wait_time.max.ms.__cond_resched.dput.pick_link.step_into.open_last_lookups
> 16.72 ą 4% +456.6% 93.04 ą 11% perf-sched.wait_time.max.ms.__cond_resched.dput.terminate_walk.path_openat.do_filp_open
> 1.16 ą 2% +88.7% 2.20 ą 33% perf-sched.wait_time.max.ms.__cond_resched.exit_signals.do_exit.__x64_sys_exit.do_syscall_64
> 53.96 ą 32% +444.0% 293.53 ą109% perf-sched.wait_time.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
> 17.13 ą 2% +511.2% 104.68 ą 14% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_filp_open
> 15.69 ą 4% +379.5% 75.25 ą 28% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.create_new_namespaces.__do_sys_setns.do_syscall_64
> 15.70 ą 3% +422.2% 81.97 ą 19% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.getname_flags.part.0
> 0.27 ą 80% -99.6% 0.00 ą223% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.mas_alloc_nodes.mas_preallocate.__split_vma
> 16.37 +528.6% 102.90 ą 21% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
> 0.44 ą 33% -99.1% 0.00 ą104% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.vm_area_dup.__split_vma.vma_modify
> 0.02 ą 83% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range
> 0.08 ą 83% -95.4% 0.00 ą147% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node.dup_task_struct.copy_process.kernel_clone
> 1.16 ą 2% +134.7% 2.72 ą 19% perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.futex_exit_release.exit_mm_release.exit_mm
> 49.88 ą 25% +141.0% 120.23 ą 27% perf-sched.wait_time.max.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> 17.24 +495.7% 102.70 ą 2% perf-sched.wait_time.max.ms.__cond_resched.slab_pre_alloc_hook.constprop.0.kmem_cache_alloc_lru
> 402.56 ą 15% -52.8% 189.89 ą 14% perf-sched.wait_time.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 16.96 ą 4% +521.4% 105.39 ą 15% perf-sched.wait_time.max.ms.__cond_resched.switch_task_namespaces.__do_sys_setns.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.06 +241.7% 3.61 ą 4% perf-sched.wait_time.max.ms.__cond_resched.tlb_batch_pages_flush.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
> 1.07 -88.9% 0.12 ą221% perf-sched.wait_time.max.ms.__cond_resched.unmap_page_range.zap_page_range_single.madvise_vma_behavior.do_madvise
> 0.28 ą 27% +499.0% 1.67 ą 18% perf-sched.wait_time.max.ms.__cond_resched.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
> 1.21 ą 2% +207.2% 3.71 ą 3% perf-sched.wait_time.max.ms.__cond_resched.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
> 13.43 ą 26% +38.8% 18.64 perf-sched.wait_time.max.ms.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 28.45 +517.3% 175.65 ą 14% perf-sched.wait_time.max.ms.do_task_dead.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.79 ą 10% +62.2% 1.28 ą 25% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
> 13.22 ą 2% +317.2% 55.16 ą 35% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function
> 834.29 ą 28% -48.5% 429.53 ą 94% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
> 22.48 +628.6% 163.83 ą 16% perf-sched.wait_time.max.ms.futex_wait_queue.__futex_wait.futex_wait.do_futex
> 22.74 ą 18% +398.0% 113.25 ą 16% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_sigtimedwait.__x64_sys_rt_sigtimedwait.do_syscall_64
> 7.72 ą 7% +80.6% 13.95 ą 2% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap
> 0.74 ą 4% +77.2% 1.31 ą 32% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 5.01 +14.1% 5.72 ą 2% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 44.98 -19.7 25.32 ą 2% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
> 43.21 -19.6 23.65 ą 3% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
> 43.21 -19.6 23.65 ą 3% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
> 43.18 -19.5 23.63 ą 3% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
> 40.30 -17.5 22.75 ą 3% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
> 41.10 -17.4 23.66 ą 2% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
> 39.55 -17.3 22.24 ą 3% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
> 24.76 ą 2% -8.5 16.23 ą 3% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> 8.68 ą 4% -6.5 2.22 ą 6% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
> 7.23 ą 4% -5.8 1.46 ą 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> 7.23 ą 4% -5.8 1.46 ą 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 7.11 ą 4% -5.7 1.39 ą 7% perf-profile.calltrace.cycles-pp.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 7.09 ą 4% -5.7 1.39 ą 7% perf-profile.calltrace.cycles-pp.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 6.59 ą 3% -5.1 1.47 ą 7% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
> 6.59 ą 3% -5.1 1.47 ą 7% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> 6.59 ą 3% -5.1 1.47 ą 7% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
> 5.76 ą 2% -5.0 0.80 ą 9% perf-profile.calltrace.cycles-pp.start_thread
> 7.43 ą 2% -4.9 2.52 ą 7% perf-profile.calltrace.cycles-pp.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> 5.51 ą 3% -4.8 0.70 ą 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.start_thread
> 5.50 ą 3% -4.8 0.70 ą 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.start_thread
> 5.48 ą 3% -4.8 0.69 ą 7% perf-profile.calltrace.cycles-pp.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe.start_thread
> 5.42 ą 3% -4.7 0.69 ą 7% perf-profile.calltrace.cycles-pp.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe.start_thread
> 5.90 ą 5% -3.9 2.01 ą 4% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise
> 4.18 ą 5% -3.8 0.37 ą 71% perf-profile.calltrace.cycles-pp.exit_notify.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 5.76 ą 5% -3.8 1.98 ą 4% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
> 5.04 ą 7% -3.7 1.32 ą 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__clone
> 5.03 ą 7% -3.7 1.32 ą 9% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone
> 5.02 ą 7% -3.7 1.32 ą 9% perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone
> 5.02 ą 7% -3.7 1.32 ą 9% perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone
> 5.62 ą 5% -3.7 1.96 ą 3% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single
> 4.03 ą 4% -3.1 0.92 ą 7% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 6.03 ą 5% -3.1 2.94 ą 3% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
> 3.43 ą 5% -2.8 0.67 ą 13% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 3.43 ą 5% -2.8 0.67 ą 13% perf-profile.calltrace.cycles-pp.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
> 3.41 ą 5% -2.7 0.66 ą 13% perf-profile.calltrace.cycles-pp.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread
> 3.40 ą 5% -2.7 0.66 ą 13% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn
> 3.67 ą 7% -2.7 0.94 ą 10% perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 2.92 ą 7% -2.4 0.50 ą 46% perf-profile.calltrace.cycles-pp.stress_pthread
> 2.54 ą 6% -2.2 0.38 ą 70% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 2.46 ą 6% -1.8 0.63 ą 10% perf-profile.calltrace.cycles-pp.dup_task_struct.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
> 3.00 ą 6% -1.6 1.43 ą 7% perf-profile.calltrace.cycles-pp.__munmap
> 2.96 ą 6% -1.5 1.42 ą 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
> 2.96 ą 6% -1.5 1.42 ą 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
> 2.95 ą 6% -1.5 1.41 ą 7% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
> 2.95 ą 6% -1.5 1.41 ą 7% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
> 2.02 ą 4% -1.5 0.52 ą 46% perf-profile.calltrace.cycles-pp.__lll_lock_wait
> 1.78 ą 3% -1.5 0.30 ą100% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__lll_lock_wait
> 1.77 ą 3% -1.5 0.30 ą100% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_lock_wait
> 1.54 ą 6% -1.3 0.26 ą100% perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
> 2.54 ą 6% -1.2 1.38 ą 6% perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 2.51 ą 6% -1.1 1.37 ą 7% perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
> 1.13 -0.7 0.40 ą 70% perf-profile.calltrace.cycles-pp.exit_mm.do_exit.__x64_sys_exit.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.15 ą 5% -0.7 0.46 ą 45% perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu
> 1.58 ą 5% -0.6 0.94 ą 7% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
> 0.99 ą 5% -0.5 0.51 ą 45% perf-profile.calltrace.cycles-pp.__do_softirq.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
> 1.01 ą 5% -0.5 0.54 ą 45% perf-profile.calltrace.cycles-pp.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
> 0.82 ą 4% -0.2 0.59 ą 5% perf-profile.calltrace.cycles-pp.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu
> 0.00 +0.5 0.54 ą 5% perf-profile.calltrace.cycles-pp.flush_tlb_func.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function
> 0.00 +0.6 0.60 ą 5% perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
> 0.00 +0.6 0.61 ą 6% perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
> 0.00 +0.6 0.62 ą 6% perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
> 0.53 ą 5% +0.6 1.17 ą 13% perf-profile.calltrace.cycles-pp.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
> 1.94 ą 2% +0.7 2.64 ą 9% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
> 0.00 +0.7 0.73 ą 5% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.__mod_lruvec_page_state.page_remove_rmap.zap_pte_range.zap_pmd_range
> 0.00 +0.8 0.75 ą 20% perf-profile.calltrace.cycles-pp.__cond_resched.clear_huge_page.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
> 2.02 ą 2% +0.8 2.85 ą 9% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> 0.74 ą 5% +0.8 1.57 ą 11% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
> 0.00 +0.9 0.90 ą 4% perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise
> 0.00 +0.9 0.92 ą 13% perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues
> 0.86 ą 4% +1.0 1.82 ą 10% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
> 0.86 ą 4% +1.0 1.83 ą 10% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
> 0.00 +1.0 0.98 ą 7% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.pmdp_invalidate.__split_huge_pmd_locked
> 0.09 ą223% +1.0 1.07 ą 11% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt
> 0.00 +1.0 0.99 ą 6% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.pmdp_invalidate.__split_huge_pmd_locked.__split_huge_pmd
> 0.00 +1.0 1.00 ą 7% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.pmdp_invalidate.__split_huge_pmd_locked.__split_huge_pmd.zap_pmd_range
> 0.09 ą223% +1.0 1.10 ą 12% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
> 0.00 +1.0 1.01 ą 6% perf-profile.calltrace.cycles-pp.pmdp_invalidate.__split_huge_pmd_locked.__split_huge_pmd.zap_pmd_range.unmap_page_range
> 0.00 +1.1 1.10 ą 5% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.native_queued_spin_lock_slowpath
> 0.00 +1.1 1.12 ą 5% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.native_queued_spin_lock_slowpath._raw_spin_lock
> 0.00 +1.2 1.23 ą 4% perf-profile.calltrace.cycles-pp.page_add_anon_rmap.__split_huge_pmd_locked.__split_huge_pmd.zap_pmd_range.unmap_page_range
> 0.00 +1.3 1.32 ą 4% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.native_queued_spin_lock_slowpath._raw_spin_lock.__split_huge_pmd
> 0.00 +1.4 1.38 ą 5% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.page_remove_rmap.zap_pte_range.zap_pmd_range.unmap_page_range
> 0.00 +2.4 2.44 ą 10% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.native_queued_spin_lock_slowpath._raw_spin_lock.__split_huge_pmd.zap_pmd_range
> 0.00 +3.1 3.10 ą 5% perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.zap_pmd_range.unmap_page_range.zap_page_range_single
> 0.00 +3.5 3.52 ą 5% perf-profile.calltrace.cycles-pp.__split_huge_pmd_locked.__split_huge_pmd.zap_pmd_range.unmap_page_range.zap_page_range_single
> 0.88 ą 4% +3.8 4.69 ą 4% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.zap_page_range_single.madvise_vma_behavior
> 6.30 ą 6% +13.5 19.85 ą 7% perf-profile.calltrace.cycles-pp.__clone
> 0.00 +16.7 16.69 ą 7% perf-profile.calltrace.cycles-pp.clear_page_erms.clear_huge_page.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
> 1.19 ą 29% +17.1 18.32 ą 7% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
> 0.00 +17.6 17.56 ą 7% perf-profile.calltrace.cycles-pp.clear_huge_page.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
> 0.63 ą 7% +17.7 18.35 ą 7% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.__clone
> 0.59 ą 5% +17.8 18.34 ą 7% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.__clone
> 0.59 ą 5% +17.8 18.34 ą 7% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.__clone
> 0.00 +17.9 17.90 ą 7% perf-profile.calltrace.cycles-pp.__do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
> 0.36 ą 71% +18.0 18.33 ą 7% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.__clone
> 0.00 +32.0 32.03 ą 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__split_huge_pmd.zap_pmd_range.unmap_page_range
> 0.00 +32.6 32.62 ą 2% perf-profile.calltrace.cycles-pp._raw_spin_lock.__split_huge_pmd.zap_pmd_range.unmap_page_range.zap_page_range_single
> 0.00 +36.2 36.19 ą 2% perf-profile.calltrace.cycles-pp.__split_huge_pmd.zap_pmd_range.unmap_page_range.zap_page_range_single.madvise_vma_behavior
> 7.97 ą 4% +36.6 44.52 ą 2% perf-profile.calltrace.cycles-pp.__madvise
> 7.91 ą 4% +36.6 44.46 ą 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
> 7.90 ą 4% +36.6 44.46 ą 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 7.87 ą 4% +36.6 44.44 ą 2% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 7.86 ą 4% +36.6 44.44 ą 2% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 7.32 ą 4% +36.8 44.07 ą 2% perf-profile.calltrace.cycles-pp.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 7.25 ą 4% +36.8 44.06 ą 2% perf-profile.calltrace.cycles-pp.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
> 1.04 ą 4% +40.0 41.08 ą 2% perf-profile.calltrace.cycles-pp.unmap_page_range.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
> 1.00 ą 3% +40.1 41.06 ą 2% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.zap_page_range_single.madvise_vma_behavior.do_madvise
> 44.98 -19.7 25.32 ą 2% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
> 44.98 -19.7 25.32 ą 2% perf-profile.children.cycles-pp.cpu_startup_entry
> 44.96 -19.6 25.31 ą 2% perf-profile.children.cycles-pp.do_idle
> 43.21 -19.6 23.65 ą 3% perf-profile.children.cycles-pp.start_secondary
> 41.98 -17.6 24.40 ą 2% perf-profile.children.cycles-pp.cpuidle_idle_call
> 41.21 -17.3 23.86 ą 2% perf-profile.children.cycles-pp.cpuidle_enter
> 41.20 -17.3 23.86 ą 2% perf-profile.children.cycles-pp.cpuidle_enter_state
> 12.69 ą 3% -10.6 2.12 ą 6% perf-profile.children.cycles-pp.do_exit
> 12.60 ą 3% -10.5 2.08 ą 7% perf-profile.children.cycles-pp.__x64_sys_exit
> 24.76 ą 2% -8.5 16.31 ą 2% perf-profile.children.cycles-pp.intel_idle
> 12.34 ą 2% -8.4 3.90 ą 5% perf-profile.children.cycles-pp.intel_idle_irq
> 6.96 ą 4% -5.4 1.58 ą 7% perf-profile.children.cycles-pp.ret_from_fork_asm
> 6.69 ą 4% -5.2 1.51 ą 7% perf-profile.children.cycles-pp.ret_from_fork
> 6.59 ą 3% -5.1 1.47 ą 7% perf-profile.children.cycles-pp.kthread
> 5.78 ą 2% -5.0 0.80 ą 8% perf-profile.children.cycles-pp.start_thread
> 4.68 ą 4% -4.5 0.22 ą 10% perf-profile.children.cycles-pp._raw_spin_lock_irq
> 5.03 ą 7% -3.7 1.32 ą 9% perf-profile.children.cycles-pp.__do_sys_clone
> 5.02 ą 7% -3.7 1.32 ą 9% perf-profile.children.cycles-pp.kernel_clone
> 4.20 ą 5% -3.7 0.53 ą 9% perf-profile.children.cycles-pp.exit_notify
> 4.67 ą 5% -3.6 1.10 ą 9% perf-profile.children.cycles-pp.rcu_core
> 4.60 ą 4% -3.5 1.06 ą 10% perf-profile.children.cycles-pp.rcu_do_batch
> 4.89 ą 5% -3.4 1.44 ą 11% perf-profile.children.cycles-pp.__do_softirq
> 5.64 ą 3% -3.2 2.39 ą 6% perf-profile.children.cycles-pp.__schedule
> 6.27 ą 5% -3.2 3.03 ą 4% perf-profile.children.cycles-pp.flush_tlb_mm_range
> 4.03 ą 4% -3.1 0.92 ą 7% perf-profile.children.cycles-pp.smpboot_thread_fn
> 6.68 ą 4% -3.1 3.61 ą 3% perf-profile.children.cycles-pp.tlb_finish_mmu
> 6.04 ą 5% -3.1 2.99 ą 4% perf-profile.children.cycles-pp.on_each_cpu_cond_mask
> 6.04 ą 5% -3.0 2.99 ą 4% perf-profile.children.cycles-pp.smp_call_function_many_cond
> 3.77 ą 2% -3.0 0.73 ą 16% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> 7.78 -3.0 4.77 ą 5% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> 3.43 ą 5% -2.8 0.67 ą 13% perf-profile.children.cycles-pp.run_ksoftirqd
> 3.67 ą 7% -2.7 0.94 ą 10% perf-profile.children.cycles-pp.copy_process
> 2.80 ą 6% -2.5 0.34 ą 15% perf-profile.children.cycles-pp.queued_write_lock_slowpath
> 3.41 ą 2% -2.5 0.96 ą 16% perf-profile.children.cycles-pp.do_futex
> 3.06 ą 5% -2.4 0.68 ą 16% perf-profile.children.cycles-pp.free_unref_page_commit
> 3.02 ą 5% -2.4 0.67 ą 16% perf-profile.children.cycles-pp.free_pcppages_bulk
> 2.92 ą 7% -2.3 0.58 ą 14% perf-profile.children.cycles-pp.stress_pthread
> 3.22 ą 3% -2.3 0.90 ą 18% perf-profile.children.cycles-pp.__x64_sys_futex
> 2.52 ą 5% -2.2 0.35 ą 7% perf-profile.children.cycles-pp.release_task
> 2.54 ą 6% -2.0 0.53 ą 10% perf-profile.children.cycles-pp.worker_thread
> 3.12 ą 5% -1.9 1.17 ą 11% perf-profile.children.cycles-pp.free_unref_page
> 2.31 ą 6% -1.9 0.45 ą 11% perf-profile.children.cycles-pp.process_one_work
> 2.47 ą 6% -1.8 0.63 ą 10% perf-profile.children.cycles-pp.dup_task_struct
> 2.19 ą 5% -1.8 0.41 ą 12% perf-profile.children.cycles-pp.delayed_vfree_work
> 2.14 ą 5% -1.7 0.40 ą 11% perf-profile.children.cycles-pp.vfree
> 3.19 ą 2% -1.6 1.58 ą 8% perf-profile.children.cycles-pp.schedule
> 2.06 ą 3% -1.6 0.46 ą 7% perf-profile.children.cycles-pp.__sigtimedwait
> 3.02 ą 6% -1.6 1.44 ą 7% perf-profile.children.cycles-pp.__munmap
> 1.94 ą 4% -1.6 0.39 ą 14% perf-profile.children.cycles-pp.__unfreeze_partials
> 2.95 ą 6% -1.5 1.41 ą 7% perf-profile.children.cycles-pp.__x64_sys_munmap
> 2.95 ą 6% -1.5 1.41 ą 7% perf-profile.children.cycles-pp.__vm_munmap
> 2.14 ą 3% -1.5 0.60 ą 21% perf-profile.children.cycles-pp.futex_wait
> 2.08 ą 4% -1.5 0.60 ą 19% perf-profile.children.cycles-pp.__lll_lock_wait
> 2.04 ą 3% -1.5 0.56 ą 20% perf-profile.children.cycles-pp.__futex_wait
> 1.77 ą 5% -1.5 0.32 ą 10% perf-profile.children.cycles-pp.remove_vm_area
> 1.86 ą 5% -1.4 0.46 ą 10% perf-profile.children.cycles-pp.open64
> 1.74 ą 4% -1.4 0.37 ą 7% perf-profile.children.cycles-pp.__x64_sys_rt_sigtimedwait
> 1.71 ą 4% -1.4 0.36 ą 8% perf-profile.children.cycles-pp.do_sigtimedwait
> 1.79 ą 5% -1.3 0.46 ą 9% perf-profile.children.cycles-pp.__x64_sys_openat
> 1.78 ą 5% -1.3 0.46 ą 8% perf-profile.children.cycles-pp.do_sys_openat2
> 1.61 ą 4% -1.3 0.32 ą 12% perf-profile.children.cycles-pp.poll_idle
> 1.65 ą 9% -1.3 0.37 ą 14% perf-profile.children.cycles-pp.pthread_create@@GLIBC_2.2.5
> 1.56 ą 8% -1.2 0.35 ą 7% perf-profile.children.cycles-pp.alloc_thread_stack_node
> 2.32 ą 3% -1.2 1.13 ą 8% perf-profile.children.cycles-pp.pick_next_task_fair
> 2.59 ą 6% -1.2 1.40 ą 7% perf-profile.children.cycles-pp.do_vmi_munmap
> 1.55 ą 4% -1.2 0.40 ą 19% perf-profile.children.cycles-pp.futex_wait_queue
> 1.37 ą 5% -1.1 0.22 ą 12% perf-profile.children.cycles-pp.find_unlink_vmap_area
> 2.52 ą 6% -1.1 1.38 ą 6% perf-profile.children.cycles-pp.do_vmi_align_munmap
> 1.53 ą 5% -1.1 0.39 ą 8% perf-profile.children.cycles-pp.do_filp_open
> 1.52 ą 5% -1.1 0.39 ą 7% perf-profile.children.cycles-pp.path_openat
> 1.25 ą 3% -1.1 0.14 ą 12% perf-profile.children.cycles-pp.sigpending
> 1.58 ą 5% -1.1 0.50 ą 6% perf-profile.children.cycles-pp.schedule_idle
> 1.29 ą 5% -1.1 0.21 ą 21% perf-profile.children.cycles-pp.__mprotect
> 1.40 ą 8% -1.1 0.32 ą 4% perf-profile.children.cycles-pp.__vmalloc_node_range
> 2.06 ą 3% -1.0 1.02 ą 9% perf-profile.children.cycles-pp.newidle_balance
> 1.04 ą 3% -1.0 0.08 ą 23% perf-profile.children.cycles-pp.__x64_sys_rt_sigpending
> 1.14 ą 6% -1.0 0.18 ą 18% perf-profile.children.cycles-pp.__x64_sys_mprotect
> 1.13 ą 6% -1.0 0.18 ą 17% perf-profile.children.cycles-pp.do_mprotect_pkey
> 1.30 ą 7% -0.9 0.36 ą 10% perf-profile.children.cycles-pp.wake_up_new_task
> 1.14 ą 9% -0.9 0.22 ą 16% perf-profile.children.cycles-pp.do_anonymous_page
> 0.95 ą 3% -0.9 0.04 ą 71% perf-profile.children.cycles-pp.do_sigpending
> 1.24 ą 3% -0.9 0.34 ą 9% perf-profile.children.cycles-pp.futex_wake
> 1.02 ą 6% -0.9 0.14 ą 15% perf-profile.children.cycles-pp.mprotect_fixup
> 1.91 ą 2% -0.9 1.06 ą 9% perf-profile.children.cycles-pp.load_balance
> 1.38 ą 5% -0.8 0.53 ą 6% perf-profile.children.cycles-pp.select_task_rq_fair
> 1.14 ą 4% -0.8 0.31 ą 12% perf-profile.children.cycles-pp.__pthread_mutex_unlock_usercnt
> 2.68 ą 3% -0.8 1.91 ą 6% perf-profile.children.cycles-pp.__flush_smp_call_function_queue
> 1.00 ą 4% -0.7 0.26 ą 10% perf-profile.children.cycles-pp.flush_smp_call_function_queue
> 1.44 ą 3% -0.7 0.73 ą 10% perf-profile.children.cycles-pp.find_busiest_group
> 0.81 ą 6% -0.7 0.10 ą 18% perf-profile.children.cycles-pp.vma_modify
> 1.29 ą 3% -0.7 0.60 ą 8% perf-profile.children.cycles-pp.exit_mm
> 1.40 ą 3% -0.7 0.71 ą 10% perf-profile.children.cycles-pp.update_sd_lb_stats
> 0.78 ą 7% -0.7 0.10 ą 19% perf-profile.children.cycles-pp.__split_vma
> 0.90 ą 8% -0.7 0.22 ą 10% perf-profile.children.cycles-pp.__vmalloc_area_node
> 0.75 ą 4% -0.7 0.10 ą 5% perf-profile.children.cycles-pp.__exit_signal
> 1.49 ą 2% -0.7 0.84 ą 7% perf-profile.children.cycles-pp.try_to_wake_up
> 0.89 ą 7% -0.6 0.24 ą 10% perf-profile.children.cycles-pp.find_idlest_cpu
> 1.59 ą 5% -0.6 0.95 ą 7% perf-profile.children.cycles-pp.unmap_region
> 0.86 ą 3% -0.6 0.22 ą 26% perf-profile.children.cycles-pp.pthread_cond_timedwait@@GLIBC_2.3.2
> 1.59 ą 3% -0.6 0.95 ą 9% perf-profile.children.cycles-pp.irq_exit_rcu
> 1.24 ą 3% -0.6 0.61 ą 10% perf-profile.children.cycles-pp.update_sg_lb_stats
> 0.94 ą 5% -0.6 0.32 ą 11% perf-profile.children.cycles-pp.do_task_dead
> 0.87 ą 3% -0.6 0.25 ą 19% perf-profile.children.cycles-pp.perf_iterate_sb
> 0.82 ą 4% -0.6 0.22 ą 10% perf-profile.children.cycles-pp.sched_ttwu_pending
> 1.14 ą 3% -0.6 0.54 ą 10% perf-profile.children.cycles-pp.activate_task
> 0.84 -0.6 0.25 ą 10% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
> 0.81 ą 6% -0.6 0.22 ą 11% perf-profile.children.cycles-pp.find_idlest_group
> 0.75 ą 5% -0.6 0.18 ą 14% perf-profile.children.cycles-pp.step_into
> 0.74 ą 8% -0.6 0.18 ą 14% perf-profile.children.cycles-pp.__alloc_pages_bulk
> 0.74 ą 6% -0.5 0.19 ą 11% perf-profile.children.cycles-pp.update_sg_wakeup_stats
> 0.72 ą 5% -0.5 0.18 ą 15% perf-profile.children.cycles-pp.pick_link
> 1.06 ą 2% -0.5 0.52 ą 9% perf-profile.children.cycles-pp.enqueue_task_fair
> 0.77 ą 6% -0.5 0.23 ą 12% perf-profile.children.cycles-pp.unmap_vmas
> 0.76 ą 2% -0.5 0.22 ą 8% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
> 0.94 ą 2% -0.5 0.42 ą 10% perf-profile.children.cycles-pp.dequeue_task_fair
> 0.65 ą 5% -0.5 0.15 ą 18% perf-profile.children.cycles-pp.open_last_lookups
> 1.37 ą 3% -0.5 0.87 ą 4% perf-profile.children.cycles-pp.llist_add_batch
> 0.70 ą 4% -0.5 0.22 ą 19% perf-profile.children.cycles-pp.memcpy_orig
> 0.91 ą 4% -0.5 0.44 ą 7% perf-profile.children.cycles-pp.update_load_avg
> 0.67 -0.5 0.20 ą 8% perf-profile.children.cycles-pp.switch_fpu_return
> 0.88 ą 3% -0.5 0.42 ą 8% perf-profile.children.cycles-pp.enqueue_entity
> 0.91 ą 4% -0.5 0.45 ą 12% perf-profile.children.cycles-pp.ttwu_do_activate
> 0.77 ą 4% -0.5 0.32 ą 10% perf-profile.children.cycles-pp.schedule_hrtimeout_range_clock
> 0.63 ą 5% -0.4 0.20 ą 21% perf-profile.children.cycles-pp.arch_dup_task_struct
> 0.74 ą 3% -0.4 0.32 ą 15% perf-profile.children.cycles-pp.dequeue_entity
> 0.62 ą 5% -0.4 0.21 ą 5% perf-profile.children.cycles-pp.finish_task_switch
> 0.56 -0.4 0.16 ą 7% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
> 0.53 ą 4% -0.4 0.13 ą 9% perf-profile.children.cycles-pp.syscall
> 0.50 ą 9% -0.4 0.11 ą 18% perf-profile.children.cycles-pp.__get_vm_area_node
> 0.51 ą 3% -0.4 0.12 ą 12% perf-profile.children.cycles-pp.__slab_free
> 0.52 ą 2% -0.4 0.14 ą 10% perf-profile.children.cycles-pp.kmem_cache_free
> 0.75 ą 3% -0.4 0.37 ą 9% perf-profile.children.cycles-pp.exit_mm_release
> 0.50 ą 6% -0.4 0.12 ą 21% perf-profile.children.cycles-pp.do_send_specific
> 0.74 ą 3% -0.4 0.37 ą 8% perf-profile.children.cycles-pp.futex_exit_release
> 0.45 ą 10% -0.4 0.09 ą 17% perf-profile.children.cycles-pp.alloc_vmap_area
> 0.47 ą 3% -0.4 0.11 ą 20% perf-profile.children.cycles-pp.tgkill
> 0.68 ą 11% -0.4 0.32 ą 12% perf-profile.children.cycles-pp.__mmap
> 0.48 ą 3% -0.4 0.13 ą 6% perf-profile.children.cycles-pp.entry_SYSCALL_64
> 0.76 ą 5% -0.3 0.41 ą 10% perf-profile.children.cycles-pp.wake_up_q
> 0.42 ą 7% -0.3 0.08 ą 22% perf-profile.children.cycles-pp.__close
> 0.49 ą 7% -0.3 0.14 ą 25% perf-profile.children.cycles-pp.kmem_cache_alloc
> 0.49 ą 9% -0.3 0.15 ą 14% perf-profile.children.cycles-pp.mas_store_gfp
> 0.46 ą 4% -0.3 0.12 ą 23% perf-profile.children.cycles-pp.perf_event_task_output
> 0.44 ą 10% -0.3 0.10 ą 28% perf-profile.children.cycles-pp.pthread_sigqueue
> 0.46 ą 4% -0.3 0.12 ą 15% perf-profile.children.cycles-pp.link_path_walk
> 0.42 ą 8% -0.3 0.10 ą 20% perf-profile.children.cycles-pp.proc_ns_get_link
> 0.63 ą 10% -0.3 0.32 ą 12% perf-profile.children.cycles-pp.vm_mmap_pgoff
> 0.45 ą 4% -0.3 0.14 ą 13% perf-profile.children.cycles-pp.sched_move_task
> 0.36 ą 8% -0.3 0.06 ą 49% perf-profile.children.cycles-pp.__x64_sys_close
> 0.46 ą 8% -0.3 0.17 ą 14% perf-profile.children.cycles-pp.prctl
> 0.65 ą 3% -0.3 0.35 ą 7% perf-profile.children.cycles-pp.futex_cleanup
> 0.42 ą 7% -0.3 0.12 ą 15% perf-profile.children.cycles-pp.mas_store_prealloc
> 0.49 ą 5% -0.3 0.20 ą 13% perf-profile.children.cycles-pp.__rmqueue_pcplist
> 0.37 ą 7% -0.3 0.08 ą 16% perf-profile.children.cycles-pp.do_tkill
> 0.36 ą 10% -0.3 0.08 ą 20% perf-profile.children.cycles-pp.ns_get_path
> 0.37 ą 4% -0.3 0.09 ą 18% perf-profile.children.cycles-pp.setns
> 0.67 ą 3% -0.3 0.41 ą 8% perf-profile.children.cycles-pp.hrtimer_wakeup
> 0.35 ą 5% -0.3 0.10 ą 16% perf-profile.children.cycles-pp.__task_pid_nr_ns
> 0.41 ą 5% -0.3 0.16 ą 12% perf-profile.children.cycles-pp.mas_wr_bnode
> 0.35 ą 4% -0.3 0.10 ą 20% perf-profile.children.cycles-pp.rcu_cblist_dequeue
> 0.37 ą 5% -0.2 0.12 ą 17% perf-profile.children.cycles-pp.exit_task_stack_account
> 0.56 ą 4% -0.2 0.31 ą 12% perf-profile.children.cycles-pp.select_task_rq
> 0.29 ą 6% -0.2 0.05 ą 46% perf-profile.children.cycles-pp.mas_wr_store_entry
> 0.34 ą 4% -0.2 0.10 ą 27% perf-profile.children.cycles-pp.perf_event_task
> 0.39 ą 9% -0.2 0.15 ą 12% perf-profile.children.cycles-pp.__switch_to_asm
> 0.35 ą 5% -0.2 0.11 ą 11% perf-profile.children.cycles-pp.account_kernel_stack
> 0.30 ą 7% -0.2 0.06 ą 48% perf-profile.children.cycles-pp.__ns_get_path
> 0.31 ą 9% -0.2 0.07 ą 17% perf-profile.children.cycles-pp.free_vmap_area_noflush
> 0.31 ą 5% -0.2 0.08 ą 19% perf-profile.children.cycles-pp.__do_sys_setns
> 0.33 ą 7% -0.2 0.10 ą 7% perf-profile.children.cycles-pp.__free_one_page
> 0.31 ą 11% -0.2 0.08 ą 13% perf-profile.children.cycles-pp.__pte_alloc
> 0.36 ą 6% -0.2 0.13 ą 12% perf-profile.children.cycles-pp.switch_mm_irqs_off
> 0.27 ą 12% -0.2 0.05 ą 71% perf-profile.children.cycles-pp.__fput
> 0.53 ą 9% -0.2 0.31 ą 12% perf-profile.children.cycles-pp.do_mmap
> 0.27 ą 12% -0.2 0.05 ą 77% perf-profile.children.cycles-pp.__x64_sys_rt_tgsigqueueinfo
> 0.28 ą 5% -0.2 0.06 ą 50% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> 0.34 ą 10% -0.2 0.12 ą 29% perf-profile.children.cycles-pp.futex_wait_setup
> 0.27 ą 6% -0.2 0.06 ą 45% perf-profile.children.cycles-pp.__x64_sys_tgkill
> 0.31 ą 7% -0.2 0.11 ą 18% perf-profile.children.cycles-pp.__switch_to
> 0.26 ą 8% -0.2 0.06 ą 21% perf-profile.children.cycles-pp.__call_rcu_common
> 0.33 ą 9% -0.2 0.13 ą 18% perf-profile.children.cycles-pp.__do_sys_prctl
> 0.28 ą 5% -0.2 0.08 ą 17% perf-profile.children.cycles-pp.mm_release
> 0.52 ą 2% -0.2 0.32 ą 9% perf-profile.children.cycles-pp.__get_user_8
> 0.24 ą 10% -0.2 0.04 ą 72% perf-profile.children.cycles-pp.dput
> 0.25 ą 14% -0.2 0.05 ą 46% perf-profile.children.cycles-pp.perf_event_mmap
> 0.24 ą 7% -0.2 0.06 ą 50% perf-profile.children.cycles-pp.mas_walk
> 0.28 ą 6% -0.2 0.10 ą 24% perf-profile.children.cycles-pp.rmqueue_bulk
> 0.23 ą 15% -0.2 0.05 ą 46% perf-profile.children.cycles-pp.perf_event_mmap_event
> 0.25 ą 15% -0.2 0.08 ą 45% perf-profile.children.cycles-pp.___slab_alloc
> 0.20 ą 14% -0.2 0.03 ą100% perf-profile.children.cycles-pp.lookup_fast
> 0.20 ą 10% -0.2 0.04 ą 75% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook
> 0.28 ą 7% -0.2 0.12 ą 24% perf-profile.children.cycles-pp.prepare_task_switch
> 0.22 ą 11% -0.2 0.05 ą 8% perf-profile.children.cycles-pp.ttwu_queue_wakelist
> 0.63 ą 5% -0.2 0.47 ą 12% perf-profile.children.cycles-pp.llist_reverse_order
> 0.25 ą 11% -0.2 0.09 ą 34% perf-profile.children.cycles-pp.futex_q_lock
> 0.21 ą 6% -0.2 0.06 ą 47% perf-profile.children.cycles-pp.kmem_cache_alloc_node
> 0.18 ą 11% -0.2 0.03 ą100% perf-profile.children.cycles-pp.alloc_empty_file
> 0.19 ą 5% -0.2 0.04 ą 71% perf-profile.children.cycles-pp.__put_task_struct
> 0.19 ą 15% -0.2 0.03 ą 70% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
> 0.24 ą 6% -0.2 0.09 ą 20% perf-profile.children.cycles-pp.___perf_sw_event
> 0.18 ą 7% -0.2 0.03 ą100% perf-profile.children.cycles-pp.perf_event_fork
> 0.19 ą 11% -0.1 0.04 ą 71% perf-profile.children.cycles-pp.select_idle_core
> 0.30 ą 11% -0.1 0.15 ą 7% perf-profile.children.cycles-pp.pte_alloc_one
> 0.25 ą 6% -0.1 0.11 ą 10% perf-profile.children.cycles-pp.set_next_entity
> 0.20 ą 10% -0.1 0.06 ą 49% perf-profile.children.cycles-pp.__perf_event_header__init_id
> 0.18 ą 15% -0.1 0.03 ą101% perf-profile.children.cycles-pp.__radix_tree_lookup
> 0.22 ą 11% -0.1 0.08 ą 21% perf-profile.children.cycles-pp.mas_spanning_rebalance
> 0.20 ą 9% -0.1 0.06 ą 9% perf-profile.children.cycles-pp.stress_pthread_func
> 0.18 ą 12% -0.1 0.04 ą 73% perf-profile.children.cycles-pp.__getpid
> 0.16 ą 13% -0.1 0.02 ą 99% perf-profile.children.cycles-pp.walk_component
> 0.28 ą 5% -0.1 0.15 ą 13% perf-profile.children.cycles-pp.update_curr
> 0.25 ą 5% -0.1 0.11 ą 22% perf-profile.children.cycles-pp.balance_fair
> 0.16 ą 9% -0.1 0.03 ą100% perf-profile.children.cycles-pp.futex_wake_mark
> 0.16 ą 12% -0.1 0.04 ą 71% perf-profile.children.cycles-pp.get_futex_key
> 0.17 ą 6% -0.1 0.05 ą 47% perf-profile.children.cycles-pp.memcg_account_kmem
> 0.25 ą 11% -0.1 0.12 ą 11% perf-profile.children.cycles-pp._find_next_bit
> 0.15 ą 13% -0.1 0.02 ą 99% perf-profile.children.cycles-pp.do_open
> 0.20 ą 8% -0.1 0.08 ą 16% perf-profile.children.cycles-pp.mas_rebalance
> 0.17 ą 13% -0.1 0.05 ą 45% perf-profile.children.cycles-pp.__memcg_kmem_charge_page
> 0.33 ą 6% -0.1 0.21 ą 10% perf-profile.children.cycles-pp.select_idle_sibling
> 0.14 ą 11% -0.1 0.03 ą100% perf-profile.children.cycles-pp.get_user_pages_fast
> 0.18 ą 7% -0.1 0.07 ą 14% perf-profile.children.cycles-pp.mas_alloc_nodes
> 0.14 ą 11% -0.1 0.03 ą101% perf-profile.children.cycles-pp.set_task_cpu
> 0.14 ą 12% -0.1 0.03 ą101% perf-profile.children.cycles-pp.vm_unmapped_area
> 0.38 ą 6% -0.1 0.27 ą 7% perf-profile.children.cycles-pp.native_sched_clock
> 0.16 ą 10% -0.1 0.05 ą 47% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown
> 0.36 ą 9% -0.1 0.25 ą 12% perf-profile.children.cycles-pp.mmap_region
> 0.23 ą 7% -0.1 0.12 ą 9% perf-profile.children.cycles-pp.available_idle_cpu
> 0.13 ą 11% -0.1 0.02 ą 99% perf-profile.children.cycles-pp.internal_get_user_pages_fast
> 0.16 ą 10% -0.1 0.06 ą 18% perf-profile.children.cycles-pp.get_unmapped_area
> 0.50 ą 7% -0.1 0.40 ą 6% perf-profile.children.cycles-pp.menu_select
> 0.24 ą 9% -0.1 0.14 ą 13% perf-profile.children.cycles-pp.rmqueue
> 0.17 ą 14% -0.1 0.07 ą 26% perf-profile.children.cycles-pp.perf_event_comm
> 0.17 ą 15% -0.1 0.07 ą 23% perf-profile.children.cycles-pp.perf_event_comm_event
> 0.17 ą 11% -0.1 0.07 ą 14% perf-profile.children.cycles-pp.pick_next_entity
> 0.13 ą 14% -0.1 0.03 ą102% perf-profile.children.cycles-pp.perf_output_begin
> 0.23 ą 6% -0.1 0.13 ą 21% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
> 0.14 ą 18% -0.1 0.04 ą 72% perf-profile.children.cycles-pp.perf_event_comm_output
> 0.21 ą 9% -0.1 0.12 ą 9% perf-profile.children.cycles-pp.update_rq_clock
> 0.16 ą 8% -0.1 0.06 ą 19% perf-profile.children.cycles-pp.mas_split
> 0.13 ą 14% -0.1 0.04 ą 71% perf-profile.children.cycles-pp.raw_spin_rq_lock_nested
> 0.13 ą 6% -0.1 0.04 ą 71% perf-profile.children.cycles-pp.syscall_return_via_sysret
> 0.13 ą 7% -0.1 0.04 ą 72% perf-profile.children.cycles-pp.mas_topiary_replace
> 0.14 ą 8% -0.1 0.06 ą 9% perf-profile.children.cycles-pp.mas_preallocate
> 0.16 ą 11% -0.1 0.07 ą 18% perf-profile.children.cycles-pp.__pick_eevdf
> 0.11 ą 14% -0.1 0.02 ą 99% perf-profile.children.cycles-pp.mas_empty_area_rev
> 0.25 ą 7% -0.1 0.17 ą 10% perf-profile.children.cycles-pp.select_idle_cpu
> 0.14 ą 12% -0.1 0.06 ą 14% perf-profile.children.cycles-pp.cpu_stopper_thread
> 0.14 ą 10% -0.1 0.06 ą 13% perf-profile.children.cycles-pp.active_load_balance_cpu_stop
> 0.14 ą 14% -0.1 0.06 ą 11% perf-profile.children.cycles-pp.os_xsave
> 0.18 ą 6% -0.1 0.11 ą 14% perf-profile.children.cycles-pp.idle_cpu
> 0.17 ą 4% -0.1 0.10 ą 15% perf-profile.children.cycles-pp.hrtimer_start_range_ns
> 0.11 ą 14% -0.1 0.03 ą100% perf-profile.children.cycles-pp.__pthread_mutex_lock
> 0.32 ą 5% -0.1 0.25 ą 5% perf-profile.children.cycles-pp.sched_clock
> 0.11 ą 6% -0.1 0.03 ą 70% perf-profile.children.cycles-pp.wakeup_preempt
> 0.23 ą 7% -0.1 0.16 ą 13% perf-profile.children.cycles-pp.update_rq_clock_task
> 0.13 ą 8% -0.1 0.06 ą 16% perf-profile.children.cycles-pp.local_clock_noinstr
> 0.11 ą 10% -0.1 0.04 ą 71% perf-profile.children.cycles-pp.kmem_cache_alloc_bulk
> 0.34 ą 4% -0.1 0.27 ą 6% perf-profile.children.cycles-pp.sched_clock_cpu
> 0.11 ą 9% -0.1 0.04 ą 76% perf-profile.children.cycles-pp.avg_vruntime
> 0.15 ą 8% -0.1 0.08 ą 14% perf-profile.children.cycles-pp.update_cfs_group
> 0.10 ą 8% -0.1 0.04 ą 71% perf-profile.children.cycles-pp.__kmem_cache_alloc_bulk
> 0.13 ą 8% -0.1 0.06 ą 11% perf-profile.children.cycles-pp.sched_use_asym_prio
> 0.09 ą 12% -0.1 0.02 ą 99% perf-profile.children.cycles-pp.getname_flags
> 0.18 ą 9% -0.1 0.12 ą 12% perf-profile.children.cycles-pp.__update_load_avg_se
> 0.11 ą 8% -0.1 0.05 ą 46% perf-profile.children.cycles-pp.place_entity
> 0.08 ą 12% -0.0 0.02 ą 99% perf-profile.children.cycles-pp.folio_add_lru_vma
> 0.10 ą 7% -0.0 0.05 ą 46% perf-profile.children.cycles-pp._find_next_and_bit
> 0.10 ą 6% -0.0 0.06 ą 24% perf-profile.children.cycles-pp.reweight_entity
> 0.03 ą 70% +0.0 0.08 ą 14% perf-profile.children.cycles-pp.perf_rotate_context
> 0.19 ą 10% +0.1 0.25 ą 7% perf-profile.children.cycles-pp.irqtime_account_irq
> 0.08 ą 11% +0.1 0.14 ą 21% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
> 0.00 +0.1 0.06 ą 14% perf-profile.children.cycles-pp.rcu_pending
> 0.10 ą 17% +0.1 0.16 ą 13% perf-profile.children.cycles-pp.rebalance_domains
> 0.14 ą 16% +0.1 0.21 ą 12% perf-profile.children.cycles-pp.downgrade_write
> 0.14 ą 14% +0.1 0.21 ą 10% perf-profile.children.cycles-pp.down_read_killable
> 0.00 +0.1 0.07 ą 11% perf-profile.children.cycles-pp.free_tail_page_prepare
> 0.02 ą141% +0.1 0.09 ą 20% perf-profile.children.cycles-pp.rcu_sched_clock_irq
> 0.01 ą223% +0.1 0.08 ą 25% perf-profile.children.cycles-pp.arch_scale_freq_tick
> 0.55 ą 9% +0.1 0.62 ą 9% perf-profile.children.cycles-pp.__alloc_pages
> 0.34 ą 5% +0.1 0.41 ą 9% perf-profile.children.cycles-pp.clock_nanosleep
> 0.00 +0.1 0.08 ą 23% perf-profile.children.cycles-pp.tick_nohz_next_event
> 0.70 ą 2% +0.1 0.78 ą 5% perf-profile.children.cycles-pp.flush_tlb_func
> 0.14 ą 10% +0.1 0.23 ą 13% perf-profile.children.cycles-pp.__intel_pmu_enable_all
> 0.07 ą 19% +0.1 0.17 ą 17% perf-profile.children.cycles-pp.cgroup_rstat_updated
> 0.04 ą 71% +0.1 0.14 ą 11% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
> 0.25 ą 9% +0.1 0.38 ą 11% perf-profile.children.cycles-pp.down_read
> 0.43 ą 9% +0.1 0.56 ą 10% perf-profile.children.cycles-pp.get_page_from_freelist
> 0.00 +0.1 0.15 ą 6% perf-profile.children.cycles-pp.vm_normal_page
> 0.31 ą 7% +0.2 0.46 ą 9% perf-profile.children.cycles-pp.native_flush_tlb_local
> 0.00 +0.2 0.16 ą 8% perf-profile.children.cycles-pp.__tlb_remove_page_size
> 0.28 ą 11% +0.2 0.46 ą 13% perf-profile.children.cycles-pp.vma_alloc_folio
> 0.00 +0.2 0.24 ą 5% perf-profile.children.cycles-pp._compound_head
> 0.07 ą 16% +0.2 0.31 ą 6% perf-profile.children.cycles-pp.__mod_node_page_state
> 0.38 ą 5% +0.2 0.62 ą 7% perf-profile.children.cycles-pp.perf_adjust_freq_unthr_context
> 0.22 ą 12% +0.2 0.47 ą 10% perf-profile.children.cycles-pp.schedule_preempt_disabled
> 0.38 ą 5% +0.3 0.64 ą 7% perf-profile.children.cycles-pp.perf_event_task_tick
> 0.00 +0.3 0.27 ą 5% perf-profile.children.cycles-pp.free_swap_cache
> 0.30 ą 10% +0.3 0.58 ą 10% perf-profile.children.cycles-pp.rwsem_down_read_slowpath
> 0.00 +0.3 0.30 ą 4% perf-profile.children.cycles-pp.free_pages_and_swap_cache
> 0.09 ą 10% +0.3 0.42 ą 7% perf-profile.children.cycles-pp.__mod_lruvec_state
> 0.00 +0.3 0.34 ą 9% perf-profile.children.cycles-pp.deferred_split_folio
> 0.00 +0.4 0.36 ą 13% perf-profile.children.cycles-pp.prep_compound_page
> 0.09 ą 10% +0.4 0.50 ą 9% perf-profile.children.cycles-pp.free_unref_page_prepare
> 0.00 +0.4 0.42 ą 11% perf-profile.children.cycles-pp.do_huge_pmd_anonymous_page
> 1.67 ą 3% +0.4 2.12 ą 8% perf-profile.children.cycles-pp.__hrtimer_run_queues
> 0.63 ą 3% +0.5 1.11 ą 12% perf-profile.children.cycles-pp.scheduler_tick
> 1.93 ą 3% +0.5 2.46 ą 8% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
> 1.92 ą 3% +0.5 2.45 ą 8% perf-profile.children.cycles-pp.hrtimer_interrupt
> 0.73 ą 3% +0.6 1.31 ą 11% perf-profile.children.cycles-pp.update_process_times
> 0.74 ą 3% +0.6 1.34 ą 11% perf-profile.children.cycles-pp.tick_sched_handle
> 0.20 ą 8% +0.6 0.83 ą 18% perf-profile.children.cycles-pp.__cond_resched
> 0.78 ą 4% +0.6 1.43 ą 12% perf-profile.children.cycles-pp.tick_nohz_highres_handler
> 0.12 ą 7% +0.7 0.81 ą 5% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
> 0.28 ą 7% +0.9 1.23 ą 4% perf-profile.children.cycles-pp.release_pages
> 0.00 +1.0 1.01 ą 6% perf-profile.children.cycles-pp.pmdp_invalidate
> 0.35 ą 6% +1.2 1.56 ą 5% perf-profile.children.cycles-pp.__mod_lruvec_page_state
> 0.30 ą 8% +1.2 1.53 ą 4% perf-profile.children.cycles-pp.tlb_batch_pages_flush
> 0.00 +1.3 1.26 ą 4% perf-profile.children.cycles-pp.page_add_anon_rmap
> 0.09 ą 11% +3.1 3.20 ą 5% perf-profile.children.cycles-pp.page_remove_rmap
> 1.60 ą 2% +3.4 5.04 ą 4% perf-profile.children.cycles-pp.zap_pte_range
> 0.03 ą100% +3.5 3.55 ą 5% perf-profile.children.cycles-pp.__split_huge_pmd_locked
> 41.36 +11.6 52.92 ą 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 41.22 +11.7 52.88 ą 2% perf-profile.children.cycles-pp.do_syscall_64
> 6.42 ą 6% +13.5 19.88 ą 7% perf-profile.children.cycles-pp.__clone
> 0.82 ą 6% +16.2 16.98 ą 7% perf-profile.children.cycles-pp.clear_page_erms
> 2.62 ą 5% +16.4 19.04 ą 7% perf-profile.children.cycles-pp.asm_exc_page_fault
> 2.18 ą 5% +16.8 18.94 ą 7% perf-profile.children.cycles-pp.exc_page_fault
> 2.06 ą 6% +16.8 18.90 ą 7% perf-profile.children.cycles-pp.do_user_addr_fault
> 1.60 ą 8% +17.0 18.60 ą 7% perf-profile.children.cycles-pp.handle_mm_fault
> 1.52 ą 7% +17.1 18.58 ą 7% perf-profile.children.cycles-pp.__handle_mm_fault
> 0.30 ą 7% +17.4 17.72 ą 7% perf-profile.children.cycles-pp.clear_huge_page
> 0.31 ą 8% +17.6 17.90 ą 7% perf-profile.children.cycles-pp.__do_huge_pmd_anonymous_page
> 11.66 ą 3% +22.2 33.89 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> 3.29 ą 3% +30.2 33.46 perf-profile.children.cycles-pp._raw_spin_lock
> 0.04 ą 71% +36.2 36.21 ą 2% perf-profile.children.cycles-pp.__split_huge_pmd
> 8.00 ą 4% +36.5 44.54 ą 2% perf-profile.children.cycles-pp.__madvise
> 7.87 ą 4% +36.6 44.44 ą 2% perf-profile.children.cycles-pp.__x64_sys_madvise
> 7.86 ą 4% +36.6 44.44 ą 2% perf-profile.children.cycles-pp.do_madvise
> 7.32 ą 4% +36.8 44.07 ą 2% perf-profile.children.cycles-pp.madvise_vma_behavior
> 7.26 ą 4% +36.8 44.06 ą 2% perf-profile.children.cycles-pp.zap_page_range_single
> 1.78 +39.5 41.30 ą 2% perf-profile.children.cycles-pp.unmap_page_range
> 1.72 +39.6 41.28 ą 2% perf-profile.children.cycles-pp.zap_pmd_range
> 24.76 ą 2% -8.5 16.31 ą 2% perf-profile.self.cycles-pp.intel_idle
> 11.46 ą 2% -7.8 3.65 ą 5% perf-profile.self.cycles-pp.intel_idle_irq
> 3.16 ą 7% -2.1 1.04 ą 6% perf-profile.self.cycles-pp.smp_call_function_many_cond
> 1.49 ą 4% -1.2 0.30 ą 12% perf-profile.self.cycles-pp.poll_idle
> 1.15 ą 3% -0.6 0.50 ą 9% perf-profile.self.cycles-pp._raw_spin_lock
> 0.60 ą 6% -0.6 0.03 ą100% perf-profile.self.cycles-pp.queued_write_lock_slowpath
> 0.69 ą 4% -0.5 0.22 ą 20% perf-profile.self.cycles-pp.memcpy_orig
> 0.66 ą 7% -0.5 0.18 ą 11% perf-profile.self.cycles-pp.update_sg_wakeup_stats
> 0.59 ą 4% -0.5 0.13 ą 8% perf-profile.self.cycles-pp._raw_spin_lock_irq
> 0.86 ą 3% -0.4 0.43 ą 12% perf-profile.self.cycles-pp.update_sg_lb_stats
> 0.56 -0.4 0.16 ą 7% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
> 0.48 ą 3% -0.4 0.12 ą 10% perf-profile.self.cycles-pp.__slab_free
> 1.18 ą 2% -0.4 0.82 ą 3% perf-profile.self.cycles-pp.llist_add_batch
> 0.54 ą 5% -0.3 0.19 ą 6% perf-profile.self.cycles-pp.__schedule
> 0.47 ą 7% -0.3 0.18 ą 13% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> 0.34 ą 5% -0.2 0.09 ą 18% perf-profile.self.cycles-pp.kmem_cache_free
> 0.43 ą 4% -0.2 0.18 ą 11% perf-profile.self.cycles-pp.update_load_avg
> 0.35 ą 4% -0.2 0.10 ą 23% perf-profile.self.cycles-pp.rcu_cblist_dequeue
> 0.38 ą 9% -0.2 0.15 ą 10% perf-profile.self.cycles-pp.__switch_to_asm
> 0.33 ą 5% -0.2 0.10 ą 16% perf-profile.self.cycles-pp.__task_pid_nr_ns
> 0.36 ą 6% -0.2 0.13 ą 14% perf-profile.self.cycles-pp.switch_mm_irqs_off
> 0.31 ą 6% -0.2 0.09 ą 6% perf-profile.self.cycles-pp.__free_one_page
> 0.28 ą 5% -0.2 0.06 ą 50% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> 0.27 ą 13% -0.2 0.06 ą 23% perf-profile.self.cycles-pp.pthread_create@@GLIBC_2.2.5
> 0.30 ą 7% -0.2 0.10 ą 19% perf-profile.self.cycles-pp.__switch_to
> 0.27 ą 4% -0.2 0.10 ą 17% perf-profile.self.cycles-pp.finish_task_switch
> 0.23 ą 7% -0.2 0.06 ą 50% perf-profile.self.cycles-pp.mas_walk
> 0.22 ą 9% -0.2 0.05 ą 48% perf-profile.self.cycles-pp.__clone
> 0.63 ą 5% -0.2 0.46 ą 12% perf-profile.self.cycles-pp.llist_reverse_order
> 0.20 ą 4% -0.2 0.04 ą 72% perf-profile.self.cycles-pp.entry_SYSCALL_64
> 0.24 ą 10% -0.1 0.09 ą 19% perf-profile.self.cycles-pp.rmqueue_bulk
> 0.18 ą 13% -0.1 0.03 ą101% perf-profile.self.cycles-pp.__radix_tree_lookup
> 0.18 ą 11% -0.1 0.04 ą 71% perf-profile.self.cycles-pp.stress_pthread_func
> 0.36 ą 8% -0.1 0.22 ą 11% perf-profile.self.cycles-pp.menu_select
> 0.22 ą 4% -0.1 0.08 ą 19% perf-profile.self.cycles-pp.___perf_sw_event
> 0.20 ą 13% -0.1 0.07 ą 20% perf-profile.self.cycles-pp.start_thread
> 0.16 ą 13% -0.1 0.03 ą101% perf-profile.self.cycles-pp.alloc_vmap_area
> 0.17 ą 10% -0.1 0.04 ą 73% perf-profile.self.cycles-pp.kmem_cache_alloc
> 0.14 ą 9% -0.1 0.03 ą100% perf-profile.self.cycles-pp.futex_wake
> 0.17 ą 4% -0.1 0.06 ą 11% perf-profile.self.cycles-pp.dequeue_task_fair
> 0.23 ą 6% -0.1 0.12 ą 11% perf-profile.self.cycles-pp.available_idle_cpu
> 0.22 ą 13% -0.1 0.11 ą 12% perf-profile.self.cycles-pp._find_next_bit
> 0.21 ą 7% -0.1 0.10 ą 6% perf-profile.self.cycles-pp.__rmqueue_pcplist
> 0.37 ą 7% -0.1 0.26 ą 8% perf-profile.self.cycles-pp.native_sched_clock
> 0.22 ą 7% -0.1 0.12 ą 21% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
> 0.19 ą 7% -0.1 0.10 ą 11% perf-profile.self.cycles-pp.enqueue_entity
> 0.15 ą 5% -0.1 0.06 ą 45% perf-profile.self.cycles-pp.enqueue_task_fair
> 0.15 ą 11% -0.1 0.06 ą 17% perf-profile.self.cycles-pp.__pick_eevdf
> 0.13 ą 13% -0.1 0.05 ą 72% perf-profile.self.cycles-pp.prepare_task_switch
> 0.17 ą 10% -0.1 0.08 ą 8% perf-profile.self.cycles-pp.update_rq_clock_task
> 0.54 ą 4% -0.1 0.46 ą 6% perf-profile.self.cycles-pp.__flush_smp_call_function_queue
> 0.14 ą 14% -0.1 0.06 ą 11% perf-profile.self.cycles-pp.os_xsave
> 0.11 ą 10% -0.1 0.03 ą 70% perf-profile.self.cycles-pp.try_to_wake_up
> 0.10 ą 8% -0.1 0.03 ą100% perf-profile.self.cycles-pp.futex_wait
> 0.14 ą 9% -0.1 0.07 ą 10% perf-profile.self.cycles-pp.update_curr
> 0.18 ą 9% -0.1 0.11 ą 14% perf-profile.self.cycles-pp.idle_cpu
> 0.11 ą 11% -0.1 0.04 ą 76% perf-profile.self.cycles-pp.avg_vruntime
> 0.15 ą 10% -0.1 0.08 ą 14% perf-profile.self.cycles-pp.update_cfs_group
> 0.09 ą 9% -0.1 0.03 ą100% perf-profile.self.cycles-pp.reweight_entity
> 0.12 ą 13% -0.1 0.06 ą 8% perf-profile.self.cycles-pp.do_idle
> 0.18 ą 10% -0.1 0.12 ą 13% perf-profile.self.cycles-pp.__update_load_avg_se
> 0.09 ą 17% -0.1 0.04 ą 71% perf-profile.self.cycles-pp.cpuidle_idle_call
> 0.10 ą 11% -0.0 0.06 ą 45% perf-profile.self.cycles-pp.update_rq_clock
> 0.12 ą 15% -0.0 0.07 ą 16% perf-profile.self.cycles-pp.update_sd_lb_stats
> 0.09 ą 5% -0.0 0.05 ą 46% perf-profile.self.cycles-pp._find_next_and_bit
> 0.01 ą223% +0.1 0.08 ą 25% perf-profile.self.cycles-pp.arch_scale_freq_tick
> 0.78 ą 4% +0.1 0.87 ą 4% perf-profile.self.cycles-pp.default_send_IPI_mask_sequence_phys
> 0.14 ą 10% +0.1 0.23 ą 13% perf-profile.self.cycles-pp.__intel_pmu_enable_all
> 0.06 ą 46% +0.1 0.15 ą 19% perf-profile.self.cycles-pp.cgroup_rstat_updated
> 0.19 ą 3% +0.1 0.29 ą 4% perf-profile.self.cycles-pp.cpuidle_enter_state
> 0.00 +0.1 0.10 ą 11% perf-profile.self.cycles-pp.__mod_lruvec_state
> 0.00 +0.1 0.11 ą 18% perf-profile.self.cycles-pp.__tlb_remove_page_size
> 0.00 +0.1 0.12 ą 9% perf-profile.self.cycles-pp.vm_normal_page
> 0.23 ą 7% +0.1 0.36 ą 8% perf-profile.self.cycles-pp.perf_adjust_freq_unthr_context
> 0.20 ą 8% +0.2 0.35 ą 7% perf-profile.self.cycles-pp.__mod_lruvec_page_state
> 1.12 ą 2% +0.2 1.28 ą 4% perf-profile.self.cycles-pp.zap_pte_range
> 0.31 ą 8% +0.2 0.46 ą 9% perf-profile.self.cycles-pp.native_flush_tlb_local
> 0.00 +0.2 0.16 ą 5% perf-profile.self.cycles-pp._compound_head
> 0.06 ą 17% +0.2 0.26 ą 4% perf-profile.self.cycles-pp.__mod_node_page_state
> 0.00 +0.2 0.24 ą 6% perf-profile.self.cycles-pp.free_swap_cache
> 0.00 +0.3 0.27 ą 15% perf-profile.self.cycles-pp.clear_huge_page
> 0.00 +0.3 0.27 ą 11% perf-profile.self.cycles-pp.deferred_split_folio
> 0.00 +0.4 0.36 ą 13% perf-profile.self.cycles-pp.prep_compound_page
> 0.05 ą 47% +0.4 0.43 ą 9% perf-profile.self.cycles-pp.free_unref_page_prepare
> 0.08 ą 7% +0.5 0.57 ą 23% perf-profile.self.cycles-pp.__cond_resched
> 0.08 ą 12% +0.5 0.58 ą 5% perf-profile.self.cycles-pp.release_pages
> 0.10 ą 10% +0.5 0.63 ą 6% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
> 0.00 +1.1 1.11 ą 7% perf-profile.self.cycles-pp.__split_huge_pmd_locked
> 0.00 +1.2 1.18 ą 4% perf-profile.self.cycles-pp.page_add_anon_rmap
> 0.03 ą101% +1.3 1.35 ą 7% perf-profile.self.cycles-pp.page_remove_rmap
> 0.82 ą 5% +16.1 16.88 ą 7% perf-profile.self.cycles-pp.clear_page_erms
> 11.65 ą 3% +20.2 31.88 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>
>
> ***************************************************************************************************
> lkp-spr-2sp4: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
> =========================================================================================
> array_size/compiler/cpufreq_governor/iterations/kconfig/loop/nr_threads/omp/rootfs/tbox_group/testcase:
> 50000000/gcc-12/performance/10x/x86_64-rhel-8.3/100/25%/true/debian-11.1-x86_64-20220510.cgz/lkp-spr-2sp4/stream
>
> commit:
> 30749e6fbb ("mm/memory: replace kmap() with kmap_local_page()")
> 1111d46b5c ("mm: align larger anonymous mappings on THP boundaries")
>
> 30749e6fbb3d391a 1111d46b5cbad57486e7a3fab75
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 10.50 ą 14% +55.6% 16.33 ą 16% perf-c2c.DRAM.local
> 6724 -11.4% 5954 ą 2% vmstat.system.cs
> 2.746e+09 +16.7% 3.205e+09 ą 2% cpuidle..time
> 2771516 +16.0% 3213723 ą 2% cpuidle..usage
> 0.06 ą 4% -0.0 0.05 ą 5% mpstat.cpu.all.soft%
> 0.47 ą 2% -0.1 0.39 ą 2% mpstat.cpu.all.sys%
> 0.01 ą 85% +1700.0% 0.20 ą188% perf-sched.sch_delay.avg.ms.syslog_print.do_syslog.kmsg_read.vfs_read
> 15.11 ą 13% -28.8% 10.76 ą 34% perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 15.09 ą 13% -30.3% 10.51 ą 38% perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 1023952 +13.4% 1161219 meminfo.AnonHugePages
> 1319741 +10.8% 1461995 meminfo.AnonPages
> 1331039 +11.2% 1480149 meminfo.Inactive
> 1330865 +11.2% 1479975 meminfo.Inactive(anon)
> 1266202 +16.0% 1469399 ą 2% turbostat.C1E
> 1509871 +16.6% 1760853 ą 2% turbostat.C6
> 3521203 +17.4% 4134075 ą 3% turbostat.IRQ
> 580.32 -3.8% 558.30 turbostat.PkgWatt
> 77.42 -14.0% 66.60 ą 2% turbostat.RAMWatt
> 330416 +10.8% 366020 proc-vmstat.nr_anon_pages
> 500.90 +13.4% 567.99 proc-vmstat.nr_anon_transparent_hugepages
> 333197 +11.2% 370536 proc-vmstat.nr_inactive_anon
> 333197 +11.2% 370536 proc-vmstat.nr_zone_inactive_anon
> 129879 ą 11% -46.7% 69207 ą 12% proc-vmstat.numa_pages_migrated
> 3879028 +5.9% 4109180 proc-vmstat.pgalloc_normal
> 3403414 +6.6% 3628929 proc-vmstat.pgfree
> 129879 ą 11% -46.7% 69207 ą 12% proc-vmstat.pgmigrate_success
> 5763 +9.8% 6327 proc-vmstat.thp_fault_alloc
> 350993 -15.6% 296081 ą 2% stream.add_bandwidth_MBps
> 349830 -16.1% 293492 ą 2% stream.add_bandwidth_MBps_harmonicMean
> 333973 -20.5% 265439 ą 3% stream.copy_bandwidth_MBps
> 332930 -21.7% 260548 ą 3% stream.copy_bandwidth_MBps_harmonicMean
> 302788 -16.2% 253817 ą 2% stream.scale_bandwidth_MBps
> 302157 -17.1% 250577 ą 2% stream.scale_bandwidth_MBps_harmonicMean
> 1177276 +9.3% 1286614 stream.time.maximum_resident_set_size
> 5038 +1.1% 5095 stream.time.percent_of_cpu_this_job_got
> 694.19 ą 2% +19.5% 829.85 ą 2% stream.time.user_time
> 339047 -12.1% 298061 stream.triad_bandwidth_MBps
> 338186 -12.4% 296218 stream.triad_bandwidth_MBps_harmonicMean
> 8.42 ą100% -8.4 0.00 perf-profile.calltrace.cycles-pp.asm_sysvec_reschedule_ipi
> 8.42 ą100% -8.4 0.00 perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
> 8.42 ą100% -8.4 0.00 perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
> 8.42 ą100% -8.4 0.00 perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
> 8.42 ą100% -8.4 0.00 perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
> 8.42 ą100% -8.4 0.00 perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode
> 0.84 ą103% +1.7 2.57 ą 59% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> 0.84 ą103% +1.7 2.57 ą 59% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
> 0.31 ą223% +2.0 2.33 ą 44% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
> 0.31 ą223% +2.0 2.33 ą 44% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
> 3.07 ą 56% +2.8 5.88 ą 28% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 8.42 ą100% -8.4 0.00 perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi
> 8.42 ą100% -8.1 0.36 ą223% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
> 12.32 ą 25% -6.6 5.69 ą 69% perf-profile.children.cycles-pp.vsnprintf
> 12.76 ą 27% -6.6 6.19 ą 67% perf-profile.children.cycles-pp.seq_printf
> 3.07 ą 56% +2.8 5.88 ą 28% perf-profile.children.cycles-pp.__x64_sys_exit_group
> 40.11 -11.0% 35.71 ą 2% perf-stat.i.MPKI
> 1.563e+10 -12.3% 1.371e+10 ą 2% perf-stat.i.branch-instructions
> 3.721e+09 ą 2% -23.2% 2.858e+09 ą 4% perf-stat.i.cache-misses
> 4.471e+09 ą 3% -22.7% 3.458e+09 ą 4% perf-stat.i.cache-references
> 5970 ą 5% -15.9% 5021 ą 4% perf-stat.i.context-switches
> 1.66 ą 2% +15.8% 1.92 ą 2% perf-stat.i.cpi
> 41.83 ą 4% +30.6% 54.63 ą 4% perf-stat.i.cycles-between-cache-misses
> 2.282e+10 ą 2% -14.5% 1.952e+10 ą 2% perf-stat.i.dTLB-loads
> 572602 ą 3% -9.2% 519922 ą 5% perf-stat.i.dTLB-store-misses
> 1.483e+10 ą 2% -15.7% 1.25e+10 ą 2% perf-stat.i.dTLB-stores
> 9.179e+10 -13.7% 7.924e+10 ą 2% perf-stat.i.instructions
> 0.61 -13.4% 0.52 ą 2% perf-stat.i.ipc
> 373.79 ą 4% -37.8% 232.60 ą 9% perf-stat.i.metric.K/sec
> 251.45 -13.4% 217.72 ą 2% perf-stat.i.metric.M/sec
> 21446 ą 3% -24.1% 16278 ą 8% perf-stat.i.minor-faults
> 15.07 ą 5% -6.0 9.10 ą 10% perf-stat.i.node-load-miss-rate%
> 68275790 ą 5% -44.9% 37626128 ą 12% perf-stat.i.node-load-misses
> 21448 ą 3% -24.1% 16281 ą 8% perf-stat.i.page-faults
> 40.71 -11.3% 36.10 ą 2% perf-stat.overall.MPKI
> 1.67 +15.3% 1.93 ą 2% perf-stat.overall.cpi
> 41.07 ą 3% +30.1% 53.42 ą 4% perf-stat.overall.cycles-between-cache-misses
> 0.00 ą 2% +0.0 0.00 ą 2% perf-stat.overall.dTLB-store-miss-rate%
> 0.60 -13.2% 0.52 ą 2% perf-stat.overall.ipc
> 15.19 ą 5% -6.2 9.03 ą 11% perf-stat.overall.node-load-miss-rate%
> 1.4e+10 -9.3% 1.269e+10 perf-stat.ps.branch-instructions
> 3.352e+09 ą 3% -20.9% 2.652e+09 ą 4% perf-stat.ps.cache-misses
> 4.026e+09 ą 3% -20.3% 3.208e+09 ą 4% perf-stat.ps.cache-references
> 4888 ą 4% -10.8% 4362 ą 3% perf-stat.ps.context-switches
> 206092 +2.1% 210375 perf-stat.ps.cpu-clock
> 1.375e+11 +2.8% 1.414e+11 perf-stat.ps.cpu-cycles
> 258.23 ą 5% +8.8% 280.85 ą 4% perf-stat.ps.cpu-migrations
> 2.048e+10 -11.7% 1.809e+10 ą 2% perf-stat.ps.dTLB-loads
> 1.333e+10 ą 2% -13.0% 1.16e+10 ą 2% perf-stat.ps.dTLB-stores
> 8.231e+10 -10.8% 7.342e+10 perf-stat.ps.instructions
> 15755 ą 3% -16.3% 13187 ą 6% perf-stat.ps.minor-faults
> 61706790 ą 6% -43.8% 34699716 ą 11% perf-stat.ps.node-load-misses
> 15757 ą 3% -16.3% 13189 ą 6% perf-stat.ps.page-faults
> 206092 +2.1% 210375 perf-stat.ps.task-clock
> 1.217e+12 +4.1% 1.267e+12 ą 2% perf-stat.total.instructions
>
>
>
> ***************************************************************************************************
> lkp-cfl-d1: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 16G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
> gcc-12/performance/x86_64-rhel-8.3/Average/Integer/debian-x86_64-phoronix/lkp-cfl-d1/ramspeed-1.4.3/phoronix-test-suite
>
> commit:
> 30749e6fbb ("mm/memory: replace kmap() with kmap_local_page()")
> 1111d46b5c ("mm: align larger anonymous mappings on THP boundaries")
>
> 30749e6fbb3d391a 1111d46b5cbad57486e7a3fab75
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 232.12 ą 7% -12.0% 204.18 ą 8% sched_debug.cfs_rq:/.load_avg.stddev
> 6797 -3.3% 6576 vmstat.system.cs
> 15161 -0.9% 15029 vmstat.system.in
> 349927 +44.3% 504820 meminfo.AnonHugePages
> 507807 +27.1% 645169 meminfo.AnonPages
> 1499332 +10.2% 1652612 meminfo.Inactive(anon)
> 8.67 ą 62% +184.6% 24.67 ą 25% turbostat.C10
> 1.50 -0.1 1.45 turbostat.C1E%
> 3.30 -3.2% 3.20 turbostat.RAMWatt
> 1.40 ą 14% -0.3 1.09 ą 13% perf-profile.calltrace.cycles-pp.asm_exc_page_fault
> 1.44 ą 12% -0.3 1.12 ą 13% perf-profile.children.cycles-pp.asm_exc_page_fault
> 0.03 ą141% +0.1 0.10 ą 30% perf-profile.children.cycles-pp.next_uptodate_folio
> 0.02 ą141% +0.1 0.10 ą 22% perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
> 0.02 ą143% +0.1 0.10 ą 25% perf-profile.self.cycles-pp.next_uptodate_folio
> 0.01 ą223% +0.1 0.09 ą 19% perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
> 19806 -3.5% 19109 phoronix-test-suite.ramspeed.Average.Integer.mb_s
> 283.70 +3.8% 294.50 phoronix-test-suite.time.elapsed_time
> 283.70 +3.8% 294.50 phoronix-test-suite.time.elapsed_time.max
> 120454 +1.6% 122334 phoronix-test-suite.time.maximum_resident_set_size
> 281337 -54.8% 127194 phoronix-test-suite.time.minor_page_faults
> 259.13 +4.1% 269.81 phoronix-test-suite.time.user_time
> 126951 +27.0% 161291 proc-vmstat.nr_anon_pages
> 170.86 +44.3% 246.49 proc-vmstat.nr_anon_transparent_hugepages
> 355917 -1.0% 352250 proc-vmstat.nr_dirty_background_threshold
> 712705 -1.0% 705362 proc-vmstat.nr_dirty_threshold
> 3265201 -1.1% 3228465 proc-vmstat.nr_free_pages
> 374833 +10.2% 413153 proc-vmstat.nr_inactive_anon
> 1767 +4.8% 1853 proc-vmstat.nr_page_table_pages
> 374833 +10.2% 413153 proc-vmstat.nr_zone_inactive_anon
> 854665 -34.3% 561406 proc-vmstat.numa_hit
> 854632 -34.3% 561397 proc-vmstat.numa_local
> 5548755 +1.1% 5610598 proc-vmstat.pgalloc_normal
> 1083315 -26.2% 799129 proc-vmstat.pgfault
> 113425 +3.7% 117656 proc-vmstat.pgreuse
> 9025 +7.6% 9714 proc-vmstat.thp_fault_alloc
> 3.38 +0.1 3.45 perf-stat.i.branch-miss-rate%
> 4.135e+08 -3.2% 4.003e+08 perf-stat.i.cache-misses
> 5.341e+08 -2.7% 5.197e+08 perf-stat.i.cache-references
> 6832 -3.4% 6600 perf-stat.i.context-switches
> 4.06 +3.1% 4.19 perf-stat.i.cpi
> 438639 ą 5% -18.7% 356730 ą 6% perf-stat.i.dTLB-load-misses
> 1.119e+09 -3.8% 1.077e+09 perf-stat.i.dTLB-loads
> 0.02 ą 15% -0.0 0.01 ą 26% perf-stat.i.dTLB-store-miss-rate%
> 80407 ą 10% -63.5% 29387 ą 23% perf-stat.i.dTLB-store-misses
> 7.319e+08 -3.8% 7.043e+08 perf-stat.i.dTLB-stores
> 57.72 +0.8 58.52 perf-stat.i.iTLB-load-miss-rate%
> 129846 -3.8% 124973 perf-stat.i.iTLB-load-misses
> 144448 -5.3% 136837 perf-stat.i.iTLB-loads
> 2.389e+09 -3.5% 2.305e+09 perf-stat.i.instructions
> 0.28 -2.9% 0.27 perf-stat.i.ipc
> 220.59 -3.4% 213.11 perf-stat.i.metric.M/sec
> 3610 -31.2% 2483 perf-stat.i.minor-faults
> 49238342 +1.1% 49776834 perf-stat.i.node-loads
> 98106028 -3.1% 95018390 perf-stat.i.node-stores
> 3615 -31.2% 2487 perf-stat.i.page-faults
> 3.65 +3.7% 3.78 perf-stat.overall.cpi
> 21.08 +3.3% 21.79 perf-stat.overall.cycles-between-cache-misses
> 0.04 ą 5% -0.0 0.03 ą 6% perf-stat.overall.dTLB-load-miss-rate%
> 0.01 ą 10% -0.0 0.00 ą 23% perf-stat.overall.dTLB-store-miss-rate%
> 0.27 -3.6% 0.26 perf-stat.overall.ipc
> 4.122e+08 -3.2% 3.99e+08 perf-stat.ps.cache-misses
> 5.324e+08 -2.7% 5.181e+08 perf-stat.ps.cache-references
> 6809 -3.4% 6580 perf-stat.ps.context-switches
> 437062 ą 5% -18.7% 355481 ą 6% perf-stat.ps.dTLB-load-misses
> 1.115e+09 -3.8% 1.073e+09 perf-stat.ps.dTLB-loads
> 80134 ą 10% -63.5% 29283 ą 23% perf-stat.ps.dTLB-store-misses
> 7.295e+08 -3.8% 7.021e+08 perf-stat.ps.dTLB-stores
> 129362 -3.7% 124535 perf-stat.ps.iTLB-load-misses
> 143865 -5.2% 136338 perf-stat.ps.iTLB-loads
> 2.381e+09 -3.5% 2.297e+09 perf-stat.ps.instructions
> 3596 -31.2% 2473 perf-stat.ps.minor-faults
> 49081949 +1.1% 49621463 perf-stat.ps.node-loads
> 97795918 -3.1% 94724831 perf-stat.ps.node-stores
> 3600 -31.2% 2477 perf-stat.ps.page-faults
>
>
>
> ***************************************************************************************************
> lkp-cfl-d1: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 16G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/option_a/option_b/rootfs/tbox_group/test/testcase:
> gcc-12/performance/x86_64-rhel-8.3/Average/Floating Point/debian-x86_64-phoronix/lkp-cfl-d1/ramspeed-1.4.3/phoronix-test-suite
>
> commit:
> 30749e6fbb ("mm/memory: replace kmap() with kmap_local_page()")
> 1111d46b5c ("mm: align larger anonymous mappings on THP boundaries")
>
> 30749e6fbb3d391a 1111d46b5cbad57486e7a3fab75
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 167.28 ą 5% -13.1% 145.32 ą 6% sched_debug.cfs_rq:/.util_est_enqueued.avg
> 6845 -2.5% 6674 vmstat.system.cs
> 351910 ą 2% +40.2% 493341 meminfo.AnonHugePages
> 505908 +27.2% 643328 meminfo.AnonPages
> 1497656 +10.2% 1650453 meminfo.Inactive(anon)
> 18957 ą 13% +26.3% 23947 ą 17% turbostat.C1
> 1.52 -0.0 1.48 turbostat.C1E%
> 3.32 -2.9% 3.23 turbostat.RAMWatt
> 19978 -3.0% 19379 phoronix-test-suite.ramspeed.Average.FloatingPoint.mb_s
> 280.71 +3.3% 289.93 phoronix-test-suite.time.elapsed_time
> 280.71 +3.3% 289.93 phoronix-test-suite.time.elapsed_time.max
> 120465 +1.5% 122257 phoronix-test-suite.time.maximum_resident_set_size
> 281047 -54.7% 127190 phoronix-test-suite.time.minor_page_faults
> 257.03 +3.5% 265.95 phoronix-test-suite.time.user_time
> 126473 +27.2% 160831 proc-vmstat.nr_anon_pages
> 171.83 ą 2% +40.2% 240.89 proc-vmstat.nr_anon_transparent_hugepages
> 355973 -1.0% 352304 proc-vmstat.nr_dirty_background_threshold
> 712818 -1.0% 705471 proc-vmstat.nr_dirty_threshold
> 3265800 -1.1% 3228879 proc-vmstat.nr_free_pages
> 374410 +10.2% 412613 proc-vmstat.nr_inactive_anon
> 1770 +4.4% 1848 proc-vmstat.nr_page_table_pages
> 374410 +10.2% 412613 proc-vmstat.nr_zone_inactive_anon
> 852082 -34.9% 555093 proc-vmstat.numa_hit
> 852125 -34.9% 555018 proc-vmstat.numa_local
> 1078293 -26.6% 791038 proc-vmstat.pgfault
> 112693 +2.9% 116004 proc-vmstat.pgreuse
> 9025 +7.6% 9713 proc-vmstat.thp_fault_alloc
> 3.63 ą 6% +0.6 4.25 ą 9% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
> 0.25 ą 55% -0.2 0.08 ą 68% perf-profile.children.cycles-pp.ret_from_fork_asm
> 0.25 ą 55% -0.2 0.08 ą 68% perf-profile.children.cycles-pp.ret_from_fork
> 0.23 ą 56% -0.2 0.07 ą 69% perf-profile.children.cycles-pp.kthread
> 0.14 ą 36% -0.1 0.05 ą120% perf-profile.children.cycles-pp.do_anonymous_page
> 0.14 ą 35% -0.1 0.05 ą 76% perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string
> 0.04 ą 72% +0.0 0.08 ą 19% perf-profile.children.cycles-pp.try_to_wake_up
> 0.04 ą118% +0.1 0.10 ą 36% perf-profile.children.cycles-pp.update_rq_clock
> 0.07 ą 79% +0.1 0.17 ą 21% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> 7.99 ą 11% +1.0 9.02 ą 5% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> 0.23 ą 28% -0.1 0.14 ą 49% perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
> 0.14 ą 35% -0.1 0.05 ą 76% perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string
> 0.06 ą 79% +0.1 0.16 ą 21% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> 0.21 ą 34% +0.2 0.36 ą 18% perf-profile.self.cycles-pp.ktime_get
> 1.187e+08 -4.6% 1.133e+08 perf-stat.i.branch-instructions
> 3.36 +0.1 3.42 perf-stat.i.branch-miss-rate%
> 5492420 -3.9% 5275592 perf-stat.i.branch-misses
> 4.148e+08 -2.8% 4.034e+08 perf-stat.i.cache-misses
> 5.251e+08 -2.6% 5.114e+08 perf-stat.i.cache-references
> 6880 -2.5% 6711 perf-stat.i.context-switches
> 4.30 +2.9% 4.43 perf-stat.i.cpi
> 0.10 ą 7% -0.0 0.09 ą 2% perf-stat.i.dTLB-load-miss-rate%
> 472268 ą 6% -19.9% 378489 perf-stat.i.dTLB-load-misses
> 8.107e+08 -3.4% 7.831e+08 perf-stat.i.dTLB-loads
> 0.02 ą 16% -0.0 0.01 ą 2% perf-stat.i.dTLB-store-miss-rate%
> 90535 ą 11% -59.8% 36371 ą 2% perf-stat.i.dTLB-store-misses
> 5.323e+08 -3.3% 5.145e+08 perf-stat.i.dTLB-stores
> 129981 -3.0% 126061 perf-stat.i.iTLB-load-misses
> 143662 -3.1% 139223 perf-stat.i.iTLB-loads
> 2.253e+09 -3.6% 2.172e+09 perf-stat.i.instructions
> 0.26 -3.2% 0.25 perf-stat.i.ipc
> 4.71 ą 2% -6.4% 4.41 ą 2% perf-stat.i.major-faults
> 180.03 -3.0% 174.57 perf-stat.i.metric.M/sec
> 3627 -30.8% 2510 ą 2% perf-stat.i.minor-faults
> 3632 -30.8% 2514 ą 2% perf-stat.i.page-faults
> 3.88 +3.6% 4.02 perf-stat.overall.cpi
> 21.08 +2.7% 21.65 perf-stat.overall.cycles-between-cache-misses
> 0.06 ą 6% -0.0 0.05 perf-stat.overall.dTLB-load-miss-rate%
> 0.02 ą 11% -0.0 0.01 ą 2% perf-stat.overall.dTLB-store-miss-rate%
> 0.26 -3.5% 0.25 perf-stat.overall.ipc
> 1.182e+08 -4.6% 1.128e+08 perf-stat.ps.branch-instructions
> 5468166 -4.0% 5251939 perf-stat.ps.branch-misses
> 4.135e+08 -2.7% 4.021e+08 perf-stat.ps.cache-misses
> 5.234e+08 -2.6% 5.098e+08 perf-stat.ps.cache-references
> 6859 -2.5% 6685 perf-stat.ps.context-switches
> 470567 ą 6% -19.9% 377127 perf-stat.ps.dTLB-load-misses
> 8.079e+08 -3.4% 7.805e+08 perf-stat.ps.dTLB-loads
> 90221 ą 11% -59.8% 36239 ą 2% perf-stat.ps.dTLB-store-misses
> 5.305e+08 -3.3% 5.128e+08 perf-stat.ps.dTLB-stores
> 129499 -3.0% 125601 perf-stat.ps.iTLB-load-misses
> 143121 -3.1% 138638 perf-stat.ps.iTLB-loads
> 2.246e+09 -3.6% 2.165e+09 perf-stat.ps.instructions
> 4.69 ą 2% -6.3% 4.39 ą 2% perf-stat.ps.major-faults
> 3613 -30.8% 2500 ą 2% perf-stat.ps.minor-faults
> 3617 -30.8% 2504 ą 2% perf-stat.ps.page-faults
>
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
next prev parent reply other threads:[~2023-12-20 5:27 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-19 15:41 kernel test robot
2023-12-20 5:27 ` Yang Shi [this message]
2023-12-20 8:29 ` Yin Fengwei
2023-12-20 15:42 ` Christoph Lameter (Ampere)
2023-12-20 20:14 ` Yang Shi
2023-12-20 20:09 ` Yang Shi
2023-12-21 0:26 ` Yang Shi
2023-12-21 0:58 ` Yin Fengwei
2023-12-21 1:02 ` Yin Fengwei
2023-12-21 4:49 ` Matthew Wilcox
2023-12-21 4:58 ` Yin Fengwei
2023-12-21 18:07 ` Yang Shi
2023-12-21 18:14 ` Matthew Wilcox
2023-12-22 1:06 ` Yin, Fengwei
2023-12-22 2:23 ` Huang, Ying
2023-12-21 13:39 ` Yin, Fengwei
2023-12-21 18:11 ` Yang Shi
2023-12-22 1:13 ` Yin, Fengwei
2024-01-04 1:32 ` Yang Shi
2024-01-04 8:18 ` Yin Fengwei
2024-01-04 8:39 ` Oliver Sang
2024-01-05 9:29 ` Oliver Sang
2024-01-05 14:52 ` Yin, Fengwei
2024-01-05 18:49 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHbLzkogaL-VTuZbBbPp=O8TPZxJmabJLRx1hrD-65rtbRmTtQ@mail.gmail.com' \
--to=shy828301@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=riel@surriel.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox