From: Yang Shi <yang@os.amperecomputing.com>
To: Oliver Sang <oliver.sang@intel.com>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
linux-kernel@vger.kernel.org, arnd@arndb.de,
gregkh@linuxfoundation.org, Liam.Howlett@oracle.com,
lorenzo.stoakes@oracle.com, vbabka@suse.cz, jannh@google.com,
willy@infradead.org, liushixin2@huawei.com,
akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH] /dev/zero: make private mapping full anonymous mapping
Date: Tue, 18 Feb 2025 17:12:41 -0800 [thread overview]
Message-ID: <a2907666-2b43-4bdc-96c7-193538945542@os.amperecomputing.com> (raw)
In-Reply-To: <Z7Qo/ijUeRcJM91j@xsang-OptiPlex-9020>
On 2/17/25 10:30 PM, Oliver Sang wrote:
> hi, Yang Shi,
>
> On Fri, Feb 14, 2025 at 02:53:37PM -0800, Yang Shi wrote:
>> On 2/12/25 6:04 PM, Oliver Sang wrote:
>>> hi, Yang Shi,
>>>
>>> On Fri, Feb 07, 2025 at 10:10:37AM -0800, Yang Shi wrote:
>>>> On 2/6/25 12:02 AM, Oliver Sang wrote:
>>> [...]
>>>
>>>>> since we applied your "/dev/zero: make private mapping full anonymous mapping"
>>>>> patch upon a68d3cbfad like below:
>>>>>
>>>>> * 7143ee2391f1e /dev/zero: make private mapping full anonymous mapping
>>>>> * a68d3cbfade64 memstick: core: fix kernel-doc notation
>>>>>
>>>>> so I applied below patch also upon a68d3cbfad.
>>>>>
>>>>> we saw big improvement but not that big.
>>>>>
>>>>> =========================================================================================
>>>>> compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
>>>>> gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-20240206.cgz/300s/lkp-cpl-4sp2/small-allocs/vm-scalability
>>>>>
>>>>> commit:
>>>>> a68d3cbfad ("memstick: core: fix kernel-doc notation")
>>>>> 52ec85cb99 <--- your patch
>>>>>
>>>>>
>>>>> a68d3cbfade64392 52ec85cb99e9b31dc304eae965a
>>>>> ---------------- ---------------------------
>>>>> %stddev %change %stddev
>>>>> \ | \
>>>>> 14364828 ± 4% +410.6% 73349239 ± 3% vm-scalability.throughput
>>>>>
>>>>> full comparison as below [1] just FYI.
>>>> Thanks for the update. I stared at the profiling report for a whole day, but
>>>> I didn't figure out where that 400% lost. I just saw the number of page
>>>> faults was fewer. And it seems like the reduction of page faults match the
>>>> 400% loss. So I did more trace and profiling.
>>>>
>>>> The test case did the below stuff in a tight loop:
>>>> mmap 40K memory from /dev/zero (read only)
>>>> read the area
>>>>
>>>> So two major factors to the performance: mmap and page fault. The
>>>> alternative patch did reduce the overhead of mmap to the same level as the
>>>> original patch.
>>>>
>>>> The further perf profiling showed the cost of page fault is higher than the
>>>> original patch. But the profiling of page fault was interesting:
>>>>
>>>> - 44.87% 0.01% usemem [kernel.kallsyms] [k]
>>>> do_translation_fault
>>>> - 44.86% do_translation_fault
>>>> - 44.83% do_page_fault
>>>> - 44.53% handle_mm_fault
>>>> 9.04% __handle_mm_fault
>>>>
>>>> Page fault consumed 40% of cpu time in handle_mm_fault, but
>>>> __handle_mm_fault just consumed 9%, I expected it should be the major
>>>> consumer.
>>>>
>>>> So I annotated handle_mm_fault, then found the most time was consumed by
>>>> lru_gen_enter_fault() -> vma_has_recency() (my kernel has multi-gen LRU
>>>> enabled):
>>>>
>>>> │ if (vma->vm_file && (vma->vm_file->f_mode & FMODE_NOREUSE))
>>>> │ ↓ cbz x1, b4
>>>> 0.00 │ ldr w0, [x1, #12]
>>>> 99.59 │ eor x0, x0, #0x800000
>>>> 0.00 │ ubfx w0, w0, #23, #1
>>>> │ current->in_lru_fault = vma_has_recency(vma);
>>>> 0.00 │ b4: ldrh w1, [x2, #1992]
>>>> 0.01 │ bfi w1, w0, #5, #1
>>>> 0.00 │ strh w1, [x2, #1992]
>>>>
>>>>
>>>> vma_has_recency() read vma->vm_file->f_mode if vma->vm_file is not NULL. But
>>>> that load took a long time. So I inspected struct file and saw:
>>>>
>>>> struct file {
>>>> file_ref_t f_ref;
>>>> spinlock_t f_lock;
>>>> fmode_t f_mode;
>>>> const struct file_operations *f_op;
>>>> ...
>>>> }
>>>>
>>>> The f_mode is in the same cache line with f_ref (my kernel does NOT have
>>>> spin lock debug enabled). The test case mmap /dev/zero in a tight loop, so
>>>> the refcount is modified (fget/fput) very frequently, this resulted in
>>>> somehow false sharing.
>>>>
>>>> So I tried the below patch on top of the alternative patch:
>>>>
>>>> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
>>>> index f9157a0c42a5..ba11dc0b1c7c 100644
>>>> --- a/include/linux/mm_inline.h
>>>> +++ b/include/linux/mm_inline.h
>>>> @@ -608,6 +608,9 @@ static inline bool vma_has_recency(struct vm_area_struct
>>>> *vma)
>>>> if (vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))
>>>> return false;
>>>>
>>>> + if (vma_is_anonymous(vma))
>>>> + return true;
>>>> +
>>>> if (vma->vm_file && (vma->vm_file->f_mode & FMODE_NOREUSE))
>>>> return false;
>>>>
>>>> This made the profiling of page fault look normal:
>>>>
>>>> - 1.90% do_translation_fault
>>>> - 1.87% do_page_fault
>>>> - 1.49% handle_mm_fault
>>>> - 1.36% __handle_mm_fault
>>>>
>>>> Please try this in your test.
>>>>
>>>> But AFAICT I have never seen performance issue reported due to the false
>>>> sharing of refcount and other fields in struct file. This benchmark stressed
>>>> this quite badly.
>>> I applied your above patch upon alternative patch last time, then saw more
>>> improvement (+445.2% vs a68d3cbfad), but still not that big as in our original
>>> report.
>> Thanks for the update. It looks like the problem is still in page faults. I
>> did my test on arm64 machine. I also noticed struct file has
>> "__randomize_layout", so it may have different layout on x86 than arm64?
>>
>> The page fault handler may also access other fields of struct file that may
>> cause false sharing, for example, accessing f_mapping to read gfp flags.
>> This may not be a problem on my machine, but may be more costly on yours
>> depending on the real layout of struct file on the machines,
>>
>> Can you please try the below patch on top of the current patches? Thank you
>> so much for your patience.
> you are welcome!
>
> now has more improvements. I just list "a68d3cbfad + 3 patches so far" vs
> a68d3cbfad below, if you want more data, please let me know.
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
> gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-20240206.cgz/300s/lkp-cpl-4sp2/small-allocs/vm-scalability
>
> commit:
> a68d3cbfad ("memstick: core: fix kernel-doc notation")
> edc84ea79f <--- a68d3cbfad + 3 patches so far
>
> a68d3cbfade64392 edc84ea79f8dc11853076b96ad5
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 14364828 ± 4% +685.6% 1.129e+08 ± 5% vm-scalability.throughput
>
> full data is as below [1] FYI.
Thank you for the update. It is close to the 800% target, and it looks
like there may be still some overhead in page fault handler due to the
false sharing. For example, the vma_is_dax() call in
__thp_vma_allowable_orders() which is called if pmd is null. I'm not
sure how much the impact could be. However, I'm not sure whether we
should continue chasing it or not. Because the false sharing in struct
file should be very rare for real life workload. The workload has to map
the same file then do page fault again and again in a tight loop, and
the struct file is shared by multiple processes. Such behavior should be
rare in real life.
And changing the layout of struct file to avoid the false sharing sounds
better than adding vma_is_anonymous() call in all the possible places.
But it may introduce new false sharing. Having refcount in a dedicated
cache line is doable too, however it will increase the size of struct
file (from 192 bytes to 256 bytes). So neither seems worth it.
We can split all the patches into two parts, the first part is to avoid
i_mmap_rwsem contention, the second part is the struct file false
sharing. IMHO the first part is more real. I can come up with a formal
patch then send to the mailing list
Thanks,
Yang
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 539c0f7c6d54..1fa9dbce0f66 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -3214,6 +3214,9 @@ static gfp_t __get_fault_gfp_mask(struct
>> vm_area_struct *vma)
>> {
>> struct file *vm_file = vma->vm_file;
>>
>> + if (vma_is_anonymous(vma))
>> + return GFP_KERNEL;
>> +
>> if (vm_file)
>> return mapping_gfp_mask(vm_file->f_mapping) | __GFP_FS |
>> __GFP_IO;
>>
> [1]
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
> gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-20240206.cgz/300s/lkp-cpl-4sp2/small-allocs/vm-scalability
>
> commit:
> a68d3cbfad ("memstick: core: fix kernel-doc notation")
> edc84ea79f <--- a68d3cbfad + 3 patches so far
>
> a68d3cbfade64392 edc84ea79f8dc11853076b96ad5
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 5.262e+09 ± 3% -59.8% 2.114e+09 ± 2% cpuidle..time
> 7924008 ± 3% -83.9% 1275131 ± 5% cpuidle..usage
> 1871164 ± 4% -16.8% 1557233 ± 8% numa-numastat.node3.local_node
> 1952164 ± 3% -14.8% 1663189 ± 7% numa-numastat.node3.numa_hit
> 399.52 -75.0% 99.77 ± 2% uptime.boot
> 14507 -22.1% 11296 uptime.idle
> 3408 ± 5% -99.8% 7.25 ± 46% perf-c2c.DRAM.local
> 18076 ± 3% -99.8% 43.00 ±100% perf-c2c.DRAM.remote
> 8082 ± 5% -99.8% 12.50 ± 63% perf-c2c.HITM.local
> 6544 ± 6% -99.7% 22.88 ±151% perf-c2c.HITM.remote
> 14627 ± 4% -99.8% 35.38 ±114% perf-c2c.HITM.total
> 6.99 ± 3% +177.6% 19.41 ± 3% vmstat.cpu.id
> 91.35 -28.5% 65.31 vmstat.cpu.sy
> 1.71 +793.1% 15.25 ± 4% vmstat.cpu.us
> 34204 ± 5% -64.1% 12271 ± 9% vmstat.system.cs
> 266575 -21.2% 210049 vmstat.system.in
> 6.49 ± 3% +10.0 16.46 ± 3% mpstat.cpu.all.idle%
> 0.63 -0.3 0.34 ± 3% mpstat.cpu.all.irq%
> 0.03 ± 2% +0.3 0.31 ± 4% mpstat.cpu.all.soft%
> 91.17 -24.1 67.09 mpstat.cpu.all.sys%
> 1.68 ± 2% +14.1 15.80 ± 4% mpstat.cpu.all.usr%
> 337.33 -98.7% 4.25 ± 10% mpstat.max_utilization.seconds
> 352.76 -84.7% 53.95 ± 4% time.elapsed_time
> 352.76 -84.7% 53.95 ± 4% time.elapsed_time.max
> 225965 ± 7% -17.1% 187329 ± 12% time.involuntary_context_switches
> 9.592e+08 ± 4% +11.9% 1.074e+09 time.minor_page_faults
> 20852 -10.0% 18761 time.percent_of_cpu_this_job_got
> 72302 -88.6% 8227 ± 6% time.system_time
> 1260 ± 3% +50.7% 1899 time.user_time
> 5393707 ± 5% -98.8% 66895 ± 21% time.voluntary_context_switches
> 1609925 -50.7% 793216 meminfo.Active
> 1609925 -50.7% 793216 meminfo.Active(anon)
> 160837 ± 33% -72.5% 44155 ± 9% meminfo.AnonHugePages
> 4435665 -18.7% 3608195 meminfo.Cached
> 1775547 -44.2% 990889 meminfo.Committed_AS
> 148539 -47.4% 78096 meminfo.Mapped
> 4245538 ± 4% -24.6% 3202495 meminfo.PageTables
> 929777 -88.9% 102759 meminfo.Shmem
> 25676018 ± 3% +14.3% 29335678 meminfo.max_used_kB
> 64129 ± 4% +706.8% 517389 ± 7% vm-scalability.median
> 45.40 ± 5% +2248.9 2294 ± 2% vm-scalability.stddev%
> 14364828 ± 4% +685.6% 1.129e+08 ± 5% vm-scalability.throughput
> 352.76 -84.7% 53.95 ± 4% vm-scalability.time.elapsed_time
> 352.76 -84.7% 53.95 ± 4% vm-scalability.time.elapsed_time.max
> 225965 ± 7% -17.1% 187329 ± 12% vm-scalability.time.involuntary_context_switches
> 9.592e+08 ± 4% +11.9% 1.074e+09 vm-scalability.time.minor_page_faults
> 20852 -10.0% 18761 vm-scalability.time.percent_of_cpu_this_job_got
> 72302 -88.6% 8227 ± 6% vm-scalability.time.system_time
> 1260 ± 3% +50.7% 1899 vm-scalability.time.user_time
> 5393707 ± 5% -98.8% 66895 ± 21% vm-scalability.time.voluntary_context_switches
> 4.316e+09 ± 4% +11.9% 4.832e+09 vm-scalability.workload
> 1063552 ± 4% -24.9% 799008 ± 3% numa-meminfo.node0.PageTables
> 125455 ±106% -85.5% 18164 ±165% numa-meminfo.node0.Shmem
> 1062709 ± 4% -25.7% 789746 ± 4% numa-meminfo.node1.PageTables
> 176171 ± 71% -92.4% 13303 ±230% numa-meminfo.node1.Shmem
> 35515 ± 91% -97.3% 976.55 ± 59% numa-meminfo.node2.Mapped
> 1058901 ± 4% -25.3% 791392 ± 4% numa-meminfo.node2.PageTables
> 770405 ± 30% -79.2% 160245 ±101% numa-meminfo.node3.Active
> 770405 ± 30% -79.2% 160245 ±101% numa-meminfo.node3.Active(anon)
> 380096 ± 50% -62.5% 142513 ± 98% numa-meminfo.node3.AnonPages.max
> 1146977 ±108% -92.8% 82894 ± 60% numa-meminfo.node3.FilePages
> 52663 ± 47% -97.2% 1488 ± 39% numa-meminfo.node3.Mapped
> 1058539 ± 4% -22.3% 821992 ± 3% numa-meminfo.node3.PageTables
> 558943 ± 14% -93.7% 35227 ±124% numa-meminfo.node3.Shmem
> 265763 ± 4% -24.9% 199601 ± 3% numa-vmstat.node0.nr_page_table_pages
> 31364 ±106% -85.5% 4539 ±165% numa-vmstat.node0.nr_shmem
> 265546 ± 4% -25.5% 197854 ± 5% numa-vmstat.node1.nr_page_table_pages
> 44052 ± 71% -92.5% 3323 ±230% numa-vmstat.node1.nr_shmem
> 8961 ± 91% -97.3% 244.02 ± 59% numa-vmstat.node2.nr_mapped
> 264589 ± 4% -25.2% 197920 ± 3% numa-vmstat.node2.nr_page_table_pages
> 192683 ± 30% -79.2% 40126 ±101% numa-vmstat.node3.nr_active_anon
> 286819 ±108% -92.8% 20761 ± 60% numa-vmstat.node3.nr_file_pages
> 13124 ± 49% -97.2% 372.02 ± 39% numa-vmstat.node3.nr_mapped
> 264499 ± 4% -22.4% 205376 ± 3% numa-vmstat.node3.nr_page_table_pages
> 139810 ± 14% -93.7% 8844 ±124% numa-vmstat.node3.nr_shmem
> 192683 ± 30% -79.2% 40126 ±101% numa-vmstat.node3.nr_zone_active_anon
> 1951359 ± 3% -14.9% 1661427 ± 7% numa-vmstat.node3.numa_hit
> 1870359 ± 4% -16.8% 1555470 ± 8% numa-vmstat.node3.numa_local
> 402515 -50.7% 198246 proc-vmstat.nr_active_anon
> 170568 +1.8% 173591 proc-vmstat.nr_anon_pages
> 1109246 -18.7% 902238 proc-vmstat.nr_file_pages
> 37525 -47.3% 19768 proc-vmstat.nr_mapped
> 1059932 ± 4% -24.2% 803105 ± 2% proc-vmstat.nr_page_table_pages
> 232507 -89.0% 25623 proc-vmstat.nr_shmem
> 37297 -5.4% 35299 proc-vmstat.nr_slab_reclaimable
> 402515 -50.7% 198246 proc-vmstat.nr_zone_active_anon
> 61931 ± 8% -83.9% 9948 ± 59% proc-vmstat.numa_hint_faults
> 15755 ± 21% -96.6% 541.38 ± 36% proc-vmstat.numa_hint_faults_local
> 6916516 ± 3% -8.0% 6360040 proc-vmstat.numa_hit
> 6568542 ± 3% -8.5% 6012265 proc-vmstat.numa_local
> 293942 ± 3% -68.8% 91724 ± 48% proc-vmstat.numa_pte_updates
> 9.608e+08 ± 4% +11.8% 1.074e+09 proc-vmstat.pgfault
> 55981 ± 2% -68.7% 17541 ± 2% proc-vmstat.pgreuse
> 0.82 ± 4% -51.0% 0.40 ± 8% perf-stat.i.MPKI
> 2.714e+10 ± 2% +378.3% 1.298e+11 ± 9% perf-stat.i.branch-instructions
> 0.11 ± 3% +0.1 0.24 ± 8% perf-stat.i.branch-miss-rate%
> 24932893 +306.8% 1.014e+08 ± 9% perf-stat.i.branch-misses
> 64.93 -7.5 57.48 perf-stat.i.cache-miss-rate%
> 88563288 ± 3% +35.0% 1.196e+08 ± 7% perf-stat.i.cache-misses
> 1.369e+08 ± 3% +43.7% 1.968e+08 ± 7% perf-stat.i.cache-references
> 34508 ± 4% -47.3% 18199 ± 9% perf-stat.i.context-switches
> 7.67 -75.7% 1.87 ± 3% perf-stat.i.cpi
> 224605 +22.5% 275084 ± 6% perf-stat.i.cpu-clock
> 696.35 ± 2% -53.5% 323.77 ± 2% perf-stat.i.cpu-migrations
> 10834 ± 4% -24.1% 8224 ± 11% perf-stat.i.cycles-between-cache-misses
> 1.102e+11 +282.2% 4.212e+11 ± 9% perf-stat.i.instructions
> 0.14 +334.6% 0.62 ± 5% perf-stat.i.ipc
> 24.25 ± 3% +626.9% 176.25 ± 4% perf-stat.i.metric.K/sec
> 2722043 ± 3% +803.8% 24600740 ± 9% perf-stat.i.minor-faults
> 2722043 ± 3% +803.8% 24600739 ± 9% perf-stat.i.page-faults
> 224605 +22.5% 275084 ± 6% perf-stat.i.task-clock
> 0.81 ± 3% -62.2% 0.31 ± 11% perf-stat.overall.MPKI
> 0.09 -0.0 0.08 ± 2% perf-stat.overall.branch-miss-rate%
> 64.81 -2.4 62.37 perf-stat.overall.cache-miss-rate%
> 7.24 -70.7% 2.12 ± 5% perf-stat.overall.cpi
> 8933 ± 4% -21.9% 6978 ± 7% perf-stat.overall.cycles-between-cache-misses
> 0.14 +242.2% 0.47 ± 5% perf-stat.overall.ipc
> 9012 ± 2% -57.8% 3806 perf-stat.overall.path-length
> 2.701e+10 ± 2% +285.4% 1.041e+11 ± 5% perf-stat.ps.branch-instructions
> 24708939 +215.8% 78042343 ± 4% perf-stat.ps.branch-misses
> 89032538 ± 3% +15.9% 1.032e+08 ± 8% perf-stat.ps.cache-misses
> 1.374e+08 ± 3% +20.6% 1.656e+08 ± 9% perf-stat.ps.cache-references
> 34266 ± 5% -66.2% 11570 ± 10% perf-stat.ps.context-switches
> 223334 -1.6% 219861 perf-stat.ps.cpu-clock
> 7.941e+11 -9.9% 7.157e+11 perf-stat.ps.cpu-cycles
> 693.54 ± 2% -67.2% 227.38 ± 4% perf-stat.ps.cpu-migrations
> 1.097e+11 +208.3% 3.381e+11 ± 5% perf-stat.ps.instructions
> 2710577 ± 3% +626.7% 19698901 ± 5% perf-stat.ps.minor-faults
> 2710577 ± 3% +626.7% 19698902 ± 5% perf-stat.ps.page-faults
> 223334 -1.6% 219861 perf-stat.ps.task-clock
> 3.886e+13 ± 2% -52.7% 1.839e+13 perf-stat.total.instructions
> 64052898 ± 5% -99.9% 81213 ± 23% sched_debug.cfs_rq:/.avg_vruntime.avg
> 95701822 ± 7% -96.4% 3425672 ± 7% sched_debug.cfs_rq:/.avg_vruntime.max
> 43098762 ± 6% -100.0% 153.42 ± 36% sched_debug.cfs_rq:/.avg_vruntime.min
> 9223270 ± 9% -95.9% 380347 ± 16% sched_debug.cfs_rq:/.avg_vruntime.stddev
> 0.00 ± 22% -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_delayed.avg
> 0.69 ± 8% -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_delayed.max
> 0.05 ± 12% -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_delayed.stddev
> 0.78 ± 2% -94.5% 0.04 ± 21% sched_debug.cfs_rq:/.h_nr_running.avg
> 1.97 ± 5% -49.3% 1.00 sched_debug.cfs_rq:/.h_nr_running.max
> 0.28 ± 7% -29.1% 0.20 ± 10% sched_debug.cfs_rq:/.h_nr_running.stddev
> 411536 ± 58% -100.0% 1.15 ±182% sched_debug.cfs_rq:/.left_deadline.avg
> 43049468 ± 22% -100.0% 258.27 ±182% sched_debug.cfs_rq:/.left_deadline.max
> 3836405 ± 37% -100.0% 17.22 ±182% sched_debug.cfs_rq:/.left_deadline.stddev
> 411536 ± 58% -100.0% 1.06 ±191% sched_debug.cfs_rq:/.left_vruntime.avg
> 43049467 ± 22% -100.0% 236.56 ±191% sched_debug.cfs_rq:/.left_vruntime.max
> 3836405 ± 37% -100.0% 15.77 ±191% sched_debug.cfs_rq:/.left_vruntime.stddev
> 64052901 ± 5% -99.9% 81213 ± 23% sched_debug.cfs_rq:/.min_vruntime.avg
> 95701822 ± 7% -96.4% 3425672 ± 7% sched_debug.cfs_rq:/.min_vruntime.max
> 43098762 ± 6% -100.0% 153.42 ± 36% sched_debug.cfs_rq:/.min_vruntime.min
> 9223270 ± 9% -95.9% 380347 ± 16% sched_debug.cfs_rq:/.min_vruntime.stddev
> 0.77 ± 2% -94.4% 0.04 ± 21% sched_debug.cfs_rq:/.nr_running.avg
> 1.50 ± 9% -33.3% 1.00 sched_debug.cfs_rq:/.nr_running.max
> 0.26 ± 10% -22.7% 0.20 ± 10% sched_debug.cfs_rq:/.nr_running.stddev
> 1.61 ± 24% +413.4% 8.24 ± 60% sched_debug.cfs_rq:/.removed.runnable_avg.avg
> 86.69 +508.6% 527.62 ± 4% sched_debug.cfs_rq:/.removed.runnable_avg.max
> 11.14 ± 13% +428.4% 58.87 ± 32% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
> 1.61 ± 24% +413.3% 8.24 ± 60% sched_debug.cfs_rq:/.removed.util_avg.avg
> 86.69 +508.6% 527.62 ± 4% sched_debug.cfs_rq:/.removed.util_avg.max
> 11.14 ± 13% +428.4% 58.87 ± 32% sched_debug.cfs_rq:/.removed.util_avg.stddev
> 411536 ± 58% -100.0% 1.06 ±191% sched_debug.cfs_rq:/.right_vruntime.avg
> 43049467 ± 22% -100.0% 236.56 ±191% sched_debug.cfs_rq:/.right_vruntime.max
> 3836405 ± 37% -100.0% 15.77 ±191% sched_debug.cfs_rq:/.right_vruntime.stddev
> 769.03 -84.7% 117.79 ± 3% sched_debug.cfs_rq:/.util_avg.avg
> 1621 ± 5% -32.7% 1092 ± 16% sched_debug.cfs_rq:/.util_avg.max
> 159.12 ± 8% +33.2% 211.88 ± 7% sched_debug.cfs_rq:/.util_avg.stddev
> 724.17 ± 2% -98.6% 10.41 ± 32% sched_debug.cfs_rq:/.util_est.avg
> 1360 ± 15% -51.5% 659.38 ± 10% sched_debug.cfs_rq:/.util_est.max
> 234.34 ± 9% -68.2% 74.43 ± 18% sched_debug.cfs_rq:/.util_est.stddev
> 766944 ± 3% +18.9% 912012 sched_debug.cpu.avg_idle.avg
> 1067639 ± 5% +25.5% 1339736 ± 9% sched_debug.cpu.avg_idle.max
> 3799 ± 7% -38.3% 2346 ± 23% sched_debug.cpu.avg_idle.min
> 321459 ± 2% -36.6% 203909 ± 7% sched_debug.cpu.avg_idle.stddev
> 195573 -76.9% 45144 sched_debug.cpu.clock.avg
> 195596 -76.9% 45160 sched_debug.cpu.clock.max
> 195548 -76.9% 45123 sched_debug.cpu.clock.min
> 13.79 ± 3% -36.0% 8.83 ± 2% sched_debug.cpu.clock.stddev
> 194424 -76.8% 45019 sched_debug.cpu.clock_task.avg
> 194608 -76.8% 45145 sched_debug.cpu.clock_task.max
> 181834 -82.1% 32559 sched_debug.cpu.clock_task.min
> 4241 ± 2% -96.8% 136.38 ± 21% sched_debug.cpu.curr->pid.avg
> 9799 ± 2% -59.8% 3934 sched_debug.cpu.curr->pid.max
> 1365 ± 10% -49.1% 695.11 ± 10% sched_debug.cpu.curr->pid.stddev
> 537665 ± 4% +28.3% 690006 ± 6% sched_debug.cpu.max_idle_balance_cost.max
> 3119 ± 56% +479.5% 18078 ± 29% sched_debug.cpu.max_idle_balance_cost.stddev
> 0.00 ± 12% -68.3% 0.00 ± 17% sched_debug.cpu.next_balance.stddev
> 0.78 ± 2% -95.3% 0.04 ± 20% sched_debug.cpu.nr_running.avg
> 2.17 ± 8% -53.8% 1.00 sched_debug.cpu.nr_running.max
> 0.29 ± 8% -35.4% 0.19 ± 9% sched_debug.cpu.nr_running.stddev
> 25773 ± 5% -97.0% 764.82 ± 3% sched_debug.cpu.nr_switches.avg
> 48669 ± 10% -77.2% 11080 ± 12% sched_debug.cpu.nr_switches.max
> 19006 ± 7% -99.2% 151.12 ± 15% sched_debug.cpu.nr_switches.min
> 4142 ± 8% -69.5% 1264 ± 6% sched_debug.cpu.nr_switches.stddev
> 0.07 ± 23% -93.3% 0.01 ± 53% sched_debug.cpu.nr_uninterruptible.avg
> 240.19 ± 16% -80.2% 47.50 ± 44% sched_debug.cpu.nr_uninterruptible.max
> -77.92 -88.1% -9.25 sched_debug.cpu.nr_uninterruptible.min
> 37.87 ± 5% -84.7% 5.78 ± 13% sched_debug.cpu.nr_uninterruptible.stddev
> 195549 -76.9% 45130 sched_debug.cpu_clk
> 194699 -77.3% 44280 sched_debug.ktime
> 0.00 -100.0% 0.00 sched_debug.rt_rq:.rt_nr_running.avg
> 0.17 -100.0% 0.00 sched_debug.rt_rq:.rt_nr_running.max
> 0.01 -100.0% 0.00 sched_debug.rt_rq:.rt_nr_running.stddev
> 196368 -76.6% 45975 sched_debug.sched_clk
> 95.59 -95.6 0.00 perf-profile.calltrace.cycles-pp.__mmap
> 95.54 -95.5 0.00 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> 95.54 -95.5 0.00 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap
> 94.54 -94.5 0.00 perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> 94.46 -94.4 0.07 ±264% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
> 94.45 -94.0 0.41 ±158% perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 94.14 -93.9 0.29 ±134% perf-profile.calltrace.cycles-pp.__mmap_new_vma.__mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff
> 94.25 -93.8 0.41 ±158% perf-profile.calltrace.cycles-pp.__mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> 93.79 -93.7 0.07 ±264% perf-profile.calltrace.cycles-pp.vma_link_file.__mmap_new_vma.__mmap_region.do_mmap.vm_mmap_pgoff
> 93.44 -93.4 0.00 perf-profile.calltrace.cycles-pp.down_write.vma_link_file.__mmap_new_vma.__mmap_region.do_mmap
> 93.40 -93.4 0.00 perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write.vma_link_file.__mmap_new_vma.__mmap_region
> 93.33 -93.3 0.00 perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.vma_link_file.__mmap_new_vma
> 92.89 -92.9 0.00 perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.vma_link_file
> 0.00 +1.7 1.69 ± 65% perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
> 0.00 +1.9 1.90 ± 55% perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
> 0.00 +1.9 1.90 ± 55% perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
> 0.00 +1.9 1.93 ± 53% perf-profile.calltrace.cycles-pp.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +1.9 1.93 ± 53% perf-profile.calltrace.cycles-pp.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64
> 0.00 +2.0 1.99 ± 53% perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +2.0 2.02 ± 64% perf-profile.calltrace.cycles-pp.do_pte_missing.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
> 0.00 +2.3 2.27 ± 56% perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
> 0.00 +2.3 2.27 ± 56% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
> 0.00 +2.3 2.27 ± 56% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._Fork
> 0.00 +2.3 2.27 ± 56% perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
> 0.00 +2.4 2.45 ± 53% perf-profile.calltrace.cycles-pp._Fork
> 0.00 +2.5 2.51 ± 52% perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +2.5 2.51 ± 52% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64
> 0.00 +2.5 2.51 ± 52% perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +2.5 2.51 ± 52% perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +3.2 3.17 ± 42% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
> 0.00 +3.3 3.28 ± 52% perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin
> 0.00 +3.3 3.28 ± 52% perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin.handle_internal_command
> 0.00 +4.1 4.10 ± 45% perf-profile.calltrace.cycles-pp.__cmd_record.cmd_record.run_builtin.handle_internal_command.main
> 0.00 +4.1 4.10 ± 45% perf-profile.calltrace.cycles-pp.cmd_record.run_builtin.handle_internal_command.main
> 0.00 +4.8 4.80 ± 61% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter
> 0.00 +5.0 4.98 ± 69% perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64
> 0.00 +5.1 5.07 ± 71% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.writen.record__pushfn
> 0.00 +5.1 5.07 ± 71% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write.writen.record__pushfn.perf_mmap__push
> 0.00 +5.1 5.07 ± 71% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.writen
> 0.00 +5.1 5.07 ± 71% perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +5.1 5.07 ± 71% perf-profile.calltrace.cycles-pp.write.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist
> 0.00 +5.1 5.07 ± 71% perf-profile.calltrace.cycles-pp.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record
> 0.00 +5.1 5.11 ± 47% perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
> 0.00 +5.1 5.12 ± 70% perf-profile.calltrace.cycles-pp.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record
> 0.00 +6.1 6.08 ± 50% perf-profile.calltrace.cycles-pp.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter
> 0.00 +7.8 7.84 ± 21% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 0.00 +7.9 7.88 ± 20% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 0.00 +7.9 7.88 ± 20% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
> 0.00 +7.9 7.88 ± 20% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
> 0.00 +7.9 7.88 ± 20% perf-profile.calltrace.cycles-pp.read
> 0.00 +11.1 11.10 ± 41% perf-profile.calltrace.cycles-pp.handle_internal_command.main
> 0.00 +11.1 11.10 ± 41% perf-profile.calltrace.cycles-pp.main
> 0.00 +11.1 11.10 ± 41% perf-profile.calltrace.cycles-pp.run_builtin.handle_internal_command.main
> 0.00 +11.2 11.18 ± 73% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
> 0.00 +15.9 15.94 ± 41% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +15.9 15.94 ± 41% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
> 0.00 +19.5 19.54 ± 41% perf-profile.calltrace.cycles-pp.asm_sysvec_reschedule_ipi.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
> 1.21 ± 3% +36.7 37.86 ± 7% perf-profile.calltrace.cycles-pp.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
> 1.21 ± 3% +36.7 37.86 ± 7% perf-profile.calltrace.cycles-pp.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> 1.21 ± 3% +37.0 38.24 ± 7% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
> 1.21 ± 3% +37.2 38.41 ± 7% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
> 1.21 ± 3% +37.4 38.57 ± 6% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
> 1.22 ± 3% +38.5 39.67 ± 7% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
> 1.22 ± 3% +38.5 39.67 ± 7% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
> 1.22 ± 3% +38.5 39.67 ± 7% perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
> 1.22 ± 3% +38.9 40.09 ± 6% perf-profile.calltrace.cycles-pp.common_startup_64
> 2.19 ± 3% +45.2 47.41 ± 14% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
> 95.60 -95.4 0.22 ±135% perf-profile.children.cycles-pp.__mmap
> 94.55 -93.9 0.60 ±103% perf-profile.children.cycles-pp.ksys_mmap_pgoff
> 94.14 -93.7 0.44 ±112% perf-profile.children.cycles-pp.__mmap_new_vma
> 93.79 -93.7 0.10 ±264% perf-profile.children.cycles-pp.vma_link_file
> 94.46 -93.5 0.96 ± 76% perf-profile.children.cycles-pp.vm_mmap_pgoff
> 94.45 -93.5 0.96 ± 76% perf-profile.children.cycles-pp.do_mmap
> 94.25 -93.4 0.86 ± 87% perf-profile.children.cycles-pp.__mmap_region
> 93.40 -93.4 0.00 perf-profile.children.cycles-pp.rwsem_down_write_slowpath
> 93.33 -93.3 0.00 perf-profile.children.cycles-pp.rwsem_optimistic_spin
> 93.44 -93.2 0.22 ±149% perf-profile.children.cycles-pp.down_write
> 92.91 -92.9 0.00 perf-profile.children.cycles-pp.osq_lock
> 95.58 -45.4 50.16 ± 8% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 95.58 -45.4 50.16 ± 8% perf-profile.children.cycles-pp.do_syscall_64
> 0.00 +1.1 1.12 ± 74% perf-profile.children.cycles-pp.filemap_map_pages
> 0.00 +1.1 1.12 ± 76% perf-profile.children.cycles-pp.vfs_fstatat
> 0.00 +1.2 1.19 ± 35% perf-profile.children.cycles-pp.vsnprintf
> 0.00 +1.2 1.20 ± 46% perf-profile.children.cycles-pp.seq_printf
> 0.00 +1.3 1.28 ± 78% perf-profile.children.cycles-pp.__do_sys_newfstatat
> 0.00 +1.5 1.54 ± 75% perf-profile.children.cycles-pp.folios_put_refs
> 0.00 +1.6 1.56 ± 52% perf-profile.children.cycles-pp.__cond_resched
> 0.00 +1.6 1.60 ± 32% perf-profile.children.cycles-pp.sched_balance_newidle
> 0.00 +1.7 1.69 ± 65% perf-profile.children.cycles-pp.dup_mm
> 0.00 +1.9 1.93 ± 53% perf-profile.children.cycles-pp.proc_reg_read_iter
> 0.00 +2.0 1.99 ± 53% perf-profile.children.cycles-pp.copy_process
> 0.00 +2.1 2.06 ± 51% perf-profile.children.cycles-pp.__x64_sys_ioctl
> 0.00 +2.1 2.08 ± 45% perf-profile.children.cycles-pp.proc_single_show
> 0.00 +2.1 2.14 ± 45% perf-profile.children.cycles-pp.seq_read
> 0.00 +2.2 2.16 ± 47% perf-profile.children.cycles-pp.ioctl
> 0.00 +2.2 2.17 ± 33% perf-profile.children.cycles-pp.schedule
> 0.00 +2.2 2.20 ± 28% perf-profile.children.cycles-pp.__pick_next_task
> 0.00 +2.2 2.21 ± 47% perf-profile.children.cycles-pp.perf_evsel__run_ioctl
> 0.00 +2.3 2.26 ± 58% perf-profile.children.cycles-pp.do_read_fault
> 0.00 +2.3 2.27 ± 56% perf-profile.children.cycles-pp.__do_sys_clone
> 0.00 +2.3 2.27 ± 56% perf-profile.children.cycles-pp.kernel_clone
> 0.00 +2.4 2.37 ± 58% perf-profile.children.cycles-pp.zap_present_ptes
> 0.00 +2.4 2.45 ± 53% perf-profile.children.cycles-pp._Fork
> 0.00 +2.6 2.59 ± 53% perf-profile.children.cycles-pp.__x64_sys_exit_group
> 0.00 +2.6 2.59 ± 53% perf-profile.children.cycles-pp.x64_sys_call
> 0.00 +2.6 2.64 ± 44% perf-profile.children.cycles-pp.do_pte_missing
> 0.00 +3.1 3.13 ± 59% perf-profile.children.cycles-pp.zap_pte_range
> 0.00 +3.2 3.21 ± 58% perf-profile.children.cycles-pp.zap_pmd_range
> 0.00 +3.4 3.40 ± 56% perf-profile.children.cycles-pp.unmap_page_range
> 0.00 +3.4 3.43 ± 55% perf-profile.children.cycles-pp.unmap_vmas
> 0.19 ± 23% +3.9 4.06 ± 45% perf-profile.children.cycles-pp.__handle_mm_fault
> 0.51 ± 6% +4.0 4.49 ± 38% perf-profile.children.cycles-pp.handle_mm_fault
> 0.04 ± 44% +4.0 4.04 ± 28% perf-profile.children.cycles-pp.__schedule
> 0.77 ± 3% +4.4 5.18 ± 39% perf-profile.children.cycles-pp.exc_page_fault
> 0.76 ± 3% +4.4 5.18 ± 39% perf-profile.children.cycles-pp.do_user_addr_fault
> 0.58 ± 2% +4.7 5.26 ± 53% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
> 0.00 +5.1 5.07 ± 71% perf-profile.children.cycles-pp.writen
> 0.00 +5.1 5.07 ± 69% perf-profile.children.cycles-pp.generic_perform_write
> 0.00 +5.1 5.12 ± 47% perf-profile.children.cycles-pp.exit_mm
> 0.00 +5.1 5.12 ± 70% perf-profile.children.cycles-pp.record__pushfn
> 0.00 +5.1 5.12 ± 70% perf-profile.children.cycles-pp.shmem_file_write_iter
> 1.18 +5.5 6.69 ± 33% perf-profile.children.cycles-pp.asm_exc_page_fault
> 0.00 +6.2 6.24 ± 43% perf-profile.children.cycles-pp.__mmput
> 0.00 +6.2 6.24 ± 43% perf-profile.children.cycles-pp.exit_mmap
> 0.00 +7.0 7.00 ± 51% perf-profile.children.cycles-pp.perf_mmap__push
> 0.00 +7.0 7.00 ± 51% perf-profile.children.cycles-pp.record__mmap_read_evlist
> 0.00 +7.2 7.25 ± 52% perf-profile.children.cycles-pp.__fput
> 0.00 +7.3 7.35 ± 20% perf-profile.children.cycles-pp.seq_read_iter
> 0.00 +7.8 7.84 ± 21% perf-profile.children.cycles-pp.vfs_read
> 0.00 +7.9 7.88 ± 20% perf-profile.children.cycles-pp.ksys_read
> 0.00 +7.9 7.88 ± 20% perf-profile.children.cycles-pp.read
> 0.00 +9.9 9.93 ± 41% perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi
> 0.02 ±141% +11.1 11.10 ± 41% perf-profile.children.cycles-pp.__cmd_record
> 0.02 ±141% +11.1 11.10 ± 41% perf-profile.children.cycles-pp.cmd_record
> 0.02 ±141% +11.1 11.10 ± 41% perf-profile.children.cycles-pp.handle_internal_command
> 0.02 ±141% +11.1 11.10 ± 41% perf-profile.children.cycles-pp.main
> 0.02 ±141% +11.1 11.10 ± 41% perf-profile.children.cycles-pp.run_builtin
> 0.00 +11.2 11.18 ± 73% perf-profile.children.cycles-pp.vfs_write
> 0.00 +11.2 11.23 ± 73% perf-profile.children.cycles-pp.ksys_write
> 0.00 +11.2 11.23 ± 73% perf-profile.children.cycles-pp.write
> 0.00 +13.6 13.61 ± 44% perf-profile.children.cycles-pp.do_exit
> 0.00 +13.6 13.61 ± 44% perf-profile.children.cycles-pp.do_group_exit
> 1.70 ± 2% +25.0 26.72 ± 15% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> 1.21 ± 3% +36.6 37.81 ± 7% perf-profile.children.cycles-pp.acpi_safe_halt
> 1.21 ± 3% +36.6 37.86 ± 7% perf-profile.children.cycles-pp.acpi_idle_do_entry
> 1.21 ± 3% +36.6 37.86 ± 7% perf-profile.children.cycles-pp.acpi_idle_enter
> 1.21 ± 3% +37.4 38.57 ± 6% perf-profile.children.cycles-pp.cpuidle_enter_state
> 1.21 ± 3% +37.4 38.66 ± 6% perf-profile.children.cycles-pp.cpuidle_enter
> 1.22 ± 3% +37.6 38.82 ± 6% perf-profile.children.cycles-pp.cpuidle_idle_call
> 1.22 ± 3% +38.5 39.67 ± 7% perf-profile.children.cycles-pp.start_secondary
> 1.22 ± 3% +38.9 40.09 ± 6% perf-profile.children.cycles-pp.common_startup_64
> 1.22 ± 3% +38.9 40.09 ± 6% perf-profile.children.cycles-pp.cpu_startup_entry
> 1.22 ± 3% +38.9 40.09 ± 6% perf-profile.children.cycles-pp.do_idle
> 92.37 -92.4 0.00 perf-profile.self.cycles-pp.osq_lock
> 1.19 ± 3% +30.7 31.90 ± 7% perf-profile.self.cycles-pp.acpi_safe_halt
> 0.17 ±142% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.__do_fault.do_read_fault.do_pte_missing.__handle_mm_fault
> 0.19 ± 34% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
> 0.14 ± 55% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
> 0.14 ± 73% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.change_pud_range.isra.0.change_protection_range
> 0.10 ± 66% -99.9% 0.00 ±264% perf-sched.sch_delay.avg.ms.__cond_resched.down_write.__mmap_new_vma.__mmap_region.do_mmap
> 0.11 ± 59% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.down_write.vma_link_file.__mmap_new_vma.__mmap_region
> 0.04 ±132% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
> 0.07 ±101% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.down_write_killable.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> 0.02 ± 31% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
> 0.02 ±143% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__mmap_new_vma
> 0.10 ± 44% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.vm_area_alloc.__mmap_new_vma.__mmap_region
> 0.12 ±145% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
> 0.04 ± 55% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 0.25 ± 41% -98.5% 0.00 ±105% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
> 0.11 ± 59% -97.1% 0.00 ± 61% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
> 0.40 ± 50% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 0.32 ±104% -100.0% 0.00 perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
> 0.01 ± 12% -100.0% 0.00 perf-sched.sch_delay.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
> 0.08 ± 28% -99.5% 0.00 ±264% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
> 0.18 ± 57% -96.8% 0.01 ±193% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> 0.03 ± 83% -100.0% 0.00 perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
> 0.01 ± 20% -100.0% 0.00 perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
> 0.02 ± 65% -100.0% 0.00 perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
> 0.32 ± 47% -98.2% 0.01 ± 42% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
> 0.19 ±185% -96.5% 0.01 ± 33% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
> 0.07 ± 20% -100.0% 0.00 perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.vma_link_file
> 0.26 ± 17% -100.0% 0.00 perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 0.02 ± 60% -94.2% 0.00 ±264% perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
> 0.01 ±128% -100.0% 0.00 perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
> 1.00 ±151% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.__do_fault.do_read_fault.do_pte_missing.__handle_mm_fault
> 25.45 ± 94% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
> 4.56 ± 67% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
> 3.55 ± 97% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.change_pud_range.isra.0.change_protection_range
> 2.13 ± 67% -100.0% 0.00 ±264% perf-sched.sch_delay.max.ms.__cond_resched.down_write.__mmap_new_vma.__mmap_region.do_mmap
> 3.16 ± 78% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.down_write.vma_link_file.__mmap_new_vma.__mmap_region
> 0.30 ±159% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
> 1.61 ±100% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.down_write_killable.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> 0.03 ± 86% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
> 0.20 ±182% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__mmap_new_vma
> 3.51 ± 21% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.vm_area_alloc.__mmap_new_vma.__mmap_region
> 0.83 ±160% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
> 0.09 ± 31% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 3.59 ± 11% -99.9% 0.00 ±105% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
> 1.60 ± 69% -99.6% 0.01 ±129% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
> 0.81 ± 43% -100.0% 0.00 perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 1.02 ± 88% -100.0% 0.00 perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
> 0.02 ± 7% -100.0% 0.00 perf-sched.sch_delay.max.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
> 9.68 ± 32% -100.0% 0.00 ±264% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
> 12.26 ±109% -100.0% 0.01 ±193% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> 5.60 ±139% -100.0% 0.00 perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
> 0.03 ±106% -100.0% 0.00 perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
> 2.11 ± 61% -100.0% 0.00 perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
> 3.67 ± 25% -99.8% 0.01 ± 16% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
> 1.65 ±187% -99.3% 0.01 ± 23% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
> 37.84 ± 47% -100.0% 0.00 perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.vma_link_file
> 4.68 ± 36% -100.0% 0.00 perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 0.21 ±169% -99.6% 0.00 ±264% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
> 7.92 ±131% -99.2% 0.06 ± 92% perf-sched.sch_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 0.36 ±186% -100.0% 0.00 perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
> 33.45 ± 3% -91.6% 2.81 ± 90% perf-sched.total_wait_and_delay.average.ms
> 97903 ± 4% -98.2% 1776 ± 28% perf-sched.total_wait_and_delay.count.ms
> 2942 ± 23% -95.2% 141.09 ± 36% perf-sched.total_wait_and_delay.max.ms
> 33.37 ± 3% -91.9% 2.69 ± 95% perf-sched.total_wait_time.average.ms
> 2942 ± 23% -96.7% 97.14 ± 19% perf-sched.total_wait_time.max.ms
> 3.97 ± 6% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
> 3.08 ± 4% -94.3% 0.18 ± 92% perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
> 119.91 ± 38% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 433.73 ± 41% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 302.41 ± 5% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
> 1.48 ± 6% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
> 23.24 ± 25% -96.7% 0.76 ± 27% perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
> 327.16 ± 9% -99.8% 0.76 ±188% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
> 369.37 ± 2% -98.9% 4.03 ±204% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
> 0.96 ± 6% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.vma_link_file
> 453.60 -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
> 187.66 -96.7% 6.11 ±109% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 2.37 ± 29% -99.6% 0.01 ±264% perf-sched.wait_and_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 750.07 -99.3% 5.10 ± 84% perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 1831 ± 9% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
> 1269 ± 8% -45.8% 688.12 ± 21% perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
> 6.17 ± 45% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 5.00 -100.0% 0.00 perf-sched.wait_and_delay.count.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 14.33 ± 5% -100.0% 0.00 perf-sched.wait_and_delay.count.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
> 810.00 ± 10% -100.0% 0.00 perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
> 3112 ± 24% -97.9% 65.75 ±106% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
> 40.50 ± 8% -98.8% 0.50 ±173% perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
> 73021 ± 3% -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.vma_link_file
> 40.00 -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.kcompactd.kthread.ret_from_fork
> 1122 -99.0% 10.88 ± 98% perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 11323 ± 3% -93.6% 722.25 ± 20% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 1887 ± 45% -100.0% 0.88 ±264% perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 1238 -93.9% 75.62 ± 79% perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 35.19 ± 57% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
> 1002 -91.0% 89.82 ± 93% perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
> 318.48 ± 65% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 1000 -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 966.90 ± 7% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
> 20.79 ± 19% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
> 1043 -98.4% 16.64 ±214% perf-sched.wait_and_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
> 1240 ± 20% -99.9% 1.52 ±188% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
> 500.34 -96.9% 15.38 ±232% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
> 58.83 ± 39% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.vma_link_file
> 505.17 -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
> 19.77 ± 55% -62.8% 7.36 ± 85% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 1237 ± 34% -91.7% 102.88 ± 33% perf-sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 1001 -100.0% 0.05 ±264% perf-sched.wait_and_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 2794 ± 24% -97.9% 59.20 ± 61% perf-sched.wait_and_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 49.27 ±119% -100.0% 0.01 ±264% perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages_noprof.alloc_pages_mpol_noprof.folio_alloc_mpol_noprof.shmem_alloc_folio
> 58.17 ±187% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.__do_fault.do_read_fault.do_pte_missing.__handle_mm_fault
> 3.78 ± 5% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
> 2.99 ± 4% -97.0% 0.09 ± 91% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
> 3.92 ± 5% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
> 4.71 ± 8% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.change_pud_range.isra.0.change_protection_range
> 1.67 ± 20% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.down_write.__mmap_new_vma.__mmap_region.do_mmap
> 2.10 ± 27% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.down_write.vma_link_file.__mmap_new_vma.__mmap_region
> 0.01 ± 44% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
> 1.67 ± 21% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.down_write_killable.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> 0.04 ±133% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
> 67.14 ± 73% -99.5% 0.32 ±177% perf-sched.wait_time.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
> 1.65 ± 67% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__mmap_new_vma
> 2.30 ± 14% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.vm_area_alloc.__mmap_new_vma.__mmap_region
> 42.44 ±200% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
> 152.73 ±152% -100.0% 0.06 ±249% perf-sched.wait_time.avg.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> 119.87 ± 38% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 3.80 ± 18% -99.9% 0.00 ±105% perf-sched.wait_time.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
> 433.32 ± 41% -100.0% 0.00 perf-sched.wait_time.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 250.23 ±107% -100.0% 0.00 perf-sched.wait_time.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
> 29.19 ± 5% -99.2% 0.25 ± 24% perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
> 302.40 ± 5% -100.0% 0.00 perf-sched.wait_time.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
> 1.40 ± 6% -100.0% 0.00 perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
> 4.03 ± 8% -99.9% 0.01 ±193% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> 35.38 ±192% -100.0% 0.00 ±264% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
> 0.05 ± 40% -100.0% 0.00 perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
> 0.72 ±220% -100.0% 0.00 perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
> 1.00 ±120% -99.9% 0.00 ±264% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
> 23.07 ± 24% -97.1% 0.67 ± 10% perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
> 326.84 ± 9% -99.6% 1.19 ±108% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
> 369.18 ± 2% -98.7% 4.72 ±167% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
> 0.89 ± 6% -100.0% 0.00 perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.vma_link_file
> 1.17 ± 16% -99.7% 0.00 ±264% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 453.58 -100.0% 0.00 perf-sched.wait_time.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
> 4.42 -25.4% 3.30 ± 17% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 187.58 -96.8% 6.05 ±110% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 2.36 ± 29% -99.1% 0.02 ± 84% perf-sched.wait_time.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 0.01 ±156% -100.0% 0.00 perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
> 750.01 -99.5% 3.45 ±141% perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 340.69 ±135% -100.0% 0.01 ±264% perf-sched.wait_time.max.ms.__cond_resched.__alloc_pages_noprof.alloc_pages_mpol_noprof.folio_alloc_mpol_noprof.shmem_alloc_folio
> 535.09 ±128% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.__do_fault.do_read_fault.do_pte_missing.__handle_mm_fault
> 22.04 ± 32% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
> 1001 -95.5% 44.91 ± 93% perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
> 13.57 ± 17% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
> 13.54 ± 10% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.change_pud_range.isra.0.change_protection_range
> 10.17 ± 19% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.down_write.__mmap_new_vma.__mmap_region.do_mmap
> 11.35 ± 25% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.down_write.vma_link_file.__mmap_new_vma.__mmap_region
> 0.01 ± 32% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
> 10.62 ± 9% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
> 0.20 ±199% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
> 1559 ± 64% -100.0% 0.44 ±167% perf-sched.wait_time.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
> 6.93 ± 53% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__mmap_new_vma
> 14.42 ± 22% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.vm_area_alloc.__mmap_new_vma.__mmap_region
> 159.10 ±148% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
> 391.02 ±171% -100.0% 0.12 ±256% perf-sched.wait_time.max.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
> 318.43 ± 65% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 13.14 ± 21% -100.0% 0.00 ±105% perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
> 1000 -100.0% 0.00 perf-sched.wait_time.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 500.84 ± 99% -100.0% 0.00 perf-sched.wait_time.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
> 641.50 ± 23% -99.2% 5.27 ± 76% perf-sched.wait_time.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
> 10.75 ± 98% -89.8% 1.10 ± 78% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 966.89 ± 7% -100.0% 0.00 perf-sched.wait_time.max.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
> 15.80 ± 8% -100.0% 0.00 perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
> 16.69 ± 10% -100.0% 0.01 ±193% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
> 41.71 ±158% -100.0% 0.00 ±264% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
> 11.64 ± 61% -100.0% 0.00 perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
> 2.94 ±213% -100.0% 0.00 perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
> 175.70 ±210% -100.0% 0.00 ±264% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
> 1043 -99.6% 4.46 ±105% perf-sched.wait_time.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
> 1240 ± 20% -99.8% 2.37 ±108% perf-sched.wait_time.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
> 500.11 -96.5% 17.32 ±201% perf-sched.wait_time.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
> 32.65 ± 33% -100.0% 0.00 perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.vma_link_file
> 22.94 ± 56% -100.0% 0.00 ±264% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 505.00 -100.0% 0.00 perf-sched.wait_time.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
> 12.20 ± 43% -59.2% 4.98 perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 1237 ± 34% -92.5% 92.94 ± 20% perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 1000 -100.0% 0.09 ±111% perf-sched.wait_time.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 0.36 ±190% -100.0% 0.00 perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
> 2794 ± 24% -98.9% 30.12 ±114% perf-sched.wait_time.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
prev parent reply other threads:[~2025-02-19 1:12 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-13 22:30 Yang Shi
2025-01-14 12:05 ` Lorenzo Stoakes
2025-01-14 16:53 ` Yang Shi
2025-01-14 18:14 ` Lorenzo Stoakes
2025-01-14 18:19 ` Lorenzo Stoakes
2025-01-14 18:21 ` Lorenzo Stoakes
2025-01-14 18:22 ` Matthew Wilcox
2025-01-14 18:26 ` Lorenzo Stoakes
2025-01-14 18:32 ` Jann Horn
2025-01-14 18:38 ` Lorenzo Stoakes
2025-01-14 19:03 ` Yang Shi
2025-01-14 19:13 ` Lorenzo Stoakes
2025-01-14 21:24 ` Yang Shi
2025-01-15 12:10 ` Lorenzo Stoakes
2025-01-15 21:29 ` Yang Shi
2025-01-15 22:05 ` Christoph Lameter (Ampere)
2025-01-14 13:01 ` David Hildenbrand
2025-01-14 14:52 ` Lorenzo Stoakes
2025-01-14 15:06 ` David Hildenbrand
2025-01-14 17:01 ` Yang Shi
2025-01-14 17:23 ` David Hildenbrand
2025-01-14 17:38 ` Yang Shi
2025-01-14 17:46 ` David Hildenbrand
2025-01-14 18:05 ` Yang Shi
2025-01-14 17:02 ` David Hildenbrand
2025-01-14 17:20 ` Yang Shi
2025-01-14 17:24 ` David Hildenbrand
2025-01-28 3:14 ` kernel test robot
2025-01-31 18:38 ` Yang Shi
2025-02-06 8:02 ` Oliver Sang
2025-02-07 18:10 ` Yang Shi
2025-02-13 2:04 ` Oliver Sang
2025-02-14 22:53 ` Yang Shi
2025-02-18 6:30 ` Oliver Sang
2025-02-19 1:12 ` Yang Shi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a2907666-2b43-4bdc-96c7-193538945542@os.amperecomputing.com \
--to=yang@os.amperecomputing.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=gregkh@linuxfoundation.org \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liushixin2@huawei.com \
--cc=lkp@intel.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox