linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Yang Shi <yang@os.amperecomputing.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	"Oscar Salvador" <osalvador@suse.de>,
	Rafael Aquini <aquini@redhat.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	David Rientjes <rientjes@google.com>, <linux-mm@kvack.org>,
	<ying.huang@intel.com>, <feng.tang@intel.com>,
	<fengwei.yin@intel.com>, <oliver.sang@intel.com>
Subject: [linus:master] [mm]  24526268f4:  stress-ng.numa.ops_per_sec 4.7% improvement
Date: Sat, 7 Oct 2023 15:08:48 +0800	[thread overview]
Message-ID: <202310071416.df82eed7-oliver.sang@intel.com> (raw)



Hello,

kernel test robot noticed a 4.7% improvement of stress-ng.numa.ops_per_sec on:


commit: 24526268f4e38c9ec0c4a30de4f37ad2a2a84e47 ("mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
parameters:

	nr_threads: 1
	testtime: 60s
	class: cpu
	test: numa
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.numa.ops_per_sec 4.5% improvement                                          |
| test machine     | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
| test parameters  | class=os                                                                                        |
|                  | cpufreq_governor=performance                                                                    |
|                  | disk=1HDD                                                                                       |
|                  | fs=ext4                                                                                         |
|                  | nr_threads=1                                                                                    |
|                  | test=numa                                                                                       |
|                  | testtime=60s                                                                                    |
+------------------+-------------------------------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231007/202310071416.df82eed7-oliver.sang@intel.com

=========================================================================================
class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  cpu/gcc-12/performance/x86_64-rhel-8.3/1/debian-11.1-x86_64-20220510.cgz/lkp-csl-d02/numa/stress-ng/60s

commit: 
  45120b1574 ("mm/damon/vaddr-test: fix memory leak in damon_do_test_apply_three_regions()")
  24526268f4 ("mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified")

45120b15743fa7c0 24526268f4e38c9ec0c4a30de4f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    272.18 ± 77%     -99.9%       0.31 ±220%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
      1089            +4.7%       1141        stress-ng.numa.ops
     18.16            +4.7%      19.01        stress-ng.numa.ops_per_sec
     20387            +5.2%      21456        stress-ng.time.involuntary_context_switches
 2.173e+09            +3.6%  2.251e+09        perf-stat.i.branch-instructions
      0.50            -3.5%       0.48        perf-stat.i.cpi
 1.865e+09            +3.6%  1.932e+09        perf-stat.i.dTLB-loads
  1.06e+10            +3.4%  1.096e+10        perf-stat.i.instructions
      2.02            +3.8%       2.10        perf-stat.i.ipc
    130.34            +3.1%     134.39        perf-stat.i.metric.M/sec
      0.50            -3.6%       0.49        perf-stat.overall.cpi
      1.99            +3.7%       2.06        perf-stat.overall.ipc
 2.139e+09            +3.6%  2.216e+09        perf-stat.ps.branch-instructions
 1.836e+09            +3.6%  1.901e+09        perf-stat.ps.dTLB-loads
 1.043e+10            +3.4%  1.079e+10        perf-stat.ps.instructions
 6.597e+11            +3.4%  6.822e+11        perf-stat.total.instructions
     17.43 ±  5%      -1.9       15.50 ±  2%  perf-profile.calltrace.cycles-pp.queue_folios_pte_range.walk_pmd_range.walk_pud_range.walk_p4d_range.walk_pgd_range
     18.49 ±  4%      -1.9       16.61 ±  2%  perf-profile.calltrace.cycles-pp.walk_pmd_range.walk_pud_range.walk_p4d_range.walk_pgd_range.__walk_page_range
     19.07 ±  4%      -1.8       17.25 ±  2%  perf-profile.calltrace.cycles-pp.walk_pud_range.walk_p4d_range.walk_pgd_range.__walk_page_range.walk_page_range
     19.67 ±  4%      -1.8       17.86 ±  2%  perf-profile.calltrace.cycles-pp.walk_p4d_range.walk_pgd_range.__walk_page_range.walk_page_range.migrate_to_node
      3.76 ±  4%      -0.4        3.33 ±  9%  perf-profile.calltrace.cycles-pp.mt_find.find_vma.queue_pages_test_walk.walk_page_range.migrate_to_node
      3.94 ±  4%      -0.4        3.53 ±  8%  perf-profile.calltrace.cycles-pp.find_vma.queue_pages_test_walk.walk_page_range.migrate_to_node.do_migrate_pages
     17.60 ±  4%      -1.9       15.71 ±  2%  perf-profile.children.cycles-pp.queue_folios_pte_range
     18.50 ±  4%      -1.9       16.63 ±  2%  perf-profile.children.cycles-pp.walk_pmd_range
     19.11 ±  4%      -1.8       17.29 ±  2%  perf-profile.children.cycles-pp.walk_pud_range
     19.69 ±  4%      -1.8       17.88 ±  2%  perf-profile.children.cycles-pp.walk_p4d_range
     20.79 ±  4%      -1.8       19.02 ±  3%  perf-profile.children.cycles-pp.__walk_page_range
      0.08 ± 19%      +0.1        0.15 ± 17%  perf-profile.children.cycles-pp.rcu_all_qs
      0.27 ±  9%      +0.1        0.35 ± 13%  perf-profile.children.cycles-pp.__cond_resched
     11.70 ±  6%      -1.9        9.84        perf-profile.self.cycles-pp.queue_folios_pte_range
      2.01 ± 10%      -0.3        1.72 ±  6%  perf-profile.self.cycles-pp.vm_normal_folio
      0.14 ± 20%      +0.1        0.22 ± 16%  perf-profile.self.cycles-pp.__cond_resched


***************************************************************************************************
lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  os/gcc-12/performance/1HDD/ext4/x86_64-rhel-8.3/1/debian-11.1-x86_64-20220510.cgz/lkp-csl-d02/numa/stress-ng/60s

commit: 
  45120b1574 ("mm/damon/vaddr-test: fix memory leak in damon_do_test_apply_three_regions()")
  24526268f4 ("mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified")

45120b15743fa7c0 24526268f4e38c9ec0c4a30de4f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1023 ± 22%     -42.3%     590.75 ± 35%  sched_debug.cpu.nr_switches.min
      1096            +4.5%       1145        stress-ng.numa.ops
     18.26            +4.5%      19.08        stress-ng.numa.ops_per_sec
     20712 ±  2%      +4.6%      21663        stress-ng.time.involuntary_context_switches
      6.57 ± 17%      -1.4        5.17 ± 12%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.55 ± 15%      -1.0        4.55 ±  9%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      4.37 ± 17%      -0.8        3.60 ±  8%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      4.32 ± 17%      -0.7        3.57 ±  8%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      2.54 ± 17%      -0.5        2.08 ± 10%  perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
      0.20 ± 28%      -0.1        0.13 ± 27%  perf-profile.children.cycles-pp.irqtime_account_irq
      0.13 ± 19%      +0.1        0.20 ± 24%  perf-profile.children.cycles-pp.hrtimer_start_range_ns
 2.068e+09            +3.7%  2.143e+09        perf-stat.i.branch-instructions
      0.55            -0.0        0.52        perf-stat.i.branch-miss-rate%
  12019422            -4.1%   11526701        perf-stat.i.branch-misses
      0.50            -3.5%       0.48        perf-stat.i.cpi
 1.767e+09            +3.6%   1.83e+09        perf-stat.i.dTLB-loads
 1.009e+10            +3.5%  1.044e+10        perf-stat.i.instructions
     19534            +2.4%      20010        perf-stat.i.instructions-per-iTLB-miss
      2.03            +3.7%       2.11        perf-stat.i.ipc
    123.98            +3.1%     127.81        perf-stat.i.metric.M/sec
      0.58            -0.0        0.54        perf-stat.overall.branch-miss-rate%
      0.49            -3.6%       0.48        perf-stat.overall.cpi
     17843            +2.3%      18252        perf-stat.overall.instructions-per-iTLB-miss
      2.02            +3.7%       2.10        perf-stat.overall.ipc
 2.035e+09            +3.7%   2.11e+09        perf-stat.ps.branch-instructions
  11834693            -4.1%   11344043        perf-stat.ps.branch-misses
 1.739e+09            +3.6%  1.801e+09        perf-stat.ps.dTLB-loads
    497472            +1.6%     505490        perf-stat.ps.iTLB-loads
 9.932e+09            +3.5%  1.028e+10        perf-stat.ps.instructions
 6.277e+11            +3.7%  6.512e+11        perf-stat.total.instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



                 reply	other threads:[~2023-10-07  7:09 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202310071416.df82eed7-oliver.sang@intel.com \
    --to=oliver.sang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=mhocko@suse.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=osalvador@suse.de \
    --cc=rientjes@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox