linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Shivank Garg <shivankg@amd.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	Christian Brauner <brauner@kernel.org>,
	"David Hildenbrand" <david@redhat.com>,
	David Howells <dhowells@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Hugh Dickins <hughd@google.com>, "Jann Horn" <jannh@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	Klara Modin <klarasmodin@gmail.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Lokesh Gidra <lokeshgidra@google.com>,
	Mateusz Guzik <mjguzik@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Mel Gorman" <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@suse.com>,
	"Minchan Kim" <minchan@google.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	"Peter Xu" <peterx@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Sourav Panda <souravpanda@google.com>,
	Wei Yang <richard.weiyang@gmail.com>,
	Will Deacon <will@kernel.org>, Heiko Carstens <hca@linux.ibm.com>,
	Stephen Rothwell <sfr@canb.auug.org.au>, <linux-mm@kvack.org>,
	<oliver.sang@intel.com>
Subject: [linus:master] [mm]  6bef4c2f97:  stress-ng.mlockmany.ops_per_sec 5.2% improvement
Date: Tue, 10 Jun 2025 22:38:45 +0800	[thread overview]
Message-ID: <202506102254.13cda0af-lkp@intel.com> (raw)



Hello,

kernel test robot noticed a 5.2% improvement of stress-ng.mlockmany.ops_per_sec on:


commit: 6bef4c2f97221f3b595d08c8656eb5845ef80fe9 ("mm: move lesser used vma_area_struct members into the last cacheline")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: mlockmany
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250610/202506102254.13cda0af-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/mlockmany/stress-ng/60s

commit: 
  f35ab95ca0 ("mm: replace vm_lock and detached flag with a reference count")
  6bef4c2f97 ("mm: move lesser used vma_area_struct members into the last cacheline")

f35ab95ca0af7a27 6bef4c2f97221f3b595d08c8656 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.66 ±  5%      -0.1        0.57 ±  9%  mpstat.cpu.all.soft%
     27183            +1.9%      27708        vmstat.system.cs
    264643            +5.2%     278326        stress-ng.mlockmany.ops
      4406            +5.2%       4634        stress-ng.mlockmany.ops_per_sec
    314509            +4.9%     329874        stress-ng.time.voluntary_context_switches
    343582            -3.7%     330742 ±  2%  proc-vmstat.nr_active_anon
    454064            -2.7%     441886        proc-vmstat.nr_anon_pages
     54743            -3.5%      52828        proc-vmstat.nr_slab_unreclaimable
    343583            -3.7%     330741 ±  2%  proc-vmstat.nr_zone_active_anon
      1.99 ±  8%     -14.0%       1.72 ± 12%  sched_debug.cfs_rq:/.h_nr_queued.stddev
      1.98 ±  8%     -13.9%       1.71 ± 12%  sched_debug.cfs_rq:/.h_nr_runnable.stddev
      0.00 ± 18%     -24.8%       0.00 ± 20%  sched_debug.cpu.next_balance.stddev
      1.99 ±  8%     -13.8%       1.72 ± 12%  sched_debug.cpu.nr_running.stddev
      0.25            +0.0        0.25        perf-stat.i.branch-miss-rate%
  21663531            +1.7%   22033919        perf-stat.i.branch-misses
     27855            +1.8%      28352        perf-stat.i.context-switches
      0.25            +0.0        0.25        perf-stat.overall.branch-miss-rate%
  21319615            +1.7%   21691011        perf-stat.ps.branch-misses
     27388            +1.7%      27866        perf-stat.ps.context-switches
     19.64 ±  7%     -18.7%      15.97 ± 11%  perf-sched.sch_delay.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
     11.34 ±  8%     -13.5%       9.80 ±  6%  perf-sched.sch_delay.avg.ms.__cond_resched.down_read.__mm_populate.do_mlock.__x64_sys_mlock
     17.11 ±  4%      -8.2%      15.70 ±  5%  perf-sched.sch_delay.avg.ms.__cond_resched.mlock_pte_range.walk_pmd_range.isra.0
     10.51 ± 10%     +35.6%      14.26 ± 15%  perf-sched.sch_delay.avg.ms.__cond_resched.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
     52.76 ± 22%     -31.2%      36.28 ± 18%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown]
     50.19 ±  7%     -26.9%      36.68 ± 45%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
     23.36 ±  9%     -14.2%      20.03 ±  6%  perf-sched.wait_and_delay.avg.ms.__cond_resched.down_read.__mm_populate.do_mlock.__x64_sys_mlock
     51.05 ± 10%     -34.3%      33.53 ± 45%  perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
    245.67 ±  6%     -47.6%     128.83 ±  4%  perf-sched.wait_and_delay.count.__cond_resched.copy_page_range.dup_mmap.dup_mm.constprop
    286.83 ±  7%     -21.0%     226.67 ±  5%  perf-sched.wait_and_delay.count.__cond_resched.down_write.anon_vma_clone.anon_vma_fork.dup_mmap
    120.67 ±  9%     +32.6%     160.00 ±  8%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.anon_vma_fork
    225.41 ± 31%     -33.7%     149.44 ±  7%  perf-sched.wait_and_delay.max.ms.__cond_resched.copy_page_range.dup_mmap.dup_mm.constprop
     77.77 ± 73%     +79.0%     139.22 ± 15%  perf-sched.wait_and_delay.max.ms.__cond_resched.uprobe_start_dup_mmap.dup_mm.constprop.0
     12.02 ± 11%     -14.9%      10.23 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.down_read.__mm_populate.do_mlock.__x64_sys_mlock
     31.78 ± 18%     -31.9%      21.63 ± 11%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
     16.57 ±  5%      -9.3%      15.03 ±  5%  perf-sched.wait_time.avg.ms.__cond_resched.mlock_pte_range.walk_pmd_range.isra.0
     25.21 ±  7%     +12.4%      28.34 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.mutex_lock_killable.pcpu_alloc_noprof.mm_init.dup_mm
     24.68 ± 29%     +39.0%      34.31 ± 15%  perf-sched.wait_time.avg.ms.__cond_resched.uprobe_start_dup_mmap.dup_mm.constprop.0
    207.48 ± 35%     -32.5%     140.02 ±  6%  perf-sched.wait_time.max.ms.__cond_resched.copy_page_range.dup_mmap.dup_mm.constprop
     70.62 ± 41%     +75.6%     124.03 ± 15%  perf-sched.wait_time.max.ms.__cond_resched.uprobe_start_dup_mmap.dup_mm.constprop.0




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



                 reply	other threads:[~2025-06-10 14:39 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202506102254.13cda0af-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hca@linux.ibm.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=klarasmodin@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lokeshgidra@google.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=mjguzik@gmail.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oleg@redhat.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=richard.weiyang@gmail.com \
    --cc=sfr@canb.auug.org.au \
    --cc=shakeel.butt@linux.dev \
    --cc=shivankg@amd.com \
    --cc=souravpanda@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox