From: kernel test robot <oliver.sang@intel.com>
To: Nikhil Dhama <nikhil.dhama@amd.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ying Huang <huang.ying.caritas@gmail.com>,
Bharata B Rao <bharata@amd.com>,
Raghavendra <raghavendra.kodsarathimmappa@amd.com>,
<linux-mm@kvack.org>, <ying.huang@linux.alibaba.com>,
Nikhil Dhama <nikhil.dhama@amd.com>,
<linux-kernel@vger.kernel.org>, <oliver.sang@intel.com>
Subject: Re: [PATCH] mm: pcp: scale batch to reduce number of high order pcp flushes on deallocation
Date: Mon, 31 Mar 2025 22:10:13 +0800 [thread overview]
Message-ID: <202503312148.c74b0351-lkp@intel.com> (raw)
In-Reply-To: <20250325171915.14384-1-nikhil.dhama@amd.com>
Hello,
kernel test robot noticed a 32.2% improvement of lmbench3.TCP.socket.bandwidth.10MB.MB/sec on:
commit: 6570c41610d1d2d3b143c253010b38ce9cbc0012 ("[PATCH] mm: pcp: scale batch to reduce number of high order pcp flushes on deallocation")
url: https://github.com/intel-lab-lkp/linux/commits/Nikhil-Dhama/mm-pcp-scale-batch-to-reduce-number-of-high-order-pcp-flushes-on-deallocation/20250326-012247
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/all/20250325171915.14384-1-nikhil.dhama@amd.com/
patch subject: [PATCH] mm: pcp: scale batch to reduce number of high order pcp flushes on deallocation
testcase: lmbench3
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:
test_memory_size: 50%
nr_threads: 100%
mode: development
test: TCP
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250331/202503312148.c74b0351-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase:
gcc-12/performance/x86_64-rhel-9.4/development/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/TCP/50%/lmbench3
commit:
7514d3cb91 ("foo")
6570c41610 ("mm: pcp: scale batch to reduce number of high order pcp flushes on deallocation")
7514d3cb916f9344 6570c41610d1d2d3b143c253010
---------------- ---------------------------
%stddev %change %stddev
\ | \
143.28 ± 38% +49.0% 213.49 ± 20% numa-vmstat.node1.nr_anon_transparent_hugepages
118.00 ± 21% +50.3% 177.33 ± 17% perf-c2c.DRAM.local
182485 +32.2% 241267 lmbench3.TCP.socket.bandwidth.10MB.MB/sec
40582104 ± 6% +114.4% 87026622 ± 2% lmbench3.time.involuntary_context_switches
0.46 ± 2% +0.1 0.52 ± 3% mpstat.cpu.all.irq%
4.57 ± 11% +1.4 5.96 ± 6% mpstat.cpu.all.soft%
291657 ± 38% +49.6% 436355 ± 20% numa-meminfo.node1.AnonHugePages
4728254 ± 36% +32.0% 6241931 ± 26% numa-meminfo.node1.MemUsed
0.40 -24.4% 0.30 ± 12% perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
13.88 ± 3% -78.2% 3.03 ±157% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.50 ± 4% +670.3% 11.58 ± 38% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
1.209e+09 ± 3% +6.5% 1.288e+09 proc-vmstat.numa_hit
1.209e+09 ± 3% +6.5% 1.287e+09 proc-vmstat.numa_local
9.644e+09 ± 3% +6.6% 1.028e+10 proc-vmstat.pgalloc_normal
9.644e+09 ± 3% +6.6% 1.028e+10 proc-vmstat.pgfree
92870937 ± 14% -17.9% 76271910 ± 8% sched_debug.cfs_rq:/.avg_vruntime.avg
2343 ± 10% -17.3% 1938 ± 17% sched_debug.cfs_rq:/.load.min
92870938 ± 14% -17.9% 76271910 ± 8% sched_debug.cfs_rq:/.min_vruntime.avg
13803 ± 10% -22.2% 10740 ± 14% sched_debug.cpu.curr->pid.min
2.87 ± 9% +69.1% 4.85 ± 4% perf-stat.i.MPKI
0.31 ± 6% +0.0 0.34 ± 3% perf-stat.i.branch-miss-rate%
13.92 +1.1 15.06 perf-stat.i.cache-miss-rate%
2.719e+08 ± 9% +27.6% 3.469e+08 ± 4% perf-stat.i.cache-misses
5.658e+11 -2.5% 5.516e+11 perf-stat.i.cpu-cycles
3.618e+11 ± 7% +10.5% 3.996e+11 ± 4% perf-stat.i.instructions
1.64 ± 9% -42.0% 0.95 ± 70% perf-stat.overall.cpi
2233 ± 11% -50.7% 1100 ± 71% perf-stat.overall.cycles-between-cache-misses
5.691e+11 -35.0% 3.702e+11 ± 70% perf-stat.ps.cpu-cycles
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2025-03-31 15:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-25 17:19 Nikhil Dhama
2025-03-30 6:52 ` Huang, Ying
2025-03-31 14:10 ` kernel test robot [this message]
2025-04-01 13:56 ` Nikhil Dhama
2025-04-03 1:36 ` Huang, Ying
2025-04-07 6:32 ` Nikhil Dhama
2025-04-07 7:38 ` Huang, Ying
2025-04-07 11:03 ` Nikhil Dhama
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202503312148.c74b0351-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=bharata@amd.com \
--cc=huang.ying.caritas@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=nikhil.dhama@amd.com \
--cc=oe-lkp@lists.linux.dev \
--cc=raghavendra.kodsarathimmappa@amd.com \
--cc=ying.huang@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox