Re: [PATCH mm-unstable v3 6/6] mm/mglru: rework workingset protection

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: kernel test robot <oliver.sang@intel.com>
To: Yu Zhao <yuzhao@google.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	Kairui Song <kasong@tencent.com>,
	Kalesh Singh <kaleshsingh@google.com>, <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	<linux-kernel@vger.kernel.org>, Yu Zhao <yuzhao@google.com>,
	<oliver.sang@intel.com>
Subject: Re: [PATCH mm-unstable v3 6/6] mm/mglru: rework workingset protection
Date: Mon, 23 Dec 2024 16:44:44 +0800	[thread overview]
Message-ID: <202412231601.f1eb8f84-lkp@intel.com> (raw)
In-Reply-To: <20241207221522.2250311-7-yuzhao@google.com>



Hello,

kernel test robot noticed a 5.7% regression of will-it-scale.per_process_ops on:


commit: 3b7734aa8458b62ecbfd785ca7918e831565006e ("[PATCH mm-unstable v3 6/6] mm/mglru: rework workingset protection")
url: https://github.com/intel-lab-lkp/linux/commits/Yu-Zhao/mm-mglru-clean-up-workingset/20241208-061714
base: v6.13-rc1
patch link: https://lore.kernel.org/all/20241207221522.2250311-7-yuzhao@google.com/
patch subject: [PATCH mm-unstable v3 6/6] mm/mglru: rework workingset protection

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 100%
	mode: process
	test: pread2
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202412231601.f1eb8f84-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241223/202412231601.f1eb8f84-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/pread2/will-it-scale

commit: 
  4a202aca7c ("mm/mglru: rework refault detection")
  3b7734aa84 ("mm/mglru: rework workingset protection")

4a202aca7c7d9f99 3b7734aa8458b62ecbfd785ca79 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1.03 ±  3%      -0.1        0.92 ±  5%  mpstat.cpu.all.usr%
      0.29 ± 14%     +20.8%       0.35 ±  7%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      1.02 ± 21%     +50.7%       1.54 ± 23%  perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.01 ± 50%     -66.9%       0.00 ± 82%  perf-stat.i.major-faults
      0.01 ± 50%     -73.6%       0.00 ±112%  perf-stat.ps.major-faults
    335982           -60.7%     132060 ± 15%  proc-vmstat.nr_active_anon
    335982           -60.7%     132060 ± 15%  proc-vmstat.nr_zone_active_anon
   1343709           -60.7%     528460 ± 15%  meminfo.Active
   1343709           -60.7%     528460 ± 15%  meminfo.Active(anon)
    259.96        +3.2e+05%     821511 ± 11%  meminfo.Inactive
   1401961            -5.7%    1321692 ±  2%  will-it-scale.104.processes
     13479            -5.7%      12708 ±  2%  will-it-scale.per_process_ops
   1401961            -5.7%    1321692 ±  2%  will-it-scale.workload
    138691 ± 43%     -75.8%      33574 ± 55%  numa-vmstat.node0.nr_active_anon
    138691 ± 43%     -75.8%      33574 ± 55%  numa-vmstat.node0.nr_zone_active_anon
    197311 ± 30%     -50.1%      98494 ± 18%  numa-vmstat.node1.nr_active_anon
    197311 ± 30%     -50.1%      98494 ± 18%  numa-vmstat.node1.nr_zone_active_anon
    554600 ± 43%     -75.8%     134360 ± 55%  numa-meminfo.node0.Active
    554600 ± 43%     -75.8%     134360 ± 55%  numa-meminfo.node0.Active(anon)
    173.31 ± 70%  +1.4e+05%     247821 ± 50%  numa-meminfo.node0.Inactive
    789291 ± 30%     -50.1%     394029 ± 18%  numa-meminfo.node1.Active
    789291 ± 30%     -50.1%     394029 ± 18%  numa-meminfo.node1.Active(anon)
     86.66 ±141%  +6.6e+05%     573998 ± 27%  numa-meminfo.node1.Inactive
     38.95            -0.9       38.09        perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.folio_wait_bit_common.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read
     38.83            -0.9       37.97        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.folio_wait_bit_common.shmem_get_folio_gfp.shmem_file_read_iter
     39.70            -0.8       38.86        perf-profile.calltrace.cycles-pp.folio_wait_bit_common.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64
     41.03            -0.8       40.26        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
      0.91            +0.0        0.95        perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_file_read_iter.vfs_read.__x64_sys_pread64
     53.14            +0.5       53.66        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_wake_bit.shmem_file_read_iter.vfs_read
     53.24            +0.5       53.76        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_wake_bit.shmem_file_read_iter.vfs_read.__x64_sys_pread64
     53.84            +0.5       54.38        perf-profile.calltrace.cycles-pp.folio_wake_bit.shmem_file_read_iter.vfs_read.__x64_sys_pread64.do_syscall_64
     38.96            -0.9       38.09        perf-profile.children.cycles-pp._raw_spin_lock_irq
     39.71            -0.8       38.87        perf-profile.children.cycles-pp.folio_wait_bit_common
     41.04            -0.8       40.26        perf-profile.children.cycles-pp.shmem_get_folio_gfp
     92.00            -0.3       91.67        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.22            -0.0        0.18 ±  3%  perf-profile.children.cycles-pp._copy_to_iter
      0.22 ±  2%      -0.0        0.19 ±  2%  perf-profile.children.cycles-pp.copy_page_to_iter
      0.20 ±  2%      -0.0        0.16 ±  4%  perf-profile.children.cycles-pp.rep_movs_alternative
      0.91            +0.0        0.96        perf-profile.children.cycles-pp.filemap_get_entry
      0.00            +0.3        0.35        perf-profile.children.cycles-pp.folio_mark_accessed
     53.27            +0.5       53.80        perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     53.86            +0.5       54.40        perf-profile.children.cycles-pp.folio_wake_bit
     92.00            -0.3       91.67        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.19            -0.0        0.16 ±  3%  perf-profile.self.cycles-pp.rep_movs_alternative
      0.41            +0.0        0.44        perf-profile.self.cycles-pp.shmem_get_folio_gfp
      0.37 ±  2%      +0.0        0.40        perf-profile.self.cycles-pp.folio_wait_bit_common
      0.90            +0.0        0.94        perf-profile.self.cycles-pp.filemap_get_entry
      0.61            +0.1        0.68        perf-profile.self.cycles-pp.shmem_file_read_iter
      0.00            +0.3        0.34 ±  2%  perf-profile.self.cycles-pp.folio_mark_accessed




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

next prev parent reply	other threads:[~2024-12-23  8:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-07 22:15 [PATCH mm-unstable v3 0/6] mm/mglru: performance optimizations Yu Zhao
2024-12-07 22:15 ` [PATCH mm-unstable v3 1/6] mm/mglru: clean up workingset Yu Zhao
2024-12-07 22:15 ` [PATCH mm-unstable v3 2/6] mm/mglru: optimize deactivation Yu Zhao
2024-12-07 22:15 ` [PATCH mm-unstable v3 3/6] mm/mglru: rework aging feedback Yu Zhao
2024-12-07 22:15 ` [PATCH mm-unstable v3 4/6] mm/mglru: rework type selection Yu Zhao
2024-12-07 22:15 ` [PATCH mm-unstable v3 5/6] mm/mglru: rework refault detection Yu Zhao
2024-12-07 22:15 ` [PATCH mm-unstable v3 6/6] mm/mglru: rework workingset protection Yu Zhao
2024-12-23  8:44   ` kernel test robot [this message]
2024-12-24 19:04     ` Yu Zhao
2024-12-26  2:51       ` Oliver Sang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202412231601.f1eb8f84-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=kaleshsingh@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox