linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Libo Chen <libo.chen@oracle.com>
To: akpm@linux-foundation.org, rostedt@goodmis.org,
	peterz@infradead.org, mgorman@suse.de, mingo@redhat.com,
	juri.lelli@redhat.com, vincent.guittot@linaro.org, tj@kernel.org,
	llong@redhat.com
Cc: sraithal@amd.com, venkat88@linux.ibm.com, kprateek.nayak@amd.com,
	raghavendra.kt@amd.com, yu.c.chen@intel.com,
	tim.c.chen@intel.com, vineethr@linux.ibm.com,
	chris.hyser@oracle.com, daniel.m.jordan@oracle.com,
	lorenzo.stoakes@oracle.com, mkoutny@suse.com, linux-mm@kvack.org,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v4 1/2] sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems
Date: Wed, 23 Apr 2025 17:01:45 -0700	[thread overview]
Message-ID: <20250424000146.1197285-2-libo.chen@oracle.com> (raw)
In-Reply-To: <20250424000146.1197285-1-libo.chen@oracle.com>

When the memory of the current task is pinned to one NUMA node by cgroup,
there is no point in continuing the rest of VMA scanning and hinting page
faults as they will just be overhead. With this change, there will be no
more unnecessary PTE updates or page faults in this scenario.

We have seen up to a 6x improvement on a typical java workload running on
VMs with memory and CPU pinned to one NUMA node via cpuset in a two-socket
AARCH64 system. With the same pinning, on a 18-cores-per-socket Intel
platform, we have seen 20% improvment in a microbench that creates a
30-vCPU selftest KVM guest with 4GB memory, where each vCPU reads 4KB
pages in a fixed number of loops.

Signed-off-by: Libo Chen <libo.chen@oracle.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
---
 kernel/sched/fair.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e43993a4e580..c9903b1b3948 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3329,6 +3329,13 @@ static void task_numa_work(struct callback_head *work)
 	if (p->flags & PF_EXITING)
 		return;
 
+	/*
+	 * Memory is pinned to only one NUMA node via cpuset.mems, naturally
+	 * no page can be migrated.
+	 */
+	if (cpusets_enabled() && nodes_weight(cpuset_current_mems_allowed) == 1)
+		return;
+
 	if (!mm->numa_next_scan) {
 		mm->numa_next_scan = now +
 			msecs_to_jiffies(sysctl_numa_balancing_scan_delay);
-- 
2.43.5



  reply	other threads:[~2025-04-24  0:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-24  0:01 [PATCH v4 0/2] sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mem Libo Chen
2025-04-24  0:01 ` Libo Chen [this message]
2025-04-24  0:01 ` [PATCH v4 2/2] sched/numa: Add tracepoint that tracks the skipping of numa balancing due to cpuset memory pinning Libo Chen
2025-04-24  0:18   ` Steven Rostedt
2025-04-24  0:36     ` Libo Chen
2025-04-24  1:01       ` Steven Rostedt
2025-04-24  1:12         ` Libo Chen
2025-04-24  1:33           ` Steven Rostedt
2025-04-24  1:41             ` Libo Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250424000146.1197285-2-libo.chen@oracle.com \
    --to=libo.chen@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chris.hyser@oracle.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=llong@redhat.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=rostedt@goodmis.org \
    --cc=sraithal@amd.com \
    --cc=tim.c.chen@intel.com \
    --cc=tj@kernel.org \
    --cc=venkat88@linux.ibm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vineethr@linux.ibm.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox