* [PATCH v2 1/6] numa: generalize numa_map_to_online_node()
2023-08-19 14:12 [PATCH v2 0/6] sched fixes Yury Norov
@ 2023-08-19 14:12 ` Yury Norov
2023-08-19 14:12 ` [PATCH v2 2/6] sched/fair: fix opencoded numa_nearest_node() Yury Norov
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Yury Norov @ 2023-08-19 14:12 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Yury Norov, Ingo Molnar, Peter Zijlstra, Andrew Morton,
Ben Segall, Daniel Bristot de Oliveira, Dietmar Eggemann,
Jacob Keller, Jakub Kicinski, Juri Lelli, Mel Gorman,
Steven Rostedt, Tariq Toukan, Valentin Schneider,
Vincent Guittot, shiju.jose, jonathan.cameron, prime.zeng,
linuxarm, yangyicong, Andy Shevchenko, Rasmus Villemoes
The function in fact searches the nearest node for a given one,
based on a N_ONLINE state. This is a common pattern to search
for a nearest node.
This patch converts numa_map_to_online_node() to numa_nearest_node()
so that others won't need to opencode the logic.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
include/linux/numa.h | 7 +++++--
mm/mempolicy.c | 18 +++++++++++-------
2 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/include/linux/numa.h b/include/linux/numa.h
index 59df211d051f..fb30a42f0700 100644
--- a/include/linux/numa.h
+++ b/include/linux/numa.h
@@ -25,7 +25,7 @@
#include <asm/sparsemem.h>
/* Generic implementation available */
-int numa_map_to_online_node(int node);
+int numa_nearest_node(int node, unsigned int state);
#ifndef memory_add_physaddr_to_nid
static inline int memory_add_physaddr_to_nid(u64 start)
@@ -44,10 +44,11 @@ static inline int phys_to_target_node(u64 start)
}
#endif
#else /* !CONFIG_NUMA */
-static inline int numa_map_to_online_node(int node)
+static inline int numa_nearest_node(int node, unsigned int state)
{
return NUMA_NO_NODE;
}
+
static inline int memory_add_physaddr_to_nid(u64 start)
{
return 0;
@@ -58,6 +59,8 @@ static inline int phys_to_target_node(u64 start)
}
#endif
+#define numa_map_to_online_node(node) numa_nearest_node(node, N_ONLINE)
+
#ifdef CONFIG_HAVE_ARCH_NODE_DEV_GROUP
extern const struct attribute_group arch_node_dev_group;
#endif
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index c53f8beeb507..0fc9a3b1d765 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -131,22 +131,26 @@ static struct mempolicy default_policy = {
static struct mempolicy preferred_node_policy[MAX_NUMNODES];
/**
- * numa_map_to_online_node - Find closest online node
+ * numa_nearest_node - Find nearest node by state
* @node: Node id to start the search
+ * @state: State to filter the search
*
- * Lookup the next closest node by distance if @nid is not online.
+ * Lookup the closest node by distance if @nid is not in state.
*
- * Return: this @node if it is online, otherwise the closest node by distance
+ * Return: this @node if it is in state, otherwise the closest node by distance
*/
-int numa_map_to_online_node(int node)
+int numa_nearest_node(int node, unsigned int state)
{
int min_dist = INT_MAX, dist, n, min_node;
- if (node == NUMA_NO_NODE || node_online(node))
+ if (state >= NR_NODE_STATES)
+ return -EINVAL;
+
+ if (node == NUMA_NO_NODE || node_state(node, state))
return node;
min_node = node;
- for_each_online_node(n) {
+ for_each_node_state(n, state) {
dist = node_distance(node, n);
if (dist < min_dist) {
min_dist = dist;
@@ -156,7 +160,7 @@ int numa_map_to_online_node(int node)
return min_node;
}
-EXPORT_SYMBOL_GPL(numa_map_to_online_node);
+EXPORT_SYMBOL_GPL(numa_nearest_node);
struct mempolicy *get_task_policy(struct task_struct *p)
{
--
2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v2 2/6] sched/fair: fix opencoded numa_nearest_node()
2023-08-19 14:12 [PATCH v2 0/6] sched fixes Yury Norov
2023-08-19 14:12 ` [PATCH v2 1/6] numa: generalize numa_map_to_online_node() Yury Norov
@ 2023-08-19 14:12 ` Yury Norov
2023-08-19 14:12 ` [PATCH v2 3/6] sched: fix sched_numa_find_nth_cpu() in CPU-less case Yury Norov
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Yury Norov @ 2023-08-19 14:12 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Yury Norov, Ingo Molnar, Peter Zijlstra, Andrew Morton,
Ben Segall, Daniel Bristot de Oliveira, Dietmar Eggemann,
Jacob Keller, Jakub Kicinski, Juri Lelli, Mel Gorman,
Steven Rostedt, Tariq Toukan, Valentin Schneider,
Vincent Guittot, shiju.jose, jonathan.cameron, prime.zeng,
linuxarm, yangyicong, Andy Shevchenko, Rasmus Villemoes
task_numa_placement() searches for a nearest node to migrate by calling
for_each_node_state(). Now that we have numa_nearest_node(), switch to
using it.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
kernel/sched/fair.c | 14 +-------------
1 file changed, 1 insertion(+), 13 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b3e25be58e2b..e7b7cf87937b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2645,19 +2645,7 @@ static void task_numa_placement(struct task_struct *p)
}
/* Cannot migrate task to CPU-less node */
- if (max_nid != NUMA_NO_NODE && !node_state(max_nid, N_CPU)) {
- int near_nid = max_nid;
- int distance, near_distance = INT_MAX;
-
- for_each_node_state(nid, N_CPU) {
- distance = node_distance(max_nid, nid);
- if (distance < near_distance) {
- near_nid = nid;
- near_distance = distance;
- }
- }
- max_nid = near_nid;
- }
+ max_nid = numa_nearest_node(max_nid, N_CPU);
if (ng) {
numa_group_count_active_nodes(ng);
--
2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v2 3/6] sched: fix sched_numa_find_nth_cpu() in CPU-less case
2023-08-19 14:12 [PATCH v2 0/6] sched fixes Yury Norov
2023-08-19 14:12 ` [PATCH v2 1/6] numa: generalize numa_map_to_online_node() Yury Norov
2023-08-19 14:12 ` [PATCH v2 2/6] sched/fair: fix opencoded numa_nearest_node() Yury Norov
@ 2023-08-19 14:12 ` Yury Norov
2023-08-19 14:12 ` [PATCH v2 4/6] sched: fix sched_numa_find_nth_cpu() in non-NUMA case Yury Norov
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Yury Norov @ 2023-08-19 14:12 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Yury Norov, Ingo Molnar, Peter Zijlstra, Andrew Morton,
Ben Segall, Daniel Bristot de Oliveira, Dietmar Eggemann,
Jacob Keller, Jakub Kicinski, Juri Lelli, Mel Gorman,
Steven Rostedt, Tariq Toukan, Valentin Schneider,
Vincent Guittot, shiju.jose, jonathan.cameron, prime.zeng,
linuxarm, yangyicong, Andy Shevchenko, Rasmus Villemoes,
Guenter Roeck
When the node provided by user is CPU-less, corresponding record in
sched_domains_numa_masks is not set. Trying to dereference it in the
following code leads to kernel crash.
To avoid it, start searching from the nearest node with CPUs.
Fixes: cd7f55359c90 ("sched: add sched_numa_find_nth_cpu()")
Reported-by: Yicong Yang <yangyicong@hisilicon.com>
Closes: https://lore.kernel.org/lkml/CAAH8bW8C5humYnfpW3y5ypwx0E-09A3QxFE1JFzR66v+mO4XfA@mail.gmail.com/T/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/lkml/ZMHSNQfv39HN068m@yury-ThinkPad/T/#mf6431cb0b7f6f05193c41adeee444bc95bf2b1c4
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
---
kernel/sched/topology.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index d3a3b2646ec4..c6e89afa0d65 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2113,12 +2113,16 @@ static int hop_cmp(const void *a, const void *b)
*/
int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
{
- struct __cmp_key k = { .cpus = cpus, .node = node, .cpu = cpu };
+ struct __cmp_key k = { .cpus = cpus, .cpu = cpu };
struct cpumask ***hop_masks;
int hop, ret = nr_cpu_ids;
rcu_read_lock();
+ /* CPU-less node entries are uninitialized in sched_domains_numa_masks */
+ node = numa_nearest_node(node, N_CPU);
+ k.node = node;
+
k.masks = rcu_dereference(sched_domains_numa_masks);
if (!k.masks)
goto unlock;
--
2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v2 4/6] sched: fix sched_numa_find_nth_cpu() in non-NUMA case
2023-08-19 14:12 [PATCH v2 0/6] sched fixes Yury Norov
` (2 preceding siblings ...)
2023-08-19 14:12 ` [PATCH v2 3/6] sched: fix sched_numa_find_nth_cpu() in CPU-less case Yury Norov
@ 2023-08-19 14:12 ` Yury Norov
2023-08-19 14:12 ` [PATCH v2 5/6] sched: handle NUMA_NO_NODE in sched_numa_find_nth_cpu() Yury Norov
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Yury Norov @ 2023-08-19 14:12 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Yury Norov, Ingo Molnar, Peter Zijlstra, Andrew Morton,
Ben Segall, Daniel Bristot de Oliveira, Dietmar Eggemann,
Jacob Keller, Jakub Kicinski, Juri Lelli, Mel Gorman,
Steven Rostedt, Tariq Toukan, Valentin Schneider,
Vincent Guittot, shiju.jose, jonathan.cameron, prime.zeng,
linuxarm, yangyicong, Andy Shevchenko, Rasmus Villemoes
When CONFIG_NUMA is enabled, sched_numa_find_nth_cpu() searches for a
CPU in sched_domains_numa_masks. The masks includes only online CPUs,
so effectively offline CPUs are skipped.
When CONFIG_NUMA is disabled, the fallback function should be consistent.
Fixes: cd7f55359c90 ("sched: add sched_numa_find_nth_cpu()")
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
include/linux/topology.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/topology.h b/include/linux/topology.h
index fea32377f7c7..52f5850730b3 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -251,7 +251,7 @@ extern const struct cpumask *sched_numa_hop_mask(unsigned int node, unsigned int
#else
static __always_inline int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
{
- return cpumask_nth(cpu, cpus);
+ return cpumask_nth_and(cpu, cpus, cpu_online_mask);
}
static inline const struct cpumask *
--
2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v2 5/6] sched: handle NUMA_NO_NODE in sched_numa_find_nth_cpu()
2023-08-19 14:12 [PATCH v2 0/6] sched fixes Yury Norov
` (3 preceding siblings ...)
2023-08-19 14:12 ` [PATCH v2 4/6] sched: fix sched_numa_find_nth_cpu() in non-NUMA case Yury Norov
@ 2023-08-19 14:12 ` Yury Norov
2023-08-19 14:12 ` [PATCH v2 6/6] sched: fix sched_numa_find_nth_cpu() comment Yury Norov
2023-08-25 11:31 ` [PATCH v2 0/6] sched fixes Yury Norov
6 siblings, 0 replies; 9+ messages in thread
From: Yury Norov @ 2023-08-19 14:12 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Yury Norov, Ingo Molnar, Peter Zijlstra, Andrew Morton,
Ben Segall, Daniel Bristot de Oliveira, Dietmar Eggemann,
Jacob Keller, Jakub Kicinski, Juri Lelli, Mel Gorman,
Steven Rostedt, Tariq Toukan, Valentin Schneider,
Vincent Guittot, shiju.jose, jonathan.cameron, prime.zeng,
linuxarm, yangyicong, Andy Shevchenko, Rasmus Villemoes
sched_numa_find_nth_cpu() doesn't handle NUMA_NO_NODE properly, and
may crash kernel if passed with it. On the other hand, the only user
of sched_numa_find_nth_cpu() has to check NUMA_NO_NODE case explicitly.
It would be easier for users if this logic will get moved into
sched_numa_find_nth_cpu().
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
kernel/sched/topology.c | 3 +++
lib/cpumask.c | 4 +---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index c6e89afa0d65..bc6802700103 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2117,6 +2117,9 @@ int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
struct cpumask ***hop_masks;
int hop, ret = nr_cpu_ids;
+ if (node == NUMA_NO_NODE)
+ return cpumask_nth_and(cpu, cpus, cpu_online_mask);
+
rcu_read_lock();
/* CPU-less node entries are uninitialized in sched_domains_numa_masks */
diff --git a/lib/cpumask.c b/lib/cpumask.c
index 19277c6d551f..e77ee9d46f71 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -147,9 +147,7 @@ unsigned int cpumask_local_spread(unsigned int i, int node)
/* Wrap: we always want a cpu. */
i %= num_online_cpus();
- cpu = (node == NUMA_NO_NODE) ?
- cpumask_nth(i, cpu_online_mask) :
- sched_numa_find_nth_cpu(cpu_online_mask, i, node);
+ cpu = sched_numa_find_nth_cpu(cpu_online_mask, i, node);
WARN_ON(cpu >= nr_cpu_ids);
return cpu;
--
2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH v2 6/6] sched: fix sched_numa_find_nth_cpu() comment
2023-08-19 14:12 [PATCH v2 0/6] sched fixes Yury Norov
` (4 preceding siblings ...)
2023-08-19 14:12 ` [PATCH v2 5/6] sched: handle NUMA_NO_NODE in sched_numa_find_nth_cpu() Yury Norov
@ 2023-08-19 14:12 ` Yury Norov
2023-08-25 11:31 ` [PATCH v2 0/6] sched fixes Yury Norov
6 siblings, 0 replies; 9+ messages in thread
From: Yury Norov @ 2023-08-19 14:12 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Yury Norov, Ingo Molnar, Peter Zijlstra, Andrew Morton,
Ben Segall, Daniel Bristot de Oliveira, Dietmar Eggemann,
Jacob Keller, Jakub Kicinski, Juri Lelli, Mel Gorman,
Steven Rostedt, Tariq Toukan, Valentin Schneider,
Vincent Guittot, shiju.jose, jonathan.cameron, prime.zeng,
linuxarm, yangyicong, Andy Shevchenko, Rasmus Villemoes
Reword sched_numa_find_nth_cpu() comment and make it kernel-doc compatible.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
kernel/sched/topology.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index bc6802700103..789b281d2380 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2103,13 +2103,15 @@ static int hop_cmp(const void *a, const void *b)
return -1;
}
-/*
- * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth next cpu
- * closest to @cpu from @cpumask.
- * cpumask: cpumask to find a cpu from
- * cpu: Nth cpu to find
- *
- * returns: cpu, or nr_cpu_ids when nothing found.
+/**
+ * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth closest CPU
+ * from @cpus to @cpu, taking into account distance
+ * from a given @node.
+ * @cpus: cpumask to find a cpu from
+ * @cpu: CPU to start searching
+ * @node: NUMA node to order CPUs by distance
+ *
+ * Return: cpu, or nr_cpu_ids when nothing found.
*/
int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
{
--
2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2 0/6] sched fixes
2023-08-19 14:12 [PATCH v2 0/6] sched fixes Yury Norov
` (5 preceding siblings ...)
2023-08-19 14:12 ` [PATCH v2 6/6] sched: fix sched_numa_find_nth_cpu() comment Yury Norov
@ 2023-08-25 11:31 ` Yury Norov
2023-09-13 0:01 ` Yury Norov
6 siblings, 1 reply; 9+ messages in thread
From: Yury Norov @ 2023-08-25 11:31 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Ben Segall,
Daniel Bristot de Oliveira, Dietmar Eggemann, Jacob Keller,
Jakub Kicinski, Juri Lelli, Mel Gorman, Steven Rostedt,
Tariq Toukan, Valentin Schneider, Vincent Guittot, shiju.jose,
jonathan.cameron, prime.zeng, linuxarm, yangyicong,
Andy Shevchenko, Rasmus Villemoes
Ping?
On Sat, Aug 19, 2023 at 07:12:32AM -0700, Yury Norov wrote:
> Fixes for recently introduced sched_numa_find_nth_cpu(), and minor
> improvements in sched/fair.
>
> v1: https://lore.kernel.org/lkml/20230810162442.9863-1-yury.norov@gmail.com/T/
> v2:
> - fix wording in commit messages;
> - move nearest node search inside rcu lock section in
> sched_numa_find_nth_cpu();
> - move NUMA_NO_NODE handling inside sched_numa_find_nth_cpu();
> - rewrite comment for sched_numa_find_nth_cpu().
> - add review tag from Yicong Yang.
>
> Yury Norov (6):
> numa: generalize numa_map_to_online_node()
> sched/fair: fix opencoded numa_nearest_node()
> sched: fix sched_numa_find_nth_cpu() in CPU-less case
> sched: fix sched_numa_find_nth_cpu() in non-NUMA case
> sched: handle NUMA_NO_NODE in sched_numa_find_nth_cpu()
> sched: fix sched_numa_find_nth_cpu() comment
>
> include/linux/numa.h | 7 +++++--
> include/linux/topology.h | 2 +-
> kernel/sched/fair.c | 14 +-------------
> kernel/sched/topology.c | 25 +++++++++++++++++--------
> lib/cpumask.c | 4 +---
> mm/mempolicy.c | 18 +++++++++++-------
> 6 files changed, 36 insertions(+), 34 deletions(-)
>
> --
> 2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v2 0/6] sched fixes
2023-08-25 11:31 ` [PATCH v2 0/6] sched fixes Yury Norov
@ 2023-09-13 0:01 ` Yury Norov
0 siblings, 0 replies; 9+ messages in thread
From: Yury Norov @ 2023-09-13 0:01 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Ben Segall,
Daniel Bristot de Oliveira, Dietmar Eggemann, Jacob Keller,
Jakub Kicinski, Juri Lelli, Mel Gorman, Steven Rostedt,
Tariq Toukan, Valentin Schneider, Vincent Guittot, shiju.jose,
jonathan.cameron, prime.zeng, linuxarm, yangyicong,
Andy Shevchenko, Rasmus Villemoes
Another ping...
On Fri, Aug 25, 2023 at 04:31:44AM -0700, Yury Norov wrote:
> Ping?
>
> On Sat, Aug 19, 2023 at 07:12:32AM -0700, Yury Norov wrote:
> > Fixes for recently introduced sched_numa_find_nth_cpu(), and minor
> > improvements in sched/fair.
> >
> > v1: https://lore.kernel.org/lkml/20230810162442.9863-1-yury.norov@gmail.com/T/
> > v2:
> > - fix wording in commit messages;
> > - move nearest node search inside rcu lock section in
> > sched_numa_find_nth_cpu();
> > - move NUMA_NO_NODE handling inside sched_numa_find_nth_cpu();
> > - rewrite comment for sched_numa_find_nth_cpu().
> > - add review tag from Yicong Yang.
> >
> > Yury Norov (6):
> > numa: generalize numa_map_to_online_node()
> > sched/fair: fix opencoded numa_nearest_node()
> > sched: fix sched_numa_find_nth_cpu() in CPU-less case
> > sched: fix sched_numa_find_nth_cpu() in non-NUMA case
> > sched: handle NUMA_NO_NODE in sched_numa_find_nth_cpu()
> > sched: fix sched_numa_find_nth_cpu() comment
> >
> > include/linux/numa.h | 7 +++++--
> > include/linux/topology.h | 2 +-
> > kernel/sched/fair.c | 14 +-------------
> > kernel/sched/topology.c | 25 +++++++++++++++++--------
> > lib/cpumask.c | 4 +---
> > mm/mempolicy.c | 18 +++++++++++-------
> > 6 files changed, 36 insertions(+), 34 deletions(-)
> >
> > --
> > 2.39.2
^ permalink raw reply [flat|nested] 9+ messages in thread