From: Lee Schermerhorn <lee.schermerhorn@hp.com>
To: linux-mm@kvack.org
Cc: ak@suse.de, Lee Schermerhorn <lee.schermerhorn@hp.com>,
Nishanth Aravamudan <nacc@us.ibm.com>,
pj@sgi.com, kxr@sgi.com, Christoph Lameter <clameter@sgi.com>,
Mel Gorman <mel@skynet.ie>,
akpm@linux-foundation.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: [PATCH 13/14] Memoryless Nodes: use "node_memory_map" for cpusets
Date: Fri, 27 Jul 2007 15:44:40 -0400 [thread overview]
Message-ID: <20070727194440.18614.95660.sendpatchset@localhost> (raw)
In-Reply-To: <20070727194316.18614.36380.sendpatchset@localhost>
[patch 13/14] Memoryless Nodes: use "node_memory_map" for cpusets - take 4
Against 2.6.22-rc1-mm1 atop Christoph Lameter's memoryless nodes
series
take 2:
+ replaced node_online_map in cpuset_current_mems_allowed()
with node_states[N_MEMORY]
+ replaced node_online_map in cpuset_init_smp() with
node_states[N_MEMORY]
take 3:
+ fix up comments and top level cpuset tracking of nodes
with memory [instead of on-line nodes]
take 4:
+ fix typo in !CPUSETS definition of cpuset_current_mems_allowed()
+ fix up Documentation/cpusets.txt to reflect these changes.
cpusets try to ensure that any node added to a cpuset's
mems_allowed is on-line and contains memory. The assumption
was that online nodes contained memory. Thus, it is possible
to add memoryless nodes to a cpuset and then add tasks to this
cpuset. This results in continuous series of oom-kill and
apparent system hang.
Change cpusets to use node_states[N_MEMORY] [a.k.a.
node_memory_map] in place of node_online_map when vetting
memories. Return error if admin attempts to write a non-empty
mems_allowed node mask containing only memoryless-nodes.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>
Tested on 4-node ppc64 with 2 memoryless nodes. Top cpuset
(and all subsequent ones) only allow nodes 0 and 1 (the
nodes with memory).
Documentation/cpusets.txt | 8 ++++---
include/linux/cpuset.h | 2 -
kernel/cpuset.c | 51 +++++++++++++++++++++++++++++-----------------
3 files changed, 39 insertions(+), 22 deletions(-)
Index: Linux/kernel/cpuset.c
===================================================================
--- Linux.orig/kernel/cpuset.c 2007-07-26 12:40:16.000000000 -0400
+++ Linux/kernel/cpuset.c 2007-07-26 12:55:29.000000000 -0400
@@ -307,26 +307,26 @@ static void guarantee_online_cpus(const
/*
* Return in *pmask the portion of a cpusets's mems_allowed that
- * are online. If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online mems. If we get
- * all the way to the top and still haven't found any online mems,
- * return node_online_map.
+ * are online, with memory. If none are online with memory, walk
+ * up the cpuset hierarchy until we find one that does have some
+ * online mems. If we get all the way to the top and still haven't
+ * found any online mems, return node_states[N_MEMORY].
*
* One way or another, we guarantee to return some non-empty subset
- * of node_online_map.
+ * of node_states[N_MEMORY].
*
* Call with callback_mutex held.
*/
static void guarantee_online_mems(const struct cpuset *cs, nodemask_t *pmask)
{
- while (cs && !nodes_intersects(cs->mems_allowed, node_online_map))
+ while (cs && !nodes_intersects(cs->mems_allowed, node_states[N_MEMORY]))
cs = cs->parent;
if (cs)
- nodes_and(*pmask, cs->mems_allowed, node_online_map);
+ nodes_and(*pmask, cs->mems_allowed, node_states[N_MEMORY]);
else
- *pmask = node_online_map;
- BUG_ON(!nodes_intersects(*pmask, node_online_map));
+ *pmask = node_states[N_MEMORY];
+ BUG_ON(!nodes_intersects(*pmask, node_states[N_MEMORY]));
}
/**
@@ -597,7 +597,7 @@ static int update_nodemask(struct cpuset
int retval;
struct container_iter it;
- /* top_cpuset.mems_allowed tracks node_online_map; it's read-only */
+ /* top_cpuset.mems_allowed tracks node_states[N_MEMORY]; it's read-only */
if (cs == &top_cpuset)
return -EACCES;
@@ -614,8 +614,21 @@ static int update_nodemask(struct cpuset
retval = nodelist_parse(buf, trialcs.mems_allowed);
if (retval < 0)
goto done;
+ if (!nodes_intersects(trialcs.mems_allowed,
+ node_states[N_MEMORY])) {
+ /*
+ * error if only memoryless nodes specified.
+ */
+ retval = -ENOSPC;
+ goto done;
+ }
}
- nodes_and(trialcs.mems_allowed, trialcs.mems_allowed, node_online_map);
+ /*
+ * Exclude memoryless nodes. We know that trialcs.mems_allowed
+ * contains at least one node with memory.
+ */
+ nodes_and(trialcs.mems_allowed, trialcs.mems_allowed,
+ node_states[N_MEMORY]);
oldmem = cs->mems_allowed;
if (nodes_equal(oldmem, trialcs.mems_allowed)) {
retval = 0; /* Too easy - nothing to do */
@@ -1356,8 +1369,9 @@ static void guarantee_online_cpus_mems_i
/*
* The cpus_allowed and mems_allowed nodemasks in the top_cpuset track
- * cpu_online_map and node_online_map. Force the top cpuset to track
- * whats online after any CPU or memory node hotplug or unplug event.
+ * cpu_online_map and node_states[N_MEMORY]. Force the top cpuset to
+ * track what's online after any CPU or memory node hotplug or unplug
+ * event.
*
* To ensure that we don't remove a CPU or node from the top cpuset
* that is currently in use by a child cpuset (which would violate
@@ -1377,7 +1391,7 @@ static void common_cpu_mem_hotplug_unplu
guarantee_online_cpus_mems_in_subtree(&top_cpuset);
top_cpuset.cpus_allowed = cpu_online_map;
- top_cpuset.mems_allowed = node_online_map;
+ top_cpuset.mems_allowed = node_states[N_MEMORY];
mutex_unlock(&callback_mutex);
container_unlock();
@@ -1405,8 +1419,9 @@ static int cpuset_handle_cpuhp(struct no
#ifdef CONFIG_MEMORY_HOTPLUG
/*
- * Keep top_cpuset.mems_allowed tracking node_online_map.
- * Call this routine anytime after you change node_online_map.
+ * Keep top_cpuset.mems_allowed tracking node_states[N_MEMORY].
+ * Call this routine anytime after you change
+ * node_states[N_MEMORY].
* See also the previous routine cpuset_handle_cpuhp().
*/
@@ -1425,7 +1440,7 @@ void cpuset_track_online_nodes(void)
void __init cpuset_init_smp(void)
{
top_cpuset.cpus_allowed = cpu_online_map;
- top_cpuset.mems_allowed = node_online_map;
+ top_cpuset.mems_allowed = node_states[N_MEMORY];
hotcpu_notifier(cpuset_handle_cpuhp, 0);
}
@@ -1465,7 +1480,7 @@ void cpuset_init_current_mems_allowed(vo
*
* Description: Returns the nodemask_t mems_allowed of the cpuset
* attached to the specified @tsk. Guaranteed to return some non-empty
- * subset of node_online_map, even if this means going outside the
+ * subset of node_states[N_MEMORY], even if this means going outside the
* tasks cpuset.
**/
Index: Linux/include/linux/cpuset.h
===================================================================
--- Linux.orig/include/linux/cpuset.h 2007-07-26 12:40:16.000000000 -0400
+++ Linux/include/linux/cpuset.h 2007-07-26 12:55:30.000000000 -0400
@@ -92,7 +92,7 @@ static inline nodemask_t cpuset_mems_all
return node_possible_map;
}
-#define cpuset_current_mems_allowed (node_online_map)
+#define cpuset_current_mems_allowed (node_states[N_MEMORY])
static inline void cpuset_init_current_mems_allowed(void) {}
static inline void cpuset_update_task_memory_state(void) {}
#define cpuset_nodes_subset_current_mems_allowed(nodes) (1)
Index: Linux/Documentation/cpusets.txt
===================================================================
--- Linux.orig/Documentation/cpusets.txt 2007-07-25 09:29:48.000000000 -0400
+++ Linux/Documentation/cpusets.txt 2007-07-26 13:02:00.000000000 -0400
@@ -8,6 +8,7 @@ Portions Copyright (c) 2004-2006 Silicon
Modified by Paul Jackson <pj@sgi.com>
Modified by Christoph Lameter <clameter@sgi.com>
Modified by Paul Menage <menage@google.com>
+Modified by Lee Schermerhorn <lee.schermerhorn@hp.com>
CONTENTS:
=========
@@ -35,7 +36,8 @@ CONTENTS:
----------------------
Cpusets provide a mechanism for assigning a set of CPUs and Memory
-Nodes to a set of tasks.
+Nodes to a set of tasks. In this document "Memory Node" refers to
+an on-line node that contains memory.
Cpusets constrain the CPU and Memory placement of tasks to only
the resources within a tasks current cpuset. They form a nested
@@ -207,8 +209,8 @@ and name space for cpusets, with a minim
The cpus and mems files in the root (top_cpuset) cpuset are
read-only. The cpus file automatically tracks the value of
cpu_online_map using a CPU hotplug notifier, and the mems file
-automatically tracks the value of node_online_map using the
-cpuset_track_online_nodes() hook.
+automatically tracks the value of node_states[N_MEMORY]--i.e.,
+nodes with memory--using the cpuset_track_online_nodes() hook.
1.4 What are exclusive cpusets ?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-07-27 19:44 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-27 19:43 [PATCH 00/14] NUMA: Memoryless node support V4 Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Lee Schermerhorn
2007-07-30 21:38 ` [PATCH/RFC] 2.6.23-rc1-mm1: MPOL_PREFERRED fixups for preferred_node < 0 Lee Schermerhorn
2007-07-30 22:00 ` Lee Schermerhorn
2007-07-31 15:32 ` Mel Gorman
2007-07-31 15:58 ` Lee Schermerhorn
2007-07-31 21:05 ` [PATCH/RFC] 2.6.23-rc1-mm1: MPOL_PREFERRED fixups for preferred_node < 0 - v2 Lee Schermerhorn
2007-08-01 2:22 ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Andrew Morton
2007-08-01 2:52 ` Christoph Lameter
2007-08-01 3:05 ` Andrew Morton
2007-08-01 3:14 ` Christoph Lameter
2007-08-01 3:32 ` Andrew Morton
2007-08-01 3:37 ` Christoph Lameter
[not found] ` <Pine.LNX.4.64.0707312151400.2894@schroedinger.engr.sgi.com>
2007-08-01 5:07 ` Andrew Morton
2007-08-01 5:11 ` Andrew Morton
2007-08-01 5:22 ` Christoph Lameter
2007-08-01 10:24 ` Mel Gorman
2007-08-02 16:23 ` Mel Gorman
2007-08-02 20:00 ` Christoph Lameter
2007-08-01 5:36 ` Paul Mundt
2007-08-01 9:19 ` Andi Kleen
2007-08-01 14:03 ` Lee Schermerhorn
2007-08-01 17:41 ` Christoph Lameter
2007-08-01 17:54 ` Lee Schermerhorn
2007-08-02 20:05 ` [PATCH/RFC/WIP] cpuset-independent interleave policy Lee Schermerhorn
2007-08-02 20:34 ` Christoph Lameter
2007-08-02 21:04 ` Lee Schermerhorn
2007-08-03 0:31 ` Christoph Lameter
2007-08-02 20:19 ` Audit of "all uses of node_online()" Lee Schermerhorn
2007-08-02 20:26 ` Christoph Lameter
2007-08-08 22:19 ` Lee Schermerhorn
2007-08-08 23:40 ` Christoph Lameter
2007-08-16 14:17 ` [PATCH/RFC] memoryless nodes - fixup uses of node_online_map in generic code Lee Schermerhorn
2007-08-16 18:33 ` Christoph Lameter
2007-08-16 19:15 ` Lee Schermerhorn
2007-08-16 21:10 ` Lee Schermerhorn
2007-08-16 21:13 ` Christoph Lameter
2007-08-24 16:09 ` [PATCH] 2.6.23-rc3-mm1 - Move setup of N_CPU node state mask Lee Schermerhorn
2007-09-06 13:56 ` Mel Gorman
2007-08-02 20:33 ` Audit of "all uses of node_online()" Andrew Morton
2007-08-02 20:45 ` Lee Schermerhorn
2007-08-01 15:58 ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Nishanth Aravamudan
2007-08-01 16:09 ` Nishanth Aravamudan
2007-08-01 17:47 ` Christoph Lameter
2007-08-01 15:25 ` Nishanth Aravamudan
2007-07-27 19:43 ` [PATCH 02/14] Memoryless nodes: introduce mask of nodes with memory Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 03/14] Memoryless Nodes: Fix interleave behavior Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 04/14] OOM: use the N_MEMORY map instead of constructing one on the fly Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 05/14] Memoryless Nodes: No need for kswapd Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 06/14] Memoryless Node: Slab support Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 07/14] Memoryless nodes: SLUB support Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 08/14] Uncached allocator: Handle memoryless nodes Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 09/14] Memoryless node: Allow profiling data to fall back to other nodes Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 10/14] Memoryless nodes: Update memory policy and page migration Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 11/14] Add N_CPU node state Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 12/14] Memoryless nodes: Fix GFP_THISNODE behavior Lee Schermerhorn
2007-07-27 19:44 ` Lee Schermerhorn [this message]
2007-07-27 19:44 ` [PATCH 14/14] Memoryless nodes: drop one memoryless node boot warning Lee Schermerhorn
2007-07-27 20:59 ` [PATCH 00/14] NUMA: Memoryless node support V4 Nishanth Aravamudan
2007-07-30 13:48 ` Lee Schermerhorn
2007-07-29 12:35 ` Paul Jackson
2007-07-30 16:07 ` Lee Schermerhorn
2007-07-30 18:56 ` Paul Jackson
2007-07-30 21:19 ` Nishanth Aravamudan
2007-07-30 22:06 ` Christoph Lameter
2007-07-30 22:35 ` Andi Kleen
2007-07-30 22:36 ` Christoph Lameter
2007-07-31 23:18 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070727194440.18614.95660.sendpatchset@localhost \
--to=lee.schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kxr@sgi.com \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
--cc=nacc@us.ibm.com \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox