From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>, Paul Jackson <pj@sgi.com>,
Nishanth Aravamudan <nacc@us.ibm.com>
Cc: akpm@linux-foundation.org, kxr@sgi.com, linux-mm@kvack.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: [PATCH take3] Memoryless nodes: use "node_memory_map" for cpuset mems_allowed validation
Date: Tue, 24 Jul 2007 16:30:19 -0400 [thread overview]
Message-ID: <1185309019.5649.69.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.64.0707111204470.17503@schroedinger.engr.sgi.com>
Memoryless Nodes: use "node_memory_map" for cpusets - take 3
Against 2.6.22-rc6-mm1 atop Christoph Lameter's memoryless nodes
series
take 2:
+ replaced node_online_map in cpuset_current_mems_allowed()
with node_states[N_MEMORY]
+ replaced node_online_map in cpuset_init_smp() with
node_states[N_MEMORY]
take 3:
+ fix up comments and top level cpuset tracking of nodes
with memory [instead of on-line nodes].
+ maybe I got them all this time?
cpusets try to ensure that any node added to a cpuset's
mems_allowed is on-line and contains memory. The assumption
was that online nodes contained memory. Thus, it is possible
to add memoryless nodes to a cpuset and then add tasks to this
cpuset. This results in continuous series of oom-kill and
apparent system hang.
Change cpusets to use node_states[N_MEMORY] [a.k.a.
node_memory_map] in place of node_online_map when vetting
memories. Return error if admin attempts to write a non-empty
mems_allowed node mask containing only memoryless-nodes.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
include/linux/cpuset.h | 2 -
kernel/cpuset.c | 51 +++++++++++++++++++++++++++++++------------------
2 files changed, 34 insertions(+), 19 deletions(-)
Index: Linux/kernel/cpuset.c
===================================================================
--- Linux.orig/kernel/cpuset.c 2007-07-24 11:24:56.000000000 -0400
+++ Linux/kernel/cpuset.c 2007-07-24 12:20:40.000000000 -0400
@@ -316,26 +316,26 @@ static void guarantee_online_cpus(const
/*
* Return in *pmask the portion of a cpusets's mems_allowed that
- * are online. If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online mems. If we get
- * all the way to the top and still haven't found any online mems,
- * return node_online_map.
+ * are online, with memory. If none are online with memory, walk
+ * up the cpuset hierarchy until we find one that does have some
+ * online mems. If we get all the way to the top and still haven't
+ * found any online mems, return node_states[N_MEMORY].
*
* One way or another, we guarantee to return some non-empty subset
- * of node_online_map.
+ * of node_states[N_MEMORY].
*
* Call with callback_mutex held.
*/
static void guarantee_online_mems(const struct cpuset *cs, nodemask_t *pmask)
{
- while (cs && !nodes_intersects(cs->mems_allowed, node_online_map))
+ while (cs && !nodes_intersects(cs->mems_allowed, node_states[N_MEMORY]))
cs = cs->parent;
if (cs)
- nodes_and(*pmask, cs->mems_allowed, node_online_map);
+ nodes_and(*pmask, cs->mems_allowed, node_states[N_MEMORY]);
else
- *pmask = node_online_map;
- BUG_ON(!nodes_intersects(*pmask, node_online_map));
+ *pmask = node_states[N_MEMORY];
+ BUG_ON(!nodes_intersects(*pmask, node_states[N_MEMORY]));
}
/**
@@ -606,7 +606,7 @@ static int update_nodemask(struct cpuset
int retval;
struct container_iter it;
- /* top_cpuset.mems_allowed tracks node_online_map; it's read-only */
+ /* top_cpuset.mems_allowed tracks node_states[N_MEMORY]; it's read-only */
if (cs == &top_cpuset)
return -EACCES;
@@ -623,8 +623,21 @@ static int update_nodemask(struct cpuset
retval = nodelist_parse(buf, trialcs.mems_allowed);
if (retval < 0)
goto done;
+ if (!nodes_intersects(trialcs.mems_allowed,
+ node_states[N_MEMORY])) {
+ /*
+ * error if only memoryless nodes specified.
+ */
+ retval = -ENOSPC;
+ goto done;
+ }
}
- nodes_and(trialcs.mems_allowed, trialcs.mems_allowed, node_online_map);
+ /*
+ * Exclude memoryless nodes. We know that trialcs.mems_allowed
+ * contains at least one node with memory.
+ */
+ nodes_and(trialcs.mems_allowed, trialcs.mems_allowed,
+ node_states[N_MEMORY]);
oldmem = cs->mems_allowed;
if (nodes_equal(oldmem, trialcs.mems_allowed)) {
retval = 0; /* Too easy - nothing to do */
@@ -1366,8 +1379,9 @@ static void guarantee_online_cpus_mems_i
/*
* The cpus_allowed and mems_allowed nodemasks in the top_cpuset track
- * cpu_online_map and node_online_map. Force the top cpuset to track
- * whats online after any CPU or memory node hotplug or unplug event.
+ * cpu_online_map and node_states[N_MEMORY]. Force the top cpuset to
+ * track what's online after any CPU or memory node hotplug or unplug
+ * event.
*
* To ensure that we don't remove a CPU or node from the top cpuset
* that is currently in use by a child cpuset (which would violate
@@ -1387,7 +1401,7 @@ static void common_cpu_mem_hotplug_unplu
guarantee_online_cpus_mems_in_subtree(&top_cpuset);
top_cpuset.cpus_allowed = cpu_online_map;
- top_cpuset.mems_allowed = node_online_map;
+ top_cpuset.mems_allowed = node_states[N_MEMORY];
mutex_unlock(&callback_mutex);
container_unlock();
@@ -1412,8 +1426,9 @@ static int cpuset_handle_cpuhp(struct no
#ifdef CONFIG_MEMORY_HOTPLUG
/*
- * Keep top_cpuset.mems_allowed tracking node_online_map.
- * Call this routine anytime after you change node_online_map.
+ * Keep top_cpuset.mems_allowed tracking node_states[N_MEMORY].
+ * Call this routine anytime after you change
+ * node_states[N_MEMORY].
* See also the previous routine cpuset_handle_cpuhp().
*/
@@ -1432,7 +1447,7 @@ void cpuset_track_online_nodes(void)
void __init cpuset_init_smp(void)
{
top_cpuset.cpus_allowed = cpu_online_map;
- top_cpuset.mems_allowed = node_online_map;
+ top_cpuset.mems_allowed = node_states[N_MEMORY];
hotcpu_notifier(cpuset_handle_cpuhp, 0);
}
@@ -1472,7 +1487,7 @@ void cpuset_init_current_mems_allowed(vo
*
* Description: Returns the nodemask_t mems_allowed of the cpuset
* attached to the specified @tsk. Guaranteed to return some non-empty
- * subset of node_online_map, even if this means going outside the
+ * subset of node_states[N_MEMORY], even if this means going outside the
* tasks cpuset.
**/
Index: Linux/include/linux/cpuset.h
===================================================================
--- Linux.orig/include/linux/cpuset.h 2007-07-24 11:24:56.000000000 -0400
+++ Linux/include/linux/cpuset.h 2007-07-24 12:20:56.000000000 -0400
@@ -92,7 +92,7 @@ static inline nodemask_t cpuset_mems_all
return node_possible_map;
}
-#define cpuset_current_mems_allowed (node_online_map)
+#define cpuset_current_mems_allowed (node_states[N_MEMORY))
static inline void cpuset_init_current_mems_allowed(void) {}
static inline void cpuset_update_task_memory_state(void) {}
#define cpuset_nodes_subset_current_mems_allowed(nodes) (1)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-07-24 20:30 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070711182219.234782227@sgi.com>
[not found] ` <20070711182252.138829364@sgi.com>
2007-07-11 18:46 ` [patch 10/12] Memoryless nodes: Update memory policy and page migration Nishanth Aravamudan
2007-07-11 18:56 ` Christoph Lameter
[not found] ` <20070711182252.376540447@sgi.com>
2007-07-11 19:04 ` [patch 11/12] Add N_CPU node state Christoph Lameter
[not found] ` <20070711182250.005856256@sgi.com>
2007-07-11 19:06 ` [patch 01/12] NUMA: Generic management of nodemasks for various purposes Christoph Lameter
2007-07-11 19:32 ` Lee Schermerhorn
2007-07-20 20:49 ` [PATCH] Memoryless nodes: use "node_memory_map" for cpuset mems_allowed validation Lee Schermerhorn
2007-07-20 22:07 ` Nishanth Aravamudan
2007-07-23 19:09 ` Nishanth Aravamudan
2007-07-23 19:23 ` Paul Jackson
2007-07-23 20:08 ` Nishanth Aravamudan
2007-07-23 20:59 ` Lee Schermerhorn
2007-07-23 21:48 ` Nishanth Aravamudan
2007-07-24 14:11 ` Lee Schermerhorn
2007-07-24 16:16 ` Nishanth Aravamudan
2007-07-24 14:15 ` [PATCH take2] " Lee Schermerhorn
2007-07-24 16:19 ` Nishanth Aravamudan
2007-07-24 19:01 ` Lee Schermerhorn
2007-07-25 15:50 ` Nishanth Aravamudan
2007-07-24 20:30 ` Lee Schermerhorn [this message]
2007-07-25 15:53 ` [PATCH take3] " Nishanth Aravamudan
2007-07-25 22:00 ` Nishanth Aravamudan
2007-07-26 13:04 ` Lee Schermerhorn
2007-07-27 0:40 ` Nishanth Aravamudan
2007-07-27 14:15 ` Lee Schermerhorn
2007-07-24 20:35 ` [PATCH/RFC] Memoryless nodes: Suppress redundant "node with no memory" messages Lee Schermerhorn
2007-07-25 15:56 ` Nishanth Aravamudan
[not found] ` <20070711182251.433134748@sgi.com>
2007-07-12 0:07 ` [patch 07/12] Memoryless nodes: SLUB support Andrew Morton
2007-07-12 1:42 ` Christoph Lameter
2007-07-12 18:33 ` Nishanth Aravamudan
2007-07-12 18:38 ` Christoph Lameter
2007-07-13 15:14 ` [patch 00/12] NUMA: Memoryless node support V3 Nishanth Aravamudan
2007-07-13 16:43 ` Christoph Lameter
2007-07-13 16:52 ` Nishanth Aravamudan
2007-07-13 17:20 ` Lee Schermerhorn
2007-07-13 17:23 ` Christoph Lameter
2007-07-13 19:22 ` Lee Schermerhorn
2007-07-13 20:53 ` Lee Schermerhorn
2007-07-13 21:34 ` Christoph Lameter
2007-07-13 23:18 ` Nishanth Aravamudan
[not found] ` <1185310277.5649.90.camel@localhost>
[not found] ` <Pine.LNX.4.64.0707241402010.4773@schroedinger.engr.sgi.com>
[not found] ` <1185372692.5604.22.camel@localhost>
2007-07-25 15:45 ` Lee Schermerhorn
2007-07-25 19:16 ` 2.6.23-rc1-mm1: boot hang on ia64 with memoryless nodes Lee Schermerhorn
2007-07-25 19:38 ` Christoph Lameter
2007-07-25 20:03 ` Christoph Lameter
2007-07-25 21:18 ` Lee Schermerhorn
2007-07-26 13:53 ` Lee Schermerhorn
2007-07-26 14:00 ` KAMEZAWA Hiroyuki
2007-07-26 18:10 ` Lee Schermerhorn
2007-07-26 14:33 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1185309019.5649.69.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kxr@sgi.com \
--cc=linux-mm@kvack.org \
--cc=nacc@us.ibm.com \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox