linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for 2.6.16] Handle all and empty zones when setting up custom zonelists for mbind
@ 2006-02-17  0:39 Andi Kleen
  2006-02-17  0:48 ` Christoph Lameter
  0 siblings, 1 reply; 2+ messages in thread
From: Andi Kleen @ 2006-02-17  0:39 UTC (permalink / raw)
  Cc: Linus Torvalds, akpm, linux-mm, Christoph Lameter

The memory allocator doesn't like empty zones (which have an 
uninitialized freelist), so a x86-64 system with a node fully
in GFP_DMA32 only would crash on mbind.

Fix that up by putting all possible zones as fallback into the zonelist
and skipping the empty ones.

In fact the code always enough allocated space for all zones,
but only used it for the highest. This change just uses all the
memory that was allocated before.

This should work fine for now, but whoever implements node hot removal
needs to fix this somewhere else too (or make sure zone datastructures 
by itself never go away, only their memory)

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux/mm/mempolicy.c
===================================================================
--- linux.orig/mm/mempolicy.c
+++ linux/mm/mempolicy.c
@@ -132,19 +132,29 @@ static int mpol_check_policy(int mode, n
 	}
 	return nodes_subset(*nodes, node_online_map) ? 0 : -EINVAL;
 }
+
 /* Generate a custom zonelist for the BIND policy. */
 static struct zonelist *bind_zonelist(nodemask_t *nodes)
 {
 	struct zonelist *zl;
-	int num, max, nd;
+	int num, max, nd, k;
 
 	max = 1 + MAX_NR_ZONES * nodes_weight(*nodes);
-	zl = kmalloc(sizeof(void *) * max, GFP_KERNEL);
+	zl = kmalloc(sizeof(struct zone *) * max, GFP_KERNEL);
 	if (!zl)
 		return NULL;
 	num = 0;
-	for_each_node_mask(nd, *nodes)
-		zl->zones[num++] = &NODE_DATA(nd)->node_zones[policy_zone];
+	/* First put in the highest zones from all nodes, then all the next 
+	   lower zones etc. Avoid empty zones because the memory allocator
+	   doesn't like them. If you implement node hot removal you
+	   have to fix that. */
+	for (k = policy_zone; k >= 0; k--) { 
+		for_each_node_mask(nd, *nodes) { 
+			struct zone *z = &NODE_DATA(nd)->node_zones[k];
+			if (z->present_pages > 0) 
+				zl->zones[num++] = z;
+		}
+	}
 	zl->zones[num] = NULL;
 	return zl;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH for 2.6.16] Handle all and empty zones when setting up custom zonelists for mbind
  2006-02-17  0:39 [PATCH for 2.6.16] Handle all and empty zones when setting up custom zonelists for mbind Andi Kleen
@ 2006-02-17  0:48 ` Christoph Lameter
  0 siblings, 0 replies; 2+ messages in thread
From: Christoph Lameter @ 2006-02-17  0:48 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, akpm, linux-mm

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-02-17  0:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-17  0:39 [PATCH for 2.6.16] Handle all and empty zones when setting up custom zonelists for mbind Andi Kleen
2006-02-17  0:48 ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox