linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Nish Aravamudan <nish.aravamudan@gmail.com>
Cc: Christoph Lameter <clameter@sgi.com>,
	Anton Blanchard <anton@samba.org>,
	linux-mm@kvack.org, ak@suse.de, mel@csn.ul.ie, apw@shadowen.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Eric Whitney <eric.whitney@hp.com>,
	andyw@uk.ibm.com
Subject: Re: [PATCH/RFC] Fix hugetlb pool allocation with empty nodes - V4
Date: Mon, 21 May 2007 10:57:08 -0400	[thread overview]
Message-ID: <1179759429.5113.16.camel@localhost> (raw)
In-Reply-To: <29495f1d0705171730h552f7d80hc3f991f8dce9d4c2@mail.gmail.com>

On Thu, 2007-05-17 at 17:30 -0700, Nish Aravamudan wrote:
> On 5/16/07, Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> > On Wed, 2007-05-16 at 12:59 -0700, Nish Aravamudan wrote:
> >
> > <snip>
> > >
> > > This completely breaks hugepage allocation on 4-node x86_64 box I have
> > > here. Each node has <4GB of memory, so all memory is ZONE_DMA and
> > > ZONE_DMA32. gfp_zone(GFP_HIGHUSER) is ZONE_NORMAL, though. So all
> > > nodes are not populated by the default initialization to an empty
> > > nodemask.
> > >
> > > Thanks to Andy Whitcroft for helping me debug this.
> > >
> > > I'm not sure how to fix this -- but I ran into while trying to base my
> > > sysfs hugepage allocation patches on top of yours.
> >
> > OK.  Try this.  Tested OK on 4 node [+ 1 pseudo node] ia64 and 2 node
> > x86_64.  The x86_64 had 2G per node--all DMA32.
> >
> > Notes:
> >
> > 1) applies on 2.6.22-rc1-mm1 atop my earlier patch to add the call to
> > check_highest_zone() to build_zonelists_in_zone_order().  I think it'll
> > apply [with offsets] w/o that patch.
> 
> Could you give both patches (or just this one) against 2.6.22-rc1 or
> current -linus? -mm1 has build issues on ppc64 and i386 (as reported
> by Andy and Mel in other threads).

Nish:

Here's the hugepage fix against 2.6.22-rc2 for your testing.  I think
the patch still needs to cook in -mm if you think it's OK.

Lee

[PATCH 2.6.22-rc2] Fix hugetlb pool allocation with empty nodes V4a

Temporary patch against 2.6.22-rc2 for Nish's testing.  

description and signoffs intentionally omitted


 include/linux/nodemask.h |   10 ++++++++++
 mm/hugetlb.c             |   21 +++++++++++++++------
 mm/page_alloc.c          |   29 +++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+), 6 deletions(-)

Index: Linux/mm/hugetlb.c
===================================================================
--- Linux.orig/mm/hugetlb.c	2007-05-21 09:22:53.000000000 -0400
+++ Linux/mm/hugetlb.c	2007-05-21 10:00:45.000000000 -0400
@@ -101,13 +101,22 @@ static void free_huge_page(struct page *
 
 static int alloc_fresh_huge_page(void)
 {
-	static int nid = 0;
+	static int nid = -1;
 	struct page *page;
-	page = alloc_pages_node(nid, GFP_HIGHUSER|__GFP_COMP|__GFP_NOWARN,
-					HUGETLB_PAGE_ORDER);
-	nid = next_node(nid, node_online_map);
-	if (nid == MAX_NUMNODES)
-		nid = first_node(node_online_map);
+	int start_nid;
+
+	if (nid < 0)
+		nid = first_node(node_populated_map);
+	start_nid = nid;
+
+	do {
+		page = alloc_pages_node(nid,
+				GFP_HIGHUSER|__GFP_COMP|GFP_THISNODE,
+  				HUGETLB_PAGE_ORDER);
+		nid = next_node(nid, node_populated_map);
+		if (nid >= nr_node_ids)
+			nid = first_node(node_populated_map);
+	} while (!page && nid != start_nid);
 	if (page) {
 		set_compound_page_dtor(page, free_huge_page);
 		spin_lock(&hugetlb_lock);
Index: Linux/include/linux/nodemask.h
===================================================================
--- Linux.orig/include/linux/nodemask.h	2007-04-25 23:08:32.000000000 -0400
+++ Linux/include/linux/nodemask.h	2007-05-21 09:59:16.000000000 -0400
@@ -64,12 +64,16 @@
  *
  * int node_online(node)		Is some node online?
  * int node_possible(node)		Is some node possible?
+ * int node_populated(node)		Is some node populated [at policy_zone]
  *
  * int any_online_node(mask)		First online node in mask
  *
  * node_set_online(node)		set bit 'node' in node_online_map
  * node_set_offline(node)		clear bit 'node' in node_online_map
  *
+ * node_set_populated(node)		set bit 'node' in node_populated_map
+ * node_not_populated(node)		clear bit 'node' in node_populated_map
+ *
  * for_each_node(node)			for-loop node over node_possible_map
  * for_each_online_node(node)		for-loop node over node_online_map
  *
@@ -344,12 +348,14 @@ static inline void __nodes_remap(nodemas
 
 extern nodemask_t node_online_map;
 extern nodemask_t node_possible_map;
+extern nodemask_t node_populated_map;
 
 #if MAX_NUMNODES > 1
 #define num_online_nodes()	nodes_weight(node_online_map)
 #define num_possible_nodes()	nodes_weight(node_possible_map)
 #define node_online(node)	node_isset((node), node_online_map)
 #define node_possible(node)	node_isset((node), node_possible_map)
+#define node_populated(node)	node_isset((node), node_populated_map)
 #define first_online_node	first_node(node_online_map)
 #define next_online_node(nid)	next_node((nid), node_online_map)
 extern int nr_node_ids;
@@ -358,6 +364,7 @@ extern int nr_node_ids;
 #define num_possible_nodes()	1
 #define node_online(node)	((node) == 0)
 #define node_possible(node)	((node) == 0)
+#define node_populated(node)	((node) == 0)
 #define first_online_node	0
 #define next_online_node(nid)	(MAX_NUMNODES)
 #define nr_node_ids		1
@@ -375,6 +382,9 @@ extern int nr_node_ids;
 #define node_set_online(node)	   set_bit((node), node_online_map.bits)
 #define node_set_offline(node)	   clear_bit((node), node_online_map.bits)
 
+#define node_set_populated(node)   set_bit((node), node_populated_map.bits)
+#define node_not_populated(node)   clear_bit((node), node_populated_map.bits)
+
 #define for_each_node(node)	   for_each_node_mask((node), node_possible_map)
 #define for_each_online_node(node) for_each_node_mask((node), node_online_map)
 
Index: Linux/mm/page_alloc.c
===================================================================
--- Linux.orig/mm/page_alloc.c	2007-05-21 09:22:53.000000000 -0400
+++ Linux/mm/page_alloc.c	2007-05-21 10:03:17.000000000 -0400
@@ -54,6 +54,9 @@ nodemask_t node_online_map __read_mostly
 EXPORT_SYMBOL(node_online_map);
 nodemask_t node_possible_map __read_mostly = NODE_MASK_ALL;
 EXPORT_SYMBOL(node_possible_map);
+nodemask_t node_populated_map __read_mostly = NODE_MASK_NONE;
+EXPORT_SYMBOL(node_populated_map);
+
 unsigned long totalram_pages __read_mostly;
 unsigned long totalreserve_pages __read_mostly;
 long nr_swap_pages;
@@ -1719,6 +1722,22 @@ static int __meminit find_next_best_node
 	return best_node;
 }
 
+/*
+ * setup_populate_map() - record nodes whose "policy_zone" is "on-node".
+ */
+static void setup_populated_map(int nid)
+{
+	pg_data_t *pgdat = NODE_DATA(nid);
+	struct zonelist *zl = pgdat->node_zonelists + policy_zone;
+	struct zone *z = zl->zones[0];
+
+	VM_BUG_ON(!z);
+	if (z->zone_pgdat == pgdat)
+		node_set_populated(nid);
+	else
+		node_not_populated(nid);
+}
+
 static void __meminit build_zonelists(pg_data_t *pgdat)
 {
 	int j, node, local_node;
@@ -1788,6 +1807,15 @@ static void __meminit build_zonelist_cac
 
 #else	/* CONFIG_NUMA */
 
+/*
+ * setup_populated_map - non-NUMA case
+ * Only node 0 should be on-line, and it MUST be populated!
+ */
+static void setup_populated_map(int nid)
+{
+	node_set_populated(nid);
+}
+
 static void __meminit build_zonelists(pg_data_t *pgdat)
 {
 	int node, local_node;
@@ -1842,6 +1870,7 @@ static int __meminit __build_all_zonelis
 	for_each_online_node(nid) {
 		build_zonelists(NODE_DATA(nid));
 		build_zonelist_cache(NODE_DATA(nid));
+		setup_populated_map(nid);
 	}
 	return 0;
 }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-21 14:57 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-03  2:21 [PATCH] Fix hugetlb pool allocation with empty nodes Anton Blanchard
2007-05-03  3:02 ` Christoph Lameter
2007-05-03  6:07   ` Anton Blanchard
2007-05-03  6:37     ` Christoph Lameter
2007-05-03  8:59 ` Andi Kleen
2007-05-03 13:22   ` Anton Blanchard
2007-05-04 20:29 ` [PATCH] Fix hugetlb pool allocation with empty nodes - V2 Lee Schermerhorn
2007-05-04 21:27   ` Christoph Lameter
2007-05-04 22:39     ` Nish Aravamudan
2007-05-07 13:40     ` Lee Schermerhorn
2007-05-09 16:37     ` [PATCH] Fix hugetlb pool allocation with empty nodes - V2 -> V3 Lee Schermerhorn
2007-05-09 16:57       ` Christoph Lameter
2007-05-09 19:17         ` Lee Schermerhorn
2007-05-16 17:27           ` Nish Aravamudan
2007-05-16 20:01             ` Lee Schermerhorn
2007-05-09 19:59       ` Nish Aravamudan
2007-05-09 20:37         ` Lee Schermerhorn
2007-05-09 20:54           ` Christoph Lameter
2007-05-09 22:34           ` Nish Aravamudan
2007-05-15 16:30             ` Lee Schermerhorn
2007-05-16 23:47               ` Nish Aravamudan
2007-05-16 19:59       ` Nish Aravamudan
2007-05-16 20:32         ` Lee Schermerhorn
2007-05-16 22:17         ` [PATCH/RFC] Fix hugetlb pool allocation with empty nodes - V4 Lee Schermerhorn
2007-05-18  0:30           ` Nish Aravamudan
2007-05-21 14:57             ` Lee Schermerhorn [this message]
2007-05-21 17:51               ` Nish Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1179759429.5113.16.camel@localhost \
    --to=lee.schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=andyw@uk.ibm.com \
    --cc=anton@samba.org \
    --cc=apw@shadowen.org \
    --cc=clameter@sgi.com \
    --cc=eric.whitney@hp.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=nish.aravamudan@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox