linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] dequeue a huge page near to this node
@ 2005-11-10 23:27 Christoph Lameter
  2005-11-10 23:34 ` Chen, Kenneth W
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Christoph Lameter @ 2005-11-10 23:27 UTC (permalink / raw)
  To: Adam Litke; +Cc: linux-mm, linux-kernel, kenneth.w.chen, akpm

The following patch changes the dequeueing to select a huge page near
the node executing instead of always beginning to check for free 
nodes from node 0. This will result in a placement of the huge pages near
the executing processor improving performance.

The existing implementation can place the huge pages far away from 
the executing processor causing significant degradation of performance.
The search starting from zero also means that the lower zones quickly 
run out of memory. Selecting a huge page near the process distributed the 
huge pages better.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6.14-mm1/mm/hugetlb.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/hugetlb.c	2005-11-09 10:47:37.000000000 -0800
+++ linux-2.6.14-mm1/mm/hugetlb.c	2005-11-10 15:02:05.000000000 -0800
@@ -36,14 +36,16 @@ static struct page *dequeue_huge_page(vo
 {
 	int nid = numa_node_id();
 	struct page *page = NULL;
+	struct zonelist *zonelist = NODE_DATA(nid)->node_zonelists;
+	struct zone **z;
 
-	if (list_empty(&hugepage_freelists[nid])) {
-		for (nid = 0; nid < MAX_NUMNODES; ++nid)
-			if (!list_empty(&hugepage_freelists[nid]))
-				break;
+	for (z = zonelist->zones; *z; z++) {
+		nid = (*z)->zone_pgdat->node_id;
+		if (!list_empty(&hugepage_freelists[nid]))
+			break;
 	}
-	if (nid >= 0 && nid < MAX_NUMNODES &&
-	    !list_empty(&hugepage_freelists[nid])) {
+
+	if (z) {
 		page = list_entry(hugepage_freelists[nid].next,
 				  struct page, lru);
 		list_del(&page->lru);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] dequeue a huge page near to this node
  2005-11-10 23:27 [PATCH] dequeue a huge page near to this node Christoph Lameter
@ 2005-11-10 23:34 ` Chen, Kenneth W
  2005-11-11  0:44   ` Christoph Lameter
  2005-11-11  0:51 ` William Lee Irwin III
  2005-11-11 14:19 ` Adam Litke
  2 siblings, 1 reply; 7+ messages in thread
From: Chen, Kenneth W @ 2005-11-10 23:34 UTC (permalink / raw)
  To: 'Christoph Lameter', Adam Litke; +Cc: linux-mm, linux-kernel, akpm

Christoph Lameter wrote on Thursday, November 10, 2005 3:27 PM
> The following patch changes the dequeueing to select a huge page near
> the node executing instead of always beginning to check for free 
> nodes from node 0. This will result in a placement of the huge pages near
> the executing processor improving performance.
> 
> The existing implementation can place the huge pages far away from 
> the executing processor causing significant degradation of performance.
> The search starting from zero also means that the lower zones quickly 
> run out of memory. Selecting a huge page near the process distributed the 
> huge pages better.


Looks great!

- Ken

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] dequeue a huge page near to this node
  2005-11-10 23:34 ` Chen, Kenneth W
@ 2005-11-11  0:44   ` Christoph Lameter
  2005-11-11  0:51     ` William Lee Irwin III
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Lameter @ 2005-11-11  0:44 UTC (permalink / raw)
  To: Chen, Kenneth W; +Cc: Adam Litke, linux-mm, linux-kernel, akpm

On Thu, 10 Nov 2005, Chen, Kenneth W wrote:

> Looks great!

Well in that case, we may do even more:

Make huge pages obey cpusets.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6.14-mm1/mm/hugetlb.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/hugetlb.c	2005-11-10 15:02:05.000000000 -0800
+++ linux-2.6.14-mm1/mm/hugetlb.c	2005-11-10 16:29:16.000000000 -0800
@@ -11,6 +11,7 @@
 #include <linux/highmem.h>
 #include <linux/nodemask.h>
 #include <linux/pagemap.h>
+#include <linux/cpuset.h>
 #include <asm/page.h>
 #include <asm/pgtable.h>
 
@@ -41,7 +42,8 @@ static struct page *dequeue_huge_page(vo
 
 	for (z = zonelist->zones; *z; z++) {
 		nid = (*z)->zone_pgdat->node_id;
-		if (!list_empty(&hugepage_freelists[nid]))
+		if (cpuset_zone_allowed(*z, GFP_HIGHUSER) &&
+		    !list_empty(&hugepage_freelists[nid]))
 			break;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dequeue a huge page near to this node
  2005-11-10 23:27 [PATCH] dequeue a huge page near to this node Christoph Lameter
  2005-11-10 23:34 ` Chen, Kenneth W
@ 2005-11-11  0:51 ` William Lee Irwin III
  2005-11-11 14:19 ` Adam Litke
  2 siblings, 0 replies; 7+ messages in thread
From: William Lee Irwin III @ 2005-11-11  0:51 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Adam Litke, linux-mm, linux-kernel, kenneth.w.chen, akpm

On Thu, Nov 10, 2005 at 03:27:12PM -0800, Christoph Lameter wrote:
> The following patch changes the dequeueing to select a huge page near
> the node executing instead of always beginning to check for free 
> nodes from node 0. This will result in a placement of the huge pages near
> the executing processor improving performance.
> The existing implementation can place the huge pages far away from 
> the executing processor causing significant degradation of performance.
> The search starting from zero also means that the lower zones quickly 
> run out of memory. Selecting a huge page near the process distributed the 
> huge pages better.
> Signed-off-by: Christoph Lameter <clameter@sgi.com>

Long intended to have been corrected. Thanks.

Acked-by: William Irwin <wli@holomorphy.com>


-- wli

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dequeue a huge page near to this node
  2005-11-11  0:44   ` Christoph Lameter
@ 2005-11-11  0:51     ` William Lee Irwin III
  0 siblings, 0 replies; 7+ messages in thread
From: William Lee Irwin III @ 2005-11-11  0:51 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Chen, Kenneth W, Adam Litke, linux-mm, linux-kernel, akpm

On Thu, Nov 10, 2005 at 04:44:40PM -0800, Christoph Lameter wrote:
> Well in that case, we may do even more:
> Make huge pages obey cpusets.
> Signed-off-by: Christoph Lameter <clameter@sgi.com>

Simple enough.

Acked-by: William Irwin <wli@holomorphy.com>


-- wli

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dequeue a huge page near to this node
  2005-11-10 23:27 [PATCH] dequeue a huge page near to this node Christoph Lameter
  2005-11-10 23:34 ` Chen, Kenneth W
  2005-11-11  0:51 ` William Lee Irwin III
@ 2005-11-11 14:19 ` Adam Litke
  2005-11-11 17:33   ` Christoph Lameter
  2 siblings, 1 reply; 7+ messages in thread
From: Adam Litke @ 2005-11-11 14:19 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-mm, linux-kernel, kenneth.w.chen, akpm

On Thu, 2005-11-10 at 15:27 -0800, Christoph Lameter wrote:
> The following patch changes the dequeueing to select a huge page near
> the node executing instead of always beginning to check for free 
> nodes from node 0. This will result in a placement of the huge pages near
> the executing processor improving performance.
> 
> The existing implementation can place the huge pages far away from 
> the executing processor causing significant degradation of performance.
> The search starting from zero also means that the lower zones quickly 
> run out of memory. Selecting a huge page near the process distributed the 
> huge pages better.
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>

I'll add my voice to the chorus of aye's.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dequeue a huge page near to this node
  2005-11-11 14:19 ` Adam Litke
@ 2005-11-11 17:33   ` Christoph Lameter
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Lameter @ 2005-11-11 17:33 UTC (permalink / raw)
  To: Adam Litke; +Cc: linux-mm, linux-kernel, kenneth.w.chen, akpm, Paul T. Darga

On Fri, 11 Nov 2005, Adam Litke wrote:

> On Thu, 2005-11-10 at 15:27 -0800, Christoph Lameter wrote:
> > The following patch changes the dequeueing to select a huge page near
> > the node executing instead of always beginning to check for free 
> > nodes from node 0. This will result in a placement of the huge pages near
> > the executing processor improving performance.
> > 
> > The existing implementation can place the huge pages far away from 
> > the executing processor causing significant degradation of performance.
> > The search starting from zero also means that the lower zones quickly 
> > run out of memory. Selecting a huge page near the process distributed the 
> > huge pages better.
> > 
> > Signed-off-by: Christoph Lameter <clameter@sgi.com>
> 
> I'll add my voice to the chorus of aye's.

There is a slight problem with the patch. We need to check *z instead of 
z. Here is a fixed patch. Thanks to Paul T. Darga to point that out.

Index: linux-2.6.14-mm1/mm/hugetlb.c
===================================================================
--- linux-2.6.14-mm1.orig/mm/hugetlb.c	2005-11-09 10:47:37.000000000 -0800
+++ linux-2.6.14-mm1/mm/hugetlb.c	2005-11-11 09:31:02.000000000 -0800
@@ -36,14 +36,16 @@ static struct page *dequeue_huge_page(vo
 {
 	int nid = numa_node_id();
 	struct page *page = NULL;
+	struct zonelist *zonelist = NODE_DATA(nid)->node_zonelists;
+	struct zone **z;
 
-	if (list_empty(&hugepage_freelists[nid])) {
-		for (nid = 0; nid < MAX_NUMNODES; ++nid)
-			if (!list_empty(&hugepage_freelists[nid]))
-				break;
+	for (z = zonelist->zones; *z; z++) {
+		nid = (*z)->zone_pgdat->node_id;
+		if (!list_empty(&hugepage_freelists[nid]))
+			break;
 	}
-	if (nid >= 0 && nid < MAX_NUMNODES &&
-	    !list_empty(&hugepage_freelists[nid])) {
+
+	if (*z) {
 		page = list_entry(hugepage_freelists[nid].next,
 				  struct page, lru);
 		list_del(&page->lru);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-11-11 17:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-11-10 23:27 [PATCH] dequeue a huge page near to this node Christoph Lameter
2005-11-10 23:34 ` Chen, Kenneth W
2005-11-11  0:44   ` Christoph Lameter
2005-11-11  0:51     ` William Lee Irwin III
2005-11-11  0:51 ` William Lee Irwin III
2005-11-11 14:19 ` Adam Litke
2005-11-11 17:33   ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox