[RFC PATCH 0/3] Reduce searching in the page allocator fast-path

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 0/3] Reduce searching in the page allocator fast-path
@ 2009-08-18 11:15 Mel Gorman
  2009-08-18 11:16 ` [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type Mel Gorman
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 11:15 UTC (permalink / raw)
  To: Linux Memory Management List
  Cc: Christoph Lameter, Nick Piggin, Linux Kernel Mailing List, Mel Gorman

The following three patches are a revisit of the proposal to remove searching
in the page allocator fast-path by maintaining multiple free-lists in the
per-cpu structure. At the time the search was introduced, increasing the
per-cpu structures would waste a lot of memory as per-cpu structures were
statically allocated at compile-time. This is no longer the case.

These patches have been brought up before but the results as to whether
they helped or not were inconclusive and I was worried about the pcpu drain
path. While the patches in various guises have been ACKd, they were never
merged because the performance results were always shaky. I beefed up the
of the testing methodology and the results indicate either no improvements
or small gains with two exceptionally large gains. I'm marginally happier
with the free path than I was previously.

The patches are as follows. They are based on mmotm-2009-08-12.

Patch 1 adds multiple lists to struct per_cpu_pages, one per
	migratetype that can be stored on the PCP lists.

Patch 2 notes that the pcpu drain path check empty lists multiple times. The
	patch  reduces the number of checks by maintaining a count of free
	lists encountered. Lists containing pages will then free multiple
	pages in batch

Patch 3 notes that the per-cpu structure is larger than it needs to be because
	pcpu->high and batch are read-mostly variables shared by the
	zone. The patch moves those fields to struct zone.

The patches were tested with kernbench, aim9, netperf udp/tcp, hackbench and
sysbench.  The netperf tests were not bound to any CPU in particular and
were run such that the results should be 99% confidence that the reported
results are within 1% of the estimated mean. sysbench was run with a
postgres background and read-only tests. Similar to netperf, it was run
multiple times so that it's 99% confidence results are within 1%. The
patches were tested on x86, x86-64 and ppc64 as

x86:	Intel Pentium D 3GHz with 8G RAM (no-brand machine)
	kernbench	- No significant difference, variance well within noise
	aim9		- 3-6% gain on page_test and brk_test
	netperf-udp	- No significant differences
	netperf-tcp	- Small variances, very close to noise
	hackbench	- Small variances, very close to noise
	sysbench	- Small gains, very close to noise

x86-64:	AMD Phenom 9950 1.3GHz with 8G RAM (no-brand machine)
	kernbench	- No significant difference, variance well within noise
	aim9		- No significant difference
	netperf-udp	- No difference until buffer >= PAGE_SIZE
				4096	+1.39%
				8192	+6.80%
				16384	+9.55%
	netperf-tcp	- No difference until buffer >= PAGE_SIZE
				4096	+14.14%
				8192	+ 0.23% (not significant)
				16384	-12.56%
	hackbench	- Small gains, very close to noise
	sysbench	- Small gains/losses, very close to noise

ppc64:	PPC970MP 2.5GHz with 10GB RAM (it's a terrasoft powerstation)
	kernbench	- No significant difference, variance well within noise
	aim9		- No significant difference
	netperf-udp	- 2-3% gain for almost all buffer sizes tested
	netperf-tcp	- losses on small buffers, gains on larger buffers
			  possibly indicates some bad caching effect. Suspect
			  struct zone could be laid out much better
	hackbench	- Small 1-2% gains
	sysbench	- 5-7% gain

For the most part, performance differences are marginal with some noticeable
exceptions. netperf-udp on x86-64 gained heavily as did sysbench on ppc64. I
suspect the TCP results, particularly for small buffers, point to some
cache line bouncing effect which I haven't pinned down yet.

 include/linux/mmzone.h |   10 ++-
 mm/page_alloc.c        |  162 +++++++++++++++++++++++++++---------------------
 mm/vmstat.c            |    4 +-
 3 files changed, 100 insertions(+), 76 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type
  2009-08-18 11:15 [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Mel Gorman
@ 2009-08-18 11:16 ` Mel Gorman
  2009-08-18 11:43   ` Nick Piggin
  2009-08-18 22:57   ` Vincent Li
  2009-08-18 11:16 ` [PATCH 2/3] page-allocoator: Maintain rolling count of pages to free from the PCP Mel Gorman
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 11:16 UTC (permalink / raw)
  To: Linux Memory Management List
  Cc: Christoph Lameter, Nick Piggin, Linux Kernel Mailing List, Mel Gorman

Currently the per-cpu page allocator searches the PCP list for pages of the
correct migrate-type to reduce the possibility of pages being inappropriate
placed from a fragmentation perspective. This search is potentially expensive
in a fast-path and undesirable. Splitting the per-cpu list into multiple
lists increases the size of a per-cpu structure and this was potentially
a major problem at the time the search was introduced. These problem has
been mitigated as now only the necessary number of structures is allocated
for the running system.

This patch replaces a list search in the per-cpu allocator with one list per
migrate type. The potential snag with this approach is when bulk freeing
pages. We round-robin free pages based on migrate type which has little
bearing on the cache hotness of the page and potentially checks empty lists
repeatedly in the event the majority of PCP pages are of one type.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 include/linux/mmzone.h |    5 ++-
 mm/page_alloc.c        |  106 ++++++++++++++++++++++++++---------------------
 2 files changed, 63 insertions(+), 48 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 9c50309..6e0b624 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -38,6 +38,7 @@
 #define MIGRATE_UNMOVABLE     0
 #define MIGRATE_RECLAIMABLE   1
 #define MIGRATE_MOVABLE       2
+#define MIGRATE_PCPTYPES      3 /* the number of types on the pcp lists */
 #define MIGRATE_RESERVE       3
 #define MIGRATE_ISOLATE       4 /* can't allocate from here */
 #define MIGRATE_TYPES         5
@@ -169,7 +170,9 @@ struct per_cpu_pages {
 	int count;		/* number of pages in the list */
 	int high;		/* high watermark, emptying needed */
 	int batch;		/* chunk size for buddy add/remove */
-	struct list_head list;	/* the list of pages */
+
+	/* Lists of pages, one per migrate type stored on the pcp-lists */
+	struct list_head lists[MIGRATE_PCPTYPES];
 };
 
 struct per_cpu_pageset {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0e5baa9..a06ddf0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -522,7 +522,7 @@ static inline int free_pages_check(struct page *page)
 }
 
 /*
- * Frees a list of pages. 
+ * Frees a number of pages from the PCP lists
  * Assumes all pages on list are in same zone, and of same order.
  * count is the number of pages to free.
  *
@@ -532,23 +532,36 @@ static inline int free_pages_check(struct page *page)
  * And clear the zone's pages_scanned counter, to hold off the "all pages are
  * pinned" detection logic.
  */
-static void free_pages_bulk(struct zone *zone, int count,
-					struct list_head *list, int order)
+static void free_pcppages_bulk(struct zone *zone, int count,
+					struct per_cpu_pages *pcp)
 {
+	int migratetype = 0;
+
 	spin_lock(&zone->lock);
 	zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
 	zone->pages_scanned = 0;
 
-	__mod_zone_page_state(zone, NR_FREE_PAGES, count << order);
+	__mod_zone_page_state(zone, NR_FREE_PAGES, count);
 	while (count--) {
 		struct page *page;
+		struct list_head *list;
+
+		/*
+		 * Remove pages from lists in a round-robin fashion. This spinning
+		 * around potentially empty lists is bloody awful, alternatives that
+		 * don't suck are welcome
+		 */
+		do {
+			if (++migratetype == MIGRATE_PCPTYPES)
+				migratetype = 0;
+			list = &pcp->lists[migratetype];
+		} while (list_empty(list));
 
-		VM_BUG_ON(list_empty(list));
 		page = list_entry(list->prev, struct page, lru);
 		/* have to delete it as __free_one_page list manipulates */
 		list_del(&page->lru);
-		trace_mm_page_pcpu_drain(page, order, page_private(page));
-		__free_one_page(page, zone, order, page_private(page));
+		trace_mm_page_pcpu_drain(page, 0, migratetype);
+		__free_one_page(page, zone, 0, migratetype);
 	}
 	spin_unlock(&zone->lock);
 }
@@ -974,7 +987,7 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
 		to_drain = pcp->batch;
 	else
 		to_drain = pcp->count;
-	free_pages_bulk(zone, to_drain, &pcp->list, 0);
+	free_pcppages_bulk(zone, to_drain, pcp);
 	pcp->count -= to_drain;
 	local_irq_restore(flags);
 }
@@ -1000,7 +1013,7 @@ static void drain_pages(unsigned int cpu)
 
 		pcp = &pset->pcp;
 		local_irq_save(flags);
-		free_pages_bulk(zone, pcp->count, &pcp->list, 0);
+		free_pcppages_bulk(zone, pcp->count, pcp);
 		pcp->count = 0;
 		local_irq_restore(flags);
 	}
@@ -1066,6 +1079,7 @@ static void free_hot_cold_page(struct page *page, int cold)
 	struct zone *zone = page_zone(page);
 	struct per_cpu_pages *pcp;
 	unsigned long flags;
+	int migratetype;
 	int wasMlocked = __TestClearPageMlocked(page);
 
 	kmemcheck_free_shadow(page, 0);
@@ -1083,21 +1097,39 @@ static void free_hot_cold_page(struct page *page, int cold)
 	kernel_map_pages(page, 1, 0);
 
 	pcp = &zone_pcp(zone, get_cpu())->pcp;
-	set_page_private(page, get_pageblock_migratetype(page));
+	migratetype = get_pageblock_migratetype(page);
+	set_page_private(page, migratetype);
 	local_irq_save(flags);
 	if (unlikely(wasMlocked))
 		free_page_mlock(page);
 	__count_vm_event(PGFREE);
 
+	/*
+	 * We only track unreclaimable, reclaimable and movable on pcp lists.
+	 * Free ISOLATE pages back to the allocator because they are being
+	 * offlined but treat RESERVE as movable pages so we can get those
+	 * areas back if necessary. Otherwise, we may have to free
+	 * excessively into the page allocator
+	 */
+	if (migratetype >= MIGRATE_PCPTYPES) {
+		if (unlikely(migratetype == MIGRATE_ISOLATE)) {
+			free_one_page(zone, page, 0, migratetype);
+			goto out;
+		}
+		migratetype = MIGRATE_MOVABLE;
+	}
+
 	if (cold)
-		list_add_tail(&page->lru, &pcp->list);
+		list_add_tail(&page->lru, &pcp->lists[migratetype]);
 	else
-		list_add(&page->lru, &pcp->list);
+		list_add(&page->lru, &pcp->lists[migratetype]);
 	pcp->count++;
 	if (pcp->count >= pcp->high) {
-		free_pages_bulk(zone, pcp->batch, &pcp->list, 0);
+		free_pcppages_bulk(zone, pcp->batch, pcp);
 		pcp->count -= pcp->batch;
 	}
+
+out:
 	local_irq_restore(flags);
 	put_cpu();
 }
@@ -1155,46 +1187,24 @@ again:
 	cpu  = get_cpu();
 	if (likely(order == 0)) {
 		struct per_cpu_pages *pcp;
+		struct list_head *list;
 
 		pcp = &zone_pcp(zone, cpu)->pcp;
+		list = &pcp->lists[migratetype];
 		local_irq_save(flags);
-		if (!pcp->count) {
-			pcp->count = rmqueue_bulk(zone, 0,
-					pcp->batch, &pcp->list,
-					migratetype, cold);
-			if (unlikely(!pcp->count))
-				goto failed;
-		}
-
-		/* Find a page of the appropriate migrate type */
-		if (cold) {
-			list_for_each_entry_reverse(page, &pcp->list, lru)
-				if (page_private(page) == migratetype)
-					break;
-		} else {
-			list_for_each_entry(page, &pcp->list, lru)
-				if (page_private(page) == migratetype)
-					break;
-		}
-
-		/* Allocate more to the pcp list if necessary */
-		if (unlikely(&page->lru == &pcp->list)) {
-			int get_one_page = 0;
-
+		if (list_empty(list)) {
 			pcp->count += rmqueue_bulk(zone, 0,
-					pcp->batch, &pcp->list,
+					pcp->batch, list,
 					migratetype, cold);
-			list_for_each_entry(page, &pcp->list, lru) {
-				if (get_pageblock_migratetype(page) !=
-					    MIGRATE_ISOLATE) {
-					get_one_page = 1;
-					break;
-				}
-			}
-			if (!get_one_page)
+			if (unlikely(list_empty(list)))
 				goto failed;
 		}
 
+		if (cold)
+			page = list_entry(list->prev, struct page, lru);
+		else
+			page = list_entry(list->next, struct page, lru);
+
 		list_del(&page->lru);
 		pcp->count--;
 	} else {
@@ -3033,6 +3043,7 @@ static int zone_batchsize(struct zone *zone)
 static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
 {
 	struct per_cpu_pages *pcp;
+	int migratetype;
 
 	memset(p, 0, sizeof(*p));
 
@@ -3040,7 +3051,8 @@ static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
 	pcp->count = 0;
 	pcp->high = 6 * batch;
 	pcp->batch = max(1UL, 1 * batch);
-	INIT_LIST_HEAD(&pcp->list);
+	for (migratetype = 0; migratetype < MIGRATE_PCPTYPES; migratetype++)
+		INIT_LIST_HEAD(&pcp->lists[migratetype]);
 }
 
 /*
@@ -3232,7 +3244,7 @@ static int __zone_pcp_update(void *data)
 		pcp = &pset->pcp;
 
 		local_irq_save(flags);
-		free_pages_bulk(zone, pcp->count, &pcp->list, 0);
+		free_pcppages_bulk(zone, pcp->count, pcp);
 		setup_pageset(pset, batch);
 		local_irq_restore(flags);
 	}
-- 
1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type
  2009-08-18 11:16 ` [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type Mel Gorman
@ 2009-08-18 11:43   ` Nick Piggin
  2009-08-18 13:10     ` Mel Gorman
  2009-08-18 22:57   ` Vincent Li
  1 sibling, 1 reply; 20+ messages in thread
From: Nick Piggin @ 2009-08-18 11:43 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Christoph Lameter,
	Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 12:16:00PM +0100, Mel Gorman wrote:
> Currently the per-cpu page allocator searches the PCP list for pages of the
> correct migrate-type to reduce the possibility of pages being inappropriate
> placed from a fragmentation perspective. This search is potentially expensive
> in a fast-path and undesirable. Splitting the per-cpu list into multiple
> lists increases the size of a per-cpu structure and this was potentially
> a major problem at the time the search was introduced. These problem has
> been mitigated as now only the necessary number of structures is allocated
> for the running system.
> 
> This patch replaces a list search in the per-cpu allocator with one list per
> migrate type. The potential snag with this approach is when bulk freeing
> pages. We round-robin free pages based on migrate type which has little
> bearing on the cache hotness of the page and potentially checks empty lists
> repeatedly in the event the majority of PCP pages are of one type.

Seems OK I guess. Trading off icache and branches for dcache and
algorithmic gains. Too bad everything is always a tradeoff ;)

But no I think this is a good idea.

> 
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> ---
>  include/linux/mmzone.h |    5 ++-
>  mm/page_alloc.c        |  106 ++++++++++++++++++++++++++---------------------
>  2 files changed, 63 insertions(+), 48 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 9c50309..6e0b624 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -38,6 +38,7 @@
>  #define MIGRATE_UNMOVABLE     0
>  #define MIGRATE_RECLAIMABLE   1
>  #define MIGRATE_MOVABLE       2
> +#define MIGRATE_PCPTYPES      3 /* the number of types on the pcp lists */
>  #define MIGRATE_RESERVE       3
>  #define MIGRATE_ISOLATE       4 /* can't allocate from here */
>  #define MIGRATE_TYPES         5
> @@ -169,7 +170,9 @@ struct per_cpu_pages {
>  	int count;		/* number of pages in the list */
>  	int high;		/* high watermark, emptying needed */
>  	int batch;		/* chunk size for buddy add/remove */
> -	struct list_head list;	/* the list of pages */
> +
> +	/* Lists of pages, one per migrate type stored on the pcp-lists */
> +	struct list_head lists[MIGRATE_PCPTYPES];
>  };
>  
>  struct per_cpu_pageset {
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0e5baa9..a06ddf0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -522,7 +522,7 @@ static inline int free_pages_check(struct page *page)
>  }
>  
>  /*
> - * Frees a list of pages. 
> + * Frees a number of pages from the PCP lists
>   * Assumes all pages on list are in same zone, and of same order.
>   * count is the number of pages to free.
>   *
> @@ -532,23 +532,36 @@ static inline int free_pages_check(struct page *page)
>   * And clear the zone's pages_scanned counter, to hold off the "all pages are
>   * pinned" detection logic.
>   */
> -static void free_pages_bulk(struct zone *zone, int count,
> -					struct list_head *list, int order)
> +static void free_pcppages_bulk(struct zone *zone, int count,
> +					struct per_cpu_pages *pcp)
>  {
> +	int migratetype = 0;
> +
>  	spin_lock(&zone->lock);
>  	zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
>  	zone->pages_scanned = 0;
>  
> -	__mod_zone_page_state(zone, NR_FREE_PAGES, count << order);
> +	__mod_zone_page_state(zone, NR_FREE_PAGES, count);
>  	while (count--) {
>  		struct page *page;
> +		struct list_head *list;
> +
> +		/*
> +		 * Remove pages from lists in a round-robin fashion. This spinning
> +		 * around potentially empty lists is bloody awful, alternatives that
> +		 * don't suck are welcome
> +		 */
> +		do {
> +			if (++migratetype == MIGRATE_PCPTYPES)
> +				migratetype = 0;
> +			list = &pcp->lists[migratetype];
> +		} while (list_empty(list));
>  
> -		VM_BUG_ON(list_empty(list));
>  		page = list_entry(list->prev, struct page, lru);
>  		/* have to delete it as __free_one_page list manipulates */
>  		list_del(&page->lru);
> -		trace_mm_page_pcpu_drain(page, order, page_private(page));
> -		__free_one_page(page, zone, order, page_private(page));
> +		trace_mm_page_pcpu_drain(page, 0, migratetype);
> +		__free_one_page(page, zone, 0, migratetype);
>  	}
>  	spin_unlock(&zone->lock);
>  }
> @@ -974,7 +987,7 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
>  		to_drain = pcp->batch;
>  	else
>  		to_drain = pcp->count;
> -	free_pages_bulk(zone, to_drain, &pcp->list, 0);
> +	free_pcppages_bulk(zone, to_drain, pcp);
>  	pcp->count -= to_drain;
>  	local_irq_restore(flags);
>  }
> @@ -1000,7 +1013,7 @@ static void drain_pages(unsigned int cpu)
>  
>  		pcp = &pset->pcp;
>  		local_irq_save(flags);
> -		free_pages_bulk(zone, pcp->count, &pcp->list, 0);
> +		free_pcppages_bulk(zone, pcp->count, pcp);
>  		pcp->count = 0;
>  		local_irq_restore(flags);
>  	}
> @@ -1066,6 +1079,7 @@ static void free_hot_cold_page(struct page *page, int cold)
>  	struct zone *zone = page_zone(page);
>  	struct per_cpu_pages *pcp;
>  	unsigned long flags;
> +	int migratetype;
>  	int wasMlocked = __TestClearPageMlocked(page);
>  
>  	kmemcheck_free_shadow(page, 0);
> @@ -1083,21 +1097,39 @@ static void free_hot_cold_page(struct page *page, int cold)
>  	kernel_map_pages(page, 1, 0);
>  
>  	pcp = &zone_pcp(zone, get_cpu())->pcp;
> -	set_page_private(page, get_pageblock_migratetype(page));
> +	migratetype = get_pageblock_migratetype(page);
> +	set_page_private(page, migratetype);
>  	local_irq_save(flags);
>  	if (unlikely(wasMlocked))
>  		free_page_mlock(page);
>  	__count_vm_event(PGFREE);
>  
> +	/*
> +	 * We only track unreclaimable, reclaimable and movable on pcp lists.
> +	 * Free ISOLATE pages back to the allocator because they are being
> +	 * offlined but treat RESERVE as movable pages so we can get those
> +	 * areas back if necessary. Otherwise, we may have to free
> +	 * excessively into the page allocator
> +	 */
> +	if (migratetype >= MIGRATE_PCPTYPES) {
> +		if (unlikely(migratetype == MIGRATE_ISOLATE)) {
> +			free_one_page(zone, page, 0, migratetype);
> +			goto out;
> +		}
> +		migratetype = MIGRATE_MOVABLE;
> +	}
> +
>  	if (cold)
> -		list_add_tail(&page->lru, &pcp->list);
> +		list_add_tail(&page->lru, &pcp->lists[migratetype]);
>  	else
> -		list_add(&page->lru, &pcp->list);
> +		list_add(&page->lru, &pcp->lists[migratetype]);
>  	pcp->count++;
>  	if (pcp->count >= pcp->high) {
> -		free_pages_bulk(zone, pcp->batch, &pcp->list, 0);
> +		free_pcppages_bulk(zone, pcp->batch, pcp);
>  		pcp->count -= pcp->batch;
>  	}
> +
> +out:
>  	local_irq_restore(flags);
>  	put_cpu();
>  }
> @@ -1155,46 +1187,24 @@ again:
>  	cpu  = get_cpu();
>  	if (likely(order == 0)) {
>  		struct per_cpu_pages *pcp;
> +		struct list_head *list;
>  
>  		pcp = &zone_pcp(zone, cpu)->pcp;
> +		list = &pcp->lists[migratetype];
>  		local_irq_save(flags);
> -		if (!pcp->count) {
> -			pcp->count = rmqueue_bulk(zone, 0,
> -					pcp->batch, &pcp->list,
> -					migratetype, cold);
> -			if (unlikely(!pcp->count))
> -				goto failed;
> -		}
> -
> -		/* Find a page of the appropriate migrate type */
> -		if (cold) {
> -			list_for_each_entry_reverse(page, &pcp->list, lru)
> -				if (page_private(page) == migratetype)
> -					break;
> -		} else {
> -			list_for_each_entry(page, &pcp->list, lru)
> -				if (page_private(page) == migratetype)
> -					break;
> -		}
> -
> -		/* Allocate more to the pcp list if necessary */
> -		if (unlikely(&page->lru == &pcp->list)) {
> -			int get_one_page = 0;
> -
> +		if (list_empty(list)) {
>  			pcp->count += rmqueue_bulk(zone, 0,
> -					pcp->batch, &pcp->list,
> +					pcp->batch, list,
>  					migratetype, cold);
> -			list_for_each_entry(page, &pcp->list, lru) {
> -				if (get_pageblock_migratetype(page) !=
> -					    MIGRATE_ISOLATE) {
> -					get_one_page = 1;
> -					break;
> -				}
> -			}
> -			if (!get_one_page)
> +			if (unlikely(list_empty(list)))
>  				goto failed;
>  		}
>  
> +		if (cold)
> +			page = list_entry(list->prev, struct page, lru);
> +		else
> +			page = list_entry(list->next, struct page, lru);
> +
>  		list_del(&page->lru);
>  		pcp->count--;
>  	} else {
> @@ -3033,6 +3043,7 @@ static int zone_batchsize(struct zone *zone)
>  static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
>  {
>  	struct per_cpu_pages *pcp;
> +	int migratetype;
>  
>  	memset(p, 0, sizeof(*p));
>  
> @@ -3040,7 +3051,8 @@ static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
>  	pcp->count = 0;
>  	pcp->high = 6 * batch;
>  	pcp->batch = max(1UL, 1 * batch);
> -	INIT_LIST_HEAD(&pcp->list);
> +	for (migratetype = 0; migratetype < MIGRATE_PCPTYPES; migratetype++)
> +		INIT_LIST_HEAD(&pcp->lists[migratetype]);
>  }
>  
>  /*
> @@ -3232,7 +3244,7 @@ static int __zone_pcp_update(void *data)
>  		pcp = &pset->pcp;
>  
>  		local_irq_save(flags);
> -		free_pages_bulk(zone, pcp->count, &pcp->list, 0);
> +		free_pcppages_bulk(zone, pcp->count, pcp);
>  		setup_pageset(pset, batch);
>  		local_irq_restore(flags);
>  	}
> -- 
> 1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type
  2009-08-18 11:43   ` Nick Piggin
@ 2009-08-18 13:10     ` Mel Gorman
  2009-08-18 13:12       ` Nick Piggin
  0 siblings, 1 reply; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 13:10 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linux Memory Management List, Christoph Lameter,
	Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 01:43:35PM +0200, Nick Piggin wrote:
> On Tue, Aug 18, 2009 at 12:16:00PM +0100, Mel Gorman wrote:
> > Currently the per-cpu page allocator searches the PCP list for pages of the
> > correct migrate-type to reduce the possibility of pages being inappropriate
> > placed from a fragmentation perspective. This search is potentially expensive
> > in a fast-path and undesirable. Splitting the per-cpu list into multiple
> > lists increases the size of a per-cpu structure and this was potentially
> > a major problem at the time the search was introduced. These problem has
> > been mitigated as now only the necessary number of structures is allocated
> > for the running system.
> > 
> > This patch replaces a list search in the per-cpu allocator with one list per
> > migrate type. The potential snag with this approach is when bulk freeing
> > pages. We round-robin free pages based on migrate type which has little
> > bearing on the cache hotness of the page and potentially checks empty lists
> > repeatedly in the event the majority of PCP pages are of one type.
> 
> Seems OK I guess. Trading off icache and branches for dcache and
> algorithmic gains. Too bad everything is always a tradeoff ;)
> 

Tell me about it. The dcache overhead of this is a problem although I
tried to limit the damage using pahole to see how much padding I had to
play with and staying within it where possible.

> But no I think this is a good idea.
> 

Thanks. Is that an Ack?

> > <SNIP>

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type
  2009-08-18 13:10     ` Mel Gorman
@ 2009-08-18 13:12       ` Nick Piggin
  0 siblings, 0 replies; 20+ messages in thread
From: Nick Piggin @ 2009-08-18 13:12 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Christoph Lameter,
	Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 02:10:24PM +0100, Mel Gorman wrote:
> On Tue, Aug 18, 2009 at 01:43:35PM +0200, Nick Piggin wrote:
> > On Tue, Aug 18, 2009 at 12:16:00PM +0100, Mel Gorman wrote:
> Tell me about it. The dcache overhead of this is a problem although I
> tried to limit the damage using pahole to see how much padding I had to
> play with and staying within it where possible.
> 
> > But no I think this is a good idea.
> > 
> 
> Thanks. Is that an Ack?

Sure, your numbers seem OK. I don't know if there is much more you
can do without having it merged somewhere...

Acked-by: Nick Piggin <npiggin@suse.de>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type
  2009-08-18 11:16 ` [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type Mel Gorman
  2009-08-18 11:43   ` Nick Piggin
@ 2009-08-18 22:57   ` Vincent Li
  2009-08-19  8:57     ` Mel Gorman
  1 sibling, 1 reply; 20+ messages in thread
From: Vincent Li @ 2009-08-18 22:57 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Christoph Lameter, Nick Piggin,
	Linux Kernel Mailing List

On Tue, 18 Aug 2009, Mel Gorman wrote:

> +	/*
> +	 * We only track unreclaimable, reclaimable and movable on pcp lists.
			 ^^^^^^^^^^^^^  
Is it unmovable? I don't see unreclaimable migrate type on pcp lists. 
Just ask to make sure I undsterstand the comment right.

> +	 * Free ISOLATE pages back to the allocator because they are being
> +	 * offlined but treat RESERVE as movable pages so we can get those
> +	 * areas back if necessary. Otherwise, we may have to free
> +	 * excessively into the page allocator
> +	 */
> +	if (migratetype >= MIGRATE_PCPTYPES) {
> +		if (unlikely(migratetype == MIGRATE_ISOLATE)) {
> +			free_one_page(zone, page, 0, migratetype);
> +			goto out;
> +		}
> +		migratetype = MIGRATE_MOVABLE;
> +	}
> +

Vincent Li
Biomedical Research Center
University of British Columbia

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type
  2009-08-18 22:57   ` Vincent Li
@ 2009-08-19  8:57     ` Mel Gorman
  0 siblings, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2009-08-19  8:57 UTC (permalink / raw)
  To: Vincent Li
  Cc: Linux Memory Management List, Christoph Lameter, Nick Piggin,
	Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 03:57:00PM -0700, Vincent Li wrote:
> On Tue, 18 Aug 2009, Mel Gorman wrote:
> 
> > +	/*
> > +	 * We only track unreclaimable, reclaimable and movable on pcp lists.
> 			 ^^^^^^^^^^^^^  
> Is it unmovable? I don't see unreclaimable migrate type on pcp lists. 
> Just ask to make sure I undsterstand the comment right.
> 

It should have said unmovable. Sorry

> > +	 * Free ISOLATE pages back to the allocator because they are being
> > +	 * offlined but treat RESERVE as movable pages so we can get those
> > +	 * areas back if necessary. Otherwise, we may have to free
> > +	 * excessively into the page allocator
> > +	 */
> > +	if (migratetype >= MIGRATE_PCPTYPES) {
> > +		if (unlikely(migratetype == MIGRATE_ISOLATE)) {
> > +			free_one_page(zone, page, 0, migratetype);
> > +			goto out;
> > +		}
> > +		migratetype = MIGRATE_MOVABLE;
> > +	}
> > +
> 
> Vincent Li
> Biomedical Research Center
> University of British Columbia
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/3] page-allocoator: Maintain rolling count of pages to free from the PCP
  2009-08-18 11:15 [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Mel Gorman
  2009-08-18 11:16 ` [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type Mel Gorman
@ 2009-08-18 11:16 ` Mel Gorman
  2009-08-18 11:16 ` [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone Mel Gorman
  2009-08-18 14:22 ` [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Christoph Lameter
  3 siblings, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 11:16 UTC (permalink / raw)
  To: Linux Memory Management List
  Cc: Christoph Lameter, Nick Piggin, Linux Kernel Mailing List, Mel Gorman

When round-robin freeing pages from the PCP lists, empty lists may be
encountered. In the event one of the lists has more pages than another,
there may be numerous checks for list_empty() which is undesirable. This
patch maintains a count of pages to free which is incremented when empty
lists are encountered. The intention is that more pages will then be freed
from fuller lists than the empty ones reducing the number of empty list
checks in the free path.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 mm/page_alloc.c |   23 ++++++++++++++---------
 1 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a06ddf0..dd3f306 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -536,32 +536,37 @@ static void free_pcppages_bulk(struct zone *zone, int count,
 					struct per_cpu_pages *pcp)
 {
 	int migratetype = 0;
+	int batch_free = 0;
 
 	spin_lock(&zone->lock);
 	zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
 	zone->pages_scanned = 0;
 
 	__mod_zone_page_state(zone, NR_FREE_PAGES, count);
-	while (count--) {
+	while (count) {
 		struct page *page;
 		struct list_head *list;
 
 		/*
-		 * Remove pages from lists in a round-robin fashion. This spinning
-		 * around potentially empty lists is bloody awful, alternatives that
-		 * don't suck are welcome
+		 * Remove pages from lists in a round-robin fashion. A batch_free
+		 * count is maintained that is incremented when an empty list is
+		 * encountered. This is so more pages are freed off fuller lists
+		 * instead of spinning excessively around empty lists
 		 */
 		do {
+			batch_free++;
 			if (++migratetype == MIGRATE_PCPTYPES)
 				migratetype = 0;
 			list = &pcp->lists[migratetype];
 		} while (list_empty(list));
 
-		page = list_entry(list->prev, struct page, lru);
-		/* have to delete it as __free_one_page list manipulates */
-		list_del(&page->lru);
-		trace_mm_page_pcpu_drain(page, 0, migratetype);
-		__free_one_page(page, zone, 0, migratetype);
+		do {
+			page = list_entry(list->prev, struct page, lru);
+			/* must delete as __free_one_page list manipulates */
+			list_del(&page->lru);
+			__free_one_page(page, zone, 0, migratetype);
+			trace_mm_page_pcpu_drain(page, 0, migratetype);
+		} while (--count && --batch_free && !list_empty(list));
 	}
 	spin_unlock(&zone->lock);
 }
-- 
1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone
  2009-08-18 11:15 [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Mel Gorman
  2009-08-18 11:16 ` [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type Mel Gorman
  2009-08-18 11:16 ` [PATCH 2/3] page-allocoator: Maintain rolling count of pages to free from the PCP Mel Gorman
@ 2009-08-18 11:16 ` Mel Gorman
  2009-08-18 11:47   ` Nick Piggin
  2009-08-18 14:18   ` Christoph Lameter
  2009-08-18 14:22 ` [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Christoph Lameter
  3 siblings, 2 replies; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 11:16 UTC (permalink / raw)
  To: Linux Memory Management List
  Cc: Christoph Lameter, Nick Piggin, Linux Kernel Mailing List, Mel Gorman

Having multiple lists per PCPU increased the size of the per-pcpu
structure. Two of the fields, high and batch, do not change within a
zone making that information redundant. This patch moves those fields
off the PCP and onto the zone to reduce the size of the PCPU.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 include/linux/mmzone.h |    9 +++++----
 mm/page_alloc.c        |   47 +++++++++++++++++++++++++----------------------
 mm/vmstat.c            |    4 ++--
 3 files changed, 32 insertions(+), 28 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 6e0b624..57a3ef0 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -167,12 +167,10 @@ enum zone_watermarks {
 #define high_wmark_pages(z) (z->watermark[WMARK_HIGH])
 
 struct per_cpu_pages {
-	int count;		/* number of pages in the list */
-	int high;		/* high watermark, emptying needed */
-	int batch;		/* chunk size for buddy add/remove */
-
 	/* Lists of pages, one per migrate type stored on the pcp-lists */
 	struct list_head lists[MIGRATE_PCPTYPES];
+
+	int count;		/* number of pages in the list */
 };
 
 struct per_cpu_pageset {
@@ -284,6 +282,9 @@ struct zone {
 	/* zone watermarks, access with *_wmark_pages(zone) macros */
 	unsigned long watermark[NR_WMARK];
 
+	int pcp_high;		/* high watermark, emptying needed */
+	int pcp_batch;		/* chunk size for buddy add/remove */
+
 	/*
 	 * We don't know if the memory that we're going to allocate will be freeable
 	 * or/and it will be released eventually, so to avoid totally wasting several
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index dd3f306..65cdfbf 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -988,8 +988,8 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
 	int to_drain;
 
 	local_irq_save(flags);
-	if (pcp->count >= pcp->batch)
-		to_drain = pcp->batch;
+	if (pcp->count >= zone->pcp_batch)
+		to_drain = zone->pcp_batch;
 	else
 		to_drain = pcp->count;
 	free_pcppages_bulk(zone, to_drain, pcp);
@@ -1129,9 +1129,9 @@ static void free_hot_cold_page(struct page *page, int cold)
 	else
 		list_add(&page->lru, &pcp->lists[migratetype]);
 	pcp->count++;
-	if (pcp->count >= pcp->high) {
-		free_pcppages_bulk(zone, pcp->batch, pcp);
-		pcp->count -= pcp->batch;
+	if (pcp->count >= zone->pcp_high) {
+		free_pcppages_bulk(zone, zone->pcp_batch, pcp);
+		pcp->count -= zone->pcp_batch;
 	}
 
 out:
@@ -1199,7 +1199,7 @@ again:
 		local_irq_save(flags);
 		if (list_empty(list)) {
 			pcp->count += rmqueue_bulk(zone, 0,
-					pcp->batch, list,
+					zone->pcp_batch, list,
 					migratetype, cold);
 			if (unlikely(list_empty(list)))
 				goto failed;
@@ -2178,8 +2178,8 @@ void show_free_areas(void)
 			pageset = zone_pcp(zone, cpu);
 
 			printk("CPU %4d: hi:%5d, btch:%4d usd:%4d\n",
-			       cpu, pageset->pcp.high,
-			       pageset->pcp.batch, pageset->pcp.count);
+			       cpu, zone->pcp_high,
+			       zone->pcp_batch, pageset->pcp.count);
 		}
 	}
 
@@ -3045,7 +3045,9 @@ static int zone_batchsize(struct zone *zone)
 #endif
 }
 
-static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
+static void setup_pageset(struct zone *zone,
+				struct per_cpu_pageset *p,
+				unsigned long batch)
 {
 	struct per_cpu_pages *pcp;
 	int migratetype;
@@ -3054,8 +3056,8 @@ static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
 
 	pcp = &p->pcp;
 	pcp->count = 0;
-	pcp->high = 6 * batch;
-	pcp->batch = max(1UL, 1 * batch);
+	zone->pcp_high = 6 * batch;
+	zone->pcp_batch = max(1UL, 1 * batch);
 	for (migratetype = 0; migratetype < MIGRATE_PCPTYPES; migratetype++)
 		INIT_LIST_HEAD(&pcp->lists[migratetype]);
 }
@@ -3065,16 +3067,17 @@ static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
  * to the value high for the pageset p.
  */
 
-static void setup_pagelist_highmark(struct per_cpu_pageset *p,
+static void setup_pagelist_highmark(struct zone *zone,
+				struct per_cpu_pageset *p,
 				unsigned long high)
 {
 	struct per_cpu_pages *pcp;
 
 	pcp = &p->pcp;
-	pcp->high = high;
-	pcp->batch = max(1UL, high/4);
+	zone->pcp_high = high;
+	zone->pcp_batch = max(1UL, high/4);
 	if ((high/4) > (PAGE_SHIFT * 8))
-		pcp->batch = PAGE_SHIFT * 8;
+		zone->pcp_batch = PAGE_SHIFT * 8;
 }
 
 
@@ -3115,10 +3118,10 @@ static int __cpuinit process_zones(int cpu)
 		if (!zone_pcp(zone, cpu))
 			goto bad;
 
-		setup_pageset(zone_pcp(zone, cpu), zone_batchsize(zone));
+		setup_pageset(zone, zone_pcp(zone, cpu), zone_batchsize(zone));
 
 		if (percpu_pagelist_fraction)
-			setup_pagelist_highmark(zone_pcp(zone, cpu),
+			setup_pagelist_highmark(zone, zone_pcp(zone, cpu),
 			 	(zone->present_pages / percpu_pagelist_fraction));
 	}
 
@@ -3250,7 +3253,7 @@ static int __zone_pcp_update(void *data)
 
 		local_irq_save(flags);
 		free_pcppages_bulk(zone, pcp->count, pcp);
-		setup_pageset(pset, batch);
+		setup_pageset(zone, pset, batch);
 		local_irq_restore(flags);
 	}
 	return 0;
@@ -3270,9 +3273,9 @@ static __meminit void zone_pcp_init(struct zone *zone)
 #ifdef CONFIG_NUMA
 		/* Early boot. Slab allocator not functional yet */
 		zone_pcp(zone, cpu) = &boot_pageset[cpu];
-		setup_pageset(&boot_pageset[cpu],0);
+		setup_pageset(zone, &boot_pageset[cpu],0);
 #else
-		setup_pageset(zone_pcp(zone,cpu), batch);
+		setup_pageset(zone, zone_pcp(zone,cpu), batch);
 #endif
 	}
 	if (zone->present_pages)
@@ -4781,7 +4784,7 @@ int lowmem_reserve_ratio_sysctl_handler(ctl_table *table, int write,
 }
 
 /*
- * percpu_pagelist_fraction - changes the pcp->high for each zone on each
+ * percpu_pagelist_fraction - changes the zone->pcp_high for each zone on each
  * cpu.  It is the fraction of total pages in each zone that a hot per cpu pagelist
  * can have before it gets flushed back to buddy allocator.
  */
@@ -4800,7 +4803,7 @@ int percpu_pagelist_fraction_sysctl_handler(ctl_table *table, int write,
 		for_each_online_cpu(cpu) {
 			unsigned long  high;
 			high = zone->present_pages / percpu_pagelist_fraction;
-			setup_pagelist_highmark(zone_pcp(zone, cpu), high);
+			setup_pagelist_highmark(zone, zone_pcp(zone, cpu), high);
 		}
 	}
 	return 0;
diff --git a/mm/vmstat.c b/mm/vmstat.c
index c81321f..a9d23c3 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -746,8 +746,8 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
 			   "\n              batch: %i",
 			   i,
 			   pageset->pcp.count,
-			   pageset->pcp.high,
-			   pageset->pcp.batch);
+			   zone->pcp_high,
+			   zone->pcp_batch);
 #ifdef CONFIG_SMP
 		seq_printf(m, "\n  vm stats threshold: %d",
 				pageset->stat_threshold);
-- 
1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone
  2009-08-18 11:16 ` [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone Mel Gorman
@ 2009-08-18 11:47   ` Nick Piggin
  2009-08-18 12:57     ` Mel Gorman
  2009-08-18 14:18   ` Christoph Lameter
  1 sibling, 1 reply; 20+ messages in thread
From: Nick Piggin @ 2009-08-18 11:47 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Christoph Lameter,
	Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 12:16:02PM +0100, Mel Gorman wrote:
> Having multiple lists per PCPU increased the size of the per-pcpu
> structure. Two of the fields, high and batch, do not change within a
> zone making that information redundant. This patch moves those fields
> off the PCP and onto the zone to reduce the size of the PCPU.

Hmm.. I did have some patches a long long time ago that among other
things made the lists larger for the local node only....

But I guess if something like that is ever shown to be a good idea
then we can go back to the old scheme. So yeah this seems OK.

> 
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> ---
>  include/linux/mmzone.h |    9 +++++----
>  mm/page_alloc.c        |   47 +++++++++++++++++++++++++----------------------
>  mm/vmstat.c            |    4 ++--
>  3 files changed, 32 insertions(+), 28 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 6e0b624..57a3ef0 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -167,12 +167,10 @@ enum zone_watermarks {
>  #define high_wmark_pages(z) (z->watermark[WMARK_HIGH])
>  
>  struct per_cpu_pages {
> -	int count;		/* number of pages in the list */
> -	int high;		/* high watermark, emptying needed */
> -	int batch;		/* chunk size for buddy add/remove */
> -
>  	/* Lists of pages, one per migrate type stored on the pcp-lists */
>  	struct list_head lists[MIGRATE_PCPTYPES];
> +
> +	int count;		/* number of pages in the list */
>  };
>  
>  struct per_cpu_pageset {
> @@ -284,6 +282,9 @@ struct zone {
>  	/* zone watermarks, access with *_wmark_pages(zone) macros */
>  	unsigned long watermark[NR_WMARK];
>  
> +	int pcp_high;		/* high watermark, emptying needed */
> +	int pcp_batch;		/* chunk size for buddy add/remove */
> +
>  	/*
>  	 * We don't know if the memory that we're going to allocate will be freeable
>  	 * or/and it will be released eventually, so to avoid totally wasting several
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index dd3f306..65cdfbf 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -988,8 +988,8 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
>  	int to_drain;
>  
>  	local_irq_save(flags);
> -	if (pcp->count >= pcp->batch)
> -		to_drain = pcp->batch;
> +	if (pcp->count >= zone->pcp_batch)
> +		to_drain = zone->pcp_batch;
>  	else
>  		to_drain = pcp->count;
>  	free_pcppages_bulk(zone, to_drain, pcp);
> @@ -1129,9 +1129,9 @@ static void free_hot_cold_page(struct page *page, int cold)
>  	else
>  		list_add(&page->lru, &pcp->lists[migratetype]);
>  	pcp->count++;
> -	if (pcp->count >= pcp->high) {
> -		free_pcppages_bulk(zone, pcp->batch, pcp);
> -		pcp->count -= pcp->batch;
> +	if (pcp->count >= zone->pcp_high) {
> +		free_pcppages_bulk(zone, zone->pcp_batch, pcp);
> +		pcp->count -= zone->pcp_batch;
>  	}
>  
>  out:
> @@ -1199,7 +1199,7 @@ again:
>  		local_irq_save(flags);
>  		if (list_empty(list)) {
>  			pcp->count += rmqueue_bulk(zone, 0,
> -					pcp->batch, list,
> +					zone->pcp_batch, list,
>  					migratetype, cold);
>  			if (unlikely(list_empty(list)))
>  				goto failed;
> @@ -2178,8 +2178,8 @@ void show_free_areas(void)
>  			pageset = zone_pcp(zone, cpu);
>  
>  			printk("CPU %4d: hi:%5d, btch:%4d usd:%4d\n",
> -			       cpu, pageset->pcp.high,
> -			       pageset->pcp.batch, pageset->pcp.count);
> +			       cpu, zone->pcp_high,
> +			       zone->pcp_batch, pageset->pcp.count);
>  		}
>  	}
>  
> @@ -3045,7 +3045,9 @@ static int zone_batchsize(struct zone *zone)
>  #endif
>  }
>  
> -static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
> +static void setup_pageset(struct zone *zone,
> +				struct per_cpu_pageset *p,
> +				unsigned long batch)
>  {
>  	struct per_cpu_pages *pcp;
>  	int migratetype;
> @@ -3054,8 +3056,8 @@ static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
>  
>  	pcp = &p->pcp;
>  	pcp->count = 0;
> -	pcp->high = 6 * batch;
> -	pcp->batch = max(1UL, 1 * batch);
> +	zone->pcp_high = 6 * batch;
> +	zone->pcp_batch = max(1UL, 1 * batch);
>  	for (migratetype = 0; migratetype < MIGRATE_PCPTYPES; migratetype++)
>  		INIT_LIST_HEAD(&pcp->lists[migratetype]);
>  }
> @@ -3065,16 +3067,17 @@ static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch)
>   * to the value high for the pageset p.
>   */
>  
> -static void setup_pagelist_highmark(struct per_cpu_pageset *p,
> +static void setup_pagelist_highmark(struct zone *zone,
> +				struct per_cpu_pageset *p,
>  				unsigned long high)
>  {
>  	struct per_cpu_pages *pcp;
>  
>  	pcp = &p->pcp;
> -	pcp->high = high;
> -	pcp->batch = max(1UL, high/4);
> +	zone->pcp_high = high;
> +	zone->pcp_batch = max(1UL, high/4);
>  	if ((high/4) > (PAGE_SHIFT * 8))
> -		pcp->batch = PAGE_SHIFT * 8;
> +		zone->pcp_batch = PAGE_SHIFT * 8;
>  }
>  
>  
> @@ -3115,10 +3118,10 @@ static int __cpuinit process_zones(int cpu)
>  		if (!zone_pcp(zone, cpu))
>  			goto bad;
>  
> -		setup_pageset(zone_pcp(zone, cpu), zone_batchsize(zone));
> +		setup_pageset(zone, zone_pcp(zone, cpu), zone_batchsize(zone));
>  
>  		if (percpu_pagelist_fraction)
> -			setup_pagelist_highmark(zone_pcp(zone, cpu),
> +			setup_pagelist_highmark(zone, zone_pcp(zone, cpu),
>  			 	(zone->present_pages / percpu_pagelist_fraction));
>  	}
>  
> @@ -3250,7 +3253,7 @@ static int __zone_pcp_update(void *data)
>  
>  		local_irq_save(flags);
>  		free_pcppages_bulk(zone, pcp->count, pcp);
> -		setup_pageset(pset, batch);
> +		setup_pageset(zone, pset, batch);
>  		local_irq_restore(flags);
>  	}
>  	return 0;
> @@ -3270,9 +3273,9 @@ static __meminit void zone_pcp_init(struct zone *zone)
>  #ifdef CONFIG_NUMA
>  		/* Early boot. Slab allocator not functional yet */
>  		zone_pcp(zone, cpu) = &boot_pageset[cpu];
> -		setup_pageset(&boot_pageset[cpu],0);
> +		setup_pageset(zone, &boot_pageset[cpu],0);
>  #else
> -		setup_pageset(zone_pcp(zone,cpu), batch);
> +		setup_pageset(zone, zone_pcp(zone,cpu), batch);
>  #endif
>  	}
>  	if (zone->present_pages)
> @@ -4781,7 +4784,7 @@ int lowmem_reserve_ratio_sysctl_handler(ctl_table *table, int write,
>  }
>  
>  /*
> - * percpu_pagelist_fraction - changes the pcp->high for each zone on each
> + * percpu_pagelist_fraction - changes the zone->pcp_high for each zone on each
>   * cpu.  It is the fraction of total pages in each zone that a hot per cpu pagelist
>   * can have before it gets flushed back to buddy allocator.
>   */
> @@ -4800,7 +4803,7 @@ int percpu_pagelist_fraction_sysctl_handler(ctl_table *table, int write,
>  		for_each_online_cpu(cpu) {
>  			unsigned long  high;
>  			high = zone->present_pages / percpu_pagelist_fraction;
> -			setup_pagelist_highmark(zone_pcp(zone, cpu), high);
> +			setup_pagelist_highmark(zone, zone_pcp(zone, cpu), high);
>  		}
>  	}
>  	return 0;
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index c81321f..a9d23c3 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -746,8 +746,8 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
>  			   "\n              batch: %i",
>  			   i,
>  			   pageset->pcp.count,
> -			   pageset->pcp.high,
> -			   pageset->pcp.batch);
> +			   zone->pcp_high,
> +			   zone->pcp_batch);
>  #ifdef CONFIG_SMP
>  		seq_printf(m, "\n  vm stats threshold: %d",
>  				pageset->stat_threshold);
> -- 
> 1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone
  2009-08-18 11:47   ` Nick Piggin
@ 2009-08-18 12:57     ` Mel Gorman
  0 siblings, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 12:57 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linux Memory Management List, Christoph Lameter,
	Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 01:47:52PM +0200, Nick Piggin wrote:
> On Tue, Aug 18, 2009 at 12:16:02PM +0100, Mel Gorman wrote:
> > Having multiple lists per PCPU increased the size of the per-pcpu
> > structure. Two of the fields, high and batch, do not change within a
> > zone making that information redundant. This patch moves those fields
> > off the PCP and onto the zone to reduce the size of the PCPU.
> 
> Hmm.. I did have some patches a long long time ago that among other
> things made the lists larger for the local node only....
> 

To reduce the remote node lists, one could look at applying some fixed factor
to the high value or basing remote lists on some percentage of high.

> But I guess if something like that is ever shown to be a good idea
> then we can go back to the old scheme. So yeah this seems OK.
> 

Thanks.

> > 
> > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> > ---
> >  include/linux/mmzone.h |    9 +++++----
> >  mm/page_alloc.c        |   47 +++++++++++++++++++++++++----------------------
> >  mm/vmstat.c            |    4 ++--
> >  3 files changed, 32 insertions(+), 28 deletions(-)
> > 
> > <SNIP>

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone
  2009-08-18 11:16 ` [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone Mel Gorman
  2009-08-18 11:47   ` Nick Piggin
@ 2009-08-18 14:18   ` Christoph Lameter
  2009-08-18 16:42     ` Mel Gorman
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Lameter @ 2009-08-18 14:18 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List


This will increase the cache footprint for the hot code path. Could these
new variable be moved next to zone fields that are already in use there?
The pageset array is used f.e.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone
  2009-08-18 14:18   ` Christoph Lameter
@ 2009-08-18 16:42     ` Mel Gorman
  2009-08-18 17:56       ` Christoph Lameter
  0 siblings, 1 reply; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 16:42 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 10:18:48AM -0400, Christoph Lameter wrote:
> 
> This will increase the cache footprint for the hot code path. Could these
> new variable be moved next to zone fields that are already in use there?
> The pageset array is used f.e.
> 

pageset is ____cacheline_aligned_in_smp so putting pcp->high/batch near
it won't help in terms of cache footprint. This is why I located it near
watermarks because it's known they'll be needed at roughly the same time
pcp->high/batch would be normally accessed.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone
  2009-08-18 16:42     ` Mel Gorman
@ 2009-08-18 17:56       ` Christoph Lameter
  2009-08-18 20:50         ` Mel Gorman
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Lameter @ 2009-08-18 17:56 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List

On Tue, 18 Aug 2009, Mel Gorman wrote:

> On Tue, Aug 18, 2009 at 10:18:48AM -0400, Christoph Lameter wrote:
> >
> > This will increase the cache footprint for the hot code path. Could these
> > new variable be moved next to zone fields that are already in use there?
> > The pageset array is used f.e.
> >
>
> pageset is ____cacheline_aligned_in_smp so putting pcp->high/batch near
> it won't help in terms of cache footprint. This is why I located it near
> watermarks because it's known they'll be needed at roughly the same time
> pcp->high/batch would be normally accessed.

watermarks are not accessed from the hot code path in free_hot_cold page.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone
  2009-08-18 17:56       ` Christoph Lameter
@ 2009-08-18 20:50         ` Mel Gorman
  0 siblings, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 20:50 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 01:56:22PM -0400, Christoph Lameter wrote:
> On Tue, 18 Aug 2009, Mel Gorman wrote:
> 
> > On Tue, Aug 18, 2009 at 10:18:48AM -0400, Christoph Lameter wrote:
> > >
> > > This will increase the cache footprint for the hot code path. Could these
> > > new variable be moved next to zone fields that are already in use there?
> > > The pageset array is used f.e.
> > >
> >
> > pageset is ____cacheline_aligned_in_smp so putting pcp->high/batch near
> > it won't help in terms of cache footprint. This is why I located it near
> > watermarks because it's known they'll be needed at roughly the same time
> > pcp->high/batch would be normally accessed.
> 
> watermarks are not accessed from the hot code path in free_hot_cold page.
> 

They are used in a commonly-used path for allocation so there is some
advantage. Put beside pageset, there is no advantage as that structure
is already aligned to a cache-line.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 0/3] Reduce searching in the page allocator fast-path
  2009-08-18 11:15 [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Mel Gorman
                   ` (2 preceding siblings ...)
  2009-08-18 11:16 ` [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone Mel Gorman
@ 2009-08-18 14:22 ` Christoph Lameter
  2009-08-18 16:53   ` Mel Gorman
  3 siblings, 1 reply; 20+ messages in thread
From: Christoph Lameter @ 2009-08-18 14:22 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List


This could be combined with the per cpu ops patch that makes the page
allocator use alloc_percpu for its per cpu data needs. That in turn would
allow the use of per cpu atomics in the hot paths, maybe we can
get to a point where we can drop the irq disable there.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 0/3] Reduce searching in the page allocator fast-path
  2009-08-18 14:22 ` [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Christoph Lameter
@ 2009-08-18 16:53   ` Mel Gorman
  2009-08-18 19:05     ` Christoph Lameter
  0 siblings, 1 reply; 20+ messages in thread
From: Mel Gorman @ 2009-08-18 16:53 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 10:22:01AM -0400, Christoph Lameter wrote:
> 
> This could be combined with the per cpu ops patch that makes the page
> allocator use alloc_percpu for its per cpu data needs. That in turn would
> allow the use of per cpu atomics in the hot paths, maybe we can
> get to a point where we can drop the irq disable there.
> 

It would appear that getting rid of IRQ disabling and using per-cpu-atomics
would be a problem independent of searching the free lists. Either would
be good, both would be better or am I missing something that makes them
mutually exclusive?

Can you point me to which patchset you are talking about specifically that
uses per-cpu atomics in the hot path? There are a lot of per-cpu patches
related to you that have been posted in the last few months and I'm not sure
what any of their merge status' is.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 0/3] Reduce searching in the page allocator fast-path
  2009-08-18 16:53   ` Mel Gorman
@ 2009-08-18 19:05     ` Christoph Lameter
  2009-08-19  9:08       ` Mel Gorman
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Lameter @ 2009-08-18 19:05 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List

On Tue, 18 Aug 2009, Mel Gorman wrote:

> Can you point me to which patchset you are talking about specifically that
> uses per-cpu atomics in the hot path? There are a lot of per-cpu patches
> related to you that have been posted in the last few months and I'm not sure
> what any of their merge status' is.

The following patch just moved the page allocator to use the new per cpu
allocator. It does not use per cpu atomic yet but its possible then.

http://marc.info/?l=linux-mm&m=124527414206546&w=2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 0/3] Reduce searching in the page allocator fast-path
  2009-08-18 19:05     ` Christoph Lameter
@ 2009-08-19  9:08       ` Mel Gorman
  2009-08-19 11:48         ` Christoph Lameter
  0 siblings, 1 reply; 20+ messages in thread
From: Mel Gorman @ 2009-08-19  9:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List

On Tue, Aug 18, 2009 at 03:05:25PM -0400, Christoph Lameter wrote:
> On Tue, 18 Aug 2009, Mel Gorman wrote:
> 
> > Can you point me to which patchset you are talking about specifically that
> > uses per-cpu atomics in the hot path? There are a lot of per-cpu patches
> > related to you that have been posted in the last few months and I'm not sure
> > what any of their merge status' is.
> 
> The following patch just moved the page allocator to use the new per cpu
> allocator. It does not use per cpu atomic yet but its possible then.
> 
> http://marc.info/?l=linux-mm&m=124527414206546&w=2
> 

Ok, I don't see this particular patch merged, is it in a merge queue somewhere?

After glancing through, I can see how it might help.  I'm going to drop patch
3 of this set that shuffles data from the PCP to the zone and take a closer
look at those patches. Patch 1 and 2 of this set should still go ahead. Do
you agree?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 0/3] Reduce searching in the page allocator fast-path
  2009-08-19  9:08       ` Mel Gorman
@ 2009-08-19 11:48         ` Christoph Lameter
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Lameter @ 2009-08-19 11:48 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Linux Memory Management List, Nick Piggin, Linux Kernel Mailing List

On Wed, 19 Aug 2009, Mel Gorman wrote:

> Ok, I don't see this particular patch merged, is it in a merge queue somewhere?

The patch depends on Tejun's work to be merged that makes the per cpu
allocator available on all platforms. I believe that is in the queue for
2.6.32.

> After glancing through, I can see how it might help.  I'm going to drop patch
> 3 of this set that shuffles data from the PCP to the zone and take a closer
> look at those patches. Patch 1 and 2 of this set should still go ahead. Do
> you agree?

Yes.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2009-08-19 11:48 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-18 11:15 [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Mel Gorman
2009-08-18 11:16 ` [PATCH 1/3] page-allocator: Split per-cpu list into one-list-per-migrate-type Mel Gorman
2009-08-18 11:43   ` Nick Piggin
2009-08-18 13:10     ` Mel Gorman
2009-08-18 13:12       ` Nick Piggin
2009-08-18 22:57   ` Vincent Li
2009-08-19  8:57     ` Mel Gorman
2009-08-18 11:16 ` [PATCH 2/3] page-allocoator: Maintain rolling count of pages to free from the PCP Mel Gorman
2009-08-18 11:16 ` [PATCH 3/3] page-allocator: Move pcp static fields for high and batch off-pcp and onto the zone Mel Gorman
2009-08-18 11:47   ` Nick Piggin
2009-08-18 12:57     ` Mel Gorman
2009-08-18 14:18   ` Christoph Lameter
2009-08-18 16:42     ` Mel Gorman
2009-08-18 17:56       ` Christoph Lameter
2009-08-18 20:50         ` Mel Gorman
2009-08-18 14:22 ` [RFC PATCH 0/3] Reduce searching in the page allocator fast-path Christoph Lameter
2009-08-18 16:53   ` Mel Gorman
2009-08-18 19:05     ` Christoph Lameter
2009-08-19  9:08       ` Mel Gorman
2009-08-19 11:48         ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox