* [PATCH 0/2] Finish polish for grouping pages by mobility
@ 2007-04-18 13:53 Mel Gorman
2007-04-18 13:53 ` [PATCH 1/2] Back out add-a-configure-option-to-group-pages-by-mobility Mel Gorman
2007-04-18 13:54 ` [PATCH 2/2] Back out group-high-order-atomic-allocations Mel Gorman
0 siblings, 2 replies; 3+ messages in thread
From: Mel Gorman @ 2007-04-18 13:53 UTC (permalink / raw)
To: akpm; +Cc: Mel Gorman, linux-mm
The following two patches are intended to polish off fragmentation avoidance
in its current incarnation. My TODO list is currently empty with these
patches applied and I intend to look through other patchesets for a while
and watch for bugs. In contrast to previous patches, these are removing code,
not adding it.
The first patch removes the CONFIG_PAGE_GROUP_BY_MOBILITY as a compile-time
option. Once applied, pages will always be grouped by mobility except when it
is determined there is not enough memory to make it work. The compile-time
option is removed because it was considered underdesirable to alter the
page allocators behavior between configurations.
The second patch stops grouping high-order atomic allocations together. I
have a strong feeling that the MIGRATE_RESERVE that keeps the min_free_kbytes
pages contiguous should be enough. If another order-3 failure with e1000 or
any other atomic allocation arrives, grouping high-alloc pages can be tried
again. That way, it'll be known if the feature works as expected or not.
With these two patches, the stack is a little funky because it adds stuff
early in the set and removes them again later. Andrew, if you like I can
send a drop in replacement stack with the config options never added.
The performance effect we've seen with kernbench remain in the -0.1% to +3%
range for total CPU time. Whether a performance regression or gain is seen
depends on the size of the TLB. Every workload has a working set but what
is often forgotten is that the kernel portion of the working set is backed
by large page table entries and does not necessarily exhibit the locality
principal the same way userspace does.
When grouping pages by mobility, kernel allocations are backed by fewer
large page table entries than when they are scattered throughout the physical
address space. This frees up TLB entries that can then be used by userspace
so there can be a performance gain in both user and system CPU times due to
increased TLB reach. The gain is seen when the size of the working set would
normally exceed TLB reach. That is why we generally see performance gains
on x86_64 but not always on PPC64 because of its much larger TLB [1]. It
is expected that the longer the system is running the more noticeable the
effect becomes but it has not been measured. Glancing through the performance
tests on test.kernel.org, there were some improvements when 2.6.21-rc2-mm2
was released which may or may not be due to fragmentation avoidance but it
is certainly interesting.
As always, the success rates of high-order allocations is drastically
improved, particularly when used in combination with Andy's intelligent
reclaim work.
[1] Size does matter
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/2] Back out add-a-configure-option-to-group-pages-by-mobility
2007-04-18 13:53 [PATCH 0/2] Finish polish for grouping pages by mobility Mel Gorman
@ 2007-04-18 13:53 ` Mel Gorman
2007-04-18 13:54 ` [PATCH 2/2] Back out group-high-order-atomic-allocations Mel Gorman
1 sibling, 0 replies; 3+ messages in thread
From: Mel Gorman @ 2007-04-18 13:53 UTC (permalink / raw)
To: akpm; +Cc: Mel Gorman, linux-mm
Grouping pages by mobility can be disabled at compile-time. This was
considered undesirable by a number of people. However, in the current stack of
patches, it is not a simple case of just dropping the configurable patch as it
would cause merge conflicts. This patch backs out the configuration option.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andy Whitcroft <apw@shadowen.org>
---
include/linux/mmzone.h | 9 ---------
init/Kconfig | 13 -------------
mm/page_alloc.c | 42 ++----------------------------------------
3 files changed, 2 insertions(+), 62 deletions(-)
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-rc6-mm1-001_latest/include/linux/mmzone.h linux-2.6.21-rc6-mm1-002_backout_configurable/include/linux/mmzone.h
--- linux-2.6.21-rc6-mm1-001_latest/include/linux/mmzone.h 2007-04-17 14:49:33.000000000 +0100
+++ linux-2.6.21-rc6-mm1-002_backout_configurable/include/linux/mmzone.h 2007-04-17 16:35:48.000000000 +0100
@@ -25,21 +25,12 @@
#endif
#define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1))
-#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
#define MIGRATE_UNMOVABLE 0
#define MIGRATE_RECLAIMABLE 1
#define MIGRATE_MOVABLE 2
#define MIGRATE_HIGHATOMIC 3
#define MIGRATE_RESERVE 4
#define MIGRATE_TYPES 5
-#else
-#define MIGRATE_UNMOVABLE 0
-#define MIGRATE_UNRECLAIMABLE 0
-#define MIGRATE_MOVABLE 0
-#define MIGRATE_HIGHATOMIC 0
-#define MIGRATE_RESERVE 0
-#define MIGRATE_TYPES 1
-#endif
#define for_each_migratetype_order(order, type) \
for (order = 0; order < MAX_ORDER; order++) \
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-rc6-mm1-001_latest/init/Kconfig linux-2.6.21-rc6-mm1-002_backout_configurable/init/Kconfig
--- linux-2.6.21-rc6-mm1-001_latest/init/Kconfig 2007-04-17 14:32:03.000000000 +0100
+++ linux-2.6.21-rc6-mm1-002_backout_configurable/init/Kconfig 2007-04-17 16:35:48.000000000 +0100
@@ -636,19 +636,6 @@ config BASE_SMALL
default 0 if BASE_FULL
default 1 if !BASE_FULL
-config PAGE_GROUP_BY_MOBILITY
- bool "Group pages based on their mobility in the page allocator"
- def_bool y
- help
- The standard allocator will fragment memory over time which means
- that high order allocations will fail even if kswapd is running. If
- this option is set, the allocator will try and group page types
- based on their ability to migrate or reclaim. This is a best effort
- attempt at lowering fragmentation which a few workloads care about.
- The loss is a more complex allocator that may perform slower. If
- you are interested in working with large pages, say Y and set
- /proc/sys/vm/min_free_bytes to 16374. Otherwise say N
-
menu "Loadable module support"
config MODULES
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-rc6-mm1-001_latest/mm/page_alloc.c linux-2.6.21-rc6-mm1-002_backout_configurable/mm/page_alloc.c
--- linux-2.6.21-rc6-mm1-001_latest/mm/page_alloc.c 2007-04-17 16:33:48.000000000 +0100
+++ linux-2.6.21-rc6-mm1-002_backout_configurable/mm/page_alloc.c 2007-04-17 16:35:48.000000000 +0100
@@ -144,7 +144,6 @@ static unsigned long __meminitdata dma_r
EXPORT_SYMBOL(movable_zone);
#endif /* CONFIG_ARCH_POPULATES_NODE_MAP */
-#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
int page_group_by_mobility_disabled __read_mostly;
static inline int get_pageblock_migratetype(struct page *page)
@@ -178,22 +177,6 @@ static inline int allocflags_to_migratet
((gfp_flags & __GFP_RECLAIMABLE) != 0);
}
-#else
-static inline int get_pageblock_migratetype(struct page *page)
-{
- return MIGRATE_UNMOVABLE;
-}
-
-static void set_pageblock_migratetype(struct page *page, int migratetype)
-{
-}
-
-static inline int allocflags_to_migratetype(gfp_t gfp_flags, int order)
-{
- return MIGRATE_UNMOVABLE;
-}
-#endif /* CONFIG_PAGE_GROUP_BY_MOBILITY */
-
#ifdef CONFIG_DEBUG_VM
static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
{
@@ -728,7 +711,6 @@ static struct page *__rmqueue_smallest(s
}
-#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
/*
* This array describes the order lists are fallen back to when
* the free lists for the desirable migrate type are depleted
@@ -760,7 +742,7 @@ int move_freepages(struct zone *zone,
* CONFIG_HOLES_IN_ZONE is set. This bug check is probably redundant
* anyway as we check zone boundaries in move_freepages_block().
* Remove at a later date when no bug reports exist related to
- * CONFIG_PAGE_GROUP_BY_MOBILITY
+ * grouping pages by mobility
*/
BUG_ON(page_zone(start_page) != page_zone(end_page));
#endif
@@ -909,13 +891,6 @@ retry:
/* Use MIGRATE_RESERVE rather than fail an allocation */
return __rmqueue_smallest(zone, order, MIGRATE_RESERVE);
}
-#else
-static struct page *__rmqueue_fallback(struct zone *zone, int order,
- int start_migratetype)
-{
- return NULL;
-}
-#endif /* CONFIG_PAGE_GROUP_BY_MOBILITY */
/*
* Do the hard work of removing an element from the buddy allocator.
@@ -1081,7 +1056,6 @@ void mark_free_pages(struct zone *zone)
}
#endif /* CONFIG_PM */
-#if defined(CONFIG_PM) || defined(CONFIG_PAGE_GROUP_BY_MOBILITY)
/*
* Spill all of this CPU's per-cpu pages back into the buddy allocator.
*/
@@ -1112,9 +1086,6 @@ void drain_all_local_pages(void)
smp_call_function(smp_drain_local_pages, NULL, 0, 1);
}
-#else
-void drain_all_local_pages(void) {}
-#endif /* CONFIG_PM || CONFIG_PAGE_GROUP_BY_MOBILITY */
/*
* Free a 0-order page
@@ -1205,7 +1176,6 @@ again:
goto failed;
}
-#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
/* Find a page of the appropriate migrate type */
list_for_each_entry(page, &pcp->list, lru)
if (page_private(page) == migratetype)
@@ -1217,9 +1187,6 @@ again:
pcp->batch, &pcp->list, migratetype);
page = list_entry(pcp->list.next, struct page, lru);
}
-#else
- page = list_entry(pcp->list.next, struct page, lru);
-#endif /* CONFIG_PAGE_GROUP_BY_MOBILITY */
list_del(&page->lru);
pcp->count--;
@@ -2385,7 +2352,6 @@ static inline unsigned long wait_table_b
#define LONG_ALIGN(x) (((x)+(sizeof(long))-1)&~((sizeof(long))-1))
-#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
/*
* Mark a number of MAX_ORDER_NR_PAGES blocks as MIGRATE_RESERVE. The number
* of blocks reserved is based on zone->pages_min. The memory within the
@@ -2439,11 +2405,7 @@ static void setup_zone_migrate_reserve(s
}
}
}
-#else
-static inline void setup_zone_migrate_reserve(struct zone *zone)
-{
-}
-#endif /* CONFIG_PAGE_GROUP_BY_MOBILITY */
+
/*
* Initially all pages are reserved - free ones are freed
* up by free_all_bootmem() once the early boot process is
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 2/2] Back out group-high-order-atomic-allocations
2007-04-18 13:53 [PATCH 0/2] Finish polish for grouping pages by mobility Mel Gorman
2007-04-18 13:53 ` [PATCH 1/2] Back out add-a-configure-option-to-group-pages-by-mobility Mel Gorman
@ 2007-04-18 13:54 ` Mel Gorman
1 sibling, 0 replies; 3+ messages in thread
From: Mel Gorman @ 2007-04-18 13:54 UTC (permalink / raw)
To: akpm; +Cc: Mel Gorman, linux-mm
Grouping high-order atomic allocations together was intended to allow
bursty users of atomic allocations to work such as e1000 in situations
where their preallocated buffers were depleted. This did not work in
at least one case with a wireless network adapter needing order-1
allocations frequently. To resolve that, the free pages used for
min_free_kbytes were moved to separate contiguous blocks with the patch
bias-the-location-of-pages-freed-for-min_free_kbytes-in-the-same-max_order_nr_pages-blocks.
It is felt that keeping the free pages in the same contiguous blocks should be
sufficient for bursty short-lived high-order atomic allocations to succeed,
maybe even with the e1000. Even if there is a failure, increasing the value
of min_free_kbytes will free pages as contiguous bloks in contrast to the
standard buddy allocator which makes no attempt to keep the minimum number
of free pages contiguous.
This patch backs out grouping high order atomic allocations together to
determine if it is really needed or not. If a new report comes in about
high-order atomic allocations failing, the feature can be reintroduced to
determine if it fixes the problem or not. As a side-effect, this patch
reduces by 1 the number of bits required to track the mobility type of
pages within a MAX_ORDER_NR_PAGES block.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andy Whitcroft <apw@shadowen.org>
---
include/linux/mmzone.h | 5 ++---
include/linux/pageblock-flags.h | 2 +-
mm/page_alloc.c | 33 +++++----------------------------
3 files changed, 8 insertions(+), 32 deletions(-)
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-rc6-mm1-002_backout_configurable/include/linux/mmzone.h linux-2.6.21-rc6-mm1-003_backout_highatomic/include/linux/mmzone.h
--- linux-2.6.21-rc6-mm1-002_backout_configurable/include/linux/mmzone.h 2007-04-17 16:35:48.000000000 +0100
+++ linux-2.6.21-rc6-mm1-003_backout_highatomic/include/linux/mmzone.h 2007-04-17 16:37:39.000000000 +0100
@@ -28,9 +28,8 @@
#define MIGRATE_UNMOVABLE 0
#define MIGRATE_RECLAIMABLE 1
#define MIGRATE_MOVABLE 2
-#define MIGRATE_HIGHATOMIC 3
-#define MIGRATE_RESERVE 4
-#define MIGRATE_TYPES 5
+#define MIGRATE_RESERVE 3
+#define MIGRATE_TYPES 4
#define for_each_migratetype_order(order, type) \
for (order = 0; order < MAX_ORDER; order++) \
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-rc6-mm1-002_backout_configurable/include/linux/pageblock-flags.h linux-2.6.21-rc6-mm1-003_backout_highatomic/include/linux/pageblock-flags.h
--- linux-2.6.21-rc6-mm1-002_backout_configurable/include/linux/pageblock-flags.h 2007-04-17 14:32:03.000000000 +0100
+++ linux-2.6.21-rc6-mm1-003_backout_highatomic/include/linux/pageblock-flags.h 2007-04-17 16:37:39.000000000 +0100
@@ -31,7 +31,7 @@
/* Bit indices that affect a whole block of pages */
enum pageblock_bits {
- PB_range(PB_migrate, 3), /* 3 bits required for migrate types */
+ PB_range(PB_migrate, 2), /* 2 bits required for migrate types */
NR_PAGEBLOCK_BITS
};
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.21-rc6-mm1-002_backout_configurable/mm/page_alloc.c linux-2.6.21-rc6-mm1-003_backout_highatomic/mm/page_alloc.c
--- linux-2.6.21-rc6-mm1-002_backout_configurable/mm/page_alloc.c 2007-04-17 16:35:48.000000000 +0100
+++ linux-2.6.21-rc6-mm1-003_backout_highatomic/mm/page_alloc.c 2007-04-17 16:37:39.000000000 +0100
@@ -167,11 +167,6 @@ static inline int allocflags_to_migratet
if (unlikely(page_group_by_mobility_disabled))
return MIGRATE_UNMOVABLE;
- /* Cluster high-order atomic allocations together */
- if (unlikely(order > 0) &&
- (!(gfp_flags & __GFP_WAIT) || in_interrupt()))
- return MIGRATE_HIGHATOMIC;
-
/* Cluster based on mobility */
return (((gfp_flags & __GFP_MOVABLE) != 0) << 1) |
((gfp_flags & __GFP_RECLAIMABLE) != 0);
@@ -716,11 +711,10 @@ static struct page *__rmqueue_smallest(s
* the free lists for the desirable migrate type are depleted
*/
static int fallbacks[MIGRATE_TYPES][MIGRATE_TYPES-1] = {
- [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_HIGHATOMIC, MIGRATE_RESERVE },
- [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_HIGHATOMIC, MIGRATE_RESERVE },
- [MIGRATE_MOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_HIGHATOMIC, MIGRATE_RESERVE },
- [MIGRATE_HIGHATOMIC] = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
- [MIGRATE_RESERVE] = { MIGRATE_RESERVE, MIGRATE_RESERVE, MIGRATE_RESERVE, MIGRATE_RESERVE }, /* Never used */
+ [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
+ [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
+ [MIGRATE_MOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_RESERVE },
+ [MIGRATE_RESERVE] = { MIGRATE_RESERVE, MIGRATE_RESERVE, MIGRATE_RESERVE }, /* Never used */
};
/*
@@ -814,9 +808,7 @@ static struct page *__rmqueue_fallback(s
int current_order;
struct page *page;
int migratetype, i;
- int nonatomic_fallback_atomic = 0;
-retry:
/* Find the largest possible block of pages in the other list */
for (current_order = MAX_ORDER-1; current_order >= order;
--current_order) {
@@ -826,14 +818,6 @@ retry:
/* MIGRATE_RESERVE handled later if necessary */
if (migratetype == MIGRATE_RESERVE)
continue;
- /*
- * Make it hard to fallback to blocks used for
- * high-order atomic allocations
- */
- if (migratetype == MIGRATE_HIGHATOMIC &&
- start_migratetype != MIGRATE_UNMOVABLE &&
- !nonatomic_fallback_atomic)
- continue;
area = &(zone->free_area[current_order]);
if (list_empty(&area->free_list[migratetype]))
@@ -859,8 +843,7 @@ retry:
start_migratetype);
/* Claim the whole block if over half of it is free */
- if ((pages << current_order) >= (1 << (MAX_ORDER-2)) &&
- migratetype != MIGRATE_HIGHATOMIC)
+ if ((pages << current_order) >= (1 << (MAX_ORDER-2)))
set_pageblock_migratetype(page,
start_migratetype);
@@ -882,12 +865,6 @@ retry:
}
}
- /* Allow fallback to high-order atomic blocks if memory is that low */
- if (!nonatomic_fallback_atomic) {
- nonatomic_fallback_atomic = 1;
- goto retry;
- }
-
/* Use MIGRATE_RESERVE rather than fail an allocation */
return __rmqueue_smallest(zone, order, MIGRATE_RESERVE);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-04-18 13:54 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-18 13:53 [PATCH 0/2] Finish polish for grouping pages by mobility Mel Gorman
2007-04-18 13:53 ` [PATCH 1/2] Back out add-a-configure-option-to-group-pages-by-mobility Mel Gorman
2007-04-18 13:54 ` [PATCH 2/2] Back out group-high-order-atomic-allocations Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox