* [PATCH 0/4] enhance compaction success rate
@ 2014-12-08 7:16 Joonsoo Kim
2014-12-08 7:16 ` [PATCH 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
` (4 more replies)
0 siblings, 5 replies; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-08 7:16 UTC (permalink / raw)
To: Andrew Morton
Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel, Joonsoo Kim
This patchset aims at increase of compaction success rate. Changes are
related to compaction finish condition and freepage isolation condition.
>From these changes, I did stress highalloc test in mmtests with nonmovable
order 7 allocation configuration, and success rate (%) at phase 1 are,
Base Patch-1 Patch-3 Patch-4
55.00 57.00 62.67 64.00
And, compaction success rate (%) on same test are,
Base Patch-1 Patch-3 Patch-4
18.47 28.94 35.13 41.50
This patchset is based on my tracepoint update on compaction.
https://lkml.org/lkml/2014/12/3/71
Joonsoo Kim (4):
mm/compaction: fix wrong order check in compact_finished()
mm/page_alloc: expands broken freepage to proper buddy list when
steal
mm/compaction: enhance compaction finish condition
mm/compaction: stop the isolation when we isolate enough freepage
include/linux/mmzone.h | 3 ++
include/trace/events/kmem.h | 7 +++--
mm/compaction.c | 48 ++++++++++++++++++++++------
mm/internal.h | 1 +
mm/page_alloc.c | 73 +++++++++++++++++++++++++------------------
5 files changed, 89 insertions(+), 43 deletions(-)
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/4] mm/compaction: fix wrong order check in compact_finished()
2014-12-08 7:16 [PATCH 0/4] enhance compaction success rate Joonsoo Kim
@ 2014-12-08 7:16 ` Joonsoo Kim
2014-12-08 9:06 ` Vlastimil Babka
2014-12-08 7:16 ` [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal Joonsoo Kim
` (3 subsequent siblings)
4 siblings, 1 reply; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-08 7:16 UTC (permalink / raw)
To: Andrew Morton
Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel, Joonsoo Kim, stable
What we want to check here is whether there is highorder freepage
in buddy list of other migratetype in order to steal it without
fragmentation. But, current code just checks cc->order which means
allocation request order. So, this is wrong.
Without this fix, non-movable synchronous compaction below pageblock order
would not stopped until compaction is complete, because migratetype of most
pageblocks are movable and high order freepage made by compaction is usually
on movable type buddy list.
There is some report related to this bug. See below link.
http://www.spinics.net/lists/linux-mm/msg81666.html
Although the issued system still has load spike comes from compaction,
this makes that system completely stable and responsive according to
his report.
stress-highalloc test in mmtests with non movable order 7 allocation doesn't
show any notable difference in allocation success rate, but, it shows more
compaction success rate and reduced elapsed time.
Compaction success rate (Compaction success * 100 / Compaction stalls, %)
18.47 : 28.94
Elapsed time (sec)
1429 : 1411
Cc: <stable@vger.kernel.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
mm/compaction.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index c9ee464..1a5f465 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1105,7 +1105,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
return COMPACT_PARTIAL;
/* Job done if allocation would set block type */
- if (cc->order >= pageblock_order && area->nr_free)
+ if (order >= pageblock_order && area->nr_free)
return COMPACT_PARTIAL;
}
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal
2014-12-08 7:16 [PATCH 0/4] enhance compaction success rate Joonsoo Kim
2014-12-08 7:16 ` [PATCH 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
@ 2014-12-08 7:16 ` Joonsoo Kim
2014-12-08 9:29 ` Vlastimil Babka
2014-12-08 7:16 ` [PATCH 3/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
` (2 subsequent siblings)
4 siblings, 1 reply; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-08 7:16 UTC (permalink / raw)
To: Andrew Morton
Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel, Joonsoo Kim
There is odd behaviour when we steal freepages from other migratetype
buddy list. In try_to_steal_freepages(), we move all freepages in
the pageblock that founded freepage is belong to to the request
migratetype in order to mitigate fragmentation. If the number of moved
pages are enough to change pageblock migratetype, there is no problem. If
not enough, we don't change pageblock migratetype and add broken freepages
to the original migratetype buddy list rather than request migratetype
one. For me, this is odd, because we already moved all freepages in this
pageblock to the request migratetype. This patch fixes this situation to
add broken freepages to the request migratetype buddy list in this case.
This patch introduce new function that can help to decide if we can
steal the page without resulting in fragmentation. It will be used in
following patch for compaction finish criteria.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
include/trace/events/kmem.h | 7 +++--
mm/page_alloc.c | 72 +++++++++++++++++++++++++------------------
2 files changed, 46 insertions(+), 33 deletions(-)
diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index aece134..4ad10ba 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -268,11 +268,11 @@ TRACE_EVENT(mm_page_alloc_extfrag,
TP_PROTO(struct page *page,
int alloc_order, int fallback_order,
- int alloc_migratetype, int fallback_migratetype, int new_migratetype),
+ int alloc_migratetype, int fallback_migratetype),
TP_ARGS(page,
alloc_order, fallback_order,
- alloc_migratetype, fallback_migratetype, new_migratetype),
+ alloc_migratetype, fallback_migratetype),
TP_STRUCT__entry(
__field( struct page *, page )
@@ -289,7 +289,8 @@ TRACE_EVENT(mm_page_alloc_extfrag,
__entry->fallback_order = fallback_order;
__entry->alloc_migratetype = alloc_migratetype;
__entry->fallback_migratetype = fallback_migratetype;
- __entry->change_ownership = (new_migratetype == alloc_migratetype);
+ __entry->change_ownership = (alloc_migratetype ==
+ get_pageblock_migratetype(page));
),
TP_printk("page=%p pfn=%lu alloc_order=%d fallback_order=%d pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d change_ownership=%d",
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7c46d0f..7b4c9aa 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1139,44 +1139,50 @@ static void change_pageblock_range(struct page *pageblock_page,
* Returns the new migratetype of the pageblock (or the same old migratetype
* if it was unchanged).
*/
-static int try_to_steal_freepages(struct zone *zone, struct page *page,
- int start_type, int fallback_type)
+static void try_to_steal_freepages(struct zone *zone, struct page *page,
+ int target_mt)
{
+ int pages;
int current_order = page_order(page);
- /*
- * When borrowing from MIGRATE_CMA, we need to release the excess
- * buddy pages to CMA itself. We also ensure the freepage_migratetype
- * is set to CMA so it is returned to the correct freelist in case
- * the page ends up being not actually allocated from the pcp lists.
- */
- if (is_migrate_cma(fallback_type))
- return fallback_type;
-
/* Take ownership for orders >= pageblock_order */
if (current_order >= pageblock_order) {
- change_pageblock_range(page, current_order, start_type);
- return start_type;
+ change_pageblock_range(page, current_order, target_mt);
+ return;
}
- if (current_order >= pageblock_order / 2 ||
- start_type == MIGRATE_RECLAIMABLE ||
- page_group_by_mobility_disabled) {
- int pages;
+ pages = move_freepages_block(zone, page, target_mt);
- pages = move_freepages_block(zone, page, start_type);
+ /* Claim the whole block if over half of it is free */
+ if (pages >= (1 << (pageblock_order-1)) ||
+ page_group_by_mobility_disabled) {
- /* Claim the whole block if over half of it is free */
- if (pages >= (1 << (pageblock_order-1)) ||
- page_group_by_mobility_disabled) {
+ set_pageblock_migratetype(page, target_mt);
+ }
+}
- set_pageblock_migratetype(page, start_type);
- return start_type;
- }
+static bool can_steal_freepages(unsigned int order,
+ int start_mt, int fallback_mt)
+{
+ /*
+ * When borrowing from MIGRATE_CMA, we need to release the excess
+ * buddy pages to CMA itself. We also ensure the freepage_migratetype
+ * is set to CMA so it is returned to the correct freelist in case
+ * the page ends up being not actually allocated from the pcp lists.
+ */
+ if (is_migrate_cma(fallback_mt))
+ return false;
- }
+ /* Can take ownership for orders >= pageblock_order */
+ if (order >= pageblock_order)
+ return true;
+
+ if (order >= pageblock_order / 2 ||
+ start_mt == MIGRATE_RECLAIMABLE ||
+ page_group_by_mobility_disabled)
+ return true;
- return fallback_type;
+ return false;
}
/* Remove an element from the buddy allocator from the fallback list */
@@ -1187,6 +1193,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
unsigned int current_order;
struct page *page;
int migratetype, new_type, i;
+ bool can_steal;
/* Find the largest possible block of pages in the other list */
for (current_order = MAX_ORDER-1;
@@ -1194,6 +1201,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
--current_order) {
for (i = 0;; i++) {
migratetype = fallbacks[start_migratetype][i];
+ new_type = migratetype;
/* MIGRATE_RESERVE handled later if necessary */
if (migratetype == MIGRATE_RESERVE)
@@ -1207,9 +1215,13 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
struct page, lru);
area->nr_free--;
- new_type = try_to_steal_freepages(zone, page,
- start_migratetype,
- migratetype);
+ can_steal = can_steal_freepages(current_order,
+ start_migratetype, migratetype);
+ if (can_steal) {
+ new_type = start_migratetype;
+ try_to_steal_freepages(zone, page,
+ start_migratetype);
+ }
/* Remove the page from the freelists */
list_del(&page->lru);
@@ -1225,7 +1237,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
set_freepage_migratetype(page, new_type);
trace_mm_page_alloc_extfrag(page, order, current_order,
- start_migratetype, migratetype, new_type);
+ start_migratetype, migratetype);
return page;
}
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/4] mm/compaction: enhance compaction finish condition
2014-12-08 7:16 [PATCH 0/4] enhance compaction success rate Joonsoo Kim
2014-12-08 7:16 ` [PATCH 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
2014-12-08 7:16 ` [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal Joonsoo Kim
@ 2014-12-08 7:16 ` Joonsoo Kim
2014-12-08 9:34 ` Vlastimil Babka
2014-12-08 7:16 ` [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
2014-12-08 9:16 ` [PATCH 0/4] enhance compaction success rate Vlastimil Babka
4 siblings, 1 reply; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-08 7:16 UTC (permalink / raw)
To: Andrew Morton
Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel, Joonsoo Kim
Compaction has anti fragmentation algorithm. It is that freepage
should be more than pageblock order to finish the compaction if we don't
find any freepage in requested migratetype buddy list. This is for
mitigating fragmentation, but, it is a lack of migratetype consideration
and too excessive.
At first, it doesn't consider migratetype so there would be false positive
on compaction finish decision. For example, if allocation request is
for unmovable migratetype, freepage in CMA migratetype doesn't help that
allocation, so compaction should not be stopped. But, current logic
considers it as compaction is no longer needed and stop the compaction.
Secondly, it is too excessive. We can steal freepage from other migratetype
and change pageblock migratetype on more relaxed conditions. In page
allocator, there is another conditions that can succeed to steal without
introducing fragmentation.
To solve these problems, this patch borrows anti fragmentation logic from
page allocator. It will reduce premature compaction finish in some cases
and reduce excessive compaction work.
stress-highalloc test in mmtests with non movable order 7 allocation shows
in allocation success rate on phase 1 and compaction success rate.
Allocation success rate on phase 1 (%)
57.00 : 63.67
Compaction success rate (Compaction success * 100 / Compaction stalls, %)
28.94 : 35.13
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
include/linux/mmzone.h | 3 +++
mm/compaction.c | 31 +++++++++++++++++++++++++++++--
mm/internal.h | 1 +
mm/page_alloc.c | 5 ++---
4 files changed, 35 insertions(+), 5 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 2f0856d..87f5bb5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -63,6 +63,9 @@ enum {
MIGRATE_TYPES
};
+#define FALLBACK_MIGRATETYPES (4)
+extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES];
+
#ifdef CONFIG_CMA
# define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
#else
diff --git a/mm/compaction.c b/mm/compaction.c
index 1a5f465..2fd5f79 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1054,6 +1054,30 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
}
+static bool can_steal_fallbacks(struct free_area *area,
+ unsigned int order, int migratetype)
+{
+ int i;
+ int fallback_mt;
+
+ if (area->nr_free == 0)
+ return false;
+
+ for (i = 0; i < FALLBACK_MIGRATETYPES; i++) {
+ fallback_mt = fallbacks[migratetype][i];
+ if (fallback_mt == MIGRATE_RESERVE)
+ break;
+
+ if (list_empty(&area->free_list[fallback_mt]))
+ continue;
+
+ if (can_steal_freepages(order, migratetype, fallback_mt))
+ return true;
+ }
+
+ return false;
+}
+
static int __compact_finished(struct zone *zone, struct compact_control *cc,
const int migratetype)
{
@@ -1104,8 +1128,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
if (!list_empty(&area->free_list[migratetype]))
return COMPACT_PARTIAL;
- /* Job done if allocation would set block type */
- if (order >= pageblock_order && area->nr_free)
+ /*
+ * Job done if allocation would steal freepages from
+ * other migratetype buddy lists.
+ */
+ if (can_steal_fallbacks(area, order, migratetype))
return COMPACT_PARTIAL;
}
diff --git a/mm/internal.h b/mm/internal.h
index efad241..7028d83 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -179,6 +179,7 @@ unsigned long
isolate_migratepages_range(struct compact_control *cc,
unsigned long low_pfn, unsigned long end_pfn);
+bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt);
#endif
/*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7b4c9aa..dcb8523 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1031,7 +1031,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
* This array describes the order lists are fallen back to when
* the free lists for the desirable migrate type are depleted
*/
-static int fallbacks[MIGRATE_TYPES][4] = {
+int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = {
[MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
#ifdef CONFIG_CMA
@@ -1161,8 +1161,7 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page,
}
}
-static bool can_steal_freepages(unsigned int order,
- int start_mt, int fallback_mt)
+bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt)
{
/*
* When borrowing from MIGRATE_CMA, we need to release the excess
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage
2014-12-08 7:16 [PATCH 0/4] enhance compaction success rate Joonsoo Kim
` (2 preceding siblings ...)
2014-12-08 7:16 ` [PATCH 3/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
@ 2014-12-08 7:16 ` Joonsoo Kim
2014-12-08 9:59 ` Vlastimil Babka
2014-12-08 9:16 ` [PATCH 0/4] enhance compaction success rate Vlastimil Babka
4 siblings, 1 reply; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-08 7:16 UTC (permalink / raw)
To: Andrew Morton
Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel, Joonsoo Kim
From: Joonsoo Kim <js1304@gmail.com>
Currently, freepage isolation in one pageblock doesn't consider how many
freepages we isolate. When I traced flow of compaction, compaction
sometimes isolates more than 256 freepages to migrate just 32 pages.
In this patch, freepage isolation is stopped at the point that we
have more isolated freepage than isolated page for migration. This
results in slowing down free page scanner and make compaction success
rate higher.
stress-highalloc test in mmtests with non movable order 7 allocation shows
increase of compaction success rate and slight improvement of allocation
success rate.
Allocation success rate on phase 1 (%)
62.70 : 64.00
Compaction success rate (Compaction success * 100 / Compaction stalls, %)
35.13 : 41.50
pfn where both scanners meets on compaction complete
(separate test due to enormous tracepoint buffer)
(zone_start=4096, zone_end=1048576)
586034 : 654378
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
---
mm/compaction.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 2fd5f79..12223b9 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -422,6 +422,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
/* If a page was split, advance to the end of it */
if (isolated) {
+ cc->nr_freepages += isolated;
+ if (!strict &&
+ cc->nr_migratepages <= cc->nr_freepages) {
+ blockpfn += isolated;
+ break;
+ }
+
blockpfn += isolated - 1;
cursor += isolated - 1;
continue;
@@ -831,7 +838,6 @@ static void isolate_freepages(struct compact_control *cc)
unsigned long isolate_start_pfn; /* exact pfn we start at */
unsigned long block_end_pfn; /* end of current pageblock */
unsigned long low_pfn; /* lowest pfn scanner is able to scan */
- int nr_freepages = cc->nr_freepages;
struct list_head *freelist = &cc->freepages;
/*
@@ -856,11 +862,11 @@ static void isolate_freepages(struct compact_control *cc)
* pages on cc->migratepages. We stop searching if the migrate
* and free page scanners meet or enough free pages are isolated.
*/
- for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
+ for (; block_start_pfn >= low_pfn &&
+ cc->nr_migratepages > cc->nr_freepages;
block_end_pfn = block_start_pfn,
block_start_pfn -= pageblock_nr_pages,
isolate_start_pfn = block_start_pfn) {
- unsigned long isolated;
/*
* This can iterate a massively long zone without finding any
@@ -885,9 +891,8 @@ static void isolate_freepages(struct compact_control *cc)
continue;
/* Found a block suitable for isolating free pages from. */
- isolated = isolate_freepages_block(cc, &isolate_start_pfn,
+ isolate_freepages_block(cc, &isolate_start_pfn,
block_end_pfn, freelist, false);
- nr_freepages += isolated;
/*
* Remember where the free scanner should restart next time,
@@ -919,8 +924,6 @@ static void isolate_freepages(struct compact_control *cc)
*/
if (block_start_pfn < low_pfn)
cc->free_pfn = cc->migrate_pfn;
-
- cc->nr_freepages = nr_freepages;
}
/*
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] mm/compaction: fix wrong order check in compact_finished()
2014-12-08 7:16 ` [PATCH 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
@ 2014-12-08 9:06 ` Vlastimil Babka
0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2014-12-08 9:06 UTC (permalink / raw)
To: Joonsoo Kim, Andrew Morton
Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel, stable
On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> What we want to check here is whether there is highorder freepage
> in buddy list of other migratetype in order to steal it without
> fragmentation. But, current code just checks cc->order which means
> allocation request order. So, this is wrong.
>
> Without this fix, non-movable synchronous compaction below pageblock order
> would not stopped until compaction is complete, because migratetype of most
> pageblocks are movable and high order freepage made by compaction is usually
> on movable type buddy list.
>
> There is some report related to this bug. See below link.
>
> http://www.spinics.net/lists/linux-mm/msg81666.html
>
> Although the issued system still has load spike comes from compaction,
> this makes that system completely stable and responsive according to
> his report.
>
> stress-highalloc test in mmtests with non movable order 7 allocation doesn't
> show any notable difference in allocation success rate, but, it shows more
> compaction success rate and reduced elapsed time.
>
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 18.47 : 28.94
>
> Elapsed time (sec)
> 1429 : 1411
>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/compaction.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index c9ee464..1a5f465 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1105,7 +1105,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
> return COMPACT_PARTIAL;
>
> /* Job done if allocation would set block type */
> - if (cc->order >= pageblock_order && area->nr_free)
> + if (order >= pageblock_order && area->nr_free)
> return COMPACT_PARTIAL;
> }
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] enhance compaction success rate
2014-12-08 7:16 [PATCH 0/4] enhance compaction success rate Joonsoo Kim
` (3 preceding siblings ...)
2014-12-08 7:16 ` [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
@ 2014-12-08 9:16 ` Vlastimil Babka
2014-12-10 6:36 ` Joonsoo Kim
4 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2014-12-08 9:16 UTC (permalink / raw)
To: Joonsoo Kim, Andrew Morton
Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel
On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> This patchset aims at increase of compaction success rate. Changes are
> related to compaction finish condition and freepage isolation condition.
>
> From these changes, I did stress highalloc test in mmtests with nonmovable
> order 7 allocation configuration, and success rate (%) at phase 1 are,
>
> Base Patch-1 Patch-3 Patch-4
> 55.00 57.00 62.67 64.00
>
> And, compaction success rate (%) on same test are,
>
> Base Patch-1 Patch-3 Patch-4
> 18.47 28.94 35.13 41.50
Did you test Patch-2 separately? Any difference to Patch 1?
> This patchset is based on my tracepoint update on compaction.
>
> https://lkml.org/lkml/2014/12/3/71
>
> Joonsoo Kim (4):
> mm/compaction: fix wrong order check in compact_finished()
> mm/page_alloc: expands broken freepage to proper buddy list when
> steal
> mm/compaction: enhance compaction finish condition
> mm/compaction: stop the isolation when we isolate enough freepage
>
> include/linux/mmzone.h | 3 ++
> include/trace/events/kmem.h | 7 +++--
> mm/compaction.c | 48 ++++++++++++++++++++++------
> mm/internal.h | 1 +
> mm/page_alloc.c | 73 +++++++++++++++++++++++++------------------
> 5 files changed, 89 insertions(+), 43 deletions(-)
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal
2014-12-08 7:16 ` [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal Joonsoo Kim
@ 2014-12-08 9:29 ` Vlastimil Babka
2014-12-10 6:38 ` Joonsoo Kim
0 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2014-12-08 9:29 UTC (permalink / raw)
To: Joonsoo Kim, Andrew Morton
Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel
On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> There is odd behaviour when we steal freepages from other migratetype
> buddy list. In try_to_steal_freepages(), we move all freepages in
> the pageblock that founded freepage is belong to to the request
> migratetype in order to mitigate fragmentation. If the number of moved
> pages are enough to change pageblock migratetype, there is no problem. If
> not enough, we don't change pageblock migratetype and add broken freepages
> to the original migratetype buddy list rather than request migratetype
> one. For me, this is odd, because we already moved all freepages in this
> pageblock to the request migratetype. This patch fixes this situation to
> add broken freepages to the request migratetype buddy list in this case.
I'd rather split the fix from the refactoring. And maybe my description
is longer, but easier to understand? (I guess somebody else should judge
this)
> This patch introduce new function that can help to decide if we can
> steal the page without resulting in fragmentation. It will be used in
> following patch for compaction finish criteria.
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
> include/trace/events/kmem.h | 7 +++--
> mm/page_alloc.c | 72 +++++++++++++++++++++++++------------------
> 2 files changed, 46 insertions(+), 33 deletions(-)
>
> diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> index aece134..4ad10ba 100644
> --- a/include/trace/events/kmem.h
> +++ b/include/trace/events/kmem.h
> @@ -268,11 +268,11 @@ TRACE_EVENT(mm_page_alloc_extfrag,
>
> TP_PROTO(struct page *page,
> int alloc_order, int fallback_order,
> - int alloc_migratetype, int fallback_migratetype, int new_migratetype),
> + int alloc_migratetype, int fallback_migratetype),
>
> TP_ARGS(page,
> alloc_order, fallback_order,
> - alloc_migratetype, fallback_migratetype, new_migratetype),
> + alloc_migratetype, fallback_migratetype),
>
> TP_STRUCT__entry(
> __field( struct page *, page )
> @@ -289,7 +289,8 @@ TRACE_EVENT(mm_page_alloc_extfrag,
> __entry->fallback_order = fallback_order;
> __entry->alloc_migratetype = alloc_migratetype;
> __entry->fallback_migratetype = fallback_migratetype;
> - __entry->change_ownership = (new_migratetype == alloc_migratetype);
> + __entry->change_ownership = (alloc_migratetype ==
> + get_pageblock_migratetype(page));
> ),
>
> TP_printk("page=%p pfn=%lu alloc_order=%d fallback_order=%d pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d change_ownership=%d",
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7c46d0f..7b4c9aa 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1139,44 +1139,50 @@ static void change_pageblock_range(struct page *pageblock_page,
> * Returns the new migratetype of the pageblock (or the same old migratetype
> * if it was unchanged).
> */
> -static int try_to_steal_freepages(struct zone *zone, struct page *page,
> - int start_type, int fallback_type)
> +static void try_to_steal_freepages(struct zone *zone, struct page *page,
> + int target_mt)
> {
> + int pages;
> int current_order = page_order(page);
>
> - /*
> - * When borrowing from MIGRATE_CMA, we need to release the excess
> - * buddy pages to CMA itself. We also ensure the freepage_migratetype
> - * is set to CMA so it is returned to the correct freelist in case
> - * the page ends up being not actually allocated from the pcp lists.
> - */
> - if (is_migrate_cma(fallback_type))
> - return fallback_type;
> -
> /* Take ownership for orders >= pageblock_order */
> if (current_order >= pageblock_order) {
> - change_pageblock_range(page, current_order, start_type);
> - return start_type;
> + change_pageblock_range(page, current_order, target_mt);
> + return;
So here's a (current_order >= pageblock_order) check.
> }
>
> - if (current_order >= pageblock_order / 2 ||
> - start_type == MIGRATE_RECLAIMABLE ||
> - page_group_by_mobility_disabled) {
> - int pages;
> + pages = move_freepages_block(zone, page, target_mt);
>
> - pages = move_freepages_block(zone, page, start_type);
> + /* Claim the whole block if over half of it is free */
> + if (pages >= (1 << (pageblock_order-1)) ||
> + page_group_by_mobility_disabled) {
>
> - /* Claim the whole block if over half of it is free */
> - if (pages >= (1 << (pageblock_order-1)) ||
> - page_group_by_mobility_disabled) {
> + set_pageblock_migratetype(page, target_mt);
> + }
> +}
>
> - set_pageblock_migratetype(page, start_type);
> - return start_type;
> - }
> +static bool can_steal_freepages(unsigned int order,
> + int start_mt, int fallback_mt)
> +{
> + /*
> + * When borrowing from MIGRATE_CMA, we need to release the excess
> + * buddy pages to CMA itself. We also ensure the freepage_migratetype
> + * is set to CMA so it is returned to the correct freelist in case
> + * the page ends up being not actually allocated from the pcp lists.
> + */
> + if (is_migrate_cma(fallback_mt))
> + return false;
>
> - }
> + /* Can take ownership for orders >= pageblock_order */
> + if (order >= pageblock_order)
> + return true;
And another check.
> +
> + if (order >= pageblock_order / 2 ||
> + start_mt == MIGRATE_RECLAIMABLE ||
> + page_group_by_mobility_disabled)
> + return true;
>
> - return fallback_type;
> + return false;
> }
>
> /* Remove an element from the buddy allocator from the fallback list */
> @@ -1187,6 +1193,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> unsigned int current_order;
> struct page *page;
> int migratetype, new_type, i;
> + bool can_steal;
>
> /* Find the largest possible block of pages in the other list */
> for (current_order = MAX_ORDER-1;
> @@ -1194,6 +1201,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> --current_order) {
> for (i = 0;; i++) {
> migratetype = fallbacks[start_migratetype][i];
> + new_type = migratetype;
>
> /* MIGRATE_RESERVE handled later if necessary */
> if (migratetype == MIGRATE_RESERVE)
> @@ -1207,9 +1215,13 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> struct page, lru);
> area->nr_free--;
So wouldn't it be better to handle the "order >= pageblock_order" case
separately at this level? I think it would be better also for the
compaction case (I'll comment on the later patch why).
> - new_type = try_to_steal_freepages(zone, page,
> - start_migratetype,
> - migratetype);
> + can_steal = can_steal_freepages(current_order,
> + start_migratetype, migratetype);
> + if (can_steal) {
> + new_type = start_migratetype;
> + try_to_steal_freepages(zone, page,
> + start_migratetype);
> + }
>
> /* Remove the page from the freelists */
> list_del(&page->lru);
> @@ -1225,7 +1237,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> set_freepage_migratetype(page, new_type);
>
> trace_mm_page_alloc_extfrag(page, order, current_order,
> - start_migratetype, migratetype, new_type);
> + start_migratetype, migratetype);
>
> return page;
> }
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/4] mm/compaction: enhance compaction finish condition
2014-12-08 7:16 ` [PATCH 3/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
@ 2014-12-08 9:34 ` Vlastimil Babka
2014-12-10 6:46 ` Joonsoo Kim
0 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2014-12-08 9:34 UTC (permalink / raw)
To: Joonsoo Kim, Andrew Morton
Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel
On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> Compaction has anti fragmentation algorithm. It is that freepage
> should be more than pageblock order to finish the compaction if we don't
> find any freepage in requested migratetype buddy list. This is for
> mitigating fragmentation, but, it is a lack of migratetype consideration
> and too excessive.
>
> At first, it doesn't consider migratetype so there would be false positive
> on compaction finish decision. For example, if allocation request is
> for unmovable migratetype, freepage in CMA migratetype doesn't help that
> allocation, so compaction should not be stopped. But, current logic
> considers it as compaction is no longer needed and stop the compaction.
>
> Secondly, it is too excessive. We can steal freepage from other migratetype
> and change pageblock migratetype on more relaxed conditions. In page
> allocator, there is another conditions that can succeed to steal without
> introducing fragmentation.
>
> To solve these problems, this patch borrows anti fragmentation logic from
> page allocator. It will reduce premature compaction finish in some cases
> and reduce excessive compaction work.
>
> stress-highalloc test in mmtests with non movable order 7 allocation shows
> in allocation success rate on phase 1 and compaction success rate.
>
> Allocation success rate on phase 1 (%)
> 57.00 : 63.67
>
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 28.94 : 35.13
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
> include/linux/mmzone.h | 3 +++
> mm/compaction.c | 31 +++++++++++++++++++++++++++++--
> mm/internal.h | 1 +
> mm/page_alloc.c | 5 ++---
> 4 files changed, 35 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 2f0856d..87f5bb5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -63,6 +63,9 @@ enum {
> MIGRATE_TYPES
> };
>
> +#define FALLBACK_MIGRATETYPES (4)
> +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES];
> +
> #ifdef CONFIG_CMA
> # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
> #else
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 1a5f465..2fd5f79 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1054,6 +1054,30 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
> return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
> }
>
> +static bool can_steal_fallbacks(struct free_area *area,
> + unsigned int order, int migratetype)
> +{
> + int i;
> + int fallback_mt;
> +
> + if (area->nr_free == 0)
> + return false;
> +
> + for (i = 0; i < FALLBACK_MIGRATETYPES; i++) {
> + fallback_mt = fallbacks[migratetype][i];
> + if (fallback_mt == MIGRATE_RESERVE)
> + break;
> +
> + if (list_empty(&area->free_list[fallback_mt]))
> + continue;
> +
> + if (can_steal_freepages(order, migratetype, fallback_mt))
> + return true;
> + }
> +
> + return false;
> +}
> +
> static int __compact_finished(struct zone *zone, struct compact_control *cc,
> const int migratetype)
> {
> @@ -1104,8 +1128,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
> if (!list_empty(&area->free_list[migratetype]))
> return COMPACT_PARTIAL;
>
> - /* Job done if allocation would set block type */
> - if (order >= pageblock_order && area->nr_free)
So, can_steal_fallbacks() -> can_steal_freepages() is quite involved way
if in the end we just realize that order >= pageblock_order and we are
stealing whole pageblock. Given that often compaction is done for THP,
it would be better to check order >= pageblock_order and handle it
upfront. This goes together with my comments on previous patch that
order >= pageblock_order is better handled separately.
> + /*
> + * Job done if allocation would steal freepages from
> + * other migratetype buddy lists.
> + */
> + if (can_steal_fallbacks(area, order, migratetype))
> return COMPACT_PARTIAL;
> }
>
> diff --git a/mm/internal.h b/mm/internal.h
> index efad241..7028d83 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -179,6 +179,7 @@ unsigned long
> isolate_migratepages_range(struct compact_control *cc,
> unsigned long low_pfn, unsigned long end_pfn);
>
> +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt);
> #endif
>
> /*
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7b4c9aa..dcb8523 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1031,7 +1031,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
> * This array describes the order lists are fallen back to when
> * the free lists for the desirable migrate type are depleted
> */
> -static int fallbacks[MIGRATE_TYPES][4] = {
> +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = {
> [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
> [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
> #ifdef CONFIG_CMA
> @@ -1161,8 +1161,7 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page,
> }
> }
>
> -static bool can_steal_freepages(unsigned int order,
> - int start_mt, int fallback_mt)
> +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt)
> {
> /*
> * When borrowing from MIGRATE_CMA, we need to release the excess
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage
2014-12-08 7:16 ` [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
@ 2014-12-08 9:59 ` Vlastimil Babka
2014-12-10 7:00 ` Joonsoo Kim
0 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2014-12-08 9:59 UTC (permalink / raw)
To: Joonsoo Kim, Andrew Morton
Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel,
Joonsoo Kim
On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> From: Joonsoo Kim <js1304@gmail.com>
>
> Currently, freepage isolation in one pageblock doesn't consider how many
> freepages we isolate. When I traced flow of compaction, compaction
> sometimes isolates more than 256 freepages to migrate just 32 pages.
>
> In this patch, freepage isolation is stopped at the point that we
> have more isolated freepage than isolated page for migration. This
> results in slowing down free page scanner and make compaction success
> rate higher.
>
> stress-highalloc test in mmtests with non movable order 7 allocation shows
> increase of compaction success rate and slight improvement of allocation
> success rate.
>
> Allocation success rate on phase 1 (%)
> 62.70 : 64.00
>
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 35.13 : 41.50
This is weird. I could maybe understand that isolating too many
freepages and then returning them is a waste of time if compaction
terminates immediately after the following migration (otherwise we would
keep those free pages for the future migrations within same compaction
run). And wasting time could reduce success rates for async compaction
terminating prematurely due to cond_resched(), but that should be all
the difference, unless there's another subtle bug, no?
> pfn where both scanners meets on compaction complete
> (separate test due to enormous tracepoint buffer)
> (zone_start=4096, zone_end=1048576)
> 586034 : 654378
The difference here suggests that there is indeed another subtle bug
related to where free scanner restarts, and we must be leaving the
excessively isolated (and then returned) freepages behind. Otherwise I
think the scanners should meet at the same place regardless of your patch.
> Signed-off-by: Joonsoo Kim <js1304@gmail.com>
> ---
> mm/compaction.c | 17 ++++++++++-------
> 1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 2fd5f79..12223b9 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -422,6 +422,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>
> /* If a page was split, advance to the end of it */
> if (isolated) {
> + cc->nr_freepages += isolated;
> + if (!strict &&
> + cc->nr_migratepages <= cc->nr_freepages) {
> + blockpfn += isolated;
> + break;
> + }
> +
> blockpfn += isolated - 1;
> cursor += isolated - 1;
> continue;
> @@ -831,7 +838,6 @@ static void isolate_freepages(struct compact_control *cc)
> unsigned long isolate_start_pfn; /* exact pfn we start at */
> unsigned long block_end_pfn; /* end of current pageblock */
> unsigned long low_pfn; /* lowest pfn scanner is able to scan */
> - int nr_freepages = cc->nr_freepages;
> struct list_head *freelist = &cc->freepages;
>
> /*
> @@ -856,11 +862,11 @@ static void isolate_freepages(struct compact_control *cc)
> * pages on cc->migratepages. We stop searching if the migrate
> * and free page scanners meet or enough free pages are isolated.
> */
> - for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
> + for (; block_start_pfn >= low_pfn &&
> + cc->nr_migratepages > cc->nr_freepages;
> block_end_pfn = block_start_pfn,
> block_start_pfn -= pageblock_nr_pages,
> isolate_start_pfn = block_start_pfn) {
> - unsigned long isolated;
>
> /*
> * This can iterate a massively long zone without finding any
> @@ -885,9 +891,8 @@ static void isolate_freepages(struct compact_control *cc)
> continue;
>
> /* Found a block suitable for isolating free pages from. */
> - isolated = isolate_freepages_block(cc, &isolate_start_pfn,
> + isolate_freepages_block(cc, &isolate_start_pfn,
> block_end_pfn, freelist, false);
> - nr_freepages += isolated;
>
> /*
> * Remember where the free scanner should restart next time,
> @@ -919,8 +924,6 @@ static void isolate_freepages(struct compact_control *cc)
> */
> if (block_start_pfn < low_pfn)
> cc->free_pfn = cc->migrate_pfn;
> -
> - cc->nr_freepages = nr_freepages;
> }
>
> /*
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/4] enhance compaction success rate
2014-12-08 9:16 ` [PATCH 0/4] enhance compaction success rate Vlastimil Babka
@ 2014-12-10 6:36 ` Joonsoo Kim
0 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-10 6:36 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On Mon, Dec 08, 2014 at 10:16:34AM +0100, Vlastimil Babka wrote:
> On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> >This patchset aims at increase of compaction success rate. Changes are
> >related to compaction finish condition and freepage isolation condition.
> >
> > From these changes, I did stress highalloc test in mmtests with nonmovable
> >order 7 allocation configuration, and success rate (%) at phase 1 are,
> >
> >Base Patch-1 Patch-3 Patch-4
> >55.00 57.00 62.67 64.00
> >
> >And, compaction success rate (%) on same test are,
> >
> >Base Patch-1 Patch-3 Patch-4
> >18.47 28.94 35.13 41.50
>
> Did you test Patch-2 separately? Any difference to Patch 1?
I didn't test it separately. I guess that there is no remarkable
difference because it just slightly changes page stealing logic, not
compaction logic. Compaction success rate would not be affected by
patch 2, but, I will check it next time.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal
2014-12-08 9:29 ` Vlastimil Babka
@ 2014-12-10 6:38 ` Joonsoo Kim
2014-12-10 9:55 ` Vlastimil Babka
2015-01-27 7:35 ` Vlastimil Babka
0 siblings, 2 replies; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-10 6:38 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On Mon, Dec 08, 2014 at 10:29:44AM +0100, Vlastimil Babka wrote:
> On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> >There is odd behaviour when we steal freepages from other migratetype
> >buddy list. In try_to_steal_freepages(), we move all freepages in
> >the pageblock that founded freepage is belong to to the request
> >migratetype in order to mitigate fragmentation. If the number of moved
> >pages are enough to change pageblock migratetype, there is no problem. If
> >not enough, we don't change pageblock migratetype and add broken freepages
> >to the original migratetype buddy list rather than request migratetype
> >one. For me, this is odd, because we already moved all freepages in this
> >pageblock to the request migratetype. This patch fixes this situation to
> >add broken freepages to the request migratetype buddy list in this case.
>
> I'd rather split the fix from the refactoring. And maybe my
> description is longer, but easier to understand? (I guess somebody
> else should judge this)
Your patch is much better to understand than mine. :)
No need to judge from somebody else.
After your patch is merged, I will resubmit these on top of it.
>
> >This patch introduce new function that can help to decide if we can
> >steal the page without resulting in fragmentation. It will be used in
> >following patch for compaction finish criteria.
> >
> >Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >---
> > include/trace/events/kmem.h | 7 +++--
> > mm/page_alloc.c | 72 +++++++++++++++++++++++++------------------
> > 2 files changed, 46 insertions(+), 33 deletions(-)
> >
> >diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> >index aece134..4ad10ba 100644
> >--- a/include/trace/events/kmem.h
> >+++ b/include/trace/events/kmem.h
> >@@ -268,11 +268,11 @@ TRACE_EVENT(mm_page_alloc_extfrag,
> >
> > TP_PROTO(struct page *page,
> > int alloc_order, int fallback_order,
> >- int alloc_migratetype, int fallback_migratetype, int new_migratetype),
> >+ int alloc_migratetype, int fallback_migratetype),
> >
> > TP_ARGS(page,
> > alloc_order, fallback_order,
> >- alloc_migratetype, fallback_migratetype, new_migratetype),
> >+ alloc_migratetype, fallback_migratetype),
> >
> > TP_STRUCT__entry(
> > __field( struct page *, page )
> >@@ -289,7 +289,8 @@ TRACE_EVENT(mm_page_alloc_extfrag,
> > __entry->fallback_order = fallback_order;
> > __entry->alloc_migratetype = alloc_migratetype;
> > __entry->fallback_migratetype = fallback_migratetype;
> >- __entry->change_ownership = (new_migratetype == alloc_migratetype);
> >+ __entry->change_ownership = (alloc_migratetype ==
> >+ get_pageblock_migratetype(page));
> > ),
> >
> > TP_printk("page=%p pfn=%lu alloc_order=%d fallback_order=%d pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d change_ownership=%d",
> >diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >index 7c46d0f..7b4c9aa 100644
> >--- a/mm/page_alloc.c
> >+++ b/mm/page_alloc.c
> >@@ -1139,44 +1139,50 @@ static void change_pageblock_range(struct page *pageblock_page,
> > * Returns the new migratetype of the pageblock (or the same old migratetype
> > * if it was unchanged).
> > */
> >-static int try_to_steal_freepages(struct zone *zone, struct page *page,
> >- int start_type, int fallback_type)
> >+static void try_to_steal_freepages(struct zone *zone, struct page *page,
> >+ int target_mt)
> > {
> >+ int pages;
> > int current_order = page_order(page);
> >
> >- /*
> >- * When borrowing from MIGRATE_CMA, we need to release the excess
> >- * buddy pages to CMA itself. We also ensure the freepage_migratetype
> >- * is set to CMA so it is returned to the correct freelist in case
> >- * the page ends up being not actually allocated from the pcp lists.
> >- */
> >- if (is_migrate_cma(fallback_type))
> >- return fallback_type;
> >-
> > /* Take ownership for orders >= pageblock_order */
> > if (current_order >= pageblock_order) {
> >- change_pageblock_range(page, current_order, start_type);
> >- return start_type;
> >+ change_pageblock_range(page, current_order, target_mt);
> >+ return;
>
> So here's a (current_order >= pageblock_order) check.
>
> > }
> >
> >- if (current_order >= pageblock_order / 2 ||
> >- start_type == MIGRATE_RECLAIMABLE ||
> >- page_group_by_mobility_disabled) {
> >- int pages;
> >+ pages = move_freepages_block(zone, page, target_mt);
> >
> >- pages = move_freepages_block(zone, page, start_type);
> >+ /* Claim the whole block if over half of it is free */
> >+ if (pages >= (1 << (pageblock_order-1)) ||
> >+ page_group_by_mobility_disabled) {
> >
> >- /* Claim the whole block if over half of it is free */
> >- if (pages >= (1 << (pageblock_order-1)) ||
> >- page_group_by_mobility_disabled) {
> >+ set_pageblock_migratetype(page, target_mt);
> >+ }
> >+}
> >
> >- set_pageblock_migratetype(page, start_type);
> >- return start_type;
> >- }
> >+static bool can_steal_freepages(unsigned int order,
> >+ int start_mt, int fallback_mt)
> >+{
> >+ /*
> >+ * When borrowing from MIGRATE_CMA, we need to release the excess
> >+ * buddy pages to CMA itself. We also ensure the freepage_migratetype
> >+ * is set to CMA so it is returned to the correct freelist in case
> >+ * the page ends up being not actually allocated from the pcp lists.
> >+ */
> >+ if (is_migrate_cma(fallback_mt))
> >+ return false;
> >
> >- }
> >+ /* Can take ownership for orders >= pageblock_order */
> >+ if (order >= pageblock_order)
> >+ return true;
>
> And another check.
>
> >+
> >+ if (order >= pageblock_order / 2 ||
> >+ start_mt == MIGRATE_RECLAIMABLE ||
> >+ page_group_by_mobility_disabled)
> >+ return true;
> >
> >- return fallback_type;
> >+ return false;
> > }
> >
> > /* Remove an element from the buddy allocator from the fallback list */
> >@@ -1187,6 +1193,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> > unsigned int current_order;
> > struct page *page;
> > int migratetype, new_type, i;
> >+ bool can_steal;
> >
> > /* Find the largest possible block of pages in the other list */
> > for (current_order = MAX_ORDER-1;
> >@@ -1194,6 +1201,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> > --current_order) {
> > for (i = 0;; i++) {
> > migratetype = fallbacks[start_migratetype][i];
> >+ new_type = migratetype;
> >
> > /* MIGRATE_RESERVE handled later if necessary */
> > if (migratetype == MIGRATE_RESERVE)
> >@@ -1207,9 +1215,13 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> > struct page, lru);
> > area->nr_free--;
>
> So wouldn't it be better to handle the "order >= pageblock_order"
> case separately at this level? I think it would be better also for
> the compaction case (I'll comment on the later patch why).
I will also comment on the later patch.
Thanks.
>
> >- new_type = try_to_steal_freepages(zone, page,
> >- start_migratetype,
> >- migratetype);
> >+ can_steal = can_steal_freepages(current_order,
> >+ start_migratetype, migratetype);
> >+ if (can_steal) {
> >+ new_type = start_migratetype;
> >+ try_to_steal_freepages(zone, page,
> >+ start_migratetype);
> >+ }
> >
> > /* Remove the page from the freelists */
> > list_del(&page->lru);
> >@@ -1225,7 +1237,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> > set_freepage_migratetype(page, new_type);
> >
> > trace_mm_page_alloc_extfrag(page, order, current_order,
> >- start_migratetype, migratetype, new_type);
> >+ start_migratetype, migratetype);
> >
> > return page;
> > }
> >
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/4] mm/compaction: enhance compaction finish condition
2014-12-08 9:34 ` Vlastimil Babka
@ 2014-12-10 6:46 ` Joonsoo Kim
0 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-10 6:46 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On Mon, Dec 08, 2014 at 10:34:05AM +0100, Vlastimil Babka wrote:
> On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> >Compaction has anti fragmentation algorithm. It is that freepage
> >should be more than pageblock order to finish the compaction if we don't
> >find any freepage in requested migratetype buddy list. This is for
> >mitigating fragmentation, but, it is a lack of migratetype consideration
> >and too excessive.
> >
> >At first, it doesn't consider migratetype so there would be false positive
> >on compaction finish decision. For example, if allocation request is
> >for unmovable migratetype, freepage in CMA migratetype doesn't help that
> >allocation, so compaction should not be stopped. But, current logic
> >considers it as compaction is no longer needed and stop the compaction.
> >
> >Secondly, it is too excessive. We can steal freepage from other migratetype
> >and change pageblock migratetype on more relaxed conditions. In page
> >allocator, there is another conditions that can succeed to steal without
> >introducing fragmentation.
> >
> >To solve these problems, this patch borrows anti fragmentation logic from
> >page allocator. It will reduce premature compaction finish in some cases
> >and reduce excessive compaction work.
> >
> >stress-highalloc test in mmtests with non movable order 7 allocation shows
> >in allocation success rate on phase 1 and compaction success rate.
> >
> >Allocation success rate on phase 1 (%)
> >57.00 : 63.67
> >
> >Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> >28.94 : 35.13
> >
> >Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >---
> > include/linux/mmzone.h | 3 +++
> > mm/compaction.c | 31 +++++++++++++++++++++++++++++--
> > mm/internal.h | 1 +
> > mm/page_alloc.c | 5 ++---
> > 4 files changed, 35 insertions(+), 5 deletions(-)
> >
> >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >index 2f0856d..87f5bb5 100644
> >--- a/include/linux/mmzone.h
> >+++ b/include/linux/mmzone.h
> >@@ -63,6 +63,9 @@ enum {
> > MIGRATE_TYPES
> > };
> >
> >+#define FALLBACK_MIGRATETYPES (4)
> >+extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES];
> >+
> > #ifdef CONFIG_CMA
> > # define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
> > #else
> >diff --git a/mm/compaction.c b/mm/compaction.c
> >index 1a5f465..2fd5f79 100644
> >--- a/mm/compaction.c
> >+++ b/mm/compaction.c
> >@@ -1054,6 +1054,30 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
> > return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
> > }
> >
> >+static bool can_steal_fallbacks(struct free_area *area,
> >+ unsigned int order, int migratetype)
> >+{
> >+ int i;
> >+ int fallback_mt;
> >+
> >+ if (area->nr_free == 0)
> >+ return false;
> >+
> >+ for (i = 0; i < FALLBACK_MIGRATETYPES; i++) {
> >+ fallback_mt = fallbacks[migratetype][i];
> >+ if (fallback_mt == MIGRATE_RESERVE)
> >+ break;
> >+
> >+ if (list_empty(&area->free_list[fallback_mt]))
> >+ continue;
> >+
> >+ if (can_steal_freepages(order, migratetype, fallback_mt))
> >+ return true;
> >+ }
> >+
> >+ return false;
> >+}
> >+
> > static int __compact_finished(struct zone *zone, struct compact_control *cc,
> > const int migratetype)
> > {
> >@@ -1104,8 +1128,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
> > if (!list_empty(&area->free_list[migratetype]))
> > return COMPACT_PARTIAL;
> >
> >- /* Job done if allocation would set block type */
> >- if (order >= pageblock_order && area->nr_free)
>
> So, can_steal_fallbacks() -> can_steal_freepages() is quite involved
> way if in the end we just realize that order >= pageblock_order and
> we are stealing whole pageblock. Given that often compaction is done
> for THP, it would be better to check order >= pageblock_order and
> handle it upfront. This goes together with my comments on previous
> patch that order >= pageblock_order is better handled separately.
I'd like to keep this order check in can_steal_freepages(). At first, we
should first check migratetype before order checking. If high order page
is on CMA, we can't steal it. Secondly, I think that maintaining well
defined function to check whether we can steal or not is better than
separating logic. It would help future maintanance.
Thanks.
>
> >+ /*
> >+ * Job done if allocation would steal freepages from
> >+ * other migratetype buddy lists.
> >+ */
> >+ if (can_steal_fallbacks(area, order, migratetype))
> > return COMPACT_PARTIAL;
> > }
> >
> >diff --git a/mm/internal.h b/mm/internal.h
> >index efad241..7028d83 100644
> >--- a/mm/internal.h
> >+++ b/mm/internal.h
> >@@ -179,6 +179,7 @@ unsigned long
> > isolate_migratepages_range(struct compact_control *cc,
> > unsigned long low_pfn, unsigned long end_pfn);
> >
> >+bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt);
> > #endif
> >
> > /*
> >diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >index 7b4c9aa..dcb8523 100644
> >--- a/mm/page_alloc.c
> >+++ b/mm/page_alloc.c
> >@@ -1031,7 +1031,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
> > * This array describes the order lists are fallen back to when
> > * the free lists for the desirable migrate type are depleted
> > */
> >-static int fallbacks[MIGRATE_TYPES][4] = {
> >+int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = {
> > [MIGRATE_UNMOVABLE] = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
> > [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_RESERVE },
> > #ifdef CONFIG_CMA
> >@@ -1161,8 +1161,7 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page,
> > }
> > }
> >
> >-static bool can_steal_freepages(unsigned int order,
> >- int start_mt, int fallback_mt)
> >+bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt)
> > {
> > /*
> > * When borrowing from MIGRATE_CMA, we need to release the excess
> >
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage
2014-12-08 9:59 ` Vlastimil Babka
@ 2014-12-10 7:00 ` Joonsoo Kim
2014-12-10 15:19 ` Vlastimil Babka
0 siblings, 1 reply; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-10 7:00 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On Mon, Dec 08, 2014 at 10:59:17AM +0100, Vlastimil Babka wrote:
> On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> >From: Joonsoo Kim <js1304@gmail.com>
> >
> >Currently, freepage isolation in one pageblock doesn't consider how many
> >freepages we isolate. When I traced flow of compaction, compaction
> >sometimes isolates more than 256 freepages to migrate just 32 pages.
> >
> >In this patch, freepage isolation is stopped at the point that we
> >have more isolated freepage than isolated page for migration. This
> >results in slowing down free page scanner and make compaction success
> >rate higher.
> >
> >stress-highalloc test in mmtests with non movable order 7 allocation shows
> >increase of compaction success rate and slight improvement of allocation
> >success rate.
> >
> >Allocation success rate on phase 1 (%)
> >62.70 : 64.00
> >
> >Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> >35.13 : 41.50
>
> This is weird. I could maybe understand that isolating too many
In fact, I also didn't fully understand why it results in this
result. :)
> freepages and then returning them is a waste of time if compaction
> terminates immediately after the following migration (otherwise we
> would keep those free pages for the future migrations within same
> compaction run). And wasting time could reduce success rates for
> async compaction terminating prematurely due to cond_resched(), but
> that should be all the difference, unless there's another subtle
> bug, no?
My guess is that there is bad effect when we release isolated
freepages. In asynchronous compaction, this happens quite easily.
In this case, freepages are returned to page allocator and, maybe,
they are on pcp list or front of buddy list so they would be used by
another user at first. This reduces freepages we can utilize so
compaction is finished earlier.
>
> >pfn where both scanners meets on compaction complete
> >(separate test due to enormous tracepoint buffer)
> >(zone_start=4096, zone_end=1048576)
> >586034 : 654378
>
> The difference here suggests that there is indeed another subtle bug
> related to where free scanner restarts, and we must be leaving the
> excessively isolated (and then returned) freepages behind. Otherwise
> I think the scanners should meet at the same place regardless of
> your patch.
I tried to find another subtle bug, but, can't find any critical one.
Hmm...
Anyway, regardless of the reason of result, this patch seems reasonable,
because we don't need to waste time to isolate unneeded freepages.
Thanks.
>
> >Signed-off-by: Joonsoo Kim <js1304@gmail.com>
> >---
> > mm/compaction.c | 17 ++++++++++-------
> > 1 file changed, 10 insertions(+), 7 deletions(-)
> >
> >diff --git a/mm/compaction.c b/mm/compaction.c
> >index 2fd5f79..12223b9 100644
> >--- a/mm/compaction.c
> >+++ b/mm/compaction.c
> >@@ -422,6 +422,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> >
> > /* If a page was split, advance to the end of it */
> > if (isolated) {
> >+ cc->nr_freepages += isolated;
> >+ if (!strict &&
> >+ cc->nr_migratepages <= cc->nr_freepages) {
> >+ blockpfn += isolated;
> >+ break;
> >+ }
> >+
> > blockpfn += isolated - 1;
> > cursor += isolated - 1;
> > continue;
> >@@ -831,7 +838,6 @@ static void isolate_freepages(struct compact_control *cc)
> > unsigned long isolate_start_pfn; /* exact pfn we start at */
> > unsigned long block_end_pfn; /* end of current pageblock */
> > unsigned long low_pfn; /* lowest pfn scanner is able to scan */
> >- int nr_freepages = cc->nr_freepages;
> > struct list_head *freelist = &cc->freepages;
> >
> > /*
> >@@ -856,11 +862,11 @@ static void isolate_freepages(struct compact_control *cc)
> > * pages on cc->migratepages. We stop searching if the migrate
> > * and free page scanners meet or enough free pages are isolated.
> > */
> >- for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
> >+ for (; block_start_pfn >= low_pfn &&
> >+ cc->nr_migratepages > cc->nr_freepages;
> > block_end_pfn = block_start_pfn,
> > block_start_pfn -= pageblock_nr_pages,
> > isolate_start_pfn = block_start_pfn) {
> >- unsigned long isolated;
> >
> > /*
> > * This can iterate a massively long zone without finding any
> >@@ -885,9 +891,8 @@ static void isolate_freepages(struct compact_control *cc)
> > continue;
> >
> > /* Found a block suitable for isolating free pages from. */
> >- isolated = isolate_freepages_block(cc, &isolate_start_pfn,
> >+ isolate_freepages_block(cc, &isolate_start_pfn,
> > block_end_pfn, freelist, false);
> >- nr_freepages += isolated;
> >
> > /*
> > * Remember where the free scanner should restart next time,
> >@@ -919,8 +924,6 @@ static void isolate_freepages(struct compact_control *cc)
> > */
> > if (block_start_pfn < low_pfn)
> > cc->free_pfn = cc->migrate_pfn;
> >-
> >- cc->nr_freepages = nr_freepages;
> > }
> >
> > /*
> >
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal
2014-12-10 6:38 ` Joonsoo Kim
@ 2014-12-10 9:55 ` Vlastimil Babka
2015-01-27 7:35 ` Vlastimil Babka
1 sibling, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2014-12-10 9:55 UTC (permalink / raw)
To: Joonsoo Kim
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On 12/10/2014 07:38 AM, Joonsoo Kim wrote:
> On Mon, Dec 08, 2014 at 10:29:44AM +0100, Vlastimil Babka wrote:
>> On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
>>> There is odd behaviour when we steal freepages from other migratetype
>>> buddy list. In try_to_steal_freepages(), we move all freepages in
>>> the pageblock that founded freepage is belong to to the request
>>> migratetype in order to mitigate fragmentation. If the number of moved
>>> pages are enough to change pageblock migratetype, there is no problem. If
>>> not enough, we don't change pageblock migratetype and add broken freepages
>>> to the original migratetype buddy list rather than request migratetype
>>> one. For me, this is odd, because we already moved all freepages in this
>>> pageblock to the request migratetype. This patch fixes this situation to
>>> add broken freepages to the request migratetype buddy list in this case.
>>
>> I'd rather split the fix from the refactoring. And maybe my
>> description is longer, but easier to understand? (I guess somebody
>> else should judge this)
>
> Your patch is much better to understand than mine. :)
> No need to judge from somebody else.
> After your patch is merged, I will resubmit these on top of it.
Thanks. I'm doing another evaluation focusing on number of unmovable
pageblocks as Mel suggested and then resubmit with tracepoint fixed.
Vlastimil
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage
2014-12-10 7:00 ` Joonsoo Kim
@ 2014-12-10 15:19 ` Vlastimil Babka
2014-12-11 3:09 ` Joonsoo Kim
0 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2014-12-10 15:19 UTC (permalink / raw)
To: Joonsoo Kim
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On 12/10/2014 08:00 AM, Joonsoo Kim wrote:
> On Mon, Dec 08, 2014 at 10:59:17AM +0100, Vlastimil Babka wrote:
>> On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
>>> From: Joonsoo Kim <js1304@gmail.com>
>>>
>>> Currently, freepage isolation in one pageblock doesn't consider how many
>>> freepages we isolate. When I traced flow of compaction, compaction
>>> sometimes isolates more than 256 freepages to migrate just 32 pages.
>>>
>>> In this patch, freepage isolation is stopped at the point that we
>>> have more isolated freepage than isolated page for migration. This
>>> results in slowing down free page scanner and make compaction success
>>> rate higher.
>>>
>>> stress-highalloc test in mmtests with non movable order 7 allocation shows
>>> increase of compaction success rate and slight improvement of allocation
>>> success rate.
>>>
>>> Allocation success rate on phase 1 (%)
>>> 62.70 : 64.00
>>>
>>> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
>>> 35.13 : 41.50
>>
>> This is weird. I could maybe understand that isolating too many
>
> In fact, I also didn't fully understand why it results in this
> result. :)
>
>> freepages and then returning them is a waste of time if compaction
>> terminates immediately after the following migration (otherwise we
>> would keep those free pages for the future migrations within same
>> compaction run). And wasting time could reduce success rates for
>> async compaction terminating prematurely due to cond_resched(), but
>> that should be all the difference, unless there's another subtle
>> bug, no?
>
> My guess is that there is bad effect when we release isolated
> freepages. In asynchronous compaction, this happens quite easily.
> In this case, freepages are returned to page allocator and, maybe,
> they are on pcp list or front of buddy list so they would be used by
> another user at first. This reduces freepages we can utilize so
> compaction is finished earlier.
Hmm, some might even stay on the pcplists and we won't isolate them
again. So we will leave them behind. I wouldn't expect such big
difference here, but anyway...
It might be interesting to evaluate if a pcplists drain after returning
isolated freepages (unless the scanners have already met, that's
pointless) would make any difference.
>>
>>> pfn where both scanners meets on compaction complete
>>> (separate test due to enormous tracepoint buffer)
>>> (zone_start=4096, zone_end=1048576)
>>> 586034 : 654378
>>
>> The difference here suggests that there is indeed another subtle bug
>> related to where free scanner restarts, and we must be leaving the
>> excessively isolated (and then returned) freepages behind. Otherwise
>> I think the scanners should meet at the same place regardless of
>> your patch.
>
> I tried to find another subtle bug, but, can't find any critical one.
> Hmm...
>
> Anyway, regardless of the reason of result, this patch seems reasonable,
> because we don't need to waste time to isolate unneeded freepages.
Right.
> Thanks.
>
>>
>>> Signed-off-by: Joonsoo Kim <js1304@gmail.com>
>>> ---
>>> mm/compaction.c | 17 ++++++++++-------
>>> 1 file changed, 10 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/mm/compaction.c b/mm/compaction.c
>>> index 2fd5f79..12223b9 100644
>>> --- a/mm/compaction.c
>>> +++ b/mm/compaction.c
>>> @@ -422,6 +422,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>>>
>>> /* If a page was split, advance to the end of it */
>>> if (isolated) {
>>> + cc->nr_freepages += isolated;
>>> + if (!strict &&
>>> + cc->nr_migratepages <= cc->nr_freepages) {
>>> + blockpfn += isolated;
>>> + break;
>>> + }
>>> +
>>> blockpfn += isolated - 1;
>>> cursor += isolated - 1;
>>> continue;
>>> @@ -831,7 +838,6 @@ static void isolate_freepages(struct compact_control *cc)
>>> unsigned long isolate_start_pfn; /* exact pfn we start at */
>>> unsigned long block_end_pfn; /* end of current pageblock */
>>> unsigned long low_pfn; /* lowest pfn scanner is able to scan */
>>> - int nr_freepages = cc->nr_freepages;
>>> struct list_head *freelist = &cc->freepages;
>>>
>>> /*
>>> @@ -856,11 +862,11 @@ static void isolate_freepages(struct compact_control *cc)
>>> * pages on cc->migratepages. We stop searching if the migrate
>>> * and free page scanners meet or enough free pages are isolated.
>>> */
>>> - for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
>>> + for (; block_start_pfn >= low_pfn &&
>>> + cc->nr_migratepages > cc->nr_freepages;
>>> block_end_pfn = block_start_pfn,
>>> block_start_pfn -= pageblock_nr_pages,
>>> isolate_start_pfn = block_start_pfn) {
>>> - unsigned long isolated;
>>>
>>> /*
>>> * This can iterate a massively long zone without finding any
>>> @@ -885,9 +891,8 @@ static void isolate_freepages(struct compact_control *cc)
>>> continue;
>>>
>>> /* Found a block suitable for isolating free pages from. */
>>> - isolated = isolate_freepages_block(cc, &isolate_start_pfn,
>>> + isolate_freepages_block(cc, &isolate_start_pfn,
>>> block_end_pfn, freelist, false);
>>> - nr_freepages += isolated;
>>>
>>> /*
>>> * Remember where the free scanner should restart next time,
>>> @@ -919,8 +924,6 @@ static void isolate_freepages(struct compact_control *cc)
>>> */
>>> if (block_start_pfn < low_pfn)
>>> cc->free_pfn = cc->migrate_pfn;
>>> -
>>> - cc->nr_freepages = nr_freepages;
>>> }
>>>
>>> /*
>>>
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage
2014-12-10 15:19 ` Vlastimil Babka
@ 2014-12-11 3:09 ` Joonsoo Kim
0 siblings, 0 replies; 20+ messages in thread
From: Joonsoo Kim @ 2014-12-11 3:09 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On Wed, Dec 10, 2014 at 04:19:13PM +0100, Vlastimil Babka wrote:
> On 12/10/2014 08:00 AM, Joonsoo Kim wrote:
> >On Mon, Dec 08, 2014 at 10:59:17AM +0100, Vlastimil Babka wrote:
> >>On 12/08/2014 08:16 AM, Joonsoo Kim wrote:
> >>>From: Joonsoo Kim <js1304@gmail.com>
> >>>
> >>>Currently, freepage isolation in one pageblock doesn't consider how many
> >>>freepages we isolate. When I traced flow of compaction, compaction
> >>>sometimes isolates more than 256 freepages to migrate just 32 pages.
> >>>
> >>>In this patch, freepage isolation is stopped at the point that we
> >>>have more isolated freepage than isolated page for migration. This
> >>>results in slowing down free page scanner and make compaction success
> >>>rate higher.
> >>>
> >>>stress-highalloc test in mmtests with non movable order 7 allocation shows
> >>>increase of compaction success rate and slight improvement of allocation
> >>>success rate.
> >>>
> >>>Allocation success rate on phase 1 (%)
> >>>62.70 : 64.00
> >>>
> >>>Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> >>>35.13 : 41.50
> >>
> >>This is weird. I could maybe understand that isolating too many
> >
> >In fact, I also didn't fully understand why it results in this
> >result. :)
> >
> >>freepages and then returning them is a waste of time if compaction
> >>terminates immediately after the following migration (otherwise we
> >>would keep those free pages for the future migrations within same
> >>compaction run). And wasting time could reduce success rates for
> >>async compaction terminating prematurely due to cond_resched(), but
> >>that should be all the difference, unless there's another subtle
> >>bug, no?
> >
> >My guess is that there is bad effect when we release isolated
> >freepages. In asynchronous compaction, this happens quite easily.
> >In this case, freepages are returned to page allocator and, maybe,
> >they are on pcp list or front of buddy list so they would be used by
> >another user at first. This reduces freepages we can utilize so
> >compaction is finished earlier.
>
> Hmm, some might even stay on the pcplists and we won't isolate them
> again. So we will leave them behind. I wouldn't expect such big
> difference here, but anyway...
> It might be interesting to evaluate if a pcplists drain after
> returning isolated freepages (unless the scanners have already met,
> that's pointless) would make any difference.
Yes, I will check it.
>
> >>
> >>>pfn where both scanners meets on compaction complete
> >>>(separate test due to enormous tracepoint buffer)
> >>>(zone_start=4096, zone_end=1048576)
> >>>586034 : 654378
> >>
> >>The difference here suggests that there is indeed another subtle bug
> >>related to where free scanner restarts, and we must be leaving the
> >>excessively isolated (and then returned) freepages behind. Otherwise
> >>I think the scanners should meet at the same place regardless of
> >>your patch.
> >
> >I tried to find another subtle bug, but, can't find any critical one.
> >Hmm...
> >
> >Anyway, regardless of the reason of result, this patch seems reasonable,
> >because we don't need to waste time to isolate unneeded freepages.
>
> Right.
>
> >Thanks.
> >
> >>
> >>>Signed-off-by: Joonsoo Kim <js1304@gmail.com>
> >>>---
> >>> mm/compaction.c | 17 ++++++++++-------
> >>> 1 file changed, 10 insertions(+), 7 deletions(-)
> >>>
> >>>diff --git a/mm/compaction.c b/mm/compaction.c
> >>>index 2fd5f79..12223b9 100644
> >>>--- a/mm/compaction.c
> >>>+++ b/mm/compaction.c
> >>>@@ -422,6 +422,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
> >>>
> >>> /* If a page was split, advance to the end of it */
> >>> if (isolated) {
> >>>+ cc->nr_freepages += isolated;
> >>>+ if (!strict &&
> >>>+ cc->nr_migratepages <= cc->nr_freepages) {
> >>>+ blockpfn += isolated;
> >>>+ break;
> >>>+ }
> >>>+
> >>> blockpfn += isolated - 1;
> >>> cursor += isolated - 1;
> >>> continue;
> >>>@@ -831,7 +838,6 @@ static void isolate_freepages(struct compact_control *cc)
> >>> unsigned long isolate_start_pfn; /* exact pfn we start at */
> >>> unsigned long block_end_pfn; /* end of current pageblock */
> >>> unsigned long low_pfn; /* lowest pfn scanner is able to scan */
> >>>- int nr_freepages = cc->nr_freepages;
> >>> struct list_head *freelist = &cc->freepages;
> >>>
> >>> /*
> >>>@@ -856,11 +862,11 @@ static void isolate_freepages(struct compact_control *cc)
> >>> * pages on cc->migratepages. We stop searching if the migrate
> >>> * and free page scanners meet or enough free pages are isolated.
> >>> */
> >>>- for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
> >>>+ for (; block_start_pfn >= low_pfn &&
> >>>+ cc->nr_migratepages > cc->nr_freepages;
> >>> block_end_pfn = block_start_pfn,
> >>> block_start_pfn -= pageblock_nr_pages,
> >>> isolate_start_pfn = block_start_pfn) {
> >>>- unsigned long isolated;
> >>>
> >>> /*
> >>> * This can iterate a massively long zone without finding any
> >>>@@ -885,9 +891,8 @@ static void isolate_freepages(struct compact_control *cc)
> >>> continue;
> >>>
> >>> /* Found a block suitable for isolating free pages from. */
> >>>- isolated = isolate_freepages_block(cc, &isolate_start_pfn,
> >>>+ isolate_freepages_block(cc, &isolate_start_pfn,
> >>> block_end_pfn, freelist, false);
> >>>- nr_freepages += isolated;
> >>>
> >>> /*
> >>> * Remember where the free scanner should restart next time,
> >>>@@ -919,8 +924,6 @@ static void isolate_freepages(struct compact_control *cc)
> >>> */
> >>> if (block_start_pfn < low_pfn)
> >>> cc->free_pfn = cc->migrate_pfn;
> >>>-
> >>>- cc->nr_freepages = nr_freepages;
> >>> }
> >>>
> >>> /*
> >>>
> >>
> >>--
> >>To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >>the body to majordomo@kvack.org. For more info on Linux MM,
> >>see: http://www.linux-mm.org/ .
> >>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal
2014-12-10 6:38 ` Joonsoo Kim
2014-12-10 9:55 ` Vlastimil Babka
@ 2015-01-27 7:35 ` Vlastimil Babka
2015-01-27 8:34 ` Joonsoo Kim
1 sibling, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2015-01-27 7:35 UTC (permalink / raw)
To: Joonsoo Kim
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On 12/10/2014 07:38 AM, Joonsoo Kim wrote:
> After your patch is merged, I will resubmit these on top of it.
Hi Joonsoo,
my page stealing patches are now in -mm so are you planning to resubmit this? At
least patch 1 is an obvious bugfix, and patch 4 a clear compaction overhead
reduction. Those don't need to wait for the rest of the series. If you are busy
with other stuff, I can also resend those two myself if you want.
Thanks,
Vlastimil
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal
2015-01-27 7:35 ` Vlastimil Babka
@ 2015-01-27 8:34 ` Joonsoo Kim
2015-01-27 8:36 ` Vlastimil Babka
0 siblings, 1 reply; 20+ messages in thread
From: Joonsoo Kim @ 2015-01-27 8:34 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On Tue, Jan 27, 2015 at 08:35:17AM +0100, Vlastimil Babka wrote:
> On 12/10/2014 07:38 AM, Joonsoo Kim wrote:
> > After your patch is merged, I will resubmit these on top of it.
>
> Hi Joonsoo,
>
> my page stealing patches are now in -mm so are you planning to resubmit this? At
> least patch 1 is an obvious bugfix, and patch 4 a clear compaction overhead
> reduction. Those don't need to wait for the rest of the series. If you are busy
> with other stuff, I can also resend those two myself if you want.
Hello,
I've noticed that your patches are merged. :)
If you are in hurry, you can resend them. I'm glad if you handle it.
If not, I will resend them, maybe, on end of this week.
In fact, I'm testing your stealing patches on my add-hoc fragmentation
benchmark. It would be finished soon and, after that, I can resend this
patchset.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal
2015-01-27 8:34 ` Joonsoo Kim
@ 2015-01-27 8:36 ` Vlastimil Babka
0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2015-01-27 8:36 UTC (permalink / raw)
To: Joonsoo Kim
Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
linux-mm, linux-kernel
On 01/27/2015 09:34 AM, Joonsoo Kim wrote:
> On Tue, Jan 27, 2015 at 08:35:17AM +0100, Vlastimil Babka wrote:
>> On 12/10/2014 07:38 AM, Joonsoo Kim wrote:
>> > After your patch is merged, I will resubmit these on top of it.
>>
>> Hi Joonsoo,
>>
>> my page stealing patches are now in -mm so are you planning to resubmit this? At
>> least patch 1 is an obvious bugfix, and patch 4 a clear compaction overhead
>> reduction. Those don't need to wait for the rest of the series. If you are busy
>> with other stuff, I can also resend those two myself if you want.
>
> Hello,
>
> I've noticed that your patches are merged. :)
> If you are in hurry, you can resend them. I'm glad if you handle it.
> If not, I will resend them, maybe, on end of this week.
Hi,
end of week is fine! I'm in no hurry, just wanted to know the status.
> In fact, I'm testing your stealing patches on my add-hoc fragmentation
> benchmark. It would be finished soon and, after that, I can resend this
> patchset.
Good to hear! Thanks.
> Thanks.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2015-01-27 8:36 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-08 7:16 [PATCH 0/4] enhance compaction success rate Joonsoo Kim
2014-12-08 7:16 ` [PATCH 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
2014-12-08 9:06 ` Vlastimil Babka
2014-12-08 7:16 ` [PATCH 2/4] mm/page_alloc: expands broken freepage to proper buddy list when steal Joonsoo Kim
2014-12-08 9:29 ` Vlastimil Babka
2014-12-10 6:38 ` Joonsoo Kim
2014-12-10 9:55 ` Vlastimil Babka
2015-01-27 7:35 ` Vlastimil Babka
2015-01-27 8:34 ` Joonsoo Kim
2015-01-27 8:36 ` Vlastimil Babka
2014-12-08 7:16 ` [PATCH 3/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
2014-12-08 9:34 ` Vlastimil Babka
2014-12-10 6:46 ` Joonsoo Kim
2014-12-08 7:16 ` [PATCH 4/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
2014-12-08 9:59 ` Vlastimil Babka
2014-12-10 7:00 ` Joonsoo Kim
2014-12-10 15:19 ` Vlastimil Babka
2014-12-11 3:09 ` Joonsoo Kim
2014-12-08 9:16 ` [PATCH 0/4] enhance compaction success rate Vlastimil Babka
2014-12-10 6:36 ` Joonsoo Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox