From: Vlastimil Babka <vbabka@suse.cz>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
David Rientjes <rientjes@google.com>
Cc: linux-kernel@vger.kernel.org, Vlastimil Babka <vbabka@suse.cz>,
Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Michal Nazarewicz <mina86@mina86.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Subject: [RFC PATCH V4 15/15] mm, compaction: do not migrate pages when that cannot satisfy page fault allocation
Date: Wed, 16 Jul 2014 15:48:23 +0200 [thread overview]
Message-ID: <1405518503-27687-16-git-send-email-vbabka@suse.cz> (raw)
In-Reply-To: <1405518503-27687-1-git-send-email-vbabka@suse.cz>
In direct compaction for a page fault, we want to allocate the high-order page
as soon as possible, so migrating from a cc->order aligned block of pages that
contains also unmigratable pages just adds to page fault latency.
This patch therefore makes the migration scanner skip to the next cc->order
aligned block of pages as soon as it cannot isolate a non-free page. Everything
isolated up to that point is put back.
In this mode, the nr_isolated limit to COMPACT_CLUSTER_MAX is not observed,
allowing the scanner to scan the whole block at once, instead of migrating
COMPACT_CLUSTER_MAX pages and then finding an unmigratable page in the next
call. This might however have some implications on direct reclaimers through
too_many_isolated().
In tests with stress-highalloc benchmark where __GFP_NO_KSWAPD was not used
and therefore the patch did not affect the benchmark itself, but only the
kernel compilations occuring in parallel, this patch has increased allocation
success rates of the benchmark by a few percent, and pages scanned by
compaction by almost 20%. The compaction successes in vmstat increased by 16%
so this is probably due to more successes translating to less deferring.
The compaction successes (and attempts) did increase for other processes than
the benchmark, which would be explained by those processes faulting THP pages
with __GFP_NO_KSWAPD. However, THP faults in vmstat did not increase, so
there either some problem in that area, or the direct compactions were
triggered by different allocations.
In tests where __GFP_NO_KSWAPD was used by the benchmark, the allocation
success rates improved from 15% to 20% in the first phase, and from 19% to 33%
in the second phase. Again, the amount of work done increased due to less
deferring.
[rientjes@google.com: skip_on_failure based on THP page faults]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
---
mm/compaction.c | 50 ++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 40 insertions(+), 10 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 4fe091c..6271bf7 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -574,11 +574,20 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
struct page *page = NULL, *valid_page = NULL;
unsigned long capture_pfn = 0; /* current candidate for capturing */
unsigned long next_capture_pfn = 0; /* next candidate for capturing */
+ bool skip_on_failure = false; /* skip block when isolation fails */
if (cc->order > 0 && cc->order <= pageblock_order && capture) {
/* This may be outside the zone, but we check that later */
capture_pfn = low_pfn & ~((1UL << cc->order) - 1);
next_capture_pfn = ALIGN(low_pfn + 1, (1UL << cc->order));
+ /*
+ * It is too expensive for compaction to migrate pages from a
+ * cc->order block of pages on page faults, unless the entire
+ * block can become free. But hugepaged should try anyway for
+ * THP so that general defragmentation happens.
+ */
+ skip_on_failure = (cc->gfp_mask & __GFP_NO_KSWAPD)
+ && !(current->flags & PF_KTHREAD);
}
/*
@@ -633,7 +642,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
break;
if (!pfn_valid_within(low_pfn))
- continue;
+ goto isolation_failed;
nr_scanned++;
page = pfn_to_page(low_pfn);
@@ -676,7 +685,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
goto isolate_success;
}
}
- continue;
+ goto isolation_failed;
}
/*
@@ -699,7 +708,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
if (next_capture_pfn)
next_capture_pfn =
ALIGN(low_pfn + 1, (1UL << cc->order));
- continue;
+ goto isolation_failed;
}
/*
@@ -709,7 +718,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
*/
if (!page_mapping(page) &&
page_count(page) > page_mapcount(page))
- continue;
+ goto isolation_failed;
/* If we already hold the lock, we can skip some rechecking */
if (!locked) {
@@ -720,13 +729,13 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
/* Recheck PageLRU and PageTransHuge under lock */
if (!PageLRU(page))
- continue;
+ goto isolation_failed;
if (PageTransHuge(page)) {
low_pfn += (1 << compound_order(page)) - 1;
if (next_capture_pfn)
next_capture_pfn = ALIGN(low_pfn + 1,
(1UL << cc->order));
- continue;
+ goto isolation_failed;
}
}
@@ -734,7 +743,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
/* Try isolate the page */
if (__isolate_lru_page(page, isolate_mode) != 0)
- continue;
+ goto isolation_failed;
VM_BUG_ON_PAGE(PageTransCompound(page), page);
@@ -747,11 +756,32 @@ isolate_success:
cc->nr_migratepages++;
nr_isolated++;
- /* Avoid isolating too much */
- if (cc->nr_migratepages == COMPACT_CLUSTER_MAX) {
+ /*
+ * Avoid isolating too much, except if we try to capture a
+ * free page and want to find out at once if it can be done
+ * or we should skip to the next block.
+ */
+ if (!skip_on_failure &&
+ cc->nr_migratepages == COMPACT_CLUSTER_MAX) {
++low_pfn;
break;
}
+
+ continue;
+
+isolation_failed:
+ if (skip_on_failure) {
+ if (nr_isolated) {
+ if (locked) {
+ spin_unlock_irqrestore(&zone->lru_lock,
+ flags);
+ locked = false;
+ }
+ putback_movable_pages(migratelist);
+ nr_isolated = 0;
+ }
+ low_pfn = next_capture_pfn - 1;
+ }
}
/*
@@ -770,7 +800,7 @@ isolate_success:
* Update the pageblock-skip information and cached scanner pfn,
* if the whole pageblock was scanned without isolating any page.
*/
- if (low_pfn == end_pfn)
+ if (low_pfn == end_pfn && !skip_on_failure)
update_pageblock_skip(cc, valid_page, nr_isolated, true);
trace_mm_compaction_isolate_migratepages(nr_scanned, nr_isolated);
--
1.8.4.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2014-07-16 13:49 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-16 13:48 [PATCH V4 00/15] compaction: balancing overhead and success rates Vlastimil Babka
2014-07-16 13:48 ` [PATCH V4 01/15] mm, THP: don't hold mmap_sem in khugepaged when allocating THP Vlastimil Babka
2014-07-25 12:18 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 02/15] mm, compaction: defer each zone individually instead of preferred zone Vlastimil Babka
2014-07-25 12:20 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 03/15] mm, compaction: do not count compact_stall if all zones skipped compaction Vlastimil Babka
2014-07-25 12:22 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 04/15] mm, compaction: do not recheck suitable_migration_target under lock Vlastimil Babka
2014-07-25 12:23 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 05/15] mm, compaction: move pageblock checks up from isolate_migratepages_range() Vlastimil Babka
2014-07-25 12:28 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 06/15] mm, compaction: reduce zone checking frequency in the migration scanner Vlastimil Babka
2014-07-25 12:29 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 07/15] mm, compaction: khugepaged should not give up due to need_resched() Vlastimil Babka
2014-07-25 12:31 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 08/15] mm, compaction: periodically drop lock and restore IRQs in scanners Vlastimil Babka
2014-07-25 12:32 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 09/15] mm, compaction: skip rechecks when lock was already held Vlastimil Babka
2014-07-25 12:34 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 10/15] mm, compaction: remember position within pageblock in free pages scanner Vlastimil Babka
2014-07-25 12:35 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 11/15] mm, compaction: skip buddy pages by their order in the migrate scanner Vlastimil Babka
2014-07-25 12:36 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 12/15] mm: rename allocflags_to_migratetype for clarity Vlastimil Babka
2014-07-25 12:37 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 13/15] mm, compaction: pass gfp mask to compact_control Vlastimil Babka
2014-07-25 12:38 ` Mel Gorman
2014-07-16 13:48 ` [PATCH V4 14/15] mm, compaction: try to capture the just-created high-order freepage Vlastimil Babka
2014-07-25 12:56 ` Mel Gorman
2014-07-25 15:49 ` Vlastimil Babka
2014-07-16 13:48 ` Vlastimil Babka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1405518503-27687-16-git-send-email-vbabka@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mina86@mina86.com \
--cc=minchan@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox