[RFC 04/13] mm, page_alloc: restructure direct compaction handling in slowpath

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Rik van Riel <riel@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: [RFC 04/13] mm, page_alloc: restructure direct compaction handling in slowpath
Date: Tue, 10 May 2016 09:35:54 +0200	[thread overview]
Message-ID: <1462865763-22084-5-git-send-email-vbabka@suse.cz> (raw)
In-Reply-To: <1462865763-22084-1-git-send-email-vbabka@suse.cz>

The retry loop in __alloc_pages_slowpath is supposed to keep trying reclaim
and compaction (and OOM), until either the allocation succeeds, or returns
with failure. Success here is more probable when reclaim precedes compaction,
as certain watermarks have to be met for compaction to even try, and more free
pages increase the probability of compaction success. On the other hand,
starting with light async compaction (if the watermarks allow it), can be
more efficient, especially for smaller orders, if there's enough free memory
which is just fragmented.

Thus, the current code starts with compaction before reclaim, and to make sure
that the last reclaim is always followed by a final compaction, there's another
direct compaction call at the end of the loop. This makes the code hard to
follow and adds some duplicated handling of migration_mode decisions. It's also
somewhat inefficient that even if reclaim or compaction decides not to retry,
the final compaction is still attempted. Some gfp flags combination also
shortcut these retry decisions by "goto noretry;", making it even harder to
follow.

This patch attempts to restructure the code with only minimal functional
changes. The call to the first compaction and THP-specific checks are now
placed above the retry loop, and the "noretry" direct compaction is removed.

The initial compaction is additionally restricted only to costly orders, as we
can expect smaller orders to be held back by watermarks, and only larger orders
to suffer primarily from fragmentation. This better matches the checks in
reclaim's shrink_zones().

There are two other smaller functional changes. One is that the upgrade from
async migration to light sync migration will always occur after the initial
compaction. This is how it has been until recent patch "mm, oom: protect
!costly allocations some more", which introduced upgrading the mode based on
COMPACT_COMPLETE result, but kept the final compaction always upgraded, which
made it even more special. It's better to return to the simpler handling for
now, as migration modes will be further modified later in the series.

The second change is that once both reclaim and compaction declare it's not
worth to retry the reclaim/compact loop, there is no final compaction attempt.
As argued above, this is intentional. If that final compaction were to succeed,
it would be due to a wrong retry decision, or simply a race with somebody else
freeing memory for us.

The main outcome of this patch should be simpler code. Logically, the initial
compaction without reclaim is the exceptional case to the reclaim/compaction
scheme, but prior to the patch, it was the last loop iteration that was
exceptional. Now the code matches the logic better. The change also enable the
following patches.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/page_alloc.c | 107 +++++++++++++++++++++++++++++---------------------------
 1 file changed, 55 insertions(+), 52 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7249949d65ca..88d680b3e7b6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3555,7 +3555,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	struct page *page = NULL;
 	unsigned int alloc_flags;
 	unsigned long did_some_progress;
-	enum migrate_mode migration_mode = MIGRATE_ASYNC;
+	enum migrate_mode migration_mode = MIGRATE_SYNC_LIGHT;
 	enum compact_result compact_result;
 	int compaction_retries = 0;
 	int no_progress_loops = 0;
@@ -3598,6 +3598,50 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (page)
 		goto got_pg;

+	/*
+	 * For costly allocations, try direct compaction first, as it's likely
+	 * that we have enough base pages and don't need to reclaim.
+	 */
+	if (can_direct_reclaim && order > PAGE_ALLOC_COSTLY_ORDER) {
+		page = __alloc_pages_direct_compact(gfp_mask, order,
+						alloc_flags, ac,
+						MIGRATE_ASYNC,
+						&compact_result);
+		if (page)
+			goto got_pg;
+
+		/* Checks for THP-specific high-order allocations */
+		if (is_thp_gfp_mask(gfp_mask)) {
+			/*
+			 * If compaction is deferred for high-order allocations,
+			 * it is because sync compaction recently failed. If
+			 * this is the case and the caller requested a THP
+			 * allocation, we do not want to heavily disrupt the
+			 * system, so we fail the allocation instead of entering
+			 * direct reclaim.
+			 */
+			if (compact_result == COMPACT_DEFERRED)
+				goto nopage;
+
+			/*
+			 * Compaction is contended so rather back off than cause
+			 * excessive stalls.
+			 */
+			if (compact_result == COMPACT_CONTENDED)
+				goto nopage;
+
+			/*
+			 * It can become very expensive to allocate transparent
+			 * hugepages at fault, so use asynchronous memory
+			 * compaction for THP unless it is khugepaged trying to
+			 * collapse. All other requests should tolerate at
+			 * least light sync migration.
+			 */
+			if (!(current->flags & PF_KTHREAD))
+				migration_mode = MIGRATE_ASYNC;
+		}
+	}
+
 retry:
 	/* Ensure kswapd doesn't accidentaly go to sleep as long as we loop */
 	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
@@ -3646,55 +3690,33 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
 		goto nopage;

-	/*
-	 * Try direct compaction. The first pass is asynchronous. Subsequent
-	 * attempts after direct reclaim are synchronous
-	 */
+
+	/* Try direct reclaim and then allocating */
+	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
+							&did_some_progress);
+	if (page)
+		goto got_pg;
+
+	/* Try direct compaction and then allocating */
 	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
 					migration_mode,
 					&compact_result);
 	if (page)
 		goto got_pg;

-	/* Checks for THP-specific high-order allocations */
-	if (is_thp_gfp_mask(gfp_mask)) {
-		/*
-		 * If compaction is deferred for high-order allocations, it is
-		 * because sync compaction recently failed. If this is the case
-		 * and the caller requested a THP allocation, we do not want
-		 * to heavily disrupt the system, so we fail the allocation
-		 * instead of entering direct reclaim.
-		 */
-		if (compact_result == COMPACT_DEFERRED)
-			goto nopage;
-
-		/*
-		 * Compaction is contended so rather back off than cause
-		 * excessive stalls.
-		 */
-		if(compact_result == COMPACT_CONTENDED)
-			goto nopage;
-	}
-
 	if (order && compaction_made_progress(compact_result))
 		compaction_retries++;

-	/* Try direct reclaim and then allocating */
-	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
-							&did_some_progress);
-	if (page)
-		goto got_pg;
-
 	/* Do not loop if specifically requested */
 	if (gfp_mask & __GFP_NORETRY)
-		goto noretry;
+		goto nopage;

 	/*
 	 * Do not retry costly high order allocations unless they are
 	 * __GFP_REPEAT
 	 */
 	if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
-		goto noretry;
+		goto nopage;

 	/*
 	 * Costly allocations might have made a progress but this doesn't mean
@@ -3733,25 +3755,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		goto retry;
 	}

-noretry:
-	/*
-	 * High-order allocations do not necessarily loop after direct reclaim
-	 * and reclaim/compaction depends on compaction being called after
-	 * reclaim so call directly if necessary.
-	 * It can become very expensive to allocate transparent hugepages at
-	 * fault, so use asynchronous memory compaction for THP unless it is
-	 * khugepaged trying to collapse. All other requests should tolerate
-	 * at least light sync migration.
-	 */
-	if (is_thp_gfp_mask(gfp_mask) && !(current->flags & PF_KTHREAD))
-		migration_mode = MIGRATE_ASYNC;
-	else
-		migration_mode = MIGRATE_SYNC_LIGHT;
-	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags,
-					    ac, migration_mode,
-					    &compact_result);
-	if (page)
-		goto got_pg;
 nopage:
 	warn_alloc_failed(gfp_mask, order, NULL);
 got_pg:
-- 
2.8.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2016-05-10  7:37 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-10  7:35 [RFC 00/13] make direct compaction more deterministic Vlastimil Babka
2016-05-10  7:35 ` [RFC 01/13] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode Vlastimil Babka
2016-05-11 12:40   ` Michal Hocko
2016-05-10  7:35 ` [RFC 02/13] mm, page_alloc: set alloc_flags only once in slowpath Vlastimil Babka
2016-05-10 11:28   ` Tetsuo Handa
2016-05-10 12:30     ` Vlastimil Babka
2016-05-12 12:41       ` Michal Hocko
2016-05-31  6:20       ` Joonsoo Kim
2016-05-31  7:59         ` Vlastimil Babka
2016-06-02  1:50           ` Joonsoo Kim
2016-05-10  7:35 ` [RFC 03/13] mm, page_alloc: don't retry initial attempt " Vlastimil Babka
2016-05-12 12:48   ` Michal Hocko
2016-05-31  6:25   ` Joonsoo Kim
2016-05-31 12:03     ` Vlastimil Babka
2016-05-10  7:35 ` Vlastimil Babka [this message]
2016-05-12 13:29   ` [RFC 04/13] mm, page_alloc: restructure direct compaction handling " Michal Hocko
2016-05-13  8:10     ` Vlastimil Babka
2016-05-13  8:31       ` Michal Hocko
2016-05-10  7:35 ` [RFC 05/13] mm, page_alloc: make THP-specific decisions more generic Vlastimil Babka
2016-05-12 13:43   ` Michal Hocko
2016-05-10  7:35 ` [RFC 06/13] mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations Vlastimil Babka
2016-05-12 16:20   ` Michal Hocko
2016-05-13  8:23     ` Vlastimil Babka
2016-05-13 12:05       ` Michal Hocko
2016-05-18 11:59         ` Vlastimil Babka
2016-05-18 15:24           ` Michal Hocko
2016-05-20 13:57             ` Vlastimil Babka
2016-05-23  8:39               ` Michal Hocko
2016-05-10  7:35 ` [RFC 07/13] mm, compaction: introduce direct compaction priority Vlastimil Babka
2016-05-13 12:37   ` Michal Hocko
2016-05-10  7:35 ` [RFC 08/13] mm, compaction: simplify contended compaction handling Vlastimil Babka
2016-05-13 13:09   ` Michal Hocko
2016-05-16  7:10     ` Vlastimil Babka
2016-05-10  7:35 ` [RFC 09/13] mm, compaction: make whole_zone flag ignore cached scanner positions Vlastimil Babka
2016-05-13 13:23   ` Michal Hocko
2016-05-10  7:36 ` [RFC 10/13] mm, compaction: cleanup unused functions Vlastimil Babka
2016-05-10  7:36 ` [RFC 11/13] mm, compaction: add the ultimate direct compaction priority Vlastimil Babka
2016-05-13 13:38   ` Michal Hocko
2016-05-16  7:17     ` Vlastimil Babka
2016-05-16  8:11       ` Michal Hocko
2016-05-18 12:46       ` Vlastimil Babka
2016-05-10  7:36 ` [RFC 12/13] mm, compaction: more reliably increase " Vlastimil Babka
2016-05-10 12:55   ` Vlastimil Babka
2016-05-13 14:15   ` Michal Hocko
2016-05-16  7:31     ` Vlastimil Babka
2016-05-16  8:14       ` Michal Hocko
2016-05-16  9:27         ` Vlastimil Babka
2016-05-16  9:52           ` Michal Hocko
2016-05-31  6:37   ` Joonsoo Kim
2016-05-31 12:07     ` Vlastimil Babka
2016-05-31 12:29       ` Vlastimil Babka
2016-06-02  2:50         ` Joonsoo Kim
2016-05-10  7:36 ` [RFC 13/13] mm, compaction: fix and improve watermark handling Vlastimil Babka
2016-05-16  9:25   ` Michal Hocko
2016-05-16  9:50     ` Vlastimil Babka
2016-05-16 12:30       ` Michal Hocko
2016-05-18 13:50     ` Mel Gorman
2016-05-18 14:27       ` Michal Hocko
2016-05-18 14:40         ` Mel Gorman
2016-05-17 20:01 ` [RFC 00/13] make direct compaction more deterministic Michal Hocko
2016-05-18  7:19   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1462865763-22084-5-git-send-email-vbabka@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox