linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Shantanu Goel <sgoel01@yahoo.com>, Chris Mason <clm@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep
Date: Wed, 15 Feb 2017 09:22:45 +0000	[thread overview]
Message-ID: <20170215092247.15989-2-mgorman@techsingularity.net> (raw)
In-Reply-To: <20170215092247.15989-1-mgorman@techsingularity.net>

From: Shantanu Goel <sgoel01@yahoo.com>

The check in prepare_kswapd_sleep needs to match the one in balance_pgdat
since the latter will return as soon as any one of the zones in the
classzone is above the watermark.  This is specially important for higher
order allocations since balance_pgdat will typically reset the order to
zero relying on compaction to create the higher order pages.  Without this
patch, prepare_kswapd_sleep fails to wake up kcompactd since the zone
balance check fails.

On 4.9.7 kswapd is failing to wake up kcompactd due to a mismatch in the
zone balance check between balance_pgdat() and prepare_kswapd_sleep().
balance_pgdat() returns as soon as a single zone satisfies the allocation
but prepare_kswapd_sleep() requires all zones to do +the same.  This causes
prepare_kswapd_sleep() to never succeed except in the order == 0 case and
consequently, wakeup_kcompactd() is never called.  On my machine prior to
apply this patch, the state of compaction from /proc/vmstat looked this
way after a day and a half +of uptime:

compact_migrate_scanned 240496
compact_free_scanned 76238632
compact_isolated 123472
compact_stall 1791
compact_fail 29
compact_success 1762
compact_daemon_wake 0

After applying the patch and about 10 hours of uptime the state looks
like this:

compact_migrate_scanned 59927299
compact_free_scanned 2021075136
compact_isolated 640926
compact_stall 4
compact_fail 2
compact_success 2
compact_daemon_wake 5160

Further notes from Mel that motivated him to pick this patch up and
resend it;

It was observed for the simoop workload (pressures the VM similar to HADOOP)
that kswapd was failing to keep ahead of direct reclaim. The investigation
noted that there was a need to rationalise kswapd decisions to reclaim
with kswapd decisions to sleep. With this patch on a 2-socket box, there
was a 43% reduction in direct reclaim scanning.

However, the impact otherwise is extremely negative. Kswapd reclaim
efficiency dropped from 98% to 76%. simoop has three latency-related
metrics for read, write and allocation (an anonymous mmap and fault).

                                         4.10.0-rc7            4.10.0-rc7
                                     mmots-20170209           fixcheck-v1
Amean    p50-Read             22325202.49 (  0.00%) 20026926.55 ( 10.29%)
Amean    p95-Read             26102988.80 (  0.00%) 27023360.00 ( -3.53%)
Amean    p99-Read             30935176.53 (  0.00%) 30994432.00 ( -0.19%)
Amean    p50-Write                 976.44 (  0.00%)     1905.28 (-95.12%)
Amean    p95-Write               15471.29 (  0.00%)    36210.09 (-134.05%)
Amean    p99-Write               35108.62 (  0.00%)   479494.96 (-1265.75%)
Amean    p50-Allocation          76382.61 (  0.00%)    87603.20 (-14.69%)
Amean    p95-Allocation         127777.39 (  0.00%)   244491.38 (-91.34%)
Amean    p99-Allocation         187937.39 (  0.00%)  1745237.33 (-828.63%)

There are also more allocation stalls. One of the largest impacts was due
to pages written back from kswapd context rising from 0 pages to 4516642
pages during the hour the workload ran for. By and large, the patch has very
bad behaviour but easily missed as the impact on a UMA machine is negligible.

This patch is included with the data in case a bisection leads to this area.
This patch is also a pre-requisite for the rest of the series.

Signed-off-by: Shantanu Goel <sgoel01@yahoo.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/vmscan.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 26c3b405ef34..92fc66bd52bc 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3140,11 +3140,11 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, int classzone_idx)
 		if (!managed_zone(zone))
 			continue;
 
-		if (!zone_balanced(zone, order, classzone_idx))
-			return false;
+		if (zone_balanced(zone, order, classzone_idx))
+			return true;
 	}
 
-	return true;
+	return false;
 }
 
 /*
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-02-15  9:22 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-15  9:22 [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely Mel Gorman
2017-02-15  9:22 ` Mel Gorman [this message]
2017-02-16  2:50   ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Hillf Danton
2017-02-22  7:00   ` Minchan Kim
2017-02-23 15:05     ` Mel Gorman
2017-02-24  1:17       ` Minchan Kim
2017-02-24  9:11         ` Mel Gorman
2017-02-27  6:16           ` Minchan Kim
2017-02-15  9:22 ` [PATCH 2/3] mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced Mel Gorman
2017-02-15  9:22 ` [PATCH 3/3] mm, vmscan: Prevent kswapd sleeping prematurely due to mismatched classzone_idx Mel Gorman
2017-02-16  6:23   ` Hillf Danton
2017-02-16  8:10     ` Mel Gorman
2017-02-16  8:21       ` Hillf Danton
2017-02-16  9:32         ` Mel Gorman
2017-02-20 16:34         ` Vlastimil Babka
2017-02-21  4:10           ` Hillf Danton
2017-02-20 16:42   ` Vlastimil Babka
2017-02-23 15:01     ` Mel Gorman
2017-03-01  9:04       ` Vlastimil Babka
2017-02-15 20:30 ` [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely Andrew Morton
2017-02-15 21:29   ` Mel Gorman
2017-02-15 21:56     ` Andrew Morton
2017-02-15 22:15       ` Mel Gorman
2017-02-15 22:00     ` Vlastimil Babka
2017-03-09  7:56 [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely v2 Mel Gorman
2017-03-09  7:56 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman
2017-03-10  9:06   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170215092247.15989-2-mgorman@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=clm@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=sgoel01@yahoo.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox