From: Mel Gorman <mel@csn.ul.ie>
To: Andrew Morton <akpm@linux-foundation.org>,
Frans Pop <elendil@planet.nl>, Jiri Kosina <jkosina@suse.cz>,
Sven Geggus <lists@fuchsschwanzdomain.de>,
Karol Lewandowski <karol.k.lewandowski@gmail.com>,
Tobias Oetiker <tobi@oetiker.ch>
Cc: linux-kernel@vger.kernel.org,
"linux-mm@kvack.org\"" <linux-mm@kvack.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Rik van Riel <riel@redhat.com>,
Christoph Lameter <cl@linux-foundation.org>,
Stephan von Krawczynski <skraw@ithnet.com>,
"Rafael J. Wysocki" <rjw@sisk.pl>,
Kernel Testers List <kernel-testers@vger.kernel.org>,
Mel Gorman <mel@csn.ul.ie>
Subject: [PATCH 3/5] page allocator: Wait on both sync and async congestion after direct reclaim
Date: Thu, 12 Nov 2009 19:30:33 +0000 [thread overview]
Message-ID: <1258054235-3208-4-git-send-email-mel@csn.ul.ie> (raw)
In-Reply-To: <1258054235-3208-1-git-send-email-mel@csn.ul.ie>
Testing by Frans Pop indicated that in the 2.6.30..2.6.31 window at least
that the commits 373c0a7e 8aa7e847 dramatically increased the number of
GFP_ATOMIC failures that were occuring within a wireless driver. Reverting
this patch seemed to help a lot even though it was pointed out that the
congestion changes were very far away from high-order atomic allocations.
The key to why the revert makes such a big difference is down to timing and
how long direct reclaimers wait versus kswapd. With the patch reverted,
the congestion_wait() is on the SYNC queue instead of the ASYNC. As a
significant part of the workload involved reads, it makes sense that the
SYNC list is what was truely congested and with the revert processes were
waiting on congestion as expected. Hence, direct reclaimers stalled
properly and kswapd was able to do its job with fewer stalls.
This patch aims to fix the congestion_wait() behaviour for SYNC and ASYNC
for direct reclaimers. Instead of making the congestion_wait() on the SYNC
queue which would only fix a particular type of workload, this patch adds a
third type of congestion_wait - BLK_RW_BOTH which first waits on the ASYNC
and then the SYNC queue if the timeout has not been reached. In tests, this
counter-intuitively results in kswapd stalling less and freeing up pages
resulting in fewer allocation failures and fewer direct-reclaim-orientated
stalls.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
include/linux/backing-dev.h | 1 +
mm/backing-dev.c | 25 ++++++++++++++++++++++---
mm/page_alloc.c | 4 ++--
mm/vmscan.c | 2 +-
4 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index b449e73..b35344c 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -276,6 +276,7 @@ static inline int bdi_rw_congested(struct backing_dev_info *bdi)
enum {
BLK_RW_ASYNC = 0,
BLK_RW_SYNC = 1,
+ BLK_RW_BOTH = 2,
};
void clear_bdi_congested(struct backing_dev_info *bdi, int sync);
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 1065b71..ea9ffc3 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -736,22 +736,41 @@ EXPORT_SYMBOL(set_bdi_congested);
/**
* congestion_wait - wait for a backing_dev to become uncongested
- * @sync: SYNC or ASYNC IO
+ * @sync: SYNC, ASYNC or BOTH IO
* @timeout: timeout in jiffies
*
* Waits for up to @timeout jiffies for a backing_dev (any backing_dev) to exit
* write congestion. If no backing_devs are congested then just wait for the
* next write to be completed.
*/
-long congestion_wait(int sync, long timeout)
+long congestion_wait(int sync_request, long timeout)
{
long ret;
DEFINE_WAIT(wait);
- wait_queue_head_t *wqh = &congestion_wqh[sync];
+ int sync;
+ wait_queue_head_t *wqh;
+
+ /* If requested to sync both, wait on ASYNC first, then SYNC */
+ if (sync_request == BLK_RW_BOTH)
+ sync = BLK_RW_ASYNC;
+ else
+ sync = sync_request;
+
+again:
+ wqh = &congestion_wqh[sync];
prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
ret = io_schedule_timeout(timeout);
finish_wait(wqh, &wait);
+
+ if (sync_request == BLK_RW_BOTH) {
+ sync_request = 0;
+ sync = BLK_RW_SYNC;
+ timeout = ret;
+ if (timeout)
+ goto again;
+ }
+
return ret;
}
EXPORT_SYMBOL(congestion_wait);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2bc2ac6..f6ed41c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1727,7 +1727,7 @@ __alloc_pages_high_priority(gfp_t gfp_mask, unsigned int order,
preferred_zone, migratetype);
if (!page && gfp_mask & __GFP_NOFAIL)
- congestion_wait(BLK_RW_ASYNC, HZ/50);
+ congestion_wait(BLK_RW_BOTH, HZ/50);
} while (!page && (gfp_mask & __GFP_NOFAIL));
return page;
@@ -1898,7 +1898,7 @@ rebalance:
pages_reclaimed += did_some_progress;
if (should_alloc_retry(gfp_mask, order, pages_reclaimed)) {
/* Wait for some write requests to complete then retry */
- congestion_wait(BLK_RW_ASYNC, HZ/50);
+ congestion_wait(BLK_RW_BOTH, HZ/50);
goto rebalance;
}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 777af57..190bae1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1793,7 +1793,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
/* Take a nap, wait for some writeback to complete */
if (sc->nr_scanned && priority < DEF_PRIORITY - 2)
- congestion_wait(BLK_RW_ASYNC, HZ/10);
+ congestion_wait(BLK_RW_BOTH, HZ/10);
}
/* top priority shrink_zones still had more to do? don't OOM, then */
if (!sc->all_unreclaimable && scanning_global_lru(sc))
--
1.6.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-11-12 19:30 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-12 19:30 [PATCH 0/5] Reduce GFP_ATOMIC allocation failures, candidate fix V3 Mel Gorman
2009-11-12 19:30 ` [PATCH 1/5] page allocator: Always wake kswapd when restarting an allocation attempt after direct reclaim failed Mel Gorman
2009-11-13 5:23 ` KOSAKI Motohiro
2009-11-13 13:55 ` Mel Gorman
2009-11-12 19:30 ` [PATCH 2/5] page allocator: Do not allow interrupts to use ALLOC_HARDER Mel Gorman
2009-11-13 5:24 ` KOSAKI Motohiro
2009-11-13 13:56 ` Mel Gorman
2009-11-12 19:30 ` Mel Gorman [this message]
2009-11-13 11:20 ` [PATCH 3/5] page allocator: Wait on both sync and async congestion after direct reclaim KOSAKI Motohiro
2009-11-13 11:55 ` Jens Axboe
2009-11-13 12:28 ` Mel Gorman
2009-11-13 13:32 ` Jens Axboe
2009-11-13 13:41 ` Pekka Enberg
2009-11-13 15:22 ` Chris Mason
2009-11-13 14:16 ` Mel Gorman
2009-11-20 14:56 ` Mel Gorman
2009-11-12 19:30 ` [PATCH 4/5] vmscan: Have kswapd sleep for a short interval and double check it should be asleep Mel Gorman
2009-11-13 10:43 ` KOSAKI Motohiro
2009-11-13 14:13 ` Mel Gorman
2009-11-13 18:00 ` KOSAKI Motohiro
2009-11-13 18:17 ` Mel Gorman
2009-11-14 9:34 ` KOSAKI Motohiro
2009-11-14 15:46 ` Mel Gorman
2009-11-17 11:03 ` KOSAKI Motohiro
2009-11-17 11:44 ` Mel Gorman
2009-11-17 12:18 ` KOSAKI Motohiro
2009-11-17 12:25 ` Mel Gorman
2009-11-18 5:20 ` KOSAKI Motohiro
2009-11-17 10:34 ` [PATCH] vmscan: Have kswapd sleep for a short interval and double check it should be asleep fix 1 Mel Gorman
2009-11-18 5:27 ` KOSAKI Motohiro
2009-11-12 19:30 ` [PATCH 5/5] vmscan: Take order into consideration when deciding if kswapd is in trouble Mel Gorman
2009-11-13 9:54 ` KOSAKI Motohiro
2009-11-13 13:54 ` Mel Gorman
2009-11-13 14:48 ` Minchan Kim
2009-11-13 18:00 ` KOSAKI Motohiro
2009-11-13 18:15 ` [PATCH] vmscan: Stop kswapd waiting on congestion when the min watermark is not being met Mel Gorman
2009-11-13 18:26 ` Frans Pop
2009-11-13 18:33 ` KOSAKI Motohiro
2009-11-13 20:03 ` [PATCH] vmscan: Stop kswapd waiting on congestion when the min watermark is not being met V2 Mel Gorman
2009-11-26 14:45 ` Tobias Oetiker
2009-11-29 7:42 ` still getting allocation failures (was Re: [PATCH] vmscan: Stop kswapd waiting on congestion when the min watermark is not being met V2) Tobi Oetiker
2009-12-02 11:32 ` Mel Gorman
2009-12-02 21:30 ` Tobias Oetiker
2009-12-03 20:26 ` Corrado Zoccolo
2009-12-14 5:59 ` Tobias Oetiker
2009-12-14 8:49 ` Corrado Zoccolo
2009-11-13 18:36 ` [PATCH] vmscan: Stop kswapd waiting on congestion when the min watermark is not being met Rik van Riel
2009-11-13 14:38 ` [PATCH 5/5] vmscan: Take order into consideration when deciding if kswapd is in trouble Minchan Kim
2009-11-13 12:41 ` Minchan Kim
2009-11-13 9:04 ` [PATCH 0/5] Reduce GFP_ATOMIC allocation failures, candidate fix V3 Frans Pop
2009-11-16 17:57 ` Mel Gorman
2009-11-13 12:47 ` Tobias Oetiker
2009-11-13 13:37 ` Mel Gorman
2009-11-15 12:07 ` Karol Lewandowski
2009-11-16 9:52 ` Mel Gorman
2009-11-16 12:08 ` Karol Lewandowski
2009-11-16 14:32 ` Karol Lewandowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1258054235-3208-4-git-send-email-mel@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=elendil@planet.nl \
--cc=jkosina@suse.cz \
--cc=karol.k.lewandowski@gmail.com \
--cc=kernel-testers@vger.kernel.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lists@fuchsschwanzdomain.de \
--cc=penberg@cs.helsinki.fi \
--cc=riel@redhat.com \
--cc=rjw@sisk.pl \
--cc=skraw@ithnet.com \
--cc=tobi@oetiker.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox