From: David Rientjes <rientjes@google.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux.com>,
Shaohua Li <shaohua.li@intel.com>,
linux-mm@kvack.org
Subject: Re: zone state overhead
Date: Tue, 28 Sep 2010 21:02:59 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.00.1009282024570.31551@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100928133059.GL8187@csn.ul.ie>
On Tue, 28 Sep 2010, Mel Gorman wrote:
> This is true. It's helpful to remember why this patch exists. Under heavy
> memory pressure, large machines run the risk of live-locking because the
> NR_FREE_PAGES gets out of sync. The test case mentioned above is under
> memory pressure so it is potentially at risk. Ordinarily, we would be less
> concerned with performance under heavy memory pressure and more concerned with
> correctness of behaviour. The percpu_drift_mark is set at a point where the
> risk is "real". Lowering it will help performance but increase risk. Reducing
> stat_threshold shifts the cost elsewhere by increasing the frequency the
> vmstat counters are updated which I considered to be worse overall.
>
> Which of these is better or is there an alternative suggestion on how
> this livelock can be avoided?
>
I don't think the risk is quite real based on the calculation of
percpu_drift_mark using the high watermark instead of the min watermark.
For Shaohua's 64 cpu system:
Node 3, zone Normal
pages free 2055926
min 1441
low 1801
high 2161
scanned 0
spanned 2097152
present 2068480
vm stats threshold: 98
It's possible that we'll be 98 pages/cpu * 64 cpus = 6272 pages off in the
NR_FREE_PAGES accounting at any given time. So to avoid depleting memory
reserves at the min watermark, which is livelock, and unnecessarily
spending time doing reclaim, percpu_drift_mark should be
1801 + 6272 = 8073 pages. Instead, we're currently using the high
watermark, so percpu_drift_mark is 8433 pages.
It's plausible that we never reclaim sufficient memory that we ever get
above the high watermark since we only trigger reclaim when we can't
allocate above low, so we may be stuck calling zone_page_state_snapshot()
constantly.
I'd be interested to see if this patch helps.
---
diff --git a/mm/vmstat.c b/mm/vmstat.c
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -154,7 +154,7 @@ static void refresh_zone_stat_thresholds(void)
tolerate_drift = low_wmark_pages(zone) - min_wmark_pages(zone);
max_drift = num_online_cpus() * threshold;
if (max_drift > tolerate_drift)
- zone->percpu_drift_mark = high_wmark_pages(zone) +
+ zone->percpu_drift_mark = low_wmark_pages(zone) +
max_drift;
}
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-09-29 4:03 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-28 5:08 Shaohua Li
2010-09-28 12:39 ` Christoph Lameter
2010-09-28 13:30 ` Mel Gorman
2010-09-28 13:40 ` Christoph Lameter
2010-09-28 13:51 ` Mel Gorman
2010-09-28 14:08 ` Christoph Lameter
2010-09-29 3:02 ` Shaohua Li
2010-09-29 4:02 ` David Rientjes [this message]
2010-09-29 4:47 ` Shaohua Li
2010-09-29 5:06 ` David Rientjes
2010-09-29 10:03 ` Mel Gorman
2010-09-29 14:12 ` Christoph Lameter
2010-09-29 14:17 ` Mel Gorman
2010-09-29 14:34 ` Christoph Lameter
2010-09-29 14:41 ` Mel Gorman
2010-09-29 14:45 ` Mel Gorman
2010-09-29 14:54 ` Christoph Lameter
2010-09-29 14:52 ` Christoph Lameter
2010-09-29 19:44 ` David Rientjes
2010-10-08 15:29 ` Mel Gorman
2010-10-09 0:58 ` Shaohua Li
2010-10-11 8:56 ` Mel Gorman
2010-10-12 1:05 ` Shaohua Li
2010-10-12 16:25 ` Mel Gorman
2010-10-13 2:41 ` Shaohua Li
2010-10-13 12:09 ` Mel Gorman
2010-10-13 3:36 ` KOSAKI Motohiro
2010-10-13 6:25 ` [RFC][PATCH 0/3] mm: reserve max drift pages at boot time instead using zone_page_state_snapshot() KOSAKI Motohiro
2010-10-13 6:27 ` [RFC][PATCH 1/3] mm, mem-hotplug: recalculate lowmem_reserve when memory hotplug occur KOSAKI Motohiro
2010-10-13 6:39 ` KAMEZAWA Hiroyuki
2010-10-13 12:59 ` Mel Gorman
2010-10-14 2:44 ` KOSAKI Motohiro
2010-10-13 6:28 ` [RFC][PATCH 2/3] mm: update pcp->stat_threshold " KOSAKI Motohiro
2010-10-13 6:40 ` KAMEZAWA Hiroyuki
2010-10-13 13:02 ` Mel Gorman
2010-10-13 6:32 ` [RFC][PATCH 3/3] mm: reserve max drift pages at boot time instead using zone_page_state_snapshot() KOSAKI Motohiro
2010-10-13 13:19 ` Mel Gorman
2010-10-14 2:39 ` KOSAKI Motohiro
2010-10-18 10:43 ` Mel Gorman
2010-10-13 7:10 ` [experimental][PATCH] mm,vmstat: per cpu stat flush too when per cpu page cache flushed KOSAKI Motohiro
2010-10-13 7:16 ` KAMEZAWA Hiroyuki
2010-10-13 13:22 ` Mel Gorman
2010-10-14 2:50 ` KOSAKI Motohiro
2010-10-15 17:31 ` Christoph Lameter
2010-10-18 9:27 ` KOSAKI Motohiro
2010-10-18 15:44 ` Christoph Lameter
2010-10-19 1:10 ` KOSAKI Motohiro
2010-10-18 11:08 ` Mel Gorman
2010-10-19 1:34 ` KOSAKI Motohiro
2010-10-19 9:06 ` Mel Gorman
2010-10-18 15:51 ` Christoph Lameter
2010-10-19 0:43 ` KOSAKI Motohiro
2010-10-13 11:24 ` zone state overhead Mel Gorman
2010-10-14 3:07 ` KOSAKI Motohiro
2010-10-18 10:39 ` Mel Gorman
2010-10-19 1:16 ` KOSAKI Motohiro
2010-10-19 9:08 ` Mel Gorman
2010-10-22 14:12 ` Mel Gorman
2010-10-22 15:23 ` Christoph Lameter
2010-10-22 18:45 ` Mel Gorman
2010-10-22 15:27 ` Christoph Lameter
2010-10-22 18:46 ` Mel Gorman
2010-10-22 20:01 ` Christoph Lameter
2010-10-25 4:46 ` KOSAKI Motohiro
2010-10-27 8:19 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1009282024570.31551@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=cl@linux.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox