[PATCH] mm/vmstat: retrieve more accurate vmstat value

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm/vmstat: retrieve more accurate vmstat value
@ 2015-11-24  6:22 Joonsoo Kim
  2015-11-24 15:36 ` Christoph Lameter
  2015-11-25 12:00 ` Michal Hocko
  0 siblings, 2 replies; 16+ messages in thread
From: Joonsoo Kim @ 2015-11-24  6:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Christoph Lameter, linux-mm, Joonsoo Kim

When I tested compaction in low memory condition, I found that
my benchmark is stuck in congestion_wait() at shrink_inactive_list().
This stuck last for 1 sec and after then it can escape. More investigation
shows that it is due to stale vmstat value. vmstat is updated every 1 sec
so it is stuck for 1 sec.

I guess that it is caused by updating NR_ISOLATED_XXX. In direct
reclaim/compaction, it would isolate some pages. After some processing,
they are returned to lru or freed and NR_ISOLATED_XXX is adjusted so
it should be recover to zero. But, it would be possible that some
updatings are appiled to global but some are applied only to per cpu
variable. In this case, zone_page_state() would return stale value so
it can be stuck.

This problem can be solved by adjusting zone_page_state() with this
cpu's vmstat value. It's sub-optimal because the other task in other cpu
can be stuck due to stale vmstat value but, at least, it can solve
some usecases without adding much overhead so I think that it is worth
to doing it. With this change, I can't find any stuck in my test.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/vmstat.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 62af0f8..7c84896 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -133,6 +133,9 @@ static inline unsigned long zone_page_state(struct zone *zone,
 {
 	long x = atomic_long_read(&zone->vm_stat[item]);
 #ifdef CONFIG_SMP
+	long diff = this_cpu_read(zone->pageset->vm_stat_diff[item]);
+
+	x += diff;
 	if (x < 0)
 		x = 0;
 #endif
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-24  6:22 [PATCH] mm/vmstat: retrieve more accurate vmstat value Joonsoo Kim
@ 2015-11-24 15:36 ` Christoph Lameter
  2015-11-25  2:57   ` Joonsoo Kim
  2015-11-25 12:00 ` Michal Hocko
  1 sibling, 1 reply; 16+ messages in thread
From: Christoph Lameter @ 2015-11-24 15:36 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Andrew Morton, linux-kernel, linux-mm, Joonsoo Kim

On Tue, 24 Nov 2015, Joonsoo Kim wrote:

> When I tested compaction in low memory condition, I found that
> my benchmark is stuck in congestion_wait() at shrink_inactive_list().
> This stuck last for 1 sec and after then it can escape. More investigation
> shows that it is due to stale vmstat value. vmstat is updated every 1 sec
> so it is stuck for 1 sec.

vmstat values are not designed to be accurate and are not guaranteed to be
accurate. Comparing to specific values should not be done. If you need an
accurate counter then please use another method of accounting like an
atomic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-24 15:36 ` Christoph Lameter
@ 2015-11-25  2:57   ` Joonsoo Kim
  2015-11-25 16:04     ` Christoph Lameter
  0 siblings, 1 reply; 16+ messages in thread
From: Joonsoo Kim @ 2015-11-25  2:57 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Andrew Morton, linux-kernel, linux-mm

On Tue, Nov 24, 2015 at 09:36:09AM -0600, Christoph Lameter wrote:
> On Tue, 24 Nov 2015, Joonsoo Kim wrote:
> 
> > When I tested compaction in low memory condition, I found that
> > my benchmark is stuck in congestion_wait() at shrink_inactive_list().
> > This stuck last for 1 sec and after then it can escape. More investigation
> > shows that it is due to stale vmstat value. vmstat is updated every 1 sec
> > so it is stuck for 1 sec.
> 
> vmstat values are not designed to be accurate and are not guaranteed to be
> accurate. Comparing to specific values should not be done. If you need an
> accurate counter then please use another method of accounting like an
> atomic.

I think that maintaining duplicate counter to guarantee accuracy isn't
reasonable solution. It would cause more overhead to the system.

Although vmstat values aren't designed for accuracy, these are already
used by some sensitive places so it is better to be more accurate.
What this patch does is just adding current cpu's diff to global value
when retrieving in order to get more accurate value and this would not be
expensive. I think that it doesn't break any design principle of vmstat.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-25  2:57   ` Joonsoo Kim
@ 2015-11-25 16:04     ` Christoph Lameter
  2015-11-25 18:03       ` Michal Hocko
  2015-11-26  1:52       ` Joonsoo Kim
  0 siblings, 2 replies; 16+ messages in thread
From: Christoph Lameter @ 2015-11-25 16:04 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Andrew Morton, linux-kernel, linux-mm

On Wed, 25 Nov 2015, Joonsoo Kim wrote:

> I think that maintaining duplicate counter to guarantee accuracy isn't
> reasonable solution. It would cause more overhead to the system.

Simply remove the counter from the vmstat handling and do it differently
then.

> Although vmstat values aren't designed for accuracy, these are already
> used by some sensitive places so it is better to be more accurate.

The design is to sacrifice accuracy and the time the updates occur for
performance reasons. This is not the purpose the counters were designed
for. If you put these demands on the vmstat then you will get complex
convoluted code and compromise performance.

> What this patch does is just adding current cpu's diff to global value
> when retrieving in order to get more accurate value and this would not be
> expensive. I think that it doesn't break any design principle of vmstat.

There have been a number of expectations recently regarding the accuracy
of vmstat. We are on the wrong track here.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-25 16:04     ` Christoph Lameter
@ 2015-11-25 18:03       ` Michal Hocko
  2015-11-25 18:26         ` Christoph Lameter
  2015-11-26  1:52       ` Joonsoo Kim
  1 sibling, 1 reply; 16+ messages in thread
From: Michal Hocko @ 2015-11-25 18:03 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Joonsoo Kim, Andrew Morton, linux-kernel, linux-mm

On Wed 25-11-15 10:04:44, Christoph Lameter wrote:
> On Wed, 25 Nov 2015, Joonsoo Kim wrote:
> 
> > I think that maintaining duplicate counter to guarantee accuracy isn't
> > reasonable solution. It would cause more overhead to the system.
> 
> Simply remove the counter from the vmstat handling and do it differently
> then.

We definitely do not want yet another set of counters. vmstat counters
are not only to be exported into the userspace. We have in kernel users
as well. I do agree that there are users who can cope with some level of
imprecision though and those which depend on the accuracy can use
zone_page_state_snapshot which doesn't impose any overhead on others.
[...]
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-25 18:03       ` Michal Hocko
@ 2015-11-25 18:26         ` Christoph Lameter
  0 siblings, 0 replies; 16+ messages in thread
From: Christoph Lameter @ 2015-11-25 18:26 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Joonsoo Kim, Andrew Morton, linux-kernel, linux-mm

On Wed, 25 Nov 2015, Michal Hocko wrote:

> > Simply remove the counter from the vmstat handling and do it differently
> > then.
>
> We definitely do not want yet another set of counters. vmstat counters
> are not only to be exported into the userspace. We have in kernel users
> as well. I do agree that there are users who can cope with some level of
> imprecision though and those which depend on the accuracy can use
> zone_page_state_snapshot which doesn't impose any overhead on others.
> [...]

Ok then the proper patch would be to use zone_page_state() instead of
zone_page_state() here instead of modifying zone_page_state().

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-25 16:04     ` Christoph Lameter
  2015-11-25 18:03       ` Michal Hocko
@ 2015-11-26  1:52       ` Joonsoo Kim
  2015-12-03  4:14         ` Joonsoo Kim
  2016-01-27 23:13         ` David Rientjes
  1 sibling, 2 replies; 16+ messages in thread
From: Joonsoo Kim @ 2015-11-26  1:52 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Andrew Morton, linux-kernel, linux-mm

On Wed, Nov 25, 2015 at 10:04:44AM -0600, Christoph Lameter wrote:
> > Although vmstat values aren't designed for accuracy, these are already
> > used by some sensitive places so it is better to be more accurate.
> 
> The design is to sacrifice accuracy and the time the updates occur for
> performance reasons. This is not the purpose the counters were designed
> for. If you put these demands on the vmstat then you will get complex
> convoluted code and compromise performance.

I understand design decision, but, it is better to get value as much
as accurate if there is no performance problem. My patch would not
cause much performance degradation because it is just adding one
this_cpu_read().

Consider about following example. Current implementation returns
interesting output if someone do following things.

v1 = zone_page_state(XXX);
mod_zone_page_state(XXX, 1);
v2 = zone_page_state(XXX);

v2 would be same with v1 in most of cases even if we already update
it.

This situation could occurs in page allocation path and others. If
some task try to allocate many pages, then watermark check returns
same values until updating vmstat even if some freepage are allocated.
There are some adjustments for this imprecision but why not do it become
accurate? I think that this change is reasonable trade-off.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-26  1:52       ` Joonsoo Kim
@ 2015-12-03  4:14         ` Joonsoo Kim
  2016-01-27 23:13         ` David Rientjes
  1 sibling, 0 replies; 16+ messages in thread
From: Joonsoo Kim @ 2015-12-03  4:14 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Andrew Morton, linux-kernel, linux-mm

On Thu, Nov 26, 2015 at 10:52:52AM +0900, Joonsoo Kim wrote:
> On Wed, Nov 25, 2015 at 10:04:44AM -0600, Christoph Lameter wrote:
> > > Although vmstat values aren't designed for accuracy, these are already
> > > used by some sensitive places so it is better to be more accurate.
> > 
> > The design is to sacrifice accuracy and the time the updates occur for
> > performance reasons. This is not the purpose the counters were designed
> > for. If you put these demands on the vmstat then you will get complex
> > convoluted code and compromise performance.
> 
> I understand design decision, but, it is better to get value as much
> as accurate if there is no performance problem. My patch would not
> cause much performance degradation because it is just adding one
> this_cpu_read().
> 
> Consider about following example. Current implementation returns
> interesting output if someone do following things.
> 
> v1 = zone_page_state(XXX);
> mod_zone_page_state(XXX, 1);
> v2 = zone_page_state(XXX);
> 
> v2 would be same with v1 in most of cases even if we already update
> it.
> 
> This situation could occurs in page allocation path and others. If
> some task try to allocate many pages, then watermark check returns
> same values until updating vmstat even if some freepage are allocated.
> There are some adjustments for this imprecision but why not do it become
> accurate? I think that this change is reasonable trade-off.
> 
Christoph, any comment?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-26  1:52       ` Joonsoo Kim
  2015-12-03  4:14         ` Joonsoo Kim
@ 2016-01-27 23:13         ` David Rientjes
  2016-01-28  5:08           ` Joonsoo Kim
  1 sibling, 1 reply; 16+ messages in thread
From: David Rientjes @ 2016-01-27 23:13 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Christoph Lameter, Andrew Morton, linux-kernel, linux-mm

On Thu, 26 Nov 2015, Joonsoo Kim wrote:

> I understand design decision, but, it is better to get value as much
> as accurate if there is no performance problem. My patch would not
> cause much performance degradation because it is just adding one
> this_cpu_read().
> 
> Consider about following example. Current implementation returns
> interesting output if someone do following things.
> 
> v1 = zone_page_state(XXX);
> mod_zone_page_state(XXX, 1);
> v2 = zone_page_state(XXX);
> 
> v2 would be same with v1 in most of cases even if we already update
> it.
> 
> This situation could occurs in page allocation path and others. If
> some task try to allocate many pages, then watermark check returns
> same values until updating vmstat even if some freepage are allocated.
> There are some adjustments for this imprecision but why not do it become
> accurate? I think that this change is reasonable trade-off.
> 

I'm not sure that NR_ISOLATED_* should be vmstats in the first place.  The 
most important callers that depend on its accuracy is 
zone_reclaimable_pages() and the too_many_isolated() loop in both 
shrink_inactive_list() and memory compaction.  If zlc's are updated every 
1s, the HZ/10 in those loops don't really matter, they may as well be 
HZ/2.

I think memory compaction updates the counters in the most appropriate 
way, by incrementing a counter and then finally doing 
mod_zone_page_state() for the counter.  The other updaters are thp 
collapse and page migration.

I discount user-visible vmstats here because the trade-off has already 
been made that they may be stale for up to 1s and userspace isn't 
affected.

So what happens if we simply convert NR_ISOLATED_* into per-zone 
atomic64_t?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2016-01-27 23:13         ` David Rientjes
@ 2016-01-28  5:08           ` Joonsoo Kim
  0 siblings, 0 replies; 16+ messages in thread
From: Joonsoo Kim @ 2016-01-28  5:08 UTC (permalink / raw)
  To: David Rientjes; +Cc: Christoph Lameter, Andrew Morton, linux-kernel, linux-mm

On Wed, Jan 27, 2016 at 03:13:12PM -0800, David Rientjes wrote:
> On Thu, 26 Nov 2015, Joonsoo Kim wrote:
> 
> > I understand design decision, but, it is better to get value as much
> > as accurate if there is no performance problem. My patch would not
> > cause much performance degradation because it is just adding one
> > this_cpu_read().
> > 
> > Consider about following example. Current implementation returns
> > interesting output if someone do following things.
> > 
> > v1 = zone_page_state(XXX);
> > mod_zone_page_state(XXX, 1);
> > v2 = zone_page_state(XXX);
> > 
> > v2 would be same with v1 in most of cases even if we already update
> > it.
> > 
> > This situation could occurs in page allocation path and others. If
> > some task try to allocate many pages, then watermark check returns
> > same values until updating vmstat even if some freepage are allocated.
> > There are some adjustments for this imprecision but why not do it become
> > accurate? I think that this change is reasonable trade-off.
> > 
> 
> I'm not sure that NR_ISOLATED_* should be vmstats in the first place.  The 
> most important callers that depend on its accuracy is 
> zone_reclaimable_pages() and the too_many_isolated() loop in both 
> shrink_inactive_list() and memory compaction.  If zlc's are updated every 
> 1s, the HZ/10 in those loops don't really matter, they may as well be 
> HZ/2.
> 
> I think memory compaction updates the counters in the most appropriate 
> way, by incrementing a counter and then finally doing 
> mod_zone_page_state() for the counter.  The other updaters are thp 
> collapse and page migration.
> 
> I discount user-visible vmstats here because the trade-off has already 
> been made that they may be stale for up to 1s and userspace isn't 
> affected.
> 
> So what happens if we simply convert NR_ISOLATED_* into per-zone 
> atomic64_t?

Just a small uncomfortable thing is that calculation is done
with different kinds of metric. For example, comparing vmstat values
(NR_INACTIVE_*, NR_ACTIVE_*) with per-zone atomic NR_ISOLATED_*
looks ugly and error-prone because their accuracy is different.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-24  6:22 [PATCH] mm/vmstat: retrieve more accurate vmstat value Joonsoo Kim
  2015-11-24 15:36 ` Christoph Lameter
@ 2015-11-25 12:00 ` Michal Hocko
  2015-11-25 13:43   ` Vlastimil Babka
  2015-11-26  1:56   ` Joonsoo Kim
  1 sibling, 2 replies; 16+ messages in thread
From: Michal Hocko @ 2015-11-25 12:00 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, linux-kernel, Christoph Lameter, linux-mm, Joonsoo Kim

On Tue 24-11-15 15:22:03, Joonsoo Kim wrote:
> When I tested compaction in low memory condition, I found that
> my benchmark is stuck in congestion_wait() at shrink_inactive_list().
> This stuck last for 1 sec and after then it can escape. More investigation
> shows that it is due to stale vmstat value. vmstat is updated every 1 sec
> so it is stuck for 1 sec.

Wouldn't it be sufficient to use zone_page_state_snapshot in
too_many_isolated?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-25 12:00 ` Michal Hocko
@ 2015-11-25 13:43   ` Vlastimil Babka
  2015-11-25 13:47     ` Michal Hocko
  2015-11-26  1:56   ` Joonsoo Kim
  1 sibling, 1 reply; 16+ messages in thread
From: Vlastimil Babka @ 2015-11-25 13:43 UTC (permalink / raw)
  To: Michal Hocko, Joonsoo Kim
  Cc: Andrew Morton, linux-kernel, Christoph Lameter, linux-mm, Joonsoo Kim

On 11/25/2015 01:00 PM, Michal Hocko wrote:
> On Tue 24-11-15 15:22:03, Joonsoo Kim wrote:
>> When I tested compaction in low memory condition, I found that
>> my benchmark is stuck in congestion_wait() at shrink_inactive_list().
>> This stuck last for 1 sec and after then it can escape. More investigation
>> shows that it is due to stale vmstat value. vmstat is updated every 1 sec
>> so it is stuck for 1 sec.
> 
> Wouldn't it be sufficient to use zone_page_state_snapshot in
> too_many_isolated?

That sounds better than the ad-hoc half-solution, yeah.
I don't know how performance sensitive the callers are, but maybe it could do a
non-snapshot check first, and only repeat with _snapshot when it's about to wait
(the result is true), just to make sure?

OTOH, how big issue is this? I suspect the system has been genuinely
too_many_isolated(), or very close, in order to hit the condition in the first
place, and the inaccuracy just delays the recovery a bit?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-25 13:43   ` Vlastimil Babka
@ 2015-11-25 13:47     ` Michal Hocko
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2015-11-25 13:47 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Joonsoo Kim, Andrew Morton, linux-kernel, Christoph Lameter,
	linux-mm, Joonsoo Kim

On Wed 25-11-15 14:43:38, Vlastimil Babka wrote:
> On 11/25/2015 01:00 PM, Michal Hocko wrote:
> > On Tue 24-11-15 15:22:03, Joonsoo Kim wrote:
> >> When I tested compaction in low memory condition, I found that
> >> my benchmark is stuck in congestion_wait() at shrink_inactive_list().
> >> This stuck last for 1 sec and after then it can escape. More investigation
> >> shows that it is due to stale vmstat value. vmstat is updated every 1 sec
> >> so it is stuck for 1 sec.
> > 
> > Wouldn't it be sufficient to use zone_page_state_snapshot in
> > too_many_isolated?
> 
> That sounds better than the ad-hoc half-solution, yeah.
> I don't know how performance sensitive the callers are, but maybe it could do a
> non-snapshot check first, and only repeat with _snapshot when it's about to wait
> (the result is true), just to make sure?

I am not sure this is worth bothering. We are in the reclaim which is
not a hot path.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-25 12:00 ` Michal Hocko
  2015-11-25 13:43   ` Vlastimil Babka
@ 2015-11-26  1:56   ` Joonsoo Kim
  2015-11-26  5:03     ` vinayak menon
  2015-11-26 15:03     ` Michal Hocko
  1 sibling, 2 replies; 16+ messages in thread
From: Joonsoo Kim @ 2015-11-26  1:56 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-kernel, Christoph Lameter, linux-mm

On Wed, Nov 25, 2015 at 01:00:22PM +0100, Michal Hocko wrote:
> On Tue 24-11-15 15:22:03, Joonsoo Kim wrote:
> > When I tested compaction in low memory condition, I found that
> > my benchmark is stuck in congestion_wait() at shrink_inactive_list().
> > This stuck last for 1 sec and after then it can escape. More investigation
> > shows that it is due to stale vmstat value. vmstat is updated every 1 sec
> > so it is stuck for 1 sec.
> 
> Wouldn't it be sufficient to use zone_page_state_snapshot in
> too_many_isolated?

Yes, it would work in this case. But, I prefer this patch because
all zone_page_state() users get this benefit.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-26  1:56   ` Joonsoo Kim
@ 2015-11-26  5:03     ` vinayak menon
  2015-11-26 15:03     ` Michal Hocko
  1 sibling, 0 replies; 16+ messages in thread
From: vinayak menon @ 2015-11-26  5:03 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Michal Hocko, Andrew Morton, linux-kernel, Christoph Lameter, linux-mm

On Thu, Nov 26, 2015 at 7:26 AM, Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
> On Wed, Nov 25, 2015 at 01:00:22PM +0100, Michal Hocko wrote:
>> On Tue 24-11-15 15:22:03, Joonsoo Kim wrote:
>> > When I tested compaction in low memory condition, I found that
>> > my benchmark is stuck in congestion_wait() at shrink_inactive_list().
>> > This stuck last for 1 sec and after then it can escape. More investigation
>> > shows that it is due to stale vmstat value. vmstat is updated every 1 sec
>> > so it is stuck for 1 sec.
>>
>> Wouldn't it be sufficient to use zone_page_state_snapshot in
>> too_many_isolated?
>
This was done by this patch I believe,
http://lkml.iu.edu/hypermail/linux/kernel/1501.2/00001.html, though
the original  issue (wait of more than 1 sec) was fixed by the vmstat
changes.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm/vmstat: retrieve more accurate vmstat value
  2015-11-26  1:56   ` Joonsoo Kim
  2015-11-26  5:03     ` vinayak menon
@ 2015-11-26 15:03     ` Michal Hocko
  1 sibling, 0 replies; 16+ messages in thread
From: Michal Hocko @ 2015-11-26 15:03 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Andrew Morton, linux-kernel, Christoph Lameter, linux-mm

On Thu 26-11-15 10:56:12, Joonsoo Kim wrote:
> On Wed, Nov 25, 2015 at 01:00:22PM +0100, Michal Hocko wrote:
> > On Tue 24-11-15 15:22:03, Joonsoo Kim wrote:
> > > When I tested compaction in low memory condition, I found that
> > > my benchmark is stuck in congestion_wait() at shrink_inactive_list().
> > > This stuck last for 1 sec and after then it can escape. More investigation
> > > shows that it is due to stale vmstat value. vmstat is updated every 1 sec
> > > so it is stuck for 1 sec.
> > 
> > Wouldn't it be sufficient to use zone_page_state_snapshot in
> > too_many_isolated?
> 
> Yes, it would work in this case. But, I prefer this patch because
> all zone_page_state() users get this benefit.

Just to make it clear, I am not against your patch in general. I am just
not sure it would help for too_many_isolated case where a significant
drift might occur on remote cpus as well so I am not really sure that is
appropriate for the issue you are seeing.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-01-28  5:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-24  6:22 [PATCH] mm/vmstat: retrieve more accurate vmstat value Joonsoo Kim
2015-11-24 15:36 ` Christoph Lameter
2015-11-25  2:57   ` Joonsoo Kim
2015-11-25 16:04     ` Christoph Lameter
2015-11-25 18:03       ` Michal Hocko
2015-11-25 18:26         ` Christoph Lameter
2015-11-26  1:52       ` Joonsoo Kim
2015-12-03  4:14         ` Joonsoo Kim
2016-01-27 23:13         ` David Rientjes
2016-01-28  5:08           ` Joonsoo Kim
2015-11-25 12:00 ` Michal Hocko
2015-11-25 13:43   ` Vlastimil Babka
2015-11-25 13:47     ` Michal Hocko
2015-11-26  1:56   ` Joonsoo Kim
2015-11-26  5:03     ` vinayak menon
2015-11-26 15:03     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox