[RFC] VM batching patch problems?

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [RFC] VM batching patch problems?
@ 2004-02-28  0:19 Nick Piggin
  2004-02-28  0:52 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Nick Piggin @ 2004-02-28  0:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

Hi,
Here are a couple of things I think are wrong with balancing-batching
patch. Comments? Maybe I'm wrong?

1. kswapd always balances all zones. With your patch, balance_pgdat from
kswapd will continue to balance all zones while any one of them is above
the pages_high watermark. Strictly, this means it puts exactly equal
pressure on all zones (disregarding problem #2). But it somewhat defeats
the purpose of zones, which is to be able to selectively put more pressure
on one area when needed.

Example: 8MB cache in ZONE_DMA, 8000MB cache in ZONE_HIGHMEM and we'll
assume pages are reclaimed from each zone at the same rate. Now suppose
ZONE_DMA goes 10 pages under the watermark. kswapd will free 10000 pages
from highmem while freeing the 10 ZONE_DMA pages.

2. "batch pressure" is proportional to total size of the zone. It should
be proportional to the size of the freeable LRU cache.

Example: 1GB ZONE_NORMAL, 1GB ZONE_HIGHMEM. Both will have the same batch
pressure, so the scanners will attempt to free the same amount of LRU cache
per run from each zone. Now say ZONE_NORMAL is filled with 750MB of slab
cache and pinned memory. The 250MB of LRU cache will be scanned at the
same rate as the 1GB of highmem LRU cache.

3. try_to_free_pages is now too lazy or sleepy. This seems to be what is
causing the lowend kbuild problems. There is a significant and very
repeatable performance drop of around 40% when moving from mm2 -> mm3 for
medium and heavy swapping kbuild.

I have a batch that addresses these problems and others. It needs to be
a bit smarter about problem #3 though. My patch gets better -j10
performance than mm2 but still slightly worse -j15 performance, though
still much better than mm3.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] VM batching patch problems?
  2004-02-28  0:19 [RFC] VM batching patch problems? Nick Piggin
@ 2004-02-28  0:52 ` Andrew Morton
  2004-02-28  1:39   ` Nick Piggin
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2004-02-28  0:52 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-mm

Nick Piggin <piggin@cyberone.com.au> wrote:
>
> Hi,
> Here are a couple of things I think are wrong with balancing-batching
> patch. Comments? Maybe I'm wrong?
> 
> 
> 1. kswapd always balances all zones. With your patch, balance_pgdat from
> kswapd will continue to balance all zones while any one of them is above
> the pages_high watermark. Strictly, this means it puts exactly equal
> pressure on all zones (disregarding problem #2). But it somewhat defeats
> the purpose of zones, which is to be able to selectively put more pressure
> on one area when needed.
> 
> Example: 8MB cache in ZONE_DMA, 8000MB cache in ZONE_HIGHMEM and we'll
> assume pages are reclaimed from each zone at the same rate. Now suppose
> ZONE_DMA goes 10 pages under the watermark. kswapd will free 10000 pages
> from highmem while freeing the 10 ZONE_DMA pages.

Maybe.  It's still a relatively small amount of the zone.

With an understanding of a problem it should be possible to construt a
testcase which demonstrates that problem, no?

> 
> 2. "batch pressure" is proportional to total size of the zone. It should
> be proportional to the size of the freeable LRU cache.
> 
> Example: 1GB ZONE_NORMAL, 1GB ZONE_HIGHMEM. Both will have the same batch
> pressure, so the scanners will attempt to free the same amount of LRU cache
> per run from each zone. Now say ZONE_NORMAL is filled with 750MB of slab
> cache and pinned memory. The 250MB of LRU cache will be scanned at the
> same rate as the 1GB of highmem LRU cache.
> 

hmm, yes.  But the increased ZONE_NORMAL scanning will cause increased slab
reclaim.  Bear in mind that the batching does not balance the amount of
scanning scross the zones (maybe it should).  It balances the amount of
reclaiming across the zones.  So we can if we want incorporate the number
of reclaimed slab pages into the arithmetic.

> 3. try_to_free_pages is now too lazy or sleepy. This seems to be what is
> causing the lowend kbuild problems.

What makes you think that?

> I have a batch that addresses these problems and others.

Sorry, I dropped most of your patches.  I was reorganising things and it
just became impossible to work with them.  One of them did eight unrelated
things!  It's impossible to select different parts, to understand what the
different parts are doing, to instrument them, etc.

Some of the changes (such as in vm-tune-throttle) looked like random
empirical hacks.  Without any description of what they're doing and of what
experimentation/testing/instrumentation went into them I'm not able to work
with them.

So I picked out a couple of bits which were useful and obvious and dropped
the rest.  Please, no more megahumongopatches.  One concept per patch.

I probably screwed up a few things and may have lost some bugfixes in the
process.

The one patch which I don't particularly like is
vm-dont-rotate-active-list.patch.  The theory makes sense but all that
mucking with false marker pages is a bit yuk.  As I am unable to
demonstrate any improvement from that patch (slight regressions, in fact)
based on admittedly inexhaustive testing, I'd prefer that we be able to
demonstrate a good reason for bringing it back.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] VM batching patch problems?
  2004-02-28  0:52 ` Andrew Morton
@ 2004-02-28  1:39   ` Nick Piggin
  2004-02-28  1:49     ` Nick Piggin
  2004-02-28  2:29     ` Nick Piggin
  0 siblings, 2 replies; 5+ messages in thread
From: Nick Piggin @ 2004-02-28  1:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

Andrew Morton wrote:

>Nick Piggin <piggin@cyberone.com.au> wrote:
>
>>Hi,
>>Here are a couple of things I think are wrong with balancing-batching
>>patch. Comments? Maybe I'm wrong?
>>
>>
>>1. kswapd always balances all zones. With your patch, balance_pgdat from
>>kswapd will continue to balance all zones while any one of them is above
>>the pages_high watermark. Strictly, this means it puts exactly equal
>>pressure on all zones (disregarding problem #2). But it somewhat defeats
>>the purpose of zones, which is to be able to selectively put more pressure
>>on one area when needed.
>>
>>Example: 8MB cache in ZONE_DMA, 8000MB cache in ZONE_HIGHMEM and we'll
>>assume pages are reclaimed from each zone at the same rate. Now suppose
>>ZONE_DMA goes 10 pages under the watermark. kswapd will free 10000 pages
>>from highmem while freeing the 10 ZONE_DMA pages.
>>
>
>Maybe.  It's still a relatively small amount of the zone.
>
>

But that is with a relatively small amount of ZONE_DMA. Either way,
kswapd will now bring ZONE_DMA above water 1000 times slower than
it would if it didn't need to scan ZONE_HIGHMEM. Doing it this way
has 0 benefits that I can see.

>With an understanding of a problem it should be possible to construt a
>testcase which demonstrates that problem, no?
>
>

I can tell you that kswapd will be much less responsive in keeping
small zones with free memory available in the presence of larger
zones (eg ZONE_DMA vs ZONE_NORMAL, or ZONE_NORMAL vs ZONE_HIGHMEM).
I guess this could be measured with a test case: keep track of total
time each zone is under water.

>>2. "batch pressure" is proportional to total size of the zone. It should
>>be proportional to the size of the freeable LRU cache.
>>
>>Example: 1GB ZONE_NORMAL, 1GB ZONE_HIGHMEM. Both will have the same batch
>>pressure, so the scanners will attempt to free the same amount of LRU cache
>>per run from each zone. Now say ZONE_NORMAL is filled with 750MB of slab
>>cache and pinned memory. The 250MB of LRU cache will be scanned at the
>>same rate as the 1GB of highmem LRU cache.
>>
>>
>
>hmm, yes.  But the increased ZONE_NORMAL scanning will cause increased slab
>reclaim.  Bear in mind that the batching does not balance the amount of
>scanning scross the zones (maybe it should).  It balances the amount of
>reclaiming across the zones.  So we can if we want incorporate the number
>of reclaimed slab pages into the arithmetic.
>

No it doesn't increase ZONE_NORMAL scanning. The scanning is the same rate,
but because ZONE_NORMAL is 1/4 the size, it has quadruple the pressure. If
you don't like logic, just pretend it is pinned by mem_map and other things.
Your batching patch is conceptually wrong and it adds complexity.

And *reclaiming* across zones is a silly notion, it applies more pressure
to zones with hotter pages. You want to balance *scanning*, and reclaim
will take care of itself. This notion is embedded all through the scanner
though, not just in your patch. I have ripped it out.

>
>
>
>>3. try_to_free_pages is now too lazy or sleepy. This seems to be what is
>>causing the lowend kbuild problems.
>>
>
>What makes you think that?
>
>

Because if I make it sleep less and reclaim more, I drop kbuild time
from 15 minutes with your scheme to 10 minutes. I don't know what more
I can say. Your patch causes big regressions here in an area that I
am trying to fix up. I am not making this up.

>>I have a batch that addresses these problems and others.
>>
>
>Sorry, I dropped most of your patches.  I was reorganising things and it
>just became impossible to work with them.  One of them did eight unrelated
>things!  It's impossible to select different parts, to understand what the
>different parts are doing, to instrument them, etc.
>
>

OK most of them did one thing, but one did a number of pretty trivial
things which I guess should have been broken out.

>Some of the changes (such as in vm-tune-throttle) looked like random
>empirical hacks.  Without any description of what they're doing and of what
>experimentation/testing/instrumentation went into them I'm not able to work
>with them.
>
>

I have been testing and posting results for the last month or two.

vm-tune-throttle is actually one of those patches that does one thing,
and it is documented exactly what it does at the top of the patch. I
admit this is the one patch I wasn't able to measure an improvement
from, but I like it anyway and I'll keep it.

>So I picked out a couple of bits which were useful and obvious and dropped
>the rest.  Please, no more megahumongopatches.  One concept per patch.
>
>I probably screwed up a few things and may have lost some bugfixes in the
>process.
>
>The one patch which I don't particularly like is
>vm-dont-rotate-active-list.patch.  The theory makes sense but all that
>mucking with false marker pages is a bit yuk.  As I am unable to
>demonstrate any improvement from that patch (slight regressions, in fact)
>based on admittedly inexhaustive testing, I'd prefer that we be able to
>demonstrate a good reason for bringing it back.
>

I already have, multiple times, which is why I sent it to you.
I really have done quite a lot of testing. From the feedback I
have got through your being in your tree I can say I'm on the
right track with the patches, so I'll have to maintain my own
rollup.

I don't understand why your tests aren't showing any improvement.
I know better than to question your testing methodology, but I
don't believe the patches are so fragile that they only work for
me. In fact, there has been quite a bit of feedback about how they
help in real usage as well as benchmarks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] VM batching patch problems?
  2004-02-28  1:39   ` Nick Piggin
@ 2004-02-28  1:49     ` Nick Piggin
  2004-02-28  2:29     ` Nick Piggin
  1 sibling, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2004-02-28  1:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm


Nick Piggin wrote:

> I already have, multiple times, which is why I sent it to you.
> I really have done quite a lot of testing. From the feedback I
> have got through your being in your tree I can say I'm on the
> right track with the patches, so I'll have to maintain my own
> rollup.
>

s/through your being/through them being/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] VM batching patch problems?
  2004-02-28  1:39   ` Nick Piggin
  2004-02-28  1:49     ` Nick Piggin
@ 2004-02-28  2:29     ` Nick Piggin
  1 sibling, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2004-02-28  2:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm


Nick Piggin wrote:

>
> No it doesn't increase ZONE_NORMAL scanning. The scanning is the same 
> rate,
> but because ZONE_NORMAL is 1/4 the size, it has quadruple the 
> pressure. If
> you don't like logic, just pretend it is pinned by mem_map and other 
> things.
> Your batching patch is conceptually wrong and it adds complexity.
>

An unfortunate typo. I was supposed to say "if you don't like *that* logic",
not "if you don't like logic"...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-02-28  2:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-02-28  0:19 [RFC] VM batching patch problems? Nick Piggin
2004-02-28  0:52 ` Andrew Morton
2004-02-28  1:39   ` Nick Piggin
2004-02-28  1:49     ` Nick Piggin
2004-02-28  2:29     ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox