* [RFC] VM batching patch problems? @ 2004-02-28 0:19 Nick Piggin 2004-02-28 0:52 ` Andrew Morton 0 siblings, 1 reply; 5+ messages in thread From: Nick Piggin @ 2004-02-28 0:19 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-mm Hi, Here are a couple of things I think are wrong with balancing-batching patch. Comments? Maybe I'm wrong? 1. kswapd always balances all zones. With your patch, balance_pgdat from kswapd will continue to balance all zones while any one of them is above the pages_high watermark. Strictly, this means it puts exactly equal pressure on all zones (disregarding problem #2). But it somewhat defeats the purpose of zones, which is to be able to selectively put more pressure on one area when needed. Example: 8MB cache in ZONE_DMA, 8000MB cache in ZONE_HIGHMEM and we'll assume pages are reclaimed from each zone at the same rate. Now suppose ZONE_DMA goes 10 pages under the watermark. kswapd will free 10000 pages from highmem while freeing the 10 ZONE_DMA pages. 2. "batch pressure" is proportional to total size of the zone. It should be proportional to the size of the freeable LRU cache. Example: 1GB ZONE_NORMAL, 1GB ZONE_HIGHMEM. Both will have the same batch pressure, so the scanners will attempt to free the same amount of LRU cache per run from each zone. Now say ZONE_NORMAL is filled with 750MB of slab cache and pinned memory. The 250MB of LRU cache will be scanned at the same rate as the 1GB of highmem LRU cache. 3. try_to_free_pages is now too lazy or sleepy. This seems to be what is causing the lowend kbuild problems. There is a significant and very repeatable performance drop of around 40% when moving from mm2 -> mm3 for medium and heavy swapping kbuild. I have a batch that addresses these problems and others. It needs to be a bit smarter about problem #3 though. My patch gets better -j10 performance than mm2 but still slightly worse -j15 performance, though still much better than mm3. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] VM batching patch problems? 2004-02-28 0:19 [RFC] VM batching patch problems? Nick Piggin @ 2004-02-28 0:52 ` Andrew Morton 2004-02-28 1:39 ` Nick Piggin 0 siblings, 1 reply; 5+ messages in thread From: Andrew Morton @ 2004-02-28 0:52 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-mm Nick Piggin <piggin@cyberone.com.au> wrote: > > Hi, > Here are a couple of things I think are wrong with balancing-batching > patch. Comments? Maybe I'm wrong? > > > 1. kswapd always balances all zones. With your patch, balance_pgdat from > kswapd will continue to balance all zones while any one of them is above > the pages_high watermark. Strictly, this means it puts exactly equal > pressure on all zones (disregarding problem #2). But it somewhat defeats > the purpose of zones, which is to be able to selectively put more pressure > on one area when needed. > > Example: 8MB cache in ZONE_DMA, 8000MB cache in ZONE_HIGHMEM and we'll > assume pages are reclaimed from each zone at the same rate. Now suppose > ZONE_DMA goes 10 pages under the watermark. kswapd will free 10000 pages > from highmem while freeing the 10 ZONE_DMA pages. Maybe. It's still a relatively small amount of the zone. With an understanding of a problem it should be possible to construt a testcase which demonstrates that problem, no? > > 2. "batch pressure" is proportional to total size of the zone. It should > be proportional to the size of the freeable LRU cache. > > Example: 1GB ZONE_NORMAL, 1GB ZONE_HIGHMEM. Both will have the same batch > pressure, so the scanners will attempt to free the same amount of LRU cache > per run from each zone. Now say ZONE_NORMAL is filled with 750MB of slab > cache and pinned memory. The 250MB of LRU cache will be scanned at the > same rate as the 1GB of highmem LRU cache. > hmm, yes. But the increased ZONE_NORMAL scanning will cause increased slab reclaim. Bear in mind that the batching does not balance the amount of scanning scross the zones (maybe it should). It balances the amount of reclaiming across the zones. So we can if we want incorporate the number of reclaimed slab pages into the arithmetic. > 3. try_to_free_pages is now too lazy or sleepy. This seems to be what is > causing the lowend kbuild problems. What makes you think that? > I have a batch that addresses these problems and others. Sorry, I dropped most of your patches. I was reorganising things and it just became impossible to work with them. One of them did eight unrelated things! It's impossible to select different parts, to understand what the different parts are doing, to instrument them, etc. Some of the changes (such as in vm-tune-throttle) looked like random empirical hacks. Without any description of what they're doing and of what experimentation/testing/instrumentation went into them I'm not able to work with them. So I picked out a couple of bits which were useful and obvious and dropped the rest. Please, no more megahumongopatches. One concept per patch. I probably screwed up a few things and may have lost some bugfixes in the process. The one patch which I don't particularly like is vm-dont-rotate-active-list.patch. The theory makes sense but all that mucking with false marker pages is a bit yuk. As I am unable to demonstrate any improvement from that patch (slight regressions, in fact) based on admittedly inexhaustive testing, I'd prefer that we be able to demonstrate a good reason for bringing it back. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] VM batching patch problems? 2004-02-28 0:52 ` Andrew Morton @ 2004-02-28 1:39 ` Nick Piggin 2004-02-28 1:49 ` Nick Piggin 2004-02-28 2:29 ` Nick Piggin 0 siblings, 2 replies; 5+ messages in thread From: Nick Piggin @ 2004-02-28 1:39 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-mm Andrew Morton wrote: >Nick Piggin <piggin@cyberone.com.au> wrote: > >>Hi, >>Here are a couple of things I think are wrong with balancing-batching >>patch. Comments? Maybe I'm wrong? >> >> >>1. kswapd always balances all zones. With your patch, balance_pgdat from >>kswapd will continue to balance all zones while any one of them is above >>the pages_high watermark. Strictly, this means it puts exactly equal >>pressure on all zones (disregarding problem #2). But it somewhat defeats >>the purpose of zones, which is to be able to selectively put more pressure >>on one area when needed. >> >>Example: 8MB cache in ZONE_DMA, 8000MB cache in ZONE_HIGHMEM and we'll >>assume pages are reclaimed from each zone at the same rate. Now suppose >>ZONE_DMA goes 10 pages under the watermark. kswapd will free 10000 pages >>from highmem while freeing the 10 ZONE_DMA pages. >> > >Maybe. It's still a relatively small amount of the zone. > > But that is with a relatively small amount of ZONE_DMA. Either way, kswapd will now bring ZONE_DMA above water 1000 times slower than it would if it didn't need to scan ZONE_HIGHMEM. Doing it this way has 0 benefits that I can see. >With an understanding of a problem it should be possible to construt a >testcase which demonstrates that problem, no? > > I can tell you that kswapd will be much less responsive in keeping small zones with free memory available in the presence of larger zones (eg ZONE_DMA vs ZONE_NORMAL, or ZONE_NORMAL vs ZONE_HIGHMEM). I guess this could be measured with a test case: keep track of total time each zone is under water. >>2. "batch pressure" is proportional to total size of the zone. It should >>be proportional to the size of the freeable LRU cache. >> >>Example: 1GB ZONE_NORMAL, 1GB ZONE_HIGHMEM. Both will have the same batch >>pressure, so the scanners will attempt to free the same amount of LRU cache >>per run from each zone. Now say ZONE_NORMAL is filled with 750MB of slab >>cache and pinned memory. The 250MB of LRU cache will be scanned at the >>same rate as the 1GB of highmem LRU cache. >> >> > >hmm, yes. But the increased ZONE_NORMAL scanning will cause increased slab >reclaim. Bear in mind that the batching does not balance the amount of >scanning scross the zones (maybe it should). It balances the amount of >reclaiming across the zones. So we can if we want incorporate the number >of reclaimed slab pages into the arithmetic. > No it doesn't increase ZONE_NORMAL scanning. The scanning is the same rate, but because ZONE_NORMAL is 1/4 the size, it has quadruple the pressure. If you don't like logic, just pretend it is pinned by mem_map and other things. Your batching patch is conceptually wrong and it adds complexity. And *reclaiming* across zones is a silly notion, it applies more pressure to zones with hotter pages. You want to balance *scanning*, and reclaim will take care of itself. This notion is embedded all through the scanner though, not just in your patch. I have ripped it out. > > > >>3. try_to_free_pages is now too lazy or sleepy. This seems to be what is >>causing the lowend kbuild problems. >> > >What makes you think that? > > Because if I make it sleep less and reclaim more, I drop kbuild time from 15 minutes with your scheme to 10 minutes. I don't know what more I can say. Your patch causes big regressions here in an area that I am trying to fix up. I am not making this up. >>I have a batch that addresses these problems and others. >> > >Sorry, I dropped most of your patches. I was reorganising things and it >just became impossible to work with them. One of them did eight unrelated >things! It's impossible to select different parts, to understand what the >different parts are doing, to instrument them, etc. > > OK most of them did one thing, but one did a number of pretty trivial things which I guess should have been broken out. >Some of the changes (such as in vm-tune-throttle) looked like random >empirical hacks. Without any description of what they're doing and of what >experimentation/testing/instrumentation went into them I'm not able to work >with them. > > I have been testing and posting results for the last month or two. vm-tune-throttle is actually one of those patches that does one thing, and it is documented exactly what it does at the top of the patch. I admit this is the one patch I wasn't able to measure an improvement from, but I like it anyway and I'll keep it. >So I picked out a couple of bits which were useful and obvious and dropped >the rest. Please, no more megahumongopatches. One concept per patch. > >I probably screwed up a few things and may have lost some bugfixes in the >process. > >The one patch which I don't particularly like is >vm-dont-rotate-active-list.patch. The theory makes sense but all that >mucking with false marker pages is a bit yuk. As I am unable to >demonstrate any improvement from that patch (slight regressions, in fact) >based on admittedly inexhaustive testing, I'd prefer that we be able to >demonstrate a good reason for bringing it back. > I already have, multiple times, which is why I sent it to you. I really have done quite a lot of testing. From the feedback I have got through your being in your tree I can say I'm on the right track with the patches, so I'll have to maintain my own rollup. I don't understand why your tests aren't showing any improvement. I know better than to question your testing methodology, but I don't believe the patches are so fragile that they only work for me. In fact, there has been quite a bit of feedback about how they help in real usage as well as benchmarks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] VM batching patch problems? 2004-02-28 1:39 ` Nick Piggin @ 2004-02-28 1:49 ` Nick Piggin 2004-02-28 2:29 ` Nick Piggin 1 sibling, 0 replies; 5+ messages in thread From: Nick Piggin @ 2004-02-28 1:49 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-mm Nick Piggin wrote: > I already have, multiple times, which is why I sent it to you. > I really have done quite a lot of testing. From the feedback I > have got through your being in your tree I can say I'm on the > right track with the patches, so I'll have to maintain my own > rollup. > s/through your being/through them being/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] VM batching patch problems? 2004-02-28 1:39 ` Nick Piggin 2004-02-28 1:49 ` Nick Piggin @ 2004-02-28 2:29 ` Nick Piggin 1 sibling, 0 replies; 5+ messages in thread From: Nick Piggin @ 2004-02-28 2:29 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-mm Nick Piggin wrote: > > No it doesn't increase ZONE_NORMAL scanning. The scanning is the same > rate, > but because ZONE_NORMAL is 1/4 the size, it has quadruple the > pressure. If > you don't like logic, just pretend it is pinned by mem_map and other > things. > Your batching patch is conceptually wrong and it adds complexity. > An unfortunate typo. I was supposed to say "if you don't like *that* logic", not "if you don't like logic"... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-02-28 2:29 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-02-28 0:19 [RFC] VM batching patch problems? Nick Piggin 2004-02-28 0:52 ` Andrew Morton 2004-02-28 1:39 ` Nick Piggin 2004-02-28 1:49 ` Nick Piggin 2004-02-28 2:29 ` Nick Piggin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox