linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: kswapd eating too much CPU on ac16/ac18
@ 2000-06-17 19:43 Cesar Eduardo Barros
  2000-06-17 21:34 ` Roger Larsson
  0 siblings, 1 reply; 16+ messages in thread
From: Cesar Eduardo Barros @ 2000-06-17 19:43 UTC (permalink / raw)
  To: Roger Larsson; +Cc: 'linux-mm@kvack.org', linux-kernel

> Please try to remove only this test to get a comparable result.

I nuked the whole block:

                /*
                 * Page is from a zone we don't care about.
                 * Don't drop page cache entries in vain.
                 */
                if (page->zone->free_pages > page->zone->pages_high) {
                        /* the page from the wrong zone doesn't count */
                        count++;
                        goto unlock_continue;
                }

Commenting it out made ac19 perform almost as good as ac4 (it looked a bit
faster).

I don't know how it would affect boxes with more than one zone, but my gut
feeling is that it won't hurt and might make them even a bit faster.

-- 
Cesar Eduardo Barros
cesarb@nitnet.com.br
cesarb@dcc.ufrj.br
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-17 19:43 kswapd eating too much CPU on ac16/ac18 Cesar Eduardo Barros
@ 2000-06-17 21:34 ` Roger Larsson
  0 siblings, 0 replies; 16+ messages in thread
From: Roger Larsson @ 2000-06-17 21:34 UTC (permalink / raw)
  To: Cesar Eduardo Barros; +Cc: linux-mm, linux-kernel

Hi,

The reason for me to ask you to remove it is since there are two
problems
related to this code snippet. (Reported earlier on linux-mm)

* If no zone has pressure - we will loop "forever" since no pages will
pass
this test. (On a 16 MB machine this is the likely scenario)

* If there are no pages of a zone with pressure are on LRU - we will
loop...

And since there is no guarantee that shrink_mmap is not called in these
circumstances...

I have released patches (on linux-mm) that tries to handle these
situations.
* do_try_to_free_pages avoids to call shrink_mmap with no pressure.
* shrink_mmap tries to determine the bad situation (not in my latest)

/RogerL


Cesar Eduardo Barros wrote:
> 
> > Please try to remove only this test to get a comparable result.
> 
> I nuked the whole block:
> 
>                 /*
>                  * Page is from a zone we don't care about.
>                  * Don't drop page cache entries in vain.
>                  */
>                 if (page->zone->free_pages > page->zone->pages_high) {
>                         /* the page from the wrong zone doesn't count */
>                         count++;
>                         goto unlock_continue;
>                 }
> 
> Commenting it out made ac19 perform almost as good as ac4 (it looked a bit
> faster).
> 
> I don't know how it would affect boxes with more than one zone, but my gut
> feeling is that it won't hurt and might make them even a bit faster.
> 
> --
> Cesar Eduardo Barros
> cesarb@nitnet.com.br
> cesarb@dcc.ufrj.br
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux.eu.org/Linux-MM/

--
Home page:
  http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-17 15:33         ` Rik van Riel
@ 2000-06-19 21:22           ` Goswin Brederlow
  0 siblings, 0 replies; 16+ messages in thread
From: Goswin Brederlow @ 2000-06-19 21:22 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Cesar Eduardo Barros, Mike Galbraith, Alan Cox, linux-kernel, linux-mm

>>>>> " " == Rik van Riel <riel@conectiva.com.br> writes:

     > I think the phenomenon you're seeing is not at all related to
     > deferred/non-deferred swapout. That doesn't have anything to do
     > with kswapd CPU usage.

     > The changed feedback loop in do_try_to_free_pages, however may
     > have something to do with this. It works well on machines with
     > more than 1 memory zone, but I can envision it breaking on
     > machines with just one zone...

     > I'm thinking of a way to fix this cleanly, I'll keep you
     > posted.

I have two boxes with 2.4.0-test1 kernels:

First one a Celeron 466 with 128 Mb ram:
BIOS-provided physical RAM map:
 e820: 000000000009f000 @ 0000000000000000 (usable)
 e820: 0000000007f00000 @ 0000000000100000 (usable)
On node 0 totalpages: 32768
zone(0): 4096 pages.
zone(1): 28672 pages.
zone(2): 0 pages.

Second one a P120 with 16 MB ram (probably in one zone, but its not in
reach at the moment).

On the Celeron 2.4.0-test1 runs fine (responsiveness is a bit low, but
kswapd useage is fine).

On the P120 kswapd needs 95-99% cpu time. and the system is realy
realy slow. I teste plain 2.4.0-test1 to 2.2.4-test1-ac19 with various
steps inbetween. The disk behaviour (how often the ide led blinks)
differs and the amount swap used is different, but the kswap allways
uses all cpu time.

This could realy be a "number of zones" problem, so pleas thing about
it.

MfG
	Goswin

PS: I will add a zone mapping for the P120 when I get to it next time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-16 15:08     ` Rik van Riel
  2000-06-17  3:05       ` Cesar Eduardo Barros
@ 2000-06-18  6:26       ` Mike Galbraith
  1 sibling, 0 replies; 16+ messages in thread
From: Mike Galbraith @ 2000-06-18  6:26 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Alan Cox, Cesar Eduardo Barros, linux-kernel, linux-mm, Roger Larsson

On Fri, 16 Jun 2000, Rik van Riel wrote:

> On Fri, 16 Jun 2000, Mike Galbraith wrote:
> > On Wed, 14 Jun 2000, Alan Cox wrote:
> > 
> > > Im interested to know if ac9/ac10 is the slow->fast change point
> > 
> > ac5 is definately the breaking point.  ac5 doesn't survive make
> > -j30.. starts swinging it's VM machette at everything in sight.  
> > Reversing the VM changes to ac4 restores throughput to test1
> > levels (11 minute build vs 21-26 minutes for everything
> > forward).
> > 
> > Exact tested reversals below.  FWIW, page aging doesn't seem to
> > be the problem.  I disabled that in ac17 and saw zero
> > difference.  (What may or not be a hint is that the /* Let
> > shrink_mmap handle this swapout. */ bit in vmscan.c does make a
> > consistent difference.  Reverting that bit alone takes a minimum
> > of 4 minutes off build time)
> 
> Interesting. Not delaying the swapout IO completely broke
> performance under the tests I did here...
> 
> Delayed swapout vs. non-delayed swapouts was the difference
> between 300 swapouts/s vs. 700 swapouts/s  (under a load
> with 400 swapins/s).
> 
> OTOH, I can imagine it being better if you have a very small
> LRU cache, something like less than 1/2 MB.

Removing only the hunk identified by Roger Larsonn brought ac20 performance
beyond 99-pre5 :)  Reverting deferred swap also no longer helps at all
and in fact hurts slightly (30 sec difference on make -j30 build times)

	-Mike

(shoot.. if it kicks butt now, I wonder what adding Juan's patch will do:)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-17  3:05       ` Cesar Eduardo Barros
  2000-06-17  4:04         ` Mike Galbraith
@ 2000-06-17 15:33         ` Rik van Riel
  2000-06-19 21:22           ` Goswin Brederlow
  1 sibling, 1 reply; 16+ messages in thread
From: Rik van Riel @ 2000-06-17 15:33 UTC (permalink / raw)
  To: Cesar Eduardo Barros; +Cc: Mike Galbraith, Alan Cox, linux-kernel, linux-mm

On Sat, 17 Jun 2000, Cesar Eduardo Barros wrote:
> On Fri, Jun 16, 2000 at 12:08:06PM -0300, Rik van Riel wrote:
> > On Fri, 16 Jun 2000, Mike Galbraith wrote:
> > > On Wed, 14 Jun 2000, Alan Cox wrote:
> > > 
> > > > Im interested to know if ac9/ac10 is the slow->fast change point
> > > 
> > > ac5 is definately the breaking point.  ac5 doesn't survive make
> > > -j30.. starts swinging it's VM machette at everything in sight.  
> > > Reversing the VM changes to ac4 restores throughput to test1
> > > levels (11 minute build vs 21-26 minutes for everything
> > > forward).
> > > 
> > > Exact tested reversals below.  FWIW, page aging doesn't seem to
> > > be the problem.  I disabled that in ac17 and saw zero
> > > difference.  (What may or not be a hint is that the /* Let
> > > shrink_mmap handle this swapout. */ bit in vmscan.c does make a
> > > consistent difference.  Reverting that bit alone takes a minimum
> > > of 4 minutes off build time)
> > 
> > Interesting. Not delaying the swapout IO completely broke
> > performance under the tests I did here...
> > 
> > Delayed swapout vs. non-delayed swapouts was the difference
> > between 300 swapouts/s vs. 700 swapouts/s  (under a load
> > with 400 swapins/s).
> 
> I can understand it... When you wake up kswapd you need more
> memory, and if you don't free it you will be called again. And
> again. And again. (leaf is a slow box; both top and vmstat eat
> 20% CPU each with 1 second updates all the time). So it does
> waste more time.
> 
> With ac4 I get the HDD light full on during the worse moments;
> with ac16/18 it just sits there in kswapd and the light blinks
> at about 1Hz.

I think the phenomenon you're seeing is not at all related
to deferred/non-deferred swapout. That doesn't have anything
to do with kswapd CPU usage.

The changed feedback loop in do_try_to_free_pages, however
may have something to do with this. It works well on machines
with more than 1 memory zone, but I can envision it breaking
on machines with just one zone...

I'm thinking of a way to fix this cleanly, I'll keep you posted.

> > OTOH, I can imagine it being better if you have a very small
> > LRU cache, something like less than 1/2 MB.
> 
> You can imagine it being better in some random rare condition I
> don't care about. People have been noticing speed problems
> related to kswapd. This is not microkernel research.

Please read my email before flaming. I am telling you I can
imagine non-deferred swapout (like what we had before) being
better when you have very little LRU cache, like on 8MB machines.

But now that you've told me you're not interested in 8MB machines
and value a flamewar more than a nicely running Linux box.  ;))

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-17 14:06           ` Cesar Eduardo Barros
@ 2000-06-17 15:25             ` Mike Galbraith
  0 siblings, 0 replies; 16+ messages in thread
From: Mike Galbraith @ 2000-06-17 15:25 UTC (permalink / raw)
  To: Cesar Eduardo Barros; +Cc: Rik van Riel, Alan Cox, linux-kernel, linux-mm

On Sat, 17 Jun 2000, Cesar Eduardo Barros wrote:

> On Sat, Jun 17, 2000 at 06:04:21AM +0200, Mike Galbraith wrote:
> > On Sat, 17 Jun 2000, Cesar Eduardo Barros wrote:
> > 
> > > > OTOH, I can imagine it being better if you have a very small
> > > > LRU cache, something like less than 1/2 MB.
> > > 
> > > You can imagine it being better in some random rare condition I don't care
> > > about. People have been noticing speed problems related to kswapd. This is not
> > > microkernel research.
> > 
> > ahem.
> > 
> > If you can do better, please do.   If not, give the man the feedback
> > he needs to find/fix the problems and spare us such useless comments.
> 
> I gave the feedback before the part you quoted. What's the problem with adding
> useless comments in the end of a message?

...nah.

> Let's not start a war here, EOT.

War?  I don't have time for that.  Your comment merely sounded excessivly
snide and non-infomative for my taste, so I thought I'd mention that fact.

EOT.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-17  4:04         ` Mike Galbraith
  2000-06-17 14:06           ` Cesar Eduardo Barros
@ 2000-06-17 15:23           ` Rik van Riel
  1 sibling, 0 replies; 16+ messages in thread
From: Rik van Riel @ 2000-06-17 15:23 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Cesar Eduardo Barros, Alan Cox, linux-kernel, linux-mm

On Sat, 17 Jun 2000, Mike Galbraith wrote:
> On Sat, 17 Jun 2000, Cesar Eduardo Barros wrote:
> 
> > > OTOH, I can imagine it being better if you have a very small
> > > LRU cache, something like less than 1/2 MB.
> > 
> > You can imagine it being better in some random rare condition I don't care
> > about. People have been noticing speed problems related to kswapd. This is not
> > microkernel research.
> 
> ahem.
> 
> If you can do better, please do.   If not, give the man the feedback
> he needs to find/fix the problems and spare us such useless comments.

Nah, all he wrote down was that I shouldn't care about his
situation because he doesn't care about it either ;)

(read the thread carefully ... this is just about what he
said)

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-17  4:04         ` Mike Galbraith
@ 2000-06-17 14:06           ` Cesar Eduardo Barros
  2000-06-17 15:25             ` Mike Galbraith
  2000-06-17 15:23           ` Rik van Riel
  1 sibling, 1 reply; 16+ messages in thread
From: Cesar Eduardo Barros @ 2000-06-17 14:06 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Cesar Eduardo Barros, Rik van Riel, Alan Cox, linux-kernel, linux-mm

On Sat, Jun 17, 2000 at 06:04:21AM +0200, Mike Galbraith wrote:
> On Sat, 17 Jun 2000, Cesar Eduardo Barros wrote:
> 
> > > OTOH, I can imagine it being better if you have a very small
> > > LRU cache, something like less than 1/2 MB.
> > 
> > You can imagine it being better in some random rare condition I don't care
> > about. People have been noticing speed problems related to kswapd. This is not
> > microkernel research.
> 
> ahem.
> 
> If you can do better, please do.   If not, give the man the feedback
> he needs to find/fix the problems and spare us such useless comments.

I gave the feedback before the part you quoted. What's the problem with adding
useless comments in the end of a message?

Let's not start a war here, EOT.

-- 
Cesar Eduardo Barros
cesarb@nitnet.com.br
cesarb@dcc.ufrj.br
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-17  3:05       ` Cesar Eduardo Barros
@ 2000-06-17  4:04         ` Mike Galbraith
  2000-06-17 14:06           ` Cesar Eduardo Barros
  2000-06-17 15:23           ` Rik van Riel
  2000-06-17 15:33         ` Rik van Riel
  1 sibling, 2 replies; 16+ messages in thread
From: Mike Galbraith @ 2000-06-17  4:04 UTC (permalink / raw)
  To: Cesar Eduardo Barros; +Cc: Rik van Riel, Alan Cox, linux-kernel, linux-mm

On Sat, 17 Jun 2000, Cesar Eduardo Barros wrote:

> > OTOH, I can imagine it being better if you have a very small
> > LRU cache, something like less than 1/2 MB.
> 
> You can imagine it being better in some random rare condition I don't care
> about. People have been noticing speed problems related to kswapd. This is not
> microkernel research.

ahem.

If you can do better, please do.   If not, give the man the feedback
he needs to find/fix the problems and spare us such useless comments.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-16 15:08     ` Rik van Riel
@ 2000-06-17  3:05       ` Cesar Eduardo Barros
  2000-06-17  4:04         ` Mike Galbraith
  2000-06-17 15:33         ` Rik van Riel
  2000-06-18  6:26       ` Mike Galbraith
  1 sibling, 2 replies; 16+ messages in thread
From: Cesar Eduardo Barros @ 2000-06-17  3:05 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Mike Galbraith, Alan Cox, Cesar Eduardo Barros, linux-kernel, linux-mm

On Fri, Jun 16, 2000 at 12:08:06PM -0300, Rik van Riel wrote:
> On Fri, 16 Jun 2000, Mike Galbraith wrote:
> > On Wed, 14 Jun 2000, Alan Cox wrote:
> > 
> > > Im interested to know if ac9/ac10 is the slow->fast change point
> > 
> > ac5 is definately the breaking point.  ac5 doesn't survive make
> > -j30.. starts swinging it's VM machette at everything in sight.  
> > Reversing the VM changes to ac4 restores throughput to test1
> > levels (11 minute build vs 21-26 minutes for everything
> > forward).
> > 
> > Exact tested reversals below.  FWIW, page aging doesn't seem to
> > be the problem.  I disabled that in ac17 and saw zero
> > difference.  (What may or not be a hint is that the /* Let
> > shrink_mmap handle this swapout. */ bit in vmscan.c does make a
> > consistent difference.  Reverting that bit alone takes a minimum
> > of 4 minutes off build time)
> 
> Interesting. Not delaying the swapout IO completely broke
> performance under the tests I did here...
> 
> Delayed swapout vs. non-delayed swapouts was the difference
> between 300 swapouts/s vs. 700 swapouts/s  (under a load
> with 400 swapins/s).

I can understand it... When you wake up kswapd you need more memory, and if you
don't free it you will be called again. And again. And again. (leaf is a slow
box; both top and vmstat eat 20% CPU each with 1 second updates all the time).
So it does waste more time.

Worst case (dpkg --install) in ac4 gets kswapd at about 5%. Which considering
that top or vmstat use 20% is low enough. Also it gets more throughput because
it has no need to waste time thinking.

With ac4 I get the HDD light full on during the worse moments; with ac16/18 it
just sits there in kswapd and the light blinks at about 1Hz.

> OTOH, I can imagine it being better if you have a very small
> LRU cache, something like less than 1/2 MB.

You can imagine it being better in some random rare condition I don't care
about. People have been noticing speed problems related to kswapd. This is not
microkernel research.

-- 
Cesar Eduardo Barros
cesarb@nitnet.com.br
cesarb@dcc.ufrj.br
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-16  5:45   ` Mike Galbraith
@ 2000-06-16 15:08     ` Rik van Riel
  2000-06-17  3:05       ` Cesar Eduardo Barros
  2000-06-18  6:26       ` Mike Galbraith
  0 siblings, 2 replies; 16+ messages in thread
From: Rik van Riel @ 2000-06-16 15:08 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Alan Cox, Cesar Eduardo Barros, linux-kernel, linux-mm

On Fri, 16 Jun 2000, Mike Galbraith wrote:
> On Wed, 14 Jun 2000, Alan Cox wrote:
> 
> > Im interested to know if ac9/ac10 is the slow->fast change point
> 
> ac5 is definately the breaking point.  ac5 doesn't survive make
> -j30.. starts swinging it's VM machette at everything in sight.  
> Reversing the VM changes to ac4 restores throughput to test1
> levels (11 minute build vs 21-26 minutes for everything
> forward).
> 
> Exact tested reversals below.  FWIW, page aging doesn't seem to
> be the problem.  I disabled that in ac17 and saw zero
> difference.  (What may or not be a hint is that the /* Let
> shrink_mmap handle this swapout. */ bit in vmscan.c does make a
> consistent difference.  Reverting that bit alone takes a minimum
> of 4 minutes off build time)

Interesting. Not delaying the swapout IO completely broke
performance under the tests I did here...

Delayed swapout vs. non-delayed swapouts was the difference
between 300 swapouts/s vs. 700 swapouts/s  (under a load
with 400 swapins/s).

OTOH, I can imagine it being better if you have a very small
LRU cache, something like less than 1/2 MB.

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
@ 2000-06-16  9:56 Roger Larsson
  0 siblings, 0 replies; 16+ messages in thread
From: Roger Larsson @ 2000-06-16  9:56 UTC (permalink / raw)
  To: 'mikeg@weiden.de'; +Cc: 'linux-mm@kvack.org'

Ohh,

This code was new at that time...

I have found out that most pages are not freed due to this check.
See "instrumentation patch for shrink_mmap to find cause of failures - it did :-)"

Please try to remove only this test to get a comparable result.
It might lead to infinite loops...

/RogerL

@@ -317,28 +326,34 @@
                        goto cache_unlock_continue;
 
                /*
+                * Page is from a zone we don't care about.
+                * Don't drop page cache entries in vain.
+                */
+               if (page->zone->free_pages > page->zone->pages_high)
+                       goto cache_unlock_continue;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-14  0:00 ` Alan Cox
  2000-06-14  0:10   ` Cesar Eduardo Barros
@ 2000-06-16  5:45   ` Mike Galbraith
  2000-06-16 15:08     ` Rik van Riel
  1 sibling, 1 reply; 16+ messages in thread
From: Mike Galbraith @ 2000-06-16  5:45 UTC (permalink / raw)
  To: Alan Cox; +Cc: Rik van Riel, Cesar Eduardo Barros, linux-kernel, linux-mm

On Wed, 14 Jun 2000, Alan Cox wrote:

> > ac4 was faster than ever, it looked like it wasn't swapping at all
> > 
> > ac16 and ac18 are both awful, dpkg takes an infinite time, all of it dominated
> 
> Im interested to know if ac9/ac10 is the slow->fast change point

ac5 is definately the breaking point.  ac5 doesn't survive make -j30..
starts swinging it's VM machette at everything in sight.  Reversing the
VM changes to ac4 restores throughput to test1 levels (11 minute build
vs 21-26 minutes for everything forward).

Exact tested reversals below.  FWIW, page aging doesn't seem to be the
problem.  I disabled that in ac17 and saw zero difference.  (What may or
not be a hint is that the /* Let shrink_mmap handle this swapout. */ bit
in vmscan.c does make a consistent difference.  Reverting that bit alone
takes a minimum of 4 minutes off build time)

	-Mike

diff -urN linux-2.4.0-ac4/include/linux/mm.h linux-2.4.0-ac5/include/linux/mm.h
--- linux-2.4.0-ac4/include/linux/mm.h	Fri Jun 16 06:09:48 2000
+++ linux-2.4.0-ac5/include/linux/mm.h	Fri Jun 16 06:17:29 2000
@@ -153,6 +153,7 @@
 	struct buffer_head * buffers;
 	unsigned long virtual; /* nonzero if kmapped */
 	struct zone_struct *zone;
+	unsigned int age;
 } mem_map_t;
 
 #define get_page(p)		atomic_inc(&(p)->count)
@@ -169,7 +170,7 @@
 #define PG_dirty		 4
 #define PG_decr_after		 5
 #define PG_unused_01		 6
-#define PG__unused_02		 7
+#define PG_active		 7
 #define PG_slab			 8
 #define PG_swap_cache		 9
 #define PG_skip			10
@@ -185,6 +186,7 @@
 #define ClearPageUptodate(page)	clear_bit(PG_uptodate, &(page)->flags)
 #define PageDirty(page)		test_bit(PG_dirty, &(page)->flags)
 #define SetPageDirty(page)	set_bit(PG_dirty, &(page)->flags)
+#define ClearPageDirty(page)	clear_bit(PG_dirty, &(page)->flags)
 #define PageLocked(page)	test_bit(PG_locked, &(page)->flags)
 #define LockPage(page)		set_bit(PG_locked, &(page)->flags)
 #define TryLockPage(page)	test_and_set_bit(PG_locked, &(page)->flags)
@@ -192,6 +194,9 @@
 					clear_bit(PG_locked, &(page)->flags); \
 					wake_up(&page->wait); \
 				} while (0)
+#define PageActive(page)	test_bit(PG_active, &(page)->flags)
+#define SetPageActive(page)	set_bit(PG_active, &(page)->flags)
+#define ClearPageActive(page)	clear_bit(PG_active, &(page)->flags)
 #define PageError(page)		test_bit(PG_error, &(page)->flags)
 #define SetPageError(page)	set_bit(PG_error, &(page)->flags)
 #define ClearPageError(page)	clear_bit(PG_error, &(page)->flags)
diff -urN linux-2.4.0-ac4/include/linux/swap.h linux-2.4.0-ac5/include/linux/swap.h
--- linux-2.4.0-ac4/include/linux/swap.h	Wed Jun 14 11:52:13 2000
+++ linux-2.4.0-ac5/include/linux/swap.h	Fri Jun 16 06:17:30 2000
@@ -168,12 +168,15 @@
 	spin_lock(&pagemap_lru_lock);		\
 	list_add(&(page)->lru, &lru_cache);	\
 	nr_lru_pages++;				\
+	page->age = 2;				\
+	SetPageActive(page);			\
 	spin_unlock(&pagemap_lru_lock);		\
 } while (0)
 
 #define	__lru_cache_del(page)			\
 do {						\
 	list_del(&(page)->lru);			\
+	ClearPageActive(page);			\
 	nr_lru_pages--;				\
 } while (0)
 
diff -urN linux-2.4.0-ac4/mm/filemap.c linux-2.4.0-ac5/mm/filemap.c
--- linux-2.4.0-ac4/mm/filemap.c	Wed May 24 06:23:09 2000
+++ linux-2.4.0-ac5/mm/filemap.c	Fri Jun 16 06:15:32 2000
@@ -264,7 +264,16 @@
 		page = list_entry(page_lru, struct page, lru);
 		list_del(page_lru);
 
-		if (PageTestandClearReferenced(page))
+		if (PageTestandClearReferenced(page)) {
+			page->age += 3;
+			if (page->age > 10)
+				page->age = 10;
+			goto dispose_continue;
+		}
+		if (page->age)
+			page->age--;
+
+		if (page->age)
 			goto dispose_continue;
 
 		count--;
@@ -317,28 +326,34 @@
 			goto cache_unlock_continue;
 
 		/*
+		 * Page is from a zone we don't care about.
+		 * Don't drop page cache entries in vain.
+		 */
+		if (page->zone->free_pages > page->zone->pages_high)
+			goto cache_unlock_continue;
+
+		/*
 		 * Is it a page swap page? If so, we want to
 		 * drop it if it is no longer used, even if it
 		 * were to be marked referenced..
 		 */
 		if (PageSwapCache(page)) {
-			spin_unlock(&pagecache_lock);
-			__delete_from_swap_cache(page);
-			goto made_inode_progress;
-		}	
-
-		/*
-		 * Page is from a zone we don't care about.
-		 * Don't drop page cache entries in vain.
-		 */
-		if (page->zone->free_pages > page->zone->pages_high)
+			if (!PageDirty(page)) {
+				spin_unlock(&pagecache_lock);
+				__delete_from_swap_cache(page);
+				goto made_inode_progress;
+			}
+			/* PageDeferswap -> we swap out the page now. */
+			if (gfp_mask & __GFP_IO)
+				goto async_swap;
 			goto cache_unlock_continue;
+		}
 
 		/* is it a page-cache page? */
 		if (page->mapping) {
 			if (!PageDirty(page) && !pgcache_under_min()) {
-				__remove_inode_page(page);
 				spin_unlock(&pagecache_lock);
+				__remove_inode_page(page);
 				goto made_inode_progress;
 			}
 			goto cache_unlock_continue;
@@ -351,6 +366,14 @@
 unlock_continue:
 		spin_lock(&pagemap_lru_lock);
 		UnlockPage(page);
+		page_cache_release(page);
+		goto dispose_continue;
+async_swap:
+		spin_unlock(&pagecache_lock);
+		/* Do NOT unlock the page ... that is done after IO. */
+		ClearPageDirty(page);
+		rw_swap_page(WRITE, page, 0);
+		spin_lock(&pagemap_lru_lock);
 		page_cache_release(page);
 dispose_continue:
 		list_add(page_lru, &lru_cache);
diff -urN linux-2.4.0-ac4/mm/page_alloc.c linux-2.4.0-ac5/mm/page_alloc.c
--- linux-2.4.0-ac4/mm/page_alloc.c	Sat May 13 07:12:42 2000
+++ linux-2.4.0-ac5/mm/page_alloc.c	Fri Jun 16 06:15:32 2000
@@ -93,6 +93,8 @@
 		BUG();
 	if (PageDecrAfter(page))
 		BUG();
+	if (PageDirty(page))
+		BUG();
 
 	zone = page->zone;
 
diff -urN linux-2.4.0-ac4/mm/swap_state.c linux-2.4.0-ac5/mm/swap_state.c
--- linux-2.4.0-ac4/mm/swap_state.c	Wed May 24 06:23:09 2000
+++ linux-2.4.0-ac5/mm/swap_state.c	Fri Jun 16 06:15:32 2000
@@ -73,6 +73,7 @@
 		PAGE_BUG(page);
 
 	PageClearSwapCache(page);
+	ClearPageDirty(page);
 	remove_inode_page(page);
 }
 
diff -urN linux-2.4.0-ac4/mm/vmscan.c linux-2.4.0-ac5/mm/vmscan.c
--- linux-2.4.0-ac4/mm/vmscan.c	Wed May 24 06:23:09 2000
+++ linux-2.4.0-ac5/mm/vmscan.c	Fri Jun 16 06:15:32 2000
@@ -62,6 +62,10 @@
 		goto out_failed;
 	}
 
+	/* Can only do this if we age all active pages. */
+	if (PageActive(page) && page->age > 1)
+		goto out_failed;
+
 	if (TryLockPage(page))
 		goto out_failed;
 
@@ -74,6 +78,8 @@
 	 * memory, and we should just continue our scan.
 	 */
 	if (PageSwapCache(page)) {
+		if (pte_dirty(pte))
+			SetPageDirty(page);
 		entry.val = page->index;
 		swap_duplicate(entry);
 		set_pte(page_table, swp_entry_to_pte(entry));
@@ -181,7 +187,10 @@
 	vmlist_access_unlock(vma->vm_mm);
 
 	/* OK, do a physical asynchronous write to swap.  */
-	rw_swap_page(WRITE, page, 0);
+	// rw_swap_page(WRITE, page, 0);
+	/* Let shrink_mmap handle this swapout. */
+	SetPageDirty(page);
+	UnlockPage(page);
 
 out_free_success:
 	page_cache_release(page);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-14  0:00 ` Alan Cox
@ 2000-06-14  0:10   ` Cesar Eduardo Barros
  2000-06-16  5:45   ` Mike Galbraith
  1 sibling, 0 replies; 16+ messages in thread
From: Cesar Eduardo Barros @ 2000-06-14  0:10 UTC (permalink / raw)
  To: Alan Cox; +Cc: Cesar Eduardo Barros, linux-kernel, linux-mm

On Wed, Jun 14, 2000 at 01:00:09AM +0100, Alan Cox wrote:
> > ac4 was faster than ever, it looked like it wasn't swapping at all
> > 
> > ac16 and ac18 are both awful, dpkg takes an infinite time, all of it dominated
> 
> Im interested to know if ac9/ac10 is the slow->fast change point
> 

I didn't compile that... I jumped from ac4 to ac16. Maybe I'll compile it
tomorrow. Maybe later (exams).

-- 
Cesar Eduardo Barros
cesarb@nitnet.com.br
cesarb@dcc.ufrj.br
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: kswapd eating too much CPU on ac16/ac18
  2000-06-13 23:51 Cesar Eduardo Barros
@ 2000-06-14  0:00 ` Alan Cox
  2000-06-14  0:10   ` Cesar Eduardo Barros
  2000-06-16  5:45   ` Mike Galbraith
  0 siblings, 2 replies; 16+ messages in thread
From: Alan Cox @ 2000-06-14  0:00 UTC (permalink / raw)
  To: Cesar Eduardo Barros; +Cc: linux-kernel, linux-mm

> ac4 was faster than ever, it looked like it wasn't swapping at all
> 
> ac16 and ac18 are both awful, dpkg takes an infinite time, all of it dominated

Im interested to know if ac9/ac10 is the slow->fast change point

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* kswapd eating too much CPU on ac16/ac18
@ 2000-06-13 23:51 Cesar Eduardo Barros
  2000-06-14  0:00 ` Alan Cox
  0 siblings, 1 reply; 16+ messages in thread
From: Cesar Eduardo Barros @ 2000-06-13 23:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm

I've seen this behavior both in ac16 and in ac18. ac4 worked fine (and was the
fastest kernel I've ever seen on that box)

The box is a 386SX25, with 8MB RAM. The problem is that kswapd eats 99.0% of
the CPU while running dpkg (I also made it happen with X). dpkg uses 10MB of
memory in a particulary awful access pattern (so it swaps a lot).

ac4 was faster than ever, it looked like it wasn't swapping at all

ac16 and ac18 are both awful, dpkg takes an infinite time, all of it dominated
by kswapd (running top -s and vmstat 1 at the same time). When the problem
happens everything seems to hang (vmstat lumps some seconds into one, as I can
see in the interrupt count), no disk activity happens (as if it was lost
thinking what to do next), and on the next update I can see kswapd ate an awful
amount of CPU (ok, top eats 20% CPU on that box, but why would ac4 remain
pretty responsive when ac16/ac18 stop to a halt?)

It's not zone related (only 8Mb of memory)

To reproduce: use mem=8M (or use a box like mine ;) ) and run dpkg --list (or
even better, try to install something using dpkg)

I think new VM ideas should always be tested with mem=8M and a dpkg run...

-- 
Cesar Eduardo Barros
cesarb@nitnet.com.br
cesarb@dcc.ufrj.br
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2000-06-19 21:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-06-17 19:43 kswapd eating too much CPU on ac16/ac18 Cesar Eduardo Barros
2000-06-17 21:34 ` Roger Larsson
  -- strict thread matches above, loose matches on Subject: below --
2000-06-16  9:56 Roger Larsson
2000-06-13 23:51 Cesar Eduardo Barros
2000-06-14  0:00 ` Alan Cox
2000-06-14  0:10   ` Cesar Eduardo Barros
2000-06-16  5:45   ` Mike Galbraith
2000-06-16 15:08     ` Rik van Riel
2000-06-17  3:05       ` Cesar Eduardo Barros
2000-06-17  4:04         ` Mike Galbraith
2000-06-17 14:06           ` Cesar Eduardo Barros
2000-06-17 15:25             ` Mike Galbraith
2000-06-17 15:23           ` Rik van Riel
2000-06-17 15:33         ` Rik van Riel
2000-06-19 21:22           ` Goswin Brederlow
2000-06-18  6:26       ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox