linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Summary of recent VM behavior [2.3.99-pre8]
@ 2000-05-14  9:48 Craig Kulesa
  2000-05-18 10:17 ` PATCH: Possible solution to VM problems (take 2) Craig Kulesa
  0 siblings, 1 reply; 13+ messages in thread
From: Craig Kulesa @ 2000-05-14  9:48 UTC (permalink / raw)
  To: linux-mm, linux-kernel


Greetings...

Below are a summary of issues that I've encountered in the pre7 and pre8
kernels (at least on mid-range hardware).  I'd appreciate comments, any
enlightening information or pointers to documentation so I can answer the
questions myself. :) Also consider me a guinea pig for patches... 


1)  Unnecessary OOM situations, killing of processes
    (pathological)

Example:  On a 64 MB box, dd'ing >64 MB from /dev/zero to a file
on disk runs the kernel aground, usually killing a large RSS process 
like X11. This has been a consistent problem since pre6(-7?). This
behavior seems quite broken.  

I assume this is in the mmap code.  Cache increases as the file is written
but when the limit of physical memory is reached, problems ensue.  The CPU
is consumed ("hijacked") by kswapd or other internal kernel operations; as
though mmap'ed allocations can't be shrunk effectively (or quickly).

Not a problem w/ classzone.


2)  What's in the cache anyways?
    (puzzling)

Example: Play mp3's on an otherwise unloaded 64 MB system until cache
fills the rest of physical RAM. Then open an xterm (or GNU emacs,
or...).  After less than 10 MB of mp3 data goes goes by, close the
xterm. Open a new one. The xterm code is not in cache but is loaded from
scratch from disk, with a flurry of disk I/O (but no swapped pages). 
Why? The cache allocation is almost 50 MB -- *why* isn't it in there
somewhere?

One might imagine that the previous mp3's are solidly in cache, but
loading an mp3 only 15 MB earlier in the queue... comes from disk and not
from cache!  Why?

Another example on a 40 MB system: Open a lightweight X11/WindowMaker
session. Open Netscape 4.72 (Navigator). Close it. Log out. Login again,
load Netscape. Both X, window manager, and Netscape seem to come
straight from disk, with no swapped pages.  But the buffer cache is 
25 MB!  What's in there if the applications aren't? 

This is also seen on a 32 MB system by simply opening Navigator, closing
it, and opening it again. In kernel 2.2.xx and 2.3.99-pre5 (or with
classzone), it comes quickly out of cache.  In pre8, there's substantial
disk I/O, and about half of the pages are read from disk and not the
cache.  (??)

Before pre6 and with AA's classzone patch, a 25 MB cache seemed to contain
the "last" 25 MB of mmap'd files or I/O buffers. This doesn't seem true
anymore (?!), and it's an impediment to performance on at least
lower-end hardware.


3) Slow I/O performance

Disk access seems to incur large CPU overhead once physical memory must be
shared between "application" memory and cache.  kswapd is invoked
excessively, applications that stream data from disk hesitate, even the
mouse pointer becomes jumpy. The system load is ~50% higher in heavy disk
access than in earlier 2.2 and 2.3 kernels. 

Untarring the kernel source is a good example of this. Even a 128 MB
system doesn't do this smoothly in pre8. 

The overall memory usage in pre6 and later seems good -- there is no
gratuitous swapping as seen in pre5 (and earlier in pre2-3 etc). But the
general impression is that in the mmap code (somewhere else?), there are a
LOT of pages moved around or scanned that incurs expensive system
overhead. 

Before an "improved" means of handling vm pages (like the active/inactive
lists that Rik is working on), surely the current code in vmscan and
filemap (etc) should be shown to be fast and not conducive to this
puzzling, even pathological, behavior?  


4)  Confusion about inode_cache and dentry_cache

I'm surely confused here, but in kernel 2.3 the inode_cache and
dentry_cache are not as limited as in kernel 2.2.  Thusly,
sample applications like Redhat's 'slocate' daemon or any global use of
the "find" command will cause these slab caches to fill quickly. These
caches are effectively released under memory pressure. No problem.

But why do these "caches" show up as "used app memory" and not cache in
common tools like 'free' (or /proc/meminfo)?  This looks like a recipe for
lots of confused souls once kernel 2.4 is adopted by major distributions. 

Thoughts?


Craig Kulesa
Steward Observatory, Tucson AZ
ckulesa@as.arizona.edu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-14  9:48 Summary of recent VM behavior [2.3.99-pre8] Craig Kulesa
@ 2000-05-18 10:17 ` Craig Kulesa
  2000-05-18 10:59   ` Jan Niehusmann
  0 siblings, 1 reply; 13+ messages in thread
From: Craig Kulesa @ 2000-05-18 10:17 UTC (permalink / raw)
  To: linux-mm, linux-kernel


[Regarding Juan Quintela's wait_buffers_02.patch against pre9-2]

Wow. Much better!

The system doesn't hang itself in a CPU-thrashing-knot everytime an app
runs the used+cache allocations up to the limit of physical memory.  Cache
relinquishes gracefully, disk activity is dramatically less.  kswapd is
quiet again, whereas in pre8 it was eating 1/4 the integrated CPU time
as X11 at times.

I'm also not having the "cache content problems" I wrote about a few days
ago either.  Netscape, for example, is now perfectly content to load from
cache in 32 MB of RAM with room to spare.  General VM behavior has
pretty decent "feel" from 16 MB to 128 MB on 4 systems from 486DX2/66 to
PIII/500 under normal development load. 

In contrast, doing _anything_ while building a kernel on a 32 MB
Pentium/75 with pre8 was nothing short of a hair-pulling
experience.  [20 seconds for a bloody xterm?!]  It's smooth and
responsive now, even when assembling 40 MB RPM packages. Paging remains
gentle and not too distracting. Good. 

A stubborn problem that remains is the behavior when lots of
dirty pages pile up quickly.  Doing a giant 'dd' from /dev/zero to a
file on disk still causes gaps of unresponsiveness.  Here's a short vmstat
session on a 128 MB PIII system performing a 'dd if=/dev/zero of=dummy.dat
bs=1024k count=256':

   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 0  0  0   1392 100844    320  14000   0   0     0     0  186   409   0   0 100
 1  0  1   1392  53652    420  60080   0   0    12  3195  169   133   1  30  69
 0  1  1   1392  27572    444  85324   0   0     0  3487  329   495   0  18  82
 0  1  1   1392  15376    456  97128   0   0     0  3251  314   468   0   9  91
 0  1  2   1392   2332    472 109716   0   0    17  3089  308   466   0  11  89
 2  1  1   2820   2220    144 114392 380 1676   663 26644 20977 31578   0  10  90
 1  2  0   3560   2796    160 114220 284 792   303  9168 6542  7826   0  11  89
 4  2  1   3948   2824    168 114748 388 476   536 12975 9753 14203   1  11  88
 0  5  0   3944   2744    244 114496 552  88   791  4667 3827  4721   1   3  96
 2  0  0   3944   1512    416 115544  72   0   370     0  492  1417   0   3  97
 0  2  0   3916   2668    556 113800 132  36   330     9  415  1845   6   8  86
 1  0  0   3916   1876    720 114172   0   0   166     0  308  1333  14   6  80
 1  0  0   3912   2292    720 114244  76   0    19     0  347  1126   2   2  96
 2  0  0   3912   2292    720 114244   0   0     0     0  136   195   0   0 100

Guess the line when UI responsiveness was lost. :)

Yup.  Nothing abnormal happens until freemem decreases to zero, and then
the excrement hits the fan (albeit fairly briefly in this test).  After
the first wave of dirty pages are written out and the cache stabilizes,
user responsiveness seems to smooth out again. 

On the plus side...
It's relevant to note that this test caused rather reliable OOM
terminations of XFree86 from pre7-x (if not earlier) until this patch. I
haven't been able to generate any OOM process kills yet. And I've tried to
be very imaginative. :)

There's still some work needed, but Juan's patch seems to be resulting in
behavior that is clearly on the right track.  Great job guys, and thanks! 


Respectfully,
Craig Kulesa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-18 10:17 ` PATCH: Possible solution to VM problems (take 2) Craig Kulesa
@ 2000-05-18 10:59   ` Jan Niehusmann
  2000-05-18 13:41     ` Rik van Riel
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Niehusmann @ 2000-05-18 10:59 UTC (permalink / raw)
  To: Craig Kulesa; +Cc: linux-mm

On Thu, May 18, 2000 at 03:17:25AM -0700, Craig Kulesa wrote:
> A stubborn problem that remains is the behavior when lots of
> dirty pages pile up quickly.  Doing a giant 'dd' from /dev/zero to a
> file on disk still causes gaps of unresponsiveness.  Here's a short vmstat
> session on a 128 MB PIII system performing a 'dd if=/dev/zero of=dummy.dat
> bs=1024k count=256':

While 'dd if=/dev/zero of=file' can, of course, generate dirty pages at
an insane rate, I see the same unresponsiveness when doing cp -a from 
one filesystem to another. (and even from a slow harddisk to a faster one).

Shouldn't the writing of dirty pages occur at least at the same rate 
as reading data from the slower hard disk? 

(My system: linux-2.3.99pre9-2, wait_buffers_02.patch, 
truncate_inode_pages_01.patch, lvm, PII/333Mhz, 256MB, ide & scsi hard disks)


Jan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-18 10:59   ` Jan Niehusmann
@ 2000-05-18 13:41     ` Rik van Riel
  2000-05-18 13:49       ` Stephen C. Tweedie
  0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2000-05-18 13:41 UTC (permalink / raw)
  To: Jan Niehusmann; +Cc: Craig Kulesa, linux-mm

On Thu, 18 May 2000, Jan Niehusmann wrote:
> On Thu, May 18, 2000 at 03:17:25AM -0700, Craig Kulesa wrote:
> > A stubborn problem that remains is the behavior when lots of
> > dirty pages pile up quickly.  Doing a giant 'dd' from /dev/zero to a
> > file on disk still causes gaps of unresponsiveness.  Here's a short vmstat
> > session on a 128 MB PIII system performing a 'dd if=/dev/zero of=dummy.dat
> > bs=1024k count=256':
> 
> While 'dd if=/dev/zero of=file' can, of course, generate dirty pages at
> an insane rate, I see the same unresponsiveness when doing cp -a from 
> one filesystem to another. (and even from a slow harddisk to a faster one).

I think I have this mostly figured out. I'll work on
making some small improvements over Quintela's patch
that will make the system behave decently in this
situation too.

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-18 13:41     ` Rik van Riel
@ 2000-05-18 13:49       ` Stephen C. Tweedie
  0 siblings, 0 replies; 13+ messages in thread
From: Stephen C. Tweedie @ 2000-05-18 13:49 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Jan Niehusmann, Craig Kulesa, linux-mm

Hi,

On Thu, May 18, 2000 at 10:41:05AM -0300, Rik van Riel wrote:
> 
> I think I have this mostly figured out. I'll work on
> making some small improvements over Quintela's patch
> that will make the system behave decently in this
> situation too.

Good, because apart from the write performance, Juan's patch seems to
work really well for the stress tests I've thrown at it so far.

--Stephen
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-21 17:15             ` Linus Torvalds
@ 2000-05-21 19:02               ` Rik van Riel
  0 siblings, 0 replies; 13+ messages in thread
From: Rik van Riel @ 2000-05-21 19:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Juan J. Quintela, linux-mm

On Sun, 21 May 2000, Linus Torvalds wrote:
> On Sun, 21 May 2000, Rik van Riel wrote:
> > 
> > The only change we may want to do is completely drop
> > the priority argument from swap_out since:
> > - if we fail through to swap_out we *must* unmap some pages
> 
> Getting rid of the priority argument to swap_out() would mean
> that swap_out() can no longer make any decisions of its own.
> Suddenly swap_out() is a slave to shrink_mmap(), and is not
> allowed to say "there's a lot of pressure on the VM system right
> now, I can't free anything up at this moment, maybe there could
> be some dirty buffers you could write out instead?".

OK, you're right here.

> > - we really want do_try_to_free_pages to succeed every time
> 
> Well, we do want that, but at the same time we also do want it to
> recognize when it really isn't making any progress. 
> 
> When our priority level turns to "Give me some pages or I'll
> rape your wife and kill your children", and _still_ nobody gives
> us memory, we should just realize that we should give up.

Problem is that the current code seems to give up way
before that. We should be able to free memory from mmap002
no matter what, because we *can* (the backing store for
the data exists).

IMHO it is not acceptable that do_try_to_free_pages() can
fail on the mmap002, but you are completely right that my
quick and dirty idea is wrong.

(I'll steal davem's code and split the current lru queue
in active, inactive and laundry, then the system will
know which page to steal, how to do effective async IO
- don't wait for pages if we have inactive pages left,
but wait for laundry pages instead of stealing active
ones - and when it *has* to call swap_out)

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-21 16:01           ` Rik van Riel
@ 2000-05-21 17:15             ` Linus Torvalds
  2000-05-21 19:02               ` Rik van Riel
  0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2000-05-21 17:15 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Juan J. Quintela, linux-mm


On Sun, 21 May 2000, Rik van Riel wrote:
> 
> The only change we may want to do is completely drop
> the priority argument from swap_out since:
> - if we fail through to swap_out we *must* unmap some pages

I don't agree.

It's a balancing act. We go from door to door, and we say "can you spare a
dime?" The fact that shrink_mmap() said "I don't have anything for you
right now" doesn't mean that swap_out() _has_ to give us memory. If nobody
gives us anything the first time through, we should just try again. A bit
more forcefully this time.

> - swap_out isn't balanced against anything else, so failing
>   it doesn't make much sense (IMHO)

This is not how I see the balancing act at all.

Think of the priority as something everybody we ask uses to judge how
badly he wants to release memory. NOBODY balances against "somebody else".
Everybody balances its own heap of memory, and there is no "global"
balance. Think of it as the same thing as "per-zone" and "class-aware"
logic all over again.

A global balance would take the other allocators into account, and say "I
only have X pages, and they have Y pages, so _they_ should pay". A global
balancing algorithm is based on envy of each others pages.

The local balance is more a "Oh, since he asks me with priority 10, I'll
just see if I can quickly look through 1% of my oldest pages, and if I
find something that I'm comfortable giving you, I'll make it available".
It doesn't take other memory users into account - it is purely selfless,
and knows that somebody asks for help.

Getting rid of the priority argument to swap_out() would mean that
swap_out() can no longer make any decisions of its own. Suddenly
swap_out() is a slave to shrink_mmap(), and is not allowed to say "there's
a lot of pressure on the VM system right now, I can't free anything up at
this moment, maybe there could be some dirty buffers you could write out
instead?".

> - we really want do_try_to_free_pages to succeed every time

Well, we do want that, but at the same time we also do want it to
recognize when it really isn't making any progress. 

When our priority level turns to "Give me some pages or I'll rape your
wife and kill your children", and _still_ nobody gives us memory, we
should just realize that we should give up.

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-21  8:14         ` Linus Torvalds
@ 2000-05-21 16:01           ` Rik van Riel
  2000-05-21 17:15             ` Linus Torvalds
  0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2000-05-21 16:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Juan J. Quintela, linux-mm

On Sun, 21 May 2000, Linus Torvalds wrote:

> The mm patchs in particular didn't apply any more, because my
> tree did some of the same stuff, so I did only a very very
> partial merge, much of it to just make a full merge later
> simpler. I made it available under testing as pre9-3, would you
> mind taking a look?

Looking good (well, I've only *read* the code, not
booted it).

The only change we may want to do is completely drop
the priority argument from swap_out since:
- if we fail through to swap_out we *must* unmap some pages
- swap_out isn't balanced against anything else, so failing
  it doesn't make much sense (IMHO)
- we really want do_try_to_free_pages to succeed every time

Of course I may have overlooked something ... please tell me
what :)

BTW, I'll soon go to work with some of davem's code and will
try to make a system with active/inactive lists. I believe the
fact that we don't have those now is responsible for the 
fragility of the current "balance" between the different memory
freeing functions... (but to be honest this too is mostly a
hunch)

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-18  0:12       ` Juan J. Quintela
  2000-05-18  1:07         ` Rik van Riel
@ 2000-05-21  8:14         ` Linus Torvalds
  2000-05-21 16:01           ` Rik van Riel
  1 sibling, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2000-05-21  8:14 UTC (permalink / raw)
  To: Juan J. Quintela; +Cc: Rik van Riel, linux-mm

I'm back from Canada, and finally have DSL at home, so I tried to sync up
with the patches I had in my in-queue. 

The mm patchs in particular didn't apply any more, because my tree did
some of the same stuff, so I did only a very very partial merge, much of
it to just make a full merge later simpler. I made it available under
testing as pre9-3, would you mind taking a look?

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
@ 2000-05-18  5:58 Neil Schemenauer
  0 siblings, 0 replies; 13+ messages in thread
From: Neil Schemenauer @ 2000-05-18  5:58 UTC (permalink / raw)
  To: linux-mm; +Cc: quintela, riel

Rik van Riel:
> I am now testing the patch on my small test machine and must
> say that things look just *great*. I can start up a gimp while
> bonnie is running without having much impact on the speed of
> either.
> 
> Interactive performance is nice and stability seems to be
> great as well.

We using the same patch?  I applied wait_buffers_02.patch from
Juan's site to pre9-2.  Running "Bonnie -s 250" on a 128 MB
machine causes extremely poor interactive performance.  The
machine is totaly unresponsive for up to a minute at a time.

    Neil
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-18  0:12       ` Juan J. Quintela
@ 2000-05-18  1:07         ` Rik van Riel
  2000-05-21  8:14         ` Linus Torvalds
  1 sibling, 0 replies; 13+ messages in thread
From: Rik van Riel @ 2000-05-18  1:07 UTC (permalink / raw)
  To: Juan J. Quintela
  Cc: linux-mm, Linus Torvalds, Stephen C. Tweedie, linux-kernel

On 18 May 2000, Juan J. Quintela wrote:

>         after some more testing we found that:
> 1- the patch works also with mem=32MB (i.e. it is a winner also for
>    low mem machines)
> 2- Interactive performance looks great, I can run an mmap002 with size
>    96MB in an 32MB machine and use an ssh session in the same machine
>    to do ls/vi/... without dropouts, no way I can do that with
>    previous pre-*
> 3- The system looks really stable now, no more processes killed for
>    OOM error, and we don't see any more fails in do_try_to_free_page.

I am now testing the patch on my small test machine and must
say that things look just *great*. I can start up a gimp while
bonnie is running without having much impact on the speed of
either.

Interactive performance is nice and stability seems to be
great as well.

I'll test it on my 512MB test machine as well and will have
more test results tomorrow. This patch is most likely good
enough to include in the kernel this night ;)

(and even if it isn't, it's a hell of a lot better than
anything we had before)

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-17 23:31     ` PATCH: Possible solution to VM problems (take 2) Juan J. Quintela
@ 2000-05-18  0:12       ` Juan J. Quintela
  2000-05-18  1:07         ` Rik van Riel
  2000-05-21  8:14         ` Linus Torvalds
  0 siblings, 2 replies; 13+ messages in thread
From: Juan J. Quintela @ 2000-05-18  0:12 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm, Linus Torvalds, Stephen C. Tweedie, linux-kernel

Hi
        after some more testing we found that:
1- the patch works also with mem=32MB (i.e. it is a winner also for
   low mem machines)
2- Interactive performance looks great, I can run an mmap002 with size
   96MB in an 32MB machine and use an ssh session in the same machine
   to do ls/vi/... without dropouts, no way I can do that with
   previous pre-*
3- The system looks really stable now, no more processes killed for
   OOM error, and we don't see any more fails in do_try_to_free_page.

Later, Juan.

PD. I will comment the patch tomorrow, I have no more time today,
    sorry about that.


-- 
In theory, practice and theory are the same, but in practice they 
are different -- Larry McVoy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PATCH: Possible solution to VM problems (take 2)
  2000-05-17 20:45   ` PATCH: Possible solution to VM problems Juan J. Quintela
@ 2000-05-17 23:31     ` Juan J. Quintela
  2000-05-18  0:12       ` Juan J. Quintela
  0 siblings, 1 reply; 13+ messages in thread
From: Juan J. Quintela @ 2000-05-17 23:31 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm, Linus Torvalds, Stephen C. Tweedie, linux-kernel

Hi
        after discussions with Rik, we have arrived to the conclusions
vs. previous patch that:

1- nr_dirty should be made initialised with priority value, that means
   that for big priorities, we start *quite a lot* of async writes
   before waiting for one page.  And in low priorities, we wait for
   any page, we need memory at any cost.

2- We changed do_try_to_free_pages to return success it it has freed
   some page, not only when we have liberated count pages, that makes
   the system not to kill mmap002 never get killed, 30 minutes test.

The interactive response from the system looks better, but I need to
do more testing on that.  The system time has been reduced also.

Please, can somebody with highmem test this patch, I am very
interested in know if the default values here work there also well.
They should work well, but, who nows.

As always, comments are welcome.

Later, Juan.

PD. You can get my kernel patches from: 
    http://carpanta.dc.fi.udc.es/~quintela/kernel/

diff -urN --exclude-from=/home/lfcia/quintela/work/kernel/exclude pre9-2/fs/buffer.c testing/fs/buffer.c
--- pre9-2/fs/buffer.c	Fri May 12 23:46:45 2000
+++ testing/fs/buffer.c	Wed May 17 19:17:27 2000
@@ -1324,7 +1324,7 @@
 	 * instead.
 	 */
 	if (!offset) {
-		if (!try_to_free_buffers(page)) {
+		if (!try_to_free_buffers(page, 0)) {
 			atomic_inc(&buffermem_pages);
 			return 0;
 		}
@@ -2121,14 +2121,14 @@
  * This all is required so that we can free up memory
  * later.
  */
-static void sync_page_buffers(struct buffer_head *bh)
+static void sync_page_buffers(struct buffer_head *bh, int wait)
 {
-	struct buffer_head * tmp;
-
-	tmp = bh;
+	struct buffer_head * tmp = bh;
 	do {
 		struct buffer_head *p = tmp;
 		tmp = tmp->b_this_page;
+		if (buffer_locked(p) && wait)
+			__wait_on_buffer(p);
 		if (buffer_dirty(p) && !buffer_locked(p))
 			ll_rw_block(WRITE, 1, &p);
 	} while (tmp != bh);
@@ -2151,7 +2151,7 @@
  *       obtain a reference to a buffer head within a page.  So we must
  *	 lock out all of these paths to cleanly toss the page.
  */
-int try_to_free_buffers(struct page * page)
+int try_to_free_buffers(struct page * page, int wait)
 {
 	struct buffer_head * tmp, * bh = page->buffers;
 	int index = BUFSIZE_INDEX(bh->b_size);
@@ -2201,7 +2201,7 @@
 	spin_unlock(&free_list[index].lock);
 	write_unlock(&hash_table_lock);
 	spin_unlock(&lru_list_lock);	
-	sync_page_buffers(bh);
+	sync_page_buffers(bh, wait);
 	return 0;
 }
 
diff -urN --exclude-from=/home/lfcia/quintela/work/kernel/exclude pre9-2/include/linux/fs.h testing/include/linux/fs.h
--- pre9-2/include/linux/fs.h	Wed May 17 19:11:51 2000
+++ testing/include/linux/fs.h	Thu May 18 00:44:24 2000
@@ -900,7 +900,7 @@
 
 extern int fs_may_remount_ro(struct super_block *);
 
-extern int try_to_free_buffers(struct page *);
+extern int try_to_free_buffers(struct page *, int);
 extern void refile_buffer(struct buffer_head * buf);
 
 #define BUF_CLEAN	0
diff -urN --exclude-from=/home/lfcia/quintela/work/kernel/exclude pre9-2/mm/filemap.c testing/mm/filemap.c
--- pre9-2/mm/filemap.c	Fri May 12 23:46:46 2000
+++ testing/mm/filemap.c	Thu May 18 01:00:39 2000
@@ -246,12 +246,13 @@
 
 int shrink_mmap(int priority, int gfp_mask)
 {
-	int ret = 0, count;
+	int ret = 0, count, nr_dirty;
 	LIST_HEAD(old);
 	struct list_head * page_lru, * dispose;
 	struct page * page = NULL;
 	
 	count = nr_lru_pages / (priority + 1);
+	nr_dirty = priority;
 
 	/* we need pagemap_lru_lock for list_del() ... subtle code below */
 	spin_lock(&pagemap_lru_lock);
@@ -303,8 +304,10 @@
 		 * of zone - it's old.
 		 */
 		if (page->buffers) {
-			if (!try_to_free_buffers(page))
-				goto unlock_continue;
+			int wait = ((gfp_mask & __GFP_IO) && (nr_dirty < 0));
+			nr_dirty--;
+			if (!try_to_free_buffers(page, wait))
+					goto unlock_continue;
 			/* page was locked, inode can't go away under us */
 			if (!page->mapping) {
 				atomic_dec(&buffermem_pages);
diff -urN --exclude-from=/home/lfcia/quintela/work/kernel/exclude pre9-2/mm/vmscan.c testing/mm/vmscan.c
--- pre9-2/mm/vmscan.c	Tue May 16 00:36:11 2000
+++ testing/mm/vmscan.c	Thu May 18 01:20:20 2000
@@ -363,7 +363,7 @@
 	 * Think of swap_cnt as a "shadow rss" - it tells us which process
 	 * we want to page out (always try largest first).
 	 */
-	counter = (nr_threads << 1) >> (priority >> 1);
+	counter = (nr_threads << 2) >> (priority >> 2);
 	if (counter < 1)
 		counter = 1;
 
@@ -435,11 +435,12 @@
 {
 	int priority;
 	int count = FREE_COUNT;
+	int swap_count;
 
 	/* Always trim SLAB caches when memory gets low. */
 	kmem_cache_reap(gfp_mask);
 
-	priority = 6;
+	priority = 64;
 	do {
 		while (shrink_mmap(priority, gfp_mask)) {
 			if (!--count)
@@ -471,12 +472,10 @@
 		 * put in the swap cache), so we must not count this
 		 * as a "count" success.
 		 */
-		{
-			int swap_count = SWAP_COUNT;
-			while (swap_out(priority, gfp_mask))
-				if (--swap_count < 0)
-					break;
-		}
+		swap_count = SWAP_COUNT;
+		while (swap_out(priority, gfp_mask))
+			if (--swap_count < 0)
+				break;
 	} while (--priority >= 0);
 
 	/* Always end on a shrink_mmap.. */
@@ -485,7 +484,7 @@
 			goto done;
 	}
 
-	return 0;
+	return (count != FREE_COUNT);
 
 done:
 	return 1;


-- 
In theory, practice and theory are the same, but in practice they 
are different -- Larry McVoy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2000-05-21 19:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-05-14  9:48 Summary of recent VM behavior [2.3.99-pre8] Craig Kulesa
2000-05-18 10:17 ` PATCH: Possible solution to VM problems (take 2) Craig Kulesa
2000-05-18 10:59   ` Jan Niehusmann
2000-05-18 13:41     ` Rik van Riel
2000-05-18 13:49       ` Stephen C. Tweedie
2000-05-16 19:32 [dirtypatch] quickhack to make pre8/9 behave (fwd) Rik van Riel
2000-05-17  0:28 ` PATCH: less dirty (Re: [dirtypatch] quickhack to make pre8/9 behave (fwd)) Juan J. Quintela
2000-05-17 20:45   ` PATCH: Possible solution to VM problems Juan J. Quintela
2000-05-17 23:31     ` PATCH: Possible solution to VM problems (take 2) Juan J. Quintela
2000-05-18  0:12       ` Juan J. Quintela
2000-05-18  1:07         ` Rik van Riel
2000-05-21  8:14         ` Linus Torvalds
2000-05-21 16:01           ` Rik van Riel
2000-05-21 17:15             ` Linus Torvalds
2000-05-21 19:02               ` Rik van Riel
2000-05-18  5:58 Neil Schemenauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox