* [PATCH *] VM patch for 2.4.0-test8
@ 2000-09-14 4:30 Rik van Riel
2000-09-14 5:25 ` David S. Miller
0 siblings, 1 reply; 10+ messages in thread
From: Rik van Riel @ 2000-09-14 4:30 UTC (permalink / raw)
To: linux-mm; +Cc: linux-kernel, Linus Torvalds
Hi,
The new VM patch seems has received a major amount of
code cleanup, performance tuning and stability improvement
over the last few days and is now almost production
quality, with the following 4 items left for 2.4:
- improve streaming IO performance
- out of memory handling
- integrate Ben LaHaise's readahead on the VMA level
(and make drop_behind() work for that) .. fixes kswapd cpu eating
- (maybe) make drop_behind() work better for some cases
- testing, testing, testing, testing ...
The post-2.4 TODO list contains these items:
- physical page based aging (reduce kswapd cpu use more and
do better/more fair page aging)
- much much better IO clustering (neatly abstracted away?)
- page->mapping->flush() callback for journaling and network
filesystems (maybe later in 2.4)
- thrashing control (like process suspension?)
The new VM already seems to be more stable under load than the
old VM and tuning has taken it so far that I'm already running
into bottle necks in /other/ places (eg. the elevator code)
when putting the system under rediculously heavy load...
I haven't had much time to do things like dbench and tiobench
testing though, which is why I'm sending this email and asking
the enthousiast benchmarkers to give the patch a try and tell
me about the results.
Oh, and please don't restrict yourself to just the synthetic
benchmarks. The VM is there to give the best results for
applications that have something like a working set and has
not been tuned yet to give good performance for benchmarks
(which seem to run very much different from any application
I've ever seen).
regards,
Rik
--
"What you're running that piece of shit Gnome?!?!"
-- Miguel de Icaza, UKUUG 2000
http://www.conectiva.com/ http://www.surriel.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-14 4:30 [PATCH *] VM patch for 2.4.0-test8 Rik van Riel
@ 2000-09-14 5:25 ` David S. Miller
2000-09-14 6:11 ` Juan J. Quintela
2000-09-14 16:53 ` Rik van Riel
0 siblings, 2 replies; 10+ messages in thread
From: David S. Miller @ 2000-09-14 5:25 UTC (permalink / raw)
To: riel; +Cc: linux-mm, linux-kernel, torvalds
In page_launder() about halfway down there is this sequence of tests
on LRU pages:
if (!clearedbuf) {
...
} else if (!page->mapping) {
...
} else if (page_count(page) > 1) {
} else /* page->mapping && page_count(page) == 1 */ {
...
}
Above this sequence we've done a page_cache_get. For the final case
in the tests above (page->mapping != NULL && page_count(page) == 1)
have you checked if this ever happens or is even possible?
If the page is a page cache page (ie. page->mapping != NULL) it
should hold a reference. Adding in our reference, the count should
always thus be > 1.
What did I miss?
Later,
David S. Miller
davem@redhat.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-14 5:25 ` David S. Miller
@ 2000-09-14 6:11 ` Juan J. Quintela
2000-09-14 8:18 ` David S. Miller
2000-09-14 16:53 ` Rik van Riel
1 sibling, 1 reply; 10+ messages in thread
From: Juan J. Quintela @ 2000-09-14 6:11 UTC (permalink / raw)
To: David S. Miller; +Cc: riel, linux-mm, linux-kernel, torvalds
>>>>> "david" == David S Miller <davem@redhat.com> writes:
david> In page_launder() about halfway down there is this sequence of tests
david> on LRU pages:
david> if (!clearedbuf) {
david> ...
david> } else if (!page->mapping) {
david> ...
david> } else if (page_count(page) > 1) {
david> } else /* page->mapping && page_count(page) == 1 */ {
david> ...
david> }
david> Above this sequence we've done a page_cache_get. For the final case
david> in the tests above (page->mapping != NULL && page_count(page) == 1)
david> have you checked if this ever happens or is even possible?
david> If the page is a page cache page (ie. page->mapping != NULL) it
david> should hold a reference. Adding in our reference, the count should
david> always thus be > 1.
david> What did I miss?
I think nothing, I suppose that riel means > 2 and == 2, if we arrive
there when a page count of 1 we are in problems as you have told.
/me doing greping ...... <some time later>
I can only see one place where we add a page to the page cache and we
don't increase its page count, and it is in grow_buffers(). Could
somebody explain me _why_ we don't need to do a page_cache_get(page)
in that function?
Later, Juan.
--
In theory, practice and theory are the same, but in practice they
are different -- Larry McVoy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-14 6:11 ` Juan J. Quintela
@ 2000-09-14 8:18 ` David S. Miller
0 siblings, 0 replies; 10+ messages in thread
From: David S. Miller @ 2000-09-14 8:18 UTC (permalink / raw)
To: quintela; +Cc: riel, linux-mm, linux-kernel, torvalds
I can only see one place where we add a page to the page cache and
we don't increase its page count, and it is in grow_buffers().
Could somebody explain me _why_ we don't need to do a
page_cache_get(page) in that function?
It's being added only to the LRU lists, not the page cache hashes. It
is a buffer-cache page not a page-cache page (ie. page->mapping ==
NULL).
The alloc_page() returns the page with a single reference, which thus
represents the refence to the page held by the buffer heads
grow_buffers attaches to it.
Later,
David S. Miller
davem@redhat.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-14 5:25 ` David S. Miller
2000-09-14 6:11 ` Juan J. Quintela
@ 2000-09-14 16:53 ` Rik van Riel
2000-09-14 17:49 ` Rik van Riel
2000-09-15 17:28 ` Martin Josefsson
1 sibling, 2 replies; 10+ messages in thread
From: Rik van Riel @ 2000-09-14 16:53 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-mm, linux-kernel, torvalds
On Wed, 13 Sep 2000, David S. Miller wrote:
> In page_launder() about halfway down there is this sequence of tests
> on LRU pages:
>
> if (!clearedbuf) {
> ...
> } else if (!page->mapping) {
> ...
> } else if (page_count(page) > 1) {
> } else /* page->mapping && page_count(page) == 1 */ {
> ...
> }
>
> Above this sequence we've done a page_cache_get.
Indeed, you're right. This bug certainly explains some
of the performance things I've seen in the stress test
last night...
Btw, in case you're wondering ... the box /survived/
a stress test that would get programs killed on quite
a few "stable" kernels we've been shipping lately. ;)
regards,
Rik
--
"What you're running that piece of shit Gnome?!?!"
-- Miguel de Icaza, UKUUG 2000
http://www.conectiva.com/ http://www.surriel.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-14 16:53 ` Rik van Riel
@ 2000-09-14 17:49 ` Rik van Riel
2000-09-15 17:28 ` Martin Josefsson
1 sibling, 0 replies; 10+ messages in thread
From: Rik van Riel @ 2000-09-14 17:49 UTC (permalink / raw)
To: David S. Miller; +Cc: linux-mm, linux-kernel, torvalds
On Thu, 14 Sep 2000, Rik van Riel wrote:
> On Wed, 13 Sep 2000, David S. Miller wrote:
>
> > In page_launder() about halfway down there is this sequence of tests
> > on LRU pages:
> >
> > } else if (page_count(page) > 1) {
> > } else /* page->mapping && page_count(page) == 1 */ {
>
> Indeed, you're right. This bug certainly explains some
> of the performance things I've seen in the stress test
> last night...
A new patch with Davem's bugfix has been uploaded and
performance seems to be quite a bit better now...
http://www.surriel.com/patches/
Unless somebody else manages to find a bug in this patch,
this will be the last patch at this feature level and the
next patch will contain a new feature. The new feature in
question will be either the out of memory killer, or Ben
LaHaise's readahead-on-VMA-level code.
(probably the OOM killer since that is a stability-related
thing and the other is "just" a performance tweak)
regards,
Rik
--
"What you're running that piece of shit Gnome?!?!"
-- Miguel de Icaza, UKUUG 2000
http://www.conectiva.com/ http://www.surriel.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-14 16:53 ` Rik van Riel
2000-09-14 17:49 ` Rik van Riel
@ 2000-09-15 17:28 ` Martin Josefsson
2000-09-15 19:37 ` Jamie Lokier
1 sibling, 1 reply; 10+ messages in thread
From: Martin Josefsson @ 2000-09-15 17:28 UTC (permalink / raw)
To: Rik van Riel; +Cc: David S. Miller, linux-mm, linux-kernel, torvalds
On Thu, 14 Sep 2000, Rik van Riel wrote:
> On Wed, 13 Sep 2000, David S. Miller wrote:
>
> > In page_launder() about halfway down there is this sequence of tests
> > on LRU pages:
> >
> > if (!clearedbuf) {
> > ...
> > } else if (!page->mapping) {
> > ...
> > } else if (page_count(page) > 1) {
> > } else /* page->mapping && page_count(page) == 1 */ {
> > ...
> > }
> >
> > Above this sequence we've done a page_cache_get.
>
> Indeed, you're right. This bug certainly explains some
> of the performance things I've seen in the stress test
> last night...
>
> Btw, in case you're wondering ... the box /survived/
> a stress test that would get programs killed on quite
> a few "stable" kernels we've been shipping lately. ;)
Here comes a success report.
I've been using 2.4.0test8+2.4.0-t8-vmpatch2 for about a day now and the
performance is great.
I've just bought a new harddrive and I was copying a _lot_ of data to the
new drive and didn't notice anything axcept the HDD led flashing :)
And now I helped a friend back up his data while he converts to reiserfs.
I had a stream of 7-9MB/s down to my harddrive for quite a while and still
didn't notice anything. Everything ended up on the inactive list.
I've been trying to get my machine to swap but that seems hard with this
new patch :) I have 0kB of swap used after 8h uptime, and I have been
compiling, moving files between partitions and running md5sum on files
(that was a big problem before, everything ended up on the active list and
the swapping started and brought my machine down to a crawl)
I can mention that while backing up my friends data I had 7000-9000
interrupts per second and 10 000 - 12 000 context switches per second.
I was really impressed that I didn't notice anything. I remember that my
machine was terribly slow when it did over 5000 context switches with
vanilla test6.
(My machine is a pIII 700 with 256MB ram)
If anyone want more info or anything please feel free to mail me.
(Hopefully my mailserver is up, we've been experiencing some power
problems)
/Martin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-15 17:28 ` Martin Josefsson
@ 2000-09-15 19:37 ` Jamie Lokier
2000-09-15 21:07 ` David Ford
2000-09-15 22:04 ` Rik van Riel
0 siblings, 2 replies; 10+ messages in thread
From: Jamie Lokier @ 2000-09-15 19:37 UTC (permalink / raw)
To: Martin Josefsson
Cc: Rik van Riel, David S. Miller, linux-mm, linux-kernel, torvalds
Martin Josefsson wrote:
> I've been trying to get my machine to swap but that seems hard with this
> new patch :) I have 0kB of swap used after 8h uptime, and I have been
> compiling, moving files between partitions and running md5sum on files
> (that was a big problem before, everything ended up on the active list and
> the swapping started and brought my machine down to a crawl)
No preemptive page-outs?
0kB swap means if you suddenly need a lot of memory, inactive
application pages have to be written to disk first. There are always
inactive application pages.
Maybe the stats are inaccurate.
-- Jamie
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-15 19:37 ` Jamie Lokier
@ 2000-09-15 21:07 ` David Ford
2000-09-15 22:04 ` Rik van Riel
1 sibling, 0 replies; 10+ messages in thread
From: David Ford @ 2000-09-15 21:07 UTC (permalink / raw)
To: Jamie Lokier
Cc: Martin Josefsson, Rik van Riel, David S. Miller, linux-mm,
linux-kernel, torvalds
[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]
Jamie Lokier wrote:
> Martin Josefsson wrote:
> > I've been trying to get my machine to swap but that seems hard with this
> > new patch :) I have 0kB of swap used after 8h uptime, and I have been
> > compiling, moving files between partitions and running md5sum on files
> > (that was a big problem before, everything ended up on the active list and
> > the swapping started and brought my machine down to a crawl)
>
> No preemptive page-outs?
>
> 0kB swap means if you suddenly need a lot of memory, inactive
> application pages have to be written to disk first. There are always
> inactive application pages.
>
> Maybe the stats are inaccurate.
Perhaps, but I run most of my machines without swap. They are between 64 and
256M. Servers are pretty constant in their mem usage, I use about 75%. The
workstations sometimes run down to a few megs free (read 'using netscape') and
I then turn on a swapfile. But all in all they generally do dandy without swap
for days on some, months on others.
-d
--
"The difference between 'involvement' and 'commitment' is like an
eggs-and-ham breakfast: the chicken was 'involved' - the pig was
'committed'."
[-- Attachment #2: Card for David Ford --]
[-- Type: text/x-vcard, Size: 238 bytes --]
begin:vcard
n:Ford;David
x-mozilla-html:TRUE
org:<img src="http://www.kalifornia.com/images/paradise.jpg">
adr:;;;;;;
version:2.1
email;internet:david@kalifornia.com
title:Blue Labs Developer
x-mozilla-cpt:;28256
fn:David Ford
end:vcard
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH *] VM patch for 2.4.0-test8
2000-09-15 19:37 ` Jamie Lokier
2000-09-15 21:07 ` David Ford
@ 2000-09-15 22:04 ` Rik van Riel
1 sibling, 0 replies; 10+ messages in thread
From: Rik van Riel @ 2000-09-15 22:04 UTC (permalink / raw)
To: Jamie Lokier
Cc: Martin Josefsson, David S. Miller, linux-mm, linux-kernel, torvalds
On Fri, 15 Sep 2000, Jamie Lokier wrote:
> Martin Josefsson wrote:
> > I've been trying to get my machine to swap but that seems hard with this
> > new patch :) I have 0kB of swap used after 8h uptime, and I have been
> > compiling, moving files between partitions and running md5sum on files
> > (that was a big problem before, everything ended up on the active list and
> > the swapping started and brought my machine down to a crawl)
>
> No preemptive page-outs?
Yes. The system tries to keep about 1 second worth of
allocations on the inactive list (+ freepages.high).
If you're allocating lots of memory very fast, the
system /will/ try to swap out things beforehand...
> 0kB swap means if you suddenly need a lot of memory, inactive
> application pages have to be written to disk first. There are
> always inactive application pages.
Indeed there are, but since we don't do physical page
based scanning yet, we don't deactivate pages from the
RSS of processes in the background yet ...
(that's a 2.5 issue)
> Maybe the stats are inaccurate.
Nope. The stats are just fine ;)
regards,
Rik
--
"What you're running that piece of shit Gnome?!?!"
-- Miguel de Icaza, UKUUG 2000
http://www.conectiva.com/ http://www.surriel.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2000-09-15 22:04 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-09-14 4:30 [PATCH *] VM patch for 2.4.0-test8 Rik van Riel
2000-09-14 5:25 ` David S. Miller
2000-09-14 6:11 ` Juan J. Quintela
2000-09-14 8:18 ` David S. Miller
2000-09-14 16:53 ` Rik van Riel
2000-09-14 17:49 ` Rik van Riel
2000-09-15 17:28 ` Martin Josefsson
2000-09-15 19:37 ` Jamie Lokier
2000-09-15 21:07 ` David Ford
2000-09-15 22:04 ` Rik van Riel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox