* [CFT][PATCH] smoother VM for -ac
@ 2001-10-10 20:25 Rik van Riel
2001-10-10 20:48 ` Benjamin LaHaise
0 siblings, 1 reply; 6+ messages in thread
From: Rik van Riel @ 2001-10-10 20:25 UTC (permalink / raw)
To: kernelnewbies; +Cc: linux-mm, linux-kernel, Alan Cox
Hi,
over the last week I've created a small patch which seems
to drastically improve VM performance and interactivity for
2.4.10-ac{9,10}. Initial test results mostly seem to suggest
that the system runs lots smoother for desktop use and doesn't
get into thrashing until the working set _really_ exceeds the
size of RAM.
People have already asked to have this patch integrated into
the -ac kernel, but it would be nice to have a few more test
results from this combined eatcache + stophog patch before
having it integrated ...
The patch implements the following things:
1) bypass page aging entirely for unused objects in
the cache
2) increase the distance between inactive_shortage
and inactive_plenty, so kswapd should spend less
time shuffling random pages around ... shouldn't
make a difference for most loads, but should add
some robustness in worst cases
3) does page aging _before_ the zone_inactive_plenty()
test, so old referenced bits get cleared
[not a big cpu eater, since the code won't run unless
we have a free or inactive shortage somewhere]
4) in page_alloc.c, the "slowdown" reschedule has been
made stronger by turning it into a try_to_free_pages(),
under memory load, this results in allocators calling
try_to_free_pages() when the amount of work to be done
isn't too bad yet and pretty much guarantees them they'll
get to do their allocation immediately afterwards ...
statistics make sure that the memory hogs are slowed down
much more than well-behaved programs
Please test this patch and tell Alan and me how it works for
you and whether there are loads where the system performs
worse with this patch than without...
regards,
Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)
http://www.surriel.com/ http://distro.conectiva.com/
--- linux-2.4.10-ac10/mm/page_alloc.c.orig Mon Oct 8 18:22:51 2001
+++ linux-2.4.10-ac10/mm/page_alloc.c Wed Oct 10 14:08:54 2001
@@ -346,22 +346,15 @@
* We wake up kswapd, in the hope that kswapd will
* resolve this situation before memory gets tight.
*
- * We also yield the CPU, because that:
- * - gives kswapd a chance to do something
- * - slows down allocations, in particular the
- * allocations from the fast allocator that's
- * causing the problems ...
- * - ... which minimises the impact the "bad guys"
- * have on the rest of the system
- * - if we don't have __GFP_IO set, kswapd may be
- * able to free some memory we can't free ourselves
+ * We'll also help a bit trying to free pages, this
+ * way statistics will make sure really fast allocators
+ * are slowed down more than slow allocators and other
+ * programs in the system shouldn't be impacted as much
+ * by the hogs.
*/
wakeup_kswapd();
- if (gfp_mask & __GFP_WAIT) {
- __set_current_state(TASK_RUNNING);
- current->policy |= SCHED_YIELD;
- schedule();
- }
+ if (gfp_mask & __GFP_WAIT)
+ try_to_free_pages(gfp_mask);
/*
* After waking up kswapd, we try to allocate a page
--- linux-2.4.10-ac10/mm/vmscan.c.orig Mon Oct 8 18:22:51 2001
+++ linux-2.4.10-ac10/mm/vmscan.c Mon Oct 8 19:18:12 2001
@@ -50,7 +50,7 @@
inactive += zone->inactive_clean_pages;
inactive += zone->free_pages;
- return (inactive > (zone->size / 3));
+ return (inactive > (zone->size * 2 / 5));
}
#define FREE_PLENTY_FACTOR 2
@@ -97,6 +97,24 @@
return pagecache > limit;
}
+static inline int page_mapping_notused(struct page * page)
+{
+ struct address_space * mapping = page->mapping;
+
+ if (!mapping)
+ return 0;
+
+ /* This mapping is really large and would monopolise the pagecache. */
+ if (mapping->nrpages > atomic_read(&page_cache_size) / 20);
+ return 0;
+
+ /* File is mmaped by somebody */
+ if (mapping->i_mmap || mapping->i_mmap_shared)
+ return 1;
+
+ return 0;
+}
+
/*
* The swap-out function returns 1 if it successfully
* scanned all the pages it was asked to (`count').
@@ -826,14 +844,14 @@
}
/*
- * Don't deactivate pages from zones which have
- * plenty inactive pages.
+ * Do aging on the pages. Every time a page is referenced,
+ * page->age gets incremented. If it wasn't referenced, we
+ * decrement page->age. The page gets moved to the inactive
+ * list when one of the following is true:
+ * - the page age reaches 0
+ * - the object the page belongs to isn't in active use
+ * - the object the page belongs to is hogging the cache
*/
- if (zone_inactive_plenty(page->zone)) {
- goto skip_page;
- }
-
- /* Do aging on the pages. */
if (PageTestandClearReferenced(page)) {
age_page_up(page);
} else {
@@ -843,20 +861,26 @@
}
/*
- * If the amount of buffer cache pages is too
- * high we just move every buffer cache page we
- * find to the inactive list. Eventually they'll
- * be reclaimed there...
+ * Don't deactivate pages from zones which have
+ * plenty inactive pages.
+ */
+ if (zone_inactive_plenty(page->zone)) {
+ goto skip_page;
+ }
+
+ /*
+ * If the buffer cache is large, don't do page aging.
+ * If this page really is used, it'll be referenced
+ * again while on the inactive list.
*/
if (page->buffers && !page->mapping && too_many_buffers())
deactivate_page_nolock(page);
/*
- * If the page cache is too large, we deactivate all
- * page cache pages which are not in use by a process.
+ * Deactivate pages from files which aren't in use, busy
+ * pages will be referenced while on the inactive list.
*/
- if (pagecache_too_large() && page->mapping &&
- page_count(page) <= (page->buffers ? 2 : 1))
+ if (page_mapping_notused(page))
deactivate_page_nolock(page);
/*
--- linux-2.4.10-ac10/include/linux/swap.h.orig Mon Oct 8 18:23:03 2001
+++ linux-2.4.10-ac10/include/linux/swap.h Mon Oct 8 19:15:09 2001
@@ -261,7 +261,7 @@
if (vm_static_inactive_target)
return vm_static_inactive_target;
- return num_physpages / 4;
+ return num_physpages / 5;
}
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [CFT][PATCH] smoother VM for -ac
2001-10-10 20:25 [CFT][PATCH] smoother VM for -ac Rik van Riel
@ 2001-10-10 20:48 ` Benjamin LaHaise
2001-10-10 21:25 ` Rik van Riel
0 siblings, 1 reply; 6+ messages in thread
From: Benjamin LaHaise @ 2001-10-10 20:48 UTC (permalink / raw)
To: Rik van Riel; +Cc: kernelnewbies, linux-mm, linux-kernel, Alan Cox
On Wed, Oct 10, 2001 at 05:25:30PM -0300, Rik van Riel wrote:
> 4) in page_alloc.c, the "slowdown" reschedule has been
> made stronger by turning it into a try_to_free_pages(),
> under memory load, this results in allocators calling
> try_to_free_pages() when the amount of work to be done
> isn't too bad yet and pretty much guarantees them they'll
> get to do their allocation immediately afterwards ...
> statistics make sure that the memory hogs are slowed down
> much more than well-behaved programs
There's a small problem with this one: I know that during testing of
earlier 2.4 kernels we saw a livelock which was caused by the vm
subsystem spinning without scheduling. This can happen in a couple of
cases like NFS where another task has to be allowed to run in order to
make progress in clearing pages.
-ben
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [CFT][PATCH] smoother VM for -ac
2001-10-10 20:48 ` Benjamin LaHaise
@ 2001-10-10 21:25 ` Rik van Riel
2001-10-10 21:44 ` Rik van Riel
0 siblings, 1 reply; 6+ messages in thread
From: Rik van Riel @ 2001-10-10 21:25 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: kernelnewbies, linux-mm, linux-kernel, Alan Cox
On Wed, 10 Oct 2001, Benjamin LaHaise wrote:
> On Wed, Oct 10, 2001 at 05:25:30PM -0300, Rik van Riel wrote:
> > 4) in page_alloc.c, the "slowdown" reschedule has been
> > made stronger by turning it into a try_to_free_pages(),
> There's a small problem with this one: I know that during
> testing of earlier 2.4 kernels we saw a livelock which was
> caused by the vm subsystem spinning without scheduling. This
> can happen in a couple of cases like NFS where another task has
> to be allowed to run in order to make progress in clearing
> pages.
OK, I'll add back the reschedule() to fix this case.
I don't like it too much, but I wouldn't know of an
easier way to fix the NFS thing. I guess we could delay
it to the zone->pages_min point though ... should cut
down on the number of reschedules ;)
regards,
Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)
http://www.surriel.com/ http://distro.conectiva.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [CFT][PATCH] smoother VM for -ac
2001-10-10 21:25 ` Rik van Riel
@ 2001-10-10 21:44 ` Rik van Riel
0 siblings, 0 replies; 6+ messages in thread
From: Rik van Riel @ 2001-10-10 21:44 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: kernelnewbies, linux-mm, linux-kernel, Alan Cox
On Wed, 10 Oct 2001, Rik van Riel wrote:
> On Wed, 10 Oct 2001, Benjamin LaHaise wrote:
> > There's a small problem with this one: I know that during
> > testing of earlier 2.4 kernels we saw a livelock which was
> > caused by the vm subsystem spinning without scheduling.
I added back the reschedule at the zone->pages_min() limit
and have documented this piece of black magic. New patch
can be found at:
http://www.surriel.com/patches/
regards,
Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)
http://www.surriel.com/ http://distro.conectiva.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [CFT][PATCH] smoother VM for -ac
[not found] ` <3BC64882.27834.2D200B0@localhost>
@ 2001-10-12 5:54 ` Andrea Arcangeli
0 siblings, 0 replies; 6+ messages in thread
From: Andrea Arcangeli @ 2001-10-12 5:54 UTC (permalink / raw)
To: John L. Males; +Cc: Rik van Riel, linux-mm, linux-kernel, Alan Cox
On Fri, Oct 12, 2001 at 01:33:54AM -0500, John L. Males wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> - -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Andrea,
>
> I can do. I see this is a VM is of keen interest. Question for you.
> To really compare apples to apples I could spider a web site or two
> just find. Then the challenge is to replay the "test" on the gui,
> say KDE for example. Do you know of any good tools that would alow
> me to do a GUI record/playback? I can then do an A vs B comparison.
For testing the repsonsiveness I usually check the startup time of
applications like netscape with cold cache, later I just start an high
vm load on my desktop and I see how long can I keep working without
being too hurted. the first is certainly a measurable test, the second
isn't reliable since it doesn't generate raw numbers and it's too much
in function of the human feeling but it shows very well any patological
problem of the code. But they may not be the best tests.
> Also, remind me, can I find your kernel to test on the SuSE FTP site
> or via kernel.org. I had tried a few of the SuSE 2.4 kernels a few
> levels back and I recall I was going to the people directory of the
> FTP site and getting them from mantel I seem to recollect.
That's still fine procedure, only make sure to pick the latest 2.4.12
one based on 2.4.12aa1 before running the tests. thanks,
> I will search about on internet to see if I can find a
> record/playback too to get some sort of good A vs B comparison.
>
>
> Regards,
>
> John L. Males
> Willowdale, Ontario
> Canada
> 12 October 2001 01:33
> mailto:jlmales@softhome.net
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [CFT][PATCH] smoother VM for -ac
[not found] <Pine.LNX.4.33L.0110101710150.26495-100000@duckman.distro.c onectiva>
@ 2001-10-11 8:46 ` Lorenzo Allegrucci
0 siblings, 0 replies; 6+ messages in thread
From: Lorenzo Allegrucci @ 2001-10-11 8:46 UTC (permalink / raw)
To: Rik van Riel; +Cc: kernelnewbies, linux-mm, linux-kernel, Alan Cox
At 17.25 10/10/01 -0300, Rik van Riel wrote:
>Please test this patch and tell Alan and me how it works for
>you and whether there are loads where the system performs
>worse with this patch than without...
qsbench results,
Linux-2.4.10-ac9:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
seed = 140175100
71.370u 2.560s 3:17.94 37.3% 0+0k 0+0io 11773pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
seed = 140175100
71.760u 3.170s 4:02.93 30.8% 0+0k 0+0io 15487pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
seed = 140175100
71.090u 3.080s 4:07.94 29.9% 0+0k 0+0io 15856pf+0w
kswapd CPU time: 0:23
Linux-2.4.10-ac9 + Rik's smooth patch:
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
seed = 140175100
71.090u 6.260s 3:21.65 38.3% 0+0k 0+0io 12868pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
seed = 140175100
72.460u 6.030s 3:58.10 32.9% 0+0k 0+0io 14637pf+0w
lenstra:~/src/qsort> time ./qsbench -n 90000000 -p 1 -s 140175100
seed = 140175100
71.630u 7.400s 4:00.86 32.8% 0+0k 0+0io 14894pf+0w
kswapd CPU time: 0:21
--
Lorenzo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2001-10-12 5:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-10-10 20:25 [CFT][PATCH] smoother VM for -ac Rik van Riel
2001-10-10 20:48 ` Benjamin LaHaise
2001-10-10 21:25 ` Rik van Riel
2001-10-10 21:44 ` Rik van Riel
[not found] <Pine.LNX.4.33L.0110101710150.26495-100000@duckman.distro.c onectiva>
2001-10-11 8:46 ` Lorenzo Allegrucci
2001-10-12 4:41 Re[02]: " Robert Love
2001-10-12 5:09 ` Andrea Arcangeli
[not found] ` <3BC64882.27834.2D200B0@localhost>
2001-10-12 5:54 ` Andrea Arcangeli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox