* use-once-cleanup testing
@ 2006-01-14 0:05 Marcelo Tosatti
2006-01-14 4:52 ` Nick Piggin
0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2006-01-14 0:05 UTC (permalink / raw)
To: akpm, Nick Piggin, Peter Zijlstra, Rik van Riel; +Cc: linux-mm
Hi folks,
Rik's use-once cleanup patch (1) gets rid of a nasty problem. The
use-once logic does not work for mmaped() files, due to the questionable
assumption that any referenced pages of such files should be held in
memory:
1 - http://lwn.net/Articles/134387/
static int shrink_list(struct list_head *page_list, struct scan_control *sc)
{
...
referenced = page_referenced(page, 1);
/* In active use or really unfreeable? Activate it. */
if (referenced && page_mapping_inuse(page))
goto activate_locked;
The page activation scheme relies on mark_page_accessed() (exported
function) to do the list move itself, which is the only way for in-cache
non mapped pages to be promoted to the active list.
Rik's patch instead only sets the referenced bit at
mark_page_accessed(), changing the use-once logic to work by means
of a newly created PG_new flag. The flag, set at add_to_pagecache()
time, gives pages a second round on the inactive list in case they
get referenced. Page activation is then performed if the page is
re-referenced.
Another clear advantage of not doing the list move at mark_page_accessed()
time is decreased zone->lru_lock contention and cache thrashing in
general (profiling on SMP machines would be interesting).
A possibly negative side-effect of PG_new, already mentioned by Nikita
in this list, is that used-once pages lurk around longer in cache, which
can slowdown particular workloads (it should not be hard to create such
loads).
However, the ongoing non-resident book keeping implementation makes it
possible to completly get rid of "second chance" behaviour: re-accessed
evicted pages are automatically promoted to the active list.
For example this is a real scenario where use-once mmap() is
performed:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0109.2/0078.html
Patch being used for the tests is:
http://programming.kicks-ass.net/kernel-patches/page-replace/2.6.16-rc1/use_once-cleanup.patch
And here are results of larger than RAM sequential access with mmap():
2.6-git-jan-12:
Command being timed: "iozone -B -s 143360 -i 1 -i 1 -i 1 -i 1 -w"
Percent of CPU this job got: 6%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:34.74
2.6-git-jan-12+useonce:
Command being timed: "iozone -B -s 143360 -i 1 -i 1 -i 1 -i 1 -w"
Percent of CPU this job got: 13%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.22
And a few graphs of the active/inactive sizes with both read and mmap
mode, with the vanilla and use-once patched kernels:
http://hera.kernel.org/~marcelo/mm/iozone_useonce/iozone_useonce.html
Its possible to note that even using read() the vanilla VM moves
used-once pages to the active list (ie. the logic is not working as
expected).
I would vote for inclusion of the first version of use-once-cleanup
(without the arguable refill_inactive_zone() page_referenced change)
into -mm.
Comments?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing
2006-01-14 0:05 use-once-cleanup testing Marcelo Tosatti
@ 2006-01-14 4:52 ` Nick Piggin
2006-01-14 4:53 ` Marcelo Tosatti
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Nick Piggin @ 2006-01-14 4:52 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: akpm, Peter Zijlstra, Rik van Riel, linux-mm
Marcelo Tosatti wrote:
>Hi folks,
>
>Rik's use-once cleanup patch (1) gets rid of a nasty problem. The
>use-once logic does not work for mmaped() files, due to the questionable
>assumption that any referenced pages of such files should be held in
>memory:
>
>1 - http://lwn.net/Articles/134387/
>
>static int shrink_list(struct list_head *page_list, struct scan_control *sc)
>{
>...
> referenced = page_referenced(page, 1);
> /* In active use or really unfreeable? Activate it. */
> if (referenced && page_mapping_inuse(page))
> goto activate_locked;
>
>The page activation scheme relies on mark_page_accessed() (exported
>function) to do the list move itself, which is the only way for in-cache
>non mapped pages to be promoted to the active list.
>
>Rik's patch instead only sets the referenced bit at
>mark_page_accessed(), changing the use-once logic to work by means
>of a newly created PG_new flag. The flag, set at add_to_pagecache()
>time, gives pages a second round on the inactive list in case they
>get referenced. Page activation is then performed if the page is
>re-referenced.
>
>
This is what I've done too (though I prefer a PG_useonce flag
which gets set after they're first seen referenced).
I think Wu may also be doing something like it for adaptive readahead.
Basically: it has been reinvented so many times that it *has* to be a
good idea ;)
>Another clear advantage of not doing the list move at mark_page_accessed()
>time is decreased zone->lru_lock contention and cache thrashing in
>general (profiling on SMP machines would be interesting).
>
>
It also allows one to get rid of the dirty hacks in mark_page_accessed
callers and means read() based useonce actually works properly in cases
where userspace isn't working in blocks of PAGE_SIZE (rsync I think was
one that did this, with fairly horrible results).
>A possibly negative side-effect of PG_new, already mentioned by Nikita
>in this list, is that used-once pages lurk around longer in cache, which
>can slowdown particular workloads (it should not be hard to create such
>loads).
>
>
Yes, I found that also doing use-once on mapped pages caused fairly huge
slowdowns in some cases. File IO could much more easily cause X and its
applications to get swapped out.
>However, the ongoing non-resident book keeping implementation makes it
>possible to completly get rid of "second chance" behaviour: re-accessed
>evicted pages are automatically promoted to the active list.
>
>
Possibly. I think moving unmapped use-once over to PG_useonce first, and
tidying the weird warts and special cases (that don't make sense) from
vmscan is a good first step.
Unfortunately I don't think Andrew wants a bar of any of it. Nor would
a crazy rewrite-pagereclaim tree really get any sort of testing at all,
realistically :(
Ideas?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing
2006-01-14 4:52 ` Nick Piggin
@ 2006-01-14 4:53 ` Marcelo Tosatti
2006-01-14 8:44 ` Peter Zijlstra
2006-01-16 13:05 ` Rik van Riel
2 siblings, 0 replies; 7+ messages in thread
From: Marcelo Tosatti @ 2006-01-14 4:53 UTC (permalink / raw)
To: Nick Piggin; +Cc: akpm, Peter Zijlstra, Rik van Riel, linux-mm
Hi Nick,
On Sat, Jan 14, 2006 at 03:52:58PM +1100, Nick Piggin wrote:
>
> Marcelo Tosatti wrote:
>
> >Hi folks,
> >
> >Rik's use-once cleanup patch (1) gets rid of a nasty problem. The
> >use-once logic does not work for mmaped() files, due to the questionable
> >assumption that any referenced pages of such files should be held in
> >memory:
> >
> >1 - http://lwn.net/Articles/134387/
> >
> >static int shrink_list(struct list_head *page_list, struct scan_control
> >*sc)
> >{
> >...
> > referenced = page_referenced(page, 1);
> > /* In active use or really unfreeable? Activate it. */
> > if (referenced && page_mapping_inuse(page))
> > goto activate_locked;
> >
> >The page activation scheme relies on mark_page_accessed() (exported
> >function) to do the list move itself, which is the only way for in-cache
> >non mapped pages to be promoted to the active list.
> >
> >Rik's patch instead only sets the referenced bit at
> >mark_page_accessed(), changing the use-once logic to work by means
> >of a newly created PG_new flag. The flag, set at add_to_pagecache()
> >time, gives pages a second round on the inactive list in case they
> >get referenced. Page activation is then performed if the page is
> >re-referenced.
> >
> >
>
> This is what I've done too (though I prefer a PG_useonce flag
> which gets set after they're first seen referenced).
>
> I think Wu may also be doing something like it for adaptive readahead.
>
> Basically: it has been reinvented so many times that it *has* to be a
> good idea ;)
For most mixed loads, think so. But not for all certainly.
> >Another clear advantage of not doing the list move at mark_page_accessed()
> >time is decreased zone->lru_lock contention and cache thrashing in
> >general (profiling on SMP machines would be interesting).
> >
> >
>
> It also allows one to get rid of the dirty hacks in mark_page_accessed
> callers and means read() based useonce actually works properly in cases
> where userspace isn't working in blocks of PAGE_SIZE (rsync I think was
> one that did this, with fairly horrible results).
>
> >A possibly negative side-effect of PG_new, already mentioned by Nikita
> >in this list, is that used-once pages lurk around longer in cache, which
> >can slowdown particular workloads (it should not be hard to create such
> >loads).
> >
> >
>
> Yes, I found that also doing use-once on mapped pages caused fairly huge
> slowdowns in some cases. File IO could much more easily cause X and its
> applications to get swapped out.
>
> >However, the ongoing non-resident book keeping implementation makes it
> >possible to completly get rid of "second chance" behaviour: re-accessed
> >evicted pages are automatically promoted to the active list.
> >
> >
> Possibly. I think moving unmapped use-once over to PG_useonce first, and
> tidying the weird warts and special cases (that don't make sense) from
> vmscan is a good first step.
>
> Unfortunately I don't think Andrew wants a bar of any of it. Nor would
> a crazy rewrite-pagereclaim tree really get any sort of testing at all,
> realistically :(
>
> Ideas?
I think that creating a page replacement interface used by the VM to
hide the details of the reclaim specifics is an important step forward,
allowing co-existance of different replacement policies.
It opens up many possibilities.
Peter started the abstraction of the page reclaim code for his CLOCK-Pro
implementation, and I've been working with him to improve it.
The current code is logically glued together, there is no distiction
between reclaim cache interface and LRU: they are the same.
Please take a look at
http://programming.kicks-ass.net/kernel-patches/page-replace/2.6.16-rc1/page-replace-documentation.patch
and the related patches in that directory.
Its basically separating the actions invoked by generic VM:
- book keeping of page information (insertion, deletion, reference, and
so on).
- selection of pagecache candidates for eviction
And:
- balancing between slab/pagecache eviction
- page eviction
- page writeout
IMO they should all be separate, with shared helpers functions, as the
document and patches suggest.
The current set makes the traditional LRU 2-queue and CLOCK-Pro policies
co-exist (at the very moment it contains several patches which change
behaviour such as Rik's PG_new, Wu's zone scanning balancing, but they
are not necessarily related to this).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing
2006-01-14 4:52 ` Nick Piggin
2006-01-14 4:53 ` Marcelo Tosatti
@ 2006-01-14 8:44 ` Peter Zijlstra
2006-01-14 8:51 ` Andrew Morton
2006-01-16 13:05 ` Rik van Riel
2 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2006-01-14 8:44 UTC (permalink / raw)
To: Nick Piggin
Cc: Marcelo Tosatti, akpm, Rik van Riel, linux-mm, Bob Picco,
Christoph Lameter
On Sat, 2006-01-14 at 15:52 +1100, Nick Piggin wrote:
> Unfortunately I don't think Andrew wants a bar of any of it. Nor would
> a crazy rewrite-pagereclaim tree really get any sort of testing at all,
> realistically :(
>
Both HP and SGI have shown interrest in getting these patches in shape
and testing them, so I do think there is quite some interrest in them.
I admit that there is still a lot of work to do, like getting the CART
policies into the new tree and NUMAfying the CLOCK-Pro and CART
policies. And ofcourse rigourous testing.
Andrew, what would you need on top of that to start being interrested?
Kind regards,
PeterZ
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing
2006-01-14 8:44 ` Peter Zijlstra
@ 2006-01-14 8:51 ` Andrew Morton
2006-01-16 13:06 ` Rik van Riel
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2006-01-14 8:51 UTC (permalink / raw)
To: Peter Zijlstra
Cc: piggin, marcelo.tosatti, riel, linux-mm, bob.picco, clameter
Peter Zijlstra <peter@programming.kicks-ass.net> wrote:
>
> Andrew, what would you need on top of that to start being interrested?
>
A demonstration that the code will make sufficient improvement to justify
its inclusion, naturally ;)
Speedups should outweigh the slowdowns, no really bad corner cases.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing
2006-01-14 4:52 ` Nick Piggin
2006-01-14 4:53 ` Marcelo Tosatti
2006-01-14 8:44 ` Peter Zijlstra
@ 2006-01-16 13:05 ` Rik van Riel
2 siblings, 0 replies; 7+ messages in thread
From: Rik van Riel @ 2006-01-16 13:05 UTC (permalink / raw)
To: Nick Piggin; +Cc: Marcelo Tosatti, akpm, Peter Zijlstra, linux-mm
On Sat, 14 Jan 2006, Nick Piggin wrote:
> Yes, I found that also doing use-once on mapped pages caused fairly huge
> slowdowns in some cases. File IO could much more easily cause X and its
> applications to get swapped out.
We can get rid of that effect easily by adding reclaim_mapped
logic to the inactive list scan. The zone previous_priority
will keep track of what to do when we start a scan...
> Possibly. I think moving unmapped use-once over to PG_useonce first, and
> tidying the weird warts and special cases (that don't make sense) from
> vmscan is a good first step.
Agreed, cleaning up the code first will make it a lot easier
to make improvements bit by bit.
--
All Rights Reversed
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing
2006-01-14 8:51 ` Andrew Morton
@ 2006-01-16 13:06 ` Rik van Riel
0 siblings, 0 replies; 7+ messages in thread
From: Rik van Riel @ 2006-01-16 13:06 UTC (permalink / raw)
To: Andrew Morton
Cc: Peter Zijlstra, piggin, marcelo.tosatti, linux-mm, bob.picco, clameter
On Sat, 14 Jan 2006, Andrew Morton wrote:
> Speedups should outweigh the slowdowns, no really bad corner cases.
When it comes to corner cases, clock-pro gets my vote over
any of the alternative algorithms.
--
All Rights Reversed
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-01-16 13:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-14 0:05 use-once-cleanup testing Marcelo Tosatti
2006-01-14 4:52 ` Nick Piggin
2006-01-14 4:53 ` Marcelo Tosatti
2006-01-14 8:44 ` Peter Zijlstra
2006-01-14 8:51 ` Andrew Morton
2006-01-16 13:06 ` Rik van Riel
2006-01-16 13:05 ` Rik van Riel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox