* use-once-cleanup testing
@ 2006-01-14 0:05 Marcelo Tosatti
2006-01-14 4:52 ` Nick Piggin
0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2006-01-14 0:05 UTC (permalink / raw)
To: akpm, Nick Piggin, Peter Zijlstra, Rik van Riel; +Cc: linux-mm
Hi folks,
Rik's use-once cleanup patch (1) gets rid of a nasty problem. The
use-once logic does not work for mmaped() files, due to the questionable
assumption that any referenced pages of such files should be held in
memory:
1 - http://lwn.net/Articles/134387/
static int shrink_list(struct list_head *page_list, struct scan_control *sc)
{
...
referenced = page_referenced(page, 1);
/* In active use or really unfreeable? Activate it. */
if (referenced && page_mapping_inuse(page))
goto activate_locked;
The page activation scheme relies on mark_page_accessed() (exported
function) to do the list move itself, which is the only way for in-cache
non mapped pages to be promoted to the active list.
Rik's patch instead only sets the referenced bit at
mark_page_accessed(), changing the use-once logic to work by means
of a newly created PG_new flag. The flag, set at add_to_pagecache()
time, gives pages a second round on the inactive list in case they
get referenced. Page activation is then performed if the page is
re-referenced.
Another clear advantage of not doing the list move at mark_page_accessed()
time is decreased zone->lru_lock contention and cache thrashing in
general (profiling on SMP machines would be interesting).
A possibly negative side-effect of PG_new, already mentioned by Nikita
in this list, is that used-once pages lurk around longer in cache, which
can slowdown particular workloads (it should not be hard to create such
loads).
However, the ongoing non-resident book keeping implementation makes it
possible to completly get rid of "second chance" behaviour: re-accessed
evicted pages are automatically promoted to the active list.
For example this is a real scenario where use-once mmap() is
performed:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0109.2/0078.html
Patch being used for the tests is:
http://programming.kicks-ass.net/kernel-patches/page-replace/2.6.16-rc1/use_once-cleanup.patch
And here are results of larger than RAM sequential access with mmap():
2.6-git-jan-12:
Command being timed: "iozone -B -s 143360 -i 1 -i 1 -i 1 -i 1 -w"
Percent of CPU this job got: 6%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:34.74
2.6-git-jan-12+useonce:
Command being timed: "iozone -B -s 143360 -i 1 -i 1 -i 1 -i 1 -w"
Percent of CPU this job got: 13%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.22
And a few graphs of the active/inactive sizes with both read and mmap
mode, with the vanilla and use-once patched kernels:
http://hera.kernel.org/~marcelo/mm/iozone_useonce/iozone_useonce.html
Its possible to note that even using read() the vanilla VM moves
used-once pages to the active list (ie. the logic is not working as
expected).
I would vote for inclusion of the first version of use-once-cleanup
(without the arguable refill_inactive_zone() page_referenced change)
into -mm.
Comments?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: use-once-cleanup testing 2006-01-14 0:05 use-once-cleanup testing Marcelo Tosatti @ 2006-01-14 4:52 ` Nick Piggin 2006-01-14 4:53 ` Marcelo Tosatti ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Nick Piggin @ 2006-01-14 4:52 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: akpm, Peter Zijlstra, Rik van Riel, linux-mm Marcelo Tosatti wrote: >Hi folks, > >Rik's use-once cleanup patch (1) gets rid of a nasty problem. The >use-once logic does not work for mmaped() files, due to the questionable >assumption that any referenced pages of such files should be held in >memory: > >1 - http://lwn.net/Articles/134387/ > >static int shrink_list(struct list_head *page_list, struct scan_control *sc) >{ >... > referenced = page_referenced(page, 1); > /* In active use or really unfreeable? Activate it. */ > if (referenced && page_mapping_inuse(page)) > goto activate_locked; > >The page activation scheme relies on mark_page_accessed() (exported >function) to do the list move itself, which is the only way for in-cache >non mapped pages to be promoted to the active list. > >Rik's patch instead only sets the referenced bit at >mark_page_accessed(), changing the use-once logic to work by means >of a newly created PG_new flag. The flag, set at add_to_pagecache() >time, gives pages a second round on the inactive list in case they >get referenced. Page activation is then performed if the page is >re-referenced. > > This is what I've done too (though I prefer a PG_useonce flag which gets set after they're first seen referenced). I think Wu may also be doing something like it for adaptive readahead. Basically: it has been reinvented so many times that it *has* to be a good idea ;) >Another clear advantage of not doing the list move at mark_page_accessed() >time is decreased zone->lru_lock contention and cache thrashing in >general (profiling on SMP machines would be interesting). > > It also allows one to get rid of the dirty hacks in mark_page_accessed callers and means read() based useonce actually works properly in cases where userspace isn't working in blocks of PAGE_SIZE (rsync I think was one that did this, with fairly horrible results). >A possibly negative side-effect of PG_new, already mentioned by Nikita >in this list, is that used-once pages lurk around longer in cache, which >can slowdown particular workloads (it should not be hard to create such >loads). > > Yes, I found that also doing use-once on mapped pages caused fairly huge slowdowns in some cases. File IO could much more easily cause X and its applications to get swapped out. >However, the ongoing non-resident book keeping implementation makes it >possible to completly get rid of "second chance" behaviour: re-accessed >evicted pages are automatically promoted to the active list. > > Possibly. I think moving unmapped use-once over to PG_useonce first, and tidying the weird warts and special cases (that don't make sense) from vmscan is a good first step. Unfortunately I don't think Andrew wants a bar of any of it. Nor would a crazy rewrite-pagereclaim tree really get any sort of testing at all, realistically :( Ideas? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing 2006-01-14 4:52 ` Nick Piggin @ 2006-01-14 4:53 ` Marcelo Tosatti 2006-01-14 8:44 ` Peter Zijlstra 2006-01-16 13:05 ` Rik van Riel 2 siblings, 0 replies; 7+ messages in thread From: Marcelo Tosatti @ 2006-01-14 4:53 UTC (permalink / raw) To: Nick Piggin; +Cc: akpm, Peter Zijlstra, Rik van Riel, linux-mm Hi Nick, On Sat, Jan 14, 2006 at 03:52:58PM +1100, Nick Piggin wrote: > > Marcelo Tosatti wrote: > > >Hi folks, > > > >Rik's use-once cleanup patch (1) gets rid of a nasty problem. The > >use-once logic does not work for mmaped() files, due to the questionable > >assumption that any referenced pages of such files should be held in > >memory: > > > >1 - http://lwn.net/Articles/134387/ > > > >static int shrink_list(struct list_head *page_list, struct scan_control > >*sc) > >{ > >... > > referenced = page_referenced(page, 1); > > /* In active use or really unfreeable? Activate it. */ > > if (referenced && page_mapping_inuse(page)) > > goto activate_locked; > > > >The page activation scheme relies on mark_page_accessed() (exported > >function) to do the list move itself, which is the only way for in-cache > >non mapped pages to be promoted to the active list. > > > >Rik's patch instead only sets the referenced bit at > >mark_page_accessed(), changing the use-once logic to work by means > >of a newly created PG_new flag. The flag, set at add_to_pagecache() > >time, gives pages a second round on the inactive list in case they > >get referenced. Page activation is then performed if the page is > >re-referenced. > > > > > > This is what I've done too (though I prefer a PG_useonce flag > which gets set after they're first seen referenced). > > I think Wu may also be doing something like it for adaptive readahead. > > Basically: it has been reinvented so many times that it *has* to be a > good idea ;) For most mixed loads, think so. But not for all certainly. > >Another clear advantage of not doing the list move at mark_page_accessed() > >time is decreased zone->lru_lock contention and cache thrashing in > >general (profiling on SMP machines would be interesting). > > > > > > It also allows one to get rid of the dirty hacks in mark_page_accessed > callers and means read() based useonce actually works properly in cases > where userspace isn't working in blocks of PAGE_SIZE (rsync I think was > one that did this, with fairly horrible results). > > >A possibly negative side-effect of PG_new, already mentioned by Nikita > >in this list, is that used-once pages lurk around longer in cache, which > >can slowdown particular workloads (it should not be hard to create such > >loads). > > > > > > Yes, I found that also doing use-once on mapped pages caused fairly huge > slowdowns in some cases. File IO could much more easily cause X and its > applications to get swapped out. > > >However, the ongoing non-resident book keeping implementation makes it > >possible to completly get rid of "second chance" behaviour: re-accessed > >evicted pages are automatically promoted to the active list. > > > > > Possibly. I think moving unmapped use-once over to PG_useonce first, and > tidying the weird warts and special cases (that don't make sense) from > vmscan is a good first step. > > Unfortunately I don't think Andrew wants a bar of any of it. Nor would > a crazy rewrite-pagereclaim tree really get any sort of testing at all, > realistically :( > > Ideas? I think that creating a page replacement interface used by the VM to hide the details of the reclaim specifics is an important step forward, allowing co-existance of different replacement policies. It opens up many possibilities. Peter started the abstraction of the page reclaim code for his CLOCK-Pro implementation, and I've been working with him to improve it. The current code is logically glued together, there is no distiction between reclaim cache interface and LRU: they are the same. Please take a look at http://programming.kicks-ass.net/kernel-patches/page-replace/2.6.16-rc1/page-replace-documentation.patch and the related patches in that directory. Its basically separating the actions invoked by generic VM: - book keeping of page information (insertion, deletion, reference, and so on). - selection of pagecache candidates for eviction And: - balancing between slab/pagecache eviction - page eviction - page writeout IMO they should all be separate, with shared helpers functions, as the document and patches suggest. The current set makes the traditional LRU 2-queue and CLOCK-Pro policies co-exist (at the very moment it contains several patches which change behaviour such as Rik's PG_new, Wu's zone scanning balancing, but they are not necessarily related to this). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing 2006-01-14 4:52 ` Nick Piggin 2006-01-14 4:53 ` Marcelo Tosatti @ 2006-01-14 8:44 ` Peter Zijlstra 2006-01-14 8:51 ` Andrew Morton 2006-01-16 13:05 ` Rik van Riel 2 siblings, 1 reply; 7+ messages in thread From: Peter Zijlstra @ 2006-01-14 8:44 UTC (permalink / raw) To: Nick Piggin Cc: Marcelo Tosatti, akpm, Rik van Riel, linux-mm, Bob Picco, Christoph Lameter On Sat, 2006-01-14 at 15:52 +1100, Nick Piggin wrote: > Unfortunately I don't think Andrew wants a bar of any of it. Nor would > a crazy rewrite-pagereclaim tree really get any sort of testing at all, > realistically :( > Both HP and SGI have shown interrest in getting these patches in shape and testing them, so I do think there is quite some interrest in them. I admit that there is still a lot of work to do, like getting the CART policies into the new tree and NUMAfying the CLOCK-Pro and CART policies. And ofcourse rigourous testing. Andrew, what would you need on top of that to start being interrested? Kind regards, PeterZ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing 2006-01-14 8:44 ` Peter Zijlstra @ 2006-01-14 8:51 ` Andrew Morton 2006-01-16 13:06 ` Rik van Riel 0 siblings, 1 reply; 7+ messages in thread From: Andrew Morton @ 2006-01-14 8:51 UTC (permalink / raw) To: Peter Zijlstra Cc: piggin, marcelo.tosatti, riel, linux-mm, bob.picco, clameter Peter Zijlstra <peter@programming.kicks-ass.net> wrote: > > Andrew, what would you need on top of that to start being interrested? > A demonstration that the code will make sufficient improvement to justify its inclusion, naturally ;) Speedups should outweigh the slowdowns, no really bad corner cases. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing 2006-01-14 8:51 ` Andrew Morton @ 2006-01-16 13:06 ` Rik van Riel 0 siblings, 0 replies; 7+ messages in thread From: Rik van Riel @ 2006-01-16 13:06 UTC (permalink / raw) To: Andrew Morton Cc: Peter Zijlstra, piggin, marcelo.tosatti, linux-mm, bob.picco, clameter On Sat, 14 Jan 2006, Andrew Morton wrote: > Speedups should outweigh the slowdowns, no really bad corner cases. When it comes to corner cases, clock-pro gets my vote over any of the alternative algorithms. -- All Rights Reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: use-once-cleanup testing 2006-01-14 4:52 ` Nick Piggin 2006-01-14 4:53 ` Marcelo Tosatti 2006-01-14 8:44 ` Peter Zijlstra @ 2006-01-16 13:05 ` Rik van Riel 2 siblings, 0 replies; 7+ messages in thread From: Rik van Riel @ 2006-01-16 13:05 UTC (permalink / raw) To: Nick Piggin; +Cc: Marcelo Tosatti, akpm, Peter Zijlstra, linux-mm On Sat, 14 Jan 2006, Nick Piggin wrote: > Yes, I found that also doing use-once on mapped pages caused fairly huge > slowdowns in some cases. File IO could much more easily cause X and its > applications to get swapped out. We can get rid of that effect easily by adding reclaim_mapped logic to the inactive list scan. The zone previous_priority will keep track of what to do when we start a scan... > Possibly. I think moving unmapped use-once over to PG_useonce first, and > tidying the weird warts and special cases (that don't make sense) from > vmscan is a good first step. Agreed, cleaning up the code first will make it a lot easier to make improvements bit by bit. -- All Rights Reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-01-16 13:06 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2006-01-14 0:05 use-once-cleanup testing Marcelo Tosatti 2006-01-14 4:52 ` Nick Piggin 2006-01-14 4:53 ` Marcelo Tosatti 2006-01-14 8:44 ` Peter Zijlstra 2006-01-14 8:51 ` Andrew Morton 2006-01-16 13:06 ` Rik van Riel 2006-01-16 13:05 ` Rik van Riel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox