* Preswapping
@ 2005-08-18 21:58 Gregory Maxwell
2005-08-19 18:39 ` Preswapping Christoph Lameter
2005-08-22 22:48 ` Preswapping Marcelo Tosatti
0 siblings, 2 replies; 4+ messages in thread
From: Gregory Maxwell @ 2005-08-18 21:58 UTC (permalink / raw)
To: linux-mm
With the ability to measure something approximating least frequently
used inactive pages now, would it not make sense to begin more
aggressive nonevicting preswapping?
For example, if the swap disks are not busy, we scan the least
frequently used inactive pages, and write them out in nice large
chunks. The pages are moved to another list, but not evicted from
memory. The normal swapping algorithm is used to decide when/if to
actually evict these pages from memory. If they are used prior to
being evicted, they can be remarked active (and their blocks on swap
marked as unused) without a disk seek.
This approach makes sense because swapping performance is often
limited by seeks rather than disk throughput or capacity. While under
memory pressure a system with preswapping has a substantial head start
on other systems because it is likely that majority of the unneeded
pages are going to already be on disk, all that is needed is to evict
them. Also, this process allows us to be very aggressive in what we
write to disk so that the truly useless pages get out, but not run the
risk of overswapping on a system with plenty of free memory.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Preswapping
2005-08-18 21:58 Preswapping Gregory Maxwell
@ 2005-08-19 18:39 ` Christoph Lameter
2005-08-19 20:20 ` Preswapping Gregory Maxwell
2005-08-22 22:48 ` Preswapping Marcelo Tosatti
1 sibling, 1 reply; 4+ messages in thread
From: Christoph Lameter @ 2005-08-19 18:39 UTC (permalink / raw)
To: Gregory Maxwell; +Cc: linux-mm
On Thu, 18 Aug 2005, Gregory Maxwell wrote:
> With the ability to measure something approximating least frequently
> used inactive pages now, would it not make sense to begin more
> aggressive nonevicting preswapping?
Maybe. What would be the overhead for cases in which swapping is not
needed?
> For example, if the swap disks are not busy, we scan the least
> frequently used inactive pages, and write them out in nice large
> chunks. The pages are moved to another list, but not evicted from
> memory. The normal swapping algorithm is used to decide when/if to
> actually evict these pages from memory. If they are used prior to
> being evicted, they can be remarked active (and their blocks on swap
> marked as unused) without a disk seek.
If you write out the pages then one could simply mark them as clean and
note where the location is in swap space.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Preswapping
2005-08-19 18:39 ` Preswapping Christoph Lameter
@ 2005-08-19 20:20 ` Gregory Maxwell
0 siblings, 0 replies; 4+ messages in thread
From: Gregory Maxwell @ 2005-08-19 20:20 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
On 8/19/05, Christoph Lameter <clameter@engr.sgi.com> wrote:
> On Thu, 18 Aug 2005, Gregory Maxwell wrote:
>
> > With the ability to measure something approximating least frequently
> > used inactive pages now, would it not make sense to begin more
> > aggressive nonevicting preswapping?
>
> Maybe. What would be the overhead for cases in which swapping is not
> needed?
Extraneous disk IO, perhaps a little extra overhead in having another
list to walk.. Oddball additional allocations on the swap partition.
I think none of these would be insurmountable obstacles. Write out
should heed the laptop mode setting to avoid spinning up the disk..
the activity should be suppressed whenever the disk is busy.
This also puts things in potentially better shape so that things can
be swapped out in nice contiguous runs and swapped-in with nice
contiguous runs.
A further step might be to arrange things so that preemptive swapping
and swsup shared many of the same structures.. so a preemptive
swapping box would just be perpetually setting up it's freeze.. a
suspend to disk would just require quiesce the processes and pushing
out the (hopefully few) remaining pages.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Preswapping
2005-08-18 21:58 Preswapping Gregory Maxwell
2005-08-19 18:39 ` Preswapping Christoph Lameter
@ 2005-08-22 22:48 ` Marcelo Tosatti
1 sibling, 0 replies; 4+ messages in thread
From: Marcelo Tosatti @ 2005-08-22 22:48 UTC (permalink / raw)
To: Gregory Maxwell; +Cc: linux-mm
On Thu, Aug 18, 2005 at 05:58:57PM -0400, Gregory Maxwell wrote:
> With the ability to measure something approximating least frequently
> used inactive pages now, would it not make sense to begin more
> aggressive nonevicting preswapping?
I think that some kind of applications might benefit while others
can be hurt. One factor is whether or not there is locality.
If the accesses are very random increasing readahead might hurt?
Why don't you do some testing? The default readahead is (1 << page_cluster)
mm/swap.c
/* Use a smaller cluster for small-memory machines */
if (megs < 16)
page_cluster = 2;
else
page_cluster = 3;
Which is 8 pages (32bytes) on machines with more than 16Mb.
The qsbench test should be pretty random (it does a quick sort on
large amounts data). And then you could use a workload where locality
is more significant (few parallel fillmem's for example).
> For example, if the swap disks are not busy, we scan the least
> frequently used inactive pages, and write them out in nice large
> chunks.
Yes, that could be done for every pagecache page on VM reclaim path
(and probably the pdflush path too, which controls the dirty limits
and buffer age).
And you can relatively easy find contiguous dirty pages in the per-inode
mapping via the radix tree with radix_tree_lookup_gang().
Hopefully the pages are contiguous on disk too, could discard IO
otherwise.
> The pages are moved to another list, but not evicted from
> memory. The normal swapping algorithm is used to decide when/if to
> actually evict these pages from memory. If they are used prior to
> being evicted, they can be remarked active (and their blocks on swap
> marked as unused) without a disk seek.
>
> This approach makes sense because swapping performance is often
> limited by seeks rather than disk throughput or capacity. While under
> memory pressure a system with preswapping has a substantial head start
> on other systems because it is likely that majority of the unneeded
> pages are going to already be on disk, all that is needed is to evict
> them. Also, this process allows us to be very aggressive in what we
> write to disk so that the truly useless pages get out, but not run the
> risk of overswapping on a system with plenty of free memory.
Yes it probably helps - you should try it out.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-08-22 22:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-08-18 21:58 Preswapping Gregory Maxwell
2005-08-19 18:39 ` Preswapping Christoph Lameter
2005-08-19 20:20 ` Preswapping Gregory Maxwell
2005-08-22 22:48 ` Preswapping Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox