* [patch] page aging and deferred swapping for 2.4.0-test1
@ 2000-05-25 23:03 Rik van Riel
2000-05-25 23:48 ` Neil Schemenauer
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Rik van Riel @ 2000-05-25 23:03 UTC (permalink / raw)
To: linux-mm
Hi,
the attached patch attempts to implement the following two
things (which we'll probably want in the active/inactive
design later on):
- page aging (for active pages)
- deferred swap IO, with only unmapping in try_to_swap_out()
The patch still crashes, but maybe one of you has an idea
on what's wrong and/or even how to fix it ;)
regards,
Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.
Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/ http://www.surriel.com/
--- linux-2.4.0-test1/mm/filemap.c.orig Thu May 25 12:27:47 2000
+++ linux-2.4.0-test1/mm/filemap.c Thu May 25 19:54:06 2000
@@ -264,7 +264,15 @@
page = list_entry(page_lru, struct page, lru);
list_del(page_lru);
- if (PageTestandClearReferenced(page))
+ if (PageTestandClearReferenced(page)) {
+ page->age += 3;
+ if (page->age > 10)
+ page->age = 0;
+ goto dispose_continue;
+ }
+ page->age--;
+
+ if (page->age)
goto dispose_continue;
count--;
@@ -317,23 +325,30 @@
goto cache_unlock_continue;
/*
+ * Page is from a zone we don't care about.
+ * Don't drop page cache entries in vain.
+ */
+ if (page->zone->free_pages > page->zone->pages_high)
+ goto cache_unlock_continue;
+
+ /*
* Is it a page swap page? If so, we want to
* drop it if it is no longer used, even if it
* were to be marked referenced..
*/
if (PageSwapCache(page)) {
- spin_unlock(&pagecache_lock);
+ /* Write dirty swap cache page to swap. */
+ if (PageDeferswap(page)) {
+ if (gfp_mask & __GFP_IO) {
+ goto async_swap;
+ }
+ else
+ goto cache_unlock_continue;
+ }
__delete_from_swap_cache(page);
goto made_inode_progress;
}
- /*
- * Page is from a zone we don't care about.
- * Don't drop page cache entries in vain.
- */
- if (page->zone->free_pages > page->zone->pages_high)
- goto cache_unlock_continue;
-
/* is it a page-cache page? */
if (page->mapping) {
if (!PageDirty(page) && !pgcache_under_min()) {
@@ -351,6 +366,14 @@
unlock_continue:
spin_lock(&pagemap_lru_lock);
UnlockPage(page);
+ page_cache_release(page);
+ goto dispose_continue;
+async_swap:
+ page->flags &= ~((1 << PG_defer_swap) | (1 << PG_dirty));
+ spin_unlock(&pagecache_lock);
+ /* Do NOT unlock the page ... that is done after IO. */
+ rw_swap_page(WRITE, page, 0);
+ spin_lock(&pagemap_lru_lock);
page_cache_release(page);
dispose_continue:
list_add(page_lru, &lru_cache);
--- linux-2.4.0-test1/mm/page_alloc.c.orig Thu May 25 12:27:47 2000
+++ linux-2.4.0-test1/mm/page_alloc.c Thu May 25 18:37:44 2000
@@ -94,6 +94,8 @@
if (PageDecrAfter(page))
BUG();
+ page->age = 2;
+
zone = page->zone;
mask = (~0UL) << order;
--- linux-2.4.0-test1/mm/vmscan.c.orig Thu May 25 12:27:47 2000
+++ linux-2.4.0-test1/mm/vmscan.c Thu May 25 19:32:17 2000
@@ -62,6 +62,10 @@
goto out_failed;
}
+ /* Can only do this if we age all active pages. */
+ // if (page->age > 1)
+ // goto out_failed;
+
if (TryLockPage(page))
goto out_failed;
@@ -181,7 +185,11 @@
vmlist_access_unlock(vma->vm_mm);
/* OK, do a physical asynchronous write to swap. */
- rw_swap_page(WRITE, page, 0);
+ // rw_swap_page(WRITE, page, 0);
+ /* Let shrink_mmap handle this swapout. */
+ SetPageDirty(page);
+ SetPageDeferswap(page);
+ UnlockPage(page);
out_free_success:
page_cache_release(page);
--- linux-2.4.0-test1/include/linux/mm.h.orig Thu May 25 12:28:10 2000
+++ linux-2.4.0-test1/include/linux/mm.h Thu May 25 19:24:04 2000
@@ -153,6 +153,7 @@
struct buffer_head * buffers;
unsigned long virtual; /* nonzero if kmapped */
struct zone_struct *zone;
+ unsigned int age;
} mem_map_t;
#define get_page(p) atomic_inc(&(p)->count)
@@ -168,8 +169,8 @@
#define PG_uptodate 3
#define PG_dirty 4
#define PG_decr_after 5
-#define PG_unused_01 6
-#define PG__unused_02 7
+#define PG_defer_swap 6
+#define PG_active 7
#define PG_slab 8
#define PG_swap_cache 9
#define PG_skip 10
@@ -185,6 +186,7 @@
#define ClearPageUptodate(page) clear_bit(PG_uptodate, &(page)->flags)
#define PageDirty(page) test_bit(PG_dirty, &(page)->flags)
#define SetPageDirty(page) set_bit(PG_dirty, &(page)->flags)
+#define ClearPageDirty(page) clear_bit(PG_dirty, &(page)->flags)
#define PageLocked(page) test_bit(PG_locked, &(page)->flags)
#define LockPage(page) set_bit(PG_locked, &(page)->flags)
#define TryLockPage(page) test_and_set_bit(PG_locked, &(page)->flags)
@@ -192,6 +194,12 @@
clear_bit(PG_locked, &(page)->flags); \
wake_up(&page->wait); \
} while (0)
+#define PageDeferswap(page) test_bit(PG_defer_swap, &(page)->flags)
+#define SetPageDeferswap(page) set_bit(PG_defer_swap, &(page)->flags)
+#define ClearPageDeferswap(page) clear_bit(PG_defer_swap, &(page)->flags)
+#define PageActive(page) test_bit(PG_active, &(page)->flags)
+#define SetPageActive(page) set_bit(PG_active, &(page)->flags)
+#define ClearPageActive(page) clear_bit(PG_active, &(page)->flags)
#define PageError(page) test_bit(PG_error, &(page)->flags)
#define SetPageError(page) set_bit(PG_error, &(page)->flags)
#define ClearPageError(page) clear_bit(PG_error, &(page)->flags)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [patch] page aging and deferred swapping for 2.4.0-test1
2000-05-25 23:03 [patch] page aging and deferred swapping for 2.4.0-test1 Rik van Riel
@ 2000-05-25 23:48 ` Neil Schemenauer
2000-05-26 13:32 ` Roger Larsson
2000-05-26 14:59 ` Roger Larsson
2 siblings, 0 replies; 6+ messages in thread
From: Neil Schemenauer @ 2000-05-25 23:48 UTC (permalink / raw)
To: Rik van Riel; +Cc: linux-mm
On Thu, May 25, 2000 at 08:03:42PM -0300, Rik van Riel wrote:
> + if (PageTestandClearReferenced(page)) {
> + page->age += 3;
> + if (page->age > 10)
> + page->age = 0;
Why this test? Something like:
if (page->age < 10) {
page->age += 3;
}
makes more sense to me.
Neil
--
'Slashdot, with its uncontrolled content and participants' poor
impulse control, remains Internet culture's answer to "Lord of
the Flies."' - Salon
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [patch] page aging and deferred swapping for 2.4.0-test1
2000-05-25 23:03 [patch] page aging and deferred swapping for 2.4.0-test1 Rik van Riel
2000-05-25 23:48 ` Neil Schemenauer
@ 2000-05-26 13:32 ` Roger Larsson
2000-05-26 13:41 ` Rik van Riel
2000-05-26 14:59 ` Roger Larsson
2 siblings, 1 reply; 6+ messages in thread
From: Roger Larsson @ 2000-05-26 13:32 UTC (permalink / raw)
To: Rik van Riel; +Cc: linux-mm
Rik van Riel wrote:
>
> Hi,
>
> the attached patch attempts to implement the following two
> things (which we'll probably want in the active/inactive
> design later on):
> - page aging (for active pages)
> - deferred swap IO, with only unmapping in try_to_swap_out()
>
> The patch still crashes, but maybe one of you has an idea
> on what's wrong and/or even how to fix it ;)
>
> regards,
>
> Rik
> --
> The Internet is not a network of computers. It is a network
> of people. That is its real strength.
>
> Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies
> http://www.conectiva.com/ http://www.surriel.com/
>
The aging code can not be correct.
if (PageTestandClearReferenced(page)) {
page->age += 3;
if (page->age > 10)
page->age = 0;
goto dispose_continue;
}
page->age--;
if (page->age)
goto dispose_continue;
I would say it should be:
if (PageTestandClearReferenced(page)) {
page->age += 3;
if (page->age > 10)
page->age = 10;
goto dispose_continue;
}
if (page->age && priority) // at zero priority ignore age
goto dispose_continue;
page->age--;
/RogerL
Home page:
http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [patch] page aging and deferred swapping for 2.4.0-test1
2000-05-26 13:32 ` Roger Larsson
@ 2000-05-26 13:41 ` Rik van Riel
0 siblings, 0 replies; 6+ messages in thread
From: Rik van Riel @ 2000-05-26 13:41 UTC (permalink / raw)
To: Roger Larsson; +Cc: linux-mm
On Fri, 26 May 2000, Roger Larsson wrote:
> Rik van Riel wrote:
> > the attached patch attempts to implement the following two
> > things (which we'll probably want in the active/inactive
> > design later on):
> > - page aging (for active pages)
> > - deferred swap IO, with only unmapping in try_to_swap_out()
> The aging code can not be correct.
> if (PageTestandClearReferenced(page)) {
> page->age += 3;
> if (page->age > 10)
> page->age = 0;
> goto dispose_continue;
> }
> page->age--;
>
> if (page->age)
> goto dispose_continue;
True, there is one obvious error here...
> I would say it should be:
>
> if (PageTestandClearReferenced(page)) {
> page->age += 3;
> if (page->age > 10)
> page->age = 10;
> goto dispose_continue;
> }
>
> if (page->age && priority) // at zero priority ignore age
> goto dispose_continue;
>
> page->age--;
This is wrong too. It would mean that we'd never decrease the
page age unless priority == 0 ;)
The fix is this:
}
- page->age--;
+ if (page->age)
+ page->age--;
(so we cannot get into a near-infinite loop when we decrease
the unsigned age when it's zero and have it wrap to infinite)
regards,
Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.
Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/ http://www.surriel.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [patch] page aging and deferred swapping for 2.4.0-test1
2000-05-25 23:03 [patch] page aging and deferred swapping for 2.4.0-test1 Rik van Riel
2000-05-25 23:48 ` Neil Schemenauer
2000-05-26 13:32 ` Roger Larsson
@ 2000-05-26 14:59 ` Roger Larsson
2000-05-26 15:16 ` Rik van Riel
2 siblings, 1 reply; 6+ messages in thread
From: Roger Larsson @ 2000-05-26 14:59 UTC (permalink / raw)
To: Rik van Riel; +Cc: linux-mm
Hi,
Shouldn't lru_cache_add in swap.h initialize age?
#define lru_cache_add(page) \
do { \
spin_lock(&pagemap_lru_lock); \
list_add(&(page)->lru, &lru_cache); \
(page)->age = 5; \
nr_lru_pages++; \
spin_unlock(&pagemap_lru_lock); \
} while (0)
Rik van Riel wrote:
> --- linux-2.4.0-test1/mm/page_alloc.c.orig Thu May 25 12:27:47 2000
> +++ linux-2.4.0-test1/mm/page_alloc.c Thu May 25 18:37:44 2000
> @@ -94,6 +94,8 @@
> if (PageDecrAfter(page))
> BUG();
>
> + page->age = 2;
> +
hmm...
If this is a page that has beed used much, isn't it penalized to
much, and don't we loose information...??? (all fread pages are the
same)
how about:
page->age /= 2;
Ok, it could race (read/write)...
and in try_to_swap_out (mm/vmscan.c) we could change to
/* Don't look at this pte if it's been accessed recently. */
if (pte_young(pte)) {
/*
* Transfer the "accessed" bit from the page
* tables to the global page map.
*/
set_pte(page_table, pte_mkold(pte));
page->age += 3;
goto out_failed;
}
/* Can only do this if we age all active pages. */
// if (page->age > 1)
// goto out_failed;
this would free a bit (PG_referenced would not be needed).
But it can race (read, write)
The races when updating page->age should not be critical.
(statistics...)
/RogerL
--
Home page:
http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [patch] page aging and deferred swapping for 2.4.0-test1
2000-05-26 14:59 ` Roger Larsson
@ 2000-05-26 15:16 ` Rik van Riel
0 siblings, 0 replies; 6+ messages in thread
From: Rik van Riel @ 2000-05-26 15:16 UTC (permalink / raw)
To: Roger Larsson; +Cc: linux-mm
On Fri, 26 May 2000, Roger Larsson wrote:
> Shouldn't lru_cache_add in swap.h initialize age?
It doesn't particularly matter where that is done, but I guess
lru_cache_add is a good place for it when you take readability
into account.
> > --- linux-2.4.0-test1/mm/page_alloc.c.orig Thu May 25 12:27:47 2000
> > +++ linux-2.4.0-test1/mm/page_alloc.c Thu May 25 18:37:44 2000
> > @@ -94,6 +94,8 @@
> > if (PageDecrAfter(page))
> > BUG();
> >
> > + page->age = 2;
> > +
>
> hmm...
> If this is a page that has beed used much, isn't it penalized to
> much, and don't we loose information...??? (all fread pages are the
> same)
You may want to read the code to see what __free_pages_ok
is actually used for.
> and in try_to_swap_out (mm/vmscan.c) we could change to
> /* Don't look at this pte if it's been accessed recently. */
> if (pte_young(pte)) {
> /*
> * Transfer the "accessed" bit from the page
> * tables to the global page map.
> */
> set_pte(page_table, pte_mkold(pte));
> page->age += 3;
> goto out_failed;
> }
This is dead wrong. Suppose the page isn't in the lru queue ...
its age would get upped to infinite values.
Also, if a page is shared between multiple ptes, we don't want
to mess with its age at scanning all ptes. We are using the
PG_referenced bit exactly to avoid this bug.
regards,
Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.
Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/ http://www.surriel.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2000-05-26 15:16 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-05-25 23:03 [patch] page aging and deferred swapping for 2.4.0-test1 Rik van Riel
2000-05-25 23:48 ` Neil Schemenauer
2000-05-26 13:32 ` Roger Larsson
2000-05-26 13:41 ` Rik van Riel
2000-05-26 14:59 ` Roger Larsson
2000-05-26 15:16 ` Rik van Riel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox