linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch] page aging and deferred swapping for 2.4.0-test1
@ 2000-05-25 23:03 Rik van Riel
  2000-05-25 23:48 ` Neil Schemenauer
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Rik van Riel @ 2000-05-25 23:03 UTC (permalink / raw)
  To: linux-mm

Hi,

the attached patch attempts to implement the following two
things (which we'll probably want in the active/inactive
design later on):
- page aging (for active pages)
- deferred swap IO, with only unmapping in try_to_swap_out()

The patch still crashes, but maybe one of you has an idea
on what's wrong and/or even how to fix it ;)

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/




--- linux-2.4.0-test1/mm/filemap.c.orig	Thu May 25 12:27:47 2000
+++ linux-2.4.0-test1/mm/filemap.c	Thu May 25 19:54:06 2000
@@ -264,7 +264,15 @@
 		page = list_entry(page_lru, struct page, lru);
 		list_del(page_lru);
 
-		if (PageTestandClearReferenced(page))
+		if (PageTestandClearReferenced(page)) {
+			page->age += 3;
+			if (page->age > 10)
+				page->age = 0;
+			goto dispose_continue;
+		}
+		page->age--;
+
+		if (page->age)
 			goto dispose_continue;
 
 		count--;
@@ -317,23 +325,30 @@
 			goto cache_unlock_continue;
 
 		/*
+		 * Page is from a zone we don't care about.
+		 * Don't drop page cache entries in vain.
+		 */
+		if (page->zone->free_pages > page->zone->pages_high)
+			goto cache_unlock_continue;
+
+		/*
 		 * Is it a page swap page? If so, we want to
 		 * drop it if it is no longer used, even if it
 		 * were to be marked referenced..
 		 */
 		if (PageSwapCache(page)) {
-			spin_unlock(&pagecache_lock);
+			/* Write dirty swap cache page to swap. */
+			if (PageDeferswap(page)) {
+				if (gfp_mask & __GFP_IO) {
+					goto async_swap;
+				}
+				else
+					goto cache_unlock_continue;
+			}
 			__delete_from_swap_cache(page);
 			goto made_inode_progress;
 		}	
 
-		/*
-		 * Page is from a zone we don't care about.
-		 * Don't drop page cache entries in vain.
-		 */
-		if (page->zone->free_pages > page->zone->pages_high)
-			goto cache_unlock_continue;
-
 		/* is it a page-cache page? */
 		if (page->mapping) {
 			if (!PageDirty(page) && !pgcache_under_min()) {
@@ -351,6 +366,14 @@
 unlock_continue:
 		spin_lock(&pagemap_lru_lock);
 		UnlockPage(page);
+		page_cache_release(page);
+		goto dispose_continue;
+async_swap:
+		page->flags &= ~((1 << PG_defer_swap) | (1 << PG_dirty));
+		spin_unlock(&pagecache_lock);
+		/* Do NOT unlock the page ... that is done after IO. */
+		rw_swap_page(WRITE, page, 0);
+		spin_lock(&pagemap_lru_lock);
 		page_cache_release(page);
 dispose_continue:
 		list_add(page_lru, &lru_cache);
--- linux-2.4.0-test1/mm/page_alloc.c.orig	Thu May 25 12:27:47 2000
+++ linux-2.4.0-test1/mm/page_alloc.c	Thu May 25 18:37:44 2000
@@ -94,6 +94,8 @@
 	if (PageDecrAfter(page))
 		BUG();
 
+	page->age = 2;
+
 	zone = page->zone;
 
 	mask = (~0UL) << order;
--- linux-2.4.0-test1/mm/vmscan.c.orig	Thu May 25 12:27:47 2000
+++ linux-2.4.0-test1/mm/vmscan.c	Thu May 25 19:32:17 2000
@@ -62,6 +62,10 @@
 		goto out_failed;
 	}
 
+	/* Can only do this if we age all active pages. */
+	// if (page->age > 1)
+	//	goto out_failed;
+
 	if (TryLockPage(page))
 		goto out_failed;
 
@@ -181,7 +185,11 @@
 	vmlist_access_unlock(vma->vm_mm);
 
 	/* OK, do a physical asynchronous write to swap.  */
-	rw_swap_page(WRITE, page, 0);
+	// rw_swap_page(WRITE, page, 0);
+	/* Let shrink_mmap handle this swapout. */
+	SetPageDirty(page);
+	SetPageDeferswap(page);
+	UnlockPage(page);
 
 out_free_success:
 	page_cache_release(page);
--- linux-2.4.0-test1/include/linux/mm.h.orig	Thu May 25 12:28:10 2000
+++ linux-2.4.0-test1/include/linux/mm.h	Thu May 25 19:24:04 2000
@@ -153,6 +153,7 @@
 	struct buffer_head * buffers;
 	unsigned long virtual; /* nonzero if kmapped */
 	struct zone_struct *zone;
+	unsigned int age;
 } mem_map_t;
 
 #define get_page(p)		atomic_inc(&(p)->count)
@@ -168,8 +169,8 @@
 #define PG_uptodate		 3
 #define PG_dirty		 4
 #define PG_decr_after		 5
-#define PG_unused_01		 6
-#define PG__unused_02		 7
+#define PG_defer_swap		 6
+#define PG_active		 7
 #define PG_slab			 8
 #define PG_swap_cache		 9
 #define PG_skip			10
@@ -185,6 +186,7 @@
 #define ClearPageUptodate(page)	clear_bit(PG_uptodate, &(page)->flags)
 #define PageDirty(page)		test_bit(PG_dirty, &(page)->flags)
 #define SetPageDirty(page)	set_bit(PG_dirty, &(page)->flags)
+#define ClearPageDirty(page)	clear_bit(PG_dirty, &(page)->flags)
 #define PageLocked(page)	test_bit(PG_locked, &(page)->flags)
 #define LockPage(page)		set_bit(PG_locked, &(page)->flags)
 #define TryLockPage(page)	test_and_set_bit(PG_locked, &(page)->flags)
@@ -192,6 +194,12 @@
 					clear_bit(PG_locked, &(page)->flags); \
 					wake_up(&page->wait); \
 				} while (0)
+#define PageDeferswap(page)	test_bit(PG_defer_swap, &(page)->flags)
+#define SetPageDeferswap(page)	set_bit(PG_defer_swap, &(page)->flags)
+#define ClearPageDeferswap(page) clear_bit(PG_defer_swap, &(page)->flags)
+#define PageActive(page)	test_bit(PG_active, &(page)->flags)
+#define SetPageActive(page)	set_bit(PG_active, &(page)->flags)
+#define ClearPageActive(page)	clear_bit(PG_active, &(page)->flags)
 #define PageError(page)		test_bit(PG_error, &(page)->flags)
 #define SetPageError(page)	set_bit(PG_error, &(page)->flags)
 #define ClearPageError(page)	clear_bit(PG_error, &(page)->flags)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] page aging and deferred swapping for 2.4.0-test1
  2000-05-25 23:03 [patch] page aging and deferred swapping for 2.4.0-test1 Rik van Riel
@ 2000-05-25 23:48 ` Neil Schemenauer
  2000-05-26 13:32 ` Roger Larsson
  2000-05-26 14:59 ` Roger Larsson
  2 siblings, 0 replies; 6+ messages in thread
From: Neil Schemenauer @ 2000-05-25 23:48 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

On Thu, May 25, 2000 at 08:03:42PM -0300, Rik van Riel wrote:
> +		if (PageTestandClearReferenced(page)) {
> +			page->age += 3;
> +			if (page->age > 10)
> +				page->age = 0;

Why this test?  Something like:

    if (page->age < 10) {
        page->age += 3;
    }

makes more sense to me.

    Neil

-- 
'Slashdot, with its uncontrolled content and participants' poor
impulse control, remains Internet culture's answer to "Lord of
the Flies."' - Salon
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] page aging and deferred swapping for 2.4.0-test1
  2000-05-25 23:03 [patch] page aging and deferred swapping for 2.4.0-test1 Rik van Riel
  2000-05-25 23:48 ` Neil Schemenauer
@ 2000-05-26 13:32 ` Roger Larsson
  2000-05-26 13:41   ` Rik van Riel
  2000-05-26 14:59 ` Roger Larsson
  2 siblings, 1 reply; 6+ messages in thread
From: Roger Larsson @ 2000-05-26 13:32 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

Rik van Riel wrote:
> 
> Hi,
> 
> the attached patch attempts to implement the following two
> things (which we'll probably want in the active/inactive
> design later on):
> - page aging (for active pages)
> - deferred swap IO, with only unmapping in try_to_swap_out()
> 
> The patch still crashes, but maybe one of you has an idea
> on what's wrong and/or even how to fix it ;)
> 
> regards,
> 
> Rik
> --
> The Internet is not a network of computers. It is a network
> of people. That is its real strength.
> 
> Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
> http://www.conectiva.com/               http://www.surriel.com/
> 


The aging code can not be correct.
		if (PageTestandClearReferenced(page)) {
			page->age += 3;
			if (page->age > 10)
				page->age = 0;
			goto dispose_continue;
		}
		page->age--;

		if (page->age)
			goto dispose_continue;

I would say it should be:

		if (PageTestandClearReferenced(page)) {
			page->age += 3;
			if (page->age > 10)
				page->age = 10;
			goto dispose_continue;
		}

		if (page->age && priority)  // at zero priority ignore age
			goto dispose_continue;

		page->age--;

/RogerL


Home page:
  http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] page aging and deferred swapping for 2.4.0-test1
  2000-05-26 13:32 ` Roger Larsson
@ 2000-05-26 13:41   ` Rik van Riel
  0 siblings, 0 replies; 6+ messages in thread
From: Rik van Riel @ 2000-05-26 13:41 UTC (permalink / raw)
  To: Roger Larsson; +Cc: linux-mm

On Fri, 26 May 2000, Roger Larsson wrote:
> Rik van Riel wrote:

> > the attached patch attempts to implement the following two
> > things (which we'll probably want in the active/inactive
> > design later on):
> > - page aging (for active pages)
> > - deferred swap IO, with only unmapping in try_to_swap_out()

> The aging code can not be correct.
> 		if (PageTestandClearReferenced(page)) {
> 			page->age += 3;
> 			if (page->age > 10)
> 				page->age = 0;
> 			goto dispose_continue;
> 		}
> 		page->age--;
> 
> 		if (page->age)
> 			goto dispose_continue;

True, there is one obvious error here...

> I would say it should be:
> 
> 		if (PageTestandClearReferenced(page)) {
> 			page->age += 3;
> 			if (page->age > 10)
> 				page->age = 10;
> 			goto dispose_continue;
> 		}
> 
> 		if (page->age && priority)  // at zero priority ignore age
> 			goto dispose_continue;
> 
> 		page->age--;

This is wrong too. It would mean that we'd never decrease the
page age unless priority == 0 ;)

The fix is this:

}
-	page->age--;
+	if (page->age)
+		page->age--;


(so we cannot get into a near-infinite loop when we decrease
the unsigned age when it's zero and have it wrap to infinite)

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] page aging and deferred swapping for 2.4.0-test1
  2000-05-25 23:03 [patch] page aging and deferred swapping for 2.4.0-test1 Rik van Riel
  2000-05-25 23:48 ` Neil Schemenauer
  2000-05-26 13:32 ` Roger Larsson
@ 2000-05-26 14:59 ` Roger Larsson
  2000-05-26 15:16   ` Rik van Riel
  2 siblings, 1 reply; 6+ messages in thread
From: Roger Larsson @ 2000-05-26 14:59 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

Hi,

Shouldn't lru_cache_add in swap.h initialize age?


#define	lru_cache_add(page)			\
do {						\
	spin_lock(&pagemap_lru_lock);		\
	list_add(&(page)->lru, &lru_cache);	\
        (page)->age = 5;                        \
	nr_lru_pages++;				\
	spin_unlock(&pagemap_lru_lock);		\
} while (0)


Rik van Riel wrote:

> --- linux-2.4.0-test1/mm/page_alloc.c.orig      Thu May 25 12:27:47 2000
> +++ linux-2.4.0-test1/mm/page_alloc.c   Thu May 25 18:37:44 2000
> @@ -94,6 +94,8 @@
>         if (PageDecrAfter(page))
>                 BUG();
> 
> +       page->age = 2;
> +

hmm...
If this is a page that has beed used much, isn't it penalized to
much, and don't we loose information...??? (all fread pages are the
same)

how about:
	page->age /= 2;

Ok, it could race (read/write)...


and in try_to_swap_out (mm/vmscan.c) we could change to
	/* Don't look at this pte if it's been accessed recently. */
	if (pte_young(pte)) {
		/*
		 * Transfer the "accessed" bit from the page
		 * tables to the global page map.
		 */
		set_pte(page_table, pte_mkold(pte));
                page->age += 3;
		goto out_failed;
	}

	/* Can only do this if we age all active pages. */
	// if (page->age > 1)
	//	goto out_failed;

this would free a bit (PG_referenced would not be needed).
But it can race (read, write)

The races when updating page->age should not be critical.
(statistics...)

/RogerL
 
--
Home page:
  http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] page aging and deferred swapping for 2.4.0-test1
  2000-05-26 14:59 ` Roger Larsson
@ 2000-05-26 15:16   ` Rik van Riel
  0 siblings, 0 replies; 6+ messages in thread
From: Rik van Riel @ 2000-05-26 15:16 UTC (permalink / raw)
  To: Roger Larsson; +Cc: linux-mm

On Fri, 26 May 2000, Roger Larsson wrote:

> Shouldn't lru_cache_add in swap.h initialize age?

It doesn't particularly matter where that is done, but I guess
lru_cache_add is a good place for it when you take readability
into account.

> > --- linux-2.4.0-test1/mm/page_alloc.c.orig      Thu May 25 12:27:47 2000
> > +++ linux-2.4.0-test1/mm/page_alloc.c   Thu May 25 18:37:44 2000
> > @@ -94,6 +94,8 @@
> >         if (PageDecrAfter(page))
> >                 BUG();
> > 
> > +       page->age = 2;
> > +
> 
> hmm...
> If this is a page that has beed used much, isn't it penalized to
> much, and don't we loose information...??? (all fread pages are the
> same)

You may want to read the code to see what __free_pages_ok
is actually used for.

> and in try_to_swap_out (mm/vmscan.c) we could change to
> 	/* Don't look at this pte if it's been accessed recently. */
> 	if (pte_young(pte)) {
> 		/*
> 		 * Transfer the "accessed" bit from the page
> 		 * tables to the global page map.
> 		 */
> 		set_pte(page_table, pte_mkold(pte));
>                 page->age += 3;
> 		goto out_failed;
> 	}

This is dead wrong. Suppose the page isn't in the lru queue ...
its age would get upped to infinite values.

Also, if a page is shared between multiple ptes, we don't want
to mess with its age at scanning all ptes. We are using the
PG_referenced bit exactly to avoid this bug.

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Wanna talk about the kernel?  irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2000-05-26 15:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-05-25 23:03 [patch] page aging and deferred swapping for 2.4.0-test1 Rik van Riel
2000-05-25 23:48 ` Neil Schemenauer
2000-05-26 13:32 ` Roger Larsson
2000-05-26 13:41   ` Rik van Riel
2000-05-26 14:59 ` Roger Larsson
2000-05-26 15:16   ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox