Re: [2x PATCH] page map aging & improved kswap logic

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [2x PATCH] page map aging & improved kswap logic
       [not found] <199802270929.KAA28081@boole.fs100.suse.de>
@ 1998-02-27  9:58 ` Rik van Riel
  1998-02-27 19:52   ` Stephen C. Tweedie
  0 siblings, 1 reply; 4+ messages in thread
From: Rik van Riel @ 1998-02-27  9:58 UTC (permalink / raw)
  To: Dr. Werner Fink; +Cc: linux-mm, linux-kernel

On Fri, 27 Feb 1998, Dr. Werner Fink wrote:

> > The kswapd logic is almost completely redone. Basically,
> > kswapd tries (free_pages_high - nr_free_pages) times to
> > free a page, but when memory becomes tighter, the number
> > of tries become even higher.
> 
> Is the explicit call of run_task_queue(&tq_disk) really needed?
> Maybe setting of the __GFP_WAIT flag would work in the same manner:
> 
>         gfp_mask = __GFP_IO;
>         if (atomic_read(&nr_async_pages) >= SWAP_CLUSTER_MAX)
>                 gfp_mask |= __GFP_WAIT;

Wouldn't that just mean that the pages that are
swapped out from now on will be done synchronously?

What I wanted kswapd to do, was to select SWAP_CLUSTER_MAX
pages and swap them out in _one_ I/O operation. Because
this should save head movement, it might give us an improvement
over syncing each swapped page seperately.

Rik.
+-----------------------------+------------------------------+
| For Linux mm-patches, go to | "I'm busy managing memory.." |
| my homepage (via LinuxHQ).  | H.H.vanRiel@fys.ruu.nl       |
| ...submissions welcome...   | http://www.fys.ruu.nl/~riel/ |
+-----------------------------+------------------------------+

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [2x PATCH] page map aging & improved kswap logic
  1998-02-27  9:58 ` [2x PATCH] page map aging & improved kswap logic Rik van Riel
@ 1998-02-27 19:52   ` Stephen C. Tweedie
  1998-02-27 22:28     ` Benjamin C.R. LaHaise
  0 siblings, 1 reply; 4+ messages in thread
From: Stephen C. Tweedie @ 1998-02-27 19:52 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Dr. Werner Fink, linux-mm, linux-kernel

Hi,

On Fri, 27 Feb 1998 10:58:34 +0100 (MET), Rik van Riel
<H.H.vanRiel@fys.ruu.nl> said:

> What I wanted kswapd to do, was to select SWAP_CLUSTER_MAX pages and
> swap them out in _one_ I/O operation. Because this should save head
> movement, it might give us an improvement over syncing each swapped
> page seperately.

I'm working towards it, and yes, this is a very important thing to have.
It's more than just head movement --- disk requests, especially on SCSI,
simply go much faster if you can amalgamate a number of physically
adjacent IO requests into a single operation (scatter-gather allows you
to do this even if the memory for the data is not physically
contiguous).  

The biggest problem is avoiding blocking while we do the work in
try_to_swap_out().  That is a rather tricky piece of code, since it has
to deal with the fact that the process it is swapping can actually be
killed if we sleep for any reason, so it will not necessarily still be
there when we wake up again.  We've really got to do the entire
custering operation for write within try_to_swap_out() and then start up
the IO for those pages.

However, at least with the new swap cache stuff we can make things
easier, since it is now possible to set up swap cache associations
atomically on all the pages we want to swapout, and then take as much
time as we want performing the actual writes.  All we need to do is make
sure that we lock all the pages for IO without the risk of blocking.

Cheers,
 Stephen.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [2x PATCH] page map aging & improved kswap logic
  1998-02-27 19:52   ` Stephen C. Tweedie
@ 1998-02-27 22:28     ` Benjamin C.R. LaHaise
  0 siblings, 0 replies; 4+ messages in thread
From: Benjamin C.R. LaHaise @ 1998-02-27 22:28 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: Rik van Riel, Dr. Werner Fink, linux-mm, linux-kernel

On Fri, 27 Feb 1998, Stephen C. Tweedie wrote:
...
> The biggest problem is avoiding blocking while we do the work in
> try_to_swap_out().  That is a rather tricky piece of code, since it has
> to deal with the fact that the process it is swapping can actually be
> killed if we sleep for any reason, so it will not necessarily still be
> there when we wake up again.  We've really got to do the entire
> custering operation for write within try_to_swap_out() and then start up
> the IO for those pages.

The code I'm hoping to complete this weekend should solve this problem
nicely -- vm_ops->swapout is now completely integrated within the swapper
for 'normal' shared/private pages and won't sleep until all ptes that
reference a page have been replaced with the swap entry.  So it's just a
small step to batch up the pages to be written out.

> However, at least with the new swap cache stuff we can make things
> easier, since it is now possible to set up swap cache associations
> atomically on all the pages we want to swapout, and then take as much
> time as we want performing the actual writes.  All we need to do is make
> sure that we lock all the pages for IO without the risk of blocking.

At your suggestion, my work in progress now includes a per private vma
inode, which essentially makes the swap-cache disappear since all pages
are now in the page cache.  There is a concern with this: on swapin, each
pte that pointed to the page on disk has to be replaced with the page's
entry.  Unfortunately this means that the swap entry is now lost!  I'm
tempted to revert back to the old swap_cache_entry, and will have to
unless someone has an ingenious idea about where the swap entry could be
stored.  (The inode, offset pair can't be used for the swap cache as
they're used to find the appropriate pte in the page tables.)

One possibility is to store the swap entries in a structure attached to
the inode - right now affs is using a whopping ~80 longs for its private
inode data.  Or the data could just be stored in swap-cache entries tied
to the inode - actually that might work well as a page would need to be
allocated on swapin of an entry.  Hmmm...

		-ben

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [2x PATCH] page map aging & improved kswap logic
@ 1998-02-26 21:00 Rik van Riel
  0 siblings, 0 replies; 4+ messages in thread
From: Rik van Riel @ 1998-02-26 21:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Stephen C. Tweedie, linux-mm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1451 bytes --]

Hi Linus,

Here are the two patches I sent you earlier today,
this time against 2.1.89-pre2.

The kswapd logic is almost completely redone. Basically,
kswapd tries (free_pages_high - nr_free_pages) times to
free a page, but when memory becomes tighter, the number
of tries become even higher.

Since the code is compiling as I write, I don't know if
the agression factor is right, but we can adjust that
later...

A nice sideeffect of this is, that when memory is being
allocated slowly, kswapd will behave itself much better
then when memory is allocated faster. In the latter case,
kswapd will also become _far_ more agressive. It's kinda
self-tuning, but just not yet :-)

As for the other patch, it simply copies kswapd's page-aging
behaviour for page cache (and swap cache) pages. Buffer
pages, and possibly other ones, are still thrown out as soon
as they're not used any more.

OK... wait and see if it compiles <wait...wait...wait...wait>:
Yup, it compiles without a hitch (only mprotect.c gives a
warning about an ambiguous else ... gcc-2.8.0).
And since there's no new code in action, I leave the rebooting
to you guys.

Rik.
+-----------------------------+------------------------------+
| For Linux mm-patches, go to | "I'm busy managing memory.." |
| my homepage (via LinuxHQ).  | H.H.vanRiel@fys.ruu.nl       |
| ...submissions welcome...   | http://www.fys.ruu.nl/~riel/ |
+-----------------------------+------------------------------+

[-- Attachment #2: Type: TEXT/PLAIN, Size: 2207 bytes --]

--- vmscan.pre89-2	Thu Feb 26 21:10:33 1998
+++ vmscan.c	Thu Feb 26 21:57:53 1998
@@ -539,7 +539,7 @@
 	init_swap_timer();
 	add_wait_queue(&kswapd_wait, &wait);
 	while (1) {
-		int async;
+		int tries;
 
 		kswapd_awake = 0;
 		flush_signals(current);
@@ -549,32 +549,45 @@
 		kswapd_awake = 1;
 		swapstats.wakeups++;
 		/* Do the background pageout: 
-		 * We now only swap out as many pages as needed.
-		 * When we are truly low on memory, we swap out
-		 * synchronously (WAIT == 1).  -- Rik.
-		 * If we've had too many consecutive failures,
-		 * go back to sleep to let other tasks run.
+		 * When we've got loads of memory, we try
+		 * (free_pages_high - nr_free_pages) times to
+		 * free memory. As memory gets tighter, kswapd
+		 * gets more and more agressive. -- Rik.
 		 */
-		async = 1;
-		for (;;) {
+		tries = free_pages_high - nr_free_pages;
+		if (tries < min_free_pages) {
+			tries = min_free_pages;
+		}
+		else if (nr_free_pages < (free_pages_high + free_pages_low) / 2) {
+			tries <<= 1;
+			if (nr_free_pages < free_pages_low) {
+				tries <<= 1;
+				if (nr_free_pages <= min_free_pages) {
+					tries <<= 1;
+				}
+			}
+		}
+		while (tries--) {
 			int gfp_mask;
 
 			if (free_memory_available())
 				break;
 			gfp_mask = __GFP_IO;
-			if (!async)
-				gfp_mask |= __GFP_WAIT;
-			async = try_to_free_page(gfp_mask);
-			if (!(gfp_mask & __GFP_WAIT) || async)
-				continue;
-
+			try_to_free_page(gfp_mask);
 			/*
-			 * Not good. We failed to free a page even though
-			 * we were synchronous. Complain and give up..
+			 * Syncing large chunks is faster than swapping
+			 * synchronously (less head movement). -- Rik.
 			 */
-			printk("kswapd: failed to free page\n");
-			break;
+			if (atomic_read(&nr_async_pages) >= SWAP_CLUSTER_MAX)
+				run_task_queue(&tq_disk);
+
 		}
+	/*
+	 * Report failure if we couldn't even reach min_free_pages.
+	 */
+	if (nr_free_pages < min_free_pages)
+		printk("kswapd: failed, got %d of %d\n",
+			nr_free_pages, min_free_pages);
 	}
 	/* As if we could ever get here - maybe we want to make this killable */
 	remove_wait_queue(&kswapd_wait, &wait);

[-- Attachment #3: Type: TEXT/PLAIN, Size: 1224 bytes --]

--- linux/mm/filemap.pre89-2	Thu Feb 26 21:10:44 1998
+++ linux/mm/filemap.c	Thu Feb 26 21:19:52 1998
@@ -25,6 +25,7 @@
 #include <linux/smp.h>
 #include <linux/smp_lock.h>
 #include <linux/blkdev.h>
+#include <linux/swapctl.h>
 
 #include <asm/system.h>
 #include <asm/pgtable.h>
@@ -158,12 +159,15 @@
 
 		switch (atomic_read(&page->count)) {
 			case 1:
-				/* If it has been referenced recently, don't free it */
-				if (test_and_clear_bit(PG_referenced, &page->flags))
-					break;
-
 				/* is it a swap-cache or page-cache page? */
 				if (page->inode) {
+					if (test_and_clear_bit(PG_referenced, &page->flags)) {
+						touch_page(page);
+						break;
+					}
+					age_page(page);
+					if (page->age)
+						break;
 					if (PageSwapCache(page)) {
 						delete_from_swap_cache(page);
 						return 1;
@@ -173,6 +177,10 @@
 					__free_page(page);
 					return 1;
 				}
+				/* It's not a cache page, so we don't do aging.
+				 * If it has been referenced recently, don't free it */
+				if (test_and_clear_bit(PG_referenced, &page->flags))
+					break;
 
 				/* is it a buffer cache page? */
 				if ((gfp_mask & __GFP_IO) && bh && try_to_free_buffer(bh, &bh, 6))

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~1998-02-27 22:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <199802270929.KAA28081@boole.fs100.suse.de>
1998-02-27  9:58 ` [2x PATCH] page map aging & improved kswap logic Rik van Riel
1998-02-27 19:52   ` Stephen C. Tweedie
1998-02-27 22:28     ` Benjamin C.R. LaHaise
1998-02-26 21:00 Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox