linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch] improve streaming I/O [bug in shrink_mmap()]
@ 2000-06-12 21:46 Zlatko Calusic
  2000-06-12 22:29 ` Stephen C. Tweedie
  0 siblings, 1 reply; 28+ messages in thread
From: Zlatko Calusic @ 2000-06-12 21:46 UTC (permalink / raw)
  To: alan; +Cc: Linux MM List, Linux Kernel List

Hi!

This simple one-liner solves a long standing problem in Linux VM.
While searching for a discardable page in shrink_mmap() Linux was too
easily failing and subsequently falling back to swapping. The problem
was that shrink_mmap() counted pages from the wrong zone, and in case
of balancing a relatively smaller zone (e.g. DMA zone on a 128MB
computer) "count" would be mistakenly spent dealing with pages from
the wrong zone. The net effect of all this was spurious swapping that
hurt performance greatly.

I tested this patch very thoroughly here and it doesn't reveal any bad
behavior. I think that applying the patch is the first and most
important step towards more fast and balanced kernel. Stay tuned for
more improvements.

Benchmarking reveals a nice improvement for the streaming I/O
applications:

    -------Sequential Output-------- ---Sequential Input-- --Random--
    -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
 MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU

*** ac-16:

400 17380 74.4 13887 14.9  6203  6.8 14452 46.4 15743 12.3 129.9  1.0
400 15134 65.3 15085 15.9  5872  6.5 13281 40.8 18943 14.4 124.4  1.0

*** ac-16 with patch applied:

400 17426 75.8 17919 18.6  6518  7.7 16294 50.0 21038 16.8 132.0  0.8
400 16915 73.3 17502 17.9  6515  7.2 16499 51.4 21148 15.7 131.0  1.4
               ^^^^^       ^^^^                 ^^^^^

Index: 24001.23/mm/filemap.c
--- 24001.23/mm/filemap.c Mon, 12 Jun 2000 21:03:48 +0200 zcalusic (linux/F/b/16_filemap.c 1.6.1.3.2.4.1.1.2.2.2.1.1.21.1.1.3.2.3.1.3.1.2.1 644)
+++ 24001.24/mm/filemap.c Mon, 12 Jun 2000 21:51:53 +0200 zcalusic (linux/F/b/16_filemap.c 1.6.1.3.2.4.1.1.2.2.2.1.1.21.1.1.3.2.3.1.3.1.2.2 644)
@@ -365,8 +365,11 @@
 		 * Page is from a zone we don't care about.
 		 * Don't drop page cache entries in vain.
 		 */
-		if (page->zone->free_pages > page->zone->pages_high)
+		if (page->zone->free_pages > page->zone->pages_high) {
+			/* the page from the wrong zone doesn't count */
+			count++;
 			goto unlock_continue;
+		}
 
 		/* Take the pagecache_lock spinlock held to avoid
 		   other tasks to notice the page while we are looking at its

Regards,
-- 
Zlatko
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 28+ messages in thread
* Re: [patch] improve streaming I/O [bug in shrink_mmap()]
@ 2000-06-13  8:10 Roger Larsson
  0 siblings, 0 replies; 28+ messages in thread
From: Roger Larsson @ 2000-06-13  8:10 UTC (permalink / raw)
  To: linux-mm

> On Mon, 12 Jun 2000, Stephen C. Tweedie wrote:
> > On Mon, Jun 12, 2000 at 11:46:09PM +0200, Zlatko Calusic wrote:
> > > 
> > > This simple one-liner solves a long standing problem in Linux VM.
> > > While searching for a discardable page in shrink_mmap() Linux was too
> > > easily failing and subsequently falling back to swapping. The problem
> > > was that shrink_mmap() counted pages from the wrong zone, and in case
> > > of balancing a relatively smaller zone (e.g. DMA zone on a 128MB
> > > computer) "count" would be mistakenly spent dealing with pages from
> > > the wrong zone. The net effect of all this was spurious swapping that
> > > hurt performance greatly.
> > 
> > Nice --- it might also explain some of the excessive kswap CPU 
> > utilisation we've seen reported now and again.
> 
> Indeed. And to be honest, the patch can be made even simpler.
> 
> We can simply move the test up to above the count--, so we won't
> start IO for the "wrong" zones either.
> 
> There's only one serious bug left with the current shrink_mmap,
> a bug which appears to be easy to trigger with this patch, but
> still there without it.
> 
> Consider the case where only one zone has free_pages < pages_high,
> but all the pages in the LRU queue are from the other zone or not
> freeable (ie. with pagetable mapping)...
> 
> In those cases shrink_mmap() can loop forever. We probably want to
> add a "maxscan" variable, initialised to nr_lru_pages, which is
> decremented on every iteration through the loop to prevent us from
> triggering this bug.


An I have already released such a patch.
See "reduce swap due to shrink_mmap failures".

But it is probable that we should clean pages (= start I/O) even on
zones with no pressure - like Rajagopal reported.

/RogerL
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 28+ messages in thread
[parent not found: <8i3qe8$lltbv$1@fido.engr.sgi.com>]

end of thread, other threads:[~2000-06-14 18:37 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-06-12 21:46 [patch] improve streaming I/O [bug in shrink_mmap()] Zlatko Calusic
2000-06-12 22:29 ` Stephen C. Tweedie
2000-06-12 23:04   ` Rik van Riel
2000-06-13 15:08   ` Andrea Arcangeli
2000-06-13 17:08     ` Juan J. Quintela
2000-06-13 19:09       ` Andrea Arcangeli
2000-06-13 19:32         ` Rik van Riel
2000-06-13 23:07           ` Andrea Arcangeli
2000-06-13 23:34             ` Rik van Riel
2000-06-14  0:12               ` Andrea Arcangeli
2000-06-14  0:58                 ` Rik van Riel
2000-06-14  1:18                   ` Andrea Arcangeli
2000-06-14  1:33                     ` Rik van Riel
2000-06-14  2:10                       ` Andrea Arcangeli
2000-06-14  2:46                         ` Rik van Riel
2000-06-14 13:01                           ` Andrea Arcangeli
2000-06-14 13:44                             ` Rik van Riel
2000-06-14 13:57                               ` Andrea Arcangeli
2000-06-14 16:48                                 ` Rik van Riel
2000-06-14 17:14                                   ` Andrea Arcangeli
2000-06-14 17:33                                     ` Rik van Riel
2000-06-14 18:37                                       ` Andrea Arcangeli
2000-06-13 23:41             ` Juan J. Quintela
2000-06-14  0:21               ` Andrea Arcangeli
2000-06-13 19:20     ` Rik van Riel
2000-06-13 21:49       ` Andrea Arcangeli
2000-06-13  8:10 Roger Larsson
     [not found] <8i3qe8$lltbv$1@fido.engr.sgi.com>
2000-06-14  6:17 ` Rajagopal Ananthanarayanan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox