Re: Linux-2.1.129.. - Rik van Riel

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Rik van Riel <H.H.vanRiel@phys.uu.nl>
To: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Linus Torvalds <torvalds@transmeta.com>,
	"Dr. Werner Fink" <werner@suse.de>,
	Kernel Mailing List <linux-kernel@vger.rutgers.edu>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: Linux-2.1.129..
Date: Mon, 23 Nov 1998 21:12:20 +0100 (CET)	[thread overview]
Message-ID: <Pine.LNX.3.96.981123204943.417I-100000@mirkwood.dummy.home> (raw)
In-Reply-To: <199811231713.RAA17361@dax.scot.redhat.com>

On Mon, 23 Nov 1998, Stephen C. Tweedie wrote:

> So, I have still seen no cases where overall performance with no
> page cache aging was better than performance with it.  However, with
> the swap aging removed as well, we seem to have a page/swap balance
> which doesn't work well on 64MB.  To be honest, I just haven't spent
> much time playing with swap page aging since the early kswap work,
> and that was all done before the page cache was added. 

What way does the balance go? Too much cache/buffer memory
can be 'fixed' by adjusting the settings in /proc/sys/vm/*
(yes, I know it goes against your principles, but some folks
need special behaviour for special-purpose systems anyway)

> On Thu, 19 Nov 1998 22:58:30 +0100 (CET), Rik van Riel
> <H.H.vanRiel@phys.uu.nl> said:
> 
> > It was certainly a huge win when page aging was implemented, but we
> > mainly felt that because there used to be an obscure bug in vmscan.c,
> > causing the kernel to always start scanning at the start of the
> > process' address space.
> 
> Rik, you keep asserting this but I have never understood it.  I have
> asked you several times for a precise description of what benchmarks
> improved when page cache aging was added,

I mean the addition of page aging in kernel version 1.2.x.

Back then there certainly was a big improvement vs 1.1.x,
but unfortunately I was not really into kernel hacking
back then (I didn't even have a Net connection) so I
might have misunderstood things...

> And the "obscure bug" you describe was never there: I've said to you
> more than once that you were misreading the source, and that the
> field you pointed to which was being reset to zero at the start of
> the swapout loop was *guaranteed* to be overwritten with the last
> address scanned before we exited that loop. 

Nevertheless I observed a much more stable and less thash-
prone system with my small patch included.

> swap_out_pmd(), there is a line
> 
> 		tsk->swap_address = address + PAGE_SIZE;

Hmm, this means that it should work as you say. The
system seemed to be much more thash-prone however...(?)

> > This gives the process a chance of reclaiming the page without
> > incurring any I/O and it gives the kernel the possibility of keeping a
> > lot of easily-freeable pages around.
> 
> That would be true if we didn't do the free_page_and_swap_cache trick.
> However, doing that would require two passes: once by the swapper, and
> once by shrink_mmap(): before actually freeing a page.  This actually
> sounds like a *very* good idea to explore, since it means that vmscan.c
> will be concerned exclusively with returning mapped and anonymous pages
> to the page cache.

It is also what *BSD and OSF/1 seem to do. They have tuned
and balanced this system for the last 15 years so the system
should be rather well tuned...

> > Maybe we even want to keep a 3:1 ratio or something like that for
> > mapped:swap_cached pages and a semi- FIFO reclamation of swap cached
> > pages so we can simulate a bit of (very cheap) page aging.
> 
> I will just restate my profound conviction that any VM balancing which
> works by imposing precalculated limits on resources is fundamentally
> wrong.

The reason for a ratio like this is to ensure that:
- there are enough pages that can be free()d at any time,
  without us needing to scan the page tables, this also
  serves as a 'buffer' for high-pressure moments
- pages will spend enough time in 'unmapped' mode to have
  some serious aging imposed on them, not doing this might
  cancel out the effect we want (multi queue semantics)
- pages that are used semi-often will have some soft faults,
  always-used pages won't. keeping the soft-fault stats will
  enable us to make better pageout decisions cheaply
- when a page softfaults (is remapped in from the unmapped
  state) we can get below the wanted ratio and push out
  something else, this gives a nice, slow and uniform page
  aging system (especially when we observe a second chance FIFO
  algorithm for reclaiming the page-/swapcached and buffer
  pages, only breaking the FIFO style when memory is fragmented)
- keeping 25% of memory in unmapped state allows us to easily
  'fix' memory fragmentation, solving that problem as well --
  without having to give up the fast & cheap memory allocator
  we use now
- the easy-free buffer will allow us to keep less free memory,
  a few higher-order buffers should be all since we can free
  cached pages (shrink_mmap()) pages immediately,
- this in turn might slightly reduce swapping, especially on
  smaller machines

cheers,

Rik -- slowly getting used to dvorak kbd layout...
+-------------------------------------------------------------------+
| Linux memory management tour guide.        H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader.      http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+

--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

next prev parent reply	other threads:[~1998-11-23 20:42 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.3.95.981119002335.838A-100000@penguin.transmeta.com>
1998-11-19 21:34 ` Linux-2.1.129 Dr. Werner Fink
1998-11-19 21:58   ` Linux-2.1.129 Rik van Riel
1998-11-20 12:09     ` Linux-2.1.129 Dr. Werner Fink
1998-11-19 22:33   ` Linux-2.1.129 Linus Torvalds
1998-11-23 17:13     ` Linux-2.1.129 Stephen C. Tweedie
1998-11-23 19:16       ` Linux-2.1.129 Eric W. Biederman
1998-11-23 20:02         ` Linux-2.1.129 Linus Torvalds
1998-11-23 21:25           ` Linux-2.1.129 Rik van Riel
1998-11-23 22:19           ` Linux-2.1.129 Dr. Werner Fink
1998-11-24  3:37           ` Linux-2.1.129 Eric W. Biederman
1998-11-24 15:25           ` Linux-2.1.129 Stephen C. Tweedie
1998-11-24 17:33             ` Linux-2.1.129 Linus Torvalds
1998-11-24 19:59               ` Linux-2.1.129 Rik van Riel
1998-11-24 20:45                 ` Linux-2.1.129 Linus Torvalds
1998-11-25 14:19               ` Linux-2.1.129 Stephen C. Tweedie
1998-11-25 21:07                 ` Linux-2.1.129 Eric W. Biederman
1998-11-26 12:57                   ` Linux-2.1.129 Stephen C. Tweedie
1998-11-25 20:33             ` Linux-2.1.129 Zlatko Calusic
1998-11-23 19:46       ` Linux-2.1.129 Eric W. Biederman
1998-11-23 21:18         ` Linux-2.1.129 Rik van Riel
1998-11-24  6:28           ` Linux-2.1.129 Eric W. Biederman
1998-11-24  7:56             ` Linux-2.1.129 Rik van Riel
1998-11-24 15:48             ` Linux-2.1.129 Stephen C. Tweedie
1998-11-24 15:38         ` Linux-2.1.129 Stephen C. Tweedie
1998-11-23 20:12       ` Rik van Riel [this message]
1998-11-23 20:53       ` Running 2.1.129 at extrem load [patch] (Was: Linux-2.1.129..) Dr. Werner Fink
1998-11-23 21:59         ` Rik van Riel
1998-11-23 22:35           ` Dr. Werner Fink
1998-11-24 12:38             ` Dr. Werner Fink

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.3.96.981123204943.417I-100000@mirkwood.dummy.home \
    --to=h.h.vanriel@phys.uu.nl \
    --cc=linux-kernel@vger.rutgers.edu \
    --cc=linux-mm@kvack.org \
    --cc=sct@redhat.com \
    --cc=torvalds@transmeta.com \
    --cc=werner@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox