linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Dillon <dillon@apollo.backplane.com>
To: Rik van Riel <riel@conectiva.com.br>
Subject: Re: [RFC] 2.3/4 VM queues idea
Date: Wed, 24 May 2000 09:16:45 -0700 (PDT)	[thread overview]
Message-ID: <200005241616.JAA75488@apollo.backplane.com> (raw)

     All right!  I think your spec is coming together nicely!   The
     multi-queue approach is the right way to go (for the same reason
     FBsd took that approach).  The most important aspect of using
     a multi-queue design is to *not* blow-off the page weighting tests
     within each queue.  Having N queues alone is not fine enough granularity,
     but having N queues and locating the lowest (in FreeBSD's case 0)
     weighted pages within a queue is the magic of making it work well.

     I actually tried to blow off the weighting tests in FreeBSD, even just
     a little, but when I did FreeBSD immediately started to stall as the
     load increased.  Needless to say I threw away that patchset :-).


     I have three comments:

     * On the laundry list.  In FreeBSD 3.x we laundered pages as we went
       through the inactive queue.   In 4.x I changed this to a two-pass
       algorithm (vm_pageout_scan() line 674 vm/vm_pageout.c around the 
       rescan0: label).  It tries to locate clean inactive pages in pass1,
       and if there is still a page shortage (line 927 vm/vm_pageout.c,
       the launder_loop conditional) we go back up and try again, this 
       time laundering pages.

       There is also a heuristic prior to the first loop, around line 650
       ('Figure out what to do with dirty pages...'), where it tries to 
       figure out whether it is worth doing two passes or whether it should
       just start laundering pages immediately.

     * On page aging.   This is going to be the most difficult item for you
       to implement under linux.  In FreeBSD the PV entry mmu tracking 
       structures make it fairly easy to scan *physical* pages then check
       whether they've been used or not by locating all the pte's mapping them,
       via the PV structures.  

       In linux this is harder to do, but I still believe it is the right
       way to do it - that is, have the main page scan loop scan physical 
       pages rather then virtual pages, for reasons I've outlined in previous
       emails (fairness in the weighting calculation).

       (I am *not* advocating a PV tracking structure for linux.  I really 
       hate the PV stuff in FBsd).

     * On write clustering.  In a completely fair aging design, the pages
       you extract for laundering will tend to appear to be 'random'.  
       Flushing them to disk can be expensive due to seeking.

       Two things can be done:  First, you collect a bunch of pages to be
       laundered before issuing the I/O, allowing you to sort the I/O
       (this is what you suggest in your design ideas email).  (p.p.s.
       don't launder more then 64 or so pages at a time, doing so will just
       stall other processes trying to do normal I/O).

       Second, you can locate other pages nearby the ones you've decided to
       launder and launder them as well, getting the most out of the disk
       seeking you have to do anyway.

       The first item is important.  The second item will help extend the
       life of the system in a heavy-load environment by being able to
       sustain a higher pagout rate.  

       In tests with FBsd, the nearby-write-clustering doubled the pageout
       rate capability under high disk load situations.  This is one of the
       main reasons why we do 'the weird two-level page scan' stuff.

       (ok to reprint this email too!)

						-Matt

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

             reply	other threads:[~2000-05-24 17:17 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-05-24 16:16 Matthew Dillon [this message]
2000-05-24 18:51 ` Rik van Riel
2000-05-24 20:57   ` Matthew Dillon
2000-05-24 22:44     ` Rik van Riel
2000-05-25  9:52     ` Jamie Lokier
2000-05-25 16:18       ` Matthew Dillon
2000-05-25 16:50         ` Jamie Lokier
2000-05-25 17:17           ` Rik van Riel
2000-05-25 17:53             ` Matthew Dillon
2000-05-26 11:38               ` Jamie Lokier
2000-05-26 11:08           ` Stephen C. Tweedie
2000-05-26 11:22             ` Jamie Lokier
2000-05-26 13:15               ` Stephen C. Tweedie
2000-05-26 14:31                 ` Jamie Lokier
2000-05-26 14:38                   ` Stephen C. Tweedie
2000-05-26 15:59                     ` Matthew Dillon
2000-05-26 16:36                     ` Jamie Lokier
2000-05-26 16:40                       ` Stephen C. Tweedie
2000-05-26 16:55                         ` Matthew Dillon
2000-05-26 17:05                           ` Jamie Lokier
2000-05-26 17:35                             ` Matthew Dillon
2000-05-26 17:46                               ` Stephen C. Tweedie
2000-05-26 17:02                         ` Jamie Lokier
2000-05-26 17:15                           ` Stephen C. Tweedie
2000-05-26 20:41                             ` Jamie Lokier
2000-05-28 22:42                               ` Stephen Tweedie
2000-05-26 15:45                   ` Matthew Dillon
2000-05-26 12:04             ` Rik van Riel
  -- strict thread matches above, loose matches on Subject: below --
2000-05-24 19:37 Mark_H_Johnson
2000-05-24 20:35 ` Matthew Dillon
2000-05-24 15:11 Rik van Riel
2000-05-24 22:44 ` Juan J. Quintela
2000-05-24 23:32   ` Rik van Riel
2000-05-26 11:11 ` Stephen C. Tweedie
2000-05-26 11:49   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200005241616.JAA75488@apollo.backplane.com \
    --to=dillon@apollo.backplane.com \
    --cc=riel@conectiva.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox